At Catalysts we often hear from customers that high availability and fault tolerance are important requirements for their applications.
When a product is used almost 24h a day nonstop by multiple people and any kind of interruption leads to people not being able to continue their work, it is obvious that availability is especially important.
Given any kind of web-enabled business application and the requirement of being as available as possible, what can we do?
The general solution
Before jumping to any technical solutions focus on what you already have and make sure that it is solid.
In the shortest possible form that means:
- Employ safe coding practices.
- Keep track of your resource consumption (resource pools etc).
If you want to know more about what I mean, read the book “Release It!” by Michael Nygard.
Having the basics aside lets say that we want to prevent the following two things from causing downtime:
- The Server / VM which our application is running on crashes, has to be rebooted.
- We have to deploy a new version of our software.
A common solution strategy is to cluster the application, meaning setting up multiple machines which run the application and a load balancer which can route traffic to any available instance.
Taking a peek at what our target is, here is a common setup of what we want to build:
Coming back to the previous argument: if you haven’t kept it simple and have done complex caching and coordination tasks within the application, relying on the fact that there will only ever be one instance of it running, you now have the task on your hands to refactor this code into something that will behave like a team player.
Now that we have multiple nodes, we need some way to exchange transient data (such as session information) and maybe do some coordination between these instances. One frequently used technology for this is Redis. Many of the biggest websites on the internet such as StackOverflow or GitHub use it.
In part 2 we will focus on the technical details, some challenges.