Resilience
If your business depends on IT systems, it is worth spending a bit of time and money planning for the day something goes wrong with your hardware. What would it cost your business to be without email for a week, or for your e-commerce website to be unavailable just before Christmas? Node4 offer a range of resiliency options to ensure that your business keeps going even if the worst happens. Let’s look at some of the options:
Website Load Balancing
Website load balancing does what is says, it spreads the load between 2 or more web servers so that you can have more people on your website at once. It also has the advantage of allowing you to take a web server offline (for maintenance or because of a fault) and still serve pages to your customers.
What you need:
- 2 or more webservers more...
- Cisco 11500 content switch (or 2 for failover)
Load balancing is only suited for “static” content, like web pages, because you can’t access shared storage from 2 or more servers at the same time. If you want to provide resilience for dynamic content (like a database), you need to use server clustering (see below).
Server Clustering
Clustering is a common way of providing a high availability environment. The most frequently used form of clustering is called active-passive. In this scenario, you use 2 identical servers (called nodes) and a shared storage device (a SAN). Data is stored on the SAN and the application runs on one node (the active node). In the event of that server failing, the cluster “fails over” to the second passive node. You can have multiple nodes and even multiple SANs making clustering a highly scalable solution. Because only one server is accessing the data at any one time, you can use clustering for dynamic content like databases, but you don’t get the load balancing benefits described above.
What you need
- 2 or more servers running Windows 2003 enterprise edition more...
- SAN device more...
- SQL server 2000 / 2005 enterprise
SQL Mirroring
Mirroring is a relatively new development and is similar to clustering, but works on an individual SQL database rather than at the Operating System level. With mirroring, every transaction that happens on one database is mirrored in near realtime to the other. If the primary database fails, the mirror takes over almost instantly. If your application supports mirroring properly, end users will notice very little interruption in service. Mirroring can be a useful way of providing high availability for databases without the cost of clustering.
What you need
- 2 SQL Servers running SQL2005 SP1 or later
- We recommend a DL360 or higher for SQL server applications
Failover firewalls
In a similar way to servers, you can cluster firewalls so that in the event of one failing, the second will take over until you fix the other one. A firewall failure will knock out everything to do with internet access on your network, so it makes sense to invest in resilience for this critical component.
What you need:
- Cisco ASA5510 bundle in failover mode more...
HSRP (Hot Standby Router Protocol)
HSRP is a service offered by Node4 to provide resilience in our core routers. We provide you with 2 physical connections to the internet and a virtual gateway address. In the unlikely event of a failure in one of our routers, the other router will take over with minimal interruption to your network connection.
Dual power feeds
If your server has dual power supplies, it makes sense to run them from separate power feeds. Node4 can provide A and B feeds into your rack with separate PDUs.
That way you are protected from a failure in the server power supplies, the PDU or our main supply.
more...
Hardware RAID
Hard disks have moving parts and, as such, can be prone to failure. A failed hard disk can be disastrous to your business, if it contains your accounts or customer database. Regular backups are one solution, but you may still lose data and time whilst the backups are restored. RAID provides a way of making sure your data is spread over multiple hard drives so that if one (or more) drives fail, things carry on as normal until you get chance to replace the failed drives.
The two most common types of RAID configuration are RAID1 (mirroring) and RAID5 (striping with Parity). In RAID1, data is written and read from to 2 hard drives at the same time, so if one fails, the other continues by itself. In RAID5, data is spread over 3 or more drives in such a way that if any drive fails, the other drives contain enough data to carry on as normal. With more drives in a RAID5 array, you can afford to lose more than one disk at a time.
RAID also gives performance improvements as data can be read more quickly from multiple drives, but you do lose some usable disk space as a trade off (50% in RAID1, 33% in a 3 disk RAID5 array).
Other high availability RAID options are available on some of our SAN storage devices, with redundant RAID controllers and power built in meaning that there is no single point of failure.
What You Need:
-
All our recommended servers support at least RAID1, with RAID5 being available on DL360 and above.
more...