Microsoft Clustering Basics
|
|
|
|
|
|
Microsoft
Cluster Server
Microsoft Cluster Server ("MSCS") is
clustering software that first shipped with Microsoft Windows NT Server -
Enterprise Edition. MCSC 1.0 (codenamed "Wolfpack") was released in
1997. Since then, MSCS has been upgraded to version 1.1 in Windows 2000
Advanced Server and Datacenter Server and to version 1.2 in Windows Server 2003
Enterprise Edition and Datacenter Edition.
MSCS supports clusters nodes which are specially linked servers running the cluster service. The primary function of MSCS occurs when one server in a cluster fails or is taken offline. With MSCS, the other server in the cluster takes over the failed server’s operations. Clients using server resources experience little or no interruption of their work because the resource functions move from one server to the other. The primary purpose of clustering is to provide failover and reinstantiation of services and resources, thereby providing increased availability for the services (e.g., messaging, database, file and print, etc.).
MSCS is comprised of two main components: clustering software and the Cluster Administrator (cluadmin.exe, a GUI and cluster.exe, a command-line management tool). The clustering software enables the two servers of a cluster to exchange specific types of messages that trigger the transfer of resources at the appropriate times. The clustering software has two primary components: the Cluster Service and the Resource Monitor. The Cluster Service runs on each cluster server. It controls cluster activity, communication between cluster servers, and failure operations. The Resource Monitor handles communication between the Cluster Service and the application resources. The Cluster Administrator is a graphical application that is used to manage a cluster. It runs on any version of NT (server, workstation) that has Service Pack 3 or later installed, Windows 2000, Windows XP and Windows 2003.
In MSCS, a cluster is a configuration of two nodes, each of which is an independent computer system. Together, these independent servers create a "server cluster." The cluster appears to users as a single server. For MSCS, both nodes must be running NT Server - Enterprise Edition, Windows 2000 Advanced/Datacenter Server or Windows Server 2003 Enterprise/Datacenter Server. The network applications, data files, and other tools you install on the nodes are the cluster resources, which provide services to network clients. A resource is hosted on only one node at any time. The figure below shows the relationship between nodes, groups, and resources.

Picture Source: Microsoft Cluster Server Administrator's Guide
Windows Cluster Terminology
Clustering introduces several new terms which should be thoroughly understood before clusters of any kind are implemented.
Node. The term used to refer to a server that is a member of a cluster.
Resource. A hardware or software component that exists in a cluster, such as a disk, an IP address, a network name, or an instance of an Exchange 2000 component.
Group. A combination of resources that are managed as a unit of failover. Groups are also known as resource groups or failover groups.
Dependency. An alliance between two or more resources in the cluster architecture. You’ll need to understand cluster resource dependencies when installing a cluster.
Failover/failback. The process of moving resources from one server to another. Failover can occur when one server experiences a failure of some sort or when you, the administrator, initiate a proactive failover.
Quorum resource. This is a special type of cluster resource that provides persistent arbitration mechanisms by allowing one node to gain control of it and then defending that node’s control. In addition, it provides physical storage that can be accessed by any node in the cluster (only one node can access the quorum at any given time). The quorum also maintains access to the most current version of the cluster database, and if a failure occurs, the quorum writes the changes to the cluster database.
Heartbeat. The network and Remote Procedure Call (RPC) traffic that flows between servers in a cluster. Windows 2000 and Windows 2003 clusters communicate by using RPC calls on IP sockets with User Datagram Protocol (UDP) packets. Heartbeats are single UDP packets sent between each node’s every 1.2 seconds. These packets are used to confirm that the node’s network interface is still active.
Membership. This term is used to describe the orderly addition and removal of active nodes to and from the cluster.
Global update. This term refers to the propagation of cluster configuration changes to all members. The cluster registry is maintained through this mechanism.
Cluster registry. Inside the Windows 2000 registry is the cluster registry—also known as the cluster database. This maintains configuration information on each member in the cluster, as well as on resources and parameters. This information is stored on the quorum resource.
Virtual server. A virtual server is a combination of configuration information and cluster resources, such as an IP address, network name and application resource.
Active/Active. From a software perspective, this describes applications (or resources) that can existing as multiple instances in a cluster. This means that both nodes can be active servicing clients.
Active/Passive. This terms describes applications that run as a single instance in a cluster. This generally also means that one node typically sits idle until a failover occurs. However, you can have an Active/Passive implementation of an application in an Active/Active cluster. An example of this would be a cluster that contained clustered file and print sharing resources and a single Exchange or SQL virtual server.
Shared storage. This refers to the external SCSI or fibre channel storage enclosure and the disks contained therein. Shared storage is a requirement for multi-node clusters. Although this storage is shared, only one node can access an external storage resource at any given time.