New Features ofWindows Server 2012 Failover Clustering
Windows IT Pro
Aug 22, 2012
Prepare for easier management, increased scalability, and
more flexibility
In “Troubleshooting
Windows Server 2008 R2 Failover Clusters,” I discussed troubleshooting
failover clusters—specifically, the locations and tips for where you can go to
get the data you need in order to troubleshoot a problem. The Microsoft Program
Management Team looked at quite a few of the top problems and worked to improve
them in Windows
Server 2012 Failover Clustering. So this month, I’ll talk about the new
features and functionality of Server 2012 Failover Clustering. The new changes
for failover clustering offer easier management, increased scalability, and
more flexibility.
Scalability Limits
One
of the first things to talk about is scalability limits. With Server 2012
clustering, you now have a maximum limit of 64 nodes per cluster. If you’re
running as a Hyper-V cluster with highly available virtual machines (VMs), the
limit has increased to 4,000 VMs per cluster and 1,024 VMs per node. With these
increased limits, Server Manager has been bolstered with the ability to
discover and provide remote management capabilities. When you’ve configured a
cluster, it will show all nodes, including the name of the cluster and any VMs
on the cluster. In Server Manager, you would see where the remote management
can be accomplished, as Figure 1 shows. With this capability, you can enable
additional roles/features remotely.
Figure 1: Configuring remote management
A New Level of AD Integration
When
you’re creating a cluster, you’ll experience a new level of detection about
where the cluster creates an object in Active Directory (AD). When allowing the
cluster to create the object, it will detect the organizational unit (OU) where
the nodes reside and create the object in the same OU. It will still use the
logged-on user account to create the Cluster Name Object (CNO), so this account
needs to have Read and Create permissions on this OU. If you want to bypass
this detection, or place it in a separate OU, you can specify that during the
creation. For example, if I want to place the cluster name in an OU called
Cluster, during creation I would input the data that Figure 2 shows.
Figure 2: Placing the cluster name in an OU called Cluster
If
you’re doing it through PowerShell, the command would be
New-Cluster “CN=WIN2012-CLUSTER,OU=Cluster,DC=Contoso,DC=com” –Node WIN2012-Node1,WIN2012-Node2,WIN2012-Node3,WIN2012-Node4
Quorum Configuration
The
quorum configuration has been simplified, and a new dynamic quorum model is
available—now the default when you’re creating a cluster. You can also manually
remove nodes from participating in the voting. When you go through the
Configure Cluster Quorum Wizard, you’re provided with three options:
·
Use typical
settings (recommended)—The cluster determines quorum management options and, if
necessary, selects the quorum witness.
·
Add or
change the quorum witness—You can select the quorum witness; the cluster
determines quorum management options.
·
Advanced
quorum configuration and witness selection—You determine the quorum management
options and the quorum witness.
When
you choose the typical settings, the wizard will select the quorum type as
dynamic. With a dynamic quorum, the number of votes changes depending on the
number of participating nodes. The way it works is that, to keep a cluster up,
you must have a quorum or consensus of votes. Each node that participates in a
cluster is a vote. If you have also chosen to have a witness disk or share,
that’s an additional vote. To keep the cluster going, more than half of the
votes must continue to run. You can use the math equation of (total votes
+1)/2. I have nine total nodes in a cluster without a witness disk. So, using
the math equation above, it would be (9+1)/2 or 5 total votes to keep the
cluster up.
So,
for example, consider what occurs with a Server 2008 R2 cluster and a Server
2012 cluster. In a Server 2008 R2 cluster, using the same nine nodes in the
cluster, this means that I have nine total votes and will need five votes
(nodes) to remain going to keep the cluster up. If there are only four nodes
up, the cluster service will terminate because there aren’t enough remaining
votes. The administrator would need to take manual actions to force the cluster
to start and get back to production. In a new Server 2012 failover cluster,
when a node goes down, the number of votes needed to remain up also dynamically
goes down. With my nine nodes (votes), if one node (vote) goes down, the total
vote count becomes eight. If another two nodes go down, the vote count becomes
six. The Server 2012 cluster will continue running and stay in production
without intervention needed. Dynamic Quorum is the default and recommended
quorum configuration for Server 2012 clusters.
Figure 3: Changing the quorum configuration
To
change the quorum witness configuration, you can right-click the name of the
cluster in the far left pane, choose More Actions, and select Configure Cluster
Quorum Settings as Figure 3 shows. The wizard will let you set a disk witness,
set a file share witness, or leave it as dynamic. If you choose the advanced
settings, one of the first settings you’ll determine is what nodes actually
have a vote in the cluster, as Figure 4 shows. All nodes participate with a
vote to achieve quorum. If you de-select a node, it won’t have a vote. Using
the earlier example of nine nodes, for Server 2008 R2 clusters, you have only
eight voting members, so a witness disk or share would need to be added. In
both Server 2008 R2 and Server 2012 clusters, this non-voting node doesn’t have
a vote; if it’s the only node left, the cluster service will stop and manual
intervention will be necessary.
Figure 4: Determining which cluster nodes have a vote
When
going through the Configure Cluster Quorum Wizard, the next screen shows the
option where you can select or de-select the dynamic quorum option, as Figure 5
shows. As you can see from it, the default action is selected and is also
recommended. If you want to change the quorum configuration to add a witness disk
or share, the next screen in the wizard, Select Quorum Witness, is where you
can choose the witness disk or share to use.
Figure 5: The Configure Quorum Management page
Cluster Validation
There
are some new cluster-validation enhancements. One of the big benefits is that
storage tests will run significantly faster. The storage tests measure things
such as which node can see the drives, determine failovers individually and as
groups to all nodes, check to see if the drive can be yanked away from each
node from the other nodes, and so on. In Server 2008 R2 failover clusters, if
you had a large number of disks, the storage tests took a lot of time to
complete. With Server 2012 cluster validation, the tests have been streamlined
in their execution and in the speed they take to complete. A new option for the
storage tests is that you can target specific LUNs to run the tests against, as
you see in Figure 6. If you want to test a single LUN or a specific set of
LUNs, just select the ones you want. There are also new tests for Cluster
Shared Volumes (CSV) as well as for Hyper-V and the VMs. These tests check to
see if your networking is configured with the recommended settings to ensure
that network connectivity can be made between machines, quick/live migrations
are set up to work, the same network switches are created on all nodes, and so
on.
Figure 6: Reviewing your storage status
Cluster Virtual Machine Monitoring
When
you’re running highly available VMs with the Hyper-V role in a cluster, you can
take advantage of a new feature called Virtual Machine Monitoring. With this
new monitoring, you can actually have Failover Clustering monitor specific
services within the VM and react if there is a problem with a service. For
example, if you’re running a VM that provides print services, you can monitor
the Print Spooler service. To set this up, you can:
1. Right-click the VM in Failover Clustering.
2. Choose More Actions.
3. Choose Configure Monitoring.
4. Choose the service or services that you would like to
monitor.
If
you want to set this up with PowerShell, the command would be
Add-ClusterVMMonitoredItem –VirtualMachine “VM Name” –Service Spooler
Failover
Clustering will then “monitor” the VM and service through periodic health
checks. If it determines that the monitored service is unhealthy, it will
consider it to be in a critical state. It will first log the following event on
the host. For example:
Event ID: 1250
Source: FailoverClustering
Description: Cluster Resource “Virtual Machine Name” in clustered role “Virtual Machine Name” has received a critical state notification. For a virtual machine this indicates that an application or service inside the virtual machine is in an unhealthy state. Verify the functionality of the service or application being monitored within the virtual machine.
It
will then restart the VM (forced shutdown, but graceful) on the host that it’s
currently running on. If it fails again, it will move it to another node to
start. Virtual Machine Monitoring gives you a finer granularity of the kind of
monitoring you want to have for your VMs. It also brings the added benefit of
additional health-checking, as well as availability. Without Virtual Machine
Monitoring, if a particular service has a problem, it would continue in that
state and user intervention would be required to get it back up.
Cluster Aware Updating
Cluster
Aware Updating (CAU) is new to Server 2012 Failover Clustering. This feature
automates software updating (security patches) while maintaining availability.
With CAU, you have the following actions available:
·
Apply
updates to this cluster
·
Preview
updates to this cluster
·
Create or
modify the Updating Run Profile
·
Generate a
report on past Updating Runs
·
Configure
cluster self-updating options
·
Analyze
cluster updating readiness
CAU
will work in conjunction with your existing Windows Update Agent (WUA) and
Windows Server Update Services (WSUS) infrastructures to apply important
Microsoft updates. When CAU begins to update, it will go through the following
steps:
1. Put each node of the cluster into node maintenance mode.
2. Move the clustered roles off the node. In the case of
highly available VMs, it will perform a live migration of the VMs.
3. Install updates and any dependent updates.
4. Perform a reboot of the node, if necessary.
5. Bring the node out of maintenance mode.
6. Restore the clustered roles on the node.
7. Move to update the next node.
You
can start CAU from Server Manager, Failover Cluster Manager, or a remote
machine. The recommendations and considerations for setting this up are as follows:
·
Don’t
configure the nodes for automatic updating either from Windows Update or a WSUS
server.
·
All cluster
nodes should be uniformly configured to use the same update source (WSUS
server, Windows Update, or Microsoft Update).
·
If you
update using Microsoft System Center Configuration Manager 2007 and Microsoft
System Center Virtual Machine Manager 2008, exclude cluster nodes from all
required or automatic updates.
·
If you use
internal software distribution servers (e.g., WSUS servers) to contain and deploy
the updates, ensure that those servers correctly identify the approved updates
for the cluster nodes.
·
Review any
preferred owner settings for clustered roles. Configure these settings so that
when the software-update process is complete, the clustered roles will be
distributed across the cluster nodes.
Alternative Connections
In
the previous versions, the only way you could connect to a share is with the
Client Access Point (the network name in the group) because the shares were
scoped to only this name. More information about this behavior is explained in
the blog post “ File
Share ‘Scoping’ in Windows Server 2008 Failover Clusters.” This limited the
way administrators had clients connect to shares because there was only one
option to connect. This was a big problem with administrators because, in some
cases, it made server consolidations more difficult and time consuming because
additional steps needed to be taken into consideration—which, in turn, led to
longer downtimes to perform the consolidations. Because of this, Server 2012
Failover Clustering now gives you the ability to connect to shares via the
virtual network name, the virtual IP address, or a CNAME that is created in
DNS. One caveat is that when using CNAMEs, there is an additional configuration
needed for the name. For example, suppose you had a file share with the
clustered network name TXFILESERVER and you wanted to set up a CNAME of TEXAS
in DNS to connect. Through PowerShell, you would execute
Get-ClusterResource “TXFILESERVER” | Set-ClusterParameter Aliases TEXAS
When
this is done, you must take the name offline and put it back online before it
will take effect and answer the connection.
You
need to consider the repercussions of connecting via the IP address or alias.
When connecting via these methods, Kerberos won’t be the authentication method
used because it will drop to using NTLM security. So, although connecting via
these alternative methods does bring flexibility, the security trade-off for
NTLM authentication must be taken into consideration.
CSV Updates
CSV
has been updated with the following list of capabilities. These features
provide for easier setups, broader workloads, and enhanced security and
performance in a wider variety of deployments, as well as greater availability.
·
Storage
capabilities for Scale-Out File Servers (more on this later), not just highly
available VMs
·
A new CSV
Proxy File System (CSVFS) to provide a single, consistent filename space
·
Support for
BitLocker drive encryption
·
Direct I/O
for file data access, enhancing VM creation and copy performance
·
Removal of
external authentication dependencies when a domain controller (DC) might not be
available
·
Integration
with the new Server Message Block (SMB) 3.0 to provide for file servers,
Hyper-V VMs, and applications such as SQL Server
·
Use of SMB
Multichannel and SMB Direct to allow CSV traffic to stream across multiple
networks, and use of network adapters that support Remote Direct Memory Access
(RDMA)
·
Ability to
scan and correct volumes with zero offline time as NTFS identifies, logs, and
repairs issues without affecting the availability of CSV drives
Scale-Out File Servers
Scale-Out
File Servers can host continuously available and scalable storage, by using the
SMB 3.0 protocol, and utilizes CSV for the storage. Benefits of Scale-Out File
Servers include the following:
·
Provides for
active-active file shares in which all nodes accept and serve SMB client
requests. This functionality provides for transparent failover to other cluster
nodes during planned maintenance and unplanned failures.
·
Increases
the total bandwidth of all file server nodes. There is no longer the bandwidth
concern of all network client connections going to a single node; instead, a
Scale-Out File Server gives you the ability to transparently move a client
connection to another node to continue servicing that client without any
network disruption. The limit to a Scale-Out File Server at this time is eight
nodes.
·
CSV takes
the improved Chkdsk times a step further by eliminating the offline phase. With
CSVFS, you can run Chkdsk without affecting applications.
·
Another new
CSV feature is CSV Cache, which can improve performance in some scenarios such
as Virtual Desktop Infrastructure (VDI). CSV Cache allows you to allocate
system memory (RAM) as a write-through cache. This provides caching of
read-only unbuffered I/O that isn’t cached by the Windows Cache Manager. CSV
Cache boosts the performance of read requests with write-through for no caching
of write requests.
·
Eases
management tasks. Is no longer necessary to create multiple clustered file
servers where multiple disks and placement strategies are needed.
Scale-Out
File Servers are ideal for SQL Server and Hyper-V configurations. The design
behind Scale-Out File Servers is for applications that keep files open for long
periods of time, as do most data operations. A SQL Server database or a Hyper-V
VM .vhd file performs a lot of data operations (changes to the file itself),
but doesn't perform a lot of actual metadata updates. It shouldn’t be used as a
user data share where the workload has a high number of NTFS metadata updates.
With NTFS, metadata updates are operations such as opening/closing files,
creating new files, renaming existing files, deleting files, and so on, that
make changes to the file system of the drive.
Always Improving
There
are further enhancements, but I just don’t have the space to list them all. Our
Microsoft Program Management Team has worked hard and listened to users about
features they've been wanting from failover clusters, and the team has delivered.
We've also taken some of the top issues seen in previous versions and made them
into positives with the new version.
No comments:
Post a Comment