Monday, July 28, 2008

HyperV and Failover Clustering Qwerk

Setting up a HyperV failover cluster on Server 2008 Datacenter edition, we kept having an issue with VM's configuration object getting stuck in "Offline Pending" when moving the virtual machine between servers. Additionally, we were getting an error when trying to add additional Virtual Machines to the cluster:
"An error was encountered while loading the list of available virtual machines. The value cannot be null Parameter name ManagementObject" . With an EventID # 1183 error as well.
After banging my head over this issue, it turns out its rather simple. The Failover Cluster Management MMC doesn't automatically add the drive that the Virtual Machine Configuration is stored on as a dependency to the Virtual Machine Configuration object. It's strange because it does add it to the Virtual Machine object its self. The result is that the cluster service can offline the disk that the configuration is stored on before it's done writing to the configuration file, causing the problems.

The solution, of course, is to add the virtual disk as a dependency of the Virtual Machine Configuration:
Update:
If you've already deleted the VM Configuration, you can add it back by right clicking on the Virtual Machine Group in the Failover Cluster Management MMC, choosing Add Resource -> More Resources -> Add Virtual Machine Configuration. Alternatively you could reload VM on one of the cluster nodes and then go through the process to make it highly available again.

Update x2:
Microsoft has fixed the root problem in the Hyper-V Failover Cluster Management Hotfix; more info here:
http://hypervoria.com/hyper-v/hyper-v-failover-cluster-hotfix-now-public.aspx

Thursday, July 24, 2008

SCCM 2007 & KB948109

We've got a single System Center Configuration Manager 2007 server running all of our SCCM roles, as well as running MS SQL 2005 Express Edition SP2. When we installed the latest MSSQL update, KB948109 , we found that it completely hosed the security on the SQL instance. It appears that this is a rather widespread problem, in generally it looks like this update is wroght with issues. Luckily, running SCCM's Repair Site wizard on the server followed by a reboot cleaned up the mess. The more you know.