Re: [pca] Cluster, zones, noreboot.

Andero Belov Tue, 13 Jul 2010 06:05:52 -0700

Hi!

I do not know your exact cluster config, but basically two scenarios are 
possible:
1. whole zoneroot is on failover disk
2. each cluster node has a complete separate zoneroot and the only thing 
failing over is application/database data


First case
Migrate all failover zones to one node and patch the node and zones.
On the other, empty node, change /etc/zones/index temporarily that the node 
knows nothing about the zones. Then apply the exact same patches to the empty 
node. After that restore /etc/zones/index and the zones *should* be able to 
fail over again.

Second case.
Migrate all zones to one node. On the empty node remove all references to 
failover datasets from /etc/zones/yourzonename.xml files. Then patch the empty 
node. The problem is that the zone cannot boot up to mainenance state when 
failover datasets are not available. Now that you have removed all references 
to them, the patching can proceed. Finally - restore yourzone.xml files to 
their original state and the zones *should* be able to migrate again. If it 
migrates, patch the first node (using the same method?).

The second case is better because it allows you to apply kernel patches and 
reboot the empty node before migrating stuff back. In case of a failure on one 
node you still have a complete zoneroot available on other node. Downside is 
that it creates overhead when administering zones.

I strongly recommend backup and testing the above methods in non-production 
environment.

These above are home-made methods and may not be the best or recommended ways 
to patch zones+cluster but they have done the trick for me :) I have not yet 
used zone detatch/upgrade-on-attach method so i won't comment that :)

HTH,
Andero


-----Original Message-----
From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at] On 
Behalf Of David Stark
Sent: Tuesday, July 13, 2010 12:52 PM
To: PCA (Patch Check Advanced) Discussion
Subject: [pca] Cluster, zones, noreboot.

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover 
zones and with no LiveUpgrade alt. boot space set up, which we need to 
patch. If we were to patch the clusters node-by-node (including the 
kernel patches that don't like zones being booted when they're applied), 
the zones would fail over between the nodes and never get patched 
themselves, and since kernel (and some other?) patches can't be applied 
from inside zones we would end up in a situation where the zones' patch 
databases are out of sync with the Global zone. I know from very bitter 
experience that this is a Bad Thing, so to avoid that we're currently 
bringing the clusters down to patch them. This obviously isn't optimal - 
people expecting 100% uptime from the clusters are naturally a bit 
annoyed at having their applications down for several hours while we 
unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, say, 
the night before, and have a more minimal patch run with the clusters 
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to live 
systems? Any weirdness at all? I've patched plenty of test machines in 
multi-user mode, but never busy production boxes - these are fairly 
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top tips?

Cheers!

Dave

Re: [pca] Cluster, zones, noreboot.

Reply via email to