[pca] Cluster, zones, noreboot.

2010-07-13 Thread David Stark

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover 
zones and with no LiveUpgrade alt. boot space set up, which we need to 
patch. If we were to patch the clusters node-by-node (including the 
kernel patches that don't like zones being booted when they're applied), 
the zones would fail over between the nodes and never get patched 
themselves, and since kernel (and some other?) patches can't be applied 
from inside zones we would end up in a situation where the zones' patch 
databases are out of sync with the Global zone. I know from very bitter 
experience that this is a Bad Thing, so to avoid that we're currently 
bringing the clusters down to patch them. This obviously isn't optimal - 
people expecting 100% uptime from the clusters are naturally a bit 
annoyed at having their applications down for several hours while we 
unleash the mighty PCA.


So, to minimise downtime I'd like to apply the noreboot patches, say, 
the night before, and have a more minimal patch run with the clusters 
down. This brings me to the question:


Has anyone ever had any problems with noreboot patches applied to live 
systems? Any weirdness at all? I've patched plenty of test machines in 
multi-user mode, but never busy production boxes - these are fairly 
large Oracle and SAP environments for the most part.


Anyone with experience patching Sun Cluster care to share any top tips?

Cheers!

Dave



Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread Filip Francis

Hi there,


I have done already quit a few upgrades of sun clusters.
Can you give me some more details on what version off cluster + version 
off os.


Regards
Filip


On 07/13/10 11:52, David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover 
zones and with no LiveUpgrade alt. boot space set up, which we need to 
patch. If we were to patch the clusters node-by-node (including the 
kernel patches that don't like zones being booted when they're 
applied), the zones would fail over between the nodes and never get 
patched themselves, and since kernel (and some other?) patches can't 
be applied from inside zones we would end up in a situation where the 
zones' patch databases are out of sync with the Global zone. I know 
from very bitter experience that this is a Bad Thing, so to avoid that 
we're currently bringing the clusters down to patch them. This 
obviously isn't optimal - people expecting 100% uptime from the 
clusters are naturally a bit annoyed at having their applications down 
for several hours while we unleash the mighty PCA.


So, to minimise downtime I'd like to apply the noreboot patches, say, 
the night before, and have a more minimal patch run with the clusters 
down. This brings me to the question:


Has anyone ever had any problems with noreboot patches applied to live 
systems? Any weirdness at all? I've patched plenty of test machines in 
multi-user mode, but never busy production boxes - these are fairly 
large Oracle and SAP environments for the most part.


Anyone with experience patching Sun Cluster care to share any top tips?

Cheers!

Dave







Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread David Stark

On 13/07/2010 12:31, Filip Francis wrote:

Hi there,


Hi!


I have done already quit a few upgrades of sun clusters.
Can you give me some more details on what version off cluster + version
off os.


Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, 
but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked 
up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the 
zones and data filesystems are ZFS.



Regards
Filip


Cheers!
Dave


On 07/13/10 11:52, David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover
zones and with no LiveUpgrade alt. boot space set up, which we need to
patch. If we were to patch the clusters node-by-node (including the
kernel patches that don't like zones being booted when they're
applied), the zones would fail over between the nodes and never get
patched themselves, and since kernel (and some other?) patches can't
be applied from inside zones we would end up in a situation where the
zones' patch databases are out of sync with the Global zone. I know
from very bitter experience that this is a Bad Thing, so to avoid that
we're currently bringing the clusters down to patch them. This
obviously isn't optimal - people expecting 100% uptime from the
clusters are naturally a bit annoyed at having their applications down
for several hours while we unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, say,
the night before, and have a more minimal patch run with the clusters
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to live
systems? Any weirdness at all? I've patched plenty of test machines in
multi-user mode, but never busy production boxes - these are fairly
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top tips?

Cheers!

Dave










Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread Glenn Satchell
When a zone migrates back to a patched system, doesn't it normally 
update itself as it starts up the first time?


Have a look at the zoneadm man page, in particular the attach and detach 
sub-commands. I don't know for sure, but Sun Cluster may be smart enough 
to do the right thing and use zoneadm attach -u to bring the zone up to 
date when it attaches to the patched system.


Perhaps a quick chat with your local Sun SE to help plan things might be 
time well spent?


regards,
-glenn

On 07/13/10 21:41, David Stark wrote:

On 13/07/2010 12:31, Filip Francis wrote:

Hi there,


Hi!


I have done already quit a few upgrades of sun clusters.
Can you give me some more details on what version off cluster + version
off os.


Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters,
but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked
up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the
zones and data filesystems are ZFS.


Regards
Filip


Cheers!
Dave


On 07/13/10 11:52, David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover
zones and with no LiveUpgrade alt. boot space set up, which we need to
patch. If we were to patch the clusters node-by-node (including the
kernel patches that don't like zones being booted when they're
applied), the zones would fail over between the nodes and never get
patched themselves, and since kernel (and some other?) patches can't
be applied from inside zones we would end up in a situation where the
zones' patch databases are out of sync with the Global zone. I know
from very bitter experience that this is a Bad Thing, so to avoid that
we're currently bringing the clusters down to patch them. This
obviously isn't optimal - people expecting 100% uptime from the
clusters are naturally a bit annoyed at having their applications down
for several hours while we unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, say,
the night before, and have a more minimal patch run with the clusters
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to live
systems? Any weirdness at all? I've patched plenty of test machines in
multi-user mode, but never busy production boxes - these are fairly
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top tips?

Cheers!

Dave





Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread David Stark

Hi all.

Thanks Filip, Glen - I'd forgotten all about Update on Attach.

It looks like Update On Attach is part of kernel patch 137137-09, which 
we've actually got applied on some recently patched clusters. The 
zoneadm man page lists the '-u' option on those hosts too, so it looks 
like this should be usable on the next patching run.
Since we've already got Parallel patching for zones, I'm not sure if 
this will really save a lot of down time, though - is a zone 
update-attach any faster than a normal patch run?


Dave

On 13/07/2010 13:05, Filip Francis wrote:

No unless you have a certain version off Solaris i think this is only
from version 10u7 or 10u8 that you have this option.
This is his problem.
The cluster will not do this for you it think this is scheduled in the
next release of sun cluster later this year
Filip


On 07/13/10 13:54, Glenn Satchell wrote:

When a zone migrates back to a patched system, doesn't it normally
update itself as it starts up the first time?

Have a look at the zoneadm man page, in particular the attach and
detach sub-commands. I don't know for sure, but Sun Cluster may be
smart enough to do the right thing and use zoneadm attach -u to bring
the zone up to date when it attaches to the patched system.

Perhaps a quick chat with your local Sun SE to help plan things might
be time well spent?

regards,
-glenn

On 07/13/10 21:41, David Stark wrote:

On 13/07/2010 12:31, Filip Francis wrote:

Hi there,


Hi!


I have done already quit a few upgrades of sun clusters.
Can you give me some more details on what version off cluster + version
off os.


Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters,
but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked
up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the
zones and data filesystems are ZFS.


Regards
Filip


Cheers!
Dave


On 07/13/10 11:52, David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover
zones and with no LiveUpgrade alt. boot space set up, which we need to
patch. If we were to patch the clusters node-by-node (including the
kernel patches that don't like zones being booted when they're
applied), the zones would fail over between the nodes and never get
patched themselves, and since kernel (and some other?) patches can't
be applied from inside zones we would end up in a situation where the
zones' patch databases are out of sync with the Global zone. I know
from very bitter experience that this is a Bad Thing, so to avoid that
we're currently bringing the clusters down to patch them. This
obviously isn't optimal - people expecting 100% uptime from the
clusters are naturally a bit annoyed at having their applications down
for several hours while we unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, say,
the night before, and have a more minimal patch run with the clusters
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to live
systems? Any weirdness at all? I've patched plenty of test machines in
multi-user mode, but never busy production boxes - these are fairly
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top
tips?

Cheers!

Dave












Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread Filip Francis


Dave,

If you want todo parallel patchen you need to edit /etc/patch/pdo.conf file.
That will make parallel patching to work on several zones at the same time
Filip

On 07/13/10 14:28, David Stark wrote:

Hi all.

Thanks Filip, Glen - I'd forgotten all about Update on Attach.

It looks like Update On Attach is part of kernel patch 137137-09, 
which we've actually got applied on some recently patched clusters. 
The zoneadm man page lists the '-u' option on those hosts too, so it 
looks like this should be usable on the next patching run.
Since we've already got Parallel patching for zones, I'm not sure if 
this will really save a lot of down time, though - is a zone 
update-attach any faster than a normal patch run?


Dave

On 13/07/2010 13:05, Filip Francis wrote:

No unless you have a certain version off Solaris i think this is only
from version 10u7 or 10u8 that you have this option.
This is his problem.
The cluster will not do this for you it think this is scheduled in the
next release of sun cluster later this year
Filip


On 07/13/10 13:54, Glenn Satchell wrote:

When a zone migrates back to a patched system, doesn't it normally
update itself as it starts up the first time?

Have a look at the zoneadm man page, in particular the attach and
detach sub-commands. I don't know for sure, but Sun Cluster may be
smart enough to do the right thing and use zoneadm attach -u to bring
the zone up to date when it attaches to the patched system.

Perhaps a quick chat with your local Sun SE to help plan things might
be time well spent?

regards,
-glenn

On 07/13/10 21:41, David Stark wrote:

On 13/07/2010 12:31, Filip Francis wrote:

Hi there,


Hi!


I have done already quit a few upgrades of sun clusters.
Can you give me some more details on what version off cluster + 
version

off os.


Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters,
but there's a 3 node and a 6 node as well. Mainly T5220 machines 
hooked

up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the
zones and data filesystems are ZFS.


Regards
Filip


Cheers!
Dave


On 07/13/10 11:52, David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover
zones and with no LiveUpgrade alt. boot space set up, which we 
need to

patch. If we were to patch the clusters node-by-node (including the
kernel patches that don't like zones being booted when they're
applied), the zones would fail over between the nodes and never get
patched themselves, and since kernel (and some other?) patches can't
be applied from inside zones we would end up in a situation where 
the

zones' patch databases are out of sync with the Global zone. I know
from very bitter experience that this is a Bad Thing, so to avoid 
that

we're currently bringing the clusters down to patch them. This
obviously isn't optimal - people expecting 100% uptime from the
clusters are naturally a bit annoyed at having their applications 
down

for several hours while we unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, 
say,
the night before, and have a more minimal patch run with the 
clusters

down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to 
live
systems? Any weirdness at all? I've patched plenty of test 
machines in

multi-user mode, but never busy production boxes - these are fairly
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top
tips?

Cheers!

Dave
















Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread Andero Belov
Hi!

I do not know your exact cluster config, but basically two scenarios are 
possible:
1. whole zoneroot is on failover disk
2. each cluster node has a complete separate zoneroot and the only thing 
failing over is application/database data

First case
Migrate all failover zones to one node and patch the node and zones.
On the other, empty node, change /etc/zones/index temporarily that the node 
knows nothing about the zones. Then apply the exact same patches to the empty 
node. After that restore /etc/zones/index and the zones *should* be able to 
fail over again.

Second case.
Migrate all zones to one node. On the empty node remove all references to 
failover datasets from /etc/zones/yourzonename.xml files. Then patch the empty 
node. The problem is that the zone cannot boot up to mainenance state when 
failover datasets are not available. Now that you have removed all references 
to them, the patching can proceed. Finally - restore yourzone.xml files to 
their original state and the zones *should* be able to migrate again. If it 
migrates, patch the first node (using the same method?).

The second case is better because it allows you to apply kernel patches and 
reboot the empty node before migrating stuff back. In case of a failure on one 
node you still have a complete zoneroot available on other node. Downside is 
that it creates overhead when administering zones.

I strongly recommend backup and testing the above methods in non-production 
environment.

These above are home-made methods and may not be the best or recommended ways 
to patch zones+cluster but they have done the trick for me :) I have not yet 
used zone detatch/upgrade-on-attach method so i won't comment that :)

HTH,
Andero


-Original Message-
From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at] On 
Behalf Of David Stark
Sent: Tuesday, July 13, 2010 12:52 PM
To: PCA (Patch Check Advanced) Discussion
Subject: [pca] Cluster, zones, noreboot.

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover 
zones and with no LiveUpgrade alt. boot space set up, which we need to 
patch. If we were to patch the clusters node-by-node (including the 
kernel patches that don't like zones being booted when they're applied), 
the zones would fail over between the nodes and never get patched 
themselves, and since kernel (and some other?) patches can't be applied 
from inside zones we would end up in a situation where the zones' patch 
databases are out of sync with the Global zone. I know from very bitter 
experience that this is a Bad Thing, so to avoid that we're currently 
bringing the clusters down to patch them. This obviously isn't optimal - 
people expecting 100% uptime from the clusters are naturally a bit 
annoyed at having their applications down for several hours while we 
unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, say, 
the night before, and have a more minimal patch run with the clusters 
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to live 
systems? Any weirdness at all? I've patched plenty of test machines in 
multi-user mode, but never busy production boxes - these are fairly 
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top tips?

Cheers!

Dave




Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread David Stark

Ah, sorry Filip, my question wasn't very clear.

I'm wondering if the downtime for each zone would be much less using 
Update On Attach compared with just bringing the whole cluster down and 
patching all the zones in parallel?


On 13/07/2010 13:56, Filip Francis wrote:


Dave,

If you want todo parallel patchen you need to edit /etc/patch/pdo.conf
file.
That will make parallel patching to work on several zones at the same time
Filip

On 07/13/10 14:28, David Stark wrote:

Hi all.

Thanks Filip, Glen - I'd forgotten all about Update on Attach.

It looks like Update On Attach is part of kernel patch 137137-09,
which we've actually got applied on some recently patched clusters.
The zoneadm man page lists the '-u' option on those hosts too, so it
looks like this should be usable on the next patching run.
Since we've already got Parallel patching for zones, I'm not sure if
this will really save a lot of down time, though - is a zone
update-attach any faster than a normal patch run?

Dave

On 13/07/2010 13:05, Filip Francis wrote:

No unless you have a certain version off Solaris i think this is only
from version 10u7 or 10u8 that you have this option.
This is his problem.
The cluster will not do this for you it think this is scheduled in the
next release of sun cluster later this year
Filip


On 07/13/10 13:54, Glenn Satchell wrote:

When a zone migrates back to a patched system, doesn't it normally
update itself as it starts up the first time?

Have a look at the zoneadm man page, in particular the attach and
detach sub-commands. I don't know for sure, but Sun Cluster may be
smart enough to do the right thing and use zoneadm attach -u to bring
the zone up to date when it attaches to the patched system.

Perhaps a quick chat with your local Sun SE to help plan things might
be time well spent?

regards,
-glenn

On 07/13/10 21:41, David Stark wrote:

On 13/07/2010 12:31, Filip Francis wrote:

Hi there,


Hi!


I have done already quit a few upgrades of sun clusters.
Can you give me some more details on what version off cluster +
version
off os.


Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters,
but there's a 3 node and a 6 node as well. Mainly T5220 machines
hooked
up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the
zones and data filesystems are ZFS.


Regards
Filip


Cheers!
Dave


On 07/13/10 11:52, David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover
zones and with no LiveUpgrade alt. boot space set up, which we
need to
patch. If we were to patch the clusters node-by-node (including the
kernel patches that don't like zones being booted when they're
applied), the zones would fail over between the nodes and never get
patched themselves, and since kernel (and some other?) patches can't
be applied from inside zones we would end up in a situation where
the
zones' patch databases are out of sync with the Global zone. I know
from very bitter experience that this is a Bad Thing, so to avoid
that
we're currently bringing the clusters down to patch them. This
obviously isn't optimal - people expecting 100% uptime from the
clusters are naturally a bit annoyed at having their applications
down
for several hours while we unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches,
say,
the night before, and have a more minimal patch run with the
clusters
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to
live
systems? Any weirdness at all? I've patched plenty of test
machines in
multi-user mode, but never busy production boxes - these are fairly
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top
tips?

Cheers!

Dave



















Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread David Stark

Hi!

On 13/07/2010 14:05, Andero Belov wrote:

Hi!

I do not know your exact cluster config, but basically two scenarios are 
possible:
1. whole zoneroot is on failover disk
2. each cluster node has a complete separate zoneroot and the only thing 
failing over is application/database data


Yup, we're using option 1 here.


First case
Migrate all failover zones to one node and patch the node and zones.
On the other, empty node, change /etc/zones/index temporarily that the node 
knows nothing about the zones. Then apply the exact same patches to the empty 
node. After that restore /etc/zones/index and the zones *should* be able to 
fail over again.


This is basically what we're doing at the moment. We leave the zones on 
the nodes where they're currently running, and just comment out the 
zones not running on each node in /etc/zones/index before we patch - it 
makes the patching prep a bit more complicated, but makes the patch run 
a bit quicker with the zones spread out across the nodes (maybe - I'm 
not sure it makes a lot of difference with parallel patching switched on).



Second case.
Migrate all zones to one node. On the empty node remove all references to 
failover datasets from /etc/zones/yourzonename.xml files. Then patch the empty 
node. The problem is that the zone cannot boot up to mainenance state when 
failover datasets are not available. Now that you have removed all references 
to them, the patching can proceed. Finally - restore yourzone.xml files to 
their original state and the zones *should* be able to migrate again. If it 
migrates, patch the first node (using the same method?).

The second case is better because it allows you to apply kernel patches and 
reboot the empty node before migrating stuff back. In case of a failure on one 
node you still have a complete zoneroot available on other node. Downside is 
that it creates overhead when administering zones.

I strongly recommend backup and testing the above methods in non-production 
environment.



These above are home-made methods and may not be the best or recommended ways 
to patch zones+cluster but they have done the trick for me :) I have not yet 
used zone detatch/upgrade-on-attach method so i won't comment that :)

HTH,
Andero


Cheers!

Dave


-Original Message-
From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at] On 
Behalf Of David Stark
Sent: Tuesday, July 13, 2010 12:52 PM
To: PCA (Patch Check Advanced) Discussion
Subject: [pca] Cluster, zones, noreboot.

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover
zones and with no LiveUpgrade alt. boot space set up, which we need to
patch. If we were to patch the clusters node-by-node (including the
kernel patches that don't like zones being booted when they're applied),
the zones would fail over between the nodes and never get patched
themselves, and since kernel (and some other?) patches can't be applied
from inside zones we would end up in a situation where the zones' patch
databases are out of sync with the Global zone. I know from very bitter
experience that this is a Bad Thing, so to avoid that we're currently
bringing the clusters down to patch them. This obviously isn't optimal -
people expecting 100% uptime from the clusters are naturally a bit
annoyed at having their applications down for several hours while we
unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, say,
the night before, and have a more minimal patch run with the clusters
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to live
systems? Any weirdness at all? I've patched plenty of test machines in
multi-user mode, but never busy production boxes - these are fairly
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top tips?

Cheers!

Dave







Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread French, David
I can't speak to Sun's Cluster software but here we use Veritas VCS and
do update on attach.  The advantage is not as much time as being able to
schedule the work.  For example, if you patch them all at once, they are
all down while this is being done.  If you migrate zones to another
system, leaving just the global zone on one of the systems, you can
patch that system.  Then when it has been rebooted and checked out, you
can migrate systems as needed using update on attach, until they have
all been updated.   Then patch the other node that has just a global and
roll back some of the zones to that box.

This may cause a 2-3 reboots for some zones depending on where they
live, but zones usually boot fast compared to actual HW.  Just to give
you an idea, a system needing a hundred patches may take  1 hr to
patch.   An update on attach will run in a fraction of that time.  It
also allows you to schedule the time with  your customer instead of
requiring they all be down for hours at the same time.  I find people
are more apt to accept quick reboots and the short time for  an update
on attach than accept being down for an extended period while you patch
everyone.

But if all zones are related then having them down at the same time may
not be an issue and a parallel patch may be more acceptable.

Now, I will say this.  Before patching validate the current packages and
patches.  I mention this as I ran into an issue on one of my systems
(non cluster, but global and 4 container/zones) where the SUNWcsl
package was missing pkginfo under /var/sadm/pkg/SUNWcsl.   Not only did
it cause patch issues, but it also broke update on attach so much that
the only option I had was to rebuild the zones.  Sun (at the time)
wasn't much help.  The issue was deeper than just the pkginfo file but
also the version of that file and the files in the package used in
patching the zones.  The  patch utilities mangled them and I didn't
catch it before patching.

I mention this as the version you are running is in the period I was at
when there were issues with the pkg/patch utilities, so better to check
in advance.

--Dave



 -Original Message-
 From: pca-boun...@lists.univie.ac.at [mailto:pca-
 boun...@lists.univie.ac.at] On Behalf Of David Stark
 Sent: Tuesday, July 13, 2010 6:09 AM
 To: PCA (Patch Check Advanced) Discussion
 Subject: Re: [pca] Cluster, zones, noreboot.
 
 Ah, sorry Filip, my question wasn't very clear.
 
 I'm wondering if the downtime for each zone would be much less using
 Update On Attach compared with just bringing the whole cluster down
and
 patching all the zones in parallel?
 
 On 13/07/2010 13:56, Filip Francis wrote:
 
  Dave,
 
  If you want todo parallel patchen you need to edit
 /etc/patch/pdo.conf
  file.
  That will make parallel patching to work on several zones at the
same
 time
  Filip
 
  On 07/13/10 14:28, David Stark wrote:
  Hi all.
 
  Thanks Filip, Glen - I'd forgotten all about Update on Attach.
 
  It looks like Update On Attach is part of kernel patch 137137-09,
  which we've actually got applied on some recently patched clusters.
  The zoneadm man page lists the '-u' option on those hosts too, so
it
  looks like this should be usable on the next patching run.
  Since we've already got Parallel patching for zones, I'm not sure
if
  this will really save a lot of down time, though - is a zone
  update-attach any faster than a normal patch run?
 
  Dave
 
  On 13/07/2010 13:05, Filip Francis wrote:
  No unless you have a certain version off Solaris i think this is
 only
  from version 10u7 or 10u8 that you have this option.
  This is his problem.
  The cluster will not do this for you it think this is scheduled in
 the
  next release of sun cluster later this year
  Filip
 
 
  On 07/13/10 13:54, Glenn Satchell wrote:
  When a zone migrates back to a patched system, doesn't it
normally
  update itself as it starts up the first time?
 
  Have a look at the zoneadm man page, in particular the attach and
  detach sub-commands. I don't know for sure, but Sun Cluster may
be
  smart enough to do the right thing and use zoneadm attach -u to
 bring
  the zone up to date when it attaches to the patched system.
 
  Perhaps a quick chat with your local Sun SE to help plan things
 might
  be time well spent?
 
  regards,
  -glenn
 
  On 07/13/10 21:41, David Stark wrote:
  On 13/07/2010 12:31, Filip Francis wrote:
  Hi there,
 
  Hi!
 
  I have done already quit a few upgrades of sun clusters.
  Can you give me some more details on what version off cluster +
  version
  off os.
 
  Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node
 clusters,
  but there's a 3 node and a 6 node as well. Mainly T5220 machines
  hooked
  up to EMC Symmetrix storage. The hosts' / slices are all UFS,
all
 the
  zones and data filesystems are ZFS.
 
  Regards
  Filip
 
  Cheers!
  Dave
 
  On 07/13/10 11:52, David Stark wrote:
  Hi List.
 
  A bit off-topic, but PCA's involved, so I'm going to push my

Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread Filip Francis

David,


Not sure i have done both and they where quit fast.
Did not realy looked at the time.
So can not realy tell.
The only difference that i can see is that with update on attach it will 
remove some stuff and then add it.


Filip

On 07/13/10 15:09, David Stark wrote:

Ah, sorry Filip, my question wasn't very clear.

I'm wondering if the downtime for each zone would be much less using 
Update On Attach compared with just bringing the whole cluster down 
and patching all the zones in parallel?


On 13/07/2010 13:56, Filip Francis wrote:


Dave,

If you want todo parallel patchen you need to edit /etc/patch/pdo.conf
file.
That will make parallel patching to work on several zones at the same 
time

Filip

On 07/13/10 14:28, David Stark wrote:

Hi all.

Thanks Filip, Glen - I'd forgotten all about Update on Attach.

It looks like Update On Attach is part of kernel patch 137137-09,
which we've actually got applied on some recently patched clusters.
The zoneadm man page lists the '-u' option on those hosts too, so it
looks like this should be usable on the next patching run.
Since we've already got Parallel patching for zones, I'm not sure if
this will really save a lot of down time, though - is a zone
update-attach any faster than a normal patch run?

Dave

On 13/07/2010 13:05, Filip Francis wrote:

No unless you have a certain version off Solaris i think this is only
from version 10u7 or 10u8 that you have this option.
This is his problem.
The cluster will not do this for you it think this is scheduled in the
next release of sun cluster later this year
Filip


On 07/13/10 13:54, Glenn Satchell wrote:

When a zone migrates back to a patched system, doesn't it normally
update itself as it starts up the first time?

Have a look at the zoneadm man page, in particular the attach and
detach sub-commands. I don't know for sure, but Sun Cluster may be
smart enough to do the right thing and use zoneadm attach -u to bring
the zone up to date when it attaches to the patched system.

Perhaps a quick chat with your local Sun SE to help plan things might
be time well spent?

regards,
-glenn

On 07/13/10 21:41, David Stark wrote:

On 13/07/2010 12:31, Filip Francis wrote:

Hi there,


Hi!


I have done already quit a few upgrades of sun clusters.
Can you give me some more details on what version off cluster +
version
off os.


Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node 
clusters,

but there's a 3 node and a 6 node as well. Mainly T5220 machines
hooked
up to EMC Symmetrix storage. The hosts' / slices are all UFS, all 
the

zones and data filesystems are ZFS.


Regards
Filip


Cheers!
Dave


On 07/13/10 11:52, David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) 
failover

zones and with no LiveUpgrade alt. boot space set up, which we
need to
patch. If we were to patch the clusters node-by-node (including 
the

kernel patches that don't like zones being booted when they're
applied), the zones would fail over between the nodes and never 
get
patched themselves, and since kernel (and some other?) patches 
can't

be applied from inside zones we would end up in a situation where
the
zones' patch databases are out of sync with the Global zone. I 
know

from very bitter experience that this is a Bad Thing, so to avoid
that
we're currently bringing the clusters down to patch them. This
obviously isn't optimal - people expecting 100% uptime from the
clusters are naturally a bit annoyed at having their applications
down
for several hours while we unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches,
say,
the night before, and have a more minimal patch run with the
clusters
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to
live
systems? Any weirdness at all? I've patched plenty of test
machines in
multi-user mode, but never busy production boxes - these are 
fairly

large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top
tips?

Cheers!

Dave























Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread David Stark

Hi Dave.

On 13/07/2010 14:30, French, David wrote:

I can't speak to Sun's Cluster software but here we use Veritas VCS and
do update on attach.  The advantage is not as much time as being able to
schedule the work.  For example, if you patch them all at once, they are
all down while this is being done.  If you migrate zones to another
system, leaving just the global zone on one of the systems, you can
patch that system.  Then when it has been rebooted and checked out, you
can migrate systems as needed using update on attach, until they have
all been updated.   Then patch the other node that has just a global and
roll back some of the zones to that box.



This may cause a 2-3 reboots for some zones depending on where they
live, but zones usually boot fast compared to actual HW.  Just to give
you an idea, a system needing a hundred patches may take  1 hr to
patch.   An update on attach will run in a fraction of that time.  It
also allows you to schedule the time with  your customer instead of
requiring they all be down for hours at the same time.  I find people
are more apt to accept quick reboots and the short time for  an update
on attach than accept being down for an extended period while you patch
everyone.


Ah, excellent. Anything that reduces downtime on the zones would be a 
win for us.



But if all zones are related then having them down at the same time may
not be an issue and a parallel patch may be more acceptable.


Yeah, most of our clusters are single-application, but then there's the 
dreaded 'Unix Consolidation Cluster' with 20-odd business units' stuff 
on it. I have a feeling Update on Attach will come in handy. 
Unfortunatley, I'll have to do an old-school patch run to get Update On 
Attach installed :( .



Now, I will say this.  Before patching validate the current packages and
patches.  I mention this as I ran into an issue on one of my systems
(non cluster, but global and 4 container/zones) where the SUNWcsl
package was missing pkginfo under /var/sadm/pkg/SUNWcsl.   Not only did
it cause patch issues, but it also broke update on attach so much that
the only option I had was to rebuild the zones.  Sun (at the time)
wasn't much help.  The issue was deeper than just the pkginfo file but
also the version of that file and the files in the package used in
patching the zones.  The  patch utilities mangled them and I didn't
catch it before patching.


Yeesh. Broken core libs package? Ouch.


I mention this as the version you are running is in the period I was at
when there were issues with the pkg/patch utilities, so better to check
in advance.


We've been OK so far (4 clusters patched already). Fingers crossed.


--Dave


Cheers.

Dave




-Original Message-
From: pca-boun...@lists.univie.ac.at [mailto:pca-
boun...@lists.univie.ac.at] On Behalf Of David Stark
Sent: Tuesday, July 13, 2010 6:09 AM
To: PCA (Patch Check Advanced) Discussion
Subject: Re: [pca] Cluster, zones, noreboot.

Ah, sorry Filip, my question wasn't very clear.

I'm wondering if the downtime for each zone would be much less using
Update On Attach compared with just bringing the whole cluster down

and

patching all the zones in parallel?

On 13/07/2010 13:56, Filip Francis wrote:


Dave,

If you want todo parallel patchen you need to edit

/etc/patch/pdo.conf

file.
That will make parallel patching to work on several zones at the

same

time

Filip

On 07/13/10 14:28, David Stark wrote:

Hi all.

Thanks Filip, Glen - I'd forgotten all about Update on Attach.

It looks like Update On Attach is part of kernel patch 137137-09,
which we've actually got applied on some recently patched clusters.
The zoneadm man page lists the '-u' option on those hosts too, so

it

looks like this should be usable on the next patching run.
Since we've already got Parallel patching for zones, I'm not sure

if

this will really save a lot of down time, though - is a zone
update-attach any faster than a normal patch run?

Dave

On 13/07/2010 13:05, Filip Francis wrote:

No unless you have a certain version off Solaris i think this is

only

from version 10u7 or 10u8 that you have this option.
This is his problem.
The cluster will not do this for you it think this is scheduled in

the

next release of sun cluster later this year
Filip


On 07/13/10 13:54, Glenn Satchell wrote:

When a zone migrates back to a patched system, doesn't it

normally

update itself as it starts up the first time?

Have a look at the zoneadm man page, in particular the attach and
detach sub-commands. I don't know for sure, but Sun Cluster may

be

smart enough to do the right thing and use zoneadm attach -u to

bring

the zone up to date when it attaches to the patched system.

Perhaps a quick chat with your local Sun SE to help plan things

might

be time well spent?

regards,
-glenn

On 07/13/10 21:41, David Stark wrote:

On 13/07/2010 12:31, Filip Francis wrote:

Hi there,


Hi!


I have done already quit a few upgrades of sun clusters.
Can you

Re: [pca] Cluster, zones, noreboot.

2010-07-13 Thread Don O'Malley

Thoughts from a colleague - Enda O'Connor - inline...

Best,
-Don

Enda O'Connor wrote:

Hi

Some people have recommended Update On Attach see
http://wikis.sun.com/display/BluePrints/Maintaining+Solaris+with+Live+Upgrade+and+Update+On+Attach 



and also the following for a description of how update on attach works 
in conjunction with patching.

http://www.sun.com/bigadmin/features/articles
/zone_attach_patch.jsp#Patching

it is important to first read the bigadmin article to understand how 
it works before goign down this route.


Enda
On 13/07/2010 16:32, Enda O'Connor wrote:

Hi David
the major issue with patching a live system with a failover zone is if
the zone failed over for any reason during patching. This would cause
patch corruption, one would need to suspend the HA container 
resource, i.e.


clrg suspend  the resource group 
detach zones on remaining node
apply patch
attach zone on other node
clrg resume the resource group

But one would need to take some care to identify patches that can be
applied in such fashion.

the following doc has section on applying patches that require Single
User Mode in failover zone environment.
http://docs.sun.com/app/docs/doc/819-2971/z476997776?a=view

I have cc'ed Chris who has lots of experience in this area.

But the main concern is that the zone might failover during such 
patching.


Enda
On 13/07/2010 10:56, Don O'Malley wrote:

Hey Enda/Ed,

Any thoughts on this?

In addition to the no reboot question, is it better to detaches your
zones and use update on attach to bring the local zones back in 
sync, or

is there no difference between the two (I thought update on attach was
quicker)?

Best,
-Don


David Stark wrote:

Hi List.

A bit off-topic, but PCA's involved, so I'm going to push my luck.

We've got a number of Sun Cluster installs using (lots of) failover
zones and with no LiveUpgrade alt. boot space set up, which we need to
patch. If we were to patch the clusters node-by-node (including the
kernel patches that don't like zones being booted when they're
applied), the zones would fail over between the nodes and never get
patched themselves, and since kernel (and some other?) patches can't
be applied from inside zones we would end up in a situation where the
zones' patch databases are out of sync with the Global zone. I know
from very bitter experience that this is a Bad Thing, so to avoid that
we're currently bringing the clusters down to patch them. This
obviously isn't optimal - people expecting 100% uptime from the
clusters are naturally a bit annoyed at having their applications down
for several hours while we unleash the mighty PCA.

So, to minimise downtime I'd like to apply the noreboot patches, say,
the night before, and have a more minimal patch run with the clusters
down. This brings me to the question:

Has anyone ever had any problems with noreboot patches applied to live
systems? Any weirdness at all? I've patched plenty of test machines in
multi-user mode, but never busy production boxes - these are fairly
large Oracle and SAP environments for the most part.

Anyone with experience patching Sun Cluster care to share any top 
tips?


Cheers!

Dave