[pca] Cluster, zones, noreboot.
Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
Hi there, I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Regards Filip On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the zones and data filesystems are ZFS. Regards Filip Cheers! Dave On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
When a zone migrates back to a patched system, doesn't it normally update itself as it starts up the first time? Have a look at the zoneadm man page, in particular the attach and detach sub-commands. I don't know for sure, but Sun Cluster may be smart enough to do the right thing and use zoneadm attach -u to bring the zone up to date when it attaches to the patched system. Perhaps a quick chat with your local Sun SE to help plan things might be time well spent? regards, -glenn On 07/13/10 21:41, David Stark wrote: On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the zones and data filesystems are ZFS. Regards Filip Cheers! Dave On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
Hi all. Thanks Filip, Glen - I'd forgotten all about Update on Attach. It looks like Update On Attach is part of kernel patch 137137-09, which we've actually got applied on some recently patched clusters. The zoneadm man page lists the '-u' option on those hosts too, so it looks like this should be usable on the next patching run. Since we've already got Parallel patching for zones, I'm not sure if this will really save a lot of down time, though - is a zone update-attach any faster than a normal patch run? Dave On 13/07/2010 13:05, Filip Francis wrote: No unless you have a certain version off Solaris i think this is only from version 10u7 or 10u8 that you have this option. This is his problem. The cluster will not do this for you it think this is scheduled in the next release of sun cluster later this year Filip On 07/13/10 13:54, Glenn Satchell wrote: When a zone migrates back to a patched system, doesn't it normally update itself as it starts up the first time? Have a look at the zoneadm man page, in particular the attach and detach sub-commands. I don't know for sure, but Sun Cluster may be smart enough to do the right thing and use zoneadm attach -u to bring the zone up to date when it attaches to the patched system. Perhaps a quick chat with your local Sun SE to help plan things might be time well spent? regards, -glenn On 07/13/10 21:41, David Stark wrote: On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the zones and data filesystems are ZFS. Regards Filip Cheers! Dave On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
Dave, If you want todo parallel patchen you need to edit /etc/patch/pdo.conf file. That will make parallel patching to work on several zones at the same time Filip On 07/13/10 14:28, David Stark wrote: Hi all. Thanks Filip, Glen - I'd forgotten all about Update on Attach. It looks like Update On Attach is part of kernel patch 137137-09, which we've actually got applied on some recently patched clusters. The zoneadm man page lists the '-u' option on those hosts too, so it looks like this should be usable on the next patching run. Since we've already got Parallel patching for zones, I'm not sure if this will really save a lot of down time, though - is a zone update-attach any faster than a normal patch run? Dave On 13/07/2010 13:05, Filip Francis wrote: No unless you have a certain version off Solaris i think this is only from version 10u7 or 10u8 that you have this option. This is his problem. The cluster will not do this for you it think this is scheduled in the next release of sun cluster later this year Filip On 07/13/10 13:54, Glenn Satchell wrote: When a zone migrates back to a patched system, doesn't it normally update itself as it starts up the first time? Have a look at the zoneadm man page, in particular the attach and detach sub-commands. I don't know for sure, but Sun Cluster may be smart enough to do the right thing and use zoneadm attach -u to bring the zone up to date when it attaches to the patched system. Perhaps a quick chat with your local Sun SE to help plan things might be time well spent? regards, -glenn On 07/13/10 21:41, David Stark wrote: On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the zones and data filesystems are ZFS. Regards Filip Cheers! Dave On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
Hi! I do not know your exact cluster config, but basically two scenarios are possible: 1. whole zoneroot is on failover disk 2. each cluster node has a complete separate zoneroot and the only thing failing over is application/database data First case Migrate all failover zones to one node and patch the node and zones. On the other, empty node, change /etc/zones/index temporarily that the node knows nothing about the zones. Then apply the exact same patches to the empty node. After that restore /etc/zones/index and the zones *should* be able to fail over again. Second case. Migrate all zones to one node. On the empty node remove all references to failover datasets from /etc/zones/yourzonename.xml files. Then patch the empty node. The problem is that the zone cannot boot up to mainenance state when failover datasets are not available. Now that you have removed all references to them, the patching can proceed. Finally - restore yourzone.xml files to their original state and the zones *should* be able to migrate again. If it migrates, patch the first node (using the same method?). The second case is better because it allows you to apply kernel patches and reboot the empty node before migrating stuff back. In case of a failure on one node you still have a complete zoneroot available on other node. Downside is that it creates overhead when administering zones. I strongly recommend backup and testing the above methods in non-production environment. These above are home-made methods and may not be the best or recommended ways to patch zones+cluster but they have done the trick for me :) I have not yet used zone detatch/upgrade-on-attach method so i won't comment that :) HTH, Andero -Original Message- From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at] On Behalf Of David Stark Sent: Tuesday, July 13, 2010 12:52 PM To: PCA (Patch Check Advanced) Discussion Subject: [pca] Cluster, zones, noreboot. Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
Ah, sorry Filip, my question wasn't very clear. I'm wondering if the downtime for each zone would be much less using Update On Attach compared with just bringing the whole cluster down and patching all the zones in parallel? On 13/07/2010 13:56, Filip Francis wrote: Dave, If you want todo parallel patchen you need to edit /etc/patch/pdo.conf file. That will make parallel patching to work on several zones at the same time Filip On 07/13/10 14:28, David Stark wrote: Hi all. Thanks Filip, Glen - I'd forgotten all about Update on Attach. It looks like Update On Attach is part of kernel patch 137137-09, which we've actually got applied on some recently patched clusters. The zoneadm man page lists the '-u' option on those hosts too, so it looks like this should be usable on the next patching run. Since we've already got Parallel patching for zones, I'm not sure if this will really save a lot of down time, though - is a zone update-attach any faster than a normal patch run? Dave On 13/07/2010 13:05, Filip Francis wrote: No unless you have a certain version off Solaris i think this is only from version 10u7 or 10u8 that you have this option. This is his problem. The cluster will not do this for you it think this is scheduled in the next release of sun cluster later this year Filip On 07/13/10 13:54, Glenn Satchell wrote: When a zone migrates back to a patched system, doesn't it normally update itself as it starts up the first time? Have a look at the zoneadm man page, in particular the attach and detach sub-commands. I don't know for sure, but Sun Cluster may be smart enough to do the right thing and use zoneadm attach -u to bring the zone up to date when it attaches to the patched system. Perhaps a quick chat with your local Sun SE to help plan things might be time well spent? regards, -glenn On 07/13/10 21:41, David Stark wrote: On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the zones and data filesystems are ZFS. Regards Filip Cheers! Dave On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
Hi! On 13/07/2010 14:05, Andero Belov wrote: Hi! I do not know your exact cluster config, but basically two scenarios are possible: 1. whole zoneroot is on failover disk 2. each cluster node has a complete separate zoneroot and the only thing failing over is application/database data Yup, we're using option 1 here. First case Migrate all failover zones to one node and patch the node and zones. On the other, empty node, change /etc/zones/index temporarily that the node knows nothing about the zones. Then apply the exact same patches to the empty node. After that restore /etc/zones/index and the zones *should* be able to fail over again. This is basically what we're doing at the moment. We leave the zones on the nodes where they're currently running, and just comment out the zones not running on each node in /etc/zones/index before we patch - it makes the patching prep a bit more complicated, but makes the patch run a bit quicker with the zones spread out across the nodes (maybe - I'm not sure it makes a lot of difference with parallel patching switched on). Second case. Migrate all zones to one node. On the empty node remove all references to failover datasets from /etc/zones/yourzonename.xml files. Then patch the empty node. The problem is that the zone cannot boot up to mainenance state when failover datasets are not available. Now that you have removed all references to them, the patching can proceed. Finally - restore yourzone.xml files to their original state and the zones *should* be able to migrate again. If it migrates, patch the first node (using the same method?). The second case is better because it allows you to apply kernel patches and reboot the empty node before migrating stuff back. In case of a failure on one node you still have a complete zoneroot available on other node. Downside is that it creates overhead when administering zones. I strongly recommend backup and testing the above methods in non-production environment. These above are home-made methods and may not be the best or recommended ways to patch zones+cluster but they have done the trick for me :) I have not yet used zone detatch/upgrade-on-attach method so i won't comment that :) HTH, Andero Cheers! Dave -Original Message- From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at] On Behalf Of David Stark Sent: Tuesday, July 13, 2010 12:52 PM To: PCA (Patch Check Advanced) Discussion Subject: [pca] Cluster, zones, noreboot. Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
I can't speak to Sun's Cluster software but here we use Veritas VCS and do update on attach. The advantage is not as much time as being able to schedule the work. For example, if you patch them all at once, they are all down while this is being done. If you migrate zones to another system, leaving just the global zone on one of the systems, you can patch that system. Then when it has been rebooted and checked out, you can migrate systems as needed using update on attach, until they have all been updated. Then patch the other node that has just a global and roll back some of the zones to that box. This may cause a 2-3 reboots for some zones depending on where they live, but zones usually boot fast compared to actual HW. Just to give you an idea, a system needing a hundred patches may take 1 hr to patch. An update on attach will run in a fraction of that time. It also allows you to schedule the time with your customer instead of requiring they all be down for hours at the same time. I find people are more apt to accept quick reboots and the short time for an update on attach than accept being down for an extended period while you patch everyone. But if all zones are related then having them down at the same time may not be an issue and a parallel patch may be more acceptable. Now, I will say this. Before patching validate the current packages and patches. I mention this as I ran into an issue on one of my systems (non cluster, but global and 4 container/zones) where the SUNWcsl package was missing pkginfo under /var/sadm/pkg/SUNWcsl. Not only did it cause patch issues, but it also broke update on attach so much that the only option I had was to rebuild the zones. Sun (at the time) wasn't much help. The issue was deeper than just the pkginfo file but also the version of that file and the files in the package used in patching the zones. The patch utilities mangled them and I didn't catch it before patching. I mention this as the version you are running is in the period I was at when there were issues with the pkg/patch utilities, so better to check in advance. --Dave -Original Message- From: pca-boun...@lists.univie.ac.at [mailto:pca- boun...@lists.univie.ac.at] On Behalf Of David Stark Sent: Tuesday, July 13, 2010 6:09 AM To: PCA (Patch Check Advanced) Discussion Subject: Re: [pca] Cluster, zones, noreboot. Ah, sorry Filip, my question wasn't very clear. I'm wondering if the downtime for each zone would be much less using Update On Attach compared with just bringing the whole cluster down and patching all the zones in parallel? On 13/07/2010 13:56, Filip Francis wrote: Dave, If you want todo parallel patchen you need to edit /etc/patch/pdo.conf file. That will make parallel patching to work on several zones at the same time Filip On 07/13/10 14:28, David Stark wrote: Hi all. Thanks Filip, Glen - I'd forgotten all about Update on Attach. It looks like Update On Attach is part of kernel patch 137137-09, which we've actually got applied on some recently patched clusters. The zoneadm man page lists the '-u' option on those hosts too, so it looks like this should be usable on the next patching run. Since we've already got Parallel patching for zones, I'm not sure if this will really save a lot of down time, though - is a zone update-attach any faster than a normal patch run? Dave On 13/07/2010 13:05, Filip Francis wrote: No unless you have a certain version off Solaris i think this is only from version 10u7 or 10u8 that you have this option. This is his problem. The cluster will not do this for you it think this is scheduled in the next release of sun cluster later this year Filip On 07/13/10 13:54, Glenn Satchell wrote: When a zone migrates back to a patched system, doesn't it normally update itself as it starts up the first time? Have a look at the zoneadm man page, in particular the attach and detach sub-commands. I don't know for sure, but Sun Cluster may be smart enough to do the right thing and use zoneadm attach -u to bring the zone up to date when it attaches to the patched system. Perhaps a quick chat with your local Sun SE to help plan things might be time well spent? regards, -glenn On 07/13/10 21:41, David Stark wrote: On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the zones and data filesystems are ZFS. Regards Filip Cheers! Dave On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my
Re: [pca] Cluster, zones, noreboot.
David, Not sure i have done both and they where quit fast. Did not realy looked at the time. So can not realy tell. The only difference that i can see is that with update on attach it will remove some stuff and then add it. Filip On 07/13/10 15:09, David Stark wrote: Ah, sorry Filip, my question wasn't very clear. I'm wondering if the downtime for each zone would be much less using Update On Attach compared with just bringing the whole cluster down and patching all the zones in parallel? On 13/07/2010 13:56, Filip Francis wrote: Dave, If you want todo parallel patchen you need to edit /etc/patch/pdo.conf file. That will make parallel patching to work on several zones at the same time Filip On 07/13/10 14:28, David Stark wrote: Hi all. Thanks Filip, Glen - I'd forgotten all about Update on Attach. It looks like Update On Attach is part of kernel patch 137137-09, which we've actually got applied on some recently patched clusters. The zoneadm man page lists the '-u' option on those hosts too, so it looks like this should be usable on the next patching run. Since we've already got Parallel patching for zones, I'm not sure if this will really save a lot of down time, though - is a zone update-attach any faster than a normal patch run? Dave On 13/07/2010 13:05, Filip Francis wrote: No unless you have a certain version off Solaris i think this is only from version 10u7 or 10u8 that you have this option. This is his problem. The cluster will not do this for you it think this is scheduled in the next release of sun cluster later this year Filip On 07/13/10 13:54, Glenn Satchell wrote: When a zone migrates back to a patched system, doesn't it normally update itself as it starts up the first time? Have a look at the zoneadm man page, in particular the attach and detach sub-commands. I don't know for sure, but Sun Cluster may be smart enough to do the right thing and use zoneadm attach -u to bring the zone up to date when it attaches to the patched system. Perhaps a quick chat with your local Sun SE to help plan things might be time well spent? regards, -glenn On 07/13/10 21:41, David Stark wrote: On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you give me some more details on what version off cluster + version off os. Sure. It's Sun Cluster 3.2 on Solaris 10 5/08. Mainly 2 node clusters, but there's a 3 node and a 6 node as well. Mainly T5220 machines hooked up to EMC Symmetrix storage. The hosts' / slices are all UFS, all the zones and data filesystems are ZFS. Regards Filip Cheers! Dave On 07/13/10 11:52, David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave
Re: [pca] Cluster, zones, noreboot.
Hi Dave. On 13/07/2010 14:30, French, David wrote: I can't speak to Sun's Cluster software but here we use Veritas VCS and do update on attach. The advantage is not as much time as being able to schedule the work. For example, if you patch them all at once, they are all down while this is being done. If you migrate zones to another system, leaving just the global zone on one of the systems, you can patch that system. Then when it has been rebooted and checked out, you can migrate systems as needed using update on attach, until they have all been updated. Then patch the other node that has just a global and roll back some of the zones to that box. This may cause a 2-3 reboots for some zones depending on where they live, but zones usually boot fast compared to actual HW. Just to give you an idea, a system needing a hundred patches may take 1 hr to patch. An update on attach will run in a fraction of that time. It also allows you to schedule the time with your customer instead of requiring they all be down for hours at the same time. I find people are more apt to accept quick reboots and the short time for an update on attach than accept being down for an extended period while you patch everyone. Ah, excellent. Anything that reduces downtime on the zones would be a win for us. But if all zones are related then having them down at the same time may not be an issue and a parallel patch may be more acceptable. Yeah, most of our clusters are single-application, but then there's the dreaded 'Unix Consolidation Cluster' with 20-odd business units' stuff on it. I have a feeling Update on Attach will come in handy. Unfortunatley, I'll have to do an old-school patch run to get Update On Attach installed :( . Now, I will say this. Before patching validate the current packages and patches. I mention this as I ran into an issue on one of my systems (non cluster, but global and 4 container/zones) where the SUNWcsl package was missing pkginfo under /var/sadm/pkg/SUNWcsl. Not only did it cause patch issues, but it also broke update on attach so much that the only option I had was to rebuild the zones. Sun (at the time) wasn't much help. The issue was deeper than just the pkginfo file but also the version of that file and the files in the package used in patching the zones. The patch utilities mangled them and I didn't catch it before patching. Yeesh. Broken core libs package? Ouch. I mention this as the version you are running is in the period I was at when there were issues with the pkg/patch utilities, so better to check in advance. We've been OK so far (4 clusters patched already). Fingers crossed. --Dave Cheers. Dave -Original Message- From: pca-boun...@lists.univie.ac.at [mailto:pca- boun...@lists.univie.ac.at] On Behalf Of David Stark Sent: Tuesday, July 13, 2010 6:09 AM To: PCA (Patch Check Advanced) Discussion Subject: Re: [pca] Cluster, zones, noreboot. Ah, sorry Filip, my question wasn't very clear. I'm wondering if the downtime for each zone would be much less using Update On Attach compared with just bringing the whole cluster down and patching all the zones in parallel? On 13/07/2010 13:56, Filip Francis wrote: Dave, If you want todo parallel patchen you need to edit /etc/patch/pdo.conf file. That will make parallel patching to work on several zones at the same time Filip On 07/13/10 14:28, David Stark wrote: Hi all. Thanks Filip, Glen - I'd forgotten all about Update on Attach. It looks like Update On Attach is part of kernel patch 137137-09, which we've actually got applied on some recently patched clusters. The zoneadm man page lists the '-u' option on those hosts too, so it looks like this should be usable on the next patching run. Since we've already got Parallel patching for zones, I'm not sure if this will really save a lot of down time, though - is a zone update-attach any faster than a normal patch run? Dave On 13/07/2010 13:05, Filip Francis wrote: No unless you have a certain version off Solaris i think this is only from version 10u7 or 10u8 that you have this option. This is his problem. The cluster will not do this for you it think this is scheduled in the next release of sun cluster later this year Filip On 07/13/10 13:54, Glenn Satchell wrote: When a zone migrates back to a patched system, doesn't it normally update itself as it starts up the first time? Have a look at the zoneadm man page, in particular the attach and detach sub-commands. I don't know for sure, but Sun Cluster may be smart enough to do the right thing and use zoneadm attach -u to bring the zone up to date when it attaches to the patched system. Perhaps a quick chat with your local Sun SE to help plan things might be time well spent? regards, -glenn On 07/13/10 21:41, David Stark wrote: On 13/07/2010 12:31, Filip Francis wrote: Hi there, Hi! I have done already quit a few upgrades of sun clusters. Can you
Re: [pca] Cluster, zones, noreboot.
Thoughts from a colleague - Enda O'Connor - inline... Best, -Don Enda O'Connor wrote: Hi Some people have recommended Update On Attach see http://wikis.sun.com/display/BluePrints/Maintaining+Solaris+with+Live+Upgrade+and+Update+On+Attach and also the following for a description of how update on attach works in conjunction with patching. http://www.sun.com/bigadmin/features/articles /zone_attach_patch.jsp#Patching it is important to first read the bigadmin article to understand how it works before goign down this route. Enda On 13/07/2010 16:32, Enda O'Connor wrote: Hi David the major issue with patching a live system with a failover zone is if the zone failed over for any reason during patching. This would cause patch corruption, one would need to suspend the HA container resource, i.e. clrg suspend the resource group detach zones on remaining node apply patch attach zone on other node clrg resume the resource group But one would need to take some care to identify patches that can be applied in such fashion. the following doc has section on applying patches that require Single User Mode in failover zone environment. http://docs.sun.com/app/docs/doc/819-2971/z476997776?a=view I have cc'ed Chris who has lots of experience in this area. But the main concern is that the zone might failover during such patching. Enda On 13/07/2010 10:56, Don O'Malley wrote: Hey Enda/Ed, Any thoughts on this? In addition to the no reboot question, is it better to detaches your zones and use update on attach to bring the local zones back in sync, or is there no difference between the two (I thought update on attach was quicker)? Best, -Don David Stark wrote: Hi List. A bit off-topic, but PCA's involved, so I'm going to push my luck. We've got a number of Sun Cluster installs using (lots of) failover zones and with no LiveUpgrade alt. boot space set up, which we need to patch. If we were to patch the clusters node-by-node (including the kernel patches that don't like zones being booted when they're applied), the zones would fail over between the nodes and never get patched themselves, and since kernel (and some other?) patches can't be applied from inside zones we would end up in a situation where the zones' patch databases are out of sync with the Global zone. I know from very bitter experience that this is a Bad Thing, so to avoid that we're currently bringing the clusters down to patch them. This obviously isn't optimal - people expecting 100% uptime from the clusters are naturally a bit annoyed at having their applications down for several hours while we unleash the mighty PCA. So, to minimise downtime I'd like to apply the noreboot patches, say, the night before, and have a more minimal patch run with the clusters down. This brings me to the question: Has anyone ever had any problems with noreboot patches applied to live systems? Any weirdness at all? I've patched plenty of test machines in multi-user mode, but never busy production boxes - these are fairly large Oracle and SAP environments for the most part. Anyone with experience patching Sun Cluster care to share any top tips? Cheers! Dave