Re: [Linux-cluster] mixing OS versions?
Hi, On Friday 25 of April 2014 12:42:59 Steven Whitehouse wrote: > Hi, > > On 24/04/14 17:29, Alan Brown wrote: > > On 30/03/14 12:34, Steven Whitehouse wrote: > >> Well that is not entirely true. We have done a great deal of > >> investigation into this issue. We do test quotas (among many other > >> things) on each release to ensure that they are working. Our tests have > >> all passed correctly, and to date you have provided the only report of > >> this particular issue via our support team. So it is certainly not > >> something that lots of people are hitting. > > > > Someone else reported it on this list (on centos), so we're not an > > isolated case. > > > >> We do now have a good idea of where the issue is. However it is clear > >> that simply exceeding quotas is not enough to trigger it. Instead quotas > >> need to be exceeded in a particular way. > > > > My suspicion is that it's some kind of interaction between quotas and > > NFS, but it'd be good if you could provide a fuller explanation. > > Yes, thats what we thought to start with... however that turned out to > be a bit of a red herring. Or at least the issue has nothing > specifically to do with NFS. The problem was related to when quota was > exceeded, and specifically what operation was in progress. You could > write to files as often as you wanted to, and exceeding quota would be > handled correctly. The problem was a specific code path within the inode > creation code, if it didn't result in quota being exceeded on that one > specific code path, then everything would work as expected. could you please provide a (somewhat reliable) test case to reproduce this bug? I have looked at the patch, and found nothing obviously related to quotas (it seems the patch only changes the fail-path of posix_acl_create() call, which doesn't appear to have nothing to do with quotas) I have been facing a possibly quota-related oops in GFS2 for some time, which I am unable to reproduce without switching my cluster to production use (which means potentialy facing the anger of my users, which I'd rather not do without at least a chance of the issue being fixed). sadly, I don't have RedHat support subscription (nor do I use RHEL or derivates), my kernel is mostly upstream. thanks Pavel Herrmann > > Also, quite often when the problem did appear, it did not actually > trigger a problem until later, making it difficult to track down. > > You are correct that someone else reported the issue on the list, > however I'm not aware of any other reports beyond yours and theirs. > Also, this was specific to certain versions of GFS2, and not something > that relates to all versions. > > The upstream patch is here: > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/gfs > 2?id=059788039f1e6343f34f46d202f8d9f2158c2783 > > It should be available in RHEL shortly - please ping support via the > ticket for updates, > > Steve. > > >> Returning to the original point however, it is certainly not recommended > >> to have mixed RHEL or CentOS versions running in the same cluster. It is > >> much better to keep everything the same, even though the GFS2 on-disk > >> format has not changed between the versions. > > > > More specfically (for those who are curious): Whilst the on-disk > > format has not changed between EL5 and EL6, the way that RH cluster > > members communicate with each other has. > > > > I ran a quick test some time back and the 2 different OS cluster > > versions didn't see each other for LAN heartbeating. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
Hi, On 24/04/14 17:29, Alan Brown wrote: On 30/03/14 12:34, Steven Whitehouse wrote: Well that is not entirely true. We have done a great deal of investigation into this issue. We do test quotas (among many other things) on each release to ensure that they are working. Our tests have all passed correctly, and to date you have provided the only report of this particular issue via our support team. So it is certainly not something that lots of people are hitting. Someone else reported it on this list (on centos), so we're not an isolated case. We do now have a good idea of where the issue is. However it is clear that simply exceeding quotas is not enough to trigger it. Instead quotas need to be exceeded in a particular way. My suspicion is that it's some kind of interaction between quotas and NFS, but it'd be good if you could provide a fuller explanation. Yes, thats what we thought to start with... however that turned out to be a bit of a red herring. Or at least the issue has nothing specifically to do with NFS. The problem was related to when quota was exceeded, and specifically what operation was in progress. You could write to files as often as you wanted to, and exceeding quota would be handled correctly. The problem was a specific code path within the inode creation code, if it didn't result in quota being exceeded on that one specific code path, then everything would work as expected. Also, quite often when the problem did appear, it did not actually trigger a problem until later, making it difficult to track down. You are correct that someone else reported the issue on the list, however I'm not aware of any other reports beyond yours and theirs. Also, this was specific to certain versions of GFS2, and not something that relates to all versions. The upstream patch is here: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/gfs2?id=059788039f1e6343f34f46d202f8d9f2158c2783 It should be available in RHEL shortly - please ping support via the ticket for updates, Steve. Returning to the original point however, it is certainly not recommended to have mixed RHEL or CentOS versions running in the same cluster. It is much better to keep everything the same, even though the GFS2 on-disk format has not changed between the versions. More specfically (for those who are curious): Whilst the on-disk format has not changed between EL5 and EL6, the way that RH cluster members communicate with each other has. I ran a quick test some time back and the 2 different OS cluster versions didn't see each other for LAN heartbeating. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
Hi, On Fri, 2014-03-28 at 22:07 +, Alan Brown wrote: > On 28/03/14 19:31, Fabio M. Di Nitto wrote: > > > > > Are there any known issues, guidelines, or recommendations for having > > a single RHCS cluster with different OS releases on the nodes? > > Only one answer.. don't do it. It's not supported and it's only asking > > for troubles. > > > > > > Seconded. There are _substantial_ differences between Centos/RHEL 5 > and 6 clustering. > > You can run one or the other OS, but you can't mix them. The on-disk > format isn't affected. > > Best path is to setup a cluster in 6, shut down the 5 cluster, attach > disks to the 6 cluster and bring it all back up. The 5 boxes can be > converted to version 6 afterwards. > > (I'm going through this at the moment, as I have 2 EL5 clusters and 1 > EL6 cluster.) > > TAKE NOTE: RHEL/CentOS6 clustering is not quite ready for prime-time > - if you enable GFS2 quotas and someone busts his quota the machine > will panic. > Well that is not entirely true. We have done a great deal of investigation into this issue. We do test quotas (among many other things) on each release to ensure that they are working. Our tests have all passed correctly, and to date you have provided the only report of this particular issue via our support team. So it is certainly not something that lots of people are hitting. We do now have a good idea of where the issue is. However it is clear that simply exceeding quotas is not enough to trigger it. Instead quotas need to be exceeded in a particular way. Abhi is working on a fix which should be available very shortly now. Returning to the original point however, it is certainly not recommended to have mixed RHEL or CentOS versions running in the same cluster. It is much better to keep everything the same, even though the GFS2 on-disk format has not changed between the versions. I hope that answers a few questions - let us know if you need more info, Steve. > -- > Linux-cluster mailing list > Linux-cluster@redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
> In the message dated: Sat, 29 Mar 2014 09:00:04 -, The pithy ruminations > from "Masopust, Christian" on > were: > => > => > => > => TAKE NOTE: RHEL/CentOS6 clustering is not quite ready for prime-time > - => > => if you enable GFS2 quotas > and someone busts his quota the machine will => panic. > => > > => > That's an example of why I no longer use GFS2. :) => > => > Thanks, => > > => > Mark => => Hi Mark, => > => what instead of GFS2 ? > > GPFS, as I wrote in the message to which you replied: > sorry, my fault... didn't notice it as GPFS has not been on my radar up to now :) -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
In the message dated: Sat, 29 Mar 2014 09:00:04 -, The pithy ruminations from "Masopust, Christian" on were: => > => => > => TAKE NOTE: RHEL/CentOS6 clustering is not quite ready for prime-time - => > => if you enable GFS2 quotas and someone busts his quota the machine will => panic. => > => > That's an example of why I no longer use GFS2. :) => > => > Thanks, => > => > Mark => => Hi Mark, => => what instead of GFS2 ? GPFS, as I wrote in the message to which you replied: --------------- From: berg...@merctech.com To: linux clustering Subject: Re: [Linux-cluster] mixing OS versions? Date: Fri, 28 Mar 2014 18:35:42 -0400 [SNIP!] For clarification, we're not using RHCS to manange any shared storage. The only 'disk' component is the quorum disk. We're using GPFS as the storage layer. --- => => br, => christian => => -- Mark Bergman -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
> => > => TAKE NOTE: RHEL/CentOS6 clustering is not quite ready for prime-time - > => if you enable GFS2 quotas and someone busts his quota the machine will => > panic. > > That's an example of why I no longer use GFS2. :) > > Thanks, > > Mark Hi Mark, what instead of GFS2 ? br, christian -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
In the message dated: Fri, 28 Mar 2014 22:07:48 -, The pithy ruminations from Alan Brown on were: => On 28/03/14 19:31, Fabio M. Di Nitto wrote: => > => > Are there any known issues, guidelines, or recommendations for having => > a single RHCS cluster with different OS releases on the nodes? => > Only one answer.. don't do it. It's not supported and it's only asking => > for troubles. Thanks for all the warnings...not what I wanted to hear, but it's good to get a clear, consistent message. => > => > => => Seconded. There are _substantial_ differences between Centos/RHEL 5 and => 6 clustering. => => You can run one or the other OS, but you can't mix them. The on-disk => format isn't affected. For clarification, we're not using RHCS to manange any shared storage. The only 'disk' component is the quorum disk. We're using GPFS as the storage layer. RHCS manages several services, such as: httpd mysql nis pgsql => => Best path is to setup a cluster in 6, shut down the 5 cluster, attach => disks to the 6 cluster and bring it all back up. The 5 boxes can be => converted to version 6 afterwards. That's what I was expecting, unfortunately. I'll probably do a more gradual approach...bring up a CentOS6 cluster with it's own quorum disk, and one-by-one add services (httpd, nis, etc.) to that, bringing them down on the old cluster. Add in some CNAMES and coordination with the network group and it should be relatively transparent to the users. => => (I'm going through this at the moment, as I have 2 EL5 clusters and 1 => EL6 cluster.) => => TAKE NOTE: RHEL/CentOS6 clustering is not quite ready for prime-time - => if you enable GFS2 quotas and someone busts his quota the machine will => panic. That's an example of why I no longer use GFS2. :) Thanks, Mark => => => => => -- => Linux-cluster mailing list => Linux-cluster@redhat.com => https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
On 28/03/14 19:31, Fabio M. Di Nitto wrote: Are there any known issues, guidelines, or recommendations for having a single RHCS cluster with different OS releases on the nodes? Only one answer.. don't do it. It's not supported and it's only asking for troubles. Seconded. There are _substantial_ differences between Centos/RHEL 5 and 6 clustering. You can run one or the other OS, but you can't mix them. The on-disk format isn't affected. Best path is to setup a cluster in 6, shut down the 5 cluster, attach disks to the 6 cluster and bring it all back up. The 5 boxes can be converted to version 6 afterwards. (I'm going through this at the moment, as I have 2 EL5 clusters and 1 EL6 cluster.) TAKE NOTE: RHEL/CentOS6 clustering is not quite ready for prime-time - if you enable GFS2 quotas and someone busts his quota the machine will panic. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
You can get by, for a short time, with a minor revision difference, say 5.7 and 5.8, but, mixing 5 and 6 will not work. Period On Fri, Mar 28, 2014 at 12:31 PM, Fabio M. Di Nitto wrote: > On 03/28/2014 05:37 PM, berg...@merctech.com wrote: > > > > > > I've got a 3-node cluster under CentOS5. > > > > I'd like to add 3 additional nodes, running CentOS6. > > > > Are there any known issues, guidelines, or recommendations for having > > a single RHCS cluster with different OS releases on the nodes? > > Only one answer.. don't do it. It's not supported and it's only asking > for troubles. > > Fabio > > -- > Linux-cluster mailing list > Linux-cluster@redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- - jim -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] mixing OS versions?
On 03/28/2014 05:37 PM, berg...@merctech.com wrote: > > > I've got a 3-node cluster under CentOS5. > > I'd like to add 3 additional nodes, running CentOS6. > > Are there any known issues, guidelines, or recommendations for having > a single RHCS cluster with different OS releases on the nodes? Only one answer.. don't do it. It's not supported and it's only asking for troubles. Fabio -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
[Linux-cluster] mixing OS versions?
I've got a 3-node cluster under CentOS5. I'd like to add 3 additional nodes, running CentOS6. Are there any known issues, guidelines, or recommendations for having a single RHCS cluster with different OS releases on the nodes? Thanks, Mark -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster