Re: [Openais] detecting cpg joiners
On Wed, May 06, 2009 at 02:10:27PM -0700, Steven Dake wrote: > On Wed, 2009-05-06 at 15:04 -0500, David Teigland wrote: > > On Mon, Apr 13, 2009 at 02:17:00PM -0500, David Teigland wrote: > > > On Mon, Apr 13, 2009 at 12:10:33PM -0700, Steven Dake wrote: > > > > On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote: > > > > > 0. configure token timeout to some long time that is longer than all > > > > > the > > > > >following steps take > > > > > > > > > > 1. cluster members are nodeid's: 1,2,3,4 > > > > > > > > > > 2. cpg foo has the following members: > > > > >nodeid 1, pid 10 > > > > >nodeid 2, pid 20 > > > > >nodeid 3, pid 30 > > > > >nodeid 4, pid 40 > > > > > > > > > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 > > > > >(optionally reboot this node now) > > > > > > > > > > 4. nodeid 4: ifup eth0, start corosync > > > > > > > > > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg > > > > >showing that 4:40 is not a member > > > > > > > > > > 6. nodeid 4: start process pid 41 that joins cpg foo > > > > > > > > > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg > > > > >showing that 4:41 is a member > > > > > > > > > > (Steps 6 and 7 should work the same even if the process started in > > > > > step 6 > > > > > has pid 40 instead of pid 41.) > > > > > > > 100% agree that is how it should work. If it doesn't, we will fix it. > > > > The only thing that may be strange is if pid in step 6 is the same pid > > > > as 40. Are you certain the test case which fails has a differing pid at > > > > step 6? > > > > > > If you fix step 5, then I suspect steps 6,7 will "just work". After the > > > test > > > failed at step 5 I didn't pay too much attention to 6,7... but I'm sure > > > that > > > the pid in step 6 was different (I didn't reboot the node). > > > > It's not clear what the plan was for this, any recent related changes I > > should > > try? > > Dave > > > > I haven't tried corosync with this test case, but it should work now. > Did you try latest corosync on this case? If it still fails Jan can > address before 1.0. Just tried it, and I get the same behavior as before. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Wed, 2009-05-06 at 15:04 -0500, David Teigland wrote: > On Mon, Apr 13, 2009 at 02:17:00PM -0500, David Teigland wrote: > > On Mon, Apr 13, 2009 at 12:10:33PM -0700, Steven Dake wrote: > > > On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote: > > > > 0. configure token timeout to some long time that is longer than all the > > > >following steps take > > > > > > > > 1. cluster members are nodeid's: 1,2,3,4 > > > > > > > > 2. cpg foo has the following members: > > > >nodeid 1, pid 10 > > > >nodeid 2, pid 20 > > > >nodeid 3, pid 30 > > > >nodeid 4, pid 40 > > > > > > > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 > > > >(optionally reboot this node now) > > > > > > > > 4. nodeid 4: ifup eth0, start corosync > > > > > > > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg > > > >showing that 4:40 is not a member > > > > > > > > 6. nodeid 4: start process pid 41 that joins cpg foo > > > > > > > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg > > > >showing that 4:41 is a member > > > > > > > > (Steps 6 and 7 should work the same even if the process started in step > > > > 6 > > > > has pid 40 instead of pid 41.) > > > > > 100% agree that is how it should work. If it doesn't, we will fix it. > > > The only thing that may be strange is if pid in step 6 is the same pid > > > as 40. Are you certain the test case which fails has a differing pid at > > > step 6? > > > > If you fix step 5, then I suspect steps 6,7 will "just work". After the > > test > > failed at step 5 I didn't pay too much attention to 6,7... but I'm sure that > > the pid in step 6 was different (I didn't reboot the node). > > It's not clear what the plan was for this, any recent related changes I should > try? > Dave > I haven't tried corosync with this test case, but it should work now. Did you try latest corosync on this case? If it still fails Jan can address before 1.0. Regards -steve > ___ > Openais mailing list > Openais@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Mon, Apr 13, 2009 at 02:17:00PM -0500, David Teigland wrote: > On Mon, Apr 13, 2009 at 12:10:33PM -0700, Steven Dake wrote: > > On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote: > > > 0. configure token timeout to some long time that is longer than all the > > >following steps take > > > > > > 1. cluster members are nodeid's: 1,2,3,4 > > > > > > 2. cpg foo has the following members: > > >nodeid 1, pid 10 > > >nodeid 2, pid 20 > > >nodeid 3, pid 30 > > >nodeid 4, pid 40 > > > > > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 > > >(optionally reboot this node now) > > > > > > 4. nodeid 4: ifup eth0, start corosync > > > > > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg > > >showing that 4:40 is not a member > > > > > > 6. nodeid 4: start process pid 41 that joins cpg foo > > > > > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg > > >showing that 4:41 is a member > > > > > > (Steps 6 and 7 should work the same even if the process started in step 6 > > > has pid 40 instead of pid 41.) > > > 100% agree that is how it should work. If it doesn't, we will fix it. > > The only thing that may be strange is if pid in step 6 is the same pid > > as 40. Are you certain the test case which fails has a differing pid at > > step 6? > > If you fix step 5, then I suspect steps 6,7 will "just work". After the test > failed at step 5 I didn't pay too much attention to 6,7... but I'm sure that > the pid in step 6 was different (I didn't reboot the node). It's not clear what the plan was for this, any recent related changes I should try? Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Mon, Apr 13, 2009 at 02:17:00PM -0500, David Teigland wrote: > On Mon, Apr 13, 2009 at 12:10:33PM -0700, Steven Dake wrote: > > On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote: > > > 0. configure token timeout to some long time that is longer than all the > > >following steps take > > > > > > 1. cluster members are nodeid's: 1,2,3,4 > > > > > > 2. cpg foo has the following members: > > >nodeid 1, pid 10 > > >nodeid 2, pid 20 > > >nodeid 3, pid 30 > > >nodeid 4, pid 40 > > > > > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 > > >(optionally reboot this node now) > > > > > > 4. nodeid 4: ifup eth0, start corosync > > > > > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg > > >showing that 4:40 is not a member > > > > > > 6. nodeid 4: start process pid 41 that joins cpg foo > > > > > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg > > >showing that 4:41 is a member > > > > > > (Steps 6 and 7 should work the same even if the process started in step 6 > > > has pid 40 instead of pid 41.) > > > 100% agree that is how it should work. If it doesn't, we will fix it. > > The only thing that may be strange is if pid in step 6 is the same pid > > as 40. Are you certain the test case which fails has a differing pid at > > step 6? > > If you fix step 5, then I suspect steps 6,7 will "just work". After the test > failed at step 5 I didn't pay too much attention to 6,7... but I'm sure that > the pid in step 6 was different (I didn't reboot the node). Yeah, if we reliably get "4:40 leaves; 4:40 joins", we still have the information we need. We need the event. The pid-wrap concern was based on the assumption that 4:40 leaving and a new process 4:40 joining would be considered as a steady-state and we would get no leave event. Joel -- "Anything that is too stupid to be spoken is sung." - Voltaire Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Mon, Apr 13, 2009 at 12:10:33PM -0700, Steven Dake wrote: > On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote: > > 0. configure token timeout to some long time that is longer than all the > >following steps take > > > > 1. cluster members are nodeid's: 1,2,3,4 > > > > 2. cpg foo has the following members: > >nodeid 1, pid 10 > >nodeid 2, pid 20 > >nodeid 3, pid 30 > >nodeid 4, pid 40 > > > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 > >(optionally reboot this node now) > > > > 4. nodeid 4: ifup eth0, start corosync > > > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg > >showing that 4:40 is not a member > > > > 6. nodeid 4: start process pid 41 that joins cpg foo > > > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg > >showing that 4:41 is a member > > > > (Steps 6 and 7 should work the same even if the process started in step 6 > > has pid 40 instead of pid 41.) > 100% agree that is how it should work. If it doesn't, we will fix it. > The only thing that may be strange is if pid in step 6 is the same pid > as 40. Are you certain the test case which fails has a differing pid at > step 6? If you fix step 5, then I suspect steps 6,7 will "just work". After the test failed at step 5 I didn't pay too much attention to 6,7... but I'm sure that the pid in step 6 was different (I didn't reboot the node). Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Mon, 2009-04-13 at 09:41 -0700, Joel Becker wrote: > On Thu, Apr 09, 2009 at 09:58:15PM -0700, Joel Becker wrote: > > On Thu, Apr 09, 2009 at 06:06:13PM -0700, Steven Dake wrote: > > > I'd like to clear up that when Andrew talks about the membership not > > > generating a leave event for totem processes in this scenario (which he > > > integrates directly with), this is true. But cpg should generate a > > > leave event. > > > > Even if the pid is the same? That is, if my node reboots very > > fast, and my daemon comes back. What happens in cpg if a) my daemon has > > a different pid, b) my daemon has the same pid? I'd like to see a) a > > leave event for the old nodeid+pid and a join event for the new > > nodeid+pid, b) a leave and a join event for the nodeid+pid. > > Steve, > I never got a reply for this. I want to clarify cpg behavior > before I fix up my daemon's routines. > My reply was this: http://marc.info/?l=openais&m=123932549923230&w=2 And I recently posted about the weakness in pid reuse in a rebooting node which seems like a pretty serious problem. Dave gives an excellent outline of the events we expect to see in a followup message. That fits your outlined events you would like to see. Regards -steve > Joel > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 06:02:38PM -0700, Steven Dake wrote: > > The issue that Dave is talking about I believe is described in the > > following bugzilla: > > https://bugzilla.redhat.com/show_bug.cgi?id=489451 > > No, not at all. > > > IMO you should get a leave event for any process that leaves the process > > group independent of how totem works underneath. CPG should provide the > > guarantees you seek, and if it doesn't, it is defective. > > OK, good. Here's what we expect: > > 0. configure token timeout to some long time that is longer than all the >following steps take > > 1. cluster members are nodeid's: 1,2,3,4 > > 2. cpg foo has the following members: >nodeid 1, pid 10 >nodeid 2, pid 20 >nodeid 3, pid 30 >nodeid 4, pid 40 > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 >(optionally reboot this node now) > > 4. nodeid 4: ifup eth0, start corosync > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg >showing that 4:40 is not a member > > 6. nodeid 4: start process pid 41 that joins cpg foo > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg >showing that 4:41 is a member > > (Steps 6 and 7 should work the same even if the process started in step 6 has > pid 40 instead of pid 41.) > > Dave 100% agree that is how it should work. If it doesn't, we will fix it. The only thing that may be strange is if pid in step 6 is the same pid as 40. Are you certain the test case which fails has a differing pid at step 6? This points out a weakness in the current cpg protocol which could be addressed by adding a pid start time to the multicast message to uniquely identify node restarts with the same pid startup order. Unfortunately this would have to be done in some backward compatible fashion. Regards -steve > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 06:02:38PM -0700, Steven Dake wrote: > The issue that Dave is talking about I believe is described in the > following bugzilla: > https://bugzilla.redhat.com/show_bug.cgi?id=489451 No, not at all. > IMO you should get a leave event for any process that leaves the process > group independent of how totem works underneath. CPG should provide the > guarantees you seek, and if it doesn't, it is defective. OK, good. Here's what we expect: 0. configure token timeout to some long time that is longer than all the following steps take 1. cluster members are nodeid's: 1,2,3,4 2. cpg foo has the following members: nodeid 1, pid 10 nodeid 2, pid 20 nodeid 3, pid 30 nodeid 4, pid 40 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 (optionally reboot this node now) 4. nodeid 4: ifup eth0, start corosync 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg showing that 4:40 is not a member 6. nodeid 4: start process pid 41 that joins cpg foo 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg showing that 4:41 is a member (Steps 6 and 7 should work the same even if the process started in step 6 has pid 40 instead of pid 41.) Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 09:58:15PM -0700, Joel Becker wrote: > On Thu, Apr 09, 2009 at 06:06:13PM -0700, Steven Dake wrote: > > I'd like to clear up that when Andrew talks about the membership not > > generating a leave event for totem processes in this scenario (which he > > integrates directly with), this is true. But cpg should generate a > > leave event. > > Even if the pid is the same? That is, if my node reboots very > fast, and my daemon comes back. What happens in cpg if a) my daemon has > a different pid, b) my daemon has the same pid? I'd like to see a) a > leave event for the old nodeid+pid and a join event for the new > nodeid+pid, b) a leave and a join event for the nodeid+pid. Steve, I never got a reply for this. I want to clarify cpg behavior before I fix up my daemon's routines. Joel -- Life's Little Instruction Book #30 "Never buy a house without a fireplace." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 06:06:13PM -0700, Steven Dake wrote: > I'd like to clear up that when Andrew talks about the membership not > generating a leave event for totem processes in this scenario (which he > integrates directly with), this is true. But cpg should generate a > leave event. Even if the pid is the same? That is, if my node reboots very fast, and my daemon comes back. What happens in cpg if a) my daemon has a different pid, b) my daemon has the same pid? I'd like to see a) a leave event for the old nodeid+pid and a join event for the new nodeid+pid, b) a leave and a join event for the nodeid+pid. Joel -- Life's Little Instruction Book #306 "Take a nap on Sunday afternoons." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
> guarantees you seek, and if it doesn't, it is defective. The only > exception might be if the new process reuses the same PID since the > pid/nodeid/group are the uniqifiers and if pid is the same, there is no > way to detect the new process (and remove the old one). PID reuse happens more often than you may think. We finally started to use PID/starttime tuple to get unique process identifiers. - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, 2009-04-09 at 17:17 -0700, Joel Becker wrote: > On Thu, Apr 09, 2009 at 04:09:18PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker > > > > wrote: > > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > > >> even appear to have left (or rejoined) the cluster. > > > > >> At the next totem confchg event, It will simply just be there again > > > > >> with no indication that anything happened. > > > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > > it. > > > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can > > > make it > > > happen here. > > > > That was pretty simple. > > - set token to 5 minutes > > - nodes 1,2,3,4 are cluster members and members of a cpg > > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > > - nodes 1,2,3 seem completely unaware that 4 ever went away > > > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think > > that > > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > > being added as a new fourth cpg member. > > Steve, > If node 4's old process went away, shouldn't we be getting a > 'leave' for that, rather than it persisting in the member list? > > Joel > I'd like to clear up that when Andrew talks about the membership not generating a leave event for totem processes in this scenario (which he integrates directly with), this is true. But cpg should generate a leave event. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, 2009-04-09 at 17:17 -0700, Joel Becker wrote: > On Thu, Apr 09, 2009 at 04:09:18PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker > > > > wrote: > > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > > >> even appear to have left (or rejoined) the cluster. > > > > >> At the next totem confchg event, It will simply just be there again > > > > >> with no indication that anything happened. > > > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > > it. > > > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can > > > make it > > > happen here. > > > > That was pretty simple. > > - set token to 5 minutes > > - nodes 1,2,3,4 are cluster members and members of a cpg > > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > > - nodes 1,2,3 seem completely unaware that 4 ever went away > > > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think > > that > > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > > being added as a new fourth cpg member. > > Steve, > If node 4's old process went away, shouldn't we be getting a > 'leave' for that, rather than it persisting in the member list? > The issue that Dave is talking about I believe is described in the following bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=489451 The bugzilla is a little misleading. I think sync prior to this bug fix didn't work at all. IMO you should get a leave event for any process that leaves the process group independent of how totem works underneath. CPG should provide the guarantees you seek, and if it doesn't, it is defective. The only exception might be if the new process reuses the same PID since the pid/nodeid/group are the uniqifiers and if pid is the same, there is no way to detect the new process (and remove the old one). How it works in reality, i am not sure. Have you tried Dave's test case with a recent whitetank? Honza and I are working on a rework of the cpg service engine which should have correct behavior in whitetank and corosync when it is finished as well as fix race condition crashes. Regards -steve > Joel > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 04:09:18PM -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > >> even appear to have left (or rejoined) the cluster. > > > >> At the next totem confchg event, It will simply just be there again > > > >> with no indication that anything happened. > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > it. > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can make > > it > > happen here. > > That was pretty simple. > - set token to 5 minutes > - nodes 1,2,3,4 are cluster members and members of a cpg > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > - nodes 1,2,3 seem completely unaware that 4 ever went away > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think that > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > being added as a new fourth cpg member. Steve, If node 4's old process went away, shouldn't we be getting a 'leave' for that, rather than it persisting in the member list? Joel -- "I don't want to achieve immortality through my work; I want to achieve immortality through not dying." - Woody Allen Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > >> For added fun, a node that restarts quickly enough (think a VM) won't > >> even appear to have left (or rejoined) the cluster. > >> At the next totem confchg event, It will simply just be there again > >> with no indication that anything happened. > > > > This had BETTER not happen. > > It does, I've seen it enough times that Pacemaker has code to deal with it. Andrew, I'm mad at you. This is death for filesystems. Next time, please let us know when the system is this bad :-) Joel -- "Ninety feet between bases is perhaps as close as man has ever come to perfection." - Red Smith Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:17:47PM -0700, Steven Dake wrote: > A proper system using this model doesn't care - it synchronizes every > time regardless of who left or joined based upon whether it has state to > sync that is unique. Dave, If we're going to use cpg for our membership, we need to come up with a scheme to detect these node downs. We probably should do this together, so we don't reinvent it. Joel -- "If you are ever in doubt as to whether or not to kiss a pretty girl, give her the benefit of the doubt" -Thomas Carlyle Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:17:47PM -0700, Steven Dake wrote: > You want a guarantee that virtual synchrony doesn't provide. Virtual > synchrony doesn't provide indications of join or left, but only the > current membership. It has no way of knowing who joined, or left other > then to take the previous membership list and compare it to the current. > Keep that in mind when looking at the joined and left list in your > callbacks. > > A proper system using this model doesn't care - it synchronizes every > time regardless of who left or joined based upon whether it has state to > sync that is unique. > > I was tempted long ago to remove the join and left lists from the > callbacks, since they don't really make any sense, but the community > said they could deal with this quirk. Hmm, I don't think any of us in the world of dlms realized this. You're providing the level-triggered case, and we mostly only care about the edges. ocfs2, for example, doesn't really care who the members are. It just needs to know when one died. And if we can't reliably detect that, we're dead in the water. Joel -- Life's Little Instruction Book #497 "Go down swinging." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, 2009-04-09 at 16:09 -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > > >> even appear to have left (or rejoined) the cluster. > > > >> At the next totem confchg event, It will simply just be there again > > > >> with no indication that anything happened. > > > > > > > > ? ? ? ?This had BETTER not happen. > > > > > > It does, I've seen it enough times that Pacemaker has code to deal with > > > it. > > > > I'd call that a serious flaw we need to get fixed. I'll see if I can make > > it > > happen here. > > That was pretty simple. > - set token to 5 minutes > - nodes 1,2,3,4 are cluster members and members of a cpg > - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync > - nodes 1,2,3 seem completely unaware that 4 ever went away > > When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think that > a new fifth process/node is joining the cpg. The cpg on node 4 shows itself > being added as a new fourth cpg member. > > Dave > You want a guarantee that virtual synchrony doesn't provide. Virtual synchrony doesn't provide indications of join or left, but only the current membership. It has no way of knowing who joined, or left other then to take the previous membership list and compare it to the current. Keep that in mind when looking at the joined and left list in your callbacks. A proper system using this model doesn't care - it synchronizes every time regardless of who left or joined based upon whether it has state to sync that is unique. I was tempted long ago to remove the join and left lists from the callbacks, since they don't really make any sense, but the community said they could deal with this quirk. regards -steve > ___ > Openais mailing list > Openais@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > >> even appear to have left (or rejoined) the cluster. > > >> At the next totem confchg event, It will simply just be there again > > >> with no indication that anything happened. > > > > > > ? ? ? ?This had BETTER not happen. > > > > It does, I've seen it enough times that Pacemaker has code to deal with it. > > I'd call that a serious flaw we need to get fixed. I'll see if I can make it > happen here. Yeah, if this is the way it works, ocfs2's going to have to go drop openais, and I don't want to do that. Joel -- "All alone at the end of the evening When the bright lights have faded to blue. I was thinking about a woman who had loved me And I never knew" Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: > On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > > >> For added fun, a node that restarts quickly enough (think a VM) won't > > >> even appear to have left (or rejoined) the cluster. > > >> At the next totem confchg event, It will simply just be there again > > >> with no indication that anything happened. > > > > > > ? ? ? ?This had BETTER not happen. > > > > It does, I've seen it enough times that Pacemaker has code to deal with it. > > I'd call that a serious flaw we need to get fixed. I'll see if I can make it > happen here. That was pretty simple. - set token to 5 minutes - nodes 1,2,3,4 are cluster members and members of a cpg - on node4: ifdown eth0, kill corosync, ifup eth0, start corosync - nodes 1,2,3 seem completely unaware that 4 ever went away When node 4 joins the cpg after coming back, the cpg on nodes 1,2,3 think that a new fifth process/node is joining the cpg. The cpg on node 4 shows itself being added as a new fourth cpg member. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: > On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > >> For added fun, a node that restarts quickly enough (think a VM) won't > >> even appear to have left (or rejoined) the cluster. > >> At the next totem confchg event, It will simply just be there again > >> with no indication that anything happened. > > > > ? ? ? ?This had BETTER not happen. > > It does, I've seen it enough times that Pacemaker has code to deal with it. I'd call that a serious flaw we need to get fixed. I'll see if I can make it happen here. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 9, 2009 at 19:15, David Teigland wrote: > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: >> For added fun, a node that restarts quickly enough (think a VM) won't >> even appear to have left (or rejoined) the cluster. >> At the next totem confchg event, It will simply just be there again >> with no indication that anything happened. >> >> At least this is true for the raw corosync/openais membership data, >> perhaps CPG can infer this some other way. > > Cpg should not let a node go away and come back without notice. In practice > I'd expect back to back confchg's: one showing it leave and another showing it > join. If you mean the raw confchg's that lcrsos see, then nope. Try this, set token: to longer than your node takes to reboot and reboot a node. For physical nodes this isn't a realistic scenario, but VMs can easily boot in 10 seconds or so. > As Chrissie mentioned earlier, cpg shouldn't show the same node both > leaving and joining in a single confchg. In theory I think it would be > legitimate. > > Consider a couple examples. > m: member list, j: joined list, l: left list > > 1. nodes A and B join at once > A gets confchg: m=A,B j=A,B l= > B gets confchg: m=A,B j=A,B l= > > 2. node C joins > A gets confchg: m=A,B,C j=C l= > B gets confchg: m=A,B,C j=C l= > C gets confchg: m=A,B,C j=C l= > > 3. node C leaves and quickly rejoins in a single confchg > A gets confchg: m=A,B,C j=C l=C > B gets confchg: m=A,B,C j=C l=C > C gets confchg: m=A,B,C j=C l=C > > 4. node D joins and quickly leaves (or fails) in a single confchg > A gets confchg: m=A,B,C j=D l=D > B gets confchg: m=A,B,C j=D l=D > C gets confchg: m=A,B,C j=D l=D > D gets confchg: m=A,B,C j=D l=D ?* > > * if D does a quick join+leave it may expect to see this confchg showing it in > the joined list, the left list, and not in the member list. > > Again, the examples in 3 and 4 are, I think, legitimate in theory. In > practice it sounds like they won't occur. > > If a quick leave+join is guaranteed to be visible through cpg, then it must be > possible to observe at the lower level from raw corosync data. > > Dave > > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 9, 2009 at 20:49, Joel Becker wrote: > On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: >> For added fun, a node that restarts quickly enough (think a VM) won't >> even appear to have left (or rejoined) the cluster. >> At the next totem confchg event, It will simply just be there again >> with no indication that anything happened. > > This had BETTER not happen. It does, I've seen it enough times that Pacemaker has code to deal with it. > If it does, we can't recover the > dead+restarted node, and our filesystems are going to corrupt all the > time. > > Joel > > -- > > "If you are ever in doubt as to whether or not to kiss a pretty girl, > give her the benefit of the doubt" > -Thomas Carlyle > > Joel Becker > Principal Software Developer > Oracle > E-mail: joel.bec...@oracle.com > Phone: (650) 506-8127 > ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 08:37:00AM +0100, Chrissie Caulfield wrote: > 1) If member_count == join count, then it's a safe bet that they are all > new nodes, and yes , it is true that all nodes should see the same > confchg messages > > 2) if join_count > 0 then leave_count will always be zero. That's a > consequence of how CPG sends its messages really, join and leave > messages are always separate. Don't rely on this behaviour though! > Although I can't see any reason to change it, I'd rather not have it > burned into the defacto specification. I agree we shouldn't rely on this. I'm just more concerned that if there is member_count==join_count and leave_count>0, we can rely on members == joiners, and thus treat it as a newly created group (all members are in the "just joined" state). Joel -- "War doesn't determine who's right; war determines who's left." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > For added fun, a node that restarts quickly enough (think a VM) won't > even appear to have left (or rejoined) the cluster. > At the next totem confchg event, It will simply just be there again > with no indication that anything happened. This had BETTER not happen. If it does, we can't recover the dead+restarted node, and our filesystems are going to corrupt all the time. Joel -- "If you are ever in doubt as to whether or not to kiss a pretty girl, give her the benefit of the doubt" -Thomas Carlyle Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: > For added fun, a node that restarts quickly enough (think a VM) won't > even appear to have left (or rejoined) the cluster. > At the next totem confchg event, It will simply just be there again > with no indication that anything happened. > > At least this is true for the raw corosync/openais membership data, > perhaps CPG can infer this some other way. Cpg should not let a node go away and come back without notice. In practice I'd expect back to back confchg's: one showing it leave and another showing it join. As Chrissie mentioned earlier, cpg shouldn't show the same node both leaving and joining in a single confchg. In theory I think it would be legitimate. Consider a couple examples. m: member list, j: joined list, l: left list 1. nodes A and B join at once A gets confchg: m=A,B j=A,B l= B gets confchg: m=A,B j=A,B l= 2. node C joins A gets confchg: m=A,B,C j=C l= B gets confchg: m=A,B,C j=C l= C gets confchg: m=A,B,C j=C l= 3. node C leaves and quickly rejoins in a single confchg A gets confchg: m=A,B,C j=C l=C B gets confchg: m=A,B,C j=C l=C C gets confchg: m=A,B,C j=C l=C 4. node D joins and quickly leaves (or fails) in a single confchg A gets confchg: m=A,B,C j=D l=D B gets confchg: m=A,B,C j=D l=D C gets confchg: m=A,B,C j=D l=D D gets confchg: m=A,B,C j=D l=D ?* * if D does a quick join+leave it may expect to see this confchg showing it in the joined list, the left list, and not in the member list. Again, the examples in 3 and 4 are, I think, legitimate in theory. In practice it sounds like they won't occur. If a quick leave+join is guaranteed to be visible through cpg, then it must be possible to observe at the lower level from raw corosync data. Dave ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
Robert Wipfel wrote: On 4/9/2009 at 5:50 AM, in message > <26ef5e70904090450s40e92dcfgea0fc34826360...@mail.gmail.com>, Andrew Beekhof > wrote: >> On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield wrote: >>> Joel Becker wrote: Steve, Dave, etc, Someone told me a while back that a node joining a cpg group would be by its lonesome in the join message. That is, when the node gets its first confchg, it will be the only node in the list of joins. I've been using this to detect the first joiner of the group ("I joined, and the member count is 1"). Dave's since told me that this assumption is not valid (if it ever was). So two or more nodes can join in parallel, and each can see more than node in the list of joins for its first confchg. I'm now trying to figure out an algorithm for "first joiner". I have a couple of questions: 1) If I see member_count == join_count, does that mean every member has just joinded, and all the members are receiving the same join message? 2) If member_count == join_count, can leave_count be non-zero? If it is, am I guaranteed that we're looking at "all old members left, all new members joined"? If these both are true, I can simply isolate a "first joiner" by checking member_count == join_count and selecting the lowest node number. >>> >>> I don't think you can detect a first-joiner using CPG. cman does it by >>> reading the totem confchg messages. It is quite possible for two nodes >>> to join at the same time ... during the same SYNC phase so you certainly >>> can't rely on that. >>> >>> 1) If member_count == join count, then it's a safe bet that they are all >>> new nodes, and yes , it is true that all nodes should see the same >>> confchg messages >>> >>> 2) if join_count > 0 then leave_count will always be zero. That's a >>> consequence of how CPG sends its messages really, join and leave >>> messages are always separate. Don't rely on this behaviour though! >>> Although I can't see any reason to change it, I'd rather not have it >>> burned into the defacto specification. >> For added fun, a node that restarts quickly enough (think a VM) won't >> even appear to have left (or rejoined) the cluster. >> At the next totem confchg event, It will simply just be there again >> with no indication that anything happened. >> >> At least this is true for the raw corosync/openais membership data, >> perhaps CPG can infer this some other way. > > When a new node joins the group does it also create the group? > e.g. http://www.opengroup.org/RI/technologies/cords/gipc.pdf > has an epoch number with each join/leave message, the group is > created by whoever joined in epoch 0. That would work but it would also break the wire-protocol AND the API! -- Chrissie ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
>>> On 4/9/2009 at 5:50 AM, in message <26ef5e70904090450s40e92dcfgea0fc34826360...@mail.gmail.com>, Andrew Beekhof wrote: > On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield wrote: >> Joel Becker wrote: >>> Steve, Dave, etc, >>> Someone told me a while back that a node joining a cpg group >>> would be by its lonesome in the join message. That is, when the node >>> gets its first confchg, it will be the only node in the list of joins. >>> I've been using this to detect the first joiner of the group ("I joined, >>> and the member count is 1"). >>> Dave's since told me that this assumption is not valid (if it >>> ever was). So two or more nodes can join in parallel, and each can see >>> more than node in the list of joins for its first confchg. I'm now >>> trying to figure out an algorithm for "first joiner". I have a couple >>> of questions: >>> >>> 1) If I see member_count == join_count, does that mean every member has >>> just joinded, and all the members are receiving the same join message? >>> >>> 2) If member_count == join_count, can leave_count be non-zero? If it >>> is, am I guaranteed that we're looking at "all old members left, all new >>> members joined"? >>> >>> If these both are true, I can simply isolate a "first joiner" by >>> checking member_count == join_count and selecting the lowest node >>> number. >> >> >> I don't think you can detect a first-joiner using CPG. cman does it by >> reading the totem confchg messages. It is quite possible for two nodes >> to join at the same time ... during the same SYNC phase so you certainly >> can't rely on that. >> >> 1) If member_count == join count, then it's a safe bet that they are all >> new nodes, and yes , it is true that all nodes should see the same >> confchg messages >> >> 2) if join_count > 0 then leave_count will always be zero. That's a >> consequence of how CPG sends its messages really, join and leave >> messages are always separate. Don't rely on this behaviour though! >> Although I can't see any reason to change it, I'd rather not have it >> burned into the defacto specification. > > For added fun, a node that restarts quickly enough (think a VM) won't > even appear to have left (or rejoined) the cluster. > At the next totem confchg event, It will simply just be there again > with no indication that anything happened. > > At least this is true for the raw corosync/openais membership data, > perhaps CPG can infer this some other way. When a new node joins the group does it also create the group? e.g. http://www.opengroup.org/RI/technologies/cords/gipc.pdf has an epoch number with each join/leave message, the group is created by whoever joined in epoch 0. Hth, Robert ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield wrote: > Joel Becker wrote: >> Steve, Dave, etc, >> Someone told me a while back that a node joining a cpg group >> would be by its lonesome in the join message. That is, when the node >> gets its first confchg, it will be the only node in the list of joins. >> I've been using this to detect the first joiner of the group ("I joined, >> and the member count is 1"). >> Dave's since told me that this assumption is not valid (if it >> ever was). So two or more nodes can join in parallel, and each can see >> more than node in the list of joins for its first confchg. I'm now >> trying to figure out an algorithm for "first joiner". I have a couple >> of questions: >> >> 1) If I see member_count == join_count, does that mean every member has >> just joinded, and all the members are receiving the same join message? >> >> 2) If member_count == join_count, can leave_count be non-zero? If it >> is, am I guaranteed that we're looking at "all old members left, all new >> members joined"? >> >> If these both are true, I can simply isolate a "first joiner" by >> checking member_count == join_count and selecting the lowest node >> number. > > > I don't think you can detect a first-joiner using CPG. cman does it by > reading the totem confchg messages. It is quite possible for two nodes > to join at the same time ... during the same SYNC phase so you certainly > can't rely on that. > > 1) If member_count == join count, then it's a safe bet that they are all > new nodes, and yes , it is true that all nodes should see the same > confchg messages > > 2) if join_count > 0 then leave_count will always be zero. That's a > consequence of how CPG sends its messages really, join and leave > messages are always separate. Don't rely on this behaviour though! > Although I can't see any reason to change it, I'd rather not have it > burned into the defacto specification. For added fun, a node that restarts quickly enough (think a VM) won't even appear to have left (or rejoined) the cluster. At the next totem confchg event, It will simply just be there again with no indication that anything happened. At least this is true for the raw corosync/openais membership data, perhaps CPG can infer this some other way. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] detecting cpg joiners
Joel Becker wrote: > Steve, Dave, etc, > Someone told me a while back that a node joining a cpg group > would be by its lonesome in the join message. That is, when the node > gets its first confchg, it will be the only node in the list of joins. > I've been using this to detect the first joiner of the group ("I joined, > and the member count is 1"). > Dave's since told me that this assumption is not valid (if it > ever was). So two or more nodes can join in parallel, and each can see > more than node in the list of joins for its first confchg. I'm now > trying to figure out an algorithm for "first joiner". I have a couple > of questions: > > 1) If I see member_count == join_count, does that mean every member has > just joinded, and all the members are receiving the same join message? > > 2) If member_count == join_count, can leave_count be non-zero? If it > is, am I guaranteed that we're looking at "all old members left, all new > members joined"? > > If these both are true, I can simply isolate a "first joiner" by > checking member_count == join_count and selecting the lowest node > number. I don't think you can detect a first-joiner using CPG. cman does it by reading the totem confchg messages. It is quite possible for two nodes to join at the same time ... during the same SYNC phase so you certainly can't rely on that. 1) If member_count == join count, then it's a safe bet that they are all new nodes, and yes , it is true that all nodes should see the same confchg messages 2) if join_count > 0 then leave_count will always be zero. That's a consequence of how CPG sends its messages really, join and leave messages are always separate. Don't rely on this behaviour though! Although I can't see any reason to change it, I'd rather not have it burned into the defacto specification. -- Chrissie ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] detecting cpg joiners
Steve, Dave, etc, Someone told me a while back that a node joining a cpg group would be by its lonesome in the join message. That is, when the node gets its first confchg, it will be the only node in the list of joins. I've been using this to detect the first joiner of the group ("I joined, and the member count is 1"). Dave's since told me that this assumption is not valid (if it ever was). So two or more nodes can join in parallel, and each can see more than node in the list of joins for its first confchg. I'm now trying to figure out an algorithm for "first joiner". I have a couple of questions: 1) If I see member_count == join_count, does that mean every member has just joinded, and all the members are receiving the same join message? 2) If member_count == join_count, can leave_count be non-zero? If it is, am I guaranteed that we're looking at "all old members left, all new members joined"? If these both are true, I can simply isolate a "first joiner" by checking member_count == join_count and selecting the lowest node number. Joel -- Life's Little Instruction Book #444 "Never underestimate the power of a kind word or deed." Joel Becker Principal Software Developer Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais