On Mon, Apr 13, 2009 at 02:17:00PM -0500, David Teigland wrote: > On Mon, Apr 13, 2009 at 12:10:33PM -0700, Steven Dake wrote: > > On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote: > > > 0. configure token timeout to some long time that is longer than all the > > > following steps take > > > > > > 1. cluster members are nodeid's: 1,2,3,4 > > > > > > 2. cpg foo has the following members: > > > nodeid 1, pid 10 > > > nodeid 2, pid 20 > > > nodeid 3, pid 30 > > > nodeid 4, pid 40 > > > > > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40 > > > (optionally reboot this node now) > > > > > > 4. nodeid 4: ifup eth0, start corosync > > > > > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg > > > showing that 4:40 is not a member > > > > > > 6. nodeid 4: start process pid 41 that joins cpg foo > > > > > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg > > > showing that 4:41 is a member > > > > > > (Steps 6 and 7 should work the same even if the process started in step 6 > > > has pid 40 instead of pid 41.) > > > 100% agree that is how it should work. If it doesn't, we will fix it. > > The only thing that may be strange is if pid in step 6 is the same pid > > as 40. Are you certain the test case which fails has a differing pid at > > step 6? > > If you fix step 5, then I suspect steps 6,7 will "just work". After the test > failed at step 5 I didn't pay too much attention to 6,7... but I'm sure that > the pid in step 6 was different (I didn't reboot the node).
It's not clear what the plan was for this, any recent related changes I should try? Dave _______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais