> -----Original Message----- > From: Saha Sayandeb-G19428 [mailto:[EMAIL PROTECTED] > Sent: den 28 november 2007 18:23 > To: Hans Feldt > Cc: [email protected] > Subject: RE: [Users] Controller HA mechanisms > > Hans, > > Comments below ... > > > How does OpenSAF handle the following scenario: > > > > - Controller 1 (C1) power on > > - C1 RDE starts and decides to be active since it is alone in the > > cluster > > - C1 PSR or AMF dies due to some reason > > - Controller 2 (C2) power on > > - C2 RDE starts and gets the role standby from RDE on C1 > > - C2 waits forever to get synced from C1 > > > > Some issues: > > C1 RDE claims to be active although it is not > > C1 does not reboot > > C2 does not reboot when its looses contact with the active > controller > > and not in sync. > > C2 cannot become active if we reboot C1 > > > > Comments? > > [SS] I simulated this condition quite easily by simply > killing the ncs_scap process in the one and only active > controller and then running the get_ha_state command and as > you say the RDE in this controller still keeps thinking that > it is active which prevents the second controller to obtain > the active state. So this is a hole as the RDE has no clue > that the Avd+AvM has crashed. I guess we could add a role > heart-beat from the Avd+AvM to the RDE to ensure that the RDE > is always in-synch with what's going on and can relinquish > the active state so that the other controller can become > active under such a circumstance. > But this whole scenario of having only one controller which > crashes and then the second one that tries to come up is > probably not so common or do you think it will be because of > the way OpenSAF waits 3 minutes before rebooting payload > blades when AvD goes down?
No I just stumbled on this since we're doing a lot power on/off of controllers and fail-overs at the moment. As a solution, what if nid stays alive and supervise its children? If rde or scap dies, nid reboots the system. Cheers, Hans > > Sayan > > > Regards, > > Hans > > _______________________________________________ > > Users mailing list > > [email protected] > > http://list.opensaf.org/maillist/listinfo/users > > > _______________________________________________ Users mailing list [email protected] http://list.opensaf.org/maillist/listinfo/users
