> -----Original Message-----
> From: Saha Sayandeb-G19428 [mailto:[EMAIL PROTECTED] 
> Sent: den 28 november 2007 18:23
> To: Hans Feldt
> Cc: [email protected]
> Subject: RE: [Users] Controller HA mechanisms
> 
> Hans,
> 
> Comments below ... 
> 
> > How does OpenSAF handle the following scenario:
> > 
> > - Controller 1 (C1) power on
> > - C1 RDE starts and decides to be active since it is alone in the 
> > cluster
> > - C1 PSR or AMF dies due to some reason
> > - Controller 2 (C2) power on
> > - C2 RDE starts and gets the role standby from RDE on C1
> > - C2 waits forever to get synced from C1
> > 
> > Some issues:
> > C1 RDE claims to be active although it is not
> > C1 does not reboot
> > C2 does not reboot when its looses contact with the active 
> controller 
> > and not in sync.
> > C2 cannot become active if we reboot C1
> > 
> > Comments?
> 
> [SS] I simulated this condition quite easily by simply 
> killing the ncs_scap process in the one and only active 
> controller and then running the get_ha_state command  and as 
> you say the RDE in this controller still keeps thinking that 
> it is active which prevents the second controller to obtain 
> the active state. So this is a hole as the RDE has no clue 
> that the Avd+AvM has crashed. I guess we could add a role 
> heart-beat from the Avd+AvM to the RDE to ensure that the RDE 
> is always in-synch with what's going on and can relinquish 
> the active state so that the other controller can become 
> active under such a circumstance.
> But this whole scenario of having only one controller which 
> crashes and then the second one that tries to come up is 
> probably not so common or do you think it will be because of 
> the way OpenSAF waits 3 minutes before rebooting payload 
> blades when AvD goes down?

No I just stumbled on this since we're doing a lot power on/off of
controllers and fail-overs at the moment.

As a solution, what if nid stays alive and supervise its children? If
rde or scap dies, nid reboots the system.

Cheers,
Hans

> 
> Sayan
> 
> > Regards,
> > Hans
> > _______________________________________________
> > Users mailing list
> > [email protected]
> > http://list.opensaf.org/maillist/listinfo/users
> > 
> 
_______________________________________________
Users mailing list
[email protected]
http://list.opensaf.org/maillist/listinfo/users

Reply via email to