Thanks Praveen, -----Original Message----- From: praveen malviya [mailto:[email protected]] Sent: Monday, August 10, 2015 11:00 AM To: Girish Nagaraj; [email protected] Subject: Re: [users] Question on SU Failover in opensaf version 4.6
On 07-Aug-15 4:20 PM, Girish Nagaraj wrote: > Hi, > > > > We have few applications (all under one single service unit) modeled > as opensaf components. > > > > In 2N model failure recovery action is configured as suFailover. > > > > Behavior in 4.3.1 > > In 2N model when one of the component fails, then failed component > is restarted and assigned STANDBY HA state and all other components > SIs are assigned STANDBY HA state > > > > Behavior in 4.6.0 > > In 2N model when one of the component fails, failed component is > not restarted and all other components are terminated abruptly > > > > Why is this difference in behavior? Has anything changed in 4.6 ? we > are using same model files in both. > > From 4.4 release onwards, sufailover feature is supported. In any application configuration, in which saAmfSUFailover or saAmfSutDefSUFailover is enabled, sufailover instead of comp-failover for the faulted SU will be performed. In think in the present case one of these flag was enabled in the original configuration.To observe old behavior dynamically disable saAmfSUFailover or saAmfSutDefSUFailover flag using immcfg command: immcfg -a saAmfSUFailover=0 <name of su>. Thanks Praveen > > // 4.3.1 log > > Aug 7 16:08:30 localhost abrt[10373]: Saved core dump of pid 10080 > (/usr/local/sbin/chkptapp) to > /var/spool/abrt/ccpp-2015-08-07-16:08:30-10080 (3244032 bytes) > > Aug 7 16:08:30 localhost abrtd: Directory 'ccpp-2015-08-07-16:08:30-10080' > creation detected > > Aug 7 16:08:31 localhost avahi-daemon[523]: Withdrawing workstation > service for svlan0.1. > > Aug 7 16:08:31 localhost osafamfnd[9852]: NO > 'safComp=chkptapp,safSu=SU1,safSg=app2N,safApp=app' faulted due to > 'avaDown' : Recovery is 'suFailover' > > Aug 7 16:08:31 localhost osafamfnd[9852]: NO > 'safSu=SU1,safSg=app2N,safApp=app' Presence State INSTANTIATED => > TERMINATING > > Aug 7 16:08:31 localhost appn_amf_script: Stopping chkptapp > > Aug 7 16:08:31 localhost osafamfnd[9852]: NO Assigning > 'safSi=appSI,safApp=app' QUIESCED to 'safSu=SU1,safSg=app2N,safApp=app' > > > > // 4.6 log > > Aug 7 06:56:52 ha-node-1 abrt[10462]: Saved core dump of pid 10191 > (/usr/local/sbin/chkptapp) to > /var/spool/abrt/ccpp-2015-08-07-06:56:52-10191 (3309568 bytes) > > Aug 7 06:56:52 ha-node-1 osafamfnd[9897]: NO saAmfSUFailover is true > for 'safSu=SU1,safSg=zebos2N,safApp=app' > > Aug 7 06:56:52 ha-node-1 osafamfnd[9897]: NO SU failover probation > timer started (timeout: 1200000000000 ns) > > Aug 7 06:56:52 ha-node-1 osafamfnd[9897]: NO Performing failover of > 'safSu=SU1,safSg=app2N,safApp=app' (SU failover count: 1) > > Aug 7 06:56:52 ha-node-1 osafamfnd[9897]: NO > 'safComp=chkptapp,safSu=SU1,safSg=app2N,safApp=app' recovery action > escalated from 'componentFailover' to 'suFailover' > > Aug 7 06:56:52 ha-node-1 osafamfnd[9897]: NO > 'safComp=chkptapp,safSu=SU1,safSg=app2N,safApp=app' faulted due to > 'avaDown' : Recovery is 'suFailover' > > Aug 7 06:56:52 ha-node-1 osafamfnd[9897]: NO Terminating components > of 'safSu=SU1,safSg=app2N,safApp=app'(abruptly & unordered) > > Aug 7 06:56:52 ha-node-1 osafamfnd[9897]: NO > 'safSu=SU1,safSg=app2N,safApp=app' Presence State INSTANTIATED => > TERMINATING > > > > Regards, > > Girish > -- . ------------------------------------------------------------------------------ _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
