Can someone please check this


*From:* Girish Nagaraj [mailto:[email protected]]
*Sent:* Friday, August 07, 2015 4:21 PM
*To:* '[email protected]'
*Subject:* Question on SU Failover in opensaf version 4.6



Hi,



We have few applications (all under one single service unit) modeled as
opensaf components.



In 2N model failure recovery action is configured as suFailover.



Behavior in 4.3.1

  In 2N model when one of the component fails, then failed component is
restarted and assigned STANDBY HA state and all other components SIs are
assigned STANDBY HA state



Behavior in 4.6.0

  In 2N model when one of the component fails, failed component is not
restarted and all other components are terminated abruptly



Why is this difference in behavior? Has anything changed in 4.6 ? we are
using same model files in both.



// 4.3.1 log

Aug  7 16:08:30 localhost abrt[10373]: Saved core dump of pid 10080
(/usr/local/sbin/chkptapp) to
/var/spool/abrt/ccpp-2015-08-07-16:08:30-10080 (3244032 bytes)

Aug  7 16:08:30 localhost abrtd: Directory 'ccpp-2015-08-07-16:08:30-10080'
creation detected

Aug  7 16:08:31 localhost avahi-daemon[523]: Withdrawing workstation
service for svlan0.1.

Aug  7 16:08:31 localhost osafamfnd[9852]: NO
'safComp=chkptapp,safSu=SU1,safSg=app2N,safApp=app' faulted due to
'avaDown' : Recovery is 'suFailover'

Aug  7 16:08:31 localhost osafamfnd[9852]: NO
'safSu=SU1,safSg=app2N,safApp=app' Presence State INSTANTIATED =>
TERMINATING

Aug  7 16:08:31 localhost appn_amf_script: Stopping chkptapp

Aug  7 16:08:31 localhost osafamfnd[9852]: NO Assigning
'safSi=appSI,safApp=app' QUIESCED to 'safSu=SU1,safSg=app2N,safApp=app'



// 4.6 log

Aug  7 06:56:52 ha-node-1 abrt[10462]: Saved core dump of pid 10191
(/usr/local/sbin/chkptapp) to
/var/spool/abrt/ccpp-2015-08-07-06:56:52-10191 (3309568 bytes)

Aug  7 06:56:52 ha-node-1 osafamfnd[9897]: NO saAmfSUFailover is true for
'safSu=SU1,safSg=zebos2N,safApp=app'

Aug  7 06:56:52 ha-node-1 osafamfnd[9897]: NO SU failover probation timer
started (timeout: 1200000000000 ns)

Aug  7 06:56:52 ha-node-1 osafamfnd[9897]: NO Performing failover of
'safSu=SU1,safSg=app2N,safApp=app' (SU failover count: 1)

Aug  7 06:56:52 ha-node-1 osafamfnd[9897]: NO
'safComp=chkptapp,safSu=SU1,safSg=app2N,safApp=app' recovery action
escalated from 'componentFailover' to 'suFailover'

Aug  7 06:56:52 ha-node-1 osafamfnd[9897]: NO
'safComp=chkptapp,safSu=SU1,safSg=app2N,safApp=app' faulted due to
'avaDown' : Recovery is 'suFailover'

Aug  7 06:56:52 ha-node-1 osafamfnd[9897]: NO Terminating components of
'safSu=SU1,safSg=app2N,safApp=app'(abruptly & unordered)

Aug  7 06:56:52 ha-node-1 osafamfnd[9897]: NO
'safSu=SU1,safSg=app2N,safApp=app' Presence State INSTANTIATED =>
TERMINATING



Regards,

Girish

-- 
.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to