Hi David,

sc-2(primary PBE)

Mar 19 16:43:52 controller-sc2 osafimmnd[6014]: WA update of PERSISTENT 
runtime attributes in object 'safSu=SU1,safSg=app4,safApp=app4' 
REVERTED. PBE rc:18

Mar 19 16:43:54 controller-sc2 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:100000b1b/4294970139

Mar 19 16:44:04 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
100000b1b/4294970139 towards slave PBE returned: '5' from Immsv

Mar 19 16:44:04 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
PRTA update Ccb:100000b1b/4294970139 towards PBE-B

Mar 19 16:44:04 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
update (ccbId:100000b1b)

sc-1(slave PBE):

Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at
PBE slave ccbId:100000b1b/4294970139 numOps:1

Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare
ccb:100000b1b/4294970139 received at Pbe slave when Prior Ccb 4294970138
still processing


Here slave PBE replies with TRY_AGAIN, to primary PBE, this in not 
received at primary PBE. This resulted in TIMEOUT(5), which is converted 
into SA_AIS_ERR_NO_RESOURCES(18).

/Neel.


>
>
> -------- Forwarded Message --------
> Subject:     RE: [users] opensaf internal admin state not propagated
> Date:     Mon, 27 Mar 2017 18:49:25 +0000
> From:     David Hoyt <[email protected]>
> To:     praveen malviya <[email protected]>, 
> [email protected] <[email protected]>
>
>
>
> Hi Praveen,
>
> First off, thank-you for your reply.
>
> Secondly, the lab was recovered.
>
> I issued the following command:
>
> [root@payload-4 ~]# immadm -o 1 safSu=SU1,safSg=app4,safApp=app4
>
> [root@payload-4 ~]#
>
> [root@payload-4 ~]#  date;  amf-state su | grep -A 5 app4
>
> Fri Mar 24 11:41:36 EDT 2017
>
> safSu=SU1,safSg=app4,safApp=app4
>
>          saAmfSUAdminState=*UNLOCKED*(1)
>
>          saAmfSUOperState=ENABLED(1)
>
>          saAmfSUPresenceState=INSTANTIATED(3)
>
>          saAmfSUReadinessState=IN-SERVICE(2)
>
> [root@payload-4 ~]#
>
> [root@payload-4 ~]#  date; amf-adm  lock safSu=SU1,safSg=app4,safApp=app4
>
> Fri Mar 24 11:43:46 EDT 2017
>
> [root@payload-4 ~]#  date;  amf-state su | grep -A 5 app4
>
> Fri Mar 24 11:44:56 EDT 2017
>
> safSu=SU1,safSg=app4,safApp=app4
>
>          saAmfSUAdminState=*LOCKED*(2)
>
>          saAmfSUOperState=ENABLED(1)
>
>          saAmfSUPresenceState=INSTANTIATED(3)
>
>          saAmfSUReadinessState=OUT-OF-SERVICE(1)
>
> [root@payload-4 ~]#
>
> Thirdly: we found we had a network issue between PL-4 and SC-2, which 
> likely caused our problem(s).
>
> Having said that, when amf and imm end up with a different view of the 
> opensaf SU state, how can you recover from this?
>
> Is there some audit that runs periodically to ensure the state info is 
> in-sync between the various opensaf processes?
>
> Below is a summary after reviewing the logs.
>
> SU: app4-SU-1
>
> Node: PL-4
>
> -As stated, PL-4 is able to ping SC-1 but NOT SC-2
>
> -all was fine with app4-SU-1 for several days. It was running 
> unlocked-enabled-in-service-active
>
> -On March 18, PL-4 went for a reboot
>
> -Upon recovery, opensaf on PL-4 failed to start – it appears it still 
> could not communicate with SC-2:
>
> Mar 18 22:30:49 payload-4 opensafd: Starting OpenSAF Services (Using TCP)
>
> Mar 18 22:30:49 payload-4 osafdtmd[1570]: Started
>
> Mar 18 22:30:49 payload-4 osafimmnd[1588]: Started
>
> Mar 18 22:30:49 payload-4 osafimmnd[1588]: NO Persistent Back-End 
> capability configured, Pbe file:imm.db (suffix may get added)
>
> Mar 18 22:30:49 payload-4 osafdtmd[1570]: NO Established contact with 
> ' controller-sc1'
>
> Mar 18 22:31:49 payload-4 opensafd[1562]: ER Timed-out for response 
> from IMMND
>
> Mar 18 22:31:49 payload-4 opensafd[1562]: ER
>
> Mar 18 22:31:49 payload-4 opensafd[1562]: ER Going for recovery
>
> Mar 18 22:31:49 payload-4 opensafd[1562]: ER Trying To RESPAWN 
> /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #1
>
> Mar 18 22:31:49 payload-4 opensafd[1562]: ER Sending SIGKILL to IMMND, 
> pid=1576
>
> Mar 18 22:31:49 payload-4 osafimmnd[1588]: exiting for shutdown
>
> -On March 19, with opensaf on PL-4 still failing to start, the user 
> issued a lock of app4-SU-1 from SC-2. Here are the logs from SC-2:
>
> Mar 19 16:43:31 controller-sc2 Running cmd: 'amf-adm -t 900 lock 
> safSu=SU1,safSg=app4,safApp=app4'
>
> Mar 19 16:43:41 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b1a/4294970138
>
> Mar 19 16:43:51 controller-sc2 osafimmnd[6014]: WA Failed to retrieve 
> search continuation, client died ?
>
> Mar 19 16:43:51 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b1a/4294970138 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:43:51 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b1a/4294970138 towards PBE-B
>
> Mar 19 16:43:51 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b1a)
>
> Mar 19 16:43:52 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:43:54 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b1b/4294970139
>
> Mar 19 16:44:04 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b1b/4294970139 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:44:04 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b1b/4294970139 towards PBE-B
>
> Mar 19 16:44:04 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b1b)
>
> Mar 19 16:44:05 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:44:06 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b1c/4294970140
>
> Mar 19 16:44:10 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :Connection timed out
>
> Mar 19 16:44:16 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b1c/4294970140 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:44:16 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b1c/4294970140 towards PBE-B
>
> Mar 19 16:44:16 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b1c)
>
> Mar 19 16:44:17 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:44:19 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b1d/4294970141
>
> Mar 19 16:44:29 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b1d/4294970141 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:44:29 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b1d/4294970141 towards PBE-B
>
> Mar 19 16:44:29 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b1d)
>
> Mar 19 16:44:30 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:44:32 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b1e/4294970142
>
> Mar 19 16:44:42 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b1e/4294970142 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:44:42 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b1e/4294970142 towards PBE-B
>
> Mar 19 16:44:42 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b1e)
>
> Mar 19 16:44:42 controller-sc2 osafimmnd[6014]: WA Failed to retrieve 
> search continuation, client died ?
>
> Mar 19 16:44:42 controller-sc2 osafimmnd[6014]: WA Failed to retrieve 
> search continuation, client died ?
>
> Mar 19 16:44:43 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:44:44 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b1f/4294970143
>
> Mar 19 16:44:54 controller-sc2 osafimmnd[6014]: WA Failed to retrieve 
> search continuation, client died ?
>
> Mar 19 16:44:54 controller-sc2 osafimmnd[6014]: WA Failed to retrieve 
> search continuation, client died ?
>
> Mar 19 16:44:54 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b1f/4294970143 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:44:54 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b1f/4294970143 towards PBE-B
>
> Mar 19 16:44:54 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b1f)
>
> Mar 19 16:44:55 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:44:57 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b20/4294970144
>
> Mar 19 16:45:07 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b20/4294970144 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:45:07 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b20/4294970144 towards PBE-B
>
> Mar 19 16:45:07 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b20)
>
> Mar 19 16:45:08 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:45:09 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b21/4294970145
>
> Mar 19 16:45:13 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :Connection timed out
>
> Mar 19 16:45:19 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b21/4294970145 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:45:19 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b21/4294970145 towards PBE-B
>
> Mar 19 16:45:19 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b21)
>
> Mar 19 16:45:20 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:45:22 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b22/4294970146
>
> Mar 19 16:45:31 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:32 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b22/4294970146 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:45:32 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b22/4294970146 towards PBE-B
>
> Mar 19 16:45:32 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b22)
>
> Mar 19 16:45:33 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:45:34 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:34 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b23/4294970147
>
> Mar 19 16:45:37 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:40 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:43 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:44 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b23/4294970147 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:45:44 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b23/4294970147 towards PBE-B
>
> Mar 19 16:45:44 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b23)
>
> Mar 19 16:45:45 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:45:46 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:47 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b24/4294970148
>
> Mar 19 16:45:49 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:52 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:55 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:57 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b24/4294970148 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:45:57 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b24/4294970148 towards PBE-B
>
> Mar 19 16:45:57 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b24)
>
> Mar 19 16:45:58 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:45:58 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:45:59 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b25/4294970149
>
> Mar 19 16:46:01 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:04 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:07 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:09 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b25/4294970149 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:46:09 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b25/4294970149 towards PBE-B
>
> Mar 19 16:46:09 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b25)
>
> Mar 19 16:46:10 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:10 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:11 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b26/4294970150
>
> Mar 19 16:46:13 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:16 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:19 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:21 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b26/4294970150 towards slave PBE returned: '5' from Immsv
>
> Mar 19 16:46:21 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b26/4294970150 towards PBE-B
>
> Mar 19 16:46:21 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b26)
>
> Mar 19 16:46:22 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:22 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:24 controller-sc2 osafimmpbed: IN Starting distributed 
> PBE commit for PRTA update Ccb:100000b27/4294970151
>
> Mar 19 16:46:25 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for 
> AdminOp was lost, probably due to timeout
>
> Mar 19 16:46:26 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv 
> (6) replied with transient error on prepare for ccb:100000b27/4294970151
>
> Mar 19 16:46:26 controller-sc2 osafimmnd[6014]: ER PBE PRTAttrs Update 
> continuation missing! invoc:2842
>
> Mar 19 16:46:26 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv 
> (6) replied with transient error on prepare for ccb:100000b27/4294970151
>
> Mar 19 16:46:27 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv 
> (6) replied with transient error on prepare for ccb:100000b27/4294970151
>
> Mar 19 16:46:27 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv 
> (6) replied with transient error on prepare for ccb:100000b27/4294970151
>
> Mar 19 16:46:28 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv 
> (6) replied with transient error on prepare for ccb:100000b27/4294970151
>
> Mar 19 16:46:28 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv 
> (6) replied with transient error on prepare for ccb:100000b27/4294970151
>
> Mar 19 16:46:28 controller-sc2 osafimmpbed: WA Start prepare for ccb: 
> 100000b27/4294970151 towards slave PBE returned: '6' from standby PBE
>
> Mar 19 16:46:28 controller-sc2 osafimmpbed: WA PBE-A failed to prepare 
> PRTA update Ccb:100000b27/4294970151 towards PBE-B
>
> Mar 19 16:46:28 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b27)
>
> Mar 19 16:46:28 controller-sc2 osafimmnd[6014]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:28 controller-sc2 osafamfd[6091]: ER exec: update FAILED
>
> -Meanwhile, SC-1 is generating the following logs:
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1a
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b1a/4294970138 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b1b/4294970139 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b1b/4294970139 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b1c/4294970140 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b1c/4294970140 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b1d/4294970141 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b1d/4294970141 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b1e/4294970142 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b1e/4294970142 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b1f/4294970143 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b1f/4294970143 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b20/4294970144 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b20/4294970144 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b21/4294970145 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b21/4294970145 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b22/4294970146 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b22/4294970146 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223131, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b23/4294970147 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223133, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223135, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b23/4294970147 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b24/4294970148 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b24/4294970148 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223137, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223139, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223141, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b25/4294970149 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b25/4294970149 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223143, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve 
> search continuation for 565213407223145, client died ?
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b26/4294970150 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b26/4294970150 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b27/4294970151 numOps:1
>
> Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b27/4294970151 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:26 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b27/4294970151 numOps:1
>
> Mar 19 16:46:26 controller-sc1 osafimmpbed: NO Prepare 
> ccb:100000b27/4294970151 received at Pbe slave when Prior Ccb 
> 4294970138 still processing
>
> Mar 19 16:46:26 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:26 controller-sc1 osafimmnd[6057]: ER PBE PRTAttrs Update 
> continuation missing! invoc:2842
>
> Mar 19 16:46:26 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b27/4294970151 numOps:1
>
> Mar 19 16:46:26 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:27 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b27/4294970151 numOps:1
>
> Mar 19 16:46:27 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:27 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b27/4294970151 numOps:1
>
> Mar 19 16:46:27 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:28 controller-sc1 osafimmpbed: IN ccb-prepare received at 
> PBE slave ccbId:100000b27/4294970151 numOps:1
>
> Mar 19 16:46:28 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:28 controller-sc1 osafimmnd[6057]: WA update of 
> PERSISTENT runtime attributes in object 
> 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18
>
> Mar 19 16:46:28 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:29 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:29 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:30 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:30 controller-sc1 osafimmpbed: IN PBE slave waiting for 
> prepare from primary on PRTA update ccb:100000b1b
>
> Mar 19 16:46:31 controller-sc1 osafimmpbed: NO Slave PBE time-out in 
> waiting on prepare for PRTA update ccb:100000b1b 
> dn:safSu=SU1,safSg=app4,safApp=app4
>
> Mar 19 16:46:31 controller-sc1 osafimmpbed: NO 2PBE Error (18) in PRTA 
> update (ccbId:100000b1b)
>
> -A short time later, I see several of the following logs being 
> generated from SC-2:
>
> Mar 19 17:15:19 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :Connection timed out
>
> Mar 19 17:16:22 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :Connection timed out
>
> Mar 19 17:17:25 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :Connection timed out
>
> Mar 19 17:18:28 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :Connection timed out
>
> Mar 19 17:18:46 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 17:18:49 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 17:18:52 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 17:18:56 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Mar 19 17:18:59 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed 
> (connect()) err :No route to host
>
> Thoughts?
>
> //
>
> Regards,
>
> /David/
>
> *From:*praveen malviya [mailto:[email protected]]
> *Sent:* Monday, March 27, 2017 1:02 AM
> *To:* David Hoyt <[email protected]>; 
> [email protected]
> *Subject:* Re: [users] opensaf internal admin state not propagated
>
> ------------------------------------------------------------------------
>
> NOTICE: This email was received from an EXTERNAL sender
>
> ------------------------------------------------------------------------
>
>
> Hi David,
>
> What kind of operations were performed before trying to lock the SU?
> There is a command to see internal states of AMF entities in AMFD:
> "immadm -a safAmfService -o 99 safAmfService"
>
> This command dumps internal states of AMFD with all its entities in a
> file in /tmp/ directory. Please check the file name in syslog on active
> controller.
> More details of this command can be found in AMF PR doc.
>
>
>
> Thanks,
> Praveen
>
> On 27-Mar-17 7:18 AM, David Hoyt wrote:
>  > Hi all,
>  >
>  > I've run into this issue a few times now where I attempt to lock an 
> SU, but opensaf rejects the request stating that it's already locked.
>  > However, when I display the SU state, it shows it as unlocked (see 
> commands & output below).
>  >
>  > It appears internally, that the opensaf state of an SU is not 
> propagated to all parts of opensaf.
>  > In talking with Alex Jones, he believes this issue has been 
> resolved but is unsure what the actual fix is.
>  > Can somebody confirm this issue has been fixed and provide the 
> ticket number?
>  >
>  > Setup:
>  >
>  > - Opensaf 4.6.0 running on RHEL 6.6 VMs with TCP
>  >
>  > - 2 controllers, 4 payloads
>  >
>  > Here's the status of the SU from the amf-state command where it 
> shows the admin state as unlocked:
>  >
>  > [root@payload-4 ~]# date; amf-state su | grep -A 5 app4
>  > Fri Mar 24 09:41:26 EDT 2017
>  > safSu=SU1,safSg=app4,safApp=app4
>  > saAmfSUAdminState=UNLOCKED(1)
>  > saAmfSUOperState=ENABLED(1)
>  > saAmfSUPresenceState=INSTANTIATED(3)
>  > saAmfSUReadinessState=OUT-OF-SERVICE(1)
>  > [root@payload-4 ~]#
>  >
>  > When I issue the lock request, it fails with the following error:
>  > [root@payload-4 ~]# date; amf-adm lock 
> safSu=SU1,safSg=app4,safApp=app4
>  > Fri Mar 24 09:42:12 EDT 2017
>  > error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: 
> SA_AIS_ERR_NO_OP (28)
>  > error-string: Admin operation (2) has no effect on current state (2)
>  > [root@payload-4 ~]#
>  >
>  > Regards,
>  > David
>  > 
> ------------------------------------------------------------------------------
>  > Check out the vibrant tech community on one of the world's most
>  > engaging tech sites, Slashdot.org! http://sdm.link/slashdot 
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot&d=DwMGaQ&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=Ii_w41sHQSHwxvKiPEOJvgMdC9kWTCtkqsXmUCrj9so&s=uCbPSbftBsSvUDziHNQxUNo0RAGyWtPdxJI_0OIQ9yU&e=>
>  
>
>  > _______________________________________________
>  > Opensaf-users mailing list
>  > [email protected] 
> <mailto:[email protected]>
>  > https://lists.sourceforge.net/lists/listinfo/opensaf-users 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_opensaf-2Dusers&d=DwMGaQ&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=Ii_w41sHQSHwxvKiPEOJvgMdC9kWTCtkqsXmUCrj9so&s=MxvpJuSYBPtsXdqtqHP4EgwVwI8XIH1kLw5xUUEg_Xo&e=>
>  >
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to