Hi David, sc-2(primary PBE)
Mar 19 16:43:52 controller-sc2 osafimmnd[6014]: WA update of PERSISTENT runtime attributes in object 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 Mar 19 16:43:54 controller-sc2 osafimmpbed: IN Starting distributed PBE commit for PRTA update Ccb:100000b1b/4294970139 Mar 19 16:44:04 controller-sc2 osafimmpbed: WA Start prepare for ccb: 100000b1b/4294970139 towards slave PBE returned: '5' from Immsv Mar 19 16:44:04 controller-sc2 osafimmpbed: WA PBE-A failed to prepare PRTA update Ccb:100000b1b/4294970139 towards PBE-B Mar 19 16:44:04 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA update (ccbId:100000b1b) sc-1(slave PBE): Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at PBE slave ccbId:100000b1b/4294970139 numOps:1 Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare ccb:100000b1b/4294970139 received at Pbe slave when Prior Ccb 4294970138 still processing Here slave PBE replies with TRY_AGAIN, to primary PBE, this in not received at primary PBE. This resulted in TIMEOUT(5), which is converted into SA_AIS_ERR_NO_RESOURCES(18). /Neel. > > > -------- Forwarded Message -------- > Subject: RE: [users] opensaf internal admin state not propagated > Date: Mon, 27 Mar 2017 18:49:25 +0000 > From: David Hoyt <[email protected]> > To: praveen malviya <[email protected]>, > [email protected] <[email protected]> > > > > Hi Praveen, > > First off, thank-you for your reply. > > Secondly, the lab was recovered. > > I issued the following command: > > [root@payload-4 ~]# immadm -o 1 safSu=SU1,safSg=app4,safApp=app4 > > [root@payload-4 ~]# > > [root@payload-4 ~]# date; amf-state su | grep -A 5 app4 > > Fri Mar 24 11:41:36 EDT 2017 > > safSu=SU1,safSg=app4,safApp=app4 > > saAmfSUAdminState=*UNLOCKED*(1) > > saAmfSUOperState=ENABLED(1) > > saAmfSUPresenceState=INSTANTIATED(3) > > saAmfSUReadinessState=IN-SERVICE(2) > > [root@payload-4 ~]# > > [root@payload-4 ~]# date; amf-adm lock safSu=SU1,safSg=app4,safApp=app4 > > Fri Mar 24 11:43:46 EDT 2017 > > [root@payload-4 ~]# date; amf-state su | grep -A 5 app4 > > Fri Mar 24 11:44:56 EDT 2017 > > safSu=SU1,safSg=app4,safApp=app4 > > saAmfSUAdminState=*LOCKED*(2) > > saAmfSUOperState=ENABLED(1) > > saAmfSUPresenceState=INSTANTIATED(3) > > saAmfSUReadinessState=OUT-OF-SERVICE(1) > > [root@payload-4 ~]# > > Thirdly: we found we had a network issue between PL-4 and SC-2, which > likely caused our problem(s). > > Having said that, when amf and imm end up with a different view of the > opensaf SU state, how can you recover from this? > > Is there some audit that runs periodically to ensure the state info is > in-sync between the various opensaf processes? > > Below is a summary after reviewing the logs. > > SU: app4-SU-1 > > Node: PL-4 > > -As stated, PL-4 is able to ping SC-1 but NOT SC-2 > > -all was fine with app4-SU-1 for several days. It was running > unlocked-enabled-in-service-active > > -On March 18, PL-4 went for a reboot > > -Upon recovery, opensaf on PL-4 failed to start – it appears it still > could not communicate with SC-2: > > Mar 18 22:30:49 payload-4 opensafd: Starting OpenSAF Services (Using TCP) > > Mar 18 22:30:49 payload-4 osafdtmd[1570]: Started > > Mar 18 22:30:49 payload-4 osafimmnd[1588]: Started > > Mar 18 22:30:49 payload-4 osafimmnd[1588]: NO Persistent Back-End > capability configured, Pbe file:imm.db (suffix may get added) > > Mar 18 22:30:49 payload-4 osafdtmd[1570]: NO Established contact with > ' controller-sc1' > > Mar 18 22:31:49 payload-4 opensafd[1562]: ER Timed-out for response > from IMMND > > Mar 18 22:31:49 payload-4 opensafd[1562]: ER > > Mar 18 22:31:49 payload-4 opensafd[1562]: ER Going for recovery > > Mar 18 22:31:49 payload-4 opensafd[1562]: ER Trying To RESPAWN > /usr/lib64/opensaf/clc-cli/osaf-immnd attempt #1 > > Mar 18 22:31:49 payload-4 opensafd[1562]: ER Sending SIGKILL to IMMND, > pid=1576 > > Mar 18 22:31:49 payload-4 osafimmnd[1588]: exiting for shutdown > > -On March 19, with opensaf on PL-4 still failing to start, the user > issued a lock of app4-SU-1 from SC-2. Here are the logs from SC-2: > > Mar 19 16:43:31 controller-sc2 Running cmd: 'amf-adm -t 900 lock > safSu=SU1,safSg=app4,safApp=app4' > > Mar 19 16:43:41 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b1a/4294970138 > > Mar 19 16:43:51 controller-sc2 osafimmnd[6014]: WA Failed to retrieve > search continuation, client died ? > > Mar 19 16:43:51 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b1a/4294970138 towards slave PBE returned: '5' from Immsv > > Mar 19 16:43:51 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b1a/4294970138 towards PBE-B > > Mar 19 16:43:51 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b1a) > > Mar 19 16:43:52 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:43:54 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b1b/4294970139 > > Mar 19 16:44:04 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b1b/4294970139 towards slave PBE returned: '5' from Immsv > > Mar 19 16:44:04 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b1b/4294970139 towards PBE-B > > Mar 19 16:44:04 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b1b) > > Mar 19 16:44:05 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:44:06 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b1c/4294970140 > > Mar 19 16:44:10 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :Connection timed out > > Mar 19 16:44:16 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b1c/4294970140 towards slave PBE returned: '5' from Immsv > > Mar 19 16:44:16 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b1c/4294970140 towards PBE-B > > Mar 19 16:44:16 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b1c) > > Mar 19 16:44:17 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:44:19 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b1d/4294970141 > > Mar 19 16:44:29 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b1d/4294970141 towards slave PBE returned: '5' from Immsv > > Mar 19 16:44:29 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b1d/4294970141 towards PBE-B > > Mar 19 16:44:29 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b1d) > > Mar 19 16:44:30 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:44:32 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b1e/4294970142 > > Mar 19 16:44:42 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b1e/4294970142 towards slave PBE returned: '5' from Immsv > > Mar 19 16:44:42 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b1e/4294970142 towards PBE-B > > Mar 19 16:44:42 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b1e) > > Mar 19 16:44:42 controller-sc2 osafimmnd[6014]: WA Failed to retrieve > search continuation, client died ? > > Mar 19 16:44:42 controller-sc2 osafimmnd[6014]: WA Failed to retrieve > search continuation, client died ? > > Mar 19 16:44:43 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:44:44 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b1f/4294970143 > > Mar 19 16:44:54 controller-sc2 osafimmnd[6014]: WA Failed to retrieve > search continuation, client died ? > > Mar 19 16:44:54 controller-sc2 osafimmnd[6014]: WA Failed to retrieve > search continuation, client died ? > > Mar 19 16:44:54 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b1f/4294970143 towards slave PBE returned: '5' from Immsv > > Mar 19 16:44:54 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b1f/4294970143 towards PBE-B > > Mar 19 16:44:54 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b1f) > > Mar 19 16:44:55 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:44:57 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b20/4294970144 > > Mar 19 16:45:07 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b20/4294970144 towards slave PBE returned: '5' from Immsv > > Mar 19 16:45:07 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b20/4294970144 towards PBE-B > > Mar 19 16:45:07 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b20) > > Mar 19 16:45:08 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:45:09 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b21/4294970145 > > Mar 19 16:45:13 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :Connection timed out > > Mar 19 16:45:19 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b21/4294970145 towards slave PBE returned: '5' from Immsv > > Mar 19 16:45:19 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b21/4294970145 towards PBE-B > > Mar 19 16:45:19 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b21) > > Mar 19 16:45:20 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:45:22 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b22/4294970146 > > Mar 19 16:45:31 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:32 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b22/4294970146 towards slave PBE returned: '5' from Immsv > > Mar 19 16:45:32 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b22/4294970146 towards PBE-B > > Mar 19 16:45:32 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b22) > > Mar 19 16:45:33 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:45:34 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:34 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b23/4294970147 > > Mar 19 16:45:37 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:40 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:43 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:44 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b23/4294970147 towards slave PBE returned: '5' from Immsv > > Mar 19 16:45:44 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b23/4294970147 towards PBE-B > > Mar 19 16:45:44 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b23) > > Mar 19 16:45:45 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:45:46 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:47 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b24/4294970148 > > Mar 19 16:45:49 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:52 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:55 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:57 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b24/4294970148 towards slave PBE returned: '5' from Immsv > > Mar 19 16:45:57 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b24/4294970148 towards PBE-B > > Mar 19 16:45:57 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b24) > > Mar 19 16:45:58 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:45:58 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:45:59 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b25/4294970149 > > Mar 19 16:46:01 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:04 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:07 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:09 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b25/4294970149 towards slave PBE returned: '5' from Immsv > > Mar 19 16:46:09 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b25/4294970149 towards PBE-B > > Mar 19 16:46:09 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b25) > > Mar 19 16:46:10 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:10 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:11 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b26/4294970150 > > Mar 19 16:46:13 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:16 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:19 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:21 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b26/4294970150 towards slave PBE returned: '5' from Immsv > > Mar 19 16:46:21 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b26/4294970150 towards PBE-B > > Mar 19 16:46:21 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b26) > > Mar 19 16:46:22 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:22 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:24 controller-sc2 osafimmpbed: IN Starting distributed > PBE commit for PRTA update Ccb:100000b27/4294970151 > > Mar 19 16:46:25 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:25 controller-sc2 osafimmnd[6014]: WA Continuation for > AdminOp was lost, probably due to timeout > > Mar 19 16:46:26 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv > (6) replied with transient error on prepare for ccb:100000b27/4294970151 > > Mar 19 16:46:26 controller-sc2 osafimmnd[6014]: ER PBE PRTAttrs Update > continuation missing! invoc:2842 > > Mar 19 16:46:26 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv > (6) replied with transient error on prepare for ccb:100000b27/4294970151 > > Mar 19 16:46:27 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv > (6) replied with transient error on prepare for ccb:100000b27/4294970151 > > Mar 19 16:46:27 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv > (6) replied with transient error on prepare for ccb:100000b27/4294970151 > > Mar 19 16:46:28 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv > (6) replied with transient error on prepare for ccb:100000b27/4294970151 > > Mar 19 16:46:28 controller-sc2 osafimmpbed: NO Slave PBE 1 or Immsv > (6) replied with transient error on prepare for ccb:100000b27/4294970151 > > Mar 19 16:46:28 controller-sc2 osafimmpbed: WA Start prepare for ccb: > 100000b27/4294970151 towards slave PBE returned: '6' from standby PBE > > Mar 19 16:46:28 controller-sc2 osafimmpbed: WA PBE-A failed to prepare > PRTA update Ccb:100000b27/4294970151 towards PBE-B > > Mar 19 16:46:28 controller-sc2 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b27) > > Mar 19 16:46:28 controller-sc2 osafimmnd[6014]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:28 controller-sc2 osafamfd[6091]: ER exec: update FAILED > > -Meanwhile, SC-1 is generating the following logs: > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1a > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b1a/4294970138 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b1b/4294970139 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b1b/4294970139 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b1c/4294970140 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b1c/4294970140 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b1d/4294970141 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b1d/4294970141 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b1e/4294970142 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b1e/4294970142 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b1f/4294970143 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b1f/4294970143 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b20/4294970144 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b20/4294970144 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b21/4294970145 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b21/4294970145 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b22/4294970146 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b22/4294970146 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223131, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b23/4294970147 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223133, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223135, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b23/4294970147 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b24/4294970148 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b24/4294970148 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223137, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223139, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223141, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b25/4294970149 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b25/4294970149 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223143, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA Failed to retrieve > search continuation for 565213407223145, client died ? > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b26/4294970150 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b26/4294970150 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:25 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b27/4294970151 numOps:1 > > Mar 19 16:46:25 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b27/4294970151 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:26 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b27/4294970151 numOps:1 > > Mar 19 16:46:26 controller-sc1 osafimmpbed: NO Prepare > ccb:100000b27/4294970151 received at Pbe slave when Prior Ccb > 4294970138 still processing > > Mar 19 16:46:26 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:26 controller-sc1 osafimmnd[6057]: ER PBE PRTAttrs Update > continuation missing! invoc:2842 > > Mar 19 16:46:26 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b27/4294970151 numOps:1 > > Mar 19 16:46:26 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:27 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b27/4294970151 numOps:1 > > Mar 19 16:46:27 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:27 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b27/4294970151 numOps:1 > > Mar 19 16:46:27 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:28 controller-sc1 osafimmpbed: IN ccb-prepare received at > PBE slave ccbId:100000b27/4294970151 numOps:1 > > Mar 19 16:46:28 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:28 controller-sc1 osafimmnd[6057]: WA update of > PERSISTENT runtime attributes in object > 'safSu=SU1,safSg=app4,safApp=app4' REVERTED. PBE rc:18 > > Mar 19 16:46:28 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:29 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:29 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:30 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:30 controller-sc1 osafimmpbed: IN PBE slave waiting for > prepare from primary on PRTA update ccb:100000b1b > > Mar 19 16:46:31 controller-sc1 osafimmpbed: NO Slave PBE time-out in > waiting on prepare for PRTA update ccb:100000b1b > dn:safSu=SU1,safSg=app4,safApp=app4 > > Mar 19 16:46:31 controller-sc1 osafimmpbed: NO 2PBE Error (18) in PRTA > update (ccbId:100000b1b) > > -A short time later, I see several of the following logs being > generated from SC-2: > > Mar 19 17:15:19 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :Connection timed out > > Mar 19 17:16:22 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :Connection timed out > > Mar 19 17:17:25 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :Connection timed out > > Mar 19 17:18:28 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :Connection timed out > > Mar 19 17:18:46 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 17:18:49 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 17:18:52 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 17:18:56 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Mar 19 17:18:59 controller-sc2 osafdtmd[5961]: ER DTM :Connect failed > (connect()) err :No route to host > > Thoughts? > > // > > Regards, > > /David/ > > *From:*praveen malviya [mailto:[email protected]] > *Sent:* Monday, March 27, 2017 1:02 AM > *To:* David Hoyt <[email protected]>; > [email protected] > *Subject:* Re: [users] opensaf internal admin state not propagated > > ------------------------------------------------------------------------ > > NOTICE: This email was received from an EXTERNAL sender > > ------------------------------------------------------------------------ > > > Hi David, > > What kind of operations were performed before trying to lock the SU? > There is a command to see internal states of AMF entities in AMFD: > "immadm -a safAmfService -o 99 safAmfService" > > This command dumps internal states of AMFD with all its entities in a > file in /tmp/ directory. Please check the file name in syslog on active > controller. > More details of this command can be found in AMF PR doc. > > > > Thanks, > Praveen > > On 27-Mar-17 7:18 AM, David Hoyt wrote: > > Hi all, > > > > I've run into this issue a few times now where I attempt to lock an > SU, but opensaf rejects the request stating that it's already locked. > > However, when I display the SU state, it shows it as unlocked (see > commands & output below). > > > > It appears internally, that the opensaf state of an SU is not > propagated to all parts of opensaf. > > In talking with Alex Jones, he believes this issue has been > resolved but is unsure what the actual fix is. > > Can somebody confirm this issue has been fixed and provide the > ticket number? > > > > Setup: > > > > - Opensaf 4.6.0 running on RHEL 6.6 VMs with TCP > > > > - 2 controllers, 4 payloads > > > > Here's the status of the SU from the amf-state command where it > shows the admin state as unlocked: > > > > [root@payload-4 ~]# date; amf-state su | grep -A 5 app4 > > Fri Mar 24 09:41:26 EDT 2017 > > safSu=SU1,safSg=app4,safApp=app4 > > saAmfSUAdminState=UNLOCKED(1) > > saAmfSUOperState=ENABLED(1) > > saAmfSUPresenceState=INSTANTIATED(3) > > saAmfSUReadinessState=OUT-OF-SERVICE(1) > > [root@payload-4 ~]# > > > > When I issue the lock request, it fails with the following error: > > [root@payload-4 ~]# date; amf-adm lock > safSu=SU1,safSg=app4,safApp=app4 > > Fri Mar 24 09:42:12 EDT 2017 > > error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: > SA_AIS_ERR_NO_OP (28) > > error-string: Admin operation (2) has no effect on current state (2) > > [root@payload-4 ~]# > > > > Regards, > > David > > > ------------------------------------------------------------------------------ > > Check out the vibrant tech community on one of the world's most > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > <https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot&d=DwMGaQ&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=Ii_w41sHQSHwxvKiPEOJvgMdC9kWTCtkqsXmUCrj9so&s=uCbPSbftBsSvUDziHNQxUNo0RAGyWtPdxJI_0OIQ9yU&e=> > > > > _______________________________________________ > > Opensaf-users mailing list > > [email protected] > <mailto:[email protected]> > > https://lists.sourceforge.net/lists/listinfo/opensaf-users > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_opensaf-2Dusers&d=DwMGaQ&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=Lehk1PZKwfDQtYJXNyUKbPAqrw5O--SlPRAF9DIEps4&m=Ii_w41sHQSHwxvKiPEOJvgMdC9kWTCtkqsXmUCrj9so&s=MxvpJuSYBPtsXdqtqHP4EgwVwI8XIH1kLw5xUUEg_Xo&e=> > > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
