- **status**: assigned --> accepted
- **Comment**:

Reproducible steps:
1. create two calsses class1 class2
2. create object obj for class1
3. create application with class applier for class2 @app1
4. create application with object applier for object obj @app2
5. modify the object obj and insert a sleep before apply
6. close the application with class applier for class2 @app1  and try to 
re-connect

Sep 17 16:43:27.265185 osafimmnd [16872:ImmModel.cc:12223] >> implementerSet
Sep 17 16:43:27.265211 osafimmnd [16872:ImmModel.cc:12264] T7 Re-using 
implementer for @app1
Sep 17 16:43:27.265219 osafimmnd [16872:ImmModel.cc:12307] TR TRY_AGAIN: ccb 4 
is active on object 'obj' bound to object applier '@app1'. Can not re-attach 
applier
Sep 17 16:43:27.265225 osafimmnd [16872:ImmModel.cc:12409] << implementerSet

Presently if the active ccb object contains an object-applier TRY_AGAIN is 
returned.The solution is when implementerset is called activeccbs are checked, 
return TRY_AGAIN only when the object apllier has same implementer info as the 
newly connected implementer.




---

** [tickets:#1078] Quiesced controller fails to become Active**

**Status:** accepted
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 09:39 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 09:41 AM UTC
**Owner:** Neelakanta Reddy

The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running 
with changeset 5697+#946 patch.

IMM Applications along with switchover is in progress. After SC-2 moved to 
Quiesced, SC-1 went for reboot because of #1067. SC-2 which was in Quiesced 
tried to become Active but implementer set timedout for amfd and the cluster 
went for reboot.

Syslog of SC-1:

Sep 15 09:38:48 SLES-64BIT-SLOT1 osafclmd[2471]: ER saImmOiClassImplementerSet 
failed for class SaClmNode rc:9, exiting
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: ER 
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131343, SupervisionTime = 60
Sep 15 09:38:48 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
timeout=60

This is a known issue #1067

Syslog of SC-2:


Sep 15 09:38:52 SLES-64BIT-SLOT2 osafimmnd[2340]: NO Epoch set to 55 in ImmModel
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER saImmOiImplementerSet 
failed 5
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER avd_imm_applier_set FAILED
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: role.cc:592: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: ER AMF director unexpectedly 
crashed
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) 
received, OwnNodeId = 131599, SupervisionTime = 60
Sep 15 09:38:56 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node; 
timeout=60

AMFD Traces on SC-2:

Sep 15  9:38:45.443707 osafamfd [2400:imm.cc:1252] >> avd_imm_applier_set
Sep 15  9:38:50.128084 osafamfd [2400:mds.cc:0453] TR avnd 2010f97b02025 down
Sep 15  9:38:50.128319 osafamfd [2400:mbcsv_mds.c:0435] T1 RED_DOWN event. 
pwe_hdl: 65537, anchor:564116434460706
Sep 15  9:38:50.128343 osafamfd [2400:mbcsv_pwe_anc.c:0122] >> 
mbcsv_rmv_pwe_anc_entry
Sep 15  9:38:50.128359 osafamfd [2400:mbcsv_pwe_anc.c:0144] << 
mbcsv_rmv_pwe_anc_entry
......

Sep 15  9:38:54.535469 osafamfd [2400:timer.cc:0169] << avd_tmr_exp
Sep 15  9:38:56.456335 osafamfd [2400:imm.cc:1256] ER saImmOiImplementerSet 
failed 5
Sep 15  9:38:56.456351 osafamfd [2400:role.cc:0591] ER avd_imm_applier_set 
FAILED
Sep 15  9:41:03.730331 osafamfd [2439:main.cc:0464] >> initialize

IMMND Traces on SC-2:

Sep 15  9:38:45.447989 osafimmnd [2340:ImmModel.cc:12117] >> implementerSet
Sep 15  9:38:45.448038 osafimmnd [2340:ImmModel.cc:12158] T7 Re-using 
implementer for @safAmfService2020f
Sep 15  9:38:45.448091 osafimmnd [2340:ImmModel.cc:12201] TR TRY_AGAIN: ccb 27 
is active on object 'attrName_testMA_verifyObjApplRejModifyCallback_101' bound 
to object applier '@safAmfService2020f'. Can not re-attach applier
Sep 15  9:38:45.448129 osafimmnd [2340:ImmModel.cc:12303] << implementerSet

attrName_testMA_verifyObjApplRejModifyCallback_101 is an application class 
object. Syslog and IMMND traces for both the controllers are attached.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to