- **summary**: Quiesced controller fails to become Active --> imm: Return
TRY_AGAIN only when object apllier matches the Re-using implementerset info
---
** [tickets:#1078] imm: Return TRY_AGAIN only when object apllier matches the
Re-using implementerset info**
**Status:** review
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 09:39 AM UTC by Sirisha Alla
**Last Updated:** Wed Sep 17, 2014 11:58 AM UTC
**Owner:** Neelakanta Reddy
The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running
with changeset 5697+#946 patch.
IMM Applications along with switchover is in progress. After SC-2 moved to
Quiesced, SC-1 went for reboot because of #1067. SC-2 which was in Quiesced
tried to become Active but implementer set timedout for amfd and the cluster
went for reboot.
Syslog of SC-1:
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafclmd[2471]: ER saImmOiClassImplementerSet
failed for class SaClmNode rc:9, exiting
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: NO
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: ER
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: Rebooting OpenSAF NodeId =
131343 EE Name = , Reason: Component faulted: recovery is node failfast,
OwnNodeId = 131343, SupervisionTime = 60
Sep 15 09:38:48 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node;
timeout=60
This is a known issue #1067
Syslog of SC-2:
Sep 15 09:38:52 SLES-64BIT-SLOT2 osafimmnd[2340]: NO Epoch set to 55 in ImmModel
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER saImmOiImplementerSet
failed 5
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER avd_imm_applier_set FAILED
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: role.cc:592:
avd_mds_qsd_role_evh: Assertion '0' failed.
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: ER AMF director unexpectedly
crashed
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: Rebooting OpenSAF NodeId =
131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest)
received, OwnNodeId = 131599, SupervisionTime = 60
Sep 15 09:38:56 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node;
timeout=60
AMFD Traces on SC-2:
Sep 15 9:38:45.443707 osafamfd [2400:imm.cc:1252] >> avd_imm_applier_set
Sep 15 9:38:50.128084 osafamfd [2400:mds.cc:0453] TR avnd 2010f97b02025 down
Sep 15 9:38:50.128319 osafamfd [2400:mbcsv_mds.c:0435] T1 RED_DOWN event.
pwe_hdl: 65537, anchor:564116434460706
Sep 15 9:38:50.128343 osafamfd [2400:mbcsv_pwe_anc.c:0122] >>
mbcsv_rmv_pwe_anc_entry
Sep 15 9:38:50.128359 osafamfd [2400:mbcsv_pwe_anc.c:0144] <<
mbcsv_rmv_pwe_anc_entry
......
Sep 15 9:38:54.535469 osafamfd [2400:timer.cc:0169] << avd_tmr_exp
Sep 15 9:38:56.456335 osafamfd [2400:imm.cc:1256] ER saImmOiImplementerSet
failed 5
Sep 15 9:38:56.456351 osafamfd [2400:role.cc:0591] ER avd_imm_applier_set
FAILED
Sep 15 9:41:03.730331 osafamfd [2439:main.cc:0464] >> initialize
IMMND Traces on SC-2:
Sep 15 9:38:45.447989 osafimmnd [2340:ImmModel.cc:12117] >> implementerSet
Sep 15 9:38:45.448038 osafimmnd [2340:ImmModel.cc:12158] T7 Re-using
implementer for @safAmfService2020f
Sep 15 9:38:45.448091 osafimmnd [2340:ImmModel.cc:12201] TR TRY_AGAIN: ccb 27
is active on object 'attrName_testMA_verifyObjApplRejModifyCallback_101' bound
to object applier '@safAmfService2020f'. Can not re-attach applier
Sep 15 9:38:45.448129 osafimmnd [2340:ImmModel.cc:12303] << implementerSet
attrName_testMA_verifyObjApplRejModifyCallback_101 is an application class
object. Syslog and IMMND traces for both the controllers are attached.
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets