- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar
- **Milestone**: future --> 4.4.FC
---
** [tickets:#426] IMM returns ERR_TRY_AGAIN for saImmOiRtObjectUpdate() in an
IMM initiated callback**
**Status:** assigned
**Created:** Fri May 31, 2013 06:31 AM UTC by Nagendra Kumar
**Last Updated:** Fri May 31, 2013 06:31 AM UTC
**Owner:** Nagendra Kumar
Migrated from http://devel.opensaf.org/ticket/2821
ChangeSet?: 3730
Redundancy model: NWAY
Through SMF campaign performed lock, lock-in of SU and deleted its components
and SupportedCSTypes of Class CompCSType.
Observed saImmOiRtObjectUpdate of below attributes is being performed by AMF
which should not happen as these are not run-time modifiable
1. saAmfCompNumCurrActiveCSIs
2. saAmfSURestartCount
3. saAmfSUNumCurrStandbySIs
4. saAmfSUNumCurrActiveSIs
/var/log/messages show :
==============================
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safSupportedCsType=safVersion=4.0.0\,safCSType=NWAYCSBASETYPE_PI,safComp=COMP1SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompNumCurrActiveCSIs failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safSupportedCsType=safVersion=4.0.0\,safCSType=NWAYCSBASETYPE_PI,safComp=COMP2SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompNumCurrActiveCSIs failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safSupportedCsType=safVersion=4.0.0\,safCSType=NWAYCSBASETYPE_PI,safComp=COMP3SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompNumCurrActiveCSIs failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safComp=COMP1SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompRestartCount failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safComp=COMP1SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompCurrProxyName failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safComp=COMP2SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompRestartCount failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safComp=COMP2SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompCurrProxyName failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safComp=COMP3SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompRestartCount failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safComp=COMP3SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompCurrProxyName failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safSu=SU5,safSg=SGONE,safApp=NWAYAPP' saAmfSURestartCount failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safSu=SU5,safSg=SGONE,safApp=NWAYAPP' saAmfSUNumCurrStandbySIs failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safSu=SU5,safSg=SGONE,safApp=NWAYAPP' saAmfSUNumCurrActiveSIs failed with 6
Sep 24 17:59:30 SLES11-SLOT-1 osafimmnd[22297]: NO Ccb 140 COMMITTED
(SMFSERVICE)
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: ER job_exec_imm_objupdate:
update FAILED 12
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: ER job_exec_imm_objupdate:
update FAILED 12
Sep 24 17:59:30 SLES11-SLOT-1 osafimmnd[22297]: NO Create of PERSISTENT runtime
object
'smfRollbackData=00000001,smfRollbackElement=ccb_00000009,smfRollbackElement=ProcWrapup?,safSmfProc=amfClusterProc-1,safSmfCampaign=Campaign,safApp=safSmfService'
by Impl safSmfCampaign=Campaign,safApp=safSmfService
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: ER job_exec_imm_objupdate:
update FAILED 12
Changed 8 months ago by hafe ¶
■version changed from 4.2.2 to 4.2.1
Kind of a duplicate of http://devel.opensaf.org/ticket/2227
Changed 8 months ago by hafe ¶
Could you please rerun the test case with amfd trace on?
Changed 8 months ago by hrishikesh.chenna
■attachment osafamfd.tgz added
Attached amfd trace file for the same time stamp given in ticket log snippet.
Changed 8 months ago by hafe ¶
■summary changed from Non modifiable runtime attributes are being modified by
AMF and returns ERR_TRY_AGAIN. to IMM returns ERR_TRY_AGAIN for
saImmOiRtObjectUpdate() in an IMM initiated callback
The AMF model contains classes with pure runtime objects. What is shown in the
trace is just a normal read of such an object which results in a callback to
avd to return values to IMM. Thus the ticket summary is wrong and now changed.
The problem here is that IMM responds with TRYAGAIN in context of an IMM
callback! The complete syslog and immnd trace is not there so there is not much
more to be done. If they were provided it would be nice.
However this log:
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: ER job_exec_imm_objupdate:
update FAILED 12
Could be avoided since it means that a deferred object update (due to an
earlier TRYAGAIN) fails since the object does not exist in IMM. At object
deletion avd could walk the deferred IMM job list and prune those jobs that are
affected.
Changed 8 months ago by anders ¶
Severity of:
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: ER job_exec_imm_objupdate:
update FAILED 12
should not be ER unless AMFD is terminating due to this, which would seem
excessive.
Severity of this:
Sep 24 17:59:30 SLES11-SLOT-1 osafamfd[22351]: WA saImmOiRtObjectUpdate of
'safComp=COMP3SU5NWAYAPP,safSu=SU5,safSg=SGONE,safApp=NWAYAPP'
saAmfCompCurrProxyName
failed with 6
should not be WA and not even be logged since the error is simply TRY_AGAIN.
When/if the service/OI has to give up its retry attempts due to its
realtime behavior requirements and duties towards other tasks, then
it should simply geive up and return ERR_NO_RESOURCES on the callback.
(Ideal would be to return ERR_TRY_AGAIN but that is not allowed according to
hte spec).
Changed 8 months ago by anders ¶
■milestone changed from future_releases to 4.2.3
follow-up: ↓ 7 Changed 8 months ago by anders ¶
The description for this ticket suggests that the local IMMND has crashed and
restarted.
If that is the case, then that should of course be investigated.
If it is not a known and fixed problem, then a ticket should be written.
It could I suppose be an effect of a node being shut down, as part of the SMF
campaign.
That would be covered by http://devel.opensaf.org/ticket/2099
in reply to: ↑ 6 Changed 8 months ago by hafe ¶
Replying to anders:
The description for this ticket suggests that the local IMMND has crashed and
restarted.
If that is the case, then that should of course be investigated.
If it is not a known and fixed problem, then a ticket should be written.
It could I suppose be an effect of a node being shut down, as part of the SMF
campaign.
That would be covered by http://devel.opensaf.org/ticket/2099
From the traces I can see that IMMND has not crashed. It is alive and executing
a CCB where objects are deleted. What is strange is that an object delete is
followed by read of that same object. The CCB is still not applied. The object
contains pure runtime objects so AMFDs SaImmOiRtAttrUpdateCallbackT is invoked.
The callback tries to return attribute values to IMMND with
saImmOiRtObjectUpdate() but receives TRYAGAIN. This is logged at Warning level
since it is perceived as strange! TRYAGAIN causes the object to be put into
AMFD's job queue. Later when processed, the object no longer exist (for real)
but a logging bug in job_exec_imm_objupdate() causes this to be logged at Error
level.
So my conclusion:
1) AMFD's logging could be changed and corrected (but that would only hide this)
2) IMMND is returning TRYAGAIN at the same time calling an implementers
SaImmOiRtAttrUpdateCallbackT, weird... Why bother calling the implementer if it
is anyway not allowed to reply?
3) Most likely SMF is constantly reading deleted (CCB not applied) objects! Why
is that? If really necessary it is probably only interested in config
attributes anyway?
Would it be possible to get SMFD and IMMND traces also?
Changed 5 months ago by anders ¶
Went back to look at this unsolved ticket because it has similarities with
#2922.
It could be (its likely) that the TRY_AGAIN from IMMND on the RtUpdate? is due
to IMMD (active)
being down.
Unfortunately, the only info attached to this ticket is the osafamfd trace.
But it appears that the AMFD gets stuck in its progress of performing the
switchover, The key question is if the AMFD can get *blocked* on imm requests
during
the switchover? My understanding was that the AMFD when getting try-again on
such
a job would park the imm-rt-update in the AMFDs job queue. So it should not get
stuck
and the switchover should get completed.
Indeed perhaps that is the case.
It is not totally clear what problem this ticket is reporting.
The slogan complains that an imm downcall inside an imm callback gets TRY_AGAIN
from the imm. But that is the way it is. The cluster state may change in the
time
beween the immnd sending the callback request to the imma client and the imma
client
acting on that callback. This is a posibillity and not an error.
The implementer of the callback could return an error on the callback as a way
to get rid of that job.
Changed 4 months ago by hafe ¶
■milestone changed from 4.2.3 to future_releases
removing milestone, ticket is not accepted thus not scheduled for 4.2.3
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets