- **status**: accepted --> review
- **Comment**:

A suspend is actually not being issued here. The state machine code is 
implemented such that the suspend is only done in states Executing, Suspending, 
or RollingBack.

After getting some more logs from Rafael, it is clear this is a race condition 
between an async failure in AMF and the campaign commit being executed. Here is 
what is happening:

Campaign commit is performed. Before smfd clears the suMaintenanceCampaign 
attribute for the SU, a component in that SU fails. This sends an NTF event 
with the maintenance name. At the same time the poll routine in smfd processes 
the TERMINATE upgrade thread event. When it returns, the upgrade campaign 
thread has been deleted and m_running has been set to false. But, the NTF file 
descriptor has not been processed yet. Now, the poll routine processes the NTF 
event which tries to use the upgrade thread to deliver the asyncFailure event, 
which is gone. Hence the crash.

The solution should be to always have "processEvt" last in the poll routine, so 
that if m_running is set to false, no other processing will be done, and the 
poll loop will finish.



---

** [tickets:#2413] smf: coredump, suspend is issued at completed state**

**Status:** review
**Milestone:** 5.2.0
**Created:** Wed Apr 05, 2017 12:39 PM UTC by Rafael
**Last Updated:** Thu Apr 06, 2017 03:33 PM UTC
**Owner:** Alex Jones
**Attachments:**

- 
[osafsmfd.9276.SC-2.core.txt](https://sourceforge.net/p/opensaf/tickets/2413/attachment/osafsmfd.9276.SC-2.core.txt)
 (15.4 kB; text/plain)


ticket #2145 looks to be causing this issue. 

coredump printout is attached.

Steps to reproduce: run a campaign and have AMF compenent fail at the campaign 
completed state. This triggers a event in SMF which tries to suspend a 
completed campaign.

Function handleAmfObjectStateChangeNotification will try to call asyncFailure() 
which is the same as suspend() because the campaign is completed and commited 
this is not a valid transition. The campaign state instance is most likely 
deleted therefore we get a coredump.

For reference refer to figures 5, 6, 7 in SMF AIS. Starting from section 5.1.3


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to