- **status**: unassigned --> duplicate
- **assigned_to**: Nagendra Kumar



---

** [tickets:#395] amfnd deadlocks with immnd**

**Status:** duplicate
**Created:** Fri May 31, 2013 05:14 AM UTC by Nagendra Kumar
**Last Updated:** Fri May 31, 2013 05:14 AM UTC
**Owner:** Nagendra Kumar

Migrated from http://devel.opensaf.org/ticket/2601

Have seen a system crash where amfnd is trying to read IMM and immnd is trying 
to register with AMF.


http://devel.opensaf.org/ticket/1713 exist to improve things on the IMM side. 
This ticket should address and reduce the risk on the amfnd side.


There will always be a risk if immnd crashes that it will lead to a system 
crash with the current design. But in the normal case the deadlock should be 
avoided by design.


The core dump from the crash below shows that amfnd is trying to read component 
related info in the context of an API response. This information (component 
capability) can instead be read when the component is initialized.


I also realized there is a slight change in the protocol between amfd and amfnd 
that was not intentional and probably reduces the risk. amfd is immediately 
sending instantiate request without waiting for the REGSU response.


#0 0x00007f35bc93efd3 in select () from /lib64/libc.so.6 #0 0x00007f35bc93efd3 
in select () from /lib64/libc.so.6
#1 0x00007f35bdae6b89 in ncs_sel_obj_select (highest_sel_obj=<optimized out>, 
rfds=0x7fff428fa1e0, wfds=0x0, efds=0x0, timeout_in_10ms=0x7fff428fa26c) at 
os_defs.c:2622
#2 0x00007f35bd67111a in imma_sync_with_immnd (cb=<optimized out>) at 
imma_init.c:79
#3 imma_create (sv_id=<optimized out>) at imma_init.c:165
#4 imma_startup (sv_id=NCSMDS_SVC_ID_IMMA_OM) at imma_init.c:278
#5 0x00007f35bd66d94d in initialize_common (immHandle=0x7fff428fa7c0, 
cl_node=0x695200, version=0x7fff428fa5e0) at imma_om_api.c:194
#6 0x00007f35bd66e0b3 in saImmOmInitialize (immHandle=0x7fff428fa7c0, 
immCallbacks=0x0, inout_version=<optimized out>) at imma_om_api.c:177
#7 0x00000000004067fd in immutil_saImmOmInitialize (immHandle=0x7fff428fa7c0, 
immCallbacks=0x0, version=0x7fff428fa7e0) at 
../../../../../osaf/tools/safimm/src/immutil.c:1051
#8 0x000000000043760d in avnd_imm_init (immHandle=0x7fff428fa7c0, 
immVersion=0x7fff428fa7e0) at avnd_util.c:199
#9 0x000000000041df25 in avnd_comp_cap_x_act_or_1_act_check 
(comp_type=0x68cbca, csi_type=0x6999d2) at avnd_comp.c:1105 #10 
0x000000000041e3cb in avnd_comp_csi_assign (cb=0x657960, comp=0x68ca90, 
csi=0x6998a0) at avnd_comp.c:1210
#11 0x000000000041e650 in assign_all_csis_at_rank (si=<optimized out>, rank=1, 
single_csi=true) at avnd_comp.c:1632
#12 0x000000000041e7f0 in avnd_comp_csi_assign_done (cb=0x657960, 
comp=0x68f1c0, csi=0x6720e0) at avnd_comp.c:1751
#13 0x000000000040abca in avnd_evt_ava_resp_evh (cb=0x657960, evt=<optimized 
out>) at avnd_cbq.c:440
#14 0x000000000042fdb0 in avnd_evt_process (evt=<optimized out>) at 
avnd_proc.c:279
#15 avnd_main_process () at avnd_proc.c:220
#16 0x00000000004086b5 in main (argc=1, argv=0x7fff428fac58) at amfnd_main.c:53




Changed 14 months ago by hafe ¶
  ■owner changed from ravisekhar to hafe 
■status changed from new to accepted 
Changed 14 months ago by hafe ¶
  The protocol change mentioned is between 3.0 and 4.0. In 4.0 the REG_COMP 
message is not used at all and is dead code. It should be removed in both amfd 
and amfnd.


When REG_COMP is not used it triggers code to instantiate SUs before they are 
even registered properly! See the bottom of avd_node_up_evh(), since comp_sent 
is always false, avd_nd_reg_comp_evt_hdl() is called at this point. Instead the 
response from REG_SU should be awaited and then SUs should be instantiated. 
Interesting here is also the error handling when REG_SU fails. Consider that 
immnd crashes so amfnd cannot read from IMM during the REG_SU handling, should 
amfnd crash or respond with an error code to amfd? And what should amfd do with 
the failed REG_SU response?


amfnd reading from IMM needs to be minimized and possibly kept in the handling 
of REG_SU. Today amfnd is reading from IMM during the handling of an SI 
assignment. One problem is that the cstype for a CSI is unknown. Another 
problem is the component capability which is moved (in B.04) into association 
objects as children to comptype objects. Those objects can be read at REG_SU 
handling time and put into the comp object. But in order to now the capability 
for a specific cstype, the cstype needs to be known when the assignment comes.


Can the SI assignment message be extended with cstype information?


Changed 14 months ago by hafe ¶
  ■patch_waiting changed from no to yes 
Changed 13 months ago by hafe ¶
  changeset: 3523:b750a1a063cc
branch: opensaf-4.2.x
parent: 3521:ed09cbfa05dd
user: Hans Feldt <hans.feldt@…>
date: Fri Apr 27 15:41:43 2012 +0200
summary: avsv/avd: instantiate SUs after registration (#2601)


changeset: 3525:aa57d1e2ad6f
tag: tip
user: Hans Feldt <hans.feldt@…>
date: Fri Apr 27 15:41:43 2012 +0200
summary: avsv/avd: instantiate SUs after registration (#2601)


remote: rev b750a1a063cc1720b9fc8e433d5e9b5f0d1fe5da sent
remote: rev aa57d1e2ad6f6edbab077126e2e0ddad53b56fdc sent


patch 2 does not build, looking into that and will push separately.


Changed 13 months ago by hafe ¶
  ■milestone changed from 4.2.1 to future_releases 
The urgent problem has been solved. In http://devel.opensaf.org/ticket/1713 and 
this ticket.


Future (after 4.2.1 release) intended work is to:
* update and push "[PATCH 2 of 2] avsv: remove reg_comp code (#2601)"
* read from IMM only in REGSU context.





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to