Hi Tran, Here is a patch you can overlay to fix this crash.
Alex On 09/21/2018 06:31 AM, Tran Thuan wrote: __________________________________________________________________ NOTICE: This email was received from an EXTERNAL sender __________________________________________________________________ Hi Alex, I think you need update samples/amf/container/README. Also when I try following steps, AMFD crash. root@SC-1:/opt/amf_demo# immcfg -f AppConfig-container.xml root@SC-1:/opt/amf_demo# immcfg -f AppConfig-contained-2N.xml root@SC-1:/opt/amf_demo# amf-adm unlock-in safSu=SU1,safSg=Container,safApp=Container root@SC-1:/opt/amf_demo# amf-adm unlock safSu=SU1,safSg=Container,safApp=Container root@SC-1:/opt/amf_demo# amf-adm lock safSu=SU1,safSg=Container,safApp=Container 2018-09-21 16:58:38.338 SC-1 osafamfnd[512]: NO Assigning 'safSi=Container,safApp=Container' ACTIVE to 'safSu=SU1,safSg=Container,safApp=Container' 2018-09-21 16:58:38.339 SC-1 amf_container_demo[739]: csi set callback for comp: safComp=Container,safSu=SU1,safSg=Container,safApp=Container 2018-09-21 16:58:38.339 SC-1 amf_container_demo[739]: CSI Set - add 'safCsi=Container1,safSi=Container,safApp=Container' HAState Active 2018-09-21 16:58:38.339 SC-1 amf_container_demo[739]: name: Contained1, value: AAAA 2018-09-21 16:58:38.339 SC-1 amf_container_demo[739]: name: Contained1, value: BBBB 2018-09-21 16:58:38.340 SC-1 osafamfnd[512]: NO Assigned 'safSi=Container,safApp=Container' ACTIVE to 'safSu=SU1,safSg=Container,safApp=Container' 2018-09-21 16:58:38.413 SC-1 osafamfnd[512]: NO 'safSu=SU1,safSg=Contained_2N,safApp=Contained_2N' Presence State UNINSTANTIATED => INSTANTIATING 2018-09-21 16:58:38.414 SC-1 amf_container_demo[739]: =====Contained Instantiate Callback====> 2018-09-21 16:58:38.414 SC-1 amf_container_demo[739]: comp:safComp=Contained_1,safSu=SU1,safSg=Contained_2N,safApp=Contained_ 2N 2018-09-21 16:58:38.414 SC-1 amf_container_demo[739]: responding with TRY_AGAIN 2018-09-21 16:58:38.417 SC-1 amf_container_demo[739]: <=========================================== 2018-09-21 16:58:38.418 SC-1 amf_container_demo[739]: =====Contained Clean Up Callback====> 2018-09-21 16:58:38.418 SC-1 amf_container_demo[739]: comp:safComp=Contained_1,safSu=SU1,safSg=Contained_2N,safApp=Contained_ 2N 2018-09-21 16:58:38.418 SC-1 amf_container_demo[739]: <=========================================== 2018-09-21 16:58:38.422 SC-1 amf_container_demo[739]: =====Contained Instantiate Callback====> 2018-09-21 16:58:38.422 SC-1 amf_container_demo[739]: comp:safComp=Contained_1,safSu=SU1,safSg=Contained_2N,safApp=Contained_ 2N 2018-09-21 16:58:38.422 SC-1 amf_container_demo[739]: <=========================================== 2018-09-21 16:58:38.424 SC-1 osafamfnd[512]: NO 'safSu=SU1,safSg=Contained_2N,safApp=Contained_2N' Presence State INSTANTIATING => INSTANTIATED 2018-09-21 17:19:07.776 SC-1 osafamfnd[512]: ER AMFD has unexpectedly crashed. Rebooting node 2018-09-21 17:19:07.776 SC-1 osafamfnd[512]: Rebooting OpenSAF NodeId = 131343 EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 131343, SupervisionTime = 60 2018-09-21 17:19:07.778 SC-1 osafimmnd[438]: NO Implementer locally disconnected. Marking it as doomed 5 <23, 2010f> (safAmfService) 2018-09-21 17:19:07.799 SC-1 opensaf_reboot: Rebooting local node; timeout=60 Best Regards, Thuan -----Original Message----- From: [1]nagen...@hasolutions.in [2]<nagen...@hasolutions.in> Sent: Friday, September 7, 2018 12:57 PM To: Alex Jones [3]<ajo...@rbbn.com>; Gary Lee [4]<gary....@dektech.com.au>; [5]hans.nordeb...@ericsson.com; [6]ravisekhar.ko...@oracle.com Cc: [7]opensaf-devel@lists.sourceforge.net Subject: Re: [devel] [PATCH 1/1] amf: add support for container/contained [#70] Hi Alex, Thanks for the patch. From my side Ack. I wish that I could have tested the following area (I assume you would have covered it): - Headless enabled test cases - CSI Dep, SI Dep testing(in 2N red model) - Combinations of Admin operations on Container and contained (in all 5 red models for contained) with fault scenarios. - Escalations of contained components. Thanks, Nagendra, 91-9866424860 High Availability Solutions Pvt. Ltd. ([8]www.hasolutions.in) - OpenSAF Support and Services --------- Original Message --------- Subject: Re: [PATCH 1/1] amf: add support for container/contained [#70] From: "Alex Jones" [9]<ajo...@rbbn.com> Date: 9/6/18 8:52 pm To: [10]nagen...@hasolutions.in, "Gary Lee" [11]<gary....@dektech.com.au>, [12]hans.nordeb...@ericsson.com, [13]ravisekhar.ko...@oracle.com Cc: [14]opensaf-devel@lists.sourceforge.net Hi Nagu, Here's a patch that fixes your issue in test #1. For the other code review issues, is it OK if I just add them when I push the final patch. Or do you want to review them now? Alex On 08/30/2018 01:44 AM, [15]nagen...@hasolutions.in wrote: NOTICE: This email was received from an EXTERNAL sender Hi Alex, Thanks for your response. For Test #2, I had configured all SUs on the single node SC-1. So, 2 container SUs and 2 contained SUs are on the same node. In such cases, we can have the implementation as having only one SU of that node(higher rank SUs may be) to be the container for all the contained SUs of that node. Thanks, Nagendra, 91-9866424860 High Availability Solutions Pvt. Ltd. ([16]www.hasolutions.in) - OpenSAF Support and Services --------- Original Message --------- Subject: Re: [PATCH 1/1] amf: add support for container/contained [#70] From: "Alex Jones" [17]<ajo...@rbbn.com> Date: 8/29/18 9:29 pm To: [18]nagen...@hasolutions.in, "Gary Lee" [19]<gary....@dektech.com.au>, [20]hans.nordeb...@ericsson.com, [21]ravisekhar.ko...@oracle.com Cc: [22]opensaf-devel@lists.sourceforge.net Hi Nagu, I have a fix for your issue test #1. I will send out a patch along with changes for code review #1 and #2. For issue test #2, I think this needs to be handled in the configuration. In this case because there is no explicit node set for the contained SUs, su.cc:map_su_to_node will assign a node in the node group. The code is assigning it to SC-2 in this case, because another SU has been assigned to SC-1, even though there is no container on SC-2. I'm not sure how we can get around this without explicitly setting the contained host node in the configuration. Since the container csi has not yet been assigned, we can't map it to a container, and so we can't figure out which container we should be on the same node as. Am I right here? Alex On 08/28/2018 09:56 AM, [23]nagen...@hasolutions.in wrote: NOTICE: This email was received from an EXTERNAL sender Hi Alex, Code review: 1. Header for few functions are missing. 2. Clc.cc: Need to add '0' in place avnd_comp_clc_inst_try_again_hdler in other fsm states. Testing: 1. Uploaded AppConfig-container.xml and AppConfig-contained-2N.xml Performed: amf-adm unlock-in safSu=SU1,safSg=Container,safApp=Container amf-adm unlock safSu=SU1,safSg=Container,safApp=Container Even I don't perform the following, the contained components are instantiated. amf-adm unlock-in safSu=SU1,safSg=Contained_2N,safApp=Contained_2N amf-adm unlock safSu=SU1,safSg=Contained_2N,safApp=Contained_2N Aug 28 19:15:11 nags-VirtualBox osafamfnd[28278]: NO 'safSu=SU1,safSg=Contained_2N,safApp=Contained_2N' Presence State UNINSTANTIATED => INSTANTIATING immlist safSu=SU1,safSg=Contained_2N,safApp=Contained_2N will show saAmfSUPresenceState 3(instantiated) and saAmfSUAdminState 3(locked-in) Now further admin operation on safSu=SU1,safSg=Contained_2N,safApp=Contained_2N will fail: root@nags-VirtualBox:/home/nags/views/ajones-review/samples/amf/contain er# amf-adm unlock-in safSu=SU1,safSg=Contained_2N,safApp=Contained_2N error - saImmOmAdminOperationInvoke_2 admin-op RETURNED: SA_AIS_ERR_BAD_OPERATION (20) error-string: Can't instantiate 'safSu=SU1,safSg=Contained_2N,safApp=Contained_2N', whose presence state is '3' 2.This is related to Specs 6.2.2 Assignment of the Container CSI: "If there are multiple container components on a node which have the active HA state for a particular container CSI, and one or more service units on the same node whose contained components are configured with the same container CSI, it is implementation- defined how the Availability Management Framework selects container components to handle the life cycle of the contained components of these service units. However, all contained components of a service unit must have the same associated container component." Uploaded AppConfig-container.xml and AppConfig-contained-2N.xml with once difference that all SUs of container and contained are configured on SC-1. Perform the following operations, but safSu=SU2,safSg=Contained_2N,safApp=Contained_2N will not get assignments. amf-adm unlock-in safSu=SU1,safSg=Contained_2N,safApp=Contained_2N amf-adm unlock safSu=SU1,safSg=Contained_2N,safApp=Contained_2N amf-adm unlock-in safSu=SU2,safSg=Contained_2N,safApp=Contained_2N amf-adm unlock safSu=SU2,safSg=Contained_2N,safApp=Contained_2N amf-adm unlock-in safSu=SU1,safSg=Container,safApp=Container amf-adm unlock safSu=SU1,safSg=Container,safApp=Container amf-adm unlock-in safSu=SU2,safSg=Container,safApp=Container amf-adm unlock safSu=SU2,safSg=Container,safApp=Container root@nags-VirtualBox:/home/nags/views/ajones-review/samples/amf/contain er# amf-state siass safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=Ope nSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SU1\,safSg=Contained_2N\,safApp=Contained_2N,safSi=Contai ned_2 N_1,safApp=Contained_2N saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SU1\,safSg=Container\,safApp=Container,safSi=Container,sa fApp= Container saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SU2\,safSg=Container\,safApp=Container,safSi=Container,sa fApp= Container saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) I will do further testing. The documentation need to be done if you haven't tested : - Headless enabled - CSI Dep, SI Dep testimg - Etc. Thanks, Nagendra, 91-9866424860 High Availability Solutions Pvt. Ltd. ([24]www.hasolutions.in) - OpenSAF Support and Services --------- Original Message --------- Subject: Re: [PATCH 1/1] amf: add support for container/contained [#70] From: "Alex Jones" [25]<ajo...@rbbn.com> Date: 8/15/18 11:10 pm To: "Gary Lee" [26]<gary....@dektech.com.au>, [27]hans.nordeb...@ericsson.com, [28]ravisekhar.ko...@oracle.com, [29]nagen...@hasolutions.in Cc: [30]opensaf-devel@lists.sourceforge.net G'day Gary, I see you were adding the XML file dynamically with "immcfg -f". I hadn't tried that. I hadn't tried killing the sample app, either. Here is a patch that should fix both issues. Apply it on top of the latest big one I sent. Alex On 08/13/2018 10:37 PM, Gary Lee wrote: NOTICE: This email was received from an EXTERNAL sender Hi Alex I modified AppConfig-container.xml and changed saAmfSgtRedundancyModel from 4 (NwayAct) to 1 (2N). The xml still loads and I could unlock, resulting in: root@SC-1:/var/log# immlist safVersion=1,safSgType=Container Name Type Value(s) ======================================================================= = safVersion SA_STRING_T safVersion=1 saAmfSgtValidSuTypes SA_NAME_T safVersion=1,safSuType=Container (32) saAmfSgtRedundancyModel SA_UINT32_T 1 (0x1) safSISU=safSu=SU2\,safSg=Container\,safApp=Container,safSi=Container,sa fApp= Container saAmfSISUHAState=STANDBY(2) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) safSISU=safSu=SU1\,safSg=Container\,safApp=Container,safSi=Container,sa fApp= Container saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) Also, have you tried killing the amf_container_demo binary? Thanks Gary On 14/08/18 05:00, Alex Jones wrote: Hi Gary, I just resubmitted a new patch which breaks out the different components, and addresses the other comments here. But, #2 (rejecting all but NWay-active for container) should already be in there. Is there a specific test you ran that didn't work? Alex On 08/13/2018 02:43 AM, Gary Lee wrote: NOTICE: This email was received from an EXTERNAL sender Hi Alex Some initial comments: 0. Is it possible to split up the patch into amfd / amfnd / common / samples. Just makes it easier to reply inline. 1. Please compile the container demo by default, and make amf_container_script world executable. Eg. diff --git a/samples/amf/Makefile.am b/samples/amf/Makefile.am index 447dedd..7ebf9c3 100644 --- a/samples/amf/Makefile.am +++ b/samples/amf/Makefile.am @@ -19,5 +19,5 @@ include $(top_srcdir)/Makefile.common MAINTAINERCLEANFILES = Makefile.in -SUBDIRS = sa_aware non_sa_aware wrapper proxy api_demo +SUBDIRS = sa_aware non_sa_aware wrapper proxy api_demo container diff --git a/samples/amf/container/amf_container_script b/samples/amf/container/amf_container_script old mode 100644 new mode 100755 diff --git a/samples/configure.ac b/samples/configure.ac index 7cf803e..9765d54 100644 --- a/samples/configure.ac +++ b/samples/configure.ac @@ -67,6 +67,7 @@ AC_CONFIG_FILES([ \ amf/wrapper/Makefile \ amf/proxy/Makefile \ amf/api_demo/Makefile \ + amf/container/Makefile \ cpsv/Makefile \ cpsv/ckpt_demo/Makefile \ cpsv/ckpt_track_demo/Makefile \ 2. We should probably reject CCBs that set saAmfSgtRedundancyModel to anything other than NWayActive, for Containers. 3. Do we need to bump the msg format version to AVSV_AVD_AVND_MSG_FMT_VER_8? An old amfnd will assert if it gets an AVSV_D2N_CONTAINED_SU_MSG_INFO msg. Thanks Gary ----------------------------------------------------------------------- ----- -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! [31]http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list [32]Opensaf-devel@lists.sourceforge.net [33]https://lists.sourceforge.net/lists/listinfo/opensaf-devel References 1. mailto:nagen...@hasolutions.in 2. mailto:nagen...@hasolutions.in 3. mailto:ajo...@rbbn.com 4. mailto:gary....@dektech.com.au 5. mailto:hans.nordeb...@ericsson.com 6. mailto:ravisekhar.ko...@oracle.com 7. mailto:opensaf-devel@lists.sourceforge.net 8. https://protect-us.mimecast.com/s/Mly8C2k90ki1Dprf2bnqw?domain=hasolutions.in 9. mailto:ajo...@rbbn.com 10. mailto:nagen...@hasolutions.in 11. mailto:gary....@dektech.com.au 12. mailto:hans.nordeb...@ericsson.com 13. mailto:ravisekhar.ko...@oracle.com 14. mailto:opensaf-devel@lists.sourceforge.net 15. mailto:nagen...@hasolutions.in 16. https://protect-us.mimecast.com/s/Mly8C2k90ki1Dprf2bnqw?domain=hasolutions.in 17. mailto:ajo...@rbbn.com 18. mailto:nagen...@hasolutions.in 19. mailto:gary....@dektech.com.au 20. mailto:hans.nordeb...@ericsson.com 21. mailto:ravisekhar.ko...@oracle.com 22. mailto:opensaf-devel@lists.sourceforge.net 23. mailto:nagen...@hasolutions.in 24. https://protect-us.mimecast.com/s/Mly8C2k90ki1Dprf2bnqw?domain=hasolutions.in 25. mailto:ajo...@rbbn.com 26. mailto:gary....@dektech.com.au 27. mailto:hans.nordeb...@ericsson.com 28. mailto:ravisekhar.ko...@oracle.com 29. mailto:nagen...@hasolutions.in 30. mailto:opensaf-devel@lists.sourceforge.net 31. https://protect-us.mimecast.com/s/JURxC319A1cqrp3HQaEHW?domain=sdm.link 32. mailto:Opensaf-devel@lists.sourceforge.net 33. https://protect-us.mimecast.com/s/JgMIC4x9BxSgwBPCMiz58?domain=lists.sourceforge.net
commit 430dbfdd4fcd713a9fdac6371b09f90563a67023 Author: Alex Jones <ajo...@rbbn.com> Date: Fri Sep 21 13:57:44 2018 -0400 amf: fix crash of amfd [#70] Don't try to bring down or wait for contained components, if none have susi assignments. diff --git a/samples/amf/container/README b/samples/amf/container/README index c76e32da5..74815d1ea 100644 --- a/samples/amf/container/README +++ b/samples/amf/container/README @@ -1,36 +1,41 @@ -This directory contains a sample implementation of an -SA-Aware AMF component. - -amf_demo.c contains the skeleton of an SA-Aware AMF component. All required -callbacks are implemented and responds OK when requested. To be used with -any of the configuration files mentioned below. This implementation can be -used as a starting point to see what happens when you do admin operations -such as lock, lock-instantiation etc on SU or any other level containing -the component. +This directory contains a sample implementation of container and contained +SA-Aware AMF components. + +amf_container_demo.c contains the skeleton of both a container and a contained +SA-Aware AMF component. All required callbacks are implemented and responds OK +when requested. The contained application will respond with TRY_AGAIN for the +first instantiation to demonstrate the use of TRY_AGAIN when instantiating a +contained component. To be used with any of the configuration files mentioned +below. This implementation can be used as a starting point to see what happens +when you do admin operations such as lock, lock-instantiation, etc on SU or any +other level containing the component. Logging output is done to the system log (normally /var/log/message). -Appconfig-2N.xml: This file contains the AMF model for an application running -in a 2N redundancy model. The amf_demo is configured to run on the 2 controllers. +AppConfig-contained-2N.xml: This file contains the AMF model for a contained +application running in a 2N redundancy model. The amf_container_demo is +configured to run on the 2 controllers. -Appconfig-nwayactive.xml: This file contains the AMF model for an application -running in an NWayActive redundancy model. The configuration contains 5 SUs -That will run on the "allnodes" AMF node group. +AppConfig-container.xml: This file contains the AMF model for the container +application running in an NWayActive redundancy model. Note that the SU admin state has the value UNLOCKED-INSTANTIATION(3). This means that the SUs needs to be unlocked after the file has been loaded. Some steps to follow: -1. Install amf_demo into /opt/amf_demo -2. Install amf_demo_script into /opt/amf_demo -3. Load configuration: immcfg -f AppConfig-2N.xml +1. Install amf_container_demo into /opt/amf_demo +2. Install amf_container_script into /opt/amf_demo +3. Load configuration: + immcfg -f AppConfig-contained-2N.xml + immcfg -f AppConfig-container.xml 4. Unlock instantiation: - amf-adm unlock-in safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 - amf-adm unlock-in safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 + amf-adm unlock-in safSu=SU1,safSg=Container,safApp=Container + amf-adm unlock-in safSu=SU2,safSg=Container,safApp=Container + amf-adm unlock-in safSu=SU1,safSg=Contained_2N,safApp=Contained_2N + amf-adm unlock-in safSu=SU2,safSg=Contained_2N,safApp=Contained_2N 5. Unlock: - amf-adm unlock safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 - amf-adm unlock safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 -6. Run below command for invocation of CSI Attribute Change Callback : - immcfg -a saAmfCSIAttriValue+=CCCC safCsiAttr=AmfDemo1,safCsi=AmfDemo,safSi=AmfDemo,safApp=AmfDemo1 - + amf-adm unlock safSu=SU1,safSg=Container,safApp=Container + amf-adm unlock safSu=SU2,safSg=Container,safApp=Container + amf-adm unlock safSu=SU1,safSg=Contained_2N,safApp=Contained_2N + amf-adm unlock safSu=SU2,safSg=Contained_2N,safApp=Contained_2N diff --git a/src/amf/amfd/sgproc.cc b/src/amf/amfd/sgproc.cc index 7fba62ba7..780421a92 100644 --- a/src/amf/amfd/sgproc.cc +++ b/src/amf/amfd/sgproc.cc @@ -2493,7 +2493,7 @@ static uint32_t shutdown_contained_sus(AVD_CL_CB *cb, AVD_SU *container_su, SaAmfHAStateT state) { TRACE_ENTER(); - uint32_t rc(NCSCC_RC_FAILURE); + uint32_t rc(NCSCC_RC_NO_OBJECT); // get the container csi AVD_COMP_CSI_REL *container_csi_rel(nullptr); @@ -2526,6 +2526,11 @@ static uint32_t shutdown_contained_sus(AVD_CL_CB *cb, AVD_SU *container_su, if (su->list_of_comp.front()->saAmfCompContainerCsi == container_csi) { su->set_readiness_state(SA_AMF_READINESS_OUT_OF_SERVICE); + if (!su->list_of_susi) { + // nothing to do + continue; + } + if (su->list_of_susi->state == SA_AMF_HA_ACTIVE) rc = avd_sg_su_si_mod_snd(cb, su, SA_AMF_HA_QUIESCED); else @@ -2590,8 +2595,14 @@ uint32_t avd_sg_su_si_mod_snd(AVD_CL_CB *cb, AVD_SU *su, SaAmfHAStateT state) { if (su->container() && !su->wait_for_contained_to_quiesce) { TRACE("this is a container su; need to shut down contained sus"); rc = shutdown_contained_sus(cb, su, state); - su->wait_for_contained_to_quiesce = true; - goto done; + + if (rc == NCSCC_RC_NO_OBJECT) { + // there were no contained sus to shutdown + } + else { + su->wait_for_contained_to_quiesce = true; + goto done; + } } /* change the state for all assignments to the specified state. */
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel