Summary: Version-1 with minimal testing [#1235]. Review request for Trac Ticket(s): #1235 Peer Reviewer(s): Hans N., Nagendra, Bertil, Mathivanan. Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>> Affected branch(es): default. Development branch: <<IF ANY GIVE THE REPO URL>>
-------------------------------- Impacted area Impact y/n -------------------------------- Docs n Build system n RPM/packaging n Configuration files n Startup scripts n SAF services y OpenSAF services n Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): --------------------------------------------- Please see the commit log below for details of the changes. Node group shutdown, lock and unlock works fine with 2N, NoRed and Nway_Active models. Note: Apply this patch and after installation work with new imm.xml. changeset 37292d239e8af54c5dc3cf8de885109a676aff00 Author: praveen.malv...@oracle.com Date: Mon, 02 Mar 2015 18:30:58 +0530 amfd : support shutdown, lock and unlock for 2N, NoRed and NWay_Active on NG[#1235] Description of important changes in files: *)comp.cc: -utility function to check if any admin operation is going on component. *)node.cc: -A node can belong to several nodegroups. A utility function is added to check if all the nodegroups of a node are unlocked. If any nodegroup of node is not unlocked then SUs on any node of the nodegroup cannot be assigned any role irrespective of node admin state. *)nodegroup.cc: -Support for saAmfNGAdminState. -Support for shutdown, lock and unlock operation for 2N, NoRed and NWay_Active models. -Shutdown and lock operation executes parallely on all the nodes. Individual nodes also moves to LOCKED state along with nodegroup during shutdown and lock operation. -Before accepting any of the three supported operations, AMF verifies if operation could be executed. AMF rejects operation on following conditionsi: --If any AMF entity deployed on anyone of nodesis unstable. --If a node contains a SU of NWay or NpM red models. Also because of shutdown or lock operation if there is going to be complete service outage in a SG on the nodegroup, AMF throws only syslogs a notice, but operation will continue. -For shutdown and lock operation, AMF picks each node of nodegroup and it calls nodegroup admin operation handler (ng_admin()) for each SU of that node. This handler is written for the presently supported red models in sg_2n_fsm.cc, sg_nored_fsm.cc and sg_nwayact_fsm.cc. -For unlock operation, AMF unlocks all the nodes and enables hosted SUs.It then calls assignment algorithm for the red model to which SU belongs. Thus SUs will be assigned according to their ranks. *)role.cc: -If controller switchover occurs when any admin operation is going on the nodegroup then AMF clears all the admin operation related parameters. *)sg.cc: -Member function (is_sg_assigned_only_in_ng()) to check if SG has assignments only on the nodes of nodegroup. -Member function (check_sg_stability()) to check if any admin operation is going on SG. -Member function (is_sg_serviceable_outside_ng()) to check SG will have assignments outside the nodegroup after completion of shutdown or lock operation. *)sgproc.cc: -Whenever AMFD gets any assignment response from AMFND, it run SG FSM for that SU. After this AMF evaluates if it has to respond to IMM for pending admin operationon AMF entities.process_su_si_response_for_ng() does same functionality for nodegroup. This function also moves individual nodes and nodegroup from shutting_down to locked state. *)su.cc: -SU inservice availabilty c and SU instantiability criteria also includes nodegroup admin state check. -Member functions (any_susi_fsm_in_<*>())to check if any SUSI of SU is undergoing modification or removal of assignments. -Member function (check_su_stability()) to check if any admin operation is going on SG. changeset 0ad73bac0187236a7bc25a46b327e6eff310e125 Author: praveen.malv...@oracle.com Date: Mon, 02 Mar 2015 18:31:23 +0530 amfd: modify or remove assignments of 2N SU during admin op on NG [#1235] Handles modification of assignments in SU of 2N SG because of lock or shutdown operation on Node group. If SU does not have any SIs assigned to it, AMF will try to instantiate new SUs in the SG. If SU has assignments, then it handles assignment based on following cases: a)If both active SU and standby SU are part of the nodegroup, then this case is handled as equivalent to SG lock or shutdown case. b)If only active or only standby SU is part of the nodegroup, then this case is handled as equivalent to SU lock or shutdown case. changeset 6b047af92de3be258f8979b7696ba585acc1d36e Author: praveen.malv...@oracle.com Date: Mon, 02 Mar 2015 18:31:50 +0530 amfd: modify assignments of NWay_Active SU during admin op on NG [#1235]. Handles modification of assignments in SU of NWay_Active SG because of lock or shutdown operation on Node group. If SU does not have any SIs assigned to it, AMF will try to instantiate new SUs in the SG. If SU has assignments, then depending upon lock or shutdown operation, quiesced or quiescing state will be sent for the SU. changeset e3500887cbe894d46b41b4baada81ad85d231ca6 Author: praveen.malv...@oracle.com Date: Mon, 02 Mar 2015 18:32:17 +0530 amfd: modify assignments of NoRed SU during admin op on NG [#1235]. Handles modification of assignments in SU of NoRed SG because of lock or shutdown operation on Node group. If SU does not have any SIs assigned to it, AMF will try to instantiate new SUs in the SG. If SU has assignments, then depending upon lock or shutdown operation, quiesced or quiescing state will be sent for the SU. changeset d3e8bf3b023da6d5e1e6fc03cd9fded28dd32fd4 Author: praveen.malv...@oracle.com Date: Mon, 02 Mar 2015 18:32:58 +0530 amfd : checkpoint saAmfNGAdminState of NG [#1235]. Checkpointing of saAmfNGAdminState is needed because of two cases: 1)If during shutdown operation controller fail-over or switch-over occurs then new active controller will have to move NG to LOCKED admin state after completion of shutdown operation. 2)For 2N model (in future for NpM and NWay ) AMF performs SG shutdown or lock equivalent admin operation using saAmfSGAdminState if both active and standby SUs are hosted on nodes of nodegroup. So after completion of operation AMF will have to revert back saAmfSGAdminState. In brief changes are: -Patch enhances AMF MBCSV sub part version to 7 as a new async update for saAmfNGAdminState is added. -New aync update counter for updates related to nodegroup (cb->async_updt_cnt.ng_updt). -Encode and decode utility for saAmfNGAdminState. -No async update for saAmfNGAdminState if peer AMFD has lower AVD_MBCSV_SUB_PART_VERSION than the updated version 7 (AVD_MBCSV_SUB_PART_VERSION_7). This is the case when new AMFD is active with new version AVD_MBCSV_SUB_PART_VERSION_7 and old AMFD is standby. -Simlilarly during cold sync update if new AMFD comes as standby with AVD_MBCSV_SUB_PART_VERSION_7 and old AMFD is active with version<AVD_MBCSV_SUB_PART_VERSION_7 then cb->async_updt_cnt.ng_updt will not be matched. changeset a8e73de68f92d238730fc29a44ccfe4ee088f5ee Author: praveen.malv...@oracle.com Date: Mon, 02 Mar 2015 18:33:22 +0530 amfd: send state change notification for saAmfNGAdminState [#1235]. Patch sends state change notification for saAmfNGAdminState with minorId (0x72) for three admin state values: SA_AMF_ADMIN_UNLOCKED =1, SA_AMF_ADMIN_LOCKED = 2, SA_AMF_ADMIN_SHUTTING_DOWN = 4 changeset 4163de3d0df4a425960e296c9fd5d12822e21b57 Author: praveen.malv...@oracle.com Date: Mon, 02 Mar 2015 18:33:51 +0530 amfd: show nodegroups and their admin state in amf-state command [#1235]. amf-state ng safAmfNodeGroup=AllNodes,safAmfCluster=myAmfCluster saAmfNGAdminState=UNLOCKED(1) safAmfNodeGroup=PLs,safAmfCluster=myAmfCluster saAmfNGAdminState=UNLOCKED(1) safAmfNodeGroup=SCs,safAmfCluster=myAmfCluster saAmfNGAdminState=UNLOCKED(1) safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster saAmfNGAdminState=LOCKED(2) Complete diffstat: ------------------ osaf/services/saf/amf/amfd/chkop.cc | 13 + osaf/services/saf/amf/amfd/ckpt_dec.cc | 83 +++++++++++- osaf/services/saf/amf/amfd/ckpt_edu.cc | 2 + osaf/services/saf/amf/amfd/ckpt_enc.cc | 24 +++- osaf/services/saf/amf/amfd/comp.cc | 14 ++ osaf/services/saf/amf/amfd/include/ckpt.h | 4 +- osaf/services/saf/amf/amfd/include/ckpt_msg.h | 1 + osaf/services/saf/amf/amfd/include/comp.h | 1 + osaf/services/saf/amf/amfd/include/node.h | 16 ++- osaf/services/saf/amf/amfd/include/ntf.h | 2 + osaf/services/saf/amf/amfd/include/sg.h | 27 +++- osaf/services/saf/amf/amfd/include/su.h | 3 + osaf/services/saf/amf/amfd/node.cc | 17 ++ osaf/services/saf/amf/amfd/nodegroup.cc | 418 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- osaf/services/saf/amf/amfd/role.cc | 17 ++- osaf/services/saf/amf/amfd/sg.cc | 82 +++++++++++ osaf/services/saf/amf/amfd/sg_2n_fsm.cc | 190 +++++++++++++++++++++++++- osaf/services/saf/amf/amfd/sg_nored_fsm.cc | 40 +++++ osaf/services/saf/amf/amfd/sg_nwayact_fsm.cc | 43 ++++++ osaf/services/saf/amf/amfd/sgproc.cc | 122 ++++++++++++++++- osaf/services/saf/amf/amfd/su.cc | 72 +++++++++- osaf/services/saf/amf/config/amf_classes.xml | 8 + osaf/tools/scripts/amf-state | 27 +++ 23 files changed, 1186 insertions(+), 40 deletions(-) Testing Commands: ----------------- As of now use new imm.xml, generated after installation of opensaf with these patches applied. 1)On four nodes cluster configuration, bring up all the three appplication (2N, NoRed and NWay_Active) configuration updated in the ticket #1235. 2)Create a node group like this: immcfg -c SaAmfNodeGroup safAmfNodeGroup=ng,safAmfCluster=myAmfCluster -a saAmfNGNodeList=safAmfNode=SC-1,safAmfCluster=myAmfCluster 3)Add one more node in node group: immcfg safAmfNodeGroup=ng,safAmfCluster=myAmfCluster -a saAmfNGNodeList+=safAmfNode=SC-2,safAmfCluster=myAmfCluster 4)Perfrom shutdown of node group: amf-adm shutdown safAmfNodeGroup=ng,safAmfCluster=myAmfCluster. SC-1, SC-2 and this nodegroup will move in locked state and all application assignments will be removed. 5)Perform unlock of nodegroup. amf-adm unlock safAmfNodeGroup=ng,safAmfCluster=myAmfCluster. All SC-1, SC-2 and this nodegroup wikk move in unlocked state and application SUs will be assigned. 6)Check the state of nodegroup: amf-state ng safAmfNodeGroup=AllNodes,safAmfCluster=myAmfCluster saAmfNGAdminState=UNLOCKED(1) safAmfNodeGroup=PLs,safAmfCluster=myAmfCluster saAmfNGAdminState=UNLOCKED(1) safAmfNodeGroup=SCs,safAmfCluster=myAmfCluster saAmfNGAdminState=UNLOCKED(1) safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster saAmfNGAdminState=UNLOCKED(1) Similary lock and unlock can also be performed. After step 4 nodegroup can be deleted with this command: immcfg -d "safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster" After this nodegroup can be created in locked state again as: immcfg -c SaAmfNodeGroup safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster -a saAmfNGNodeList=safAmfNode=SC-1,safAmfCluster=myAmfCluster -a saAmfNGAdminState=2 immcfg safAmfNodeGroup=TestNG,safAmfCluster=myAmfCluster -a saAmfNGNodeList+=safAmfNode=SC-2,safAmfCluster=myAmfCluster Now step 5 is performed it will lead to unlock of nodegroup and its nodes. Note:If a node group is locked and after it is deleted then its nodes will remain in locked state. After this if same or other nodegroup is created with same nodes without setting attribute -a saAmfNGAdminState=2 then nodegroup will be created in unlocked state. So to unlock all the nodes first lock the nodegroup. After this unlock will lead to unlock of all the unlocked nodes. Testing, Expected Results: -------------------------- Pass Conditions of Submission: ------------------------- Ack from reviewers and with completion of some more unit testing. Arch Built Started Linux distro ------------------------------------------- mips n n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: ------------------- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests. Demonstrate some level of runtime or other sanity testing. ___ You have ^M present in some of your files. These have to be removed. ___ You have needlessly changed whitespace or added whitespace crimes like trailing spaces, or spaces before tabs. ___ You have mixed real technical changes with whitespace and other cosmetic code cleanup changes. These have to be separate commits. ___ You need to refactor your submission into logical chunks; there is too much content into a single commit. ___ You have extraneous garbage in your review (merge commits etc) ___ You have giant attachments which should never have been sent; Instead you should place your content in a public tree to be pulled. ___ You have too many commits attached to an e-mail; resend as threaded commits, or place in a public tree for a pull. ___ You have resent this content multiple times without a clear indication of what has changed between each re-send. ___ You have failed to adequately and individually address all of the comments and change requests that were proposed in the initial review. ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc) ___ Your computer have a badly configured date and time; confusing the the threaded patch review. ___ Your changes affect IPC mechanism, and you don't present any results for in-service upgradability test. ___ Your changes affect user manual and documentation, your patch series do not contain the patch that updates the Doxygen manual. ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel