Yes, We could see if this can be done (for the PI components) in AMF without deviating from the state machine.
Mathi, ----- [email protected] wrote: > Ack with comment: > > I think this is a temporary solution which solves the problem reported > > in the ticket (the case of restarting OpenSAF components). However, > the > same situation can happen also with an application. The proper > solution > as I see it would be to let AMF wait for both the reply to the > terminate > callback as well as waiting for the process to disappear, before > trying > to instantiate the component again. Such a solution could be > implemented > as an enhancement ticket. > > / Anders Widell > > On 05/06/15 16:47, [email protected] wrote: > > Summary: osaf: During adminrestart of node directors, before > re-instantiating kill them [#1326] > > Review request for Trac Ticket(s): #1326 > > Peer Reviewer(s): AndersW, Ramesh - General aspects. HansN,Nagendra, > Praveen from AMF perspective > > Pull request to: <<LIST THE PERSON WITH PUSH ACCESS HERE>> > > Affected branch(es): opensaf-4.5.x, 4.6.x, default > > Development branch: <<IF ANY GIVE THE REPO URL>> > > > > -------------------------------- > > Impacted area Impact y/n > > -------------------------------- > > Docs n > > Build system n > > RPM/packaging n > > Configuration files n > > Startup scripts n > > SAF services y > > OpenSAF services n > > Core libraries n > > Samples n > > Tests n > > Other n > > > > > > Comments (indicate scope for each "y" above): > > --------------------------------------------- > > Since this is a common code for all opensaf components. > > A general review from AndersW and Ramesh would be good. > > And, review from an AMF component perspective by the AMF > maintainers. > > > > changeset 41c5c64465057b2fe896ead1d931925ba1077d0f > > Author: Mathivanan N.P.<[email protected]> > > Date: Wed, 06 May 2015 20:11:05 +0530 > > > > osaf: During adminrestart of node directors, before > re-instantiating kill > > them [#1326]. > > The command $ amf-adm restart <DN name> is one way of > > administratively restarting an AMF component. As apart of this > admin > > operation, AMF sends the component terminate callback to the PI > components. > > It is up to the component to release all its resources and respond > to AMF > > the status of its self-termination before exiting (typically) the > process > > itself. After receiving the response from the component, AMF > invokes the > > instantiation script of the component. During this time, it is > possible that > > the previously running instance of the process (of this component) > has not > > yet exited. This situation when there is already a running > daemon/process > > and now a new instantiation is being attempted can cause the > instantiation > > script to return failure. This patch creates temporary > term_state_file from > > inside the component terminate callback of the node directors. In > the > > instantiation scripts, a check is done to distinguish a a fresh > > instantiation versus an instantiation after a termination. If the > > term_state_file exists then it means, its an instantiation after > > termination. If so, just attempt to kill (using killproc) the > process again > > before calling start_daemon. > > > > Note: There has been mention of using start_daemon -f option which > will > > create another copy of the daemon if the previous daemon is still > running. > > Using this option may not be ideal for us as it can create any > inconsistency > > between the two daemons when using any resources and also, there is > no proof > > or documentation of start_daemon -f working successfully. This is > even more > > significant given that some distros are really slow in becoming > LSB > > compliant, particularly the start_daemon and the likes of it. > > > > > > Complete diffstat: > > ------------------ > > osaf/services/saf/cpsv/cpnd/cpnd_amf.c | 14 > +++++++++++++- > > osaf/services/saf/cpsv/cpnd/scripts/osaf-ckptnd.in | 13 > +++++++++++++ > > osaf/services/saf/glsv/glnd/glnd_amf.c | 14 > ++++++++++++++ > > osaf/services/saf/glsv/glnd/scripts/osaf-lcknd.in | 13 > +++++++++++++ > > osaf/services/saf/immsv/immnd/immnd_amf.c | 11 > +++++++++++ > > osaf/services/saf/immsv/immnd/scripts/osaf-immnd.in | 18 > ++++++++++++++++++ > > osaf/services/saf/mqsv/mqnd/mqnd_amf.c | 11 > +++++++++++ > > osaf/services/saf/mqsv/mqnd/scripts/osaf-msgnd.in | 13 > +++++++++++++ > > osaf/services/saf/smfsv/smfnd/scripts/osaf-smfnd.in | 13 > +++++++++++++ > > osaf/services/saf/smfsv/smfnd/smfnd_amf.c | 11 > +++++++++++ > > 10 files changed, 130 insertions(+), 1 deletions(-) > > > > > > Testing Commands: > > ----------------- > > amf-adm restart <DN name of opensaf Node director component> > > > > Testing, Expected Results: > > -------------------------- > > There should not be any error observed during re-instantiation > > of that component. > > NOte: The test may be mixedup with combination of > > amf-adm restart <DN name> > > and > > kill -9 <pid> > > commands. > > > > Conditions of Submission: > > ------------------------- > > Ack from AndersW/Ramesh. > > > > Arch Built Started Linux distro > > ------------------------------------------- > > mips n n > > mips64 n n > > x86 n n > > x86_64 y y > > powerpc n n > > powerpc64 n n > > > > > > Reviewer Checklist: > > ------------------- > > [Submitters: make sure that your review doesn't trigger any > checkmarks!] > > > > > > Your checkin has not passed review because (see checked entries): > > > > ___ Your RR template is generally incomplete; it has too many blank > entries > > that need proper data filled in. > > > > ___ You have failed to nominate the proper persons for review and > push. > > > > ___ Your patches do not have proper short+long header > > > > ___ You have grammar/spelling in your header that is unacceptable. > > > > ___ You have exceeded a sensible line length in your > headers/comments/text. > > > > ___ You have failed to put in a proper Trac Ticket # into your > commits. > > > > ___ You have incorrectly put/left internal data in your > comments/files > > (i.e. internal bug tracking tool IDs, product names etc) > > > > ___ You have not given any evidence of testing beyond basic build > tests. > > Demonstrate some level of runtime or other sanity testing. > > > > ___ You have ^M present in some of your files. These have to be > removed. > > > > ___ You have needlessly changed whitespace or added whitespace > crimes > > like trailing spaces, or spaces before tabs. > > > > ___ You have mixed real technical changes with whitespace and other > > cosmetic code cleanup changes. These have to be separate > commits. > > > > ___ You need to refactor your submission into logical chunks; there > is > > too much content into a single commit. > > > > ___ You have extraneous garbage in your review (merge commits etc) > > > > ___ You have giant attachments which should never have been sent; > > Instead you should place your content in a public tree to be > pulled. > > > > ___ You have too many commits attached to an e-mail; resend as > threaded > > commits, or place in a public tree for a pull. > > > > ___ You have resent this content multiple times without a clear > indication > > of what has changed between each re-send. > > > > ___ You have failed to adequately and individually address all of > the > > comments and change requests that were proposed in the initial > review. > > > > ___ You have a misconfigured ~/.hgrc file (i.e. username, email > etc) > > > > ___ Your computer have a badly configured date and time; confusing > the > > the threaded patch review. > > > > ___ Your changes affect IPC mechanism, and you don't present any > results > > for in-service upgradability test. > > > > ___ Your changes affect user manual and documentation, your patch > series > > do not contain the patch that updates the Doxygen manual. > > ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
