Hi Minh & Thang, Have you guys got some time to test this as well ? Thanks Anand Sundararaj Senior Solutions Architect | +1 480 686 4772 www.GetHighAvailability.com (https://am2.myprofessionalmail.com/appsuite/www.GetHighAvailability.com) Get High Availability Today! NJ, USA: +1 508-507-6507
> On 07/28/2020 8:43 AM Anand Sundararaj <s.an...@gethighavailability.com> > wrote: > > > Hi Minh/Thang/Nagendra/Paul, > I am planning to push the patch by 30th July(thursday). > Please kindly find some time to review by 29th July(tomorrow) and > send your comments or Ack. > > Thanks > Anand Sundararaj > Senior Solutions Architect | +1 480 686 4772 > www.GetHighAvailability.com > (https://am2.myprofessionalmail.com/appsuite/www.GetHighAvailability.com) > Get High Availability Today! > NJ, USA: +1 508-507-6507 > > > On 07/23/2020 7:28 PM s.an...@gethighavailability.com wrote: > > > > > > From: Anand Sundararaj <s.an...@gethighavailability.com> > > > > Summary: amf: support error report on non local component [#109] > > Review request for Ticket(s): 109 > > Peer Reviewer(s): Minh, Thang, Nagendra, Paul > > Pull request to: Amf Maintainers > > Affected branch(es): develop > > Development branch: ticket-109 > > Base revision: 59ded7cdf6a431e522229afd5ecb989e4a61c7d8 > > Personal repository: git://git.code.sf.net/u/s-anand-has/review > > > > -------------------------------- > > Impacted area Impact y/n > > -------------------------------- > > Docs n > > Build system n > > RPM/packaging n > > Configuration files n > > Startup scripts n > > SAF services y > > OpenSAF services n > > Core libraries n > > Samples n > > Tests n > > Other n > > > > NOTE: Patch(es) contain lines longer than 80 characers > > > > Comments (indicate scope for each "y" above): > > --------------------------------------------- > > *** EXPLAIN/COMMENT THE PATCH SERIES HERE *** > > > > revision fbdef9a140a12d1ca301658537f28bc0dc719a22 > > Author: Anand Sundararaj <s.an...@gethighavailability.com> > > Date: Fri, 24 Jul 2020 07:53:50 +0530 > > > > amf: support error report on non local component [#109] > > > > > > > > Complete diffstat: > > ------------------ > > src/amf/amfnd/amfnd.cc | 22 ++++-- > > src/amf/amfnd/avnd_cb.h | 2 + > > src/amf/amfnd/avnd_comp.h | 2 + > > src/amf/amfnd/clm.cc | 33 ++++++++- > > src/amf/amfnd/err.cc | 182 > > +++++++++++++++++++++++++++++++++++++++++++++- > > 5 files changed, 228 insertions(+), 13 deletions(-) > > > > > > Testing Commands: > > ----------------- > > Configure amf demo on Comp1/SU1(on SC-1) and Comp2/SU2 (on PL-3) > > 1. Report error(saAmfComponentErrorReport_4) from Comp1 runnign on SC-1 for > > Comp2 running on PL-3 with > > recommendedRecovery as SA_AMF_COMPONENT_RESTART > > Comp2 restarts > > osafamfnd[2450]: NO Restarting a component of > > 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart count: 1) > > osafamfnd[2450]: NO > > 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' faulted due to > > 'errorReport' : Recovery is 'componentRestart' > > osafamfnd[2450]: NO 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' Presence > > State INSTANTIATED => RESTARTING > > > > 2. Repeat tc #1 with unconfigured component name like safComp=AmfDem5, then > > the return is SA_AIS_ERR_NOT_EXIST(12) > > osafamfnd[3970]: NO Component > > 'safComp=AmfDem5,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' is not configured > > amf_demo[14441]: saAmfComponentErrorReport_4 FAILED - 12 on > > safComp=AmfDem5,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 > > > > 3. Stop PL-3 and rerun the tc #1, the return will be > > SA_AIS_ERR_UNAVAILABLE(31) > > amf_demo[14922]: saAmfComponentErrorReport_4 FAILED - 31 on > > safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 > > > > 4. Repeat tc #1. When error report call comes to Amfnd of PL-3, then keep > > gdb and stop PL-3 > > The return will be SA_AIS_ERR_TIMEOUT(5) > > amf_demo[15503]: saAmfComponentErrorReport_4 FAILED - 5 on > > safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 > > > > 5. Lock PL-3 and repeat tc #1. The component will restart at PL-3 > > > > 6. Lock and lock-in PL-3 and repeat tc #1. The error report will return > > SA_AIS_ERR_INVALID_PARAM(7) > > amf_demo[15773]: saAmfComponentErrorReport_4 FAILED - 7 on > > safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 > > > > 7. Lock Clm node PL-3, repeat tc #1. The error report will return > > SA_AIS_ERR_UNAVAILABLE(31) > > amf_demo[15873]: saAmfComponentErrorReport_4 FAILED - 31 on > > safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 > > > > 8. Kill component on PL-3 and return non-zero in cleanup command, it will > > go into TERMINATION_FAILED > > Now repeat #1, the return will be SA_AIS_ERR_INVALID_PARAM(7) > > amf_demo[16016]: saAmfComponentErrorReport_4 FAILED - 7 on > > safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1 > > > > 9. Repeat the tc #8 for INSTANTIATION_FAILED, the same result. > > > > 10. Repeat tc #1 with recommendedRecovery as SA_AMF_NODE_SWITCHOVER > > osafamfnd[2419]: NO > > 'safComp=AmfDemo,safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' faulted due to > > 'errorReport' : Recovery is 'nodeSwitchover' > > osafamfnd[2419]: NO Informing director of Nodeswitchover > > > > 11. Repeat tc #1 when su unlock operation going on SU2 of PL-3. > > While admin unlock is going on SU2(i.e. when it gets Act cbk, then hold the > > response for 5 seconds), call saAmfComponentErrorReport_4() from > > Comp1(Running on SC-1) as in tc #1. > > Comp2 will restart and get Act assignment again. > > > > osafamfnd[3258]: NO Assigning 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to > > 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' > > osafamfnd[3258]: NO 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' component > > restart probation timer started (timeout: 400000000000 ns) > > osafamfnd[3258]: NO Restarting a component of > > 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' (comp restart count: 1) > > osafamfnd[3258]: NO 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' Presence > > State RESTARTING => INSTANTIATED > > osafamfnd[3258]: NO Assigned 'safSi=AmfDemo,safApp=AmfDemo1' ACTIVE to > > 'safSu=SU2,safSg=AmfDemo,safApp=AmfDemo1' > > > > 12. Repeat tc #11 for su shutdown/lock, node&SG lock/unlock/shutdown, SI > > lock/unlock. The same result. > > 13. Repeat tc #1 for NPI component. The npi component get restarted. > > 14. Repeat tc #3 for NPI. The same result. > > 15. Repeat rc #4 for NPI. The same result. > > > > Testing, Expected Results: > > -------------------------- > > As described above > > > > Conditions of Submission: > > ------------------------- > > Ack from any amf maintainers. Timeout in 3 days > > > > Arch Built Started Linux distro > > ------------------------------------------- > > mips n n > > mips64 n n > > x86 n n > > x86_64 y y > > powerpc n n > > powerpc64 n n > > > > > > Reviewer Checklist: > > ------------------- > > [Submitters: make sure that your review doesn't trigger any checkmarks!] > > > > > > Your checkin has not passed review because (see checked entries): > > > > ___ Your RR template is generally incomplete; it has too many blank entries > > that need proper data filled in. > > > > ___ You have failed to nominate the proper persons for review and push. > > > > ___ Your patches do not have proper short+long header > > > > ___ You have grammar/spelling in your header that is unacceptable. > > > > ___ You have exceeded a sensible line length in your headers/comments/text. > > > > ___ You have failed to put in a proper Trac Ticket # into your commits. > > > > ___ You have incorrectly put/left internal data in your comments/files > > (i.e. internal bug tracking tool IDs, product names etc) > > > > ___ You have not given any evidence of testing beyond basic build tests. > > Demonstrate some level of runtime or other sanity testing. > > > > ___ You have ^M present in some of your files. These have to be removed. > > > > ___ You have needlessly changed whitespace or added whitespace crimes > > like trailing spaces, or spaces before tabs. > > > > ___ You have mixed real technical changes with whitespace and other > > cosmetic code cleanup changes. These have to be separate commits. > > > > ___ You need to refactor your submission into logical chunks; there is > > too much content into a single commit. > > > > ___ You have extraneous garbage in your review (merge commits etc) > > > > ___ You have giant attachments which should never have been sent; > > Instead you should place your content in a public tree to be pulled. > > > > ___ You have too many commits attached to an e-mail; resend as threaded > > commits, or place in a public tree for a pull. > > > > ___ You have resent this content multiple times without a clear indication > > of what has changed between each re-send. > > > > ___ You have failed to adequately and individually address all of the > > comments and change requests that were proposed in the initial review. > > > > ___ You have a misconfigured ~/.gitconfig file (i.e. user.name, user.email > > etc) > > > > ___ Your computer have a badly configured date and time; confusing the > > the threaded patch review. > > > > ___ Your changes affect IPC mechanism, and you don't present any results > > for in-service upgradability test. > > > > ___ Your changes affect user manual and documentation, your patch series > > do not contain the patch that updates the Doxygen manual. > > > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel