On 12-Feb-15 2:21 AM, santosh satapathy wrote: > Hi, > > I have integrated one of our applications with amfpm for passive > monitoring. I executed amf-adm unlock-in and unlock command in sequence > while amfpm binary was not available. Forgot to install it to the node. And > after that, I see the su state as below and its stays there forever, not > responding to any of the amf-adm commands, reason being "WA Admin > operation is already going on " . I tried to start the SU after installing > and putting required bins but even after 1 hour I found the same thing. How > to get out of this state? > > [root@mgt-a bin]# amf-state su all safSu=testsu,safSg=testsg,safApp=TestApp > safSu=testsu,safSg=testsg,safApp=TestApp > saAmfSUAdminState=UNLOCKED(1) > saAmfSUOperState=ENABLED(1) > saAmfSUPresenceState=TERMINATION-FAILED(7) > saAmfSUReadinessState=IN-SERVICE(2)
I think it is a NPI application. If amfpm is missing, AMF will declare that component faulty. After this AMF will try to clean up the component. Here SU is marked TERM_FAILED which means AMF could not clean up the component successfully. Please see why cleanup failed. There is open ticket #538 for this reported case. However, to allow AMF to automatically perform recovery and repair whenever a SU moves to TERM_FAULED state: enable node level attribute saAmfNodeFailfastOnTerminationFailure=1 along with saAmfNodeAutoRepair and saAmfSgAutoRepair. When all these attributes are enabled. AMF will perform nodefailfast recovery whenever a Su enters TEMR_FAILED state. In this reported case SG is unstable. So AMF will not accept any admin operation like repair admin op on SU. So restore amfpm binary and reboot the node manually. When node joins the cluster again, su will get instantiated. Thanks, Praveen > [root@mgt-a bin]# > > Logs: > ===== > > Feb 11 14:38:44 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 14:38:45 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 14:38:46 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 14:38:47 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 14:38:48 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 14:38:49 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > > > > Feb 11 15:31:29 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 15:31:30 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 15:31:31 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 15:31:32 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 15:31:33 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 15:31:34 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > Feb 11 15:31:35 mgt-a osafamfd[23083]: WA Admin operation is already going > on (su'safSu=testsu,safSg=testsg,safApp=TestApp') > ------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
