Application has gone into termination failed state. From here on, AMF(OpenSAF) 
does not knows or cannot determine anything 
about what exactly is causing this problem with the application.
Therefore manual intervention is necessary to fix the problem and restart 
opensaf or the node.

Mathi.

-----Original Message-----
From: praveen malviya 
Sent: Thursday, February 12, 2015 10:05 AM
To: santosh satapathy; [email protected]
Subject: Re: [users] amf-adm issue on amfpm



On 12-Feb-15 2:21 AM, santosh satapathy wrote:
> Hi,
>
> I have integrated one of our applications with amfpm for passive 
> monitoring. I executed amf-adm unlock-in and unlock command in 
> sequence while amfpm binary was not available. Forgot to install it to 
> the node. And after that, I see the su state as below and its stays 
> there forever, not responding to any of the amf-adm commands, reason 
> being  "WA Admin operation is already going on " . I tried to start 
> the SU after installing and putting required bins but even after 1 
> hour I found the same thing. How to get out of this state?
>
> [root@mgt-a bin]# amf-state su all 
> safSu=testsu,safSg=testsg,safApp=TestApp
> safSu=testsu,safSg=testsg,safApp=TestApp
>          saAmfSUAdminState=UNLOCKED(1)
>          saAmfSUOperState=ENABLED(1)
>          saAmfSUPresenceState=TERMINATION-FAILED(7)
>          saAmfSUReadinessState=IN-SERVICE(2)

I think it is a NPI application.
If amfpm is missing, AMF will declare that component faulty. After this AMF 
will try to clean up the component. Here SU is marked TERM_FAILED which means 
AMF could not clean up the component successfully. Please see why cleanup 
failed.

There is open ticket #538 for this reported case.

However, to allow AMF to automatically perform recovery and repair whenever a 
SU moves to TERM_FAULED state: enable node level attribute
saAmfNodeFailfastOnTerminationFailure=1 along with saAmfNodeAutoRepair and 
saAmfSgAutoRepair. When all these attributes are enabled. AMF will perform 
nodefailfast recovery whenever a Su enters TEMR_FAILED state.

In this reported case SG is unstable. So AMF will not accept any admin 
operation like repair admin op on SU. So restore amfpm binary and reboot the 
node manually. When node joins the cluster again, su will get instantiated.

Thanks,
Praveen


> [root@mgt-a bin]#
>
> Logs:
> =====
>
> Feb 11 14:38:44 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 14:38:45 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 14:38:46 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 14:38:47 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 14:38:48 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 14:38:49 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
>
>
>
> Feb 11 15:31:29 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 15:31:30 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 15:31:31 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 15:31:32 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 15:31:33 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 15:31:34 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
> Feb 11 15:31:35 mgt-a osafamfd[23083]: WA Admin operation is already 
> going on (su'safSu=testsu,safSg=testsg,safApp=TestApp')
>

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website, sponsored 
by Intel and developed in partnership with Slashdot Media, is your hub for all 
things parallel software development, from weekly thought leadership blogs to 
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to