On Wed, Dec 19, 2012 at 4:38 AM,  <laurent+pacema...@u-picardie.fr> wrote:
> laurent+pacema...@u-picardie.fr writes:
>
>> David Vossel <dvos...@redhat.com> writes:
>>
>>>> Dec 12 01:12:37 elasticsearch-06 stonith-ng[18181]:   notice:
>>>> dynamic_list_search_cb: Disabling port list queries for
>>>> stonith-xen-eddu (1): failed:  255
>>>
>>> We discover what hosts a agent can fence by running this command internally 
>>> in stonith.
>>>
>>> # agent -o list
>>>
>>>>From there we expect a exit-code of 0 and the list of node to be in the 
>>>>output.
>>> https://fedorahosted.org/cluster/wiki/FenceAgentAPI
>>>
>>> Looking at your logs, stonith-xen-eddu is returning -1 (255) as the return 
>>> code when we issue the 'list' action.  That means we don't try to get the 
>>> dynamic list again, we assume the 'list' action isn't supported. From there 
>>> we fall back to using the 'status' action to dynamically determine if agent 
>>> can fence a particular host.  I'm guessing the 'status' action is returning 
>>> true (return codes 0 or 2) for hosts you wouldn't expect the agent to be 
>>> able to fence for some reason.
>>
>> Hi,
>>
>> Ok it makes sense.
>> The FenceAgentAPI doc gives extra information on top of this one:
>> http://hg.linux-ha.org/glue/file/67224d37df80/doc/stonith/README.external
>>
>> returning 1 when hostlist is empty does the trick (gethosts action)
>> so does returning 1 to the status action.
>>
>> So I guess that's the explanation to both of my issues :
>> - after the timeout issue, the port list queries were disabled,
>>   failing back to the status action that was always returning rc=0
>> - gethosts returning rc=0 with an empty hostlist also disables the
>>   port list queries
>>
>> so I guess there's no need to fill a new ticket :)
>> Thanks,
>
> Hmm it still feels like there's something funny with this issue.
> is the FenceAgentAPI relevant with pacemaker ?
>
> I don't see why the fencing agent should return 1 when called with
> "gethosts", it's reachable and working properly. It's just returning
> an empty hostlist.

Agreed.

>
> as for the status action, it also feels like it should return 0 (or 2
> if pacemaker supports it) as the device is reachable.
>
> In the end I'm going to fill a bug.
>
> --
> Laurent
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to