Andrew Beekhof wrote:
> 
> On Feb 21, 2008, at 2:05 PM, Andreas Kurz wrote:
> 
>> On Thu, Feb 21, 2008 at 12:22 PM, Dejan Muhamedagic
>> <[EMAIL PROTECTED]> wrote:
>>> Hi,
>>>
>>>
>>> On Thu, Feb 21, 2008 at 12:19:55AM +0100, Johan Hoeke wrote:
>>>> LS,
>>>>
>>>> Running a 2 node cluster, heartbeat-2.1.3-3 centos rpms, RH AS 4.6
>>>>
>>>> While testing a "maintenance scenario" for the cluster I set all
>>>> resources to is_managed is false,
>>>>
>>>> Feb 20 21:09:41 sierpinski pengine: [15725]: notice: native_print:
>>>> R_BB10PRD_DB     (heartbeat::ocf:oracle):        Started
>>>> sierpinski.uvt.nl (unmanaged)
>>>>
>>>>
>>>> and proceeded to shut oracle by hand, oracle being one of the
>>>> resources.
>>>>
>>>> Feb 20 21:12:03 sierpinski oracle[23120]: [23145]: INFO: Oracle
>>>> instance
>>>> BB10PRD is down
>>>>
>>>>
>>>> Within minutes, the node was stonithed. The log shows that this was
>>>> right after the monitor operation for the oracle resource came back
>>>> with
>>>> return code 7:
>>>>
>>>> Feb 20 21:12:03 sierpinski crmd: [4584]: info: process_lrm_event: LRM
>>>> operation R_BB10PRD_DB_monitor_120000 (call=31, rc=7) complete
>>>>
>>>> Feb 20 21:12:03 mandelbrot stonithd: [4580]: info:
>>>> stonith_operate_locally::2375: sending fencing op (RESET) for
>>>> sierpinski.uvt.nl to device external (rsc_id=R_ilo_sierpinski:0,
>>>> pid=5414)
>>>> Feb 20 21:12:03 mandelbrot stonithd: [4580]: info: Node
>>>> mandelbrot.uvt.nl try to help node sierpinski.uvt.nl to fence node
>>>> sierpinski.uvt.nl.
>>>>
>>>> Conclusion: the monitor operation was still running even though the
>>>> resource was unmanaged, and it forced a fencing action.
>>>
>>> Oops. So there's an on_fail=fence for this monitor operation. Is
>>> that necessary?
>>>
>>>
>>>> I then made a script which in addition to changing the resources to
>>>> is_managed = false also set the monitor operations to disabled=true.
>>>> This worked, now I am able to shutdown oracle by hand without a fencing
>>>> action starting up.
>>>>
>>>> Questions:
>>>>
>>>> It this expected behavior? Should monitor operations keep running even
>>>> though the resources are set to is_managed=false?
>>>
>>> Yes. There was some discussion about it and the majority of
>>> votes went this way, i.e. that monitoring should continue even
>>> for the unmanaged resources.
>>
>> I also agree, that it is a good idea to continue monitoring for
>> unmanaged resources but I would see this behaviour as a bug if the
>> "on_fail" action is executed if its "fence".

I wouldn't want a on_fail=restart to be executed either

> 
> right, unmanaged resources probably shouldn't have their on_fail action
> executed when they fail.
> did someone log a bug for this yet?

not me, sorry :|
I did reopen bug 1768 as mentioned in this thread.

Have changed my clusters since then so I can't easily recreate the error
here I'm afraid.

Hope you enjoyed your vacation Andrew, and welcome back!

regards,

Johan


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to