Well, exactly what I expected happened!
I set the 2nd node to standby - it had no resources running. We stopped 
Heartbeat on the 2nd node and did some maintenance. When we started 
Heartbeat again it joined the cluster as Online-standby and guess what!

The resources on node 01 were getting stopped and restarted by heartbeat!

Now why the hell did heartbeat do this and how can I stop heartbeat from 
doing this in the future?

Another very weird thing was that it did not stop all the resources.
We have configured one resource group only, containing 6 resources in 
the following order:
mount filesystem
virtual ip
afd
cups
nfs
mailto notification

it stopped the mailto and tried to stop NFS which failed since NFS was 
being in use, instead of going into an unmanage state, it just left it 
running and started mailto again.
No error was shown in crm_mon and the cluster luckily for us kept on 
running. But we did get 2 emails from mailto.

Now why did Heartbeat behave like this? We even had a constraint in 
place which forces the resource group on node 01 (score infinity).

If anyone can bring any light on this matter please do. This is 
essentiell for me.

Regards,
Tobi


Andrew Beekhof wrote:
> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel <tap...@eso.org> wrote:
>> Hi,
>>
>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>
>> 2-Node Cluster, all resources run one node - no location constraints
>> Now I restarted the "standby" node (which had no resources running but
>> was still active inside the cluster).
>> When it came back online and joined the cluster again 3 different
>> scenarios happened:
>>
>> 1. all resources failed over to the newly joined node
>> 2. all resources stay on the current node but get restarted!
> 
> Usually 1 and 2 occur when services are started by the node when it
> boots up (ie. not by the cluster).
> The cluster then detects this, stops them everywhere and starts them
> on just one node.
> 
> Cluster resources must never be started automatically by the node at boot 
> time.
> 
>> 3. nothing happens
>>
>> Now I don't know why 1. or 2. happen but I remember seeing a mail on the
>> mailing list from someone with a similiar problem. Is there any way to
>> make sure heartbeat does NOT touch the resources, especially not
>> restarting or re-locating them?
>>
>> Thanks in advance,
>> Tobi
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to