subject:"\[Linux\-HA\] Resources get restarted when a node joins the cluster"

Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-29 Thread Andrew Beekhof

On Fri, May 29, 2009 at 10:30 AM, Tobias Appel  wrote:
> Well, exactly what I expected happened!
> I set the 2nd node to standby - it had no resources running. We stopped
> Heartbeat on the 2nd node and did some maintenance. When we started
> Heartbeat again it joined the cluster as Online-standby and guess what!
>
> The resources on node 01 were getting stopped and restarted by heartbeat!
>
> Now why the hell did heartbeat do this and how can I stop heartbeat from
> doing this in the future?

Attach a hb_report archive to a bugzilla entry so that the developers
have a chance to fix it :-)

I've seen this with clones where the PE isn't always smart enough to
do the right thing, but never for groups.

>
> Another very weird thing was that it did not stop all the resources.
> We have configured one resource group only, containing 6 resources in
> the following order:
> mount filesystem
> virtual ip
> afd
> cups
> nfs
> mailto notification
>
> it stopped the mailto and tried to stop NFS which failed since NFS was
> being in use, instead of going into an unmanage state, it just left it
> running and started mailto again.
> No error was shown in crm_mon and the cluster luckily for us kept on
> running. But we did get 2 emails from mailto.
>
> Now why did Heartbeat behave like this? We even had a constraint in
> place which forces the resource group on node 01 (score infinity).
>
> If anyone can bring any light on this matter please do. This is
> essentiell for me.
>
> Regards,
> Tobi
>
>
> Andrew Beekhof wrote:
>> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
>>> Hi,
>>>
>>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>>
>>> 2-Node Cluster, all resources run one node - no location constraints
>>> Now I restarted the "standby" node (which had no resources running but
>>> was still active inside the cluster).
>>> When it came back online and joined the cluster again 3 different
>>> scenarios happened:
>>>
>>> 1. all resources failed over to the newly joined node
>>> 2. all resources stay on the current node but get restarted!
>>
>> Usually 1 and 2 occur when services are started by the node when it
>> boots up (ie. not by the cluster).
>> The cluster then detects this, stops them everywhere and starts them
>> on just one node.
>>
>> Cluster resources must never be started automatically by the node at boot 
>> time.
>>
>>> 3. nothing happens
>>>
>>> Now I don't know why 1. or 2. happen but I remember seeing a mail on the
>>> mailing list from someone with a similiar problem. Is there any way to
>>> make sure heartbeat does NOT touch the resources, especially not
>>> restarting or re-locating them?
>>>
>>> Thanks in advance,
>>> Tobi
>>> ___
>>> Linux-HA mailing list
>>> Linux-HA@lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-29 Thread Tobias Appel

Well, exactly what I expected happened!
I set the 2nd node to standby - it had no resources running. We stopped 
Heartbeat on the 2nd node and did some maintenance. When we started 
Heartbeat again it joined the cluster as Online-standby and guess what!

The resources on node 01 were getting stopped and restarted by heartbeat!

Now why the hell did heartbeat do this and how can I stop heartbeat from 
doing this in the future?

Another very weird thing was that it did not stop all the resources.
We have configured one resource group only, containing 6 resources in 
the following order:
mount filesystem
virtual ip
afd
cups
nfs
mailto notification

it stopped the mailto and tried to stop NFS which failed since NFS was 
being in use, instead of going into an unmanage state, it just left it 
running and started mailto again.
No error was shown in crm_mon and the cluster luckily for us kept on 
running. But we did get 2 emails from mailto.

Now why did Heartbeat behave like this? We even had a constraint in 
place which forces the resource group on node 01 (score infinity).

If anyone can bring any light on this matter please do. This is 
essentiell for me.

Regards,
Tobi

Andrew Beekhof wrote:
> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
>> Hi,
>>
>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>
>> 2-Node Cluster, all resources run one node - no location constraints
>> Now I restarted the "standby" node (which had no resources running but
>> was still active inside the cluster).
>> When it came back online and joined the cluster again 3 different
>> scenarios happened:
>>
>> 1. all resources failed over to the newly joined node
>> 2. all resources stay on the current node but get restarted!
> 
> Usually 1 and 2 occur when services are started by the node when it
> boots up (ie. not by the cluster).
> The cluster then detects this, stops them everywhere and starts them
> on just one node.
> 
> Cluster resources must never be started automatically by the node at boot 
> time.
> 
>> 3. nothing happens
>>
>> Now I don't know why 1. or 2. happen but I remember seeing a mail on the
>> mailing list from someone with a similiar problem. Is there any way to
>> make sure heartbeat does NOT touch the resources, especially not
>> restarting or re-locating them?
>>
>> Thanks in advance,
>> Tobi
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-29 Thread Jan Kalcic

Andrew Beekhof wrote:
> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
>   
>> Hi,
>>
>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>
>> 2-Node Cluster, all resources run one node - no location constraints
>> Now I restarted the "standby" node (which had no resources running but
>> was still active inside the cluster).
>> When it came back online and joined the cluster again 3 different
>> scenarios happened:
>>
>> 1. all resources failed over to the newly joined node
>> 2. all resources stay on the current node but get restarted!
>> 
>
> Usually 1 and 2 occur when services are started by the node when it
> boots up (ie. not by the cluster).
> The cluster then detects this, stops them everywhere and starts them
> on just one node.
>
> Cluster resources must never be started automatically by the node at boot 
> time.
>
>   
I noticed the same behaviour. Once then the standby node is activated
back again, the resources stay on the same node but get restarted. The
standby server is not restarted at all and no services are started along
with it. In my case the resources were Xen domains.

Thanks,
Jan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-28 Thread Tobias Appel

Thanks Andrew, I'll double check that nothing gets started automatically.

Wish me luck :)

Andrew Beekhof wrote:
> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
>> Hi,
>>
>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>
>> 2-Node Cluster, all resources run one node - no location constraints
>> Now I restarted the "standby" node (which had no resources running but
>> was still active inside the cluster).
>> When it came back online and joined the cluster again 3 different
>> scenarios happened:
>>
>> 1. all resources failed over to the newly joined node
>> 2. all resources stay on the current node but get restarted!
> 
> Usually 1 and 2 occur when services are started by the node when it
> boots up (ie. not by the cluster).
> The cluster then detects this, stops them everywhere and starts them
> on just one node.
> 
> Cluster resources must never be started automatically by the node at boot 
> time.
> 


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-27 Thread Andrew Beekhof

On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
> Hi,
>
> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>
> 2-Node Cluster, all resources run one node - no location constraints
> Now I restarted the "standby" node (which had no resources running but
> was still active inside the cluster).
> When it came back online and joined the cluster again 3 different
> scenarios happened:
>
> 1. all resources failed over to the newly joined node
> 2. all resources stay on the current node but get restarted!

Usually 1 and 2 occur when services are started by the node when it
boots up (ie. not by the cluster).
The cluster then detects this, stops them everywhere and starts them
on just one node.

Cluster resources must never be started automatically by the node at boot time.

> 3. nothing happens
>
> Now I don't know why 1. or 2. happen but I remember seeing a mail on the
> mailing list from someone with a similiar problem. Is there any way to
> make sure heartbeat does NOT touch the resources, especially not
> restarting or re-locating them?
>
> Thanks in advance,
> Tobi
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Resources get restarted when a node joins the cluster

2009-05-26 Thread Tobias Appel

Hi,

In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:

2-Node Cluster, all resources run one node - no location constraints
Now I restarted the "standby" node (which had no resources running but 
was still active inside the cluster).
When it came back online and joined the cluster again 3 different 
scenarios happened:

1. all resources failed over to the newly joined node
2. all resources stay on the current node but get restarted!
3. nothing happens

Now I don't know why 1. or 2. happen but I remember seeing a mail on the 
mailing list from someone with a similiar problem. Is there any way to 
make sure heartbeat does NOT touch the resources, especially not 
restarting or re-locating them?

Thanks in advance,
Tobi
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Resources get restarted when a node joins the cluster

Re: [Linux-HA] Resources get restarted when a node joins the cluster

Re: [Linux-HA] Resources get restarted when a node joins the cluster

Re: [Linux-HA] Resources get restarted when a node joins the cluster

Re: [Linux-HA] Resources get restarted when a node joins the cluster

[Linux-HA] Resources get restarted when a node joins the cluster

6 matches

Site Navigation

Mail list logo

Footer information