Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster for Proof of Concept

2009-05-29 Thread Dimitri Maziuk
On Friday 29 May 2009 14:44:55 Bernie Wu wrote:
> Hi Les,
> Currently, there is no database or filesystem on the active system that
> needs to be sync'ed.  Apache currently just serves up static pages.  It's
> too early in our POC for something fancy.

Syncing only really works if people upload files and expect them to persist 
past their current "session". With throw-away uploads (i.e. they click on 
upload and get back some output) it's not going to do anything, and with 
fancier stuff like data entry via multiple CGI forms you'll want DRBD. And if 
you have stateful stuff like applet-servlet apps, those will break during 
failover.

Dima
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster for Proof of Concept

2009-05-29 Thread Bernie Wu
Hi Les,
Currently, there is no database or filesystem on the active system that needs 
to be sync'ed.  Apache currently just serves up static pages.  It's too early 
in our POC for something fancy.

Bernie

-Original Message-
From: linux-ha-boun...@lists.linux-ha.org 
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Les Mikesell
Sent: Friday, May 29, 2009 11:25 AM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster 
for Proof of Concept

Dimitri Maziuk wrote:
> Bernie Wu wrote:
>> The company I work for wants us to start investigating HA.
>> My first POC setup was a 2 node cluster with a floating IP and that worked 
>> out quite well.
>> Now the second POC was to work with an application, in this case, apache2 in 
>> a active/passive configuration.
>> My question is this.  Do I need to setup a third node to serve as the quorum 
>> node or can I work with 2 nodes.
>
> If your setup is v1-style active/passive, all you need to do is add
> httpd to haresources line (and make sure it's not started by init).
> That's for proof of concept. IRL you may want to throw in at least mon,
> possibly stonith -- although I haven't seen a split brain problem in my
> setup (but then again, mine usually fail over when I upgrade the kernel).

Apache typically won't do anything until a request hits it, which won't
happen on the machine that doesn't have the floating IP, so it isn't
likely to have a split-brain issue and doesn't even matter if it is
running all the time on the passive machine.  However, you need to think
about any state information that is updated/maintained on the active
system.  Is there a backend database or anything written to the
filesystem that has to be kept in sync?

--
   Les Mikesell
lesmikes...@gmail.com


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

The information contained in this e-mail message is intended only for the 
personal and confidential use of the recipient(s) named above. This message may 
be an attorney-client communication and/or work product and as such is 
privileged and confidential. If the reader of this message is not the intended 
recipient or an agent responsible for delivering it to the intended recipient, 
you are hereby notified that you have received this document in error and that 
any review, dissemination, distribution, or copying of this message is strictly 
prohibited. If you have received this communication in error, please notify us 
immediately by e-mail, and delete the original message.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster for Proof of Concept

2009-05-29 Thread Bernie Wu
Hi Dima,
Thanks for your reply.  However, I'm using Heartbeat V2 which doesn't use the 
haresources file.

Bernie

-Original Message-
From: linux-ha-boun...@lists.linux-ha.org 
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Dimitri Maziuk
Sent: Friday, May 29, 2009 10:52 AM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster 
for Proof of Concept

Bernie Wu wrote:
> The company I work for wants us to start investigating HA.
> My first POC setup was a 2 node cluster with a floating IP and that worked 
> out quite well.
> Now the second POC was to work with an application, in this case, apache2 in 
> a active/passive configuration.
> My question is this.  Do I need to setup a third node to serve as the quorum 
> node or can I work with 2 nodes.

If your setup is v1-style active/passive, all you need to do is add
httpd to haresources line (and make sure it's not started by init).
That's for proof of concept. IRL you may want to throw in at least mon,
possibly stonith -- although I haven't seen a split brain problem in my
setup (but then again, mine usually fail over when I upgrade the kernel).

Dima

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

The information contained in this e-mail message is intended only for the 
personal and confidential use of the recipient(s) named above. This message may 
be an attorney-client communication and/or work product and as such is 
privileged and confidential. If the reader of this message is not the intended 
recipient or an agent responsible for delivering it to the intended recipient, 
you are hereby notified that you have received this document in error and that 
any review, dissemination, distribution, or copying of this message is strictly 
prohibited. If you have received this communication in error, please notify us 
immediately by e-mail, and delete the original message.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster for Proof of Concept

2009-05-29 Thread Les Mikesell
Dimitri Maziuk wrote:
> Bernie Wu wrote:
>> The company I work for wants us to start investigating HA.
>> My first POC setup was a 2 node cluster with a floating IP and that worked 
>> out quite well.
>> Now the second POC was to work with an application, in this case, apache2 in 
>> a active/passive configuration.
>> My question is this.  Do I need to setup a third node to serve as the quorum 
>> node or can I work with 2 nodes.
> 
> If your setup is v1-style active/passive, all you need to do is add 
> httpd to haresources line (and make sure it's not started by init). 
> That's for proof of concept. IRL you may want to throw in at least mon, 
> possibly stonith -- although I haven't seen a split brain problem in my 
> setup (but then again, mine usually fail over when I upgrade the kernel).

Apache typically won't do anything until a request hits it, which won't 
happen on the machine that doesn't have the floating IP, so it isn't 
likely to have a split-brain issue and doesn't even matter if it is 
running all the time on the passive machine.  However, you need to think 
about any state information that is updated/maintained on the active 
system.  Is there a backend database or anything written to the 
filesystem that has to be kept in sync?

-- 
   Les Mikesell
lesmikes...@gmail.com


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster for Proof of Concept

2009-05-29 Thread Dimitri Maziuk
Bernie Wu wrote:
> The company I work for wants us to start investigating HA.
> My first POC setup was a 2 node cluster with a floating IP and that worked 
> out quite well.
> Now the second POC was to work with an application, in this case, apache2 in 
> a active/passive configuration.
> My question is this.  Do I need to setup a third node to serve as the quorum 
> node or can I work with 2 nodes.

If your setup is v1-style active/passive, all you need to do is add 
httpd to haresources line (and make sure it's not started by init). 
That's for proof of concept. IRL you may want to throw in at least mon, 
possibly stonith -- although I haven't seen a split brain problem in my 
setup (but then again, mine usually fail over when I upgrade the kernel).

Dima

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] resource not failover

2009-05-29 Thread joe tse
Hi Dejan
After changing to OCF RA, it is working fine. however, i would like to know
what does action failed (rc=1) mean and why there is a problem within LSB?
Thanks  for your help.



Best regards

Joe



-Original Message-
From: linux-ha-boun...@lists.linux-ha.org [mailto:
linux-ha-boun...@lists.linux-ha.org] On Behalf Of Dejan Muhamedagic
Sent: Tuesday, May 26, 2009 11:23 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] resource not failover



Hi,



On Thu, May 21, 2009 at 06:34:06PM +0800, joe tse wrote:

> i'm a newbie of Heartbeat. Currently, i configure two nodes cluster with

> postgresql. Everything is working fine. However, once i stop postgresql
from

> shell "rcpostgres stop". It does not failover and from the crm_mon just
show

> like below. Please help is there a way to make it work. Thanks everyone.

>

> 

> Last updated: Thu May 21 18:32:51 2009

> Current DC: node-1 (2d3f1c8c-1f57-48d6-94fe-634efde56e9f)

> 2 Nodes configured.

> 2 Resources configured.

> 

>

> Node: node-1 (2d3f1c8c-1f57-48d6-94fe-634efde56e9f): online

> Node: node-2 (07ef1d66-c3ec-45b3-a3d7-bd144ea3a320): online

>

> Master/Slave Set: r0

> drbd_module:0   (ocf::heartbeat:drbd):  Master node-1

> drbd_module:1   (ocf::heartbeat:drbd):  Started node-2

>

> Failed actions:

> postgresql_status_6 (node=hklnxdvm27-2, call=111, rc=7): complete

> postgresql_stop_0 (node=hklnxdvm27-2, call=114, rc=1): complete



The postgresql stop action failed (rc=1). Any reason not to use

the packaged pgsql OCF RA?



Thanks,



Dejan



> Regards,

> Joe

> ___

> Linux-HA mailing list

> Linux-HA@lists.linux-ha.org

> http://lists.linux-ha.org/mailman/listinfo/linux-ha

> See also: http://linux-ha.org/ReportingProblems

___

Linux-HA mailing list

Linux-HA@lists.linux-ha.org

http://lists.linux-ha.org/mailman/listinfo/linux-ha

See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] usage heartbeat x Oracle XE

2009-05-29 Thread Rafael - Uol
Morning,

I am trying to configure the user has to the Oracle, but I can not  
configure
correctly, I read the script / usr / lib / ocf / resource.d /  
heartbeat / oracle and it shows
as an example:

node1  10.0.0.170 oracle::RK1::/oracle/10.2::orark1


what is RK1?
what is orark1?

tanks!

Rafael - Uol
rn.n...@uol.com.br
skype: rn.neto



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Re sources wont run anywhere when PingNode is unreachable

2009-05-29 Thread firewall

Hi all, 

I'm having problems getting a split-brain to work (for demo, yes that's
right). Basically, I have an active/passive cluster both configured with
pingd like below. When the ping node is not reachable on the primary/active
(i.e. I unplug the cable..), a failover occurs to the backup and vice-versa.
However, I expect that when the ping node is not reachable from both nodes,
a split-brain should occur resulting in the cluster resource running on both
nodes at the same time. I need this to happen for demonstration. I'm pretty
sure it's to do my config but not sure what to change so that a split-brain
would occur when both cluster nodes cannot ping the ping node. My cib.xml
file is attached and my ha.cf file is displayed below.

http://www.nabble.com/file/p23762887/cib.xml cib.xml 
gabbep

My ha.cf file:
use_logd on
udpport 694
keepalive 2  #2 seconds interval between heartbeats
deadtime 10
initdead 20
bcast eth0 bond0 #send heartbeats on eth0 and bond0 for redundancy
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s   # failover on network
failures
ping 1.2.3.4
#apiauth default uid=root
node primary backup
crm yes  #enable Heartbeat V2-style w/ cluster manager
#auto_failback directive only applies to R1 style haresources. The
corresponding attribute in R2 is the resource_stickiness (in the CIB file)
#auto_failback yes

PS. surprisingly, a failback still occurs albeit it's commented out in the
ha.cf file.

Any help will be greatly appreciated.

-- 
View this message in context: 
http://www.nabble.com/Resources-wont-run-anywhere-when-PingNode-is-unreachable-tp23762887p23762887.html
Sent from the Linux-HA mailing list archive at Nabble.com.

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] allow_migrate does not allow to migrate LIVE a Xen VM

2009-05-29 Thread Florian Haas
This must be a meta attribute not an instance attribute, and AFAICS in
Pacemaker 1.0.3 it's now named "allow-migrate", not "allow_migrate".
Andrew, please correct me if I'm wrong.

Cheers,
Florian

On 2009-05-27 23:30, Jan Kalcic wrote:
> Hi All,
> 
> as in subject, I configured a simple Xen resource with the attribute
> "allow_migrate" "true" in order to migrate it live but it's always
> stopped and re-started. Strange is that with the xm tool I can migrate
> it live without any problem.
> 
> pacemaker-1.0.3 on SLES 11
> 
> Any hints?
> 
> Thanks,
> Jan



signature.asc
Description: OpenPGP digital signature
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Looking for impressive success stories

2009-05-29 Thread Michael Schwartzkopff
Hi,

Since the "Success Stories" link of linux-ha.org is quite outdated we are 
looking for some recent and impressive stories that could convice the readers 
to use pacemaker cluster resource manager. So we could add these to 
clusterlabs.org.

Anybody please feel free to mail me your stories. Alternatively you can mail 
it to Andrew.

Greetings,


-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: mi...@multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] clean CIB

2009-05-29 Thread Michael Schwartzkopff
Am Freitag, 29. Mai 2009 15:06:29 schrieb Koen Verwimp:
> Hi,
>
>  
>
> What is the easiest way to clear the whole CIB configuration?
>
>  
>
> Thanks,
>
> Koen

cibadmin -E --force

Beware if you have a heartbeat based cluster. The node information is stored 
elsewhere, too.

-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: mi...@multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] clean CIB

2009-05-29 Thread Koen Verwimp

Hi, 

  

What is the easiest way to clear the whole CIB configuration? 

  

Thanks, 

Koen 

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Node Selection on Failover

2009-05-29 Thread Andrew Beekhof
On Tue, May 26, 2009 at 10:07 PM, Kevin Harms  wrote:
>
>   I have setup an 8 node cluster. The cluster has 15 resources. I
> setup the system such that all 15 resources are  distributed on the 7
> primary nodes when the cluster starts up. I would like it such that
> when a node fails, the resources are migrated to the backup node first
> and then subsequently balancing the resource distribution among the
> remaining nodes. I'm running HB 2.1.4. It is my understanding that if
> the score values are equal then the node with the fewest resources
> will be selected as the target node. During my testing I don't find
> this to be the case. It seems like the node selected is the first node
> listed in the crm_mon output:
>
> Node: fs24 (1ff11fab-a613-4335-b302-33dd812bf99b): online
> Node: fs23 (85a8a44b-9e1e-4aaa-bedf-d023398b5553): online
> Node: fs22 (bf727f13-62d9-450b-b389-86353994ffe1): online
> Node: fs21 (4d36cf33-7540-45e5-b6ae-d3d34661de75): online
> Node: fs20 (11d49119-471f-4ce1-bbed-2603edee32fc): online
> Node: fs19 (f10a906e-5ad2-4e49-af91-f9ebbfa2b52d): online
> Node: fs18 (d538b892-ddee-4dac-aaf1-e29dcf85817b): online
> Node: fs17 (9073d06c-e040-46f9-9a69-9c3e7719333c): online
>
>   Can anyone confirm the behavior of Heartbeat 2.1.4 for node
> selection in the case where scores are equal?

I'd have to see your config to be able to comment
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Recovering a Fragile CIB after Debian Lenny upgrade

2009-05-29 Thread Andrew Beekhof
On Wed, May 27, 2009 at 7:57 PM, Imran Chaudhry
 wrote:

> One out-standing question I have is that if I reboot foo, then the
> resources will migrate to bar but when foo comes back up the resources
> migrate back to foo. I did not expect this to happen since I have
> "auto_failback off" in ha.cf. Is this because I have "crm on" so that it
> ignores ha.cf?

Not all of it, but yes, the auto_failback option has no meaning in a
crm cluster.

>
> In my production scenario this was actually OK and did not cause a
> problem when it happened (because other services on foo were configured
> correctly). However, I would not like the resources to failback
> automatically,

This will stop it from moving automatically

# crm configure property default-resource-stickiness=INFINITY

> I would like to do so manually via hb_gui/CLI

Use crm_resource --migrate for this purpose.
I think there is also a crm shell command too.

> or at least
> have a toggle for this behaviour. How do I do this? Any pointers
> welcome.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] allow_migrate does not allow to migrate LIVE a Xen VM

2009-05-29 Thread Jan Kalcic
I finally got it. Replaced underscore with dash as indicated by Andrew
few messages ago.

"In 1.0.x most underscores were replaced with dashes.
So try changing allow_migrate to allow-migrate"

The meta attribute "allow-migrate" does the job for me. Thanks Andrew.

I think an update into the Xen RA would be really handy for those who
have this issues (I found a lot of people eventually). The information
contained into the OCF RA cause big headaches. A documentation update
would be really appreciated too.

HTH

Thanks,
Jan



Jan Kalcic wrote:
> The following is the xml for the resource
>
> 
> 
>name="target-role" value="started"/>
> 
> 
>start-delay="20" timeout="60"/>
>start-delay="10" timeout="60"/>
>timeout="60"/>
>start-delay="10" timeout="120"/>
> 
> 
>name="xmfile" value="/etc/xen/vm/sles11"/>
>name="allow_migrate" value="true"/>
> 
>   
>
> Thanks,
> Jan
>
> Jan Kalcic wrote:
>   
>> Hi All,
>>
>> as in subject, I configured a simple Xen resource with the attribute
>> "allow_migrate" "true" in order to migrate it live but it's always
>> stopped and re-started. Strange is that with the xm tool I can migrate
>> it live without any problem.
>>
>> pacemaker-1.0.3 on SLES 11
>>
>> Any hints?
>>
>> Thanks,
>> Jan
>>
>>   
>> 
>
>
>   

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-29 Thread Andrew Beekhof
On Fri, May 29, 2009 at 10:30 AM, Tobias Appel  wrote:
> Well, exactly what I expected happened!
> I set the 2nd node to standby - it had no resources running. We stopped
> Heartbeat on the 2nd node and did some maintenance. When we started
> Heartbeat again it joined the cluster as Online-standby and guess what!
>
> The resources on node 01 were getting stopped and restarted by heartbeat!
>
> Now why the hell did heartbeat do this and how can I stop heartbeat from
> doing this in the future?

Attach a hb_report archive to a bugzilla entry so that the developers
have a chance to fix it :-)

I've seen this with clones where the PE isn't always smart enough to
do the right thing, but never for groups.

>
> Another very weird thing was that it did not stop all the resources.
> We have configured one resource group only, containing 6 resources in
> the following order:
> mount filesystem
> virtual ip
> afd
> cups
> nfs
> mailto notification
>
> it stopped the mailto and tried to stop NFS which failed since NFS was
> being in use, instead of going into an unmanage state, it just left it
> running and started mailto again.
> No error was shown in crm_mon and the cluster luckily for us kept on
> running. But we did get 2 emails from mailto.
>
> Now why did Heartbeat behave like this? We even had a constraint in
> place which forces the resource group on node 01 (score infinity).
>
> If anyone can bring any light on this matter please do. This is
> essentiell for me.
>
> Regards,
> Tobi
>
>
> Andrew Beekhof wrote:
>> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
>>> Hi,
>>>
>>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>>
>>> 2-Node Cluster, all resources run one node - no location constraints
>>> Now I restarted the "standby" node (which had no resources running but
>>> was still active inside the cluster).
>>> When it came back online and joined the cluster again 3 different
>>> scenarios happened:
>>>
>>> 1. all resources failed over to the newly joined node
>>> 2. all resources stay on the current node but get restarted!
>>
>> Usually 1 and 2 occur when services are started by the node when it
>> boots up (ie. not by the cluster).
>> The cluster then detects this, stops them everywhere and starts them
>> on just one node.
>>
>> Cluster resources must never be started automatically by the node at boot 
>> time.
>>
>>> 3. nothing happens
>>>
>>> Now I don't know why 1. or 2. happen but I remember seeing a mail on the
>>> mailing list from someone with a similiar problem. Is there any way to
>>> make sure heartbeat does NOT touch the resources, especially not
>>> restarting or re-locating them?
>>>
>>> Thanks in advance,
>>> Tobi
>>> ___
>>> Linux-HA mailing list
>>> Linux-HA@lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] How to setup a 2 node active/passive apache2 cluster for Proof of Concept

2009-05-29 Thread Dejan Muhamedagic
Hi,

On Thu, May 28, 2009 at 09:03:10PM -0400, Bernie Wu wrote:
> Hi Listers,
> I am a complete newbie when it comes to HA.
> The company I work for wants us to start investigating HA.
> My first POC setup was a 2 node cluster with a floating IP and that worked 
> out quite well.
> Now the second POC was to work with an application, in this case, apache2 in 
> a active/passive configuration.
> My question is this.  Do I need to setup a third node to serve
> as the quorum node or can I work with 2 nodes.

You can use two nodes, but make sure that you have stonith
configured.

> If I went with
> 2 nodes, do I still use pingd and can I use the same interface
> that heartbeat uses ?

Yes.

> Also how many interfaces do I need for HA ?

At least two. One can work, but it won't be supported.

> Currently, each node has eth0 and hsi0.  The floating IP uses eth0. Heartbeat 
> uses hsi0.
> Our setup is :
> SLES10- SP2,  zVM 5.3 and heartbeat-2.1.3-0.9.
> Any advice will be much appreciated.

Thanks,

Dejan

> 
> TIA
> Bernie
> 
> 
> The information contained in this e-mail message is intended only for the 
> personal and confidential use of the recipient(s) named above. This message 
> may be an attorney-client communication and/or work product and as such is 
> privileged and confidential. If the reader of this message is not the 
> intended recipient or an agent responsible for delivering it to the intended 
> recipient, you are hereby notified that you have received this document in 
> error and that any review, dissemination, distribution, or copying of this 
> message is strictly prohibited. If you have received this communication in 
> error, please notify us immediately by e-mail, and delete the original 
> message.
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] new doc about stonith/fencing

2009-05-29 Thread Jan Kalcic
Really interesting. I would have appreciated some more example (they are
always welcome) but still very interesting.

Thanks,
Jan

Dejan Muhamedagic wrote:
> Hi,
>
> Trying to make it a bit less mysterious, I wrote something about
> fencing and stonith quite a while ago and then forgot to share
> the link. Sorry about that.
>
> Here it is:
>
> http://www.clusterlabs.org/mediawiki/images/f/f2/Crm_fencing.pdf
>
> As usual, constructive criticism/suggestions/etc are welcome.
> I won't be able to read your impressions for the next two weeks,
> but will sure look forward to see them afterwards.
>
> Cheers,
>
> Dejan
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>   

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Antwort: Re: Resources get restarted when a node joins the cluster

2009-05-29 Thread Robert . Koeppl
I once had similar behaviour on one of my clusters. the reason was that 
the LSB-scritp of one of the resources (for which no ocf agent is written 
yet) was reporting the resource as running even if it was not. that caused 
heartbeat to think that the ressources where running on both nodes, which 
in turn caused the system to stop it on all nodes and afterwards start it 
on the right one again.

Mit freundlichen Grüßen / Best regards,
 
Robert Köppl
System Administration
---
Phone: +43 3842 805-910
Fax: +43 3842 805-500 
robert.koeppl@@knapp.com 
www.KNAPP.com 
---
KNAPP Systemintegration GmbH 
Waltenbachstrasse 9 
8700 Leoben, Austria 
---
Commercial register number: FN 138870x
Commercial register court: Leoben
---
The information in this e-mail (including any attachment) is confidential 
and intended to be for the use of the addressee(s) only. If you have 
received the e-mail by mistake, any disclosure, copy, distribution or use 
of the contents of the e-mail is prohibited, and you must delete the 
e-mail from your system. As e-mail can be changed electronically KNAPP 
assumes no responsibility for any alteration to this e-mail or its 
attachments. KNAPP has taken every reasonable precaution to ensure that 
any attachment to this e-mail has been swept for virus. However, KNAPP 
does not accept any liability for damage sustained as a result of such 
attachment being virus infected and strongly recommend that you carry out 
your own virus check before opening any attachment.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-29 Thread Tobias Appel
Well, exactly what I expected happened!
I set the 2nd node to standby - it had no resources running. We stopped 
Heartbeat on the 2nd node and did some maintenance. When we started 
Heartbeat again it joined the cluster as Online-standby and guess what!

The resources on node 01 were getting stopped and restarted by heartbeat!

Now why the hell did heartbeat do this and how can I stop heartbeat from 
doing this in the future?

Another very weird thing was that it did not stop all the resources.
We have configured one resource group only, containing 6 resources in 
the following order:
mount filesystem
virtual ip
afd
cups
nfs
mailto notification

it stopped the mailto and tried to stop NFS which failed since NFS was 
being in use, instead of going into an unmanage state, it just left it 
running and started mailto again.
No error was shown in crm_mon and the cluster luckily for us kept on 
running. But we did get 2 emails from mailto.

Now why did Heartbeat behave like this? We even had a constraint in 
place which forces the resource group on node 01 (score infinity).

If anyone can bring any light on this matter please do. This is 
essentiell for me.

Regards,
Tobi


Andrew Beekhof wrote:
> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
>> Hi,
>>
>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>
>> 2-Node Cluster, all resources run one node - no location constraints
>> Now I restarted the "standby" node (which had no resources running but
>> was still active inside the cluster).
>> When it came back online and joined the cluster again 3 different
>> scenarios happened:
>>
>> 1. all resources failed over to the newly joined node
>> 2. all resources stay on the current node but get restarted!
> 
> Usually 1 and 2 occur when services are started by the node when it
> boots up (ie. not by the cluster).
> The cluster then detects this, stops them everywhere and starts them
> on just one node.
> 
> Cluster resources must never be started automatically by the node at boot 
> time.
> 
>> 3. nothing happens
>>
>> Now I don't know why 1. or 2. happen but I remember seeing a mail on the
>> mailing list from someone with a similiar problem. Is there any way to
>> make sure heartbeat does NOT touch the resources, especially not
>> restarting or re-locating them?
>>
>> Thanks in advance,
>> Tobi
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Resource doesn't come online after failure-timeout has expired

2009-05-29 Thread Koen Verwimp

Hi, 

  

I’m using Heartbeat 2.99.1 and Pacemaker 1.0.2. 

  

I have configured a resource with the migration-threshold=2 and 
failure-timeout=60s. When I simulate 2 resource failures per server, the 
resource is automatically migrated to the other node (docucluster04) as 
excepted. 

  

group_color: virtual-ip-alfresco allocation score on docucluster03: -100 

group_color: virtual-ip-alfresco allocation score on docucluster04: 0 

native_color: virtual-ip-alfresco allocation score on docucluster03: -100 

native_color: virtual-ip-alfresco allocation score on docucluster04: 0 

  

After 60s (migration-threshold) I can see the following scores, but the 
resource is not started back on the original node (docucluster03). 

  

group_color: virtual-ip-alfresco allocation score on docucluster03: 100 

group_color: virtual-ip-alfresco allocation score on docucluster04: 0 

native_color: virtual-ip-alfresco allocation score on docucluster03: 100 

native_color: virtual-ip-alfresco allocation score on docucluster04: 0 

  

I have tried to set the ‘ cluster-recheck-interval’ option, this tells the 
cluster to periodically recalculate the ideal state of the cluster … but the 
resource isn’t still migrated automatically back on the original node 
(docucluster03) 

  

After executing cibadmin –B (bump) the resource is migrated to docucluster03. 

  

Anyone idea what to do for a automatically migration after expiring the 
failure-timeout? 

  

Best regards, 

Koen 

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Resources get restarted when a node joins the cluster

2009-05-29 Thread Jan Kalcic
Andrew Beekhof wrote:
> On Tue, May 26, 2009 at 2:56 PM, Tobias Appel  wrote:
>   
>> Hi,
>>
>> In the past sometimes the following happened on my Heartbeat 2.1.14 cluster:
>>
>> 2-Node Cluster, all resources run one node - no location constraints
>> Now I restarted the "standby" node (which had no resources running but
>> was still active inside the cluster).
>> When it came back online and joined the cluster again 3 different
>> scenarios happened:
>>
>> 1. all resources failed over to the newly joined node
>> 2. all resources stay on the current node but get restarted!
>> 
>
> Usually 1 and 2 occur when services are started by the node when it
> boots up (ie. not by the cluster).
> The cluster then detects this, stops them everywhere and starts them
> on just one node.
>
> Cluster resources must never be started automatically by the node at boot 
> time.
>
>   
I noticed the same behaviour. Once then the standby node is activated
back again, the resources stay on the same node but get restarted. The
standby server is not restarted at all and no services are started along
with it. In my case the resources were Xen domains.

Thanks,
Jan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] show scores per resource/node

2009-05-29 Thread Michael Schwartzkopff
Am Freitag, 29. Mai 2009 09:39:25 schrieb Koen Verwimp:
> Hi!
>
>  
>
> In Hearbeat/pacemaker scores are calculated on a per-resource basis and any
> node with a negative score can’t run that resource. After calculating the
> scores for a resource, the cluster then chooses the node with the highest
> one.
>
>  
>
> Is it possible to show the calculated score on a node for a specific
> resource?
>
>  
>
> Thanks,
>
> Koen

ptest -s -L

-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: mi...@multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] show scores per resource/node

2009-05-29 Thread Koen Verwimp

Hi! 

  

In Hearbeat/pacemaker scores are calculated on a per-resource basis and any 
node with a negative score can’t run that resource. After calculating the 
scores for a resource, the cluster then chooses the node with the highest one. 

  

Is it possible to show the calculated score on a node for a specific resource? 

  

Thanks, 

Koen 

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems