Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Ante Karamatić
On 26.04.2010 16:37, Oliver Heinz wrote: Still segfaults. Urgh... I can not reproduce this. I'm using virtualized 64bit KVM machines with ppa:ubuntu-ha/lucid-cluster and I get no segfaults. Your cib is almost identical as mine (I have no IP failover, but I do have drbd). Could you add rele

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Oliver Heinz
Am Montag, 26. April 2010, um 18:50:24 schrieb Lars Ellenberg: > On Mon, Apr 26, 2010 at 04:37:37PM +0200, Oliver Heinz wrote: > > Am Montag, 26. April 2010 15:58:51 schrieb Ante Karamatić: > > > On 26.04.2010 14:42, Oliver Heinz wrote: > > > > Thanks for that information. I rebuild the complete st

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Oliver Heinz
Am Dienstag, 27. April 2010, um 09:15:58 schrieb Ante Karamatić: > On 26.04.2010 16:37, Oliver Heinz wrote: > > Still segfaults. > > Urgh... I can not reproduce this. I'm using virtualized 64bit KVM > machines with ppa:ubuntu-ha/lucid-cluster and I get no segfaults. Your > cib is almost identical

Re: [Pacemaker] About influence of resouce-stickiness which used colocation for limitation.

2010-04-27 Thread renayama19661014
Hi Andrew, > > Done. > >http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f7da9d09ebd2 It seems to move with your patch definitely. But, the following error is reflected on log. Does not this error have any problem? Apr 27 16:37:00 srv01 pengine: [5839]: ERROR: native_merge_weights: Appl

[Pacemaker] Promote first, then try restart

2010-04-27 Thread Lubo Drobny
Hello, I have this situation: 2 nodes, M/S resource. At first node Master resource totally crashed, pacemaker tried restart this resouce, but it cannot run - after start timeout finished pacemaker promote Master to second node. It is possible to set up pacemaker to first promote Master at second

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Ante Karamatić
On 27.04.2010 09:41, Oliver Heinz wrote: Pål Simensen reported that he has the same segfault error, so it would be interessting if the new packages fixed it for him. Did they? We talked yesterday and it seems that segfaulting happens only on one machine, but I'm sure he'll have more details

Re: [Pacemaker] Is it possible for ocf:heartbeat:IPaddr2 to be on different NICs?

2010-04-27 Thread Dejan Muhamedagic
Hi, On Mon, Apr 26, 2010 at 10:38:21AM -0400, daniel qian wrote: > > On 2010-04-26, at 3:11 AM, Dejan Muhamedagic wrote: > > > Hi, > > > > On Fri, Apr 23, 2010 at 12:13:31PM -0400, daniel qian wrote: > >> > >> On 2010-04-23, at 12:04 PM, Dejan Muhamedagic wrote: > >> > >>> Hi, > >>> > >>> On

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Ante Karamatić
On 27.04.2010 09:41, Oliver Heinz wrote: Pål Simensen reported that he has the same segfault error, so it would be interessting if the new packages fixed it for him. Did they? What's your network setup? Do you use bonding or dhcp? ___ Pacemaker m

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Oliver Heinz
Am Dienstag, 27. April 2010 12:44:57 schrieb Ante Karamatić: > On 27.04.2010 09:41, Oliver Heinz wrote: > > Pål Simensen reported that he has the same segfault error, so it would > > be > > > > interessting if the new packages fixed it for him. Did they? > > What's your network setup? Do you

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Lars Ellenberg
On Tue, Apr 27, 2010 at 09:33:27AM +0200, Oliver Heinz wrote: > Am Montag, 26. April 2010, um 18:50:24 schrieb Lars Ellenberg: > > On Mon, Apr 26, 2010 at 04:37:37PM +0200, Oliver Heinz wrote: > > > Am Montag, 26. April 2010 15:58:51 schrieb Ante Karamatić: > > > > On 26.04.2010 14:42, Oliver Heinz

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Ante Karamatić
On 27.04.2010 12:58, Oliver Heinz wrote: I use bonding, vlans and bridge interfaces. No dhcp just fixed adresses. Other two people with this issue also use bonding and after disabling bonding issue was gone. Could you please verify this? ___ Pacem

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Ante Karamatić
On 27.04.2010 12:58, Oliver Heinz wrote: I use bonding, vlans and bridge interfaces. No dhcp just fixed adresses. As a workaround, you could: sudo update-rc.d -f corosync disable S add 'post-up /etc/init.d/corosync start' to bonding interface in /etc/network/interfaces. __

[Pacemaker] Need an idea for dynamic configuration dependng on resource distribution

2010-04-27 Thread Andreas Mock
Hi all, I need an idea how I could achieve the following with corosync/pacemaker. 2 servers in a cluster. Each server is running a resource on its own by pacemaker configuration in the default case (everything o.k.). In this scenario the resources shall take as much "server estate" as possible.

Re: [Pacemaker] Need an idea for dynamic configuration dependng on resource distribution

2010-04-27 Thread Michael Schwartzkopff
Am Dienstag, 27. April 2010 14:45:21 schrieb Andreas Mock: > Hi all, > > I need an idea how I could achieve the following with corosync/pacemaker. > 2 servers in a cluster. Each server is running a resource on its own by > pacemaker configuration in the default case (everything o.k.). In this > sce

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Oliver Heinz
Am Dienstag, 27. April 2010 13:01:36 schrieb Ante Karamatić: > On 27.04.2010 12:58, Oliver Heinz wrote: > > I use bonding, vlans and bridge interfaces. No dhcp just fixed adresses. > > Other two people with this issue also use bonding and after disabling > bonding issue was gone. Could you please

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Fabio M. Di Nitto
On 4/27/2010 3:26 PM, Oliver Heinz wrote: > Am Dienstag, 27. April 2010 13:01:36 schrieb Ante Karamatić: >> On 27.04.2010 12:58, Oliver Heinz wrote: >>> I use bonding, vlans and bridge interfaces. No dhcp just fixed adresses. >> >> Other two people with this issue also use bonding and after disabli

Re: [Pacemaker] Need an idea for dynamic configuration dependng on resource distribution

2010-04-27 Thread Andreas Mock
-Ursprüngliche Nachricht- Von: Michael Schwartzkopff Gesendet: 27.04.2010 14:51:58 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Need an idea for dynamic configuration dependng on resource distribution >Am Dienstag, 27. April 2010 14:45:21 schrieb Andreas Mock: >>

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Oliver Heinz
Am Dienstag, 27. April 2010 14:18:27 schrieb Ante Karamatić: > On 27.04.2010 12:58, Oliver Heinz wrote: > > I use bonding, vlans and bridge interfaces. No dhcp just fixed adresses. > > As a workaround, you could: > > sudo update-rc.d -f corosync disable S > > add 'post-up /etc/init.d/corosync st

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Oliver Heinz
Am Dienstag, 27. April 2010 13:00:27 schrieb Lars Ellenberg: > On Tue, Apr 27, 2010 at 09:33:27AM +0200, Oliver Heinz wrote: .. > > I installed every -dbg package that is available for any package > > installed on the system (just to be sure). There are no debug packages > > for most of the cluster

[Pacemaker] pacemaker + drbd + mysql = confusion

2010-04-27 Thread Oliver Hoffmann
Hi all! I had a working two-node-drbd-cluster with the following config. (Ubuntu 10.04 Server amd64, upgraded today) node $id="05570095-f264-41ae-a609-768fd4a3b7e8" store2 node $id="e790fa15-1f91-4442-94e9-bf411519c4f8" store1 primitive drbd0 ocf:linbit:drbd \ params drbd_resource="raid"

[Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Michael Brown
Greetings, I've discovered that if I have ocf::hb:Xen resources as primitives, they migrate to other nodes on failover. But if they are members of a group, the domUs stop/start. Am I doing it wrong? I'd like to use the groups to minimize the amount of config I need for each VM. Here's my current

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Ante Karamatić
On 27.04.2010 15:26, Oliver Heinz wrote: bad news is: it's not only bonding related It's not bonding at all. It's an upstart issue. corosync is started before all network interfaces are up. This issue is most visible with bridging, cause bridged interfaces become functional much later than

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2010-04-27 Thread Andrew Beekhof
Suggested patch: diff --git a/group/dlm_controld/pacemaker.c b/group/dlm_controld/pacemaker.c index c661343..93c1841 100644 --- a/group/dlm_controld/pacemaker.c +++ b/group/dlm_controld/pacemaker.c @@ -123,7 +123,7 @@ void dlm_process_node(gpointer key, gpointer value, gpointer user_data) } e

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 6:39 PM, Michael Brown wrote: > Greetings, > > I've discovered that if I have ocf::hb:Xen resources as primitives, they > migrate to other nodes on failover. > > But if they are members of a group, the domUs stop/start. Am I doing it > wrong? No. The semantics of groups p

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Roberto Giordani
Hello, this mean that if i create a group with a file system + xen the live migration doen't work? Regards, Roberto. On 04/27/2010 08:41 PM, Andrew Beekhof wrote: On Tue, Apr 27, 2010 at 6:39 PM, Michael Brown wrote: Greetings, I've discovered that if I have ocf::hb:Xen resources as pri

Re: [Pacemaker] pacemaker + drbd + mysql = confusion

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 6:21 PM, Oliver Hoffmann wrote: > Hi all! > > > I had a working two-node-drbd-cluster with the following config. (Ubuntu > 10.04 Server amd64, upgraded today) > > node $id="05570095-f264-41ae-a609-768fd4a3b7e8" store2 > node $id="e790fa15-1f91-4442-94e9-bf411519c4f8" store1

Re: [Pacemaker] Need an idea for dynamic configuration dependng on resource distribution

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 2:45 PM, Andreas Mock wrote: > Hi all, > > I need an idea how I could achieve the following with corosync/pacemaker. > 2 servers in a cluster. Each server is running a resource on its own by > pacemaker > configuration in the default case (everything o.k.). In this scenari

Re: [Pacemaker] Promote first, then try restart

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 9:51 AM, Lubo Drobny wrote: > Hello, > > I have this situation: > 2 nodes, M/S resource. > > At first node Master resource totally crashed, pacemaker tried restart this > resouce, but it cannot run - after start timeout finished pacemaker promote > Master to second node. >

Re: [Pacemaker] About influence of resouce-stickiness which used colocation for limitation.

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 9:42 AM, wrote: > Hi Andrew, > >> > Done. >> >    http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f7da9d09ebd2 > > > It seems to move with your patch definitely. > > But, the following error is reflected on log. > Does not this error have any problem? > > Apr 27 16:37:0

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 8:54 PM, Roberto Giordani wrote: > Hello, > this mean that if i create a group with a file system + xen the live > migration doen't work? Correct. You'd need to use a cluster filesystem (or some other arrangement that could be cloned sanely) and colocate the xen instance w

Re: [Pacemaker] SLES11+HAE: Resources on a single node with two configured?

2010-04-27 Thread Andrew Beekhof
2010/4/26 Aleksey Zholdak : > Andrew Beekhof: > What do you mean here? Need logs? I am pleased to show you! Only their size > will occupy much space here, tell me what to choose and what to look ... Better to just send everything and compress them first. __

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Michael Brown
Is there A Better Way to do this, or is the best approach to colocate each domU with the group? M. On 04/27/2010 02:41 PM, Andrew Beekhof wrote: > On Tue, Apr 27, 2010 at 6:39 PM, Michael Brown wrote: > >> Greetings, >> >> I've discovered that if I have ocf::hb:Xen resources as primitives, th

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 9:12 PM, Michael Brown wrote: > Is there A Better Way to do this, or is the best approach to colocate > each domU with the group? In order for the domU to migrate, everything it needs must be on the current host AND already at the destination. ONLY cloned resources satisfy

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Roberto Giordani
Hi Andrew, probably ther is my mistake. I'd like to add a second file system mounted on the guest xen, this file system is not the main files sytem where run the O.S., but is a file system that use the xen machine for file sharing for example. Is it not possible? Regards, Roberto On 04/27/2010

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Sander van Vugt
Andrew, On Tue, 2010-04-27 at 21:09 +0200, Andrew Beekhof wrote: > On Tue, Apr 27, 2010 at 8:54 PM, Roberto Giordani > wrote: > > Hello, > > this mean that if i create a group with a file system + xen the live > > migration doen't work? > > Correct. > You'd need to use a cluster filesystem (or

[Pacemaker] Resource monitoring stops suddenly

2010-04-27 Thread Michael Brown
I've written a custom plugin to monitor RAID status and it's working great. Except for some reason, pacemaker seems to stop dispatching checks: xenhost1:~ # grep RaidStatus /var/log/messages | tail -n4 Apr 27 15:38:14 xenhost1 RaidStatus[25972]: [25977]: INFO: status optimal Apr 27 15:38:44 xenhos

Re: [Pacemaker] Resource monitoring stops suddenly

2010-04-27 Thread Florian Haas
On 04/27/2010 10:20 PM, Michael Brown wrote: > I've written a custom plugin to monitor RAID status and it's working > great. Except for some reason, pacemaker seems to stop dispatching checks: > > [...] > > Any idea what's happening? Without seeing the resource agent? Unlikely. Version informat

Re: [Pacemaker] Resource monitoring stops suddenly

2010-04-27 Thread Dejan Muhamedagic
Hi, On Tue, Apr 27, 2010 at 04:20:16PM -0400, Michael Brown wrote: > I've written a custom plugin to monitor RAID status and it's working > great. Except for some reason, pacemaker seems to stop dispatching checks: > > xenhost1:~ # grep RaidStatus /var/log/messages | tail -n4 > Apr 27 15:38:14 xe

Re: [Pacemaker] About influence of resouce-stickiness which used colocation for limitation.

2010-04-27 Thread renayama19661014
Hi Andrew, > Oh, that was some development logging I forgot to remove. > I'll backport that fix in a moment too. All right. Thanks! Best Regards, Hideo Yamauchi. --- Andrew Beekhof wrote: > On Tue, Apr 27, 2010 at 9:42 AM, wrote: > > Hi Andrew, > > > >> > Done. > >> > � > >> > �http://hg.c

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 9:56 PM, Sander van Vugt wrote: > Andrew, > > On Tue, 2010-04-27 at 21:09 +0200, Andrew Beekhof wrote: >> On Tue, Apr 27, 2010 at 8:54 PM, Roberto Giordani >> wrote: >> > Hello, >> > this mean that if i create a group with a file system + xen the live >> > migration doen'

Re: [Pacemaker] Only Xen primitives not in groups migrate on failover

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 9:24 PM, Roberto Giordani wrote: > Hi Andrew, > probably ther is my mistake. > I'd like to add a second file system mounted on the guest xen, this file > system is not the main files sytem where run the O.S., but is a file system > that use the xen machine for file sharing

Re: [Pacemaker] Promote first, then try restart

2010-04-27 Thread Lubo Drobny
> > > > It is possible to set up pacemaker to first promote Master at second node > > and > > then try restart resource at first node. > > Sorry, pacemaker can't do this. > Is there any way to know earlier if the start operation will fail? > Unfortunatelly I don't know. But this is quite impor