[Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Hi folks, I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, corosync 2.3.0 and pacemaker 1.1.8, and I'm facing a strange problem while starting pacemaker. Log shows the following errors: Mar 25 09:21:26 [33720] lrmd:error: mainloop_add_ipc_server: Could not

[Pacemaker] Patrik Rapposch is out of the office

2013-03-25 Thread Patrik . Rapposch
Ich werde ab 25.03.2013 nicht im Büro sein. Ich kehre zurück am 27.03.2013. Sehr geehrte Damen und Herren, ich bin bis einschließlich 27.03 auf Dienstreise. Trotzdem versuche ich Ihr Anliegen so schnell als möglich zu beantworten. Bitte setzen Sie immer ksi.network in Kopie. Please note, that I

Re: [Pacemaker] solaris problem

2013-03-25 Thread LGL Extern
With solaris/openindiana you should use this setting export PCMK_ipc_type=socket Andreas -Ursprüngliche Nachricht- Von: Andrei Belov [mailto:defana...@gmail.com] Gesendet: Montag, 25. März 2013 10:43 An: pacemaker@oss.clusterlabs.org Betreff: [Pacemaker] solaris problem Hi folks,

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Andreas, just tried PCMK_ipc_type=socket pacemaker -fV - a bunch of additional event_send errors appeared: Mar 25 11:15:55 [33641] ha1 corosync error [MAIN ] event_send retuned -32, expected 256! Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, expected 217!

Re: [Pacemaker] CMAN, corosync pacemaker

2013-03-25 Thread Lars Marowsky-Bree
On 2013-03-21T15:28:17, Leon Fauster leonfaus...@googlemail.com wrote: I believe the preferred pacemaker based HA configuration in RHEL 6.4 uses all three packages and the preferred configuration in SLES11 SP2 is just corosync/pacemaker (I do not believe CMAN is even available in

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-25 Thread Viacheslav Dubrovskyi
23.03.2013 08:27, Viacheslav Dubrovskyi пишет: Hi. I'm building a package for my distributive. Everything is built, but the package does not pass our internal tests. I get errors like this: verify-elf: ERROR: ./usr/lib/libpe_status.so.4.1.0: undefined symbol: get_object_root It mean, that

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
I've rebuilt libqb using separated SOCKETDIR (/var/run/qb), and set hacluster:haclient ownership to this dir. After that pacemakerd has been successfully started with all its childs: [root@ha1 /var/run/qb]# pacemakerd -fV Could not establish pacemakerd connection: Connection refused (146)

[Pacemaker] DRBD+LVM+NFS problems

2013-03-25 Thread Dennis Jacobfeuerborn
Hi, I'm currently trying create a two node redundant NFS setup on CentOS 6.4 using pacemaker and crmsh. I use this Document as a starting poing: https://www.suse.com/documentation/sle_ha/singlehtml/book_sleha_techguides/book_sleha_techguides.html The first issue is that using these

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-25 Thread Dennis Jacobfeuerborn
I just found the following in the dmesg output which might or might not add to understanding the problem: device-mapper: table: 253:2: linear: dm-linear: Device lookup failed device-mapper: ioctl: error adding target to table Regards, Dennis On 25.03.2013 13:04, Dennis Jacobfeuerborn wrote:

Re: [Pacemaker] racing crm commands... last write wins?

2013-03-25 Thread Dejan Muhamedagic
On Wed, Mar 20, 2013 at 10:40:10AM -0700, Bob Haxo wrote: Regarding the replace triggering a DC election ... which is causing issues with scripted installs ... how do I determine which crm commands will NOT trigger this election? It seems like every configure commit could possible result in

[Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Angel L. Mateo
Hello, I am newbie with pacemaker (and, generally, with ha clusters). I have configured a two nodes cluster. Both nodes are virtual machines (vmware esx) and use a shared storage (provided by a SAN, although access to the SAN is from esx infrastructure and VM consider it as scsi disk). I

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread emmanuel segura
I have a production cluster, using two vm on esx cluster, for stonith i'm using sbd, everything work fine 2013/3/25 emmanuel segura emi2f...@gmail.com I have a production cluster, using two vm on esx cluster, for stonith i'm using sbd, everything work find 2013/3/25 Angel L. Mateo

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread emmanuel segura
I have a production cluster, using two vm on esx cluster, for stonith i'm using sbd, everything work find 2013/3/25 Angel L. Mateo ama...@um.es Hello, I am newbie with pacemaker (and, generally, with ha clusters). I have configured a two nodes cluster. Both nodes are virtual machines

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Ok, I fixed this issue with the following patch against libqb 0.14.4: --- lib/unix.c.orig 2013-03-25 12:30:50.445762231 + +++ lib/unix.c 2013-03-25 12:49:59.322276376 + @@ -83,7 +83,7 @@ #if defined(QB_LINUX) || defined(QB_CYGWIN) snprintf(path, PATH_MAX,

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Jacek Konieczny
On Mon, 25 Mar 2013 13:54:22 +0100 My problem is how to avoid split brain situation with this configuration, without configuring a 3rd node. I have read about quorum disks, external/sbd stonith plugin and other references, but I'm too confused with all this. For example, [1]

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Angel L. Mateo
Jacek Konieczny jaj...@jajcus.net escribió: On Mon, 25 Mar 2013 13:54:22 +0100 My problem is how to avoid split brain situation with this configuration, without configuring a 3rd node. I have read about quorum disks, external/sbd stonith plugin and other references, but I'm too

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Andreas, thank you for sharing this link and your start script! My goal is to make possible building those tools using more convenient way of NetBSD's pkgsrc system. Perhaps using something like --localstatedir=${VARBASE}/cluster for both libqb, corosync and pacemaker, and setting the

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Jacek Konieczny
On Mon, 25 Mar 2013 20:01:28 +0100 Angel L. Mateo ama...@um.es wrote: quorum { provider: corosync_votequorum expected_votes: 2 two_node: 1 } Corosync will then manage quorum for the two-node cluster and Pacemaker I'm using corosync 1.1 which is the one provided with

Re: [Pacemaker] OCF Resource agent promote question

2013-03-25 Thread Andreas Kurz
Hi Steve, On 2013-03-25 18:44, Steven Bambling wrote: All, I'm trying to work on a OCF resource agent that uses postgresql streaming replication. I'm running into a few issues that I hope might be answered or at least some pointers given to steer me in the right direction. Why are you

Re: [Pacemaker] Resource is Too Active (on both nodes)

2013-03-25 Thread Andreas Kurz
On 2013-03-22 21:35, Mohica Jasha wrote: Hey, I have two cluster nodes. I have a service process which is prone to crash and takes a very long time to start. Since the service process takes a long time to start I have the service process running on both nodes, but only the active node

Re: [Pacemaker] issues when installing on pxe booted environment

2013-03-25 Thread Andreas Kurz
On 2013-03-22 19:31, John White wrote: Hello Folks, We're trying to get a corosync/pacemaker instance going on a 4 node cluster that boots via pxe. There have been a number of state/file system issues, but those appear to be *mostly* taken care of thus far. We're running into an

Re: [Pacemaker] pacemaker node stuck offline

2013-03-25 Thread Andreas Kurz
On 2013-03-22 03:39, pacema...@feystorm.net wrote: On 03/21/2013 11:15 AM, Andreas Kurz wrote: On 2013-03-21 14:31, Patrick Hemmer wrote: I've got a 2-node cluster where it seems last night one of the nodes went offline, and I can't see any reason why. Attached are the logs from the 2

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-25 Thread Dennis Jacobfeuerborn
I have now reduced the configuration further and removed LVM from the picture. Still the cluster fails when I set the master node to standby. What's interesting is that things get fixed when I issue a simple cleanup for the filesystem resource. This is what my current config looks like: node