Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread andreas graeper
(in addition) n1 (former active/master node) is still stopped n2 : corosync stop + start now drbd is slave on n2 and nothing else starts there is a location-constraint rsc_location rsc=ms_drbd_r0 id=drbd-fence-by-handler-r0-ms_drbd_r0 rule role=Master score=-INFINITY

Re: [Pacemaker] pacemaker/corosync: error: qb_sys_mmap_file_open: couldn't open file

2013-06-26 Thread Jacek Konieczny
On Wed, 26 Jun 2013 14:35:03 +1000 Andrew Beekhof and...@beekhof.net wrote: Urgh: infoJun 25 13:40:10 lrmd_ipc_connect(913):0: Connecting to lrmd trace Jun 25 13:40:10 pick_ipc_buffer(670):0: Using max message size of 51200 error Jun 25 13:40:10 qb_sys_mmap_file_open(92):2147483648:

Re: [Pacemaker] pacemaker/corosync: error: qb_sys_mmap_file_open: couldn't open file

2013-06-26 Thread Andrew Beekhof
Sent from a mobile device On 26/06/2013, at 5:44 PM, Jacek Konieczny jaj...@jajcus.net wrote: On Wed, 26 Jun 2013 14:35:03 +1000 Andrew Beekhof and...@beekhof.net wrote: Urgh: infoJun 25 13:40:10 lrmd_ipc_connect(913):0: Connecting to lrmd trace Jun 25 13:40:10

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Denis Witt
On Tue, 25 Jun 2013 13:34:30 -0400 (EDT) Jake Smith jsm...@argotec.com wrote: I'm guessing if you run: grep user /sys/fs/ocfs2/loaded_cluster_plugins 21; rc=$? your going to return a 1 or something other than 0. Hi Jake, true, as long as ocfs2_stack_user isn't loaded. I loaded it and now

Re: [Pacemaker] [OT] MySQL Replication

2013-06-26 Thread Denis Witt
On Wed, 26 Jun 2013 12:35:33 +1000 Andrew Beekhof and...@beekhof.net wrote: System is Debian Wheezy which means version 0.11.1-2 for libqb-dev. rpm errors on debian? I'm confused. When you run ./autogen.sh it tries to start an rpm command, this failed because I didn't had rpm installed.

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Lars Marowsky-Bree
On 2013-06-25T17:08:36, Denis Witt denis.w...@concepts-and-training.de wrote: My cluster.conf (I added this later to be able to run tunefs.ocfs --update-cluster-stack): This indicates you have a 'wrong stack' on disk still. You need to run mkfs.ocfs2/tunefs.ocfs while the o2cb cluster resource

Re: [Pacemaker] pacemaker/corosync: error: qb_sys_mmap_file_open: couldn't open file

2013-06-26 Thread Jacek Konieczny
On Wed, 26 Jun 2013 18:38:37 +1000 Andrew Beekhof and...@beekhof.net wrote: trace Jun 25 13:40:10 gio_read_socket(366):0: 0xa6c140.4 1 (ref=1) trace Jun 25 13:40:10 lrmd_ipc_accept(89):0: Connection 0xa6d110 infoJun 25 13:40:10 crm_client_new(276):0: Connecting 0xa6d110 for uid=17

Re: [Pacemaker] GPU Processing

2013-06-26 Thread Lars Marowsky-Bree
On 2013-06-25T12:03:22, Colin Blair cbl...@technicacorp.com wrote: Andrew, Does Pacemaker support GPU processes? Pacemaker is not very CPU intensive; what would it use a GPU for? Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-26 Thread Lars Marowsky-Bree
On 2013-06-25T20:28:29, Andrew Beekhof and...@beekhof.net wrote: Perhaps a numbering scheme like the Linux kernel would fit better than a stable/unstable branch distinction. Changes that deserve the unstable term are really really rare (and I'm sure we've all learned from them), so it may

Re: [Pacemaker] GPU Processing

2013-06-26 Thread Vladislav Bogdanov
26.06.2013 12:07, Lars Marowsky-Bree wrote: On 2013-06-25T12:03:22, Colin Blair cbl...@technicacorp.com wrote: Andrew, Does Pacemaker support GPU processes? Pacemaker is not very CPU intensive; what would it use a GPU for? Finding strict optimal solution for utilization-based placement

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Denis Witt
On Wed, 26 Jun 2013 11:07:05 +0200 Lars Marowsky-Bree l...@suse.com wrote: This indicates you have a 'wrong stack' on disk still. You need to run mkfs.ocfs2/tunefs.ocfs while the o2cb cluster resource is running, or to set it to pcmk manually. Hi Lars, at the moment I assume I have a

Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 12:24 AM, Digimer li...@alteeve.ca wrote: On 06/25/2013 07:29 AM, andreas graeper wrote: hi, maybe again and again the same question, please excuse. two nodes (n1 active / n2 passive) and `service corosync stop` on active. does the node, that is going down, tells the

Re: [Pacemaker] [OT] MySQL Replication

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 6:51 PM, Denis Witt denis.w...@concepts-and-training.de wrote: On Wed, 26 Jun 2013 12:35:33 +1000 Andrew Beekhof and...@beekhof.net wrote: System is Debian Wheezy which means version 0.11.1-2 for libqb-dev. rpm errors on debian? I'm confused. When you run

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 7:30 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-25T20:28:29, Andrew Beekhof and...@beekhof.net wrote: Perhaps a numbering scheme like the Linux kernel would fit better than a stable/unstable branch distinction. Changes that deserve the unstable term are really

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread jsmith
Original message From: Denis Witt denis.w...@concepts-and-training.de Date: 06/26/2013 6:46 AM (GMT-05:00) To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] ERROR: Wrong stack o2cb On Wed, 26 Jun 2013 11:07:05 +0200 Lars

[Pacemaker] Questions regarding STONITH

2013-06-26 Thread Paul Walsh
Situation: Two HP DL380 G7 (with ILO3) servers running pacemaker and heartbeat (yes, I know it's deprecated but I haven't got round to using corosync yet) on RHEL 6.4 (NOT using the Red Hat Cluster suite). The servers are in different data centres. node-a

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Denis Witt
On Wed, 26 Jun 2013 07:53:37 -0400 (EDT) jsmith jsm...@argotec.com wrote: You could start ocfs2 in the cluster just disable/remove the filesystem resource for now. Once pacemaker has started ocfs2 I believe you can do what you need?  Hi Jake, Node test4: standby Online: [ test4-node1

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-26 Thread Lars Marowsky-Bree
On 2013-06-26T21:31:14, Andrew Beekhof and...@beekhof.net wrote: Distributions can take care of them when they integrate them; basically they'll trickle through until the whole stack the distributions ship builds again. If we let 2.0.x be anything like 1.1.x, I suspect this would be rather

Re: [Pacemaker] weird drbd/cluster behaviour

2013-06-26 Thread Саша Александров
Any ideas? :-( 2013/6/25 Саша Александров shurr...@gmail.com Hi all! I am setting up a new cluster on OracleLinux 6.4 (well, it is CentOS 6.4). I went through http://clusterlabs.org/quickstart-redhat.html Then I installed DRBD 8.4.2 from elrepo. This setup is unusable :-( with DRBD 8.4.2.

[Pacemaker] Enhancement requests for Pacemaker

2013-06-26 Thread Michael Furman
Dear Pacemaker community! We are almost completed our evaluation of Pacemaker + Corosync + Cman We think it is great HA framework! Also, great support from the community! Thank you for this! We think that the following features will help for the easier integration of Pacemaker with complex

Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread Michael Schwartzkopff
Am Mittwoch, 26. Juni 2013, 16:36:43 schrieb andreas graeper: hi and thanks. a primitive can be moved to another node. how can i move (change roles) of drbd:master to the other node ? Switch off the other node. and will all the depending resources follow? Depends on your

Re: [Pacemaker] weird drbd/cluster behaviour

2013-06-26 Thread Digimer
I don't see fencing/stonith configured. Without it, your cluster will not be stable. You will get DRBD split-brains easily and depending in what you use DRBD for, you could corrupt your data. On 06/25/2013 09:25 AM, Саша Александров wrote: Hi all! I am setting up a new cluster on OracleLinux

Re: [Pacemaker] Questions regarding STONITH

2013-06-26 Thread Digimer
You have a few issues here. First up, a node can *never* be expected to self-fence. You can intentionally crash a node to see why this is a bad idea; 'echo c /proc/sysrq-trigger' (this will immediately crash a node!). You need to make sure each node can reach the *other* node's iLO interface.

Re: [Pacemaker] ERROR: Wrong stack o2cb

2013-06-26 Thread Jake Smith
- Original Message - From: Denis Witt denis.w...@concepts-and-training.de To: pacemaker@oss.clusterlabs.org Cc: jsmith jsm...@argotec.com Sent: Wednesday, June 26, 2013 8:35:08 AM Subject: Re: [Pacemaker] ERROR: Wrong stack o2cb On Wed, 26 Jun 2013 07:53:37 -0400 (EDT) jsmith

Re: [Pacemaker] [OT] MySQL Replication

2013-06-26 Thread Denis Witt
On Wed, 26 Jun 2013 21:33:30 +1000 Andrew Beekhof and...@beekhof.net wrote: When you run ./autogen.sh it tries to start an rpm command, this failed because I didn't had rpm installed. How did it fail? That whole block if intended to be skipped if rpm isn't available. if [ -e

Re: [Pacemaker] Pacemaker error trying to add Apache resource

2013-06-26 Thread Jake Smith
- Original Message - From: Colin Blair cbl...@technicacorp.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Wednesday, June 26, 2013 10:56:49 AM Subject: [Pacemaker] Pacemaker error trying to add Apache resource All, Couldn't find a solution in

Re: [Pacemaker] weird drbd/cluster behaviour

2013-06-26 Thread Саша Александров
Hi! Fencing is disabled for now, the issue is not with fencing: the question is - why only one out of three DRBD master-slave sets is recognized by pacemaker, even though all three drbd resources are active and configured properly... 2013/6/26 Digimer li...@alteeve.ca I don't see

Re: [Pacemaker] KVM live migration and multipath

2013-06-26 Thread Sven Arnold
Hi, I did not trigger file system corruptions again. So, at this moment it looks like it is important to: - turn caching off - use native aio - *and* use an up-to-date machine type Failure to meet any of these criteria would result in fs corruption. Does this make sense at all? Later part is

Re: [Pacemaker] Pacemaker error trying to add Apache resource

2013-06-26 Thread Colin Blair
Hi Jake, Thank you for the info. I was using OCF. Lsb worked. R, CB -Original Message- From: Jake Smith [mailto:jsm...@argotec.com] Sent: Wednesday, June 26, 2013 12:08 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Pacemaker error trying to add Apache resource

Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread andreas graeper
thanks four your answer. but still question open. when i switch off the active node: though this is done reliable for me, the still passive node wants to know for sure and will kill the (already dead) former active node. i have no stonith-hardware (and i could not find till now stonith:null what

[Pacemaker] Problem with dual-PDU fencing node with redundant PSUs

2013-06-26 Thread Digimer
This question appears to be the same issue asked here: http://oss.clusterlabs.org/pipermail/pacemaker/2013-June/018650.html In my case, I have two fence methods per node; IPMI first with action=reboot and, if that fails, two PDUs (one backing each side of the node's redundant PSUs). Initially I

Re: [Pacemaker] pcs and ping location rule

2013-06-26 Thread Chris Feist
On 06/24/13 16:33, Mailing List SVR wrote: Hi, I defined this clone resource for connectivity check: pcs resource create ping ocf:pacemaker:ping host_list=10.0.2.2 multiplier=1000 dampen=10s op monitor interval=60s pcs resource clone ping ping_clone globally-unique=false these works, but now

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 10:37 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-26T21:31:14, Andrew Beekhof and...@beekhof.net wrote: Distributions can take care of them when they integrate them; basically they'll trickle through until the whole stack the distributions ship builds again.