Re: [Openais] problem to delete resource

2015-02-23 Thread Andrew Beekhof
Looks like the resource is badly configured - to the point that the RA doesn't know how to stop it. Thats what this means: > p_drbd_ora_stop_0 on node1 'not configured' (6): call=6, status=complete, > last-rc-change='Mon Feb 2 16:54:19 2015', queued=0ms, exec=26ms > p_drbd_ora_stop_0 on

Re: [Openais] pgsql troubles.

2014-12-03 Thread Andrew Beekhof
You're probably better to take this to the pacemaker list. I don't think the guys that wrote the postgres agent subscribe here. > On 2 Dec 2014, at 11:50 pm, steve wrote: > > Good Afternoon, > > Sending again now that the holidays are over. > > I am having loads of trouble with pacemaker/coros

Re: [Openais] unmanaged resource failed - how to get back?

2014-06-30 Thread Andrew Beekhof
On 30 Jun 2014, at 9:25 pm, Senftleben, Stefan (itsc) wrote: > Hello, > > I set the cluster in a maintainance mode with: crm configure property > maintenance-mode=true . > Afterwards I did stop one resource manually, but after turning of the > maintainance mode, the resource is in status „u

Re: [Openais] Error: -> Need help! cib: [1539]: WARN: cib_peer_callback: Discarding cib_modify message (3) from lxds05: not in our membership

2014-05-19 Thread Andrew Beekhof
s-for-ubuntu-10-04/ > Can somebody confirm that upgrade procedure? Looks reasonable > > Regards > Stefan > > -Ursprüngliche Nachricht- > Von: Jan Friesse [mailto:jfrie...@redhat.com] > Gesendet: Montag, 19. Mai 2014 10:44 > An: Andrew Beekhof; Senftl

Re: [Openais] Error: -> Need help! cib: [1539]: WARN: cib_peer_callback: Discarding cib_modify message (3) from lxds05: not in our membership

2014-05-18 Thread Andrew Beekhof
On 16 May 2014, at 11:11 pm, Senftleben, Stefan (itsc) wrote: > Hello, > > I hope that someone can help me… > I have a two node pacemaker cluster, with to corosync rings. > Ubuntu 10.04, 64 bit. Pacemaker 1.0.8+hg15494-2ubuntu2, corosync > 1.2.0-0ubuntu1. It _could_ be a pacemaker issue, bu

Re: [Openais] Failure in failover, trouble determining cause and how to correct

2014-04-28 Thread Andrew Beekhof
On 29 Apr 2014, at 4:49 am, Joey D. wrote: > dc-version="1.1.8-7.el6-394e906" \ > cluster-infrastructure="classic openais (with plugin)" \ Please don't use the custom plugin on RHEL6 (and clones), its likely to go away RealSoonNow(tm). See: http://clusterlabs.org/quickstart-red

Re: [Openais] very slow pacemaker/corosync shutdown

2013-09-19 Thread Andrew Beekhof
On 19/09/2013, at 8:25 AM, David Lang wrote: > I have been using heartbeat for many years, but am now setting up some new > clusters with pacemaker/corosync. I'm not sure which component is having > problems so I'm sending to both lists. > > These are two machine clusters, configured per the

Re: [Openais] Heartbeat to Openais conversion. cib.xml verification errors

2013-01-03 Thread Andrew Beekhof
On Thu, Jan 3, 2013 at 3:43 AM, Nick Hoare wrote: > Hi > > I am updating SLES 10 SP3 to SLES 11 SP2 and am trying to upgrade a > heartbeat configuration to openais. > > I am following the conversion process as documented and have run the test > conversion $/usr/lib/heartbeat/hb2openais.sh -T /tmp/

Re: [Openais] Corosync 2.0 Feature Request: Replace objdb/confdb with something easier to use

2011-09-07 Thread Andrew Beekhof
On Thu, Aug 25, 2011 at 12:56 PM, Angus Salkeld wrote: > On Mon, Aug 08, 2011 at 09:41:10AM +0200, Jan Friesse wrote: >> Current objdb/confdb is really hard to use, because of all iterationing, >> ... It would be nice to replace it by hash table and thus for simple get >> item or set item, no iter

Re: [Openais] Installing corosync from source

2011-09-07 Thread Andrew Beekhof
On Wed, Sep 7, 2011 at 4:01 PM, Dan Frincu wrote: > Hi, > > On Wed, Sep 7, 2011 at 4:05 AM, Nick Khamis wrote: >> Hello Everyone, >> >> We are moving everything over from heartbeat, after the last update >> brought the cluster to it's knees... What we are interested in is >> using corosync, pacem

Re: [Openais] corosync didn't do what I expected

2011-07-31 Thread Andrew Beekhof
Read up on the no-quorum-policy setting. On Sat, Jul 30, 2011 at 5:36 AM, Keith Stevens wrote: > I have the following configuration on two servers netbox1 and netbox2: > > crm(live)configure# show > node netbox1 \ >         attributes standby="off" > node netbox2 > primitive failover-ip ocf:heart

Re: [Openais] Corosync Compatability

2011-07-28 Thread Andrew Beekhof
On Wed, Jul 27, 2011 at 1:28 PM, wrote: > >    Thank you Steave, >    We are currentely using corosync-1.2.1 and pacemaker 1.0.10 >    Can we use the same version of pacemaker with corosync-1.4 I'd say its likely. Try on one node - you'll find out pretty quickly if its not going to work > > >

Re: [Openais] Corosync quesion - ps auxf output

2011-07-25 Thread Andrew Beekhof
2011/7/26 José Pablo Méndez Soto : > Hello, > > According to http://www.clusterlabs.org/wiki/Debian_Lenny_HowTo, if one > installs pacemaker package alone on a debian based distro, it will install > on top of Corosync, but if one installs as: > > aptitude install pacemaker heartbeat > > then Pacema

Re: [Openais] Compilation error in HEAD

2011-07-04 Thread Andrew Beekhof
borked there. I found it simpler to just allow pacemaker to build without corosync. > > Regards, >  Honza > > Andrew Beekhof napsal(a): >> >> On Mon, Jul 4, 2011 at 12:16 PM, Andrew Beekhof >> wrote: >>> >>> [12:06 pm] beekhof@iMac ~/Developmen

Re: [Openais] Where can we find information on Corosync/OpenAIS w/out Pacemaker?

2011-07-03 Thread Andrew Beekhof
On Fri, Jul 1, 2011 at 11:23 PM, Whit Blauvelt wrote: > On Fri, Jul 01, 2011 at 11:45:22AM +1000, Andrew Beekhof wrote: > >> Being a generic cluster manager, rather than purpose-built for a specific >> app like a filesystem or database, we don't get involved with things l

Re: [Openais] Compilation error in HEAD

2011-07-03 Thread Andrew Beekhof
On Mon, Jul 4, 2011 at 12:16 PM, Andrew Beekhof wrote: > [12:06 pm] beekhof@iMac ~/Development/cluster/corosync # make > Making all in include > Making all in lcr > Built Live Component Replacement System > Making all in lib > Built shared libs > Making all in exec >

[Openais] Compilation error in HEAD

2011-07-03 Thread Andrew Beekhof
[12:06 pm] beekhof@iMac ~/Development/cluster/corosync # make Making all in include Making all in lcr Built Live Component Replacement System Making all in lib Built shared libs Making all in exec cd .. && /bin/sh /Users/beekhof/Development/cluster/corosync/missing --run automake-1.10 --gnu exec/

Re: [Openais] Where can we find information on Corosync/OpenAIS w/out Pacemaker?

2011-06-30 Thread Andrew Beekhof
On Fri, Jul 1, 2011 at 2:58 AM, Steven Dake wrote: > On 06/30/2011 06:57 AM, Digimer wrote: >> On 06/30/2011 08:48 AM, Whit Blauvelt wrote: >>> On Thu, Jun 30, 2011 at 01:30:43PM +1000, Andrew Beekhof wrote: >>> >>>> I'll agree that Pacemaker isn

Re: [Openais] Where can we find information on Corosync/OpenAIS w/out Pacemaker?

2011-06-29 Thread Andrew Beekhof
On Tue, Jun 28, 2011 at 1:12 AM, Whit Blauvelt wrote: > Hi, > > While the corosync.org site is sparse, some of the presentation slides > linked from there there promise that Corosync is designed to approach an > ideal of simplicity and clarity, to allow a variety of HA projects to be > developed a

Re: [Openais] Meatware Configuration Errors

2011-06-26 Thread Andrew Beekhof
Looks like the metadata this agent is returning is borked. Can you use stonith_admin -M to check the output? On Fri, Jun 3, 2011 at 1:09 PM, imnotpc wrote: > Hi again, > > I've got a 3 node cluster running with correct firewall rules this time. I set > up a meatware fence device using crm: > >  

Re: [Openais] Remote Access not Working

2011-06-26 Thread Andrew Beekhof
On Thu, May 19, 2011 at 8:29 PM, wrote: > Hi, > >   I have configured 2-Node(Linux1, Linux2) cluster that is working fine. > But I am not able to remotely access(From Linux3) the cluster. > >  I have configured both parameter remote-tls-port and remote-clear-port > in cib.xml file > >  On Linux3(

Re: [Openais] Corosync goes into endless loop when same hostname is used on more than one node

2011-05-12 Thread Andrew Beekhof
On Thu, May 12, 2011 at 4:04 PM, Dan Frincu wrote: > Hi, > When using the same hostname on 2 nodes Don't do that. Ever. > (debian squeeze, corosync 1.3.0-3 > from unstable) the following happens: > May 12 08:36:27 debian cib: [3125]: info: cib_process_request: Operation > complete: op cib_sync f

Re: [Openais] cibadmin usages....

2011-04-18 Thread Andrew Beekhof
On Mon, Apr 18, 2011 at 10:40 AM, wrote: > Hi , > >    Using cibadmin or any other command I want to extract all the > configured node for ResgGrp1 here(server150, server151).. Try the --xpath option? Or perhaps use ptest --show-scores >   >       >                operation="eq" value="server15

Re: [Openais] Corosync failover too long with many resources

2011-04-14 Thread Andrew Beekhof
On Thu, Apr 14, 2011 at 12:09 PM, Jonathan Amiez wrote: > Hello, > > I would need some advices to configure a 2-nodes cluster of load balancers > with > Pacemaker/Corosync. > > I already have that cluster set up, but it does not work as expected. > > It's running Haproxy/Nginx in active/passive s

Re: [Openais] Issues with order of fencing

2011-04-11 Thread Andrew Beekhof
On Thu, Apr 7, 2011 at 4:12 AM, Richard wrote: > Hi, >    I'm rather new to opanais and have run into some issues with the order of > fencing plus refusal to failover once one fencing method fails. Any help > would be much appreciated. >    Even though I've set priority lower on my fence_node2_ipm

Re: [Openais] How to add bind as a resource?

2011-04-09 Thread Andrew Beekhof
Sounds like the init script is not LSB compliant (wrong return code) On Sat, Apr 9, 2011 at 5:43 AM, Neil Aggarwal wrote: > Hello: > > I would like to add bind as a resource to my cluster. > > I tried this command: > crm configure primitive named lsb:named op monitor interval="30s" > timeout="20s

Re: [Openais] Resource start/stop linear dependency

2011-04-07 Thread Andrew Beekhof
On Thu, Apr 7, 2011 at 9:12 AM, wrote: > Hi, > >    I have configured three resource(X,Y,Z) in a one resource-group. > >    Order of resource in cib.xml file X then Y then Z. > >    No rsc-order constraint is added in the cib.xml file Yes there is - ordering is implied by the use of a group. >

Re: [Openais] need help for stonith configuration on RHEL6

2011-03-11 Thread Andrew Beekhof
Are you using pacemaker or rhcs/rgmanager? On Fri, Mar 11, 2011 at 12:01 PM, Amit Jathar wrote: > Hi, > > > > I am working on RHEL6. > > I am using two-node corosync cluster. I have configured it for Apache, > tomcat & Mysql. It is running fine & doing good job in the failover > scenarios. > > I

Re: [Openais] corosync shutdown process

2011-03-09 Thread Andrew Beekhof
Not enough information. Create and attach a hb_report for the shutdown case. On Tue, Mar 8, 2011 at 8:08 PM, Beau Sapach wrote: > Hello everyone, > > I’ve got a 2-node cluster that exposes iSCSI targets backed by LVM volumes > on top of a DRBD device.  For the most part I’ve got everything workin

Re: [Openais] [Pacemaker] Finalizing installation

2011-02-24 Thread Andrew Beekhof
On Thu, Feb 24, 2011 at 6:18 PM, Alessio Gennari wrote: > Hello, > I installed in a Ubuntu Server 10.10 64bit machine Cluster-glue, agents, > Corosync, Openais and Pacemaker. I could to start services and consigure a > ClusterIP resource that correctly respond. I installed two nodes (opeais and >

Re: [Openais] Problems with Pacemaker + Corosync after reboot

2011-01-18 Thread Andrew Beekhof
On Mon, Dec 20, 2010 at 12:55 AM, Daniel Bareiro wrote: > Hi all! > > I hope this is the right group to discuss my problem. > > I'm beginning to test HA clusters with GNU/Linux and for that I decided > to try Pacemaker + Corosync in Debian Lenny following this [1] howto. > > Both packages were ins

Re: [Openais] Large delay when restarting active node

2010-12-03 Thread Andrew Beekhof
This looks like a drbd issue, you might have more luck on that list. On Fri, Dec 3, 2010 at 4:15 PM, Dan Frincu wrote: > Hi, > > Don't know how to summarize what I've encountered, therefore the rather lame > subject. I'm running a HA setup of 2 nodes on RHEL5U3, and I have done the > following te

Re: [Openais] Pb with ais Library Error

2010-12-01 Thread Andrew Beekhof
On Wed, Dec 1, 2010 at 7:49 AM, Alain.Moulle wrote: > Hi Steve, > > I have some difficulties to follow the developpments ... what is > exactly the "MCP deployment model" ? http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for > > Thanks > Alain

Re: [Openais] pingd - monitoring different subnet

2010-11-30 Thread Andrew Beekhof
On Thu, Nov 25, 2010 at 8:25 PM, Luc Paulin wrote: > Hi, I am looking to setup a monitoring of gre/tunnel interface of our > clustered firewall. > > Both firewall do have a gre/ipsec tunnel to a another host/site. > What I would like to do is to add a pingd ressource which will monitor the > oth

Re: [Openais] Child process of corosync outputs a core

2010-11-11 Thread Andrew Beekhof
On Thu, Nov 11, 2010 at 4:39 PM, Steven Dake wrote: > On 11/11/2010 02:35 AM, Andrew Beekhof wrote: >> On Wed, Oct 27, 2010 at 5:15 PM, Steven Dake wrote: >>> On 10/26/2010 11:17 PM, Andrew Beekhof wrote: >>>> >>>> On Wed, Oct 27, 2010 at 7:32

Re: [Openais] Child process of corosync outputs a core

2010-11-11 Thread Andrew Beekhof
On Wed, Oct 27, 2010 at 5:15 PM, Steven Dake wrote: > On 10/26/2010 11:17 PM, Andrew Beekhof wrote: >> >> On Wed, Oct 27, 2010 at 7:32 AM, nozawat  wrote: >>> >>> Hi Andrew, >>> >>>  I send two log files of terminal.log and ha.log. >>>

Re: [Openais] All Resources shutting down on the master in a two node cluster when corosync is stopped on the slave

2010-10-27 Thread Andrew Beekhof
On Wed, Oct 27, 2010 at 12:42 PM, Andrew Beekhof wrote: > On Wed, Oct 20, 2010 at 7:06 PM, Tom Pride wrote: >> Hi there, >> Could someone please help me diagnose this problem, where if I run "service >> corosync stop" on the slave server of a 2 node cluster, all of

Re: [Openais] Child process of corosync outputs a core

2010-10-26 Thread Andrew Beekhof
rosync[6695]: [pcmk ] plugin.c:1526 ERROR: send_cluster_msg_raw: Message not sent (-1): totem_mcast(&iovec, 1, TOTEMPG_SAFE); is returning -1 Steve: would this happen if membership was in flux? I thought only IPC got stopped. > > Regards, > Tomo > > > 2010/10/27 Andrew Beekhof &

Re: [Openais] Child process of corosync outputs a core

2010-10-26 Thread Andrew Beekhof
On Tue, Oct 26, 2010 at 11:22 AM, nozawat wrote: > Hi all, > > My environment is as follows. >  * cluster-glue-1.0.6 >  * resource-agents-1.0.3 >  * corosync-1.2.8 (svn revision '3059') >  * pacemaker-1.1.3-2f0326468a33acb1ada8fa744c7d36d0b315bd35 > > Core file was output by corosync of the DC nod

Re: [Openais] superfluous dependency in corosync spec file

2010-10-14 Thread Andrew Beekhof
On Thu, Oct 14, 2010 at 2:08 PM, Vadym Chepkov wrote: > > On Oct 14, 2010, at 2:12 AM, Andrew Beekhof wrote: >> >> Since when was common sense a basis for reading distro packaging policies? >> I'm just grateful they don't make us create a separate subpackage for

Re: [Openais] superfluous dependency in corosync spec file

2010-10-13 Thread Andrew Beekhof
On Wed, Oct 13, 2010 at 10:44 PM, Vadym Chepkov wrote: > > On Oct 12, 2010, at 6:14 PM, Vadym Chepkov wrote: > >> >> On Oct 12, 2010, at 1:43 PM, Fabio M. Di NItto wrote: >> >>> >>> what distribution are you looking at? In Fedora, where the spec file was >>> first done as template for others to us

Re: [Openais] Corosync failing to start

2010-09-26 Thread Andrew Beekhof
On Sat, Sep 25, 2010 at 5:58 AM, Steven Dake wrote: > On 09/24/2010 05:55 PM, Lars Kellogg-Stedman wrote: >>> pacemaker is waiting for something in nanosleep.  Not sure what. >> >> Should I ping the pacemaker list separately?  I'm not sure how much >> overlap there is between here and there. >> >>

Re: [Openais] Announcement: Perl bindings for Corosync's CPG

2010-09-13 Thread Andrew Beekhof
On Mon, Sep 13, 2010 at 2:36 PM, Florian Haas wrote: > On 2010-09-13 11:21, Chase Venters wrote: >> On Monday 13 September 2010 3:45:50 am Florian Haas wrote: >>> I realize I may be asking for a lot, but is there any chance you could >>> rewrite your module to use SWIG, thereby making it more easi

Re: [Openais] openais trunk - change shutdown priority to 80

2010-09-07 Thread Andrew Beekhof
On Tue, Sep 7, 2010 at 3:36 PM, Ryan O'Hara wrote: > On Sat, Sep 04, 2010 at 09:42:28PM +0200, Fabio M. Di NItto wrote: >> On 09/04/2010 07:23 PM, Steven Dake wrote: >> > On 09/03/2010 09:33 PM, Fabio M. Di NItto wrote: >> >> On 09/03/2010 09:13 PM, Ryan O'Hara wrote: >> >>> >> >>> Same as Steve's

Re: [Openais] corosync & syslog dependencies

2010-08-09 Thread Andrew Beekhof
On Sat, Aug 7, 2010 at 3:08 AM, Angus Salkeld wrote: > On Fri, Aug 06, 2010 at 11:06:14AM +0200, Alain.Moulle wrote: >> Hi, >> >> About corosync & Pacemaker use : >> >> in my current release on RHEL6 : corosync-1.2.1-2.el6.x86_64 , >> the start of corosync requires the service syslog-ng to be star

Re: [Openais] [Corosync] The corosync shared memory keeps increasing

2010-08-05 Thread Andrew Beekhof
On Wed, Aug 4, 2010 at 8:24 PM, Steven Dake wrote: > On 08/03/2010 10:02 AM, hj lee wrote: >> Hi, >> >> I tried the latest version corosync 1.2.7 rpms from clusterlabs. The >> problem is still there. Actually the latest version gets worse. In old >> 1.1.2 version, the shared memory increases only

Re: [Openais] >>: drbd + pacemaker failback problems

2010-07-29 Thread Andrew Beekhof
On Thu, Jul 29, 2010 at 2:53 PM, Пленкин Алексей wrote: > Hi, > > i am using drbd 8.3.4 , pacemaker 1.0.1 and openais 0.80.3 > all works good, but i can't stop fail-back > i try to use resource-stickiness 100, but nothing changes and then failed > node comes online, resources migrating back

Re: [Openais] stonithd

2010-07-12 Thread Andrew Beekhof
On Mon, Jul 12, 2010 at 5:02 PM, Steven Dake wrote: > On 07/12/2010 07:09 AM, morphium wrote: >> Hi, >> >> I today installed pacemaker from Debian squeeze and configured it, but >> my syslog is filling with >> >> Jul 12 15:53:47 host crmd: [2174]: ERROR: stonithd_signon: Can't >> initiate connecti

Re: [Openais] corosync offline

2010-07-06 Thread Andrew Beekhof
On Tue, Jul 6, 2010 at 1:53 PM, wrote: > > Hello, > > I've build a cluster with just two nodes, both of them see each other, but >  they don't like to go online. This is my config: > > interface { >         bindnetaddr:    172.28.87.0 >         mcastaddr:      226.94.1.1 >                 mcastpo

Re: [Openais] Corosync 1.2.5 still hangs on startup

2010-07-01 Thread Andrew Beekhof
On Thu, Jul 1, 2010 at 3:09 PM, Keisuke MORI wrote: > Bad news... > > 2010/6/30 Andrew Beekhof : >> On Wed, Jun 30, 2010 at 12:06 PM, Keisuke MORI >> wrote: >>> 2010/6/29 Andrew Beekhof : >>>> On Mon, Jun 28, 2010 at 2:20 PM, Keisuke MORI >>&

Re: [Openais] Corosync 1.2.5 still hangs on startup

2010-06-30 Thread Andrew Beekhof
On Wed, Jun 30, 2010 at 12:06 PM, Keisuke MORI wrote: > 2010/6/29 Andrew Beekhof : >> On Mon, Jun 28, 2010 at 2:20 PM, Keisuke MORI >> wrote: >>> I've upgrade to pacemaker-1.0.9.1 / corosync-1.2.5 from clusterlabs on >>> CentOS 5.5 using yum but

Re: [Openais] Corosync 1.2.5 still hangs on startup

2010-06-28 Thread Andrew Beekhof
On Mon, Jun 28, 2010 at 2:20 PM, Keisuke MORI wrote: > I've upgrade to pacemaker-1.0.9.1 / corosync-1.2.5 from clusterlabs on > CentOS 5.5 using yum but it still hangs on its startup somtimes. > > The symptom is exactly same as this: >  https://lists.linux-foundation.org/pipermail/openais/2010-Jun

Re: [Openais] recover from corosync daemon restart and cpg_finalize timing

2010-06-24 Thread Andrew Beekhof
On Thu, Jun 24, 2010 at 9:16 AM, Steven Dake wrote: > On 06/23/2010 11:35 PM, Andrew Beekhof wrote: >> >> On Thu, Jun 24, 2010 at 1:50 AM, dan clark<2cla...@gmail.com>  wrote: >>> >>> Dear Gentle Reader >>> >>> Attached is a

Re: [Openais] recover from corosync daemon restart and cpg_finalize timing

2010-06-23 Thread Andrew Beekhof
On Thu, Jun 24, 2010 at 1:50 AM, dan clark <2cla...@gmail.com> wrote: > Dear Gentle Reader > > Attached is a small test program to stress initializing and finalizing > communication between a corosync cpg client and the corosync daemon. > The test was run under version 1.2.4.  Initial testing w

Re: [Openais] corosync 1.2.5 still doesn't shutdown properly

2010-06-22 Thread Andrew Beekhof
On Wed, Jun 23, 2010 at 8:22 AM, Alain.Moulle wrote: > Hi, > With whatever release (i.e. currently with corosync-1.2.1-2.el6.x86_64), > I always have trouble with the stop of corosync. And each > time it failed when there were some failed actions reported > by crm_mon. That would seem to be a dif

Re: [Openais] {patch] Corosync hangs on startup

2010-06-18 Thread Andrew Beekhof
Checked in as r2948. Please backport to 1.2 On Fri, Jun 11, 2010 at 6:23 PM, Steven Dake wrote: > On 06/11/2010 09:00 AM, Andrew Beekhof wrote: >> >> This is a bit convoluted, but hang in there. >> >> >> So there is this bug: >>   http://developerbugs.li

[Openais] {patch] Corosync hangs on startup

2010-06-11 Thread Andrew Beekhof
This is a bit convoluted, but hang in there. So there is this bug: http://developerbugs.linux-foundation.org/show_bug.cgi?id=2379 Essentially, to reproduce, you stop syslog but leave it enabled in corosync.conf. Here is the logging section I used: logging { debug: on fileline: off to_s

Re: [Openais] [announce] corosync 1.2.4 released

2010-06-11 Thread Andrew Beekhof
On Fri, Jun 11, 2010 at 4:03 PM, Colin wrote: > On Thu, Jun 10, 2010 at 12:22 AM, Steven Dake wrote: >> >> This version has the following changes: >> * Fixes defects in logsys which are crashing pacemaker installations. > > Hm, don't know whether I did something wrong, but I just compiled > coros

Re: [Openais] [Pacemaker] corosync/openais fails to start

2010-05-30 Thread Andrew Beekhof
On Thu, May 27, 2010 at 5:50 PM, Steven Dake wrote: > On 05/27/2010 08:40 AM, Diego Remolina wrote: >> >> Is there any workaround for this? Perhaps a slightly older version of >> the rpms? If so where do I find those? >> > > Corosync 1.2.1 doesn't have this issue apparently.  With corosync 1.2.1,

Re: [Openais] fusion-io card, drbd, corosync, pacemaker stop issue

2010-05-25 Thread Andrew Beekhof
On Sat, May 22, 2010 at 1:13 AM, Dean Patterson wrote: > We are using the following to create a 2-node highly-available cluster: > > Disk device - fusion-io cards (PCIe SSD's) > DRBD/Corosync/Pacemaker > > [r...@motest16 log]# rpm -qa | egrep "drbd|corosync|pacemaker" > drbd-pacemaker-8.3.7-1 > dr

Re: [Openais] Corosync Node not rejoining cluster.

2010-05-20 Thread Andrew Beekhof
On Wed, May 19, 2010 at 6:42 PM, James Mackie wrote: > I have had this small 2 node cluster running since February. This morning > one of the servers (Node2) stopped responding on the external network > interface. To remedy this the server was rebooted at the console. (Not by > me). When the node

Re: [Openais] plan for resolving corosync services unloading, problem blocking shutdown on opensuse

2010-05-10 Thread Andrew Beekhof
On Tue, May 11, 2010 at 7:52 AM, Steven Dake wrote: > On Tue, 2010-05-11 at 07:48 +0200, Alain.Moulle wrote: >> Hi, >> FYI : me too, I have debug : on and I faced the problem on RHEL5 as well >> as on fc12. >> Alain > > I have found the root cause I believe is related to your issues. > Basically w

Re: [Openais] What can I do when facing "Waiting for corosync services to unload:........."

2010-05-10 Thread Andrew Beekhof
On Mon, May 10, 2010 at 8:31 AM, Alain.Moulle wrote: > > I meant  "/etc/init.d/corosync stop" never returns. Ok. Can you show us the logs and "ps axf" please? ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org

Re: [Openais] What can I do when facing "Waiting for corosync services to unload:........."

2010-05-07 Thread Andrew Beekhof
On Fri, May 7, 2010 at 2:10 PM, Alain.Moulle wrote: > Hi, > > good news I think : I got a good clue to identify the "unload stalled" > problem , you'll > tell me if it really helps : > in fact, I got again the message : > "Waiting for corosync services to unload:..." > and from there I did fro

Re: [Openais] What can I do when facing "Waiting for corosync services to unload:........."

2010-05-04 Thread Andrew Beekhof
Alain, clusterlabs has 1.2.1 now. Could you try updating? On Tue, May 4, 2010 at 2:48 PM, Jan Friesse wrote: > Hi, > 1.2.0 has some shutdown issues. Try to upgrade to 1.2.1 (1.2.2 when > released), and problem should dissapeared. > > Regards, >  Honza > > > Alain.Moulle wrote: >> Hi everybody, >

Re: [Openais] What can I do when facing "Waiting for corosync services to unload:........."

2010-05-04 Thread Andrew Beekhof
On Tue, May 4, 2010 at 9:10 AM, Alain.Moulle wrote: > Hi, > > When stopping corosync with /etc/init.d/corosync stop", I'm from time to > time stalled > during unload services : > Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ] > Waiting for corosync services to unload:.

Re: [Openais] What can I do when facing "Waiting for corosync services to unload:........."

2010-05-04 Thread Andrew Beekhof
On Tue, May 4, 2010 at 9:41 AM, Andreas Mock wrote: > -Ursprüngliche Nachricht- > Von: "Alain.Moulle" >>What could I do to avoid this ? > > Don't use it.  ;-) That sort of comments isn't going to win many friends on this list is it. Even with a smiley face. It may not even be a corosync

Re: [Openais] Failover constraint problem

2010-04-19 Thread Andrew Beekhof
: >     nfs_client_stop_0 (node=node0, call=21, rc=1, status=complete): unknown > error > node1:~# > > Here is the relevant part of daemon.log http://pastebin.com/L9scU4fy > > Thank you ! > > Andrew Beekhof írta: > > On Sat, Apr 17, 2010 at 12:21 AM, Sandor Feher w

Re: [Openais] Missing shutdown messages with corosync 1.2.1 and pacemaker

2010-04-19 Thread Andrew Beekhof
On Mon, Apr 12, 2010 at 3:19 PM, Andreas Mock wrote: > -Ursprüngliche Nachricht- > Von: Andrew Beekhof > Gesendet: 12.04.2010 08:58:44 > An: Andreas Mock > Betreff: Re: [Openais] Missing shutdown messages with corosync 1.2.1 and > pacemaker > > Hi all, >

Re: [Openais] Failover constraint problem

2010-04-19 Thread Andrew Beekhof
On Sat, Apr 17, 2010 at 12:21 AM, Sandor Feher wrote: > Hi, > > First of all my goal is to set up a two-node cluster with pacemaker to > serve our webhosting service. > This config sites on two vmware virtual machines for testing purposes > now. Both of them runs Debian Lenny. > > Here are the bas

Re: [Openais] [solved] Re: problem running ocfs2/o2cb with openais/pacemaker

2010-04-16 Thread Andrew Beekhof
On Wed, Apr 14, 2010 at 1:06 PM, Jürgen Herrmann wrote: > > On Tue, 13 Apr 2010 16:52:01 +0200, Andrew Beekhof > wrote: >> On Tue, Apr 13, 2010 at 3:33 PM, Jürgen Herrmann >> wrote: >>> >>> On Mon, 12 Apr 2010 14:46:39 +0200, Andrew Beekhof >>>

Re: [Openais] Failover problem

2010-04-16 Thread Andrew Beekhof
On Fri, Apr 16, 2010 at 3:28 PM, Haussecker, Armin wrote: > Hi, > > we have a 2-node-cluster based on SLES11 , openais (0.80.3-26.8.1) and > pacemaker (1.0.5-0.5.6). You're best off contacting Novell support for older versions. There's really not enough in the log fragments below to make any mea

Re: [Openais] corosync shutdown timeout

2010-04-16 Thread Andrew Beekhof
On Thu, Apr 15, 2010 at 8:06 PM, Vadym Chepkov wrote: > pacemaker-1.0.8-4.el5 :-( Can you create a bug for this please: http://developerbugs.linux-foundation.org/ Also, please include a hb_report for the period just before shutdown began. ___ Opena

Re: [Openais] corosync shutdown timeout

2010-04-15 Thread Andrew Beekhof
On Thu, Apr 15, 2010 at 6:34 PM, Vadym Chepkov wrote: > In case of a shutdown yes, but in this particular case I did > > crm configure property is-managed-default=false. > > and it seems brings shutdown procedure to a stupor. Then thats definitely a bug in pacemaker. What version or pacemaker are

Re: [Openais] corosync shutdown timeout

2010-04-15 Thread Andrew Beekhof
On Thu, Apr 15, 2010 at 5:29 PM, Vadym Chepkov wrote: > Hi, > > Is there a way to configure corosync timeout shutdown? > > # grep 'Still waiting' /var/log/messages > Apr 15 15:13:37 ashlin02 corosync[3017]:   [pcmk  ] notice: pcmk_shutdown: > Still waiting for crmd (pid=3029, seq=6) to terminate.

Re: [Openais] problem running ocfs2/o2cb with openais/pacemaker

2010-04-13 Thread Andrew Beekhof
On Tue, Apr 13, 2010 at 3:33 PM, Jürgen Herrmann wrote: > > On Mon, 12 Apr 2010 14:46:39 +0200, Andrew Beekhof > wrote: >> Please keep all replies on the list. >> >> On Apr 12, 2010, at 2:44 PM, Jürgen Herrmann wrote: >> >>> >>> On Mon, 12

Re: [Openais] problem running ocfs2/o2cb with openais/pacemaker

2010-04-12 Thread Andrew Beekhof
Please keep all replies on the list. On Apr 12, 2010, at 2:44 PM, Jürgen Herrmann wrote: > > On Mon, 12 Apr 2010 14:25:55 +0200, Andrew Beekhof > wrote: >> What versions of openais (corosync?) and pacemaker are you using? > > app1a:~# apt-show-versions |grep pace

Re: [Openais] problem running ocfs2/o2cb with openais/pacemaker

2010-04-12 Thread Andrew Beekhof
What versions of openais (corosync?) and pacemaker are you using? On Mon, Apr 12, 2010 at 2:00 PM, Jürgen Herrmann wrote: > > hi! > > i'm on debian lenny and trying to run ocfs2 on a dual primary > drbd device. the drbd device is already set up as msDRBD0. > > to get dlm_controld.pcmk i installed

Re: [Openais] Corosync Patch: Fix the default for COROSYNC_RUN_DIR

2010-04-12 Thread Andrew Beekhof
On Mon, Apr 12, 2010 at 12:46 AM, Steven Dake wrote: > On Sun, 2010-04-11 at 10:30 +0200, Andrew Beekhof wrote: >> On Sun, Apr 11, 2010 at 1:59 AM, Steven Dake wrote: >> > On Sat, 2010-04-10 at 13:35 +0200, Andrew Beekhof wrote: >> >> On Sat, Apr 10, 2010

Re: [Openais] Missing shutdown messages with corosync 1.2.1 and pacemaker

2010-04-12 Thread Andrew Beekhof
You might want to include your corosync config file. Does the same happen if you configure log-to-file? On Fri, Apr 9, 2010 at 11:43 PM, Andreas Mock wrote: > Hi all, > > while trying to test corosync 1.2.1 and pacemaker 1.0.8 with CTS I found the > following > problem. The expected shutdown mes

Re: [Openais] Corosync Patch: Fix the default for COROSYNC_RUN_DIR

2010-04-11 Thread Andrew Beekhof
On Sun, Apr 11, 2010 at 1:59 AM, Steven Dake wrote: > On Sat, 2010-04-10 at 13:35 +0200, Andrew Beekhof wrote: >> On Sat, Apr 10, 2010 at 6:18 AM, Fabio M. Di Nitto >> wrote: >> > On 4/9/2010 8:17 PM, Steven Dake wrote: >> >> On Fri, 2010-04-09 at 15:05 +020

Re: [Openais] Corosync Patch: Fix the default for COROSYNC_RUN_DIR

2010-04-10 Thread Andrew Beekhof
On Sat, Apr 10, 2010 at 6:18 AM, Fabio M. Di Nitto wrote: > On 4/9/2010 8:17 PM, Steven Dake wrote: >> On Fri, 2010-04-09 at 15:05 +0200, Andrew Beekhof wrote: >>> This looks like a copy/paste error to me... >>> >>> The "RUN" in COROSYNC_RUN_DIR wou

[Openais] Corosync Patch: Fix the default for COROSYNC_RUN_DIR

2010-04-09 Thread Andrew Beekhof
This looks like a copy/paste error to me... The "RUN" in COROSYNC_RUN_DIR would seem to imply /var/run Also /var/lib is persistent and doesn't need to be created at startup. On the other-hand, LSB states that the contents of /var/run is blow away at boot time. So I'm reasonably sure the following

Re: [Openais] Unusual exit code with /etc/init.d/corosync stop (Steve - Please ack new patch)

2010-03-25 Thread Andrew Beekhof
On Thu, Mar 25, 2010 at 9:32 AM, Andreas Mock wrote: > -Ursprüngliche Nachricht- > Von: Andrew Beekhof > Gesendet: 25.03.2010 09:15:11 > An: Andreas Mock > Betreff: Re: [Openais] Unusual exit code with /etc/init.d/corosync stop > >>On Tue, Mar 23, 2010 at 12:42

Re: [Openais] Unusual exit code with /etc/init.d/corosync stop

2010-03-25 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 12:42 AM, Andreas Mock wrote: > Hi all, > > I'm using corosync 1.2.0 from the packages of clusterlabs.org on openSuSE > 11.2. > A correct /etc/init.d/corosync stop issues a return code of 1 The rc code isn't coming from corosync at all. Its coming from the last command in

Re: [Openais] Corosync can't start pacemaker due to syslog and creates a lots of corosync child processes

2010-03-25 Thread Andrew Beekhof
On Thu, Mar 25, 2010 at 2:50 AM, Thomas Guthmann wrote: > Hey Steven, > >> This is a distro specific bug.  Please file a bugzilla with the >> appropriate distro to work out the runlevels on their system.  For >> fedora which I test on mostly, rsyslog is runlevel 12.  Other distros >> may be differ

Re: [Openais] Strange behaviour of corosync

2010-03-23 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 9:07 PM, Andreas Mock wrote: > -Ursprüngliche Nachricht- > Von: Andrew Beekhof > Gesendet: 23.03.2010 20:37:12 > An: Andreas Mock > Betreff: Re: [Openais] Strange behaviour of corosync >> >>Because the amount of time is determined b

Re: [Openais] Strange behaviour of corosync

2010-03-23 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 8:19 PM, Andreas Mock wrote: > -Ursprüngliche Nachricht- > Von: Andrew Beekhof > Gesendet: 23.03.2010 16:35:01 > An: sd...@redhat.com > Betreff: Re: [Openais] Strange behaviour of corosync > >>> Andrew really did all the work on t

Re: [Openais] Strange behaviour of corosync

2010-03-23 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 12:28 AM, Steven Dake wrote: > On Tue, 2010-03-23 at 00:18 +0100, Andreas Mock wrote: >> -Ursprüngliche Nachricht- >> Von: Steven Dake >> Gesendet: 22.03.2010 22:56:03 >> An: Andreas Mock >> Betreff: Re: [Openais] Strange behaviour of corosync >> >> > >> >Thank yo

Re: [Openais] Corosync cluster stack won't start

2010-03-22 Thread Andrew Beekhof
dhat Enterprise: > > yum install -y pacemaker corosync heartbeat > > Is it just that there are some shared scripts or binaries or libraries that > pacemaker needs from heartbeat? > > Cheers, > Tom > > > > On Mon, Mar 22, 2010 at 2:35 PM, Andrew Beekhof wrote: >

Re: [Openais] Corosync cluster stack won't start

2010-03-22 Thread Andrew Beekhof
On Sat, Mar 20, 2010 at 1:06 AM, Thomas Guthmann wrote: > Hi Tom, > >> heartbeat-libs-3.0.2-2.el5.x86_64.rpm >> heartbeat-3.0.2-2.el5.x86_64.rpm >> openais-1.1.0-1.el5.x86_64.rpm >> openaislib-1.1.0-1.el5.x86_64.rpm > > I reckon it could be due to the presence of openais _and_ corosync. > If you w

Re: [Openais] Quorum Debian Lenny

2010-03-15 Thread Andrew Beekhof
do you have both nodes up and running? On Mon, Mar 15, 2010 at 8:31 PM, Olivier BATARD wrote: > Hi, > > > I'm trying to setup an active/passive cluster. > > > I follow the cluster from scratch and Debian Lenny howto but I'm having some > errors : > > > #crm_mon --one-shot -V > > > Last updated:

Re: [Openais] How Corosync identifies a new node?

2010-03-15 Thread Andrew Beekhof
On Sat, Mar 13, 2010 at 6:02 AM, Steven Dake wrote: > On Fri, 2010-03-12 at 16:02 +0530, S, Prashanth wrote: >> Hi! >> >> I need to clarify my understanding on how corosync handles addition of a new >> node. >> I think whenever a new node is up it will multicast about its arrival.  This >> will

Re: [Openais] How to debug?

2010-03-10 Thread Andrew Beekhof
osync's log. It's > this a corosync(openais) error or pacemaker error? Its a resource agent error. Usually you'll find the details in /var/log/messages If in doubt, look in the resource agent itself and look for places that might return that error. > Thanks. > > On

Re: [Openais] How to debug?

2010-03-08 Thread Andrew Beekhof
You might want to cross reference the return codes from the failed operations with: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-ocf-return-codes.html Looks like you have some missing packages and invalid configuration options. On Tue, Mar 9, 2010 at 1:38 AM,

Re: [Openais] resource restart

2010-03-08 Thread Andrew Beekhof
On Wed, Mar 3, 2010 at 8:21 AM, Haussecker, Armin wrote: > Hi, > > we have an openais cluster consisting of two nodes, a resource is started on > first node, and this resource should remain on first node by suitable > location constraint, and also it should be started on the same node as > another

Re: [Openais] constraint problem

2010-02-24 Thread Andrew Beekhof
On Wed, Feb 24, 2010 at 3:05 PM, Haussecker, Armin wrote: > Hi, > > calling command cibadmin to create resource constraints in a CIB, the > following problem occurred: > > if file constraints.xml (containing xml snippets) contains more than one > constraint definition, for example: > > >   score

Re: [Openais] [PATCH corosync_trunk] Add a test harness to corosync that uses CTS from pacemaker.

2010-02-23 Thread Andrew Beekhof
On Tue, Feb 23, 2010 at 4:00 AM, Angus Salkeld wrote: > On Mon, 2010-02-22 at 15:23 -0700, Steven Dake wrote: >> On Thu, 2010-02-18 at 11:17 +1100, Angus Salkeld wrote: >> > Hi >> > >> > This adds a test harness to corosync. It reuses the >> > Cluster Test System (CTS) from pacemaker. It also >> >

Re: [Openais] does self-fencing makes sense?

2010-02-23 Thread Andrew Beekhof
On Tue, Feb 23, 2010 at 8:29 AM, Steven Dake wrote: > On Tue, 2010-02-23 at 08:25 +0100, Dietmar Maurer wrote: >> > > There are thousands of interactions with power fencing and every one >> > > of them needs to work perfectly for power fencing to work. >> > >> > Thats not the problem. >> > Its the

Re: [Openais] does self-fencing makes sense?

2010-02-22 Thread Andrew Beekhof
On Fri, Feb 19, 2010 at 11:31 PM, Steven Dake wrote: > On Fri, 2010-02-19 at 18:41 +0100, Andrew Beekhof wrote: >> On Fri, Feb 19, 2010 at 5:36 PM, Dietmar Maurer wrote: >> > Hi all, I just found a whitepaper from XenServer - seem they implement some >>

  1   2   3   >