Re: [Pacemaker] Some questions about corosync-cfgtool

2011-01-18 Thread Dan Frincu
Hi, https://lists.linux-foundation.org/pipermail/openais/2011-January/015626.html xin.li...@cs2c.com.cn wrote: hi everybody! I have some questions about corosync-cfgtool 1. What should I do when "corosync-cfgtool -s" return "Could not initialize corosync configuration API error" ?

Re: [Pacemaker] Colocation of multistate resource and group

2011-01-18 Thread Florian Haas
On 01/18/2011 10:15 PM, Evgeniy Ivanov wrote: >>> Is it expected (PM 1.0.3)? What's correct way to achieve this? >> >> Consider upgrading. > > Yeah, I will. But for now I need to make it work on 1.0.3. IIRC there were a _bunch_ of issues of exactly the type you mentioned that were fixed in the 1.

[Pacemaker] start resource timeout

2011-01-18 Thread jiaju liu
I use lustre filesystem in cluster,By default, the start, stop, and monitor operations in a Filesystem resource time out after 20 sec. Since some mounts in Lustre require up to 5 minutes or more,so, the default timeouts for these operations must be modified.I want to change it to 10min, is it o

[Pacemaker] Some questions about corosync-cfgtool

2011-01-18 Thread xin . liang
hi everybody!    I have some questions about corosync-cfgtool        1. What should I do when "corosync-cfgtool -s" return "Could not initialize corosync configuration API error" ? Restart corosync ?(I don't think it's a good idea)    2. How can the process happen automatically when network problem

Re: [Pacemaker] Problem with Xen live migration

2011-01-18 Thread Jean-Francois Malouin
* Vladislav Bogdanov [20110118 08:41]: > > Unless clustered LVM locking is enabled and working: > > # sed -ri 's/^([ \t]+locking_type).*/locking_type = 3/' > > /etc/lvm/lvm.conf > > # sed -ri 's/^([ \t]+fallback_to_local_locking).*/ > > fall

Re: [Pacemaker] Colocation of multistate resource and group

2011-01-18 Thread Evgeniy Ivanov
On Tue, Jan 18, 2011 at 6:37 PM, Dejan Muhamedagic wrote: > Hi, > > On Tue, Jan 18, 2011 at 02:52:07PM +0300, Evgeniy Ivanov wrote: >> Hello, >> >> I have a group Gr and multistate resource MasterRsc. I need MasterRsc >> to be master on the node where Gr works. >> In MasterRsc::start() I use crm_m

Re: [Pacemaker] Oracle 10g Express edition OCF cluster script

2011-01-18 Thread Florian Haas
On 01/18/2011 04:02 PM, Ladislav Jech wrote: > Hi, > > it's already some time ago, when was asked to configure cluster for > Oracle Database 10g Express Edition. The included scripts for oracle in > repository didn't work well, Interesting. I did the exact same thing a while ago, and found that t

Re: [Pacemaker] stonith problems RHEL6

2011-01-18 Thread Andrew Beekhof
On Tue, Jan 4, 2011 at 5:03 PM, Dejan Muhamedagic wrote: > Hi, > > On Sun, Jan 02, 2011 at 03:05:58PM +0100, Jocke M wrote: >> Hello, >> >> I just tested to setup a cluster (RHEL6) using "Clusters from Scratch" found >> at >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Clusters_

Re: [Pacemaker] crm configure load update

2011-01-18 Thread Dejan Muhamedagic
On Tue, Jan 18, 2011 at 06:39:52PM +0200, Vladislav Bogdanov wrote: > 18.01.2011 18:22, Dejan Muhamedagic пишет: > > Hi, > > > > On Tue, Jan 18, 2011 at 03:35:15PM +0200, Vladislav Bogdanov wrote: > >> Hi all, > >> > >> It looks like configuration lines are pushed to running CIB line-by-line > >>

Re: [Pacemaker] crm configure load update

2011-01-18 Thread Vladislav Bogdanov
18.01.2011 18:22, Dejan Muhamedagic пишет: > Hi, > > On Tue, Jan 18, 2011 at 03:35:15PM +0200, Vladislav Bogdanov wrote: >> Hi all, >> >> It looks like configuration lines are pushed to running CIB line-by-line >> during 'crm configure load update', rather then edit-all/commit. >> >> I just observ

Re: [Pacemaker] Oracle 10g Express edition OCF cluster script

2011-01-18 Thread Dejan Muhamedagic
Hi, On Tue, Jan 18, 2011 at 04:02:56PM +0100, Ladislav Jech wrote: > Hi, > > it's already some time ago, when was asked to configure cluster for > Oracle Database 10g Express Edition. The included scripts for oracle in > repository didn't work well, so I created my own script and while Wonderful

Re: [Pacemaker] crm configure load update

2011-01-18 Thread Dejan Muhamedagic
Hi, On Tue, Jan 18, 2011 at 03:35:15PM +0200, Vladislav Bogdanov wrote: > Hi all, > > It looks like configuration lines are pushed to running CIB line-by-line > during 'crm configure load update', rather then edit-all/commit. > > I just observed this while pushing dozen of new primitives togethe

Re: [Pacemaker] Colocation of multistate resource and group

2011-01-18 Thread Dejan Muhamedagic
Hi, On Tue, Jan 18, 2011 at 02:52:07PM +0300, Evgeniy Ivanov wrote: > Hello, > > I have a group Gr and multistate resource MasterRsc. I need MasterRsc > to be master on the node where Gr works. > In MasterRsc::start() I use crm_master to set higher value if Gr works > on this node and it works fi

Re: [Pacemaker] ccm returning with exit code 100 and system rebooting

2011-01-18 Thread Dejan Muhamedagic
On Tue, Jan 18, 2011 at 04:55:38PM +0530, akshay punja wrote: > Hi, > > Thanks for the help, > > As suggest I have changed the crm on to respawn, after the configuration > change Rebooting has stopped. > > I are using tomcat, apache httpd and mysql master - slave replication, I > have set this

[Pacemaker] Oracle 10g Express edition OCF cluster script

2011-01-18 Thread Ladislav Jech
Hi, it's already some time ago, when was asked to configure cluster for Oracle Database 10g Express Edition. The included scripts for oracle in repository didn't work well, so I created my own script and while playing with pacemaker and corosync application blocks and communicating with Andrew Bee

Re: [Pacemaker] Problem with Xen live migration

2011-01-18 Thread Vladislav Bogdanov
18.01.2011 16:00, Vladislav Bogdanov пишет: > 18.01.2011 15:41, Vadym Chepkov wrote: > > ... > I have tried it myself, but concluded it's impossible to do it reliably with the current code. For the live migration to work you have to remove any colocation constraints (gr

Re: [Pacemaker] Unordered groups (was Re: [Linux-HA] Is 'resource_set' still experimental?)

2011-01-18 Thread Rasto Levrinc
On Tue, January 18, 2011 1:42 pm, Florian Haas wrote: > On 01/18/2011 11:49 AM, RaSca wrote: > >> As discussed yesterday on IRC with Andrew, there is no way of creating >> a group with indipendent resources. I was hoping that setting the options >> you mentioned can do the trick, but I've just tes

Re: [Pacemaker] Problem with Xen live migration

2011-01-18 Thread Vladislav Bogdanov
18.01.2011 15:41, Vadym Chepkov wrote: ... >>> >>> I have tried it myself, but concluded it's impossible to do it reliably >>> with the current code. >>> For the live migration to work you have to remove any colocation >>> constraints (group included) with the Xen resource. >>> drbd code includ

Re: [Pacemaker] Problem with Xen live migration

2011-01-18 Thread Vadym Chepkov
On Jan 18, 2011, at 8:16 AM, Vladislav Bogdanov wrote: > 18.01.2011 14:45, Vadym Chepkov wrote: >> >> On Jan 17, 2011, at 6:44 PM, Jean-Francois Malouin wrote: >> >>> Back again to setup an active/passive cluster for Xen with live migration >>> but so far, no go. Xen DomU is shutdown and restar

Re: [Pacemaker] Problem with Xen live migration

2011-01-18 Thread Vladislav Bogdanov
> Unless clustered LVM locking is enabled and working: > # sed -ri 's/^([ \t]+locking_type).*/locking_type = 3/' > /etc/lvm/lvm.conf > # sed -ri 's/^([ \t]+fallback_to_local_locking).*/ > fallback_to_local_locking = 1/' /etc/lvm/lvm.conf > # vgchange -cy VG_NAME > # service clvmd start > # vgs|

[Pacemaker] crm configure load update

2011-01-18 Thread Vladislav Bogdanov
Hi all, It looks like configuration lines are pushed to running CIB line-by-line during 'crm configure load update', rather then edit-all/commit. I just observed this while pushing dozen of new primitives together with colocation/order constraints - primitives tried to start (and failed) on a nod

Re: [Pacemaker] Problem with Xen live migration

2011-01-18 Thread Vladislav Bogdanov
18.01.2011 14:45, Vadym Chepkov wrote: > > On Jan 17, 2011, at 6:44 PM, Jean-Francois Malouin wrote: > >> Back again to setup an active/passive cluster for Xen with live migration >> but so far, no go. Xen DomU is shutdown and restarted when I move the >> Xen resource. >> >> I'm using Debian Sque

Re: [Pacemaker] alignment issues on arm (debian armel arch)

2011-01-18 Thread Greg Walton
thanks, i did manage to figure it out and then steve dake came up with a patch for it. It should eventually go into trunk. the message buffer is an aligned chunk of mmap'd memory and the first message written to it is aligned, but subsequent ones probably not depending on the length of the alre

Re: [Pacemaker] Problem with Xen live migration

2011-01-18 Thread Vadym Chepkov
On Jan 17, 2011, at 6:44 PM, Jean-Francois Malouin wrote: > Back again to setup an active/passive cluster for Xen with live migration > but so far, no go. Xen DomU is shutdown and restarted when I move the > Xen resource. > > I'm using Debian Squeeze, pacemaker 1.0.9.1, corosync 1.2.1-4 with Xen

[Pacemaker] Unordered groups (was Re: [Linux-HA] Is 'resource_set' still experimental?)

2011-01-18 Thread Florian Haas
On 01/18/2011 11:49 AM, RaSca wrote: > As discussed yesterday on IRC with Andrew, there is no way of creating a > group with indipendent resources. > I was hoping that setting the options you mentioned can do the trick, > but I've just tested: > > If you declare a group like this: > > group group

Re: [Pacemaker] Reboot host when service fails

2011-01-18 Thread Andrew Beekhof
On Tue, Dec 21, 2010 at 8:55 AM, Marko Potocnik wrote: > setting on-fail parameter does nothing. I still have to define a stonith > agent and enable stonih. correct, on-fail simply triggers the normal fencing process. > I'm a little lost here. I don't know which stonith > agent to use and I don'

[Pacemaker] Colocation of multistate resource and group

2011-01-18 Thread Evgeniy Ivanov
Hello, I have a group Gr and multistate resource MasterRsc. I need MasterRsc to be master on the node where Gr works. In MasterRsc::start() I use crm_master to set higher value if Gr works on this node and it works fine. But I have a problem with failovers and colocation: I did failover of Gr a

Re: [Pacemaker] ccm returning with exit code 100 and system rebooting

2011-01-18 Thread akshay punja
Hi, Thanks for the help, As suggest I have changed the crm on to respawn, after the configuration change Rebooting has stopped. I are using tomcat, apache httpd and mysql master - slave replication, I have set this up in multiple environments and its working fine. I are see this issue only in o

Re: [Pacemaker] ccm returning with exit code 100 and system rebooting

2011-01-18 Thread Dejan Muhamedagic
Hi, On Tue, Jan 18, 2011 at 08:34:57AM +0530, akshay punja wrote: > Please let me know if any one has solved this issue. CCM exiting with return > code 100 and system rebooting Either bad installation or some kind of security mechanism preventing heartbeat/ccm from operating normally. For insta

Re: [Pacemaker] [PATCH] Build: fix fedora package interdependencies

2011-01-18 Thread Andrew Beekhof
On Fri, Dec 3, 2010 at 6:48 PM, Vadym Chepkov wrote: > On Fri, Dec 3, 2010 at 2:17 AM, Andrew Beekhof wrote: >> Libraries we link against are automatically added as dependancies by >> rpm, there's no need (and it is discouraged) to list them explicitly >> > > I totally agree, the spec file is jus

Re: [Pacemaker] ccm returning with exit code 100 and system rebooting

2011-01-18 Thread Andrew Beekhof
On Tue, Jan 18, 2011 at 4:04 AM, akshay punja wrote: > Please let me know if any one has solved this issue. Can you try "crm respawn" instead of "crm on" so the node stays up long enough to see why the ccm is unhappy. Lars, you really aught to think about changing the default behavior and adding

Re: [Pacemaker] Strange expected-quorum-votes

2011-01-18 Thread Andrew Beekhof
On Fri, Jan 7, 2011 at 1:10 PM, Michael Schwartzkopff wrote: > On Friday 07 January 2011 13:04:27 Michael Schwartzkopff wrote: >> Hi, >> >> I just installed pacemaker-1.1.4 from clusterlabs.org on my opensuse-11.3 >> >> When I start my cluster I get the following picture: >> >> # crm configure sho

Re: [Pacemaker] Speed up resource failover?

2011-01-18 Thread Andrew Beekhof
On Fri, Jan 14, 2011 at 12:45 PM, Dejan Muhamedagic wrote: > Hi, > > On Wed, Jan 12, 2011 at 02:41:31PM -0700, Patrick H. wrote: >> >> >>Oh, and its not waiting for the resource to stop on the other >> >>node  before it starts it up either. >> >>Here's the lrmd log for resource vip_55.63 from the

Re: [Pacemaker] attrd_updater - how works?

2011-01-18 Thread Andrew Beekhof
On Thu, Jan 6, 2011 at 9:28 PM, Michael Schwartzkopff wrote: > Hi, > > I have a question about the attrd_updater. How does it work exactly? Ok, I > understand that it uses hysteresis updating the values in the CIB. But what > happens if the values change just a little bit. Changed is changed. Wh

Re: [Pacemaker] alignment issues on arm (debian armel arch)

2011-01-18 Thread Andrew Beekhof
Long story short... the buffer is created by the Corosync IPC code. See: gboolean ais_dispatch(int sender, gpointer user_data) { int rc = CS_OK; char *buffer = NULL; gboolean good = TRUE; gboolean (*dispatch)(AIS_Message*,char*,int) = user_data; rc = coroipcc_dispatch_get (ais

Re: [Pacemaker] Node doesn't rejoin automatically after reboot - POSSIBLE CAUSE

2011-01-18 Thread Andrew Beekhof
On Fri, Jan 14, 2011 at 4:59 PM, Bob Haxo wrote: > >> Where there (m)any logs containing the text "crm_abort" ... > Sorry Andrew, > > Since I'm testing installations, all of the nodes in the cluster have > been installed several times since I solved this issue, and the original > log files are gon

Re: [Pacemaker] ERROR: handle_request: Unexpected request (clear_failcount) sent to non-DC node

2011-01-18 Thread Andrew Beekhof
Thats odd. Can you file a bug and attach a hb_report please? On Thu, Jan 13, 2011 at 3:23 AM, Bart Coninckx wrote: > Hi, > > are these "ERROR: handle_request: Unexpected request (clear_failcount) sent to > non-DC node" messages important and indicative of problems? The failcount for > all resour

Re: [Pacemaker] [Ubuntu-ha] startup problem DLM on ubuntu lucid

2011-01-18 Thread Andrew Beekhof
On Thu, Jan 13, 2011 at 5:34 PM, Jake Smith wrote: > I read the thread related to this startup problem (dlm segfaults when server > comes up with corosync auto starting up).  I just have one follow-up > question: > > > > The 3.07 package in Ubuntu-HA has not been patched for Lucid yet and there >

Re: [Pacemaker] Logging level problem

2011-01-18 Thread Andrew Beekhof
Pacemaker doesn't observe syslog_priority, arguably it should. Either way, syslog filtering is a better path forward (since not every CLI tool reads corosync.conf) On Mon, Jan 17, 2011 at 4:01 PM, Jake Smith wrote: > I wanted to change the log level to reduce the amount of logging to syslog. > I

Re: [Pacemaker] How to set up logging for CTSlab

2011-01-18 Thread Andrew Beekhof
On Mon, Jan 17, 2011 at 3:26 PM, Simon Jansen wrote: >> On Mon, Jan 10, 2011 at 5:40 PM, Dejan Muhamedagic >> wrote: >> > Hi, >> > >> > On Mon, Jan 10, 2011 at 10:46:58AM +0100, Simon Jansen wrote: >> >> Hi, >> >> >> >> I would like to test my cluster with the cluster test suite. I followed >> >>