Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 10:56 am, Gianluca Cecchi wrote: > On Wed, Mar 12, 2014 at 12:37 AM, Andrew Beekhof wrote: > >> It was put in when drbd called: >> >> fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; >> >> When and why it called that is not my area of expertise though. >> > > The constraint

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

2014-03-11 Thread Vladislav Bogdanov
12.03.2014 00:37, Andrew Beekhof wrote: ... > I'm somewhat confused at this point if crmsh is using --replace, then why > is it doing diff calculations? > Or are replace operations only for the load operation? It uses on of two methods depending on pacemaker version. ___

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

2014-03-11 Thread Yusuke Iida
Hi, Andrew 2014-03-12 6:37 GMT+09:00 Andrew Beekhof : >> Mar 07 13:24:14 [2528] vm01 crmd: (te_callbacks:493 ) error: >> te_update_diff: Ingoring create operation for /cib 0xf91c10, >> configuration > > Thats interesting... is that with the fixes mentioned above? I'm sorry. The above-ment

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Gianluca Cecchi
On Tue, Mar 11, 2014 at 11:52 PM, Andrew Beekhof wrote: > > On 8 Mar 2014, at 11:31 am, Gianluca Cecchi wrote: > >> I provoke power off of ovirteng01. Fencing agent works ok on >> ovirteng02 and reboots it. >> I stop boot ofovirteng01 at grub prompt to simulate problem in boot >> (for example sys

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Gianluca Cecchi
On Wed, Mar 12, 2014 at 12:37 AM, Andrew Beekhof wrote: > It was put in when drbd called: > > fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; > > When and why it called that is not my area of expertise though. > The constraint put by crm-fence-peer.sh was and I think it was g

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 10:32 am, Gianluca Cecchi wrote: > On Tue, Mar 11, 2014 at 11:52 PM, Andrew Beekhof wrote: >> >> On 8 Mar 2014, at 11:31 am, Gianluca Cecchi >> wrote: >> >>> I provoke power off of ovirteng01. Fencing agent works ok on >>> ovirteng02 and reboots it. >>> I stop boot ofovir

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Andrew Beekhof
On 8 Mar 2014, at 11:31 am, Gianluca Cecchi wrote: > I provoke power off of ovirteng01. Fencing agent works ok on > ovirteng02 and reboots it. > I stop boot ofovirteng01 at grub prompt to simulate problem in boot > (for example system put in console mode due to filesystem problem) > In the mean

Re: [Pacemaker] hangs pending

2014-03-11 Thread Andrew Beekhof
Sorry for the delay, sometimes it takes a while to rebuild the necessary context On 5 Mar 2014, at 4:42 pm, Andrey Groshev wrote: > > > 05.03.2014, 04:04, "Andrew Beekhof" : >> On 25 Feb 2014, at 8:30 pm, Andrey Groshev wrote: >> >>> 21.02.2014, 12:04, "Andrey Groshev" : 21.02.2014, 0

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 8:40 am, Andrew Beekhof wrote: > > On 11 Mar 2014, at 6:23 pm, Vladislav Bogdanov wrote: > >> 07.03.2014 10:30, Vladislav Bogdanov wrote: >>> 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov wrote: > 18.02.20

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

2014-03-11 Thread Andrew Beekhof
On 11 Mar 2014, at 6:23 pm, Vladislav Bogdanov wrote: > 07.03.2014 10:30, Vladislav Bogdanov wrote: >> 07.03.2014 05:43, Andrew Beekhof wrote: >>> >>> On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov wrote: >>> 18.02.2014 03:49, Andrew Beekhof wrote: > > On 31 Jan 2014, at 6:20 pm

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

2014-03-11 Thread Andrew Beekhof
On 11 Mar 2014, at 6:51 pm, Yusuke Iida wrote: > Hi, Andrew > > 2014-03-11 14:21 GMT+09:00 Andrew Beekhof : >> >> On 11 Mar 2014, at 4:14 pm, Andrew Beekhof wrote: >> >> [snip] >> >>> If I do this however: >>> >>> # cp start.xml 1.xml; tools/cibadmin --replace -o configuration --xml-file

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 1:54 am, Attila Megyeri wrote: >> >> -Original Message- >> From: Andrew Beekhof [mailto:and...@beekhof.net] >> Sent: Tuesday, March 11, 2014 12:48 AM >> To: The Pacemaker cluster resource manager >> Subject: Re: [Pacemaker] Pacemaker/corosync freeze >> >> >> On 7 Ma

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-11 Thread Attila Megyeri
> -Original Message- > From: Andrew Beekhof [mailto:and...@beekhof.net] > Sent: Tuesday, March 11, 2014 12:48 AM > To: The Pacemaker cluster resource manager > Subject: Re: [Pacemaker] Pacemaker/corosync freeze > > > On 7 Mar 2014, at 5:54 pm, Attila Megyeri > wrote: > > > Thanks for t

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

2014-03-11 Thread Yusuke Iida
Hi, Andrew 2014-03-11 14:21 GMT+09:00 Andrew Beekhof : > > On 11 Mar 2014, at 4:14 pm, Andrew Beekhof wrote: > > [snip] > >> If I do this however: >> >> # cp start.xml 1.xml; tools/cibadmin --replace -o configuration --xml-file >> replace.some -V >> >> I start to see what you see: >> >> (

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

2014-03-11 Thread Vladislav Bogdanov
07.03.2014 10:30, Vladislav Bogdanov wrote: > 07.03.2014 05:43, Andrew Beekhof wrote: >> >> On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov wrote: >> >>> 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida wrote: > Hi, all > > I measure the pe