[Linux-HA] OpenVZ CTs on NFS troubles

2013-04-03 Thread Roman Haefeli
Hi all We are using pacemaker and the RA heartbeat:ManageVE to manage a number of OpenVZ CTs on a three-node cluster. The CTs are hosted on an NFS mount shared by all three nodes which is managed by by the heartbeat:Filesystem RA. We get errors in certain situations, for instance when starting al

[Linux-HA] Problem with migration of OpenVZ CTs

2013-03-22 Thread Roman Haefeli
Hi, I encountered a problem when performing a live migration of some OpenVZ CTs. Altough the migration didn't trigger any messages in 'crm_mon' and was initially performed without any troubles, the resource was restarted on the target node 'unnecessarily'. From the logs it looks as if after the ac

Re: [Linux-HA] RA heartbeat/exportfs hangs sporadically

2013-03-15 Thread Roman Haefeli
On Mon, 2013-03-11 at 16:28 +0100, Dejan Muhamedagic wrote: > Hi, > > On Mon, Mar 11, 2013 at 10:53:55AM +0100, Roman Haefeli wrote: > > On Fri, 2013-03-08 at 14:15 +0100, Dejan Muhamedagic wrote: > > > Hi, > > > > > > On Fri, Mar 08, 2013 at 01:39:27PM

Re: [Linux-HA] RA heartbeat/exportfs hangs sporadically

2013-03-11 Thread Roman Haefeli
On Fri, 2013-03-08 at 14:15 +0100, Dejan Muhamedagic wrote: > Hi, > > On Fri, Mar 08, 2013 at 01:39:27PM +0100, Roman Haefeli wrote: > > On Fri, 2013-03-08 at 13:28 +0100, Roman Haefeli wrote: > > > On Fri, 2013-03-08 at 12:02 +0100, Lars Marowsky-Bree wrote: > > >

Re: [Linux-HA] RA heartbeat/exportfs hangs sporadically

2013-03-08 Thread Roman Haefeli
On Fri, 2013-03-08 at 13:28 +0100, Roman Haefeli wrote: > On Fri, 2013-03-08 at 12:02 +0100, Lars Marowsky-Bree wrote: > > On 2013-03-08T11:56:12, Roman Haefeli wrote: > > > > > Googling "TrackedProcTimeoutFunction exportfs" didn't reveal any > > >

Re: [Linux-HA] RA heartbeat/exportfs hangs sporadically

2013-03-08 Thread Roman Haefeli
On Fri, 2013-03-08 at 12:02 +0100, Lars Marowsky-Bree wrote: > On 2013-03-08T11:56:12, Roman Haefeli wrote: > > > Googling "TrackedProcTimeoutFunction exportfs" didn't reveal any > > results, which makes me think we are alone with this specific problem. > >

[Linux-HA] RA heartbeat/exportfs hangs sporadically

2013-03-08 Thread Roman Haefeli
Hi all In a two-node cluster that manages an NFS server on top of a DRBD replication, we experienced fencing of the active node due to exportfs monitor failure: Mar 8 03:08:42 vicestore1 lrmd: [1550]: WARN: p_exportfs_virtual:monitor process (PID 4120) timed out (try 1). Killing with signal S

Re: [Linux-HA] How to identify reason for fencing

2013-02-07 Thread Roman Haefeli
On Wed, 2013-02-06 at 11:24 +0100, Michael Schwartzkopff wrote: > Am Mittwoch, 6. Februar 2013, 11:06:23 schrieb Roman Haefeli: > > Hi all > > > > We are running a pacemaker/corosync cluster with three nodes that > > manages ~30 OpenVZ containers. > > > >

[Linux-HA] How to identify reason for fencing

2013-02-06 Thread Roman Haefeli
Hi all We are running a pacemaker/corosync cluster with three nodes that manages ~30 OpenVZ containers. We recently had the situation where one node fenced the to other two nodes (sbd is configured as a stonith device). In the system logs I was able to spot the line where the node gives the death

[Linux-HA] prevent unwanted fencing

2012-04-04 Thread Roman Haefeli
Hi all We're running a two node cluster with a bunch of OpenVZ Containers as Resources and use SBD as a fencing method. We're still in testing mode and did perform some IO benchmarks on NFS with tiobench. While we were performing those test, the node fenced itself as soon as tiobench was finished

Re: [Linux-HA] order troubles

2012-03-26 Thread Roman Haefeli
On Thu, 2012-03-22 at 15:06 +0100, Florian Haas wrote: > On Thu, Mar 22, 2012 at 10:34 AM, Lars Ellenberg > wrote: > >> order o_nfs_before_vz 0: cl_fs_nfs cl_vz > >> order o_vz_before_ve992 0: cl_vz ve992 > > > > a score of "0" is roughly equivalent to > > "if you happen do plan to do both operati

[Linux-HA] order troubles

2012-03-22 Thread Roman Haefeli
Hi all I only started diving into linux-ha recently and am currently working on a test setup, where a 2-node cluster manages a bunch of OpenVZ containers. The Containers are running on a cloned nfs mount, which is also managed by the cluster. In normal operation mode, everything works fine. I can