Re: [Linux-ha-dev] possible deadlock in lrmd?

2011-11-09 Thread Dejan Muhamedagic
On Wed, Nov 09, 2011 at 09:26:09AM +0100, Ante Karamatic wrote: > On 03.11.2010 11:34, Dejan Muhamedagic wrote: > > > Any news here? > > Yes. I know, it's been a while... > > At the time patch '2444'[1] was provided, glib had a bug[2] that was > resolved few months later. 2444 patch did resolve

Re: [Linux-ha-dev] possible deadlock in lrmd?

2011-01-28 Thread Andres Rodriguez
Hi all, Any updates on this issue? On Fri, Nov 26, 2010 at 9:27 AM, Dave Williams < d...@opensourcesolutions.co.uk> wrote: > > > But this is what Senko's patch (2444.diff) fixes - so with that added > it cures > > > the abort in both situations above. Now time to look at his potential > ref leak

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-26 Thread Dave Williams
> > But this is what Senko's patch (2444.diff) fixes - so with that added it > > cures > > the abort in both situations above. Now time to look at his potential ref > > leak. > > That patch doesn't cure the cause, just works around it. lrmd > would just keep accumulating open IPC sockets. > > T

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-26 Thread Dejan Muhamedagic
Hi, On Thu, Nov 25, 2010 at 10:51:56PM +, Dave Williams wrote: > > To follow up it appears that lrmd is aborting with the same error > > message after executing both "crm configure verify" AND "lrmadmin -C" > > > > Strace yields the following: > > > > lrmd: [336]: debug: on_receive_cmd: the

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-25 Thread Dave Williams
> To follow up it appears that lrmd is aborting with the same error > message after executing both "crm configure verify" AND "lrmadmin -C" > > Strace yields the following: > > lrmd: [336]: debug: on_receive_cmd: the IPC to client [pid:342] > disconnected.\n > \nGThread-ERROR **: Trying to recur

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-25 Thread Dave Williams
> > Adding the following patch > > --- lrmd.c 2010-11-19 17:51:44.0 + > +++ ../../../lrmd.c 2010-11-24 20:40:49.351322794 + > @@ -1104,6 +1104,9 @@ > > register_pid(FALSE, sigterm_action); > > + > + g_thread_init(NULL); > + > /* load RA plugins

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-25 Thread Andrew Beekhof
On Thu, Nov 25, 2010 at 10:35 AM, Dejan Muhamedagic wrote: > On Thu, Nov 25, 2010 at 09:47:51AM +0100, Andrew Beekhof wrote: >> On Wed, Nov 24, 2010 at 2:18 PM, Dejan Muhamedagic >> wrote: >> > Hi, >> > >> > On Wed, Nov 24, 2010 at 10:52:23AM +, Dave Williams wrote: >> >> On 10:35, Wed 24 No

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-25 Thread Dejan Muhamedagic
On Thu, Nov 25, 2010 at 09:47:51AM +0100, Andrew Beekhof wrote: > On Wed, Nov 24, 2010 at 2:18 PM, Dejan Muhamedagic > wrote: > > Hi, > > > > On Wed, Nov 24, 2010 at 10:52:23AM +, Dave Williams wrote: > >> On 10:35, Wed 24 Nov 10, Dejan Muhamedagic wrote: > >> > Hi, > >> > > >> > On Tue, Nov

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-25 Thread Andrew Beekhof
On Wed, Nov 24, 2010 at 2:18 PM, Dejan Muhamedagic wrote: > Hi, > > On Wed, Nov 24, 2010 at 10:52:23AM +, Dave Williams wrote: >> On 10:35, Wed 24 Nov 10, Dejan Muhamedagic wrote: >> > Hi, >> > >> > On Tue, Nov 23, 2010 at 11:03:33PM +, Dave Williams wrote: >> > > Hi, >> > > I have a probl

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Dave Williams
> Adding the following patch > > --- lrmd.c 2010-11-19 17:51:44.0 + > +++ ../../../lrmd.c 2010-11-24 20:40:49.351322794 + > @@ -1104,6 +1104,9 @@ > > register_pid(FALSE, sigterm_action); > > + > + g_thread_init(NULL); > + > /* load RA plugins */

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Dave Williams
On 16:16, Wed 24 Nov 10, Ante Karamatić wrote: > U Sri, 24. 11. 2010., u 10:52 +, Dave Williams je napisao/la: > > > I currently have a production clustered server down because of this and > > the fact that ubuntu (I'm advised) have an inconsistently compiled set > > of HA components. Certaint

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Dave Williams
On 16:16, Wed 24 Nov 10, Ante Karamatić wrote: > U Sri, 24. 11. 2010., u 10:52 +, Dave Williams je napisao/la: > > > I currently have a production clustered server down because of this and > > the fact that ubuntu (I'm advised) have an inconsistently compiled set > > of HA components. Certaint

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Ante Karamatić
U Sri, 24. 11. 2010., u 10:52 +, Dave Williams je napisao/la: > I currently have a production clustered server down because of this and > the fact that ubuntu (I'm advised) have an inconsistently compiled set > of HA components. Certaintly both lucid and maverick released packages > leave defu

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Dave Williams
On 14:18, Wed 24 Nov 10, Dejan Muhamedagic wrote: > > The most plausible explanation is in this thread: > http://marc.info/?l=linux-ha-dev&m=128765996706209&w=2 > Thanks for your quick response Dejan, I understand what the thread says - which is why I posted my findings on the list. As I said it

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Dejan Muhamedagic
Hi, On Wed, Nov 24, 2010 at 10:52:23AM +, Dave Williams wrote: > On 10:35, Wed 24 Nov 10, Dejan Muhamedagic wrote: > > Hi, > > > > On Tue, Nov 23, 2010 at 11:03:33PM +, Dave Williams wrote: > > > Hi, > > > I have a problem that looks similar to that reported "possible deadlock > > > in lr

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Dave Williams
On 10:35, Wed 24 Nov 10, Dejan Muhamedagic wrote: > Hi, > > On Tue, Nov 23, 2010 at 11:03:33PM +, Dave Williams wrote: > > Hi, > > I have a problem that looks similar to that reported "possible deadlock > > in lrmd" on 21st Oct > > > > When running lradmin -C to list classes the first time it

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-24 Thread Dejan Muhamedagic
Hi, On Tue, Nov 23, 2010 at 11:03:33PM +, Dave Williams wrote: > Hi, > I have a problem that looks similar to that reported "possible deadlock > in lrmd" on 21st Oct > > When running lradmin -C to list classes the first time it comes back > immediately with the expected list e.g. > > r...@no

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-23 Thread Dave Williams
Hi, I have a problem that looks similar to that reported "possible deadlock in lrmd" on 21st Oct When running lradmin -C to list classes the first time it comes back immediately with the expected list e.g. r...@node1:/home# lrmadmin -C There are 5 RA classes supported: lsb ocf stonith upstart hea

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-11-03 Thread Dejan Muhamedagic
Hi, On Thu, Oct 21, 2010 at 01:20:29PM +0200, Dejan Muhamedagic wrote: > On Thu, Oct 21, 2010 at 11:57:56AM +0200, Senko Rasic wrote: > > Hi, > > > > I'm replying directly to you with this, feel free to forward to the > > list. For further discussions I could also join the list. > > Yes, it woul

Re: [Linux-ha-dev] possible deadlock in lrmd?

2010-10-21 Thread Dejan Muhamedagic
On Thu, Oct 21, 2010 at 11:57:56AM +0200, Senko Rasic wrote: > Hi, > > I'm replying directly to you with this, feel free to forward to the > list. For further discussions I could also join the list. Yes, it would be good. > On 10/21/2010 11:34 AM, Dejan Muhamedagic wrote: > >Senko Rasic proposed

[Linux-ha-dev] possible deadlock in lrmd?

2010-10-21 Thread Dejan Muhamedagic
Hi, Senko Rasic proposed a patch for the client unregister which would prevent a double unref of glib sources (IPC channel). However, I cannot recall any deadlocks in lrmd. See http://www.init.hr/dev/cluster/patches/2444.diff Senko, is this something you observed or you just thought it might occ