On 06/25/2015 02:22 PM, Павел Лобанов wrote:
> Ok, we will necessarily check your fixes in our computer systems.
> After testing we will report results to you. And if successful we will
> glad to see this bug fixes in next OpenIPMI library release! -)

Certainly :).


>
> So we are waiting your bug fixes.
>
> Thanks.
>
> P.S. Why you don't use native GNU recursive mutexes -
> pthread_mutexattr_settype(attr, PTHREAD_MUTEX_RECURSIVE) ?
>

Well, I thought that was a Linux extension, but I appear to be wrong. 
That would be a better solution by far.

I'll fix it that way.  Do you want to wait for that patch?

Thanks,

-corey

>
>
>
> 2015-06-24 4:01 GMT+03:00 Corey Minyard <[email protected]
> <mailto:[email protected]>>:
>
>     Well, unless I can reproduce this or have someone to test it, I
>     can't really fix it,  I believe your change will cause problems
>     elsewhere, though it may be that no more users or recursive
>     mutexes exist.  But I'm not sure.
>
>     So I can apply my fix, which is better, but I am unsure it fixes
>     your problem.  I would feel better with SMP barriers, but C offers
>     no general implementation.
>
>     -corey
>
>     On Jun 23, 2015 2:51 PM, "Павел Лобанов" <[email protected]
>     <mailto:[email protected]>> wrote:
>     >
>     > I am not sure that we really fixes this bug. But after our bug
>     fixes we don't get anymore problems with OpenIPMI-module in collectd.
>     >
>     > We don't have any time to explore your recursive mutexes
>     realization and to fixes this bug in OpenIPMI library right.
>     Please, check this bug. And we wait eagerly for your bug fixes.
>     >
>     > Thanks.
>     >
>     > 2015-06-18 21:16 GMT+03:00 Corey Minyard <[email protected]
>     <mailto:[email protected]>>:
>     >>
>     >> Are you sure that really fixes the problem?  The mutexes are
>     supposed to
>     >> be recursive (so the same thread can claim the same mutex multiple
>     >> times) and you are likely causing a deadlock someplace with
>     this change.
>     >>
>     >> Actually I see a bug here, but it's more subtle.  The id->owner
>     of a
>     >> mutex needs to be set to an invalid value when the lock count
>     reaches
>     >> zero.  Otherwise the check:
>     >>
>     >>  if ((id->lock_count == 0) || (pthread_self() != id->owner)) {
>     >>
>     >> can race with other threads claiming the mutex.  Can you try
>     setting
>     >>
>     >>  id->owner = 0
>     >>
>     >> right after after (before the unlock):
>     >>
>     >>  if (id->lock_count == 0) {
>     >>
>     >> Thanks,
>     >>
>     >> -corey
>     >>
>     >> On 06/12/2015 03:03 PM, Павел Лобанов wrote:
>     >> > I am using based on OpenIPMI library ipmi module in collectd
>     . And in
>     >> > some cases I have watched interlock OpenIPMI library threads.
>     >> > Interlock have been here:
>     >> >
>     >> > Thread 5 (Thread 0x2b2909116700 (LWP 19188)):
>     >> > #0  0x0000003b9540e264 in __lll_lock_wait () from
>     /lib64/libpthread.so.0
>     >> > #1  0x0000003b95409508 in _L_lock_854 () from
>     /lib64/libpthread.so.0
>     >> > #2  0x0000003b954093d7 in pthread_mutex_lock () from
>     >> > /lib64/libpthread.so.0
>     >> > #3  0x00002b2906320284 in lock (handler=<value optimized out>,
>     >> > id=0x2b290c00b6d0) at posix_thread_os_hnd.c:445
>     >> > #4  0x00002b2906772498 in _ipmi_domain_get
>     (domain=0x2b290c00d3a0) at
>     >> > domain.c:1313
>     >> > #5  0x00002b29067727c1 in ipmi_domain_pointer_cb (id=<value
>     optimized
>     >> > out>, handler=0x2b290677f060 <mc_ptr_cb>,
>     cb_data=0x2b2909115d50) at
>     >> > domain.c:4033
>     >> > #6  0x00002b290677d6bd in ipmi_mc_pointer_cb (id=...,
>     handler=<value
>     >> > optimized out>, cb_data=<value optimized out>) at mc.c:2610
>     >> > #7  0x00002b2906795a92 in ipmi_sensor_pointer_cb (id=...,
>     >> > handler=<value optimized out>, cb_data=<value optimized out>) at
>     >> > sensor.c:390
>     >> > #8  0x00002b2906795b56 in ipmi_sensor_id_get_reading
>     (sensor_id=...,
>     >> > done=<value optimized out>, cb_data=<value optimized out>) at
>     >> > sensor.c:5915
>     >> > #9  0x00002b2906115851 in sensor_list_read_all () at ipmi.c:522
>     >> > #10 c_ipmi_read () at ipmi.c:775
>     >> > #11 0x000000000041bbe2 in plugin_read_thread (args=<value
>     optimized
>     >> > out>) at plugin.c:526
>     >> > #12 0x0000003b954079d1 in start_thread () from
>     /lib64/libpthread.so.0
>     >> > #13 0x0000003b950e8b6d in clone () from /lib64/libc.so.6
>     >> >
>     >> > This bug is saved in openipmi-2.0.21 and we fixed it.
>     >> >
>     >> >
>     >> > patch to fix it
>     >> >
>     >> > --- unix/posix_thread_os_hnd.c2013-10-10 23:09:17.000000000 +0400
>     >> > +++ unix/posix_thread_os_hnd.c2015-05-21 20:54:45.000000000 +0300
>     >> > @@ -439,12 +439,9 @@ static int
>     >> >  lock(os_handler_t  *handler,
>     >> >       os_hnd_lock_t *id)
>     >> >  {
>     >> > -    int rv;
>     >> > -
>     >> > -    if ((id->lock_count == 0) || (pthread_self() !=
>     id->owner)) {
>     >> > -rv = pthread_mutex_lock(&id->mutex);
>     >> > -if (rv)
>     >> > -   return rv;
>     >> > +    int rv = pthread_mutex_lock(&id->mutex);
>     >> > +    if (rv) {
>     >> > +        return rv;
>     >> >      }
>     >> >      id->owner = pthread_self();
>     >> >      id->lock_count++;
>     >> > @@ -462,12 +459,10 @@ unlock(os_handler_t  *handler,
>     >> >      if (pthread_self() != id->owner)
>     >> >  handler->log(handler, IPMI_LOG_FATAL, "lock release by
>     non-owner");
>     >> >      id->lock_count--;
>     >> > -    if (id->lock_count == 0) {
>     >> > -rv = pthread_mutex_unlock(&id->mutex);
>     >> > -if (rv) {
>     >> > -   id->lock_count++;
>     >> > -   return rv;
>     >> > -}
>     >> > +    rv = pthread_mutex_unlock(&id->mutex);
>     >> > +    if (rv) {
>     >> > +        id->lock_count++;
>     >> > +        return rv;
>     >> >      }
>     >> >      return 0;
>     >> >  }
>     >> >
>     >> >
>     >> >
>     >> >
>     
> ------------------------------------------------------------------------------
>     >> >
>     >> >
>     >> > _______________________________________________
>     >> > Openipmi-developer mailing list
>     >> > [email protected]
>     <mailto:[email protected]>
>     >> > https://lists.sourceforge.net/lists/listinfo/openipmi-developer
>     >>
>     >
>
>


------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical & virtual servers, alerts via email & sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to