Ok, we will necessarily check your fixes in our computer systems. After
testing we will report results to you. And if successful we will glad to
see this bug fixes in next OpenIPMI library release! -)
So we are waiting your bug fixes.
Thanks.
P.S. Why you don't use native GNU recursive mutexes -
pthread_mutexattr_settype(attr, PTHREAD_MUTEX_RECURSIVE) ?
2015-06-24 4:01 GMT+03:00 Corey Minyard <[email protected]>:
> Well, unless I can reproduce this or have someone to test it, I can't
> really fix it, I believe your change will cause problems elsewhere, though
> it may be that no more users or recursive mutexes exist. But I'm not sure.
>
> So I can apply my fix, which is better, but I am unsure it fixes your
> problem. I would feel better with SMP barriers, but C offers no general
> implementation.
>
> -corey
>
> On Jun 23, 2015 2:51 PM, "Павел Лобанов" <[email protected]> wrote:
> >
> > I am not sure that we really fixes this bug. But after our bug fixes we
> don't get anymore problems with OpenIPMI-module in collectd.
> >
> > We don't have any time to explore your recursive mutexes realization and
> to fixes this bug in OpenIPMI library right. Please, check this bug. And we
> wait eagerly for your bug fixes.
> >
> > Thanks.
> >
> > 2015-06-18 21:16 GMT+03:00 Corey Minyard <[email protected]>:
> >>
> >> Are you sure that really fixes the problem? The mutexes are supposed to
> >> be recursive (so the same thread can claim the same mutex multiple
> >> times) and you are likely causing a deadlock someplace with this change.
> >>
> >> Actually I see a bug here, but it's more subtle. The id->owner of a
> >> mutex needs to be set to an invalid value when the lock count reaches
> >> zero. Otherwise the check:
> >>
> >> if ((id->lock_count == 0) || (pthread_self() != id->owner)) {
> >>
> >> can race with other threads claiming the mutex. Can you try setting
> >>
> >> id->owner = 0
> >>
> >> right after after (before the unlock):
> >>
> >> if (id->lock_count == 0) {
> >>
> >> Thanks,
> >>
> >> -corey
> >>
> >> On 06/12/2015 03:03 PM, Павел Лобанов wrote:
> >> > I am using based on OpenIPMI library ipmi module in collectd . And in
> >> > some cases I have watched interlock OpenIPMI library threads.
> >> > Interlock have been here:
> >> >
> >> > Thread 5 (Thread 0x2b2909116700 (LWP 19188)):
> >> > #0 0x0000003b9540e264 in __lll_lock_wait () from
> /lib64/libpthread.so.0
> >> > #1 0x0000003b95409508 in _L_lock_854 () from /lib64/libpthread.so.0
> >> > #2 0x0000003b954093d7 in pthread_mutex_lock () from
> >> > /lib64/libpthread.so.0
> >> > #3 0x00002b2906320284 in lock (handler=<value optimized out>,
> >> > id=0x2b290c00b6d0) at posix_thread_os_hnd.c:445
> >> > #4 0x00002b2906772498 in _ipmi_domain_get (domain=0x2b290c00d3a0) at
> >> > domain.c:1313
> >> > #5 0x00002b29067727c1 in ipmi_domain_pointer_cb (id=<value optimized
> >> > out>, handler=0x2b290677f060 <mc_ptr_cb>, cb_data=0x2b2909115d50) at
> >> > domain.c:4033
> >> > #6 0x00002b290677d6bd in ipmi_mc_pointer_cb (id=..., handler=<value
> >> > optimized out>, cb_data=<value optimized out>) at mc.c:2610
> >> > #7 0x00002b2906795a92 in ipmi_sensor_pointer_cb (id=...,
> >> > handler=<value optimized out>, cb_data=<value optimized out>) at
> >> > sensor.c:390
> >> > #8 0x00002b2906795b56 in ipmi_sensor_id_get_reading (sensor_id=...,
> >> > done=<value optimized out>, cb_data=<value optimized out>) at
> >> > sensor.c:5915
> >> > #9 0x00002b2906115851 in sensor_list_read_all () at ipmi.c:522
> >> > #10 c_ipmi_read () at ipmi.c:775
> >> > #11 0x000000000041bbe2 in plugin_read_thread (args=<value optimized
> >> > out>) at plugin.c:526
> >> > #12 0x0000003b954079d1 in start_thread () from /lib64/libpthread.so.0
> >> > #13 0x0000003b950e8b6d in clone () from /lib64/libc.so.6
> >> >
> >> > This bug is saved in openipmi-2.0.21 and we fixed it.
> >> >
> >> >
> >> > patch to fix it
> >> >
> >> > --- unix/posix_thread_os_hnd.c2013-10-10 23:09:17.000000000 +0400
> >> > +++ unix/posix_thread_os_hnd.c2015-05-21 20:54:45.000000000 +0300
> >> > @@ -439,12 +439,9 @@ static int
> >> > lock(os_handler_t *handler,
> >> > os_hnd_lock_t *id)
> >> > {
> >> > - int rv;
> >> > -
> >> > - if ((id->lock_count == 0) || (pthread_self() != id->owner)) {
> >> > -rv = pthread_mutex_lock(&id->mutex);
> >> > -if (rv)
> >> > - return rv;
> >> > + int rv = pthread_mutex_lock(&id->mutex);
> >> > + if (rv) {
> >> > + return rv;
> >> > }
> >> > id->owner = pthread_self();
> >> > id->lock_count++;
> >> > @@ -462,12 +459,10 @@ unlock(os_handler_t *handler,
> >> > if (pthread_self() != id->owner)
> >> > handler->log(handler, IPMI_LOG_FATAL, "lock release by non-owner");
> >> > id->lock_count--;
> >> > - if (id->lock_count == 0) {
> >> > -rv = pthread_mutex_unlock(&id->mutex);
> >> > -if (rv) {
> >> > - id->lock_count++;
> >> > - return rv;
> >> > -}
> >> > + rv = pthread_mutex_unlock(&id->mutex);
> >> > + if (rv) {
> >> > + id->lock_count++;
> >> > + return rv;
> >> > }
> >> > return 0;
> >> > }
> >> >
> >> >
> >> >
> >> >
> ------------------------------------------------------------------------------
> >> >
> >> >
> >> > _______________________________________________
> >> > Openipmi-developer mailing list
> >> > [email protected]
> >> > https://lists.sourceforge.net/lists/listinfo/openipmi-developer
> >>
> >
>
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer