Hallo Steve,
Am Dienstag, den 03.01.2012, 22:14 -0600 schrieb Steve M. Robbins: > tags 653922 + more-info > thanks > > On Sun, Jan 01, 2012 at 03:57:56PM +0100, Tobias Frost wrote: > > > for solarpowerlog I implemented a working thread to schedule tasks to be > > executed in a specified time. > > It is receiving its work by a pushing it to a list and then calling > > thread.interrupt(). [1] line 79 > > > > This scheme works very fine since boost 1.38, but on a recent recompile I > > found that occasionally > > (like the programm running several hours) the flow stopped. > > > > Debugging the issue I found that the the programm halted always "inside" of > > libboost, obviously > > trying to obtain some mutex. > > This gives folks very little to information to work with. If you find > out more, please update this bug. I know, sorry, but I merley wanted to document it, as I found an solution and the work needing to providing a simple testcase is outweighting the importance of this, I think. ... However, here's some more informations which might help a little: - I encountered also the bug when compiling solarpowerlog for Win32 (using cygwin). However, I only saw the symptoms, I did not debug on Win32. I can provide a backtrack using gdb (linux): Thread 1 is the thread which calls the interrupt function Thread 2 is the one to be interrupted (Thread 3 is also a boost-thread, but unrelated to the issue and it is also not being interrupted by boost::thread::interrupt Command name abbreviations are allowed if unambiguous. (gdb) info threads Id Target Id Frame 3 Thread 0xb70ffb70 (LWP 8301) "solarpowerlog" 0xb7fe2424 in __kernel_vsyscall () 2 Thread 0xb7a0ab70 (LWP 8300) "solarpowerlog" 0xb7fe2424 in __kernel_vsyscall () * 1 Thread 0xb7a0c710 (LWP 8297) "solarpowerlog" 0xb7fe2424 in __kernel_vsyscall () (gdb) bt #0 0xb7fe2424 in __kernel_vsyscall () #1 0xb7c38f02 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xb7c3439b in _L_lock_728 () from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0 #3 0xb7c341c1 in __pthread_mutex_lock (mutex=0x80c7c70) at pthread_mutex_lock.c:61 #4 0xb7ddf371 in boost::thread::interrupt() () from /usr/lib/libboost_thread.so.1.46.1 #5 0x0805d83b in CTimedWork::ScheduleWork (this=0x80c77c8, Command=0xb71015c0, ts=...) at interfaces/CTimedWork.cpp:115 #6 0x08068ce1 in CCSVOutputFilter::ExecuteCommand (this=0xb7100818, cmd=0xb7100e80) at DataFilters/CCSVOutputFilter.cpp:233 #7 0x080a12da in ICommand::execute (this=0xb7100e80) at patterns/ICommand.cpp:64 #8 0x0805bf67 in CWorkScheduler::DoWork (this=0x80c7400, block=true) at interfaces/CWorkScheduler.cpp:96 #9 0x08059c3f in main (argc=-1223685708, argv=0x80c7104) at solarpowerlog.cpp:569 (gdb) info threads Id Target Id Frame 3 Thread 0xb70ffb70 (LWP 8301) "solarpowerlog" 0xb7fe2424 in __kernel_vsyscall () 2 Thread 0xb7a0ab70 (LWP 8300) "solarpowerlog" 0xb7fe2424 in __kernel_vsyscall () * 1 Thread 0xb7a0c710 (LWP 8297) "solarpowerlog" 0xb7fe2424 in __kernel_vsyscall () (gdb) thread 2 [Switching to thread 2 (Thread 0xb7a0ab70 (LWP 8300))] #0 0xb7fe2424 in __kernel_vsyscall () (gdb) bt #0 0xb7fe2424 in __kernel_vsyscall () #1 0xb7c38f02 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142 #2 0xb7c3439b in _L_lock_728 () from /lib/i386-linux-gnu/i686/cmov/libpthread.so.0 #3 0xb7c341c1 in __pthread_mutex_lock (mutex=0x80c7bf8) at pthread_mutex_lock.c:61 #4 0xb7ddf6cb in boost::this_thread::interruption_point() () from /usr/lib/libboost_thread.so.1.46.1 #5 0xb7de15fc in boost::this_thread::sleep(boost::posix_time::ptime const&) () from /usr/lib/libboost_thread.so.1.46.1 #6 0x0805e1d6 in sleep<boost::posix_time::time_duration> (rel_time=<synthetic pointer>) at /usr/include/boost/thread/pthread/thread_data.hpp:138 #7 CTimedWork::_main (this=0x80c77c8) at interfaces/CTimedWork.cpp:181 #8 0xb7ddeebc in thread_proxy () from /usr/lib/libboost_thread.so.1.46.1 #9 0xb7c31c39 in start_thread (arg=0xb7a0ab70) at pthread_create.c:304 #10 0xb7b9e98e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130 Backtrace stopped: Not enough registers or memory available to unwind further (gdb) Finally to document the versions I use: tobi@mordor:~/workspace/solarpowerlog/tools/sputnik_simulator$ dpkg -l libboost*-dev | grep ^ii ii libboost-date-time1.46-dev 1.46.1-8 set of date-time libraries based on generic programming concepts ii libboost-dev 1.48.0.2 Boost C++ Libraries development files (default version) ii libboost-filesystem1.46-dev 1.46.1-8 filesystem operations (portable paths, iteration over directories, etc) in C++ ii libboost-graph-parallel1.46-dev 1.46.1-8 generic graph components and algorithms in C++ ii libboost-graph1.46-dev 1.46.1-8 generic graph components and algorithms in C++ ii libboost-iostreams1.46-dev 1.46.1-8 Boost.Iostreams Library development files ii libboost-math1.46-dev 1.46.1-8 Boost.Math Library development files ii libboost-mpi1.46-dev 1.46.1-8 C++ interface to the Message Passing Interface (MPI) ii libboost-program-options1.46-dev 1.46.1-8 program options library for C++ ii libboost-python1.46-dev 1.46.1-8 Boost.Python Library development files ii libboost-random1.46-dev 1.46.1-8 Boost Random Number Library ii libboost-regex1.46-dev 1.46.1-8 regular expression library for C++ ii libboost-serialization1.46-dev 1.46.1-8 serialization library for C++ ii libboost-signals1.46-dev 1.46.1-8 managed signals and slots library for C++ ii libboost-system1.46-dev 1.46.1-8 Operating system (e.g. diagnostics support) library ii libboost-test1.46-dev 1.46.1-8 components for writing and executing test suites ii libboost-thread1.46-dev 1.46.1-8 portable C++ multi-threading ii libboost-wave1.46-dev 1.46.1-8 C99/C++ preprocessor library ii libboost1.46-dev 1.46.1-8 Boost C++ Libraries development files > > Recompilation with 1.48 fixes the problem (at least after 10hours for 10 > > instances), so I assume only 1.46 is affected. > > OK. Since upstream is already working on 1.49, I will not forward > this unless you find out that 1.48 is affected. > > Thanks, > -Steve I *quite sure* that 1.48 does not show this issue, as I put again several 100 hours of execution-time into it and the hang did not show up anymore. So I think it is not worth the effort to dig into deeper or report it upstream (unless one want to document the issue). If you think it is worth to take a look, let me know and I prepare you instructions how to reproduce it (basically by running solarpowerlog and waiting...) Best regards, Tobi -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org