ok, i can reproduce it... will work on a fix till next tuesday/wednesday. Thanks Christoph
2012/8/9 BigQiao <[email protected]> > This deadlock still exists in 0.9.6.0, when delete a TableScanner > > a TableScanner destructor lock IndexScannerCallback then TableScannerAsync > a Database Working Thread lock TableScannerAsync then IndexScannerCallback > > Thread 14 (Thread 0x7fffee266700 (LWP 10936)): //Database Working > Thread > #0 __lll_lock_wait () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 > #1 0x00007ffff79c8179 in _L_lock_953 () from /lib/libpthread.so.0 > #2 0x00007ffff79c7f9b in __pthread_mutex_lock (mutex=0xc08630) at > pthread_mutex_lock.c:61 > #3 0x0000000000477886 in boost::mutex::lock (this=0xc08630) at > /usr/include/boost/thread/pthread/mutex.hpp:50 > #4 0x000000000047e790 in boost::unique_lock<boost::mutex>::lock > (this=0x7fffee2638e0) at /usr/include/boost/thread/locks.hpp:349 > #5 0x000000000047d51d in unique_lock (this=0x7fffee2638e0, m_=...) at > /usr/include/boost/thread/locks.hpp:227 > #6 0x00000000005f0a07 in > Hypertable::IndexScannerCallback::scan_ok(Hypertable::TableScannerAsync*, > boost::intrusive_ptr<Hypertable::ScanCells>&) () > #7 0x00000000005ed180 in Hypertable::TableScannerAsync::maybe_callback_ok > (this=0x10e7b50, scanner_id=1, next=true, do_callback=true, cells=...) > at > /home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:522 > #8 0x00000000005ec5cc in Hypertable::TableScannerAsync::handle_result > (this=0x10e7b50, scanner_id=1, event=..., is_create=true) > at > /home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:459 > #9 0x00000000006286d2 in Hypertable::TableScannerHandler::run > (this=0x7fffe8049e30) at > /home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerHandler.cc:40 > #10 0x000000000047b625 in > Hypertable::ApplicationQueue::Worker::operator()() () > #11 0x000000000048dbd2 in > boost::detail::thread_data<Hypertable::ApplicationQueue::Worker>::run() () > #12 0x00007ffff77b5200 in thread_proxy () from > /usr/lib/libboost_thread.so.1.42.0 > #13 0x00007ffff79c58ca in start_thread (arg=<value optimized out>) at > pthread_create.c:300 > #14 0x00007ffff4978b6d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 > #15 0x0000000000000000 in ?? () > > > Thread 27 (Thread 0x7fffe33ee700 (LWP 10949)): //TableScanner > Destructor Thread > #0 pthread_cond_wait@@GLIBC_2.3.2 () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 > #1 0x000000000047d87f in > boost::condition_variable_any::wait<boost::unique_lock<boost::mutex> > > (this=0x10e7c18, m=...) > at /usr/include/boost/thread/pthread/condition_variable.hpp:84 > #2 0x00000000005ed224 in > Hypertable::TableScannerAsync::wait_for_completion (this=0x10e7b50) > at > /home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:535 > #3 0x00000000005eb370 in ~TableScannerAsync (this=0x10e7b50, > __in_chrg=<value optimized out>) > at > /home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:318 > #4 0x00000000005f01d3 in > Hypertable::IndexScannerCallback::~IndexScannerCallback() () > #5 0x00000000005eb579 in ~TableScannerAsync (this=0xc04ca0, > __in_chrg=<value optimized out>) > at > /home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:324 > #6 0x0000000000444847 in Hypertable::intrusive_ptr_release (rc=0xc04ca0) > at /opt/hypertable/0.9.6.0/include/Common/ReferenceCount.h:73 > #7 0x00000000005e70e3 in > boost::intrusive_ptr<Hypertable::TableScannerAsync>::~intrusive_ptr() () > #8 0x00000000005e6cf3 in Hypertable::TableScanner::~TableScanner() () > #9 0x000000000043c943 in DBRecycled::run (this=0xa95c60) at > /home/qiao/Project/Bingo/DistributedSpider/DBRecycled.cpp:48 > #10 0x000000000046eed7 in thread_proc (param=0x7fffe805ae00) at > /home/qiao/Project/Bingo/DistributedSpider/shared/Threading/ThreadPool.cpp:331 > #11 0x00007ffff79c58ca in start_thread (arg=<value optimized out>) at > pthread_create.c:300 > #12 0x00007ffff4978b6d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 > #13 0x0000000000000000 in ?? () > > > Sorry for the delay - i was finally able to reproduce it and i also fixed >> it. >> >> The commit is a bit larger than my first try. >> >> https://github.com/cruppstahl/**hypertable/commits/v0.9.5<https://github.com/cruppstahl/hypertable/commits/v0.9.5> >> >> commit b45ba15b701373c3a1f689f8997f31**bde8ff5165 >> Author: Christoph Rupp <[email protected]> >> Date: Wed Apr 25 18:54:11 2012 +0200 >> >> issue 827: fixed deadlock when scanning secondary indices >> >> Thanks again for your great help! >> >> Best regards >> Christoph >> >> 2012/4/26 gcc.lua <[email protected]> >> >>> Hi, >>> >>> thanks to reply quickly, but the commit just remove m_mutex inside >>> virtual ~IndexScannerCallback() , >>> I try it, will a new problem occured, see end of report, >>> some additional info about reproduce this issue before you commit >>> >>> void run() >>> { >>> TableScannerPtr aScanner = tbSourcelist- >>> >create_scanner( specbuilder.get(), 5000 ); >>> >>> while( aScanner->next( gotCell ) ) >>> { >>> .... >>> if(condition) >>> break;//if have next result, now break, internel scanner >>> thread running >>> .... >>> } >>> return;//trigger TableScanner destructor, next info see my first >>> post please >>> } >>> >>> //////////////////////////////**//////////////////////////////** >>> ////////////////////////// >>> >>> >>> pure virtual method called >>> terminate called without an active exception >>> >>> Program received signal SIGABRT, Aborted. >>> [Switching to Thread 0x7fffe6ff5700 (LWP 23887)] >>> 0x00007ffff48db1b5 in raise () from /lib/libc.so.6 >>> >>> >>> (gdb) where >>> #0 0x00007ffff48db1b5 in raise () from /lib/libc.so.6 >>> #1 0x00007ffff48ddfc0 in abort () from /lib/libc.so.6 >>> #2 0x00007ffff516fdc5 in __gnu_cxx::__verbose_**terminate_handler() () >>> from /usr/lib/libstdc++.so.6 >>> #3 0x00007ffff516e166 in ?? () from /usr/lib/libstdc++.so.6 >>> #4 0x00007ffff516e193 in std::terminate() () from /usr/lib/libstdc+ >>> +.so.6 >>> #5 0x00007ffff516ea6f in __cxa_pure_virtual () from /usr/lib/libstdc+ >>> +.so.6 >>> #6 0x00000000005c43c6 in >>> Hypertable::TableScannerAsync:**:maybe_callback_ok >>> (this=0x7fffb432ecd0, >>> scanner_id=19373, next=true, do_callback=true, cells=...) >>> at >>> /root/qiao/Project/hypertable-**0.9.5.6/src/cc/Hypertable/Lib/ >>> TableScannerAsync.cc:520 >>> #7 0x00000000005c393f in >>> Hypertable::TableScannerAsync:**:handle_result >>> (this=0x7fffb432ecd0, scanner_id=19373, event=..., is_create=true) >>> at >>> /root/qiao/Project/hypertable-**0.9.5.6/src/cc/Hypertable/Lib/ >>> TableScannerAsync.cc:464 >>> #8 0x00000000005fdc5e in Hypertable::**TableScannerHandler::run >>> (this=0x7fff99915850) at >>> /root/qiao/Project/hypertable-**0.9.5.6/src/cc/Hypertable/Lib/ >>> TableScannerHandler.cc:40 >>> #9 0x000000000045f2c5 in >>> Hypertable::ApplicationQueue::**Worker::operator() (this=0xaaa120) at >>> /root/qiao/Project/hypertable-**0.9.5.6/src/cc/AsyncComm/ >>> ApplicationQueue.h:173 >>> #10 0x0000000000470f04 in >>> boost::detail::thread_data<**Hypertable::ApplicationQueue::** >>> Worker>::run >>> (this=0xaa9ff0) at /usr/include/boost/thread/**detail/thread.hpp:56 >>> #11 0x00007ffff77b5200 in thread_proxy () from >>> /usr/lib/libboost_thread.so.1.**42.0 >>> #12 0x00007ffff79c58ca in start_thread () from /lib/libpthread.so.0 >>> #13 0x00007ffff497892d in clone () from /lib/libc.so.6 >>> #14 0x0000000000000000 in ?? () >>> >>> On 4月26日, 上午12时56分, Christoph Rupp <[email protected]> wrote: >>> > Hi, >>> > >>> > thanks for the great bug report. >>> > >>> > I am not able to reproduce this issue, but i think i came up with a >>> fix. If >>> > you want to check out the sources then you can get them here: >>> https://github.com/**cruppstahl/hypertablebranch<https://github.com/cruppstahl/hypertablebranch>"v0.9.5" >>> > >>> > This is the commit: >>> > commit 2572b5dcb524e1c36dc23307c37784**fd34c1bdde >>> > Author: Christoph Rupp <[email protected]> >>> > Date: Wed Apr 25 18:54:11 2012 +0200 >>> > >>> > issue 827: fixed deadlock when scanning secondary indices >>> > >>> > And here's the diff: >>> > >>> > diff --git a/src/cc/Hypertable/Lib/**IndexScannerCallback.h >>> > b/src/cc/Hypertable/Li >>> > index 70ffda7..1b37127 100644 >>> > --- a/src/cc/Hypertable/Lib/**IndexScannerCallback.h >>> > +++ b/src/cc/Hypertable/Lib/**IndexScannerCallback.h >>> > @@ -118,13 +118,12 @@ static String last; >>> > } >>> > >>> > virtual ~IndexScannerCallback() { >>> > - ScopedLock lock(m_mutex); >>> > - if (m_mutator) >>> > - delete m_mutator; >>> > foreach (TableScannerAsync *s, m_scanners) >>> > delete s; >>> > m_scanners.clear(); >>> > sspecs_clear(); >>> > + if (m_mutator) >>> > + delete m_mutator; >>> > >>> > Can you please give it a try and see if this helps? >>> > >>> > Thanks >>> > Christoph >>> > >>> > 2012/4/24 gcc.lua <[email protected]> >>> > >>> > > user thread logic like follow: >>> > > TableScannerPtr aScanner = tbSourcelist- >>> > > >create_scanner( specbuilder.get(), 5000 ); >>> > > while( aScanner->next( gotCell ) ) >>> > > { >>> > > ..... >>> > > } >>> > >>> > > dead lock between user thread and scanner thread: >>> > >>> > > 1. user thread TableScanner >>> > >>> > > TableScannerAsync::~**TableScannerAsync() { >>> > > try { >>> > > cancel(); >>> > > wait_for_completion(); >>> > > } >>> > > catch (Exception &e) { >>> > > HT_ERROR_OUT << e << HT_END; >>> > > } >>> > > if (m_use_index) { >>> > > delete m_cb;//<======================**===dead lock entry >>> > > m_cb = 0; >>> > > } >>> > > } >>> > > //////////////////////////////**/////////// >>> > > virtual ~IndexScannerCallback() { >>> > > ScopedLock lock(m_mutex);//<========= user thread got this >>> > > IndexScannerCallback::m_mutex >>> > > if (m_mutator) >>> > > delete m_mutator; >>> > >>> > > foreach (TableScannerAsync *s, m_scanners) >>> > > delete s;//dead lock 1<=============user thread wait >>> > > TableScannerAsync::m_mutex >>> > >>> > > 2. scanner thread >>> > >>> > > void TableScannerAsync::handle_**result(int scanner_id, EventPtr >>> > > &event, bool is_create) { >>> > >>> > > bool cancelled = is_cancelled(); >>> > > ScopedLock lock(m_mutex);<============**scanner thread got >>> > > TableScannerAsync::m_mutex >>> > > ScanCellsPtr cells; >>> > >>> > > . . . . . . >>> > > maybe_callback_ok();<========**========call m_cb->scan_ok(this, >>> > > cells); >>> > >>> > > } >>> > > ////////////////////////////// >>> > > class IndexScannerCallback : public ResultCallback { >>> > >>> > > virtual void scan_ok(TableScannerAsync *scanner, ScanCellsPtr >>> > > &scancells) { >>> > > bool is_eos = scancells->get_eos(); >>> > > String table_name = scanner->get_table_name(); >>> > >>> > > ScopedLock lock(m_mutex);//dead lock 2<============scanner >>> > > thread wait IndexScannerCallback::m_mutex >>> > >>> > > -- >>> > > You received this message because you are subscribed to the Google >>> Groups >>> > > "Hypertable Development" group. >>> > > To post to this group, send email to hyperta...@googlegroups.**com. >>> > > To unsubscribe from this group, send email to >>> > > hypertable-de...@**googlegroups.com. >>> > > For more options, visit this group at >>> > >http://groups.google.com/**group/hypertable-dev?hl=en<http://groups.google.com/group/hypertable-dev?hl=en> >>> . >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Hypertable Development" group. >>> To post to this group, send email to hyperta...@googlegroups.**com. >>> To unsubscribe from this group, send email to hypertable-de...@** >>> googlegroups.com. >>> For more options, visit this group at http://groups.google.com/** >>> group/hypertable-dev?hl=en<http://groups.google.com/group/hypertable-dev?hl=en> >>> . >>> >>> >> -- > You received this message because you are subscribed to the Google Groups > "Hypertable Development" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/hypertable-dev/-/sERE6hok0i0J. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/hypertable-dev?hl=en. > -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en.
