This deadlock still exists in 0.9.6.0,   when delete a TableScanner

a TableScanner destructor  lock IndexScannerCallback then TableScannerAsync
a Database Working Thread lock TableScannerAsync then IndexScannerCallback

Thread 14 (Thread 0x7fffee266700 (LWP 10936)):      //Database Working 
Thread
#0  __lll_lock_wait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007ffff79c8179 in _L_lock_953 () from /lib/libpthread.so.0
#2  0x00007ffff79c7f9b in __pthread_mutex_lock (mutex=0xc08630) at 
pthread_mutex_lock.c:61
#3  0x0000000000477886 in boost::mutex::lock (this=0xc08630) at 
/usr/include/boost/thread/pthread/mutex.hpp:50
#4  0x000000000047e790 in boost::unique_lock<boost::mutex>::lock 
(this=0x7fffee2638e0) at /usr/include/boost/thread/locks.hpp:349
#5  0x000000000047d51d in unique_lock (this=0x7fffee2638e0, m_=...) at 
/usr/include/boost/thread/locks.hpp:227
#6  0x00000000005f0a07 in 
Hypertable::IndexScannerCallback::scan_ok(Hypertable::TableScannerAsync*, 
boost::intrusive_ptr<Hypertable::ScanCells>&) ()
#7  0x00000000005ed180 in Hypertable::TableScannerAsync::maybe_callback_ok 
(this=0x10e7b50, scanner_id=1, next=true, do_callback=true, cells=...)
    at 
/home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:522
#8  0x00000000005ec5cc in Hypertable::TableScannerAsync::handle_result 
(this=0x10e7b50, scanner_id=1, event=..., is_create=true)
    at 
/home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:459
#9  0x00000000006286d2 in Hypertable::TableScannerHandler::run 
(this=0x7fffe8049e30) at 
/home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerHandler.cc:40
#10 0x000000000047b625 in 
Hypertable::ApplicationQueue::Worker::operator()() ()
#11 0x000000000048dbd2 in 
boost::detail::thread_data<Hypertable::ApplicationQueue::Worker>::run() ()
#12 0x00007ffff77b5200 in thread_proxy () from 
/usr/lib/libboost_thread.so.1.42.0
#13 0x00007ffff79c58ca in start_thread (arg=<value optimized out>) at 
pthread_create.c:300
#14 0x00007ffff4978b6d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#15 0x0000000000000000 in ?? ()


Thread 27 (Thread 0x7fffe33ee700 (LWP 10949)):         //TableScanner 
Destructor Thread
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x000000000047d87f in 
boost::condition_variable_any::wait<boost::unique_lock<boost::mutex> > 
(this=0x10e7c18, m=...)
    at /usr/include/boost/thread/pthread/condition_variable.hpp:84
#2  0x00000000005ed224 in 
Hypertable::TableScannerAsync::wait_for_completion (this=0x10e7b50)
    at 
/home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:535
#3  0x00000000005eb370 in ~TableScannerAsync (this=0x10e7b50, 
__in_chrg=<value optimized out>)
    at 
/home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:318
#4  0x00000000005f01d3 in 
Hypertable::IndexScannerCallback::~IndexScannerCallback() ()
#5  0x00000000005eb579 in ~TableScannerAsync (this=0xc04ca0, 
__in_chrg=<value optimized out>)
    at 
/home/hadoop/temp/hypertable-0.9.6.0/src/cc/Hypertable/Lib/TableScannerAsync.cc:324
#6  0x0000000000444847 in Hypertable::intrusive_ptr_release (rc=0xc04ca0) 
at /opt/hypertable/0.9.6.0/include/Common/ReferenceCount.h:73
#7  0x00000000005e70e3 in 
boost::intrusive_ptr<Hypertable::TableScannerAsync>::~intrusive_ptr() ()
#8  0x00000000005e6cf3 in Hypertable::TableScanner::~TableScanner() ()
#9  0x000000000043c943 in DBRecycled::run (this=0xa95c60) at 
/home/qiao/Project/Bingo/DistributedSpider/DBRecycled.cpp:48
#10 0x000000000046eed7 in thread_proc (param=0x7fffe805ae00) at 
/home/qiao/Project/Bingo/DistributedSpider/shared/Threading/ThreadPool.cpp:331
#11 0x00007ffff79c58ca in start_thread (arg=<value optimized out>) at 
pthread_create.c:300
#12 0x00007ffff4978b6d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#13 0x0000000000000000 in ?? ()


Sorry for the delay - i was finally able to reproduce it and i also fixed 
> it.
>
> The commit is a bit larger than my first try.
>
> https://github.com/cruppstahl/hypertable/commits/v0.9.5
>
> commit b45ba15b701373c3a1f689f8997f31bde8ff5165
> Author: Christoph Rupp <[email protected] <javascript:>>
> Date:   Wed Apr 25 18:54:11 2012 +0200
>
>     issue 827: fixed deadlock when scanning secondary indices
>
> Thanks again for your great help!
>
> Best regards
> Christoph
>
> 2012/4/26 gcc.lua <[email protected] <javascript:>>
>
>> Hi,
>>
>> thanks to reply quickly, but the commit just remove m_mutex inside
>> virtual ~IndexScannerCallback() ,
>>  I try it,  will  a new problem occured, see end of report,
>> some additional info about  reproduce this issue before you commit
>>
>> void run()
>> {
>> TableScannerPtr aScanner = tbSourcelist-
>> >create_scanner( specbuilder.get(), 5000 );
>>
>>  while( aScanner->next( gotCell ) )
>>  {
>>   ....
>>        if(condition)
>>           break;//if have next result, now break, internel scanner
>> thread running
>>   ....
>>  }
>>  return;//trigger  TableScanner destructor,  next info see my first
>> post please
>> }
>>
>>
>> //////////////////////////////////////////////////////////////////////////////////////
>>
>>
>> pure virtual method called
>> terminate called without an active exception
>>
>> Program received signal SIGABRT, Aborted.
>> [Switching to Thread 0x7fffe6ff5700 (LWP 23887)]
>> 0x00007ffff48db1b5 in raise () from /lib/libc.so.6
>>
>>
>> (gdb) where
>> #0  0x00007ffff48db1b5 in raise () from /lib/libc.so.6
>> #1  0x00007ffff48ddfc0 in abort () from /lib/libc.so.6
>> #2  0x00007ffff516fdc5 in __gnu_cxx::__verbose_terminate_handler() ()
>> from /usr/lib/libstdc++.so.6
>> #3  0x00007ffff516e166 in ?? () from /usr/lib/libstdc++.so.6
>> #4  0x00007ffff516e193 in std::terminate() () from /usr/lib/libstdc+
>> +.so.6
>> #5  0x00007ffff516ea6f in __cxa_pure_virtual () from /usr/lib/libstdc+
>> +.so.6
>> #6  0x00000000005c43c6 in
>> Hypertable::TableScannerAsync::maybe_callback_ok
>> (this=0x7fffb432ecd0,
>> scanner_id=19373, next=true, do_callback=true, cells=...)
>>     at
>> /root/qiao/Project/hypertable-0.9.5.6/src/cc/Hypertable/Lib/
>> TableScannerAsync.cc:520
>> #7  0x00000000005c393f in
>> Hypertable::TableScannerAsync::handle_result
>> (this=0x7fffb432ecd0, scanner_id=19373, event=..., is_create=true)
>>     at
>> /root/qiao/Project/hypertable-0.9.5.6/src/cc/Hypertable/Lib/
>> TableScannerAsync.cc:464
>> #8  0x00000000005fdc5e in Hypertable::TableScannerHandler::run
>> (this=0x7fff99915850) at
>> /root/qiao/Project/hypertable-0.9.5.6/src/cc/Hypertable/Lib/
>> TableScannerHandler.cc:40
>> #9  0x000000000045f2c5 in
>> Hypertable::ApplicationQueue::Worker::operator() (this=0xaaa120) at
>> /root/qiao/Project/hypertable-0.9.5.6/src/cc/AsyncComm/
>> ApplicationQueue.h:173
>> #10 0x0000000000470f04 in
>> boost::detail::thread_data<Hypertable::ApplicationQueue::Worker>::run
>> (this=0xaa9ff0) at /usr/include/boost/thread/detail/thread.hpp:56
>> #11 0x00007ffff77b5200 in thread_proxy () from
>> /usr/lib/libboost_thread.so.1.42.0
>> #12 0x00007ffff79c58ca in start_thread () from /lib/libpthread.so.0
>> #13 0x00007ffff497892d in clone () from /lib/libc.so.6
>> #14 0x0000000000000000 in ?? ()
>>
>> On 4月26日, 上午12时56分, Christoph Rupp <[email protected]> wrote:
>> > Hi,
>> >
>> > thanks for the great bug report.
>> >
>> > I am not able to reproduce this issue, but i think i came up with a 
>> fix. If
>> > you want to check out the sources then you can get them here:
>> https://github.com/cruppstahl/hypertablebranch "v0.9.5"
>> >
>> > This is the commit:
>> > commit 2572b5dcb524e1c36dc23307c37784fd34c1bdde
>> > Author: Christoph Rupp <[email protected]>
>> > Date:   Wed Apr 25 18:54:11 2012 +0200
>> >
>> >     issue 827: fixed deadlock when scanning secondary indices
>> >
>> > And here's the diff:
>> >
>> > diff --git a/src/cc/Hypertable/Lib/IndexScannerCallback.h
>> > b/src/cc/Hypertable/Li
>> > index 70ffda7..1b37127 100644
>> > --- a/src/cc/Hypertable/Lib/IndexScannerCallback.h
>> > +++ b/src/cc/Hypertable/Lib/IndexScannerCallback.h
>> > @@ -118,13 +118,12 @@ static String last;
>> >      }
>> >
>> >      virtual ~IndexScannerCallback() {
>> > -      ScopedLock lock(m_mutex);
>> > -      if (m_mutator)
>> > -        delete m_mutator;
>> >        foreach (TableScannerAsync *s, m_scanners)
>> >          delete s;
>> >        m_scanners.clear();
>> >        sspecs_clear();
>> > +      if (m_mutator)
>> > +        delete m_mutator;
>> >
>> > Can you please give it a try and see if this helps?
>> >
>> > Thanks
>> > Christoph
>> >
>> > 2012/4/24 gcc.lua <[email protected]>
>> >
>> > > user thread  logic like follow:
>> > > TableScannerPtr aScanner = tbSourcelist-
>> > > >create_scanner( specbuilder.get(), 5000 );
>> > >  while( aScanner->next( gotCell ) )
>> > >  {
>> > >         .....
>> > >  }
>> >
>> > > dead lock between user thread and scanner thread:
>> >
>> > > 1. user thread TableScanner
>> >
>> > >    TableScannerAsync::~TableScannerAsync() {
>> > >  try {
>> > >    cancel();
>> > >    wait_for_completion();
>> > >  }
>> > >  catch (Exception &e) {
>> > >    HT_ERROR_OUT << e << HT_END;
>> > >  }
>> > >  if (m_use_index) {
>> > >    delete m_cb;//<=========================dead lock entry
>> > >    m_cb = 0;
>> > >  }
>> > > }
>> > > /////////////////////////////////////////
>> > >   virtual ~IndexScannerCallback() {
>> > >  ScopedLock lock(m_mutex);//<=========  user thread got this
>> > > IndexScannerCallback::m_mutex
>> > >      if (m_mutator)
>> > >        delete m_mutator;
>> >
>> > >      foreach (TableScannerAsync *s, m_scanners)
>> > >        delete s;//dead lock 1<=============user thread wait
>> > > TableScannerAsync::m_mutex
>> >
>> > > 2. scanner thread
>> >
>> > >  void TableScannerAsync::handle_result(int scanner_id, EventPtr
>> > > &event, bool is_create) {
>> >
>> > >  bool cancelled = is_cancelled();
>> > >  ScopedLock lock(m_mutex);<============scanner thread got
>> > > TableScannerAsync::m_mutex
>> > >  ScanCellsPtr cells;
>> >
>> > >    . . . . . .
>> > >  maybe_callback_ok();<================call  m_cb->scan_ok(this,
>> > > cells);
>> >
>> > > }
>> > > //////////////////////////////
>> > >  class IndexScannerCallback : public ResultCallback {
>> >
>> > >    virtual void scan_ok(TableScannerAsync *scanner, ScanCellsPtr
>> > > &scancells) {
>> > >      bool is_eos = scancells->get_eos();
>> > >      String table_name = scanner->get_table_name();
>> >
>> > >      ScopedLock lock(m_mutex);//dead lock 2<============scanner
>> > > thread wait IndexScannerCallback::m_mutex
>> >
>> > > --
>> > > You received this message because you are subscribed to the Google 
>> Groups
>> > > "Hypertable Development" group.
>> > > To post to this group, send email to 
>> > > [email protected]<javascript:>
>> .
>> > > To unsubscribe from this group, send email to
>> > > [email protected] <javascript:>.
>> > > For more options, visit this group at
>> > >http://groups.google.com/group/hypertable-dev?hl=en.
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "Hypertable Development" group.
>> To post to this group, send email to [email protected]<javascript:>
>> .
>> To unsubscribe from this group, send email to 
>> [email protected] <javascript:>.
>> For more options, visit this group at 
>> http://groups.google.com/group/hypertable-dev?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/hypertable-dev/-/sERE6hok0i0J.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en.

Reply via email to