felixwluo opened a new pull request, #45747:
URL: https://github.com/apache/doris/pull/45747

   ### What problem does this PR solve?
   
   Core Dump
   ```
   (gdb) bt
   #0  0x000055f476bcda1d in 
std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release 
(this=0x7f1187acbb00)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:168
   #1  std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count 
(this=0x7f12bbeaac98)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:702
   #2  std::__shared_ptr<doris::MetricEntity, 
(__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7f12bbeaac90)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1149
   #3  doris::BaseTablet::~BaseTablet (this=0x7f12bbeaac10) at 
/root/be/src/olap/base_tablet.cpp:53
   #4  0x000055f476beabbb in 
std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release 
(this=0x7f12bbeaac00)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:168
   #5  std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count 
(this=0x7f11b8d046c8)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:702
   #6  std::__shared_ptr<doris::Tablet, 
(__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7f11b8d046c0)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1149
   #7  std::destroy_at<std::shared_ptr<doris::Tablet> > 
(__location=0x7f11b8d046c0)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:88
   #8  std::_Destroy<std::shared_ptr<doris::Tablet> > (__pointer=0x7f11b8d046c0)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:138
   #9  std::_Destroy_aux<false>::__destroy<std::shared_ptr<doris::Tablet>*> 
(__first=0x7f11b8d046c0, __last=0x7f11b8d04c80)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:152
   #10 std::_Destroy<std::shared_ptr<doris::Tablet>*> (__first=<optimized out>, 
__last=0x7f11b8d04c80)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:184
   #11 std::_Destroy<std::shared_ptr<doris::Tablet>*, 
std::shared_ptr<doris::Tablet> > (__first=<optimized out>, 
__last=0x7f11b8d04c80)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:746
   #12 std::vector<std::shared_ptr<doris::Tablet>, 
std::allocator<std::shared_ptr<doris::Tablet> > >::~vector (this=<optimized 
out>)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:680
   #13 doris::TabletManager::start_trash_sweep()::$_2::operator()() const 
(this=<optimized out>) at /root/be/src/olap/tablet_manager.cpp:1105
   #14 doris::TabletManager::start_trash_sweep (this=0x7f17fc2d1d00) at 
/root/be/src/olap/tablet_manager.cpp:1110
   #15 0x000055f4761ac0c6 in doris::StorageEngine::start_trash_sweep 
(this=0x7f17fbef7000, usage=0x7f150f1bf3d0, ignore_guard=<optimized out>)
       at /root/be/src/olap/storage_engine.cpp:803
   #16 0x000055f476a355e6 in 
doris::StorageEngine::_garbage_sweeper_thread_callback (this=0x7f17fbef7000) at 
/root/be/src/olap/olap_server.cpp:300
   #17 0x000055f47707da51 in std::function<void ()>::operator()() const 
(this=0x7f1187acbb00)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
   #18 doris::Thread::supervise_thread (arg=0x7f17fbf7da40) at 
/root/be/src/util/thread.cpp:498
   #19 0x00007f182d17fea5 in start_thread () from /lib64/libpthread.so.0
   #20 0x00007f182dbae9fd in clone () from /lib64/libc.so.6
   ```
   
   Cause of occurrence
   `The crash occurred during the processing of _metric_entity at BaseTablet 
destructor, from memory, the reference count for _metric_entity is already 0, 
but there is still a weak reference, n a multithreaded environment, a race 
condition may occur between deregister_entity and reset_metric_entity`
   
   GDB
   ```
   (gdb) f 0
   #0  0x000055f476bcda1d in 
std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release 
(this=0x7f1187acbb00)
       at 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:168
   168     in 
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h
   (gdb) p *this
   $18 = {<std::_Mutex_base<(__gnu_cxx::_Lock_policy)2>> = {<No data fields>}, 
_vptr$_Sp_counted_base = 0x55f46f61696a, _M_use_count = 0, 
     _M_weak_count = 1}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to