[ 
https://issues.apache.org/jira/browse/KUDU-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15315825#comment-15315825
 ] 

zhangsong commented on KUDU-1472:
---------------------------------

@todd, met this crash again , with little difference backtrace:
(gdb) bt
#0  kudu::BlockIdPB::set_has_id (this=<optimized out>) at 
/export/ldb/kudu_build/kudu-gitlab/build/release/src/kudu/fs/fs.pb.h:1016
#1  kudu::BlockIdPB::set_id (value=1909031780344067001, this=0xd00) at 
/export/ldb/kudu_build/kudu-gitlab/build/release/src/kudu/fs/fs.pb.h:1030
#2  kudu::BlockId::CopyToPB (this=this@entry=0x42a70848, pb=0xd00) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/fs/block_id.cc:44
#3  0x00000000008e7e9b in kudu::tablet::RowSetMetadata::ToProtobuf 
(this=0x42a70820, pb=0x1234fe100) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/rowset_metadata.cc:129
#4  0x00000000008e208f in kudu::tablet::TabletMetadata::ToSuperBlockUnlocked 
(this=this@entry=0x42ab6480, super_block=super_block@entry=0x7fe0bcc8cdf0, 
rowsets=...)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:540
#5  0x00000000008e26ac in kudu::tablet::TabletMetadata::Flush 
(this=this@entry=0x42ab6480) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:433
#6  0x00000000008e3889 in kudu::tablet::TabletMetadata::UpdateAndFlush 
(this=0x42ab6480, to_remove=..., to_add=..., 
last_durable_mrs_id=last_durable_mrs_id@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:354
#7  0x000000000086c423 in kudu::tablet::Tablet::FlushMetadata 
(this=this@entry=0x88266dc0, to_remove=..., to_add=..., 
mrs_being_flushed=mrs_being_flushed@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:1228
#8  0x000000000086d736 in kudu::tablet::Tablet::DoCompactionOrFlush 
(this=this@entry=0x88266dc0, input=..., 
mrs_being_flushed=mrs_being_flushed@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:1410
#9  0x000000000086eb35 in kudu::tablet::Tablet::FlushInternal 
(this=this@entry=0x88266dc0, input=..., old_ms=...) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:777
#10 0x000000000086ef27 in kudu::tablet::Tablet::FlushUnlocked 
(this=this@entry=0x88266dc0) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:712
#11 0x0000000000903c0c in kudu::tablet::FlushMRSOp::Perform (this=0xac5588c0) 
at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_peer_mm_ops.cc:127
#12 0x00000000008b83fa in kudu::MaintenanceManager::LaunchOp (this=0x3896300, 
op=0xac5588c0) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/maintenance_manager.cc:360
#13 0x0000000001901c3e in boost::function0<void>::operator() (this=<optimized 
out>) at /usr/local/include/boost/function/function_template.hpp:767
#14 kudu::FunctionRunnable::Run (this=<optimized out>) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/util/threadpool.cc:48
#15 kudu::ThreadPool::DispatchThread (this=0x3917380, permanent=true) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/util/threadpool.cc:343
#16 0x00000000018fc7ba in boost::function0<void>::operator() (this=0x3880368) 
at /usr/local/include/boost/function/function_template.hpp:767
#17 kudu::Thread::SuperviseThread (arg=0x3880340) at 
/export/ldb/kudu_build/kudu-gitlab/src/kudu/util/thread.cc:586
#18 0x0000003296a079d1 in start_thread () from 
/export/servers/kudu/lib64/libpthread.so.0
#19 0x00000032966e8b6d in clone () from /export/servers/kudu/lib64/libc.so.6

> kudu-tserver crash unexpected
> -----------------------------
>
>                 Key: KUDU-1472
>                 URL: https://issues.apache.org/jira/browse/KUDU-1472
>             Project: Kudu
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: zhangsong
>            Priority: Critical
>
> kudu-tserver will crash under some case, in jd.com 200-node env, it occurring 
> frequently.
> some crash   info  from core file:
> (gdb) bt
> #0  0x0000000000a2489f in kudu::tablet::RowSetDataPB::SharedDtor 
> (this=0x58fb5b180)
>    at /export/ldb/kudu-master/build/release/src/kudu/tablet/metadata.pb.cc:815
> #1  kudu::tablet::RowSetDataPB::~RowSetDataPB (this=0x58fb5b180, 
> __in_chrg=<optimized out>)
>    at /export/ldb/kudu-master/build/release/src/kudu/tablet/metadata.pb.cc:809
> #2  kudu::tablet::RowSetDataPB::~RowSetDataPB (this=0x58fb5b180, 
> __in_chrg=<optimized out>)
>    at /export/ldb/kudu-master/build/release/src/kudu/tablet/metadata.pb.cc:810
> #3  
> google::protobuf::internal::GenericTypeHandler<kudu::tablet::RowSetDataPB>::Delete
>  (value=0x58fb5b180)
>    at 
> /export/ldb/kudu-master/thirdparty/installed-deps/include/google/protobuf/repeated_field.h:363
> #4  
> google::protobuf::internal::RepeatedPtrFieldBase::Destroy<google::protobuf::RepeatedPtrField<kudu::tablet::RowSetDataPB>::TypeHandler>
>  (
>    this=<optimized out>, this=<optimized out>) at 
> /export/ldb/kudu-master/thirdparty/installed-deps/include/google/protobuf/repeated_field.h:869
> Backtrace stopped: Cannot access memory at address 0x7fc1f230fd08
> after crash , kudu-tserver will not be restarted successfully, due to some pb 
> validation  check failed, for example:
>  check failed: _s.ok() Bad status: IO error: Could not init Tablet Manager: 
> Failed to open tablet metadata for tablet: 260359a41a134c1f91631e9094847bcf: 
> Failed to load tablet metadata for tablet id 
> 260359a41a134c1f91631e9094847bcf: Could not load tablet metadata from 
> /export/servers/kudu/tserver_data_7052/tablet-meta/260359a41a134c1f91631e9094847bcf:
>  Unable to parse PB from path: 
> /export/servers/kudu/tserver_data_7052/tablet-meta/260359a41a134c1f91631e9094847bcf
> kudu version is 0.9.0-snapshot, last commit id :  
> be10f8514c48950b64c7d59bbce848f3792ec52d 
> workload is: several write tasks  keeps inserting into kudu table, some task 
> using java api, while others using impala.
> kudu-table will be scanned while whose tasks are running.
> almost everyday there will be a crash case. same phenomenon as described 
> above. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to