YifanZhang created KUDU-3108:
--------------------------------

             Summary: Tablet server crashes when handle scan request 
                 Key: KUDU-3108
                 URL: https://issues.apache.org/jira/browse/KUDU-3108
             Project: Kudu
          Issue Type: Bug
    Affects Versions: 1.10.1
            Reporter: YifanZhang


When we use KuduBackup{{}} Spark job to backup tables in a  cluster with 20 
tservers,  3 tservers crashed, coredump stacks are the same:
{code:java}
[Thread debugging using libthread_db enabled][Thread debugging using 
libthread_db enabled]Using host libthread_db library 
"/lib64/libthread_db.so.1".Missing separate debuginfo for 
/home/work/app/kudu/zjyprc-hadoop/tablet_server/package/libstdc++.so.6Try: yum 
--enablerepo='*debug*' install 
/usr/lib/debug/.build-id/b3/d9128bcf6786292a339a477953167d0ddab5ba.debugCore 
was generated by 
`/home/work/app/kudu/zjyprc-hadoop/tablet_server/package/kudu_tablet_server 
-tse'.Program terminated with signal 11, Segmentation fault.#0  
kudu::Schema::Compare<kudu::RowBlockRow, kudu::RowBlockRow> (this=0x25b883680, 
lhs=..., rhs=...) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/common/rowblock.h:267267 
/home/zhangyifan8/work/kudu-xm/src/kudu/common/rowblock.h: No such file or 
directory.Missing separate debuginfos, use: debuginfo-install 
bzip2-libs-1.0.6-13.el7.x86_64 cyrus-sasl-gssapi-2.1.26-20.el7_2.x86_64 
cyrus-sasl-lib-2.1.26-20.el7_2.x86_64 cyrus-sasl-md5-2.1.26-20.el7_2.x86_64 
cyrus-sasl-plain-2.1.26-20.el7_2.x86_64 elfutils-libelf-0.166-2.el7.x86_64 
elfutils-libs-0.166-2.el7.x86_64 glibc-2.17-157.el7_3.1.x86_64 
keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.14.1-27.el7_3.x86_64 
libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64 
libcom_err-1.42.9-9.el7.x86_64 libdb-5.3.21-19.el7.x86_64 
libgcc-4.8.5-28.el7_5.1.x86_64 libselinux-2.5-6.el7.x86_64 
ncurses-libs-5.9-13.20130511.el7.x86_64 
nss-softokn-freebl-3.16.2.3-14.4.el7.x86_64 
openssl-libs-1.0.1e-60.el7_3.1.x86_64 pcre-8.32-15.el7_2.1.x86_64 
systemd-libs-219-30.el7_3.8.x86_64 xz-libs-5.2.2-1.el7.x86_64 
zlib-1.2.7-17.el7.x86_64(gdb) bt#0  kudu::Schema::Compare<kudu::RowBlockRow, 
kudu::RowBlockRow> (this=0x25b883680, lhs=..., rhs=...) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/common/rowblock.h:267#1  
0x0000000001da51fb in kudu::MergeIterator::RefillHotHeap 
(this=this@entry=0x78f6ec500) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/common/generic_iterators.cc:720#2  
0x0000000001da622b in kudu::MergeIterator::AdvanceAndReheap 
(this=this@entry=0x78f6ec500, state=0xd1661a000, 
num_rows_to_advance=num_rows_to_advance@entry=1)    at 
/home/zhangyifan8/work/kudu-xm/src/kudu/common/generic_iterators.cc:690#3  
0x0000000001da7927 in kudu::MergeIterator::MaterializeOneRow 
(this=this@entry=0x78f6ec500, dst=dst@entry=0x7f0d5cc9ffc0, 
dst_row_idx=dst_row_idx@entry=0x7f0d5cc9fbb0)    at 
/home/zhangyifan8/work/kudu-xm/src/kudu/common/generic_iterators.cc:894#4  
0x0000000001da7de3 in kudu::MergeIterator::NextBlock (this=0x78f6ec500, 
dst=0x7f0d5cc9ffc0) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/common/generic_iterators.cc:796#5  
0x0000000000a9ff19 in kudu::tablet::Tablet::Iterator::NextBlock 
(this=<optimized out>, dst=<optimized out>) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/tablet/tablet.cc:2499#6  
0x000000000095475c in 
kudu::tserver::TabletServiceImpl::HandleContinueScanRequest 
(this=this@entry=0x53b5a90, req=req@entry=0x7f0d5cca0720,     
rpc_context=rpc_context@entry=0x5e512a460, 
result_collector=result_collector@entry=0x7f0d5cca0a00, 
has_more_results=has_more_results@entry=0x7f0d5cca0886,     
error_code=error_code@entry=0x7f0d5cca0888) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/tserver/tablet_service.cc:2565#7  
0x0000000000966564 in kudu::tserver::TabletServiceImpl::HandleNewScanRequest 
(this=this@entry=0x53b5a90, replica=0xf5c0189c0, req=req@entry=0x2a15c240,     
rpc_context=rpc_context@entry=0x5e512a460, 
result_collector=result_collector@entry=0x7f0d5cca0a00, 
scanner_id=scanner_id@entry=0x7f0d5cca0940,     
snap_timestamp=snap_timestamp@entry=0x7f0d5cca0950, 
has_more_results=has_more_results@entry=0x7f0d5cca0886, 
error_code=error_code@entry=0x7f0d5cca0888)    at 
/home/zhangyifan8/work/kudu-xm/src/kudu/tserver/tablet_service.cc:2476#8  
0x0000000000967f4b in kudu::tserver::TabletServiceImpl::Scan (this=0x53b5a90, 
req=0x2a15c240, resp=0x56f9be6c0, context=0x5e512a460)    at 
/home/zhangyifan8/work/kudu-xm/src/kudu/tserver/tablet_service.cc:1674#9  
0x0000000001d2e449 in operator() (__args#2=0x5e512a460, __args#1=0x56f9be6c0, 
__args#0=<optimized out>, this=0x497ecdd8) at 
/usr/include/c++/4.8.2/functional:2471#10 kudu::rpc::GeneratedServiceIf::Handle 
(this=0x53b5a90, call=<optimized out>) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/rpc/service_if.cc:139#11 
0x0000000001d2eb49 in kudu::rpc::ServicePool::RunThread (this=0x2ab69560) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/rpc/service_pool.cc:225#12 
0x0000000001e9e924 in operator() (this=0x90fb52e8) at 
/home/zhangyifan8/work/kudu-xm/thirdparty/installed/uninstrumented/include/boost/function/function_template.hpp:771#13
 kudu::Thread::SuperviseThread (arg=0x90fb52c0) at 
/home/zhangyifan8/work/kudu-xm/src/kudu/util/thread.cc:657#14 
0x00007f103b20cdc5 in start_thread () from /lib64/libpthread.so.0#15 
0x00007f103956673d in clone () from /lib64/libc.so.6
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to