Dan Burkert created KUDU-1657:
---------------------------------

             Summary: read-only FsManager::Open on active tablet can crash
                 Key: KUDU-1657
                 URL: https://issues.apache.org/jira/browse/KUDU-1657
             Project: Kudu
          Issue Type: Bug
            Reporter: Dan Burkert
            Assignee: Dan Burkert


alter_table-randomized-test.cc is currently flaky due to a crash in the 
LogVerifier that happens because FsManager is not robust to running in 
read-only mode against an actively writing tablet. The root of the issue is a 
stale data container length that is used after reading new metadata. The 
failure results in log messages such as:

{code}
F0927 19:37:39.883033 22107 log_block_manager.cc:535] Found malformed block 
record in data file: 
/tmp/kudutest-4348/insert-verify-itest.InsertVerifyITest.TestInsertAndVerify.1475030222707874-17327/minicluster-data/ts-0/data/e4ade118175d48cabd2085014a6d762e.data
Record: block_id {
  id: 1525
}
op_type: CREATE
timestamp_us: 1475030259882913
offset: 5840896
length: 279030

Data file size: 6119892
*** Check failure stack trace: ***
    @     0x7f86ce57bf5d  google::LogMessage::Fail() at ??:0
    @     0x7f86ce57de5d  google::LogMessage::SendToLog() at ??:0
    @     0x7f86ce57ba99  google::LogMessage::Flush() at ??:0
    @     0x7f86ce57e8ff  google::LogMessageFatal::~LogMessageFatal() at ??:0
    @     0x7f86cfe4e32b  
kudu::fs::internal::LogBlockContainer::CheckBlockRecord() at ??:0
    @     0x7f86cfe4dc8d  
kudu::fs::internal::LogBlockContainer::ReadContainerRecords() at ??:0
    @     0x7f86cfe5731a  kudu::fs::LogBlockManager::OpenRootPath() at ??:0
    @     0x7f86cfe69023  kudu::internal::RunnableAdapter<>::Run() at ??:0
    @     0x7f86cfe66959  kudu::internal::InvokeHelper<>::MakeItSo() at ??:0
    @     0x7f86cfe63a77  kudu::internal::Invoker<>::Run() at ??:0
    @     0x7f86d598b542  kudu::Callback<>::Run() at ??:0
    @     0x7f86d598fe61  boost::_mfi::cmf0<>::operator()() at ??:0
    @     0x7f86d598f93e  boost::_bi::list1<>::operator()<>() at ??:0
    @     0x7f86d598f05d  boost::_bi::bind_t<>::operator()() at ??:0
    @     0x7f86d598e860  
boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
    @     0x7f86d1296732  boost::function0<>::operator()() at ??:0
    @     0x7f86cf402124  kudu::FunctionRunnable::Run() at ??:0
    @     0x7f86cf401556  kudu::ThreadPool::DispatchThread() at ??:0
    @     0x7f86cf405824  boost::_mfi::mf1<>::operator()() at ??:0
    @     0x7f86cf40542b  boost::_bi::list2<>::operator()<>() at ??:0
    @     0x7f86cf404ecd  boost::_bi::bind_t<>::operator()() at ??:0
    @     0x7f86cf4047fe  
boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
    @     0x7f86d1296732  boost::function0<>::operator()() at ??:0
    @     0x7f86cf3f8717  kudu::Thread::SuperviseThread() at ??:0
    @       0x3ae0e079d1  (unknown) at ??:0
    @       0x3ae0ae88fd  (unknown) at ??:0
    @              (nil)  (unknown)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to