Hello Tidy Bot, Kudu Jenkins, helifu, Adar Dembo,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14061

to look at the new patch set (#4).

Change subject: [tablet] Fixed the bug of DeltaTracker::CountDeletedRows
......................................................................

[tablet] Fixed the bug of DeltaTracker::CountDeletedRows

When Tablet.CountLiveRows was called in a multi-thread case, there's a
chance we'll see the following failure.

User stack:
F0814 12:05:51.975797 96375 diskrowset.cc:759] Check failed: *count >= 0 (-3 
vs. 0)
*** Check failure stack trace: ***
*** Aborted at 1565755551 (unix time) try "date -d @1565755551" if you are 
using GNU date ***
PC: @     0x7f9bd20425f7 __GI_raise
*** SIGABRT (@0x70900017872) received by PID 96370 (TID 0x7f9bce2d7700) from 
PID 96370; stack trace: ***
    @     0x7f9bdaff6100 (unknown)
    @     0x7f9bd20425f7 __GI_raise
    @     0x7f9bd2043ce8 __GI_abort
    @     0x7f9bd4540c99 google::logging_fail()
    @     0x7f9bd454246d google::LogMessage::Fail()
    @     0x7f9bd45443c3 google::LogMessage::SendToLog()
    @     0x7f9bd4541fc9 google::LogMessage::Flush()
    @     0x7f9bd4544d4f google::LogMessageFatal::~LogMessageFatal()
    @     0x7f9bddc9aabe kudu::tablet::DiskRowSet::CountLiveRows()
    @     0x7f9bddbdeb79 kudu::tablet::Tablet::CountLiveRows()
    @           0x49891f 
kudu::tablet::MultiThreadedTabletTest<>::CollectStatisticsThread()
    @           0x4ae34b boost::_mfi::mf1<>::operator()()
    @           0x4add25 boost::_bi::list2<>::operator()<>()
    @           0x4acfe9 boost::_bi::bind_t<>::operator()()
    @           0x4ac8a6 
boost::detail::function::void_function_obj_invoker0<>::invoke()
    @     0x7f9bd7116492 boost::function0<>::operator()()
    @     0x7f9bd62e5324 kudu::Thread::SuperviseThread()
    @     0x7f9bdafeedc5 start_thread
    @     0x7f9bd2103ced __clone

This is because there is DeltaTracker lack of lock protection when modify
the number of live rows in rowset_metadata_ and reset the deleted_row_count_.
This caused deleted_row_count_ to be duplicated when calculating the number
of live rows of DRS. Consider the following sequence:
| T1                                | T2
|----------                         |----------
|+ In DT::Flush                     |
|  Take compact_flush_lock_ (excl)  |
|  Take component_lock_ (excl)      |
|  deleted_row_count_ = ...         |
|  Release component_lock_          |
|  + In DT::FlushDMS                |
|    Call RSMD::IncrementLiveRows   |
|    --> RSMD::live_row_count - deleted_row_count_
|                                   |+ In DRS::CountLiveRows
|                                   |  Take component_lock_ (shared)
|                                   |  Call RSMD::live_row_count - 
DT::CountDeletedRows
|                                   |  --> RSMD::live_row_count - 
deleted_row_count_
|                                   |  --> we double counted deleted_row_count_ 
!!!
|  Take component_lock_ (excl)      |
|  deleted_row_count_ = 0           |
|  Release component_lock_          |
|  Release compact_flush_lock_      |

Change-Id: I9bb4456123087778c9dc799777c5990938a84fdf
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/mt-tablet-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
10 files changed, 145 insertions(+), 76 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/14061/4
--
To view, visit http://gerrit.cloudera.org:8080/14061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9bb4456123087778c9dc799777c5990938a84fdf
Gerrit-Change-Number: 14061
Gerrit-PatchSet: 4
Gerrit-Owner: Yao Xu <ocla...@gmail.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yao Xu <ocla...@gmail.com>
Gerrit-Reviewer: helifu <hzhel...@corp.netease.com>

Reply via email to