Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8951

to look at the new patch set (#6).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from 
PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 
kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

This commit contains two repro tests: one targeted unit-test which
reproduces the overflow quickly and deterministically, and an
integration test that gives us more coverage of the update-heavy write
workloads required to hit this bug. The integration test has mixed
success on triggering the overflow, depending on how fast the machine
it's running on is (particularly disk throughput), and thus how fast the
MM can do compactions. On my laptop it triggered the overflow nearly
100% of the time, but on dist-test it triggered nearly 0% of the time.

Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-itest.cc
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset_info.h
5 files changed, 260 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/8951/6
--
To view, visit http://gerrit.cloudera.org:8080/8951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74975cdab605b51617d93d1ae98ef72ce87e35cb
Gerrit-Change-Number: 8951
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to