Hello David Ribeiro Alves, Jean-Daniel Cryans, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8941

to look at the new patch set (#7).

Change subject: KUDU-2251: rowset size can overflow int in RowSetInfo
......................................................................

KUDU-2251: rowset size can overflow int in RowSetInfo

This overflow causes a CHECK failure from rowset compaction planning in
tablets with rowsets with more than 2GiB of REDO deltafiles:

*** SIGABRT (@0x3ce00007614) received by PID 30228 (TID 0x7fbb52a5e700) from 
PID 30228; stack trace: ***
    @     0x7fbb977cb100 (unknown)
    @     0x7fbb95a985f7 __GI_raise
    @     0x7fbb95a99ce8 __GI_abort
    @          0x1af56d9 (unknown)
    @           0x8baf3d google::LogMessage::Fail()
    @           0x8bce93 google::LogMessage::SendToLog()
    @           0x8baa99 google::LogMessage::Flush()
    @           0x8bd81f google::LogMessageFatal::~LogMessageFatal()
    @           0x9f71d6 kudu::tablet::RowSetInfo::CollectOrdered()
    @           0x9d42d9 
kudu::tablet::BudgetedCompactionPolicy::SetupKnapsackInput()
    @           0x9d5a3a kudu::tablet::BudgetedCompactionPolicy::PickRowSets()
    @           0x98e28f kudu::tablet::Tablet::UpdateCompactionStats()
    @           0x9aff08 kudu::tablet::CompactRowSetsOp::UpdateStats()
    @          0x1ae02b5 kudu::MaintenanceManager::FindBestOp()
    @          0x1ae2bce kudu::MaintenanceManager::RunSchedulerThread()
    @          0x1b27eda kudu::Thread::SuperviseThread()
    @     0x7fbb977c3dc5 start_thread
    @     0x7fbb95b5921d __clone
    @                0x0 (unknown)

This commit contains two repro tests: one targeted unit-test which
reproduces the overflow quickly and deterministically, and an
integration test that gives us more coverage of the update-heavy write
workloads required to hit this bug. The integration test has mixed
success on triggering the overflow, depending on how fast the machine
it's running on is (particularly disk throughput), and thus how fast the
MM can do compactions. On my laptop it triggered the overflow nearly
100% of the time, but on dist-test it triggered nearly 0% of the time.

Change-Id: I598344e22fc1ecbb482bfe85ea3867ddf63963b4
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/heavy-update-compaction-itest.cc
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset_info.h
5 files changed, 264 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8941/7
--
To view, visit http://gerrit.cloudera.org:8080/8941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I598344e22fc1ecbb482bfe85ea3867ddf63963b4
Gerrit-Change-Number: 8941
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to