Hello Dan Burkert, Jean-Daniel Cryans,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/5780

to review the following change.

Change subject: WIP: KUDU-1848. in-memory dictionary for binary columns
......................................................................

WIP: KUDU-1848. in-memory dictionary for binary columns

Quick prototype to test out this idea. Seems to save about 10% MRS usage for
kudu-ts data (which has one compressible string column). Worth trying with some
other datasets/workloads, maybe adding some smarts to auto-disable on columns
determined to be non-compressible.

without patch
------------------
memory consumption: 1.59GB
backoffs took a total of 0.000000sec of runtime
loaded 10000000 items in 59.522394sec with 6 workers (mean rate 
168003.996280/sec)

with patch (64 dict) after loading 10M points:
------------------
memory consumption 1.44GB
backoffs took a total of 0.000000sec of runtime
loaded 10000000 items in 57.729048sec with 6 workers (mean rate 
173223.020620/sec)

with patch (512 dict) after loading 10M points:
------------------
memory consumption 1.43GB
backoffs took a total of 0.000000sec of runtime
loaded 10000000 items in 62.279514sec with 6 workers (mean rate 
160566.443468/sec)

Change-Id: Ic03ef0473383b1dcf22ebefee029ff29d2dae813
---
M src/kudu/tablet/memrowset.cc
M src/kudu/tablet/memrowset.h
2 files changed, 73 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/5780/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5780
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic03ef0473383b1dcf22ebefee029ff29d2dae813
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>

Reply via email to