Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13749 )

Change subject: Extend benchmark for ColumnarRowBlockToPB.
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc
File src/kudu/common/wire_protocol-test.cc:

http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@91
PS2, Line 91:   void ResetBenchMarkSchema(int n) {
nit: use 'num_columns' instead of 'n' so it's more descriptive


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@98
PS2, Line 98: UINT32
nit: the UINT type is more or less deprecated, we only use signed ints in the 
actual client APIs


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@99
PS2, Line 99:       }
I think it'd be more readable to just write the above 6 lines as :

column_schemas.emplace_back(Substitute("col$0", i), i % 2 ? STRING : INT32);


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@115
PS2, Line 115:           *reinterpret_cast<Slice*>(row.mutable_cell_ptr(j)) = 
col;
nit: use memcpy here instead of *reinterpret_cast


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@117
PS2, Line 117:           *reinterpret_cast<uint32_t*>(row.mutable_cell_ptr(j)) 
= i;
same


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@118
PS2, Line 118:         }
else { LOG(FATAL); }


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@123
PS2, Line 123: Row
nit: Rows


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@137
PS2, Line 137:         int next = rand.Uniform(block->nrows());
nit: this loop would run very long when rate approaches 1.

Instead of ensuring you select exactly N rows, why not just loop over the rows, 
and for each row, generate a random double between 0 and 1, and if it's less 
than rate, set the row to selected? You might get not exactly N, but should be 
close enough for the benchmark

Another option would be something like:

vector<int> indexes(block->nrows());
std::iota(indexes.begin(), indexes.end(), 0);
std::shuffle(indexes.begin(), indexes.end());
indexes.resize(select_count);


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@149
PS2, Line 149: RunBenchMark
nit: "RunBenchmark" (Benchmark is one word, so not camel case)


http://gerrit.cloudera.org:8080/#/c/13749/2/src/kudu/common/wire_protocol-test.cc@426
PS2, Line 426:   // select_rates = {1.0, 0.8, 0.5, 0.2} for benchmark.
maybe set these values if AllowSlowTests is true?



--
To view, visit http://gerrit.cloudera.org:8080/13749
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie29937c316be624151e6c51e54545c4f023e603d
Gerrit-Change-Number: 13749
Gerrit-PatchSet: 2
Gerrit-Owner: ZhangYao <triplesheep0...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Fri, 28 Jun 2019 06:08:50 +0000
Gerrit-HasComments: Yes

Reply via email to