[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15550 ) Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. wire_protocol: some simplification and optimization for rowwise encoding * Re-implement GetSelectedRows based on the new ForEachSetBit(...) utility, which operates word-by-word instead of byte-by-byte. * use a boolean return value which indicates the common case of "all rows are selected". Currently the rowwise serialization code path doesn't use this special value (and just reproduces the old std::iota() call to generate the sequence of all indexes), but the columnar code path will special case this as a memcpy. * Avoid one call to CountSelected() in SerializeRowBlock() by calculating num_rows from the size of the row index vector. * Change SerializeRowBlock() to return an int indicating the number of rows selected, instead of accumulating it into the protobuf. This value can then be re-used to eliminate one extra call to CountSelected in ScanResultCopier::HandleRowBlock(). After this change, the protobuf is no longer used by SerializeRowBlock, so I removed the parameter, which required a bit of fixup in the tests. Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Reviewed-on: http://gerrit.cloudera.org:8080/15550 Reviewed-by: Grant Henke Tested-by: Todd Lipcon --- M src/kudu/common/rowblock-test.cc M src/kudu/common/rowblock.cc M src/kudu/common/rowblock.h M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/common/wire_protocol.h M src/kudu/tserver/tablet_service.cc 7 files changed, 89 insertions(+), 67 deletions(-) Approvals: Grant Henke: Looks good to me, approved Todd Lipcon: Verified -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Todd Lipcon has removed a vote on this change. Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Todd Lipcon has removed a vote on this change. Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. Removed -Verified by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/15550 ) Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. Patch Set 3: Verified+1 k, skipping IWYU for now -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:42 + Gerrit-HasComments: No
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/15550 ) Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. Patch Set 3: Code-Review+2 There is an IWYU failure but you could probably skip it given Adar is fixing it right now. -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 25 Mar 2020 17:31:02 + Gerrit-HasComments: No
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Hello Tidy Bot, Andrew Wong, Kudu Jenkins, Andrew Wong, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15550 to look at the new patch set (#3). Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. wire_protocol: some simplification and optimization for rowwise encoding * Re-implement GetSelectedRows based on the new ForEachSetBit(...) utility, which operates word-by-word instead of byte-by-byte. * use a boolean return value which indicates the common case of "all rows are selected". Currently the rowwise serialization code path doesn't use this special value (and just reproduces the old std::iota() call to generate the sequence of all indexes), but the columnar code path will special case this as a memcpy. * Avoid one call to CountSelected() in SerializeRowBlock() by calculating num_rows from the size of the row index vector. * Change SerializeRowBlock() to return an int indicating the number of rows selected, instead of accumulating it into the protobuf. This value can then be re-used to eliminate one extra call to CountSelected in ScanResultCopier::HandleRowBlock(). After this change, the protobuf is no longer used by SerializeRowBlock, so I removed the parameter, which required a bit of fixup in the tests. Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd --- M src/kudu/common/rowblock-test.cc M src/kudu/common/rowblock.cc M src/kudu/common/rowblock.h M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/common/wire_protocol.h M src/kudu/tserver/tablet_service.cc 7 files changed, 89 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/15550/3 -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/15550 ) Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h File src/kudu/common/rowblock.h: http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h@108 PS2, Line 108: // In the case that all rows are selected, sets *selected to 'boost::none' > I felt similarly, but I also don't mind it if it's documented. Alternativel yea, I couldn't find a great way to do this. I can change to a bool return if you guys prefer http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc File src/kudu/common/rowblock.cc: http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc@91 PS2, Line 91: CHECK_LE(n_rows_, std::numeric_limits::max()); > What prevent's n_rows_ > 65535? just in practice we don't set rowblock sizes that large since the point of a RowBlock is to fit all the rows into CPU cache. I found some small perf improvement using uint16s here, -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 25 Mar 2020 05:44:55 + Gerrit-HasComments: Yes
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15550 ) Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h File src/kudu/common/rowblock.h: http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h@108 PS2, Line 108: // In the case that all rows are selected, sets *selected to 'boost::none' > It feels a little unintuitive that none == all, but as long as we document I felt similarly, but I also don't mind it if it's documented. Alternatively, we could forego optionality entirely and return a bool indicating whether all rows were selected or something, but I don't feel strongly about it. http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/wire_protocol.h File src/kudu/common/wire_protocol.h: http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/wire_protocol.h@164 PS2, Line 164: protobuf and nit: this no longer applies -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 25 Mar 2020 04:12:04 + Gerrit-HasComments: Yes
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/15550 ) Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. Patch Set 2: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h File src/kudu/common/rowblock.h: http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h@108 PS2, Line 108: // In the case that all rows are selected, sets *selected to 'boost::none' It feels a little unintuitive that none == all, but as long as we document it everywhere I think it make sense. http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc File src/kudu/common/rowblock.cc: http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc@91 PS2, Line 91: CHECK_LE(n_rows_, std::numeric_limits::max()); What prevent's n_rows_ > 65535? -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 25 Mar 2020 04:08:19 + Gerrit-HasComments: Yes
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Hello Tidy Bot, Andrew Wong, Kudu Jenkins, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15550 to look at the new patch set (#2). Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. wire_protocol: some simplification and optimization for rowwise encoding * Re-implement GetSelectedRows based on the new ForEachSetBit(...) utility, which operates word-by-word instead of byte-by-byte. * use boost::none as a special case return value which indicates the common case of "all rows are selected". Currently the rowwise serialization code path doesn't use this special value (and just reproduces the old std::iota() call to generate the sequence of all indexes), but the columnar code path will special case this as a memcpy. * Avoid one call to CountSelected() in SerializeRowBlock() by calculating num_rows from the size of the row index vector. * Change SerializeRowBlock() to return an int indicating the number of rows selected, instead of accumulating it into the protobuf. This value can then be re-used to eliminate one extra call to CountSelected in ScanResultCopier::HandleRowBlock(). After this change, the protobuf is no longer used by SerializeRowBlock, so I removed the parameter, which required a bit of fixup in the tests. Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd --- M src/kudu/common/rowblock-test.cc M src/kudu/common/rowblock.cc M src/kudu/common/rowblock.h M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/common/wire_protocol.h M src/kudu/tserver/tablet_service.cc 7 files changed, 94 insertions(+), 64 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/15550/2 -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding
Hello Andrew Wong, Grant Henke, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/15550 to review the following change. Change subject: wire_protocol: some simplification and optimization for rowwise encoding .. wire_protocol: some simplification and optimization for rowwise encoding * Re-implement GetSelectedRows based on the new ForEachSetBit(...) utility, which operates word-by-word instead of byte-by-byte. * use boost::none as a special case return value which indicates the common case of "all rows are selected". Currently the rowwise serialization code path doesn't use this special value (and just reproduces the old std::iota() call to generate the sequence of all indexes), but the columnar code path will special case this as a memcpy. * Avoid one call to CountSelected() in SerializeRowBlock() by calculating num_rows from the size of the row index vector. * Change SerializeRowBlock() to return an int indicating the number of rows selected, instead of accumulating it into the protobuf. This value can then be re-used to eliminate one extra call to CountSelected in ScanResultCopier::HandleRowBlock(). After this change, the protobuf is no longer used by SerializeRowBlock, so I removed the parameter, which required a bit of fixup in the tests. Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd --- M src/kudu/common/rowblock-test.cc M src/kudu/common/rowblock.cc M src/kudu/common/rowblock.h M src/kudu/common/wire_protocol-test.cc M src/kudu/common/wire_protocol.cc M src/kudu/common/wire_protocol.h M src/kudu/tserver/tablet_service.cc 7 files changed, 95 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/15550/1 -- To view, visit http://gerrit.cloudera.org:8080/15550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd Gerrit-Change-Number: 15550 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke