[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-25 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15550 )

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..

wire_protocol: some simplification and optimization for rowwise encoding

* Re-implement GetSelectedRows based on the new ForEachSetBit(...)
  utility, which operates word-by-word instead of byte-by-byte.

* use a boolean return value which indicates the common case of "all
  rows are selected". Currently the rowwise serialization code path
  doesn't use this special value (and just reproduces the old
  std::iota() call to generate the sequence of all indexes), but the
  columnar code path will special case this as a memcpy.

* Avoid one call to CountSelected() in SerializeRowBlock() by calculating
  num_rows from the size of the row index vector.

* Change SerializeRowBlock() to return an int indicating the number of
  rows selected, instead of accumulating it into the protobuf. This
  value can then be re-used to eliminate one extra call to CountSelected
  in ScanResultCopier::HandleRowBlock().

  After this change, the protobuf is no longer used by
  SerializeRowBlock, so I removed the parameter, which required a bit of
  fixup in the tests.

Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Reviewed-on: http://gerrit.cloudera.org:8080/15550
Reviewed-by: Grant Henke 
Tested-by: Todd Lipcon 
---
M src/kudu/common/rowblock-test.cc
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/tablet_service.cc
7 files changed, 89 insertions(+), 67 deletions(-)

Approvals:
  Grant Henke: Looks good to me, approved
  Todd Lipcon: Verified

--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-25 Thread Todd Lipcon (Code Review)
Todd Lipcon has removed a vote on this change.

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-25 Thread Todd Lipcon (Code Review)
Todd Lipcon has removed a vote on this change.

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..


Removed -Verified by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-25 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15550 )

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..


Patch Set 3: Verified+1

k, skipping IWYU for now


--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:42 +
Gerrit-HasComments: No


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-25 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15550 )

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..


Patch Set 3: Code-Review+2

There is an IWYU failure but you could probably skip it given Adar is fixing it 
right now.


-- 
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 25 Mar 2020 17:31:02 +
Gerrit-HasComments: No


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-25 Thread Todd Lipcon (Code Review)
Hello Tidy Bot, Andrew Wong, Kudu Jenkins, Andrew Wong, Grant Henke,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15550

to look at the new patch set (#3).

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..

wire_protocol: some simplification and optimization for rowwise encoding

* Re-implement GetSelectedRows based on the new ForEachSetBit(...)
  utility, which operates word-by-word instead of byte-by-byte.

* use a boolean return value which indicates the common case of "all
  rows are selected". Currently the rowwise serialization code path
  doesn't use this special value (and just reproduces the old
  std::iota() call to generate the sequence of all indexes), but the
  columnar code path will special case this as a memcpy.

* Avoid one call to CountSelected() in SerializeRowBlock() by calculating
  num_rows from the size of the row index vector.

* Change SerializeRowBlock() to return an int indicating the number of
  rows selected, instead of accumulating it into the protobuf. This
  value can then be re-used to eliminate one extra call to CountSelected
  in ScanResultCopier::HandleRowBlock().

  After this change, the protobuf is no longer used by
  SerializeRowBlock, so I removed the parameter, which required a bit of
  fixup in the tests.

Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
---
M src/kudu/common/rowblock-test.cc
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/tablet_service.cc
7 files changed, 89 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/15550/3
--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-24 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15550 )

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h@108
PS2, Line 108:   // In the case that all rows are selected, sets *selected to 
'boost::none'
> I felt similarly, but I also don't mind it if it's documented. Alternativel
yea, I couldn't find a great way to do this. I can change to a bool return if 
you guys prefer


http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc
File src/kudu/common/rowblock.cc:

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc@91
PS2, Line 91:   CHECK_LE(n_rows_, std::numeric_limits::max());
> What prevent's n_rows_ > 65535?
just in practice we don't set rowblock sizes that large since the point of a 
RowBlock is to fit all the rows into CPU cache. I found some small perf 
improvement using uint16s here,



--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 25 Mar 2020 05:44:55 +
Gerrit-HasComments: Yes


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-24 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15550 )

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h@108
PS2, Line 108:   // In the case that all rows are selected, sets *selected to 
'boost::none'
> It feels a little unintuitive that none == all, but as long as we document
I felt similarly, but I also don't mind it if it's documented. Alternatively, 
we could forego optionality entirely and return a bool indicating whether all 
rows were selected or something, but I don't feel strongly about it.


http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/wire_protocol.h
File src/kudu/common/wire_protocol.h:

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/wire_protocol.h@164
PS2, Line 164: protobuf and
nit: this no longer applies



--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 25 Mar 2020 04:12:04 +
Gerrit-HasComments: Yes


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-24 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15550 )

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.h@108
PS2, Line 108:   // In the case that all rows are selected, sets *selected to 
'boost::none'
It feels a little unintuitive that none == all, but as long as we document it 
everywhere I think it make sense.


http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc
File src/kudu/common/rowblock.cc:

http://gerrit.cloudera.org:8080/#/c/15550/2/src/kudu/common/rowblock.cc@91
PS2, Line 91:   CHECK_LE(n_rows_, std::numeric_limits::max());
What prevent's n_rows_ > 65535?



--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 25 Mar 2020 04:08:19 +
Gerrit-HasComments: Yes


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-24 Thread Todd Lipcon (Code Review)
Hello Tidy Bot, Andrew Wong, Kudu Jenkins, Grant Henke,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15550

to look at the new patch set (#2).

Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..

wire_protocol: some simplification and optimization for rowwise encoding

* Re-implement GetSelectedRows based on the new ForEachSetBit(...)
  utility, which operates word-by-word instead of byte-by-byte.

* use boost::none as a special case return value which indicates the
  common case of "all rows are selected". Currently the rowwise
  serialization code path doesn't use this special value (and just
  reproduces the old std::iota() call to generate the sequence of all
  indexes), but the columnar code path will special case this as a
  memcpy.

* Avoid one call to CountSelected() in SerializeRowBlock() by calculating
  num_rows from the size of the row index vector.

* Change SerializeRowBlock() to return an int indicating the number of
  rows selected, instead of accumulating it into the protobuf. This
  value can then be re-used to eliminate one extra call to CountSelected
  in ScanResultCopier::HandleRowBlock().

  After this change, the protobuf is no longer used by
  SerializeRowBlock, so I removed the parameter, which required a bit of
  fixup in the tests.

Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
---
M src/kudu/common/rowblock-test.cc
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/tablet_service.cc
7 files changed, 94 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/15550/2
--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] wire protocol: some simplification and optimization for rowwise encoding

2020-03-24 Thread Todd Lipcon (Code Review)
Hello Andrew Wong, Grant Henke,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/15550

to review the following change.


Change subject: wire_protocol: some simplification and optimization for rowwise 
encoding
..

wire_protocol: some simplification and optimization for rowwise encoding

* Re-implement GetSelectedRows based on the new ForEachSetBit(...)
  utility, which operates word-by-word instead of byte-by-byte.

* use boost::none as a special case return value which indicates the
  common case of "all rows are selected". Currently the rowwise
  serialization code path doesn't use this special value (and just
  reproduces the old std::iota() call to generate the sequence of all
  indexes), but the columnar code path will special case this as a
  memcpy.

* Avoid one call to CountSelected() in SerializeRowBlock() by calculating
  num_rows from the size of the row index vector.

* Change SerializeRowBlock() to return an int indicating the number of
  rows selected, instead of accumulating it into the protobuf. This
  value can then be re-used to eliminate one extra call to CountSelected
  in ScanResultCopier::HandleRowBlock().

  After this change, the protobuf is no longer used by
  SerializeRowBlock, so I removed the parameter, which required a bit of
  fixup in the tests.

Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
---
M src/kudu/common/rowblock-test.cc
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/tablet_service.cc
7 files changed, 95 insertions(+), 62 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/50/15550/1
--
To view, visit http://gerrit.cloudera.org:8080/15550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I24dfb1bd036fde514ca6494bae0ddc171dd225dd
Gerrit-Change-Number: 15550
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke