Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20850 )

Change subject: IMPALA-12533: Support row materialization outside of fetch lock
......................................................................


Patch Set 1:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/20850/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20850/1//COMMIT_MSG@13
PS1, Line 13: FetchLockWaitTimer and ResultFlushTimer and have been added to 
account
nit: extra "and" after ResultFlushTimer


http://gerrit.cloudera.org:8080/#/c/20850/1//COMMIT_MSG@15
PS1, Line 15: materializing results outside of the fetch lock. New query option
nit: A line break would be nice somewhere around here. Big wall of text.


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/exec/buffered-plan-root-sink.cc
File be/src/exec/buffered-plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/exec/buffered-plan-root-sink.cc@240
PS1, Line 240:                 num_rows_to_read, output_expr_evals_, state));
I think this needs a comment. It's not obvious why this uses num_rows_to_read 
until you realize InitDelayedMaterialization only does something the first time 
it's called, so you're just making sure it happens before the first AddRows.

Could this be done before the loop instead?


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc@469
PS1, Line 469: void RowBatch::CopyRows(RowBatch* dst, int start_idx, int 
num_rows) {
nit: can we put this after DeepCopyTo so the ordering is consistent with the 
header file?


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc@471
PS1, Line 471:   DCHECK_LE(num_rows, num_rows_);
This feels a little redundant with the next line. What we actually care about 
is that we won't read past the end, which start_idx+num_rows <= num_rows_ 
checks.


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc@472
PS1, Line 472:   DCHECK_LE(start_idx + num_rows, num_rows_);
Maybe also DCHECK start_idx >= 0, and num_rows > 0?


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/impala-hs2-server.cc@208
PS1, Line 208:   MonotonicStopWatch fetch_lock_timer;
nit: No reason this needs to be outside the block where it's used.


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/impala-hs2-server.cc@248
PS1, Line 248:     fetch_lock_timer.Start();
When does this get stopped again? Or used?


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc
File be/src/service/query-result-set.cc:

http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@61
PS1, Line 61:     pendingRowBatch_(),
nit: you don't really need to default initialize all of these. It happens 
automatically. Which I'm guessing you know, so where'd this habit come from?


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@82
PS1, Line 82:     DCHECK(!pendingRowBatch_);
nit: could some of this happen in the constructor? You're asserting things are 
uninitialized then initializing them. If they can't fail, they could happen in 
the constructor and you wouldn't need these checks.


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@190
PS1, Line 190:   virtual Status InitDelayedMaterialization(const RowDescriptor& 
row_desc,
Please try to match style (spacing, functions descriptions).


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@198
PS1, Line 198:       RETURN_IF_ERROR(AddRows(row_batch_context_->GetExprEvals(),
Re-using AddRows here makes the AddRows conditional feel awkward. How about 
breaking out the work it does into a private function we can call here, and get 
rid of the "GetRowBatch() != batch" check?


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@457
PS1, Line 457:   if (!row_batch_context_) {
Should we DCHECK that delay_materialization_ is true?


http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@502
PS1, Line 502:   DCHECK(o->row_batch_context_ == nullptr);
Why not DCHECK_EQ?



--
To view, visit http://gerrit.cloudera.org:8080/20850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9512aa6022dbcf81e7eb5f559853b89fe80bd23
Gerrit-Change-Number: 20850
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Comment-Date: Wed, 03 Jan 2024 23:58:23 +0000
Gerrit-HasComments: Yes

Reply via email to