Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20850 )
Change subject: IMPALA-12533: Support row materialization outside of fetch lock ...................................................................... Patch Set 1: (14 comments) http://gerrit.cloudera.org:8080/#/c/20850/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20850/1//COMMIT_MSG@13 PS1, Line 13: FetchLockWaitTimer and ResultFlushTimer and have been added to account nit: extra "and" after ResultFlushTimer http://gerrit.cloudera.org:8080/#/c/20850/1//COMMIT_MSG@15 PS1, Line 15: materializing results outside of the fetch lock. New query option nit: A line break would be nice somewhere around here. Big wall of text. http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/exec/buffered-plan-root-sink.cc File be/src/exec/buffered-plan-root-sink.cc: http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/exec/buffered-plan-root-sink.cc@240 PS1, Line 240: num_rows_to_read, output_expr_evals_, state)); I think this needs a comment. It's not obvious why this uses num_rows_to_read until you realize InitDelayedMaterialization only does something the first time it's called, so you're just making sure it happens before the first AddRows. Could this be done before the loop instead? http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc@469 PS1, Line 469: void RowBatch::CopyRows(RowBatch* dst, int start_idx, int num_rows) { nit: can we put this after DeepCopyTo so the ordering is consistent with the header file? http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc@471 PS1, Line 471: DCHECK_LE(num_rows, num_rows_); This feels a little redundant with the next line. What we actually care about is that we won't read past the end, which start_idx+num_rows <= num_rows_ checks. http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/runtime/row-batch.cc@472 PS1, Line 472: DCHECK_LE(start_idx + num_rows, num_rows_); Maybe also DCHECK start_idx >= 0, and num_rows > 0? http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/impala-hs2-server.cc@208 PS1, Line 208: MonotonicStopWatch fetch_lock_timer; nit: No reason this needs to be outside the block where it's used. http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/impala-hs2-server.cc@248 PS1, Line 248: fetch_lock_timer.Start(); When does this get stopped again? Or used? http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc File be/src/service/query-result-set.cc: http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@61 PS1, Line 61: pendingRowBatch_(), nit: you don't really need to default initialize all of these. It happens automatically. Which I'm guessing you know, so where'd this habit come from? http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@82 PS1, Line 82: DCHECK(!pendingRowBatch_); nit: could some of this happen in the constructor? You're asserting things are uninitialized then initializing them. If they can't fail, they could happen in the constructor and you wouldn't need these checks. http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@190 PS1, Line 190: virtual Status InitDelayedMaterialization(const RowDescriptor& row_desc, Please try to match style (spacing, functions descriptions). http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@198 PS1, Line 198: RETURN_IF_ERROR(AddRows(row_batch_context_->GetExprEvals(), Re-using AddRows here makes the AddRows conditional feel awkward. How about breaking out the work it does into a private function we can call here, and get rid of the "GetRowBatch() != batch" check? http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@457 PS1, Line 457: if (!row_batch_context_) { Should we DCHECK that delay_materialization_ is true? http://gerrit.cloudera.org:8080/#/c/20850/1/be/src/service/query-result-set.cc@502 PS1, Line 502: DCHECK(o->row_batch_context_ == nullptr); Why not DCHECK_EQ? -- To view, visit http://gerrit.cloudera.org:8080/20850 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If9512aa6022dbcf81e7eb5f559853b89fe80bd23 Gerrit-Change-Number: 20850 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Comment-Date: Wed, 03 Jan 2024 23:58:23 +0000 Gerrit-HasComments: Yes