Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14039 )

Change subject: IMPALA-8818: Replace deque with spillable queue in BufferedPRS
......................................................................


Patch Set 7:

(9 comments)

Nice! Had some relatively minor comments.

I think a few of them come from a philosophical thing about error handling - I 
think I'd prefer if SpillableRowBatchQueue's invariants were stricter about 
when it was valid to call different methods in the lifecycle - currently there 
are a few runtime checks that would only be triggered if there was a bug in the 
calling code.

There's a couple of reasons I prefer it this way (and why we've generally done 
it this way in Impala). First, if the error-handling code isn't actually 
reachable, then we can't really test that it works correctly. Second, too many 
defensive checks add complexity and runtime overhead if the pattern is repeated 
- and it does actually get confusing whether something is an invariant of the 
system or a runtime error.

http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/exec/exec-node.h@196
PS7, Line 196:   const std::string label() const;
I think the const is unnecessary for a return by value (if you're returning by 
value, the caller gets its own copy anyway, so can choose whether it wants to 
mutate it).


http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/buffered-tuple-stream.h
File be/src/runtime/buffered-tuple-stream.h:

http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/buffered-tuple-stream.h@370
PS7, Line 370:   int64_t BytesUnpinned() const {
I'd probably just call it bytes_unpinned() cause it's a plain accessor (there's 
a bit of a grey area in naming conventions here, but plain accessors usually 
have are lower case with underscores).


http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.h
File be/src/runtime/spillable-row-batch-queue.h:

http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.h@71
PS7, Line 71:   /// successfully added, returns an error Status if the queue is 
full, has already been
Why not just make it invalid to append to the queue when it's full? We could 
then make it a DCHECK instead of a runtime check


http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.h@79
PS7, Line 79:   /// Returns and removes the RowBatch at the head of the queue. 
Returns Status::OK() if
Same thing - if it's not valid to call GetBatch() on an empty queue, just make 
it part of the contract and use a DCHECK instead of the runtime check.


http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc@67
PS7, Line 67:   if (UNLIKELY(closed_)) return Status("SpillableRowBatchQueue is 
already closed.");
If it's a bug to call this when it's closed, let's just make it a DCHECK.


http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc@112
PS7, Line 112:   if (UNLIKELY(closed_)) return false;
I think it would be a bug to call this when it's closed, so we could document 
it as a precondition and make it a DCHECK


http://gerrit.cloudera.org:8080/#/c/14039/7/be/src/runtime/spillable-row-batch-queue.cc@130
PS7, Line 130:   if (batch_queue_ != nullptr)
Missing braces for multi-line if


http://gerrit.cloudera.org:8080/#/c/14039/7/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/14039/7/common/thrift/ImpalaService.thrift@415
PS7, Line 415:   // The maximum amount of pinned memory used when spooling 
query results. If this value
It's probably better to avoid renumbering things in thrift files. I think it's 
ok in this case since it's not really exposed in public APIs, but it's probably 
a good habit to avoid doing it.


http://gerrit.cloudera.org:8080/#/c/14039/7/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/14039/7/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@88
PS7, Line 88:       maxMemReservationBytes = Math.max(maxMemReservationBytes, 
minMemReservationBytes);
Move this up to l74 so that the calculation is all in one place?



--
To view, visit http://gerrit.cloudera.org:8080/14039
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10f9e72374cdf9501c0e5e2c5b39c13688ae65a9
Gerrit-Change-Number: 14039
Gerrit-PatchSet: 7
Gerrit-Owner: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Aug 2019 21:26:51 +0000
Gerrit-HasComments: Yes

Reply via email to