Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14039 )

Change subject: IMPALA-8818: Replace deque with spillable queue in BufferedPRS
......................................................................


Patch Set 8:

(8 comments)

I took a look at the resource management part of this

http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/runtime/spillable-row-batch-queue.cc
File be/src/runtime/spillable-row-batch-queue.cc:

http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/runtime/spillable-row-batch-queue.cc@80
PS8, Line 80:       RETURN_IF_ERROR(state_->StartSpilling(mem_tracker_));
Add something to the profile to indicate it spilled?


http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/runtime/spillable-row-batch-queue.cc@81
PS8, Line 81:       // The pin should be stream at this point.
            :       DCHECK(batch_queue_->is_pinned());
> It makes me wonder if the logic will be simpler if we just always use  an u
This wouldn't work with the current spilling infrastructure, which really 
assumes that you don't unpin things until you actually need to spill them. 
There's a few things:

* Spilling can be disabled and we need to call StartSpilling() to check whether 
it's ok to spill things
* Data is flushed to disk eagerly, even if there's plenty of memory
* Errors in writing to disk (e.g. hitting a scratch limit, no configured 
scratch, etc) will result in a query failure

You could add a mode where unpinned things don't get flushed until there is 
memory pressure (and errors from spilling being disabled are propagated in a 
different way), but it would be additional work and complexity in a different 
part of the code.


http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/service/query-options.cc@907
PS8, Line 907:   int64_t max_pinned_mem = 
query_options->max_pinned_result_spooling_memory;
I'd suggest just making this max_result_spooling_mem or 
max_result_spooling_memory. We haven't exposed the pinned/unpinned concept in 
user-facing options or documentation for the most part, and I think it would be 
confusing. For accounting purposes unpinned pages don't count against a query's 
memory limits.


http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/service/query-options.cc@911
PS8, Line 911: max_unpinned_result_spooling_memory
We should use "spilled" instead of "unpinned" in user-facing terminology to be 
consistent with elsewhere.


http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@84
PS8, Line 84:         long perInstanceInputCardinality =
There will only ever be one instance of PLAN_ROOT_SINK, so we can remote the 
numInstances calculations.


http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS8, Line 87:             (long) Math.ceil(perInstanceInputCardinality * 
inputNode.getAvgRowSize());
We should cap this at the max result spooling memory, otherwise it could be a 
huge overestimate.


http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
File fe/src/main/java/org/apache/impala/planner/ResourceProfile.java:

http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java@76
PS8, Line 76:     memEstimateBytes_ = (maxMemReservationBytes > 0) ?
This isn't generally correct for all plan nodes, the memory estimate can 
include non-reserved memory. You need to do this calculation in PlanRootSink 
since it depends on the node how much non-reserved memory there might be.


http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
File fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java:

http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java@78
PS8, Line 78:     Preconditions.checkState(memEstimateBytes_ >= 0, "Mem 
estimate must be set");
Maybe check that maxMemReservationBytes_ >= minMemReservationBytes_



--
To view, visit http://gerrit.cloudera.org:8080/14039
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10f9e72374cdf9501c0e5e2c5b39c13688ae65a9
Gerrit-Change-Number: 14039
Gerrit-PatchSet: 8
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Wed, 21 Aug 2019 00:47:18 +0000
Gerrit-HasComments: Yes

Reply via email to