Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14039 )
Change subject: IMPALA-8818: Replace deque with spillable queue in BufferedPRS ...................................................................... Patch Set 8: (8 comments) I took a look at the resource management part of this http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/runtime/spillable-row-batch-queue.cc File be/src/runtime/spillable-row-batch-queue.cc: http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/runtime/spillable-row-batch-queue.cc@80 PS8, Line 80: RETURN_IF_ERROR(state_->StartSpilling(mem_tracker_)); Add something to the profile to indicate it spilled? http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/runtime/spillable-row-batch-queue.cc@81 PS8, Line 81: // The pin should be stream at this point. : DCHECK(batch_queue_->is_pinned()); > It makes me wonder if the logic will be simpler if we just always use an u This wouldn't work with the current spilling infrastructure, which really assumes that you don't unpin things until you actually need to spill them. There's a few things: * Spilling can be disabled and we need to call StartSpilling() to check whether it's ok to spill things * Data is flushed to disk eagerly, even if there's plenty of memory * Errors in writing to disk (e.g. hitting a scratch limit, no configured scratch, etc) will result in a query failure You could add a mode where unpinned things don't get flushed until there is memory pressure (and errors from spilling being disabled are propagated in a different way), but it would be additional work and complexity in a different part of the code. http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/service/query-options.cc@907 PS8, Line 907: int64_t max_pinned_mem = query_options->max_pinned_result_spooling_memory; I'd suggest just making this max_result_spooling_mem or max_result_spooling_memory. We haven't exposed the pinned/unpinned concept in user-facing options or documentation for the most part, and I think it would be confusing. For accounting purposes unpinned pages don't count against a query's memory limits. http://gerrit.cloudera.org:8080/#/c/14039/8/be/src/service/query-options.cc@911 PS8, Line 911: max_unpinned_result_spooling_memory We should use "spilled" instead of "unpinned" in user-facing terminology to be consistent with elsewhere. http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@84 PS8, Line 84: long perInstanceInputCardinality = There will only ever be one instance of PLAN_ROOT_SINK, so we can remote the numInstances calculations. http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87 PS8, Line 87: (long) Math.ceil(perInstanceInputCardinality * inputNode.getAvgRowSize()); We should cap this at the max result spooling memory, otherwise it could be a huge overestimate. http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java File fe/src/main/java/org/apache/impala/planner/ResourceProfile.java: http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java@76 PS8, Line 76: memEstimateBytes_ = (maxMemReservationBytes > 0) ? This isn't generally correct for all plan nodes, the memory estimate can include non-reserved memory. You need to do this calculation in PlanRootSink since it depends on the node how much non-reserved memory there might be. http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java File fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java: http://gerrit.cloudera.org:8080/#/c/14039/8/fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java@78 PS8, Line 78: Preconditions.checkState(memEstimateBytes_ >= 0, "Mem estimate must be set"); Maybe check that maxMemReservationBytes_ >= minMemReservationBytes_ -- To view, visit http://gerrit.cloudera.org:8080/14039 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I10f9e72374cdf9501c0e5e2c5b39c13688ae65a9 Gerrit-Change-Number: 14039 Gerrit-PatchSet: 8 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 21 Aug 2019 00:47:18 +0000 Gerrit-HasComments: Yes
