Hello Kurt Deschler, Daniel Becker, Joe McDonnell, Csaba Ringhofer, Michael 
Smith, Wenzhe Zhou, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18126

to look at the new patch set (#7).

Change subject: IMPALA-11068: Add query option to reduce scanner thread launch.
......................................................................

IMPALA-11068: Add query option to reduce scanner thread launch.

Under heavy decompression workload, Impala running with scanner thread
parallelism (MT_DOP=0) can still hit OOM error due to launching too many
threads too soon. We have logic in ScannerMemLimiter to limit the number
of scanner threads by calculating the thread's memory requirement and
estimating the memory growth rate of all threads. However, it does not
prevent a scanner node from quickly launching many threads and
immediately reaching the memtracker's spare capacity. Even after
ScannerMemLimiter rejects a new thread launch, some existing threads
might continue increasing their non-reserved memory for decompression
work until the memory limit exceeded.

IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a
heuristic to count for non-reserved memory growth. Increasing this flag
value can help reduce thread count, but might severely regress other
queries that do not have heavy decompression characteristics. Similarly
with lowering the NUM_SCANNER_THREADS query option.

This patch adds one more query option as an alternative to mitigate OOM
called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer
the same control as hdfs_scanner_thread_max_estimated_bytes, but as a
query option such that tuning can be done at per query granularity. If
this query option not set, set to 0, or negative value, backend will
revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag.

Testing:
- Add test case in query-options-test.cc and
  TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling.

Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M 
testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test
8 files changed, 89 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/18126/7
--
To view, visit http://gerrit.cloudera.org:8080/18126
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466
Gerrit-Change-Number: 18126
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

Reply via email to