Hello Dan Hecht, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/10550 to look at the new patch set (#7). Change subject: IMPALA-7078: Part 1: improve memory consumption of wide Avro scans ...................................................................... IMPALA-7078: Part 1: improve memory consumption of wide Avro scans Revert to the pre-IMPALA-3905 algorithm for deciding when to return a batch from an Avro scan. The post-IMPALA-3905 algorithm is bad for wide tables where there are only a small number of rows per Avro block. Optimise memory transfer for selective scans - don't attach unused decompression buffers to the output batch. Combined with the previous change, this dramatically reduces the amount of memory transferred out of scanner threads for selective scans of wide tables. Includes some observability improvements including additional counters that will help diagnose issues like this more easily: * Add counters to give some insight into row batch queue. Here's an excerpt: - RowBatchBytesEnqueued: 20.89 MB (21903380) - RowBatchQueueCapacity: 5 (5) - RowBatchQueueGetWaitTime: 59.187ms - RowBatchQueuePeakMemoryUsage: 8.85 MB (9279347) - RowBatchQueuePutWaitTime: 0.000ns - RowBatchesEnqueued: 6 (6) * Don't create AverageScannerThreadConcurrency for MT scan node where it's not actually used. * Track the row batch queue memory consumption against a sub-tracker HDFS_SCAN_NODE (id=2): Reservation=48.00 MB OtherMemory=588.00 KB Total=48.57 MB Peak=48.62 MB Queued Batches: Total=588.00 KB Peak=637.00 KB Ran the repro in the JIRA. Memory consumption was reduced from ~500MB to ~220MB on my system. Testing: * Ran stress test for an hour on uncompressed and 3 hours on snappy-compressed avro. * Debug exhaustive tests passed. * ASAN core tests passed. Perf: - Parquet TPC-H scale factor 60 on one impalad showed no change in perf - Avro/Snappy scale factor 20 showed a small improvement: +----------+---------------------+---------+------------+------------+----------------+ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +----------+---------------------+---------+------------+------------+----------------+ | TPCH(20) | avro / snap / block | 9.86 | -2.23% | 7.83 | -2.37% | +----------+---------------------+---------+------------+------------+----------------+ +----------+----------+----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +----------+----------+----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+ | TPCH(20) | TPCH-Q6 | avro / block / block | 5.59 | 5.17 | +8.10% | 0.75% | 0.71% | 1 | 30 | | TPCH(20) | TPCH-Q14 | avro / block / block | 6.31 | 5.89 | +7.21% | 0.68% | 0.74% | 1 | 30 | | TPCH(20) | TPCH-Q15 | avro / block / block | 11.32 | 10.64 | +6.41% | 0.37% | 0.46% | 1 | 30 | | TPCH(20) | TPCH-Q12 | avro / block / block | 8.57 | 8.14 | +5.23% | 0.67% | 0.81% | 1 | 30 | | TPCH(20) | TPCH-Q13 | avro / block / block | 6.72 | 6.54 | +2.72% | 0.77% | 0.73% | 1 | 30 | | TPCH(20) | TPCH-Q4 | avro / block / block | 11.76 | 11.61 | +1.32% | 0.60% | 0.61% | 1 | 30 | | TPCH(20) | TPCH-Q7 | avro / block / block | 14.43 | 14.26 | +1.21% | 1.14% | 0.35% | 1 | 30 | | TPCH(20) | TPCH-Q21 | avro / block / block | 34.12 | 34.25 | -0.36% | 0.27% | 0.24% | 1 | 30 | | TPCH(20) | TPCH-Q20 | avro / block / block | 8.49 | 8.52 | -0.38% | 0.45% | 0.54% | 1 | 30 | | TPCH(20) | TPCH-Q1 | avro / block / block | 6.99 | 7.02 | -0.38% | 0.96% | 0.65% | 1 | 30 | | TPCH(20) | TPCH-Q22 | avro / block / block | 2.44 | 2.47 | -1.09% | 1.81% | 1.47% | 1 | 30 | | TPCH(20) | TPCH-Q11 | avro / block / block | 1.99 | 2.02 | -1.57% | 1.95% | 1.90% | 1 | 30 | | TPCH(20) | TPCH-Q17 | avro / block / block | 13.57 | 13.79 | -1.63% | 1.53% | 1.31% | 1 | 30 | | TPCH(20) | TPCH-Q18 | avro / block / block | 21.93 | 22.31 | -1.72% | 0.31% | 0.34% | 1 | 30 | | TPCH(20) | TPCH-Q8 | avro / block / block | 9.05 | 9.31 | -2.81% | 0.85% | 0.72% | 1 | 30 | | TPCH(20) | TPCH-Q19 | avro / block / block | 7.20 | 7.41 | -2.91% | 0.72% | 0.52% | 1 | 30 | | TPCH(20) | TPCH-Q9 | avro / block / block | 14.25 | 14.73 | -3.29% | 0.45% | 0.33% | 1 | 30 | | TPCH(20) | TPCH-Q2 | avro / block / block | 2.69 | 2.88 | -6.66% | 1.17% | 1.52% | 1 | 30 | | TPCH(20) | TPCH-Q16 | avro / block / block | 2.12 | 2.30 | -7.82% | 2.56% | 2.10% | 1 | 30 | | TPCH(20) | TPCH-Q3 | avro / block / block | 9.68 | 11.24 | -13.85% | 0.46% | 0.50% | 1 | 30 | | TPCH(20) | TPCH-Q10 | avro / block / block | 8.92 | 10.66 | -16.33% | 0.75% | 0.49% | 1 | 30 | | TPCH(20) | TPCH-Q5 | avro / block / block | 8.76 | 10.69 | -18.08% | 0.64% | 0.49% | 1 | 30 | +----------+----------+----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+ Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b --- M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.cc M be/src/exec/scan-node.h M be/src/runtime/mem-pool.cc M be/src/runtime/mem-pool.h M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 13 files changed, 229 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/10550/7 -- To view, visit http://gerrit.cloudera.org:8080/10550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b Gerrit-Change-Number: 10550 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>