Hello Sailesh Mukil, Tim Armstrong, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/8991 to look at the new patch set (#3). Change subject: IMPALA-5528: Upgrade GPerfTools to 2.6.3 and tune TCMalloc for KRPC ...................................................................... IMPALA-5528: Upgrade GPerfTools to 2.6.3 and tune TCMalloc for KRPC KRPC in general tends to put more pressure on the thread caches due to allocations of more small objects (i.e. <1MB). While some of them are being addressed in KUDU-1865, we notice that the following TCMalloc workarounds will provide reasonable performance with KRPC: - TCMALLOC_TRANSFER_NUM_OBJ: - maximum number of object per classe type to transfer between thread and central caches. - the default value of 512 in 2.5.2 seems to cause the spin lock in the central cache to be held for too long with KRPC. 2.6.0 and latter reverts this value to 32 by default. - TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES - total amount of memory allocated to all thread caches in bytes - the default value is 32MB. We need to bump it to 1GB which is the internal cap in TCMalloc. This change upgrades GPerfTools/TCMalloc to 2.6.3 to pick up the change of the default value of TCMALLOC_TRANSFER_NUM_OBJ. In addition, when KRPC is enabled and FLAGS_TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES has the default value of 0, we will automatically bump the thread cache sizes to 1GB. Without these workarounds, stress test with KRPC will grind to a halt due to contention for the spinlock in TCMalloc's central cache. With these workarounds, the stress test completes within the same ballpark as thrift. Also did a perf run with Thrift. The regression in TPCH-Q2 is mostly due to sensitivity in runtime filter timing and the avg can be dragged up due to a bad run when filters arrive late. No regression as measured in targeted-perf. +------------+-----------------------+---------+------------+------------+----------------+ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +------------+-----------------------+---------+------------+------------+----------------+ | TPCH(_300) | parquet / none / none | 18.93 | -0.84% | 10.08 | +1.45% | +------------+-----------------------+---------+------------+------------+----------------+ +------------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +------------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ | TPCH(_300) | TPCH-Q2 | parquet / none / none | 6.28 | 3.25 | R +93.41% | * 49.77% * | * 12.47% * | 1 | 3 | | TPCH(_300) | TPCH-Q4 | parquet / none / none | 5.00 | 4.77 | +4.83% | 0.41% | 0.03% | 1 | 3 | | TPCH(_300) | TPCH-Q13 | parquet / none / none | 21.29 | 20.69 | +2.90% | 0.55% | 0.37% | 1 | 3 | | TPCH(_300) | TPCH-Q11 | parquet / none / none | 1.73 | 1.71 | +0.94% | 1.69% | 2.85% | 1 | 3 | | TPCH(_300) | TPCH-Q14 | parquet / none / none | 6.03 | 5.99 | +0.76% | 0.00% | 0.95% | 1 | 3 | | TPCH(_300) | TPCH-Q16 | parquet / none / none | 6.97 | 6.93 | +0.58% | 0.74% | 0.73% | 1 | 3 | | TPCH(_300) | TPCH-Q3 | parquet / none / none | 29.15 | 29.03 | +0.40% | 1.63% | 1.39% | 1 | 3 | | TPCH(_300) | TPCH-Q1 | parquet / none / none | 14.01 | 13.96 | +0.34% | 1.28% | 0.51% | 1 | 3 | | TPCH(_300) | TPCH-Q6 | parquet / none / none | 1.27 | 1.27 | -0.03% | 3.69% | 0.07% | 1 | 3 | | TPCH(_300) | TPCH-Q9 | parquet / none / none | 30.99 | 31.13 | -0.45% | 0.54% | 0.19% | 1 | 3 | | TPCH(_300) | TPCH-Q5 | parquet / none / none | 48.03 | 48.33 | -0.63% | 4.72% | 0.11% | 1 | 3 | | TPCH(_300) | TPCH-Q7 | parquet / none / none | 46.85 | 47.41 | -1.18% | 1.59% | 0.46% | 1 | 3 | | TPCH(_300) | TPCH-Q8 | parquet / none / none | 7.92 | 8.03 | -1.39% | 3.67% | 5.63% | 1 | 3 | | TPCH(_300) | TPCH-Q19 | parquet / none / none | 30.98 | 31.51 | -1.67% | 1.33% | 0.82% | 1 | 3 | | TPCH(_300) | TPCH-Q18 | parquet / none / none | 33.55 | 34.13 | -1.71% | 1.15% | 1.46% | 1 | 3 | | TPCH(_300) | TPCH-Q10 | parquet / none / none | 9.46 | 9.64 | -1.82% | 0.63% | 0.75% | 1 | 3 | | TPCH(_300) | TPCH-Q22 | parquet / none / none | 6.00 | 6.16 | -2.58% | 0.08% | 5.12% | 1 | 3 | | TPCH(_300) | TPCH-Q15 | parquet / none / none | 3.41 | 3.50 | -2.60% | 1.40% | 0.46% | 1 | 3 | | TPCH(_300) | TPCH-Q12 | parquet / none / none | 3.24 | 3.33 | -2.86% | 1.36% | 1.55% | 1 | 3 | | TPCH(_300) | TPCH-Q17 | parquet / none / none | 4.65 | 4.83 | -3.58% | 1.17% | 0.42% | 1 | 3 | | TPCH(_300) | TPCH-Q21 | parquet / none / none | 96.15 | 100.63 | -4.45% | 0.29% | 3.18% | 1 | 3 | | TPCH(_300) | TPCH-Q20 | parquet / none / none | 3.40 | 3.64 | -6.63% | 4.82% | * 12.70% * | 1 | 3 | +------------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ +---------------------+-----------------------+---------+------------+------------+----------------+ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---------------------+-----------------------+---------+------------+------------+----------------+ | TARGETED-PERF(_300) | parquet / none / none | 59.31 | -1.40% | 8.80 | -2.24% | +---------------------+-----------------------+---------+------------+------------+----------------+ +---------------------+--------------------------------------------------------+-----------------------+---------+-------------+------------+------------+----------------+-------------+-------+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +---------------------+--------------------------------------------------------+-----------------------+---------+-------------+------------+------------+----------------+-------------+-------+ | TARGETED-PERF(_300) | primitive_conjunct_ordering_2 | parquet / none / none | 36.27 | 30.52 | +18.87% | * 17.02% * | 2.42% | 1 | 3 | | TARGETED-PERF(_300) | primitive_broadcast_join_1 | parquet / none / none | 1.17 | 1.02 | +14.59% | * 12.82% * | 0.02% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_bigint_in_list | parquet / none / none | 1.03 | 0.92 | +11.54% | 2.51% | 2.53% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_bigint_selective | parquet / none / none | 0.37 | 0.34 | +6.93% | 7.94% | 1.32% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_string_selective | parquet / none / none | 0.47 | 0.44 | +5.97% | 6.11% | 1.12% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_bigint_highndv | parquet / none / none | 24.35 | 23.87 | +1.99% | 0.82% | 0.63% | 1 | 3 | | TARGETED-PERF(_300) | primitive_many_fragments | parquet / none / none | 61.64 | 60.93 | +1.17% | 3.25% | 1.07% | 1 | 3 | | TARGETED-PERF(_300) | primitive_top-n_all | parquet / none / none | 36.63 | 36.31 | +0.87% | 0.35% | 4.62% | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_4 | parquet / none / none | 0.87 | 0.86 | +0.66% | 0.47% | 0.03% | 1 | 3 | | TARGETED-PERF(_300) | primitive_long_predicate | parquet / none / none | 28.83 | 28.69 | +0.49% | 0.24% | 0.10% | 1 | 3 | | TARGETED-PERF(_300) | primitive_topn_bigint | parquet / none / none | 5.53 | 5.51 | +0.34% | 2.23% | 0.07% | 1 | 3 | | TARGETED-PERF(_300) | primitive_broadcast_join_3 | parquet / none / none | 58.41 | 58.27 | +0.24% | 2.51% | 0.02% | 1 | 3 | | TARGETED-PERF(_300) | primitive_decimal_arithmetic | parquet / none / none | 96.67 | 96.59 | +0.09% | 0.41% | 0.08% | 1 | 3 | | TARGETED-PERF(_300) | primitive_count_star | parquet / none / none | 0.09 | 0.09 | +0.03% | 0.72% | 0.73% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_decimal_lowndv.test | parquet / none / none | 3.26 | 3.26 | -0.00% | 0.09% | 1.57% | 1 | 3 | | TARGETED-PERF(_300) | primitive_broadcast_join_2 | parquet / none / none | 4.40 | 4.41 | -0.10% | 0.01% | 1.24% | 1 | 3 | | TARGETED-PERF(_300) | primitive_shuffle_join_union_all_with_groupby | parquet / none / none | 67.43 | 67.58 | -0.21% | 0.31% | 0.38% | 1 | 3 | | TARGETED-PERF(_300) | primitive_intrinsic_appx_median | parquet / none / none | 35.14 | 35.27 | -0.34% | 0.39% | 0.47% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_decimal_highndv | parquet / none / none | 25.94 | 26.07 | -0.51% | 5.33% | 2.30% | 1 | 3 | | TARGETED-PERF(_300) | primitive_exchange_broadcast | parquet / none / none | 76.54 | 76.96 | -0.54% | 4.50% | 4.70% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_string_like | parquet / none / none | 5.52 | 5.56 | -0.68% | 0.92% | 3.17% | 1 | 3 | | TARGETED-PERF(_300) | primitive_many_independent_fragments | parquet / none / none | 254.76 | 256.50 | -0.68% | 5.95% | 2.19% | 1 | 3 | | TARGETED-PERF(_300) | primitive_shuffle_join_one_to_many_string_with_groupby | parquet / none / none | 228.43 | 230.08 | -0.72% | 0.62% | 1.34% | 1 | 3 | | TARGETED-PERF(_300) | primitive_empty_build_join_1 | parquet / none / none | 1.90 | 1.92 | -1.26% | 1.19% | 2.74% | 1 | 3 | | TARGETED-PERF(_300) | primitive_exchange_shuffle | parquet / none / none | 78.99 | 80.26 | -1.59% | 0.75% | 1.61% | 1 | 3 | | TARGETED-PERF(_300) | primitive_shuffle_1mb_rows | parquet / none / none | 1008.91 | 1027.39 | -1.80% | 2.33% | 0.72% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_bigint_pk | parquet / none / none | 96.58 | 98.62 | -2.07% | 1.08% | 1.98% | 1 | 3 | | TARGETED-PERF(_300) | primitive_groupby_bigint_lowndv | parquet / none / none | 3.26 | 3.33 | -2.10% | 3.00% | 0.06% | 1 | 3 | | TARGETED-PERF(_300) | primitive_small_join_1 | parquet / none / none | 0.42 | 0.43 | -2.54% | 0.23% | 1.54% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_string_non_selective | parquet / none / none | 0.90 | 0.93 | -2.54% | 0.18% | 2.45% | 1 | 3 | | TARGETED-PERF(_300) | primitive_intrinsic_to_date | parquet / none / none | 77.56 | 79.81 | -2.82% | 0.39% | 2.79% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_decimal_non_selective | parquet / none / none | 0.80 | 0.83 | -3.56% | 0.12% | 2.68% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_bigint_non_selective | parquet / none / none | 1.00 | 1.05 | -4.60% | 0.31% | 5.18% | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_1 | parquet / none / none | 4.91 | 5.16 | -4.89% | 0.44% | 0.41% | 1 | 3 | | TARGETED-PERF(_300) | primitive_orderby_all | parquet / none / none | 54.67 | 58.30 | -6.23% | 0.45% | 0.81% | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_5 | parquet / none / none | 11.91 | 12.70 | -6.24% | 1.04% | 0.53% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_decimal_selective | parquet / none / none | 0.86 | 0.94 | -8.58% | * 24.14% * | * 36.04% * | 1 | 3 | | TARGETED-PERF(_300) | primitive_orderby_bigint | parquet / none / none | 15.06 | 16.57 | -9.14% | 3.50% | 0.10% | 1 | 3 | | TARGETED-PERF(_300) | primitive_filter_in_predicate | parquet / none / none | 1.11 | 1.24 | -10.09% | * 11.28% * | * 12.71% * | 1 | 3 | | TARGETED-PERF(_300) | primitive_orderby_bigint_expression | parquet / none / none | 18.16 | 24.86 | I -26.97% | 0.96% | * 20.73% * | 1 | 3 | | TARGETED-PERF(_300) | primitive_conjunct_ordering_3 | parquet / none / none | 0.94 | 1.71 | I -44.74% | 2.68% | * 42.73% * | 1 | 3 | +---------------------+--------------------------------------------------------+-----------------------+---------+-------------+------------+------------+----------------+-------------+-------+ Change-Id: I5be574435af51fb7a875b16888cca260b341190e --- M be/src/common/init.cc M be/src/runtime/exec-env.cc M bin/impala-config.sh 3 files changed, 11 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/8991/3 -- To view, visit http://gerrit.cloudera.org:8080/8991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5be574435af51fb7a875b16888cca260b341190e Gerrit-Change-Number: 8991 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>