Hello Michael Ho, Joe McDonnell, Tim Armstrong, Dan Hecht, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8758 to look at the new patch set (#18). Change subject: IMPALA-6190/6246: Add instances tab and event sequence ...................................................................... IMPALA-6190/6246: Add instances tab and event sequence This change adds tracking of the current state during the execution of a fragment instance. The current state is then reported back to the coordinator and exposed to users via a new tab in the query detail debug webpage. This change also adds an event timeline to fragment instances in the query profile. The timeline measures the time since backend-local query start at which particular events complete. Events are derived from the current state of the execution of a fragment instance. For example: - Prepare Finished: 13.436ms (13.436ms) - First Batch Produced: 1s022ms (1s008ms) - First Batch Sent: 1s022ms (455.558us) - ExecInternal Finished: 2s783ms (1s760ms) I added automated tests for both extensions and additionally verified the change by manual inspection. Here are the TPCH performance comparison results between this change and the previous commit on a 16 node cluster. +------------+-----------------------+---------+------------+------------+----------------+ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +------------+-----------------------+---------+------------+------------+----------------+ | TPCH(_300) | parquet / none / none | 18.47 | -0.94% | 9.72 | -1.08% | +------------+-----------------------+---------+------------+------------+----------------+ +------------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +------------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ | TPCH(_300) | TPCH-Q5 | parquet / none / none | 48.88 | 46.93 | +4.15% | 0.14% | 3.61% | 1 | 3 | | TPCH(_300) | TPCH-Q13 | parquet / none / none | 21.64 | 21.15 | +2.29% | 2.06% | 1.84% | 1 | 3 | | TPCH(_300) | TPCH-Q11 | parquet / none / none | 1.71 | 1.70 | +1.12% | 0.54% | 2.51% | 1 | 3 | | TPCH(_300) | TPCH-Q18 | parquet / none / none | 33.15 | 32.79 | +1.09% | 0.13% | 2.03% | 1 | 3 | | TPCH(_300) | TPCH-Q14 | parquet / none / none | 5.95 | 5.90 | +0.82% | 2.19% | 0.49% | 1 | 3 | | TPCH(_300) | TPCH-Q1 | parquet / none / none | 13.99 | 13.90 | +0.63% | 0.25% | 1.39% | 1 | 3 | | TPCH(_300) | TPCH-Q2 | parquet / none / none | 3.44 | 3.44 | +0.00% | * 20.29% * | * 20.76% * | 1 | 3 | | TPCH(_300) | TPCH-Q6 | parquet / none / none | 1.21 | 1.22 | -0.01% | 0.06% | 0.06% | 1 | 3 | | TPCH(_300) | TPCH-Q20 | parquet / none / none | 3.51 | 3.51 | -0.11% | 7.15% | 7.30% | 1 | 3 | | TPCH(_300) | TPCH-Q16 | parquet / none / none | 6.89 | 6.91 | -0.21% | 0.65% | 0.55% | 1 | 3 | | TPCH(_300) | TPCH-Q4 | parquet / none / none | 4.78 | 4.80 | -0.38% | 0.06% | 0.59% | 1 | 3 | | TPCH(_300) | TPCH-Q19 | parquet / none / none | 30.78 | 31.04 | -0.83% | 0.45% | 1.03% | 1 | 3 | | TPCH(_300) | TPCH-Q22 | parquet / none / none | 6.06 | 6.12 | -1.02% | 1.51% | 2.12% | 1 | 3 | | TPCH(_300) | TPCH-Q10 | parquet / none / none | 9.43 | 9.58 | -1.54% | 0.69% | 3.30% | 1 | 3 | | TPCH(_300) | TPCH-Q21 | parquet / none / none | 93.41 | 95.18 | -1.86% | 0.08% | 0.81% | 1 | 3 | | TPCH(_300) | TPCH-Q15 | parquet / none / none | 3.40 | 3.47 | -1.99% | 0.72% | 1.27% | 1 | 3 | | TPCH(_300) | TPCH-Q7 | parquet / none / none | 44.98 | 46.24 | -2.71% | 1.83% | 1.27% | 1 | 3 | | TPCH(_300) | TPCH-Q3 | parquet / none / none | 28.06 | 29.11 | -3.61% | 1.62% | 1.23% | 1 | 3 | | TPCH(_300) | TPCH-Q12 | parquet / none / none | 3.15 | 3.28 | -3.80% | 0.96% | 1.32% | 1 | 3 | | TPCH(_300) | TPCH-Q9 | parquet / none / none | 29.47 | 30.80 | -4.30% | 0.29% | 0.34% | 1 | 3 | | TPCH(_300) | TPCH-Q17 | parquet / none / none | 4.37 | 4.62 | -5.33% | 0.63% | 0.54% | 1 | 3 | | TPCH(_300) | TPCH-Q8 | parquet / none / none | 7.99 | 8.46 | -5.53% | 7.95% | 1.11% | 1 | 3 | +------------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ Here are the TPCDS performance comparison results between this change and the previous commit on a 16 node cluster. I inspected the Q2 results and concluded that the variability is unrelated to this change. +--------------+-----------------------+---------+------------+------------+----------------+ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--------------+-----------------------+---------+------------+------------+----------------+ | TPCDS(_1000) | parquet / none / none | 13.07 | +0.51% | 4.27 | +1.83% | +--------------+-----------------------+---------+------------+------------+----------------+ +--------------+------------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +--------------+------------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ | TPCDS(_1000) | TPCDS-Q2 | parquet / none / none | 8.36 | 4.25 | R +96.81% | * 48.88% * | 0.42% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q8 | parquet / none / none | 1.59 | 1.35 | +17.86% | * 13.91% * | 4.01% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q73 | parquet / none / none | 1.81 | 1.71 | +5.92% | 5.53% | 0.15% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q28 | parquet / none / none | 7.26 | 6.95 | +4.47% | 1.09% | 1.11% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q46 | parquet / none / none | 2.36 | 2.30 | +2.62% | 1.45% | 0.40% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q7 | parquet / none / none | 2.78 | 2.73 | +1.98% | 1.21% | 2.23% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q55 | parquet / none / none | 1.05 | 1.03 | +1.91% | 1.16% | 2.20% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q42 | parquet / none / none | 1.05 | 1.04 | +1.71% | 0.90% | 2.63% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q19 | parquet / none / none | 1.67 | 1.65 | +1.55% | 1.12% | 1.96% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q23 | parquet / none / none | 151.75 | 149.94 | +1.20% | 3.23% | 1.83% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q64 | parquet / none / none | 40.25 | 39.79 | +1.16% | 0.43% | 0.28% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q96 | parquet / none / none | 2.25 | 2.22 | +1.05% | 1.00% | 0.11% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q53 | parquet / none / none | 1.60 | 1.58 | +1.01% | 1.28% | 0.04% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q79 | parquet / none / none | 4.17 | 4.13 | +0.94% | 0.89% | 0.06% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q59 | parquet / none / none | 5.74 | 5.71 | +0.60% | 1.22% | 2.56% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q52 | parquet / none / none | 0.89 | 0.89 | +0.14% | 0.03% | 0.63% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q88 | parquet / none / none | 7.10 | 7.12 | -0.23% | 0.43% | 0.47% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q3 | parquet / none / none | 1.10 | 1.11 | -0.40% | 0.58% | 0.36% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q98 | parquet / none / none | 2.30 | 2.31 | -0.49% | 3.58% | 1.04% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q61 | parquet / none / none | 1.87 | 1.89 | -1.08% | 1.68% | 0.14% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q27a | parquet / none / none | 2.93 | 2.96 | -1.18% | 1.74% | 1.54% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q34 | parquet / none / none | 2.23 | 2.27 | -1.73% | 1.91% | 1.32% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q63 | parquet / none / none | 1.56 | 1.60 | -1.96% | 1.91% | 3.33% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q89 | parquet / none / none | 2.64 | 2.70 | -2.20% | 1.93% | 1.88% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q47 | parquet / none / none | 30.41 | 31.17 | -2.41% | 1.09% | 1.52% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q1 | parquet / none / none | 3.77 | 3.86 | -2.46% | 1.91% | 0.61% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q6 | parquet / none / none | 61.67 | 63.34 | -2.65% | 3.77% | 0.31% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q4 | parquet / none / none | 31.11 | 31.96 | -2.66% | 0.61% | 0.77% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q43 | parquet / none / none | 4.10 | 4.22 | -2.87% | 1.40% | 2.85% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q5 | parquet / none / none | 8.30 | 8.56 | -3.13% | 1.55% | 0.47% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q27 | parquet / none / none | 2.28 | 2.35 | -3.13% | 1.17% | 1.56% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q65 | parquet / none / none | 31.74 | 32.77 | -3.15% | 1.47% | 1.11% | 1 | 3 | | TPCDS(_1000) | TPCDS-Q68 | parquet / none / none | 1.56 | 1.62 | -3.58% | 9.37% | * 11.93% * | 1 | 3 | +--------------+------------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+ (R) Regression: TPCDS(_1000) TPCDS-Q2 [parquet / none / none] (4.25s -> 8.36s [+96.81%]) +---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+---------+-----------+ | Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Rows | Est #Rows | +---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+---------+-----------+ | 27:MERGING-EXCHANGE | 22.48% | 6.97s | 2.85s | +144.40% | * 58.44% * | 11.05s | 2.86s | +286.33% | 1 | 2.51K | 2.56K | | 26:EXCHANGE | 7.65% | 2.37s | 2.43s | -2.16% | 1.82% | 2.46s | 2.50s | -1.65% | 14 | 365 | 2.56K | | 23:EXCHANGE | 8.58% | 2.66s | 2.70s | -1.46% | 1.67% | 2.74s | 2.78s | -1.47% | 14 | 516 | 10.64K | | 13:AGGREGATE | 4.21% | 1.31s | 1.30s | +0.65% | 0.06% | 1.47s | 1.43s | +2.38% | 14 | 516 | 10.64K | | 12:HASH JOIN | 2.89% | 896.20ms | 885.79ms | +1.17% | 1.43% | 1.06s | 1.01s | +4.77% | 14 | 433.27M | 2.16B | | 06:SCAN HDFS | 2.83% | 877.34ms | 886.93ms | -1.08% | 1.23% | 888.16ms | 906.88ms | -2.06% | 1 | 365 | 373 | | 19:EXCHANGE | 23.20% | 7.20s | 3.12s | +130.58% | * 56.73% * | 11.33s | 3.17s | +256.92% | 14 | 520 | 10.64K | | 05:AGGREGATE | 12.06% | 3.74s | 1.34s | +178.49% | * 64.53% * | 6.33s | 1.53s | +314.84% | 14 | 520 | 10.64K | | 04:HASH JOIN | 7.71% | 2.39s | 956.81ms | +149.90% | * 60.36% * | 4.04s | 1.13s | +256.75% | 14 | 442.29M | 2.16B | | 03:SCAN HDFS | 2.83% | 878.97ms | 894.11ms | -1.69% | 1.34% | 890.78ms | 910.22ms | -2.14% | 1 | 371 | 73.05K | +---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+---------+-----------+ Change-Id: I626456b6afa9101eeeeffd5cda10c4096d63d7f9 --- M be/src/common/atomic.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile.cc M be/src/util/stopwatch.h M common/thrift/ImpalaInternalService.thrift M tests/query_test/test_observability.py M tests/webserver/test_web_pages.py M www/query_detail_tabs.tmpl A www/query_finstances.tmpl 19 files changed, 594 insertions(+), 81 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/8758/18 -- To view, visit http://gerrit.cloudera.org:8080/8758 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I626456b6afa9101eeeeffd5cda10c4096d63d7f9 Gerrit-Change-Number: 8758 Gerrit-PatchSet: 18 Gerrit-Owner: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>