Hello Michael Ho, Lars Volker, Philip Zeyliger,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11967

to look at the new patch set (#6).

Change subject: IMPALA-1048: show sinks in exec summary
......................................................................

IMPALA-1048: show sinks in exec summary

The exec summary now includes the total time taken and memory
consumed by the data sink at the root of each fragment. Previously
the exec summary could hide where time and memory went while
executing a query.

The high-level changes are:
* Generalising logic in the exec summary and runtime profile to
  handle data sinks, not just plan nodes, including adding richer
  metadata to runtime profile nodes.
* Threading through metadata about the data sinks, like names and
  estimates, so that it can appear in the exec summary.

The major potential downside is that the new timings reported for
data stream sender can overlap with the receiver's time and
potentially cause confusion.

[localhost:21000] default> select count(distinct l_comment) from 
tpch_parquet.lineitem; summary;
Query: select count(distinct l_comment) from tpch_parquet.lineitem
Query submitted at: 2018-11-20 16:47:03 (Coordinator: 
http://tarmstrong-box:25000)
Query progress can be monitored at: 
http://tarmstrong-box:25000/query_plan?query_id=f5464383a3bb6878:54b5252b00000000
+---------------------------+
| count(distinct l_comment) |
+---------------------------+
| 4580667                   |
+---------------------------+
Fetched 1 row(s) in 4.13s
+---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
| Operator            | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | 
Peak Mem  | Est. Peak Mem | Detail                |
+---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+
| F02:ROOT            | 1      | 59.11ms  | 59.11ms  |       |            | 0 B 
      | 0 B           |                       |
| 06:AGGREGATE        | 1      | 274.24us | 274.24us | 1     | 1          | 
16.00 KB  | 10.00 MB      | FINALIZE              |
| 05:EXCHANGE         | 1      | 75.16us  | 75.16us  | 3     | 1          | 
32.00 KB  | 16.00 KB      | UNPARTITIONED         |
| F01:EXCHANGE SENDER | 3      | 119.53us | 146.28us |       |            | 
16.00 KB  | 0 B           |                       |
| 02:AGGREGATE        | 3      | 19.26ms  | 19.89ms  | 3     | 1          | 
16.00 KB  | 10.00 MB      |                       |
| 04:AGGREGATE        | 3      | 1.06s    | 1.07s    | 4.58M | 4.65M      | 
96.02 MB  | 62.63 MB      |                       |
| 03:EXCHANGE         | 3      | 243.91ms | 246.44ms | 5.01M | 4.65M      | 
464.00 KB | 10.12 MB      | HASH(l_comment)       |
| F00:EXCHANGE SENDER | 3      | 2.41s    | 2.55s    |       |            | 
337.53 KB | 0 B           |                       |
| 01:AGGREGATE        | 3      | 1.05s    | 1.14s    | 5.01M | 4.65M      | 
97.20 MB  | 121.17 MB     | STREAMING             |
| 00:SCAN HDFS        | 3      | 37.88ms  | 41.28ms  | 6.00M | 6.00M      | 
27.88 MB  | 80.00 MB      | tpch_parquet.lineitem |
+---------------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------+

Testing:
Added a basic observability test.

Change-Id: I3fdf7bacae8ff597b255da65af453e174ba53544
---
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M be/src/util/summary-util.cc
M common/thrift/DataSinks.thrift
M common/thrift/ExecStats.thrift
M common/thrift/RuntimeProfile.thrift
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M shell/impala_client.py
M tests/beeswax/impala_beeswax.py
M tests/query_test/test_observability.py
36 files changed, 321 insertions(+), 122 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/11967/6
--
To view, visit http://gerrit.cloudera.org:8080/11967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3fdf7bacae8ff597b255da65af453e174ba53544
Gerrit-Change-Number: 11967
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>

Reply via email to