Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11967 )

Change subject: IMPALA-1048: show sinks in exec summary
......................................................................


Patch Set 8:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h@58
PS8, Line 58: Otherwise
            :   /// 'fragment_idx' should be -1.
Are there any other cases than join builders that are sinks but not at the root 
of the fragment?


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h@60
PS8, Line 60: fragment_idx
I think this could be easier to understand if it was named "data_sink_id" 
because that's what we use it for. The comment could still point out that it 
needs to be unique and that by convention we pass the fragment index here. The 
.cc file could also become easier to read.


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.cc@72
PS8, Line 72:   int fragment_idx = fragment_ctx.fragment.idx;
While you're here, could you add a DCHECK that this is not a JOIN_BUILD_SINK?


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc@501
PS8, Line 501:   // TODO: why do this every time we get an updated instance 
profile?
While you're here, do you know why we do this?


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc@518
PS8, Line 518: node_exec_summary
rename to exec_summary?


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.h@285
PS8, Line 285: TPlanNodeId
Maybe create a separate TDataSinkId (see comment elsewhere)?


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.cc@246
PS8, Line 246: the
nit: be


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/util/runtime-profile.cc@870
PS8, Line 870:   // that rely on the plan node id being set there.
Should we create a follow up jira to drop this in Impala 4.0 / on the next 
compatibility breaking release?


http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift
File common/thrift/RuntimeProfile.thrift:

http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift@68
PS8, Line 68: struct TRuntimeProfileNodeMetadata {
Out of curiosity, is this approach preferable to storing the id and the type in 
two separate fields, with the type identified by some enum? Or a tagged union 
with TPlanNodeId and TDataSinkId?


http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift@70
PS8, Line 70: i32
We also have Types.TPlanNodeId, would that fit here?


http://gerrit.cloudera.org:8080/#/c/11967/8/fe/src/main/java/org/apache/impala/planner/DataSink.java
File fe/src/main/java/org/apache/impala/planner/DataSink.java:

http://gerrit.cloudera.org:8080/#/c/11967/8/fe/src/main/java/org/apache/impala/planner/DataSink.java@84
PS8, Line 84: toThrift
I find overloads harder to follow when they have different visibility and such 
and would call this toThriftInternal(), but I don't feel strongly about it.


http://gerrit.cloudera.org:8080/#/c/11967/8/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/11967/8/shell/impala_client.py@211
PS8, Line 211:             "" if is_sink else prettyprint_units(cardinality),
nit: you can write this as

  is_sink and "" or prettyprint_units(...)

It is closer to c++ ternary expressions and I find it easier to read. I don't 
feel strongly about it though.


http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py@97
PS8, Line 97: result.exec_summary[0]
You could name this part to root_sink to make the code less repetitive and 
possibly easier to read.


http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py@109
PS8, Line 109: *
nit: + instead of *



--
To view, visit http://gerrit.cloudera.org:8080/11967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3fdf7bacae8ff597b255da65af453e174ba53544
Gerrit-Change-Number: 11967
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Fri, 30 Nov 2018 02:42:33 +0000
Gerrit-HasComments: Yes

Reply via email to