Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/11967 )
Change subject: IMPALA-1048: show sinks in exec summary ...................................................................... Patch Set 8: (14 comments) http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h File be/src/exec/data-sink.h: http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h@58 PS8, Line 58: Otherwise : /// 'fragment_idx' should be -1. Are there any other cases than join builders that are sinks but not at the root of the fragment? http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h@60 PS8, Line 60: fragment_idx I think this could be easier to understand if it was named "data_sink_id" because that's what we use it for. The comment could still point out that it needs to be unique and that by convention we pass the fragment index here. The .cc file could also become easier to read. http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.cc File be/src/exec/data-sink.cc: http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.cc@72 PS8, Line 72: int fragment_idx = fragment_ctx.fragment.idx; While you're here, could you add a DCHECK that this is not a JOIN_BUILD_SINK? http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc@501 PS8, Line 501: // TODO: why do this every time we get an updated instance profile? While you're here, do you know why we do this? http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc@518 PS8, Line 518: node_exec_summary rename to exec_summary? http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.h@285 PS8, Line 285: TPlanNodeId Maybe create a separate TDataSinkId (see comment elsewhere)? http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.cc@246 PS8, Line 246: the nit: be http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/util/runtime-profile.cc@870 PS8, Line 870: // that rely on the plan node id being set there. Should we create a follow up jira to drop this in Impala 4.0 / on the next compatibility breaking release? http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift File common/thrift/RuntimeProfile.thrift: http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift@68 PS8, Line 68: struct TRuntimeProfileNodeMetadata { Out of curiosity, is this approach preferable to storing the id and the type in two separate fields, with the type identified by some enum? Or a tagged union with TPlanNodeId and TDataSinkId? http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift@70 PS8, Line 70: i32 We also have Types.TPlanNodeId, would that fit here? http://gerrit.cloudera.org:8080/#/c/11967/8/fe/src/main/java/org/apache/impala/planner/DataSink.java File fe/src/main/java/org/apache/impala/planner/DataSink.java: http://gerrit.cloudera.org:8080/#/c/11967/8/fe/src/main/java/org/apache/impala/planner/DataSink.java@84 PS8, Line 84: toThrift I find overloads harder to follow when they have different visibility and such and would call this toThriftInternal(), but I don't feel strongly about it. http://gerrit.cloudera.org:8080/#/c/11967/8/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/11967/8/shell/impala_client.py@211 PS8, Line 211: "" if is_sink else prettyprint_units(cardinality), nit: you can write this as is_sink and "" or prettyprint_units(...) It is closer to c++ ternary expressions and I find it easier to read. I don't feel strongly about it though. http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py@97 PS8, Line 97: result.exec_summary[0] You could name this part to root_sink to make the code less repetitive and possibly easier to read. http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py@109 PS8, Line 109: * nit: + instead of * -- To view, visit http://gerrit.cloudera.org:8080/11967 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3fdf7bacae8ff597b255da65af453e174ba53544 Gerrit-Change-Number: 11967 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Fri, 30 Nov 2018 02:42:33 +0000 Gerrit-HasComments: Yes