[Impala-ASF-CR] IMPALA-6323 Allow constant analytic window expressions.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11556 ) Change subject: IMPALA-6323 Allow constant analytic window expressions. .. Patch Set 6: Should there also be an end-to-end test for this ? It may be worth augmenting some of the queries in testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test ? -- To view, visit http://gerrit.cloudera.org:8080/11556 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idf129026c45120e9470df601268863634037908c Gerrit-Change-Number: 11556 Gerrit-PatchSet: 6 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 25 Oct 2018 23:57:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 6: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 25 Oct 2018 02:18:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test File testdata/workloads/functional-query/queries/QueryTest/aggregation.test: http://gerrit.cloudera.org:8080/#/c/11535/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1402 PS18, Line 1402: bigint, float You forgot to update result type for the extra column in the select list. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 18 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 25 Oct 2018 02:17:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/11615/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11615/5//COMMIT_MSG@14 PS5, Line 14: coordinator backend states. In addition, due to the lack : of coordinator between query fragment instances, a query may end : without collecting the profiles from all fragment instances when : one of them hits an error before some other fragment instance manages : to finish calling Prepare(), leading to missing > Does this mean that we've fixed the problem in IMPALA-7148 and can re-enabl Yes. Uncommented some of the tests in bloom_filters.test. http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc@236 PS1, Line 236: if (overall_status_.IsCancelled()) { > Right, I was thinking that CANCELLED_INTERNALLY shouldn't leak out of fragm Keep it as-is for now. http://gerrit.cloudera.org:8080/#/c/11615/5/testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test File testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test: http://gerrit.cloudera.org:8080/#/c/11615/5/testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test@18 PS5, Line 18: QUERY > This test doesn't really have anything to do with nondeterminism. Would you Done -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 25 Oct 2018 02:13:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 20: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 20 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 25 Oct 2018 02:12:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 19: (1 comment) http://gerrit.cloudera.org:8080/#/c/10855/19/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/19/be/src/runtime/query-state.cc@363 PS19, Line 363: LOG(DFATAL) << FromKuduStatus(sidecar_status, "Failed to add sidecar").GetDetail(); > line too long (91 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 19 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 25 Oct 2018 02:12:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Hello Thomas Marshall, Tim Armstrong, Joe McDonnell, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11615 to look at the new patch set (#6). Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. IMPALA-4063: Merge report of query fragment instances per executor Previously, each fragment instance executing on an executor will independently report its status to the coordinator periodically. This creates a huge amount of RPCs to the coordinator under highly concurrent workloads, causing lock contention in the coordinator's backend states when multiple fragment instances send them at the same time. In addition, due to the lack of coordination between query fragment instances, a query may end without collecting the profiles from all fragment instances when one of them hits an error before another fragment instance manages to finish Prepare(), leading to missing profiles for certain fragment instances. This change fixes the problem above by making a thread per QueryState (started by QueryExecMgr) to be responsible for periodically reporting the status and profiles of all fragment instances of a query running on a backend. As part of this refactoring, each query fragment instance will not report their errors individually. Instead, there is a cumulative status maintained per QueryState. It's set to the error status of the first fragment instance which hits an error or any general error (e.g. failure to start a thread) when starting fragment instances. With this change, the status reporting threads are also removed. Testing done: exhaustive tests This patch is based on a patch by Sailesh Mukil Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/control-service.cc M be/src/service/control-service.h M common/protobuf/control_service.proto M common/thrift/RuntimeProfile.thrift M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test A testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test D testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test M testdata/workloads/functional-query/queries/QueryTest/udf.test M tests/query_test/test_udfs.py 21 files changed, 419 insertions(+), 473 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/11615/6 -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Thomas Marshall, Todd Lipcon, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, Michal Ostrowski, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#20). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 65 files changed, 1,299 insertions(+), 769 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/20 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 20 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer:
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 17 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 21:44:27 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) Download 3.0.1
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11766 ) Change subject: Download 3.0.1 .. Patch Set 1: Code-Review+2 Thanks for doing it. -- To view, visit http://gerrit.cloudera.org:8080/11766 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I044f119788e5c228bf41ed241ae6d23e5eb00a25 Gerrit-Change-Number: 11766 Gerrit-PatchSet: 1 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 24 Oct 2018 17:56:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: Code-Review+2 (2 comments) LGTM. Two minor comments which need to be addressed. http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.h@40 PS16, Line 40: nit: blank space http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818 PS16, Line 818: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))), : r as (select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> sqrt(1.0-t2.x)) : select * from r > If you plan to keep this test case, this can be simplified as: Should this be addressed ? -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 06:41:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7351: Add estimates to Exchange node
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11692 ) Change subject: IMPALA-7351: Add estimates to Exchange node .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/11692/2/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java File fe/src/main/java/org/apache/impala/planner/ExchangeNode.java: http://gerrit.cloudera.org:8080/#/c/11692/2/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@192 PS2, Line 192: There is also a deferred rpc queue which queues at max : // one rpc payload (containing the rowbatch) per sender in-case the row : // batch queue hits the previously mentioned soft limit In the long run, we should consider either of the followings: - enable spilling in exchange node - throttle the sender until space opens up on exchange node. May waste a bit of network bandwidth in some cases due to resend. I suppose either of the above may eventually get the memory consumption of an exchange node under control. http://gerrit.cloudera.org:8080/#/c/11692/2/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@214 PS2, Line 214: // TODO: will this be different for a broadcast exchange? Yes, for broadcasting, each exchange node will get the aggregate of all senders' produced row batches. May be worth having that distinction here. Also, does the following line make assumption that there is no skew in distribution during data shuffling ? May be worth documenting that. http://gerrit.cloudera.org:8080/#/c/11692/2/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@223 PS2, Line 223: cardinality_ How is this different from getCardinality() ? http://gerrit.cloudera.org:8080/#/c/11692/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/11692/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java@74 PS2, Line 74: nit: extra blank line -- To view, visit http://gerrit.cloudera.org:8080/11692 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b577f9511abc48b992e814d50bba4959f23f7fd Gerrit-Change-Number: 11692 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 23 Oct 2018 00:56:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Thomas Marshall, Todd Lipcon, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, Michal Ostrowski, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#19). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 65 files changed, 1,298 insertions(+), 769 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/19 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 19 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer:
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/10855/18/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/18/be/src/runtime/query-state.cc@363 PS18, Line 363: ERROR > I'd use DFATAL so that it would at least crash in debug builds. (DFATAL tur Done -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 18 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Sat, 20 Oct 2018 01:52:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 19: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 19 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Sat, 20 Oct 2018 01:52:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: (3 comments) Hey sorry for not getting back earlier. Bogged down again with some users' issues which I have to help with. http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/aggregation.test File testdata/workloads/functional-query/queries/QueryTest/aggregation.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1384 PS16, Line 1384: (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T Please see comments in joins.test http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@801 PS16, Line 801: (VALUES((1.6 x, 0 y), (3.2, 1), (5.4,2), (0.5, 3), (0.5, 4), (-0.5, 5))) XX), The problem with this kind of query with few number VALUES() is that codegen will be disabled as the planner knows the number of rows will be small. I think it may make sense to have another test cases to scan some sizable table. This is also a good test case to keep as this exercises the interpretation path. Of course, one can also set the query option DISABLE_CODEGEN_ROWS_THRESHOLD to a small value but it seems better to have a more realistic test query with scans in there instead of joining two union nodes of constants. You can check the query profile to see codegen is enabled in the HASH JOIN node. http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818 PS16, Line 818: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))), : r as (select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> sqrt(1.0-t2.x)) : select * from r If you plan to keep this test case, this can be simplified as: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))) select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> sqrt(1.0-t2.x) -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 19 Oct 2018 18:18:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 ) Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11650 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29 Gerrit-Change-Number: 11650 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Thu, 18 Oct 2018 05:49:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 ) Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter .. Patch Set 3: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/11650/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11650/3//COMMIT_MSG@29 PS3, Line 29: : MemPool::Allocate. Can you please consider adding this query as a test as a regression test ? Although it may not be reproducible as standalone, it will still exercise the path in question. Since it's also timing dependent, it may be more readily reproducible in our regular testing environment where we run multiple queries concurrently. http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter-test.cc File be/src/util/min-max-filter-test.cc: http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter-test.cc@355 PS3, Line 355: EXPECT_EQ(TimestampValue::FromTColumnValue(tFilter2.max), t3); Should we call Close() on the filters above too ? http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/11650/3/be/src/util/min-max-filter.h@188 PS3, Line 188: MemPool mem_pool_; As mentioned previously, please add a comment for this. -- To view, visit http://gerrit.cloudera.org:8080/11650 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29 Gerrit-Change-Number: 11650 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Thu, 18 Oct 2018 02:34:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7351: Add estimates to Exchange node
Michael Ho has removed Michael Ho from this change. ( http://gerrit.cloudera.org:8080/11692 ) Change subject: IMPALA-7351: Add estimates to Exchange node .. Removed reviewer Michael Ho. -- To view, visit http://gerrit.cloudera.org:8080/11692 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I5b577f9511abc48b992e814d50bba4959f23f7fd Gerrit-Change-Number: 11692 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075 PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any other NaN via <=> : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT) P, CAST(SQRT(Y) AS FLOAT) Q : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) T ) : SELECT * FROM W WHERE W.Q<=>W.P : RESULTS : 0,0,0,0 : TYPES : FLOAT, FLOAT, FLOAT, FLOAT > This test case is here in response to your comment: "It may warrant a test Sorry, I wasn't clear in my comment. I meant to say a test case which exercise the join condition on Nan <=> Nan out of caution. hash-table.cc logic is used both by join and aggregation. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 01:51:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/11615/5/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/5/be/src/runtime/query-state.cc@495 PS5, Line 495: error: : // This point is reached if there were general errors to start query fragment instances. : // Wait for all running fragment instances to finish preparing and report status to the : // coordinator to start query cancellation. > Since we are changing some error handling for thread creation, a way to tes Thanks for pointing that out. We already have a test to exercise the thread creation failure here (https://github.com/apache/impala/blob/master/tests/failure/test_failpoints.py#L159-L173) but I think it's also a good idea to exercise the paths stress option you mentioned. Thanks for the suggestion. -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 00:31:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075 PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any other NaN via <=> : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT) P, CAST(SQRT(Y) AS FLOAT) Q : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) T ) : SELECT * FROM W WHERE W.Q<=>W.P : RESULTS : 0,0,0,0 : TYPES : FLOAT, FLOAT, FLOAT, FLOAT This is not the right test to exercise join code. You can check the query plan but it doesn't have a join in it. You can check the query plan by doing explain of the query in Impala shell and see if it matches your expectation. You can do a self-join like the following. Also, this test belongs to joins.test. with y as (select t1.float_col as v from functional.alltypessmall t1, functional.alltypessmall t2 where sqrt(3.5-t1.float_col) <=> sqrt(3.5-t2.float_col)) select count(*) from y where is_nan(v); -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 00:21:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Revert "IMPALA-6910/IMPALA-7070: Increase log level for HDFS S3 code"
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11699 ) Change subject: Revert "IMPALA-6910/IMPALA-7070: Increase log level for HDFS S3 code" .. Patch Set 2: Lars, what's the status of this patch ? Should it be abandoned ? -- To view, visit http://gerrit.cloudera.org:8080/11699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5ab7f2f6317f1281928af5ea6cf3fd7d0c6e0a09 Gerrit-Change-Number: 11699 Gerrit-PatchSet: 2 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Tue, 16 Oct 2018 23:41:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 ) Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/11650/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11650/2//COMMIT_MSG@26 PS2, Line 26: I have been unable to repro the actual crash despite trying a large : variety of different things. However, with additional logging added : its clear that the MemPool is being used concurrently, which is : incorrect. Can you reproduce the problem now with DFAKE_MUTEX ? http://gerrit.cloudera.org:8080/#/c/11650/2/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/11650/2/be/CMakeLists.txt@234 PS2, Line 234: SET(CLANG_IR_CXX_FLAGS "${CLANG_IR_CXX_FLAGS}" "-DNDEBUG") I think this makes sense. I wonder what the historical context for not doing it in the first place. I checked the code on some older releases from more than 4 years ago and it was there all along. May be the codegen time was too slow back then to include all the debug code ? How much slow down (if any) did you notice in debug builds with this change ? http://gerrit.cloudera.org:8080/#/c/11650/2/be/src/runtime/mem-pool.h File be/src/runtime/mem-pool.h: http://gerrit.cloudera.org:8080/#/c/11650/2/be/src/runtime/mem-pool.h@135 PS2, Line 135: DFAKE_SCOPED_LOCK(mutex_); nit: Feel free to ignore but it seems easier to just add DFAKE_SCOPED_LOCK() directly in Allocate(). That said, the current code also looks fine as-is. -- To view, visit http://gerrit.cloudera.org:8080/11650 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29 Gerrit-Change-Number: 11650 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 16 Oct 2018 23:40:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/11615/3/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/3/be/src/runtime/fragment-instance-state.cc@444 PS3, Line 444: if (fragment_ctx_.fragment.output_sink.type != TDataSinkType::PLAN_ROOT_SINK) { : // if we haven't already release this thread token in Prepare(), release it now : ReleaseThreadToken(); : } > this is called at only one site and now does only one thing, should we just This still seems like a conceptual step in the fragment life cycle so it makes sense to encapsulate it as a function. http://gerrit.cloudera.org:8080/#/c/11615/3/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: http://gerrit.cloudera.org:8080/#/c/11615/3/be/src/runtime/query-exec-mgr.h@78 PS3, Line 78: an instances hit : /// an error or > nit: unless an instance hits an error they are cancelled Done http://gerrit.cloudera.org:8080/#/c/11615/3/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/3/be/src/runtime/query-state.cc@397 PS3, Line 397: void QueryState::ErrorDuringExecute(const Status& status, const TUniqueId& finst_id) { > previous patch had a racy check on the status, was there any benefit that e It's okay to skip the racy call for now until evidence suggests otherwise. It's not a big deal if multiple fragment threads have to block when setting the error status. We don't expect this to be a common case so it's better to keep the code simpler and less error prone. http://gerrit.cloudera.org:8080/#/c/11615/3/be/src/runtime/query-state.cc@509 PS3, Line 509: } > nit: retain this comment here from previously at L506: Done -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 16 Oct 2018 00:47:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Hello Thomas Marshall, Tim Armstrong, Joe McDonnell, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11615 to look at the new patch set (#5). Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. IMPALA-4063: Merge report of query fragment instances per executor Previously, each fragment instance executing on an executor will independently report its status to the coordinator periodically. This creates a huge amount of RPCs to the coordinator under highly concurrent workloads, causing lock contention in the coordinator's backend states when multiple fragment instances send them at the same time. In addition, due to the lack of coordination between query fragment instances, a query may end without collecting the profiles from all fragment instances when one of them hits an error before another fragment instance manages to finish Prepare(), leading to missing profiles for certain fragment instances. This change fixes the problem above by making a thread per QueryState (started by QueryExecMgr) to be responsible for periodically reporting the status and profiles of all fragment instances of a query running on a backend. As part of this refactoring, each query fragment instance will not report their errors individually. Instead, there is a cumulative status maintained per QueryState. It's set to the error status of the first fragment instance which hits an error or any general error (e.g. failure to start a thread) when starting fragment instances. With this change, the status reporting threads are also removed. Testing done: exhaustive tests This patch is based on a patch by Sailesh Mukil Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/control-service.cc M be/src/service/control-service.h M common/protobuf/control_service.proto M common/thrift/RuntimeProfile.thrift M testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test M testdata/workloads/functional-query/queries/QueryTest/udf.test 18 files changed, 389 insertions(+), 436 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/11615/5 -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798 PS14, Line 798: QUERY > This test should cover that... it compares NaN values and considers them t Yes I think t1.c1 = t2.c1 is not the same as t1.c1 <=> t2.c1; Since your patch touches some logic in the hash table which indirectly affects <=> comparison case, it may be worth double checking nothing unexpected happens. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 14 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 15 Oct 2018 21:13:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798 PS14, Line 798: QUERY Still missing a test case for nan <=> nan being false. Please see earlier comments. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 14 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 15 Oct 2018 18:35:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 18: (8 comments) http://gerrit.cloudera.org:8080/#/c/10855/17//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10855/17//COMMIT_MSG@40 PS17, Line 40: 2. Added some targeted test cases for profile serialization failure > nit: long line Done http://gerrit.cloudera.org:8080/#/c/10855/17/be/src/rpc/rpc-mgr.inline.h File be/src/rpc/rpc-mgr.inline.h: http://gerrit.cloudera.org:8080/#/c/10855/17/be/src/rpc/rpc-mgr.inline.h@35 PS17, Line 35: std::unique_ptr* proxy) { > hrm, it still feels like we are going through a lot of gymnastics to avoid OK. Reverted to sharing the connection for now in this patch. The kudu patch (https://gerrit.cloudera.org/#/c/11681/) is already out for review. http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/rpc/thrift-util-test.cc File be/src/rpc/thrift-util-test.cc: http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/rpc/thrift-util-test.cc@58 PS14, Line 58: > Should we add a test for SerializeToString as well? Done http://gerrit.cloudera.org:8080/#/c/10855/17/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/17/be/src/runtime/query-state.cc@356 PS17, Line 356: > should we handle errors like these and log it instead? Yes, while Todd previously commented that this should be a CHECK(), I think it's better to not crash Impala on a non-fatal issue. Admittedly, the system is most likely in a bad state if this fails but not being able to send the profile shouldn't be fatal either. http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/service/control-service.h File be/src/service/control-service.h: http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/service/control-service.h@52 PS16, Line 52: e* > nit: with Done http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/testutil/in-process-servers.cc File be/src/testutil/in-process-servers.cc: http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/testutil/in-process-servers.cc@51 PS14, Line 51: // Thrift server c > can you document the reason here? it was not clear to me as to why only thi Done http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/util/runtime-profile.cc@251 PS14, Line 251: if (UNLIKELY(nodes.size()) == 0) return; > does this only happen when thrift de-serialization fails? Or serialization failure on the executor side. http://gerrit.cloudera.org:8080/#/c/10855/14/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/10855/14/common/protobuf/control_service.proto@27 PS14, Line 27: message ParquetDmlStatsPB { > nit: maybe retain the comment in thrift file Done -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 18 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Sun, 14 Oct 2018 02:21:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 18: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 18 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Sun, 14 Oct 2018 02:21:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.cc@243 PS2, Line 243: backend_exec_state_ = cur_state == BackendExecState::PREPARING ? : BackendExecState::EXECUTING : BackendExecState::FINISHED; > Sorry, to be clearer - I was suggesting leaving the logic around 'overall_s There is one already at line 523. -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sun, 14 Oct 2018 02:21:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Hello Thomas Marshall, Tim Armstrong, Joe McDonnell, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11615 to look at the new patch set (#4). Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. IMPALA-4063: Merge report of query fragment instances per executor Previously, each fragment instance executing on an executor will independently report its status to the coordinator periodically. This creates a huge amount of RPCs to the coordinator under highly concurrent workloads, causing lock contention in the coordinator's backend states as each fragment instance tries to them at the same time. In addition, due to the lack of coordination between query fragment instances, a query may end without collecting the profiles from all fragment instances when one of them hits an error before another fragment instance manages to finish calling Prepare(), leading to missing profiles for certain fragment instances. This change fixes the problem above by making a thread per QueryState (started by QueryExecMgr) to be responsible for periodically reporting the status and profiles of all fragment instances of a query running on a backend. As part of this refactoring, each query fragment instance will not report their errors individually. Instead, there is a cumulative status maintained per QueryState. It's set to the error status of the first fragment instance which hits an error or any general error (e.g. failure to start a thread) when starting fragment instances. With this change, the status reporting threads are also removed. Testing done: exhaustive tests This patch is based on a patch by Sailesh Mukil Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/control-service.cc M be/src/service/control-service.h M common/protobuf/control_service.proto M common/thrift/RuntimeProfile.thrift M testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test M testdata/workloads/functional-query/queries/QueryTest/udf.test 18 files changed, 387 insertions(+), 436 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/11615/4 -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Thomas Marshall, Todd Lipcon, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, Michal Ostrowski, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#18). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 65 files changed, 1,298 insertions(+), 769 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/18 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 18 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer:
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25 PS9, Line 25: group by Z > The new test case still seems to be group-by a single column. May be you ca This is not yet addressed http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12 PS11, Line 12: GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), sqrt(0.5-x) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T : group by Z : RESULTS : 3, NaN : TYPES : bigint, double : : QUERY : # GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), cast(sqrt(0.5-x) as FLOAT) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T : group by Z order by Z : RESULTS : 3, NaN : 2, 0 : 1, 1 : TYPES : bigint, float : : QUERY > These tests are testing the behavior of group-by so they should belong to a This comment is not yet addressed -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 13 Oct 2018 04:42:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature flag"
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11671 ) Change subject: IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature flag" .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11671 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf0a64d6ec747275e3ecd6e801e054f81095591a Gerrit-Change-Number: 11671 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Sat, 13 Oct 2018 01:00:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature flag"
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11671 ) Change subject: IMPALA-7704: Revert "IMPALA-7644: Hide Parquet page index writing with feature flag" .. Patch Set 1: Is this a clean revert ? -- To view, visit http://gerrit.cloudera.org:8080/11671 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf0a64d6ec747275e3ecd6e801e054f81095591a Gerrit-Change-Number: 11671 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Sat, 13 Oct 2018 00:40:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 16: (3 comments) http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/rpc/rpc-mgr.h@146 PS16, Line 146: service_name > admittedly it's clever that you were able to hack the username in order to I didn't realize the existence of static_service_name(). Thanks for pointing that out. I added a new template parameter S for this as the static_service_name or service_name doesn't seem available from the proxy. I kind of implemented your suggestion for enum in this patch by defining GetProxy() in DataStreamService and ControlService respectively. http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/runtime/coordinator-backend-state.cc@506 PS16, Line 506: { > why is this extra indentation block here? doesn't seem like there is any RA Fixed. Forgot to revert the change after undoing the move for exec_summary_->lock http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: http://gerrit.cloudera.org:8080/#/c/10855/16/be/src/runtime/fragment-instance-state.h@107 PS16, Line 107: const > nit: const non-reference parameters dont make much sense Done -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 16 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 12 Oct 2018 06:12:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 66 files changed, 1,269 insertions(+), 771 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/17 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 17 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 11: (8 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748 PS9, Line 748: llvm::Value* cmp_raw = builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); : if (!inclusive_equality) return cmp_raw; > I need to generate that comparison in either case, so if I followed you sug Good point. Missed that it's used twice. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801 PS9, Line 801: > There is no "inclusive_equality" flag here. The restriction to probe-only The probe-only behavior here doesn't match the behavior of the non-codegen'd version of the code (i.e. EvalRow which is called by EvalBuildRow() and EvalProbeRow()) which unconditionally converts Nan to its canonical form. http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@237 PS11, Line 237: INCLUSIVE_EQUALITY This change in the logic of hash table may have implication to Join too. It may warrant a test case to verify the behavior of Nan <=> Nan (which should evaluate to false isn't changed after this patch in this file: https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/joins.test Please see the section "Handling NULLs in Join Columns" here (https://www.cloudera.com/documentation/enterprise/latest/topics/impala_joins.html) for background. Can you please also verify the behavior of join on columns with Nan value doesn't change after this patch ? create table nan_test (c1 float, c2 float); insert into nan_test select cast(sqrt(0.5-x) as FLOAT), cast(sqrt(3.5-y) as FLOAT) from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T; select t1.c1,t2.c1 from nan_test t1 right outer join nan_test t2 on t1.c1 = t2.c1; http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@256 PS11, Line 256: val_is_ambiguous nit: val_is_nan http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@257 PS11, Line 257: local_is_ambiguous nit: local_is_nan http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/runtime/raw-value.h@40 PS11, Line 40: nit: extra space http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25 PS9, Line 25: group by Z > Done The new test case still seems to be group-by a single column. May be you can try something like: select count(*), cast(sqrt(0.5-x) as FLOAT) as Z, cast(sqrt(0.5+x) as FLOAT) as Y from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T group by Y,Z; http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12 PS11, Line 12: GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), sqrt(0.5-x) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T : group by Z : RESULTS : 3, NaN : TYPES : bigint, double : : QUERY : # GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), cast(sqrt(0.5-x) as FLOAT) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T : group by Z order by Z : RESULTS : 3, NaN : 2, 0 : 1, 1 : TYPES : bigint, float : : QUERY These tests are testing the behavior of group-by so they should belong to aggregation.test -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski
[Impala-ASF-CR] IMPALA-7677: Fix DCHECK failure in GroupingAggregator
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11626 ) Change subject: IMPALA-7677: Fix DCHECK failure in GroupingAggregator .. Patch Set 2: Code-Review+1 (1 comment) Not super familiar with the code but the fix makes sense to me. Please feel free to get one more review or I can +2 it too. http://gerrit.cloudera.org:8080/#/c/11626/2/be/src/exec/streaming-aggregation-node.cc File be/src/exec/streaming-aggregation-node.cc: http://gerrit.cloudera.org:8080/#/c/11626/2/be/src/exec/streaming-aggregation-node.cc@176 PS2, Line 176: child_batch_.reset(); Not necessary this change but I wonder if this line is necessary given that we usually do this in Close() ? -- To view, visit http://gerrit.cloudera.org:8080/11626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I851007a60472d0e53081c076c863c866c516677c Gerrit-Change-Number: 11626 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 11 Oct 2018 07:43:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 ) Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h File be/src/runtime/mem-pool.h: http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h@260 PS1, Line 260: uint8_t* ALWAYS_INLINE Allocate(int64_t size, int alignment) noexcept { > After a first pass, it looks like this isn't our only mis-use of MemPools ( Good to hear that DFAKE_SCOPED_LOCK() is catching the problem. How many other different instances are we talking about here ? My preference would be to fix all of them in this patch as there is a workaround for the bug in the min/max filter. Of course, if fixing some of those new cases requires a lot of refactoring or if there are too many places which suffer from this problem (which would be surprising), we can consider deferring the adding of DFAKE_SCOPED_LOCK() here. -- To view, visit http://gerrit.cloudera.org:8080/11650 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29 Gerrit-Change-Number: 11650 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 10 Oct 2018 23:12:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.h@185 PS2, Line 185: status reports periodically : /// to > nit: status reports periodically to the Done http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.h@428 PS2, Line 428: Should only be called when : /// the current state is a non-terminal state. T > A little confusing - its required to currently be in a non-terminal state w Done http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.cc@227 PS2, Line 227: { > any reason to do this here, when its only used within the lock below? Done http://gerrit.cloudera.org:8080/#/c/11615/2/be/src/runtime/query-state.cc@243 PS2, Line 243: backend_exec_state_ = cur_state == BackendExecState::PREPARING ? : BackendExecState::EXECUTING : BackendExecState::FINISHED; > This feels a little bit weird - would it work to make this more explicit by The state transition logic is determined by 'overall_status_'. So, making UpdateBackendExecState() take a param of the next state means moving the logic encoded in the if-stmt above elsewhere and UpdateBackendExecState() simply becomes a setter of backend_exec_state_ with some DCHECKs(). I can see how it may be slightly clearer for the transition from PREPARING state to the next legal state but it may make the code slightly more hairy at the multiple call sites. Will it be clearer if we update the DCHECK at line 231 to make it more explicit which valid states we could be at so it's easier to reason about the state transition logic ? http://gerrit.cloudera.org:8080/#/c/11615/2/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/11615/2/common/protobuf/control_service.proto@150 PS2, Line 150: // The fragment instance id of the first failed fragment instance. > Maybe worth making it more explicit that the reason we chose the 'first' he It's not set for "general" error. Comments clarified to update its relationship with "overall_status". -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 10 Oct 2018 21:57:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Thomas Marshall, Todd Lipcon, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, Michal Ostrowski, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#16). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 66 files changed, 1,287 insertions(+), 787 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/16 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 16 Gerrit-Owner: Michael Ho Gerrit-Reviewer:
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/10855/15/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/10855/15/common/protobuf/control_service.proto@113 PS15, Line 113: int32 int64 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 15 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 10 Oct 2018 20:48:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7272: Fix crash in StringMinMaxFilter
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11650 ) Change subject: IMPALA-7272: Fix crash in StringMinMaxFilter .. Patch Set 1: (3 comments) The fix makes sense to me. Can you please try the DFAKE_SCOPED_LOCK() suggestion and use a query with multiple joins inside the same fragment to see if you can trigger the crash ? http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h File be/src/runtime/mem-pool.h: http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/runtime/mem-pool.h@260 PS1, Line 260: uint8_t* ALWAYS_INLINE Allocate(int64_t size, int alignment) noexcept { May benefit from DFAKE_SCOPED_LOCK(). See https://gerrit.cloudera.org/#/c/10855/15/be/src/runtime/fragment-instance-state.cc@396 Same for other functions. http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h@80 PS1, Line 80: /// Returns a new MinMaxFilter with the given type, allocated from 'pool'. Please add a comment for 'mem_tracker' http://gerrit.cloudera.org:8080/#/c/11650/1/be/src/util/min-max-filter.h@186 PS1, Line 186: MemPool mem_pool_; Please add a comment for it. -- To view, visit http://gerrit.cloudera.org:8080/11650 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I751cad7e6b75c9d95d7ad029bbd1e52fe09e8a29 Gerrit-Change-Number: 11650 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 10 Oct 2018 19:23:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11615/2/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/11615/2/common/protobuf/control_service.proto@113 PS2, Line 113: int32 int64. Will update in PS3. -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 10 Oct 2018 18:21:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107 PS9, Line 107: IsAmbiguous > The ambiguity is intentional so as to not define float/double-specific mech I don't see the point of keeping this ambiguity unless you can come up with reasonable examples which are applicable to other types. The general rule of operation is that we try not to generalize things too much until there are needs for it. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 9 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 10 Oct 2018 17:52:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Thomas Marshall, Todd Lipcon, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, Michal Ostrowski, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#15). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. To avoid unnecessary delays due to sharing the network connections between DataStream service and Control service, this change added the service name as part of the user credentials for the ConnectionId so each service will use a separate connection. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 66 files changed, 1,287 insertions(+), 787 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/15 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 15 Gerrit-Owner: Michael Ho Gerrit-Reviewer:
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 15: (6 comments) http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/runtime/coordinator-backend-state.cc@275 PS14, Line 275: // Hold the exec_summary's lock to avoid exposing it half-way through > Can we document the lock order in coordinator.h (which already references E Done http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/runtime/fragment-instance-state.h@177 PS14, Line 177: Monotonically > Typo: Monotonically Done http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/runtime/fragment-instance-state.h@179 PS14, Line 179: int64_t report_seq_no_ = 0; > I imagine you already thought about this and concluded that overflows were Done http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/testutil/in-process-servers.cc File be/src/testutil/in-process-servers.cc: http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/testutil/in-process-servers.cc@51 PS14, Line 51: FLAGS_krpc_port = > I guess we can't set this to 0 to automatically choose an ephemeral port? W Some tests were failing without this. I don't recall which one. Apparently, something is sourcing FLAGS_krpc_port so we have to set it to be consistent. http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/testutil/in-process-servers.cc@51 PS14, Line 51: ; > Extra semicolon Done http://gerrit.cloudera.org:8080/#/c/10855/13/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/10855/13/common/protobuf/control_service.proto@50 PS13, Line 50: } > Yeah, I think the rename makes sense to do, though not a big deal, and othe Done -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 15 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 10 Oct 2018 08:11:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Hello Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11615 to look at the new patch set (#2). Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. IMPALA-4063: Merge report of query fragment instances per executor Previously, each fragment instance executing on an executor will independently report its status to the coordinator periodically. This creates a huge amount of RPCs to the coordinator under highly concurrent workloads, causing lock contention in the coordinator's backend states as each fragment instance tries to them at the same time. In addition, due to the lack of coordination between query fragment instances, a query may end without collecting the profiles from all fragment instances when one of them hits an error before another fragment instance manages to finish calling Prepare(), leading to missing profiles for certain fragment instances. This change fixes the problem above by making a thread per QueryState (started by QueryExecMgr) to be responsible for periodically reporting the status and profiles of all fragment instances of a query running on a backend. As part of this refactoring, each query fragment instance will not report their errors individually. Instead, there is a cumulative status maintained per QueryState. It's set to the error status of the first fragment instance which hits an error or any general error (e.g. failure to start a thread) when starting fragment instances. With this change, the status reporting threads are also removed. Testing done: exhaustive tests This patch is based on a patch by Sailesh Mukil Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/control-service.cc M common/protobuf/control_service.proto M common/thrift/RuntimeProfile.thrift M testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test M testdata/workloads/functional-query/queries/QueryTest/udf.test 17 files changed, 375 insertions(+), 426 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/11615/2 -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 1: (11 comments) http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h@93 PS1, Line 93: /// Called periodically to get the current status of this fragment instance. > Since we're describing the usage pattern, might as well mention which threa Done http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h@159 PS1, Line 159: bool final_report_sent_ = false; > What's the thread safety story here? Added some comments. This is exclusively accessed by the query state thread only. http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@97 PS1, Line 97: intance > instance Done http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@239 PS1, Line 239: > nit: unnecessary blank line Done http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@243 PS1, Line 243: void FragmentInstanceState::GetStatusReport(FragmentInstanceExecStatusPB* instance_status, > nit: could probably reduce vertical whitespace in this function. Done http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-exec-mgr.cc@128 PS1, Line 128: void QueryExecMgr::StartQueryHelper(QueryState* qs) { > I feel like we should rename this function since it now runs until the quer Done http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@353 PS1, Line 353: /// 'overall_status_' will be set once this is unblocked and so will 'failed_instance_id_' > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@418 PS1, Line 418: bool IsStatusSet() const { > Maybe something like "HasErrorStatus()". It's confusing that setting overal Done http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@419 PS1, Line 419: return !overall_status_.ok() && !overall_status_.IsCancelled(); > This is called without holding status_lock_, right? I don't think IsCancell Good point. I think the optimization doesn't really matter that much given it's only exercised in the error path. It's okay to have a minor bit of lock contention if multiple fragment instances hit errors at the same time. http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc@a553 PS1, Line 553: : The deletion of this line warrants some explanation: In the old code, if there is any error returned from fis->Exec(), the final report for "fis" would have been sent already so the coordinator will already record the failure of "fis" before the cancellation is issued. In the new code, the reporting is done periodically by the query state thread so the final report may or may not be sent at this point after "fis" hit an error above. Calling Cancel() here may actually cause the coordinator fragment (if there is one) to prematurely set the backend status at the coordinator, causing Coordinator::BackendState::ApplyExecStatusReport() to ignore subsequent updates from "fis". As a result, the actual error status was masked. The implicit contract in the code (even before this change) is that the final report of the first fragment instance to hit an (non-cancellation) error must have been sent before initiating the cancellation. Will add a comment here. http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc@236 PS1, Line 236: if (overall_status_.IsCancelled()) { > Can we refine this with the internal/client-driven cancellation distinction Sorry, I don't quite get this comment. In either cases, won't the state still transit to BackendExecState::CANCELLED ? -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG@16 PS4, Line 16: This patch also introduces a new service pool for all query execution : control related RPCs in the future so that control commands from : coordinators aren't blocked by long-running DataStream services' RPCs. > Thanks for the suggestion. I think it's fine to do it as a follow-up or par Actually, after IMPALA-7585, we can achieve this by just setting a different username in the UserCredentials for different services. No change in Kudu code is needed. http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: http://gerrit.cloudera.org:8080/#/c/10855/14/be/src/runtime/fragment-instance-state.h@179 PS14, Line 179: /// Event sequence tracking the completion of various stages of this fragment instance. > I imagine you already thought about this and concluded that overflows were Good idea. With 64-bit and 1 millisecond reporting interval, it will take about 292471208 years for it to overflow. With 32-bit values, it takes about 24 days. I believe no legitimate query will run for 24 days but then I guess using 64-bit provides the peace of mind. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 09 Oct 2018 17:55:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-7555: Set socket timeout in impala-shell
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11540 ) Change subject: [WIP] IMPALA-7555: Set socket timeout in impala-shell .. Patch Set 5: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/11540/5/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/11540/5/shell/impala_client.py@255 PS5, Line 255: # Setting a timeout of None disables timeouts on socket operations This comment can be moved to above line 264. http://gerrit.cloudera.org:8080/#/c/11540/5/shell/impala_client.py@258 PS5, Line 258: else: : sock.setTimeout(None) Given sock is just returned from self._get_socket_and_transport(), I don't think this line is needed. http://gerrit.cloudera.org:8080/#/c/11540/5/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/11540/5/shell/option_parser.py@227 PS5, Line 227: " if it fails to connect to Impala server") Set to 0 to disable any timeout. -- To view, visit http://gerrit.cloudera.org:8080/11540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813 Gerrit-Change-Number: 11540 Gerrit-PatchSet: 5 Gerrit-Owner: Anuj Phadke Gerrit-Reviewer: Anuj Phadke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 09 Oct 2018 01:42:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/coordinator-backend-state.cc@277 PS1, Line 277: lock_guard l1(exec_summary->lock); > Isn't this just reverting the change that the previous patch made? Yes. Will revert that change in the previous patch and add a comment instead. -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 09 Oct 2018 01:21:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 9: (25 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@236 PS9, Line 236: ensure ensures http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@239 PS9, Line 239: /// value of the value. May want to clarify that this is a no-op for types other than FLOAT and DOUBLE or if the value itself is not a Nan to begin with. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@240 PS9, Line 240: ConvertToCanonicalForm ConvertNanToCanonicalForm(); http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@249 PS9, Line 249: inclusive_equality Warrants a quick comment. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@476 PS9, Line 476: llvm::Value * llvm::Value* . Same below. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@477 PS9, Line 477: NULL Seems unnecessary to initialize this value here as it's set in all paths which can be taken below, http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@483 PS9, Line 483: cond "is_nan" is a more fitting name. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@487 PS9, Line 487: default: : ; May as well remove these lines. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@747 PS9, Line 747: llvm::Value * llvm::Value* . Same below. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748 PS9, Line 748: llvm::Value *cmp_raw = builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); : if (!inclusive_equality) return cmp_raw; nit: seems slightly easier to follow: if (!inclusive_equality) { return builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); } http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h@464 PS9, Line 464: broadens the meaning of equality to null means "NULL == NULL" and "Nan == Nan" http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@198 PS9, Line 198: const ColumnType & nit: const ColumnType& expr_type http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@253 PS9, Line 253: const ColumnType & nit: const Columntype& expr_type http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@257 PS9, Line 257: loc huh ? Did you mean val ? Also, the entire thing seems easier to parse if it's broken down: if (!RawValue::Eq(loc, val, expr_type)) { if (INCLUSIVE_EQUALITY) { // Check for the case of Nan == Nan bool loc_is_nan = RawValue::IsNan(loc, expr_type) ; bool val_is_nan = RawValue::IsNan(val, expr_type); if (loc_is_nan && val_is_nan) continue; } return false; } http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801 PS9, Line 801: if (!build) { Don't we need to check for inclusive_equality flag here ? Also, why is the conversion to canonical form limited to probe only ? (i.e. !build) ? http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1149 PS9, Line 1149: if (inclusive_equality) { : result.ConvertToCanonicalForm(); : } nit: one line http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1179 PS9, Line 1179: nit: extra blank line http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40 PS9, Line 40: static const double CANONICAL_DOUBLE_NAN; : static const float CANONICAL_FLOAT_NAN; These deserve some comments on their purposes. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107 PS9, Line 107: IsAmbiguous IsNan() seems to be a more appropriate name. The name and the description of this function is a bit ambiguous (punt unintended). The comment seems clearer to just say: Returns true if 'val' is of type float or double and it has a Nan value. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112 PS9, Line 112: const ColumnType & nit: const ColumnType& type http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112 PS9, Line 112:
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11615 ) Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11615/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11615/1//COMMIT_MSG@12 PS1, Line 12: conurrent concurrent http://gerrit.cloudera.org:8080/#/c/11615/1//COMMIT_MSG@15 PS1, Line 15: coordinator coordination -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 08 Oct 2018 22:24:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-4063
Michael Ho has abandoned this change. ( http://gerrit.cloudera.org:8080/11185 ) Change subject: WIP: IMPALA-4063 .. Abandoned New patch is at https://gerrit.cloudera.org/#/c/11615/ -- To view, visit http://gerrit.cloudera.org:8080/11185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I82dd6d6014f5219550082c3cc59e88bfb7d84ef8 Gerrit-Change-Number: 11185 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11615 Change subject: IMPALA-4063: Merge report of query fragment instances per executor .. IMPALA-4063: Merge report of query fragment instances per executor Previously, each fragment instance executing on an executor will independently report its status to the coordinator periodically. This creates a huge amount of RPCs to the coordinator under highly conurrent workloads, causing lock contention in the coordinator's backend states as each fragment instance independently tries to update the coordinator backend states. In addition, due to the lack of coordinator between query fragment instances, a query may end without collecting the profiles from all fragment instances when one of them hits an error before some other fragment instance manages to finish calling Prepare(), leading to missing profile for certain fragment instances. This change fixes the problem above by making a thread per QueryState (started by QueryExecMgr) to be responsible for periodically reporting the status and profiles of all fragment instances of a query running on a backend. As part of this refactoring, each query fragment instance will not report their errors individually. Instead, there is a cumulative status maintained per QueryState. It's set to the error status of the first fragment instance which hits an error or any general error (e.g. failure to start a thread). With this change, the status reporting thread is also removed. Testing done: exhaustive tests This patch is based on a patch by Sailesh Mukil Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/control-service.cc M common/protobuf/control_service.proto M common/thrift/RuntimeProfile.thrift M testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test M testdata/workloads/functional-query/queries/QueryTest/udf.test 16 files changed, 377 insertions(+), 420 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/11615/1 -- To view, visit http://gerrit.cloudera.org:8080/11615 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa Gerrit-Change-Number: 11615 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 14: (8 comments) http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/runtime/coordinator-backend-state.h@152 PS13, Line 152: /// Updates 'this' with exec_status and the fragment intance's thrift prof > comment is out of date Done http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/runtime/coordinator-backend-state.cc@523 PS13, Line 523: if (rows_counter != nullptr) { > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/runtime/coordinator-backend-state.cc@530 PS13, Line 530: } > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/service/control-service.cc@42 PS13, Line 42: QUEUE_LIMIT_MSG > Looks like a negative value here gives no limit? Maybe worth mentioning. Thanks for pointing that out. The handling of negative value is fixed in new PS. ParseUtil::ParseMemSpec() already distinguishes the input value between absolute value and percentage. If the input is percentage, it will compute the memory limit based on the reference value passed to ParseMemSpec(). http://gerrit.cloudera.org:8080/#/c/10855/13/be/src/service/control-service.cc@46 PS13, Line 46: if left at default value 0 > or negative Wording fixed. http://gerrit.cloudera.org:8080/#/c/10855/13/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/10855/13/common/protobuf/control_service.proto@50 PS13, Line 50: optional KuduDmlStatsPB kudu_stats = 3; > is this, and elsewhere, referring to anything other than renaming, eg. to D Looking at (https://gerrit.cloudera.org/#/c/4985/1/common/thrift/ImpalaInternalService.thrift@430), it seems to suggest more than just renaming the struct. It also involves some sort of refactoring. That said, if we don't really know the meaning of this TODO statement, we can drop it too. What's your take ? http://gerrit.cloudera.org:8080/#/c/10855/13/common/protobuf/control_service.proto@107 PS13, Line 107: : FIRST_BATCH_PRODUCED > is this true if an error is encountered? Copied and pasted from Thrift Implementation. Fixed. http://gerrit.cloudera.org:8080/#/c/10855/13/common/protobuf/control_service.proto@154 PS13, Line 154: individual fra > I guess this and elsewhere was copied from the original thrift definition. I am not 100% sure about the historical context but I believe the rationale behind all the RPC versioning is that we can have backward compatibility between different versions of Impala (See also https://gerrit.cloudera.org/#/c/6535/4/common/thrift/ImpalaInternalService.thrift@698) That said, I don't think we ever implemented any true support for mixed versions of Impala (which is very useful for rolling upgrade). May be it's fair to drop the version string for now until we commit to supporting mixed version of Impala. In which case, we can take whatever snapshot of the structs then as V1. Before that, having the version string seems like some cruft that we don't need at the moment. Also, I don't quite understand the rationale behind the required fields in some of the existing Thrift structs as that seems to make it hard to change the definition of the struct later. So, for now, I am marking all fields to be optional (and latter version of protobuf actually gets rid of required/optional so everything is optional anyway). -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 14 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 08 Oct 2018 06:14:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 60 files changed, 1,211 insertions(+), 764 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/14 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 14 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has removed Dan Hecht from this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Removed reviewer Dan Hecht. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 13 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has removed Sailesh Mukil from this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Removed reviewer Sailesh Mukil. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 13 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 12 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Fri, 05 Oct 2018 18:34:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Fri, 05 Oct 2018 18:33:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 10: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/11425/10/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/10/be/src/util/runtime-profile.cc@41 PS10, Line 41: nit: blank line -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 10 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Fri, 05 Oct 2018 18:24:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.h@481 PS8, Line 481: /// Redaction rules are applied on the info string if 'redact' is true. > Done Did you mean to push a new patch ? Still the same in PS9. http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.cc@40 PS8, Line 40: #include "gutil/strings/strip.h" > Done Did you mean to push a new patch ? Still the same in PS9. -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 9 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Wed, 03 Oct 2018 21:32:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-7555: Set socket timeout in impala-shell
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11540 ) Change subject: [WIP] IMPALA-7555: Set socket timeout in impala-shell .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11540/3/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/11540/3/shell/option_parser.py@225 PS3, Line 225: "-t", "--client_connect_timeout_ms" > I'm in favor of sticking to 0 to disable it, which is in line with what we That's also a fine option. It seems the current behavior with 0 is not well defined anyway so may as well special case it to disable the timeout switch. -- To view, visit http://gerrit.cloudera.org:8080/11540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813 Gerrit-Change-Number: 11540 Gerrit-PatchSet: 3 Gerrit-Owner: Anuj Phadke Gerrit-Reviewer: Anuj Phadke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 03 Oct 2018 21:28:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-7555: Set socket timeout in impala-shell
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11540 ) Change subject: [WIP] IMPALA-7555: Set socket timeout in impala-shell .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11540/3/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/11540/3/shell/option_parser.py@225 PS3, Line 225: "-t", "--client_connect_timeout_ms" > I tried setting it here I suppose one way to disable the timeout is to have a very large timeout value. So, in that sense, it's probably okay to not have a value to disable the timeout. -- To view, visit http://gerrit.cloudera.org:8080/11540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813 Gerrit-Change-Number: 11540 Gerrit-PatchSet: 3 Gerrit-Owner: Anuj Phadke Gerrit-Reviewer: Anuj Phadke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 03 Oct 2018 21:18:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 8: (1 comment) The new change looks better now. Looks like you may have missed my comments in PS8. Can you please address them ? http://gerrit.cloudera.org:8080/#/c/11425/9/tests/query_test/test_cancellation.py File tests/query_test/test_cancellation.py: http://gerrit.cloudera.org:8080/#/c/11425/9/tests/query_test/test_cancellation.py@193 PS9, Line 193: the close rpc should have succeeded. Not sure why this is needed. -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 8 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Wed, 03 Oct 2018 21:13:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-7555: Set socket timeout in impala-shell
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11540 ) Change subject: [WIP] IMPALA-7555: Set socket timeout in impala-shell .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/11540/4/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/11540/4/shell/option_parser.py@227 PS4, Line 227: connect connect to Impala server. http://gerrit.cloudera.org:8080/#/c/11540/4/tests/shell/test_shell_commandline.py File tests/shell/test_shell_commandline.py: http://gerrit.cloudera.org:8080/#/c/11540/4/tests/shell/test_shell_commandline.py@749 PS4, Line 749: '-q "select foo; select bar;" -t 2000 ' + '-i localhost:%d' % (test_port) nit: no need for '+': '-q "select foo; select bar;" -t 2000 -i localhost:%d' -- To view, visit http://gerrit.cloudera.org:8080/11540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813 Gerrit-Change-Number: 11540 Gerrit-PatchSet: 4 Gerrit-Owner: Anuj Phadke Gerrit-Reviewer: Anuj Phadke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 03 Oct 2018 00:43:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 5: (2 comments) I plan to take a look. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@104 PS3, Line 104: van typo http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h File be/src/runtime/raw-value.inline.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@55 PS3, Line 55: ColumnType & nit: ColumnType& -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 5 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 03 Oct 2018 00:05:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc@551 PS5, Line 551: // If there is a newline-like character, remove duplicates > The test for double newlines effectively was a test that info strings aren' Actually, it's possible to get the value of individual field with the Thrift profile. So, it's definitely possible to just extract the "Query Status" field from the final profile. Copying some code from test_observability.py: tree = self.impalad_test_service.get_thrift_profile(query_id) if tree: query_status = tree.nodes[1].info_strings["Query Status"] assert query_status == query_status.rstrip() -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 5 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Tue, 02 Oct 2018 19:30:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc@551 PS5, Line 551: // If there is a newline-like character, remove duplicates > That's fine. The only reason the double newline is being stripped is becau I guess a more meaningful test is to verify that all "Query Status" (not just Status: Cancelled) (or in general any info string) don't end with whitespace / newline. I can see either incorporating that verification into some test infrastructure (e.g. somewhere in test_result_verifier.py) or you write a test which triggers different types of query statuses (e.g. OK, mem limit exceeded, scan error, cancelled etc) to verify that they don't end with whitespaces. The former seems to be more future-proof. -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 5 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Tue, 02 Oct 2018 17:56:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7585: Always set user credentials after creating RPC proxy
Hello Thomas Marshall, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11477 to look at the new patch set (#2). Change subject: IMPALA-7585: Always set user credentials after creating RPC proxy .. IMPALA-7585: Always set user credentials after creating RPC proxy kudu::rpc::Proxy() ctor may fail in GetLoggedInUser() for various reasons (e.g. missing certain libraries). This resulted in an empty username being used in kudu::rpc::ConnectionId. With plaintext SASL (e.g. in an insecure Impala cluster), this may result in the following error during connection negotiation: Not authorized: Client connection negotiation failed: client connection to 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username. In Impala, we don't consider failing GetLoggedInUser() a fatal error. This change fixes the issue above by always explicitly setting the username after creating the proxy. The username is "impala". Please note that this username is not really used anywhere for authorization for RPC services. Authorization is only done when authentication is enabled with Kerberos. With Kerberos enabled, the username is derived from the Kerberos principal instead of the user credentials set in the ConnectionId. It's there mostly to satisfy the SASL plaintext case. rpc-mgr-test has been updated to test for this string when Kerberos is disabled. Testing done: core test; rpc-mgr-test; rpc-mgr-kerberized-test Change-Id: I75059f55bcdb8f95916610100cad4d8280daf3f6 --- M be/src/exec/kudu-util.h M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.inline.h 3 files changed, 15 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/11477/2 -- To view, visit http://gerrit.cloudera.org:8080/11477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75059f55bcdb8f95916610100cad4d8280daf3f6 Gerrit-Change-Number: 11477 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7585: Always set user credentials after creating RPC proxy
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11477 ) Change subject: IMPALA-7585: Always set user credentials after creating RPC proxy .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11477/1/be/src/exec/kudu-util.h File be/src/exec/kudu-util.h: http://gerrit.cloudera.org:8080/#/c/11477/1/be/src/exec/kudu-util.h@32 PS1, Line 32: > why was this necessary? Oops. Meant to include-what-you-use. Replaced with: #include -- To view, visit http://gerrit.cloudera.org:8080/11477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75059f55bcdb8f95916610100cad4d8280daf3f6 Gerrit-Change-Number: 11477 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 02 Oct 2018 03:04:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 8: (6 comments) http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.h@481 PS8, Line 481: /// Redaction rules are applied on the info string if 'redact' is true. Please add a comment that any trailing whitespace in "value" will be stripped before being inserted. http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc@551 PS5, Line 551: info_strings_.emplace(key, std::move(value)); > The test code I added gets tripped up because there the runtime profile inc I applied the patch and looked at the query profile output at the debug webpage. It appears that stripping all "\n\n" (even those embedded inside the info string) is more than what IMPALA-2063 calls for. For one thing, the "\n\n" embedded inside "value" could be there for legitimate formatting reasons. Stripping them here seems to make the profile potentially less legible or at least break the formatting expected by the creator of "value". I would recommend keeping StripTrailingWhiteSpace() above only. http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.cc@40 PS8, Line 40: #include "gutil/strings/strip.h" nit: #insert is sorted by alphabetical order http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.cc@541 PS8, Line 541: if (redact) { : Redact(); : } nit: can fit in one line http://gerrit.cloudera.org:8080/#/c/11425/8/be/src/util/runtime-profile.cc@546 PS8, Line 546: StripDupCharacters(, '\n', 0); Please see comments elsewhere. This change in behavior seems to be more than what IMPALA-2063 calls for. It may actually affect the legibility of the profile in an unexpected way. http://gerrit.cloudera.org:8080/#/c/11425/8/tests/query_test/test_cancellation.py File tests/query_test/test_cancellation.py: http://gerrit.cloudera.org:8080/#/c/11425/8/tests/query_test/test_cancellation.py@188 PS8, Line 188: assert "\n\n" not in profile, \ : "Profile for query %s contains redundant newlines." % handle This assert seems boarder than necessary. I am not sure it's legitimate to enforce that no "\n\n" can never be present in the profile string. For instance, one may add a blank line between various sections of the profile for legibility reasons. So, this assert doesn't seem future proof. -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 8 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Comment-Date: Tue, 02 Oct 2018 02:37:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-7555: Set socket timeout in impala-shell
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11540 ) Change subject: [WIP] IMPALA-7555: Set socket timeout in impala-shell .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/11540/3/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/11540/3/shell/impala_client.py@81 PS3, Line 81: float(client_connect_timeout_ms) Any idea what happens if the user passes a negative value ? Also, it seems a bit weird that it's a float. Does int not work ? http://gerrit.cloudera.org:8080/#/c/11540/3/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/11540/3/shell/impala_shell.py@a864 PS3, Line 864: Why was this removed ? http://gerrit.cloudera.org:8080/#/c/11540/3/shell/option_parser.py File shell/option_parser.py: http://gerrit.cloudera.org:8080/#/c/11540/3/shell/option_parser.py@225 PS3, Line 225: "-t", "--client_connect_timeout_ms" Do you think it makes sense to have a value (e.g. 0) which allows user to disable the timeout behavior ? Actually, have you confirmed the behavior of setting this to 0 ? Does it timeout all the time ? http://gerrit.cloudera.org:8080/#/c/11540/3/shell/option_parser.py@227 PS3, Line 227: if nit: missing space before "if" http://gerrit.cloudera.org:8080/#/c/11540/3/tests/shell/test_shell_commandline.py File tests/shell/test_shell_commandline.py: http://gerrit.cloudera.org:8080/#/c/11540/3/tests/shell/test_shell_commandline.py@740 PS3, Line 740: nit: extra whitespace http://gerrit.cloudera.org:8080/#/c/11540/3/tests/shell/test_shell_commandline.py@742 PS3, Line 742: indefinately indefinitely ? -- To view, visit http://gerrit.cloudera.org:8080/11540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813 Gerrit-Change-Number: 11540 Gerrit-PatchSet: 3 Gerrit-Owner: anujphadke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 02 Oct 2018 01:07:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7585: Always set user credentials after creating RPC proxy
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11477 Change subject: IMPALA-7585: Always set user credentials after creating RPC proxy .. IMPALA-7585: Always set user credentials after creating RPC proxy kudu::rpc::Proxy() ctor may fail in GetLoggedInUser() for various reasons (e.g. missing certain libraries). This resulted in an empty username being used in kudu::rpc::ConnectionId. With plaintext SASL (e.g. in an insecure Impala cluster), this may result in the following error during connection negotiation: Not authorized: Client connection negotiation failed: client connection to 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username. In Impala, we don't consider failing GetLoggedInUser() a fatal error. This change fixes the issue above by always explicitly setting the username after creating the proxy. The username is "impala". Please note that this username is not really used anywhere for authorization for RPC services. Authorization is only done when authentication is enabled with Kerberos. With Kerberos enabled, the username is derived from the Kerberos principal instead of the user credentials set in the ConnectionId. It's there mostly to satisfy the SASL plaintext case. rpc-mgr-test has been updated to test for this string when Kerberos is disabled. Testing done: core test; rpc-mgr-test; rpc-mgr-kerberized-test Change-Id: I75059f55bcdb8f95916610100cad4d8280daf3f6 --- M be/src/exec/kudu-util.h M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.inline.h 3 files changed, 16 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/11477/1 -- To view, visit http://gerrit.cloudera.org:8080/11477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I75059f55bcdb8f95916610100cad4d8280daf3f6 Gerrit-Change-Number: 11477 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 12: (9 comments) http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/coordinator-backend-state.cc@497 PS12, Line 497: last_report_seq_no_ = exec_status.report_seq_no(); > Can you add a CHECK here that it isn't decreasing? It's verified above, but Done. Used a DCHECK instead for documenting the invariant. http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/coordinator-backend-state.cc@508 PS12, Line 508: lock_guard l1(exec_summary->lock); > is this critical section extending farther than it needs to? should it end Fixed. http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/fragment-instance-state.h@128 PS12, Line 128: int32_t ReportSeqNo() { > nit: maybe rename to AdvanceReportSeqNo or NextReportSeqNo or something so Done http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/fragment-instance-state.h@178 PS12, Line 178: DFAKE_MUTEX(report_seq_no_lock_); > I think this fake mutex should be held around the whole function that gener Good point. Moved to SendReport() instead. http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/12/be/src/runtime/query-state.cc@306 PS12, Line 306: REPORT_EXEC_STATUS > hmm, I dont see this one used. did I miss something? Nice catch. I meant to rename this to REPORT_EXEC_STATUS_PROFILE. http://gerrit.cloudera.org:8080/#/c/10855/12/common/protobuf/CMakeLists.txt File common/protobuf/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/10855/12/common/protobuf/CMakeLists.txt@47 PS12, Line 47: KRPC_GENERATE(CONTROL_SVC_PROTO_SRCS CONTROL_SVC_PROTO_HDRS > You could factor out the replication of the protobuf generation by putting Good idea. Adopted something similar to your suggested code. http://gerrit.cloudera.org:8080/#/c/10855/12/common/protobuf/common.proto File common/protobuf/common.proto: http://gerrit.cloudera.org:8080/#/c/10855/12/common/protobuf/common.proto@33 PS12, Line 33: int64 > nit: i think fixed64 is more appropriate for these fields since they dont t Agreed that they don't tend to be small. Does using fixed64 make the encoding slightly faster ? http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py File tests/custom_cluster/test_rpc_timeout.py: http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py@42 PS12, Line 42: > flake8: E251 unexpected spaces around keyword / parameter equals Done http://gerrit.cloudera.org:8080/#/c/10855/12/tests/custom_cluster/test_rpc_timeout.py@42 PS12, Line 42: > flake8: E251 unexpected spaces around keyword / parameter equals Done -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 12 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 19 Sep 2018 19:01:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, Dan Hecht, Michal Ostrowski, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#13). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/CMakeLists.txt M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 60 files changed, 1,212 insertions(+), 758 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/13 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 13 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-2063 Remove newline characters in query status.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 ) Change subject: IMPALA-2063 Remove newline characters in query status. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11425/3/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/3/be/src/util/runtime-profile.cc@547 PS3, Line 547: size_t size = value.size(); : if (size > 0 && value.at(size - 1) == '\n') { : value.resize(size - 1); : } It seems more robust to strip any trailing whitespace by using find_last_not_of(" \t\f\v\n\r") to find the last non-whitespace char and use substr() to extract the part which is valid. Just a suggestion, not dictating the implementation. -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 3 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 13 Sep 2018 20:47:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 12: (14 comments) http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h File be/src/exec/hdfs-parquet-table-writer.h: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h@199 PS10, Line 199: > nit: should #include the appropriate .pb.h here ("include-what-you-use") Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@60 PS10, Line 60: const Coordinator& > I think this can probably change back to being const if you take the sugges Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@176 PS10, Line 176: last_report_ti > nit: I think the term "sequence number" is more usual here -- "version" to Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@220 PS10, Line 220: /// Backend exec params, owned by the QuerySchedule and has query lifetime. > This "back pointer" still seems error-prone to me. I think the object lifet Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@267 PS10, Line 267: return num_remaining_instances_ == 0 || !status_.ok(); > I think a VLOG_QUERY about the skipped RPC is probably useful Done http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@294 PS10, Line 294: DCHECK(!instance_stats->done_); > nit: why not: The ctor was marked explicit so not sure it's allowed: "explicit Status(const StatusPB& status);" http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@287 PS10, Line 287: atus = report > Ah, I missed that we join on the reporter thread first. Good idea about using DFAKE_MUTEX(). Also switched to using a non-atomic. Also simplified the logic in Coordinator::BackendState::ApplyExecStatusReport() as we can rely purely on the sequence number as you suggested. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@375 PS10, Line 375: ReportExecStatusResponsePB resp; > should we have a failure injection point on the RPC itself? I only saw fail Please find the tests in test_rpc_timeout.py which: 1. inject delays in the RPC handler to induce timeout 2. run with a very short service queue to emulate a busy server. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@379 PS10, Line 379: reak; > should we backoff? I will refrain from changing the logic here too much. There will be a follow up patch after IMPALA-4063 which will change the retry logic. TODO added. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc@202 PS10, Line 202: } > the method doc says that new_errors is cleared, but it's actually written i This was lost after refactoring this function. Fixed now. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/service/control-service.cc File be/src/service/control-service.cc: PS10: > This is a general krpc-in-Impala question: I can't find where you set up au Very good point. This is definitely a bug and it's now fixed in this commit here (https://github.com/apache/impala/commit/5c541b960491ba91533712144599fb3b6d99521d) http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h File be/src/util/uid-util.h: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h@79 PS10, Line 79: DCHECK(uid_pb.IsInitialized()); > worth DCHECKs here that the fields are set by calling uid_pb.IsInitialized( Done http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh@562 PS10, Line 562: export HBASE_CONF_DIR="$IMPALA_FE_DIR/src/test/resources" > why's this necessary? Can we change cmake to invoke it from the full path i FindProtobuf should have set PROTOBUF_PROTOC_EXECUTABLE. Not sure why I needed to set it before. http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py@97 PS10, Line 97: > can we change this flag to be in millis instead of seconds? Or do we advert I don't
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#12). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: 1. Exhaustive build. 2. Added some targeted test cases for profile serialization failure and RPC retries/timeout. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_timeout.py 59 files changed, 1,155 insertions(+), 695 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/12 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 12 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7517. Fix hang in scanner threads when soft limit is exceeded
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11369 ) Change subject: IMPALA-7517. Fix hang in scanner threads when soft limit is exceeded .. Patch Set 1: Code-Review+2 (1 comment) Thanks for fixing it. This makes sense to me. I checked https://gerrit.cloudera.org/#/c/11103/ and didn't see the same problem in Kudu scan node. Please double check if you haven't done so already. http://gerrit.cloudera.org:8080/#/c/11369/1/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/11369/1/be/src/exec/hdfs-scan-node.cc@440 PS1, Line 440: "HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT").ok())) { > I considered adding this fault injection more globally in MemTracker, but w I think it's reasonable to add it here for now but please feel free to do a follow on change to add it to MemTracker::AnyLimitExceeded(). At the bare minimum, we can enable it in debug builds only. This will definitely help exercise rarely exercised error handling paths in code. -- To view, visit http://gerrit.cloudera.org:8080/11369 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3dc1a2ec79c823575d7d40e7b52216dea5b0ddde Gerrit-Change-Number: 11369 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 31 Aug 2018 02:24:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 7: Code-Review+2 Carry Todd's +2 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 7 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 30 Aug 2018 00:31:31 + Gerrit-HasComments: No
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 7: PS7 adds 2 lines of code in rpc-mgr-kerberized-test.cc to set / clear FLAGS_krb5_ccname. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 7 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 30 Aug 2018 00:31:03 + Gerrit-HasComments: No
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#7). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamService. The authorization method will only let authenticated principal which matches FLAGS_principal / FLAGS_be_principal to access the service. Also added a new startup flag --krb5_ccname to allow users to customize the locations of the Kerberos credentials cache. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 332 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/7 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 7 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 6: Will merge the patch after more testing. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 22:11:03 + Gerrit-HasComments: No
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 6: Carry Sailesh's +1 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 19:09:11 + Gerrit-HasComments: No
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/authentication.cc@734 PS5, Line 734: return Status(Substitute("Bad --krb5_ccname value: $0 is not an absolute file path", > nit: use the user-facing name --krb5_ccname instead of FLAGS_... which is i Done http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr-kerberized-test.cc File be/src/rpc/rpc-mgr-kerberized-test.cc: http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr-kerberized-test.cc@114 PS5, Line 114: controller.Reset(); : rpc_status = : FromKuduStatus(scan_proxy > So, this is passing only because the Authorize() function for ScanMemServic Yes, I have some comments about it on line 99-100 but yes, I should move them to a location closer to the actual execution of the test to make it clearer. http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr.cc@172 PS5, Line 172: LOG(ERROR) << Substitute("Rejecting unauthorized access to $0 from $1. Expected " > sorry, think I wasn't entirely clear. I think the error message should stil Lol. It's indeed a bit scary to not have the word "Reject" in the error message. Fixed. Thanks. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 19:08:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 19:09:17 + Gerrit-HasComments: No
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#6). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamService. The authorization method will only let authenticated principal which matches FLAGS_principal / FLAGS_be_principal to access the service. Also added a new startup flag --krb5_ccname to allow users to customize the locations of the Kerberos credentials cache. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 330 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/6 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50 PS2, Line 50: krb5_ccname > That's fair. Maybe we should add some validation, then, that the path is an Good point. Validation added in AuthManager::InitKerberosEnv() http://gerrit.cloudera.org:8080/#/c/11331/4/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/4/be/src/rpc/rpc-mgr.cc@174 PS4, Line 174: mem_tracker->Release(context->GetTransferSize()); > the log message probably include the 'requestor_string' which includes the Good point. This definitely helps debugging. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc File be/src/util/auth-util.cc: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc@84 PS3, Line 84: Status ParseKerberosPrincipal(const string& principal, string* service_name, > In terms of whether to keep the change in the patch, I figured that since t Fair enough. Reverted the change and left a TODO. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 05:00:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#5). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 328 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/5 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7464: fix race when ExecFInstance() RPC times out
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11339 ) Change subject: IMPALA-7464: fix race when ExecFInstance() RPC times out .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/11339/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11339/2//COMMIT_MSG@21 PS2, Line 21: (verified via code inspection) > I'd appreciate the code reviewer to verify this as well. Did some manual verification to ensure all resources released in QueryState::ReleaseExecResources() are only referenced by backend executors except for query_mem_tracker_ which is a control structure. http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.cc@93 PS2, Line 93: set_query_exec_finished(); This is apparently beyond the scope of this change: Does it make sense for some code in MemTracker to DCHECK / CHECK against this value to make sure query_mem_tracker_ is not used after this flag is set ? -- To view, visit http://gerrit.cloudera.org:8080/11339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If60d983e0e68b00e6557185db1f86757ab8b3f2d Gerrit-Change-Number: 11339 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 29 Aug 2018 01:20:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7464: fix race when ExecFInstance() RPC times out
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11339 ) Change subject: IMPALA-7464: fix race when ExecFInstance() RPC times out .. Patch Set 2: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h@161 PS2, Line 161: ReleaseExecResourceRefcount() ReleaseBackendResourceRefcount() http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h@186 PS2, Line 186: void AcquireBackendResourceRefcount(); Should we add a remark that this should not be called from coordinator and cite this JIRA as a problem ? -- To view, visit http://gerrit.cloudera.org:8080/11339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If60d983e0e68b00e6557185db1f86757ab8b3f2d Gerrit-Change-Number: 11339 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 29 Aug 2018 01:09:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50 PS2, Line 50: krb5_ccname > Should this be hidden? I dont think we'd want to support users changing it, I actually prefer to keep it public so the user can configure it for cases such as KUDU-2545. I will add a remark about its being advanced option. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.h@25 PS3, Line 25: #include "rpc/authentication.h" : #include "rpc/impala-service-pool.h" : #include "util/auth-util.h" > are these new includes necessary? I don't see any new usages of types in th Left over from previous patch. Removed. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc@172 PS3, Line 172: mem_tracker->Release(context->GetTransferSize()); > slight nit (though I guess this happens elsewhere also): shouldn't we wait Sure. Will do it as a clean-up in a follow-on patch. That said, my understanding is that the reply itself is processed asynchronously. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc@173 PS3, Line 173: context->RespondFailure(kudu::Status::NotAuthorized( > should we also log the unauthorized access attempt (perhaps including the e Done http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc File be/src/util/auth-util.cc: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc@84 PS3, Line 84: Status ParseKerberosPrincipal(const string& principal, string* service_name, > I wonder whether we should just be using krb5_parse_name here instead of im When writing this patch, I noticed the previous implementation made two calls split() which seems unnecessary. I fixed it as part of this patch as I was calling it in the previous version of this patch but this is not needed anymore in the new version. It doesn't hurt to keep it though. I am not sure why krb5_parse_name() wasn't used here in the previous implementation. Given the Kudu code has better infrastructure, it seems easier for me to add a utility function to Kudu and call it here instead. Filed IMPALA-7504 for this. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 00:24:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#4). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 315 insertions(+), 146 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/4 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50 PS2, Line 50: krb5_ccname > nit: Maybe krb5_ccpath ? Or krb5ccname_path Seems more intuitive to have a name which resembles the env var KRB5CCNAME. That env variable indicates the location of the credentials cache so it seems a bit redundant to add path to its name as we also don't use krb_conf_path below. Please let me know if you feel strongly about it though. KRB5CCNAME Location of the Kerberos 5 credentials (ticket) cache. http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.h@216 PS2, Line 216: > Just to add, if we're going to keep the service name fixed, this should pro Variable removed as this is no longer necessary per suggestion to use GetLoggedInUserNameFromKeytab(). http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@118 PS2, Line 118: nused_realm)); > I think this should actually be hardcoded to "impala" or something rather t For all practical purposes, that's true. However, doing so will also require hardcoding part of FLAGS_principal / FLAGS_be_principal as that's what's used for kinit. Hardcoding the principal is almost the same as deprecating FLAGS_principal / FLAGS_be_principal and switching to using a boolean flag FLAGS_kerberos_enabled. This seems to be a larger scope than I feel comfortable with for this change. I need more thought about this as I have to understand the historical context for having FLAGS_principal / FLAGS_be_principal to begin with. http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@161 PS2, Line 161: // Authorization is enforced iff Kerberos is enabled. > invert this condition so that you can un-nest the majority of the function. Done http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@164 PS2, Line 164: ername matches that of the kinit'ed principal. : const RemoteUser& remote_user = context->remote_user > sorry I skipped half of what I meant to say in this comment. What I meant i Good point. Done. http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/testutil/mini-kdc-wrapper.h File be/src/testutil/mini-kdc-wrapper.h: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/testutil/mini-kdc-wrapper.h@43 PS2, Line 43: static Status SetupAndStartMiniKDC(std::string realm, : std::string ticket_lifetime, std::string renew_lifetime, > why these changes? it's better to take the arguments by copy, and then std: Done -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 28 Aug 2018 18:27:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#3). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 311 insertions(+), 142 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/3 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon