[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/authentication.cc@734 PS5, Line 734: return Status(Substitute("Bad --krb5_ccname value: $0 is not an absolute file path", > nit: use the user-facing name --krb5_ccname instead of FLAGS_... which is i Done http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr-kerberized-test.cc File be/src/rpc/rpc-mgr-kerberized-test.cc: http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr-kerberized-test.cc@114 PS5, Line 114: controller.Reset(); : rpc_status = : FromKuduStatus(scan_proxy > So, this is passing only because the Authorize() function for ScanMemServic Yes, I have some comments about it on line 99-100 but yes, I should move them to a location closer to the actual execution of the test to make it clearer. http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr.cc@172 PS5, Line 172: LOG(ERROR) << Substitute("Rejecting unauthorized access to $0 from $1. Expected " > sorry, think I wasn't entirely clear. I think the error message should stil Lol. It's indeed a bit scary to not have the word "Reject" in the error message. Fixed. Thanks. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 19:08:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 19:09:17 + Gerrit-HasComments: No
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#6). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamService. The authorization method will only let authenticated principal which matches FLAGS_principal / FLAGS_be_principal to access the service. Also added a new startup flag --krb5_ccname to allow users to customize the locations of the Kerberos credentials cache. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 330 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/6 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50 PS2, Line 50: krb5_ccname > That's fair. Maybe we should add some validation, then, that the path is an Good point. Validation added in AuthManager::InitKerberosEnv() http://gerrit.cloudera.org:8080/#/c/11331/4/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/4/be/src/rpc/rpc-mgr.cc@174 PS4, Line 174: mem_tracker->Release(context->GetTransferSize()); > the log message probably include the 'requestor_string' which includes the Good point. This definitely helps debugging. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc File be/src/util/auth-util.cc: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc@84 PS3, Line 84: Status ParseKerberosPrincipal(const string& principal, string* service_name, > In terms of whether to keep the change in the patch, I figured that since t Fair enough. Reverted the change and left a TODO. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 05:00:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#5). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 328 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/5 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7464: fix race when ExecFInstance() RPC times out
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11339 ) Change subject: IMPALA-7464: fix race when ExecFInstance() RPC times out .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/11339/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11339/2//COMMIT_MSG@21 PS2, Line 21: (verified via code inspection) > I'd appreciate the code reviewer to verify this as well. Did some manual verification to ensure all resources released in QueryState::ReleaseExecResources() are only referenced by backend executors except for query_mem_tracker_ which is a control structure. http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.cc@93 PS2, Line 93: set_query_exec_finished(); This is apparently beyond the scope of this change: Does it make sense for some code in MemTracker to DCHECK / CHECK against this value to make sure query_mem_tracker_ is not used after this flag is set ? -- To view, visit http://gerrit.cloudera.org:8080/11339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If60d983e0e68b00e6557185db1f86757ab8b3f2d Gerrit-Change-Number: 11339 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 29 Aug 2018 01:20:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7464: fix race when ExecFInstance() RPC times out
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11339 ) Change subject: IMPALA-7464: fix race when ExecFInstance() RPC times out .. Patch Set 2: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h@161 PS2, Line 161: ReleaseExecResourceRefcount() ReleaseBackendResourceRefcount() http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h@186 PS2, Line 186: void AcquireBackendResourceRefcount(); Should we add a remark that this should not be called from coordinator and cite this JIRA as a problem ? -- To view, visit http://gerrit.cloudera.org:8080/11339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If60d983e0e68b00e6557185db1f86757ab8b3f2d Gerrit-Change-Number: 11339 Gerrit-PatchSet: 2 Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 29 Aug 2018 01:09:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50 PS2, Line 50: krb5_ccname > Should this be hidden? I dont think we'd want to support users changing it, I actually prefer to keep it public so the user can configure it for cases such as KUDU-2545. I will add a remark about its being advanced option. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.h@25 PS3, Line 25: #include "rpc/authentication.h" : #include "rpc/impala-service-pool.h" : #include "util/auth-util.h" > are these new includes necessary? I don't see any new usages of types in th Left over from previous patch. Removed. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc@172 PS3, Line 172: mem_tracker->Release(context->GetTransferSize()); > slight nit (though I guess this happens elsewhere also): shouldn't we wait Sure. Will do it as a clean-up in a follow-on patch. That said, my understanding is that the reply itself is processed asynchronously. http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc@173 PS3, Line 173: context->RespondFailure(kudu::Status::NotAuthorized( > should we also log the unauthorized access attempt (perhaps including the e Done http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc File be/src/util/auth-util.cc: http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc@84 PS3, Line 84: Status ParseKerberosPrincipal(const string& principal, string* service_name, > I wonder whether we should just be using krb5_parse_name here instead of im When writing this patch, I noticed the previous implementation made two calls split() which seems unnecessary. I fixed it as part of this patch as I was calling it in the previous version of this patch but this is not needed anymore in the new version. It doesn't hurt to keep it though. I am not sure why krb5_parse_name() wasn't used here in the previous implementation. Given the Kudu code has better infrastructure, it seems easier for me to add a utility function to Kudu and call it here instead. Filed IMPALA-7504 for this. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 29 Aug 2018 00:24:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#4). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 315 insertions(+), 146 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/4 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50 PS2, Line 50: krb5_ccname > nit: Maybe krb5_ccpath ? Or krb5ccname_path Seems more intuitive to have a name which resembles the env var KRB5CCNAME. That env variable indicates the location of the credentials cache so it seems a bit redundant to add path to its name as we also don't use krb_conf_path below. Please let me know if you feel strongly about it though. KRB5CCNAME Location of the Kerberos 5 credentials (ticket) cache. http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.h@216 PS2, Line 216: > Just to add, if we're going to keep the service name fixed, this should pro Variable removed as this is no longer necessary per suggestion to use GetLoggedInUserNameFromKeytab(). http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@118 PS2, Line 118: nused_realm)); > I think this should actually be hardcoded to "impala" or something rather t For all practical purposes, that's true. However, doing so will also require hardcoding part of FLAGS_principal / FLAGS_be_principal as that's what's used for kinit. Hardcoding the principal is almost the same as deprecating FLAGS_principal / FLAGS_be_principal and switching to using a boolean flag FLAGS_kerberos_enabled. This seems to be a larger scope than I feel comfortable with for this change. I need more thought about this as I have to understand the historical context for having FLAGS_principal / FLAGS_be_principal to begin with. http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@161 PS2, Line 161: // Authorization is enforced iff Kerberos is enabled. > invert this condition so that you can un-nest the majority of the function. Done http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@164 PS2, Line 164: ername matches that of the kinit'ed principal. : const RemoteUser& remote_user = context->remote_user > sorry I skipped half of what I meant to say in this comment. What I meant i Good point. Done. http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/testutil/mini-kdc-wrapper.h File be/src/testutil/mini-kdc-wrapper.h: http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/testutil/mini-kdc-wrapper.h@43 PS2, Line 43: static Status SetupAndStartMiniKDC(std::string realm, : std::string ticket_lifetime, std::string renew_lifetime, > why these changes? it's better to take the arguments by copy, and then std: Done -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 28 Aug 2018 18:27:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#3). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 311 insertions(+), 142 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/3 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Add missing authorization in KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11331 to look at the new patch set (#2). Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M bin/rat_exclude_files.txt M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 19 files changed, 315 insertions(+), 143 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/2 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc@608 PS1, Line 608: sasl::TSaslServer::SaslInit(GENERAL_CALLBACKS, APP_NAME.c_str()); > Todd pointed out during an offline discussion that KRPC internally implemen Wasn't very clear in my previous update. I was trying to say: The per-connection authorization (instead of per-RPC authorization) may save some CPU overhead by avoiding the need to parse the principal on every RPC call. It's unclear how much we are saving in the grand scheme of things though. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 27 Aug 2018 19:05:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11331 ) Change subject: Add missing authorization in KRPC .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc@608 PS1, Line 608: sasl::TSaslServer::SaslInit(GENERAL_CALLBACKS, APP_NAME.c_str()); > Alternatively, you could add these callbacks to the Kudu callback list for Todd pointed out during an offline discussion that KRPC internally implements authorization by overriding the authz_method defined in the RPC service. The authz_method is invoked for all RPCs. So, adding a SASL callback in some sense is adding another path which authorization is done. The argument for it is that it's a per-connection authorization instead of a per-RPC authorization. The latter may be needed if we need finer grained authorization (e.g. RPC method level authorization) so we can save some CPU overhead of having to parse the principal repeatedly for each RPC. It's unclear how much we are saving in the grand scheme of things. I will explore the alternative of overriding the SASL callbacks but IMHO, using the existing mechanism in KRPC to do the authorization (which is what's implemented in this patch) seems to be the easiest path for now. May help if Todd or others can chime in on their opinion. I think it may make sense to explore the option of adding a way for caller to specify Sasl callbacks as part of constructing a messenger. -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 27 Aug 2018 18:57:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add missing authorization in KRPC
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11331 Change subject: Add missing authorization in KRPC .. Add missing authorization in KRPC In 2.12.0, Impala adopted Kudu RPC library for certain backened services (TransmitData(), EndDataStream()). While the implementation uses Kerberos for authenticating users connecting to the backend services, there is no authorization implemented. This is a regression from the Thrift based implementation because it registered a SASL callback (SaslAuthorizeInternal) to be invoked during the connection negotiation. With this regression, an unauthorized but authenticated user may invoke RPC calls to Impala backend services. This change fixes the issue above by overriding the default authorization method for the DataStreamServices. The authorization method will only let authenticated principal which matches the expected service name to access the service. Testing done: 1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an unauthorized user is not allowed to access the service. 2. Ran some queries in a Kerberos enabled cluster to make sure there is no error. 3. Exhaustive builds. Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix. Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 --- M be/src/common/global-flags.cc M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test.cc R be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/thrift-server-test.cc M be/src/runtime/data-stream-test.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/testutil/mini-kdc-wrapper.cc M be/src/testutil/mini-kdc-wrapper.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M common/protobuf/data_stream_service.proto A common/protobuf/kudu M common/protobuf/rpc_test.proto 18 files changed, 314 insertions(+), 143 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/1 -- To view, visit http://gerrit.cloudera.org:8080/11331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5 Gerrit-Change-Number: 11331 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@265 PS10, Line 265: DCHECK(!instance_stats->done_ || : report_version == instance_stats->last_report_version_); > isn't it possible (though unlikely) that you have some old report sitting i Yup. That's actually a DCHECK which Sailesh ran into when testing IMPALA-4063. Removed in patch 11. http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@287 PS10, Line 287: ReportVersion > if I understand the threading correctly, there are two entry points for sen To answer the question about your second race, I think it's impossible today because we explicitly turn off the reporting thread before sending the final profile (https://github.com/apache/impala/blob/master/be/src/runtime/fragment-instance-state.cc#L479-L496) The first race is also kind of impossible today because we wait for the report thread to exit before sending the final report but yes, it's possible that the RPC in the report thread fails somehow and we just drop those error statuses on the floor. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 22 Aug 2018 01:58:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#11). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: core build. Added some targeted test cases for profile serialization failure and RPC retry. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_exception.py M tests/failure/test_failpoints.py 60 files changed, 1,106 insertions(+), 670 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/11 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 11 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11270 ) Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances() .. Patch Set 4: GVO failed due to IMPALA-6776 / IMPALA-7061 -- To view, visit http://gerrit.cloudera.org:8080/11270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e Gerrit-Change-Number: 11270 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 21 Aug 2018 06:21:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7433: reduce logging on executors
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11202 ) Change subject: IMPALA-7433: reduce logging on executors .. Patch Set 4: Lars, do you also want to take a look ? -- To view, visit http://gerrit.cloudera.org:8080/11202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff Gerrit-Change-Number: 11202 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 21 Aug 2018 00:06:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11270 ) Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances() .. Patch Set 3: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/11270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e Gerrit-Change-Number: 11270 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 20 Aug 2018 22:05:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11270 ) Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances() .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h@344 PS2, Line 344: instances_prepare_promise_ > This variable doesn't exist. Fixed. http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h@345 PS2, Line 345: accessor > reader? Since writes by the startup thread are allowed, and in fact require Yes, was following the terminology accessor/mutator. Using the term "readers" make it clearer. http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h@351 PS2, Line 351: accessor > readers Done -- To view, visit http://gerrit.cloudera.org:8080/11270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e Gerrit-Change-Number: 11270 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 20 Aug 2018 22:05:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()
Hello Tim Armstrong, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11270 to look at the new patch set (#3). Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances() .. IMPALA-7422: Fix a race in QueryState::StartFInstances() A recent commit for IMPALA-7163 (cbc8c63) introduced a race between insertion into QueryState::fragment_map_ and the thread creation. In particular, after the aforementioned commit, the counting barrier 'instances_prepared_barrier_' is used for synchronizing callers of Cancel()/PublishFilter() and the PREPARE phase of fragment instances. Cancel()/PublishFilter() cannot proceed until all fragment instances have finished preparing; 'instances_prepared_barrier_' is updated by fragment instances once each of them is done preparing. The race is due to the fact that QueryState::StartFInstances() doesn't insert the fragment instance into 'fragment_map_' until after the fragment instance thread has been spawned. So, it's possible for the newly spawned thread to finish preparing and update the counting barrier before the insertion into 'fragment_map_' happens. It's therefore possible for PublishFilter() to have gotten unblocked before a fragment is inserted into 'fragment_map_', triggering the DCHECK() in IMPALA-7422. This change fixes the race by moving the insertion into fragment_map_ before the thread is spawned. Testing done: Exhaustive debug + release builds which previously ran into this race Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e --- M be/src/runtime/query-state.cc M be/src/runtime/query-state.h 2 files changed, 14 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/11270/3 -- To view, visit http://gerrit.cloudera.org:8080/11270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e Gerrit-Change-Number: 11270 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7433: reduce logging on executors
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11202 ) Change subject: IMPALA-7433: reduce logging on executors .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc@356 PS3, Line 356: VLOG(2) > We do already have log messages before and after this. I might be missing s Yes, please do add logging for errors in QueryState::Init(). -- To view, visit http://gerrit.cloudera.org:8080/11202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff Gerrit-Change-Number: 11202 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 20 Aug 2018 21:54:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7433: reduce logging on executors
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11202 ) Change subject: IMPALA-7433: reduce logging on executors .. Patch Set 4: Code-Review+1 Thanks for the explanation. -- To view, visit http://gerrit.cloudera.org:8080/11202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff Gerrit-Change-Number: 11202 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 20 Aug 2018 21:45:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11270 ) Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances() .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11270/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11270/1//COMMIT_MSG@14 PS1, Line 14: cnanot > nit: type Done http://gerrit.cloudera.org:8080/#/c/11270/1/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11270/1/be/src/runtime/query-state.cc@403 PS1, Line 403: // Update fragment_map_. Has to happen before the thread is spawned below or > Isn't a race still possible between this thread modifying the map in a subs 'fragment_map_' is supposed to be modified by the thread which calls StartFInstances(). Other threads may not access 'fragment_map_' until all fragment instances have finished preparing (i.e. they are blocked on WaitForPrepare()). At this point, there is at least one fragment instance which hasn't yet been started so no other thread should be accessing fragment_map_. Added some comments in query-state.h. -- To view, visit http://gerrit.cloudera.org:8080/11270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e Gerrit-Change-Number: 11270 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 20 Aug 2018 20:39:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()
Hello Tim Armstrong, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11270 to look at the new patch set (#2). Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances() .. IMPALA-7422: Fix a race in QueryState::StartFInstances() A recent commit for IMPALA-7163 (cbc8c63) introduced a race between insertion into QueryState::fragment_map_ and the thread creation. In particular, after the aforementioned commit, the counting barrier 'instances_prepared_barrier_' is used for synchronizing callers of Cancel() and PublishFilter() and the PREPARE phase of fragment instances. Cancel() and PublishFilter() cannot proceed until all fragment instances have finished preparing; 'instances_prepared_barrier_' is updated by fragment instances once each of them is done preparing. The race is due to the fact that QueryState::StartFInstances() doesn't insert the fragment instance into 'fragment_map_' until after the fragment instance thread has been spawned. So, it's possible for the newly spawned thread to finish preparing and update the counting barrier before the insertion into 'fragment_map_' happens. It's therefore possible for, PublishFilter() to have gotten unblocked before a fragment is inserted into 'fragment_map_', triggering the DCHECK() in IMPALA-7422. This change fixes the race by moving the insertion into fragment_map_ before the thread is spawned. Testing done: Exhaustive debug + release builds which previously ran into this race Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e --- M be/src/runtime/query-state.cc M be/src/runtime/query-state.h 2 files changed, 13 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/11270/2 -- To view, visit http://gerrit.cloudera.org:8080/11270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e Gerrit-Change-Number: 11270 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7433: reduce logging on executors
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11202 ) Change subject: IMPALA-7433: reduce logging on executors .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-exec-mgr.cc@155 PS3, Line 155: VLOG(2) This may be useful for debugging issue such as IMPALA-7194. Not sure if the newly added line below is sufficient as it may help to see when each fragment instance finishes. http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc@356 PS3, Line 356: VLOG(2) This one may be useful to keep to indicate whether we manage to get to the point of starting fragment instances after creating the QueryState. While we may consider bumping the log level when problem occurs, this may be a bit hard in practice if the problem is due to a race which doesn't occur consistently. -- To view, visit http://gerrit.cloudera.org:8080/11202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff Gerrit-Change-Number: 11202 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Mon, 20 Aug 2018 20:18:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11270 Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances() .. IMPALA-7422: Fix a race in QueryState::StartFInstances() A recent commit for IMPALA-7163 (cbc8c63) introduced a race between insertion into QueryState::fragment_map_ and the thread creation. In particular, after the aforementioned commit, the counting barrier 'instances_prepared_barrier_' is used for synchronizing callers of Cancel() and PublishFilter() and the PREPARE phase of fragment instances. Cancel() and PublishFilter() cnanot proceed until all fragment instances have finished preparing; 'instances_prepared_barrier_' is updated by fragment instances once each of them is done preparing. The race is due to the fact that QueryState::StartFInstances() doesn't insert the fragment instance into 'fragment_map_' until after the fragment instance thread has been spawned. So, it's possible for the newly spawned thread to finish preparing and update the counting barrier before the insertion into 'fragment_map_' happens. It's therefore possible for, PublishFilter() to have gotten unblocked before a fragment is inserted into 'fragment_map_', triggering the DCHECK() in IMPALA-7422. This change fixes the race by moving the insertion into fragment_map_ before the thread is spawned. Testing done: Exhaustive debug + release builds which previously ran into this race Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e --- M be/src/runtime/query-state.cc 1 file changed, 8 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/11270/1 -- To view, visit http://gerrit.cloudera.org:8080/11270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e Gerrit-Change-Number: 11270 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11234 ) Change subject: IMPALA-7444: Improve logging of opening/closing/expiring sessions. .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a Gerrit-Change-Number: 11234 Gerrit-PatchSet: 4 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Sat, 18 Aug 2018 00:32:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7449: Fix network throughput calculation of DataStreamSender
Hello Lars Volker, Csaba Ringhofer, Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11241 to look at the new patch set (#3). Change subject: IMPALA-7449: Fix network throughput calculation of DataStreamSender .. IMPALA-7449: Fix network throughput calculation of DataStreamSender Currently, the network throughput presented in the query profile for DataStreamSender is computed by dividing the total bytes sent by the total network time which is the sum of observed network time of all individual RPCs. This is wrong in general and may only make sense if the network throughput is fixed. In addition, RPCs are asynchronous and they overlap with each other. So, dividing the total byte sent by network throughput may result in time which exceeds the wall clock time, making it impossible to interpret. This change fixes the problem by measuring the network throughput of each individual RPC and uses a summary counter to track avg/min/max of network throughputs instead. Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24 --- M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h 2 files changed, 14 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11241/3 -- To view, visit http://gerrit.cloudera.org:8080/11241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24 Gerrit-Change-Number: 11241 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11234 ) Change subject: IMPALA-7444: Improve logging of opening/closing/expiring sessions. .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/11234/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11234/3/be/src/service/impala-server.cc@1975 PS3, Line 1975: LOG_IF(INFO, expired_cnt > 0) << "Expired sessions. Count: " << expired_cnt : << ". Time: " << UnixMillis() - now << " milliseconds."; nit: LOG_IF(INFO, expired_cnt > 0) << Substitute("Expired $0 sessions.", expired_cnt); I believe the timestamp at the log line is a reasonable proxy for the time when the above critical section completed so there may not be a need to log the time here. -- To view, visit http://gerrit.cloudera.org:8080/11234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a Gerrit-Change-Number: 11234 Gerrit-PatchSet: 3 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 17 Aug 2018 22:49:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7449: Fix network throughput calculation of DataStreamSender
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11241 ) Change subject: IMPALA-7449: Fix network throughput calculation of DataStreamSender .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11241/2/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/11241/2/be/src/runtime/krpc-data-stream-sender.cc@401 PS2, Line 401: row_batch_size * NANOS_PER_SEC > This will overflow around a few gigabyte row_batch_size - is it possible to Good point. The max of our row batch currently is limited to numeric_limits::max(); (https://github.com/apache/impala/blob/master/be/src/rpc/rpc-mgr.cc#L90) so it should be fine as far as I can tell. It probably warrants a DCHECK. -- To view, visit http://gerrit.cloudera.org:8080/11241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24 Gerrit-Change-Number: 11241 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Fri, 17 Aug 2018 18:36:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11234 ) Change subject: IMPALA-7444: Improve logging of opening/closing/expiring sessions. .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/11234/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11234/2/be/src/service/impala-server.cc@1962 PS2, Line 1962: expired_cnt++; ++expired_cnt; http://gerrit.cloudera.org:8080/#/c/11234/2/be/src/service/impala-server.cc@1975 PS2, Line 1975: LOG_IF(INFO, expired_cnt > 0) << "Expired sessions. Count: " << expired_cnt : << ". Time: " << UnixMillis() - now << " milliseconds."; What's the purpose of this line ? Won't we be able to tell that from the "Expiring session..." line above ? -- To view, visit http://gerrit.cloudera.org:8080/11234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a Gerrit-Change-Number: 11234 Gerrit-PatchSet: 2 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Thu, 16 Aug 2018 21:50:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7449: Fix network throughput calculation of DataStreamSender
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11241 Change subject: IMPALA-7449: Fix network throughput calculation of DataStreamSender .. IMPALA-7449: Fix network throughput calculation of DataStreamSender Currently, the network throughput presented in the query profile for DataStreamSender is computed by dividing the total bytes sent by the total network time which is the sum of observed network time of all individual RPCs. This is wrong in general and may only make sense if the network throughput is fixed. In addition, RPCs are asynchronous and they overlap with each other. So, dividing the total byte sent by network throughput may result in time which exceeds the wall clock time, making it impossible to interpret. This change fixes the problem by measuring the network throughput of each individual RPC and uses a summary counter to track avg/min/max of network throughputs instead. Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24 --- M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h 2 files changed, 11 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11241/1 -- To view, visit http://gerrit.cloudera.org:8080/11241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24 Gerrit-Change-Number: 11241 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11234 ) Change subject: IMPALA-7444: Improve logging of opening/closing/expiring sessions. .. Patch Set 1: Just being paranoid here but I hope this doesn't make the log too verbose. -- To view, visit http://gerrit.cloudera.org:8080/11234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a Gerrit-Change-Number: 11234 Gerrit-PatchSet: 1 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 15 Aug 2018 19:25:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11234 ) Change subject: IMPALA-7444: Improve logging of opening/closing/expiring sessions. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-hs2-server.cc@367 PS1, Line 367: Done opening Opened http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-server.cc@1236 PS1, Line 1236: Done closing Closed -- To view, visit http://gerrit.cloudera.org:8080/11234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a Gerrit-Change-Number: 11234 Gerrit-PatchSet: 1 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 15 Aug 2018 19:25:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7434: Heimdal Kerberos is not supported in Impala
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11204 ) Change subject: IMPALA-7434: Heimdal Kerberos is not supported in Impala .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id488665d8e6037e3743750dbd32b48def5b58b0f Gerrit-Change-Number: 11204 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 15 Aug 2018 01:11:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7434: Limitation in kinit with auth to local and Heimdal kerberos
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11204 ) Change subject: IMPALA-7434: Limitation in kinit with auth_to_local and Heimdal kerberos .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11204/3/docs/topics/impala_known_issues.xml File docs/topics/impala_known_issues.xml: http://gerrit.cloudera.org:8080/#/c/11204/3/docs/topics/impala_known_issues.xml@323 PS3, Line 323: auth_to_local rules not supported with Heimdal Kerberos Sorry for the confusion but my understanding when discussing with Sailesh is that we don't support auth_to_local in any implementation of Kerberos. Also, Heimdal Kerberos (an implementation of Kerberos) is also not supported either. We only support MIT Kerberos. Let me double check with Michael Yoder. -- To view, visit http://gerrit.cloudera.org:8080/11204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id488665d8e6037e3743750dbd32b48def5b58b0f Gerrit-Change-Number: 11204 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Tue, 14 Aug 2018 23:48:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7434: Limitation in kinit with auth to local and Heimdal kerberos
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11204 ) Change subject: IMPALA-7434: Limitation in kinit with auth_to_local and Heimdal kerberos .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11204/2/docs/topics/impala_known_issues.xml File docs/topics/impala_known_issues.xml: http://gerrit.cloudera.org:8080/#/c/11204/2/docs/topics/impala_known_issues.xml@323 PS2, Line 323: KRPC Unavailable auth_to_local and Heimdal Kerberos are never supported even before KRPC. So, I think we should remove any wording related to KRPC. -- To view, visit http://gerrit.cloudera.org:8080/11204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id488665d8e6037e3743750dbd32b48def5b58b0f Gerrit-Change-Number: 11204 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Tue, 14 Aug 2018 02:08:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11055 ) Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation .. Patch Set 4: (1 comment) May help to add a unit test to exercise retrying logic when sending a filter to the aggregator. http://gerrit.cloudera.org:8080/#/c/11055/4/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/11055/4/common/thrift/ImpalaInternalService.thrift@878 PS4, Line 878: optional map> backend_list May help to add a comment on what this is for. -- To view, visit http://gerrit.cloudera.org:8080/11055 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c Gerrit-Change-Number: 11055 Gerrit-PatchSet: 4 Gerrit-Owner: Rahul Shivu Mahadev Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Rahul Shivu Mahadev Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 10 Aug 2018 21:50:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h@33 PS9, Line 33: #include "runtime/coordina > Not needed. Done http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc@52 PS9, Line 52: > This should have been moved to line 48. Done http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto@137 PS9, Line 137: 5 > Skipped 5. Done -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 09 Aug 2018 02:46:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#10). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: core build. Added some targeted test cases for profile serialization failure and RPC retry. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_exception.py M tests/failure/test_failpoints.py 60 files changed, 1,110 insertions(+), 673 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/10 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc@52 PS9, Line 52: DCHECK_NOTNULL(coord); This should have been moved to line 48. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 09 Aug 2018 01:49:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@373 PS8, Line 373: MonoDelta::FromMilliseconds(FLAGS_backend_client_rpc_timeout_ms)); > We have some logging already below at line 385. Actually, mis-interpreted your comment. Will add a LOG(WARNING). -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 09 Aug 2018 00:49:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h@33 PS9, Line 33: #include "common/atomic.h" Not needed. http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto@137 PS9, Line 137: 6 Skipped 5. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 09 Aug 2018 00:22:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. Patch Set 8: (16 comments) http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/data-sink.cc File be/src/exec/data-sink.cc: http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/data-sink.cc@49 PS8, Line 49: const char* DataSink::ROOT_PARTITION_KEY = ""; > this should be const char* const -- ie it's a const pointer to const charac True. Same comment probably applies to other classes too (e.g. https://github.com/apache/impala/blob/master/be/src/exec/hash-table.h#L196) http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/hdfs-table-writer.h File be/src/exec/hdfs-table-writer.h: http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/hdfs-table-writer.h@91 PS8, Line 91: const InsertStatsPB& stats() { return stats_; } > probably should be a const function as well? Done http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.h@211 PS8, Line 211: Coordinator* coord_; /// Coordinator object that owns this BackendState > this went from const to non-const. does this mutate the coordinator that ow Indirectly. We are updating the coord_->dml_exec_state_(). The locking is done inside dml_exec_state_ itself so it should be thread safe. http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@50 PS8, Line 50: : coord_(coord), > nit: you could just do coord_(DHECK_NOTNULL(coord)) here since DCHECK_NOTNU Done http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@262 PS8, Line 262: lock.unlock(); > I'm nervous about this optimization. Do we really have data that deserializ The reason for doing it here is to handle the case in which we can have 0 to # fragment instances (after IMPALA-4063) entries in "backend_exec_status. instance_exec_status". We can have 0 entries if say we failed to start a thread. So, while we can do it in the RPC handler, it's a bit awkward to handle the zero entry case. The new patch moves the profile sidecar idx into backend_exec_status from backend_exec_status.instance_exec_status. That allows us to keep the deserialization in the ControlService::ReportExecStatus(). In the patch for IMPALA-4063, we will just replace TRuntimeProfileTree with a list instead. http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@310 PS8, Line 310: coord_ > I feel that allowing for a non-const pointer to the Coordinator object to t Done http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@317 PS8, Line 317: MergeErrorMaps(instance_exec_status.error_log(), &error_log_); > This is less severe, but I guess that this is already wrong too. We potenti The new report has a monotonically increasing version number. This is a variant of IMPALA-7241. http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@516 PS8, Line 516: lock_guard l1(exec_summary->lock); > Are we violating any lock ordering here? Yes, I also looked and didn't see anything. http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@318 PS8, Line 318: bool done > trying to understand the purpose of this parameter. I don't think this shou I agree in general. Let me add a TODO. http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@331 PS8, Line 331: ThriftSerializer serializer(true); > Why construct this here and pass it into ConstructReport? It seems Construc The serialization output is owned by ThriftSerializer. My mistake to use a string below. I was using (buffer, len) earlier to avoid the extra copying. Also added comments about the lifetime of the buffer. http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@359 PS8, Line 359: CHECK(sidecar_status.ok()) : << FromKuduStatus(sidecar_status, "Failed to add sidecar").GetDetail(); > is there any chance that the profile would extend past max RPC size? I gues Added a CHECK for it in ConstructReport(). http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@373 PS8, Line 373: if (i < 2) SleepForMs(FLAGS_report_status_retry_interval_ms); > worth a LOG(WARNING) here? We have some logging already below at line 385. http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/service/control-service.cc File be/src/service/control-service.cc: http://
[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#9). Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. This patch also fixes IMPALA-7241 which may lead to duplicated dml stats being applied. The fix is by adding a monotonically increasing version number for fragment instances' reports. The coordinator will ignore any report smaller than or equal to the version in the last report. Testing done: core build. Added some targeted test cases for profile serialization failure and RPC retry. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_exception.py M tests/failure/test_failpoints.py 60 files changed, 1,110 insertions(+), 673 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/9 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/kudu/rpc/rpc_context.h File be/src/kudu/rpc/rpc_context.h: http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/kudu/rpc/rpc_context.h@166 PS8, Line 166: Status GetInboundSidecar(int idx, Slice* slice) const; Changes done on Kudu side already: https://github.com/apache/kudu/commit/37d0c35bcf43aef92fdde82c629f732f1466d8ae -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 8 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 08 Aug 2018 18:14:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. Patch Set 8: (22 comments) http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/data-sink.h File be/src/exec/data-sink.h: http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/data-sink.h@100 PS5, Line 100: static const char* ROOT_PARTITION_KEY; > can this be a const char* const ROOT_PARTITION_KEY instead? usually static Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/hdfs-table-writer.h File be/src/exec/hdfs-table-writer.h: http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/hdfs-table-writer.h@91 PS5, Line 91: const InsertS > style: is this meant to be returning a mutable reference or should this be Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/coordinator-backend-state.cc@254 PS5, Line 254: // If this backend completed previously, don't apply the update. : if (IsDone()) return false; : for (const FragmentInstanceExecStatusPB& instance_ > you can use a C++11 style loop with protobuf repeated elements too Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.h@276 PS5, Line 276: > "... to construct a status report ..." Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.h@277 PS5, Line 277: > "If 'fis' is not 'nullptr', the runtime profile ..." Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@233 PS5, Line 233: return state == BackendExecState::FINISHED : || state == BackendExecState::CANCELLED : || state == BackendExecState::E > I thought I saw some utility code for this elsewhere Good point about adding this as utility function which we don't have one right now. The utility function may not be needed eventually once we convert other ExecFInstance() RPC to KRPC too. http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@243 PS5, Line 243: DCHECK(!IsTerminalState(backend_exec_state_)) : << " Current State: " << BackendExecStateToString(backend_exec_state_) : << " | Current Status: " << query_status_.Get > and here Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@256 PS5, Line 256: > Shouldn't we print the fragment instance ID instead? Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@295 PS5, Line 295: tus serialize_status = serializer-> > The profile is a thrift structure serialized to a string ... Done http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@306 PS5, Line 306: Only send updates to insert status if > We could probably consider moving 'profile_str' to the 'sidecar_buf' to avo I guess it's a tradeoff because moving the 'profile_str' means having to redo the serialization of the profile on when retrying the RPC. I'm a bit worried about the overhead of doing so once we merge the profile of multiple fragment instances together. Timeout is more likely to occur once the coordinator is under stress. http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@304 PS5, Line 304: } : : // Only send updates to insert status if fragment is finished, the coordinator waits : // until query execution is done to use them anyhow. : RuntimeState* state = fis->runtime_state(); : if (done) { : state->dml_exec_state()->ToProto(instance_status->mutable_insert_exec_status()); : } : : // Send new errors to coordinator : state->GetUnreportedErrors(instance_status->mutable_error_log()); : } : } : > Can't we just do this once before the loop starts? Not really. We need to re-arm the sidecar on every retry. The ownership of the buffer is transferred to the OutboundCall so we need to re-arm the sidecar. http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/runtime-state.cc@200 PS5, Line 200: g_lock_); > this is weird here because there is no 'exec_status' in this function Typo. Meant 'new_errors'. http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#8). Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/kudu/rpc/rpc_context.cc M be/src/kudu/rpc/rpc_context.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_exception.py M tests/failure/test_failpoints.py 62 files changed, 1,096 insertions(+), 666 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/8 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 8 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#7). Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/kudu/rpc/rpc_context.cc M be/src/kudu/rpc/rpc_context.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_exception.py M tests/failure/test_failpoints.py 62 files changed, 1,095 insertions(+), 666 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/7 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 7 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#6). Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/kudu/rpc/rpc_context.cc M be/src/kudu/rpc/rpc_context.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_rpc_exception.py M tests/failure/test_failpoints.py 62 files changed, 1,096 insertions(+), 666 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/6 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 16: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 16 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Aug 2018 21:01:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/10813/15/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/15/be/src/runtime/fragment-instance-state.cc@105 PS15, Line 105: current_state_.Load() <= TFInstanceExecState::WAITING_FOR_PREPARE) { : DCHECK(!is_prepared); Seems easier to understand with the following (with the same effect): if (!is_prepared) { DCHECK_LT(current_state.Load(), ); } else { } -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 15 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Aug 2018 18:15:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 14: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@107 PS13, Line 107: er > I think it's better to rely on the enum ordering. Otherwise, the new boolea My understanding is that explicitly check the current state during state transition but don't rely on the ordering of the enum in any way. That said, it'd be great if the enum is ordered in the expected state transition order. May help to have a static DCHECK somewhere to confirm the ordering of the enum entries matches the expected state transition order if we don't have one already. http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@522 PS13, Line 522: "First batch produced", // FIRST_BATCH_PRODUCED : "Producing Data", // PRODUCING_DATA This needs to be changed. I think this is dependent on the enum ordering. http://gerrit.cloudera.org:8080/#/c/10813/14/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/10813/14/common/thrift/ImpalaInternalService.thrift@626 PS14, Line 626: // the debug webpages. Given the assumption in this patch about the ordering of the fields in this enum, can you please add a comment and ideally a DCHECK for it ? -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 14 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 06 Aug 2018 19:55:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@107 PS13, Line 107: <= > Hmm, the enums are ordered incorrectly. I think the right thing to do would A simpler alternative to relying on this enum is to save a boolean above which is marked as true if we get past Prepare() successfully. This avoids the reliance that the code actually enforces the state transition based on the order in which it's defined in the enum. -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 13 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Sun, 05 Aug 2018 17:43:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7394: Don't print stack trace in ExpireSessions()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/7 ) Change subject: IMPALA-7394: Don't print stack trace in ExpireSessions() .. Patch Set 1: Build failed due to IMPALA-7328 -- To view, visit http://gerrit.cloudera.org:8080/7 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29cb143af96e2eef45479a03365d8f49a2ee2dfa Gerrit-Change-Number: 7 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Sat, 04 Aug 2018 18:11:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7394: Don't print stack trace in ExpireSessions()
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/7 Change subject: IMPALA-7394: Don't print stack trace in ExpireSessions() .. IMPALA-7394: Don't print stack trace in ExpireSessions() When sessions expire, ExpireSessions() may create excessive amount of log spew due to printing of the stack traces esp when there are a massive amount of sessions expiring at the same time. This stack crawling also unnecessarily extends the critical section under the session_state_map_lock_, increasing the contention for that lock. This change converts the use of Status() to Status::Expected() to remove the unnecessary stack crawl. Change-Id: I29cb143af96e2eef45479a03365d8f49a2ee2dfa --- M be/src/service/impala-server.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/7/1 -- To view, visit http://gerrit.cloudera.org:8080/7 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I29cb143af96e2eef45479a03365d8f49a2ee2dfa Gerrit-Change-Number: 7 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 12: (3 comments) http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@107 PS13, Line 107: yS == Apparently WAITING_FOR_CODEGEN < WAITING_FOR_PREPARE http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h@244 PS12, Line 244: Status WaitForFinish(); > We call it towards the end of StartFInstances(). We update the query state Oh right. Missed that. http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py File tests/failure/test_failpoints.py: http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py@162 PS12, Line 162: i = 0 > When I remove this, it gives me this error: It should work with for i in range(50): -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 12 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 03 Aug 2018 19:10:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7390: Pin rpc-mgr-kerberized-test to localhost.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/3 ) Change subject: IMPALA-7390: Pin rpc-mgr-kerberized-test to localhost. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/3/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/3/1//COMMIT_MSG@11 PS1, Line 11: failing due to DNS resolution reasons Does it mean in general, to run Impala with docker, one has to specify the hostname ? Is there anything we can do at this line (https://github.com/apache/impala/blob/master/be/src/common/init.cc#L191) to fix the issue ? http://gerrit.cloudera.org:8080/#/c/3/1/be/src/rpc/rpc-mgr-kerberized-test.cc File be/src/rpc/rpc-mgr-kerberized-test.cc: http://gerrit.cloudera.org:8080/#/c/3/1/be/src/rpc/rpc-mgr-kerberized-test.cc@86 PS1, Line 86: FLAGS_hostname = "localhost"; nit: may help to add a comment on why InitCommonRuntime() doesn't set this up correctly. -- To view, visit http://gerrit.cloudera.org:8080/3 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91003cbc86177feb7f99563f61297f7da7fabab4 Gerrit-Change-Number: 3 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 03 Aug 2018 16:48:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 12: (3 comments) http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/fragment-instance-state.cc@106 PS12, Line 106: if (!status.ok() && current_state_.Load() <= TFInstanceExecState::WAITING_FOR_PREPARE) { : // Tell the managing 'QueryState' that we hit an error during Prepare(). : query_state_->ErrorDuringPrepare(status, instance_id()); : } else if (!status.ok()) { : // Tell the managing 'QueryState' that we hit an error during execution. : query_state_->ErrorDuringExecute(status, instance_id()); : } Seems easier to read if written like the following: if (!status.ok()) if (...) query_state_->Error... else query_state_->ErrorDuring http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h@244 PS12, Line 244: Status WaitForFinish(); Is this actually still needed in this patch ? It doesn't seem to be called anywhere. http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py File tests/failure/test_failpoints.py: http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py@162 PS12, Line 162: i = 0 This line isn't needed if you use for i in range(50) below. -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 12 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 02 Aug 2018 18:26:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 10: (10 comments) http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/fragment-instance-state.cc@100 PS10, Line 100: if (!status.ok()) { : // Tell the managing 'QueryState' that we hit an error during execution. : query_state_->ErrorDuringExecute(status, instance_id()); : goto done; : } : // Tell the managing 'QueryState' that we're done with executing and that we've stopped : // the reporting thread. : query_state_->DoneExecuting(); Should this be included moved to point after label done ? Otherwise, we may hang in case Open() fails ? May be this warrants a debug action which triggers failure() in Open(). http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@216 PS10, Line 216: ret_status; not used ? http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@217 PS10, Line 217: BackendExecState old_state; BackendExecState old_state = backend_exec_state_; and no need for line 220 http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@230 PS10, Line 230: BackendExecState::EXECUTING : : BackendExecState::FINISHED; nit: one line http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@408 PS10, Line 408: // We failed to start 'num_unstarted_instances', so make sure to notify : // 'instances_prepared_barrier_' 'num_unstarted_instances - 1' times, to unblock : // 'WaitForPrepare(). The last remaining notification will be set by the error : // handling logic once we've exited this loop so that the we may have the query : // wide status reflect the 'thread_create_status'. : while (num_unstarted_instances > 1) { : DonePreparing(); : --num_unstarted_instances; : } A minor suggestion is to group this loop with ErrorDuringPrepare() below so it's easier to see that we have updated num_unstarted_instances times. http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@435 PS10, Line 435: ErrorDuringPrepare(thread_create_status, TUniqueId()); We used to wait for all fragment instances to finish preparing before reporting errors. This patch stops doing that. Does it matter ? http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@442 PS10, Line 442: all_fis_status; not actually used? WaitForPrepare() below can become discard_result(WaitForPrepare()); http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py File tests/failure/test_failpoints.py: http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py@151 PS10, Line 151: self.execute_query_expect_failure(self.client, query, : query_options={'debug_action':debug_action}) Do we care about verifying the failure actually occurred ? http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py@157 PS10, Line 157: x = 0 : # Since there's only a 50% chance of fragment instance thread creation failure, : # we attempt to catch a query failure up to a very conservative maximum of 50 tries. : while x in range(0, 50): for i in range(50): http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py@165 PS10, Line 165: if 'Query aborted:Debug Action: FIS_FAIL_THREAD_CREATION:FAIL@0.5' in str(e): : break : else: : assert False, str(e) assert 'Query aborted:Debug Action: FIS_FAIL_THREAD_CREATION:FAIL@0.5' in str(e), str(e) -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 10 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 02 Aug 2018 00:06:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7376: DCHECK hit if a fragment instance fails to initialize the filter bank
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11096 ) Change subject: IMPALA-7376: DCHECK hit if a fragment instance fails to initialize the filter bank .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/11096/2/tests/failure/test_failpoints.py File tests/failure/test_failpoints.py: http://gerrit.cloudera.org:8080/#/c/11096/2/tests/failure/test_failpoints.py@147 PS2, Line 147: query_options={'debug_action':debug_action}) Should we also verify that the query actually failed ? -- To view, visit http://gerrit.cloudera.org:8080/11096 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1d6e7afc18fe2f0e1e29d2bd8a5f804a78f7043a Gerrit-Change-Number: 11096 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 01 Aug 2018 22:46:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. Patch Set 4: (30 comments) http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG@16 PS4, Line 16: This patch also introduces a new service pool for all query execution : control related RPCs in the future so that control commands from : coordinators aren't blocked by long-running DataStream services' RPCs. > yep, that's essentially what I meant. Thanks for explaining it better than Thanks for the suggestion. I think it's fine to do it as a follow-up or parallel work on Kudu side. This doesn't necessarily block the merging of this patch IMHO as ReportExecStatus() can already be late today for various reasons such as an overloaded coordinator. http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/coordinator.h@31 PS4, Line 31: #include "kudu/util/slice.h" > Not needed? Done http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/dml-exec-state.cc File be/src/runtime/dml-exec-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/dml-exec-state.cc@404 PS4, Line 404: void DmlExecState::ToProto(InsertExecStatusPB* dml_status) { > do you want to clear dml_status first just in case? Similarly for other ToP Done http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/dml-exec-state.cc@455 PS4, Line 455:if (!kudu_stats->has_num_row_errors()) { : kudu_stats->set_num_row_errors(0); : } > I don't think this is necessary- it's not illegal to access an "unset" fiel Done http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/exec-env.h@136 PS4, Line 136: DataStreamService* data_svc() const { return data_svc_.get(); } > Not used, we can remove this. Done http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@268 PS4, Line 268: rpc_controller.AddOutboundSidecar(move(sidecar), &sidecar_idx).ok > should this be a CHECK_OK? Done http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@273 PS4, Line 273: "final" > need a space here Oops. Done. http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@274 PS4, Line 274: ERROR > should this be a DFATAL? do you expect this to ever actually happen? I don't think this is expected to happen very often but it seems more robust to not crash Impala because of failure to serialize the query profile. The query can still run to completion without the profile. http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@283 PS4, Line 283: req.clear_error_log(); > isnt the error log already clear because this is a new instance? Done http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@284 PS4, Line 284: state->GetUnreportedErrors(req.mutable_error_log()); > it's a shame that this mutates the state, making retries a bit more difficu The new patch creates a single instance of the RPC parameter and reuse it on retry. My understanding is that the RPC layer shouldn't mutate the input RPC request parameter. http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@289 PS4, Line 289: rpc_mgr->GetProxy > clang-tidy failure: Missing status check. Done http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@293 PS4, Line 293: kudu::Status rpc_status = proxy->ReportExecStatus(req, &resp, &rpc_controller); > do you want any timeout on this RPC? Yes. Switched back to the behavior of existing code of using FLAGS_backend_client_rpc_timeout_ms. This is not ideal but I guess it's simplest to just preserve the existing behavior and fix IMPALA-2990 in a separate patch. http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@305 PS4, Line 305: // We can retry the RPC if the payload hasn't been sent yet. > are these reports idempotent? seems like it shoudl be easy to make them ide Yes, I believe they are idempotent. So, I was worried about the connection being reset in the middle of transmission so only partial payload was sent. In that case, I think the server side will also drop that connection so it should be fine I suppose. Comments removed. http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@306 PS4, Line 306: RpcMgr::IsServerTooBusy(rpc_controller > I'm a bit fuzzy on the context of this code, but if we get TOO_BUSY, it see Restored to the old behavior of retrying for at most 2 ti
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#5). Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift 58 files changed, 1,050 insertions(+), 634 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/5 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10977 ) Change subject: IMPALA-7296: bytes limit for row batch queue .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/10977 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681 Gerrit-Change-Number: 10977 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 30 Jul 2018 18:53:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 8: (9 comments) http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@149 PS8, Line 149: single thread Seems clearer to also state which thread it is. http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@215 PS8, Line 215: /// during Execute(). Update 'query_status_' and 'failed_instance_id_' if it's not set already. Also notify all waiters of 'instances_finished_barrier_'. http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@260 PS8, Line 260: }; nit: blank line missing. http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@261 PS8, Line 261: // nit: /// http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@268 PS8, Line 268: single QueryState thread ? http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@328 PS8, Line 328: we recieve the finst ID and the status of the : /// FIS that hit the error. the error status and failing fragment instance ID are set in 'query_status_' and 'failed_instance_id_' respectively. http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc@421 PS8, Line 421: TUniqueId()) Can we also record the fragment instance id above which failed to start thread ? http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc@422 PS8, Line 422: Status updated_query_status = UpdateBackendExecState(); : instances_prepared_barrier_->NotifyRemaining(); Actually, there may be some problem calling NotifyRemaining() here. The problem is that some fragment instances may not have finished preparing yet at this point. In which case, if a Cancel() sneaks in after this point and before all fragment instances finished preparing, Impala may crash. Looks like we need to think through how to structure this code more carefully with WaitForPrepare() below. http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc@425 PS8, Line 425: all_fis_status Seems clearer to use thread_create_status here like the old code. -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 8 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 30 Jul 2018 18:42:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@479 PS7, Line 479: { : unique_lock l(status_lock_); : query_status_ = Status::CANCELLED; : } > No, it makes sense to preemptively set this since a Cancel() is a query wid As discussed in person, we this may overwrite earlier error status during prepare / execution. Also, if we don't have these statements here, the first fragment to cancel will set query_status_ to Status::CANCELLED anyway. -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 7 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 30 Jul 2018 18:08:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11062 ) Change subject: IMPALA-7360: sequence scanners sometimes skip blocks .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a Gerrit-Change-Number: 11062 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Fri, 27 Jul 2018 01:32:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 7: (9 comments) Looking good. Some more comments. http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/fragment-instance-state.cc@125 PS7, Line 125: DCHECK( Does DCHECK_EQ() not work ? http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@67 PS7, Line 67: however, : /// if any fragment instance hits an error or cancellation, then we immediately change the : /// state of the query to the ERROR or CANCELLED state accordingly. This is not true for PREPARING phase, right ? Seems easier to separately describe the behavior in PREPARING phase and EXECUTING phase instead of making a general statement like this and talking about exception in the next paragraph. http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@70 PS7, Line 70: EXECUTING PREPARING ? http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@71 PS7, Line 71: all the underlying FISes will be in Prepare() This is not necessarily true, right ? In PREPARING state, it only means at least one fragment instances is running Prepare(), right ? Some fragment instances could already be past Prepare() and executing Open(), Exec() etc, right ? http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@264 PS7, Line 264: BackendExecState backend_exec_state_ = BackendExecState::PREPARING; May want to document the thread safety of this variable. I suppose it's only updated by the "query-state thread" only, right ? http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@287 PS7, Line 287: /// ID of first fragment instance to hit an error. Please consider documenting the thread safety of this variable too. http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@308 PS6, Line 308: : /// Number of active fragment instances and coordinators for this query that may consume : /// resources for query execution (i.e. threads, memory) on the Impala daemon. : /// Query-wide execution resources for this query are released once this goes to zero. : AtomicInt32 exec_resource_refcnt_; : : /// Temporary files for this query (owned by obj_pool_). Non-null if spilling is : /// enabled. Set in Pr > I feel that it's not ideal having the fragment instance threads modify the Now that the patch has sunk in more, I think the state machine the query status and the query state has a slightly counter intuitive relationship which exists today already. query_status_ represents the error status which any fragment instances can hit either in PREPARING or EXECUTING phase. The query state means whether at least one fragment instances is still in that said state. So, a query state of PREPARING only means at least one fragment instance is still preparing but other fragment instances could already be executing. The problem is that each fragment instance will directly report to the coordinator on failure today independent of other fragment instances' states. In the next patch, the status reporting will be driven solely by the query state thread so we may actually track the status of PREPARING phase and EXECUTING phase separately like you did here. The change in behavior will be that we will report any error hit in PREPARING phase first before any error hit in EXECUTING phase. Do you think that will make the system easier to reason about ? http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@230 PS7, Line 230: BackendExecState::EXECUTING : : BackendExecState::FINISHED; nit: one line http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@479 PS7, Line 479: { : unique_lock l(status_lock_); : query_status_ = Status::CANCELLED; : } Should this happen after line 483 below ? Actually, do we need to set the query_status_ here ? Won't the first thread which notices the cancellation set it when calling ErrorDuringExec() set it ? Do we have to worry about overwriting the error status if query_status_ already contains some errors ? -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, vis
[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10977 ) Change subject: IMPALA-7296: bytes limit for row batch queue .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h File be/src/util/blocking-queue.h: http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@117 PS5, Line 117: ElemBytesFn()(*out); > It's a bit unfortunate that each element goes through two indirect function Actually, this may not actually be an indirect call after all, depending on how the compiler ends up generating the code. May be safe to double check that's actually the case with the compiled code. -- To view, visit http://gerrit.cloudera.org:8080/10977 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681 Gerrit-Change-Number: 10977 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 25 Jul 2018 05:37:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10977 ) Change subject: IMPALA-7296: bytes limit for row batch queue .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue-test.cc File be/src/util/blocking-queue-test.cc: http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue-test.cc@97 PS5, Line 97: LONG_TIMEOUT_MICROS); Does it make the test timing sensitive ? Should we use a larger timeout ? http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h File be/src/util/blocking-queue.h: http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@117 PS5, Line 117: ElemBytesFn()(*out); It's a bit unfortunate that each element goes through two indirect function calls (one in BlockingPut() and one in BlockingGet()) before it's consumed. Is it a reasonable alternative to modify BlockingPut() to take an extra parameter for the element size (with default value set to 0 for callers which don't care about size limit of the queue) and store the size for each entry as part of the queue ? I think it's reasonable to cap each entry to no more than 2GB so we can represent the size as int32_t. Would that be too much space overhead in the queue ? http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@152 PS5, Line 152: if (val_bytes > 0) Not sure why the need for if() here. I can only assume this is to avoid accessing put_bytes_enqueued_. Given the access to total_put_wait_time_ above, put_bytes_enqueued_ is most likely in the cache already. This complication is most likely not warranted given the speculative execution of modern CPU. If anything, this just adds unnecessary pressure to the branch predicator. http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@185 PS5, Line 185: if (val_bytes > 0) See comments above. http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@237 PS5, Line 237: bool HasCapacityInternal(const boost::unique_lock& lock, int64_t val_bytes) { nit: long line -- To view, visit http://gerrit.cloudera.org:8080/10977 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681 Gerrit-Change-Number: 10977 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 25 Jul 2018 01:41:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@289 PS4, Line 289: rpc_mgr->GetProxy clang-tidy failure: Missing status check. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 25 Jul 2018 00:43:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#4). Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift 58 files changed, 1,019 insertions(+), 616 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/4 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. Patch Set 3: Still working on some targeted BE tests for some error cases but please feel free to go ahead to do another pass. -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 25 Jul 2018 00:01:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Hello Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10855 to look at the new patch set (#3). Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. IMPALA-7213: Port ReportExecStatus() RPC to use KRPC This change converts ReportExecStatus() RPC from thrift based RPC to KRPC. This is done in part of the preparation for fixing IMPALA-2990 as we can take advantage of TCP connection multiplexing in KRPC to avoid overwhelming the coordinator with too many connections by reducing the number of TCP connection to one for each executor. This patch also introduces a new service pool for all query execution control related RPCs in the future so that control commands from coordinators aren't blocked by long-running DataStream services' RPCs. The majority of this patch is mechanical conversion of some Thrift structures used in ReportExecStatus() RPC to Protobuf. Note that the runtime profile is still retained as a Thrift structure as Impala clients will still fetch query profiles using Thrift RPCs. This also avoids duplicating the serialization implementation in both Thrift and Protobuf for the runtime profile. The Thrift runtime profiles are serialized and sent as a sidecar in ReportExecStatus() RPC. Change-Id: I7638583b433dcac066b87198e448743d90415ebe --- M be/src/benchmarks/expr-benchmark.cc M be/src/catalog/catalog-util.cc M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/rpc/jni-thrift-util.h M be/src/rpc/thrift-util-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-test.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/scheduler-test-util.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h A be/src/service/control-service.cc A be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M be/src/util/container-util.h A be/src/util/error-util-internal.h M be/src/util/error-util-test.cc M be/src/util/error-util.cc M be/src/util/error-util.h M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M be/src/util/uid-util.h M bin/impala-config.sh M common/protobuf/CMakeLists.txt M common/protobuf/common.proto A common/protobuf/control_service.proto M common/protobuf/data_stream_service.proto M common/protobuf/row_batch.proto M common/protobuf/rpc_test.proto M common/thrift/ImpalaInternalService.thrift 59 files changed, 1,017 insertions(+), 617 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/3 -- To view, visit http://gerrit.cloudera.org:8080/10855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe Gerrit-Change-Number: 10855 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10855 ) Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC .. Patch Set 2: (36 comments) http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG@17 PS2, Line 17: they > nit: that Done http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG@19 PS2, Line 19: convertion > nit: conversion Done http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG@25 PS2, Line 25: > Not to be pedantic, but a lot of Thrift structs are being removed as part o Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/exec/hdfs-parquet-table-writer.cc@1183 PS2, Line 1183: (*parquet_insert_stats_.mutable_per_column_size())[col_name] += : col_writer->total_compressed_size(); > nit: Should we have a level of indirection here? It's not very obvious what Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/rpc/thrift-util.h File be/src/rpc/thrift-util.h: http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/rpc/thrift-util.h@93 PS2, Line 93: Serialize > Should we explicitly qualify the names of these functions to avoid confusio Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/rpc/thrift-util.h@103 PS2, Line 103: uint8_t* buffer; : uint32_t len; : mem_buffer_->getBuffer(&buffer, &len); : result->assign_copy(buffer, len); > I think you can avoid the copy here if you do: I believe faststring manages its own buffer internally so there is no easy way to share the internal buffer inside mem_buffer_ with the faststring. If I understand it correctly, mem_buffer_->getBuffer() shouldn't do any data copying so there should be only one copy into the faststring. http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.h File be/src/runtime/dml-exec-state.h: http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.h@35 PS2, Line 35: class TInsertResult; : class TFinalizeParams; : class TUpdateCatalogRequest; : class RuntimeProfile; : class HdfsTableDescriptor; > nit: Not your change, but the ordering is not alphabetical. Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.h@126 PS2, Line 126: std::map > I'm wondering if it just makes sense to use google::protobuf::Map here inst The comment explicitly says "Uses ordered map so that iteration order is deterministic." so I am sticking to map for now. According to the specification of protobuf, google::protobuf::Map acts more like an unordered_map (https://developers.google.com/protocol-buffers/docs/proto#maps-features) - Wire format ordering and map iteration ordering of map values is undefined, so you cannot rely on your map items being in a particular order. http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc File be/src/runtime/dml-exec-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@83 PS2, Line 83: for (auto i = parquet_stats.per_column_size().begin(); :i != parquet_stats.per_column_size().end(); ++i) { > Range based for-loop with const ref. Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@95 PS2, Line 95: auto > Let's try to stay explicit about data types where we can. Also take it as a Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@95 PS2, Line 95: new_partition_status > nit: new_per_partition_status_map Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@96 PS2, Line 96: (auto iter = new_partition_status.begin(); iter != new_partition_status.end(); :++iter) > Why not use a range based for loop? It's much more readable. Also, referenc Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@404 PS2, Line 404: for (auto iter = files_to_move_.begin(); iter != files_to_move_.end(); ++iter) { > Range based for loop, and take iterator as const ref. Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@407 PS2, Line 407: for (auto iter = per_partition_status_.begin(); > Same here Done http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@453 PS2, Line 453: KuduDmlStatsPB* kudu_stats > This can be a reference, no reason to use a pointer here. kudu_stats is actually being modified here so using a pointer seems to fit the coding convention. http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/query-state.
[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10835 ) Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream services .. Patch Set 6: Code-Review+2 Fix clang-tidy error. Carry Sailesh's +2. -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 23 Jul 2018 23:29:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services
Hello Sailesh Mukil, Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10835 to look at the new patch set (#6). Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream services .. IMPALA-7212: Removes --use_krpc flag and remove old DataStream services This change removes the flag --use_krpc which allows users to fall back to using Thrift based implementation of DataStream services. This flag was originally added during development of IMPALA-2567. It has served its purpose. As we port more ImpalaInternalServices to use KRPC, it's becoming increasingly burdensome to maintain parallel implementation of the RPC handlers. Therefore, going forward, KRPC is always enabled. This change removes the Thrift based implemenation of DataStreamServices and also simplifies some of the tests which were skipped when KRPC is disabled. Testing done: core debug build. Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 --- M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/thrift-server-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/backend-client.h D be/src/runtime/data-stream-mgr-base.h D be/src/runtime/data-stream-mgr.h D be/src/runtime/data-stream-recvr-base.h D be/src/runtime/data-stream-recvr.cc D be/src/runtime/data-stream-recvr.h D be/src/runtime/data-stream-sender.cc D be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/ImpalaInternalService.thrift M tests/common/custom_cluster_test_suite.py M tests/common/skip.py D tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/custom_cluster/test_krpc_metrics.py M tests/custom_cluster/test_rpc_exception.py M tests/query_test/test_codegen.py M tests/webserver/test_web_pages.py 44 files changed, 88 insertions(+), 2,143 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/6 -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10835 ) Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream services .. Patch Set 5: Code-Review+1 Rebased. Carry +1. -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 23 Jul 2018 18:22:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services
Hello Sailesh Mukil, Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10835 to look at the new patch set (#5). Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream services .. IMPALA-7212: Removes --use_krpc flag and remove old DataStream services This change removes the flag --use_krpc which allows users to fall back to using Thrift based implementation of DataStream services. This flag was originally added during development of IMPALA-2567. It has served its purpose. As we port more ImpalaInternalServices to use KRPC, it's becoming increasingly burdensome to maintain parallel implementation of the RPC handlers. Therefore, going forward, KRPC is always enabled. This change removes the Thrift based implemenation of DataStreamServices and also simplifies some of the tests which were skipped when KRPC is disabled. Testing done: core debug build. Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 --- M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/thrift-server-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/backend-client.h D be/src/runtime/data-stream-mgr-base.h D be/src/runtime/data-stream-mgr.h D be/src/runtime/data-stream-recvr-base.h D be/src/runtime/data-stream-recvr.cc D be/src/runtime/data-stream-recvr.h D be/src/runtime/data-stream-sender.cc D be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/ImpalaInternalService.thrift M tests/common/custom_cluster_test_suite.py M tests/common/skip.py D tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/custom_cluster/test_krpc_metrics.py M tests/custom_cluster/test_rpc_exception.py M tests/query_test/test_codegen.py M tests/webserver/test_web_pages.py 44 files changed, 88 insertions(+), 2,143 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/5 -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10813 ) Change subject: IMPALA-7163: Implement a state machine for the QueryState class .. Patch Set 6: (13 comments) http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/fragment-instance-state.cc@a113 PS6, Line 113: May make sense to replace this by checking the state. Doing a racy read of the state should be fine. http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@198 PS6, Line 198: struct FInstanceStatus { If you take the suggestion to only keep a single 'query_status_', there doesn't seem to be need for this struct. We can replace it with a single class member to record the fragment instance ID of the first failed fragment instance. http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@212 PS6, Line 212: TUniqueId finst_id_; This is only meaningful when there is an error status, right ? Please document it. http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@234 PS6, Line 234: FragmentInstanceState fragment instance http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@271 PS6, Line 271: INITIALIZED, Consider calling it "PREPARING" http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@274 PS6, Line 274: PREPARED, Consider calling it "EXECUTING" http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@308 PS6, Line 308: /// The overall status of the Prepare phase for this query state. : FInstanceStatus prepare_status_; : : /// The overall status of the Exec phase (after Prepare()) for this query state. : FInstanceStatus exec_status_; : : /// Protects 'prepare_status_' and 'exec_status_'. : SpinLock status_lock_; It seems simpler to keep track of a single status for the first error hit instead of tracking the errors in different phases. The behavior in the existing code is that the first fragment instance which hits the error will report it to the coordinator directly. In theory, a fragment instance F1 can hit an error during exec after prepare phase is done while another fragment instance F2 can hit error during its prepare phase after F1 hits the error. The new patch seems to set query_status_ to the error of F2 while the coordinator will get the error of F1. So, this seems to cause divergence with the coordinator. So, in that sense, it's necessary to keep a single status only. http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@320 PS6, Line 320: FInstanceStatus query_status_; Mind documenting the thread safety about this variable ? http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@238 PS6, Line 238: case BackendExecState::INITIALIZED: : case BackendExecState::PREPARED: : DCHECK(query_status_.ok()) << query_status_.ToString(); : if (!status.ok()) { : // Error while executing - go to ERROR state. : query_status_ = finst_status; : backend_exec_state_ = BackendExecState::ERROR; Can this be merged with TransitionNonTerminalState() above as: if (query_status_.IsCancelled()) { new_state = BackendExecState::CANCELLED; } else if (!query_status_.ok()) { new_state = BackendExecState::ERROR; } else { new_state = state == PREPARING ? EXECUTING : FINISHED; } http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@250 PS6, Line 250: case BackendExecState::FINISHED Under what circumstances would we call QueryState::UpdateBackendExecState() after entering a terminal state ? http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@457 PS6, Line 457: ErrorDuringPrepare( Won't it be a problem if we call ErrorDuringPrepare() here as it only bump the instances_prepared_barrier_ by one instead of the number of remaining fragment instances ? In that sense, won't caller of WaitForPrepare() wait forever ? Also, the thread creation failure path warrants a targeted test case. May be a debug action can be added in Thread::Create(). http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@460 PS6, Line 460: return; Seems wrong to return here. Don't we need to call ReportExecStatusAux() like we did in the past. http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@463 PS6, Line 463: all_fis_status = WaitFor
[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10977 ) Change subject: IMPALA-7296: bytes limit for row batch queue .. Patch Set 5: Hi Tim, will it be simpler in general to just use max_bytes instead of both (max_batches, max_bytes) to control the queue capacity ? Can you please elaborate on the reasoning for not doing so ? -- To view, visit http://gerrit.cloudera.org:8080/10977 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681 Gerrit-Change-Number: 10977 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Jul 2018 17:27:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7298: Stop passing IP address as hostname in Kerberos principal
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10980 ) Change subject: IMPALA-7298: Stop passing IP address as hostname in Kerberos principal .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/10980/1/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/10980/1/be/src/runtime/krpc-data-stream-sender.cc@598 PS1, Line 598: destinations[i].thrift_backend.hostname, > Instead of adding this new parameter, would it be possible to ensure that " destinations[i].krpc_backend.hostname is actually set to the resolved IP address of hostname so we have to pull it from thrift_backend.hosname. As we remove Thrift from communications between Impala backends in the future, we need to move thrift_backend.hostname into a separate field in TPlanFragmentDestination. -- To view, visit http://gerrit.cloudera.org:8080/10980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e3e978746cf03766eee151835aad5877d9ed63e Gerrit-Change-Number: 10980 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 19 Jul 2018 00:47:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7298: Stop passing IP address as hostname in Kerberos principal
Michael Ho has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10980 Change subject: IMPALA-7298: Stop passing IP address as hostname in Kerberos principal .. IMPALA-7298: Stop passing IP address as hostname in Kerberos principal Previously, we pass the resolved IP address of a KRPC destination host as the hostname when creating a proxy for making KRPC calls. This may lead to connection negotiation failure in KRPC when Kerberos is enabled. In particular, if reversed DNS isn't enabled in Kerberos, KDC may fail to look up the principal of the destination host if the principal includes the hostname instead of resolved IP address. This change fixes the problem above by passing the actual hostname of the destination host when calling RpcMgr::GetProxy(). rpc-mgr-kerberized-test.cc is also updated to use hostname instead of the resolved IP address as Kerberos principal. Change-Id: I3e3e978746cf03766eee151835aad5877d9ed63e --- M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/rpc-mgr-test-base.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/scheduling/scheduler.cc M common/thrift/ImpalaInternalService.thrift 10 files changed, 33 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/10980/1 -- To view, visit http://gerrit.cloudera.org:8080/10980 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3e3e978746cf03766eee151835aad5877d9ed63e Gerrit-Change-Number: 10980 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services
Hello Sailesh Mukil, Joe McDonnell, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10835 to look at the new patch set (#4). Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream services .. IMPALA-7212: Removes --use_krpc flag and remove old DataStream services This change removes the flag --use_krpc which allows users to fall back to using Thrift based implementation of DataStream services. This flag was originally added during development of IMPALA-2567. It has served its purpose. As we port more ImpalaInternalServices to use KRPC, it's becoming increasingly burdensome to maintain parallel implementation of the RPC handlers. Therefore, going forward, KRPC is always enabled. This change removes the Thrift based implemenation of DataStreamServices and also simplifies some of the tests which were skipped when KRPC is disabled. Testing done: core debug build. Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 --- M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/thrift-server-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/backend-client.h D be/src/runtime/data-stream-mgr-base.h D be/src/runtime/data-stream-mgr.h D be/src/runtime/data-stream-recvr-base.h D be/src/runtime/data-stream-recvr.cc D be/src/runtime/data-stream-recvr.h D be/src/runtime/data-stream-sender.cc D be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/ImpalaInternalService.thrift M tests/common/custom_cluster_test_suite.py M tests/common/skip.py D tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/custom_cluster/test_krpc_metrics.py M tests/custom_cluster/test_rpc_exception.py M tests/query_test/test_codegen.py M tests/webserver/test_web_pages.py 44 files changed, 88 insertions(+), 2,143 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/4 -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 4 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10835 ) Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream services .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/10835/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10835/3//COMMIT_MSG@7 PS3, Line 7: Deprecate > Nit: I think it would be clearer to say "Remove" rather than "Deprecate". Done http://gerrit.cloudera.org:8080/#/c/10835/3//COMMIT_MSG@9 PS3, Line 9: deprecates > Same here. Done http://gerrit.cloudera.org:8080/#/c/10835/3/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/10835/3/be/src/common/global-flags.cc@40 PS3, Line 40: DEFINE_int32_hidden(krpc_port, 27000, : "port on which KRPC based ImpalaInternalService is exported"); > Now that we're exposing this, we should probably document this (if we're no Internal doc bug filed. http://gerrit.cloudera.org:8080/#/c/10835/3/be/src/common/global-flags.cc@260 PS3, Line 260: REMOVED_FLAG(disable_mem_pools); > What are our rules about removing startup flags? Should use_krpc be here? ( Done http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py@a73 PS1, Line 73: > So that goes to say that we won't benefit much from running with this for R Yes -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Jul 2018 05:33:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10933 ) Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule() .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/10933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0 Gerrit-Change-Number: 10933 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Fri, 13 Jul 2018 00:35:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10933 ) Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule() .. Patch Set 1: Another idea is to inject fault at expression codegen functions to randomly fail codegen. This may not cover cases like scanners but should still provide reasonable coverage for other exec nodes. -- To view, visit http://gerrit.cloudera.org:8080/10933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0 Gerrit-Change-Number: 10933 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 12 Jul 2018 23:06:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10933 ) Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule() .. Patch Set 1: > > Change LGTM. May be it helps to add some end-to-end test which > > exercises the hashT table paths with CHAR type to confirm it's > > fixed. Same can be said about the scanner changes. > > > > In the long run, every codegen changes which include handcrafted > IR > > should include CHAR type as part of the tests. > > CHAR type was a part of the manual testing that i did for > IMPALA-6177. The thing that I am concerned about is that if we > decide to add tests for CHAR for only the changes in this patch, > then to get full coverage we would also have to add test cases that > exercise all handcrafted IR generation code paths that have failure > modes for CHAR There may be ways to automate it but I agree that's a lot of cases to cover. We may not need to take on all of them in this patch. My suspicion is that we already added some test cases when codegen logic was introduced in exec nodes so they may need to be extended only to cover CHAR. That said, it is a bit tedious and ideally we want some form of query generator to generate these queries. -- To view, visit http://gerrit.cloudera.org:8080/10933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0 Gerrit-Change-Number: 10933 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 12 Jul 2018 23:04:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10933 ) Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule() .. Patch Set 1: Change LGTM. May be it helps to add some end-to-end test which exercises the hashT table paths with CHAR type to confirm it's fixed. Same can be said about the scanner changes. In the long run, every codegen changes which include handcrafted IR should include CHAR type as part of the tests. -- To view, visit http://gerrit.cloudera.org:8080/10933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0 Gerrit-Change-Number: 10933 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 12 Jul 2018 22:52:03 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Added KRPC port to the list of Impala ports
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10930 ) Change subject: [DOCS] Added KRPC port to the list of Impala ports .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic42930296fded6dcd4b60fba9efa1c60a0ee4944 Gerrit-Change-Number: 10930 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 12 Jul 2018 20:44:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4669: [KUTIL] Add kudu util library to the build.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10758 ) Change subject: IMPALA-4669: [KUTIL] Add kudu_util library to the build. .. Patch Set 12: Code-Review+2 I looked at the diff between v10 and v12. -- To view, visit http://gerrit.cloudera.org:8080/10758 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I451f02d3e4669e8a548b92fb1445cb2b322659a2 Gerrit-Change-Number: 10758 Gerrit-PatchSet: 12 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Thu, 12 Jul 2018 20:43:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7212: Deprecate --use krpc flag and remove old DataStream services
Hello Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10835 to look at the new patch set (#3). Change subject: IMPALA-7212: Deprecate --use_krpc flag and remove old DataStream services .. IMPALA-7212: Deprecate --use_krpc flag and remove old DataStream services This change deprecates the flag --use_krpc which allows users to fall back to using Thrift based implementation of DataStream services. This flag was originally added during development of IMPALA-2567. It has served its purpose. As we port more ImpalaInternalServices to use KRPC, it's becoming increasingly burdensome to maintain parallel implementation of the RPC handlers. Therefore, going forward, KRPC is always enabled. This change removes the Thrift based implemenation of DataStreamServices and also simplifies some of the tests which were skipped when KRPC is disabled. Testing done: core debug build. Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 --- M be/src/common/global-flags.cc M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/thrift-server-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/backend-client.h D be/src/runtime/data-stream-mgr-base.h D be/src/runtime/data-stream-mgr.h D be/src/runtime/data-stream-recvr-base.h D be/src/runtime/data-stream-recvr.cc D be/src/runtime/data-stream-recvr.h D be/src/runtime/data-stream-sender.cc D be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/ImpalaInternalService.thrift M tests/common/custom_cluster_test_suite.py M tests/common/skip.py D tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/custom_cluster/test_krpc_metrics.py M tests/custom_cluster/test_rpc_exception.py M tests/query_test/test_codegen.py M tests/webserver/test_web_pages.py 44 files changed, 87 insertions(+), 2,142 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/3 -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7212: Deprecate --use krpc flag and remove old DataStream services
Hello Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10835 to look at the new patch set (#2). Change subject: IMPALA-7212: Deprecate --use_krpc flag and remove old DataStream services .. IMPALA-7212: Deprecate --use_krpc flag and remove old DataStream services This change deprecates the flag --use_krpc which allows users to fall back to using Thrift based implementation of DataStream services. This flag was originally added during development of IMPALA-2567. It has served its purpose. As we port more ImpalaInternalServices to use KRPC, it's becoming increasingly burdensome to maintain parallel implementation of the RPC handlers. Therefore, going forward, KRPC is always enabled. This change removes the Thrift based implemenation of DataStreamServices and also simplifies some of the tests which were skipped when KRPC is disabled. Testing done: core debug build. Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 --- M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/rpc/authentication.cc M be/src/rpc/rpc-mgr-kerberized-test.cc M be/src/rpc/thrift-server-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/backend-client.h D be/src/runtime/data-stream-mgr-base.h D be/src/runtime/data-stream-mgr.h D be/src/runtime/data-stream-recvr-base.h D be/src/runtime/data-stream-recvr.cc D be/src/runtime/data-stream-recvr.h D be/src/runtime/data-stream-sender.cc D be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/scheduling/scheduler.cc M be/src/service/data-stream-service.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M bin/run-all-tests.sh M bin/start-impala-cluster.py M common/thrift/ImpalaInternalService.thrift M tests/common/custom_cluster_test_suite.py M tests/common/skip.py D tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/custom_cluster/test_krpc_metrics.py M tests/custom_cluster/test_rpc_exception.py M tests/query_test/test_codegen.py M tests/webserver/test_web_pages.py 43 files changed, 86 insertions(+), 2,141 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/2 -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-7212: Deprecate --use krpc flag and remove old DataStream services
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10835 ) Change subject: IMPALA-7212: Deprecate --use_krpc flag and remove old DataStream services .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc File be/src/runtime/data-stream-test.cc: http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@23 PS1, Line 23: #include "common/status.h" > I think I've found all the dead code in this patch, but if you want a refer Thanks for checking. http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@128 PS1, Line 128: class DataStreamTest : public testing::Test { : protected: : DataStreamTest() : next_val_(0) { : // Stop tests that rely on mismatched sender / receiver pairs timing out from failing. : > No longer needed. Done http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@134 PS1, Line 134: ~DataStreamTest() { runtime_state_->ReleaseResources(); } : : virtual void SetUp() { : > No longer needed. Done http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@139 PS1, Line 139: erPool(32 * 1024, 1024 * 1024 * 1024, 32 * 1024); > You can replace this with: Done http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py@a73 PS1, Line 73: > Can't this happen without TransmitData() today? Don't we also call it on R ReportExecStatus() can be very timing sensitive as the number of iterations it's invoked depends on how long the query runs for. -- To view, visit http://gerrit.cloudera.org:8080/10835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3 Gerrit-Change-Number: 10835 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Jul 2018 22:02:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7006: Pick parts of recent Kudu gutil changes
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10769 ) Change subject: IMPALA-7006: Pick parts of recent Kudu gutil changes .. Patch Set 14: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc File be/src/gutil/strings/substitute.cc: http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc@15 PS14, Line 15: > I only picked those parts of that commit that had an effect on Impala's cod I see. Makes sense. -- To view, visit http://gerrit.cloudera.org:8080/10769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2dc8c70425e3ac030427ebeb1ec18a44d14d5cb Gerrit-Change-Number: 10769 Gerrit-PatchSet: 14 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Jul 2018 21:45:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7095: clean up scan node profiles
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10810 ) Change subject: IMPALA-7095: clean up scan node profiles .. Patch Set 8: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/10810/7/be/src/exec/kudu-scan-node-base.h File be/src/exec/kudu-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/10810/7/be/src/exec/kudu-scan-node-base.h@41 PS7, Line 41: : virtual Status Prepare(RuntimeState* state) override; : virtual Status Open(RuntimeState* state) override; : virtual Status GetNext(RuntimeState* state, RowBatch* row_batch, bool* eos) : override = 0; : protected: : virtual void DebugString(int indentation_level, std::stringstream* out) const override; > It's not really necessary now that the clang precommit builds detect droppe It seems that we have them in some places (hdfs-scan-node-base.h) in this patch but not others. Just a bit inconsistent. http://gerrit.cloudera.org:8080/#/c/10810/8/be/src/exec/scan-node.h File be/src/exec/scan-node.h: http://gerrit.cloudera.org:8080/#/c/10810/8/be/src/exec/scan-node.h@227 PS8, Line 227: Not thread-safe. Is it expected that the caller is supposed to hold certain locks before calling this function or is it called from a particular thread context only ? -- To view, visit http://gerrit.cloudera.org:8080/10810 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77286282d42e7764bfdf94c7ec47cec9d743f787 Gerrit-Change-Number: 10810 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 11 Jul 2018 19:33:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7006: Pick parts of recent Kudu gutil changes
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10769 ) Change subject: IMPALA-7006: Pick parts of recent Kudu gutil changes .. Patch Set 14: (2 comments) http://gerrit.cloudera.org:8080/#/c/10769/14//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10769/14//COMMIT_MSG@11 PS14, Line 11: Rename constants Can you please also specify the commit (e719b5ef) ? http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc File be/src/gutil/strings/substitute.cc: http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc@15 PS14, Line 15: Is the change in gutil/strings/escaping.cc missing ?(https://github.com/apache/kudu/commit/e719b5eff2a47f61f9e6ee1f18d2055247f45847#diff-94c169876895d2a2b452a78355d51e57) -- To view, visit http://gerrit.cloudera.org:8080/10769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2dc8c70425e3ac030427ebeb1ec18a44d14d5cb Gerrit-Change-Number: 10769 Gerrit-PatchSet: 14 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Jul 2018 18:51:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7006: [KSECURITY] Update security library integration
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/10759 ) Change subject: IMPALA-7006: [KSECURITY] Update security library integration .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10759 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifab51d887f5e771ad62eeddc14b9c47f42c3130d Gerrit-Change-Number: 10759 Gerrit-PatchSet: 9 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 11 Jul 2018 01:58:28 + Gerrit-HasComments: No