[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11331 )

Change subject: Add missing authorization in KRPC
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/authentication.cc@734
PS5, Line 734: return Status(Substitute("Bad --krb5_ccname value: $0 is not 
an absolute file path",
> nit: use the user-facing name --krb5_ccname instead of FLAGS_... which is i
Done


http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr-kerberized-test.cc
File be/src/rpc/rpc-mgr-kerberized-test.cc:

http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr-kerberized-test.cc@114
PS5, Line 114:   controller.Reset();
 :   rpc_status =
 :   FromKuduStatus(scan_proxy
> So, this is passing only because the Authorize() function for ScanMemServic
Yes, I have some comments about it on line 99-100 but yes, I should move them 
to a location closer to the actual execution of the test to make it clearer.


http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/11331/5/be/src/rpc/rpc-mgr.cc@172
PS5, Line 172: LOG(ERROR) << Substitute("Rejecting unauthorized access to 
$0 from $1. Expected "
> sorry, think I wasn't entirely clear. I think the error message should stil
Lol. It's indeed a bit scary to not have the word "Reject" in the error 
message. Fixed. Thanks.



--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 29 Aug 2018 19:08:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11331 )

Change subject: Add missing authorization in KRPC
..


Patch Set 6: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 29 Aug 2018 19:09:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-29 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11331

to look at the new patch set (#6).

Change subject: Add missing authorization in KRPC
..

Add missing authorization in KRPC

In 2.12.0, Impala adopted Kudu RPC library for certain backened services
(TransmitData(), EndDataStream()). While the implementation uses Kerberos
for authenticating users connecting to the backend services, there is no
authorization implemented. This is a regression from the Thrift based
implementation because it registered a SASL callback (SaslAuthorizeInternal)
to be invoked during the connection negotiation. With this regression,
an unauthorized but authenticated user may invoke RPC calls to Impala backend
services.

This change fixes the issue above by overriding the default authorization method
for the DataStreamService. The authorization method will only let authenticated
principal which matches FLAGS_principal / FLAGS_be_principal to access the 
service.
Also added a new startup flag --krb5_ccname to allow users to customize the 
locations
of the Kerberos credentials cache.

Testing done:
1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an 
unauthorized
user is not allowed to access the service.
2. Ran some queries in a Kerberos enabled cluster to make sure there is no 
error.
3. Exhaustive builds.

Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix.

Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
---
M be/src/common/global-flags.cc
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test.cc
R be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/data-stream-test.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/testutil/mini-kdc-wrapper.cc
M be/src/testutil/mini-kdc-wrapper.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M bin/rat_exclude_files.txt
M common/protobuf/data_stream_service.proto
A common/protobuf/kudu
M common/protobuf/rpc_test.proto
19 files changed, 330 insertions(+), 135 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/6
--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11331 )

Change subject: Add missing authorization in KRPC
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50
PS2, Line 50: krb5_ccname
> That's fair. Maybe we should add some validation, then, that the path is an
Good point. Validation added in AuthManager::InitKerberosEnv()


http://gerrit.cloudera.org:8080/#/c/11331/4/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/11331/4/be/src/rpc/rpc-mgr.cc@174
PS4, Line 174: mem_tracker->Release(context->GetTransferSize());
> the log message probably include the 'requestor_string' which includes the
Good point. This definitely helps debugging.


http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc
File be/src/util/auth-util.cc:

http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc@84
PS3, Line 84: Status ParseKerberosPrincipal(const string& principal, string* 
service_name,
> In terms of whether to keep the change in the patch, I figured that since t
Fair enough. Reverted the change and left a TODO.



--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 29 Aug 2018 05:00:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-28 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11331

to look at the new patch set (#5).

Change subject: Add missing authorization in KRPC
..

Add missing authorization in KRPC

In 2.12.0, Impala adopted Kudu RPC library for certain backened services
(TransmitData(), EndDataStream()). While the implementation uses Kerberos
for authenticating users connecting to the backend services, there is no
authorization implemented. This is a regression from the Thrift based
implementation because it registered a SASL callback (SaslAuthorizeInternal)
to be invoked during the connection negotiation. With this regression,
an unauthorized but authenticated user may invoke RPC calls to Impala backend
services.

This change fixes the issue above by overriding the default authorization method
for the DataStreamServices. The authorization method will only let authenticated
principal which matches the expected service name to access the service.

Testing done:
1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an 
unauthorized
user is not allowed to access the service.
2. Ran some queries in a Kerberos enabled cluster to make sure there is no 
error.
3. Exhaustive builds.

Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix.

Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
---
M be/src/common/global-flags.cc
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test.cc
R be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/data-stream-test.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/testutil/mini-kdc-wrapper.cc
M be/src/testutil/mini-kdc-wrapper.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M bin/rat_exclude_files.txt
M common/protobuf/data_stream_service.proto
A common/protobuf/kudu
M common/protobuf/rpc_test.proto
19 files changed, 328 insertions(+), 135 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/5
--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7464: fix race when ExecFInstance() RPC times out

2018-08-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11339 )

Change subject: IMPALA-7464: fix race when ExecFInstance() RPC times out
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11339/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11339/2//COMMIT_MSG@21
PS2, Line 21: (verified via code inspection)
> I'd appreciate the code reviewer to verify this as well.
Did some manual verification to ensure all resources released in 
QueryState::ReleaseExecResources() are only referenced by backend executors 
except for query_mem_tracker_ which is a control structure.


http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.cc@93
PS2, Line 93: set_query_exec_finished();
This is apparently beyond the scope of this change:

Does it make sense for some code in MemTracker to DCHECK / CHECK against this 
value to make sure query_mem_tracker_ is not used after this flag is set ?



--
To view, visit http://gerrit.cloudera.org:8080/11339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If60d983e0e68b00e6557185db1f86757ab8b3f2d
Gerrit-Change-Number: 11339
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 29 Aug 2018 01:20:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7464: fix race when ExecFInstance() RPC times out

2018-08-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11339 )

Change subject: IMPALA-7464: fix race when ExecFInstance() RPC times out
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h@161
PS2, Line 161: ReleaseExecResourceRefcount()
ReleaseBackendResourceRefcount()


http://gerrit.cloudera.org:8080/#/c/11339/2/be/src/runtime/query-state.h@186
PS2, Line 186:   void AcquireBackendResourceRefcount();
Should we add a remark that this should not be called from coordinator and cite 
this JIRA as a problem ?



--
To view, visit http://gerrit.cloudera.org:8080/11339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If60d983e0e68b00e6557185db1f86757ab8b3f2d
Gerrit-Change-Number: 11339
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 29 Aug 2018 01:09:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11331 )

Change subject: Add missing authorization in KRPC
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50
PS2, Line 50: krb5_ccname
> Should this be hidden? I dont think we'd want to support users changing it,
I actually prefer to keep it public so the user can configure it for cases such 
as KUDU-2545. I will add a remark about its being advanced option.


http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.h@25
PS3, Line 25: #include "rpc/authentication.h"
: #include "rpc/impala-service-pool.h"
: #include "util/auth-util.h"
> are these new includes necessary? I don't see any new usages of types in th
Left over from previous patch. Removed.


http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc@172
PS3, Line 172: mem_tracker->Release(context->GetTransferSize());
> slight nit (though I guess this happens elsewhere also): shouldn't we wait
Sure. Will do it as a clean-up in a follow-on patch. That said, my 
understanding is that the reply itself is processed asynchronously.


http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/rpc/rpc-mgr.cc@173
PS3, Line 173: context->RespondFailure(kudu::Status::NotAuthorized(
> should we also log the unauthorized access attempt (perhaps including the e
Done


http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc
File be/src/util/auth-util.cc:

http://gerrit.cloudera.org:8080/#/c/11331/3/be/src/util/auth-util.cc@84
PS3, Line 84: Status ParseKerberosPrincipal(const string& principal, string* 
service_name,
> I wonder whether we should just be using krb5_parse_name here instead of im
When writing this patch, I noticed the previous implementation made two calls 
split() which seems unnecessary. I fixed it as part of this patch as I was 
calling it in the previous version of this patch but this is not needed anymore 
in the new version. It doesn't hurt to keep it though.

I am not sure why krb5_parse_name() wasn't used here in the previous 
implementation. Given the Kudu code has better infrastructure, it seems easier 
for me to add a utility function to Kudu and call it here instead. Filed 
IMPALA-7504 for this.



--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 29 Aug 2018 00:24:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-28 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11331

to look at the new patch set (#4).

Change subject: Add missing authorization in KRPC
..

Add missing authorization in KRPC

In 2.12.0, Impala adopted Kudu RPC library for certain backened services
(TransmitData(), EndDataStream()). While the implementation uses Kerberos
for authenticating users connecting to the backend services, there is no
authorization implemented. This is a regression from the Thrift based
implementation because it registered a SASL callback (SaslAuthorizeInternal)
to be invoked during the connection negotiation. With this regression,
an unauthorized but authenticated user may invoke RPC calls to Impala backend
services.

This change fixes the issue above by overriding the default authorization method
for the DataStreamServices. The authorization method will only let authenticated
principal which matches the expected service name to access the service.

Testing done:
1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an 
unauthorized
user is not allowed to access the service.
2. Ran some queries in a Kerberos enabled cluster to make sure there is no 
error.
3. Exhaustive builds.

Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix.

Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
---
M be/src/common/global-flags.cc
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test.cc
R be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/data-stream-test.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/testutil/mini-kdc-wrapper.cc
M be/src/testutil/mini-kdc-wrapper.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M bin/rat_exclude_files.txt
M common/protobuf/data_stream_service.proto
A common/protobuf/kudu
M common/protobuf/rpc_test.proto
19 files changed, 315 insertions(+), 146 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/4
--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11331 )

Change subject: Add missing authorization in KRPC
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/common/global-flags.cc@50
PS2, Line 50: krb5_ccname
> nit: Maybe krb5_ccpath ? Or krb5ccname_path
Seems more intuitive to have a name which resembles the env var KRB5CCNAME. 
That env variable indicates the location of the credentials cache so it seems a 
bit redundant to add path to its name as we also don't use krb_conf_path below. 
Please let me know if you feel strongly about it though.

 KRB5CCNAME  Location  of  the  Kerberos  5   credentials
 (ticket) cache.


http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.h@216
PS2, Line 216:
> Just to add, if we're going to keep the service name fixed, this should pro
Variable removed as this is no longer necessary per suggestion to use 
GetLoggedInUserNameFromKeytab().


http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@118
PS2, Line 118: nused_realm));
> I think this should actually be hardcoded to "impala" or something rather t
For all practical purposes, that's true.

However, doing so will also require hardcoding part of FLAGS_principal / 
FLAGS_be_principal as that's what's used for kinit. Hardcoding the principal is 
almost the same as deprecating FLAGS_principal / FLAGS_be_principal and 
switching to using a boolean flag FLAGS_kerberos_enabled. This seems to be a 
larger scope than I feel comfortable with for this change. I need more thought 
about this as I have to understand the historical context for having 
FLAGS_principal / FLAGS_be_principal to begin with.


http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@161
PS2, Line 161:   // Authorization is enforced iff Kerberos is enabled.
> invert this condition so that you can un-nest the majority of the function.
Done


http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/rpc/rpc-mgr.cc@164
PS2, Line 164: ername matches that of the kinit'ed principal.
 :   const RemoteUser& remote_user = context->remote_user
> sorry I skipped half of what I meant to say in this comment. What I meant i
Good point. Done.


http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/testutil/mini-kdc-wrapper.h
File be/src/testutil/mini-kdc-wrapper.h:

http://gerrit.cloudera.org:8080/#/c/11331/2/be/src/testutil/mini-kdc-wrapper.h@43
PS2, Line 43:   static Status SetupAndStartMiniKDC(std::string realm,
:   std::string ticket_lifetime, std::string renew_lifetime,
> why these changes? it's better to take the arguments by copy, and then std:
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 28 Aug 2018 18:27:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-28 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11331

to look at the new patch set (#3).

Change subject: Add missing authorization in KRPC
..

Add missing authorization in KRPC

In 2.12.0, Impala adopted Kudu RPC library for certain backened services
(TransmitData(), EndDataStream()). While the implementation uses Kerberos
for authenticating users connecting to the backend services, there is no
authorization implemented. This is a regression from the Thrift based
implementation because it registered a SASL callback (SaslAuthorizeInternal)
to be invoked during the connection negotiation. With this regression,
an unauthorized but authenticated user may invoke RPC calls to Impala backend
services.

This change fixes the issue above by overriding the default authorization method
for the DataStreamServices. The authorization method will only let authenticated
principal which matches the expected service name to access the service.

Testing done:
1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an 
unauthorized
user is not allowed to access the service.
2. Ran some queries in a Kerberos enabled cluster to make sure there is no 
error.
3. Exhaustive builds.

Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix.

Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
---
M be/src/common/global-flags.cc
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test.cc
R be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/data-stream-test.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/testutil/mini-kdc-wrapper.cc
M be/src/testutil/mini-kdc-wrapper.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M bin/rat_exclude_files.txt
M common/protobuf/data_stream_service.proto
A common/protobuf/kudu
M common/protobuf/rpc_test.proto
19 files changed, 311 insertions(+), 142 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/3
--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-27 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11331

to look at the new patch set (#2).

Change subject: Add missing authorization in KRPC
..

Add missing authorization in KRPC

In 2.12.0, Impala adopted Kudu RPC library for certain backened services
(TransmitData(), EndDataStream()). While the implementation uses Kerberos
for authenticating users connecting to the backend services, there is no
authorization implemented. This is a regression from the Thrift based
implementation because it registered a SASL callback (SaslAuthorizeInternal)
to be invoked during the connection negotiation. With this regression,
an unauthorized but authenticated user may invoke RPC calls to Impala backend
services.

This change fixes the issue above by overriding the default authorization method
for the DataStreamServices. The authorization method will only let authenticated
principal which matches the expected service name to access the service.

Testing done:
1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an 
unauthorized
user is not allowed to access the service.
2. Ran some queries in a Kerberos enabled cluster to make sure there is no 
error.
3. Exhaustive builds.

Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix.

Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
---
M be/src/common/global-flags.cc
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test.cc
R be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/data-stream-test.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/testutil/mini-kdc-wrapper.cc
M be/src/testutil/mini-kdc-wrapper.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M bin/rat_exclude_files.txt
M common/protobuf/data_stream_service.proto
A common/protobuf/kudu
M common/protobuf/rpc_test.proto
19 files changed, 315 insertions(+), 143 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/2
--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-27 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11331 )

Change subject: Add missing authorization in KRPC
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc@608
PS1, Line 608: sasl::TSaslServer::SaslInit(GENERAL_CALLBACKS, 
APP_NAME.c_str());
> Todd pointed out during an offline discussion that KRPC internally implemen
Wasn't very clear in my previous update. I was trying to say:

The per-connection authorization (instead of per-RPC authorization) may save 
some CPU overhead by avoiding the need to parse the principal on every RPC 
call. It's unclear how much we are saving in the grand scheme of things though.



--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 27 Aug 2018 19:05:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-27 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11331 )

Change subject: Add missing authorization in KRPC
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/11331/1/be/src/rpc/authentication.cc@608
PS1, Line 608: sasl::TSaslServer::SaslInit(GENERAL_CALLBACKS, 
APP_NAME.c_str());
> Alternatively, you could add these callbacks to the Kudu callback list for
Todd pointed out during an offline discussion that KRPC internally implements 
authorization by overriding the authz_method defined in the RPC service. The 
authz_method is invoked for all RPCs.

So, adding a SASL callback in some sense is adding another path which 
authorization is done. The argument for it is that it's a per-connection 
authorization instead of a per-RPC authorization. The latter may be needed if 
we need finer grained authorization (e.g. RPC method level authorization) so we 
can save some CPU overhead of having to parse the principal repeatedly for each 
RPC. It's unclear how much we are saving in the grand scheme of things.

I will explore the alternative of overriding the SASL callbacks but IMHO, using 
the existing mechanism in KRPC to do the authorization (which is what's 
implemented in this patch) seems to be the easiest path for now.

May help if Todd or others can chime in on their opinion. I think it may make 
sense to explore the option of adding a way for caller to specify Sasl 
callbacks as part of constructing a messenger.



--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 27 Aug 2018 18:57:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add missing authorization in KRPC

2018-08-25 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11331


Change subject: Add missing authorization in KRPC
..

Add missing authorization in KRPC

In 2.12.0, Impala adopted Kudu RPC library for certain backened services
(TransmitData(), EndDataStream()). While the implementation uses Kerberos
for authenticating users connecting to the backend services, there is no
authorization implemented. This is a regression from the Thrift based
implementation because it registered a SASL callback (SaslAuthorizeInternal)
to be invoked during the connection negotiation. With this regression,
an unauthorized but authenticated user may invoke RPC calls to Impala backend
services.

This change fixes the issue above by overriding the default authorization method
for the DataStreamServices. The authorization method will only let authenticated
principal which matches the expected service name to access the service.

Testing done:
1. Added a new test case in rpc-mgr-kerberized-test.cc to confirm an 
unauthorized
user is not allowed to access the service.
2. Ran some queries in a Kerberos enabled cluster to make sure there is no 
error.
3. Exhaustive builds.

Thanks to Todd Lipcon for pointing out the problem and his guidance on the fix.

Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
---
M be/src/common/global-flags.cc
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test.cc
R be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/data-stream-test.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/testutil/mini-kdc-wrapper.cc
M be/src/testutil/mini-kdc-wrapper.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M common/protobuf/data_stream_service.proto
A common/protobuf/kudu
M common/protobuf/rpc_test.proto
18 files changed, 314 insertions(+), 143 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11331/1
--
To view, visit http://gerrit.cloudera.org:8080/11331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2f82dee5e721f2ed23e75fd91abbc6ab7addd4c5
Gerrit-Change-Number: 11331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-21 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@265
PS10, Line 265:   DCHECK(!instance_stats->done_ ||
  :   report_version == 
instance_stats->last_report_version_);
> isn't it possible (though unlikely) that you have some old report sitting i
Yup. That's actually a DCHECK which Sailesh ran into when testing IMPALA-4063. 
Removed in patch 11.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@287
PS10, Line 287: ReportVersion
> if I understand the threading correctly, there are two entry points for sen
To answer the question about your second race, I think it's impossible today 
because we explicitly turn off the reporting thread before sending the final 
profile 
(https://github.com/apache/impala/blob/master/be/src/runtime/fragment-instance-state.cc#L479-L496)

The first race is also kind of impossible today because we wait for the report 
thread to exit before sending the final report but yes, it's possible that the 
RPC in the report thread fails somehow and we just drop those error statuses on 
the floor.



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 10
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 22 Aug 2018 01:58:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-21 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#11).

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..

IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

This patch also fixes IMPALA-7241 which may lead to duplicated
dml stats being applied. The fix is by adding a monotonically
increasing version number for fragment instances' reports. The
coordinator will ignore any report smaller than or equal to the
version in the last report.

Testing done: core build. Added some targeted test cases for profile
serialization failure and RPC retry.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_exception.py
M tests/failure/test_failpoints.py
60 files changed, 1,106 insertions(+), 670 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/11
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11270 )

Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances()
..


Patch Set 4:

GVO failed due to IMPALA-6776 / IMPALA-7061


--
To view, visit http://gerrit.cloudera.org:8080/11270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
Gerrit-Change-Number: 11270
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 21 Aug 2018 06:21:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7433: reduce logging on executors

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11202 )

Change subject: IMPALA-7433: reduce logging on executors
..


Patch Set 4:

Lars, do you also want to take a look ?


--
To view, visit http://gerrit.cloudera.org:8080/11202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff
Gerrit-Change-Number: 11202
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 21 Aug 2018 00:06:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11270 )

Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances()
..


Patch Set 3: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/11270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
Gerrit-Change-Number: 11270
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 20 Aug 2018 22:05:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11270 )

Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances()
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h@344
PS2, Line 344: instances_prepare_promise_
> This variable doesn't exist.
Fixed.


http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h@345
PS2, Line 345: accessor
> reader? Since writes by the startup thread are allowed, and in fact require
Yes, was following the terminology accessor/mutator. Using the term "readers" 
make it clearer.


http://gerrit.cloudera.org:8080/#/c/11270/2/be/src/runtime/query-state.h@351
PS2, Line 351: accessor
> readers
Done



--
To view, visit http://gerrit.cloudera.org:8080/11270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
Gerrit-Change-Number: 11270
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 20 Aug 2018 22:05:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()

2018-08-20 Thread Michael Ho (Code Review)
Hello Tim Armstrong, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11270

to look at the new patch set (#3).

Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances()
..

IMPALA-7422: Fix a race in QueryState::StartFInstances()

A recent commit for IMPALA-7163 (cbc8c63) introduced a race between
insertion into QueryState::fragment_map_ and the thread creation.
In particular, after the aforementioned commit, the counting barrier
'instances_prepared_barrier_' is used for synchronizing callers of
Cancel()/PublishFilter() and the PREPARE phase of fragment instances.
Cancel()/PublishFilter() cannot proceed until all fragment instances
have finished preparing; 'instances_prepared_barrier_' is updated by
fragment instances once each of them is done preparing.

The race is due to the fact that QueryState::StartFInstances() doesn't
insert the fragment instance into 'fragment_map_' until after the fragment
instance thread has been spawned. So, it's possible for the newly spawned
thread to finish preparing and update the counting barrier before the insertion
into 'fragment_map_' happens. It's therefore possible for PublishFilter() to
have gotten unblocked before a fragment is inserted into 'fragment_map_',
triggering the DCHECK() in IMPALA-7422.

This change fixes the race by moving the insertion into fragment_map_
before the thread is spawned.

Testing done: Exhaustive debug + release builds which previously ran into this 
race

Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
---
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
2 files changed, 14 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/11270/3
-- 
To view, visit http://gerrit.cloudera.org:8080/11270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
Gerrit-Change-Number: 11270
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7433: reduce logging on executors

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11202 )

Change subject: IMPALA-7433: reduce logging on executors
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc@356
PS3, Line 356: VLOG(2)
> We do already have log messages before and after this. I might be missing s
Yes, please do add logging for errors in QueryState::Init().



--
To view, visit http://gerrit.cloudera.org:8080/11202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff
Gerrit-Change-Number: 11202
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 20 Aug 2018 21:54:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7433: reduce logging on executors

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11202 )

Change subject: IMPALA-7433: reduce logging on executors
..


Patch Set 4: Code-Review+1

Thanks for the explanation.


--
To view, visit http://gerrit.cloudera.org:8080/11202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff
Gerrit-Change-Number: 11202
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 20 Aug 2018 21:45:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11270 )

Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances()
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11270/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11270/1//COMMIT_MSG@14
PS1, Line 14: cnanot
> nit: type
Done


http://gerrit.cloudera.org:8080/#/c/11270/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/11270/1/be/src/runtime/query-state.cc@403
PS1, Line 403: // Update fragment_map_. Has to happen before the thread is 
spawned below or
> Isn't a race still possible between this thread modifying the map in a subs
'fragment_map_' is supposed to be modified by the thread which calls 
StartFInstances(). Other threads may not access 'fragment_map_' until all 
fragment instances have finished preparing (i.e. they are blocked on 
WaitForPrepare()). At this point, there is at least one fragment instance which 
hasn't yet been started so no other thread should be accessing fragment_map_.

Added some comments in query-state.h.



--
To view, visit http://gerrit.cloudera.org:8080/11270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
Gerrit-Change-Number: 11270
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 20 Aug 2018 20:39:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()

2018-08-20 Thread Michael Ho (Code Review)
Hello Tim Armstrong, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11270

to look at the new patch set (#2).

Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances()
..

IMPALA-7422: Fix a race in QueryState::StartFInstances()

A recent commit for IMPALA-7163 (cbc8c63) introduced a race between
insertion into QueryState::fragment_map_ and the thread creation.
In particular, after the aforementioned commit, the counting barrier
'instances_prepared_barrier_' is used for synchronizing callers of
Cancel() and PublishFilter() and the PREPARE phase of fragment instances.
Cancel() and PublishFilter() cannot proceed until all fragment instances
have finished preparing; 'instances_prepared_barrier_' is updated by
fragment instances once each of them is done preparing.

The race is due to the fact that QueryState::StartFInstances() doesn't
insert the fragment instance into 'fragment_map_' until after the fragment
instance thread has been spawned. So, it's possible for the newly spawned
thread to finish preparing and update the counting barrier before the insertion
into 'fragment_map_' happens. It's therefore possible for, PublishFilter() to
have gotten unblocked before a fragment is inserted into 'fragment_map_',
triggering the DCHECK() in IMPALA-7422.

This change fixes the race by moving the insertion into fragment_map_
before the thread is spawned.

Testing done: Exhaustive debug + release builds which previously ran into this 
race

Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
---
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
2 files changed, 13 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/11270/2
--
To view, visit http://gerrit.cloudera.org:8080/11270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
Gerrit-Change-Number: 11270
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7433: reduce logging on executors

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11202 )

Change subject: IMPALA-7433: reduce logging on executors
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-exec-mgr.cc@155
PS3, Line 155: VLOG(2)
This may be useful for debugging issue such as IMPALA-7194. Not sure if the 
newly added line below is sufficient as it may help to see when each fragment 
instance finishes.


http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/11202/3/be/src/runtime/query-state.cc@356
PS3, Line 356: VLOG(2)
This one may be useful to keep to indicate whether we manage to get to the 
point of starting fragment instances after creating the QueryState. While we 
may consider bumping the log level when problem occurs, this may be a bit hard 
in practice if the problem is due to a race which doesn't occur consistently.



--
To view, visit http://gerrit.cloudera.org:8080/11202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c1db44acc6def2b05a4fd032c63716e08cdf5ff
Gerrit-Change-Number: 11202
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Mon, 20 Aug 2018 20:18:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7422: Fix a race in QueryState::StartFInstances()

2018-08-20 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11270


Change subject: IMPALA-7422: Fix a race in QueryState::StartFInstances()
..

IMPALA-7422: Fix a race in QueryState::StartFInstances()

A recent commit for IMPALA-7163 (cbc8c63) introduced a race between
insertion into QueryState::fragment_map_ and the thread creation.
In particular, after the aforementioned commit, the counting barrier
'instances_prepared_barrier_' is used for synchronizing callers of
Cancel() and PublishFilter() and the PREPARE phase of fragment instances.
Cancel() and PublishFilter() cnanot proceed until all fragment instances
have finished preparing; 'instances_prepared_barrier_' is updated by
fragment instances once each of them is done preparing.

The race is due to the fact that QueryState::StartFInstances() doesn't
insert the fragment instance into 'fragment_map_' until after the fragment
instance thread has been spawned. So, it's possible for the newly spawned
thread to finish preparing and update the counting barrier before the insertion
into 'fragment_map_' happens. It's therefore possible for, PublishFilter() to
have gotten unblocked before a fragment is inserted into 'fragment_map_',
triggering the DCHECK() in IMPALA-7422.

This change fixes the race by moving the insertion into fragment_map_
before the thread is spawned.

Testing done: Exhaustive debug + release builds which previously ran into this 
race

Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
---
M be/src/runtime/query-state.cc
1 file changed, 8 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/11270/1
--
To view, visit http://gerrit.cloudera.org:8080/11270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I35f2a5b0ea5143703850ffc229cec0e4294e6a3e
Gerrit-Change-Number: 11270
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.

2018-08-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11234 )

Change subject: IMPALA-7444: Improve logging of opening/closing/expiring 
sessions.
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a
Gerrit-Change-Number: 11234
Gerrit-PatchSet: 4
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Sat, 18 Aug 2018 00:32:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7449: Fix network throughput calculation of DataStreamSender

2018-08-17 Thread Michael Ho (Code Review)
Hello Lars Volker, Csaba Ringhofer, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11241

to look at the new patch set (#3).

Change subject: IMPALA-7449: Fix network throughput calculation of 
DataStreamSender
..

IMPALA-7449: Fix network throughput calculation of DataStreamSender

Currently, the network throughput presented in the query profile
for DataStreamSender is computed by dividing the total bytes sent
by the total network time which is the sum of observed network time
of all individual RPCs. This is wrong in general and may only make
sense if the network throughput is fixed. In addition, RPCs are
asynchronous and they overlap with each other. So, dividing the
total byte sent by network throughput may result in time which exceeds
the wall clock time, making it impossible to interpret.

This change fixes the problem by measuring the network throughput
of each individual RPC and uses a summary counter to track avg/min/max
of network throughputs instead.

Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24
---
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
2 files changed, 14 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11241/3
--
To view, visit http://gerrit.cloudera.org:8080/11241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24
Gerrit-Change-Number: 11241
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.

2018-08-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11234 )

Change subject: IMPALA-7444: Improve logging of opening/closing/expiring 
sessions.
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11234/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11234/3/be/src/service/impala-server.cc@1975
PS3, Line 1975: LOG_IF(INFO, expired_cnt > 0) << "Expired sessions. Count: 
" << expired_cnt
  :   << ". Time: " << UnixMillis() 
- now << " milliseconds.";
nit: LOG_IF(INFO, expired_cnt > 0) << Substitute("Expired $0 sessions.", 
expired_cnt);

I believe the timestamp at the log line is a reasonable proxy for the time when 
the above critical section completed so there may not be a need to log the time 
here.



--
To view, visit http://gerrit.cloudera.org:8080/11234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a
Gerrit-Change-Number: 11234
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 17 Aug 2018 22:49:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7449: Fix network throughput calculation of DataStreamSender

2018-08-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11241 )

Change subject: IMPALA-7449: Fix network throughput calculation of 
DataStreamSender
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11241/2/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/11241/2/be/src/runtime/krpc-data-stream-sender.cc@401
PS2, Line 401: row_batch_size * NANOS_PER_SEC
> This will overflow around a few gigabyte row_batch_size - is it possible to
Good point. The max of our row batch currently is limited to 
numeric_limits::max(); 
(https://github.com/apache/impala/blob/master/be/src/rpc/rpc-mgr.cc#L90) so it 
should be fine as far as I can tell.

It probably warrants a DCHECK.



--
To view, visit http://gerrit.cloudera.org:8080/11241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24
Gerrit-Change-Number: 11241
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Fri, 17 Aug 2018 18:36:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.

2018-08-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11234 )

Change subject: IMPALA-7444: Improve logging of opening/closing/expiring 
sessions.
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11234/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11234/2/be/src/service/impala-server.cc@1962
PS2, Line 1962: expired_cnt++;
++expired_cnt;


http://gerrit.cloudera.org:8080/#/c/11234/2/be/src/service/impala-server.cc@1975
PS2, Line 1975: LOG_IF(INFO, expired_cnt > 0) << "Expired sessions. Count: 
" << expired_cnt
  :   << ". Time: " << UnixMillis() 
- now << " milliseconds.";
What's the purpose of this line ? Won't we be able to tell that from the 
"Expiring session..." line above ?



--
To view, visit http://gerrit.cloudera.org:8080/11234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a
Gerrit-Change-Number: 11234
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 16 Aug 2018 21:50:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7449: Fix network throughput calculation of DataStreamSender

2018-08-15 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11241


Change subject: IMPALA-7449: Fix network throughput calculation of 
DataStreamSender
..

IMPALA-7449: Fix network throughput calculation of DataStreamSender

Currently, the network throughput presented in the query profile
for DataStreamSender is computed by dividing the total bytes sent
by the total network time which is the sum of observed network time
of all individual RPCs. This is wrong in general and may only make
sense if the network throughput is fixed. In addition, RPCs are
asynchronous and they overlap with each other. So, dividing the
total byte sent by network throughput may result in time which exceeds
the wall clock time, making it impossible to interpret.

This change fixes the problem by measuring the network throughput
of each individual RPC and uses a summary counter to track avg/min/max
of network throughputs instead.

Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24
---
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
2 files changed, 11 insertions(+), 14 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11241/1
--
To view, visit http://gerrit.cloudera.org:8080/11241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24
Gerrit-Change-Number: 11241
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.

2018-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11234 )

Change subject: IMPALA-7444: Improve logging of opening/closing/expiring 
sessions.
..


Patch Set 1:

Just being paranoid here but I hope this doesn't make the log too verbose.


--
To view, visit http://gerrit.cloudera.org:8080/11234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a
Gerrit-Change-Number: 11234
Gerrit-PatchSet: 1
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 15 Aug 2018 19:25:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7444: Improve logging of opening/closing/expiring sessions.

2018-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11234 )

Change subject: IMPALA-7444: Improve logging of opening/closing/expiring 
sessions.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-hs2-server.cc@367
PS1, Line 367: Done opening
Opened


http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11234/1/be/src/service/impala-server.cc@1236
PS1, Line 1236: Done closing
Closed



--
To view, visit http://gerrit.cloudera.org:8080/11234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecf2d3ce707cc36c21da8a2459c2f68cf0b56a4a
Gerrit-Change-Number: 11234
Gerrit-PatchSet: 1
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 15 Aug 2018 19:25:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7434: Heimdal Kerberos is not supported in Impala

2018-08-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11204 )

Change subject: IMPALA-7434: Heimdal Kerberos is not supported in Impala
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id488665d8e6037e3743750dbd32b48def5b58b0f
Gerrit-Change-Number: 11204
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 15 Aug 2018 01:11:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7434: Limitation in kinit with auth to local and Heimdal kerberos

2018-08-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11204 )

Change subject: IMPALA-7434: Limitation in kinit with auth_to_local and Heimdal 
kerberos
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11204/3/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/11204/3/docs/topics/impala_known_issues.xml@323
PS3, Line 323: auth_to_local rules not supported with Heimdal Kerberos
Sorry for the confusion but my understanding when discussing with Sailesh is 
that we don't support auth_to_local in any implementation of Kerberos. Also, 
Heimdal Kerberos (an implementation of Kerberos) is also not supported either. 
We only support MIT Kerberos.

Let me double check with Michael Yoder.



--
To view, visit http://gerrit.cloudera.org:8080/11204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id488665d8e6037e3743750dbd32b48def5b58b0f
Gerrit-Change-Number: 11204
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 14 Aug 2018 23:48:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7434: Limitation in kinit with auth to local and Heimdal kerberos

2018-08-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11204 )

Change subject: IMPALA-7434: Limitation in kinit with auth_to_local and Heimdal 
kerberos
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11204/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/11204/2/docs/topics/impala_known_issues.xml@323
PS2, Line 323: KRPC Unavailable
auth_to_local and Heimdal Kerberos are never supported even before KRPC. So, I 
think we should remove any wording related to KRPC.



--
To view, visit http://gerrit.cloudera.org:8080/11204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id488665d8e6037e3743750dbd32b48def5b58b0f
Gerrit-Change-Number: 11204
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 14 Aug 2018 02:08:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation

2018-08-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11055 )

Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation
..


Patch Set 4:

(1 comment)

May help to add a unit test to exercise retrying logic when sending a filter to 
the aggregator.

http://gerrit.cloudera.org:8080/#/c/11055/4/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/11055/4/common/thrift/ImpalaInternalService.thrift@878
PS4, Line 878: optional map> backend_list
May help to add a comment on what this is for.



--
To view, visit http://gerrit.cloudera.org:8080/11055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
Gerrit-Change-Number: 11055
Gerrit-PatchSet: 4
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 10 Aug 2018 21:50:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h@33
PS9, Line 33: #include "runtime/coordina
> Not needed.
Done


http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc@52
PS9, Line 52:
> This should have been moved to line 48.
Done


http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto@137
PS9, Line 137: 5
> Skipped 5.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 10
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 09 Aug 2018 02:46:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#10).

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..

IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

This patch also fixes IMPALA-7241 which may lead to duplicated
dml stats being applied. The fix is by adding a monotonically
increasing version number for fragment instances' reports. The
coordinator will ignore any report smaller than or equal to the
version in the last report.

Testing done: core build. Added some targeted test cases for profile
serialization failure and RPC retry.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_exception.py
M tests/failure/test_failpoints.py
60 files changed, 1,110 insertions(+), 673 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/10
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 10
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.cc@52
PS9, Line 52:  DCHECK_NOTNULL(coord);
This should have been moved to line 48.



-- 
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 09 Aug 2018 01:49:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@373
PS8, Line 373: 
MonoDelta::FromMilliseconds(FLAGS_backend_client_rpc_timeout_ms));
> We have some logging already below at line 385.
Actually, mis-interpreted your comment. Will add a LOG(WARNING).



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 09 Aug 2018 00:49:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10855/9/be/src/runtime/coordinator-backend-state.h@33
PS9, Line 33: #include "common/atomic.h"
Not needed.


http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/10855/9/common/protobuf/control_service.proto@137
PS9, Line 137: 6
Skipped 5.



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 09 Aug 2018 00:22:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 8:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/data-sink.cc@49
PS8, Line 49: const char* DataSink::ROOT_PARTITION_KEY = "";
> this should be const char* const -- ie it's a const pointer to const charac
True. Same comment probably applies to other classes too (e.g. 
https://github.com/apache/impala/blob/master/be/src/exec/hash-table.h#L196)


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/hdfs-table-writer.h
File be/src/exec/hdfs-table-writer.h:

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/exec/hdfs-table-writer.h@91
PS8, Line 91:   const InsertStatsPB& stats() { return stats_; }
> probably should be a const function as well?
Done


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.h@211
PS8, Line 211:   Coordinator* coord_; /// Coordinator object that owns this 
BackendState
> this went from const to non-const. does this mutate the coordinator that ow
Indirectly. We are updating the coord_->dml_exec_state_(). The locking is done 
inside dml_exec_state_ itself so it should be thread safe.


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@50
PS8, Line 50:   : coord_(coord),
> nit: you could just do coord_(DHECK_NOTNULL(coord)) here since DCHECK_NOTNU
Done


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@262
PS8, Line 262: lock.unlock();
> I'm nervous about this optimization. Do we really have data that deserializ
The reason for doing it here is to handle the case in which we can have 0 to # 
fragment instances (after IMPALA-4063) entries in "backend_exec_status. 
instance_exec_status". We can have 0 entries if say we failed to start a 
thread. So, while we can do it in the RPC handler, it's a bit awkward to handle 
the zero entry case.

The new patch moves the profile sidecar idx into backend_exec_status from 
backend_exec_status.instance_exec_status. That allows us to keep the 
deserialization in the ControlService::ReportExecStatus().

In the patch for IMPALA-4063, we will just replace TRuntimeProfileTree with a 
list instead.


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@310
PS8, Line 310: coord_
> I feel that allowing for a non-const pointer to the Coordinator object to t
Done


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@317
PS8, Line 317: MergeErrorMaps(instance_exec_status.error_log(), &error_log_);
> This is less severe, but I guess that this is already wrong too. We potenti
The new report has a monotonically increasing version number. This is a variant 
of IMPALA-7241.


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/coordinator-backend-state.cc@516
PS8, Line 516: lock_guard l1(exec_summary->lock);
> Are we violating any lock ordering here?
Yes, I also looked and didn't see anything.


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@318
PS8, Line 318: bool done
> trying to understand the purpose of this parameter. I don't think this shou
I agree in general. Let me add a TODO.


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@331
PS8, Line 331:   ThriftSerializer serializer(true);
> Why construct this here and pass it into ConstructReport? It seems Construc
The serialization output is owned by ThriftSerializer. My mistake to use a 
string below. I was using (buffer, len) earlier to avoid the extra copying. 
Also added comments about the lifetime of the buffer.


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@359
PS8, Line 359:   CHECK(sidecar_status.ok())
 :   << FromKuduStatus(sidecar_status, "Failed to add 
sidecar").GetDetail();
> is there any chance that the profile would extend past max RPC size? I gues
Added a CHECK for it in ConstructReport().


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/runtime/query-state.cc@373
PS8, Line 373: if (i < 2) SleepForMs(FLAGS_report_status_retry_interval_ms);
> worth a LOG(WARNING) here?
We have some logging already below at line 385.


http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://

[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#9).

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..

IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

This patch also fixes IMPALA-7241 which may lead to duplicated
dml stats being applied. The fix is by adding a monotonically
increasing version number for fragment instances' reports. The
coordinator will ignore any report smaller than or equal to the
version in the last report.

Testing done: core build. Added some targeted test cases for profile
serialization failure and RPC retry.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_exception.py
M tests/failure/test_failpoints.py
60 files changed, 1,110 insertions(+), 673 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/9
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/kudu/rpc/rpc_context.h
File be/src/kudu/rpc/rpc_context.h:

http://gerrit.cloudera.org:8080/#/c/10855/8/be/src/kudu/rpc/rpc_context.h@166
PS8, Line 166: Status GetInboundSidecar(int idx, Slice* slice) const;
Changes done on Kudu side already: 
https://github.com/apache/kudu/commit/37d0c35bcf43aef92fdde82c629f732f1466d8ae



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 08 Aug 2018 18:14:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..


Patch Set 8:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/data-sink.h@100
PS5, Line 100:   static const char* ROOT_PARTITION_KEY;
> can this be a const char* const ROOT_PARTITION_KEY instead? usually static
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/hdfs-table-writer.h
File be/src/exec/hdfs-table-writer.h:

http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/exec/hdfs-table-writer.h@91
PS5, Line 91: const InsertS
> style: is this meant to be returning a mutable reference or should this be
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/coordinator-backend-state.cc@254
PS5, Line 254:   // If this backend completed previously, don't apply the 
update.
 :   if (IsDone()) return false;
 :   for (const FragmentInstanceExecStatusPB& instance_
> you can use a C++11 style loop with protobuf repeated elements too
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.h@276
PS5, Line 276:
> "... to construct a status report ..."
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.h@277
PS5, Line 277:
> "If 'fis' is not 'nullptr', the runtime profile ..."
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@233
PS5, Line 233:   return state == BackendExecState::FINISHED
 :   || state == BackendExecState::CANCELLED
 :   || state == BackendExecState::E
> I thought I saw some utility code for this elsewhere
Good point about adding this as utility function which we don't have one right 
now. The utility function may not be needed eventually once we convert other 
ExecFInstance() RPC to KRPC too.


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@243
PS5, Line 243:   DCHECK(!IsTerminalState(backend_exec_state_))
 :   << " Current State: " << 
BackendExecStateToString(backend_exec_state_)
 :   << " | Current Status: " << query_status_.Get
> and here
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@256
PS5, Line 256:
> Shouldn't we print the fragment instance ID instead?
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@295
PS5, Line 295: tus serialize_status = serializer->
> The profile is a thrift structure serialized to a string ...
Done


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@306
PS5, Line 306:  Only send updates to insert status if
> We could probably consider moving 'profile_str' to the 'sidecar_buf' to avo
I guess it's a tradeoff because moving the 'profile_str' means having to redo 
the serialization of the profile on when retrying the RPC. I'm a bit worried 
about the overhead of doing so once we merge the profile of multiple fragment 
instances together. Timeout is more likely to occur once the coordinator is 
under stress.


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/query-state.cc@304
PS5, Line 304: }
 :
 : // Only send updates to insert status if fragment is 
finished, the coordinator waits
 : // until query execution is done to use them anyhow.
 : RuntimeState* state = fis->runtime_state();
 : if (done) {
 :   
state->dml_exec_state()->ToProto(instance_status->mutable_insert_exec_status());
 : }
 :
 : // Send new errors to coordinator
 : 
state->GetUnreportedErrors(instance_status->mutable_error_log());
 :   }
 : }
 :
> Can't we just do this once before the loop starts?
Not really. We need to re-arm the sidecar on every retry. The ownership of the 
buffer is transferred to the OutboundCall so we need to re-arm the sidecar.


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/runtime/runtime-state.cc@200
PS5, Line 200: g_lock_);
> this is weird here because there is no 'exec_status' in this function
Typo. Meant 'new_errors'.


http://gerrit.cloudera.org:8080/#/c/10855/5/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:

[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#8).

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..

IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/kudu/rpc/rpc_context.cc
M be/src/kudu/rpc/rpc_context.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_exception.py
M tests/failure/test_failpoints.py
62 files changed, 1,096 insertions(+), 666 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/8
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#7).

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..

IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/kudu/rpc/rpc_context.cc
M be/src/kudu/rpc/rpc_context.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_exception.py
M tests/failure/test_failpoints.py
62 files changed, 1,095 insertions(+), 666 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/7
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-08-08 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#6).

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..

IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/kudu/rpc/rpc_context.cc
M be/src/kudu/rpc/rpc_context.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_exception.py
M tests/failure/test_failpoints.py
62 files changed, 1,096 insertions(+), 666 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/6
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-08-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 16: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 16
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Aug 2018 21:01:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-08-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10813/15/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/15/be/src/runtime/fragment-instance-state.cc@105
PS15, Line 105: current_state_.Load() <= 
TFInstanceExecState::WAITING_FOR_PREPARE) {
  :   DCHECK(!is_prepared);
Seems easier to understand with the following (with the same effect):

 if (!is_prepared) {
 DCHECK_LT(current_state.Load(), );
 } else {
 
 }



-- 
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 15
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Aug 2018 18:15:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-08-06 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 14: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@107
PS13, Line 107: er
> I think it's better to rely on the enum ordering. Otherwise, the new boolea
My understanding is that explicitly check the current state during state 
transition but don't rely on the ordering of the enum in any way. That said, 
it'd be great if the enum is ordered in the expected state transition order. 
May help to have a static DCHECK somewhere to confirm the ordering of the enum 
entries matches the expected state transition order if we don't have one 
already.


http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@522
PS13, Line 522:   "First batch produced", // FIRST_BATCH_PRODUCED
  :   "Producing Data",   // PRODUCING_DATA
This needs to be changed. I think this is dependent on the enum ordering.


http://gerrit.cloudera.org:8080/#/c/10813/14/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/10813/14/common/thrift/ImpalaInternalService.thrift@626
PS14, Line 626: // the debug webpages.
Given the assumption in this patch about the ordering of the fields in this 
enum, can you please add a comment and ideally a DCHECK for it ?



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 14
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 06 Aug 2018 19:55:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-08-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@107
PS13, Line 107: <=
> Hmm, the enums are ordered incorrectly. I think the right thing to do would
A simpler alternative to relying on this enum is to save a boolean above which 
is marked as true if we get past Prepare() successfully.

This avoids the reliance that the code actually enforces the state transition 
based on the order in which it's defined in the enum.



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 13
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Sun, 05 Aug 2018 17:43:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7394: Don't print stack trace in ExpireSessions()

2018-08-04 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7 )

Change subject: IMPALA-7394: Don't print stack trace in ExpireSessions()
..


Patch Set 1:

Build failed due to IMPALA-7328


--
To view, visit http://gerrit.cloudera.org:8080/7
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I29cb143af96e2eef45479a03365d8f49a2ee2dfa
Gerrit-Change-Number: 7
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Sat, 04 Aug 2018 18:11:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7394: Don't print stack trace in ExpireSessions()

2018-08-03 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/7


Change subject: IMPALA-7394: Don't print stack trace in ExpireSessions()
..

IMPALA-7394: Don't print stack trace in ExpireSessions()

When sessions expire, ExpireSessions() may create excessive amount
of log spew due to printing of the stack traces esp when there are
a massive amount of sessions expiring at the same time. This stack
crawling also unnecessarily extends the critical section under the
session_state_map_lock_, increasing the contention for that lock.
This change converts the use of Status() to Status::Expected() to
remove the unnecessary stack crawl.

Change-Id: I29cb143af96e2eef45479a03365d8f49a2ee2dfa
---
M be/src/service/impala-server.cc
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/7/1
--
To view, visit http://gerrit.cloudera.org:8080/7
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I29cb143af96e2eef45479a03365d8f49a2ee2dfa
Gerrit-Change-Number: 7
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-08-03 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/13/be/src/runtime/fragment-instance-state.cc@107
PS13, Line 107: yS
==

Apparently WAITING_FOR_CODEGEN < WAITING_FOR_PREPARE


http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h@244
PS12, Line 244: Status WaitForFinish();
> We call it towards the end of StartFInstances(). We update the query state
Oh right. Missed that.


http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py
File tests/failure/test_failpoints.py:

http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py@162
PS12, Line 162: i = 0
> When I remove this, it gives me this error:
It should work with for i in range(50):



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 12
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 03 Aug 2018 19:10:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7390: Pin rpc-mgr-kerberized-test to localhost.

2018-08-03 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/3 )

Change subject: IMPALA-7390: Pin rpc-mgr-kerberized-test to localhost.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/3/1//COMMIT_MSG@11
PS1, Line 11: failing due to DNS resolution reasons
Does it mean in general, to run Impala with docker, one has to specify the 
hostname ? Is there anything we can do at this line 
(https://github.com/apache/impala/blob/master/be/src/common/init.cc#L191) to 
fix the issue ?


http://gerrit.cloudera.org:8080/#/c/3/1/be/src/rpc/rpc-mgr-kerberized-test.cc
File be/src/rpc/rpc-mgr-kerberized-test.cc:

http://gerrit.cloudera.org:8080/#/c/3/1/be/src/rpc/rpc-mgr-kerberized-test.cc@86
PS1, Line 86: FLAGS_hostname = "localhost";
nit: may help to add a comment on why InitCommonRuntime() doesn't set this up 
correctly.



--
To view, visit http://gerrit.cloudera.org:8080/3
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91003cbc86177feb7f99563f61297f7da7fabab4
Gerrit-Change-Number: 3
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 03 Aug 2018 16:48:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-08-02 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/fragment-instance-state.cc@106
PS12, Line 106:   if (!status.ok() && current_state_.Load() <= 
TFInstanceExecState::WAITING_FOR_PREPARE) {
  : // Tell the managing 'QueryState' that we hit an error 
during Prepare().
  : query_state_->ErrorDuringPrepare(status, instance_id());
  :   } else if (!status.ok()) {
  : // Tell the managing 'QueryState' that we hit an error 
during execution.
  : query_state_->ErrorDuringExecute(status, instance_id());
  :   }
Seems easier to read if written like the following:

   if (!status.ok())
  if (...)
query_state_->Error...
  else
query_state_->ErrorDuring


http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/12/be/src/runtime/query-state.h@244
PS12, Line 244: Status WaitForFinish();
Is this actually still needed in this patch ? It doesn't seem to be called 
anywhere.


http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py
File tests/failure/test_failpoints.py:

http://gerrit.cloudera.org:8080/#/c/10813/12/tests/failure/test_failpoints.py@162
PS12, Line 162: i = 0
This line isn't needed if you use for i in range(50) below.



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 12
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 02 Aug 2018 18:26:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-08-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 10:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/fragment-instance-state.cc@100
PS10, Line 100:   if (!status.ok()) {
  : // Tell the managing 'QueryState' that we hit an error 
during execution.
  : query_state_->ErrorDuringExecute(status, instance_id());
  : goto done;
  :   }
  :   // Tell the managing 'QueryState' that we're done with 
executing and that we've stopped
  :   // the reporting thread.
  :   query_state_->DoneExecuting();
Should this be included moved to point after label done ? Otherwise, we may 
hang in case Open() fails ?

May be this warrants a debug action which triggers failure() in Open().


http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@216
PS10, Line 216: ret_status;
not used ?


http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@217
PS10, Line 217: BackendExecState old_state;
BackendExecState old_state = backend_exec_state_; and no need for line 220


http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@230
PS10, Line 230: BackendExecState::EXECUTING :
  : BackendExecState::FINISHED;
nit: one line


http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@408
PS10, Line 408:   // We failed to start 'num_unstarted_instances', so make 
sure to notify
  :   // 'instances_prepared_barrier_' 'num_unstarted_instances 
- 1' times, to unblock
  :   // 'WaitForPrepare(). The last remaining notification 
will be set by the error
  :   // handling logic once we've exited this loop so that the 
we may have the query
  :   // wide status reflect the 'thread_create_status'.
  :   while (num_unstarted_instances > 1) {
  : DonePreparing();
  : --num_unstarted_instances;
  :   }
A minor suggestion is to group this loop with ErrorDuringPrepare() below so 
it's easier to see that we have updated num_unstarted_instances times.


http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@435
PS10, Line 435: ErrorDuringPrepare(thread_create_status, TUniqueId());
We used to wait for all fragment instances to finish preparing before reporting 
errors. This patch stops doing that. Does it matter ?


http://gerrit.cloudera.org:8080/#/c/10813/10/be/src/runtime/query-state.cc@442
PS10, Line 442: all_fis_status;
not actually used?

WaitForPrepare() below can become discard_result(WaitForPrepare());


http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py
File tests/failure/test_failpoints.py:

http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py@151
PS10, Line 151: self.execute_query_expect_failure(self.client, query,
  : query_options={'debug_action':debug_action})
Do we care about verifying the failure actually occurred ?


http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py@157
PS10, Line 157: x = 0
  : # Since there's only a 50% chance of fragment instance 
thread creation failure,
  : # we attempt to catch a query failure up to a very 
conservative maximum of 50 tries.
  : while x in range(0, 50):
for i in range(50):


http://gerrit.cloudera.org:8080/#/c/10813/10/tests/failure/test_failpoints.py@165
PS10, Line 165: if 'Query aborted:Debug Action: 
FIS_FAIL_THREAD_CREATION:FAIL@0.5' in str(e):
  :   break
  : else:
  :   assert False, str(e)
assert 'Query aborted:Debug Action: FIS_FAIL_THREAD_CREATION:FAIL@0.5' in 
str(e), str(e)



-- 
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 10
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 02 Aug 2018 00:06:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7376: DCHECK hit if a fragment instance fails to initialize the filter bank

2018-08-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11096 )

Change subject: IMPALA-7376: DCHECK hit if a fragment instance fails to 
initialize the filter bank
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11096/2/tests/failure/test_failpoints.py
File tests/failure/test_failpoints.py:

http://gerrit.cloudera.org:8080/#/c/11096/2/tests/failure/test_failpoints.py@147
PS2, Line 147: query_options={'debug_action':debug_action})
Should we also verify that the query actually failed ?



--
To view, visit http://gerrit.cloudera.org:8080/11096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1d6e7afc18fe2f0e1e29d2bd8a5f804a78f7043a
Gerrit-Change-Number: 11096
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 01 Aug 2018 22:46:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-08-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..


Patch Set 4:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG@16
PS4, Line 16: This patch also introduces a new service pool for all query 
execution
: control related RPCs in the future so that control commands from
: coordinators aren't blocked by long-running DataStream services' 
RPCs.
> yep, that's essentially what I meant. Thanks for explaining it better than
Thanks for the suggestion. I think it's fine to do it as a follow-up or 
parallel work on Kudu side. This doesn't necessarily block the merging of this 
patch IMHO as ReportExecStatus() can already be late today for various reasons 
such as an overloaded coordinator.


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/coordinator.h@31
PS4, Line 31: #include "kudu/util/slice.h"
> Not needed?
Done


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/dml-exec-state.cc
File be/src/runtime/dml-exec-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/dml-exec-state.cc@404
PS4, Line 404: void DmlExecState::ToProto(InsertExecStatusPB* dml_status) {
> do you want to clear dml_status first just in case? Similarly for other ToP
Done


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/dml-exec-state.cc@455
PS4, Line 455:if (!kudu_stats->has_num_row_errors()) {
 :   kudu_stats->set_num_row_errors(0);
 : }
> I don't think this is necessary- it's not illegal to access an "unset" fiel
Done


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/exec-env.h@136
PS4, Line 136: DataStreamService* data_svc() const { return data_svc_.get(); }
> Not used, we can remove this.
Done


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@268
PS4, Line 268: rpc_controller.AddOutboundSidecar(move(sidecar), &sidecar_idx).ok
> should this be a CHECK_OK?
Done


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@273
PS4, Line 273: "final"
> need a space here
Oops. Done.


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@274
PS4, Line 274: ERROR
> should this be a DFATAL? do you expect this to ever actually happen?
I don't think this is expected to happen very often but it seems more robust to 
not crash Impala because of failure to serialize the query profile. The query 
can still run to completion without the profile.


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@283
PS4, Line 283: req.clear_error_log();
> isnt the error log already clear because this is a new instance?
Done


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@284
PS4, Line 284: state->GetUnreportedErrors(req.mutable_error_log());
> it's a shame that this mutates the state, making retries a bit more difficu
The new patch creates a single instance of the RPC parameter and reuse it on 
retry. My understanding is that the RPC layer shouldn't mutate the input RPC 
request parameter.


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@289
PS4, Line 289: rpc_mgr->GetProxy
> clang-tidy failure: Missing status check.
Done


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@293
PS4, Line 293:   kudu::Status rpc_status = proxy->ReportExecStatus(req, &resp, 
&rpc_controller);
> do you want any timeout on this RPC?
Yes. Switched back to the behavior of existing code of using 
FLAGS_backend_client_rpc_timeout_ms. This is not ideal but I guess it's 
simplest to just preserve the existing behavior and fix IMPALA-2990 in a 
separate patch.


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@305
PS4, Line 305: // We can retry the RPC if the payload hasn't been sent yet.
> are these reports idempotent? seems like it shoudl be easy to make them ide
Yes, I believe they are idempotent. So, I was worried about the connection 
being reset in the middle of transmission so only partial payload was sent. In 
that case, I think the server side will also drop that connection so it should 
be fine I suppose. Comments removed.


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@306
PS4, Line 306: RpcMgr::IsServerTooBusy(rpc_controller
> I'm a bit fuzzy on the context of this code, but if we get TOO_BUSY, it see
Restored to the old behavior of retrying for at most 2 ti

[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-08-01 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#5).

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..

IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
58 files changed, 1,050 insertions(+), 634 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/5
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue

2018-07-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10977 )

Change subject: IMPALA-7296: bytes limit for row batch queue
..


Patch Set 6: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/10977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681
Gerrit-Change-Number: 10977
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 30 Jul 2018 18:53:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-07-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 8:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@149
PS8, Line 149:  single thread
Seems clearer to also state which thread it is.


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@215
PS8, Line 215:   /// during Execute().
Update 'query_status_' and 'failed_instance_id_' if it's not set already. Also 
notify all waiters of 'instances_finished_barrier_'.


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@260
PS8, Line 260:   };
nit: blank line missing.


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@261
PS8, Line 261: //
nit: ///


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@268
PS8, Line 268:  single
QueryState thread ?


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.h@328
PS8, Line 328:  we recieve the finst ID and the status of the
 :   /// FIS that hit the error.
the error status and failing fragment instance ID are set in 'query_status_' 
and 'failed_instance_id_' respectively.


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc@421
PS8, Line 421: TUniqueId())
Can we also record the fragment instance id above which failed to start thread ?


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc@422
PS8, Line 422: Status updated_query_status = UpdateBackendExecState();
 : instances_prepared_barrier_->NotifyRemaining();
Actually, there may be some problem calling NotifyRemaining() here. The problem 
is that some fragment instances may not have finished preparing yet at this 
point. In which case, if a Cancel() sneaks in after this point and before all 
fragment instances finished preparing, Impala may crash.

Looks like we need to think through how to structure this code more carefully 
with WaitForPrepare() below.


http://gerrit.cloudera.org:8080/#/c/10813/8/be/src/runtime/query-state.cc@425
PS8, Line 425: all_fis_status
Seems clearer to use thread_create_status here like the old code.



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 8
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 30 Jul 2018 18:42:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-07-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@479
PS7, Line 479:   {
 : unique_lock l(status_lock_);
 : query_status_ = Status::CANCELLED;
 :   }
> No, it makes sense to preemptively set this since a Cancel() is a query wid
As discussed in person, we this may overwrite earlier error status during 
prepare / execution. Also, if we don't have these statements here, the first 
fragment to cancel will set query_status_ to Status::CANCELLED anyway.



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe
Gerrit-Change-Number: 10813
Gerrit-PatchSet: 7
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 30 Jul 2018 18:08:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Fri, 27 Jul 2018 01:32:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-07-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 7:

(9 comments)

Looking good. Some more comments.

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/fragment-instance-state.cc@125
PS7, Line 125: DCHECK(
Does DCHECK_EQ() not work ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@67
PS7, Line 67: however,
: /// if any fragment instance hits an error or cancellation, then 
we immediately change the
: /// state of the query to the ERROR or CANCELLED state 
accordingly.
This is not true for PREPARING phase, right ? Seems easier to separately 
describe the behavior in PREPARING phase and EXECUTING phase instead of making 
a general statement like this and talking about exception in the next paragraph.


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@70
PS7, Line 70: EXECUTING
PREPARING ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@71
PS7, Line 71: all the underlying FISes will be in Prepare()
This is not necessarily true, right ? In PREPARING state, it only means at 
least one fragment instances is running Prepare(), right ? Some fragment 
instances could already be past Prepare() and executing Open(), Exec() etc, 
right ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@264
PS7, Line 264: BackendExecState backend_exec_state_ = 
BackendExecState::PREPARING;
May want to document the thread safety of this variable. I suppose it's only 
updated by the "query-state thread" only, right ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@287
PS7, Line 287: /// ID of first fragment instance to hit an error.
Please consider documenting the thread safety of this variable too.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@308
PS6, Line 308:
 :   /// Number of active fragment instances and coordinators for 
this query that may consume
 :   /// resources for query execution (i.e. threads, memory) on 
the Impala daemon.
 :   /// Query-wide execution resources for this query are released 
once this goes to zero.
 :   AtomicInt32 exec_resource_refcnt_;
 :
 :   /// Temporary files for this query (owned by obj_pool_). 
Non-null if spilling is
 :   /// enabled. Set in Pr
> I feel that it's not ideal having the fragment instance threads modify the
Now that the patch has sunk in more, I think the state machine the query status 
and the query state has a slightly counter intuitive relationship which exists 
today already.

query_status_ represents the error status which any fragment instances can hit 
either in PREPARING or EXECUTING phase.

The query state means whether at least one fragment instances is still in that 
said state. So, a query state of PREPARING only means at least one fragment 
instance is still preparing but other fragment instances could already be 
executing.

The problem is that each fragment instance will directly report to the 
coordinator on failure today independent of other fragment instances' states.

In the next patch, the status reporting will be driven solely by the query 
state thread so we may actually track the status of PREPARING phase and 
EXECUTING phase separately like you did here. The change in behavior will be 
that we will report any error hit in PREPARING phase first before any error hit 
in EXECUTING phase. Do you think that will make the system easier to reason 
about ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@230
PS7, Line 230: BackendExecState::EXECUTING :
 : BackendExecState::FINISHED;
nit: one line


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@479
PS7, Line 479:   {
 : unique_lock l(status_lock_);
 : query_status_ = Status::CANCELLED;
 :   }
Should this happen after line 483 below ?

Actually, do we need to set the query_status_ here ? Won't the first thread 
which notices the cancellation set it when calling ErrorDuringExec() set it ?

Do we have to worry about overwriting the error status if query_status_ already 
contains some errors ?



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, vis

[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue

2018-07-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10977 )

Change subject: IMPALA-7296: bytes limit for row batch queue
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h
File be/src/util/blocking-queue.h:

http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@117
PS5, Line 117:  ElemBytesFn()(*out);
> It's a bit unfortunate that each element goes through two indirect function
Actually, this may not actually be an indirect call after all, depending on how 
the compiler ends up generating the code. May be safe to double check that's 
actually the case with the compiled code.



--
To view, visit http://gerrit.cloudera.org:8080/10977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681
Gerrit-Change-Number: 10977
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 25 Jul 2018 05:37:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue

2018-07-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10977 )

Change subject: IMPALA-7296: bytes limit for row batch queue
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue-test.cc
File be/src/util/blocking-queue-test.cc:

http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue-test.cc@97
PS5, Line 97:  LONG_TIMEOUT_MICROS);
Does it make the test timing sensitive ? Should we use a larger timeout ?


http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h
File be/src/util/blocking-queue.h:

http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@117
PS5, Line 117:  ElemBytesFn()(*out);
It's a bit unfortunate that each element goes through two indirect function 
calls (one in BlockingPut() and one in BlockingGet()) before it's consumed.

Is it a reasonable alternative to modify BlockingPut() to take an extra 
parameter for the element size (with default value set to 0 for callers which 
don't care about size limit of the queue) and store the size for each entry as 
part of the queue ? I think it's reasonable to cap each entry to no more than 
2GB so we can represent the size as int32_t. Would that be too much space 
overhead in the queue ?


http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@152
PS5, Line 152: if (val_bytes > 0)
Not sure why the need for if() here. I can only assume this is to avoid 
accessing put_bytes_enqueued_.

Given the access to total_put_wait_time_ above, put_bytes_enqueued_ is most 
likely in the cache already. This complication is most likely not warranted 
given the speculative execution of modern CPU. If anything, this just adds 
unnecessary pressure to the branch predicator.


http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@185
PS5, Line 185: if (val_bytes > 0)
See comments above.


http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@237
PS5, Line 237:   bool HasCapacityInternal(const 
boost::unique_lock& lock, int64_t val_bytes) {
nit: long line



--
To view, visit http://gerrit.cloudera.org:8080/10977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681
Gerrit-Change-Number: 10977
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Jul 2018 01:41:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-07-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@289
PS4, Line 289: rpc_mgr->GetProxy
clang-tidy failure: Missing status check.



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 25 Jul 2018 00:43:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-07-24 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#4).

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..

IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
58 files changed, 1,019 insertions(+), 616 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/4
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-07-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..


Patch Set 3:

Still working on some targeted BE tests for some error cases but please feel 
free to go ahead to do another pass.


--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 25 Jul 2018 00:01:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-07-24 Thread Michael Ho (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#3).

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..

IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M be/src/util/uid-util.h
M bin/impala-config.sh
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
59 files changed, 1,017 insertions(+), 617 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/3
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-07-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..


Patch Set 2:

(36 comments)

http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG@17
PS2, Line 17: they
> nit: that
Done


http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG@19
PS2, Line 19: convertion
> nit: conversion
Done


http://gerrit.cloudera.org:8080/#/c/10855/2//COMMIT_MSG@25
PS2, Line 25:
> Not to be pedantic, but a lot of Thrift structs are being removed as part o
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/exec/hdfs-parquet-table-writer.cc@1183
PS2, Line 1183: 
(*parquet_insert_stats_.mutable_per_column_size())[col_name] +=
  : col_writer->total_compressed_size();
> nit: Should we have a level of indirection here? It's not very obvious what
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/rpc/thrift-util.h
File be/src/rpc/thrift-util.h:

http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/rpc/thrift-util.h@93
PS2, Line 93: Serialize
> Should we explicitly qualify the names of these functions to avoid confusio
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/rpc/thrift-util.h@103
PS2, Line 103: uint8_t* buffer;
 : uint32_t len;
 : mem_buffer_->getBuffer(&buffer, &len);
 : result->assign_copy(buffer, len);
> I think you can avoid the copy here if you do:
I believe faststring manages its own buffer internally so there is no easy way 
to share the internal buffer inside mem_buffer_ with the faststring.

If I understand it correctly, mem_buffer_->getBuffer() shouldn't do any data 
copying so there should be only one copy into the faststring.


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.h
File be/src/runtime/dml-exec-state.h:

http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.h@35
PS2, Line 35: class TInsertResult;
: class TFinalizeParams;
: class TUpdateCatalogRequest;
: class RuntimeProfile;
: class HdfsTableDescriptor;
> nit: Not your change, but the ordering is not alphabetical.
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.h@126
PS2, Line 126: std::map
> I'm wondering if it just makes sense to use google::protobuf::Map here inst
The comment explicitly says "Uses ordered map so that iteration order is 
deterministic." so I am sticking to map for now. According to the specification 
of protobuf, google::protobuf::Map acts more like an unordered_map 
(https://developers.google.com/protocol-buffers/docs/proto#maps-features)

  - Wire format ordering and map iteration ordering of map values is undefined, 
so you cannot rely on your map items being in a particular order.


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc
File be/src/runtime/dml-exec-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@83
PS2, Line 83: for (auto i = parquet_stats.per_column_size().begin();
:i != parquet_stats.per_column_size().end(); ++i) {
> Range based for-loop with const ref.
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@95
PS2, Line 95: auto
> Let's try to stay explicit about data types where we can. Also take it as a
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@95
PS2, Line 95: new_partition_status
> nit: new_per_partition_status_map
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@96
PS2, Line 96: (auto iter = new_partition_status.begin(); iter != 
new_partition_status.end();
:++iter)
> Why not use a range based for loop? It's much more readable. Also, referenc
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@404
PS2, Line 404: for (auto iter = files_to_move_.begin(); iter != 
files_to_move_.end(); ++iter) {
> Range based for loop, and take iterator as const ref.
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@407
PS2, Line 407:   for (auto iter = per_partition_status_.begin();
> Same here
Done


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/dml-exec-state.cc@453
PS2, Line 453: KuduDmlStatsPB* kudu_stats
> This can be a reference, no reason to use a pointer here.
kudu_stats is actually being modified here so using a pointer seems to fit the 
coding convention.


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/query-state.

[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services

2018-07-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10835 )

Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream 
services
..


Patch Set 6: Code-Review+2

Fix clang-tidy error. Carry Sailesh's +2.


--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 23 Jul 2018 23:29:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services

2018-07-23 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10835

to look at the new patch set (#6).

Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream 
services
..

IMPALA-7212: Removes --use_krpc flag and remove old DataStream services

This change removes the flag --use_krpc which allows users
to fall back to using Thrift based implementation of DataStream
services. This flag was originally added during development of
IMPALA-2567. It has served its purpose.

As we port more ImpalaInternalServices to use KRPC, it's becoming
increasingly burdensome to maintain parallel implementation of the
RPC handlers. Therefore, going forward, KRPC is always enabled.
This change removes the Thrift based implemenation of DataStreamServices
and also simplifies some of the tests which were skipped when KRPC
is disabled.

Testing done: core debug build.

Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
---
M be/src/common/global-flags.cc
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/backend-client.h
D be/src/runtime/data-stream-mgr-base.h
D be/src/runtime/data-stream-mgr.h
D be/src/runtime/data-stream-recvr-base.h
D be/src/runtime/data-stream-recvr.cc
D be/src/runtime/data-stream-recvr.h
D be/src/runtime/data-stream-sender.cc
D be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/scheduling/scheduler.cc
M be/src/service/data-stream-service.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/ImpalaInternalService.thrift
M tests/common/custom_cluster_test_suite.py
M tests/common/skip.py
D tests/common/test_skip.py
M tests/conftest.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/custom_cluster/test_krpc_metrics.py
M tests/custom_cluster/test_rpc_exception.py
M tests/query_test/test_codegen.py
M tests/webserver/test_web_pages.py
44 files changed, 88 insertions(+), 2,143 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/6
--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services

2018-07-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10835 )

Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream 
services
..


Patch Set 5: Code-Review+1

Rebased. Carry +1.


--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 23 Jul 2018 18:22:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services

2018-07-23 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10835

to look at the new patch set (#5).

Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream 
services
..

IMPALA-7212: Removes --use_krpc flag and remove old DataStream services

This change removes the flag --use_krpc which allows users
to fall back to using Thrift based implementation of DataStream
services. This flag was originally added during development of
IMPALA-2567. It has served its purpose.

As we port more ImpalaInternalServices to use KRPC, it's becoming
increasingly burdensome to maintain parallel implementation of the
RPC handlers. Therefore, going forward, KRPC is always enabled.
This change removes the Thrift based implemenation of DataStreamServices
and also simplifies some of the tests which were skipped when KRPC
is disabled.

Testing done: core debug build.

Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
---
M be/src/common/global-flags.cc
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/backend-client.h
D be/src/runtime/data-stream-mgr-base.h
D be/src/runtime/data-stream-mgr.h
D be/src/runtime/data-stream-recvr-base.h
D be/src/runtime/data-stream-recvr.cc
D be/src/runtime/data-stream-recvr.h
D be/src/runtime/data-stream-sender.cc
D be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/scheduling/scheduler.cc
M be/src/service/data-stream-service.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/ImpalaInternalService.thrift
M tests/common/custom_cluster_test_suite.py
M tests/common/skip.py
D tests/common/test_skip.py
M tests/conftest.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/custom_cluster/test_krpc_metrics.py
M tests/custom_cluster/test_rpc_exception.py
M tests/query_test/test_codegen.py
M tests/webserver/test_web_pages.py
44 files changed, 88 insertions(+), 2,143 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/5
--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-07-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 6:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/fragment-instance-state.cc@a113
PS6, Line 113:
May make sense to replace this by checking the state. Doing a racy read of the 
state should be fine.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@198
PS6, Line 198:  struct FInstanceStatus {
If you take the suggestion to only keep a single 'query_status_', there doesn't 
seem to be need for this struct. We can replace it with a single class member 
to record the fragment instance ID of the first failed fragment instance.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@212
PS6, Line 212: TUniqueId finst_id_;
This is only meaningful when there is an error status, right ? Please document 
it.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@234
PS6, Line 234: FragmentInstanceState
fragment instance


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@271
PS6, Line 271: INITIALIZED,
Consider calling it "PREPARING"


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@274
PS6, Line 274: PREPARED,
Consider calling it "EXECUTING"


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@308
PS6, Line 308:   /// The overall status of the Prepare phase for this query 
state.
 :   FInstanceStatus prepare_status_;
 :
 :   /// The overall status of the Exec phase (after Prepare()) for 
this query state.
 :   FInstanceStatus exec_status_;
 :
 :   /// Protects 'prepare_status_' and 'exec_status_'.
 :   SpinLock status_lock_;
It seems simpler to keep track of a single status for the first error hit 
instead of tracking the errors in different phases.

The behavior in the existing code is that the first fragment instance which 
hits the error will report it to the coordinator directly. In theory, a 
fragment instance F1 can hit an error during exec after prepare phase is done 
while another fragment instance F2 can hit error during its prepare phase after 
F1 hits the error. The new patch seems to set query_status_ to the error of F2 
while the coordinator will get the error of F1. So, this seems to cause 
divergence with the coordinator.

So, in that sense, it's necessary to keep a single status only.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@320
PS6, Line 320:   FInstanceStatus query_status_;
Mind documenting the thread safety about this variable ?


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@238
PS6, Line 238: case BackendExecState::INITIALIZED:
 : case BackendExecState::PREPARED:
 :   DCHECK(query_status_.ok()) << query_status_.ToString();
 :   if (!status.ok()) {
 : // Error while executing - go to ERROR state.
 : query_status_ = finst_status;
 : backend_exec_state_ = BackendExecState::ERROR;
Can this be merged with TransitionNonTerminalState() above as:

 if (query_status_.IsCancelled()) {
 new_state = BackendExecState::CANCELLED;
 } else if (!query_status_.ok()) {
 new_state = BackendExecState::ERROR;
 } else {
 new_state = state == PREPARING ? EXECUTING : FINISHED;
 }


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@250
PS6, Line 250: case BackendExecState::FINISHED
Under what circumstances would we call QueryState::UpdateBackendExecState() 
after entering a terminal state ?


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@457
PS6, Line 457: ErrorDuringPrepare(
Won't it be a problem if we call ErrorDuringPrepare() here as it only bump the 
instances_prepared_barrier_ by one instead of the number of remaining fragment 
instances ? In that sense, won't caller of WaitForPrepare() wait forever ?

Also, the thread creation failure path warrants a targeted test case. May be a 
debug action can be added in Thread::Create().


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@460
PS6, Line 460: return;
Seems wrong to return here. Don't we need to call ReportExecStatusAux() like we 
did in the past.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@463
PS6, Line 463:   all_fis_status = WaitFor

[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue

2018-07-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10977 )

Change subject: IMPALA-7296: bytes limit for row batch queue
..


Patch Set 5:

Hi Tim, will it be simpler in general to just use max_bytes instead of both 
(max_batches, max_bytes) to control the queue capacity ? Can you please 
elaborate on the reasoning for not doing so ?


--
To view, visit http://gerrit.cloudera.org:8080/10977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681
Gerrit-Change-Number: 10977
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 17:27:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7298: Stop passing IP address as hostname in Kerberos principal

2018-07-18 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10980 )

Change subject: IMPALA-7298: Stop passing IP address as hostname in Kerberos 
principal
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10980/1/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/10980/1/be/src/runtime/krpc-data-stream-sender.cc@598
PS1, Line 598: destinations[i].thrift_backend.hostname,
> Instead of adding this new parameter, would it be possible to ensure that "
destinations[i].krpc_backend.hostname is actually set to the resolved IP 
address of hostname so we have to pull it from thrift_backend.hosname. As we 
remove Thrift from communications between Impala backends in the future, we 
need to move thrift_backend.hostname into a separate field in 
TPlanFragmentDestination.



--
To view, visit http://gerrit.cloudera.org:8080/10980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e3e978746cf03766eee151835aad5877d9ed63e
Gerrit-Change-Number: 10980
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 19 Jul 2018 00:47:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7298: Stop passing IP address as hostname in Kerberos principal

2018-07-18 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10980


Change subject: IMPALA-7298: Stop passing IP address as hostname in Kerberos 
principal
..

IMPALA-7298: Stop passing IP address as hostname in Kerberos principal

Previously, we pass the resolved IP address of a KRPC destination
host as the hostname when creating a proxy for making KRPC calls.
This may lead to connection negotiation failure in KRPC when Kerberos
is enabled. In particular, if reversed DNS isn't enabled in Kerberos,
KDC may fail to look up the principal of the destination host if the
principal includes the hostname instead of resolved IP address.

This change fixes the problem above by passing the actual hostname
of the destination host when calling RpcMgr::GetProxy().

rpc-mgr-kerberized-test.cc is also updated to use hostname
instead of the resolved IP address as Kerberos principal.

Change-Id: I3e3e978746cf03766eee151835aad5877d9ed63e
---
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test-base.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/scheduling/scheduler.cc
M common/thrift/ImpalaInternalService.thrift
10 files changed, 33 insertions(+), 27 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/10980/1
--
To view, visit http://gerrit.cloudera.org:8080/10980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3e3e978746cf03766eee151835aad5877d9ed63e
Gerrit-Change-Number: 10980
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services

2018-07-17 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10835

to look at the new patch set (#4).

Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream 
services
..

IMPALA-7212: Removes --use_krpc flag and remove old DataStream services

This change removes the flag --use_krpc which allows users
to fall back to using Thrift based implementation of DataStream
services. This flag was originally added during development of
IMPALA-2567. It has served its purpose.

As we port more ImpalaInternalServices to use KRPC, it's becoming
increasingly burdensome to maintain parallel implementation of the
RPC handlers. Therefore, going forward, KRPC is always enabled.
This change removes the Thrift based implemenation of DataStreamServices
and also simplifies some of the tests which were skipped when KRPC
is disabled.

Testing done: core debug build.

Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
---
M be/src/common/global-flags.cc
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/backend-client.h
D be/src/runtime/data-stream-mgr-base.h
D be/src/runtime/data-stream-mgr.h
D be/src/runtime/data-stream-recvr-base.h
D be/src/runtime/data-stream-recvr.cc
D be/src/runtime/data-stream-recvr.h
D be/src/runtime/data-stream-sender.cc
D be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/scheduling/scheduler.cc
M be/src/service/data-stream-service.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/ImpalaInternalService.thrift
M tests/common/custom_cluster_test_suite.py
M tests/common/skip.py
D tests/common/test_skip.py
M tests/conftest.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/custom_cluster/test_krpc_metrics.py
M tests/custom_cluster/test_rpc_exception.py
M tests/query_test/test_codegen.py
M tests/webserver/test_web_pages.py
44 files changed, 88 insertions(+), 2,143 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/4
--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7212: Removes --use krpc flag and remove old DataStream services

2018-07-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10835 )

Change subject: IMPALA-7212: Removes --use_krpc flag and remove old DataStream 
services
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10835/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10835/3//COMMIT_MSG@7
PS3, Line 7: Deprecate
> Nit: I think it would be clearer to say "Remove" rather than "Deprecate".
Done


http://gerrit.cloudera.org:8080/#/c/10835/3//COMMIT_MSG@9
PS3, Line 9: deprecates
> Same here.
Done


http://gerrit.cloudera.org:8080/#/c/10835/3/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/10835/3/be/src/common/global-flags.cc@40
PS3, Line 40: DEFINE_int32_hidden(krpc_port, 27000,
: "port on which KRPC based ImpalaInternalService is exported");
> Now that we're exposing this, we should probably document this (if we're no
Internal doc bug filed.


http://gerrit.cloudera.org:8080/#/c/10835/3/be/src/common/global-flags.cc@260
PS3, Line 260: REMOVED_FLAG(disable_mem_pools);
> What are our rules about removing startup flags? Should use_krpc be here? (
Done


http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py
File tests/custom_cluster/test_rpc_exception.py:

http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py@a73
PS1, Line 73:
> So that goes to say that we won't benefit much from running with this for R
Yes



--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 18 Jul 2018 05:33:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()

2018-07-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10933 )

Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule()
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/10933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0
Gerrit-Change-Number: 10933
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Fri, 13 Jul 2018 00:35:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()

2018-07-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10933 )

Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule()
..


Patch Set 1:

Another idea is to inject fault at expression codegen functions to randomly 
fail codegen. This may not cover cases like scanners but should still provide 
reasonable coverage for other exec nodes.


--
To view, visit http://gerrit.cloudera.org:8080/10933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0
Gerrit-Change-Number: 10933
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 12 Jul 2018 23:06:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()

2018-07-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10933 )

Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule()
..


Patch Set 1:

> > Change LGTM. May be it helps to add some end-to-end test which
 > > exercises the hashT table paths with CHAR type to confirm it's
 > > fixed. Same can be said about the scanner changes.
 > >
 > > In the long run, every codegen changes which include handcrafted
 > IR
 > > should include CHAR type as part of the tests.
 >
 > CHAR type was a part of the manual testing that i did for
 > IMPALA-6177. The thing that I am concerned about is that if we
 > decide to add tests for CHAR for only the changes in this patch,
 > then to get full coverage we would also have to add test cases that
 > exercise all handcrafted IR generation code paths that have failure
 > modes for CHAR

There may be ways to automate it but I agree that's a lot of cases to cover. We 
may not need to take on all of them in this patch. My suspicion is that we 
already added some test cases when codegen logic was introduced in exec nodes 
so they may need to be extended only to cover CHAR. That said, it is a bit 
tedious and ideally we want some form of query generator to generate these 
queries.


--
To view, visit http://gerrit.cloudera.org:8080/10933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0
Gerrit-Change-Number: 10933
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 12 Jul 2018 23:04:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7288: Fix Codegen Crash in FinalizeModule()

2018-07-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10933 )

Change subject: IMPALA-7288: Fix Codegen Crash in FinalizeModule()
..


Patch Set 1:

Change LGTM. May be it helps to add some end-to-end test which exercises the 
hashT table paths with CHAR type to confirm it's fixed. Same can be said about 
the scanner changes.

In the long run, every codegen changes which include handcrafted IR should 
include CHAR type as part of the tests.


--
To view, visit http://gerrit.cloudera.org:8080/10933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0
Gerrit-Change-Number: 10933
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 12 Jul 2018 22:52:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Added KRPC port to the list of Impala ports

2018-07-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10930 )

Change subject: [DOCS] Added KRPC port to the list of Impala ports
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic42930296fded6dcd4b60fba9efa1c60a0ee4944
Gerrit-Change-Number: 10930
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 12 Jul 2018 20:44:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4669: [KUTIL] Add kudu util library to the build.

2018-07-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10758 )

Change subject: IMPALA-4669: [KUTIL] Add kudu_util library to the build.
..


Patch Set 12: Code-Review+2

I looked at the diff between v10 and v12.


--
To view, visit http://gerrit.cloudera.org:8080/10758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I451f02d3e4669e8a548b92fb1445cb2b322659a2
Gerrit-Change-Number: 10758
Gerrit-PatchSet: 12
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 12 Jul 2018 20:43:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7212: Deprecate --use krpc flag and remove old DataStream services

2018-07-11 Thread Michael Ho (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10835

to look at the new patch set (#3).

Change subject: IMPALA-7212: Deprecate --use_krpc flag and remove old 
DataStream services
..

IMPALA-7212: Deprecate --use_krpc flag and remove old DataStream services

This change deprecates the flag --use_krpc which allows users
to fall back to using Thrift based implementation of DataStream
services. This flag was originally added during development of
IMPALA-2567. It has served its purpose.

As we port more ImpalaInternalServices to use KRPC, it's becoming
increasingly burdensome to maintain parallel implementation of the
RPC handlers. Therefore, going forward, KRPC is always enabled.
This change removes the Thrift based implemenation of DataStreamServices
and also simplifies some of the tests which were skipped when KRPC
is disabled.

Testing done: core debug build.

Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
---
M be/src/common/global-flags.cc
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/backend-client.h
D be/src/runtime/data-stream-mgr-base.h
D be/src/runtime/data-stream-mgr.h
D be/src/runtime/data-stream-recvr-base.h
D be/src/runtime/data-stream-recvr.cc
D be/src/runtime/data-stream-recvr.h
D be/src/runtime/data-stream-sender.cc
D be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/scheduling/scheduler.cc
M be/src/service/data-stream-service.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/ImpalaInternalService.thrift
M tests/common/custom_cluster_test_suite.py
M tests/common/skip.py
D tests/common/test_skip.py
M tests/conftest.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/custom_cluster/test_krpc_metrics.py
M tests/custom_cluster/test_rpc_exception.py
M tests/query_test/test_codegen.py
M tests/webserver/test_web_pages.py
44 files changed, 87 insertions(+), 2,142 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/3
--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7212: Deprecate --use krpc flag and remove old DataStream services

2018-07-11 Thread Michael Ho (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10835

to look at the new patch set (#2).

Change subject: IMPALA-7212: Deprecate --use_krpc flag and remove old 
DataStream services
..

IMPALA-7212: Deprecate --use_krpc flag and remove old DataStream services

This change deprecates the flag --use_krpc which allows users
to fall back to using Thrift based implementation of DataStream
services. This flag was originally added during development of
IMPALA-2567. It has served its purpose.

As we port more ImpalaInternalServices to use KRPC, it's becoming
increasingly burdensome to maintain parallel implementation of the
RPC handlers. Therefore, going forward, KRPC is always enabled.
This change removes the Thrift based implemenation of DataStreamServices
and also simplifies some of the tests which were skipped when KRPC
is disabled.

Testing done: core debug build.

Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
---
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/backend-client.h
D be/src/runtime/data-stream-mgr-base.h
D be/src/runtime/data-stream-mgr.h
D be/src/runtime/data-stream-recvr-base.h
D be/src/runtime/data-stream-recvr.cc
D be/src/runtime/data-stream-recvr.h
D be/src/runtime/data-stream-sender.cc
D be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/scheduling/scheduler.cc
M be/src/service/data-stream-service.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M common/thrift/ImpalaInternalService.thrift
M tests/common/custom_cluster_test_suite.py
M tests/common/skip.py
D tests/common/test_skip.py
M tests/conftest.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/custom_cluster/test_krpc_metrics.py
M tests/custom_cluster/test_rpc_exception.py
M tests/query_test/test_codegen.py
M tests/webserver/test_web_pages.py
43 files changed, 86 insertions(+), 2,141 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/10835/2
--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7212: Deprecate --use krpc flag and remove old DataStream services

2018-07-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10835 )

Change subject: IMPALA-7212: Deprecate --use_krpc flag and remove old 
DataStream services
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@23
PS1, Line 23: #include "common/status.h"
> I think I've found all the dead code in this patch, but if you want a refer
Thanks for checking.


http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@128
PS1, Line 128: class DataStreamTest : public testing::Test {
 :  protected:
 :   DataStreamTest() : next_val_(0) {
 : // Stop tests that rely on mismatched sender / receiver 
pairs timing out from failing.
 :
> No longer needed.
Done


http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@134
PS1, Line 134:   ~DataStreamTest() { runtime_state_->ReleaseResources(); }
 :
 :   virtual void SetUp() {
 :
> No longer needed.
Done


http://gerrit.cloudera.org:8080/#/c/10835/1/be/src/runtime/data-stream-test.cc@139
PS1, Line 139: erPool(32 * 1024, 1024 * 1024 * 1024, 32 * 1024);
> You can replace this with:
Done


http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py
File tests/custom_cluster/test_rpc_exception.py:

http://gerrit.cloudera.org:8080/#/c/10835/1/tests/custom_cluster/test_rpc_exception.py@a73
PS1, Line 73:
> Can't this happen without TransmitData() today?  Don't we also call it on R
ReportExecStatus() can be very timing sensitive as the number of iterations 
it's invoked depends on how long the query runs for.



--
To view, visit http://gerrit.cloudera.org:8080/10835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Gerrit-Change-Number: 10835
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Jul 2018 22:02:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7006: Pick parts of recent Kudu gutil changes

2018-07-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10769 )

Change subject: IMPALA-7006: Pick parts of recent Kudu gutil changes
..


Patch Set 14: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc
File be/src/gutil/strings/substitute.cc:

http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc@15
PS14, Line 15:
> I only picked those parts of that commit that had an effect on Impala's cod
I see. Makes sense.



--
To view, visit http://gerrit.cloudera.org:8080/10769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2dc8c70425e3ac030427ebeb1ec18a44d14d5cb
Gerrit-Change-Number: 10769
Gerrit-PatchSet: 14
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Jul 2018 21:45:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7095: clean up scan node profiles

2018-07-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10810 )

Change subject: IMPALA-7095: clean up scan node profiles
..


Patch Set 8: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10810/7/be/src/exec/kudu-scan-node-base.h
File be/src/exec/kudu-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/10810/7/be/src/exec/kudu-scan-node-base.h@41
PS7, Line 41:
:   virtual Status Prepare(RuntimeState* state) override;
:   virtual Status Open(RuntimeState* state) override;
:   virtual Status GetNext(RuntimeState* state, RowBatch* 
row_batch, bool* eos)
:   override = 0;
:  protected:
:   virtual void DebugString(int indentation_level, 
std::stringstream* out) const override;
> It's not really necessary now that the clang precommit builds detect droppe
It seems that we have them in some places (hdfs-scan-node-base.h) in this patch 
but not others. Just a bit inconsistent.


http://gerrit.cloudera.org:8080/#/c/10810/8/be/src/exec/scan-node.h
File be/src/exec/scan-node.h:

http://gerrit.cloudera.org:8080/#/c/10810/8/be/src/exec/scan-node.h@227
PS8, Line 227: Not thread-safe.
Is it expected that the caller is supposed to hold certain locks before calling 
this function or is it called from a particular thread context only ?



--
To view, visit http://gerrit.cloudera.org:8080/10810
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77286282d42e7764bfdf94c7ec47cec9d743f787
Gerrit-Change-Number: 10810
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 11 Jul 2018 19:33:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7006: Pick parts of recent Kudu gutil changes

2018-07-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10769 )

Change subject: IMPALA-7006: Pick parts of recent Kudu gutil changes
..


Patch Set 14:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10769/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10769/14//COMMIT_MSG@11
PS14, Line 11: Rename constants
Can you please also specify the commit (e719b5ef) ?


http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc
File be/src/gutil/strings/substitute.cc:

http://gerrit.cloudera.org:8080/#/c/10769/14/be/src/gutil/strings/substitute.cc@15
PS14, Line 15:
Is the change in gutil/strings/escaping.cc missing 
?(https://github.com/apache/kudu/commit/e719b5eff2a47f61f9e6ee1f18d2055247f45847#diff-94c169876895d2a2b452a78355d51e57)



--
To view, visit http://gerrit.cloudera.org:8080/10769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2dc8c70425e3ac030427ebeb1ec18a44d14d5cb
Gerrit-Change-Number: 10769
Gerrit-PatchSet: 14
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Jul 2018 18:51:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7006: [KSECURITY] Update security library integration

2018-07-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10759 )

Change subject: IMPALA-7006: [KSECURITY] Update security library integration
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10759
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab51d887f5e771ad62eeddc14b9c47f42c3130d
Gerrit-Change-Number: 10759
Gerrit-PatchSet: 9
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 11 Jul 2018 01:58:28 +
Gerrit-HasComments: No


<    1   2   3   4   5   6   7   8   9   10   >