[Impala-ASF-CR] Revert "IMPALA-8146: Remove make {debug,release,asan}.sh"

2019-02-01 Thread Michael Ho (Code Review)
Michael Ho has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12335 )

Change subject: Revert "IMPALA-8146: Remove make_{debug,release,asan}.sh"
..

Revert "IMPALA-8146: Remove make_{debug,release,asan}.sh"

This reverts commit e8ea4b0525fb3223da9291775eff4b7fbd15c547.
Apparently some users of Impala have some workflow which rely
on these scripts. Reverting this change to unblock those users.

Change-Id: Ibcb921224071eef2dc77cf7b9f4c618f88aaa7c1
Reviewed-on: http://gerrit.cloudera.org:8080/12335
Reviewed-by: Philip Zeyliger 
Tested-by: Michael Ho 
---
A bin/make_asan.sh
A bin/make_debug.sh
A bin/make_release.sh
3 files changed, 60 insertions(+), 0 deletions(-)

Approvals:
  Philip Zeyliger: Looks good to me, approved
  Michael Ho: Verified

--
To view, visit http://gerrit.cloudera.org:8080/12335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibcb921224071eef2dc77cf7b9f4c618f88aaa7c1
Gerrit-Change-Number: 12335
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Revert "IMPALA-8146: Remove make {debug,release,asan}.sh"

2019-02-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12335 )

Change subject: Revert "IMPALA-8146: Remove make_{debug,release,asan}.sh"
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcb921224071eef2dc77cf7b9f4c618f88aaa7c1
Gerrit-Change-Number: 12335
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 01 Feb 2019 19:31:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-8146: Remove make {debug,release,asan}.sh"

2019-02-01 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12335


Change subject: Revert "IMPALA-8146: Remove make_{debug,release,asan}.sh"
..

Revert "IMPALA-8146: Remove make_{debug,release,asan}.sh"

This reverts commit e8ea4b0525fb3223da9291775eff4b7fbd15c547.
Apparently some users of Impala have some workflow which rely
on these scripts. Reverting this change to unblock those users.

Change-Id: Ibcb921224071eef2dc77cf7b9f4c618f88aaa7c1
---
A bin/make_asan.sh
A bin/make_debug.sh
A bin/make_release.sh
3 files changed, 60 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12335/1
--
To view, visit http://gerrit.cloudera.org:8080/12335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibcb921224071eef2dc77cf7b9f4c618f88aaa7c1
Gerrit-Change-Number: 12335
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-2990: timeout unresponsive queries in coordinator

2019-01-31 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12299 )

Change subject: IMPALA-2990: timeout unresponsive queries in coordinator
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/runtime/coordinator-backend-state.h@137
PS2, Line 137: boost::lock_guard l(lock_);
In theory, this doesn't need any locking as read / write of a 64-bit word 
should be atomic as long as its' 64-bit aligned.


http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/runtime/coordinator-backend-state.h@153
PS2, Line 153:   /// Returns a timestamp for use in tracking arrival of status 
reports.
May help to state explicitly that this uses the monotonic time instead of Unix 
time.


http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/runtime/coordinator-backend-state.cc@89
PS2, Line 89: last_report_time_ms_ = GenerateReportTimestamp();
Should this clock not be started until we have issued all the initial 
ExecFInstance() RPC ? I can see there may be cases in which starting a query in 
a large cluster can take a bit of time so we may be more prone to false 
positive.


http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/runtime/coordinator.cc@747
PS2, Line 747: for (BackendState* backend_state : backend_states_) {
Is there any race with the init code which populates backend_states_ ? 
Shouldn't the detection be skipped until the backend_states_ be fully populated 
?

Does this handle the case in which a backend is done so no more new report will 
be received ?


http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12299/2/be/src/service/impala-server.cc@215
PS2, Line 215: (max_report_lag_s, 30,
Shouldn't the lag be dependent on the reporting interval ? Ideally, we should 
mark the query as hung after x * reporting interval has elapsed without 
receiving a report.

In addition, we need to handle the case of periodic reporting being disabled by 
the user. In that case, we probably need to disable this detection.



--
To view, visit http://gerrit.cloudera.org:8080/12299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I196c8c6a5633b1960e2c3a3884777be9b3824987
Gerrit-Change-Number: 12299
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 Jan 2019 20:18:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4555: Make QueryState's status reporting more robust

2019-01-31 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12049 )

Change subject: IMPALA-4555: Make QueryState's status reporting more robust
..


Patch Set 4:

(9 comments)

Looking good. Some more minor comments.

http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/coordinator-backend-state.cc@344
PS4, Line 344:   for (const auto& stateful_report : 
instance_exec_status.stateful_report()) {
DCHECK_LE(stateful_report.report_seq_no(), report_seq_no());


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/fragment-instance-state.h@170
PS4, Line 170: recieved
typo


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/fragment-instance-state.h@171
PS4, Line 171: stateful_reports_
nit: prev_stateful_reports_;


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/fragment-instance-state.cc@253
PS4, Line 253: if (final_report_saved_) {
Please see comments below about some potential simplification we can do.


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/fragment-instance-state.cc@279
PS4, Line 279:   {stateful_reports_.begin(), stateful_reports_.end()};
nit: indent 4


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/fragment-instance-state.cc@298
PS4, Line 298: if (instance_exec_status.done()) {
As discussed in person, this could have been simpler by not storing the final 
profile here. We just have to keep track of the fact that we don't need to bump 
the sequence number once the final report has been generated at least once.


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/query-state.h@410
PS4, Line 410: report_
stale ?


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/query-state.h@411
PS4, Line 411: profiles_forest_'
stale ?


http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/4/be/src/runtime/query-state.cc@563
PS4, Line 563: FLAGS_status_report_interval_ms * (num_failed_reports_ + 1)
Given this is quite clunky to write and used at more than one places, may be it 
makes sense to factor it into a function itself. This also makes changing the 
logic of sleep time (e.g. exponential backoff) in the future easier.



--
To view, visit http://gerrit.cloudera.org:8080/12049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971
Gerrit-Change-Number: 12049
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 Jan 2019 19:46:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8111: [DOCS] Take 2: Removed the Fix Version for KUDU-2198

2019-01-31 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12311 )

Change subject: IMPALA-8111: [DOCS] Take 2: Removed the Fix Version for 
KUDU-2198
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I299eeda31de3cea768c1a1627702527501240a8b
Gerrit-Change-Number: 12311
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 31 Jan 2019 19:05:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7985: Port RemoteShutdown() to KRPC.

2019-01-31 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12260 )

Change subject: IMPALA-7985: Port RemoteShutdown() to KRPC.
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/12260/4/be/src/service/control-service.h
File be/src/service/control-service.h:

http://gerrit.cloudera.org:8080/#/c/12260/4/be/src/service/control-service.h@79
PS4, Line 79:   /// Retry the Rpc 'rpc_call' up to 'times_to_try' times.
May also want to mention each RPC has a timeout of 'timeout' in some time units.

Also, there is no sleeping between retries.


http://gerrit.cloudera.org:8080/#/c/12260/4/be/src/service/control-service.h@84
PS4, Line 84: MonoDelta krpc_timeout
Instead of passing the MonoDelta object itself, how about just defining this as 
timeout in millisecond ?


http://gerrit.cloudera.org:8080/#/c/12260/4/be/src/service/control-service.h@85
PS4, Line 85: DCHECK(times_to_try > 0);
DCHECK_GT()


http://gerrit.cloudera.org:8080/#/c/12260/4/be/src/service/control-service.h@92
PS4, Line 92:   if (!rpc_status.ok()) continue;
Please add a TODO about sleeping between retries in case the server is 
overloaded. Honestly though, I don't see why we cannot add a sleep time between 
retries in this patch if the remote server's service queue fills up. Otherwise, 
the retry loop won't be too effective as we burst sending the same RPCs in such 
a short period of time.


http://gerrit.cloudera.org:8080/#/c/12260/3/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/12260/3/be/src/service/control-service.cc@170
PS3, Line 170:   FAULT_INJECTION_RPC_DELAY(RPC_CANCELQUERYFINSTANCES);
> These tests are setting delays on server-side commands so as to test RPC ti
This can easily be replaced by DebugAction. Please see 
https://github.com/apache/impala/blob/master/tests/custom_cluster/test_rpc_timeout.py#L145-L150

In general, we should converge on using DebugAction for any type of fault 
injection instead of building on the old infrastructure.


http://gerrit.cloudera.org:8080/#/c/12260/3/be/src/service/control-service.cc@184
PS3, Line 184: FAULT_INJECTION_RPC_DELAY(RPC_REMOTESHUTDOWN);
> These tests are setting delays on server-side commands so as to test RPC ti
Please see replies above.



--
To view, visit http://gerrit.cloudera.org:8080/12260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e
Gerrit-Change-Number: 12260
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 Jan 2019 19:00:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7980: Fix spinning because of buggy num unqueued files .

2019-01-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12097 )

Change subject: IMPALA-7980: Fix spinning because of buggy num_unqueued_files_.
..


Patch Set 7: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12097/7/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/12097/7/be/src/exec/hdfs-scan-node-base.cc@452
PS7, Line 452: MakeScopeExitTrigger([&](){ 
UpdateRemainingScanRangeIssuances(-1); });
nit: indent 4


http://gerrit.cloudera.org:8080/#/c/12097/7/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12097/7/be/src/exec/hdfs-scan-node.cc@79
PS7, Line 79:
nit: blank line



--
To view, visit http://gerrit.cloudera.org:8080/12097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I133de13238d3d05c510e2ff771d48979125735b1
Gerrit-Change-Number: 12097
Gerrit-PatchSet: 7
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 30 Jan 2019 22:00:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8138: Reintroduce rpc debugging options

2019-01-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12297 )

Change subject: IMPALA-8138: Reintroduce rpc debugging options
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12297/1/be/src/runtime/krpc-backend-client.h
File be/src/runtime/krpc-backend-client.h:

http://gerrit.cloudera.org:8080/#/c/12297/1/be/src/runtime/krpc-backend-client.h@27
PS1, Line 27:
Please add a class level comment.


http://gerrit.cloudera.org:8080/#/c/12297/1/be/src/runtime/krpc-backend-client.h@28
PS1, Line 28: class KrpcBackendClient : public ControlServiceProxy
Hmm...what about DataStreamService ? This seems to be a client for 
ControlService's RPCs only ?


http://gerrit.cloudera.org:8080/#/c/12297/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/12297/1/be/src/runtime/query-state.h@314
PS1, Line 314: proxy_
Will client_ be a more appropriate name ? Same for other places.



--
To view, visit http://gerrit.cloudera.org:8080/12297
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2046cb9dadf846ea90c04e95781b2bbde3325941
Gerrit-Change-Number: 12297
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 30 Jan 2019 00:55:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7980: Fix spinning because of buggy num unqueued files .

2019-01-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12097 )

Change subject: IMPALA-7980: Fix spinning because of buggy num_unqueued_files_.
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node-base.h@465
PS5, Line 465: remaining_scan_range_submission
> I am now trying remaining_scan_range_submissions_. I'd argue that it's all
Thanks for the explanation. I suppose the initial source of confusion had to do 
with the way to interpret this variable. In particular, there is a minor 
distinction between (1) the remaining scan ranges to be issued to the IO 
subsystem for the rest of the scan nodes' lifetime and (2) the "new work" to be 
picked up by scanner threads. I believe this variables refer to (2). In other 
words, the scanner thread which picks up the footer range may still submit one 
or more scan ranges for the actual split it's responsible for after this 
variable has gone to 0. As far as the scanner thread loop is concerned, there 
is no more new work so it's safe to exit.

Not sure if there is a better name to explain this here but I guess some 
(hopefully succinct) comments may help.


http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node.cc@79
PS5, Line 79:
: Status HdfsScanNode::GetNext(RuntimeState* state, RowBatch* 
row_batch, bool* eos) {
:   SCOPED_TIMER(runtime_profile_->total_time_counter());
:   Sco
> You were right and we can move this into Close() once the threads have been
I was just wondering if the if-stmt body is empty as DCHECK is not compiled in 
release build, the if-check automatically becomes dead code. That said, I am 
not sure the compiler is always smart enough to remove the if-check as the 
conjuncts being evaluated may have side-effects.

Anyhow, this is just a minor point. Feel free to ignore.


http://gerrit.cloudera.org:8080/#/c/12097/6/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12097/6/be/src/exec/hdfs-scan-node.cc@196
PS6, Line 196: !(ranges_issued_barrier_.pending() != 0)
Why not ranges_issued_barrier_.pending() == 0 ?


http://gerrit.cloudera.org:8080/#/c/12097/6/be/src/exec/hdfs-scan-node.cc@197
PS6, Line 197:   && initial_ranges_issued_
 :   && progress_.done()) {
nit: one line



--
To view, visit http://gerrit.cloudera.org:8080/12097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I133de13238d3d05c510e2ff771d48979125735b1
Gerrit-Change-Number: 12097
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 30 Jan 2019 00:43:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 14:

(2 comments)

This patch seems to have pretty through review from Phil and Tim and others 
already. Please feel free to proceed without waiting for more comments from me. 
Thanks.

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/runtime-state.cc@76
PS11, Line 76: profile_(RuntimeProfile::Create(obj_pool(), "")),
> To make it more consistent. The name is ignored since the profile itself is
That's fine I guess.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.h@419
PS11, Line 419:  Samples
  :   /// are not re-sampled into larger intervals, instead owners 
of this profile can call
  :   /// ClearChunkedTimeSeriesCounters() to reset the sample 
buffers of all chunked time
  :   /// series counters
> Done
Yes, capping it makes sense but warning about it "once" when it becomes full 
will be good too.



--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 14
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 29 Jan 2019 23:35:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7985: Port RemoteShutdown() to KRPC.

2019-01-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12260 )

Change subject: IMPALA-7985: Port RemoteShutdown() to KRPC.
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc@653
PS2, Line 653:   // KRPC relies on resolved IP address, so convert hostname.
 :   IpAddr ip_address;
 :   Status ip_status = HostnameToIpAddr(request.backend.hostname, 
_address);
 :   if (!ip_status.ok()) {
 : VLOG(1) << "Could not convert hostname " << 
request.backend.hostname
 : << " to ip address, error: " << 
ip_status.GetDetail();
 : return ip_status;
 :   }
 :   TNetworkAddress addr = MakeNetworkAddress(ip_address, port);
> In my original (unpublished) prototype I was keeping the original UX where 
Thanks for the explanation. That's a reasonable scenario in which the backend 
being shut down may not be found in the backend descriptors.


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc@692
PS2, Line 692:   return Status(Substitute("Rpc to $0 failed with error 
'$1'. This may be because "
 :"the port specified is a thrift 
port, which :shutdown() "
 :"can no longer use. Make sure the 
port is a KRPC port.",
 :   TNetworkAddressToString(addr), msg));
 : }
 : return Status(Substitute(
 : "Rpc to $0 failed with error '$1'.", 
TNetworkAddressToString(addr), msg));
Not sure if a typical user has any understanding of the difference between 
Thrift port and KPRC port as they are internal to Impala. May we can simplify 
the statement to be like the following:

   string err_string = Substitute("Rpc to $0 failed with error '$1'",
 TNetworkAddressToString(addr), msg);
   if (msg.find(...) != string::npos) {
   err_string.append(" Please make sure the right port is being used or 
don't specify any port in the :shutdown() command");
   }
   return Status(err_string);}


http://gerrit.cloudera.org:8080/#/c/12260/3/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/12260/3/be/src/service/control-service.cc@170
PS3, Line 170:   FAULT_INJECTION_RPC_DELAY(RPC_CANCELQUERYFINSTANCES);
Shouldn't this use a DebugAction instead ?


http://gerrit.cloudera.org:8080/#/c/12260/3/be/src/service/control-service.cc@184
PS3, Line 184: FAULT_INJECTION_RPC_DELAY(RPC_REMOTESHUTDOWN);
Shouldn't this use a DebugAction instead ?



--
To view, visit http://gerrit.cloudera.org:8080/12260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e
Gerrit-Change-Number: 12260
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 29 Jan 2019 23:11:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8111: [DOCS] Added KRPC-related known issues

2019-01-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12291 )

Change subject: IMPALA-8111: [DOCS] Added KRPC-related known issues
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12291/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/12291/2/docs/topics/impala_known_issues.xml@254
PS2, Line 254: is
was



--
To view, visit http://gerrit.cloudera.org:8080/12291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I806bedcb9d16140413bb6a1f42798be9d1b6371c
Gerrit-Change-Number: 12291
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 29 Jan 2019 20:58:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8111: [DOCS] Added KRPC-related known issues

2019-01-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12291 )

Change subject: IMPALA-8111: [DOCS] Added KRPC-related known issues
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12291/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/12291/1/docs/topics/impala_known_issues.xml@243
PS1, Line 243: impala-7585
IMPALA ?


http://gerrit.cloudera.org:8080/#/c/12291/1/docs/topics/impala_known_issues.xml@244
PS1, Line 244:   The impala user not added to /etc/passwd when LDAP 
is
 : enabled
one liner ?


http://gerrit.cloudera.org:8080/#/c/12291/1/docs/topics/impala_known_issues.xml@247
PS1, Line 247: When using Impala with LDAP enabled, a user may hit 
the
 :   following:
 : Not authorized: Client connection negotiation 
failed: client connection to 127.0.0.1:27000: SASL(-1): generic failure: 
All-whitespace username.
 : The following sequence can lead to the 
impala user
 :   not being created in /etc/passwd.
 : Time 1: The impala user is not 
in LDAP. Impala
 :   was installed, and the user 
impala is created in
 : /etc/passwd.
 : Time 2: The impala user is 
added to LDAP.
 : Time 3: New machine is added.
My original wording wasn't that clear. I rewrote them as follows. Please edit 
as you see fit:

The following sequence can lead to Impala user not being created in /etc/passwd 
on some machines in the cluster.

Time 1: The user "impala" is not in LDAP. Impala service was installed on 
machine 1 and the user "impala" was created in /etc/passwd

Time 2: The user "impala" is added to LDAP

Time 3: A new machine is added to the cluster. When adding Impala service to 
this new machine, adding the user "impala" will fail as it already exists in 
LDAP.

The consequence is that "impala" doesn't exist in /etc/passwd on the new 
machine, leading to the error seen above.


http://gerrit.cloudera.org:8080/#/c/12291/1/docs/topics/impala_known_issues.xml@266
PS1, Line 266: IMPALLA-7298
typo ?


http://gerrit.cloudera.org:8080/#/c/12291/1/docs/topics/impala_known_issues.xml@292
PS1, Line 292: kudu-2198
KUDU ?



--
To view, visit http://gerrit.cloudera.org:8080/12291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I806bedcb9d16140413bb6a1f42798be9d1b6371c
Gerrit-Change-Number: 12291
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 29 Jan 2019 19:30:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7985: Port RemoteShutdown() to KRPC.

2019-01-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12260 )

Change subject: IMPALA-7985: Port RemoteShutdown() to KRPC.
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h
File be/src/service/control-service.h:

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h@79
PS2, Line 79:   /// Retry the Rpc 'rpc_call' up to 3 times.
:   /// Pass 'debug_action' to DebugAction() to potentially inject 
errors.
:   template 
:   static Status DoRpcWithRetry(F&& rpc_call, const TQueryCtx& 
query_ctx,
:   const char* debug_action, const char* error_msg) {
> Thanks. The idea (suggested by Thomas) was for DoRpcWithRetry to be reusabl
I am fine with doing the refactoring with a separate JIRA as long as it doesn't 
become a back burner somehow.  That said, it seems to make sense to add a sleep 
time between retries in this patch at least for the case of remote server being 
too busy (see the example in KrpcDataStreamSender).


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h@87
PS2, Line 87: MonoDelta::FromSeconds(10)
> Yes, I wondered about this too. I decided that while my 2 clients are using
Given what you stated above about DoRpcWithRetry() being reusable, I am not 
sure it makes sense to make assumption about the clients and hardcode the 
timeout / number of retries / sleep time between retries.


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/impala-server.cc@2489
PS2, Line 2489: ShutdownStatusPB
> Is that safe? GetShutdownStatus() allocates the ShutdownStatusPB on the sta
It is an official C++ feature to extend the life time of a temporary object to 
the life time of the const reference which refers to it.



--
To view, visit http://gerrit.cloudera.org:8080/12260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e
Gerrit-Change-Number: 12260
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 29 Jan 2019 19:12:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7980: Fix spinning because of buggy num unqueued files .

2019-01-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12097 )

Change subject: IMPALA-7980: Fix spinning because of buggy num_unqueued_files_.
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node-base.h@465
PS5, Line 465: remaining_scan_range_issuances_
> Do you have any suggestions?
remaining_initial_scan_ranges_?


http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node.cc@79
PS5, Line 79: auto remaining = remaining_scan_range_issuances_.Load();
: if (remaining != 0) {
:   LOG(INFO) << "XXX Assertion Failed " << remaining << " " << 
GetStackTrace();
: }
> D'oh. I don't know how this one got by me.
If you take the suggestion of using the above DCHECK, I don't think you need 
the #ifndef NDEBUG.



--
To view, visit http://gerrit.cloudera.org:8080/12097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I133de13238d3d05c510e2ff771d48979125735b1
Gerrit-Change-Number: 12097
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 29 Jan 2019 07:59:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7985: Port RemoteShutdown() to KRPC.

2019-01-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12260 )

Change subject: IMPALA-7985: Port RemoteShutdown() to KRPC.
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/12260/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12260/1//COMMIT_MSG@13
PS1, Line 13: imapald
typo


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc@653
PS2, Line 653:   // KRPC relies on resolved IP address, so convert hostname.
 :   IpAddr ip_address;
 :   Status ip_status = HostnameToIpAddr(request.backend.hostname, 
_address);
 :   if (!ip_status.ok()) {
 : VLOG(1) << "Could not convert hostname " << 
request.backend.hostname
 : << " to ip address, error: " << 
ip_status.GetDetail();
 : return ip_status;
 :   }
 :   TNetworkAddress addr = MakeNetworkAddress(ip_address, port);
Instead of resolving the IP for the backend, it seems more consistent to re-use 
some of the logic at the scheduler which already contains a map of all the 
backend state:

https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.cc#L354-L357

May need some refactoring in the scheduler code to expose the necessary backend 
states. It's unfortunate that the existing 'executor_config_' uses 'hostname, 
be_port' as lookup key. May be it makes sense to implement a new function which 
looks up if there exists a known backend with 'hostname and krpc_address which 
exposes the port specified' and if so, returns the krpc_address_. If there 
isn't a known backend which maps to the specified 'host + krpc port', then we 
should just fail the shutdown request without making the actual RPC.

Please feel free to come up with other ideas too.


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc@689
PS2, Line 689: if (msg.find("RemoteShutdown() RPC failed: Timed out: connection 
negotiation to")
 : != string::npos) {
If we take the route of using existing backend states suggested above, the case 
of invalid port should be impossible so we shouldn't need to call this out like 
below here.


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h
File be/src/service/control-service.h:

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h@79
PS2, Line 79:   /// Retry the Rpc 'rpc_call' up to 3 times.
:   /// Pass 'debug_action' to DebugAction() to potentially inject 
errors.
:   template 
:   static Status DoRpcWithRetry(F&& rpc_call, const TQueryCtx& 
query_ctx,
:   const char* debug_action, const char* error_msg) {
Not specifically related to this change but it may be useful to factor this 
retry function to generic RPC code in rpc-mgr.inline.h.

Also, if the RPC fails due to being overloaded, it may make sense to have some 
sleep time between retries.

FWIW, we had some retry logic in KrpcDataStreamSender too and I believe we also 
had similar retry logic for the status reporting code. It may be worth the 
effort to consider refactoring these code so we don't implement the same retry 
logic at multiple places:

https://github.com/apache/impala/blob/master/be/src/runtime/krpc-data-stream-sender.cc#L374-L386

https://github.com/apache/impala/blob/master/be/src/runtime/query-state.cc#L332-L336


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h@87
PS2, Line 87: MonoDelta::FromSeconds(10)
Not sure if it makes sense to hard code the timeout here as it seems to be 
utility function used by different RPC calls. Would it be better for the 
timeout to be one of the function parameter ? At the minimum, if we decide that 
10 seconds is the timeout for all RPCs in ControlService, we should at least 
define it as some constant or #define to make it more explicit.

I have a similar question about the number of retries too (which is hardcoded 
to 3).


http://gerrit.cloudera.org:8080/#/c/12260/1/be/src/service/control-service.h
File be/src/service/control-service.h:

http://gerrit.cloudera.org:8080/#/c/12260/1/be/src/service/control-service.h@81
PS1, Line 81: name F>
typo ? DoRpcWithRetry ?


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.cc@182
PS2, Line 182: class
Why is this needed ?


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.cc@183
PS2, Line 183: class
Why is this needed ?


http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:


[Impala-ASF-CR] IMPALA-7980: Fix spinning because of buggy num unqueued files .

2019-01-25 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12097 )

Change subject: IMPALA-7980: Fix spinning because of buggy num_unqueued_files_.
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node-base.h@465
PS5, Line 465: remaining_scan_range_issuances_
This feels a bit of a misnomer as parquet scanner will actually issue more scan 
ranges although it does so without calling AddDiskIoRanges().


http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node.cc@79
PS5, Line 79: auto remaining = remaining_scan_range_issuances_.Load();
: if (remaining != 0) {
:   LOG(INFO) << "XXX Assertion Failed " << remaining << " " << 
GetStackTrace();
: }
DCHECK_EQ(remaining_scan_range_issuances_.Load(), 0) << GetStackTrace();


http://gerrit.cloudera.org:8080/#/c/12097/5/be/src/exec/hdfs-scan-node.cc@212
PS5, Line 212: thread_state_.Close(this);
I wonder if you can DCHECK remaining_scan_range_issuances_ to 0 after 
thread_state_.Close() as it seems to wait for all scaner threads to exit. 
However, I can imagine that we can potentially reach this point due to 
cancellation or reaching the limit so not all header ranges for the sequence 
files may have been handled yet ?



--
To view, visit http://gerrit.cloudera.org:8080/12097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I133de13238d3d05c510e2ff771d48979125735b1
Gerrit-Change-Number: 12097
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 26 Jan 2019 03:50:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8097: mt dop for all queries via hidden flag

2019-01-25 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12257 )

Change subject: IMPALA-8097: mt_dop for all queries via hidden flag
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12257/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12257/3//COMMIT_MSG@13
PS3, Line 13: because these are not executable.
Would you mind elaborating for latter references ?


http://gerrit.cloudera.org:8080/#/c/12257/3/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/12257/3/be/src/exec/blocking-join-node.cc@206
PS3, Line 206: && state->query_options().mt_dop == 0
Seems better to do this check before trying to acquire the thread token. 
Otherwise, we would leak the token.


http://gerrit.cloudera.org:8080/#/c/12257/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12257/3/be/src/runtime/coordinator.cc@a108
PS3, Line 108:
Do you recall what the limitation was ?


http://gerrit.cloudera.org:8080/#/c/12257/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/12257/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@161
PS3, Line 161:* - MT_DOP > 0 is not supported for plans with base table 
joins or table sinks.
 :* Throws a NotImplementedException if plan validation fails.
Comments need updating



--
To view, visit http://gerrit.cloudera.org:8080/12257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I72f0b02a005e8bf22fd17b8fb5aabf8c0d9b6b15
Gerrit-Change-Number: 12257
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Fri, 25 Jan 2019 21:55:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7565: Set TAcceptQueueServer connection setup pool to be multi-threaded by default

2019-01-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12249 )

Change subject: IMPALA-7565: Set TAcceptQueueServer connection_setup_pool to be 
multi-threaded by default
..


Patch Set 1: Code-Review+1

(2 comments)

Given the subtlety of this change, it may make sense to have a second reviewer 
take a look too.

http://gerrit.cloudera.org:8080/#/c/12249/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12249/1//COMMIT_MSG@16
PS1, Line 16: Ran exhaustive tests with a thread pool of 10.
Did you run with Kerberos enabled ? It may make sense to run stress test in a 
secure cluster with FLAGS_accepted_cnxn_setup_thread_pool_size and 
FLAGS_fe_service_threads set to some values such as 128.


http://gerrit.cloudera.org:8080/#/c/12249/1/be/src/transport/TSaslServerTransport.cpp
File be/src/transport/TSaslServerTransport.cpp:

http://gerrit.cloudera.org:8080/#/c/12249/1/be/src/transport/TSaslServerTransport.cpp@179
PS1, Line 179: lock_guard l(transportMap_mutex_);
Does it make sense to DCHECK(transportMap_.find(trans) == transportMap_.end()); 
here ?



--
To view, visit http://gerrit.cloudera.org:8080/12249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I053120d4c3153ddbe5261acd28388be6cd191908
Gerrit-Change-Number: 12249
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Fri, 25 Jan 2019 00:30:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4555: Make QueryState's status reporting more robust

2019-01-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12049 )

Change subject: IMPALA-4555: Make QueryState's status reporting more robust
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc@300
PS2, Line 300: nstance_exec_status :
> Yes, but instance_stats->done_ can be set by an intermediate report (if tha
Thanks for the explanation. This seems a bit subtle indeed.

Would things be simpler if we just stop bumping the sequence number after 
generating the last report for a fragment instance ? In other words, we need to 
record the fact that the final report may have been generated already, which is 
different from whether final report has been sent.


http://gerrit.cloudera.org:8080/#/c/12049/3/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/12049/3/be/src/runtime/query-state.h@394
PS3, Line 394: succeeds
nit: succeeded, matching "failed" below.


http://gerrit.cloudera.org:8080/#/c/12049/3/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/3/be/src/runtime/query-state.cc@382
PS3, Line 382: it = report_.mutable_instance_exec_status()->erase(it);
 : profile_it = 
profiles_forest_.profile_trees.erase(profile_it);
Reading the documentation of vector::erase(), it's not exactly an efficient 
operation as it seems to do quite a bit of copying. Doing this in a list may 
end up being O(n^2) here.

I suppose the reason for doing this dance here is to make UpdateReport() work 
on both code paths of ReportSuccessful() and ReportFailed().

Assuming the common case is that the report succeeds, it seems to be doing 
quite a bit of extra work just to erase each element one by one. Shouldn't the 
penalty of copying be paid instead when the report fails, which is presumably 
not too common ? In other words, should we just copy / move the stateful report 
into the fragment instance state instead ?

A slightly more efficient approach of what this loop is trying to achieve could 
be swapping elements at the end of the list to this spot being erased.  In 
which case, it would be O(n) worst case. However, I am not sure if this 
complexity is warranted.


http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto@117
PS2, Line 117:   // Sequence number prevents out-of-order or duplicated updates 
from being applied.
 :   optional int64 report_seq_no = 1;
May be worth documenting the relationship of this seq_no with that in 
FragmentInstanceExecStatusPB. If I understand it correctly, this one should be 
<= that in FragmentInstanceExecStatusPB.


http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto@138
PS2, Line 138:   // Cumulative structural changes made by the table sink of 
this fragment
 :   // instance. This is sent only when 'done' above is true. Not 
idempotent.
 :   optional DmlExecStatusPB dml_exec_status = 5;
> Its not quite straight-forward to do, so given this patch is already reason
Sounds good to me.



--
To view, visit http://gerrit.cloudera.org:8080/12049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971
Gerrit-Change-Number: 12049
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 24 Jan 2019 21:55:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 11:

(12 comments)

Still wrapping my head around this big change. Some initial comments for now.

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/coordinator-backend-state.h@67
PS11, Line 67: RuntimeProfile* host_profiles
nit: Please add a comment on the role of this parameter.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/coordinator-backend-state.h@249
PS11, Line 249: host_profile_ = nullptr;
nit: A brief statement of what this contains.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/query-state.cc@296
PS11, Line 296:   // Add profile to report
  :   host_profile_->ToThrift(_forest->host_profile);
  :   profiles_forest->__isset.host_profile = true;
  :   // Free resources in chunked counters in the profile
  :   host_profile_->ClearChunkedTimeSeriesCounters();
I wonder how this may reconcile with https://gerrit.cloudera.org/#/c/12049/ ? 
In particular, with the mentioned patch, on a failure to send the profile 
report, the query state thread will back off, and retry sending a new profile.

So, the host profile is potentially another non-idempotent state added to the 
report. That said, how tolerant is the coordinator's side logic on handling 
missing chunks in this time series ? In other words, if we fail to send a 
report, how bad is it to let the host profile of this epoch to be dropped ?


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/runtime-state.cc@76
PS11, Line 76: profile_(RuntimeProfile::Create(obj_pool(), "")),
Why this change ?


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile-counters.h@443
PS11, Line 443: previous_sample_count_ = 0;
This deserves a comment.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.h@419
PS11, Line 419:  Samples
  :   /// are not re-sampled into larger intervals, instead owners 
of this profile can call
  :   /// ClearChunkedTimeSeriesCounters() to reset the sample 
buffers of all chunked time
  :   /// series counters
So, misuse of this kind of counter (e.g. never calling 
ClearChunkedTimeSeriesCounters) may lead to large untracked memory usage, right 
?


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc@1216
PS11, Line 1216:
nit: blank space


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc@1231
PS11, Line 1231:
nit: blank space


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc@1279
PS11, Line 1279: DCHECK(
DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.h
File be/src/util/system-state-info.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.h@66
PS11, Line 66:   /// CaptureSystemStateSnapshot().
'cpu_ratios_' is updated after returning from this function.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.h@69
PS11, Line 69: NUM_CPU_VALUES = 5;
Seems more future proof to have this as the last value in the enum below like:

   enum PROC_STAT_CPU_VALUES {
CPU_USER = 0,
CPU_NICE,
CPU_SYSTEM,
CPU_IDLE,
CPU_IOWAIT,
NUM_CPU_VALUES
  };


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.cc
File be/src/util/system-state-info.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.cc@44
PS11, Line 44: int64_t SystemStateInfo::ParseInt64(const string& val) const {
 :   StringParser::ParseResult result;
 :   int64_t parsed = 
StringParser::StringToInt(val.c_str(), val.size(), );
 :   if (result == StringParser::PARSE_SUCCESS) return parsed;
 :   return -1;
 : }
Wonder if this fits better as a utility function in StringParser ?



--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment

[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads

2019-01-22 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12226 )

Change subject: IMPALA-7800: Reject new connections after --fe_service_threads
..


Patch Set 6:

I meant "the timeout can also be turned off by setting it to 0 or something by 
user". We will have a non-zero timeout by default.


--
To view, visit http://gerrit.cloudera.org:8080/12226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559
Gerrit-Change-Number: 12226
Gerrit-PatchSet: 6
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Wed, 23 Jan 2019 01:51:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4555: Make QueryState's status reporting more robust

2018-12-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12049 )

Change subject: IMPALA-4555: Make QueryState's status reporting more robust
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc@563
PS2, Line 563: FLAGS_status_report_interval_ms * (num_failed_reports_ + 1))) {
> This backoff time seems way larger than the retry interval used when failin
Some more thoughts on this:

The behavior in this patch is that we will back off on the intermediate reports 
with at least 5 seconds (or whatever FLAGS_status_report_interval_ms is set to) 
but then we only spin for about 300ms (or whatever 
FLAGS_report_status_retry_interval_ms is set to) when it fails to be sent. This 
behavior seems rather complicated without much benefit. Having 2 different 
flags controlling the behavior also makes it hard to use and reason about the 
behavior.

An executor is merely using this periodic status report as a proxy to determine 
the liveness of the coordinator.  So, wouldn't the behavior be simpler and more 
consistent if an executor just considers the coordinator as dysfunctional after 
 FLAGS_status_report_max_retries failed reports in a row. In other words, an 
executor provides a total time allowance of  FLAGS_status_report_max_retries * 
FLAGS_report_interval_ms (or potentially longer with exponential backoff) for 
coordinator to respond before giving up.


With the above policy, we don't distinguish between intermediate report and 
final report for the backoff behavior. A clock will be started on the first 
failure and an executor gives up after the time allowance has passed.

As a side note, an executor should ideally be able to rely on statestore to 
provide proper cluster membership but the reality is that it's insufficient for 
determining the health of a node (see IMPALA-7872). So, we fall back to this 
poor-man approach of using periodic status report as a heartbeat to probe the 
coordinator.



--
To view, visit http://gerrit.cloudera.org:8080/12049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971
Gerrit-Change-Number: 12049
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 20 Dec 2018 20:15:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213: Use separate network plane for DataStream and Control services

2018-12-19 Thread Michael Ho (Code Review)
Michael Ho has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12107 )

Change subject: IMPALA-7213: Use separate network plane for DataStream and 
Control services
..

IMPALA-7213: Use separate network plane for DataStream and Control services

This change is a follow up for a review comment in
https://gerrit.cloudera.org/#/c/10855/ about
separating the TCP connections of DataStream and Control
services so that control commands don't get blocked behind
large payloads being sent in the DataStream services.

Testing done: exhaustive build

Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
Reviewed-on: http://gerrit.cloudera.org:8080/12107
Reviewed-by: Michael Ho 
Tested-by: Impala Public Jenkins 
---
M be/src/service/control-service.cc
M be/src/service/data-stream-service.cc
2 files changed, 8 insertions(+), 2 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/12107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
Gerrit-Change-Number: 12107
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] Add "network plane" as part of ConnectionId

2018-12-19 Thread Michael Ho (Code Review)
Michael Ho has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12106 )

Change subject: Add "network_plane" as part of ConnectionId
..

Add "network_plane" as part of ConnectionId

The motivation for doing so is to allow N services on the same host to
be multiplexed on M different connections. For instance, a server may
host multiple KRPC based services: one for control command and one for
data transfer. Separating the connections between the control channel
and the data channel prevents unnecessary delays of the control commands
due to being stuck behind large data transfers from client to server.

By default, the network_plane of a new ConnectionId is not set.
A user can change it to a different value by calling
Proxy::set_network_plane() on the ConnectionId.

Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Reviewed-on: http://gerrit.cloudera.org:8080/11681
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/12106
Reviewed-by: Thomas Marshall 
Tested-by: Michael Ho 
---
M be/src/kudu/rpc/connection_id.cc
M be/src/kudu/rpc/connection_id.h
M be/src/kudu/rpc/messenger.h
M be/src/kudu/rpc/proxy.cc
M be/src/kudu/rpc/proxy.h
M be/src/kudu/rpc/reactor.h
M be/src/kudu/rpc/rpc-test.cc
7 files changed, 138 insertions(+), 31 deletions(-)

Approvals:
  Thomas Marshall: Looks good to me, approved
  Michael Ho: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Gerrit-Change-Number: 12106
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] Add "network plane" as part of ConnectionId

2018-12-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12106 )

Change subject: Add "network_plane" as part of ConnectionId
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Gerrit-Change-Number: 12106
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 20 Dec 2018 00:27:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add "network plane" as part of ConnectionId

2018-12-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12106 )

Change subject: Add "network_plane" as part of ConnectionId
..


Patch Set 1:

Verified by https://jenkins.impala.io/job/gerrit-verify-dryrun/3588/


--
To view, visit http://gerrit.cloudera.org:8080/12106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Gerrit-Change-Number: 12106
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 20 Dec 2018 00:27:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7213: Use separate network plane for DataStream and Control services

2018-12-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12107 )

Change subject: IMPALA-7213: Use separate network plane for DataStream and 
Control services
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12107/1/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/12107/1/be/src/service/control-service.cc@83
PS1, Line 83:   (*proxy)->set_network_plane("control");
> Maybe add a brief comment about the motivation here
Done



--
To view, visit http://gerrit.cloudera.org:8080/12107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
Gerrit-Change-Number: 12107
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Wed, 19 Dec 2018 19:30:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213: Use separate network plane for DataStream and Control services

2018-12-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12107 )

Change subject: IMPALA-7213: Use separate network plane for DataStream and 
Control services
..


Patch Set 2: Code-Review+2

Carry Thomas' +2.


--
To view, visit http://gerrit.cloudera.org:8080/12107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
Gerrit-Change-Number: 12107
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Wed, 19 Dec 2018 19:30:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7213: Use separate network plane for DataStream and Control services

2018-12-19 Thread Michael Ho (Code Review)
Hello Thomas Marshall, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12107

to look at the new patch set (#2).

Change subject: IMPALA-7213: Use separate network plane for DataStream and 
Control services
..

IMPALA-7213: Use separate network plane for DataStream and Control services

This change is a follow up for a review comment in
https://gerrit.cloudera.org/#/c/10855/ about
separating the TCP connections of DataStream and Control
services so that control commands don't get blocked behind
large payloads being sent in the DataStream services.

Testing done: exhaustive build

Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
---
M be/src/service/control-service.cc
M be/src/service/data-stream-service.cc
2 files changed, 8 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/12107/2
--
To view, visit http://gerrit.cloudera.org:8080/12107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
Gerrit-Change-Number: 12107
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7931: fix executor shutdown races

2018-12-18 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12082 )

Change subject: IMPALA-7931: fix executor shutdown races
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/cancellation-work.h
File be/src/service/cancellation-work.h:

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/cancellation-work.h@18
PS7, Line 18: #pragma once
> There was a discussion on dev@ a while ago and that was the consensus: http
I see. Thanks for reminding me.



--
To view, visit http://gerrit.cloudera.org:8080/12082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c1a80304cb6695d228aca8314e2231727ab1998
Gerrit-Change-Number: 12082
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 19 Dec 2018 01:50:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7931: fix executor shutdown races

2018-12-18 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12082 )

Change subject: IMPALA-7931: fix executor shutdown races
..


Patch Set 7:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/runtime/coordinator-backend-state.h@291
PS7, Line 291: boost::unique_lock
Did we decide on moving to std::unique_lock in the long run ? I cannot quite 
recall what we agreed on in the last discussion.


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/runtime/coordinator.cc@835
PS7, Line 835: :
nit: space


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/runtime/coordinator.cc@835
PS7, Line 835: (BackendState* backend_state: backend_states_) {
This may be very unlikely in practice but is there a chance this loop race with 
InitBackendStates() ?


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/cancellation-work.h
File be/src/service/cancellation-work.h:

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/cancellation-work.h@18
PS7, Line 18: #pragma once
Interesting. So, that's the new convention we are gonna use in the future ?


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/cancellation-work.h@31
PS7, Line 31: EXPIRED,
Will "LIMIT_EXCEEDED" be a better name ? Sounds to me the reason for 
cancellation is either "time limit exceeded" or "some resource limit exceeded".


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/cancellation-work.h@80
PS7, Line 80: TUniqueId query_id_;
const TUniqueId query_id_; ?

Same for other fields below.


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/impala-server.cc@1406
PS7, Line 1406:   vector active_backends =
const


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/service/impala-server.cc@1800
PS7, Line 1800:   cancellation_entry->first, 
cancellation_entry->second));
nit: wrong indent


http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/util/impalad-metrics.h
File be/src/util/impalad-metrics.h:

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/util/impalad-metrics.h@49
PS7, Line 49: executed
nit: started. Precisely speaking, the counter is incremented when a query 
starts on a backend. It may not have completed. This may just be a minor 
distinction. Feel free to ignore.


http://gerrit.cloudera.org:8080/#/c/12082/7/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/12082/7/common/thrift/metrics.json@483
PS7, Line 483: ..
nit: duplicated "."



--
To view, visit http://gerrit.cloudera.org:8080/12082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c1a80304cb6695d228aca8314e2231727ab1998
Gerrit-Change-Number: 12082
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 19 Dec 2018 00:54:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7931: fix executor shutdown races

2018-12-18 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12082 )

Change subject: IMPALA-7931: fix executor shutdown races
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/common/status.cc
File be/src/common/status.cc:

http://gerrit.cloudera.org:8080/#/c/12082/7/be/src/common/status.cc@179
PS7, Line 179: Status(code);
May be I misread the code but isn't this calling the version of Status ctor 
which will actually log and dump the stack trace ?!



--
To view, visit http://gerrit.cloudera.org:8080/12082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c1a80304cb6695d228aca8314e2231727ab1998
Gerrit-Change-Number: 12082
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 18 Dec 2018 19:31:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add "network plane" as part of ConnectionId

2018-12-18 Thread Michael Ho (Code Review)
Michael Ho has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/12106 )

Change subject: Add "network_plane" as part of ConnectionId
..


Removed reviewer Kudu Jenkins.
--
To view, visit http://gerrit.cloudera.org:8080/12106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Gerrit-Change-Number: 12106
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] Add "network plane" as part of ConnectionId

2018-12-18 Thread Michael Ho (Code Review)
Michael Ho has removed Adar Dembo from this change.  ( 
http://gerrit.cloudera.org:8080/12106 )

Change subject: Add "network_plane" as part of ConnectionId
..


Removed reviewer Adar Dembo.
--
To view, visit http://gerrit.cloudera.org:8080/12106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Gerrit-Change-Number: 12106
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Kudu Jenkins (120)


[Impala-ASF-CR] Add "network plane" as part of ConnectionId

2018-12-18 Thread Michael Ho (Code Review)
Hello Kudu Jenkins, Adar Dembo,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/12106

to review the following change.


Change subject: Add "network_plane" as part of ConnectionId
..

Add "network_plane" as part of ConnectionId

The motivation for doing so is to allow N services on the same host to
be multiplexed on M different connections. For instance, a server may
host multiple KRPC based services: one for control command and one for
data transfer. Separating the connections between the control channel
and the data channel prevents unnecessary delays of the control commands
due to being stuck behind large data transfers from client to server.

By default, the network_plane of a new ConnectionId is not set.
A user can change it to a different value by calling
Proxy::set_network_plane() on the ConnectionId.

Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Reviewed-on: http://gerrit.cloudera.org:8080/11681
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M be/src/kudu/rpc/connection_id.cc
M be/src/kudu/rpc/connection_id.h
M be/src/kudu/rpc/messenger.h
M be/src/kudu/rpc/proxy.cc
M be/src/kudu/rpc/proxy.h
M be/src/kudu/rpc/reactor.h
M be/src/kudu/rpc/rpc-test.cc
7 files changed, 138 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/12106/1
--
To view, visit http://gerrit.cloudera.org:8080/12106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6767e631fd9530ea54f5ed63ff4c8c179ab216b2
Gerrit-Change-Number: 12106
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins (120)


[Impala-ASF-CR] IMPALA-7213: Use separate network plane for DataStream and Control services

2018-12-18 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12107


Change subject: IMPALA-7213: Use separate network plane for DataStream and 
Control services
..

IMPALA-7213: Use separate network plane for DataStream and Control services

This change is a follow up for a review comment in
https://gerrit.cloudera.org/#/c/10855/ about
separating the TCP connections of DataStream and Control
services so that control commands don't get blocked behind
large payloads being sent in the DataStream services.

Testing done: exhaustive build

Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
---
M be/src/service/control-service.cc
M be/src/service/data-stream-service.cc
2 files changed, 6 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/12107/1
--
To view, visit http://gerrit.cloudera.org:8080/12107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I774f4a0e2cfedc4dba72cde4f5d28898cdbdc236
Gerrit-Change-Number: 12107
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-4555: Make QueryState's status reporting more robust

2018-12-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12049 )

Change subject: IMPALA-4555: Make QueryState's status reporting more robust
..


Patch Set 2:

(8 comments)

Some trivial comments for now. I have yet to think through the retry logic to 
see if they can be simplified and more consistent.

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc@300
PS2, Line 300: || instance_stats->done_
If we send the final report multiple times, won't report_seq_no == 
last_report_seq_no here ?


http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc@320
PS2, Line 320: auto
const auto&


http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/fragment-instance-state.h@101
PS2, Line 101: recieved
nit: received


http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.h@387
PS2, Line 387: The number of consecutive failed intermediate reports.
This is reset to 0 upon a successful RPC, right ? So, will it be clearer to say 
"The number of failed intermediate reports since the last successfully sent 
report" ?


http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc@386
PS2, Line 386: TUniqueId
const TUniqueId&


http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc@563
PS2, Line 563: FLAGS_status_report_interval_ms * (num_failed_reports_ + 1))) {
This backoff time seems way larger than the retry interval used when failing to 
send the final report. This seems a bit inconsistent.


http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto@116
PS2, Line 116: FInstanceExecStatusStatefulPB
Is there a more succinct name for it ? StatefulStatusPB ?


http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto@138
PS2, Line 138:   // Cumulative structural changes made by the table sink of 
this fragment
 :   // instance. This is sent only when 'done' above is true. Not 
idempotent.
 :   optional DmlExecStatusPB dml_exec_status = 5;
Why isn't this moved into FInstanceExecStatusStatefulPB



--
To view, visit http://gerrit.cloudera.org:8080/12049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971
Gerrit-Change-Number: 12049
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 18 Dec 2018 00:43:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2018-12-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12069/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12069/7//COMMIT_MSG@15
PS7, Line 15: This mechanism adds a new time series counter class that collects 
all
: measured values and does not re-sample them. It will re-sample 
values
: when printing them into a string profile to a max of 64 values, 
but
: Thrift profiles will contain the full list of values.
How hard is it to factor this subset of changes out as a patch itself ? It 
seems easier to review that way.


http://gerrit.cloudera.org:8080/#/c/12069/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12069/7/be/src/service/impala-server.cc@1051
PS7, Line 1051:   if (rand() < trace_ratio * (RAND_MAX + 1L)) 
query_ctx->__set_trace_resource_usage(true);
> I think rand() isn't thread safe. (And do we initialize it with srand() som
Seems helpful to have a log statement that tracing is enabled.


http://gerrit.cloudera.org:8080/#/c/12069/7/be/src/util/periodic-counter-updater.h
File be/src/util/periodic-counter-updater.h:

http://gerrit.cloudera.org:8080/#/c/12069/7/be/src/util/periodic-counter-updater.h@33
PS7, Line 33: /// Singleton utility class that updates counter values. This is 
used to sample some
: /// metric (e.g. memory used) at regular intervals. The samples 
can be summarized in
: /// a few ways (e.g. averaged, stored as histogram, kept as a 
time series data, etc).
: /// This class has one thread that will wake up at a regular 
period and update all
: /// the registered counters.
: /// Typically, the counter updates should be stopped as early as 
possible to prevent
: /// future stale samples from polluting the useful values.
Should this be updated to include the role of UpdateFn() below ?



--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Mon, 17 Dec 2018 23:23:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7980: Fix spinning threads because of buggy handling of num unqueued files .

2018-12-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12097 )

Change subject: IMPALA-7980: Fix spinning threads because of buggy handling of 
num_unqueued_files_.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12097/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12097/1//COMMIT_MSG@48
PS1, Line 48: Michael Ho.
I didn't do much so please feel free to leave me out.


http://gerrit.cloudera.org:8080/#/c/12097/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/12097/1/be/src/exec/hdfs-scan-node-base.h@313
PS1, Line 313:   /// Calls RangeComplete() with skipped=true for all the splits 
of the file
nit: May also want to update the side effect of decrementing 
num_unqueued_files_;


http://gerrit.cloudera.org:8080/#/c/12097/1/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12097/1/be/src/exec/hdfs-scan-node.cc@469
PS1, Line 469:   // File ranges haven't been issued yet, skip entire file
 :   SkipFile(partition->file_format(), desc);
I guess this is slightly subtle here as we never decrement 
"num_unqueued_files_" for reading the sequence file's header in 
BaseSequenceScanner::IssueInitialRanges(), unlike other file formats which 
usually decrement that counter in their IssueInitialRanges().



--
To view, visit http://gerrit.cloudera.org:8080/12097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I133de13238d3d05c510e2ff771d48979125735b1
Gerrit-Change-Number: 12097
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Dec 2018 22:51:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4555: Make QueryState's status reporting more robust

2018-12-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12049 )

Change subject: IMPALA-4555: Make QueryState's status reporting more robust
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc@58
PS1, Line 58: DEFINE_int32(status_report_max_failures, 3,
: "Max number of consecutive failed status reports to allow 
before cancelling");
> I thought we want to use a fixed timeout approach for the maximum retries ?
Actually, max_retries seems to be safer in the sense that it guarantees a 
minimum amount of time the thread will sleep before giving up as we know the 
sleep time between each retry.

With an absolute timeout, there is no guarantee on the number of retries we 
will do. If the system is overloaded, the query state thread may not get to run 
very often before expiration so the number of retries is non-deterministic.


http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc@368
PS1, Line 368: fis_map_[id]->runtime_state()->ClearUnreportedErrors();
There is a race here: we may be clearing newly added errors we added after the 
profile was computed.



--
To view, visit http://gerrit.cloudera.org:8080/12049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971
Gerrit-Change-Number: 12049
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 11 Dec 2018 22:25:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7212: Remove dead code data-stream-mgr.cc

2018-12-10 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12064


Change subject: IMPALA-7212: Remove dead code data-stream-mgr.cc
..

IMPALA-7212: Remove dead code data-stream-mgr.cc

Dead code which was accidentally left out in the last
patch of IMPALA-7212.

Testing done: Built Impala debug and release builds

Change-Id: I047e2a01b835936f1066d4d7f87194dcc6857542
---
D be/src/runtime/data-stream-mgr.cc
1 file changed, 0 insertions(+), 298 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/12064/1
--
To view, visit http://gerrit.cloudera.org:8080/12064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I047e2a01b835936f1066d4d7f87194dcc6857542
Gerrit-Change-Number: 12064
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-4555: Make QueryState's status reporting more robust

2018-12-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12049 )

Change subject: IMPALA-4555: Make QueryState's status reporting more robust
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/coordinator-backend-state.cc@299
PS1, Line 299: report_seq_no <= instance_stats->last_report_seq_no_ || 
instance_stats->done_) {
Wouldn't this be a problem for intermediate report ? If the coordinator is slow 
to respond to the RPC, the client side (i.e. the executor) may assume that the 
intermediate reports were never applied even if it was actually applied by the 
coordinator. In which case, the new report may come with a larger sequence 
number which is larger than the old one and that may cause duplicated report.

May be it makes sense to keep track of whether a FIS' report was successfully 
applied at the executor side and bump the sequence number if the last one was 
applied successfully. Not sure if there is any corner case which may not have 
been considered in this alternate approach ?!


http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/fragment-instance-state.h@99
PS1, Line 99: void SetFinalReportSent() { final_report_sent_ = true; }
May help to add comments on the expected caller and on when this is expected to 
be called ?


http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.h@417
PS1, Line 417: (bool final
Comment on the input parameter.


http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc@58
PS1, Line 58: DEFINE_int32(status_report_max_failures, 3,
: "Max number of consecutive failed status reports to allow 
before cancelling");
I thought we want to use a fixed timeout approach for the maximum retries ? I 
am okay with max_retries too but I think it may make sense to have larger 
number of retries than 3.


http://gerrit.cloudera.org:8080/#/c/12049/1/be/src/runtime/query-state.cc@324
PS1, Line 324: Try to send the RPC 3 times before failing. Sleep for 100ms 
between retries.
May need to be updated.



--
To view, visit http://gerrit.cloudera.org:8080/12049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971
Gerrit-Change-Number: 12049
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Mon, 10 Dec 2018 21:56:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7943: Bump the default client timeout set on impala-shell

2018-12-06 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12051 )

Change subject: IMPALA-7943: Bump the default client timeout set on impala-shell
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc40069e86cbf93634320804efba003fb5551afe
Gerrit-Change-Number: 12051
Gerrit-PatchSet: 1
Gerrit-Owner: Anuj Phadke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Fri, 07 Dec 2018 02:58:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4475: part 1, reduce size of TExecQueryFInstancesParams

2018-12-06 Thread Michael Ho (Code Review)
Michael Ho has restored this change. ( http://gerrit.cloudera.org:8080/9251 )

Change subject: IMPALA-4475: part 1, reduce size of TExecQueryFInstancesParams
..


Restored

Relevant for IMPALA-7467 as we may still keep a lot of the Plan related 
structures from Frontend in Thrift.
--
To view, visit http://gerrit.cloudera.org:8080/9251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: restore
Gerrit-Change-Id: I195c59efc73e5fd4c310ccfc96b480d2209bde09
Gerrit-Change-Number: 9251
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-4475: part 1, reduce size of TExecQueryFInstancesParams

2018-12-06 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9251 )

Change subject: IMPALA-4475: part 1, reduce size of TExecQueryFInstancesParams
..


Patch Set 1:

We may consider reviving it as we look into IMPALA-7467. Let's keep it opened 
for now.


--
To view, visit http://gerrit.cloudera.org:8080/9251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I195c59efc73e5fd4c310ccfc96b480d2209bde09
Gerrit-Change-Number: 9251
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Dec 2018 18:54:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-12-03 Thread Michael Ho (Code Review)
Hello Lars Volker, Balazs Jeszenszky, Zoram Thanga, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12000

to look at the new patch set (#6).

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..

IMPALA-6741: Add timestamp of fragment instance's status updates

Currently, the profile of a running query doesn't contain any
timestamps for the last updates from the fragment instances.
This makes it hard to differentiate between when a fragment
instance failed to send status reports to the coordinator
for various reasons (e.g. IMPALA-2990) or a truly stuck
fragment instance.

This change adds a timestamp to a fragment instance's profile
to record the time when the coordinator last received a status
update from it. Note that it's possible that there is delay
between when the status was created on the executor and when
it arrived at the coordinator. Given that the clocks are not
necessarily synchronized across all executors, the receiving
time of the update at the coordinator seems easier to make sense of.

Sample output:

Fragment F01:
  Instance 494d948d3235441a:23eae1790001 (host=???):(Total: 15.099ms, 
non-child: 263.951us, % non-child: 1.75%)
Last report received time: 2018-11-27 16:57:30.014
Hdfs split stats (:<# splits>/): 0:1/1.58 KB
Fragment Instance Lifecycle Event Timeline: 15.622ms
   - Prepare Finished: 1.026ms (1.026ms)
   - Open Finished: 1.137ms (110.297us)
   - First Batch Produced: 15.010ms (13.873ms)
   - First Batch Sent: 15.080ms (70.715us)
   - ExecInternal Finished: 15.622ms (541.181us)

Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M tests/query_test/test_observability.py
5 files changed, 66 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/12000/6
--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-12-03 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12000 )

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..


Patch Set 6: Code-Review+2

Carry Lars' +2.


--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 03 Dec 2018 21:49:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-12-03 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12000 )

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12000/5/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12000/5/tests/query_test/test_observability.py@28
PS5, Line 28: =
> flake8: E225 missing whitespace around operator
Done


http://gerrit.cloudera.org:8080/#/c/12000/5/tests/query_test/test_observability.py@28
PS5, Line 28: MAX_WAIT
> Now that this is removed from the code, I suggest naming it something like
Done



--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 03 Dec 2018 21:49:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6955: fix test query concurrency and server startup sequence

2018-12-03 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12019 )

Change subject: IMPALA-6955: fix test_query_concurrency and server startup 
sequence
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12019/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12019/1/be/src/service/impala-server.cc@2202
PS1, Line 2202:   http_handler_.reset(new ImpalaHttpHandler(this));
  :   http_handler_->RegisterHandlers(exec_env_->webserver());
Can you please add a comment that this needs to happen after the ImpalaServer 
has registered with ExecEnv or it may result in some sort of races ?



--
To view, visit http://gerrit.cloudera.org:8080/12019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If22f71ab6edaf9a6b46afc0985c73dc4625b5103
Gerrit-Change-Number: 12019
Gerrit-PatchSet: 1
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 03 Dec 2018 19:54:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-12-03 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12000 )

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12000/4/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12000/4/tests/query_test/test_observability.py@19
PS4, Line 19: from tests.common.impala_cluster import ImpalaCluster
> flake8: F401 'math.ceil' imported but unused
Done


http://gerrit.cloudera.org:8080/#/c/12000/4/tests/query_test/test_observability.py@74
PS4, Line 74:
> The other test below uses 300 seconds here. I suggest to use the same time,
Done



--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 03 Dec 2018 19:38:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-12-03 Thread Michael Ho (Code Review)
Hello Lars Volker, Balazs Jeszenszky, Zoram Thanga, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12000

to look at the new patch set (#5).

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..

IMPALA-6741: Add timestamp of fragment instance's status updates

Currently, the profile of a running query doesn't contain any
timestamps for the last updates from the fragment instances.
This makes it hard to differentiate between when a fragment
instance failed to send status reports to the coordinator
for various reasons (e.g. IMPALA-2990) or a truly stuck
fragment instance.

This change adds a timestamp to a fragment instance's profile
to record the time when the coordinator last received a status
update from it. Note that it's possible that there is delay
between when the status was created on the executor and when
it arrived at the coordinator. Given that the clocks are not
necessarily synchronized across all executors, the receiving
time of the update at the coordinator seems easier to make sense of.

Sample output:

Fragment F01:
  Instance 494d948d3235441a:23eae1790001 (host=???):(Total: 15.099ms, 
non-child: 263.951us, % non-child: 1.75%)
Last report received time: 2018-11-27 16:57:30.014
Hdfs split stats (:<# splits>/): 0:1/1.58 KB
Fragment Instance Lifecycle Event Timeline: 15.622ms
   - Prepare Finished: 1.026ms (1.026ms)
   - Open Finished: 1.137ms (110.297us)
   - First Batch Produced: 15.010ms (13.873ms)
   - First Batch Sent: 15.080ms (70.715us)
   - ExecInternal Finished: 15.622ms (541.181us)

Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M tests/query_test/test_observability.py
5 files changed, 63 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/12000/5
--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-6656: BufferAllocator observability

2018-11-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11947 )

Change subject: IMPALA-6656: BufferAllocator observability
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11947
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12b740b8ea7773b3215681531dfa62db55cfdf18
Gerrit-Change-Number: 11947
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 01 Dec 2018 00:31:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-11-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12000 )

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12000/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12000/3/be/src/runtime/coordinator-backend-state.cc@571
PS3, Line 571:
> Maybe clamp this at 0 to make sure we won't return a negative value if some
Done


http://gerrit.cloudera.org:8080/#/c/12000/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12000/3/tests/query_test/test_observability.py@83
PS3, Line 83: rt_time_str, '%Y-%
> get_thrift_profile takes a timeout, so you can call it once before the loop
Done


http://gerrit.cloudera.org:8080/#/c/12000/3/tests/query_test/test_observability.py@90
PS3, Line 90:
> Can you make this a constant and add a comment. Otherwise this test will fa
Iterate through all the nodes in the new PS.


http://gerrit.cloudera.org:8080/#/c/12000/3/tests/query_test/test_observability.py@108
PS3, Line 108: assert result.exec_summary[scan_idx]['detail'] == 
'functional_kudu.alltypestiny'
> I found this part tricky to read. Why is it 8 seconds? Can you elaborate in
The test in new PS is simplified to not check for monotonicity of the timestamp 
as that's not part of the scope.


http://gerrit.cloudera.org:8080/#/c/12000/3/tests/query_test/test_observability.py@112
PS3, Line 112: result = self.execute_query(quer
> Should we just call close_query() here?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Sat, 01 Dec 2018 00:01:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-11-30 Thread Michael Ho (Code Review)
Hello Lars Volker, Balazs Jeszenszky, Zoram Thanga, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12000

to look at the new patch set (#4).

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..

IMPALA-6741: Add timestamp of fragment instance's status updates

Currently, the profile of a running query doesn't contain any
timestamps for the last updates from the fragment instances.
This makes it hard to differentiate between when a fragment
instance failed to send status reports to the coordinator
for various reasons (e.g. IMPALA-2990) or a truly stuck
fragment instance.

This change adds a timestamp to a fragment instance's profile
to record the time when the coordinator last received a status
update from it. Note that it's possible that there is delay
between when the status was created on the executor and when
it arrived at the coordinator. Given that the clocks are not
necessarily synchronized across all executors, the receiving
time of the update at the coordinator seems easier to make sense of.

Sample output:

Fragment F01:
  Instance 494d948d3235441a:23eae1790001 (host=???):(Total: 15.099ms, 
non-child: 263.951us, % non-child: 1.75%)
Last report received time: 2018-11-27 16:57:30.014
Hdfs split stats (:<# splits>/): 0:1/1.58 KB
Fragment Instance Lifecycle Event Timeline: 15.622ms
   - Prepare Finished: 1.026ms (1.026ms)
   - Open Finished: 1.137ms (110.297us)
   - First Batch Produced: 15.010ms (13.873ms)
   - First Batch Sent: 15.080ms (70.715us)
   - ExecInternal Finished: 15.622ms (541.181us)

Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M tests/query_test/test_observability.py
5 files changed, 62 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/12000/4
--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-6656: BufferAllocator observability

2018-11-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11947 )

Change subject: IMPALA-6656: BufferAllocator observability
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11947/4/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

http://gerrit.cloudera.org:8080/#/c/11947/4/be/src/runtime/bufferpool/buffer-allocator.cc@37
PS4, Line 37: //
nit: /// Same below.


http://gerrit.cloudera.org:8080/#/c/11947/4/be/src/runtime/bufferpool/buffer-pool-test.cc
File be/src/runtime/bufferpool/buffer-pool-test.cc:

http://gerrit.cloudera.org:8080/#/c/11947/4/be/src/runtime/bufferpool/buffer-pool-test.cc@75
PS4, Line 75: test_env_->DisableBufferPool();
Can you please add a comment on why this is necessary ?


http://gerrit.cloudera.org:8080/#/c/11947/4/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/11947/4/common/thrift/metrics.json@1330
PS4, Line 1330: allocating from this system
in system allocator ?


http://gerrit.cloudera.org:8080/#/c/11947/4/common/thrift/metrics.json@1340
PS4, Line 1340: Counts the number of times
"Number of times" may be more succinct. Same below.


http://gerrit.cloudera.org:8080/#/c/11947/4/common/thrift/metrics.json@1350
PS4, Line 1350: the number
the number of times


http://gerrit.cloudera.org:8080/#/c/11947/4/common/thrift/metrics.json@1350
PS4, Line 1350:  allocated
directly allocated


http://gerrit.cloudera.org:8080/#/c/11947/4/common/thrift/metrics.json@1370
PS4, Line 1370: a free buffer was recycled within same NUMA node to fulfi
a recycled buffer within the same NUMA node was used to fufil... ?



--
To view, visit http://gerrit.cloudera.org:8080/11947
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12b740b8ea7773b3215681531dfa62db55cfdf18
Gerrit-Change-Number: 11947
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 30 Nov 2018 22:10:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1048: show sinks in exec summary

2018-11-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11967 )

Change subject: IMPALA-1048: show sinks in exec summary
..


Patch Set 8: Code-Review+1

I suppose it's safe to assume that no external software depends on the format 
of the exec summary, right ? I suppose not but just wanna double check.


--
To view, visit http://gerrit.cloudera.org:8080/11967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3fdf7bacae8ff597b255da65af453e174ba53544
Gerrit-Change-Number: 11967
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 29 Nov 2018 18:52:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-11-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12000 )

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12000/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12000/1/be/src/runtime/coordinator-backend-state.cc@47
PS1, Line 47: "Last report received time";
> I am nit-picking somewhat, but this is really 'Last received time' from the
How about "Last report received time". "received time" by itself may not be 
very clear on what was received.


http://gerrit.cloudera.org:8080/#/c/12000/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12000/1/tests/query_test/test_observability.py@95
PS1, Line 95:   if report_time < report_time_dict.get(node.name, 
datetime.min):
> What happens here the first time through, when node.name does not yet exist
The second argument (datetime.min) is the default value if the key doesn't 
exist.


http://gerrit.cloudera.org:8080/#/c/12000/1/tests/query_test/test_observability.py@111
PS1, Line 111: assert num_time_backward <= ceil(elapsed_time / 
MIN_NTP_POLL_PERIOD)
> Should this be num_time_backward? time_backward is a boolean, and shouldn't
Nice catch. Fixed.


http://gerrit.cloudera.org:8080/#/c/12000/1/tests/query_test/test_observability.py@112
PS1, Line 112: s
> flake8: F841 local variable 'results' is assigned to but never used
Done



--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 29 Nov 2018 00:04:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-11-28 Thread Michael Ho (Code Review)
Hello Balazs Jeszenszky, Zoram Thanga, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12000

to look at the new patch set (#2).

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..

IMPALA-6741: Add timestamp of fragment instance's status updates

Currently, the profile of a running query doesn't contain any
timestamps for the last updates from the fragment instances.
This makes it hard to differentiate between when a fragment
instance failed to send status reports to the coordinator
for various reasons (e.g. IMPALA-2990) or a truly stuck
fragment instance.

This change adds a timestamp to a fragment instance's profile
to record the time when the coordinator last received a status
update from it. Note that it's possible that there is delay
between when the status was created on the executor and when
it arrived at the coordinator. Given that the clocks are not
necessarily synchronized across all executors, the receiving
time of the update at the coordinator seems easier to make sense of.

Sample output:

Fragment F01:
  Instance 494d948d3235441a:23eae1790001 (host=???):(Total: 15.099ms, 
non-child: 263.951us, % non-child: 1.75%)
Last report received time: 2018-11-27 16:57:30.014
Hdfs split stats (:<# splits>/): 0:1/1.58 KB
Fragment Instance Lifecycle Event Timeline: 15.622ms
   - Prepare Finished: 1.026ms (1.026ms)
   - Open Finished: 1.137ms (110.297us)
   - First Batch Produced: 15.010ms (13.873ms)
   - First Batch Sent: 15.080ms (70.715us)
   - ExecInternal Finished: 15.622ms (541.181us)

Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M tests/query_test/test_observability.py
5 files changed, 85 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/12000/2
--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-1048: show sinks in exec summary

2018-11-28 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11967 )

Change subject: IMPALA-1048: show sinks in exec summary
..


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/exec/plan-root-sink.cc@79
PS7, Line 79: while (results_ == nullptr && !state->is_cancelled()) 
sender_cv_.Wait(l);
Should we count the time spent here as inactive_time ? This seems to be the 
convention for other plan nodes and DataStreamSender. Didn't dive into other 
hdfs table writer sinks to see if inactive time also applies there.


http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/runtime/coordinator.cc@234
PS7, Line 234: optional in the thrift
Should they become required then ?


http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/util/runtime-profile.h@307
PS7, Line 307: plan node
data sink


http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/util/runtime-profile.cc@962
PS7, Line 962: void RuntimeProfile::SetPlanNodeId(int node_id) {
Should we DCHECK that "data_sink_id" is not set here ?


http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/util/runtime-profile.cc@966
PS7, Line 966: void RuntimeProfile::SetDataSinkId(int sink_id) {
Should we DCHECK that "plan_node_id" is not set here ?


http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/util/summary-util.cc
File be/src/util/summary-util.cc:

http://gerrit.cloudera.org:8080/#/c/11967/7/be/src/util/summary-util.cc@79
PS7, Line 79: LOG(INFO) << label_ss.str() << " " << node.node_id;
Is this for debugging only ?


http://gerrit.cloudera.org:8080/#/c/11967/7/common/thrift/RuntimeProfile.thrift
File common/thrift/RuntimeProfile.thrift:

http://gerrit.cloudera.org:8080/#/c/11967/7/common/thrift/RuntimeProfile.thrift@67
PS7, Line 67:   // Set if this node corresponds to a plan node.
:   1: optional i32 plan_node_id
:
:   // Set if this node corresponds to a data sink.
:   2: optional i32 data_sink_id
Are these two not supposed to be set at the same time ? if so, can you please 
add a comment ?



--
To view, visit http://gerrit.cloudera.org:8080/11967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3fdf7bacae8ff597b255da65af453e174ba53544
Gerrit-Change-Number: 11967
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 28 Nov 2018 23:53:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-11-27 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12000 )

Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12000/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12000/1/be/src/runtime/coordinator-backend-state.cc@502
PS1, Line 502: ToStringFromUnixMillis(last_report_time_ms_)
An alternate approach would be to just store 'last_report_time_ms_' as string 
and convert it using ToStringFromUnixMillis() when exporting the profile as 
Thrift object or when we are pretty printing it.

That said, it's unclear whether the complication is worth it given this is not 
in the critical path of returning query results.



--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 28 Nov 2018 01:24:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6741: Add timestamp of fragment instance's status updates

2018-11-27 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12000


Change subject: IMPALA-6741: Add timestamp of fragment instance's status updates
..

IMPALA-6741: Add timestamp of fragment instance's status updates

Currently, the profile of a running query doesn't contain any
timestamps for the last updates from the fragment instances.
This makes it hard to differentiate between when a fragment
instance failed to send status reports to the coordinator
for various reasons (e.g. IMPALA-2990) or a truly stuck
fragment instance.

This change adds a timestamp to a fragment instance's profile
to record the time when the coordinator last received a status
update from it. Note that it's possible that there is delay
between when the status was created on the executor and when
it arrived at the coordinator. Given that the clocks are not
necessarily synchronized across all executors, the receiving
time of the update at the coordinator seems easier to make sense of.

Sample output:

Fragment F01:
  Instance 494d948d3235441a:23eae1790001 (host=???):(Total: 15.099ms, 
non-child: 263.951us, % non-child: 1.75%)
Last report time: 2018-11-27 16:57:30.014
Hdfs split stats (:<# splits>/): 0:1/1.58 KB
Fragment Instance Lifecycle Event Timeline: 15.622ms
   - Prepare Finished: 1.026ms (1.026ms)
   - Open Finished: 1.137ms (110.297us)
   - First Batch Produced: 15.010ms (13.873ms)
   - First Batch Sent: 15.080ms (70.715us)
   - ExecInternal Finished: 15.622ms (541.181us)

Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M tests/query_test/test_observability.py
5 files changed, 85 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/12000/1
--
To view, visit http://gerrit.cloudera.org:8080/12000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Gerrit-Change-Number: 12000
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7882: Remove StringValue padding validation

2018-11-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11989 )

Change subject: IMPALA-7882: Remove StringValue padding validation
..


Patch Set 1: Code-Review+1

Make sense to me. I suppose it was only caught in ASAN builds, right ?


--
To view, visit http://gerrit.cloudera.org:8080/11989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e7247b416ae9e1bc6da299572f2b6f326ad1c71
Gerrit-Change-Number: 11989
Gerrit-PatchSet: 1
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 27 Nov 2018 00:46:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7852: Fix some flakiness in test hash join timer.py

2018-11-21 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11964 )

Change subject: IMPALA-7852: Fix some flakiness in test_hash_join_timer.py
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11964/1/tests/beeswax/impala_beeswax.py
File tests/beeswax/impala_beeswax.py:

http://gerrit.cloudera.org:8080/#/c/11964/1/tests/beeswax/impala_beeswax.py@206
PS1, Line 206:   # Get the profile after unregistering the query so 
ExecSummary is included in it.
> Thanks. I wonder if we should track the profile issue as a query lifecycle
Yes, we should consider adding the ExecSummary to the profile once all backends 
completed. Filed IMPALA-7879 for it.



--
To view, visit http://gerrit.cloudera.org:8080/11964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Gerrit-Change-Number: 11964
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 21 Nov 2018 21:23:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7852: Fix some flakiness in test hash join timer.py

2018-11-20 Thread Michael Ho (Code Review)
Hello Thomas Marshall, Lars Volker, Tim Armstrong, Bikramjeet Vig, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11964

to look at the new patch set (#3).

Change subject: IMPALA-7852: Fix some flakiness in test_hash_join_timer.py
..

IMPALA-7852: Fix some flakiness in test_hash_join_timer.py

test_hash_join_timer.py aims to verify the timers in the
join nodes are functioning correctly. It does so by parsing
the query profile for certain patterns after the query has
finished.

Before IMPALA-4063, each individual fragment instance will
post its profile to the coordinator upon completion. After
IMPALA-4063, the profiles of all fragment instances are sent
together periodically and the final profile is sent once all
fragment instances on a backend are done.

The problem with the existing implementation of the test is
that it doesn't actually fetch results before closing the
query. As a result of it, the coordinator fragment never gets
a chance to complete as it will block forever when inserting
into the plan root sink. The lack of completion of the
coordinator fragment causes the final profiles of fragment
instances on the coordinator to be not sent before the query
is closed. As a result, the profile of a fragment instance
on the coordinator could be stale if it completes between
two periodic updates, leading to random test failure.

This change fixes the flankiness by always fetching results
before closing the query. Ideally, if we fix IMPALA-539 and
wait for all backends' final profiles before completing query
unregistration, we should get the final profile from the
coordinator fragment too.

Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
---
M be/src/runtime/query-state.cc
M tests/query_test/test_hash_join_timer.py
2 files changed, 47 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/11964/3
--
To view, visit http://gerrit.cloudera.org:8080/11964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Gerrit-Change-Number: 11964
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7852: Fix some flakiness in test hash join timer.py

2018-11-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11964 )

Change subject: IMPALA-7852: Fix some flakiness in test_hash_join_timer.py
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11964/2/tests/query_test/test_hash_join_timer.py
File tests/query_test/test_hash_join_timer.py:

http://gerrit.cloudera.org:8080/#/c/11964/2/tests/query_test/test_hash_join_timer.py@148
PS2, Line 148: ;
> flake8: E703 statement ends with a semicolon
Done



--
To view, visit http://gerrit.cloudera.org:8080/11964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Gerrit-Change-Number: 11964
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 21 Nov 2018 00:14:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7852: Fix some flakiness in test hash join timer.py

2018-11-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11964 )

Change subject: IMPALA-7852: Fix some flakiness in test_hash_join_timer.py
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11964/1/tests/beeswax/impala_beeswax.py
File tests/beeswax/impala_beeswax.py:

http://gerrit.cloudera.org:8080/#/c/11964/1/tests/beeswax/impala_beeswax.py@206
PS1, Line 206:   # Get the profile after unregistering the query so 
ExecSummary is included in it.
> Good point although in practice, given the default size of query log and th
I rewrote part of test_hash_join_timer.py in PS2 to use the exec_summary from 
the result set. Also did a bit of clean up there so there is no need for the 
change in this file here.



--
To view, visit http://gerrit.cloudera.org:8080/11964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Gerrit-Change-Number: 11964
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 21 Nov 2018 00:11:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7852: Fix some flakiness in test hash join timer.py

2018-11-20 Thread Michael Ho (Code Review)
Hello Thomas Marshall, Lars Volker, Tim Armstrong, Bikramjeet Vig, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11964

to look at the new patch set (#2).

Change subject: IMPALA-7852: Fix some flakiness in test_hash_join_timer.py
..

IMPALA-7852: Fix some flakiness in test_hash_join_timer.py

test_hash_join_timer.py aims to verify the timers in the
join nodes are functioning correctly. It does so by parsing
the query profile for certain patterns after the query has
finished.

Before IMPALA-4063, each individual fragment instance will
post its profile to the coordinator upon completion. After
IMPALA-4063, the profiles of all fragment instances are sent
together periodically and the final profile is sent once all
fragment instances on a backend are done.

The problem with the existing implementation of the test is
that it doesn't actually fetch results before closing the
query. As a result of it, the coordinator fragment never gets
a chance to complete as it will block forever when inserting
into the plan root sink. The lack of completion of the
coordinator fragment causes the final profiles of fragment
instances on the coordinator to be not sent before the query
is closed. As a result, the profile of a fragment instance
on the coordinator could be stale if it completes between
two periodic updates, leading to random test failure.

This change fixes the flankiness by always fetching results
before closing the query. Ideally, if we fix IMPALA-539 and
wait for all backends' final profiles before completing query
unregistration, we should get the final profile from the
coordinator fragment too.

Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
---
M be/src/runtime/query-state.cc
M tests/query_test/test_hash_join_timer.py
2 files changed, 43 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/11964/2
--
To view, visit http://gerrit.cloudera.org:8080/11964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Gerrit-Change-Number: 11964
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7829: Mark a fragment instance as done only after Close() is called

2018-11-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11939 )

Change subject: IMPALA-7829: Mark a fragment instance as done only after 
Close() is called
..


Patch Set 2: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/11939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61618854ae3f4e7ef20028dcb0ff5cbcfa8adb01
Gerrit-Change-Number: 11939
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Nov 2018 19:45:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7829: Mark a fragment instance as done only after Close() is called

2018-11-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11939 )

Change subject: IMPALA-7829: Mark a fragment instance as done only after 
Close() is called
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11939/1/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/11939/1/be/src/runtime/fragment-instance-state.cc@100
PS1, Line 100: ReleaseThreadToken();
> Consider moving to Close()? It's acquired in Prepare() so that feels symmet
Done



--
To view, visit http://gerrit.cloudera.org:8080/11939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61618854ae3f4e7ef20028dcb0ff5cbcfa8adb01
Gerrit-Change-Number: 11939
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Nov 2018 19:44:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7829: Mark a fragment instance as done only after Close() is called

2018-11-20 Thread Michael Ho (Code Review)
Hello Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11939

to look at the new patch set (#2).

Change subject: IMPALA-7829: Mark a fragment instance as done only after 
Close() is called
..

IMPALA-7829: Mark a fragment instance as done only after Close() is called

As shown in IMPALA-7828. there is some non-determinism on whether the errors
detected in FragmentInstanceState::Close() will show up in the final profile
sent to the coordinator. The reason is that the current code marks a fragment
instance as "done" after ExecInternal() completes but before Close() is called.
There is a window between when the final status report is sent and when Close()
finishes.

This change fixes the problem by not sending the final report until Close()
is called. This has no implication on the first row available time for normal
queries. It may slightly lengthen the first row available time for DML queries.

Testing done: Updated udf-no-expr-rewrite.test to exercise this test

Perf run on an 8 node clusters didn't show any regression:

TPCH-300
++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
++---+-++++
| TPCH(_300) | parquet / none / none | 23.94   | -2.05% | 12.55  | 
-2.62% |
++---+-++++

Small concurrency
+-+---+-++++
| Workload| File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
+-+---+-++++
| TPCDS-UNMODIFIED(_1000) | parquet / none / none | 6.89| -0.66% | 6.62 
  | +0.41% |
+-+---+-++++

Medium concurrency
+-+---+-++++
| Workload| File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
+-+---+-++++
| TPCDS-UNMODIFIED(_1000) | parquet / none / none | 55.57   | -1.04% | 
55.27  | -0.98% |
+-+---+-++++

Change-Id: I61618854ae3f4e7ef20028dcb0ff5cbcfa8adb01
---
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test
3 files changed, 12 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/11939/2
--
To view, visit http://gerrit.cloudera.org:8080/11939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61618854ae3f4e7ef20028dcb0ff5cbcfa8adb01
Gerrit-Change-Number: 11939
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7852: Fix some flakiness in test hash join timer.py

2018-11-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11964 )

Change subject: IMPALA-7852: Fix some flakiness in test_hash_join_timer.py
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11964/1/tests/beeswax/impala_beeswax.py
File tests/beeswax/impala_beeswax.py:

http://gerrit.cloudera.org:8080/#/c/11964/1/tests/beeswax/impala_beeswax.py@206
PS1, Line 206:   # Get the profile after unregistering the query so 
ExecSummary is included in it.
> I think this is introducing a new race between the query being evicted from
Good point although in practice, given the default size of query log and the 
size of the race window, it's unlikely to occur. Let me add a check in the 
test_hash_join_timer.py to make sure we check if the profile is not None.



--
To view, visit http://gerrit.cloudera.org:8080/11964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Gerrit-Change-Number: 11964
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Nov 2018 19:24:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7852: Fix some flakiness in test hash join timer.py

2018-11-20 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11964


Change subject: IMPALA-7852: Fix some flakiness in test_hash_join_timer.py
..

IMPALA-7852: Fix some flakiness in test_hash_join_timer.py

test_hash_join_timer.py aims to verify the timers in the
join nodes are functioning correctly. It does so by parsing
the query profile for certain patterns after the query has
finished.

Before IMPALA-4063, each individual fragment instance will
post its profile to the coordinator upon completion. After
IMPALA-4063, the profiles of all fragment instances are sent
together periodically and the final profile is sent once all
fragment instances on a backend are done.

The problem with the existing implementation of the test is
that it doesn't actually fetch results before closing the
query. As a result of it, the coordinator fragment never gets
a chance to complete as it will block forever when inserting
into the plan root sink. The lack of completion of the
coordinator fragment causes the final profiles of fragment
instances on the coordinator to be not sent before the query
is closed. As a result, the profile of a fragment instance
on the coordinator could be stale if it completes between
two periodic updates, leading to random test failure.

This change fixes the flankiness by always fetching results
before closing the query. Ideally, if we fix IMPALA-539 and
wait for all backends' final profiles before completing query
unregistration, we should get the final profile from the
coordinator fragment too.

Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
---
M be/src/runtime/query-state.cc
M tests/beeswax/impala_beeswax.py
M tests/query_test/test_hash_join_timer.py
3 files changed, 9 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/11964/1
--
To view, visit http://gerrit.cloudera.org:8080/11964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I851824dffb78c7731e60793d90f1e57050c54955
Gerrit-Change-Number: 11964
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7829: Mark a fragment instance as done only after Close() is called

2018-11-15 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11939


Change subject: IMPALA-7829: Mark a fragment instance as done only after 
Close() is called
..

IMPALA-7829: Mark a fragment instance as done only after Close() is called

As shown in IMPALA-7828. there is some non-determinism on whether the errors
detected in FragmentInstanceState::Close() will show up in the final profile
sent to the coordinator. The reason is that the current code marks a fragment
instance as "done" after ExecInternal() completes but before Close() is called.
There is a window between when the final status report is sent and when Close()
finishes.

This change fixes the problem by not sending the final report until Close()
is called. This has no implication on the first row available time for normal
queries. It may slightly lengthen the first row available time for DML queries.

Testing done: Updated udf-no-expr-rewrite.test to exercise this test

Perf run on an 8 node clusters didn't show any regression:

TPCH-300
++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
++---+-++++
| TPCH(_300) | parquet / none / none | 23.94   | -2.05% | 12.55  | 
-2.62% |
++---+-++++

Small concurrency
+-+---+-++++
| Workload| File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
+-+---+-++++
| TPCDS-UNMODIFIED(_1000) | parquet / none / none | 6.89| -0.66% | 6.62 
  | +0.41% |
+-+---+-++++

Medium concurrency
+-+---+-++++
| Workload| File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
+-+---+-++++
| TPCDS-UNMODIFIED(_1000) | parquet / none / none | 55.57   | -1.04% | 
55.27  | -0.98% |
+-+---+-++++

Change-Id: I61618854ae3f4e7ef20028dcb0ff5cbcfa8adb01
---
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test
3 files changed, 11 insertions(+), 19 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/11939/1
--
To view, visit http://gerrit.cloudera.org:8080/11939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I61618854ae3f4e7ef20028dcb0ff5cbcfa8adb01
Gerrit-Change-Number: 11939
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7738: Implement timeouts for HDFS open calls

2018-11-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11874 )

Change subject: IMPALA-7738: Implement timeouts for HDFS open calls
..


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/11874/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11874/4//COMMIT_MSG@17
PS4, Line 17: 5 minutes
How is this determined ? In this patch, only hdfsOpen() has been ported over to 
have a timeout. Do we anticipate more HDFS operations to be ported over in the 
future ? If so, will they need a different timeout periods ?


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/handle-cache.inline.h@33
PS4, Line 33: /* TODO: check return code */
nit: I believe we tend to use // for inline comments


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.h
File be/src/runtime/io/hdfs-monitored-ops.h:

http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.h@18
PS4, Line 18: IMPALA_RUNTIME_IO_HDFS_SHIM_H
nit: should match file name


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.h@30
PS4, Line 30: //
nit: ///


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.cc
File be/src/runtime/io/hdfs-monitored-ops.cc:

http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.cc@32
PS4, Line 32: HDFS operation should wait before timing out and failing
Please see comments below in HdfsMonitor::OpenFileHandle(). It appears that a 
call to Open() can potentially block for longer than this timeout period.


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.cc@55
PS4, Line 55: virtual Status ExecuteImpl();
Why not just make it a pure virtual function ? and we don't need line 70 - line 
73 below?


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.cc@66
PS4, Line 66:   // Set when ExecuteImpl() completes
Set to return status of ExecuteImpl() upon completion.


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/runtime/io/hdfs-monitored-ops.cc@139
PS4, Line 139: FLAGS_hdfs_operation_timeout_sec * MILLIS_PER_SEC);
Both the Offer() and Wait() calls can block for almost 
FLAGS_hdfs_operation_timeout_sec seconds, right ? So, this function can in 
theory block for 2 * FLAGS_hdfs_operation_timeout_sec seconds, right ?

So, should we be more clear about exactly what a HDFS operation in the 
definition (hdfs_operation_timeout_sec) means ?


http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

http://gerrit.cloudera.org:8080/#/c/11874/4/be/src/util/thread-pool.h@116
PS4, Line 116: bool Offer(V&& work, int64_t timeout_millis) {
May warrant a test in thread-pool-test.cc



--
To view, visit http://gerrit.cloudera.org:8080/11874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia14403ca5f3f19c6d5f61b9ab2306b0ad3267454
Gerrit-Change-Number: 11874
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 08 Nov 2018 20:19:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7148: Make test profile fragment instances() more robust

2018-11-07 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11906


Change subject: IMPALA-7148: Make test_profile_fragment_instances() more robust
..

IMPALA-7148: Make test_profile_fragment_instances() more robust

test_profile_fragment_instances() makes assumption about number
of instances of each scan node in the distributed query plan.
The number of instances for each scan nodes can actually vary
depending on how data is loaded and scheduler's decision.

This change relaxes the check for number of instances of each
scan node is a multiple of 3 which is the number of scan nodes
in the plan.

Change-Id: I08b068c21e9637575c85f4d54be9f7c56c106bf1
---
M tests/query_test/test_observability.py
1 file changed, 5 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/11906/1
--
To view, visit http://gerrit.cloudera.org:8080/11906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I08b068c21e9637575c85f4d54be9f7c56c106bf1
Gerrit-Change-Number: 11906
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-7828: A temporary workaround for flaky UDF test (test mem leak())

2018-11-07 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11900


Change subject: IMPALA-7828: A temporary workaround for flaky UDF test 
(test_mem_leak())
..

IMPALA-7828: A temporary workaround for flaky UDF test (test_mem_leak())

Before IMPALA-4063, the error message detected during
FragmentInstanceState::Close() was always lost. After
IMPALA-4063, we may sometimes get the error message in
FragmentInstanceState::Close(). It's non-deterministic
as the fragment instance thread may race with the query
state thread which reports the final status. The UDF test
currently tries to handle this non-determinism by using
"row_regex:.*" in the ERRORS section but it doesn't
always seem to work.

This change workarounds the issue by commenting out the
ERRORS section in udf-no-expr-rewrite.text for now.
The actual fix will be done in IMPALA-7829.

Change-Id: I6a55d5ad1a5a7278e7390f60854a8df28c1b9f28
---
M testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/11900/1
--
To view, visit http://gerrit.cloudera.org:8080/11900
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6a55d5ad1a5a7278e7390f60854a8df28c1b9f28
Gerrit-Change-Number: 11900
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-6323 Allow constant analytic window expressions.

2018-11-05 Thread Michael Ho (Code Review)
Michael Ho has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11556 )

Change subject: IMPALA-6323 Allow constant analytic window expressions.
..

IMPALA-6323 Allow constant analytic window expressions.

The constraint imposed by IMPALA-1354 was artificial.
If there are constant "partition by" expressions, simply drop them,
they are no-ops.

Constant "order by" expressions can be ignored as well, though in effect
they should be accounted for as null expressions in the backend, with the
effect that combine all rows in the same window (i.e. no window breaks).

Change-Id: Idf129026c45120e9470df601268863634037908c
Reviewed-on: http://gerrit.cloudera.org:8080/11556
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 
---
M be/src/exec/analytic-eval-node.cc
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
6 files changed, 77 insertions(+), 26 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Michael Ho: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idf129026c45120e9470df601268863634037908c
Gerrit-Change-Number: 11556
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6323 Allow constant analytic window expressions.

2018-11-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11556 )

Change subject: IMPALA-6323 Allow constant analytic window expressions.
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf129026c45120e9470df601268863634037908c
Gerrit-Change-Number: 11556
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 06 Nov 2018 07:54:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6323 Allow constant analytic window expressions.

2018-11-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11556 )

Change subject: IMPALA-6323 Allow constant analytic window expressions.
..


Patch Set 14:

Michal, can you please look into the failures here: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3524/

Please do a private build with the new patch to make sure everything is sane.


--
To view, visit http://gerrit.cloudera.org:8080/11556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf129026c45120e9470df601268863634037908c
Gerrit-Change-Number: 11556
Gerrit-PatchSet: 14
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 05 Nov 2018 23:50:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7565: Add startup flag to set thrift connection setup thread pool size

2018-11-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11873 )

Change subject: IMPALA-7565: Add startup flag to set thrift connection setup 
thread pool size
..


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11873/2/be/src/rpc/TAcceptQueueServer.cpp
File be/src/rpc/TAcceptQueueServer.cpp:

http://gerrit.cloudera.org:8080/#/c/11873/2/be/src/rpc/TAcceptQueueServer.cpp@30
PS2, Line 30: Impala internal, Beeswax and HiveServer2 connections
nit: could be simpler to just say "Impala internal and external connections".


http://gerrit.cloudera.org:8080/#/c/11873/2/be/src/rpc/TAcceptQueueServer.cpp@35
PS2, Line 35: internal, Beeswax and HiveServer2 connections
internal and external



--
To view, visit http://gerrit.cloudera.org:8080/11873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I31344321a5f9e840a399ccb0f963c0759e2ab234
Gerrit-Change-Number: 11873
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Mon, 05 Nov 2018 23:12:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6323 Allow constant analytic window expressions.

2018-11-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11556 )

Change subject: IMPALA-6323 Allow constant analytic window expressions.
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf129026c45120e9470df601268863634037908c
Gerrit-Change-Number: 11556
Gerrit-PatchSet: 13
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 05 Nov 2018 19:16:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor

2018-11-05 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11615 )

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11615/8/testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test
File 
testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test:

http://gerrit.cloudera.org:8080/#/c/11615/8/testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test@26
PS8, Line 26: #A fragment instance's last status report may be sent before 
calling Close() which is
> This in fact may be racy as the final profile may be sent before Close() is
So, workaround this flakiness in test by changing this to row_regex:.* for now.



--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 05 Nov 2018 18:57:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor

2018-11-05 Thread Michael Ho (Code Review)
Hello Thomas Marshall, Tim Armstrong, Joe McDonnell, Bikramjeet Vig, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11615

to look at the new patch set (#9).

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
..

IMPALA-4063: Merge report of query fragment instances per executor

Previously, each fragment instance executing on an executor will
independently report its status to the coordinator periodically.
This creates a huge amount of RPCs to the coordinator under highly
concurrent workloads, causing lock contention in the coordinator's
backend states when multiple fragment instances send them at the
same time. In addition, due to the lack of coordination between query
fragment instances, a query may end without collecting the profiles
from all fragment instances when one of them hits an error before
another fragment instance manages to finish Prepare(), leading to
missing profiles for certain fragment instances.

This change fixes the problem above by making a thread per QueryState
(started by QueryExecMgr) to be responsible for periodically reporting
the status and profiles of all fragment instances of a query running
on a backend. As part of this refactoring, each query fragment instance
will not report their errors individually. Instead, there is a cumulative
status maintained per QueryState. It's set to the error status of the first
fragment instance which hits an error or any general error (e.g. failure
to start a thread) when starting fragment instances. With this change,
the status reporting threads are also removed.

Testing done: exhaustive tests

This patch is based on a patch by Sailesh Mukil

Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/control-service.cc
M be/src/service/control-service.h
M common/protobuf/control_service.proto
M common/thrift/RuntimeProfile.thrift
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
A testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test
D 
testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test
M testdata/workloads/functional-query/queries/QueryTest/udf.test
M tests/query_test/test_observability.py
M tests/query_test/test_udfs.py
22 files changed, 423 insertions(+), 475 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/11615/9
--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7565: Add startup flag to set thrift connection setup thread pool size

2018-11-02 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11873 )

Change subject: IMPALA-7565: Add startup flag to set thrift connection setup 
thread pool size
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11873/1/be/src/rpc/TAcceptQueueServer.cpp
File be/src/rpc/TAcceptQueueServer.cpp:

http://gerrit.cloudera.org:8080/#/c/11873/1/be/src/rpc/TAcceptQueueServer.cpp@30
PS1, Line 30: internal connections
nit: not necessarily Impala internal connections. This class is used for 
Beeswax and HS2 servers too.


http://gerrit.cloudera.org:8080/#/c/11873/1/be/src/rpc/TAcceptQueueServer.cpp@32
PS1, Line 32: connection
May be "accepted_cnxn_setup_thread_pool_size" so it's consistent with the other 
knob.


http://gerrit.cloudera.org:8080/#/c/11873/1/be/src/rpc/TAcceptQueueServer.cpp@34
PS1, Line 34: internal connections
nit: not necessarily Impala internal connections. This class is used for 
Beeswax and HS2 servers too.



--
To view, visit http://gerrit.cloudera.org:8080/11873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I31344321a5f9e840a399ccb0f963c0759e2ab234
Gerrit-Change-Number: 11873
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Sat, 03 Nov 2018 00:54:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7775: fix some lifecycle issues in statestore/session tests

2018-11-02 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11864 )

Change subject: IMPALA-7775: fix some lifecycle issues in statestore/session 
tests
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11864/2/be/src/statestore/statestore-test.cc
File be/src/statestore/statestore-test.cc:

http://gerrit.cloudera.org:8080/#/c/11864/2/be/src/statestore/statestore-test.cc@40
PS2, Line 40: static ObjectPool* perm_objects
> Yeah, it does - if we actually run the destructors stuff can crash while be
This may be a bit subtle. Can you please add a comment about it ?



--
To view, visit http://gerrit.cloudera.org:8080/11864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b25c8b8a96bfa1183ce273b3bb4debde234dd01
Gerrit-Change-Number: 11864
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 03 Nov 2018 00:10:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7775: fix some lifecycle issues in statestore/session tests

2018-11-02 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11864 )

Change subject: IMPALA-7775: fix some lifecycle issues in statestore/session 
tests
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11864/2/be/src/service/session-expiry-test.cc
File be/src/service/session-expiry-test.cc:

http://gerrit.cloudera.org:8080/#/c/11864/2/be/src/service/session-expiry-test.cc@53
PS2, Line 53:  scoped_ptr metrics(new MetricGroup("statestore"));
Does this need to live in an ObjectPool too like what was done in 
statestore-test.cc ?


http://gerrit.cloudera.org:8080/#/c/11864/2/be/src/statestore/statestore-test.cc
File be/src/statestore/statestore-test.cc:

http://gerrit.cloudera.org:8080/#/c/11864/2/be/src/statestore/statestore-test.cc@40
PS2, Line 40: static ObjectPool* perm_objects
Why not just static ObjectPool perm_object; ? Does it have anything to do with 
the IMPALA-5291 comment in main() ?



--
To view, visit http://gerrit.cloudera.org:8080/11864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b25c8b8a96bfa1183ce273b3bb4debde234dd01
Gerrit-Change-Number: 11864
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Fri, 02 Nov 2018 22:54:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6323 Allow constant analytic window expressions.

2018-11-02 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11556 )

Change subject: IMPALA-6323 Allow constant analytic window expressions.
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf129026c45120e9470df601268863634037908c
Gerrit-Change-Number: 11556
Gerrit-PatchSet: 12
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 02 Nov 2018 17:52:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Reviewed-on: http://gerrit.cloudera.org:8080/11535
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 202 insertions(+), 35 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Michael Ho: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 24
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 23: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 23
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Nov 2018 02:58:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11615 )

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11615/8/testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test
File 
testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test:

http://gerrit.cloudera.org:8080/#/c/11615/8/testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test@26
PS8, Line 26: UDF WARNING: Memory leaked via FunctionContext::Allocate(), 100 
bytes leaked via FunctionContext::TrackAl
This in fact may be racy as the final profile may be sent before Close() is 
called on a Fragment Instance. So, this is not 100% deterministic.



--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Nov 2018 00:49:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11615 )

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
..


Patch Set 7: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Nov 2018 16:56:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..


Patch Set 21: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 21
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 01 Nov 2018 16:56:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11615 )

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11615/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11615/5//COMMIT_MSG@14
PS5, Line 14: In addition, due to the lack of coordination between query
: fragment instances, a query may end without collecting the 
profiles
: from all fragment instances when one of them hits an error before
: another fragment instance manages to finish Prepare(), leading to
: missing profiles for certain fragment instances.
> Great. Could you also remove the xfail from test_profile_fragment_instances
Done


http://gerrit.cloudera.org:8080/#/c/11615/6/tests/query_test/test_udfs.py
File tests/query_test/test_udfs.py:

http://gerrit.cloudera.org:8080/#/c/11615/6/tests/query_test/test_udfs.py@290
PS6, Line 290:   self.run_test_case('QueryTest/udf-codegen-required', 
vector, use_db=unique_database)
> This comment is outdated, and below
Done



--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Nov 2018 16:55:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

2018-11-01 Thread Michael Ho (Code Review)
Hello Thomas Marshall, Todd Lipcon, Tim Armstrong, Bikramjeet Vig, Impala 
Public Jenkins, Michal Ostrowski,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10855

to look at the new patch set (#21).

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
..

IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
To avoid unnecessary delays due to sharing the network connections
between DataStream service and Control service, this change added the
service name as part of the user credentials for the ConnectionId
so each service will use a separate connection.

The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

This patch also fixes IMPALA-7241 which may lead to duplicated
dml stats being applied. The fix is by adding a monotonically
increasing version number for fragment instances' reports. The
coordinator will ignore any report smaller than or equal to the
version in the last report.

Testing done:
1. Exhaustive build.
2. Added some targeted test cases for profile serialization failure
   and RPC retries/timeout.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/catalog/catalog-util.cc
M be/src/common/global-flags.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/rpc/CMakeLists.txt
M be/src/rpc/jni-thrift-util.h
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-util-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
A be/src/service/control-service.cc
A be/src/service/control-service.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M be/src/util/container-util.h
A be/src/util/error-util-internal.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/runtime-profile.cc
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
A common/protobuf/control_service.proto
M common/protobuf/data_stream_service.proto
M common/protobuf/row_batch.proto
M common/protobuf/rpc_test.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_rpc_timeout.py
65 files changed, 1,299 insertions(+), 769 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/10855/21
--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 21
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: 

[Impala-ASF-CR] IMPALA-4063: Merge report of query fragment instances per executor

2018-11-01 Thread Michael Ho (Code Review)
Hello Thomas Marshall, Tim Armstrong, Joe McDonnell, Bikramjeet Vig, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11615

to look at the new patch set (#7).

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
..

IMPALA-4063: Merge report of query fragment instances per executor

Previously, each fragment instance executing on an executor will
independently report its status to the coordinator periodically.
This creates a huge amount of RPCs to the coordinator under highly
concurrent workloads, causing lock contention in the coordinator's
backend states when multiple fragment instances send them at the
same time. In addition, due to the lack of coordination between query
fragment instances, a query may end without collecting the profiles
from all fragment instances when one of them hits an error before
another fragment instance manages to finish Prepare(), leading to
missing profiles for certain fragment instances.

This change fixes the problem above by making a thread per QueryState
(started by QueryExecMgr) to be responsible for periodically reporting
the status and profiles of all fragment instances of a query running
on a backend. As part of this refactoring, each query fragment instance
will not report their errors individually. Instead, there is a cumulative
status maintained per QueryState. It's set to the error status of the first
fragment instance which hits an error or any general error (e.g. failure
to start a thread) when starting fragment instances. With this change,
the status reporting threads are also removed.

Testing done: exhaustive tests

This patch is based on a patch by Sailesh Mukil

Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/control-service.cc
M be/src/service/control-service.h
M common/protobuf/control_service.proto
M common/thrift/RuntimeProfile.thrift
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
A testdata/workloads/functional-query/queries/QueryTest/udf-no-expr-rewrite.test
D 
testdata/workloads/functional-query/queries/QueryTest/udf-non-deterministic.test
M testdata/workloads/functional-query/queries/QueryTest/udf.test
M tests/query_test/test_observability.py
M tests/query_test/test_udfs.py
22 files changed, 420 insertions(+), 475 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/11615/7
--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-30 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 06:00:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7727: Fix TStatusCode to TErrorCode mapping

2018-10-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11778 )

Change subject: IMPALA-7727: Fix TStatusCode to TErrorCode mapping
..


Patch Set 4:

OK. Will do a pass tonight.


--
To view, visit http://gerrit.cloudera.org:8080/11778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62527734aa73c1524c731773638590bdac9e789
Gerrit-Change-Number: 11778
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:23:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 19:

There seems to be other test failures in addition to the one you fixed: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3421/

Please run a private build to confirm all newly added tests actually passed.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 19
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Oct 2018 22:24:26 +
Gerrit-HasComments: No


<    1   2   3   4   5   6   7   8   9   10   >