[Impala-ASF-CR] IMPALA-7102(Part 1): Disable reading of erasure coding by default

2018-06-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10646 )

Change subject: IMPALA-7102(Part 1): Disable reading of erasure coding by 
default
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2751/


--
To view, visit http://gerrit.cloudera.org:8080/10646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd3b1754541262467a6e67068b0b447882a40fb3
Gerrit-Change-Number: 10646
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Adrian Ng
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Jun 2018 05:42:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7215: Implement a templatized CountingBarrier

2018-06-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10827 )

Change subject: IMPALA-7215: Implement a templatized CountingBarrier
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/10827/3/be/src/util/counting-barrier.h
File be/src/util/counting-barrier.h:

http://gerrit.cloudera.org:8080/#/c/10827/3/be/src/util/counting-barrier.h@34
PS3, Line 34:   /// Sends one notification, decrementing the number of pending 
notifications by one.
Add something like:
If this is the final notifier, sets the value returned to the waiter to 
'promise_value'


http://gerrit.cloudera.org:8080/#/c/10827/3/be/src/util/counting-barrier.h@36
PS3, Line 36: int32_t
looks like no caller actually uses this return value and so you've annotated 
them with discard_result(). You could also just make this return void since no 
one wants the value. I'm fine either way.


http://gerrit.cloudera.org:8080/#/c/10827/3/be/src/util/counting-barrier.h@42
PS3, Line 42:   /// Sets the number of pending notifications to 0 and unblocks 
Wait().
similarly, add comment about 'promise_value'


http://gerrit.cloudera.org:8080/#/c/10827/3/be/src/util/counting-barrier.h@54
PS3, Line 54:   /// Blocks until all notifications are received.
comment the return value


http://gerrit.cloudera.org:8080/#/c/10827/3/be/src/util/counting-barrier.h@55
PS3, Line 55: T
would be good to return "const T&" here and below (to match Promise.Get())


http://gerrit.cloudera.org:8080/#/c/10827/3/be/src/util/counting-barrier.h@58
PS3, Line 58:   /// case '*timed_out' will be true.
comment return value



--
To view, visit http://gerrit.cloudera.org:8080/10827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05fc79228250408ae16481ae7ff3491a90d26b8e
Gerrit-Change-Number: 10827
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 28 Jun 2018 04:32:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2752/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 28 Jun 2018 04:12:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 7
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 28 Jun 2018 04:12:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 6: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@336
PS1, Line 336: ) {
> I'm curious why you feel this is less readable.
As we spoke, my basic reasoning was that if we have a function signature like 
so:
def Func(param..)

then that would signify that 'param' is used to do 'Func'.

In this case:
bool Cancel(query_ctx)

that would signify, use 'query_ctx' to 'Cancel'.

However, 'query_ctx' is used for an orthogonal use case within the function 
(which is important, but is just not doing any real 'cancel' work).

There are cases where this pattern is unavoidable, but I feel like we could do 
without it here. The second patch set looks much better though. Thanks for 
doing that.



--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 28 Jun 2018 02:58:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7215: Implement a templatized CountingBarrier

2018-06-27 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10827 )

Change subject: IMPALA-7215: Implement a templatized CountingBarrier
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10827/1/be/src/util/counting-barrier.h
File be/src/util/counting-barrier.h:

http://gerrit.cloudera.org:8080/#/c/10827/1/be/src/util/counting-barrier.h@39
PS1, Line 39: ;
> rather than have a default_promise_value_, why not just pass it as a parame
Done


http://gerrit.cloudera.org:8080/#/c/10827/1/be/src/util/counting-barrier.h@46
PS1, Line 46: eturn;  // count_ can legitimately drop below 0
:   if (count_.CompareAndSwap(value, 0)) {
: promise_.Set(promise_value);
> this would be simpler if we just require the promise value for the non-bool
Good point. I implemented a TypedCountingBarrier and made CountingBarrier a 
user of TypedCountingBarrier.



--
To view, visit http://gerrit.cloudera.org:8080/10827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05fc79228250408ae16481ae7ff3491a90d26b8e
Gerrit-Change-Number: 10827
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 28 Jun 2018 02:49:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7215: Implement a templatized CountingBarrier

2018-06-27 Thread Sailesh Mukil (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10827

to look at the new patch set (#3).

Change subject: IMPALA-7215: Implement a templatized CountingBarrier
..

IMPALA-7215: Implement a templatized CountingBarrier

Currently, our CountingBarrier util only notifies a 'bool' value
and uses an underlying Promise.

We're seeing cases in code where we might want to be notified of a
different kind of Promise (other than bool). We add a templatized
class TypedCountingBarrier and convert CountingBarrier to use the
TypedCountingBarrier internally.

This was identified while working on IMPALA-7163.

Testing: Ran 'core' tests.

Change-Id: I05fc79228250408ae16481ae7ff3491a90d26b8e
---
M be/src/exec/hdfs-scan-node.cc
M be/src/rpc/rpc-mgr-test.cc
M be/src/runtime/coordinator.cc
M be/src/util/counting-barrier.h
M be/src/util/hdfs-bulk-ops.cc
5 files changed, 54 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/10827/3
--
To view, visit http://gerrit.cloudera.org:8080/10827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I05fc79228250408ae16481ae7ff3491a90d26b8e
Gerrit-Change-Number: 10827
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-7215: Implement a templatized CountingBarrier

2018-06-27 Thread Sailesh Mukil (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10827

to look at the new patch set (#2).

Change subject: IMPALA-7215: Implement a templatized CountingBarrier
..

IMPALA-7215: Implement a templatized CountingBarrier

Currently, our CountingBarrier util only notifies a 'bool' value
and uses an underlying Promise.

We're seeing cases in code where we might want to be notified of a
different kind of Promise (other than bool). We add a templatized
class TypedCountingBarrier and convert CountingBarrier to use the
TypedCountingBarrier internally.

This was identified while working on IMPALA-7163.

Testing: Ran 'core' tests.

Change-Id: I05fc79228250408ae16481ae7ff3491a90d26b8e
---
M be/src/exec/hdfs-scan-node.cc
M be/src/rpc/rpc-mgr-test.cc
M be/src/runtime/coordinator.cc
M be/src/util/counting-barrier.h
M be/src/util/hdfs-bulk-ops.cc
5 files changed, 59 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/10827/2
--
To view, visit http://gerrit.cloudera.org:8080/10827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I05fc79228250408ae16481ae7ff3491a90d26b8e
Gerrit-Change-Number: 10827
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-7102(Part 1): Disable reading of erasure coding by default

2018-06-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10646 )

Change subject: IMPALA-7102(Part 1): Disable reading of erasure coding by 
default
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd3b1754541262467a6e67068b0b447882a40fb3
Gerrit-Change-Number: 10646
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Adrian Ng
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Jun 2018 02:16:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7102(Part 1): Disable reading of erasure coding by default

2018-06-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10646 )

Change subject: IMPALA-7102(Part 1): Disable reading of erasure coding by 
default
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2751/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/10646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd3b1754541262467a6e67068b0b447882a40fb3
Gerrit-Change-Number: 10646
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Adrian Ng
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Jun 2018 02:16:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7095: clean up scan node profiles

2018-06-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10810 )

Change subject: IMPALA-7095: clean up scan node profiles
..


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/hdfs-scan-node-base.h@489
PS5, Line 489: per_read_thread
> I guess it's rather per_thread_read_throughput , just like in its comment.
The name of the counter is "PerReadThreadRawHdfsThroughput", which isn't a 
great name, but the names of the variables reflect the counter name. Not sure 
if it's worth changing the counter name - I didn't have a great alternative.


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h
File be/src/exec/scan-node.h:

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@46
PS5, Line 46: layet
> Nit: layer
Done


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@95
PS5, Line 95:
> Nit: too much spaces
Done


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@234
PS5, Line 234: &&
> Nit: I think it's not necessary here and it would be nice if it was consist
Done


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@256
PS5, Line 256: &&
> Since this function does not always take ownership of 'row_batch', I think
Yeah we had an extended discussion about this pattern at one point: 
https://gerrit.cloudera.org/#/c/6527/5/be/src/exec/kudu-scan-node.cc@179

I'd find lvalue references confusing since then there's no indication that 
ownership might be transferred at the callsite. I switched this to just pass in 
a pointer to the unique_ptr, since that seems to avoid both potential 
confusions. I think the pattern is harder to avoid in BlockingQueue itself, 
since it's generic and meant to work for copyable value types too.


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.cc@143
PS5, Line 143:   /// Starts counters that should start measuring in the Open() 
phase.
 :   void StartOpenCounters();
> It seems it is some leftover code.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10810
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77286282d42e7764bfdf94c7ec47cec9d743f787
Gerrit-Change-Number: 10810
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 28 Jun 2018 00:41:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7095: clean up scan node profiles

2018-06-27 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10810

to look at the new patch set (#6).

Change subject: IMPALA-7095: clean up scan node profiles
..

IMPALA-7095: clean up scan node profiles

Add counters to scan node implementations where they make sense but were
missing (e.g. row batch queue counters for multithread Kudu scans) and
remove them where they don't make sense (e.g. scanner thread counters
for non-multithreaded scans).

Refactors the multithreaded Kudu and HDFS scans to share logic via
composition (single inheritance doesn't work for this case),
which enables the same set of counters to be maintained with shared
code. The row batch queueing and thread tracking is now shared. I looked
at combining the logic around 'status_', 'lock_' and 'done_' between the
two but the details were different enough that it didn't seem worth
abstracting.

Adds a PeakScannerThreadConcurrency counter - this answers a common
question.

Fixes RowsRead for data source scans.

Fix some of the comments to be more accurate/useful.

Testing:
Ran exhaustive tests. Ran various types of scans (HDFS, Kudu, HBase,
Data source) and inspected the profile output manually.

Change-Id: I77286282d42e7764bfdf94c7ec47cec9d743f787
---
M be/src/exec/data-source-scan-node.cc
M be/src/exec/data-source-scan-node.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hbase-table-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node-mt.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/fragment-instance-state.cc
M be/src/util/blocking-queue.h
M be/src/util/thread.h
20 files changed, 466 insertions(+), 351 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/10810/6
--
To view, visit http://gerrit.cloudera.org:8080/10810
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77286282d42e7764bfdf94c7ec47cec9d743f787
Gerrit-Change-Number: 10810
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7102(Part 1): Disable reading of erasure coding by default

2018-06-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10646 )

Change subject: IMPALA-7102(Part 1): Disable reading of erasure coding by 
default
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd3b1754541262467a6e67068b0b447882a40fb3
Gerrit-Change-Number: 10646
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Adrian Ng
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 28 Jun 2018 00:26:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-110 (part 2): Refactor PartitionedAggregationNode

2018-06-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10394 )

Change subject: IMPALA-110 (part 2): Refactor PartitionedAggregationNode
..


Patch Set 6: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10394/6/be/src/exec/aggregator.h
File be/src/exec/aggregator.h:

http://gerrit.cloudera.org:8080/#/c/10394/6/be/src/exec/aggregator.h@26
PS6, Line 26: #include "runtime/mem-tracker.h"
Nit: Can we forward-declare MemTracker instead of pulling in the whole header?


http://gerrit.cloudera.org:8080/#/c/10394/6/be/src/exec/grouping-aggregator-partition.cc
File be/src/exec/grouping-aggregator-partition.cc:

PS6: 
This way of organising the code is much saner, thank you.



--
To view, visit http://gerrit.cloudera.org:8080/10394
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e7bb583f54aa4add3738bde7f57cf3511ac567e
Gerrit-Change-Number: 10394
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 28 Jun 2018 00:22:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 6: Code-Review+1

Fix whitespace glitch, carry.


--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 27 Jun 2018 23:39:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Dan Hecht (Code Review)
Hello Lars Volker, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10815

to look at the new patch set (#6).

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..

IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query 
execution has terminated

Otherwise, if the coordinator to backend CancelFInstances() RPC had failed,
the query can hang (and/or finstances can continue running until the
query is closed.

Testing:
- the modified test reproduces the hang without the impalad fix

Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M tests/query_test/test_cancellation.py
4 files changed, 51 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10815/6
--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 6
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 5: Code-Review+1

(2 comments)

Carry Lars' +1

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@366
PS1, Line 366:
> Thanks for the explanation. I think the code could be more readable if we r
Agree, done.


http://gerrit.cloudera.org:8080/#/c/10815/4/tests/query_test/test_cancellation.py
File tests/query_test/test_cancellation.py:

http://gerrit.cloudera.org:8080/#/c/10815/4/tests/query_test/test_cancellation.py@136
PS4, Line 136: "|".join(filter
> remove one "debug_action = "
Oops, done.



--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 27 Jun 2018 23:37:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Dan Hecht (Code Review)
Hello Lars Volker, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10815

to look at the new patch set (#5).

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..

IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query 
execution has terminated

Otherwise, if the coordinator to backend CancelFInstances() RPC had failed,
the query can hang (and/or finstances can continue running until the
query is closed.

Testing:
- the modified test reproduces the hang without the impalad fix

Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M tests/query_test/test_cancellation.py
4 files changed, 51 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10815/5
--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 5
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6625: Skip computing parquet conjuncts for non-Parquet scans

2018-06-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10704 )

Change subject: IMPALA-6625: Skip computing parquet conjuncts for non-Parquet 
scans
..


Patch Set 9:

More tests fails, we should fix those first. You can do a dry run of the 
precommit job with https://jenkins.impala.io/job/gerrit-verify-dryrun-external/


--
To view, visit http://gerrit.cloudera.org:8080/10704
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d6c26d42db090c8a15c602f6419ad6399c329e7
Gerrit-Change-Number: 10704
Gerrit-PatchSet: 9
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Jun 2018 23:23:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5981: [DOCS] Documented SET=""

2018-06-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10816 )

Change subject: IMPALA-5981: [DOCS] Documented SET=""
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10816/1/docs/topics/impala_set.xml
File docs/topics/impala_set.xml:

http://gerrit.cloudera.org:8080/#/c/10816/1/docs/topics/impala_set.xml@57
PS1, Line 57: 
> After checking the impala-shell command and start option doc, I removed imp
The other difference between the two sets is that the SQL set has quotes around 
the strings

SQL:

  set foo="bar";
  set foo="";

Shell:

  set foo=bar;
  unset foo;


http://gerrit.cloudera.org:8080/#/c/10816/2/docs/topics/impala_set.xml
File docs/topics/impala_set.xml:

http://gerrit.cloudera.org:8080/#/c/10816/2/docs/topics/impala_set.xml@142
PS2, Line 142: 
This is true for any interface that isn't impala-shell, e.g. Hue. Remove the 
qualification here?



--
To view, visit http://gerrit.cloudera.org:8080/10816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7211405d5cc0a548c05ea5218798591873c14417
Gerrit-Change-Number: 10816
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Jun 2018 23:22:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 4: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@366
PS1, Line 366:   rpc_status = 
DebugAction(query_ctx().client_request.query_options,
> No, I wanted to simulate what happens when a single RPC fails. That way we
Thanks for the explanation. I think the code could be more readable if we 
reduced the indent by using continue, both for !client_status.ok() above and 
for !rpc_status.ok() here. I don't feel that strongly about it though.


http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@367
PS1, Line 367: COORD_CANCEL_QUERY_FINSTANCES_RPC
> The convention I've been using for the global debug action labels is to pre
I agree, with the COORD prefix it actually is clear enough.


http://gerrit.cloudera.org:8080/#/c/10815/4/tests/query_test/test_cancellation.py
File tests/query_test/test_cancellation.py:

http://gerrit.cloudera.org:8080/#/c/10815/4/tests/query_test/test_cancellation.py@136
PS4, Line 136: debug_action =
remove one "debug_action = "



--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 27 Jun 2018 23:06:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Dan Hecht (Code Review)
Hello Lars Volker, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10815

to look at the new patch set (#4).

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..

IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query 
execution has terminated

Otherwise, if the coordinator to backend CancelFInstances() RPC had failed,
the query can hang (and/or finstances can continue running until the
query is closed.

Testing:
- the modified test reproduces the hang without the impalad fix

Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M tests/query_test/test_cancellation.py
4 files changed, 49 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10815/4
--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@366
PS1, Line 366:   rpc_status = 
DebugAction(query_ctx.client_request.query_options,
> Wouldn't we need to break; the loop if this was not ok()?
No, I wanted to simulate what happens when a single RPC fails. That way we can 
even exercise the retry logic with e.g. 
COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL@0.3

In the test for failing the RPC altogether, I give it a failure probability of 
100%, so it loops 3 times and then gives up, just like what would happen if the 
RPC really failed.


http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@367
PS1, Line 367: COORD_CANCEL_QUERY_FINSTANCES_RPC
> Should we name this actions so that it's clear that it's in the sender side
The convention I've been using for the global debug action labels is to prefix 
with the containing module name. So, I think the "COORD" prefix is sufficient 
for knowing this is the sender side. If we want to add one to the handler, it'd 
probably be called QUERYSTATE_CANCEL_QUERY_FINSTANCES_RPC. LMK if you feel it's 
still not clear.


http://gerrit.cloudera.org:8080/#/c/10815/1/tests/query_test/test_cancellation.py
File tests/query_test/test_cancellation.py:

http://gerrit.cloudera.org:8080/#/c/10815/1/tests/query_test/test_cancellation.py@139
PS1, Line 139: if fail_rpc_action != None:
> If wait_action != None and fail_rpc_action == None, then debug_action will
Sure, if you feel that's more readable to a native python speaker. (I'd have to 
try it out to know what it's doing but I don't consider myself the most 
proficient python coder). Note that the extraneous "|" was harmless, but happy 
to change this, it does seem more python-y


http://gerrit.cloudera.org:8080/#/c/10815/1/tests/query_test/test_cancellation.py@206
PS1, Line 206: if wait_action is None and fail_rpc_action is None \
> Then this could be "if not debug_action and ('count'..."
Done



--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 27 Jun 2018 22:24:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Dan Hecht (Code Review)
Hello Lars Volker, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10815

to look at the new patch set (#3).

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..

IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query 
execution has terminated

Otherwise, if the coordinator to backend CancelFInstances() RPC had failed,
the query can hang (and/or finstances can continue running until the
query is closed.

Testing:
- the modified test reproduces the hang without the impalad fix

Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M tests/query_test/test_cancellation.py
4 files changed, 50 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10815/3
--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED if query execution has terminated

2018-06-27 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10815 )

Change subject: IMPALA-7205: Respond to ReportExecStatus() RPC with CANCELLED 
if query execution has terminated
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@366
PS1, Line 366:   rpc_status = 
DebugAction(query_ctx.client_request.query_options,
Wouldn't we need to break; the loop if this was not ok()?


http://gerrit.cloudera.org:8080/#/c/10815/1/be/src/runtime/coordinator-backend-state.cc@367
PS1, Line 367: COORD_CANCEL_QUERY_FINSTANCES_RPC
Should we name this actions so that it's clear that it's in the sender side of 
the rpc?


http://gerrit.cloudera.org:8080/#/c/10815/1/tests/query_test/test_cancellation.py
File tests/query_test/test_cancellation.py:

http://gerrit.cloudera.org:8080/#/c/10815/1/tests/query_test/test_cancellation.py@139
PS1, Line 139: if fail_rpc_action != None:
If wait_action != None and fail_rpc_action == None, then debug_action will and 
in a "|". You could do this instead:

  debug_action = "|".join(filter(None, [wait_action, fail_rpc_action]))


http://gerrit.cloudera.org:8080/#/c/10815/1/tests/query_test/test_cancellation.py@206
PS1, Line 206: if wait_action is None and fail_rpc_action is None \
Then this could be "if not debug_action and ('count'..."



--
To view, visit http://gerrit.cloudera.org:8080/10815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb2c26edace89853f14a329f891d1f9a065a991
Gerrit-Change-Number: 10815
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 27 Jun 2018 21:43:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3040: Remove cache directive before dropping a table

2018-06-27 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10792 )

Change subject: IMPALA-3040: Remove cache directive before dropping a table
..


Patch Set 2:

Also, thinking a bit more about your theory, are you able to reproduce it by 
adding Thread.sleep() s in the required places?


--
To view, visit http://gerrit.cloudera.org:8080/10792
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7701a499405e961456adea63f3592b43bd69170
Gerrit-Change-Number: 10792
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 27 Jun 2018 16:58:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-7178: Add the possibility to reduce logging for common data errors

2018-06-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10793 )

Change subject: WIP IMPALA-7178: Add the possibility to reduce logging for 
common data errors
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10793/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10793/1//COMMIT_MSG@22
PS1, Line 22: fragments
> My problem with changing RuntimeState::LogError() to log only the first err
So it sounds like the motivation is to be able to define a "scope" for what is 
consider to be redundant errors? I agree doing that in LogError() would not be 
straightforward.

My main concern with building more logic on top of LogError() is that the error 
log stuff seems a bit broken and so adding more complication on top is hard to 
reason about. Specifically, the use of max_errors query option seems 
inconsistent. So wondering if we need to step back and rethink how the warning 
collection works generally rather than adding another layer on top.

Anyway, I'm okay with adding this scoped warning collector thingy if you feel 
it's needed but it needs more explanation in the class comment.


http://gerrit.cloudera.org:8080/#/c/10793/1/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

http://gerrit.cloudera.org:8080/#/c/10793/1/be/src/runtime/runtime-state.h@234
PS1, Line 234:   /// when a new error arrives or when errors need to be flushed.
this comment needs more explanation. With the current comment, it's not easy 
for someone to determine whether they should use this class or just call 
LogError() directly. You should explain how this differs from just calling 
LogError().


http://gerrit.cloudera.org:8080/#/c/10793/1/be/src/runtime/runtime-state.h@241
PS1, Line 241: if there is no error
isn't it: return true if there is already an error with this error code?



--
To view, visit http://gerrit.cloudera.org:8080/10793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3b7c1fd020a7ba5e0d9c619e1b67236dce198aa
Gerrit-Change-Number: 10793
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Jun 2018 16:46:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location

2018-06-27 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10543 )

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1435
PS18, Line 1435: / Only reload file metadata of partitions speci
> I think patch set 20 reflects something like this. Could you please check a
not really. the code in that patch loops over the partitions to delete, and for 
each such partition, checks if the partition's location is shared amongst more 
than one. The sharing test depends on that new map in hdfstable, which is 
maintained incrementally. I'm expecting something like this:

1) build delete partitions index, mapping location to partition
2) for each partition in table
  (location, delete_partition) = location_map.get(partition)
  if partition != delete_partition then we have sharing, error.

Might want to collect all cases of shared locations violations.



--
To view, visit http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 20
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Jun 2018 16:07:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7095: clean up scan node profiles

2018-06-27 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10810 )

Change subject: IMPALA-7095: clean up scan node profiles
..


Patch Set 5:

(6 comments)

Did an initial pass over the code. Found some minor issues.

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/hdfs-scan-node-base.h@489
PS5, Line 489: per_read_thread
I guess it's rather per_thread_read_throughput , just like in its comment.


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h
File be/src/exec/scan-node.h:

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@46
PS5, Line 46: layet
Nit: layer


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@95
PS5, Line 95:
Nit: too much spaces


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@234
PS5, Line 234: &&
Nit: I think it's not necessary here and it would be nice if it was consistent 
with EnqueueBatch().


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.h@256
PS5, Line 256: &&
Since this function does not always take ownership of 'row_batch', I think it 
is confusing to see std::move at the call sites.
Maybe a non-const l-value reference or a pointer would be a little bit more 
straightforward.


http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/10810/5/be/src/exec/scan-node.cc@143
PS5, Line 143:   /// Starts counters that should start measuring in the Open() 
phase.
 :   void StartOpenCounters();
It seems it is some leftover code.



--
To view, visit http://gerrit.cloudera.org:8080/10810
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77286282d42e7764bfdf94c7ec47cec9d743f787
Gerrit-Change-Number: 10810
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Jun 2018 16:02:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-7178: Add the possibility to reduce logging for common data errors

2018-06-27 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10793 )

Change subject: WIP IMPALA-7178: Add the possibility to reduce logging for 
common data errors
..


Patch Set 1:

(2 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/10793/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10793/1//COMMIT_MSG@12
PS1, Line 12: the logs
: will contain a new line for every occurrence
> But isn't that the point of the change (as you state below).  I guess it wo
The log lines to eliminate are the ones that report the exact same information. 
These would be merged to max two lines:

1. one at the first occurrence of the error
2. one when the error collector is flushed - this would contain the number of 
errors since 1.

This logic is probably more complicated then it should be, but it ensures that 
minimal information is lost.

It would be simpler to avoid the flushing logic by logging only the first 
errors, but as RuntimeState reports the number of occurrences per error code, 
some way is needed to increment the counters.


http://gerrit.cloudera.org:8080/#/c/10793/1//COMMIT_MSG@22
PS1, Line 22: fragments
> As the class is written today, it's not really tied to RuntimeState. You co
My problem with changing RuntimeState::LogError() to log only the first error 
for a code is that it would completely hide if an error occurs in more than one 
files/columns. Having one LogCollector per ColumnReader means that the error in 
one column cannot hide the one in another.

An alternate approach would be to keep a LogCollector per HdfsScanner, and keep 
the last message, and log only if the message was changed. This would simplify 
the interface and work well with column readers.

Another possibility would be to create a class to collect specially data 
errors. Its LogError() function could expect table/file/column parameters, and 
use them as the key of a map and count errors separately.



--
To view, visit http://gerrit.cloudera.org:8080/10793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3b7c1fd020a7ba5e0d9c619e1b67236dce198aa
Gerrit-Change-Number: 10793
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Jun 2018 14:14:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7190: Remove unsupported format writer support

2018-06-27 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10823 )

Change subject: IMPALA-7190: Remove unsupported format writer support
..


Patch Set 1: Code-Review+1

(2 comments)

Some minor comments for tests, otherwise lgtm.

http://gerrit.cloudera.org:8080/#/c/10823/1/tests/query_test/test_compressed_formats.py
File tests/query_test/test_compressed_formats.py:

http://gerrit.cloudera.org:8080/#/c/10823/1/tests/query_test/test_compressed_formats.py@150
PS1, Line 150:   def test_unsupported_writers(self, vector):
I would prefer to use unique_database, and remove the drops from 
unsupported-writers.test


http://gerrit.cloudera.org:8080/#/c/10823/1/tests/shell/test_shell_interactive.py
File tests/shell/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/10823/1/tests/shell/test_shell_interactive.py@a404
PS1, Line 404:
This line could be left here, but check that "... not in result.stduot". This 
would make it easier to restore the original test when there will be new 
deprecated query options.



--
To view, visit http://gerrit.cloudera.org:8080/10823
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I821dc7495a901f1658daa500daf3791b386c7185
Gerrit-Change-Number: 10823
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 27 Jun 2018 13:13:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location

2018-06-27 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10543 )

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1435
PS18, Line 1435: / Only reload file metadata of partitions speci
> yes, I think drop partitions complexity should be O(http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 20
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 27 Jun 2018 07:27:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3040: Remove cache directive before dropping a table

2018-06-27 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10792 )

Change subject: IMPALA-3040: Remove cache directive before dropping a table
..


Patch Set 2:

Your theory seems plausible to me

> I think table is dropped concurrently with

Do you know what in test_caching_ddl() is calling this drop (drop db 
cascade/drop table etc.) ? It does not seem to be using unique_db_fixture and 
runs serially. So I'm wondering what is triggering a race between load() and 
drop().

> Now the questions is whether listPartitionNames() returns an empty list if 
> the table doesn't exist. The first thing to notices is that 
> listPartitionNames() might throw NoSuchObjectException, so intuitively that 
> should happen if the table doesn't exist, which is not true. The relevant 
> code is at 
> https://github.com/apache/hive/blob/966b83e3b9123bb455572d47878601d60b86999e/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4717.
>  The NoSuchObjectException is only thrown by fireReadTablePreEvent(), which 
> is some kind of hook mechanism and might be a no-op in most cases. The 
> backend implementation is at 
> https://github.com/apache/hive/blob/966b83e3b9123bb455572d47878601d60b86999e/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L3247.
>  It merely executes a select query which doesn't check if the table exists at 
> all. So yes, it will return an empty list.

You are right about this. I wrote a quick HMSClient class to confirm this. 
Following prints 0.

$javac -cp "fe/target/dependency/*" TestListPartitionNames.java
$java -cp "fe/target/dependency/*":$HADOOP_CONF_DIR:. TestListPartitionNames

import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
import org.apache.hadoop.hive.conf.HiveConf;
import java.util.List;

public class TestListPartitionNames {
  public static void main(String[] args) throws Exception {
HiveMetaStoreClient client = new HiveMetaStoreClient(
new HiveConf(), null);
List parts = client.listPartitionNames("non_existent_db_blah_blah", 
"foo", (short) -1);
System.out.println(parts.size());
  }
}


--
To view, visit http://gerrit.cloudera.org:8080/10792
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7701a499405e961456adea63f3592b43bd69170
Gerrit-Change-Number: 10792
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 27 Jun 2018 06:03:21 +
Gerrit-HasComments: No