[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5643: Add total number of threads created per group to 
/threadz
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5643: Add total number of threads created per group to 
/threadz
..


IMPALA-5643: Add total number of threads created per group to /threadz

Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53
Reviewed-on: http://gerrit.cloudera.org:8080/7390
Reviewed-by: Henry Robinson 
Tested-by: Impala Public Jenkins
---
M be/src/util/thread.cc
M www/threadz.tmpl
2 files changed, 15 insertions(+), 7 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Henry Robinson: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins


[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used 
without LDAP auth
..


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/843/

-- 
To view, visit http://gerrit.cloudera.org:8080/7188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth

2017-07-10 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used 
without LDAP auth
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#25).

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..

IMPALA-4674: Part 2: port backend exec to BufferPool

Always create global BufferPool at startup using 80% of memory and
limit reservations to 80% of query memory (same as BufferedBlockMgr).
The query's initial reservation is computed in the planner, claimed
centrally (managed by the InitialReservations class) and distributed
to query operators from there.

min_spillable_buffer_size and default_spillable_buffer_size query
options control the buffer size that the planner selects for
spilling operators.

Port ExecNodes to use BufferPool:
  * Each ExecNode has to claim its reservation during Open()
  * Port Sorter to use BufferPool.
  * Switch from BufferedTupleStream to BufferedTupleStreamV2
  * Port HashTable to use BufferPool via a Suballocator.

This also makes PAGG memory consumption more efficient (avoid wasting buffers)
and improve the spilling algorithm:
* Allow preaggs to execute with 0 reservation - if streams and hash tables
  cannot be allocated, it will pass through rows.
* Halve the buffer requirement for spilling aggs - avoid allocating
  buffers for aggregated and unaggregated streams simultaneously.
* Rebuild spilled partitions instead of repartitioning (IMPALA-2708)

TODO in follow-up patches:
* Rename BufferedTupleStreamV2 to BufferedTupleStream
* Implement max_row_size query option.

Testing:
* Updated tests to reflect new memory requirements

Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/partitioned-hash-join-node.inline.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/buffered-block-mgr-test.cc
D be/src/runtime/buffered-block-mgr.cc
D be/src/runtime/buffered-block-mgr.h
D be/src/runtime/buffered-tuple-stream-test.cc
D be/src/runtime/buffered-tuple-stream.cc
D be/src/runtime/buffered-tuple-stream.h
D be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
A be/src/runtime/initial-reservations.cc
A be/src/runtime/initial-reservations.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/service/client-request-state.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/bloom-filter.h
M be/src/util/static-asserts.cc
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M 

[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 23:

(27 comments)

http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

PS23, Line 194: !buffer_pool_client_.is_registered()
> I guess that's needed for subplans?
Yeah exactly. It's a little ugly but I couldn't think of a cleaner way to deal 
with this.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.h
File be/src/exec/analytic-eval-node.h:

PS23, Line 336: reservation
> the stream will try to grow the reservation first, right?
Done


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 209: _pool_client_, resource_profile_.min_reservation);
> do we somewhere inside there dcheck that the exec-node has at least min_res
No. I'm not sure that it's a useful check either - if the ExecNode 
implementation wants to give up reservation earlier that seems OK.


Line 226:   resource_profile_.spillable_buffer_size < 
buffer_pool->min_buffer_len()) {
> when can that happen?
It's possible if the coordinator has a different configuration from the 
backends. It seems unlikely that anyone would configure it in that way but who 
knows.


Line 233:   // to get some reservation.
> I'm not sure what that comment is saying.
Yeah it was garbled. I don't think it's needed.


PS23, Line 240: if (resource_profile_.min_reservation > 0) {
> maybe delete that to exercise this path unconditionally
Done


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

PS23, Line 948: 
> is that solved or was it not a problem?
This was IMPALA-3899, which is fixed. This comment just didn't get updated.


PS23, Line 532: Either a pointer to the single tuple of this row 
> where does the tuple live in that case?
Reworded the comment to be clearer (hopefully).


PS23, Line 605: success
> that's kind of confusing given the OK on status means success also. Maybe m
Yeah got_memory is clearer.


PS23, Line 617:   
> blank
Done


PS23, Line 625: Status* status
> do we need to pass that indirectly for codegen/perf still? i thought we mad
I tried to do this a while back with BTS::AddRow() but it was a can of worms.

The problem was mainly codegen time, IIRC the runtime difference wasn't 
significant. In a lot of cases the compiler couldn't infer that the destructor 
for the automatic Status variable was dead code so the IR ends up with lots of 
unnecessary branches from the Status destructors.


PS23, Line 688: success
> same comment
Done


PS23, Line 854: Status* status
> same question about status perf. otherwise, seems better to pass HtData**
See above comment.


PS23, Line 862: success
> same comment
changed to got_memory


PS23, Line 879: status
> same comment vs DuplicateNode**
See above comment.


PS23, Line 895: tatus* status)
> this is opposite all the "bool* success" interfaces. reason for that?
Same as the other ones - this gets inlined into the IR and the automatic Status 
causes problems.


PS23, Line 923: stores_tuples_
> this could probably use a comment (the comment above seems to apply to the 
Yeah Tuple* is significantly faster at least until GetTupleRow() is codegen'd.

Added a comment.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/partitioned-aggregation-node-ir.cc
File be/src/exec/partitioned-aggregation-node-ir.cc:

Line 224:   if (hash_tbl == nullptr) return false; // Hash table was not 
created - pass through.
> why does that happen now? or is it a bug fix?
This was required so that preaggs can execute with zero reservation.

Before we "got away" with allocating small hash tables and streams because the 
hash table and small buffer memory wasn't counted against the reservations.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

PS23, Line 88: page
> should we express that as min_buffer_size instead?  maybe easy to understan
Yeah that would be more consistent.

There was already a min_buffer_size flag in disk-io-mgr so I combined the two - 
they'll be merged anyway when we switch the io-mgr over to the buffer pool.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.cc
File be/src/runtime/initial-reservations.cc:

Line 69:   DCHECK(success);
> Explain why success is expected (planner computation).
Done


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.h
File be/src/runtime/initial-reservations.h:

PS23, Line 40:  This creates trackers for initial reservations under those.
> This seems similar to a SubReservation, no?
Right, they're both wrappers around ReservationTrackers but a SubReservation is 
associated with a BufferPool Client, and the operations are a bit different.


PS23, Line 47: Claim
> claim 

[Impala-ASF-CR] IMPALA-5498: Support for partial sorts

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7267/4/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 527:   if (!insertStmt.hasNoClusteredHint() && 
!ctx_.isSingleNodeExec()) {
Why is this limited to Kudu?

Inserting into Partitioned HDFS tables is a very common use case.


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth

2017-07-10 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used 
without LDAP auth
..


Patch Set 3: Code-Review+2

(1 comment)

I can run the pre-commit job once you fix the longline. +2'ing since the patch 
is trivial.

http://gerrit.cloudera.org:8080/#/c/7188/3/shell/impala_shell.py
File shell/impala_shell.py:

Line 1333: print_to_stderr("Option --ldap_password_cmd requires using LDAP 
authentication mechanism (-l)")
long line


-- 
To view, visit http://gerrit.cloudera.org:8080/7188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Tim Armstrong (Code Review)
Hello Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7223

to look at the new patch set (#19).

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..

IMPALA-4862: make resource profile consistent with backend behaviour

This moves away from the PipelinedPlanNodeSet approach of enumerating
sets of concurrently-executing nodes because unions would force
creating many overlapping sets of nodes. The new approach computes
the peak resources during Open() and the peak resources between Open()
and Close() (i.e. while calling GetNext()) bottom-up for each plan node
in a fragment. The fragment resources are then combined to produce the
query resources.

The basic assumptions for the new resource estimates are:
* resources are acquired during or after the first call to Open()
  and released in Close().
* Blocking nodes call Open() on their child before acquiring
  their own resources (this required some backend changes).
* Blocking nodes call Close() on their children before returning
  from Open().
* The peak resource consumption of the query is the sum of the
  independent fragments (except for the parallel join build plans
  where we can assume there will be synchronisation). This is
  conservative but we don't synchronise fragment Open() and Close()
  across exchanges so can't make stronger assumptions in general.

Also compute the sum of minimum reservations. This will be useful
in the backend to determine exactly when all of the initial
reservations have been claimed from a shared pool of initial reservations.

Testing:
* Updated planner tests to reflect behavioural changes.
* Added extra resource requirement planner tests for unions, subplans,
  pipelines of blocking operators, and bushy join plans.
* Added single-node plans to resource-requirements tests. These have
  more complex plan trees inside a single fragment, which is useful
  for testing the peak resource requirement logic.

Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
---
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/sort-node.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/common/TreeNode.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
A fe/src/main/java/org/apache/impala/util/MathUtil.java
A fe/src/test/java/org/apache/impala/util/BitUtilTest.java
A fe/src/test/java/org/apache/impala/util/MathUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M 

[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 18: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/839/

-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5629: avoid expensive list::size() call
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5629: avoid expensive list::size() call
..


IMPALA-5629: avoid expensive list::size() call

As a workaround until we move to GCC5+, explicitly track the pages_
list size. This is not too bad in practice since it is only mutated
in 3 places.

Testing:
Ran buffered-tuple-stream-v2-test (the only coverage of
BufferedTupleStreamV2 currently).

Reran the query with the perf issue, confirmed that it was no longer
spending lots of time in BufferedTupleStreamV2::AdvanceWritePage().

Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66
Reviewed-on: http://gerrit.cloudera.org:8080/7382
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/buffered-tuple-stream-v2.cc
M be/src/runtime/buffered-tuple-stream-v2.h
2 files changed, 21 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Better benchmark heuristic
..


Patch Set 1: -Code-Review

-- 
To view, visit http://gerrit.cloudera.org:8080/7389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5582: Null-aware anti-join can take a long time to cancel

2017-07-10 Thread anujphadke (Code Review)
anujphadke has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7393

Change subject: IMPALA-5582: Null-aware anti-join can take a long time to cancel
..

IMPALA-5582: Null-aware anti-join can take a long time to cancel

Queries with a null-aware anti-join joining on a large number of NULLs
can take a long time to cancel if threads are stuck in
PartitionedHashJoinNode::EvaluateNullProbe(). This change adds the
RETURN_IF_CANCELLED macro to the function.

Testing:
Added logs to PartitionedHashJoinNode::EvaluateNullProbe and made sure
that the function returns right away on cancellation.

Change-Id: I0800754d4ad31cbadbdfadc630c640963f3f6053
---
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
2 files changed, 12 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/7393/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0800754d4ad31cbadbdfadc630c640963f3f6053
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 


[Impala-ASF-CR] IMPALA-5623: Fix lag() on STRING cols to release UDF mem

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5623: Fix lag() on STRING cols to release UDF mem
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5623: Fix lag() on STRING cols to release UDF mem

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5623: Fix lag() on STRING cols to release UDF mem
..


IMPALA-5623: Fix lag() on STRING cols to release UDF mem

IMPALA-4120 fixed an issue where lead/lag was potentially
operating on memory that the UDA didn't own, resulting in
potentially wrong results. As part of that fix, lead and lag
started allocating 'global' UDF memory (e.g. via Allocate()
rather than AllocateLocal()) in Init() which needs to be freed
in Serialize() or Finalize(), but only lead() was updated to
free the memory. This memory is eventually freed when the
fragment is torn down, but as a result of not freeing the
memory in Serialize or Finalize, the memory may be allocated
longer than necessary.

Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3
Reviewed-on: http://gerrit.cloudera.org:8080/7371
Reviewed-by: Matthew Jacobs 
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
2 files changed, 19 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Matthew Jacobs: Looks good to me, but someone else must approve
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth

2017-07-10 Thread Donghui Xu (Code Review)
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7188

to look at the new patch set (#4).

Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used 
without LDAP auth
..

IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth

When only with ldap_password_cmd option, impala-shell runs successfully.

I solved this problem by throwing an error when --ldap_password_cmd is
used without LDAP auth, that is, ldap_password_cmd option will only
take effect if ldap option presents.

Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
---
M shell/impala_shell.py
1 file changed, 5 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7188/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5643: Add total number of threads created per group to 
/threadz
..


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/841/

-- 
To view, visit http://gerrit.cloudera.org:8080/7390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-10 Thread Donghui Xu (Code Review)
Hello Impala Public Jenkins, Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7179

to look at the new patch set (#11).

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..

IMPALA-5507: Add clear description to help information of KEYVAL option

Help information of KEYVAL option in impala-shell is not clear enough.

I fix this issue by adding clear description to help information of
KEYVAL option.

Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
---
M shell/option_parser.py
1 file changed, 6 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7179/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: Prevent leaking AWS credentials to the log.
..


Patch Set 2: Code-Review+1

I think this makes sense (bash flags are weird). It'd be good to get someone 
more familiar with bash to check as well.

-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Prevent leaking AWS credentials to the log.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7391/2/functions.sh
File functions.sh:

Line 380: if set +o | grep -q "set -o xtrace"; then
Should we parse "set -o" here instead?


-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..


IMPALA-5164: Bloom filter benchmark fails to complete

Due to large allocations during benchmark, we spend a lot of time
under memory pressure, which triggers the context switch detection.
Not using micro-benchmark heuristics for these benchmarks fixes the
problem.

Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Reviewed-on: http://gerrit.cloudera.org:8080/7381
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M be/src/benchmarks/bloom-filter-benchmark.cc
1 file changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3603 [DOCS] Document handling of NaN values

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3603 [DOCS] Document handling of NaN values
..


IMPALA-3603 [DOCS] Document handling of NaN values

Added information in the "DOUBLE Data Type" (impala_double.html)
and the "FLOAT Data Type" (impala_float.html) topics about
how Impala handles NaN values.

Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0
Reviewed-on: http://gerrit.cloudera.org:8080/7098
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins
---
M docs/shared/impala_common.xml
M docs/topics/impala_double.xml
M docs/topics/impala_float.xml
3 files changed, 22 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Michael Brown: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laurel Hale 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Will Berkeley 


[Impala-ASF-CR] IMPALA-3603 [DOCS] Document handling of NaN values

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3603 [DOCS] Document handling of NaN values
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laurel Hale 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4107 [DOCS] APPX MEDIAN cuts string to 10 chars

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars
..


IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars

In the "Restrictions" section of the "APPX_MEDIAN Function"
topic, added information about how the function truncates
string values.

Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2
Reviewed-on: http://gerrit.cloudera.org:8080/7094
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins
---
M docs/topics/impala_appx_median.xml
1 file changed, 6 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Michael Brown: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laurel Hale 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Peter Brejcak


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Lars Volker (Code Review)
Lars Volker has submitted this change and it was merged.

Change subject: Prevent leaking AWS credentials to the log.
..


Prevent leaking AWS credentials to the log.

Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
---
M functions.sh
1 file changed, 11 insertions(+), 0 deletions(-)

Approvals:
  David Knupp: Looks good to me, approved
  Lars Volker: Verified
  Matthew Jacobs: Looks good to me, but someone else must approve
  Tim Wood: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Wood 


[Impala-ASF-CR] IMPALA-3603 [DOCS] Document handling of NaN values

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3603 [DOCS] Document handling of NaN values
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/143/

-- 
To view, visit http://gerrit.cloudera.org:8080/7098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laurel Hale 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Will Berkeley 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4107 [DOCS] APPX MEDIAN cuts string to 10 chars

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laurel Hale 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Peter Brejcak
Gerrit-HasComments: No


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Prevent leaking AWS credentials to the log.
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Prevent leaking AWS credentials to the log.
..


Patch Set 2:

Thank you all for the quick reviews!

-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: Prevent leaking AWS credentials to the log.
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Tim Wood (Code Review)
Tim Wood has posted comments on this change.

Change subject: Prevent leaking AWS credentials to the log.
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: Prevent leaking AWS credentials to the log.
..


Patch Set 2: Code-Review+1

Lars asked me to look at this as well, and I feel the same way. 

Tim Wood is better at bash than me. I'll confer with him, then +2 if things 
check out.

-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec

2017-07-10 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-4863/IMPALA-5311: Correctly account the file type and 
compression codec
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7245/2//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4863/IMPALA-5311: Correctly account the file type and 
compression codec
> It would be easier to review if we included the IMPALA-5311 fix in this, si
Done


http://gerrit.cloudera.org:8080/#/c/7245/2/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

PS2, Line 553: 
> It seems like this will remove useful information for file formats where we
Done


Line 771: 
> missing space - maybe run it it through clang-format before posting?
Done


http://gerrit.cloudera.org:8080/#/c/7245/2/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS2, Line 250: t 
> should have spaces around " = "
Done


PS2, Line 462: 
> don't need space between >> in C++11
Not needed anymore.
Moved std::pair to a tuple


-- 
To view, visit http://gerrit.cloudera.org:8080/7245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I797916505f62e568f4159e07099481b8ff571da2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/141/

-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] single node perf run.py: clean up newly-added testdata

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: single_node_perf_run.py: clean up newly-added testdata
..


single_node_perf_run.py: clean up newly-added testdata

In single_node_perf_run.py, restore_workloads() can make the tree
"dirty", and when a tree is dirty, git won't let you switch branches
in a way that clobbers the dirty file contents:

$ cd $(mktemp -d)
$ git init .
Initialized empty Git repository in /tmp/tmp.H0NxzTXLUj/.git/
$ touch foo && git add foo && git commit -a -m "foo"
[master (root-commit) 3776149] foo
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 foo
$ git checkout -b ok_foo && echo "ok" >> foo && git commit -a -m "foo is ok"
Switched to a new branch 'ok_foo'
[ok_foo 9fd5bde] foo is ok
 1 file changed, 1 insertion(+)
$ git checkout master && echo "not ok" >> foo
Switched to branch 'master'
$ git checkout ok_foo
error: Your local changes to the following files would be overwritten by 
checkout:
foo
Please, commit your changes or stash them before you can switch branches.
Aborting

Discovered when testing single_node_perf_run with
https://gerrit.cloudera.org/#/c/7153/; after this commit, that patch
works with single_node_perf_run.py

Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4
Reviewed-on: http://gerrit.cloudera.org:8080/7291
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M bin/single_node_perf_run.py
1 file changed, 5 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4107 [DOCS] APPX MEDIAN cuts string to 10 chars

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars
..


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/142/

-- 
To view, visit http://gerrit.cloudera.org:8080/7094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laurel Hale 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Peter Brejcak
Gerrit-HasComments: No


[Impala-ASF-CR] single node perf run.py: clean up newly-added testdata

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: single_node_perf_run.py: clean up newly-added testdata
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 24:

Rebased and fixed a bunch of tests.

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 24
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7391

Change subject: Prevent leaking AWS credentials to the log.
..

Prevent leaking AWS credentials to the log.

Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
---
M functions.sh
1 file changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/91/7391/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 1
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[native-toolchain-CR] Prevent leaking AWS credentials to the log.

2017-07-10 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: Prevent leaking AWS credentials to the log.
..

Prevent leaking AWS credentials to the log.

Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
---
M functions.sh
1 file changed, 11 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/91/7391/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-07-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 23:

(27 comments)

First batch of comments. Will continue but figure we can pipeline somewhat.

http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

PS23, Line 194: !buffer_pool_client_.is_registered()
I guess that's needed for subplans?


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.h
File be/src/exec/analytic-eval-node.h:

PS23, Line 336: reservation
the stream will try to grow the reservation first, right?
Maybe reword to be more accurate:
When the tuple stream cannot acquire additional reservation, input_stream_ is 
unpinned... 
or something.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 209: _pool_client_, resource_profile_.min_reservation);
do we somewhere inside there dcheck that the exec-node has at least 
min_reservation reservation still?


Line 226:   resource_profile_.spillable_buffer_size < 
buffer_pool->min_buffer_len()) {
when can that happen?


Line 233:   // to get some reservation.
I'm not sure what that comment is saying.


PS23, Line 240: if (resource_profile_.min_reservation > 0) {
maybe delete that to exercise this path unconditionally


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

PS23, Line 948: 
is that solved or was it not a problem?


PS23, Line 532: Either a pointer to the single tuple of this row 
where does the tuple live in that case?


PS23, Line 605: success
that's kind of confusing given the OK on status means success also. Maybe make 
that 'got_memory' or 'got_reservation' (whichever the case is)?


PS23, Line 617:   
blank


PS23, Line 625: Status* status
do we need to pass that indirectly for codegen/perf still? i thought we made 
status faster on the success path.


PS23, Line 688: success
same comment


PS23, Line 854: Status* status
same question about status perf. otherwise, seems better to pass HtData**


PS23, Line 862: success
same comment


PS23, Line 879: status
same comment vs DuplicateNode**


PS23, Line 895: tatus* status)
this is opposite all the "bool* success" interfaces. reason for that?


PS23, Line 923: stores_tuples_
this could probably use a comment (the comment above seems to apply to the set 
of variables since this one is not dependent on join vs agg, but instead the 
number of tuples, right?). Also a short motivation for this optimization would 
be helpful.  I suppose it's still worthwhile even with the new BTS format.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/partitioned-aggregation-node-ir.cc
File be/src/exec/partitioned-aggregation-node-ir.cc:

Line 224:   if (hash_tbl == nullptr) return false; // Hash table was not 
created - pass through.
why does that happen now? or is it a bug fix?


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

PS23, Line 88: page
should we express that as min_buffer_size instead?  maybe easy to understand by 
users that way?


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.cc
File be/src/runtime/initial-reservations.cc:

Line 69:   DCHECK(success);
Explain why success is expected (planner computation).


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.h
File be/src/runtime/initial-reservations.h:

PS23, Line 40:  This creates trackers for initial reservations under those.
This seems similar to a SubReservation, no?


PS23, Line 47: Claim
claim it from where? How is this different than Claim() claiming? also what do 
we mean by "peak" here?


PS23, Line 48: that be claimed
garbled


PS23, Line 48: but not
 :   /// returned
why not returned?


PS23, Line 50: Status
explain failure modes


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS23, Line 403: 
in order to not regress w.r.t. IMPALA-1575, don't we need to release the 
initial reservations when the last fragment instance completes?  i think this 
shared_ptr was effectively doing that in the BufferedBlockMgr world, right?


http://gerrit.cloudera.org:8080/#/c/5801/23/fe/src/main/java/org/apache/impala/util/MathUtil.java
File fe/src/main/java/org/apache/impala/util/MathUtil.java:

Line 45: }
look familar


-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 23
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 

[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

2017-07-10 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per 
disk type
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS9, Line 53: string num_io_threads_per_rotational_disk_help_msg = 
Substitute("Number of I/O threads"
: " per rotational disk. Has priority over 
num_threads_per_disk. If neither is set, "
: "defaults to $0 thread(s) per rotational disk", 
THREADS_PER_ROTATIONAL_DISK);
: DEFINE_int32(num_io_threads_per_rotational_disk, 0,
: num_io_threads_per_rotational_disk_help_msg.c_str());
@Matthew: as discussed, switched to a static const string instead which works 
as intended.
@Tim: I wanted to avoid using macro constants as they are also being used 
during member initialization in the constructor


PS9, Line 271: deafult_val
> typo: default
Done


http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 61: if (disk_id >= disks_.size()) return false;
> It shouldn't be in practice, but this is necessary for the tests which test
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#24).

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..

IMPALA-4674: Part 2: port backend exec to BufferPool

Always create global BufferPool at startup using 80% of memory and
limit reservations to 80% of query memory (same as BufferedBlockMgr).
The query's initial reservation is computed in the planner, claimed
centrally (managed by the InitialReservations class) and distributed
to query operators from there.

min_spillable_buffer_size and default_spillable_buffer_size query
options control the buffer size that the planner selects for
spilling operators.

Port ExecNodes to use BufferPool:
  * Each ExecNode has to claim its reservation during Open()
  * Port Sorter to use BufferPool.
  * Switch from BufferedTupleStream to BufferedTupleStreamV2
  * Port HashTable to use BufferPool via a Suballocator.

This also makes PAGG memory consumption more efficient (avoid wasting buffers)
and improve the spilling algorithm:
* Allow preaggs to execute with 0 reservation - if streams and hash tables
  cannot be allocated, it will pass through rows.
* Halve the buffer requirement for spilling aggs - avoid allocating
  buffers for aggregated and unaggregated streams simultaneously.
* Rebuild spilled partitions instead of repartitioning (IMPALA-2708)

TODO in follow-up patches:
* Rename BufferedTupleStreamV2 to BufferedTupleStream
* Implement max_row_size query option.

Testing:
* Updated tests to reflect new memory requirements

Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/partitioned-hash-join-node.inline.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/buffered-block-mgr-test.cc
D be/src/runtime/buffered-block-mgr.cc
D be/src/runtime/buffered-block-mgr.h
D be/src/runtime/buffered-tuple-stream-test.cc
D be/src/runtime/buffered-tuple-stream.cc
D be/src/runtime/buffered-tuple-stream.h
D be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
A be/src/runtime/initial-reservations.cc
A be/src/runtime/initial-reservations.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/service/client-request-state.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/bloom-filter.h
M be/src/util/static-asserts.cc
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M 

[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue

2017-07-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
..


Patch Set 2:

(1 comment)

After some discussion, I think two issues are being conflated here.

1. If a complete KRB principal is not supplied via --principal, Impala tries to 
guess at the hostname part using gethostname(). This might not return the FQDN 
which is needed for Kerberos. The linked bug (IMPALA-4978) is really talking 
about this issue.

2. However, the more important issue is that IMPALA-5253 made it critical that 
for TLS-enabled deployments, every service thinks its hostname is the FQDN. 
That hostname is controlled by --hostname. If that is not set, Impala falls 
back to gethostname() - and this is where IMPALA-4978 comes in, because it 
should also use the getaddrinfo() call first. The workaround for this issue is 
to set --hostname correctly to the FQDN (or ensure that 'hostname' returns the 
FQDN by configuring name resolution correctly). We found this at Cloudera 
because in some situations our cluster management software would set --hostname 
to the shortname.

http://gerrit.cloudera.org:8080/#/c/7388/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

PS2, Line 156: This problem might occur immediately after an upgrade, due to 
changes in cluster
 :   management software that supplies the 
--hostname flag
 :   automatically to the Impala-related daemons.
It's a bit hard to understand *what* kind of changes would cause this, and 
without that knowledge this is a bit vague. In my opinion it makes sense to 
either link directly to the CM known issue (I think it's perfectly ok to point 
to third-party software that breaks Impala during upgrades), or to remove this 
entirely.


-- 
To view, visit http://gerrit.cloudera.org:8080/7388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..


[DOCS] Advise setting vm.overcommit_memory=1 in various places

Reusing the same advice under "Known Issues", scalability
considerations, and in the Impala + Kerberos section.

Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Reviewed-on: http://gerrit.cloudera.org:8080/7349
Reviewed-by: Mostafa Mokhtar 
Tested-by: Impala Public Jenkins
---
M docs/shared/impala_common.xml
M docs/topics/impala_kerberos.xml
M docs/topics/impala_known_issues.xml
M docs/topics/impala_scalability.xml
4 files changed, 59 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Mostafa Mokhtar: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5498: Support for partial sorts

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7267/4/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

Line 111: *eos = true;
Do we need to update num_rows_returned_ here?


Line 148: 
We can't have a partial sort in a subplan, right? It's probably best to just 
put a DCHECK(false) here since there's no way to test the code.


PS4, Line 149: :Res
Let's use nullptr consistently in the new code.


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

Line 96:   Sorter(const TupleRowComparator& compare_less_than,
> This will be included in a follow up patch.
WFM. If you want to wait for https://gerrit.cloudera.org/#/c/5801/ to land 
there will be a way to directly set a maximum on bytes of buffers in the 
planner, which would accomplish the same thing.


http://gerrit.cloudera.org:8080/#/c/7267/4/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

PS4, Line 99: RuntimeProfile* profile, Run
I feel like this is breaking some of the encapsulation of the sort 
implementation. This detail leaks out into the planner anyway so I guess it's 
not really encapsulated.

How about we just document more clearly what value is expected. The class 
comment talks about the number of blocks per run, so it could be extended to 
talk about the minimum requirement for spilling and non-spilling sorts.


http://gerrit.cloudera.org:8080/#/c/7267/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

PS3, Line 349:  highest/lowest
> My problem with that wording is that it could be interpreted as "take the f
WFM


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..


IMPALA-5583: [DOCS] Document default_join_distribution_mode query option

New page for the query option.

Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Reviewed-on: http://gerrit.cloudera.org:8080/7300
Reviewed-by: Mostafa Mokhtar 
Tested-by: Impala Public Jenkins
---
M docs/impala.ditamap
M docs/impala_keydefs.ditamap
A docs/topics/impala_default_join_distribution_mode.xml
3 files changed, 136 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Mostafa Mokhtar: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


IMPALA-5605: [DOCS] New known issue for upping thread resource limits

Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Reviewed-on: http://gerrit.cloudera.org:8080/7348
Reviewed-by: Mostafa Mokhtar 
Tested-by: Impala Public Jenkins
---
M docs/topics/impala_known_issues.xml
1 file changed, 33 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Mostafa Mokhtar: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/140/

-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/139/

-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: remove old aggs and joins

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: remove old aggs and joins
..


Patch Set 8: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5ce2236d37c0ced188a4a81f7e00d4b8ac98e7e9
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5629: avoid expensive list::size() call
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/840/

-- 
To view, visit http://gerrit.cloudera.org:8080/7382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 18:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/839/

-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5498: Support for partial sorts

2017-07-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#4).

Change subject: IMPALA-5498: Support for partial sorts
..

IMPALA-5498: Support for partial sorts

Impala currently supports total sorts (the entire set of data
is sorted) and top-n sorts (only the highest/lowest n elements
are sorted). This patch adds the ability to do partial sorts,
where the data is divided up into some number of subsets, each
of which is sorted individually.

It accomplishes this by adding a new exec node, PartialSortNode.
When PartialSortNode::GetNext() is called, it retrieves input
up to its memory limit, uses the existing Sorter class to sort
it, and outputs it. This is faster than a total sort with SortNode
as it avoids the need to spill if the input is larger than the
memory limit.

Future work will look into setting a more restrictive memory limit
on the PartialSortNode.

In the planner, the SortNode plan node is used, with an enum value
indicating if it is a total or partial sort.

As a first use case, partial sort is used where a total sort was
used previously for inserts into Kudu.

Testing:
- E2E test with a large INSERT into a Kudu table with a mem limit.
  Checks that no spills occurred.
- Updated planner tests.
- Existing E2E tests and stress test verify correctness of INSERT.
- Perf tests on the 10 node cluster: inserting tpch_100.lineitem
  into a Kudu table with mem_limit=3gb:
  Previously: 5 runs are spilled, sort took 7m33s
  Now: no spills, sort takes 6m19s, for ~18% speedup

Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
A be/src/exec/partial-sort-node.cc
A be/src/exec/partial-sort-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
16 files changed, 458 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7267/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5498: Support for partial sorts

2017-07-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
..


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

Line 32: sorter_(NULL),
> This doesn't need to handle offset?
Done


Line 127: 
> Since this only requires a partial sort, you can exit early when the cumula
As noted elsewhere, removing limit-related code.


Line 140: *eos = input_eos_;
> I guess it is not clear to me why allowing a limit without an offset is use
Right, so 'offset' never really makes sense with partial-sort-node, since as 
you say there's not a total ordering and you're not guaranteed that one 
instance of partial-sort-node with an offset can pick up where another instance 
left off with a limit.

A limit is potentially useful just for cutting off the number of rows returned 
in a single partial-sort-node, but with the initial use case for this there's 
no way to end up with a partial-sort-node with a limit, so in the interest of 
keeping things simple and not having untested code paths, I think I'll just 
remove the limit code.


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.h
File be/src/exec/partial-sort-node.h:

Line 58:   virtual Status QueryMaintenance(RuntimeState* state);
> Just for general niceness, can you also update be/src/exec/sort-node.h's pr
Done


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

Line 1384:   expr_mem_pool, _tuple_expr_evals_));
> Need to update this calculation for partial sorts.
Done


Line 1434:   if (sorted_runs_.size() == 1) {
> Unnecessary - this is unconditionally dereferenced
Done


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

Line 96:   Sorter(const TupleRowComparator& compare_less_than,
> Wasn't the plan to apply the partition sort memory limit to the sorter itse
This will be included in a follow up patch.


http://gerrit.cloudera.org:8080/#/c/7267/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

PS3, Line 349: e {
> Should this be "first" since the order is specified elsewhere.
My problem with that wording is that it could be interpreted as "take the first 
N elements of input and then sort them"

Maybe: "Return the first N sorted elements"?


Line 360: struct TSortNode {
> Please ignore my earlier comments about offset handling (although a comment
Added to partial-sort-node.h


http://gerrit.cloudera.org:8080/#/c/7267/3/fe/src/main/java/org/apache/impala/planner/SortNode.java
File fe/src/main/java/org/apache/impala/planner/SortNode.java:

Line 58:   private final long PARTIAL_SORT_MEM_LIMIT = 128 * 1024 * 1024;
> It might be clearer to specify this in bytes, then just make sure that its 
Done


Line 286:   // The memory limit cannot be less than the size of the 
required blocks.
> This looks good to me.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 17: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-10 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7300/3/docs/topics/impala_default_join_distribution_mode.xml
File docs/topics/impala_default_join_distribution_mode.xml:

Line 52:   on the right-hand side of the join is broadcast. This behavior
> I believe the word right-hand side is reserved to the position of the table
I knew 'RHS' was going to get me in trouble. :-)


-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz

2017-07-10 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7390

Change subject: IMPALA-5643: Add total number of threads created per group to 
/threadz
..

IMPALA-5643: Add total number of threads created per group to /threadz

Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53
---
M be/src/util/thread.cc
M www/threadz.tmpl
2 files changed, 15 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/7390/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-5031: Set DiskIoMgr::ScanRange::is cancelled before read

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5031: Set DiskIoMgr::ScanRange::is_cancelled_ before read
..


Patch Set 4:

FWIW in practice it's not possible to re-use a ScanRange object because the 
lifecycle is a bit messed-up - IMPALA-4249

-- 
To view, visit http://gerrit.cloudera.org:8080/7294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I22898ec96ac44c4902f8072f27453cfc58358cae
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 16:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7223/16/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

PS16, Line 394: total
> total claims
Done


http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
File fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java:

Line 89:   // We already recursed on the join build fragment in 
createBuildPlan().
> is this a bug fix independent of this patch? this statement isn't obviously
Yes, the problem was that createBuildPlan() was not correctly updating the 
fragment tree, and this function was implicitly depending on that bug.

Before these fixes we got obviously incorrect output in the planner tests - the 
resource consumption of a query was greater than the sum of all operators 
because operators were double-counted.


http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

PS16, Line 210: Compute the resource profile of the fragment
> I think this should say something about being host wide (i.e. for all insta
Done


Line 224:   // finish.
> where does that happen?
JoinNode.computeTreeResourceProfiles(). Added a reference in the comment.


Line 385: builder.append("Per-Host Resources: ");
> did you and Alex already decide that this should be output regardless of ex
Maybe I misunderstood the question, but we only output the fragment header at 
EXTENDED explain level and above anyway - this patch doesn't change that.

Yeah Per-Host was meant to differentiate that.


http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

PS16, Line 617: resources
> peak resources?
Done


http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

PS16, Line 342: the given plans
> obvious*
I augmented the comment to explicitly mention fragment instances.


PS16, Line 354: Sum of per-host minimum reservations over all plan nodes and 
sinks.
> that's still the ambiguous way of describing it. How about rewording that t
I thought it was useful to describe how it was computed. Reworded the comment a 
bit to be more concise and explain what it is before how it is computed.


PS16, Line 366: so we are
  :   // forced to
> forced to what?
Oops didn't finish writing this comment (or left it in an editor buffer).


http://gerrit.cloudera.org:8080/#/c/7223/16/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test:

PS16, Line 141: mem-estimate=15.68KB mem-reservation=136.00MB
> does it make sense for the estimate ever less than the reservation? i guess
In the current patch, no, since the reservations aren't really enforced.

With the switch-over to the buffer pool that would make sense. Filed 
IMPALA-5641 to keep track of it - it will require updating a bunch of planner 
tests so probably best to do it as a separate patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#17).

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..

IMPALA-4862: make resource profile consistent with backend behaviour

This moves away from the PipelinedPlanNodeSet approach of enumerating
sets of concurrently-executing nodes because unions would force
creating many overlapping sets of nodes. The new approach computes
the peak resources during Open() and the peak resources between Open()
and Close() (i.e. while calling GetNext()) bottom-up for each plan node
in a fragment. The fragment resources are then combined to produce the
query resources.

The basic assumptions for the new resource estimates are:
* resources are acquired during or after the first call to Open()
  and released in Close().
* Blocking nodes call Open() on their child before acquiring
  their own resources (this required some backend changes).
* Blocking nodes call Close() on their children before returning
  from Open().
* The peak resource consumption of the query is the sum of the
  independent fragments (except for the parallel join build plans
  where we can assume there will be synchronisation). This is
  conservative but we don't synchronise fragment Open() and Close()
  across exchanges so can't make stronger assumptions in general.

Also compute the sum of minimum reservations. This will be useful
in the backend to determine exactly when all of the initial
reservations have been claimed from a shared pool of initial reservations.

Testing:
* Updated planner tests to reflect behavioural changes.
* Added extra resource requirement planner tests for unions, subplans,
  pipelines of blocking operators, and bushy join plans.
* Added single-node plans to resource-requirements tests. These have
  more complex plan trees inside a single fragment, which is useful
  for testing the peak resource requirement logic.

Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
---
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/sort-node.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/common/TreeNode.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
A fe/src/main/java/org/apache/impala/util/MathUtil.java
A fe/src/test/java/org/apache/impala/util/BitUtilTest.java
A fe/src/test/java/org/apache/impala/util/MathUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 

[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7389

Change subject: IMPALA-5164: Better benchmark heuristic
..

IMPALA-5164: Better benchmark heuristic

If any benchmark took longer than 50ms to run, the test suite would
bail.  This doesn't seem reasonable for tests which have large data
structures.  Just increase the timeout in that case while still
trying to measure context-switch free runs.

Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/util/benchmark.cc
2 files changed, 9 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/7389/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5031: Set DiskIoMgr::ScanRange::is cancelled before read

2017-07-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5031: Set DiskIoMgr::ScanRange::is_cancelled_ before read
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7294/2/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS2, Line 584:  is_cancelled_ = false;
> Nevermind about my comment of Reset() being used only in test code. It's us
Would you comment on IMPALA-5634 to that effect?


-- 
To view, visit http://gerrit.cloudera.org:8080/7294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I22898ec96ac44c4902f8072f27453cfc58358cae
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

PS16, Line 342: the given plans
> I don't think it's opposite that this incorporates DOP.
obvious*


-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5031: Set DiskIoMgr::ScanRange::is cancelled before read

2017-07-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-5031: Set DiskIoMgr::ScanRange::is_cancelled_ before read
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7294/2/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS2, Line 584:  is_cancelled_ = false;
> Done:
Nevermind about my comment of Reset() being used only in test code. It's used 
here too:  
https://github.com/apache/incubator-impala/blob/master/be/src/exec/hdfs-scan-node-base.cc#L614

Those bool flags are currently being initialized in InitInternal(). We should 
consider moving their initialization to Reset() instead.


-- 
To view, visit http://gerrit.cloudera.org:8080/7294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I22898ec96ac44c4902f8072f27453cfc58358cae
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic

2017-07-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5164: Better benchmark heuristic
..


Patch Set 1:

I won't have time to review this for a few days; you might find a quicker 
review from whomever reviewed your first patch that changed how benchmarking 
worked.

Also, it's pretty rare to +anything one's own patch except on rebase.

-- 
To view, visit http://gerrit.cloudera.org:8080/7389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Better benchmark heuristic
..


Patch Set 1: Code-Review+1

Turns out this approach works even better and avoids any special changes to 
benchmarks.

-- 
To view, visit http://gerrit.cloudera.org:8080/7389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec

2017-07-10 Thread anujphadke (Code Review)
anujphadke has uploaded a new patch set (#3).

Change subject: IMPALA-4863/IMPALA-5311: Correctly account the file type and 
compression codec
..

IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec

If a scan range is skipped at runtime the scan node skips reading
the range and never figures out the underlying compression codec used
to compress the files. In such a scenario we default the compression
codec to NONE which can be misleading. This change marks these files
as filtered in the scan node profile

e.g. - File Formats: PARQUET/SNAPPY:364 PARQUET(Skipped):1460

Change-Id: I797916505f62e568f4159e07099481b8ff571da2
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M 
testdata/workloads/functional-query/queries/QueryTest/hdfs_scanner_profile.test
6 files changed, 42 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7245/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I797916505f62e568f4159e07099481b8ff571da2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/837/

-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..


Patch Set 1:

@jbapple, let's go ahead with this diff and if there are more failures, I'll 
attach more fixes.

-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Zach Amsden (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7381

to look at the new patch set (#2).

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..

IMPALA-5164: Bloom filter benchmark fails to complete

Due to large allocations during benchmark, we spend a lot of time
under memory pressure, which triggers the context switch detection.
Not using micro-benchmark heuristics for these benchmarks fixes the
problem.

Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
---
M be/src/benchmarks/bloom-filter-benchmark.cc
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7381/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] single node perf run.py: clean up newly-added testdata

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: single_node_perf_run.py: clean up newly-added testdata
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/836/

-- 
To view, visit http://gerrit.cloudera.org:8080/7291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#16).

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..

IMPALA-4862: make resource profile consistent with backend behaviour

This moves away from the PipelinedPlanNodeSet approach of enumerating
sets of concurrently-executing nodes because unions would force
creating many overlapping sets of nodes. The new approach computes
the peak resources during Open() and the peak resources between Open()
and Close() (i.e. while calling GetNext()) bottom-up for each plan node
in a fragment. The fragment resources are then combined to produce the
query resources.

The basic assumptions for the new resource estimates are:
* resources are acquired during or after the first call to Open()
  and released in Close().
* Blocking nodes call Open() on their child before acquiring
  their own resources (this required some backend changes).
* Blocking nodes call Close() on their children before returning
  from Open().
* The peak resource consumption of the query is the sum of the
  independent fragments (except for the parallel join build plans
  where we can assume there will be synchronisation). This is
  conservative but we don't synchronise fragment Open() and Close()
  across exchanges so can't make stronger assumptions in general.

Also compute the sum of minimum reservations. This will be useful
in the backend to determine exactly when all of the initial
reservations have been claimed from a shared pool of initial reservations.

Testing:
* Updated planner tests to reflect behavioural changes.
* Added extra resource requirement planner tests for unions, subplans,
  pipelines of blocking operators, and bushy join plans.
* Added single-node plans to resource-requirements tests. These have
  more complex plan trees inside a single fragment, which is useful
  for testing the peak resource requirement logic.

Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
---
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/sort-node.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/common/TreeNode.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
A fe/src/main/java/org/apache/impala/util/MathUtil.java
A fe/src/test/java/org/apache/impala/util/BitUtilTest.java
A fe/src/test/java/org/apache/impala/util/MathUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 

[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour

2017-07-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7223/14/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

PS14, Line 401: node
> Yes, that'd be more meaningful and make it clearer. Let's go with that.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue

2017-07-10 Thread John Russell (Code Review)
John Russell has uploaded a new patch set (#2).

Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
..

IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue

Because there was no obvious subcategory of known issues to
put this one in, I made a new subcategory 'startup issues'.

Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0
---
M docs/topics/impala_known_issues.xml
1 file changed, 43 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7388/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7349/1/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

Line 1335: On a kerberized cluster with high memory utilization, 
kinit commands executed after
> On a kerberized cluster, kinit commands executed after every 'kerberos_rein
Done. I included the extra detail from Sailesh's comment too.


Line 1335: On a kerberized cluster with high memory utilization, 
kinit commands executed after
> This issue doesn't happen during startup. 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue

2017-07-10 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7388/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

PS1, Line 146: from hostname command
> I think the precise issue is that Impala must have the --hostname flag set 
OK, trying an alternative wording to convey the symptom in the title in a way 
that somebody suffering from the problem will recognize. I think repeating the 
JIRA title wouldn't be clear in the context of "what issue did I encounter".


PS1, Line 155: 
 : 
Mention that the problem might occur after an upgrade on a system where it 
formerly worked. (Due AIUI to changes in how Cloudera Manager supplies 
--hostname. But find a way not to mention Cloudera Manager by name.)


-- 
To view, visit http://gerrit.cloudera.org:8080/7388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-10 Thread John Russell (Code Review)
John Russell has uploaded a new patch set (#4).

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..

IMPALA-5583: [DOCS] Document default_join_distribution_mode query option

New page for the query option.

Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
---
M docs/impala.ditamap
M docs/impala_keydefs.ditamap
A docs/topics/impala_default_join_distribution_mode.xml
3 files changed, 136 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/7300/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5629: avoid expensive list::size() call
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/835/

-- 
To view, visit http://gerrit.cloudera.org:8080/7382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

2017-07-10 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per 
disk type
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 61: if (disk_id >= disks_.size()) return false;
> why is this the right thing to assume? why should disk_id ever be more than
It shouldn't be in practice, but this is necessary for the tests which test 
having # disk queues greater than the number of physical disks. Bikram filed a 
separate JIRA to make the tests more realistic. We should put a TODO with that 
JIRA number here to switch this back to a DCHECK later.


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type

2017-07-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5240: Allow config of number of disk I/O threads per 
disk type
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS9, Line 631: This constructor is only used for testing.
it's unfortunate that we have this at all. you don't have to fix it in this 
change, but I wonder why we can't just have the test set the FLAG_ variables, 
like in the real impalad environment.


http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 61: if (disk_id >= disks_.size()) return false;
why is this the right thing to assume? why should disk_id ever be more than the 
number of disks?


-- 
To view, visit http://gerrit.cloudera.org:8080/7232
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread John Russell (Code Review)
John Russell has uploaded a new patch set (#2).

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..

[DOCS] Advise setting vm.overcommit_memory=1 in various places

Reusing the same advice under "Known Issues", scalability
considerations, and in the Impala + Kerberos section.

Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
---
M docs/shared/impala_common.xml
M docs/topics/impala_kerberos.xml
M docs/topics/impala_known_issues.xml
M docs/topics/impala_scalability.xml
4 files changed, 59 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/7349/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7179/10/shell/option_parser.py
File shell/option_parser.py:

PS10, Line 88: "-",
Quotes inside quotes must be escaped using \.

When you do so, don't forget to keep the lines under 90 characters.


-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7348/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

Line 571:   Impala could encounter a serious error due to resource 
usage from multithreading.
> This might mean that Impala supports multi threads. 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread John Russell (Code Review)
John Russell has uploaded a new patch set (#2).

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..

IMPALA-5605: [DOCS] New known issue for upping thread resource limits

Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
---
M docs/topics/impala_known_issues.xml
1 file changed, 33 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/7348/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue

2017-07-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7388/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

PS1, Line 146: from hostname command
I think the precise issue is that Impala must have the --hostname flag set to 
the FQDN. If the user does not set --hostname explicitly to the FQDN, Impala 
will use gethostname() (not the command 'hostname') to determine the value of 
the flag. If that call returns a non-FQ DN, Kerberos will not work. 

If there's a problem with gethostname(), that can be detected by the user 
running 'hostname' which calls the gethostname() system call.


-- 
To view, visit http://gerrit.cloudera.org:8080/7388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option

2017-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5507: Add clear description to help information of 
KEYVAL option
..


Patch Set 10: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/832/

-- 
To view, visit http://gerrit.cloudera.org:8080/7179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue

2017-07-10 Thread John Russell (Code Review)
John Russell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7388

Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
..

IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue

Because there was no obvious subcategory of known issues to
put this one in, I made a new subcategory 'startup issues'.

Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0
---
M docs/topics/impala_known_issues.xml
1 file changed, 40 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7388/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call

2017-07-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5629: avoid expensive list::size() call
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7382/2//COMMIT_MSG
Commit Message:

Line 11: in 3 places.
or you could define a small list wrapper, of course. But I'm fine with this 
given the limited scope.


-- 
To view, visit http://gerrit.cloudera.org:8080/7382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7349/1/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

Line 1335: During Impala startup on a Kerberized cluster, high memory 
usage from
> This issue doesn't happen during startup. 
On a kerberized cluster, kinit commands executed after every 
'kerberos_reinit_interval' may cause out-of-memory errors since executing the 
command involves a fork of the Impala process.


-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7348/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

Line 571:   Impala could encounter a serious error due to resource 
usage from multithreading.
This might mean that Impala supports multi threads. 
I would use "under very high concurrency"


-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


  1   2   >