[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState

2016-11-01 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4410: Safer tear-down of RuntimeState
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS1, Line 300: RuntimeState::Close()
> The ownership is actually really simple - TestEnv owns the RuntimeStates an
Oh, I see - no the problem is that TestEnv a) calls them query states and b) 
allocates and returns a naked pointer before finally wrapping it in a 
shared_ptr at the end of the method. I can make that change.


-- 
To view, visit http://gerrit.cloudera.org:8080/4893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-11-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..


Patch Set 1:

(11 comments)

Still working on a minor test fix, sending out to continue CR.

http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java
File fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java:

Line 31:  * Extracts common conjuncts from disjunctive CompoundPredicates.
> Might be worth commenting on how this handles > 2 disjuncts. I.e. you need 
Expanded comments on ExprRewriter and modified comment here. 

(I don't think we should document how the framework works in every rule.)


Line 37: public class ExtractCommonConjunctRule implements ExprRewriteRule {
> where is this being applied?
At the end of analysis, made those changes (had forgotten this after 
rebasing/cleaning)


Line 42: if (!Expr.IS_OR_PREDICATE.apply(expr)) return expr;
> Isn't this check a little weak in the sense that it only checks the root's 
Having separate rules for normalization certainly makes sense, but I see them 
as being independent of this change. I don't think we'd change the 
code/behavior of the current rule even if we had normalization.

In your example, we would not return 'a' because the single conjunct of child0 
would be (a OR b) which is not common to any of the child1 conjuncts "a" and 
"c".

Maybe you can come up with another example that breaks?


Line 49:   if (child1Conjuncts.contains(conjunct)) {
> This is n^2 so will get very slow if we have long lists of disjuncts, e.g. 
Agree that this is problematic. Unfortunately, also non-trivial to fix and make 
sure the hashCode() and equals() functions are correct.

I added an arbitrary threshold on the maximum number of Expr.equals() 
comparisons to bail on these pathological cases.

It's also not clear whether this rule would be much worse than the codegen in 
the BE. Not a good argument to avoid the improvement, but the improvement here 
alone might not make a material impact for the overall query.


Line 72: if (!child0Conjuncts.isEmpty()) {
> At this point both child0Conjuncts and child1Conjuncts are non-empty becaus
Right, forgot to correct after some cleanup. Done.


PS1, Line 84: newDisjuncts.size() > 1
> See above - this should always be true.
You are right, thanks. Done.


http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 146: "(!(int_col=5 or tinyint_col > 9) and double_col = 7) or " +
> Slightly tangential, but should we apply De Morgan's first to normalise the
Completely agree about having additional rules for normalization. I view those 
as somewhat independent of the current rule. I don't think the code/approach of 
the current rule would change much even if we had normalization. Of course, 
normalizing would make the current rule useful in more cases.

Let's continue this train of thought and come up with useful normalization 
rules in a separate change set. Definitely needed.


Line 153: "(int_col < 10 and id between 5 and 6) or " +
> Does common conjunct extraction work if you have a BETWEEN expression and t
Good point, there was a bug. Fixed bug and added tests.


http://gerrit.cloudera.org:8080/#/c/4877/1/testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test:

Line 3384:   p_brand = 'Brand#12'
> i agree, i think we need normalization as a separate transformation step. t
Agree.


Line 3384:   p_brand = 'Brand#12'
> I'm not really sure how this patch handles a bushy predicate. Do we flatten
Agree we need normalization as a separate set of rules.

I'm not sure normalization is required for your specific example though. The 
interesting conjuncts are those that bubble up all the way to the top so that, 
e.g., we can assign them as scan predicates. I think the current implementation 
does that, although some opportunities for simplifying disjunctive 
subexpressions are missed.


Line 3393:   p_brand = 'Brand#23'
> additional idea for transformation rule: derive additional, non-essential d
Agree. Sounds like IMPALA-2108.


-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-11-01 Thread Alex Behm (Code Review)
Alex Behm has restored this change.

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..


Restored

-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: restore
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1

2016-11-01 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4415: Fix unassigned scan range of size 1
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster

2016-11-01 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster
..


Patch Set 9:

Ignore remote_load_data.py in patch set 8. I accidentally pushed an interim 
in-progress version of the file in that patch set. Patch set 9 has the right 
code.

Sorry about the confusion.

-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

2016-11-01 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in 
object model
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4873/1/tests/comparison/db_connection.py
File tests/comparison/db_connection.py:

Line 407: col.is_primary_key = True
col.is_primary_key = col_name in primary_key_names


-- 
To view, visit http://gerrit.cloudera.org:8080/4873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3710: Kudu DML should ignore conflicts by default

2016-11-01 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4911

Change subject: IMPALA-3710: Kudu DML should ignore conflicts by default
..

IMPALA-3710: Kudu DML should ignore conflicts by default

Removes the non-standard IGNORE syntax that was allowed for
DML into Kudu tables to indicate that certain errors should
be ignored, i.e. not fail the query and continue. However,
because there is no way to 'roll back' mutations that
occurred before an error occurs, tables are left in an
inconsistent state and it's difficult to know what rows were
successfully modified vs which rows were not. Instead, this
change makes it so that we always 'ignore' these conflicts,
i.e. a 'best effort'. In the future, when Kudu will provide
the mechanisms Impala needs to provide a notion of isolation
levels, then Impala will be able to provide options for more
traditional semantics.

After this change, the following errors are ignored:
* INSERT where the PK already exists
* UPDATE/DELETE where the PK doesn't exist

Another follow-up patch will change other violations to be
handled in this way as well, e.g. nulls inserted in
non-nullable cols.

Reporting:
The number of rows inserted is reported to the coordinator,
which makes the aggregate available to the shell and via the
profile.
TODO: Return rows modified for INSERT via HS2 (IMPALA-1789).
TODO: Return rows modified for other CRUD (beeswax+hs2) (IMPALA-3713).
TODO: Return error counts for specific warnings (IMPALA-4416).

Testing:
Updated tests. Ran all functional tests. More tests will be
needed when other conflicts are handled in the same way.

Change-Id: I83b5beaa982d006da4997a2af061ef7c22cad3f1
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M common/thrift/DataSinks.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
15 files changed, 70 insertions(+), 146 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4911/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I83b5beaa982d006da4997a2af061ef7c22cad3f1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4410: Safer tear-down of RuntimeState
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS1, Line 300: RuntimeState::Close()
> Fair point. I'm going to leave TestEnv for now - looking into that code it'
The ownership is actually really simple - TestEnv owns the RuntimeStates and 
destroys them in TestEnv::TearDownQueryStates(). So you could just close them 
in that function.

Maybe the red herring is that TestEnv::query_states_ uses shared_ptr. It could 
actually be switched to unique_ptr - the ownership is actually exclusive, the 
code was just written before C++11 support.


-- 
To view, visit http://gerrit.cloudera.org:8080/4893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1

2016-11-01 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4415: Fix unassigned scan range of size 1
..


Patch Set 1:

(2 comments)

Started a GVO, but won't commit until I know that one of the Local / S3 builds 
are passing.

http://gerrit.cloudera.org:8080/#/c/4907/1/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 518:   // reached, move onto the next instance.
> "move on to"
Done


Line 530: // If this assignment pushes this instance past the 
threshold, move onto the next
> on to
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1

2016-11-01 Thread Henry Robinson (Code Review)
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4907

to look at the new patch set (#2).

Change subject: IMPALA-4415: Fix unassigned scan range of size 1
..

IMPALA-4415: Fix unassigned scan range of size 1

ComputeScanRangeAssignment() computes the per-node scan-range load as
the total number of bytes to be scanned / number of nodes, and casts
that to a float.

However, that computation suffers from a precision issue where the
average * num_nodes may be 1 byte less than the total.

The scan range assignment loop continues until at least average *
num_nodes bytes have been assigned. If the last scan range has only 1
byte, it will not be assigned (if it has 2 or more bytes, it will be
considered for assignment before the loop exit condition is met).

The fix is to make sure that all instances are assigned in the last
iteration of the assignment loop, even if the per-node threshold is
already met.

Testing: No local repro was found - only S3 and LocalFS builds have
it. A unit test requires a lot of infrastructure from
simple-scheduler-util that doesn't exist
yet (e.g. ComputeScanRangeAssignment() is not called). S3 and LocalFS
full test builds are in progress.

Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6
---
M be/src/scheduling/simple-scheduler.cc
1 file changed, 14 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/4907/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4415: Fix unassigned scan range of size 1
..


Patch Set 1: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4907/1/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 518:   // reached, move onto the next instance.
"move on to"


Line 530: // If this assignment pushes this instance past the 
threshold, move onto the next
on to


-- 
To view, visit http://gerrit.cloudera.org:8080/4907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState

2016-11-01 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4410: Safer tear-down of RuntimeState
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS1, Line 300: RuntimeState::Close()
> Agreed. TestEnv also creates RuntimeStates so we should call Close() there 
Fair point. I'm going to leave TestEnv for now - looking into that code it's a 
rat's nest, and we leak RuntimeStates everywhere as far as I can tell. Added it 
to fe-support - I added a utility to call Close() on scope exit, because there 
are so many paths out of that method...


Line 304:   
ExecEnv::GetInstance()->thread_mgr()->UnregisterPool(resource_pool_);
> Any reason for changing the order in which you are closing now? (Unregister
No reason, don't think they're dependent.


-- 
To view, visit http://gerrit.cloudera.org:8080/4893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState

2016-11-01 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4910

Change subject: IMPALA-4410: Safer tear-down of RuntimeState
..

IMPALA-4410: Safer tear-down of RuntimeState

* Add RuntimeState::Close() which is guaranteed to release resources
  safely, rather than having PFE::Close() do the same piecemeal.
* Fix for crash where PFE::Prepare() fails before descriptor table is
  created.

Change-Id: I5f8244f41b247a82464f1e102a3b82e2001ccc1e
Testing: Found by debug-actions, which has a reproducible test where
PFE::Prepare() fails. Manually tested on master.
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/service/fe-support.cc
A be/src/util/scope-exit-trigger.h
5 files changed, 61 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/4910/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I5f8244f41b247a82464f1e102a3b82e2001ccc1e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4:

(1 comment)

Augmented the comment. Also explained the relationship to the Coordinator locks.

http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 215:   /// holding this lock.
> what about this?
Updated. The condition was kind-of trivially true since it's only acquired in a 
couple of short functions.


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Hello Marcel Kornacker, Matthew Jacobs, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4896

to look at the new patch set (#5).

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/runtime/coordinator.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
4 files changed, 66 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1

2016-11-01 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4907

Change subject: IMPALA-4415: Fix unassigned scan range of size 1
..

IMPALA-4415: Fix unassigned scan range of size 1

ComputeScanRangeAssignment() computes the per-node scan-range load as
the total number of bytes to be scanned / number of nodes, and casts
that to a float.

However, that computation suffers from a precision issue where the
average * num_nodes may be 1 byte less than the total.

The scan range assignment loop continues until at least average *
num_nodes bytes have been assigned. If the last scan range has only 1
byte, it will not be assigned (if it has 2 or more bytes, it will be
considered for assignment before the loop exit condition is met).

The fix is to make sure that all instances are assigned in the last
iteration of the assignment loop, even if the per-node threshold is
already met.

Testing: No local repro was found - only S3 and LocalFS builds have
it. A unit test requires a lot of infrastructure from
simple-scheduler-util that doesn't exist
yet (e.g. ComputeScanRangeAssignment() is not called). S3 and LocalFS
full test builds are in progress.

Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6
---
M be/src/scheduling/simple-scheduler.cc
1 file changed, 14 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/4907/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
..


Patch Set 7:

Fixed the bug found with IMPALA-4414

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
..


Patch Set 7: Code-Review+2

Carry

-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool

2016-11-01 Thread Tim Armstrong (Code Review)
Hello Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4631

to look at the new patch set (#6).

Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool
..

IMPALA-3202: DiskIoMgr improvements for new buffer pool

The main goal of this patch is to add support to the DiskIoMgr
to read into client-provided buffers, instead of the DiskIoMgr
allocating intermediate buffers. This is important for the new
buffer pool because we want it to entirely manage its own memory,
rather than delegating part of its memory management to the DiskIoMgr.

Also do some cleanup:
* Consolidate some correlated ScanRange parameters into a parameter
  struct.
* Remove the "untracked" buffers mem tracker, which is no longer
  necessary.
* Change the buffer type in DiskIoMgr to use uint8_t* to be consistent
  with the rest of Impala.

Testing:
Added a unit test. We also get coverage from the BufferedBlockMgr unit
tests, any spilling tests and the Parquet tests with large footers.

Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress-test.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
15 files changed, 404 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster

2016-11-01 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster
..


Patch Set 7:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/4769/7/bin/remote_data_load.py
File bin/remote_data_load.py:

PS7, Line 17: This script is a setup script
> How about simply "this is a setup script"
Done


PS7, Line 21: The script expects a setup Impala development environment.
> How about:
Done


PS7, Line 22: appropriately
> It's not clear what this means. Maybe expand on this a little?
Done


PS7, Line 47: import logging
> import statements should be above from x import y
Done


PS7, Line 111: ,
> remove
Done


PS7, Line 116: that,
> remove
Done


PS7, Line 127: not
> having a not here makes it harder to reason about, how about:
Done


PS7, Line 130: password=options.cm_pass)
> Is this indent correct according to the style guide? I thought it was suppo
This passes PEP-8 linter inspection.


PS7, Line 204: Only pick the first cluster configured
> This comment does not really add anything, it's obvious what the code is do
Done


PS7, Line 209: for service in cluster.get_all_services():
> Just an observation: it looks like we don't check that all services we need
I think that confirming the readiness of the cluster is a good idea. I'm adding 
a method to do this as part of the init procedure. Right now, all it does is 
check that the required services are installed and running. It can be extended 
later to do other checks as well.


Line 281: # See https://issues.cloudera.org/browse/IMPALA-4365
> It looks like the issue is resolved, is this still necessary?
Removed.


PS7, Line 338: "database_name"
> Why would it ever be called "database_name"? Where is this constant coming 
I figured out it's the header from the beeline output. (Much of this code, I've 
inherited, and I'll confess I didn't catch every line.) I'll add a comment.


PS7, Line 432: wmay
> may
Done


PS7, Line 451: # "--skip_local_tests",
> why is this commented out?
Bascially, this is another remainder from Martin was last working on the 
script. In fact, this entire method may go away (it's not currently working to 
use runtest to run remote cluster tests, but you can do it calilng 
impala-py.test directly.)


http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

Line 492: run-step "Loading nested data" load-nested.log \
> The command is very similar in both cases. The only difference is --cm-host
Done


http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

Line 411: if os.getenv("REMOTE_LOAD"):
> base_load_dir = os.getenv("REMOTE_LOAD") or os.getenv("IMPALA_HOME")
Done


http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/load_nested.py
File testdata/bin/load_nested.py:

Line 301:   parser.add_argument("-s", "--source-db", default="tpch_parquet")
> don't we also need cm-host parameter here?
Yup, but we pick that up from L300:


  cli_options.add_cluster_options(parser)


Good eye though. I'll add a comment.


http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/setup-hdfs-env.sh
File testdata/bin/setup-hdfs-env.sh:

Line 20: set -e
> we are already setting -euo pipefail below
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster

2016-11-01 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#8).

Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster
..

IMPALA-4365: Enabling end-to-end tests on a remote cluster

This patch lays the groundwork for loading data and running end-to-end
tests on a remote CDH cluster. The requirements for the cluster to run
the tests are:

  - Managed by Cloudera Manager (CM)
  - GPL Extras need to be installed
  - KMS and KeyTrustee installed and available as a service
  - SERDEPROPERTIES in the Hive DB modified to accept wide tables
  - Hive warehouse dir points to /test-warehouse

The actual data loading is done via a new script, remote_data_load.py,
which takes the CM host as an argument. It can be run from a client
machine that is not a node of the cluster, but it needs to have the
Impala repo checked out and Impala built. This insures that all of the
necessary data load scripts are available, as well as setting up the
environment properly (client binaries like beeline and the hbase shell
are available, python libraries like cm_api are installed, necessary
environment variables are defined, etc.)

It should be noted that running remote_data_load.py will overwrite
any local XML config files with the configurations downloaded from
the remote cluster.

Usage: remote_data_load.py [options] 

Options:
  -h, --helpshow this help message and exit
  --snapshot-file=SNAPSHOT_FILE
Path to the test-warehouse archive
  --cm-user=CM_USER Cloudera Manager admin user
  --cm-pass=CM_PASS Cloudera Manager admin user password
  --gateway=GATEWAY Gateway host to upload the data from. If not
set, uses the CM host as gateway.
  --ssh-user=SSH_USER   System user on the remote machine with
passwordless SSH configured.
  --no-load Do not try to load the snapshot
  --exploration-strategy=EXPLORATION_STRATEGY
  --testRun end-to-end tests against cluster

Testing:

This patch is being submitted with the understanding that there are
still problems to work out with the remote data load script itself.

However, since many of the existing build scripts also had to be
modified, it is more important to make sure that no regressions were
inadvertently introduced into the existing data load process. Loading
data to a local mini-cluster was checked repeatedly while this patch
was being developed, as well as running it against the Jenkins job
that provides the test-warehouse snapshot used by the many other
Impala CI builds that run daily.

Remote data loading is working for the most part, although recent
Kudu-related changes have introduced unforeseen problems:

https://github.com/apache/incubator-impala/commit/041fa6d

In the meantime, setting KUDU_IS_SUPPORTED to false provides a
temporary workaround.

Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
---
M bin/load-data.py
A bin/remote_data_load.py
M testdata/bin/compute-table-stats.sh
M testdata/bin/create-load-data.sh
M testdata/bin/create-table-many-blocks.sh
M testdata/bin/generate-schema-statements.py
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/bin/load_nested.py
M testdata/bin/run-step.sh
M testdata/bin/setup-hdfs-env.sh
10 files changed, 794 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/4769/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4229: Download aux test data if Impala-aux repo is available.

2016-11-01 Thread David Knupp (Code Review)
David Knupp has abandoned this change.

Change subject: IMPALA-4229: Download aux test data if Impala-aux repo is 
available.
..


Abandoned

After a further discussion with Jim, it was decided that we need fewer (read: 
zero) references to Impala-aux in our ASF repo, so the fix may be larger than 
just this one line change.

-- 
To view, visit http://gerrit.cloudera.org:8080/4901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I089e61477a71f7f248aec74d01db6655fd203801
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-3853: More RAT cleaning.

2016-11-01 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4904

Change subject: IMPALA-3853: More RAT cleaning.
..

IMPALA-3853: More RAT cleaning.

Apache RAT is a tool to audit code repositories for the ASF copyright
rules. Our wrapper script around it found a few more things; this
patch fixes those things.

Change-Id: I01367ea26feaf6a3e2cf4ac04f1c6a63f6e66195
---
M bin/check-rat-report.py
M bin/rat_exclude_files.txt
M testdata/datasets/tpcds/tpcds_kudu_template.sql
M testdata/datasets/tpch/tpch_kudu_template.sql
4 files changed, 41 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/4904/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I01367ea26feaf6a3e2cf4ac04f1c6a63f6e66195
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4:

(1 comment)

if you could augment the comment to include the locks that aren't part of the 
hierarchy yet, that would be helpful.

http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 215:   /// holding this lock.
what about this?


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4:

this needs a +2 from an owner.

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4: Code-Review-1

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4896

to look at the new patch set (#4).

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/runtime/coordinator.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
4 files changed, 50 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Revert "IMPALA-4314: Standardize on MT-related data structures"

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: Revert "IMPALA-4314: Standardize on MT-related data structures"
..


Patch Set 1: Code-Review-1

Henry is working on a fix.

-- 
To view, visit http://gerrit.cloudera.org:8080/4903
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia484b70e5545fe696b8aadcbb9a0b36555a0918d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4: Code-Review+2

Thanks for adding the comment. Will be very helpful in the future.

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h
File be/src/service/impala-server.h:

PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or
 : /// QueryExecState::lock_ if they will be held 
simulataneously.
> Instead of just having this above this lock, it would be better to spell ou
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h
File be/src/service/impala-server.h:

PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or
 : /// QueryExecState::lock_ if they will be held 
simulataneously.
Instead of just having this above this lock, it would be better to spell out 
the lock ordering of the main locks in question at the top of this header file, 
based on the DAG you posted in the JIRA.

I have a feeling there might be other places where we mess up the lock ordering 
too that we don't know of yet.


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Revert "IMPALA-4314: Standardize on MT-related data structures"

2016-11-01 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4903

Change subject: Revert "IMPALA-4314: Standardize on MT-related data structures"
..

Revert "IMPALA-4314: Standardize on MT-related data structures"

This reverts commit 0d857237a8d5231caf9af4c30d4134b151cd1e0d.

Change-Id: Ia484b70e5545fe696b8aadcbb9a0b36555a0918d
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test-util.cc
M be/src/scheduling/simple-scheduler-test-util.h
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/Frontend.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
23 files changed, 1,174 insertions(+), 637 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/4903/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4903
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia484b70e5545fe696b8aadcbb9a0b36555a0918d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-4411: Kudu inserts violate lock ordering and could deadlock

2016-11-01 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4411: Kudu inserts violate lock ordering and could 
deadlock
..


Patch Set 2: Code-Review+2

carrying +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4895
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2aa36fce78525a80a6b880e1b668105006bc1425
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4411: Kudu inserts violate lock ordering and could deadlock

2016-11-01 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4411: Kudu inserts violate lock ordering and could 
deadlock
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4895/1//COMMIT_MSG
Commit Message:

PS1, Line 17: Running an exhaustive test job, and stress tests
: manually.
> Did you reproduce the hang? It should be easy to do if you add a sleep call
I wasn't able to reproduce it easily. I tried inserting sleeps and running 
stress and cancellation tests, but no luck. I think it may have required a rare 
combination of events across a number of threads, e.g. like the one mentioned 
by Tim in his related patch for fixing another case of mis-ordering the same 
locks.


http://gerrit.cloudera.org:8080/#/c/4895/1/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

PS1, Line 532: // This should be safe to access on coord_ after Wait() has been 
called.
> This is a bit vague. Is it safe or not?
Done


PS1, Line 535: "Updating session latest observed Kudu timestamp: " << 
latest_kudu_ts
> Add the session ID here, otherwise this message is not going to be useful (
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4895
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2aa36fce78525a80a6b880e1b668105006bc1425
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-11-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: Add functional tests for compute stats with mt_dop > 0.
..


Patch Set 3: Code-Review-1

Hits DCHECK now, need to debug.

-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4229: Download auxiliary test data if Impala-auxiliary-tests is available.

2016-11-01 Thread David Knupp (Code Review)
David Knupp has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4900

Change subject: IMPALA-4229: Download auxiliary test data if 
Impala-auxiliary-tests is available.
..

IMPALA-4229: Download auxiliary test data if Impala-auxiliary-tests is 
available.

This was a slight annoyance for internal Cloudera devs setting up new
environments. If the Impala-auxiliary-tests repo is checked out, our
data load scripts assume that the auxiliary test data archives have
been downloaded and unpacked. But this requires devs to run another
script by hand.

This patch just inserts the download step into the data load process.
If the test data is in fact already present, nothing is downloaded.

Change-Id: Iebe605bfb610c2f0529100d79ceecb07cc7c6165
---
M testdata/bin/create-load-data.sh
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/4900/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4900
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iebe605bfb610c2f0529100d79ceecb07cc7c6165
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3202: refactor scratch file management into TmpFileMgr

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4898

Change subject: IMPALA-3202: refactor scratch file management into TmpFileMgr
..

IMPALA-3202: refactor scratch file management into TmpFileMgr

This is a pure refactoring patch that moves all of the logic
for allocating scratch file ranges into TmpFileMgr in anticipation of
this logic being used by the new BufferPool.

There should be no behavioural changes.

Also remove a bunch of TODOs that we're not going to fix.

Testing:
Covered by TmpFileMgr and BufferedBlockMgr unit tests. Code
is exercised heavily by spilling system tests and the
stress test.

Change-Id: I0c56c195f3f28d520034f8c384494e566635fc62
---
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
6 files changed, 193 insertions(+), 203 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4898/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0c56c195f3f28d520034f8c384494e566635fc62
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4410: Safer tear-down of RuntimeState
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS1, Line 300: RuntimeState::Close()
> Nothing very pressing but wanted to bring this up:
Agreed. TestEnv also creates RuntimeStates so we should call Close() there too.


-- 
To view, visit http://gerrit.cloudera.org:8080/4893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4381: Incorrect AVX version of BloomFilter::Or

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4381: Incorrect AVX version of BloomFilter::Or
..


Patch Set 4: Code-Review+2

Sorry about that - had this patch on a branch for testing and accidentally 
pushed it.

-- 
To view, visit http://gerrit.cloudera.org:8080/4866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I18480c598fbf6b842398581acde6fc719c40ce27
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4896/1/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

PS1, Line 833: if (check_inflight) {
 :   DCHECK(session_lock.owns_lock());
 :   if (session_->inflight_queries.find(query_id()) ==
 :   session_->inflight_queries.end()) {
 : return Status("Query not yet running");
 :   }
> Are you aware of any reason we can't just do this check before taking lock_
It doesn't look like it would protect against anything - I did a pass through 
the code.

Based on the history of the code,  'check_inflight' is meant to guard against 
the query being cancelled early when it is in a state with partially set-up 
data structures, so the race where it is removed from 'inflight_queries' during 
cancellation should be benign.


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
2 files changed, 13 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs