[Impala-ASF-CR] IMPALA-4000: Disable Sentry column-level privileges for Kudu

2016-11-10 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4000: Disable Sentry column-level privileges for Kudu
..


Patch Set 1:

I don't think disabling column-level privileges is the desired behavior. Can 
you check with Alex for the requirements for Kudu tables? If I recall 
correctly, we decided on enabling Sentry for Kudu tables and requiring ALL 
SERVER privileges for creating external tables.

-- 
To view, visit http://gerrit.cloudera.org:8080/5047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. alignof

2016-11-10 Thread Jim Apple (Code Review)
Jim Apple has submitted this change and it was merged.

Change subject: IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. 
alignof
..


IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. alignof

This was testing all memory alignments up to and including
sizeof(max_align_t), but the standard says nothing about that. It does
say things about alignof(max_align_t), including that malloc() returns
memory at least that aligned.

In both gcc and clang on our currently supported platforms,
max_align_t has sizeof == 32 and alignof == 16, so this test expected
an alignment that malloc was not guaranteed to provide.

Change-Id: Ic2dbabcb9af2874d8ed354738243dfca9c492b08
Reviewed-on: http://gerrit.cloudera.org:8080/5022
Reviewed-by: Henry Robinson 
Reviewed-by: Dan Hecht 
Tested-by: Jim Apple 
---
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
5 files changed, 34 insertions(+), 13 deletions(-)

Approvals:
  Jim Apple: Verified
  Henry Robinson: Looks good to me, but someone else must approve
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic2dbabcb9af2874d8ed354738243dfca9c492b08
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4452: Always call AggFnEvaluator::Open() before AggFnEvaluator::Init()

2016-11-10 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5049

Change subject: IMPALA-4452: Always call AggFnEvaluator::Open() before 
AggFnEvaluator::Init()
..

IMPALA-4452: Always call AggFnEvaluator::Open() before AggFnEvaluator::Init()

As part of the fix for IMPALA-2379, the expression contexts of
aggregation function evaluators are expected to be opened before
their initFn() are called so \ constant arguments can be accessed
in initFn(). However, the legacy aggregation node wasn't updated
to follow this order for singleton result tuple (i.e. no group-by).

This patch fixes the problem by deferring the creation of the
singleton tuple to a point in AggregationNode::Open() after the
expression contexts of all aggregate function evaluators have
been opened. PartitionedAggregationNode() was already updated
to follow this order.

This patch also fixes a minor bug in which uninitialized entries
of agg_fn_ctxs_[] may be accessed in AggregationNode::Close()
if AggregationNode::Prepare() fails.

Change-Id: I2f261dee47821c517d8dbe1babf4112462d85807
---
M be/src/exec/aggregation-node.cc
1 file changed, 19 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/5049/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I2f261dee47821c517d8dbe1babf4112462d85807
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-4000: Disable Sentry column-level privileges for Kudu

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5047

Change subject: IMPALA-4000: Disable Sentry column-level privileges for Kudu
..

IMPALA-4000: Disable Sentry column-level privileges for Kudu

Since Kudu does not support security, we do not want to mislead anyone
into thinking this is not the case by being able to set up column level
privileges.

Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c
---
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java
2 files changed, 9 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5047/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-3812: Fix error message for unsupported types

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-3812: Fix error message for unsupported types
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4859/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

Line 214:   if (!expr.getType().isSupported() || 
expr.getType().isInvalid()) {
> make isSupported() also cover invalid types
Done


http://gerrit.cloudera.org:8080/#/c/4859/1/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java:

Line 307: // Unsupported type binary in a star expansion.
> update comment
Done


http://gerrit.cloudera.org:8080/#/c/4859/1/testdata/workloads/functional-query/queries/QueryTest/misc.test
File testdata/workloads/functional-query/queries/QueryTest/misc.test:

Line 143:  TYPES
> can we remove these empty sections?
Done


Line 146: ImpalaBeeswaxException: INNER EXCEPTION:  MESSAGE: AnalysisException: Unsupported 
type 'DATE' in 'functional.unsupported_types.date_col'.
> can shrink this to just "Unsupported type ..." right?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9019b4bfd219f94e554c795befd3ff5e39706ea9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3812: Fix error message for unsupported types

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: IMPALA-3812: Fix error message for unsupported types
..

IMPALA-3812: Fix error message for unsupported types

Before this patch an unclear error message was returned if DATE or
DATETIME appeared in the select list after a star expansion. This was
because DATE and DATETIME PrimitiveType was serialized as INVALID_TYPE.
This is fixed by serializing correctly.

Change-Id: I9019b4bfd219f94e554c795befd3ff5e39706ea9
---
M fe/src/main/java/org/apache/impala/catalog/PrimitiveType.java
M fe/src/main/java/org/apache/impala/catalog/ScalarType.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M testdata/UnsupportedTypes/data.csv
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/misc.test
6 files changed, 34 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/4859/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9019b4bfd219f94e554c795befd3ff5e39706ea9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-3882: Simplify some query exec state locking

2016-11-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3882: Simplify some query exec state locking
..


Patch Set 4:

(9 comments)

Here's how I think about the lock safety (sorry, long, but hopefully helps to 
convince):

1. For callers that passed 'false' to GetQES(), nothing changes. That code path 
stays the same. 

2. For callers that passed 'true' to GetQES(), there's a new set of possible 
executions. Previously no two 'true' callers could obtain a reference to the 
QES simultaneously. That has now changed. 

However, apart from registration or UnregisterQuery()*, there's no reason I can 
think of to care about whether there are other references to the QES; rather 
the concern is whether the QES is accessed safely under that concurrency. 
Registration and UnregisterQuery() are safe anyhow.

Safety under concurrent references to the same QES is managed by QES::lock(), 
so the important thing in this patch is making sure that all 'true' callers 
correctly obtain the lock when they need to. Conveniently, that's the same 
point at which they all used to *adopt* the same lock, so it's pretty easy to 
check that we've got that right.

I added some more defensive locking even to the 'false' callers when they read 
from the QES in the latest patch version. 

3. Previously, no 'true' caller could obtain a reference to the QES 
simultaneously with planning. That's changed (effectively by passing 'false' in 
all those GetQES() calls), and planning doesn't hold the lock now anyhow.

* One might think that UnregisterQuery() erased the QES from the map while 
holding both locks, so that 'true' callers couldn't get at the QES after it was 
removed - guaranteeing that all 'true' callers were finished with the QES - but 
that was not the case even before this patch.

I ran this patch for ~10k queries under the stress test. I'll leave it running.

http://gerrit.cloudera.org:8080/#/c/4935/4/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

Line 279:   lock_guard l(*exec_state->lock());
> There still is a race between GetQES() and this lock(). What if it the quer
I think that race exists currently, and is benign. If you look at 
https://github.com/apache/incubator-impala/blob/master/be/src/service/impala-server.cc#L904,
 UnregisterQuery() doesn't do anything to make sure that other references to 
the exec state have gone away, so it's totally possible for some other caller 
to have an existing shared_ptr and to read from the QES while it calls Done().

So the question becomes: is it safe to do anything to a QES if Done() might be 
called concurrently? Depends on the definition of 'safe':

* safe == free from concurrent access bugs -> both caller and Done() need to 
take lock_, which they [should] do.
* safe == free from *logic* bugs, where Done() destroys state that caller wants 
to read -> that's not what Done() does. Mostly Done() fills in some query 
profile information, and tells the coordinator it can stop executing (and the 
coordinator doesn't get destroyed otherwise).


http://gerrit.cloudera.org:8080/#/c/4935/4/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

PS4, Line 625: .get()
> I don't feel that strongly, but comparisons with NULL for scoped_ptr/unique
I don't feel that strongly either, but went with this because a) it's 
consistent with other usages in this file and b) it's explicit, rather than 
relying on operator overloading.


Line 641:   return;
> Unneeded return.
Done


PS4, Line 708: shared_ptr exec_state = 
GetQueryExecState(query_id);
> If this line is moved above L701, we wouldn't need to call GetQueryExecStat
Done


http://gerrit.cloudera.org:8080/#/c/4935/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 789:   RETURN_IF_ERROR((*exec_state)->ExecQuery());
> What happens if Cancel() was called before we reach here?
Do you mean this comment to be on line 785? Cancel() could happen before this 
because the query is 'visible' after RegisterQuery() returns, and anyone that 
looks at e.g. the debug web page can find out the query id. 

But there is a new window for cancellation between RegisterQuery() and 
PlanQuery(). PlanQuery() has code to deal with that. Let me know if I'm not 
seeing the same race that you are.


http://gerrit.cloudera.org:8080/#/c/4935/4/be/src/service/impala-server.h
File be/src/service/impala-server.h:

Line 82: /// 2. session_state_map_lock_
> Curious - do we have a similar problem with the session map lock?
Similar-ish, but not as acute. We don't have the problem where we have this 
weird lock-inversion like we do when calling GetQES(..., true). 
GetSessionState(). But there is one place where we get s_s_m_l_ and then the 
s_s_l_, and other places where we don't always get the s_s_m_l_ first. 

This can lead to deadlock if the latter case then calls GetSessionState() with 
the session ID 

[Impala-ASF-CR] IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. alignof

2016-11-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. 
alignof
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2dbabcb9af2874d8ed354738243dfca9c492b08
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4465: Don't hold process wide lock while serializing Runtime Profile in GetRuntimeProfileStr()

2016-11-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4465: Don't hold process wide lock while serializing 
Runtime Profile in GetRuntimeProfileStr()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5035/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS1, Line 568: true
> There's a race there, if I take the lock after this call, the exec_state co
Yes, I don't think that's a problem. That's possible everywhere that 
GetQES(..., false) is called.


-- 
To view, visit http://gerrit.cloudera.org:8080/5035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad8a1d6644259f177dfb3b29b3ba1ad6a76210a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4465: Don't hold process wide lock while serializing Runtime Profile in GetRuntimeProfileStr()

2016-11-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#2).

Change subject: IMPALA-4465: Don't hold process wide lock while serializing 
Runtime Profile in GetRuntimeProfileStr()
..

IMPALA-4465: Don't hold process wide lock while serializing Runtime Profile in 
GetRuntimeProfileStr()

This patch changes the code so that the query_exec_state_map_lock_
is not held while serializing the RuntimeProfile, since that is a
pretty expensive operation and happens at least once per query. This
change makes a lot of client calls have less lock contention including
Webserver calls and query registration/unregistration.

Change-Id: I3ad8a1d6644259f177dfb3b29b3ba1ad6a76210a
---
M be/src/service/impala-server.cc
1 file changed, 5 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5035/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ad8a1d6644259f177dfb3b29b3ba1ad6a76210a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-4461: Make sure data gets loaded for wide hbase tables.

2016-11-10 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4461: Make sure data gets loaded for wide hbase tables.
..


IMPALA-4461: Make sure data gets loaded for wide hbase tables.

Ths patch reverts a change that broke the exhaustive suite of Impala
tests. The change was introduced here:

https://github.com/apache/incubator-impala/commit/ce4c5f67433ebe050261710920972621d625c81c

The orginal problem was that data load was failing when run against a
remote cluster, due to a 4000 byte max for SERDEPROPERTIES.PARAM_VALUE,
a limitation that is well described in HIVE-1364. Locally, when we load
data, we work around the issue here:

https://github.com/apache/incubator-impala/blob/master/bin/create-test-configuration.sh#L99

When testing on CDH remote cluster however, this "fix" never gets applied.
(It also assumes the database will always by postgres.)

I made this change without realizing its full effect, or appreciating
exactly how exhaustive our exhaustive test suite really is. Another
solution will need to be found for the case of remote cluster testing,
but this should unblock the local build for now.

As far as testing, I ran the full suite of tests in query_test/
test_scanners.py, and they all pass after removing these lines.

Change-Id: If2148d6546789c6c53c8e045717081b24ce76689
Reviewed-on: http://gerrit.cloudera.org:8080/5033
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M testdata/datasets/functional/schema_constraints.csv
1 file changed, 0 insertions(+), 5 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If2148d6546789c6c53c8e045717081b24ce76689
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-4454: test kudu.TestShowCreateTable flaky

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-4454: test_kudu.TestShowCreateTable flaky
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5040/1/tests/conftest.py
File tests/conftest.py:

PS1, Line 331:   use_unique_conn = __call_cls_method_if_exists(request.cls, 
"auto_create_db")
> Instead of hard coding 45 in several places, can you create a constant for 
Added a constant, but it seems weird to mention the JIRA here since this is 
fairly general purpose and mentioning the JIRA in the actual test should be 
enough for anyone who's curious.


http://gerrit.cloudera.org:8080/#/c/5040/1/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

PS1, Line 205: # For IMPALA-4454
> Can you mention the Jira here, so that when someone else comes across this 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2727c27ff66140ac4043bcad332cd4e1d72b255f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4363: Add Parquet timestamp validation
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4968/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

Line 44: #from tests.util.get_parquet_metadata import get_parquet_metadata
I'll remove this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4363: Add Parquet timestamp validation
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4968/4/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 77: 
I'll remove the empty lines I accidentally added here and in other files.


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Preview: IMPALA-4363: Add timestamp validation

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: Preview: IMPALA-4363: Add timestamp validation
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/4968/2/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 473: if (NeedsValidation() && !ValidateSlot(val_ptr).ok()) {
> I don't think this will behave quite right with abort_on_error=false. Retur
I rewrote this section, and added a comment.


Line 492:   /// Similar to NeedsCoversion, most column readers never require 
validation, so
> NeedsConversion()
Done


Line 507:   /// Ensures the data is valid. If it is discovered that data is not 
valid,
> Sets the slot to NULL if the slot value is invalid, e.g., due to being out 
Done


Line 510: DCHECK(false);
> If we ever hit this by accident in a release build we might not want to ret
I changed the interface, but False is returned instead of true now.


Line 594:   // Conversion should only happen when this flag is enabled.
> I'll delete this comment.
Done


Line 599:   return Status::OK();
> Setting a slot to NULL means flipping the null bit in the containing tuple.
Fixed. Null is being set properly now.


http://gerrit.cloudera.org:8080/#/c/4968/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 146:   bool Validate();
> Can you change this to something like:
Done


http://gerrit.cloudera.org:8080/#/c/4968/2/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

Line 307:   ("PARQUET_TIMESTAMP_INVALID", 100,
> PARQUET_TIMESTAMP_OUT_OF_RANGE?
Done


Line 308:"File '$0' column '$1' contains invalid timestamp."),
> "Invalid" should be qualified a little more. The value is actually out of r
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#4).

Change subject: IMPALA-4363: Add Parquet timestamp validation
..

IMPALA-4363: Add Parquet timestamp validation

Before this patch, we would simply read the INT96 Parquet timestamp
representation and assume that it's valid. However, not all bit
permutations represent a valid timestamp. One of the boost functions
raised an exception (that we did't catch) when passed an invalid
boost date object, which resulted in a crash. This patch fixes
problem by validating that the timestamp falls into 1400.. date
range as we are scanning Parquet.

Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/runtime/timestamp-value.h
M common/thrift/generate_error_codes.py
M testdata/bad_parquet_data/README
A testdata/bad_parquet_data/out-of-range-timestamp.parq
M testdata/bin/create-load-data.sh
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-abort-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M tests/query_test/test_scanners.py
10 files changed, 120 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/4968/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4466: Improve Kudu UPSERT test coverage

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4953

to look at the new patch set (#8).

Change subject: IMPALA-4466: Improve Kudu UPSERT test coverage
..

IMPALA-4466: Improve Kudu UPSERT test coverage

Test results in kudu_upsert.test were verified by hand.

This patch also introduces a new test section 'DML_RESULTS', which
takes the name of a table as a comment and the contents of the
table as its body and then verifies that the body matches the
actual contents of the table. This makes it easy to check that a
DML operation has the desired effect on the contents of a table,
rather than always having to add another test case that runs a
select on the table. For now, this section cannot be used in a
test along with the RESULTS or ERRORS sections.

TODO: Refactor the DML test case handling (IMPALA-4471)

Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
---
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
A testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/common/impala_test_suite.py
M tests/common/test_result_verifier.py
M tests/query_test/test_kudu.py
M tests/util/test_file_parser.py
6 files changed, 532 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4953/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4466: Improve Kudu UPSERT test coverage

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4953

to look at the new patch set (#7).

Change subject: IMPALA-4466: Improve Kudu UPSERT test coverage
..

IMPALA-4466: Improve Kudu UPSERT test coverage

This patch also introduces a new test section 'DML_RESULTS', which
takes the name of a table as a comment and the contents of the
table as its body and then verifies that the body matches the
actual contents of the table. This makes it easy to check that a
DML operation has the desired effect on the contents of a table,
rather than always having to add another test case that runs a
select on the table. For now, this section cannot be used in a
test along with the RESULTS or ERRORS sections.

TODO: Refactor the DML test case handling (IMPALA-4471)

Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
---
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
A testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/common/impala_test_suite.py
M tests/common/test_result_verifier.py
M tests/query_test/test_kudu.py
M tests/util/test_file_parser.py
6 files changed, 532 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4953/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4466: Improve Kudu UPSERT test coverage

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-4466: Improve Kudu UPSERT test coverage
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4953/6/testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test:

PS6, Line 216: error
> typo
Done


http://gerrit.cloudera.org:8080/#/c/4953/6/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

Line 344: # different result sets to consider (IMPALA-4471).
> to consider (IMPALA-4471).
Done


http://gerrit.cloudera.org:8080/#/c/4953/6/tests/common/test_result_verifier.py
File tests/common/test_result_verifier.py:

Line 303:   to allow regular RESULTS/ERRORS sections in tests with DML_RESULTS 
(IMPALA-4471).
> Reference IMPALA-4471
Done


http://gerrit.cloudera.org:8080/#/c/4953/6/tests/util/test_file_parser.py
File tests/util/test_file_parser.py:

PS6, Line 230: # The DML_RESULTS section is used to
> We need a comment explaining a summary of what this is. Can you add this:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3726: Add support for Kudu-specific column options

2016-11-10 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#4).

Change subject: IMPALA-3726: Add support for Kudu-specific column options
..

IMPALA-3726: Add support for Kudu-specific column options

This commit adds support for Kudu-specific column options in CREATE
TABLE statements. The syntax is:
CREATE TABLE tbl_name ([col_name type [PRIMARY KEY] [option [...]]] [, ])
where option is:
| NULL
| NOT NULL
| ENCODING encoding_val
| COMPRESSION compression_algorithm
| DEFAULT expr
| BLOCK_SIZE num

The output of the SHOW CREATE TABLE statement was altered to include all the 
specified
column options for Kudu tables.

Change-Id: I727b9ae1b7b2387db752b58081398dd3f3449c02
---
M be/src/exec/kudu-table-sink.cc
M common/thrift/CatalogObjects.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AvroSchemaParser.java
M fe/src/main/java/org/apache/impala/util/AvroSchemaUtils.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M tests/query_test/test_kudu.py
24 files changed, 884 insertions(+), 167 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/5026/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I727b9ae1b7b2387db752b58081398dd3f3449c02
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-4454: test kudu.TestShowCreateTable flaky

2016-11-10 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4454: test_kudu.TestShowCreateTable flaky
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5040/1/tests/conftest.py
File tests/conftest.py:

PS1, Line 331:   timeout = __call_cls_method_if_exists(request.cls, 
"get_conn_timeout") or 45
Instead of hard coding 45 in several places, can you create a constant for use 
in all the places 45 is hardcoded? Also please have a comment pointing to the 
Jira so a user understands why it's here.


http://gerrit.cloudera.org:8080/#/c/5040/1/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

PS1, Line 205: return 60 * 5 # 5 minutes
Can you mention the Jira here, so that when someone else comes across this 
code, he has a reference for the intent?


-- 
To view, visit http://gerrit.cloudera.org:8080/5040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2727c27ff66140ac4043bcad332cd4e1d72b255f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4454: test kudu.TestShowCreateTable flaky

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5040

Change subject: IMPALA-4454: test_kudu.TestShowCreateTable flaky
..

IMPALA-4454: test_kudu.TestShowCreateTable flaky

The cause of the flakiness is Kudu CREATE TABLE operations
that are sometimes taking a long time, leading to timeouts
in the hiveserver2 connection. This patch adds the ability
for tests using the 'conn' pytest fixture to specify a
timeout to connect(), and sets a timeout of 5 minutes for
this test.

Change-Id: I2727c27ff66140ac4043bcad332cd4e1d72b255f
---
M tests/conftest.py
M tests/query_test/test_kudu.py
2 files changed, 11 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/5040/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5040
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I2727c27ff66140ac4043bcad332cd4e1d72b255f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4456: Change query exec state lock to a reader-writer lock

2016-11-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4456: Change query_exec_state_lock_ to a reader-writer 
lock
..


Patch Set 3:

Once concern I have is about fairness. It's pretty critical that writers don't 
get starved out of this lock because that prevents queries from starting or 
finishing. 

If there's enough demand for the reader-side of the lock such that there's 
really no period at which the lock is free for writers, I can see a starvation 
situation happening. 

This doesn't happen right now because all once the lock is released, there is 
always a period (however small) where the lock is available for taking. You 
might need to think about an external mechanism to prevent new readers from 
taking the lock when a writer is waiting. That would make things more 
complicated, so let me know if you have any better ideas.

-- 
To view, visit http://gerrit.cloudera.org:8080/5021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I790a95e7179f07aa7ba188d5422c5e054353ba0b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable

2016-11-10 Thread Bharath Vissapragada (Code Review)
Hello Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4867

to look at the new patch set (#11).

Change subject: IMPALA-3552: Make incremental stats max serialized size 
configurable
..

IMPALA-3552: Make incremental stats max serialized size configurable

The fix "IMPALA-2648/IMPALA-2664" introduced a conservative limitation
on the maximum serialized size of incremental stats. As a side effect,
some users with very large tables are experiencing regressions
especially when they upgrade impala and the serialized size goes
beyond 200MB.

To mitigate the issue, the change introduces a new gflag,
'inc_stats_size_limit_bytes' to make the max serialized size
configurable, which allows impala users to specify their own maximum
serialized size. Default value for inc_stats_size_limit_bytes is
200MB.

The change introduces a TBackendGflags class to pass the gflags from
backend to the Frontend and the Catalog via thrift.  This also revamps
existing query options to use the TBackendConfig.

Change-Id: I33684725a61eabc67237503e61178305d37d3cb5
---
M be/generated-sources/gen-cpp/CMakeLists.txt
M be/src/catalog/catalog.cc
M be/src/common/global-flags.cc
M be/src/service/fe-support.cc
M be/src/service/frontend.cc
M be/src/util/CMakeLists.txt
A be/src/util/backend-gflag-util.cc
A be/src/util/backend-gflag-util.h
A common/thrift/BackendGflags.thrift
M common/thrift/CMakeLists.txt
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/authorization/User.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
24 files changed, 280 insertions(+), 199 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4867/11
-- 
To view, visit http://gerrit.cloudera.org:8080/4867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yonghyun Hwang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: Yonghyun Hwang 


[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable

2016-11-10 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3552: Make incremental stats max serialized size 
configurable
..


Patch Set 10:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4867/10//COMMIT_MSG
Commit Message:

PS10, Line 21: amends the TBackendConfig
> introduces a TBackendGflags
Done


http://gerrit.cloudera.org:8080/#/c/4867/10/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

Line 76: }
> it's too bad we can't autogenerate this file (and the thrift structure) so 
Yep, we should probably do some kind of scripting work for that. May we can do 
it as a follow up patch sometime.


http://gerrit.cloudera.org:8080/#/c/4867/10/common/thrift/BackendGflags.thrift
File common/thrift/BackendGflags.thrift:

PS10, Line 22: Attributes without comments correspond to gflags
> I'd prefer them to be 1:1 with gflags. Bharath, do you think that's a doabl
Made them 1:1.


PS10, Line 22: Attributes without comments correspond to gflags
> If these aren't 1:1 with gflags, let's call the struct TBackendFlags.
Made them 1:1 as per Alex's comment.


http://gerrit.cloudera.org:8080/#/c/4867/10/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

PS10, Line 42: Preconditions.checkState(INSTANCE == null);
@Alex, I removed this Preconditions check since the 'be' tests instantiate 
multiple Frontend classes and this check doesn't hold true after the first 
initializaion. We could ideally do something like 
Preconditions.checkState(isTestEnv() || INSTANCE == null) but I think its fine 
even if we overwrite it the second time.


-- 
To view, visit http://gerrit.cloudera.org:8080/4867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yonghyun Hwang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: Yonghyun Hwang 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Preview: IMPALA-4363: Add timestamp validation

2016-11-10 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#3).

Change subject: Preview: IMPALA-4363: Add timestamp validation
..

Preview: IMPALA-4363: Add timestamp validation

Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M common/thrift/generate_error_codes.py
M testdata/bad_parquet_data/README
A testdata/bad_parquet_data/out-of-range-timestamp.parq
M testdata/bin/create-load-data.sh
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-abort-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M tests/query_test/test_scanners.py
11 files changed, 113 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/4968/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable

2016-11-10 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3552: Make incremental stats max serialized size 
configurable
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4867/10/common/thrift/BackendGflags.thrift
File common/thrift/BackendGflags.thrift:

PS10, Line 22: Attributes without comments correspond to gflags
> If these aren't 1:1 with gflags, let's call the struct TBackendFlags.
I'd prefer them to be 1:1 with gflags. Bharath, do you think that's a doable 
change?


-- 
To view, visit http://gerrit.cloudera.org:8080/4867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yonghyun Hwang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: Yonghyun Hwang 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4456: Change query exec state lock to a reader-writer lock

2016-11-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4456: Change query_exec_state_lock_ to a reader-writer 
lock
..


Patch Set 3:

(4 comments)

Did you consider boost::scoped_mutex?

http://gerrit.cloudera.org:8080/#/c/5021/2/be/src/gutil/read_write_lock.cc
File be/src/gutil/read_write_lock.cc:

PS2, Line 2: // Use of this source code is governed by a BSD-style license that 
can be
   : // found in the LICENSE file.
   : 
   : // This file has had some code modified and removed from the 
original version.
Same comments as header.


http://gerrit.cloudera.org:8080/#/c/5021/2/be/src/gutil/read_write_lock.h
File be/src/gutil/read_write_lock.h:

PS2, Line 3: LICENSE
Remove this sentence, I think, because it's covered by LICENSE.txt (but please 
check that the LICENSE referred to here is the same as that in LICENSE.txt for 
util/).


PS2, Line 5: This file has had some code modified and removed from the original 
version
By whom - by us? Or was this line present when you copied the file?


http://gerrit.cloudera.org:8080/#/c/5021/2/be/src/util/rw_lock.h
File be/src/util/rw_lock.h:

PS2, Line 25: class ReadWriteLock {
:  public:
:   ReadWriteLock() {}
: 
:   // Reader lock functions.
:   void ReadAcquire() { l.ReadAcquire(); }
:   void ReadRelease() { l.ReadRelease(); }
: 
:   // Writer lock functions.
:   void WriteAcquire() { l.WriteAcquire(); }
:   void WriteRelease() { l.WriteRelease(); }
: 
:  private:
:   base::subtle::ReadWriteLock l;
: 
:   DISALLOW_COPY_AND_ASSIGN(ReadWriteLock);
: };
> I noticed that we don't have the habit of using objects from other namespac
I think that's just because the majority of support classes we use come from 
boost or std. We use gutil's classes where appropriate - that's why we have it 
in the codebase! So this wrapper seems unnecessary to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/5021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I790a95e7179f07aa7ba188d5422c5e054353ba0b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. alignof

2016-11-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. 
alignof
..


Patch Set 3:

Data load failed with 

#7  0x01853932 in impala::MemPool::TryAllocateAligned 
(this=0x89c6640, size=32, alignment=32) at 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/be/src/runtime/mem-pool.h:117
#8  0x01853134 in impala::AllocateAnyVal (state=0x77dc3200, 
pool=0x89c6640, type=..., mem_limit_exceeded_msg=..., result=0x7fdc5cc1efa8) at 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/be/src/exprs/anyval-util.cc:38

This test should fix that, but I am doing a data load to check. It'll take a 
few hours.

-- 
To view, visit http://gerrit.cloudera.org:8080/5022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2dbabcb9af2874d8ed354738243dfca9c492b08
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. alignof

2016-11-10 Thread Jim Apple (Code Review)
Hello Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5022

to look at the new patch set (#3).

Change subject: IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. 
alignof
..

IMPALA-4455: MemPoolTest.TryAllocateAligned failure: sizeof v. alignof

This was testing all memory alignments up to and including
sizeof(max_align_t), but the standard says nothing about that. It does
say things about alignof(max_align_t), including that malloc() returns
memory at least that aligned.

In both gcc and clang on our currently supported platforms,
max_align_t has sizeof == 32 and alignof == 16, so this test expected
an alignment that malloc was not guaranteed to provide.

Change-Id: Ic2dbabcb9af2874d8ed354738243dfca9c492b08
---
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
5 files changed, 34 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/5022/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic2dbabcb9af2874d8ed354738243dfca9c492b08
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4466: Improve Kudu UPDATE test coverage

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5036

Change subject: IMPALA-4466: Improve Kudu UPDATE test coverage
..

IMPALA-4466: Improve Kudu UPDATE test coverage

Change-Id: Ib697c64d37bba61f07fe7d7a30e5345909070b8c
---
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
A testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M tests/query_test/test_kudu.py
3 files changed, 312 insertions(+), 157 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5036/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5036
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib697c64d37bba61f07fe7d7a30e5345909070b8c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable

2016-11-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3552: Make incremental stats max serialized size 
configurable
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4867/10/common/thrift/BackendGflags.thrift
File common/thrift/BackendGflags.thrift:

PS10, Line 22: Attributes without comments correspond to gflags
If these aren't 1:1 with gflags, let's call the struct TBackendFlags.


-- 
To view, visit http://gerrit.cloudera.org:8080/4867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yonghyun Hwang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: Yonghyun Hwang 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4465: Don't hold process wide lock while serializing Runtime Profile in GetRuntimeProfileStr()

2016-11-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4465: Don't hold process wide lock while serializing 
Runtime Profile in GetRuntimeProfileStr()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5035/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS1, Line 568: true
why true? Prefer taking the lock when the method returns.


-- 
To view, visit http://gerrit.cloudera.org:8080/5035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad8a1d6644259f177dfb3b29b3ba1ad6a76210a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4450: qgen: use string concatenation operator for postgres queries

2016-11-10 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5034

Change subject: IMPALA-4450: qgen: use string concatenation operator for 
postgres queries
..

IMPALA-4450: qgen: use string concatenation operator for postgres queries

The random query generator writes a logical query Python object into
Impala or PostgreSQL dialects. When the CONCAT() function is chosen,
Impala and PostgreSQL's CONCAT() implementations behave differently.
PostgreSQL has a || operator that functions like Impala's CONCAT().

The method added here overrides the default behavior for the
PostgresqlSqlWriter. It prevents CONCAT(arg1, arg2, ..., argN) from
being written and instead causes the SQL to be written as
'arg1 || arg2 || ... || argN'.

Testing:

It's difficult to fully test the effects here. I made sure that we
generate syntactically valid queries still on the PostgreSQL side. This
includes queries that made use of string concatenation. I also re-run
some failed queries that previously produced different results. They now
produce the same results.

Change-Id: I149b695889addfd7df4ca5f40dc991456da51687
---
M tests/comparison/model_translator.py
1 file changed, 6 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/5034/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I149b695889addfd7df4ca5f40dc991456da51687
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] IMPALA-4461: Make sure data gets loaded for wide hbase tables.

2016-11-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4461: Make sure data gets loaded for wide hbase tables.
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If2148d6546789c6c53c8e045717081b24ce76689
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4461: Make sure data gets loaded for wide hbase tables.

2016-11-10 Thread David Knupp (Code Review)
David Knupp has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5033

Change subject: IMPALA-4461: Make sure data gets loaded for wide hbase tables.
..

IMPALA-4461: Make sure data gets loaded for wide hbase tables.

Ths patch reverts a change that broke the exhaustive suite of Impala
tests. The change was introduced here:

https://github.com/apache/incubator-impala/commit/ce4c5f67433ebe050261710920972621d625c81c

The orginal problem was that data load was failing when run against a
remote cluster, due to a 4000 byte max for SERDEPROPERTIES.PARAM_VALUE,
a limitation that is well described in HIVE-1364. Locally, when we load
data, we work around the issue here:

https://github.com/apache/incubator-impala/blob/master/bin/create-test-configuration.sh#L99

When testing on CDH remote cluster however, this "fix" never gets applied.
(It also assumes the database will always by postgres.)

I made this change without realizing its full effect, or appreciating
exactly how exhaustive our exhaustive test suite really is. Another
solution will need to be found for the case of remote cluster testing,
but this should unblock the local build for now.

As far as testing, I ran the full suite of tests in query_test/
test_scanners.py, and they all pass after removing these lines.

Change-Id: If2148d6546789c6c53c8e045717081b24ce76689
---
M testdata/datasets/functional/schema_constraints.csv
1 file changed, 0 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/5033/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If2148d6546789c6c53c8e045717081b24ce76689
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 


[Impala-ASF-CR] Improve Kudu UPSERT test coverage

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4953

to look at the new patch set (#5).

Change subject: Improve Kudu UPSERT test coverage
..

Improve Kudu UPSERT test coverage

This patch also introduces a new test section 'DML_RESULTS', which
takes the name of a table as a comment and the contents of the
table as its body and then verifies that the body matches the
actual contents of the table. This makes it easy to check that a
DML operation has the desired effect on the contents of a table,
rather than always having to add another test case that runs a
select on the table. For now, this section cannot be used in a
test along with the RESULTS or ERRORS sections.

Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
---
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
A testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/common/impala_test_suite.py
M tests/common/test_result_verifier.py
M tests/query_test/test_kudu.py
M tests/util/test_file_parser.py
6 files changed, 526 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4953/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] Improve Kudu UPSERT test coverage

2016-11-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: Improve Kudu UPSERT test coverage
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4953/4//COMMIT_MSG
Commit Message:

PS4, Line 9: This patch also introduces a new test section 'DML_RESULTS', which
   : takes the name of a table as a comment and the contents of the
   : table as its body and then verifies that the body 
> let's remove this; sorry if my previous comments were confusing
Done


http://gerrit.cloudera.org:8080/#/c/4953/4/testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test:

PS4, Line 23: NumModifiedRows: 1
> I realized in my own change that all of these regex checks can be simplifie
Done


http://gerrit.cloudera.org:8080/#/c/4953/4/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

PS4, Line 341:  Combining 'RESULTS' with 'DML_RESULTS"
> add a comment that this isn't supported for now because __verify_results_an
Done


PS4, Line 357: UNTIME_PRO
> Please add a comment about the limit, e.g .to make sure the queries aren't 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9e7afbef60186edb00a9d11fbe5a8c64931add6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow.

2016-11-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4431: Add audit event log control mechanism to prevent 
disk  overflow.
..


Patch Set 6:

(4 comments)

There are still some minor formatting issues left. Have you actually run 
git-clang-format on your change? That should fix them automatically.

Also, have you thought about how we can test this new behaviour, both to make 
sure it works as intended and does not break other features?

http://gerrit.cloudera.org:8080/#/c/4971/3/be/src/common/init.cc
File be/src/common/init.cc:

Line 69:  "to retain. The most recent audit event log files are 
retained. If set to 0, "
> Please follow the indentation of the surrounding code
This still needs to be done.


http://gerrit.cloudera.org:8080/#/c/4971/6/be/src/common/logging.cc
File be/src/common/logging.cc:

Line 186:   ImpalaServer::GetAuditEventLogFilePrefix());
much better :)


http://gerrit.cloudera.org:8080/#/c/4971/6/be/src/service/impala-server.h
File be/src/service/impala-server.h:

Line 271:   /// Get the prefix of audit event log filename
nit: dot.


Line 272:   static std::string GetAuditEventLogFilePrefix();  
Please remove trailing whitespace. Also check other files.


-- 
To view, visit http://gerrit.cloudera.org:8080/4971
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes