[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..


IMPALA-3771: Expose kudu client timeout and set default

The Kudu client timeout was too low for Impala usage. This
sets the default timeout to 3 minutes and exposes it as a
gflag.

New timeout tests were added.

Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Reviewed-on: http://gerrit.cloudera.org:8080/4849
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M be/src/catalog/catalog.cc
M be/src/common/global-flags.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/service/frontend.cc
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
A 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test
A 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_kudu.py
21 files changed, 209 insertions(+), 101 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4435: Fix in-predicate-benchmark link error in release mode

2016-11-04 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4964

Change subject: IMPALA-4435: Fix in-predicate-benchmark link error in release 
mode
..

IMPALA-4435: Fix in-predicate-benchmark link error in release mode

in-predicate-benchmark would not link in release mode only after a
recent change fixed an include to target a .h file rather that an *ir.cc
file. Unfortunately, this also prevented the required symbols from being
visible to the linker.

There might be a more principled approach than reverting the change and
asking clang-tidy to ignore it, but since the benchmarks are not
production code and are rarely run, simple seems best.

Change-Id: I7e0091b25e4d10780f963b74734e0ee81b7f1c39
---
M be/src/benchmarks/in-predicate-benchmark.cc
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/4964/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7e0091b25e4d10780f963b74734e0ee81b7f1c39
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-4391: fix dropped statuses in scanners

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4391: fix dropped statuses in scanners
..


IMPALA-4391: fix dropped statuses in scanners

As far as I'm aware we haven't seen any failures related to these in
practice, but fixing them as a preventative measure.

Testing:
I was unable to reproduce this easily by adding a failpoint. I suspect
that stress testing of the affected file formats would have eventually
found this, because of the similarity to IMPALA-3962.

Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
Reviewed-on: http://gerrit.cloudera.org:8080/4938
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
7 files changed, 19 insertions(+), 15 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4391: fix dropped statuses in scanners

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4391: fix dropped statuses in scanners
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..


Patch Set 4:

Please take another quick look at expr-test.cc

I needed to do an interesting fix because some exprs get simplified and change 
their result type.

I'm thinking of adding a query option for disabling the FE expr rewrites, so 
that we can run the BE expr-test without the rewrites.
Might not be a bad idea to have such a switch for disabling the rewrites in 
case bugs sneak in as we add more complex rules. Thoughts?

-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-11-04 Thread Alex Behm (Code Review)
Hello Marcel Kornacker, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4877

to look at the new patch set (#4).

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..

IMPALA-1286: Extract common conjuncts from disjunctions.

Adds a new ExprRewriteRule to extract common conjuncts from
disjunctions.

Examples:
(a AND b AND c) OR (b AND d) ==> b AND ((a AND c) OR (d))
(a AND b) OR (a AND b) ==> a AND b
(a AND b AND c) OR (c) ==> c

Testing: Added a new unit test in ExprRewriteRulesTest.

Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
---
M be/src/exprs/expr-test.cc
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java
M fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
13 files changed, 266 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/4877/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


IMPALA-3725 Support Kudu UPSERT in Impala

This patch introduces a new query statement, UPSERT, for Kudu
tables which operates like an INSERT and uses all of the analysis,
planning, and execution machinery as INSERT, except that if
there's a primary key collision instead of returning an error an
update is performed.

New syntax:
[with_clause] UPSERT INTO [TABLE] table_name [(column list)]
{
  query_stmt
 | VALUES (value [, value...]) [, (value [, (value...)]) ...]
}

where column list must contain all of the key columns in
table_name, if specified, and table_name must be a Kudu table.

This patch also improves the behavior of INSERTing into Kudu
tables without specifying all of the key columns - this now
results in an analysis exception, rather than attempting the
INSERT and receiving an error back from Kudu.

Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Reviewed-on: http://gerrit.cloudera.org:8080/4047
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M be/src/exec/kudu-table-sink.cc
M common/thrift/DataSinks.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
A fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
17 files changed, 665 insertions(+), 84 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


Patch Set 19: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 437:   BlockLocation[] locations = fs.getFileBlockLocations(file, 0, 
file.getLen());
> still using the old API?
I don't think this change is complete.
This call stack need to be moved to the new API as well.

com.cloudera.impala.catalog.TableLoader.load(Db,String)
com.cloudera.impala.catalog.HdfsTable.load(boolean,IMetaStoreClient,Table)
com.cloudera.impala.catalog.HdfsTable.load(boolean,IMetaStoreClient,Table,boolean,boolean,Set)
com.cloudera.impala.catalog.HdfsTable.loadAllPartitions(List,Table)
com.cloudera.impala.catalog.HdfsTable.createPartition(StorageDescriptor,Partition,Map)
com.cloudera.impala.catalog.HdfsTable.updatePartitionFds(Path,boolean,HdfsFileFormat,Map)
com.cloudera.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem,FileStatus,HdfsPartition$FileDescriptor,HdfsFileFormat,Map)
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(FileStatus,long,long)
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(Path,long,long)
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystem,Path)
org.apache.hadoop.hdfs.DistributedFileSystem$1.doCall(Path)
org.apache.hadoop.hdfs.DistributedFileSystem$1.doCall(Path)
org.apache.hadoop.hdfs.DFSClient.getBlockLocations(String,long,long)
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(String,long,long)
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(ClientProtocol,String,long,long)


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4048: Misc. improvements to /sessions

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4048: Misc. improvements to /sessions
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Matthew Jacobs (Code Review)
Hello Lars Volker, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4849

to look at the new patch set (#6).

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..

IMPALA-3771: Expose kudu client timeout and set default

The Kudu client timeout was too low for Impala usage. This
sets the default timeout to 3 minutes and exposes it as a
gflag.

New timeout tests were added.

Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
---
M be/src/catalog/catalog.cc
M be/src/common/global-flags.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/service/frontend.cc
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
A 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test
A 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_kudu.py
21 files changed, 209 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/4849/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4849/5/testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test:

Line 7: Unable to initialize the Kudu scan node
> #2 is fine with me, as long as there are comments.
Ok, I put a comment in the python test's class header since it is meant to 
apply to both the impalad and catalogd tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..


IMPALA-3586: Clean up union-node.h/cc to enable improvements.

This patch does not address IMPALA-3586, but it cleans up the
code in union-node.h/cc to make it easier to implement those
perf improvements.

The major simplification is to remove conjunct evaluation since
the planner does not assigns conjuncts to a union-node anymore.
Conjuncts are always pushed to the union operands.

Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Reviewed-on: http://gerrit.cloudera.org:8080/4817
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M be/src/exec/union-node.cc
M be/src/exec/union-node.h
2 files changed, 95 insertions(+), 133 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-4303: Do not reset() qualifier of union operands.

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4963

Change subject: IMPALA-4303: Do not reset() qualifier of union operands.
..

IMPALA-4303: Do not reset() qualifier of union operands.

The bug: We used to reset() the qualifier of union operands
to their original value obtained during parsing. This leads to
problems when union operands are unnested and we need to rewrite
Subqueries. In particular, the first union operand of a nested union
was reset() to a null qualifier, but that operand could be somewhere
in the middle of the list of unnested operands in the parent. At that
point, we've lost information about the qualifier of the unnested
operand.

The fix: The simplest solution is to not reset() the qualifier.
The other alternative is be to reset() the qualifier, but also
undo any unnesting. That seems unnecessary and wasteful.

Change-Id: I157bb0f08c4a94fd779487d7c23edd64a537a1f6
---
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
2 files changed, 70 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/4963/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I157bb0f08c4a94fd779487d7c23edd64a537a1f6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-3882: Simplify some query exec state locking

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3882: Simplify some query exec state locking
..


Patch Set 4:

(3 comments)

I did an initial pass over it. I'm still struggling with reasoning about 
whether this introduces new dangerous races - the get_metadata() one is 
obviously fixed, but it's difficult to anticipate what other races may be 
possible. If you have any suggestions about how to approach this it would be 
welcome.

Have you done any stress testing of this? It would be good to do some local 
stress testing against the minicluster. E.g.

  ./tests/stress/concurrent_select.py --cancel-probability=0.5 --max-queries 
1 --tpch-db=tpch_parquet --fail-upon-successive-errors=100

http://gerrit.cloudera.org:8080/#/c/4935/4/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

PS4, Line 625: .get()
I don't feel that strongly, but comparisons with NULL for 
scoped_ptr/unique_ptr/shared_ptr should work without the .get().


Line 641:   return;
Unneeded return.


http://gerrit.cloudera.org:8080/#/c/4935/4/be/src/service/impala-server.h
File be/src/service/impala-server.h:

Line 82: /// 2. session_state_map_lock_
Curious - do we have a similar problem with the session map lock?

There's a somewhat similar pattern in GetSessionState() to the one this patch 
fixes.


-- 
To view, visit http://gerrit.cloudera.org:8080/4935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I516357d2b5e9eb83e8209872cbe4c078c778a629
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4849/5/testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test:

Line 7: Unable to initialize the Kudu scan node
> Hmm yes, I suppose this (and in the other file) could be flaky. I'm not sur
#2 is fine with me, as long as there are comments.


-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4391: fix dropped statuses in scanners

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4391: fix dropped statuses in scanners
..


Patch Set 2: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4391: fix dropped statuses in scanners

2016-11-04 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4938

to look at the new patch set (#2).

Change subject: IMPALA-4391: fix dropped statuses in scanners
..

IMPALA-4391: fix dropped statuses in scanners

As far as I'm aware we haven't seen any failures related to these in
practice, but fixing them as a preventative measure.

Testing:
I was unable to reproduce this easily by adding a failpoint. I suspect
that stress testing of the affected file formats would have eventually
found this, because of the similarity to IMPALA-3962.

Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
7 files changed, 19 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/4938/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4391: fix dropped statuses in scanners

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4391: fix dropped statuses in scanners
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> why would we ship the uuid around, or a 16-byte int for that matter? why no
Agree, that's even better. We need to group the UUIDs by host and then assign 
the disk ids.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4174: Refine PK/FK detection in JoinNode.getJoinCardinality().

2016-11-04 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4174: Refine PK/FK detection in 
JoinNode.getJoinCardinality().
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4925
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe652366352964db96c2f772905ec53fdb9954dd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Tim Armstrong (Code Review)
Hello Bharath Vissapragada, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4941

to look at the new patch set (#4).

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..

IMPALA-4266: Java udf returning string can give incorrect results

The memory management of string results was wrong: strings returned from
Exprs must live until the next time FreeLocalAllocations() is called.
Otherwise the buffer holding the string is freed or reused by the next
UDF call. The fix is to copy string values into a buffer with the
right lifetime.

Testing:
Added a regression test based on Bharath's example that reproduced the
bug reliably.

Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
---
M be/src/exprs/hive-udf-call.cc
M testdata/workloads/functional-query/queries/QueryTest/java-udf.test
M testdata/workloads/functional-query/queries/QueryTest/load-java-udfs.test
A tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java
4 files changed, 70 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/4941/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4941/3/tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java
File tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java:

Line 1: package org.apache.impala;
> copyright
Done


Line 8: {
> move to prev line
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..


Patch Set 4: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Add "Effective Coding Practices" doc to site

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add "Effective Coding Practices" doc to site
..


Patch Set 2:

I can see it being a blog post. Mainly I didn't think it fit on the wiki since 
it should be a static document. How would you feel about that, Henry?

-- 
To view, visit http://gerrit.cloudera.org:8080/4961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I21fdce898a71be836b658e0c914e05a6868d6263
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Add "Effective Coding Practices" doc to site

2016-11-04 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: Add "Effective Coding Practices" doc to site
..


Patch Set 2:

Would this make more sense as a blog post, rather than a top level document?

If it's a top level document, it's worth noting that when this patch ships...

https://gerrit.cloudera.org/#/c/4944/

...then the nav header bar on this page will need to be updated.

-- 
To view, visit http://gerrit.cloudera.org:8080/4961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I21fdce898a71be836b658e0c914e05a6868d6263
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> Agree and that's basically what I think I recommended in my comments here. 
why would we ship the uuid around, or a 16-byte int for that matter? why not 
stick to 1-byte ints right after we get the disk uuids and convert them?


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..


Patch Set 3: Code-Review+1

(2 comments)

Thanks!

http://gerrit.cloudera.org:8080/#/c/4941/3/tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java
File tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java:

Line 1: package org.apache.impala;
copyright


Line 8: {
move to prev line


-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> there is no need to do what you just described.
Agree and that's basically what I think I recommended in my comments here. We 
can shop the UUID around as a TUniqueId though (and not as a string).


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4941/1/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

Line 334:   FunctionContext* fn_ctx = ctx->fn_context(fn_context_index_);
> Sounds good. Might still be good to add in the comment here that the underl
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> the UUIDs are generated randomly and we hash those, so it's possible we get
there is no need to do what you just described.

what we should do:
- map uuids to per-node ordinals. in your example, that machine would see disk 
ordinals 0-7
- use the disk ordinal to directly address the corresponding disk queue

with this approach, only servers with >=127 disks would see degradation.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Tim Armstrong (Code Review)
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4941

to look at the new patch set (#3).

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..

IMPALA-4266: Java udf returning string can give incorrect results

The memory management of string results was wrong: strings returned from
Exprs must live until the next time FreeLocalAllocations() is called.
Otherwise the buffer holding the string is freed or reused by the next
UDF call. The fix is to copy string values into a buffer with the
right lifetime.

Testing:
Added a regression test based on Bharath's example that reproduced the
bug reliably.

Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
---
M be/src/exprs/hive-udf-call.cc
M testdata/workloads/functional-query/queries/QueryTest/java-udf.test
M testdata/workloads/functional-query/queries/QueryTest/load-java-udfs.test
A tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java
4 files changed, 54 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/4941/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> it's very unusual for a server to have more than 128 disks (i don't think w
the UUIDs are generated randomly and we hash those, so it's possible we get a 
series of disk IDs which when modded all land on the same disk queue

for example, we could get hashes 8, 16, 24, 32, etc. on a machine with 8 disks

Of course, in aggregate these collisions are unlikely, but like you said a 
single machine has few disks, so I think this effect can easily happen on a 
single machine.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) Add "Effective Coding Practices" doc to site

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: Add "Effective Coding Practices" doc to site
..

Add "Effective Coding Practices" doc to site

This is a useful document that had floated around internally at
Cloudera. It would be useful to have as a reference doc on the website
as a guide for new and old contributors.

Change-Id: I21fdce898a71be836b658e0c914e05a6868d6263
---
A coding-practices.html
1 file changed, 339 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/4961/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I21fdce898a71be836b658e0c914e05a6868d6263
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-11-04 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
..


Patch Set 8:

i'll pick this up once it has an overall +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> In the BE we mod the disk id to determine which disk queue to put reads on,
it's very unusual for a server to have more than 128 disks (i don't think we've 
ever seen one), so from that perspective i'm not worried about using fewer bits 
to represent the disk "ordinal".

keep in mind that we really don't care about distinguishing disks globally. 
using a designated "out of range" value (such as 127) if a node should really 
exceed 127 disks, and then placing scan ranges with that value on random disk 
queues should degrade performance gracefully in cases where the number of disks 
is just a bit above that threshold.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/4838/4//COMMIT_MSG
Commit Message:

Line 18: arguments for ScalarFnCall expressions on both the codegen'd and
> missing words
Done


PS4, Line 36: temporary
> automatic?  at least I would have understood this a little quicker with tha
Done


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

PS4, Line 158: ",
> is it worth differentiating these messages to make debugging easier?
Can't hurt.


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/anyval-util.h
File be/src/exprs/anyval-util.h:

PS4, Line 297: The returned AnyVal is not initialized.
> this sentence is repeated
Done


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/expr-context.h
File be/src/exprs/expr-context.h:

PS4, Line 66: fragment
> do you mean fragment instance?
It's kind-of unclear what the direction here is. There's a FRAGMENT_LOCAL 
concept in the udf interface that this corresponds to, so I think this is 
consistent at the moment (although maybe it should be FRAGMENT_INSTANCE_LOCAL 
everywhere).


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS4, Line 188: fragment
> fragment or fragment instance?
Reworded to avoid making the distinction and be clearer. It's not clear that 
the FRAGMENT_LOCAL terminology is accurate, but I think that's out of scope for 
this fix.


PS4, Line 189: cases
> not sure what this sentence is saying. what cases?
Reworded to be clearer. It's copied when the ExprContext is cloned.


Line 205: // children.
> do we document the calling convention somewhere? I think a brief explanatio
Updated the comment. The UDF calling convention is in udf/udf.h and the 
contents of the buffers are in udf/udf-internal.h


PS4, Line 340: optimise out the buffer
> is this also enabling constant propagation for constant args, or was that a
The constant was propagated but the side-effect of storing to the varargs 
buffer couldn't be optimised out.


PS4, Line 370: Either no varargs or arguments before varargs begin
> this comment seems unnecessary with the new conditional
Done


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/runtime/free-pool.h
File be/src/runtime/free-pool.h:

Line 73: size = std::max(8, size);
> seems fine, but why is it required?
Not required but it's pointless making smaller allocations, since it's aligned 
to 8 bytes by the MemPool anyway. Added a comment


-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#5).

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..

IMPALA-4302,IMPALA-2379: constant expr arg fixes

This patch fixes two issues around handling of constant expr args.
The patches are combined because they touch some of the same code
and depend on some of the same memory management cleanup.

First, it fixes IMPALA-2379, where constant expr args were not visible
to UDAFs. The issue is that the input exprs need to be opened before
calling the UDAF Init() function.

Second, it avoids overhead from repeated evaluation of constant
arguments for ScalarFnCall expressions on both the codegen'd and
interpreted paths. A common example is an IN predicate with a
long list of constant values.

The interpreted path was inefficient because it always evaluated all
children expressions. Instead in this patch constant args are
evaluated once and cached. The memory management of the AnyVal*
objects was somewhat nebulous - adjusted it so that they're allocated
from ExprContext::mem_pool_, which has the correct lifetime.

The codegen'd path was inefficient only with varargs - with fixed
arguments the LLVM optimiser is able to infer after inlining that
the expressions are constant and remove all evaluation. However,
for varargs it stores the vararg values into a heap-allocated buffer.
The LLVM optimiser is unable to remove these stores because they
have a side-effect that is visible to code outside the function.

The codegen'd path is improved by evaluating varargs into an automatic
buffer that can be optimised out. We also make a small related change
to bake the string constants into the codegen'd code.

Testing:
Ran exhaustive build.

Added regression test for IMPALA-2379 and MemPool test for aligned
allocation. Added a test for in predicates with constant strings.

Perf:
Added a targeted query that demonstrates the improvement. Also manually
validated the non-codegend perf. Also ran TPC-H and targeted perf
queries locally - didn't see any significant changes.

++---+---++-++---++-+---+
| Workload   | Query | File Format   | 
Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | 
Iters |
++---+---++-++---++-+---+
| TARGETED-PERF(_20) | primitive_filter_in_predicate | parquet / none / none | 
1.19   | 9.82| I -87.85%  |   3.82%   |   0.71%| 1   | 
10|
++---+---++-++---++-+---+

(I) Improvement: TARGETED-PERF(_20) primitive_filter_in_predicate [parquet / 
none / none] (9.82s -> 1.19s [-87.85%])
+--++--+--++---+--+--++++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Rows  | Est #Rows |
+--++--+--++---+--+--++++---+
| 01:AGGREGATE | 14.39% | 155.88ms | 214.61ms | -27.37%|   2.68%   | 
163.38ms | 227.53ms | -28.19%| 1  | 1  | 1 |
| 00:SCAN HDFS | 85.60% | 927.46ms | 9.43s| -90.16%|   4.49%   | 
1.01s| 9.50s| -89.42%| 1  | 13.77K | 14.05K|
+--++--+--++---+--+--++++---+

Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
---
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/case-expr.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/in-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/testutil/test-udas.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf-test-harness.cc
M be/src/udf/udf.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/uda.test
A testdata/workloads/targeted-perf/queries/primiti

[Impala-ASF-CR](asf-site) Add "Effective Coding Practices" doc to site

2016-11-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Add "Effective Coding Practices" doc to site
..


Patch Set 1:

Can you word-wrap the file so it's possible to review? Thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/4961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I21fdce898a71be836b658e0c914e05a6868d6263
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4941/1/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

Line 334:   FunctionContext* fn_ctx = ctx->fn_context(fn_context_index_);
> I'd rather avoid that because Java has to make those allocations via sun.mi
Sounds good. Might still be good to add in the comment here that the underlying 
buffer is allocated and owned by the Java UDfExecutor and that the buffer is 
reused in subsequent invocations. Just to motivate more clearly why the copy is 
needed here.


-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4849/5/be/src/exec/kudu-table-sink.h
File be/src/exec/kudu-table-sink.h:

Line 103:   /// This uses 'kudu::client::sp::shared_ptr' as that is the type 
expected by Kudu.
> this comment can probably be removed now that kudu has it's own namespace w
Done


http://gerrit.cloudera.org:8080/#/c/4849/5/testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test:

Line 7: Unable to initialize the Kudu scan node
> will these always give the error? couldn't it possibly work with 1ms timeou
Hmm yes, I suppose this (and in the other file) could be flaky. I'm not sure 
there'll be any way to _ensure_ that any of these operations take longer than 
1ms.

I can think of these options:
1) take these tests out. not the end of the world but nice to have this 
coverage.
2) leave them in (and add comments), maybe it'll never be an issue in practice. 
I can remove/xfail them later if necessary, or see if the Kudu folks have ideas 
about how to make it time out.

I guess i'd prefer to stick with #2 for now since it seems likely that all of 
these operations will actually timeout (all metadata ops involving the master, 
constructing a response with a bunch of strings to serialize, etc), and the 
coverage is nice. What do you think?


-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> could you go into detail about the performance problems?
In the BE we mod the disk id to determine which disk queue to put reads on, so 
I'm worried about pathological cases with "collisions" where many ranges are 
assigned to only one disk thread. That will be quite hard to debug.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 379:   private static int getDiskId(String storageId) {
> As much as I'd like to use this simpler alternative, I think it could lead 
could you go into detail about the performance problems?

the goal is to switch to an *8-bit* node-relative disk id in order to reduce 
the amount of storage space required for recording this info.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) Add "Effective Coding Practices" doc to site

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4961

Change subject: Add "Effective Coding Practices" doc to site
..

Add "Effective Coding Practices" doc to site

This is a useful document that had floated around internally at
Cloudera. It would be useful to have as a reference doc on the website
as a guide for new and old contributors.

Change-Id: I21fdce898a71be836b658e0c914e05a6868d6263
---
A coding-practices.html
1 file changed, 227 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/4961/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I21fdce898a71be836b658e0c914e05a6868d6263
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join.

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4960

Change subject: IMPALA-3167: Fix assignment of WHERE-clause predicate through 
grouping agg + outer join.
..

IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + 
outer join.

Background: We generally allow the assignment of predicates below the nullable 
side
of a left/right outer join, explained as follows using an example:

SELECT * FROM t1 LEFT OUTER JOIN t2 ON t1.id = t2.id
WHERE t2.int_col < 10

The scan of 't2' will pick up 't2.int_col < 10' via 
Analyzer.getBoundPredicates()
and recognizes that the predicate must also be evaluated by a join later, so the
predicate is not marked as assigned. The join then picks up the unassigned
predicate via Analyzer.getUnassignedConjuncts().

The bug was that our logic for detecting whether a bound predicate must also be
evaluated at a join node was flawed because it only considered whether the 
tuples
of the source or destination predicate were outer joined (plus other 
conditions).
The underlying assumption is that either the source or destination tuple are 
bound
by a tuple produced by a TableRef, but in the buggy query the source predicate
is bound by an aggregation tuple, so we incorrectly marked the bound predicate
as assigned in Analyzer.getBoundPredicates().

The fix is to conservatively not mark bound predicates as assigned if there 
exists
equivalent tuples that are outer joined. This conservative fix leads to some
duplicate assignments of predicates. Those are simply deduped now.

Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
4 files changed, 52 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/4960/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-4338: test infra data migrator: include tables' primary keys in PostgreSQL

2016-11-04 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4338: test infra data migrator: include tables' primary 
keys in PostgreSQL
..


Patch Set 4: Code-Review+2

Looks good to me.

-- 
To view, visit http://gerrit.cloudera.org:8080/4951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-11-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


Patch Set 19: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-11-04 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#18).

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..

IMPALA-3725 Support Kudu UPSERT in Impala

This patch introduces a new query statement, UPSERT, for Kudu
tables which operates like an INSERT and uses all of the analysis,
planning, and execution machinery as INSERT, except that if
there's a primary key collision instead of returning an error an
update is performed.

New syntax:
[with_clause] UPSERT INTO [TABLE] table_name [(column list)]
{
  query_stmt
 | VALUES (value [, value...]) [, (value [, (value...)]) ...]
}

where column list must contain all of the key columns in
table_name, if specified, and table_name must be a Kudu table.

This patch also improves the behavior of INSERTing into Kudu
tables without specifying all of the key columns - this now
results in an analysis exception, rather than attempting the
INSERT and receiving an error back from Kudu.

Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
---
M be/src/exec/kudu-table-sink.cc
M common/thrift/DataSinks.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
A fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
17 files changed, 665 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/4047/18
-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Tim Armstrong (Code Review)
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4941

to look at the new patch set (#2).

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..

IMPALA-4266: Java udf returning string can give incorrect results

The memory management of string results was wrong: strings returned from
Exprs must live until the next time FreeLocalAllocations() is called.
Otherwise the buffer holding the string is freed or reused by the next
UDF call. The fix is to copy string values into a buffer with the
right lifetime.

Testing:
Added a regression test based on Bharath's example that reproduced the
bug reliably.

Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
---
M be/src/exprs/hive-udf-call.cc
M testdata/workloads/functional-query/queries/QueryTest/java-udf.test
M testdata/workloads/functional-query/queries/QueryTest/load-java-udfs.test
A tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java
4 files changed, 54 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/4941/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-4266: Java udf returning string can give incorrect results

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4266: Java udf returning string can give incorrect 
results
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4941/1/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

Line 334:   FunctionContext* fn_ctx = ctx->fn_context(fn_context_index_);
> Just wondering whether you've considered the following alternative: Take th
I'd rather avoid that because Java has to make those allocations via 
sun.misc.Unsafe, which I think calls malloc() (although the docs don't say). 
Normally UDF allocations come from a FreePool.

We'd need a separate list of pointers that came from malloc() instead of the 
FreePool, which further complicates the local allocation mechanism, which is 
confusing anyway.

I suspect that the FreePool alloc + copy is probably comparable in cost to the 
malloc anyway.


http://gerrit.cloudera.org:8080/#/c/4941/1/testdata/workloads/functional-query/queries/QueryTest/java-udf.test
File testdata/workloads/functional-query/queries/QueryTest/java-udf.test:

Line 299: w
> extraneous w?
Done


http://gerrit.cloudera.org:8080/#/c/4941/1/tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java
File tests/test-hive-udfs/src/main/java/org/apache/impala/ReplaceStringUdf.java:

Line 3: import org.apache.hadoop.hive.ql.exec.UDF;
> nit: order of imports
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I705d271814cb1143f67d8a12f4fd87bab7a8e161
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


Patch Set 17: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(1 comment)

An additional high-level comment

http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 526:   if (SUPPORTS_STORAGE_ID) {
> This seems fine as a first cut, but we should consider taking this a step f
On Hadoop 3 the old API is not available at all, so the old API calls won't 
even compile.

Can we factor out the logic into two different implementation classes with a 
common interface for the two different APIs? That will be necessary anyway and 
I think will make the logic easier to understand.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3724: Support Kudu non-covering range partitions

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3724: Support Kudu non-covering range partitions
..


Patch Set 9: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3724: Support Kudu non-covering range partitions

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3724: Support Kudu non-covering range partitions
..


IMPALA-3724: Support Kudu non-covering range partitions

This commit adds support for non-covering range partitions in Kudu
tables. The SPLIT ROWS clause is now deprecated and no longer supported.
The following new syntax provides more flexibility in creating range
partitions and it supports bounded and unbounded ranges as well as single value
partitions; multi-column range partitions are supported as well.

The new syntax is:
DISTRIBUTE BY RANGE (col_list)
(
 PARTITION lower_1 <[=] VALUES <[=] upper_1,
 PARTITION lower_2 <[=] VALUES <[=] upper_2,
 
 PARTITION lower_n <[=] VALUES <[=] upper_n,
 PARTITION VALUE = val_1,
 
 PARTITION VALUE = val_n
)

Multi-column range partitions are specified as follows:
DISTRIBUTE BY RANGE (col1, col2,..., coln)
(
 PARTITION VALUE = (col1_val, col2_val, ..., coln_val),
 
 PARTITION VALUE = (col1_val, col2_val, ..., coln_val)
)

Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06
Reviewed-on: http://gerrit.cloudera.org:8080/4856
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Internal Jenkins
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
A fe/src/main/java/org/apache/impala/analysis/RangePartition.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
D fe/src/test/java/org/apache/impala/util/KuduUtilTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M tests/query_test/test_kudu.py
17 files changed, 858 insertions(+), 494 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dimitris Tsirogiannis: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

(11 comments)

Some high-level comments before digging in deeper.

http://gerrit.cloudera.org:8080/#/c/4914/1//COMMIT_MSG
Commit Message:

Line 9: This change enables Impala to use BlockLocation#getStorageIds,
nit: we usually use () after names to denote they are e function/method

i.e. getStorageIds -> getStorageIds()

getFileBlockLocations -> getFileBlockLocations()


Line 12: getFileBlockLocations call which will be  deprecated in Hadoop-3.
extra space after "be"


http://gerrit.cloudera.org:8080/#/c/4914/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 287:   clazz = null;
remove


Line 290: if ( clazz != null ) {
style: extra spaces, we do

if (clazz != null) {
...
}


Line 295: m = null;
remove


Line 311: LOG.info("SUPPORTS_STORAGE_ID is: " + (SUPPORTS_STORAGE_ID ? 
"true" : "false"));
no need for this ternary logic, just:

LOG.info("SUPPORTS_STORAGE_ID is: " + SUPPORTS_STORAGE_ID);


Line 352:   private static int getSequentialDiskId(String storageId)
While this solution is minimally invasive, I'm not a big fan of the extra 
locking and memory consumption of this map.

An overall better design is to convert the UUID to a 128-bit int and use that 
everywhere. Also see my comments below.

Let me know what you think.


Line 353:   {
move to previous line


Line 379:   private static int getDiskId(String storageId) {
As much as I'd like to use this simpler alternative, I think it could lead to 
serious performance problems that could be very hard to debug. I don't think we 
can use this.

Perhaps we should consider switching to a 128-bit integer to represent the UUID 
everywhere (including all thrift structures and the BE). The BE only needs a 
number that it can mod against the number of local data dirs (see 
DiskIoMgr::AssignQueue()).

Using a 128-bit int would also work with the old volume id API.

We have an existing TUniqueId which may be suitable.


Line 437:   BlockLocation[] locations = fs.getFileBlockLocations(file, 0, 
file.getLen());
still using the old API?

My understanding is that switching to a new API for these calls here will make 
the extra loadDiskIds() unnecessary.


Line 526:   if (SUPPORTS_STORAGE_ID) {
This seems fine as a first cut, but we should consider taking this a step 
further in a follow-on change.

This is still grabbing the storage ids partition-by-partition, but ideally we 
could get all the blocks and their locations for an entire table. We'd need to 
handle partitions with non-standard locations specially.


-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4338: test infra data migrator: include tables' primary keys in PostgreSQL

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#4).

Change subject: IMPALA-4338: test infra data migrator: include tables' primary 
keys in PostgreSQL
..

IMPALA-4338: test infra data migrator: include tables' primary keys in 
PostgreSQL

This patch adds the ability for the test infrastructure's
Impala-to-PostgreSQL data migration tool to recognize whether the Impala
source tables have primary keys, and if so, CREATE the tables in
PostgreSQL with the same primary keys. This is needed especially for
performing CRUD operations by the random query generator for comparison
with Impala/Kudu tables and equivalent PostgreSQL tables.

I modified the make_create_table_sql() implementation to check the
"universal" Python object model of the table's columns. We generate
CREATE TABLE statements with, or without, a PRIMARY KEY clause. For
Impala-side tables that this tool may create, we also ensure that we
only write such a clause when the table's format supports primary keys
(currently Kudu).

When the random query generator runs, it needs to know that the tables
it's examining in both databases are equivalent. It does this by
examining the tables' names, column names, and column types. I have
added whether the column is a primary key as part of this equivalence
test.

Testing:
- The patch includes some unit and system tests for the tool.
- Actually migrated a few small Kudu and HDFS tables from Impala into
  both PostgreSQL 9.3 and 9.5 and examined the tables in PostgreSQL to
  make sure they had primary keys (or not) as expected.
- Very short discrepancy_searcher.py --explain-only runs to test positive
  and negative cases of Impala/PostgreSQL equivalency.

Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
---
M tests/comparison/db_connection.py
M tests/comparison/tests/fake_query.py
A tests/comparison/tests/test_cursor.py
3 files changed, 271 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/4951/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4338: test infra data migrator: include tables' primary keys in PostgreSQL

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4338: test infra data migrator: include tables' primary 
keys in PostgreSQL
..


Patch Set 4:

(3 comments)

Thanks for the review. Please see patch set 4.

http://gerrit.cloudera.org:8080/#/c/4951/2/tests/comparison/db_connection.py
File tests/comparison/db_connection.py:

PS2, Line 797: r
> why 3?
Done


http://gerrit.cloudera.org:8080/#/c/4951/2/tests/comparison/tests/test_cursor.py
File tests/comparison/tests/test_cursor.py:

PS2, Line 43: class,
> class
Done


Line 172: def test_postgres_table_reading(postgresql_cursor, 
sql_primary_key_map):
> Are we connecting to a real Postgres instance in this test? If so, where do
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4302,IMPALA-2379: constant expr arg fixes

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4302,IMPALA-2379: constant expr arg fixes
..


Patch Set 4:

(11 comments)

Looks good. I think just comments about comments.

http://gerrit.cloudera.org:8080/#/c/4838/4//COMMIT_MSG
Commit Message:

Line 18: arguments for ScalarFnCall expressions on both the codegen'd and
missing words


PS4, Line 36: temporary
automatic?  at least I would have understood this a little quicker with that 
terminology.


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

PS4, Line 158: ",
is it worth differentiating these messages to make debugging easier?


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/anyval-util.h
File be/src/exprs/anyval-util.h:

PS4, Line 297: The returned AnyVal is not initialized.
this sentence is repeated


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/expr-context.h
File be/src/exprs/expr-context.h:

PS4, Line 66: fragment
do you mean fragment instance?


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS4, Line 188: fragment
fragment or fragment instance?


PS4, Line 189: cases
not sure what this sentence is saying. what cases?


Line 205: // children.
do we document the calling convention somewhere? I think a brief explanation 
would help casual readers understand why the interpreted path and codegen path 
set things up the way they do.


PS4, Line 340: optimise out the buffer
is this also enabling constant propagation for constant args, or was that 
already happening?


PS4, Line 370: Either no varargs or arguments before varargs begin
this comment seems unnecessary with the new conditional


http://gerrit.cloudera.org:8080/#/c/4838/4/be/src/runtime/free-pool.h
File be/src/runtime/free-pool.h:

Line 73: size = std::max(8, size);
seems fine, but why is it required?


-- 
To view, visit http://gerrit.cloudera.org:8080/4838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I45c3ed8c9d7a61e94a9b9d6c316e8a53d9ff6c24
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change.

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..


Patch Set 1:

First pass on the review to make sure it's going in the right direction. I also 
wonder about the best way to test it in the current dev environment; the 
minicluster seems to be too constrained in instances and (simulated) spindles.
The old code path is still preserved in case the new HDFS method is not present 
on the platform.

-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4371: Incorrect DCHECK-s in hdfs-parquet-table-writer

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4371: Incorrect DCHECK-s in hdfs-parquet-table-writer
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4371: Incorrect DCHECK-s in hdfs-parquet-table-writer

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4371: Incorrect DCHECK-s in hdfs-parquet-table-writer
..


IMPALA-4371: Incorrect DCHECK-s in hdfs-parquet-table-writer

Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76
Reviewed-on: http://gerrit.cloudera.org:8080/4835
Reviewed-by: Tim Armstrong 
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/exec/hdfs-parquet-table-writer.cc
1 file changed, 1 insertion(+), 3 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved
  Tim Armstrong: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/4835
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 12:

Rebased. Had to make some fixes to the UDF tests because of logical conflicts 
with one of Michael's codegen changes - I started running the tests with and 
without codegen, but some of the tests can now not be run with 
disable_codegen=1.

-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172: Switch to BlockLocation methods for disk IDs

2016-11-04 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4914

Change subject: IMPALA-4172: Switch to BlockLocation methods for disk IDs
..

IMPALA-4172: Switch to BlockLocation methods for disk IDs

This change enables Impala to use BlockLocation#getStorageIds,
the new call in Hadoop 3 for getting HDFS data block location
info from the NameNode. This call supercedes the old
getFileBlockLocations call which will be  deprecated in Hadoop-3.

The presence of BlockLocation#getStorageIds is determined dynamically.
This is necessary because the implementation was backported to
various Hadoop-2 releases after it appeared in Hadoop-3, so simple
version checking is not suitable to decide whether the call can be
used. In cases where both getFileBlockLocations and
BlockLocation#getStorageIds are supported, the code prefers the
latter for performance reasons: getStorageIds does not have to
query the DataNodes for this information.

BlockLocation#getStorageIds returns disk IDs as UUID-based strings,
which would be too expensive to ship around and store. This patch
maps these strings to small integers, preserving compatibility with the
existing representation of diskIDs.

Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
1 file changed, 155 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/4914/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibbff94cef9a9db7b3945f8e7b0286866d2cc3b61
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-11-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#12).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.
The tests are run with both codegen enabled and disabled. To avoid
flaky tests, we switch the UDF tests to use "unique_database".

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/java-udf.test
M 
testdata/workloads/functional-query/queries/QueryTest/libs_with_same_filenames.test
M testdata/workloads/functional-query/queries/QueryTest/load-java-udfs.test
M testdata/workloads/functional-query/queries/QueryTest/uda-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/uda.test
A 
testdata/workloads/functional-query/queries/QueryTest/udf-codegen-required.test
M testdata/workloads/functional-query/queries/QueryTest/udf-errors.test
M testdata/workloads/functional-query/queries/QueryTest/udf-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/udf.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
47 files changed, 973 insertions(+), 747 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/12
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..


Patch Set 3: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in 
object model
..


Patch Set 7:

Thanks for the review, Taras. This *does* need GVO, which I'll make sure to do.

-- 
To view, visit http://gerrit.cloudera.org:8080/4873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4427: leopard: make DOCKER IMAGE NAME required

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required
..


Patch Set 3:

> There is no point in sending it through GVO. Just submit it directly

I agree. As a non-committer, I can't do it, which is why I asked you. Thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/4936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4427: leopard: make DOCKER IMAGE NAME required

2016-11-04 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4427: leopard: make DOCKER IMAGE NAME required

2016-11-04 Thread Jim Apple (Code Review)
Jim Apple has submitted this change and it was merged.

Change subject: IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required
..


IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required

This patch now requires users of the Leopard framework to supply a
DOCKER_IMAGE_NAME. The cloudera/impala image isn't being maintained, and
until Apache Impala (incubating) decides to publish its own image, no
default image name is possible. It is still possible to run the Leopard
framework against a homegrown Docker image that has Apache Impala
(incubating) and PostgreSQL installed: simply build such an image and
export the DOCKER_IMAGE_NAME environment variable before running the
controller.

While here, fix some flake8 non-indent problems.

Testing: short Leopard controller / query generator run.

Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Reviewed-on: http://gerrit.cloudera.org:8080/4936
Reviewed-by: Matthew Jacobs 
Reviewed-by: Jim Apple 
Reviewed-by: Taras Bobrovytsky 
Tested-by: Jim Apple 
---
M tests/comparison/leopard/controller.py
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 3 insertions(+), 6 deletions(-)

Approvals:
  Taras Bobrovytsky: Looks good to me, approved
  Jim Apple: Looks good to me, but someone else must approve; Verified
  Matthew Jacobs: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/4936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4427: leopard: make DOCKER IMAGE NAME required

2016-11-04 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required
..


Patch Set 4:

> There is no point in sending it through GVO. Just submit it
 > directly

Only committers can do so. I did so.

-- 
To view, visit http://gerrit.cloudera.org:8080/4936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

2016-11-04 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in 
object model
..


Patch Set 7: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4260: Alter table add column drops all the column stats

2016-11-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4260: Alter table add column drops all the column stats
..


Patch Set 2: Code-Review+2

carrying dimitris' +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4845
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia373ec917efa7ab9f2a59b8a870b7ebc30175dda
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4427: leopard: make DOCKER IMAGE NAME required

2016-11-04 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required
..


Patch Set 3:

There is no point in sending it through GVO. Just submit it directly

-- 
To view, visit http://gerrit.cloudera.org:8080/4936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4338: test infra data migrator: include tables' primary keys in PostgreSQL

2016-11-04 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4338: test infra data migrator: include tables' primary 
keys in PostgreSQL
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4951/2/tests/comparison/db_connection.py
File tests/comparison/db_connection.py:

PS2, Line 797: 3
why 3?


http://gerrit.cloudera.org:8080/#/c/4951/2/tests/comparison/tests/test_cursor.py
File tests/comparison/tests/test_cursor.py:

PS2, Line 43: cllass
class


Line 172: def test_postgres_table_reading(postgresql_cursor, 
sql_primary_key_map):
Are we connecting to a real Postgres instance in this test? If so, where do we 
configure the host and port, or do we just always connect to the local instance?


-- 
To view, visit http://gerrit.cloudera.org:8080/4951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in 
object model
..


Patch Set 7: Code-Review+1

(1 comment)

Thanks for the review. I made the rename and ran the tests locally.

http://gerrit.cloudera.org:8080/#/c/4873/5/tests/comparison/db_connection.py
File tests/comparison/db_connection.py:

PS5, Line 532: _fet
> Maybe it would be better to call this something other than get? Maybe 'load
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

2016-11-04 Thread Michael Brown (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4873

to look at the new patch set (#7).

Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in 
object model
..

IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

Test infrastructure, including the random query generator and the data
migrator, needs to know the primary keys of Impala/Kudu tables. This
test infrastructure keeps Python object models of the tables and
columns. This patch adds the ability to read from source Impala/Kudu
tables via SHOW CREATE TABLE and store primary keys as proper
attributes. The patch also adds tests that ensure the test
infrastructure is always able to read and store the primary keys. This
helps find breakages sooner rather than later. For example, if a
regression to "SHOW CREATE TABLE" or the test infrastructure makes us no
longer able to parse primary keys, GVO or other CI will find the
breakage faster than running the query generator.

I also fixed some flake8 issues in files I touched. There were several
files that had a lot of white space warnings, and I wanted to keep the
patch from getting too large.

Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
---
M tests/comparison/common.py
M tests/comparison/db_connection.py
M tests/conftest.py
M tests/metadata/test_show_create_table.py
4 files changed, 175 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/4873/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in 
object model
..


Patch Set 6: Code-Review+1

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4048: Misc. improvements to /sessions

2016-11-04 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-4048: Misc. improvements to /sessions
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4427: leopard: make DOCKER IMAGE NAME required

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required
..


Patch Set 3:

Thanks! Do you mind submitting this in the Gerrit WebUI since

1. GVO won't touch paths affected by this patch

2. We are generally having GVO backlog and other problems - don't want to waste 
resources or create more trouble for a patch that wouldn't benefit from GVO

-- 
To view, visit http://gerrit.cloudera.org:8080/4936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution

2016-11-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3342: Add thread counters to monitor plan fragment 
execution
..


Patch Set 8:

(3 comments)

Couple of naming questions, otherwise looks good.

http://gerrit.cloudera.org:8080/#/c/4633/8/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS8, Line 42: #include "runtime/mem-tracker.h"
what do you need this for?


http://gerrit.cloudera.org:8080/#/c/4633/8/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS8, Line 137: PlanFragmentThreads
Does it work if you pass an empty string? I think "PlanFragmentThreads" is 
redundant, given the suffixes attached to all the counters.


http://gerrit.cloudera.org:8080/#/c/4633/8/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS8, Line 323: plan_fragment_counters_
how about calling this total_thread_statistics_ or similar?

plan_fragment_counters_ doesn't tell me anything about what it contains.


-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4427: leopard: make DOCKER IMAGE NAME required

2016-11-04 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4427: leopard: make DOCKER_IMAGE_NAME required
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1cb96cb5c9a894f40e0892f3bdd3f3d0158e887
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default

2016-11-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3771: Expose kudu client timeout and set default
..


Patch Set 5: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4849/5/be/src/exec/kudu-table-sink.h
File be/src/exec/kudu-table-sink.h:

Line 103:   /// This uses 'kudu::client::sp::shared_ptr' as that is the type 
expected by Kudu.
this comment can probably be removed now that kudu has it's own namespace which 
makes it clear what's going on.


http://gerrit.cloudera.org:8080/#/c/4849/5/testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test:

Line 7: Unable to initialize the Kudu scan node
will these always give the error? couldn't it possibly work with 1ms timeout?


-- 
To view, visit http://gerrit.cloudera.org:8080/4849
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model

2016-11-04 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in 
object model
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4873/5/tests/comparison/db_connection.py
File tests/comparison/db_connection.py:

PS5, Line 532: _get
Maybe it would be better to call this something other than get? Maybe 'load' or 
'fetch'.

To me get sounds like it's returning some value in a local varaible.


-- 
To view, visit http://gerrit.cloudera.org:8080/4873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4421: Send custom cluster & process failure test results to logs/

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4421: Send custom cluster & process failure test results 
to logs/
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I475f61b4ba8a693324bbefed8819d029674b66dd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4421: Send custom cluster & process failure test results to logs/

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4421: Send custom cluster & process failure test results 
to logs/
..


IMPALA-4421: Send custom cluster & process failure test results to logs/

Improperly nested quotes in tests/run-custom-cluster-tests.sh caused a
directory named "${RESULTS_DIR}" to be created in the tests directory,
which in turned interfered with run-tests.py.

Additonally noticed that run-process-failure-tests.sh was creating a
results directory within tests/. We shoud not be polluting the tests/
dir with random test artifacts, so changed the result directory path
for those tests as well.

Tested by running both scripts, and ensuring that results wound up in
logs/, and nothing was created in tests/.

Change-Id: I475f61b4ba8a693324bbefed8819d029674b66dd
Reviewed-on: http://gerrit.cloudera.org:8080/4918
Reviewed-by: Jim Apple 
Tested-by: Internal Jenkins
---
M tests/run-custom-cluster-tests.sh
M tests/run-process-failure-tests.sh
2 files changed, 9 insertions(+), 8 deletions(-)

Approvals:
  Jim Apple: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I475f61b4ba8a693324bbefed8819d029674b66dd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-11-04 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 8: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/419/

-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4338: test infra data migrator: include tables' primary keys in PostgreSQL

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4338: test infra data migrator: include tables' primary 
keys in PostgreSQL
..


Patch Set 2:

Note this patched is based on https://gerrit.cloudera.org/#/c/4873/

-- 
To view, visit http://gerrit.cloudera.org:8080/4951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4338: test infra data migrator: include tables' primary keys in PostgreSQL

2016-11-04 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4951

Change subject: IMPALA-4338: test infra data migrator: include tables' primary 
keys in PostgreSQL
..

IMPALA-4338: test infra data migrator: include tables' primary keys in 
PostgreSQL

This patch adds the ability for the test infrastructure's
Impala-to-PostgreSQL data migration tool to recognize whether the Impala
source tables have primary keys, and if so, CREATE the tables in
PostgreSQL with the same primary keys. This is needed especially for
performing CRUD operations by the random query generator for comparison
with Impala/Kudu tables and equivalent PostgreSQL tables.

I modified the make_create_table_sql() implementation to check the
"universal" Python object model of the table's columns. We generate
CREATE TABLE statements with, or without, a PRIMARY KEY clause. For
Impala-side tables that this tool may create, we also ensure that we
only write such a clause when the table's format supports primary keys
(currently Kudu).

When the random query generator runs, it needs to know that the tables
it's examining in both databases are equivalent. It does this by
examining the tables' names, column names, and column types. I have
added whether the column is a primary key as part of this equivalence
test.

Testing:
- The patch includes some unit and system tests for the tool.
- Actually migrated a few small Kudu and HDFS tables from Impala into
  both PostgreSQL 9.3 and 9.5 and examined the tables in PostgreSQL to
  make sure they had primary keys (or not) as expected.
- Very short discrepancy_searcher.py --explain-only runs to test positive
  and negative cases of Impala/PostgreSQL equivalency.

Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
---
M tests/comparison/db_connection.py
M tests/comparison/tests/fake_query.py
A tests/comparison/tests/test_cursor.py
3 files changed, 268 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/4951/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I447f022e2dc3d4fc8373b7f388c7875a869921b8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] IMPALA-3552: Make incremental stats max serialized size configurable

2016-11-04 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3552: Make incremental stats max serialized size 
configurable
..


Patch Set 5:

(12 comments)

Nice cleanup

http://gerrit.cloudera.org:8080/#/c/4867/5/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

Line 150: const {
move into line above (90 chars per line max)


Line 152:   cfg.load_catalog_in_background = FLAGS_load_catalog_in_background;
use the thrift setter functions


http://gerrit.cloudera.org:8080/#/c/4867/5/common/thrift/Types.thrift
File common/thrift/Types.thrift:

Line 236:   1: required string sentry_config
Why are some of these required and some optional? I couldn't make out a pattern.


Line 242:   3: required i32 other_log_lvl
non_impala_java_vlog


Line 246:   5: required i64 inc_stats_size_limit_bytes
your gflag is a uint64, I suggest making the gflag signed as well


Line 249:   6: required bool compute_lineage
if possible, I'd prefer to pass the gflags directly, i.e., instead of a bool 
here, pass the lineage_event_log_dir, and then the FE BackendConfig can 
implement a function computeLineage() based on that


http://gerrit.cloudera.org:8080/#/c/4867/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 1601: (statsSizeEstimate < 
BackendConfig.INSTANCE.getIncStatMaxSize());
getIncStatsMaxSize() (Stats vs. Stat)


http://gerrit.cloudera.org:8080/#/c/4867/5/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
File fe/src/main/java/org/apache/impala/common/RuntimeEnv.java:

Line 30: public class RuntimeEnv {
much better!


http://gerrit.cloudera.org:8080/#/c/4867/5/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

Line 41:   }
How about adding initializing the singleton this way:

public static void create(TBackendConfig cfg) {
  Preconditions.checkNotNull(cfg);
  Preconditions.checkState(INSTANCE == null);
  INSTANCE = new TBackendConfig(cfg);
}


Line 43:   public void setBackendConfig(TBackendConfig cfg) {
indentation


http://gerrit.cloudera.org:8080/#/c/4867/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 83:   public JniCatalog(byte[] thriftBEConfig) throws InternalException,
thriftBackendConfig or thriftBeConfig


http://gerrit.cloudera.org:8080/#/c/4867/5/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

Line 120:   public JniFrontend(byte[] thriftBEConfig) throws InternalException,
remove InternalException because that's already covered by ImpalaException


-- 
To view, visit http://gerrit.cloudera.org:8080/4867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yonghyun Hwang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: Yonghyun Hwang 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4406: Add cryptography export control notice

2016-11-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4406: Add cryptography export control notice
..


Patch Set 3: Code-Review+2 Verified+1

(2 comments)

Carry +2.

No code changes, so verifying and submitting manually.

http://gerrit.cloudera.org:8080/#/c/4940/2//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4406: Add cryptography export control notice
> JIRA #
Done


http://gerrit.cloudera.org:8080/#/c/4940/2/README.md
File README.md:

Line 39: Please refer to EXPORT\_CONTROL.md for more information.
> escape the underscore for Markdown?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I527a5c07a13c207c33544887d3f20501a0294c01
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4406: Add cryptography export control notice

2016-11-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has submitted this change and it was merged.

Change subject: IMPALA-4406: Add cryptography export control notice
..


IMPALA-4406: Add cryptography export control notice

The ASF guidelines on crypto requires projects that use any form of
asymmetric cryptography to include an export control notice to notify
the users of the same.

Refer: https://www.apache.org/dev/crypto.html

Change-Id: I527a5c07a13c207c33544887d3f20501a0294c01
Reviewed-on: http://gerrit.cloudera.org:8080/4940
Reviewed-by: Sailesh Mukil 
Tested-by: Sailesh Mukil 
---
A EXPORT_CONTROL.md
M README.md
2 files changed, 51 insertions(+), 1 deletion(-)

Approvals:
  Sailesh Mukil: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I527a5c07a13c207c33544887d3f20501a0294c01
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4406: Add cryptography export control notice

2016-11-04 Thread Sailesh Mukil (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4940

to look at the new patch set (#3).

Change subject: IMPALA-4406: Add cryptography export control notice
..

IMPALA-4406: Add cryptography export control notice

The ASF guidelines on crypto requires projects that use any form of
asymmetric cryptography to include an export control notice to notify
the users of the same.

Refer: https://www.apache.org/dev/crypto.html

Change-Id: I527a5c07a13c207c33544887d3f20501a0294c01
---
A EXPORT_CONTROL.md
M README.md
2 files changed, 51 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/4940/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I527a5c07a13c207c33544887d3f20501a0294c01
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-4391: fix dropped statuses in scanners

2016-11-04 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4391: fix dropped statuses in scanners
..


Patch Set 1: Code-Review+1

Thanks for fixing this!

-- 
To view, visit http://gerrit.cloudera.org:8080/4938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7c47f8b29a20dc74ad9e9e1c58b3d7ed95de4d35
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4048: Misc. improvements to /sessions

2016-11-04 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#3).

Change subject: IMPALA-4048: Misc. improvements to /sessions
..

IMPALA-4048: Misc. improvements to /sessions

* Make table searchable and sortable
* Fix 'last accessed time' being printed in UTC
* Made table contents more compact so that it mostly fits on screen
* Clarify summary text re: active and inactive sessions
* Include fix in mustache-cpp required to print 64-bit integers
  correctly (see
  
https://github.com/henryr/cpp-mustache/commit/29768bf0e84f5a1e95e006fc64996d375499dbda)

Testing: Visual inspection and manual sorting, searching etc.

Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
---
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/thirdparty/mustache/mustache.cc
M www/sessions.tmpl
6 files changed, 78 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/4880/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4048: Misc. improvements to /sessions

2016-11-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4048: Misc. improvements to /sessions
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4880/2/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

Line 466: "start_time_sort", session.second->last_accessed_ms, 
document->GetAllocator());
> session.second->start_time_ms?
Oops, thanks.


http://gerrit.cloudera.org:8080/#/c/4880/2/be/src/thirdparty/mustache/mustache.cc
File be/src/thirdparty/mustache/mustache.cc:

Line 281: (*out) << context->GetInt64();
> GetInt()?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


  1   2   >