[Impala-ASF-CR] IMPALA-5180: Don't use non-deterministic exprs in partition pruning

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5180: Don't use non-deterministic exprs in partition 
pruning
..


Patch Set 9:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/525/

-- 
To view, visit http://gerrit.cloudera.org:8080/6575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I91054c6bf017401242259a1eff5e859085285546
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5180: Don't use non-deterministic exprs in partition pruning

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5180: Don't use non-deterministic exprs in partition 
pruning
..


Patch Set 9: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I91054c6bf017401242259a1eff5e859085285546
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

Line 189
> Fair point. Let me think about this a little more.
How about running this query and asserting that the result is sorted?

select last_value(random(2)) over (order by random(2)) lv from 
functional.alltypessmall order by lv;

It's not perfect, but better that nothing imo.


-- 
To view, visit http://gerrit.cloudera.org:8080/6775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


IMPALA-5162,IMPALA-5163: stress test support on secure clusters

This patch adds support for running the stress test
(concurrent_select.py) and loading nested data (load_nested.py) into a
Kerberized, SSL-enabled Impala cluster. It assumes the calling user
already has a valid Kerberos ticket. One way to do that is:

1. Get access to a keytab and krb5.config
2. Set KRB5_CONFIG and KRB5CCNAME appropriately
3. Run kinit(1)
4. Run load_nested.py and/or concurrent_select.py within this
   environment.

Because our Python clients already support Kerberos and SSL, we simply
need to make sure to use the correct options when calling the entry
points and initializing the clients:

Impala: Impyla
Hive: Impyla
HDFS: hdfs.ext.kerberos.KerberosClient

With this patch, I was able to manually do a short concurrent_select.py
run against a secure cluster without connection or auth errors, and I
was able to do the same with load_nested.py for a cluster that already
had TPC-H loaded.

Follow-ons for future cleanup work:

IMPALA-5263: support CA bundles when running stress test against SSL'd
 Impala

IMPALA-5264: fix InsecurePlatformWarning under stress test with SSL

Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Reviewed-on: http://gerrit.cloudera.org:8080/6763
Reviewed-by: Matthew Mulder 
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M testdata/bin/load_nested.py
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/db_connection.py
M tests/stress/concurrent_select.py
5 files changed, 61 insertions(+), 18 deletions(-)

Approvals:
  Matthew Mulder: Looks good to me, but someone else must approve
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4866: Hash join node does not apply limits correctly

2017-05-01 Thread anujphadke (Code Review)
anujphadke has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6778

Change subject: IMPALA-4866: Hash join node does not apply limits correctly
..

IMPALA-4866: Hash join node does not apply limits correctly

Hash join node currently does not apply the limits correctly.
This issue gets masked most of the times since the planner sticks
an exhcnage node on top of most of the joins. This issue gets
exposed when NUM_NODES=1.

Change-Id: I414124f8bb6f8b2af2df468e1c23418d05a0e29f
---
M be/src/exec/partitioned-hash-join-node.cc
M tests/common/test_dimensions.py
2 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6778/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I414124f8bb6f8b2af2df468e1c23418d05a0e29f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 


[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
..


IMPALA-3742: Partitions and sort INSERTs for Kudu tables

Bulk DMLs (INSERT, UPSERT, UPDATE, and DELETE) for Kudu
are currently painful because we just send rows randomly,
which creates a lot of work for Kudu since it partitions
and sorts data before writing, causing writes to be slow
and leading to timeouts.

We can alleviate this by sending the rows to Kudu already
partitioned and sorted. This patch partitions and sorts
rows according to Kudu's partitioning scheme for INSERTs
and UPSERTs. A followup patch will handle UPDATE and DELETE.

It accomplishes this by inserting an exchange node and a sort
node into the plan before the operation. Both the exchange and
the sort are given a KuduPartitionExpr which takes a row and
calls into the Kudu client to return its partition number.

It also disallows INSERT hints for Kudu tables, since the
hints that we support (SHUFFLE, CLUSTER, SORTBY), so longer
make sense.

Testing:
- Updated planner tests.
- Ran the Kudu functional tests.
- Ran performance tests demonstrating that we can now handle much
  larger inserts without having timeouts.

Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Reviewed-on: http://gerrit.cloudera.org:8080/6559
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/CMakeLists.txt
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
A be/src/exprs/kudu-partition-expr.cc
A be/src/exprs/kudu-partition-expr.h
M be/src/runtime/coordinator.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/scheduling/scheduler.cc
M common/thrift/Exprs.thrift
M common/thrift/Partitions.thrift
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
A fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/DataPartition.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
26 files changed, 616 insertions(+), 170 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Thomas Tauber-Marshall: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
..


Patch Set 9: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

PS10, Line 287:   /// TODO: Move these into the new query-wide state, indexed 
by partition id.
Remove.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

Line 189
> Do you have any suggestions?
Fair point. Let me think about this a little more.


-- 
To view, visit http://gerrit.cloudera.org:8080/6775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has submitted this change and it was merged.

Change subject: IMPALA-5003: Constant propagation in scan conjuncts
..


IMPALA-5003: Constant propagation in scan conjuncts

Implements constant propagation within conjuncts and applies the
optimization to scan conjuncts and collection conjuncts within Hdfs
scan nodes.  The optimization is applied during planning.  At scan
nodes in particular, we want to optimize to enable partition pruning.
In certain cases, we might end up with a FALSE conditional, which
now will convert to an EmptySet node.

Testing: Expanded the test cases for the planner to achieve constant
propagation.  Added Kudu, datasource, Hdfs and HBase tests to validate
we can create EmptySetNodes.

Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Reviewed-on: http://gerrit.cloudera.org:8080/6389
Tested-by: Impala Public Jenkins
Reviewed-by: Alex Behm 
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test
19 files changed, 636 insertions(+), 93 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 26
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan conjuncts
..


Patch Set 25: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 25
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/524/

-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan conjuncts
..


Patch Set 25: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 25
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert

2017-05-01 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

Line 189
> Do we have any end-to-end tests that exercise non-deterministic exprs in an
Do you have any suggestions?

Its hard to check the output in a reliable way unless we're able to have the 
random() values returned so that we can check that their order is correct, but 
I don't know of any way to do that (other than use an inline view, which 
doesn't work, see IMPALA-5270)

We can run the query without checking its output just to ensure that it doesn't 
crash, though the crash that was happening here was rare so that doesn't give 
us much coverage. We could fix that by adding it to the query generator.

I also haven't actually repro-ed this, so I could also investigate why its 
actually failing and it may be something related to local filesystem that could 
get fixed, but the entire premise of the test if faulty and its likely to just 
continue to be flaky (e.g. if IMPALA-660 get addressed).


-- 
To view, visit http://gerrit.cloudera.org:8080/6775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2716: Hive/Impala incompatibility for timestamp data in 
Parquet
..


Patch Set 10: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/521/

-- 
To view, visit http://gerrit.cloudera.org:8080/5939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3f24525ef45a2814f476bdee76655b30081079d6
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6775/1/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

Line 189
Do we have any end-to-end tests that exercise non-deterministic exprs in an 
analytic sort? If not, I suggest we try to fix this test instead.

I know there are PlannerTests, but having at least one end-to-end tests seems 
prudent.


-- 
To view, visit http://gerrit.cloudera.org:8080/6775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5262: test analytic order by random fails with assert

2017-05-01 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6775

Change subject: IMPALA-5262: test_analytic_order_by_random fails with assert
..

IMPALA-5262: test_analytic_order_by_random fails with assert

This was a poorly written test that relies on assumptions about
the behavior of 'rand' and the order that rows get processed in
a table that Impala doesn't actually guarantee.

Its also unnecessary as test_order_by_random verifies that sort
expr materialization behaves as expected and the PlannerTest
sort-expr-materialization verified that exprs are materialized
for analytic functions as appropriate.

Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9
---
M tests/query_test/test_sort.py
1 file changed, 0 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/6775/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If1ba8154c2b6a8d508916d85391b95885ef915a9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5137: pt1, Refactor TimestampValue constructors

2017-05-01 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5137: pt1, Refactor TimestampValue constructors
..


Patch Set 7: Code-Review+2

sorry for all the rebasing noise, I'm holding off on committing this until the 
TIMESTAMP patch gets reviewed in case this needs to change:

https://gerrit.cloudera.org/#/c/6526/5

-- 
To view, visit http://gerrit.cloudera.org:8080/6510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id25e19f7984e5ebf9073d9c569faf69cec142fa1
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
..


Patch Set 9:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/523/

-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables

2017-05-01 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
..


Patch Set 9: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables

2017-05-01 Thread Thomas Tauber-Marshall (Code Review)
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6559

to look at the new patch set (#9).

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
..

IMPALA-3742: Partitions and sort INSERTs for Kudu tables

Bulk DMLs (INSERT, UPSERT, UPDATE, and DELETE) for Kudu
are currently painful because we just send rows randomly,
which creates a lot of work for Kudu since it partitions
and sorts data before writing, causing writes to be slow
and leading to timeouts.

We can alleviate this by sending the rows to Kudu already
partitioned and sorted. This patch partitions and sorts
rows according to Kudu's partitioning scheme for INSERTs
and UPSERTs. A followup patch will handle UPDATE and DELETE.

It accomplishes this by inserting an exchange node and a sort
node into the plan before the operation. Both the exchange and
the sort are given a KuduPartitionExpr which takes a row and
calls into the Kudu client to return its partition number.

It also disallows INSERT hints for Kudu tables, since the
hints that we support (SHUFFLE, CLUSTER, SORTBY), so longer
make sense.

Testing:
- Updated planner tests.
- Ran the Kudu functional tests.
- Ran performance tests demonstrating that we can now handle much
  larger inserts without having timeouts.

Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/CMakeLists.txt
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
A be/src/exprs/kudu-partition-expr.cc
A be/src/exprs/kudu-partition-expr.h
M be/src/runtime/coordinator.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/scheduling/scheduler.cc
M common/thrift/Exprs.thrift
M common/thrift/Partitions.thrift
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
A fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/DataPartition.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
26 files changed, 616 insertions(+), 170 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/6559/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5266 Impala ABM / LZCNT support

2017-05-01 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5266 Impala ABM / LZCNT support
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5821/5/be/src/util/bit-util.h
File be/src/util/bit-util.h:

Line 88: return (value >> bits) | (value << (64 - bits));
This is undefined behavior when bits is 0 or 64:

"The behavior is undefined if the right operand is negative, or greater than or 
equal to the length in bits of the promoted left operand."


-- 
To view, visit http://gerrit.cloudera.org:8080/5821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

Line 364:   "0.0.0.0:50070")
> I can only speculate: My guess it has to do with supporting Mini vs. real c
Let's not make the changes in this patch to avoid breaking functionality. Just 
wanted to get your take on this pattern.


Line 412:   local_shell(pip_path + " install pykerberos==1.1.14 
requests-kerberos==0.11.0",
> I can only speculate: I've seen this pattern in a few places and it's likel
Thanks. If you agree there is questionable/little benefit to this lazy 
install+import, we should consider simplifying it - but not in this patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

Line 364:   "0.0.0.0:50070")
> Independent question:
I can only speculate: My guess it has to do with supporting Mini vs. real 
clusters, where the port numbers differ, and dev environments that are half-set 
up or whatever. Do you want me to alter get_hadoop_config() in this patch and 
remove the employments of default values?


Line 412:   local_shell(pip_path + " install pykerberos==1.1.14 
requests-kerberos==0.11.0",
> Not your change, but this flow strikes me as odd. We have packages+versions
I can only speculate: I've seen this pattern in a few places and it's likely an 
attempt at micro-optimization to prevent the Impala Python virtual environment 
from having unnecessary packages. Pre-commit tests won't go through this path, 
for example, thus don't need the packages.


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Matthew Mulder (Code Review)
Matthew Mulder has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1:

(2 comments)

Changes look reasonable, but I'm not super familiar with this code.

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

Line 364:   "0.0.0.0:50070")
Independent question:
Does it even make sense to plug in default values here?
Seems like a misconfiguration might be hard to debug if we plug in default 
values, instead of throwing an error.


Line 412:   local_shell(pip_path + " install pykerberos==1.1.14 
requests-kerberos==0.11.0",
Not your change, but this flow strikes me as odd. We have packages+versions 
baked into the code here. What's the benefit of doing this lazy install+import 
as opposed to requiring these to be installed up-front?


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables

2017-05-01 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
..


Patch Set 8: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6559/7/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java:

Line 37:  * a given row. Returns -1 for rows that do not correspond to a 
partition. The children of
> is it documented in some class header that values outside the legal range r
Done


Line 74: for (int i = 0; i < children_.size(); ++i) {
> single line
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables

2017-05-01 Thread Thomas Tauber-Marshall (Code Review)
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6559

to look at the new patch set (#8).

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
..

IMPALA-3742: Partitions and sort INSERTs for Kudu tables

Bulk DMLs (INSERT, UPSERT, UPDATE, and DELETE) for Kudu
are currently painful because we just send rows randomly,
which creates a lot of work for Kudu since it partitions
and sorts data before writing, causing writes to be slow
and leading to timeouts.

We can alleviate this by sending the rows to Kudu already
partitioned and sorted. This patch partitions and sorts
rows according to Kudu's partitioning scheme for INSERTs
and UPSERTs. A followup patch will handle UPDATE and DELETE.

It accomplishes this by inserting an exchange node and a sort
node into the plan before the operation. Both the exchange and
the sort are given a KuduPartitionExpr which takes a row and
calls into the Kudu client to return its partition number.

It also disallows INSERT hints for Kudu tables, since the
hints that we support (SHUFFLE, CLUSTER, SORTBY), so longer
make sense.

Testing:
- Updated planner tests.
- Ran the Kudu functional tests.
- Ran performance tests demonstrating that we can now handle much
  larger inserts without having timeouts.

Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/CMakeLists.txt
M be/src/exprs/expr-context.h
M be/src/exprs/expr.cc
A be/src/exprs/kudu-partition-expr.cc
A be/src/exprs/kudu-partition-expr.h
M be/src/runtime/coordinator.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/scheduling/scheduler.cc
M bin/impala-config.sh
M common/thrift/Exprs.thrift
M common/thrift/Partitions.thrift
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
A fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/DataPartition.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
27 files changed, 617 insertions(+), 171 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/6559/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS1, Line 404: 
> Why is this removed?
It's an unsupported parameter. 
http://hdfscli.readthedocs.io/en/latest/api.html#hdfs.ext.kerberos.KerberosClient


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

2017-05-01 Thread Matthew Mulder (Code Review)
Matthew Mulder has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS1, Line 404: 
Why is this removed?


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan conjuncts
..


Patch Set 25:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/522/

-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 25
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2716: Hive/Impala incompatibility for timestamp data in 
Parquet
..


Patch Set 10:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/521/

-- 
To view, visit http://gerrit.cloudera.org:8080/5939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3f24525ef45a2814f476bdee76655b30081079d6
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5266 Impala ABM / LZCNT support

2017-05-01 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#5).

Change subject: IMPALA-5266 Impala ABM / LZCNT support
..

IMPALA-5266 Impala ABM / LZCNT support

I recently added some code that wants to do upwards power of 2
calculation.  Turns out this can be done much more quickly in
hardware.  It isn't on a perf critical code path yet but
still seems like a decent idea.

PopcountNoHw was absolutely atrocious as it contains a totally
unpredictable loop that can be computed much more efficiently,
so I fixed that as well.

Testing: Added a perf test to verify this is faster (it is)
and updated the bit-util-test to add better test coverage.

Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/bit-intrinsics-benchmark.cc
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
M be/src/util/fixed-size-hash-table.h
M be/src/util/sse-util.h
8 files changed, 287 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/5821/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] Bump Kudu version to 238249c

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Bump Kudu version to 238249c
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I92587a8061ce70ecd9dac4889bda550636982767
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] Bump Kudu version to 238249c

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: Bump Kudu version to 238249c
..


Bump Kudu version to 238249c

This will pull in the Kudu client partitioner API, which is needed for
IMPALA-3742.

Change-Id: I92587a8061ce70ecd9dac4889bda550636982767
Reviewed-on: http://gerrit.cloudera.org:8080/6718
Reviewed-by: Matthew Jacobs 
Tested-by: Impala Public Jenkins
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 3 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Matthew Jacobs: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I92587a8061ce70ecd9dac4889bda550636982767
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] Impala ABM / LZCNT support

2017-05-01 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: Impala ABM / LZCNT support
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5821/3//COMMIT_MSG
Commit Message:

Line 7: Impala ABM / LZCNT support
> Can you file a tracking JIRA for this? We mostly have standardised on alway
Done - IMPALA-5266


-- 
To view, visit http://gerrit.cloudera.org:8080/5821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5137: Support Kudu UNIXTIME MICROS as Impala TIMESTAMP

2017-05-01 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#5).

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
..

IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP

Adds Impala support for TIMESTAMP types stored in Kudu.

Impala stores TIMESTAMP values in 96-bits and has nanosecond
precision. Kudu's timestamp is a 64-bit microsecond delta
from the Unix epoch (called UNIXTIME_MICROS), so a conversion
is necessary.

When writing to Kudu, TIMESTAMP values in nanoseconds are
averaged to the nearest microsecond.

When reading from Kudu, the KuduScanner returns
UNIXTIME_MICROS with 8bytes of padding so Impala can convert
the value to a TimestampValue in-line and copy the entire
row.
TODO: Kudu still needs to provide a knob to enable this:
  https://gerrit.cloudera.org/#/c/6624/

Testing:
Updated the functional_kudu schema to use TIMESTAMPs instead
of converting to STRING, so this provides some decent
coverage. Some BE tests were added, and some EE tests as
well.
TODO: More testing of boundary values, and some basic perf.

TODO: Support pushing down TIMESTAMP predicates
TODO: Support TIMESTAMPs in range partitioning expressions

Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/impala-config.sh
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/functional_schema_template.sql
A 
testdata/workloads/functional-query/queries/QueryTest/kudu-overflow-ts-abort-on-error.test
A testdata/workloads/functional-query/queries/QueryTest/kudu-overflow-ts.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M tests/query_test/test_kudu.py
M tests/query_test/test_queries.py
M tests/query_test/test_scanners.py
16 files changed, 312 insertions(+), 138 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6526/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5003: Constant propagation in scan conjuncts

2017-05-01 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan conjuncts
..


Patch Set 21:

Got a green light test run for this:

http://sandbox.jenkins.cloudera.com/view/Impala/view/Private-Utility/job/impala-private-build-and-test/5527/console

-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 21
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] Bump Kudu version to 238249c

2017-05-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Bump Kudu version to 238249c
..


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/520/

-- 
To view, visit http://gerrit.cloudera.org:8080/6718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I92587a8061ce70ecd9dac4889bda550636982767
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] Experiment: glibc strncmp/memcmp appears much faster than SSE4.2

2017-05-01 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Experiment: glibc strncmp/memcmp appears much faster than SSE4.2
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6768/1//COMMIT_MSG
Commit Message:

PS1, Line 17: memcmp is sse4.1-based
> does it fall back to non-SSE4.1, if the CPU doesn't have SSE4.1? (maybe usi
Yes, it looks like it:

https://github.com/bminor/glibc/blob/ee19f1de0d0da24114be554fdf94243c0ec6b86c/sysdeps/x86_64/multiarch/memcmp.S

Any suggestions on how to check the codegen case? Do you think the benchmark 
showing 5x improvement in running a particular query in the shell in covers 
that?


-- 
To view, visit http://gerrit.cloudera.org:8080/6768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie4786a4a75fdaffedd6e17cf076b5368ba4b4e3e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Experiment: glibc strncmp/memcmp appears much faster than SSE4.2

2017-05-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: Experiment: glibc strncmp/memcmp appears much faster than SSE4.2
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6768/1//COMMIT_MSG
Commit Message:

PS1, Line 17: memcmp is sse4.1-based
does it fall back to non-SSE4.1, if the CPU doesn't have SSE4.1? (maybe using 
IFUNC)?

If so, then I think it makes sense to switch to gcc's version if it's always 
faster. You'll probably want to check codegen case too (make sure the cpu 
dependent dispatch all works with clang too).


-- 
To view, visit http://gerrit.cloudera.org:8080/6768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie4786a4a75fdaffedd6e17cf076b5368ba4b4e3e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes