[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


Patch Set 4: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/197/

-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] Make gen build version.py resilient to a failing git rev-parse

2016-09-13 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4411

Change subject: Make gen_build_version.py resilient to a failing git rev-parse
..

Make gen_build_version.py resilient to a failing git rev-parse

It was noticed that some build processes did not checkout Impala and
instead built it from a tarball. This would cause our gen_build_version
script to write a blank version info everytime to the version.info file.

This patch takes care of the case where if there is an already existing
version.info file and we cannot get the git rev-parse output, we use
the old file instead. Blank version info is written only when we don't
have an old version.info file and we cannot do a git rev-parse.

Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
---
M bin/gen_build_version.py
M bin/save-version.sh
2 files changed, 23 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4411/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests.
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests.
..


IMPALA-4091: Fix backend unit to log in logs/be_tests.

1. Many backend unit tests did not follow proper initialization
using InitCommonRuntime(), and as a result did not write their
logs into the logs/be_tests directory.

2. Added an IMPALA_TEST_MAIN() macro that stamps out the common
main() function used in most gtest unit tests.

3. Tests added via ADD_UDF_TEST in a CMakeLists.txt did not
have the logging dir set up properly.

Testing: I validated that every test produces a corresponding
.INFO file in logs/be_tests. The only exception is promise-test
for which I added a TODO since the fix seems non-trivial.

Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b
Reviewed-on: http://gerrit.cloudera.org:8080/4352
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M be/CMakeLists.txt
M be/src/codegen/instruction-counter-test.cc
M be/src/common/atomic-test.cc
M be/src/common/init.h
M be/src/exec/delimited-text-parser-test.cc
M be/src/exec/hdfs-avro-scanner-test.cc
M be/src/exec/incr-stats-util-test.cc
M be/src/exec/incr-stats-util.cc
M be/src/exec/old-hash-table-test.cc
M be/src/exec/parquet-plain-test.cc
M be/src/exec/parquet-version-test.cc
M be/src/exec/read-write-util-test.cc
M be/src/exec/row-batch-list-test.cc
M be/src/exec/zigzag-test.cc
M be/src/experiments/string-search-sse-test.cc
M be/src/exprs/aggregate-functions-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-util-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/collection-value-builder-test.cc
M be/src/runtime/decimal-test.cc
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/hdfs-fs-cache-test.cc
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-tracker-test.cc
M be/src/runtime/multi-precision-test.cc
M be/src/runtime/parallel-executor-test.cc
M be/src/runtime/raw-value-test.cc
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/string-buffer-test.cc
M be/src/runtime/string-value-test.cc
M be/src/runtime/thread-resource-mgr-test.cc
M be/src/runtime/timestamp-test.cc
M be/src/scheduling/backend-config-test.cc
M be/src/scheduling/simple-scheduler-test.cc
M be/src/service/hs2-util-test.cc
M be/src/service/query-options-test.cc
M be/src/service/session-expiry-test.cc
M be/src/testutil/gtest-util.h
M be/src/udf/uda-test.cc
M be/src/udf/udf-test.cc
M be/src/util/benchmark-test.cc
M be/src/util/bit-util-test.cc
M be/src/util/bitmap-test.cc
M be/src/util/blocking-queue-test.cc
M be/src/util/bloom-filter-test.cc
M be/src/util/coding-util-test.cc
M be/src/util/debug-util-test.cc
M be/src/util/decompress-test.cc
M be/src/util/dict-test.cc
M be/src/util/error-util-test.cc
M be/src/util/filesystem-util-test.cc
M be/src/util/fixed-size-hash-table-test.cc
M be/src/util/internal-queue-test.cc
M be/src/util/logging-support-test.cc
M be/src/util/lru-cache-test.cc
M be/src/util/metrics-test.cc
M be/src/util/parse-util-test.cc
M be/src/util/perf-counters-test.cc
M be/src/util/pretty-printer-test.cc
M be/src/util/promise-test.cc
M be/src/util/redactor-config-parser-test.cc
M be/src/util/redactor-test.cc
M be/src/util/redactor-unconfigured-test.cc
M be/src/util/rle-test.cc
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/string-parser-test.cc
M be/src/util/symbols-util-test.cc
M be/src/util/thread-pool-test.cc
M be/src/util/uid-util-test.cc
M be/src/util/webserver-test.cc
73 files changed, 169 insertions(+), 421 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> yeah, but ideally any table change should trigger a new analyzing just like
You raise a good point. Maybe we should consider how to ensure at least 
table-level consistency during analysis. For example, one way could be to add 
referenced Tables into the Analyzer such that subsequent references will see 
the same Table object (instead of going to the Catalog every time).

Seems relatively easy to do.

Dimitris also has a good point that distinguishing the drop+add case from the 
invalidate case may be hard, but I think the proposed Analyzer change above 
should handle that, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


IMPALA-4008: don't bake in hash table and hash join pointers

This fixes some of the cases where memory addresses are baked into
codegen'd code.

Testing:
Ran exhaustive build.

Perf:
Ran a local perf run. No significant changes. I was able to see some
small improvements on microbenchmarks.


+---+---+-++++
| Workload  | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |

+---+---+-++++
| TPCH(_20) | parquet / none / none | 9.07| +0.46% | 5.88   | 
+0.34% |

+---+---+-++++


+---+--+---++-++++-+---+
| Workload  | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |

+---+--+---++-++++-+---+
| TPCH(_20) | TPCH-Q2  | parquet / none / none | 2.12   | 1.89|   
+12.29%  | * 10.85% * | * 20.30% * | 1   | 10|
| TPCH(_20) | TPCH-Q13 | parquet / none / none | 9.84   | 9.34|   
+5.39%   |   9.01%|   3.79%| 1   | 10|
| TPCH(_20) | TPCH-Q17 | parquet / none / none | 14.61  | 14.19   |   
+2.97%   |   2.15%|   1.52%| 1   | 10|
| TPCH(_20) | TPCH-Q18 | parquet / none / none | 14.76  | 14.35   |   
+2.82%   |   3.20%|   2.64%| 1   | 10|
| TPCH(_20) | TPCH-Q9  | parquet / none / none | 13.72  | 13.54   |   
+1.30%   |   1.75%|   0.70%| 1   | 10|
| TPCH(_20) | TPCH-Q8  | parquet / none / none | 5.71   | 5.64|   
+1.30%   |   1.21%|   1.23%| 1   | 10|
| TPCH(_20) | TPCH-Q19 | parquet / none / none | 47.35  | 46.75   |   
+1.28%   |   2.39%|   1.88%| 1   | 10|
| TPCH(_20) | TPCH-Q5  | parquet / none / none | 4.57   | 4.52|   
+1.20%   |   1.30%|   0.88%| 1   | 10|
| TPCH(_20) | TPCH-Q16 | parquet / none / none | 2.07   | 2.05|   
+1.12%   |   2.59%|   1.79%| 1   | 10|
| TPCH(_20) | TPCH-Q11 | parquet / none / none | 1.45   | 1.45|   
+0.15%   |   2.69%|   2.06%| 1   | 10|
| TPCH(_20) | TPCH-Q3  | parquet / none / none | 4.65   | 4.65|   
-0.09%   |   2.12%|   2.17%| 1   | 10|
| TPCH(_20) | TPCH-Q4  | parquet / none / none | 3.22   | 3.23|   
-0.26%   |   1.03%|   1.33%| 1   | 10|
| TPCH(_20) | TPCH-Q7  | parquet / none / none | 15.84  | 15.92   |   
-0.50%   |   0.91%|   1.15%| 1   | 10|
| TPCH(_20) | TPCH-Q14 | parquet / none / none | 3.29   | 3.31|   
-0.59%   |   3.31%|   1.58%| 1   | 10|
| TPCH(_20) | TPCH-Q22 | parquet / none / none | 2.65   | 2.67|   
-0.78%   |   3.03%|   1.46%| 1   | 10|
| TPCH(_20) | TPCH-Q15 | parquet / none / none | 4.50   | 4.55|   
-1.19%   |   2.87%|   2.45%| 1   | 10|
| TPCH(_20) | TPCH-Q20 | parquet / none / none | 3.84   | 3.91|   
-1.76%   |   2.20%|   1.94%| 1   | 10|
| TPCH(_20) | TPCH-Q10 | parquet / none / none | 5.58   | 5.70|   
-2.00%   |   1.01%|   1.79%| 1   | 10|
| TPCH(_20) | TPCH-Q21 | parquet / none / none | 22.84  | 23.42   |   
-2.47%   |   0.68%|   0.56%| 1   | 10|
| TPCH(_20) | TPCH-Q1  | parquet / none / none | 11.25  | 11.60   |   
-3.06%   |   0.48%|   1.74%| 1   | 10|
| TPCH(_20) | TPCH-Q12 | parquet / none / none | 3.81   | 3.98|   
-4.38%   |   1.62%|   1.14%| 1   | 10|
| TPCH(_20) | TPCH-Q6  | parquet / none / none | 1.94   | 2.04|   
-4.85%   |   2.40%|   1.58%| 1   | 10|

+---+--+---++-++++-+---+


++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |

++---+-++++

[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4028: Trim sentry config file path spaces while impala start.

2016-09-13 Thread Anonymous Coward (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4309

to look at the new patch set (#4).

Change subject: IMPALA-4028: Trim sentry config file path spaces while impala 
start.
..

IMPALA-4028: Trim sentry config file path spaces while impala start.

When sentry config file path is wrong input which end contains spaces, 
impala start up failed.

Trimming sentry config file path spaces and checking file again while
the original file has been checked failed will solve this problem.

Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634
---
M fe/src/main/java/com/cloudera/impala/authorization/SentryConfig.java
1 file changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/4309/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: davy...@163.com
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: davy...@163.com


[Impala-ASF-CR] IMPALA-4028: Trim sentry config file path spaces while impala start.

2016-09-13 Thread Anonymous Coward (Code Review)
davy...@163.com has posted comments on this change.

Change subject: IMPALA-4028: Trim sentry config file path spaces while impala 
start.
..


Patch Set 3:

We have developed a configuration management system through which command line 
parameters can be set.
Thank you for your advice. I will use quotes to show the file name path in the 
error message.

-- 
To view, visit http://gerrit.cloudera.org:8080/4309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: davy...@163.com
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: davy...@163.com
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has posted comments on this change.

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..


Patch Set 4:

> > I'd like you to run the small suite of cluster tests, but you
 > can't
 > > yet until you rebase on top of https://gerrit.cloudera.org/#/c/4404/
 > > I realize there aren't that many, and they may not find anything,
 > > but it's a habit we need to start. Thanks.
 > 
 > Will do, looks like your patch just got merged. I will rebase my
 > patch and then run the tests. Will update this review once I'm
 > done.

@Michael, rebased this patch and ran all the units tests (test_cluster.py and 
test_query_objects.py). All tests passed.

-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Huaisi Xu (Code Review)
Huaisi Xu has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> I think it may be hard to differentiate between a table dropped + added and
why we want to differentiate this? we do not do this now as well right?


Line 188: private final HashMap tblIdMap_ =
> I think that behavior is actually desired. If a table with the same name wa
yeah, but ideally any table change should trigger a new analyzing just like 
incomplete table would do in the middle of analyze. so we do not have the case 
when for 100 sub queries, half of them are analyzed based on table version a, 
the second half on version b. and backend ends up with two version of the same 
table as well. If we keep them the same table ID for the same table name, then 
at least backend can do things on a single version of table? I know that may 
not matter.. anyway, I will do what you guys suggested... thanks


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests.
..


Patch Set 4: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.

2016-09-13 Thread Alex Behm (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4352

to look at the new patch set (#4).

Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests.
..

IMPALA-4091: Fix backend unit to log in logs/be_tests.

1. Many backend unit tests did not follow proper initialization
using InitCommonRuntime(), and as a result did not write their
logs into the logs/be_tests directory.

2. Added an IMPALA_TEST_MAIN() macro that stamps out the common
main() function used in most gtest unit tests.

3. Tests added via ADD_UDF_TEST in a CMakeLists.txt did not
have the logging dir set up properly.

Testing: I validated that every test produces a corresponding
.INFO file in logs/be_tests. The only exception is promise-test
for which I added a TODO since the fix seems non-trivial.

Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b
---
M be/CMakeLists.txt
M be/src/codegen/instruction-counter-test.cc
M be/src/common/atomic-test.cc
M be/src/common/init.h
M be/src/exec/delimited-text-parser-test.cc
M be/src/exec/hdfs-avro-scanner-test.cc
M be/src/exec/incr-stats-util-test.cc
M be/src/exec/incr-stats-util.cc
M be/src/exec/old-hash-table-test.cc
M be/src/exec/parquet-plain-test.cc
M be/src/exec/parquet-version-test.cc
M be/src/exec/read-write-util-test.cc
M be/src/exec/row-batch-list-test.cc
M be/src/exec/zigzag-test.cc
M be/src/experiments/string-search-sse-test.cc
M be/src/exprs/aggregate-functions-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-util-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/collection-value-builder-test.cc
M be/src/runtime/decimal-test.cc
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/hdfs-fs-cache-test.cc
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-tracker-test.cc
M be/src/runtime/multi-precision-test.cc
M be/src/runtime/parallel-executor-test.cc
M be/src/runtime/raw-value-test.cc
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/string-buffer-test.cc
M be/src/runtime/string-value-test.cc
M be/src/runtime/thread-resource-mgr-test.cc
M be/src/runtime/timestamp-test.cc
M be/src/scheduling/backend-config-test.cc
M be/src/scheduling/simple-scheduler-test.cc
M be/src/service/hs2-util-test.cc
M be/src/service/query-options-test.cc
M be/src/service/session-expiry-test.cc
M be/src/testutil/gtest-util.h
M be/src/udf/uda-test.cc
M be/src/udf/udf-test.cc
M be/src/util/benchmark-test.cc
M be/src/util/bit-util-test.cc
M be/src/util/bitmap-test.cc
M be/src/util/blocking-queue-test.cc
M be/src/util/bloom-filter-test.cc
M be/src/util/coding-util-test.cc
M be/src/util/debug-util-test.cc
M be/src/util/decompress-test.cc
M be/src/util/dict-test.cc
M be/src/util/error-util-test.cc
M be/src/util/filesystem-util-test.cc
M be/src/util/fixed-size-hash-table-test.cc
M be/src/util/internal-queue-test.cc
M be/src/util/logging-support-test.cc
M be/src/util/lru-cache-test.cc
M be/src/util/metrics-test.cc
M be/src/util/parse-util-test.cc
M be/src/util/perf-counters-test.cc
M be/src/util/pretty-printer-test.cc
M be/src/util/promise-test.cc
M be/src/util/redactor-config-parser-test.cc
M be/src/util/redactor-test.cc
M be/src/util/redactor-unconfigured-test.cc
M be/src/util/rle-test.cc
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/string-parser-test.cc
M be/src/util/symbols-util-test.cc
M be/src/util/thread-pool-test.cc
M be/src/util/uid-util-test.cc
M be/src/util/webserver-test.cc
73 files changed, 169 insertions(+), 421 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/4352/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4110: Clean up issues found by Apache RAT.
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4361/6/be/.impala.doxy
File be/.impala.doxy:

Line 17
should we really delete all these existing comments?


-- 
To view, visit http://gerrit.cloudera.org:8080/4361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has posted comments on this change.

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..


Patch Set 4:

> I'd like you to run the small suite of cluster tests, but you can't
 > yet until you rebase on top of https://gerrit.cloudera.org/#/c/4404/
 > I realize there aren't that many, and they may not find anything,
 > but it's a habit we need to start. Thanks.

Will do, looks like your patch just got merged. I will rebase my patch and then 
run the tests. Will update this review once I'm done.

-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests
..


IMPALA-4122: qgen: fix bitrotted cluster unit tests

There's a small set of pytest-style tests and associated conftest for
testing some of the cluster-related test infrastructure Python objects.
Going forward, I want unit tests for the query generator to be run as
part of patch acceptance (CI isn't necessary at this time).

This patch fixes a bitrotted test and moves the tests into the
tests-for-qgen directory. The moves were performed thusly:

 $ git mv conftest.py tests/
 $ git mv cluster_tests.py tests/test_cluster.py

Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Reviewed-on: http://gerrit.cloudera.org:8080/4404
Reviewed-by: Michael Brown 
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
D tests/comparison/conftest.py
A tests/comparison/tests/conftest.py
R tests/comparison/tests/test_cluster.py
3 files changed, 66 insertions(+), 50 deletions(-)

Approvals:
  Michael Brown: Looks good to me, but someone else must approve
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4118: extract encryption utils from BufferedBlockMgr

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#6).

Change subject: IMPALA-4118: extract encryption utils from BufferedBlockMgr
..

IMPALA-4118: extract encryption utils from BufferedBlockMgr

As groundwork for IMPALA-4118, extract encryption and integrity
verification routines into a separate module that can be used
by the new BufferPool.

Simplify the logic in BufferedBlockMgr by not distinguishing between the
integrity check and encryption options, which are controlled by the same
command line flag anyway.

This patch changes how the OpenSSL RNG is seeded. I consulted with the
original author of the code (Michael Yoder), who suggested that the new
approach would be appropriate: "I'd pick whatever implementation is
easiest for you. You could add entropy once per query, or once every X
keys (100 keys seems reasonable), or once every "convenient place". 4kb
is probably overkill, you could use 512b for example."

Testing:
Added some unit tests for the utilities. We already have coverage of the
BufferedBlockMgr with encryption in buffered-block-mgr-test. Also reduce
the number of iterations in the buffered-block-mgr-test to save some
testing time.

Change-Id: Ibebe431f69c3c569cbff68171beaa32ef2ab03b6
---
M be/src/common/init.cc
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/buffered-block-mgr.h
M be/src/util/CMakeLists.txt
A be/src/util/openssl-util-test.cc
A be/src/util/openssl-util.cc
A be/src/util/openssl-util.h
8 files changed, 461 insertions(+), 209 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/4389/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibebe431f69c3c569cbff68171beaa32ef2ab03b6
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Matthew Jacobs (Code Review)
Hello Michael Brown, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4175

to look at the new patch set (#4).

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..

IMPALA-3718: Support subset of functional-query for Kudu

Adds initial support for the functional-query test workload
for Kudu tables.

There are a few issues that make loading the functional
schema difficult on Kudu:
 1) Kudu tables must have one or more columns that together
constitute a unique primary key.
   a) Primary key columns must currently be the first columns
  in the table definition (KUDU-1271).
   b) Primary key columns cannot be nullable (KUDU-1570).
 2) Kudu tables must be specified with distribution
parameters.

(1) limits the tables that can be loaded without ugly
workarounds. This patch only includes important tables that
are used for relevant tests, most notably the alltypes*
family. In particular, alltypesagg is important but it does
not have a set of columns that are non-nullable and form a unique
primary key. As a result, that table is created in Kudu with
a different name and an additional BIGINT column for a PK
that is a unique index and is generated at data loading time
using the ROW_NUMBER analytic function. A view is then
wrapped around the underlying table that matches the
alltypesagg schema exactly. When KUDU-1570 is resolved, this
can be simplified.

(2) requires some additional considerations and custom
syntax. As a result, the DDL to create the tables is
explicitly specified in CREATE_KUDU sections in the
functional_schema_constraints.csv, and an additional
DEPENDENT_LOAD_KUDU section was added to specify custom data
loading DML that differs from the existing DEPENDENT_LOAD.

TODO: IMPALA-4005: generate_schema_statements.py needs refactoring

Tests that are not relevant or not yet supported have been
marked with xfail and a skip where appropriate.

TODO: Support remaining functional tables/tests when possible.

Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
---
M bin/impala-config.sh
M testdata/bin/compute-table-stats.sh
M testdata/bin/generate-schema-statements.py
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test
M testdata/workloads/functional-query/functional-query_core.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M testdata/workloads/functional-query/functional-query_pairwise.csv
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M tests/common/skip.py
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_exprs.py
M tests/query_test/test_queries.py
M tests/query_test/test_runtime_filters.py
M tests/query_test/test_scanners.py
M tests/query_test/test_tpcds_queries.py
18 files changed, 329 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/4175/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


Patch Set 4: Code-Review+2

Fixed the commit msg, carrying the +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4110: Apache RAT script on Impala tarballs.
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder

2016-09-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
..


Patch Set 14:

(25 comments)

Here's a first set of comments, mostly focused on phjn.h.  I need to go back 
through phjn.cc, builder.h/cc.

http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

Line 207: state, Substitute(PREPARE_FAILED_ERROR_MSG, "write"));
old code and agg uses state_->block_mgr()->MemLimitTooLowError().  Any reason 
this uses MemLimitExceeded() directly instead? Just wondering the reasoning.


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

Line 79:   /// Logically, the algorithm has the following modes:.
I think this comment could be a lot more useful if we explain how the phases 
work together to give the full algorithm.  Probably that could replace the 
state transition graph which I find confusing.


PS14, Line 81: partition them into hash_partitions_. The input can 
this comment is kind of disjoint since it talks about what the phase does, then 
where the input comes from, and then explains more about what the phase does.  
Let's format each phase consistently, maybe describe inputs first, then 
actions, then outputs?


PS14, Line 86: PROCESSING
why do we use PROCESSING term here rather than PARTITIONING, which would be 
more consistent?


PS14, Line 87: spilled
single spilled
(or remove single from #3 since having it in one place but not the other makes 
the reader wonder if there's a difference).


Line 89:   ///  table in memory, we perform the join, otherwise we spill 
the probe row.
this last sentence is confusing since it mostly restates what the first says 
adds a small amount of new info. Let's combine them.


PS14, Line 90: PROBING_SPILLED_PARTITION
it would be good to be more upfront about when this phase is used and how, for 
a spilled partition, we either do this phase or REPARTITIONING_PROBE phase.


PS14, Line 90: construct
what does this phase construct? isn't the hash table already constructed?


PS14, Line 91: walking
what does this mean? processing? 

also, from these comments it's not clear how these stages relate. You could 
read this to mean that #2 a fallback if we can't do #3, or that #3 is the 
probing that "perform the join" of #2.


PS14, Line 97: *
what does this star mean to say? We can go from REPARTITIONING_PROBE back to 
PROBING_SPILLED_PARTITION, right?


PS14, Line 143: buffer
write buffer?


PS14, Line 264: the
its


PS14, Line 267: the
that


PS14, Line 291: Walks
What does this mean. Iterates over?


PS14, Line 291: hash partitions
is this talking about the probe hash partitions or the builder's?


PS14, Line 384: builder_->hash_partitions
is this referring to the entries in the builder_->hash_partition_ array?


PS14, Line 385: This is not used when processing a single partition.
what does this mean?


PS14, Line 410:  we then iterate over all the probe rows
I can't tell if this means literally all the probe rows, or all the rows in 
this stream.  clarify this paragraph.


PS14, Line 432: and
this makes it sound like there are two conditions required to create a probe 
partition, but I think you mean this is implied by the first part.  "because"?


PS14, Line 436: preallocated
what does this mean?


PS14, Line 437: and
double and.  And is this saying the constructor prepares it or the caller 
should have already?  is this right:

... should be an unpinned stream that has been prepared for writing with ...


PS14, Line 443: should
will


PS14, Line 464: meaning
  : /// it has to call Close() on it) but transferred to the 
parent exec node
this doesn't seem right. don't we either Close() it or transfer it (not both), 
in PartitionedHashJoinNode::ProbePartition::Close()?


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

PS14, Line 409: DCHECK
DCHECK_EQ


Line 412:   children_.insert(children_.begin(), child_entry);
rather than having two special cases (here and line 395-399, how about making 
AddChildLocked() take the position iterator? This case passes begin() and then 
AddChild() case passes either end() or the middle iterator.


-- 
To view, visit http://gerrit.cloudera.org:8080/3873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 88: LOGGER.debug("Success: " + str(zk_client))
> That's a good catch, but I think it might be OK as-is. This is the output o
Wfm.


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4175/3//COMMIT_MSG
Commit Message:

Line 45: marked with xfail and a message where appropriate.
marked with skip


-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


Patch Set 6: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 88: LOGGER.debug("Success: " + str(zk_client))
> LOGGER.info? Otherwise it looks like we're trying to connect but never succ
That's a good catch, but I think it might be OK as-is. This is the output of a 
typical run:


Contents of HDFS root: [u'hbase', u'home', u'test-warehouse', u'tmp', 
u'user']
Connecting to Zookeeper host(s).
Waiting for HBase node: /hbase/master
Success: /hbase/master
Waiting for HBase node: /hbase/rs
Success: /hbase/rs 
[etc...]


The missing line would have been:


Success: 


If the connection attempt didn't succeed, I don't think it would be a mystery 
-- you'd get the exception msg, and the script would exit with an error. Is 
that acceptable? (Honestly, I think might have intended to take that line out.)


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 88: LOGGER.debug("Success: " + str(zk_client))
LOGGER.info? Otherwise it looks like we're trying to connect but never succeed


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> I think that behavior is actually desired. If a table with the same name wa
I think it may be hard to differentiate between a table dropped + added and a 
table reloaded after an invalidate metadata.


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.

2016-09-13 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests.
..


Patch Set 3: Code-Review+1

Thanks for fixing this.

-- 
To view, visit http://gerrit.cloudera.org:8080/4352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> Isn't that ok? Once you drop+create a table with same name, it isn't essent
I think that behavior is actually desired. If a table with the same name was 
dropped+added it *should* have a different table id.


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> Alex, Bharath, another possibility is that given our asymmetry architecture
Isn't that ok? Once you drop+create a table with same name, it isn't 
essentially the "same" table anymore, for ex: its schema could've changed. IMO 
thats fine. I think this is what Alex meant when he mentioned that drop+create 
should create a new table ID.


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4110: Clean up issues found by Apache RAT.
..


Patch Set 3:

(3 comments)

> (3 comments)
 > 
 > Looks good to me.
 > 
 > Can we please separate the tool integration from the fixes in two
 > separate patches?

Done, see https://gerrit.cloudera.org/#/c/4405

http://gerrit.cloudera.org:8080/#/c/4361/3/bin/check-rat-report.py
File bin/check-rat-report.py:

Line 21: # Usage:
> Can you describe the purpose of this tool? Where did this script come from?
Done. Moved to https://gerrit.cloudera.org/#/c/4405


Line 24: #  `git archive --prefix=Foo/ -o baz.tar.gz HEAD`
> can we make the examples more concrete?
Done


http://gerrit.cloudera.org:8080/#/c/4361/3/testdata/bin/cache_tables.py
File testdata/bin/cache_tables.py:

Line 22: # query.  This only works on a mini-dfs cluster.  This is remarkably 
difficult to do 
> fix whitespace while here
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#6).

Change subject: IMPALA-4110: Clean up issues found by Apache RAT.
..

IMPALA-4110: Clean up issues found by Apache RAT.

The script and support files are in another patch:

https://gerrit.cloudera.org/#/c/4405/

Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
---
M LICENSE.txt
M be/.astylerc
M be/.impala.doxy
M be/src/benchmarks/expr-benchmark.cc
M be/src/experiments/data-provider.cc
M be/src/experiments/hashing/cache-hash-test.cc
M be/src/experiments/hashing/growing-test.cc
M be/src/experiments/hashing/interface/cache-hash-test.cc
M be/src/experiments/hashing/interface/growing-test.cc
M be/src/experiments/hashing/multilevel/cache-hash-test.cc
M be/src/experiments/hashing/multilevel/growing-test.cc
M be/src/experiments/hashing/prefetch/cache-hash-test.cc
M be/src/experiments/hashing/prefetch/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/cache-hash-test.cc
M be/src/experiments/hashing/split-benchmarks/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/partitioning-throughput-test.cc
M be/src/experiments/hashing/streaming/cache-hash-test.cc
M be/src/experiments/hashing/streaming/growing-test.cc
M be/src/testutil/certificates-info.txt
M be/src/util/asan.h
M bin/README-RUNNING-BENCHMARKS
D bin/cpplint.py
M bin/impala-ipython
M bin/impala-py.test
M bin/impala-python
M cmake_modules/FindAvro.cmake
M cmake_modules/FindBreakpad.cmake
M cmake_modules/FindBzip2.cmake
M cmake_modules/FindGFlags.cmake
M cmake_modules/FindGLog.cmake
M cmake_modules/FindHDFS.cmake
M cmake_modules/FindJNI.cmake
M cmake_modules/FindLdap.cmake
M cmake_modules/FindLlvm.cmake
M cmake_modules/FindLlvmBinaries.cmake
M cmake_modules/FindLz4.cmake
M cmake_modules/FindOpenSSL.cmake
M cmake_modules/FindPProf.cmake
M cmake_modules/FindRapidJson.cmake
M cmake_modules/FindRe2.cmake
M cmake_modules/FindSnappy.cmake
M cmake_modules/FindThrift.cmake
M cmake_modules/FindZlib.cmake
M fe/src/main/java/com/cloudera/impala/analysis/AggregateInfoBase.java
M fe/src/main/java/com/cloudera/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
M fe/src/main/java/com/cloudera/impala/catalog/ArrayType.java
M fe/src/main/java/com/cloudera/impala/catalog/MapType.java
M fe/src/main/java/com/cloudera/impala/catalog/StructType.java
M fe/src/main/java/com/cloudera/impala/planner/Planner.java
M fe/src/test/resources/authz-policy.ini.template
M fe/src/test/resources/hive-log4j.properties.template
M fe/src/test/resources/log4j.properties.template
M infra/python/deps/download_requirements
M infra/python/deps/requirements.txt
M testdata/avro_schema_resolution/create_table.sql
M testdata/bin/cache_tables.py
M testdata/bin/create-mini.sql
M testdata/bin/load-dependent-tables.sql
M testdata/bin/load-hive-builtins.sh
M testdata/bin/run-hive.sh
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-master
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-tserver
M testdata/cluster/node_templates/cdh5/etc/init.d/llama-application
M testdata/cluster/node_templates/common/etc/init.d/common.tmpl
M testdata/cluster/node_templates/common/etc/init.d/hdfs-common
M testdata/cluster/node_templates/common/etc/init.d/hdfs-datanode
M testdata/cluster/node_templates/common/etc/init.d/hdfs-namenode
M testdata/cluster/node_templates/common/etc/init.d/yarn-common
M testdata/cluster/node_templates/common/etc/init.d/yarn-nodemanager
M testdata/cluster/node_templates/common/etc/init.d/yarn-resourcemanager
D testdata/data/mstr/eatwh1/EATWH1-DDL.sql
D testdata/data/mstr/eatwh1/LoadTablesEATWH1.sql
D testdata/data/mstr/eatwh1/csv/COST_MARKET_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DIV.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/FT1.TXT
D testdata/data/mstr/eatwh1/csv/FT10.TXT
D testdata/data/mstr/eatwh1/csv/FT11.TXT
D testdata/data/mstr/eatwh1/csv/FT12.TXT
D testdata/data/mstr/eatwh1/csv/FT13.TXT
D testdata/data/mstr/eatwh1/csv/FT14.TXT
D testdata/data/mstr/eatwh1/csv/FT15.TXT
D testdata/data/mstr/eatwh1/csv/FT17.TXT
D testdata/data/mstr/eatwh1/csv/FT2.TXT
D testdata/data/mstr/eatwh1/csv/FT3.TXT
D testdata/data/mstr/eatwh1/csv/FT4.TXT
D testdata/data/mstr/eatwh1/csv/FT5.TXT
D testdata/data/mstr/eatwh1/csv/FT6.TXT
D testdata/data/mstr/eatwh1/csv/FT7.TXT
D testdata/data/mstr/eatwh1/csv/FT8.TXT
D testdata/data/mstr/eatwh1/csv/FT9.TXT
D testdata/data/mstr/eatwh1/csv/INVENTORY_CURR.TXT
D testdata/data/mstr/eatwh1/csv/INVENTORY_ORDERS.TXT
D testdata/data/mstr/eatwh1/csv/INVENTORY_Q1_1997.TXT
D testdata/data/mstr/eatwh1/csv/INVENTOR

[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4110: Apache RAT script on Impala tarballs.
..


Patch Set 1:

Separated from https://gerrit.cloudera.org/#/c/4361/ following a request by 
Alex.

-- 
To view, visit http://gerrit.cloudera.org:8080/4405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#5).

Change subject: IMPALA-4110: Clean up issues found by Apache RAT.
..

IMPALA-4110: Clean up issues found by Apache RAT.

The script and support files are in another patch:

https://gerrit.cloudera.org/#/c/4405/

Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
---
M LICENSE.txt
M be/.astylerc
M be/.impala.doxy
M be/src/benchmarks/expr-benchmark.cc
M be/src/experiments/data-provider.cc
M be/src/experiments/hashing/cache-hash-test.cc
M be/src/experiments/hashing/growing-test.cc
M be/src/experiments/hashing/interface/cache-hash-test.cc
M be/src/experiments/hashing/interface/growing-test.cc
M be/src/experiments/hashing/multilevel/cache-hash-test.cc
M be/src/experiments/hashing/multilevel/growing-test.cc
M be/src/experiments/hashing/prefetch/cache-hash-test.cc
M be/src/experiments/hashing/prefetch/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/cache-hash-test.cc
M be/src/experiments/hashing/split-benchmarks/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/partitioning-throughput-test.cc
M be/src/experiments/hashing/streaming/cache-hash-test.cc
M be/src/experiments/hashing/streaming/growing-test.cc
M be/src/testutil/certificates-info.txt
M be/src/util/asan.h
M bin/README-RUNNING-BENCHMARKS
D bin/cpplint.py
M bin/impala-ipython
M bin/impala-py.test
M bin/impala-python
M cmake_modules/FindAvro.cmake
M cmake_modules/FindBreakpad.cmake
M cmake_modules/FindBzip2.cmake
M cmake_modules/FindGFlags.cmake
M cmake_modules/FindGLog.cmake
M cmake_modules/FindHDFS.cmake
M cmake_modules/FindJNI.cmake
M cmake_modules/FindLdap.cmake
M cmake_modules/FindLlvm.cmake
M cmake_modules/FindLlvmBinaries.cmake
M cmake_modules/FindLz4.cmake
M cmake_modules/FindOpenSSL.cmake
M cmake_modules/FindPProf.cmake
M cmake_modules/FindRapidJson.cmake
M cmake_modules/FindRe2.cmake
M cmake_modules/FindSnappy.cmake
M cmake_modules/FindThrift.cmake
M cmake_modules/FindZlib.cmake
M fe/src/main/java/com/cloudera/impala/analysis/AggregateInfoBase.java
M fe/src/main/java/com/cloudera/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
M fe/src/main/java/com/cloudera/impala/catalog/ArrayType.java
M fe/src/main/java/com/cloudera/impala/catalog/MapType.java
M fe/src/main/java/com/cloudera/impala/catalog/StructType.java
M fe/src/main/java/com/cloudera/impala/planner/Planner.java
M fe/src/test/resources/authz-policy.ini.template
M fe/src/test/resources/hive-log4j.properties.template
M fe/src/test/resources/log4j.properties.template
M infra/python/deps/download_requirements
M infra/python/deps/requirements.txt
M testdata/avro_schema_resolution/create_table.sql
M testdata/bin/cache_tables.py
M testdata/bin/create-mini.sql
M testdata/bin/load-dependent-tables.sql
M testdata/bin/load-hive-builtins.sh
M testdata/bin/run-hive.sh
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-master
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-tserver
M testdata/cluster/node_templates/cdh5/etc/init.d/llama-application
M testdata/cluster/node_templates/common/etc/init.d/common.tmpl
M testdata/cluster/node_templates/common/etc/init.d/hdfs-common
M testdata/cluster/node_templates/common/etc/init.d/hdfs-datanode
M testdata/cluster/node_templates/common/etc/init.d/hdfs-namenode
M testdata/cluster/node_templates/common/etc/init.d/yarn-common
M testdata/cluster/node_templates/common/etc/init.d/yarn-nodemanager
M testdata/cluster/node_templates/common/etc/init.d/yarn-resourcemanager
D testdata/data/mstr/eatwh1/EATWH1-DDL.sql
D testdata/data/mstr/eatwh1/LoadTablesEATWH1.sql
D testdata/data/mstr/eatwh1/csv/COST_MARKET_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DIV.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/FT1.TXT
D testdata/data/mstr/eatwh1/csv/FT10.TXT
D testdata/data/mstr/eatwh1/csv/FT11.TXT
D testdata/data/mstr/eatwh1/csv/FT12.TXT
D testdata/data/mstr/eatwh1/csv/FT13.TXT
D testdata/data/mstr/eatwh1/csv/FT14.TXT
D testdata/data/mstr/eatwh1/csv/FT15.TXT
D testdata/data/mstr/eatwh1/csv/FT17.TXT
D testdata/data/mstr/eatwh1/csv/FT2.TXT
D testdata/data/mstr/eatwh1/csv/FT3.TXT
D testdata/data/mstr/eatwh1/csv/FT4.TXT
D testdata/data/mstr/eatwh1/csv/FT5.TXT
D testdata/data/mstr/eatwh1/csv/FT6.TXT
D testdata/data/mstr/eatwh1/csv/FT7.TXT
D testdata/data/mstr/eatwh1/csv/FT8.TXT
D testdata/data/mstr/eatwh1/csv/FT9.TXT
D testdata/data/mstr/eatwh1/csv/INVENTORY_CURR.TXT
D testdata/data/mstr/eatwh1/csv/INVENTORY_ORDERS.TXT
D testdata/data/mstr/eatwh1/csv/INVENTORY_Q1_1997.TXT
D testdata/data/mstr/eatwh1/csv/INVENTOR

[Impala-ASF-CR] IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#4).

Change subject: IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs.
..

IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs.

While I'm here, fix as many of the warnings as possible.

The instructions on how to run this are in check-rat-report.py:

1. Either download a release candidate or create an archive from your
   git repo using `git archive --prefix=Foo/ -o baz.tar.gz HEAD`

2. Untar and run `java -jar apache-rat-0.12.jar -x . >baz.xml`. Only
   RAT version 0.12 is known to work at this time.

3. `bin/check-rat-report.py bin/rat_exclude_files.txt baz.xml`

Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
---
M LICENSE.txt
M be/.astylerc
M be/.impala.doxy
M be/src/benchmarks/expr-benchmark.cc
M be/src/experiments/data-provider.cc
M be/src/experiments/hashing/cache-hash-test.cc
M be/src/experiments/hashing/growing-test.cc
M be/src/experiments/hashing/interface/cache-hash-test.cc
M be/src/experiments/hashing/interface/growing-test.cc
M be/src/experiments/hashing/multilevel/cache-hash-test.cc
M be/src/experiments/hashing/multilevel/growing-test.cc
M be/src/experiments/hashing/prefetch/cache-hash-test.cc
M be/src/experiments/hashing/prefetch/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/cache-hash-test.cc
M be/src/experiments/hashing/split-benchmarks/growing-test.cc
M be/src/experiments/hashing/split-benchmarks/partitioning-throughput-test.cc
M be/src/experiments/hashing/streaming/cache-hash-test.cc
M be/src/experiments/hashing/streaming/growing-test.cc
M be/src/testutil/certificates-info.txt
M be/src/util/asan.h
M bin/README-RUNNING-BENCHMARKS
D bin/cpplint.py
M bin/impala-ipython
M bin/impala-py.test
M bin/impala-python
M cmake_modules/FindAvro.cmake
M cmake_modules/FindBreakpad.cmake
M cmake_modules/FindBzip2.cmake
M cmake_modules/FindGFlags.cmake
M cmake_modules/FindGLog.cmake
M cmake_modules/FindHDFS.cmake
M cmake_modules/FindJNI.cmake
M cmake_modules/FindLdap.cmake
M cmake_modules/FindLlvm.cmake
M cmake_modules/FindLlvmBinaries.cmake
M cmake_modules/FindLz4.cmake
M cmake_modules/FindOpenSSL.cmake
M cmake_modules/FindPProf.cmake
M cmake_modules/FindRapidJson.cmake
M cmake_modules/FindRe2.cmake
M cmake_modules/FindSnappy.cmake
M cmake_modules/FindThrift.cmake
M cmake_modules/FindZlib.cmake
M fe/src/main/java/com/cloudera/impala/analysis/AggregateInfoBase.java
M fe/src/main/java/com/cloudera/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
M fe/src/main/java/com/cloudera/impala/catalog/ArrayType.java
M fe/src/main/java/com/cloudera/impala/catalog/MapType.java
M fe/src/main/java/com/cloudera/impala/catalog/StructType.java
M fe/src/main/java/com/cloudera/impala/planner/Planner.java
M fe/src/test/resources/authz-policy.ini.template
M fe/src/test/resources/hive-log4j.properties.template
M fe/src/test/resources/log4j.properties.template
M infra/python/deps/download_requirements
M infra/python/deps/requirements.txt
M testdata/avro_schema_resolution/create_table.sql
M testdata/bin/cache_tables.py
M testdata/bin/create-mini.sql
M testdata/bin/load-dependent-tables.sql
M testdata/bin/load-hive-builtins.sh
M testdata/bin/run-hive.sh
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-master
M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-tserver
M testdata/cluster/node_templates/cdh5/etc/init.d/llama-application
M testdata/cluster/node_templates/common/etc/init.d/common.tmpl
M testdata/cluster/node_templates/common/etc/init.d/hdfs-common
M testdata/cluster/node_templates/common/etc/init.d/hdfs-datanode
M testdata/cluster/node_templates/common/etc/init.d/hdfs-namenode
M testdata/cluster/node_templates/common/etc/init.d/yarn-common
M testdata/cluster/node_templates/common/etc/init.d/yarn-nodemanager
M testdata/cluster/node_templates/common/etc/init.d/yarn-resourcemanager
D testdata/data/mstr/eatwh1/EATWH1-DDL.sql
D testdata/data/mstr/eatwh1/LoadTablesEATWH1.sql
D testdata/data/mstr/eatwh1/csv/COST_MARKET_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_MARKET_DIV.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_CLASS.TXT
D testdata/data/mstr/eatwh1/csv/COST_REGION_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_DEP.TXT
D testdata/data/mstr/eatwh1/csv/COST_STORE_ITEM.TXT
D testdata/data/mstr/eatwh1/csv/FT1.TXT
D testdata/data/mstr/eatwh1/csv/FT10.TXT
D testdata/data/mstr/eatwh1/csv/FT11.TXT
D testdata/data/mstr/eatwh1/csv/FT12.TXT
D testdata/data/mstr/eatwh1/csv/FT13.TXT
D testdata/data/mstr/eatwh1/csv/FT14.TXT
D testdata/data/mstr/eatwh1/csv/FT15.TXT
D testdata/data/mstr/eatwh1/csv/FT17.TXT
D testdata/data/mstr/eatwh1/csv/FT2.TXT
D testdata/data/mstr/eatwh1/csv/FT3.TXT
D testdata/data/mstr/eatwh1/csv/FT4.TXT
D testdata/data/mstr/eatwh1/csv/FT5

[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3491: Use unique db in test_scanners.py and 
test_aggregation.py
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3491: Use unique db in test_scanners.py and 
test_aggregation.py
..


IMPALA-3491: Use unique db in test_scanners.py and test_aggregation.py

Testing: Ran the tests locally in a loop on exhaustive.
Did a private debug/exhaustive run.

Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c
Reviewed-on: http://gerrit.cloudera.org:8080/4339
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M 
testdata/workloads/functional-query/queries/QueryTest/aggregation_no_codegen_only.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_scanners.py
3 files changed, 20 insertions(+), 35 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4405

Change subject: IMPALA-4110: Apache RAT script on Impala tarballs.
..

IMPALA-4110: Apache RAT script on Impala tarballs.

Apache RAT is a tool for license auditing. It will be used as part of
the Apache release process. This patch includes a script for parsing
its output and a file containing a list of filename globs that should
be ignored. The script takes two command line parameters as input -
the filename of the ignored file globs, and the filename of the RAT
xml output.

Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714
---
A bin/check-rat-report.py
A bin/rat_exclude_files.txt
2 files changed, 191 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/4405/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Huaisi Xu (Code Review)
Huaisi Xu has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> this is a good idea! I will try to do this. but for 64bit thing I think it 
Alex, Bharath, another possibility is that given our asymmetry architecture, 
one impalad can drop and create a new table with the same name. if this table 
gets updated in a single catalog update to another impalad, then we might see 
same table with different table ID as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder

2016-09-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
..


Patch Set 14:

(31 comments)

Some comments and questions.

http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

PS14, Line 297: SCOPED_TIMER(build_timer_);
We used to have two different timers in PHJ for measuring the time to hash the 
input rows into different partitions and the time to build the hash tables. It 
would be good to be able to retain that fined grained tracking PHJBuilder.


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

Line 179: largest_fraction = max(largest_fraction,
DCHECK_GT(num_build_rows, 0);


PS14, Line 318: a hash table
hash tables


Line 341:   // building hash tables to avoid building hash tables for 
partitions we'll spill anyway.
That's an interesting comment. If I understand it correctly, it means that 
calling InitSpilledPartitionProbeStreams() may cause more partitions to be 
spilled as it needs to pin write buffers for the spilled streams so we should 
make sure we allocate all the write streams to trigger all the extra spills 
which may happen before building hash tables. If so, would you mind elaborating 
a bit in the comment:

"Allocate probe buffers for all partitions that are already spilled. Do this 
before building hash tables as allocating probe buffers may cause some more 
partitions to be spilled. This avoids wasted work on building hash tables for 
partitions which end up being spilled eventually."

Do you know how often this case happens ? I suppose this complication will be 
gone eventually once reservation is in place.


Line 401:   RETURN_IF_ERROR(SpillPartition());
May help to document that failure to find any partition to spill (e.g. all 
partitions are spilled) will return an error code. It looks as if we may be in 
an infinite loop on the first glance.


PS14, Line 529: PhjBuilder::HashTableStoresNulls()
This seems to belong to PartitionedHashJoinNode conceptually.


Line 646:   do {
Please see comments in BlockingJoinNode,  it would be great to retain timers 
for InsertBatch() and ProcessBuildInput() separately for finer grain tracking.


PS14, Line 782: process_build_batch_fn_level0
'process_build_batch_fn_level0_'


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

Line 425: 
nit: unnecessary blank line ?


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

PS14, Line 151: 
  : 
  : 
This may be important information to retain.


PS14, Line 87: Expr::CreateExprTree(pool_, eq_join_conjunct.right
Does this overlap with CreateExprTree() in PhjBuilder::Init() ? Same question 
for build_expr_ctxs_. If they are not redundant, mind commenting on how they 
are used differently ?


Line 213:   for (unique_ptr& partition : spilled_partitions_)
missing { }


PS14, Line 494: to output
outputting


PS14, Line 585: hash_partitions_
'hash_partitions_'


Line 594:   continue;
Is there a reason why we cannot call OutputUnmatchedBuild() directly at this 
point ?


PS14, Line 996: s)
next_state ?


http://gerrit.cloudera.org:8080/#/c/3873/12/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

PS12, Line 81: hash_partitions_
'hash_partitions_'


PS12, Line 88: probe_hash_partitions_
'probe_hash_partitions_'


PS12, Line 112: input_partition_
'input_partition_'


http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

PS14, Line 437: 
Is this merged into build_timer_ in BlockingJoinNode now ? It may be helpful 
for debugging to maintain two separate timers.


PS14, Line 81: hash_partitions_
'hash_partitions_'


PS14, Line 84: This is the only phase 
These are the only phases


PS14, Line 88: probe_hash_partitions_
'probe_hash_partitions_'


PS14, Line 112: input_partition_
'input_partition_'


PS14, Line 118: input_partition_
'input_partition_'


PS14, Line 298: spilled_partitions_
'spilled_partitions_'


PS14, Line 302: probe_batch_pos_
'probe_batch_pos_'


PS14, Line 306: probe_batch_pos_
'probe_batch_pos_'


PS14, Line 309: input_partition_
'input_partition_'


PS14, Line 408: null_aware_
'null_aware_'


Line 445: /// block cannot be acquired. "delete_on_read" mode is used, so 
blocks in the stream
... used for the buffered tuple stream, so..


-- 
To view, visit http://gerrit.cloudera.org:8080/3873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit

[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Huaisi Xu (Code Review)
Huaisi Xu has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> +1. Given this happens only with "invalidate metadata", how about we change
this is a good idea! I will try to do this. but for 64bit thing I think it is 
not necessary if we reuse table ID(like Alex suggested). the reason is that the 
new largest possible IDs is lower bounded by our current implementation. we do 
not have a problem now so probably that can go to another patch. For this one, 
I think I will only focus on enforcing unique table ID.

Thank you both!


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> +1. Given this happens only with "invalidate metadata", how about we change
This seems like the simplest solution to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal()

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3949: Log the error message in 
FileSystemUtil.copyToLocal()
..


IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal()

To improve supportability, this commit logs the actual error stack trace
that can cause FileSystemUtil.copyToLocal() to fail. Additionaly this
also cleans up FileSystemUtil.isPathReachable() method to throw an
exception on failures rather than returning false and then returning
the error message as a string to callers.

Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8
Reviewed-on: http://gerrit.cloudera.org:8080/4125
Reviewed-by: Bharath Vissapragada 
Tested-by: Internal Jenkins
---
M fe/src/main/java/com/cloudera/impala/analysis/HdfsUri.java
M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/com/cloudera/impala/common/FileSystemUtil.java
M fe/src/main/java/com/cloudera/impala/util/AvroSchemaUtils.java
4 files changed, 19 insertions(+), 33 deletions(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Brock Noland 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal()

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3949: Log the error message in 
FileSystemUtil.copyToLocal()
..


Patch Set 10: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Brock Noland 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
> Why do we need another map? Don't we already have a map of existing tables 
+1. Given this happens only with "invalidate metadata", how about we change 
just the reset() to re-use existing Ids (by looking up from the existing 
dbCache, rather than resetting to 0) and other places just uses a new unique 
id. This solves two issues.

1. Invalidate won't change the Id.
2. drop+create of same table creates a new Id.

Going by Tim's stats, if we use a 64-bit, we can use an ever-increasing Id. 
Thoughts?


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


Patch Set 2: Code-Review+1

> My thought with xfail was that I wanted them to show up as xfail so
 > we see them and have an incentive to fix them in the future.

That's fair. I was coming from the standpoint of:

1. No one really reads the ends of these tests reports and notices an increase 
or decrease in XFAILs or SKIPs. Nothing consumes those counts, or tracks them 
on a wallboard, for instance.

2. When reading test code, it's odd to see explicit xfail() calls. I associate 
xfail() with "I am executing this test and expect it to fail." Typically I see 
it as a mark on a test.

However, I don't feel strongly about these things. If the xfails incentivize 
you, that's fine. They can stay.

Thanks too for running an exhaustive test. I think you should probably rebase 
this before you submit and run GVO.

-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4326/4/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 308: uint8_t* ALWAYS_INLINE cur_expr_values() const { return 
cur_expr_values_; }
> let's add a short comment here too:
Done


PS4, Line 310: .
> for the current row.
Done


PS4, Line 400: into
> This one should have stayed "in" (or "of"), right? i.e. it doesn't write *e
Done


Line 427:   uint32_t HashVariableLenRow(uint8_t* expr_values, uint8_t* 
expr_values_null) const;
> how about making these const as well.
Done


Line 441:   TupleRow* build_row, uint8_t* expr_values, uint8_t* 
expr_values_null) const;
> these too could be const to clarify they are "in" params.
Done. Also made the various TupleRow arguments const too for consistency. This 
all led to a few additional consts.


-- 
To view, visit http://gerrit.cloudera.org:8080/4326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Tim Armstrong (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4326

to look at the new patch set (#5).

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..

IMPALA-4008: don't bake in hash table and hash join pointers

This fixes some of the cases where memory addresses are baked into
codegen'd code.

Testing:
Ran exhaustive build.

Perf:
Ran a local perf run. No significant changes. I was able to see some
small improvements on microbenchmarks.


+---+---+-++++
| Workload  | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |

+---+---+-++++
| TPCH(_20) | parquet / none / none | 9.07| +0.46% | 5.88   | 
+0.34% |

+---+---+-++++


+---+--+---++-++++-+---+
| Workload  | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |

+---+--+---++-++++-+---+
| TPCH(_20) | TPCH-Q2  | parquet / none / none | 2.12   | 1.89|   
+12.29%  | * 10.85% * | * 20.30% * | 1   | 10|
| TPCH(_20) | TPCH-Q13 | parquet / none / none | 9.84   | 9.34|   
+5.39%   |   9.01%|   3.79%| 1   | 10|
| TPCH(_20) | TPCH-Q17 | parquet / none / none | 14.61  | 14.19   |   
+2.97%   |   2.15%|   1.52%| 1   | 10|
| TPCH(_20) | TPCH-Q18 | parquet / none / none | 14.76  | 14.35   |   
+2.82%   |   3.20%|   2.64%| 1   | 10|
| TPCH(_20) | TPCH-Q9  | parquet / none / none | 13.72  | 13.54   |   
+1.30%   |   1.75%|   0.70%| 1   | 10|
| TPCH(_20) | TPCH-Q8  | parquet / none / none | 5.71   | 5.64|   
+1.30%   |   1.21%|   1.23%| 1   | 10|
| TPCH(_20) | TPCH-Q19 | parquet / none / none | 47.35  | 46.75   |   
+1.28%   |   2.39%|   1.88%| 1   | 10|
| TPCH(_20) | TPCH-Q5  | parquet / none / none | 4.57   | 4.52|   
+1.20%   |   1.30%|   0.88%| 1   | 10|
| TPCH(_20) | TPCH-Q16 | parquet / none / none | 2.07   | 2.05|   
+1.12%   |   2.59%|   1.79%| 1   | 10|
| TPCH(_20) | TPCH-Q11 | parquet / none / none | 1.45   | 1.45|   
+0.15%   |   2.69%|   2.06%| 1   | 10|
| TPCH(_20) | TPCH-Q3  | parquet / none / none | 4.65   | 4.65|   
-0.09%   |   2.12%|   2.17%| 1   | 10|
| TPCH(_20) | TPCH-Q4  | parquet / none / none | 3.22   | 3.23|   
-0.26%   |   1.03%|   1.33%| 1   | 10|
| TPCH(_20) | TPCH-Q7  | parquet / none / none | 15.84  | 15.92   |   
-0.50%   |   0.91%|   1.15%| 1   | 10|
| TPCH(_20) | TPCH-Q14 | parquet / none / none | 3.29   | 3.31|   
-0.59%   |   3.31%|   1.58%| 1   | 10|
| TPCH(_20) | TPCH-Q22 | parquet / none / none | 2.65   | 2.67|   
-0.78%   |   3.03%|   1.46%| 1   | 10|
| TPCH(_20) | TPCH-Q15 | parquet / none / none | 4.50   | 4.55|   
-1.19%   |   2.87%|   2.45%| 1   | 10|
| TPCH(_20) | TPCH-Q20 | parquet / none / none | 3.84   | 3.91|   
-1.76%   |   2.20%|   1.94%| 1   | 10|
| TPCH(_20) | TPCH-Q10 | parquet / none / none | 5.58   | 5.70|   
-2.00%   |   1.01%|   1.79%| 1   | 10|
| TPCH(_20) | TPCH-Q21 | parquet / none / none | 22.84  | 23.42   |   
-2.47%   |   0.68%|   0.56%| 1   | 10|
| TPCH(_20) | TPCH-Q1  | parquet / none / none | 11.25  | 11.60   |   
-3.06%   |   0.48%|   1.74%| 1   | 10|
| TPCH(_20) | TPCH-Q12 | parquet / none / none | 3.81   | 3.98|   
-4.38%   |   1.62%|   1.14%| 1   | 10|
| TPCH(_20) | TPCH-Q6  | parquet / none / none | 1.94   | 2.04|   
-4.85%   |   2.40%|   1.58%| 1   | 10|

+---+--+---++-++++-+---+


++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |

+---

[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


Patch Set 2:

> (8 comments)
 > 
 > Thanks for the feedback.
 > 
 > My thought with xfail was that I wanted them to show up as xfail so
 > we see them and have an incentive to fix them in the future. We
 > should be able to add support for the remaining tables once kudu
 > addresses the JIRAs I mentioned in the commit comment. I can change
 > them to skips if you still prefer.

BTW I forgot to mention, I tested this with an exhaustive test run which 
completed successfully.

-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 8: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs.

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs.
..


Patch Set 3:

(3 comments)

Looks good to me.

Can we please separate the tool integration from the fixes in two separate 
patches?

http://gerrit.cloudera.org:8080/#/c/4361/3/bin/check-rat-report.py
File bin/check-rat-report.py:

Line 21: # Usage:
Can you describe the purpose of this tool? Where did this script come from?


Line 24: #  `git archive --prefix=Foo/ -o baz.tar.gz HEAD`
can we make the examples more concrete?


http://gerrit.cloudera.org:8080/#/c/4361/3/testdata/bin/cache_tables.py
File testdata/bin/cache_tables.py:

Line 22: # query.  This only works on a mini-dfs cluster.  This is remarkably 
difficult to do 
fix whitespace while here


-- 
To view, visit http://gerrit.cloudera.org:8080/4361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..

IMPALA-3718: Support subset of functional-query for Kudu

Adds initial support for the functional-query test workload
for Kudu tables.

There are a few issues that make loading the functional
schema difficult on Kudu:
 1) Kudu tables must have one or more columns that together
constitute a unique primary key.
   a) Primary key columns must currently be the first columns
  in the table definition (KUDU-1271).
   b) Primary key columns cannot be nullable (KUDU-1570).
 2) Kudu tables must be specified with distribution
parameters.

(1) limits the tables that can be loaded without ugly
workarounds. This patch only includes important tables that
are used for relevant tests, most notably the alltypes*
family. In particular, alltypesagg is important but it does
not have a set of columns that are non-nullable and form a unique
primary key. As a result, that table is created in Kudu with
a different name and an additional BIGINT column for a PK
that is a unique index and is generated at data loading time
using the ROW_NUMBER analytic function. A view is then
wrapped around the underlying table that matches the
alltypesagg schema exactly. When KUDU-1570 is resolved, this
can be simplified.

(2) requires some additional considerations and custom
syntax. As a result, the DDL to create the tables is
explicitly specified in CREATE_KUDU sections in the
functional_schema_constraints.csv, and an additional
DEPENDENT_LOAD_KUDU section was added to specify custom data
loading DML that differs from the existing DEPENDENT_LOAD.

TODO: IMPALA-4005: generate_schema_statements.py needs refactoring

Tests that are not relevant or not yet supported have been
marked with xfail and a message where appropriate.

TODO: Support remaining functional tables/tests when possible.

Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
---
M bin/impala-config.sh
M testdata/bin/compute-table-stats.sh
M testdata/bin/generate-schema-statements.py
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test
M testdata/workloads/functional-query/functional-query_core.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M testdata/workloads/functional-query/functional-query_pairwise.csv
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M tests/common/skip.py
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_exprs.py
M tests/query_test/test_queries.py
M tests/query_test/test_runtime_filters.py
M tests/query_test/test_scanners.py
M tests/query_test/test_tpcds_queries.py
18 files changed, 329 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/4175/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-09-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3973: add position and occurrence to instr()
..


Patch Set 18:

I carried the +2 and pushed to asf. Thanks, Zoltan.

-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-09-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has submitted this change and it was merged.

Change subject: IMPALA-3973: add position and occurrence to instr()
..


IMPALA-3973: add position and occurrence to instr()

Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Reviewed-on: http://gerrit.cloudera.org:8080/4094
Tested-by: Internal Jenkins
Reviewed-by: Matthew Jacobs 
---
M be/src/experiments/CMakeLists.txt
R be/src/experiments/string-search-sse-test.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
M be/src/exprs/string-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/string-search-test.cc
M be/src/runtime/string-search.h
M common/function-registry/impala_functions.py
9 files changed, 343 insertions(+), 20 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1702: Enforce unique table ID
..


Patch Set 2:

(1 comment)

I think we should avoid both forms of id inconsistency: Different tables with 
the same id, or the same table with different ids.

http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java:

Line 188: private final HashMap tblIdMap_ =
Why do we need another map? Don't we already have a map of existing tables that 
we can use to get the current id?

I think it could actually be problematic to only rely on the TTableName for 
identification. The user could drop+add the same table and I think in that case 
a new id should be assigned. Thoughts?


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-09-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3973: add position and occurrence to instr()
..


Patch Set 17: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu

2016-09-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3718: Support subset of functional-query for Kudu
..


Patch Set 1:

(8 comments)

Thanks for the feedback.

My thought with xfail was that I wanted them to show up as xfail so we see them 
and have an incentive to fix them in the future. We should be able to add 
support for the remaining tables once kudu addresses the JIRAs I mentioned in 
the commit comment. I can change them to skips if you still prefer.

http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

Line 594: print "Ignore partitions on Kudu"
> Include the db and table name here so we know what we're ignoring.
Done


http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

PS1, Line 598: SELECT row_number() over (order by year, month, id, day),
> For my education, was the ordering of "id, day" in the ORDER BY intentional
It doesn't really matter too much. This row_number() column gets hidden anyway. 
If anything I might have put id first. There is only 1 distinct year and 1 
distinct month, so they don't change the order.


PS1, Line 1063: text_comma_backslash_newline
> This table is defined as a kudu table in schema_constraints.csv L181, but i
Good catch it shouldn't be there. Thanks.


http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

PS1, Line 177: table_name:testtbl, constraint:only, table_format:kudu/none/none
> How is it that this constraint was already defined but only just now had a 
generate_schema_statements will generate a create table statement if it's not 
defined (using the first col as the PK and hash expr), but I wanted to have 
them explicitly defined.


http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/workloads/functional-query/queries/QueryTest/aggregation.test
File testdata/workloads/functional-query/queries/QueryTest/aggregation.test:

Line 836: select min(distinct NULL), max(distinct NULL) from alltypes
> Why this change? Thanks.
IMPALA-4042
alltypesagg is a view for kudu so this wouldn't work. The test still exercises 
the target functionality on a different table.


http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/workloads/functional-query/queries/QueryTest/create_kudu.test
File testdata/workloads/functional-query/queries/QueryTest/create_kudu.test:

PS1, Line 4: drop table if exists managed_kudu;
> Fwiw, this won't be needed once the commit lies atop https://gerrit.clouder
Thanks! Removing...


http://gerrit.cloudera.org:8080/#/c/4175/1/tests/common/test_result_verifier.py
File tests/common/test_result_verifier.py:

PS1, Line 333: # TINYINT). Bypass the type # checking by ignoring the 
actual types of the Avro
> Nit: You left the # when joining the line.
Done


PS1, Line 336:   if 'TIMESTAMP' in expected_types:
 : LOG.info("TIMESTAMP columns unsupported in %s, skipping 
verification." %\
 : file_format)
 : return
> It's weird to see this logic again.
this can be removed, thanks


-- 
To view, visit http://gerrit.cloudera.org:8080/4175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-09-13 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-3973: add position and occurrence to instr()
..


Patch Set 17:

Dan, could you please +2 again? I added the missing test case and I also had to 
rebase to allow jenkins to verify the change. Thanks.

 > Please see comment about possible missing test case, but if nothing
 > else changes, this looks good.

-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests

2016-09-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests
..


Patch Set 5: Code-Review+1

Thanks for the review. Patch set 5 adds an ASF license notice to a file that 
previously didn't have it. Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests

2016-09-13 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4404/4/tests/comparison/tests/test_cluster.py
File tests/comparison/tests/test_cluster.py:

Line 51: ls = cluster.hdfs.create_client().list("/")
Ha, I just noticed this. Look familiar, mikeb?


-- 
To view, visit http://gerrit.cloudera.org:8080/4404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests

2016-09-13 Thread Michael Brown (Code Review)
Hello David Knupp,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4404

to look at the new patch set (#5).

Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests
..

IMPALA-4122: qgen: fix bitrotted cluster unit tests

There's a small set of pytest-style tests and associated conftest for
testing some of the cluster-related test infrastructure Python objects.
Going forward, I want unit tests for the query generator to be run as
part of patch acceptance (CI isn't necessary at this time).

This patch fixes a bitrotted test and moves the tests into the
tests-for-qgen directory. The moves were performed thusly:

 $ git mv conftest.py tests/
 $ git mv cluster_tests.py tests/test_cluster.py

Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
---
D tests/comparison/conftest.py
A tests/comparison/tests/conftest.py
R tests/comparison/tests/test_cluster.py
3 files changed, 66 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/4404/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4326/4/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 308: uint8_t* ALWAYS_INLINE cur_expr_values() const { return 
cur_expr_values_; }
let's add a short comment here too:
/// Returns the current row's expression buffer. 
or something.


PS4, Line 310: .
for the current row.


PS4, Line 400: into
This one should have stayed "in" (or "of"), right? i.e. it doesn't write 
*expr_values.
(Could make them 'const uint8_t*' to clarify that too).


Line 427:   uint32_t HashVariableLenRow(uint8_t* expr_values, uint8_t* 
expr_values_null) const;
how about making these const as well.


Line 441:   TupleRow* build_row, uint8_t* expr_values, uint8_t* 
expr_values_null) const;
these too could be const to clarify they are "in" params.


-- 
To view, visit http://gerrit.cloudera.org:8080/4326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..


Patch Set 4:

I'd like you to run the small suite of cluster tests, but you can't yet until 
you rebase on top of https://gerrit.cloudera.org/#/c/4404/ I realize there 
aren't that many, and they may not find anything, but it's a habit we need to 
start. Thanks.

-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests

2016-09-13 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4404

Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests
..

IMPALA-4122: qgen: fix bitrotted cluster unit tests

There's a small set of pytest-style tests and associated conftest for
testing some of the cluster-related test infrastructure Python objects.
Going forward, I want unit tests for the query generator to be run as
part of patch acceptance (CI isn't necessary at this time).

This patch fixes a bitrotted test and moves the tests into the
tests-for-qgen directory. The moves were performed thusly:

 $ git mv conftest.py tests/
 $ git mv cluster_tests.py tests/test_cluster.py

Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
---
R tests/comparison/tests/conftest.py
R tests/comparison/tests/test_cluster.py
2 files changed, 6 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/4404/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4326/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

PS3, Line 300: ExprValuesHash
> CurExprValuesHash()  [see below for motiviation]
Done


PS3, Line 303: SetExprValuesHash
> Let's rename this to more closely match cur_expr_values() rather than match
Done


PS3, Line 309: cur_expr_values_null
> comment explaining that there is one byte per null value (i.e. these are us
Done.

I'm not actually sure why we need this for codegen, since bool and uint8_t 
should both be represented as int8 internally in LLVM, but I don't want to get 
distracted digging into that, so I left a TODO.

There's some weirdness with vector where it uses bitfields but otherwise 
C++ bools should be 8 bits - maybe that was the reason.


PS3, Line 403: in
> the first time I read this, I read it as this functions stores the values t
Makes sense - done.


PS3, Line 403: in
> same
Done


PS3, Line 413: in
> same
Done


PS3, Line 431: This will be replaced by
 :   /// codegen.
> nit: move this sentence to end of the comment so it's standardized with sim
Done


PS3, Line 438: with 'cur_expr_values_'
> "to compare against the current row."  (since it also uses cur_expr_values_
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..


Patch Set 4: Code-Review+1

Carry Michael's +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers

2016-09-13 Thread Tim Armstrong (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4326

to look at the new patch set (#4).

Change subject: IMPALA-4008: don't bake in hash table and hash join pointers
..

IMPALA-4008: don't bake in hash table and hash join pointers

This fixes some of the cases where memory addresses are baked into
codegen'd code.

Testing:
Ran exhaustive build.

Perf:
Ran a local perf run. No significant changes. I was able to see some
small improvements on microbenchmarks.


+---+---+-++++
| Workload  | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |

+---+---+-++++
| TPCH(_20) | parquet / none / none | 9.07| +0.46% | 5.88   | 
+0.34% |

+---+---+-++++


+---+--+---++-++++-+---+
| Workload  | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |

+---+--+---++-++++-+---+
| TPCH(_20) | TPCH-Q2  | parquet / none / none | 2.12   | 1.89|   
+12.29%  | * 10.85% * | * 20.30% * | 1   | 10|
| TPCH(_20) | TPCH-Q13 | parquet / none / none | 9.84   | 9.34|   
+5.39%   |   9.01%|   3.79%| 1   | 10|
| TPCH(_20) | TPCH-Q17 | parquet / none / none | 14.61  | 14.19   |   
+2.97%   |   2.15%|   1.52%| 1   | 10|
| TPCH(_20) | TPCH-Q18 | parquet / none / none | 14.76  | 14.35   |   
+2.82%   |   3.20%|   2.64%| 1   | 10|
| TPCH(_20) | TPCH-Q9  | parquet / none / none | 13.72  | 13.54   |   
+1.30%   |   1.75%|   0.70%| 1   | 10|
| TPCH(_20) | TPCH-Q8  | parquet / none / none | 5.71   | 5.64|   
+1.30%   |   1.21%|   1.23%| 1   | 10|
| TPCH(_20) | TPCH-Q19 | parquet / none / none | 47.35  | 46.75   |   
+1.28%   |   2.39%|   1.88%| 1   | 10|
| TPCH(_20) | TPCH-Q5  | parquet / none / none | 4.57   | 4.52|   
+1.20%   |   1.30%|   0.88%| 1   | 10|
| TPCH(_20) | TPCH-Q16 | parquet / none / none | 2.07   | 2.05|   
+1.12%   |   2.59%|   1.79%| 1   | 10|
| TPCH(_20) | TPCH-Q11 | parquet / none / none | 1.45   | 1.45|   
+0.15%   |   2.69%|   2.06%| 1   | 10|
| TPCH(_20) | TPCH-Q3  | parquet / none / none | 4.65   | 4.65|   
-0.09%   |   2.12%|   2.17%| 1   | 10|
| TPCH(_20) | TPCH-Q4  | parquet / none / none | 3.22   | 3.23|   
-0.26%   |   1.03%|   1.33%| 1   | 10|
| TPCH(_20) | TPCH-Q7  | parquet / none / none | 15.84  | 15.92   |   
-0.50%   |   0.91%|   1.15%| 1   | 10|
| TPCH(_20) | TPCH-Q14 | parquet / none / none | 3.29   | 3.31|   
-0.59%   |   3.31%|   1.58%| 1   | 10|
| TPCH(_20) | TPCH-Q22 | parquet / none / none | 2.65   | 2.67|   
-0.78%   |   3.03%|   1.46%| 1   | 10|
| TPCH(_20) | TPCH-Q15 | parquet / none / none | 4.50   | 4.55|   
-1.19%   |   2.87%|   2.45%| 1   | 10|
| TPCH(_20) | TPCH-Q20 | parquet / none / none | 3.84   | 3.91|   
-1.76%   |   2.20%|   1.94%| 1   | 10|
| TPCH(_20) | TPCH-Q10 | parquet / none / none | 5.58   | 5.70|   
-2.00%   |   1.01%|   1.79%| 1   | 10|
| TPCH(_20) | TPCH-Q21 | parquet / none / none | 22.84  | 23.42   |   
-2.47%   |   0.68%|   0.56%| 1   | 10|
| TPCH(_20) | TPCH-Q1  | parquet / none / none | 11.25  | 11.60   |   
-3.06%   |   0.48%|   1.74%| 1   | 10|
| TPCH(_20) | TPCH-Q12 | parquet / none / none | 3.81   | 3.98|   
-4.38%   |   1.62%|   1.14%| 1   | 10|
| TPCH(_20) | TPCH-Q6  | parquet / none / none | 1.94   | 2.04|   
-4.85%   |   2.40%|   1.58%| 1   | 10|

+---+--+---++-++++-+---+


++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |

+---

[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3973: add position and occurrence to instr()
..


Patch Set 17: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3873: Add QueryStateAccessor

2016-09-13 Thread Henry Robinson (Code Review)
Henry Robinson has abandoned this change.

Change subject: IMPALA-3873: Add QueryStateAccessor
..


Abandoned

No time to work on this. Will pick up when I've implemented Marcel's 
suggestions as a comparison.

-- 
To view, visit http://gerrit.cloudera.org:8080/3851
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Iec3dae66a81988c99cde1516ff511186e17dd8c0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4111: backend death tests should not produce minidumps

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4111: backend death tests should not produce minidumps
..


Patch Set 4:

Somehow Lars' earlier comments got deleted. Reproducing from my email:



Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4353/3/be/src/util/minidump.h
File be/src/util/minidump.h:

Lars Volker has posted comments on this change.

Change subject: IMPALA-4111: backend death tests should not produce minidumps
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4353/3/be/src/util/minidump.h
File be/src/util/minidump.h:

PS3, Line 29:  for death test
nit: maybe remove or reword like "for example during death test" or "during 
tests"

-- 
To view, visit http://gerrit.cloudera.org:8080/4353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I33037f4f77c8188fc2ec46a77083a0a64f6ea404
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4026: Implement double-buffering for BlockingQueue.

2016-09-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4026: Implement double-buffering for BlockingQueue.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4350/3/be/src/util/blocking-queue.h
File be/src/util/blocking-queue.h:

Line 99:   bool BlockingPut(const T& val) {
> I think there's a pretty bad bug with how the signalling works. 
Yes, I have thought about it before. I have tested various locations for 
notification:

1. notify_one once an entry is consumed from get_list by BlockingGet()
2. notify_all when the get_list is empty
3. notify_one when the get_list is empty

Apparently, option (3) seems to give the best performance.

I have investigated option (2) and it appears that the thundering herd effect 
causes all scanner threads to start immediately leading to contention in memory 
allocation (e.g. memset takes 6x longer), causing some TPCH queries to regress.

I suspect the slow down in option (1) may have to do the extra signaling 
overhead per entry or it could be a side effect of the poor struct layout 
before so it may be worth retrying again.

In practice, a consumer can consume a row batch faster than a scanner thread 
can produce it so the effect of option (3) could be that it scatters out when 
the scanner threads get unblocked.


-- 
To view, visit http://gerrit.cloudera.org:8080/4350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9f4cf351455efefb0f3bb791cf9bc82d1421d54
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Chen Huang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4318/2/community.html
File community.html:

Line 118:   https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala";>Contribute
> Good idea!
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.

2016-09-13 Thread Jim Apple (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4318

to look at the new patch set (#4).

Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
..

IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.

While I'm in here, remove outdated roadmap.

Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82
---
M bylaws.html
M community.html
R impala-docs.html
M index.html
M overview.html
5 files changed, 26 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4318/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.

2016-09-13 Thread Jim Apple (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4318

to look at the new patch set (#3).

Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
..

IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.

While I'm in here, remove outdated roadmap.

Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82
---
M bylaws.html
M community.html
R impala-docs.html
M index.html
M overview.html
5 files changed, 26 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4318/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has posted comments on this change.

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4011/3/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS3, Line 56: DEFAULT_HIVE_PORT = 11050
> This should be reverted back to 11050 to match what's already in use in the
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has uploaded a new patch set (#4).

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..

IMPALA-3980: qgen: re-enable Hive as a target database

Changes:

* Added hive cli options back in (removed in commit "Stress test: Various 
changes")
* Modifications so that if --use-hive is specified, a Hive connection is 
actually created
* A few minor bug fixes so that the RQG can be run locally
* Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a 
hard-coded
file under IMPALA_HOME

Testing:

* Hive integration tested locally by invoking the data generator via the 
command:

./data-generator.py \
--db-name=functional \
--use-hive \
--min-row-count=50 \
--max-row-count=100 \
--storage-file-formats textfile \
--use-postgresql \
--postgresql-user stakiar

and the discrepancy checker via the command:

./discrepancy-checker.py \
--db-name=functional \
--use-hive \
--use-postgresql \
--postgresql-user stakiar \
--test-db-type HIVE \
--timeout 300 \
--query-count 50 \
--profile hive

* The output of the above two commands is essentially the same as the Impala 
output,
however, about 20% of the queries will fail when the discrepancy checker is run
* Regression testing done by running Leopard in a local VM running Ubuntu 
14.04, and by
running the discrepancy checker against Impala while inside an Impala Docker 
container

Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
---
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/data_generator.py
3 files changed, 61 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com


[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py

2016-09-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3491: Use unique db in test_scanners.py and 
test_aggregation.py
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal()

2016-09-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3949: Log the error message in 
FileSystemUtil.copyToLocal()
..


Patch Set 10: Code-Review+2

Rebased Carrying Alex's +2. Thanks Alex, Matt and Brock for reviews.

-- 
To view, visit http://gerrit.cloudera.org:8080/4125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Brock Noland 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4011/3/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS3, Line 56: DEFAULT_HIVE_PORT = 1
This should be reverted back to 11050 to match what's already in use in the 
Impala dev environment (for good or ill).


-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has posted comments on this change.

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4011/2//COMMIT_MSG
Commit Message:

PS2, Line 7: IMPALA-3980: qgen: re-enable Hive as a target database
> I prefer commit messages that, after the bug, speak of the code section cha
Done


PS2, Line 19: * Hive integration tested locally by invoking the data generator 
via the command:
: 
: ./data-generator.py \
: --db-name=functional \
: --use-hive \
: --min-row-count=50 \
> While it makes the commit message have more lines, it is more readable to w
Done


http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cli_options.py
File tests/comparison/cli_options.py:

PS2, Line 150:   group.add_argument('--hive-host', default=DEFAULT_HIVE_HOST,
 :   help="The name of the host running the HS2")
 :   group.add_argument("--hive-port", default=DEFAULT_HIVE_PORT, 
type=int,
 :   help="The port of HiveServer2")
> Why not derive these defaults from symbols imported from cluster.py? (The c
Done


http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS2, Line 166: class MiniCluster(Cluster):
> This changes the MiniCluster constructor interface for all callers. It also
Done


PS2, Line 178:   def _init_local_hadoop_conf_dir(self):
> Thanks for doing the cleanups like this one.
No problem! It makes the code a lot easier to run locally.


http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/data_generator.py
File tests/comparison/data_generator.py:

PS2, Line 101:   def populate_db(self, table_count, postgresql_conn=None, 
is_hive=False)
> We don't use variable names of the format "isHive" typically, so can you ch
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.

2016-09-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4318/2/community.html
File community.html:

Line 118:   https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala";>Contribute
> So, perhaps I could rename "books" to "docs", then have that page link to t
Good idea!


-- 
To view, visit http://gerrit.cloudera.org:8080/4318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database

2016-09-13 Thread Anonymous Coward (Code Review)
stak...@cloudera.com has uploaded a new patch set (#3).

Change subject: IMPALA-3980: qgen: re-enable Hive as a target database
..

IMPALA-3980: qgen: re-enable Hive as a target database

Changes:

* Added hive cli options back in (removed in commit "Stress test: Various 
changes")
* Modifications so that if --use-hive is specified, a Hive connection is 
actually created
* A few minor bug fixes so that the RQG can be run locally
* Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a 
hard-coded
file under IMPALA_HOME

Testing:

* Hive integration tested locally by invoking the data generator via the 
command:

./data-generator.py \
--db-name=functional \
--use-hive \
--min-row-count=50 \
--max-row-count=100 \
--storage-file-formats textfile \
--use-postgresql \
--postgresql-user stakiar

and the discrepancy checker via the command:

./discrepancy-checker.py \
--db-name=functional \
--use-hive \
--use-postgresql \
--postgresql-user stakiar \
--test-db-type HIVE \
--timeout 300 \
--query-count 50 \
--profile hive

* The output of the above two commands is essentially the same as the Impala 
output,
however, about 20% of the queries will fail when the discrepancy checker is run
* Regression testing done by running Leopard in a local VM running Ubuntu 
14.04, and by
running the discrepancy checker against Impala while inside an Impala Docker 
container

Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
---
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/data_generator.py
3 files changed, 61 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com


[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py

2016-09-13 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3491: Use unique db in test_scanners.py and 
test_aggregation.py
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 8: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#8).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 173 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#7).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 173 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

2016-09-13 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in 
run-hbase.sh
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4348/6/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

PS6, Line 59: help=('Comma-delineated string of hosts 
in host:PORT format, '
> When is more than one zk host needed?
Don't know, but that's the way the API is written.


PS6, Line 135: # Confirm that HDFS is available. There is a pathological 
case where the HBase master
 : # can start up briefly if HDFS is not available, and then 
quit immediately, but that can
 : # be long enough to give a false positive that the HBase 
master is running.
 : #
 : try:
 : hdfs_client = InsecureClient('http://' + args.hdfs_host)
 : hdfs_client.list('/')
 : except requests.exceptions.ConnectionError as e:
 : msg = 'Could not connect to HDFS web host http://{0} - 
{1}'.format(args.hdfs_host, e)
 : LOGGER.error(msg)
 : sys.exit(1)
 : 
 : errors = check_znodes_list_for_errors(args.nodes, 
args.zookeeper_hosts, args.timeout)
> I meant for this section also to be a method, with errors returned.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Ishaan Joshi 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4026: Implement double-buffering for BlockingQueue.

2016-09-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4026: Implement double-buffering for BlockingQueue.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4350/3/be/src/util/blocking-queue.h
File be/src/util/blocking-queue.h:

Line 99:   bool BlockingPut(const T& val) {
I think there's a pretty bad bug with how the signalling works. 

Consider a queue with a limit of N. If the queue gets into a state where 
get_list has N elements and put_list has 0 elements with all of the producer 
threads waiting on it, then the caller to BlockingGet() will drain get_list 
before signalling put_cv,  even though it should wake up a producer thread once 
there is space in the queue. Worse, it will only wake up one producer thread 
each time BlockingGet() is called, which means that one only producer thread 
can run at a time.


-- 
To view, visit http://gerrit.cloudera.org:8080/4350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9f4cf351455efefb0f3bb791cf9bc82d1421d54
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Chen Huang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3980: Re-enable Hive as a target database for the Random Query Generator

2016-09-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3980: Re-enable Hive as a target database for the Random 
Query Generator
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4011/2//COMMIT_MSG
Commit Message:

PS2, Line 7: IMPALA-3980: Re-enable Hive as a target database for the Random 
Query Generator
I prefer commit messages that, after the bug, speak of the code section 
changed. So something like:

"IMPALA-3890: qgen: reenable Hive as a target database"


PS2, Line 19: * Hive integration tested locally by invoking the data generator 
(./data-generator.py
: --db-name=functional --use-hive --min-row-count=50 
--max-row-count=100
: --storage-file-formats textfile --use-postgresql 
--postgresql-user stakiar) and the
: discrepancy checker (./discrepancy-checker.py 
--db-name=functional --use-hive
: --use-postgresql --postgresql-user stakiar --test-db-type HIVE 
--timeout 300
: --query-count 50 --profile hive)
While it makes the commit message have more lines, it is more readable to write 
the commands with breaks across options, like this:

  ./data_generator.py \
--db-name=functional \
--use-hive \
--min-row-count=50 \
--max-row-count=100 \
--storage-file-formats textfile \
--use-postgresql \
--postgresql-user stakiar


http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cli_options.py
File tests/comparison/cli_options.py:

PS2, Line 150:   group.add_argument('--hive-host', default='localhost',
 :   help="The name of the host running the HS2")
 :   group.add_argument("--hive-port", default=1, type=int,
 :   help="The port of HiveServer2")
Why not derive these defaults from symbols imported from cluster.py? (The 
constants I've asked you to create in that file)


http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS2, Line 166:   def __init__(self, hive_host, hive_port, num_impalads=3):
This changes the MiniCluster constructor interface for all callers. It also 
implicitly changes the cli_options.create_cluster() call, which has several 
callers.

What do you think about defining 127.0.0.1 and 11050 (from the old code L179) 
as default hive host and port as constants at the top of this module? Then make 
hive_host and hive_port on this line optional, defaulting to those constants.


PS2, Line 178: node_conf_dir = os.environ["HADOOP_CONF_DIR"]
Thanks for doing the cleanups like this one.


http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/data_generator.py
File tests/comparison/data_generator.py:

PS2, Line 101:   def populate_db(self, table_count, postgresql_conn=None, 
isHive=False):
We don't use variable names of the format "isHive" typically, so can you change 
this to is_hive?


-- 
To view, visit http://gerrit.cloudera.org:8080/4011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: stak...@cloudera.com
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: stak...@cloudera.com
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-09-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3973: add position and occurrence to instr()
..


Patch Set 16: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/190/

-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.

2016-09-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4318/2/community.html
File community.html:

Line 118:   https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala";>Contribute
> So, perhaps I could rename "books" to "docs", then have that page link to t
Any more thoughts on this, Henry?


-- 
To view, visit http://gerrit.cloudera.org:8080/4318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-09-13 Thread Zoltan Ivanfi (Code Review)
Hello Lars Volker, Matthew Jacobs, Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4094

to look at the new patch set (#16).

Change subject: IMPALA-3973: add position and occurrence to instr()
..

IMPALA-3973: add position and occurrence to instr()

Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
---
M be/src/experiments/CMakeLists.txt
R be/src/experiments/string-search-sse-test.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
M be/src/exprs/string-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/string-search-test.cc
M be/src/runtime/string-search.h
M common/function-registry/impala_functions.py
9 files changed, 343 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4094/16
-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Zoltan Ivanfi