[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. Patch Set 4: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/197/ -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] Make gen build version.py resilient to a failing git rev-parse
Sailesh Mukil has uploaded a new change for review. http://gerrit.cloudera.org:8080/4411 Change subject: Make gen_build_version.py resilient to a failing git rev-parse .. Make gen_build_version.py resilient to a failing git rev-parse It was noticed that some build processes did not checkout Impala and instead built it from a tarball. This would cause our gen_build_version script to write a blank version info everytime to the version.info file. This patch takes care of the case where if there is an already existing version.info file and we cannot get the git rev-parse output, we use the old file instead. Blank version info is written only when we don't have an old version.info file and we cannot do a git rev-parse. Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b --- M bin/gen_build_version.py M bin/save-version.sh 2 files changed, 23 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4411/1 -- To view, visit http://gerrit.cloudera.org:8080/4411 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id7af33502bbb70185dc15ffca6219436a616f25b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests. .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests. .. IMPALA-4091: Fix backend unit to log in logs/be_tests. 1. Many backend unit tests did not follow proper initialization using InitCommonRuntime(), and as a result did not write their logs into the logs/be_tests directory. 2. Added an IMPALA_TEST_MAIN() macro that stamps out the common main() function used in most gtest unit tests. 3. Tests added via ADD_UDF_TEST in a CMakeLists.txt did not have the logging dir set up properly. Testing: I validated that every test produces a corresponding .INFO file in logs/be_tests. The only exception is promise-test for which I added a TODO since the fix seems non-trivial. Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b Reviewed-on: http://gerrit.cloudera.org:8080/4352 Reviewed-by: Alex Behm Tested-by: Internal Jenkins --- M be/CMakeLists.txt M be/src/codegen/instruction-counter-test.cc M be/src/common/atomic-test.cc M be/src/common/init.h M be/src/exec/delimited-text-parser-test.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/incr-stats-util-test.cc M be/src/exec/incr-stats-util.cc M be/src/exec/old-hash-table-test.cc M be/src/exec/parquet-plain-test.cc M be/src/exec/parquet-version-test.cc M be/src/exec/read-write-util-test.cc M be/src/exec/row-batch-list-test.cc M be/src/exec/zigzag-test.cc M be/src/experiments/string-search-sse-test.cc M be/src/exprs/aggregate-functions-test.cc M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util-test.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/collection-value-builder-test.cc M be/src/runtime/decimal-test.cc M be/src/runtime/free-pool-test.cc M be/src/runtime/free-pool.h M be/src/runtime/hdfs-fs-cache-test.cc M be/src/runtime/mem-pool-test.cc M be/src/runtime/mem-tracker-test.cc M be/src/runtime/multi-precision-test.cc M be/src/runtime/parallel-executor-test.cc M be/src/runtime/raw-value-test.cc M be/src/runtime/row-batch-serialize-test.cc M be/src/runtime/string-buffer-test.cc M be/src/runtime/string-value-test.cc M be/src/runtime/thread-resource-mgr-test.cc M be/src/runtime/timestamp-test.cc M be/src/scheduling/backend-config-test.cc M be/src/scheduling/simple-scheduler-test.cc M be/src/service/hs2-util-test.cc M be/src/service/query-options-test.cc M be/src/service/session-expiry-test.cc M be/src/testutil/gtest-util.h M be/src/udf/uda-test.cc M be/src/udf/udf-test.cc M be/src/util/benchmark-test.cc M be/src/util/bit-util-test.cc M be/src/util/bitmap-test.cc M be/src/util/blocking-queue-test.cc M be/src/util/bloom-filter-test.cc M be/src/util/coding-util-test.cc M be/src/util/debug-util-test.cc M be/src/util/decompress-test.cc M be/src/util/dict-test.cc M be/src/util/error-util-test.cc M be/src/util/filesystem-util-test.cc M be/src/util/fixed-size-hash-table-test.cc M be/src/util/internal-queue-test.cc M be/src/util/logging-support-test.cc M be/src/util/lru-cache-test.cc M be/src/util/metrics-test.cc M be/src/util/parse-util-test.cc M be/src/util/perf-counters-test.cc M be/src/util/pretty-printer-test.cc M be/src/util/promise-test.cc M be/src/util/redactor-config-parser-test.cc M be/src/util/redactor-test.cc M be/src/util/redactor-unconfigured-test.cc M be/src/util/rle-test.cc M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/string-parser-test.cc M be/src/util/symbols-util-test.cc M be/src/util/thread-pool-test.cc M be/src/util/uid-util-test.cc M be/src/util/webserver-test.cc 73 files changed, 169 insertions(+), 421 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Alex Behm has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > yeah, but ideally any table change should trigger a new analyzing just like You raise a good point. Maybe we should consider how to ensure at least table-level consistency during analysis. For example, one way could be to add referenced Tables into the Analyzer such that subsequent references will see the same Table object (instead of going to the Catalog every time). Seems relatively easy to do. Dimitris also has a good point that distinguishing the drop+add case from the invalidate case may be hard, but I think the proposed Analyzer change above should handle that, right? -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. IMPALA-4008: don't bake in hash table and hash join pointers This fixes some of the cases where memory addresses are baked into codegen'd code. Testing: Ran exhaustive build. Perf: Ran a local perf run. No significant changes. I was able to see some small improvements on microbenchmarks. +---+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---+---+-++++ | TPCH(_20) | parquet / none / none | 9.07| +0.46% | 5.88 | +0.34% | +---+---+-++++ +---+--+---++-++++-+---+ | Workload | Query| File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +---+--+---++-++++-+---+ | TPCH(_20) | TPCH-Q2 | parquet / none / none | 2.12 | 1.89| +12.29% | * 10.85% * | * 20.30% * | 1 | 10| | TPCH(_20) | TPCH-Q13 | parquet / none / none | 9.84 | 9.34| +5.39% | 9.01%| 3.79%| 1 | 10| | TPCH(_20) | TPCH-Q17 | parquet / none / none | 14.61 | 14.19 | +2.97% | 2.15%| 1.52%| 1 | 10| | TPCH(_20) | TPCH-Q18 | parquet / none / none | 14.76 | 14.35 | +2.82% | 3.20%| 2.64%| 1 | 10| | TPCH(_20) | TPCH-Q9 | parquet / none / none | 13.72 | 13.54 | +1.30% | 1.75%| 0.70%| 1 | 10| | TPCH(_20) | TPCH-Q8 | parquet / none / none | 5.71 | 5.64| +1.30% | 1.21%| 1.23%| 1 | 10| | TPCH(_20) | TPCH-Q19 | parquet / none / none | 47.35 | 46.75 | +1.28% | 2.39%| 1.88%| 1 | 10| | TPCH(_20) | TPCH-Q5 | parquet / none / none | 4.57 | 4.52| +1.20% | 1.30%| 0.88%| 1 | 10| | TPCH(_20) | TPCH-Q16 | parquet / none / none | 2.07 | 2.05| +1.12% | 2.59%| 1.79%| 1 | 10| | TPCH(_20) | TPCH-Q11 | parquet / none / none | 1.45 | 1.45| +0.15% | 2.69%| 2.06%| 1 | 10| | TPCH(_20) | TPCH-Q3 | parquet / none / none | 4.65 | 4.65| -0.09% | 2.12%| 2.17%| 1 | 10| | TPCH(_20) | TPCH-Q4 | parquet / none / none | 3.22 | 3.23| -0.26% | 1.03%| 1.33%| 1 | 10| | TPCH(_20) | TPCH-Q7 | parquet / none / none | 15.84 | 15.92 | -0.50% | 0.91%| 1.15%| 1 | 10| | TPCH(_20) | TPCH-Q14 | parquet / none / none | 3.29 | 3.31| -0.59% | 3.31%| 1.58%| 1 | 10| | TPCH(_20) | TPCH-Q22 | parquet / none / none | 2.65 | 2.67| -0.78% | 3.03%| 1.46%| 1 | 10| | TPCH(_20) | TPCH-Q15 | parquet / none / none | 4.50 | 4.55| -1.19% | 2.87%| 2.45%| 1 | 10| | TPCH(_20) | TPCH-Q20 | parquet / none / none | 3.84 | 3.91| -1.76% | 2.20%| 1.94%| 1 | 10| | TPCH(_20) | TPCH-Q10 | parquet / none / none | 5.58 | 5.70| -2.00% | 1.01%| 1.79%| 1 | 10| | TPCH(_20) | TPCH-Q21 | parquet / none / none | 22.84 | 23.42 | -2.47% | 0.68%| 0.56%| 1 | 10| | TPCH(_20) | TPCH-Q1 | parquet / none / none | 11.25 | 11.60 | -3.06% | 0.48%| 1.74%| 1 | 10| | TPCH(_20) | TPCH-Q12 | parquet / none / none | 3.81 | 3.98| -4.38% | 1.62%| 1.14%| 1 | 10| | TPCH(_20) | TPCH-Q6 | parquet / none / none | 1.94 | 2.04| -4.85% | 2.40%| 1.58%| 1 | 10| +---+--+---++-++++-+---+ ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | ++---+-++++
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4028: Trim sentry config file path spaces while impala start.
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4309 to look at the new patch set (#4). Change subject: IMPALA-4028: Trim sentry config file path spaces while impala start. .. IMPALA-4028: Trim sentry config file path spaces while impala start. When sentry config file path is wrong input which end contains spaces, impala start up failed. Trimming sentry config file path spaces and checking file again while the original file has been checked failed will solve this problem. Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 --- M fe/src/main/java/com/cloudera/impala/authorization/SentryConfig.java 1 file changed, 3 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/4309/4 -- To view, visit http://gerrit.cloudera.org:8080/4309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: davy...@163.com Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: davy...@163.com
[Impala-ASF-CR] IMPALA-4028: Trim sentry config file path spaces while impala start.
davy...@163.com has posted comments on this change. Change subject: IMPALA-4028: Trim sentry config file path spaces while impala start. .. Patch Set 3: We have developed a configuration management system through which command line parameters can be set. Thank you for your advice. I will use quotes to show the file name path in the error message. -- To view, visit http://gerrit.cloudera.org:8080/4309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: davy...@163.com Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: davy...@163.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 4: > > I'd like you to run the small suite of cluster tests, but you > can't > > yet until you rebase on top of https://gerrit.cloudera.org/#/c/4404/ > > I realize there aren't that many, and they may not find anything, > > but it's a habit we need to start. Thanks. > > Will do, looks like your patch just got merged. I will rebase my > patch and then run the tests. Will update this review once I'm > done. @Michael, rebased this patch and ran all the units tests (test_cluster.py and test_query_objects.py). All tests passed. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Huaisi Xu has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > I think it may be hard to differentiate between a table dropped + added and why we want to differentiate this? we do not do this now as well right? Line 188: private final HashMap tblIdMap_ = > I think that behavior is actually desired. If a table with the same name wa yeah, but ideally any table change should trigger a new analyzing just like incomplete table would do in the middle of analyze. so we do not have the case when for 100 sub queries, half of them are analyzed based on table version a, the second half on version b. and backend ends up with two version of the same table as well. If we keep them the same table ID for the same table name, then at least backend can do things on a single version of table? I know that may not matter.. anyway, I will do what you guys suggested... thanks -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.
Alex Behm has posted comments on this change. Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests. .. Patch Set 4: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/4352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4352 to look at the new patch set (#4). Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests. .. IMPALA-4091: Fix backend unit to log in logs/be_tests. 1. Many backend unit tests did not follow proper initialization using InitCommonRuntime(), and as a result did not write their logs into the logs/be_tests directory. 2. Added an IMPALA_TEST_MAIN() macro that stamps out the common main() function used in most gtest unit tests. 3. Tests added via ADD_UDF_TEST in a CMakeLists.txt did not have the logging dir set up properly. Testing: I validated that every test produces a corresponding .INFO file in logs/be_tests. The only exception is promise-test for which I added a TODO since the fix seems non-trivial. Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b --- M be/CMakeLists.txt M be/src/codegen/instruction-counter-test.cc M be/src/common/atomic-test.cc M be/src/common/init.h M be/src/exec/delimited-text-parser-test.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/incr-stats-util-test.cc M be/src/exec/incr-stats-util.cc M be/src/exec/old-hash-table-test.cc M be/src/exec/parquet-plain-test.cc M be/src/exec/parquet-version-test.cc M be/src/exec/read-write-util-test.cc M be/src/exec/row-batch-list-test.cc M be/src/exec/zigzag-test.cc M be/src/experiments/string-search-sse-test.cc M be/src/exprs/aggregate-functions-test.cc M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util-test.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/collection-value-builder-test.cc M be/src/runtime/decimal-test.cc M be/src/runtime/free-pool-test.cc M be/src/runtime/free-pool.h M be/src/runtime/hdfs-fs-cache-test.cc M be/src/runtime/mem-pool-test.cc M be/src/runtime/mem-tracker-test.cc M be/src/runtime/multi-precision-test.cc M be/src/runtime/parallel-executor-test.cc M be/src/runtime/raw-value-test.cc M be/src/runtime/row-batch-serialize-test.cc M be/src/runtime/string-buffer-test.cc M be/src/runtime/string-value-test.cc M be/src/runtime/thread-resource-mgr-test.cc M be/src/runtime/timestamp-test.cc M be/src/scheduling/backend-config-test.cc M be/src/scheduling/simple-scheduler-test.cc M be/src/service/hs2-util-test.cc M be/src/service/query-options-test.cc M be/src/service/session-expiry-test.cc M be/src/testutil/gtest-util.h M be/src/udf/uda-test.cc M be/src/udf/udf-test.cc M be/src/util/benchmark-test.cc M be/src/util/bit-util-test.cc M be/src/util/bitmap-test.cc M be/src/util/blocking-queue-test.cc M be/src/util/bloom-filter-test.cc M be/src/util/coding-util-test.cc M be/src/util/debug-util-test.cc M be/src/util/decompress-test.cc M be/src/util/dict-test.cc M be/src/util/error-util-test.cc M be/src/util/filesystem-util-test.cc M be/src/util/fixed-size-hash-table-test.cc M be/src/util/internal-queue-test.cc M be/src/util/logging-support-test.cc M be/src/util/lru-cache-test.cc M be/src/util/metrics-test.cc M be/src/util/parse-util-test.cc M be/src/util/perf-counters-test.cc M be/src/util/pretty-printer-test.cc M be/src/util/promise-test.cc M be/src/util/redactor-config-parser-test.cc M be/src/util/redactor-test.cc M be/src/util/redactor-unconfigured-test.cc M be/src/util/rle-test.cc M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/string-parser-test.cc M be/src/util/symbols-util-test.cc M be/src/util/thread-pool-test.cc M be/src/util/uid-util-test.cc M be/src/util/webserver-test.cc 73 files changed, 169 insertions(+), 421 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/4352/4 -- To view, visit http://gerrit.cloudera.org:8080/4352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.
Alex Behm has posted comments on this change. Change subject: IMPALA-4110: Clean up issues found by Apache RAT. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/4361/6/be/.impala.doxy File be/.impala.doxy: Line 17 should we really delete all these existing comments? -- To view, visit http://gerrit.cloudera.org:8080/4361 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 4: > I'd like you to run the small suite of cluster tests, but you can't > yet until you rebase on top of https://gerrit.cloudera.org/#/c/4404/ > I realize there aren't that many, and they may not find anything, > but it's a habit we need to start. Thanks. Will do, looks like your patch just got merged. I will rebase my patch and then run the tests. Will update this review once I'm done. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests .. IMPALA-4122: qgen: fix bitrotted cluster unit tests There's a small set of pytest-style tests and associated conftest for testing some of the cluster-related test infrastructure Python objects. Going forward, I want unit tests for the query generator to be run as part of patch acceptance (CI isn't necessary at this time). This patch fixes a bitrotted test and moves the tests into the tests-for-qgen directory. The moves were performed thusly: $ git mv conftest.py tests/ $ git mv cluster_tests.py tests/test_cluster.py Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Reviewed-on: http://gerrit.cloudera.org:8080/4404 Reviewed-by: Michael Brown Reviewed-by: Tim Armstrong Tested-by: Internal Jenkins --- D tests/comparison/conftest.py A tests/comparison/tests/conftest.py R tests/comparison/tests/test_cluster.py 3 files changed, 66 insertions(+), 50 deletions(-) Approvals: Michael Brown: Looks good to me, but someone else must approve Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4118: extract encryption utils from BufferedBlockMgr
Tim Armstrong has uploaded a new patch set (#6). Change subject: IMPALA-4118: extract encryption utils from BufferedBlockMgr .. IMPALA-4118: extract encryption utils from BufferedBlockMgr As groundwork for IMPALA-4118, extract encryption and integrity verification routines into a separate module that can be used by the new BufferPool. Simplify the logic in BufferedBlockMgr by not distinguishing between the integrity check and encryption options, which are controlled by the same command line flag anyway. This patch changes how the OpenSSL RNG is seeded. I consulted with the original author of the code (Michael Yoder), who suggested that the new approach would be appropriate: "I'd pick whatever implementation is easiest for you. You could add entropy once per query, or once every X keys (100 keys seems reasonable), or once every "convenient place". 4kb is probably overkill, you could use 512b for example." Testing: Added some unit tests for the utilities. We already have coverage of the BufferedBlockMgr with encryption in buffered-block-mgr-test. Also reduce the number of iterations in the buffered-block-mgr-test to save some testing time. Change-Id: Ibebe431f69c3c569cbff68171beaa32ef2ab03b6 --- M be/src/common/init.cc M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/buffered-block-mgr.h M be/src/util/CMakeLists.txt A be/src/util/openssl-util-test.cc A be/src/util/openssl-util.cc A be/src/util/openssl-util.h 8 files changed, 461 insertions(+), 209 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/4389/6 -- To view, visit http://gerrit.cloudera.org:8080/4389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibebe431f69c3c569cbff68171beaa32ef2ab03b6 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Hello Michael Brown, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4175 to look at the new patch set (#4). Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. IMPALA-3718: Support subset of functional-query for Kudu Adds initial support for the functional-query test workload for Kudu tables. There are a few issues that make loading the functional schema difficult on Kudu: 1) Kudu tables must have one or more columns that together constitute a unique primary key. a) Primary key columns must currently be the first columns in the table definition (KUDU-1271). b) Primary key columns cannot be nullable (KUDU-1570). 2) Kudu tables must be specified with distribution parameters. (1) limits the tables that can be loaded without ugly workarounds. This patch only includes important tables that are used for relevant tests, most notably the alltypes* family. In particular, alltypesagg is important but it does not have a set of columns that are non-nullable and form a unique primary key. As a result, that table is created in Kudu with a different name and an additional BIGINT column for a PK that is a unique index and is generated at data loading time using the ROW_NUMBER analytic function. A view is then wrapped around the underlying table that matches the alltypesagg schema exactly. When KUDU-1570 is resolved, this can be simplified. (2) requires some additional considerations and custom syntax. As a result, the DDL to create the tables is explicitly specified in CREATE_KUDU sections in the functional_schema_constraints.csv, and an additional DEPENDENT_LOAD_KUDU section was added to specify custom data loading DML that differs from the existing DEPENDENT_LOAD. TODO: IMPALA-4005: generate_schema_statements.py needs refactoring Tests that are not relevant or not yet supported have been marked with xfail and a skip where appropriate. TODO: Support remaining functional tables/tests when possible. Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc --- M bin/impala-config.sh M testdata/bin/compute-table-stats.sh M testdata/bin/generate-schema-statements.py M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test M testdata/workloads/functional-query/functional-query_core.csv M testdata/workloads/functional-query/functional-query_exhaustive.csv M testdata/workloads/functional-query/functional-query_pairwise.csv M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M tests/common/skip.py M tests/common/test_result_verifier.py M tests/query_test/test_aggregation.py M tests/query_test/test_exprs.py M tests/query_test/test_queries.py M tests/query_test/test_runtime_filters.py M tests/query_test/test_scanners.py M tests/query_test/test_tpcds_queries.py 18 files changed, 329 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/4175/4 -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. Patch Set 4: Code-Review+2 Fixed the commit msg, carrying the +2 -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.
Alex Behm has posted comments on this change. Change subject: IMPALA-4110: Apache RAT script on Impala tarballs. .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
Dan Hecht has posted comments on this change. Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder .. Patch Set 14: (25 comments) Here's a first set of comments, mostly focused on phjn.h. I need to go back through phjn.cc, builder.h/cc. http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/analytic-eval-node.cc File be/src/exec/analytic-eval-node.cc: Line 207: state, Substitute(PREPARE_FAILED_ERROR_MSG, "write")); old code and agg uses state_->block_mgr()->MemLimitTooLowError(). Any reason this uses MemLimitExceeded() directly instead? Just wondering the reasoning. http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: Line 79: /// Logically, the algorithm has the following modes:. I think this comment could be a lot more useful if we explain how the phases work together to give the full algorithm. Probably that could replace the state transition graph which I find confusing. PS14, Line 81: partition them into hash_partitions_. The input can this comment is kind of disjoint since it talks about what the phase does, then where the input comes from, and then explains more about what the phase does. Let's format each phase consistently, maybe describe inputs first, then actions, then outputs? PS14, Line 86: PROCESSING why do we use PROCESSING term here rather than PARTITIONING, which would be more consistent? PS14, Line 87: spilled single spilled (or remove single from #3 since having it in one place but not the other makes the reader wonder if there's a difference). Line 89: /// table in memory, we perform the join, otherwise we spill the probe row. this last sentence is confusing since it mostly restates what the first says adds a small amount of new info. Let's combine them. PS14, Line 90: PROBING_SPILLED_PARTITION it would be good to be more upfront about when this phase is used and how, for a spilled partition, we either do this phase or REPARTITIONING_PROBE phase. PS14, Line 90: construct what does this phase construct? isn't the hash table already constructed? PS14, Line 91: walking what does this mean? processing? also, from these comments it's not clear how these stages relate. You could read this to mean that #2 a fallback if we can't do #3, or that #3 is the probing that "perform the join" of #2. PS14, Line 97: * what does this star mean to say? We can go from REPARTITIONING_PROBE back to PROBING_SPILLED_PARTITION, right? PS14, Line 143: buffer write buffer? PS14, Line 264: the its PS14, Line 267: the that PS14, Line 291: Walks What does this mean. Iterates over? PS14, Line 291: hash partitions is this talking about the probe hash partitions or the builder's? PS14, Line 384: builder_->hash_partitions is this referring to the entries in the builder_->hash_partition_ array? PS14, Line 385: This is not used when processing a single partition. what does this mean? PS14, Line 410: we then iterate over all the probe rows I can't tell if this means literally all the probe rows, or all the rows in this stream. clarify this paragraph. PS14, Line 432: and this makes it sound like there are two conditions required to create a probe partition, but I think you mean this is implied by the first part. "because"? PS14, Line 436: preallocated what does this mean? PS14, Line 437: and double and. And is this saying the constructor prepares it or the caller should have already? is this right: ... should be an unpinned stream that has been prepared for writing with ... PS14, Line 443: should will PS14, Line 464: meaning : /// it has to call Close() on it) but transferred to the parent exec node this doesn't seem right. don't we either Close() it or transfer it (not both), in PartitionedHashJoinNode::ProbePartition::Close()? http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: PS14, Line 409: DCHECK DCHECK_EQ Line 412: children_.insert(children_.begin(), child_entry); rather than having two special cases (here and line 395-399, how about making AddChildLocked() take the position iterator? This case passes begin() and then AddChild() case passes either end() or the middle iterator. -- To view, visit http://gerrit.cloudera.org:8080/3873 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
Alex Behm has posted comments on this change. Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py File testdata/bin/check-hbase-nodes.py: Line 88: LOGGER.debug("Success: " + str(zk_client)) > That's a good catch, but I think it might be OK as-is. This is the output o Wfm. -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Alex Behm has posted comments on this change. Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4175/3//COMMIT_MSG Commit Message: Line 45: marked with xfail and a message where appropriate. marked with skip -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. Patch Set 6: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/4326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
David Knupp has posted comments on this change. Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py File testdata/bin/check-hbase-nodes.py: Line 88: LOGGER.debug("Success: " + str(zk_client)) > LOGGER.info? Otherwise it looks like we're trying to connect but never succ That's a good catch, but I think it might be OK as-is. This is the output of a typical run: Contents of HDFS root: [u'hbase', u'home', u'test-warehouse', u'tmp', u'user'] Connecting to Zookeeper host(s). Waiting for HBase node: /hbase/master Success: /hbase/master Waiting for HBase node: /hbase/rs Success: /hbase/rs [etc...] The missing line would have been: Success: If the connection attempt didn't succeed, I don't think it would be a mystery -- you'd get the exception msg, and the script would exit with an error. Is that acceptable? (Honestly, I think might have intended to take that line out.) -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Dan Hecht has posted comments on this change. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
Alex Behm has posted comments on this change. Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. Patch Set 8: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py File testdata/bin/check-hbase-nodes.py: Line 88: LOGGER.debug("Success: " + str(zk_client)) LOGGER.info? Otherwise it looks like we're trying to connect but never succeed -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > I think that behavior is actually desired. If a table with the same name wa I think it may be hard to differentiate between a table dropped + added and a table reloaded after an invalidate metadata. -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4091: Fix backend unit to log in logs/be tests.
Lars Volker has posted comments on this change. Change subject: IMPALA-4091: Fix backend unit to log in logs/be_tests. .. Patch Set 3: Code-Review+1 Thanks for fixing this. -- To view, visit http://gerrit.cloudera.org:8080/4352 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaff0acf09bf192d54baeb0bb347e895fce6ed23b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Alex Behm has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > Isn't that ok? Once you drop+create a table with same name, it isn't essent I think that behavior is actually desired. If a table with the same name was dropped+added it *should* have a different table id. -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > Alex, Bharath, another possibility is that given our asymmetry architecture Isn't that ok? Once you drop+create a table with same name, it isn't essentially the "same" table anymore, for ex: its schema could've changed. IMO thats fine. I think this is what Alex meant when he mentioned that drop+create should create a new table ID. -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.
Jim Apple has posted comments on this change. Change subject: IMPALA-4110: Clean up issues found by Apache RAT. .. Patch Set 3: (3 comments) > (3 comments) > > Looks good to me. > > Can we please separate the tool integration from the fixes in two > separate patches? Done, see https://gerrit.cloudera.org/#/c/4405 http://gerrit.cloudera.org:8080/#/c/4361/3/bin/check-rat-report.py File bin/check-rat-report.py: Line 21: # Usage: > Can you describe the purpose of this tool? Where did this script come from? Done. Moved to https://gerrit.cloudera.org/#/c/4405 Line 24: # `git archive --prefix=Foo/ -o baz.tar.gz HEAD` > can we make the examples more concrete? Done http://gerrit.cloudera.org:8080/#/c/4361/3/testdata/bin/cache_tables.py File testdata/bin/cache_tables.py: Line 22: # query. This only works on a mini-dfs cluster. This is remarkably difficult to do > fix whitespace while here Done -- To view, visit http://gerrit.cloudera.org:8080/4361 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.
Jim Apple has uploaded a new patch set (#6). Change subject: IMPALA-4110: Clean up issues found by Apache RAT. .. IMPALA-4110: Clean up issues found by Apache RAT. The script and support files are in another patch: https://gerrit.cloudera.org/#/c/4405/ Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962 --- M LICENSE.txt M be/.astylerc M be/.impala.doxy M be/src/benchmarks/expr-benchmark.cc M be/src/experiments/data-provider.cc M be/src/experiments/hashing/cache-hash-test.cc M be/src/experiments/hashing/growing-test.cc M be/src/experiments/hashing/interface/cache-hash-test.cc M be/src/experiments/hashing/interface/growing-test.cc M be/src/experiments/hashing/multilevel/cache-hash-test.cc M be/src/experiments/hashing/multilevel/growing-test.cc M be/src/experiments/hashing/prefetch/cache-hash-test.cc M be/src/experiments/hashing/prefetch/growing-test.cc M be/src/experiments/hashing/split-benchmarks/cache-hash-test.cc M be/src/experiments/hashing/split-benchmarks/growing-test.cc M be/src/experiments/hashing/split-benchmarks/partitioning-throughput-test.cc M be/src/experiments/hashing/streaming/cache-hash-test.cc M be/src/experiments/hashing/streaming/growing-test.cc M be/src/testutil/certificates-info.txt M be/src/util/asan.h M bin/README-RUNNING-BENCHMARKS D bin/cpplint.py M bin/impala-ipython M bin/impala-py.test M bin/impala-python M cmake_modules/FindAvro.cmake M cmake_modules/FindBreakpad.cmake M cmake_modules/FindBzip2.cmake M cmake_modules/FindGFlags.cmake M cmake_modules/FindGLog.cmake M cmake_modules/FindHDFS.cmake M cmake_modules/FindJNI.cmake M cmake_modules/FindLdap.cmake M cmake_modules/FindLlvm.cmake M cmake_modules/FindLlvmBinaries.cmake M cmake_modules/FindLz4.cmake M cmake_modules/FindOpenSSL.cmake M cmake_modules/FindPProf.cmake M cmake_modules/FindRapidJson.cmake M cmake_modules/FindRe2.cmake M cmake_modules/FindSnappy.cmake M cmake_modules/FindThrift.cmake M cmake_modules/FindZlib.cmake M fe/src/main/java/com/cloudera/impala/analysis/AggregateInfoBase.java M fe/src/main/java/com/cloudera/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java M fe/src/main/java/com/cloudera/impala/catalog/ArrayType.java M fe/src/main/java/com/cloudera/impala/catalog/MapType.java M fe/src/main/java/com/cloudera/impala/catalog/StructType.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/test/resources/authz-policy.ini.template M fe/src/test/resources/hive-log4j.properties.template M fe/src/test/resources/log4j.properties.template M infra/python/deps/download_requirements M infra/python/deps/requirements.txt M testdata/avro_schema_resolution/create_table.sql M testdata/bin/cache_tables.py M testdata/bin/create-mini.sql M testdata/bin/load-dependent-tables.sql M testdata/bin/load-hive-builtins.sh M testdata/bin/run-hive.sh M testdata/cluster/node_templates/cdh5/etc/init.d/kms M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-master M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-tserver M testdata/cluster/node_templates/cdh5/etc/init.d/llama-application M testdata/cluster/node_templates/common/etc/init.d/common.tmpl M testdata/cluster/node_templates/common/etc/init.d/hdfs-common M testdata/cluster/node_templates/common/etc/init.d/hdfs-datanode M testdata/cluster/node_templates/common/etc/init.d/hdfs-namenode M testdata/cluster/node_templates/common/etc/init.d/yarn-common M testdata/cluster/node_templates/common/etc/init.d/yarn-nodemanager M testdata/cluster/node_templates/common/etc/init.d/yarn-resourcemanager D testdata/data/mstr/eatwh1/EATWH1-DDL.sql D testdata/data/mstr/eatwh1/LoadTablesEATWH1.sql D testdata/data/mstr/eatwh1/csv/COST_MARKET_CLASS.TXT D testdata/data/mstr/eatwh1/csv/COST_MARKET_DEP.TXT D testdata/data/mstr/eatwh1/csv/COST_MARKET_DIV.TXT D testdata/data/mstr/eatwh1/csv/COST_REGION_CLASS.TXT D testdata/data/mstr/eatwh1/csv/COST_REGION_ITEM.TXT D testdata/data/mstr/eatwh1/csv/COST_STORE_DEP.TXT D testdata/data/mstr/eatwh1/csv/COST_STORE_ITEM.TXT D testdata/data/mstr/eatwh1/csv/FT1.TXT D testdata/data/mstr/eatwh1/csv/FT10.TXT D testdata/data/mstr/eatwh1/csv/FT11.TXT D testdata/data/mstr/eatwh1/csv/FT12.TXT D testdata/data/mstr/eatwh1/csv/FT13.TXT D testdata/data/mstr/eatwh1/csv/FT14.TXT D testdata/data/mstr/eatwh1/csv/FT15.TXT D testdata/data/mstr/eatwh1/csv/FT17.TXT D testdata/data/mstr/eatwh1/csv/FT2.TXT D testdata/data/mstr/eatwh1/csv/FT3.TXT D testdata/data/mstr/eatwh1/csv/FT4.TXT D testdata/data/mstr/eatwh1/csv/FT5.TXT D testdata/data/mstr/eatwh1/csv/FT6.TXT D testdata/data/mstr/eatwh1/csv/FT7.TXT D testdata/data/mstr/eatwh1/csv/FT8.TXT D testdata/data/mstr/eatwh1/csv/FT9.TXT D testdata/data/mstr/eatwh1/csv/INVENTORY_CURR.TXT D testdata/data/mstr/eatwh1/csv/INVENTORY_ORDERS.TXT D testdata/data/mstr/eatwh1/csv/INVENTORY_Q1_1997.TXT D testdata/data/mstr/eatwh1/csv/INVENTOR
[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.
Jim Apple has posted comments on this change. Change subject: IMPALA-4110: Apache RAT script on Impala tarballs. .. Patch Set 1: Separated from https://gerrit.cloudera.org/#/c/4361/ following a request by Alex. -- To view, visit http://gerrit.cloudera.org:8080/4405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4110: Clean up issues found by Apache RAT.
Jim Apple has uploaded a new patch set (#5). Change subject: IMPALA-4110: Clean up issues found by Apache RAT. .. IMPALA-4110: Clean up issues found by Apache RAT. The script and support files are in another patch: https://gerrit.cloudera.org/#/c/4405/ Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962 --- M LICENSE.txt M be/.astylerc M be/.impala.doxy M be/src/benchmarks/expr-benchmark.cc M be/src/experiments/data-provider.cc M be/src/experiments/hashing/cache-hash-test.cc M be/src/experiments/hashing/growing-test.cc M be/src/experiments/hashing/interface/cache-hash-test.cc M be/src/experiments/hashing/interface/growing-test.cc M be/src/experiments/hashing/multilevel/cache-hash-test.cc M be/src/experiments/hashing/multilevel/growing-test.cc M be/src/experiments/hashing/prefetch/cache-hash-test.cc M be/src/experiments/hashing/prefetch/growing-test.cc M be/src/experiments/hashing/split-benchmarks/cache-hash-test.cc M be/src/experiments/hashing/split-benchmarks/growing-test.cc M be/src/experiments/hashing/split-benchmarks/partitioning-throughput-test.cc M be/src/experiments/hashing/streaming/cache-hash-test.cc M be/src/experiments/hashing/streaming/growing-test.cc M be/src/testutil/certificates-info.txt M be/src/util/asan.h M bin/README-RUNNING-BENCHMARKS D bin/cpplint.py M bin/impala-ipython M bin/impala-py.test M bin/impala-python M cmake_modules/FindAvro.cmake M cmake_modules/FindBreakpad.cmake M cmake_modules/FindBzip2.cmake M cmake_modules/FindGFlags.cmake M cmake_modules/FindGLog.cmake M cmake_modules/FindHDFS.cmake M cmake_modules/FindJNI.cmake M cmake_modules/FindLdap.cmake M cmake_modules/FindLlvm.cmake M cmake_modules/FindLlvmBinaries.cmake M cmake_modules/FindLz4.cmake M cmake_modules/FindOpenSSL.cmake M cmake_modules/FindPProf.cmake M cmake_modules/FindRapidJson.cmake M cmake_modules/FindRe2.cmake M cmake_modules/FindSnappy.cmake M cmake_modules/FindThrift.cmake M cmake_modules/FindZlib.cmake M fe/src/main/java/com/cloudera/impala/analysis/AggregateInfoBase.java M fe/src/main/java/com/cloudera/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java M fe/src/main/java/com/cloudera/impala/catalog/ArrayType.java M fe/src/main/java/com/cloudera/impala/catalog/MapType.java M fe/src/main/java/com/cloudera/impala/catalog/StructType.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/test/resources/authz-policy.ini.template M fe/src/test/resources/hive-log4j.properties.template M fe/src/test/resources/log4j.properties.template M infra/python/deps/download_requirements M infra/python/deps/requirements.txt M testdata/avro_schema_resolution/create_table.sql M testdata/bin/cache_tables.py M testdata/bin/create-mini.sql M testdata/bin/load-dependent-tables.sql M testdata/bin/load-hive-builtins.sh M testdata/bin/run-hive.sh M testdata/cluster/node_templates/cdh5/etc/init.d/kms M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-master M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-tserver M testdata/cluster/node_templates/cdh5/etc/init.d/llama-application M testdata/cluster/node_templates/common/etc/init.d/common.tmpl M testdata/cluster/node_templates/common/etc/init.d/hdfs-common M testdata/cluster/node_templates/common/etc/init.d/hdfs-datanode M testdata/cluster/node_templates/common/etc/init.d/hdfs-namenode M testdata/cluster/node_templates/common/etc/init.d/yarn-common M testdata/cluster/node_templates/common/etc/init.d/yarn-nodemanager M testdata/cluster/node_templates/common/etc/init.d/yarn-resourcemanager D testdata/data/mstr/eatwh1/EATWH1-DDL.sql D testdata/data/mstr/eatwh1/LoadTablesEATWH1.sql D testdata/data/mstr/eatwh1/csv/COST_MARKET_CLASS.TXT D testdata/data/mstr/eatwh1/csv/COST_MARKET_DEP.TXT D testdata/data/mstr/eatwh1/csv/COST_MARKET_DIV.TXT D testdata/data/mstr/eatwh1/csv/COST_REGION_CLASS.TXT D testdata/data/mstr/eatwh1/csv/COST_REGION_ITEM.TXT D testdata/data/mstr/eatwh1/csv/COST_STORE_DEP.TXT D testdata/data/mstr/eatwh1/csv/COST_STORE_ITEM.TXT D testdata/data/mstr/eatwh1/csv/FT1.TXT D testdata/data/mstr/eatwh1/csv/FT10.TXT D testdata/data/mstr/eatwh1/csv/FT11.TXT D testdata/data/mstr/eatwh1/csv/FT12.TXT D testdata/data/mstr/eatwh1/csv/FT13.TXT D testdata/data/mstr/eatwh1/csv/FT14.TXT D testdata/data/mstr/eatwh1/csv/FT15.TXT D testdata/data/mstr/eatwh1/csv/FT17.TXT D testdata/data/mstr/eatwh1/csv/FT2.TXT D testdata/data/mstr/eatwh1/csv/FT3.TXT D testdata/data/mstr/eatwh1/csv/FT4.TXT D testdata/data/mstr/eatwh1/csv/FT5.TXT D testdata/data/mstr/eatwh1/csv/FT6.TXT D testdata/data/mstr/eatwh1/csv/FT7.TXT D testdata/data/mstr/eatwh1/csv/FT8.TXT D testdata/data/mstr/eatwh1/csv/FT9.TXT D testdata/data/mstr/eatwh1/csv/INVENTORY_CURR.TXT D testdata/data/mstr/eatwh1/csv/INVENTORY_ORDERS.TXT D testdata/data/mstr/eatwh1/csv/INVENTORY_Q1_1997.TXT D testdata/data/mstr/eatwh1/csv/INVENTOR
[Impala-ASF-CR] IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs.
Jim Apple has uploaded a new patch set (#4). Change subject: IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs. .. IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs. While I'm here, fix as many of the warnings as possible. The instructions on how to run this are in check-rat-report.py: 1. Either download a release candidate or create an archive from your git repo using `git archive --prefix=Foo/ -o baz.tar.gz HEAD` 2. Untar and run `java -jar apache-rat-0.12.jar -x . >baz.xml`. Only RAT version 0.12 is known to work at this time. 3. `bin/check-rat-report.py bin/rat_exclude_files.txt baz.xml` Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962 --- M LICENSE.txt M be/.astylerc M be/.impala.doxy M be/src/benchmarks/expr-benchmark.cc M be/src/experiments/data-provider.cc M be/src/experiments/hashing/cache-hash-test.cc M be/src/experiments/hashing/growing-test.cc M be/src/experiments/hashing/interface/cache-hash-test.cc M be/src/experiments/hashing/interface/growing-test.cc M be/src/experiments/hashing/multilevel/cache-hash-test.cc M be/src/experiments/hashing/multilevel/growing-test.cc M be/src/experiments/hashing/prefetch/cache-hash-test.cc M be/src/experiments/hashing/prefetch/growing-test.cc M be/src/experiments/hashing/split-benchmarks/cache-hash-test.cc M be/src/experiments/hashing/split-benchmarks/growing-test.cc M be/src/experiments/hashing/split-benchmarks/partitioning-throughput-test.cc M be/src/experiments/hashing/streaming/cache-hash-test.cc M be/src/experiments/hashing/streaming/growing-test.cc M be/src/testutil/certificates-info.txt M be/src/util/asan.h M bin/README-RUNNING-BENCHMARKS D bin/cpplint.py M bin/impala-ipython M bin/impala-py.test M bin/impala-python M cmake_modules/FindAvro.cmake M cmake_modules/FindBreakpad.cmake M cmake_modules/FindBzip2.cmake M cmake_modules/FindGFlags.cmake M cmake_modules/FindGLog.cmake M cmake_modules/FindHDFS.cmake M cmake_modules/FindJNI.cmake M cmake_modules/FindLdap.cmake M cmake_modules/FindLlvm.cmake M cmake_modules/FindLlvmBinaries.cmake M cmake_modules/FindLz4.cmake M cmake_modules/FindOpenSSL.cmake M cmake_modules/FindPProf.cmake M cmake_modules/FindRapidJson.cmake M cmake_modules/FindRe2.cmake M cmake_modules/FindSnappy.cmake M cmake_modules/FindThrift.cmake M cmake_modules/FindZlib.cmake M fe/src/main/java/com/cloudera/impala/analysis/AggregateInfoBase.java M fe/src/main/java/com/cloudera/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java M fe/src/main/java/com/cloudera/impala/catalog/ArrayType.java M fe/src/main/java/com/cloudera/impala/catalog/MapType.java M fe/src/main/java/com/cloudera/impala/catalog/StructType.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/test/resources/authz-policy.ini.template M fe/src/test/resources/hive-log4j.properties.template M fe/src/test/resources/log4j.properties.template M infra/python/deps/download_requirements M infra/python/deps/requirements.txt M testdata/avro_schema_resolution/create_table.sql M testdata/bin/cache_tables.py M testdata/bin/create-mini.sql M testdata/bin/load-dependent-tables.sql M testdata/bin/load-hive-builtins.sh M testdata/bin/run-hive.sh M testdata/cluster/node_templates/cdh5/etc/init.d/kms M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-master M testdata/cluster/node_templates/cdh5/etc/init.d/kudu-tserver M testdata/cluster/node_templates/cdh5/etc/init.d/llama-application M testdata/cluster/node_templates/common/etc/init.d/common.tmpl M testdata/cluster/node_templates/common/etc/init.d/hdfs-common M testdata/cluster/node_templates/common/etc/init.d/hdfs-datanode M testdata/cluster/node_templates/common/etc/init.d/hdfs-namenode M testdata/cluster/node_templates/common/etc/init.d/yarn-common M testdata/cluster/node_templates/common/etc/init.d/yarn-nodemanager M testdata/cluster/node_templates/common/etc/init.d/yarn-resourcemanager D testdata/data/mstr/eatwh1/EATWH1-DDL.sql D testdata/data/mstr/eatwh1/LoadTablesEATWH1.sql D testdata/data/mstr/eatwh1/csv/COST_MARKET_CLASS.TXT D testdata/data/mstr/eatwh1/csv/COST_MARKET_DEP.TXT D testdata/data/mstr/eatwh1/csv/COST_MARKET_DIV.TXT D testdata/data/mstr/eatwh1/csv/COST_REGION_CLASS.TXT D testdata/data/mstr/eatwh1/csv/COST_REGION_ITEM.TXT D testdata/data/mstr/eatwh1/csv/COST_STORE_DEP.TXT D testdata/data/mstr/eatwh1/csv/COST_STORE_ITEM.TXT D testdata/data/mstr/eatwh1/csv/FT1.TXT D testdata/data/mstr/eatwh1/csv/FT10.TXT D testdata/data/mstr/eatwh1/csv/FT11.TXT D testdata/data/mstr/eatwh1/csv/FT12.TXT D testdata/data/mstr/eatwh1/csv/FT13.TXT D testdata/data/mstr/eatwh1/csv/FT14.TXT D testdata/data/mstr/eatwh1/csv/FT15.TXT D testdata/data/mstr/eatwh1/csv/FT17.TXT D testdata/data/mstr/eatwh1/csv/FT2.TXT D testdata/data/mstr/eatwh1/csv/FT3.TXT D testdata/data/mstr/eatwh1/csv/FT4.TXT D testdata/data/mstr/eatwh1/csv/FT5
[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3491: Use unique db in test_scanners.py and test_aggregation.py .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-3491: Use unique db in test_scanners.py and test_aggregation.py .. IMPALA-3491: Use unique db in test_scanners.py and test_aggregation.py Testing: Ran the tests locally in a loop on exhaustive. Did a private debug/exhaustive run. Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c Reviewed-on: http://gerrit.cloudera.org:8080/4339 Reviewed-by: Alex Behm Tested-by: Internal Jenkins --- M testdata/workloads/functional-query/queries/QueryTest/aggregation_no_codegen_only.test M tests/query_test/test_aggregation.py M tests/query_test/test_scanners.py 3 files changed, 20 insertions(+), 35 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4110: Apache RAT script on Impala tarballs.
Jim Apple has uploaded a new change for review. http://gerrit.cloudera.org:8080/4405 Change subject: IMPALA-4110: Apache RAT script on Impala tarballs. .. IMPALA-4110: Apache RAT script on Impala tarballs. Apache RAT is a tool for license auditing. It will be used as part of the Apache release process. This patch includes a script for parsing its output and a file containing a list of filename globs that should be ignored. The script takes two command line parameters as input - the filename of the ignored file globs, and the filename of the RAT xml output. Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714 --- A bin/check-rat-report.py A bin/rat_exclude_files.txt 2 files changed, 191 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/4405/1 -- To view, visit http://gerrit.cloudera.org:8080/4405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic95bd38fbb90f9a901602dd91cee541b16bf4714 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Huaisi Xu has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > this is a good idea! I will try to do this. but for 64bit thing I think it Alex, Bharath, another possibility is that given our asymmetry architecture, one impalad can drop and create a new table with the same name. if this table gets updated in a single catalog update to another impalad, then we might see same table with different table ID as well. -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder
Michael Ho has posted comments on this change. Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder .. Patch Set 14: (31 comments) Some comments and questions. http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/blocking-join-node.cc File be/src/exec/blocking-join-node.cc: PS14, Line 297: SCOPED_TIMER(build_timer_); We used to have two different timers in PHJ for measuring the time to hash the input rows into different partitions and the time to build the hash tables. It would be good to be able to retain that fined grained tracking PHJBuilder. http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: Line 179: largest_fraction = max(largest_fraction, DCHECK_GT(num_build_rows, 0); PS14, Line 318: a hash table hash tables Line 341: // building hash tables to avoid building hash tables for partitions we'll spill anyway. That's an interesting comment. If I understand it correctly, it means that calling InitSpilledPartitionProbeStreams() may cause more partitions to be spilled as it needs to pin write buffers for the spilled streams so we should make sure we allocate all the write streams to trigger all the extra spills which may happen before building hash tables. If so, would you mind elaborating a bit in the comment: "Allocate probe buffers for all partitions that are already spilled. Do this before building hash tables as allocating probe buffers may cause some more partitions to be spilled. This avoids wasted work on building hash tables for partitions which end up being spilled eventually." Do you know how often this case happens ? I suppose this complication will be gone eventually once reservation is in place. Line 401: RETURN_IF_ERROR(SpillPartition()); May help to document that failure to find any partition to spill (e.g. all partitions are spilled) will return an error code. It looks as if we may be in an infinite loop on the first glance. PS14, Line 529: PhjBuilder::HashTableStoresNulls() This seems to belong to PartitionedHashJoinNode conceptually. Line 646: do { Please see comments in BlockingJoinNode, it would be great to retain timers for InsertBatch() and ProcessBuildInput() separately for finer grain tracking. PS14, Line 782: process_build_batch_fn_level0 'process_build_batch_fn_level0_' http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: Line 425: nit: unnecessary blank line ? http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: PS14, Line 151: : : This may be important information to retain. PS14, Line 87: Expr::CreateExprTree(pool_, eq_join_conjunct.right Does this overlap with CreateExprTree() in PhjBuilder::Init() ? Same question for build_expr_ctxs_. If they are not redundant, mind commenting on how they are used differently ? Line 213: for (unique_ptr& partition : spilled_partitions_) missing { } PS14, Line 494: to output outputting PS14, Line 585: hash_partitions_ 'hash_partitions_' Line 594: continue; Is there a reason why we cannot call OutputUnmatchedBuild() directly at this point ? PS14, Line 996: s) next_state ? http://gerrit.cloudera.org:8080/#/c/3873/12/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: PS12, Line 81: hash_partitions_ 'hash_partitions_' PS12, Line 88: probe_hash_partitions_ 'probe_hash_partitions_' PS12, Line 112: input_partition_ 'input_partition_' http://gerrit.cloudera.org:8080/#/c/3873/14/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: PS14, Line 437: Is this merged into build_timer_ in BlockingJoinNode now ? It may be helpful for debugging to maintain two separate timers. PS14, Line 81: hash_partitions_ 'hash_partitions_' PS14, Line 84: This is the only phase These are the only phases PS14, Line 88: probe_hash_partitions_ 'probe_hash_partitions_' PS14, Line 112: input_partition_ 'input_partition_' PS14, Line 118: input_partition_ 'input_partition_' PS14, Line 298: spilled_partitions_ 'spilled_partitions_' PS14, Line 302: probe_batch_pos_ 'probe_batch_pos_' PS14, Line 306: probe_batch_pos_ 'probe_batch_pos_' PS14, Line 309: input_partition_ 'input_partition_' PS14, Line 408: null_aware_ 'null_aware_' Line 445: /// block cannot be acquired. "delete_on_read" mode is used, so blocks in the stream ... used for the buffered tuple stream, so.. -- To view, visit http://gerrit.cloudera.org:8080/3873 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Huaisi Xu has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > +1. Given this happens only with "invalidate metadata", how about we change this is a good idea! I will try to do this. but for 64bit thing I think it is not necessary if we reuse table ID(like Alex suggested). the reason is that the new largest possible IDs is lower bounded by our current implementation. we do not have a problem now so probably that can go to another patch. For this one, I think I will only focus on enforcing unique table ID. Thank you both! -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Alex Behm has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > +1. Given this happens only with "invalidate metadata", how about we change This seems like the simplest solution to me. -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal()
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal() .. IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal() To improve supportability, this commit logs the actual error stack trace that can cause FileSystemUtil.copyToLocal() to fail. Additionaly this also cleans up FileSystemUtil.isPathReachable() method to throw an exception on failures rather than returning false and then returning the error message as a string to callers. Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8 Reviewed-on: http://gerrit.cloudera.org:8080/4125 Reviewed-by: Bharath Vissapragada Tested-by: Internal Jenkins --- M fe/src/main/java/com/cloudera/impala/analysis/HdfsUri.java M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/com/cloudera/impala/common/FileSystemUtil.java M fe/src/main/java/com/cloudera/impala/util/AvroSchemaUtils.java 4 files changed, 19 insertions(+), 33 deletions(-) Approvals: Bharath Vissapragada: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Brock Noland Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal() .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Brock Noland Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = > Why do we need another map? Don't we already have a map of existing tables +1. Given this happens only with "invalidate metadata", how about we change just the reset() to re-use existing Ids (by looking up from the existing dbCache, rather than resetting to 0) and other places just uses a new unique id. This solves two issues. 1. Invalidate won't change the Id. 2. drop+create of same table creates a new Id. Going by Tim's stats, if we use a 64-bit, we can use an ever-increasing Id. Thoughts? -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Michael Brown has posted comments on this change. Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. Patch Set 2: Code-Review+1 > My thought with xfail was that I wanted them to show up as xfail so > we see them and have an incentive to fix them in the future. That's fair. I was coming from the standpoint of: 1. No one really reads the ends of these tests reports and notices an increase or decrease in XFAILs or SKIPs. Nothing consumes those counts, or tracks them on a wallboard, for instance. 2. When reading test code, it's odd to see explicit xfail() calls. I associate xfail() with "I am executing this test and expect it to fail." Typically I see it as a mark on a test. However, I don't feel strongly about these things. If the xfails incentivize you, that's fine. They can stay. Thanks too for running an exhaustive test. I think you should probably rebase this before you submit and run GVO. -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/4326/4/be/src/exec/hash-table.h File be/src/exec/hash-table.h: Line 308: uint8_t* ALWAYS_INLINE cur_expr_values() const { return cur_expr_values_; } > let's add a short comment here too: Done PS4, Line 310: . > for the current row. Done PS4, Line 400: into > This one should have stayed "in" (or "of"), right? i.e. it doesn't write *e Done Line 427: uint32_t HashVariableLenRow(uint8_t* expr_values, uint8_t* expr_values_null) const; > how about making these const as well. Done Line 441: TupleRow* build_row, uint8_t* expr_values, uint8_t* expr_values_null) const; > these too could be const to clarify they are "in" params. Done. Also made the various TupleRow arguments const too for consistency. This all led to a few additional consts. -- To view, visit http://gerrit.cloudera.org:8080/4326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4326 to look at the new patch set (#5). Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. IMPALA-4008: don't bake in hash table and hash join pointers This fixes some of the cases where memory addresses are baked into codegen'd code. Testing: Ran exhaustive build. Perf: Ran a local perf run. No significant changes. I was able to see some small improvements on microbenchmarks. +---+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---+---+-++++ | TPCH(_20) | parquet / none / none | 9.07| +0.46% | 5.88 | +0.34% | +---+---+-++++ +---+--+---++-++++-+---+ | Workload | Query| File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +---+--+---++-++++-+---+ | TPCH(_20) | TPCH-Q2 | parquet / none / none | 2.12 | 1.89| +12.29% | * 10.85% * | * 20.30% * | 1 | 10| | TPCH(_20) | TPCH-Q13 | parquet / none / none | 9.84 | 9.34| +5.39% | 9.01%| 3.79%| 1 | 10| | TPCH(_20) | TPCH-Q17 | parquet / none / none | 14.61 | 14.19 | +2.97% | 2.15%| 1.52%| 1 | 10| | TPCH(_20) | TPCH-Q18 | parquet / none / none | 14.76 | 14.35 | +2.82% | 3.20%| 2.64%| 1 | 10| | TPCH(_20) | TPCH-Q9 | parquet / none / none | 13.72 | 13.54 | +1.30% | 1.75%| 0.70%| 1 | 10| | TPCH(_20) | TPCH-Q8 | parquet / none / none | 5.71 | 5.64| +1.30% | 1.21%| 1.23%| 1 | 10| | TPCH(_20) | TPCH-Q19 | parquet / none / none | 47.35 | 46.75 | +1.28% | 2.39%| 1.88%| 1 | 10| | TPCH(_20) | TPCH-Q5 | parquet / none / none | 4.57 | 4.52| +1.20% | 1.30%| 0.88%| 1 | 10| | TPCH(_20) | TPCH-Q16 | parquet / none / none | 2.07 | 2.05| +1.12% | 2.59%| 1.79%| 1 | 10| | TPCH(_20) | TPCH-Q11 | parquet / none / none | 1.45 | 1.45| +0.15% | 2.69%| 2.06%| 1 | 10| | TPCH(_20) | TPCH-Q3 | parquet / none / none | 4.65 | 4.65| -0.09% | 2.12%| 2.17%| 1 | 10| | TPCH(_20) | TPCH-Q4 | parquet / none / none | 3.22 | 3.23| -0.26% | 1.03%| 1.33%| 1 | 10| | TPCH(_20) | TPCH-Q7 | parquet / none / none | 15.84 | 15.92 | -0.50% | 0.91%| 1.15%| 1 | 10| | TPCH(_20) | TPCH-Q14 | parquet / none / none | 3.29 | 3.31| -0.59% | 3.31%| 1.58%| 1 | 10| | TPCH(_20) | TPCH-Q22 | parquet / none / none | 2.65 | 2.67| -0.78% | 3.03%| 1.46%| 1 | 10| | TPCH(_20) | TPCH-Q15 | parquet / none / none | 4.50 | 4.55| -1.19% | 2.87%| 2.45%| 1 | 10| | TPCH(_20) | TPCH-Q20 | parquet / none / none | 3.84 | 3.91| -1.76% | 2.20%| 1.94%| 1 | 10| | TPCH(_20) | TPCH-Q10 | parquet / none / none | 5.58 | 5.70| -2.00% | 1.01%| 1.79%| 1 | 10| | TPCH(_20) | TPCH-Q21 | parquet / none / none | 22.84 | 23.42 | -2.47% | 0.68%| 0.56%| 1 | 10| | TPCH(_20) | TPCH-Q1 | parquet / none / none | 11.25 | 11.60 | -3.06% | 0.48%| 1.74%| 1 | 10| | TPCH(_20) | TPCH-Q12 | parquet / none / none | 3.81 | 3.98| -4.38% | 1.62%| 1.14%| 1 | 10| | TPCH(_20) | TPCH-Q6 | parquet / none / none | 1.94 | 2.04| -4.85% | 2.40%| 1.58%| 1 | 10| +---+--+---++-++++-+---+ ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---
[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. Patch Set 2: > (8 comments) > > Thanks for the feedback. > > My thought with xfail was that I wanted them to show up as xfail so > we see them and have an incentive to fix them in the future. We > should be able to add support for the remaining tables once kudu > addresses the JIRAs I mentioned in the commit comment. I can change > them to skips if you still prefer. BTW I forgot to mention, I tested this with an exhaustive test run which completed successfully. -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs.
Alex Behm has posted comments on this change. Change subject: IMPALA-4110: PREVIEW: Make RAT run on Impala tarballs. .. Patch Set 3: (3 comments) Looks good to me. Can we please separate the tool integration from the fixes in two separate patches? http://gerrit.cloudera.org:8080/#/c/4361/3/bin/check-rat-report.py File bin/check-rat-report.py: Line 21: # Usage: Can you describe the purpose of this tool? Where did this script come from? Line 24: # `git archive --prefix=Foo/ -o baz.tar.gz HEAD` can we make the examples more concrete? http://gerrit.cloudera.org:8080/#/c/4361/3/testdata/bin/cache_tables.py File testdata/bin/cache_tables.py: Line 22: # query. This only works on a mini-dfs cluster. This is remarkably difficult to do fix whitespace while here -- To view, visit http://gerrit.cloudera.org:8080/4361 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5bfe77f9a871018e7a67553ed270e2df53006962 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Matthew Jacobs has uploaded a new patch set (#2). Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. IMPALA-3718: Support subset of functional-query for Kudu Adds initial support for the functional-query test workload for Kudu tables. There are a few issues that make loading the functional schema difficult on Kudu: 1) Kudu tables must have one or more columns that together constitute a unique primary key. a) Primary key columns must currently be the first columns in the table definition (KUDU-1271). b) Primary key columns cannot be nullable (KUDU-1570). 2) Kudu tables must be specified with distribution parameters. (1) limits the tables that can be loaded without ugly workarounds. This patch only includes important tables that are used for relevant tests, most notably the alltypes* family. In particular, alltypesagg is important but it does not have a set of columns that are non-nullable and form a unique primary key. As a result, that table is created in Kudu with a different name and an additional BIGINT column for a PK that is a unique index and is generated at data loading time using the ROW_NUMBER analytic function. A view is then wrapped around the underlying table that matches the alltypesagg schema exactly. When KUDU-1570 is resolved, this can be simplified. (2) requires some additional considerations and custom syntax. As a result, the DDL to create the tables is explicitly specified in CREATE_KUDU sections in the functional_schema_constraints.csv, and an additional DEPENDENT_LOAD_KUDU section was added to specify custom data loading DML that differs from the existing DEPENDENT_LOAD. TODO: IMPALA-4005: generate_schema_statements.py needs refactoring Tests that are not relevant or not yet supported have been marked with xfail and a message where appropriate. TODO: Support remaining functional tables/tests when possible. Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc --- M bin/impala-config.sh M testdata/bin/compute-table-stats.sh M testdata/bin/generate-schema-statements.py M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test M testdata/workloads/functional-query/functional-query_core.csv M testdata/workloads/functional-query/functional-query_exhaustive.csv M testdata/workloads/functional-query/functional-query_pairwise.csv M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M tests/common/skip.py M tests/common/test_result_verifier.py M tests/query_test/test_aggregation.py M tests/query_test/test_exprs.py M tests/query_test/test_queries.py M tests/query_test/test_runtime_filters.py M tests/query_test/test_scanners.py M tests/query_test/test_tpcds_queries.py 18 files changed, 329 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/4175/2 -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() .. Patch Set 18: I carried the +2 and pushed to asf. Thanks, Zoltan. -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()
Matthew Jacobs has submitted this change and it was merged. Change subject: IMPALA-3973: add position and occurrence to instr() .. IMPALA-3973: add position and occurrence to instr() Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Reviewed-on: http://gerrit.cloudera.org:8080/4094 Tested-by: Internal Jenkins Reviewed-by: Matthew Jacobs --- M be/src/experiments/CMakeLists.txt R be/src/experiments/string-search-sse-test.cc M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M be/src/runtime/CMakeLists.txt A be/src/runtime/string-search-test.cc M be/src/runtime/string-search.h M common/function-registry/impala_functions.py 9 files changed, 343 insertions(+), 20 deletions(-) Approvals: Matthew Jacobs: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zoltan Ivanfi
[Impala-ASF-CR] IMPALA-1702: Enforce unique table ID
Alex Behm has posted comments on this change. Change subject: IMPALA-1702: Enforce unique table ID .. Patch Set 2: (1 comment) I think we should avoid both forms of id inconsistency: Different tables with the same id, or the same table with different ids. http://gerrit.cloudera.org:8080/#/c/4349/2/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 188: private final HashMap tblIdMap_ = Why do we need another map? Don't we already have a map of existing tables that we can use to get the current id? I think it could actually be problematic to only rely on the TTableName for identification. The user could drop+add the same table and I think in that case a new id should be assigned. Thoughts? -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3718: Support subset of functional-query for Kudu
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3718: Support subset of functional-query for Kudu .. Patch Set 1: (8 comments) Thanks for the feedback. My thought with xfail was that I wanted them to show up as xfail so we see them and have an incentive to fix them in the future. We should be able to add support for the remaining tables once kudu addresses the JIRAs I mentioned in the commit comment. I can change them to skips if you still prefer. http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: Line 594: print "Ignore partitions on Kudu" > Include the db and table name here so we know what we're ignoring. Done http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: PS1, Line 598: SELECT row_number() over (order by year, month, id, day), > For my education, was the ordering of "id, day" in the ORDER BY intentional It doesn't really matter too much. This row_number() column gets hidden anyway. If anything I might have put id first. There is only 1 distinct year and 1 distinct month, so they don't change the order. PS1, Line 1063: text_comma_backslash_newline > This table is defined as a kudu table in schema_constraints.csv L181, but i Good catch it shouldn't be there. Thanks. http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/datasets/functional/schema_constraints.csv File testdata/datasets/functional/schema_constraints.csv: PS1, Line 177: table_name:testtbl, constraint:only, table_format:kudu/none/none > How is it that this constraint was already defined but only just now had a generate_schema_statements will generate a create table statement if it's not defined (using the first col as the PK and hash expr), but I wanted to have them explicitly defined. http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/workloads/functional-query/queries/QueryTest/aggregation.test File testdata/workloads/functional-query/queries/QueryTest/aggregation.test: Line 836: select min(distinct NULL), max(distinct NULL) from alltypes > Why this change? Thanks. IMPALA-4042 alltypesagg is a view for kudu so this wouldn't work. The test still exercises the target functionality on a different table. http://gerrit.cloudera.org:8080/#/c/4175/1/testdata/workloads/functional-query/queries/QueryTest/create_kudu.test File testdata/workloads/functional-query/queries/QueryTest/create_kudu.test: PS1, Line 4: drop table if exists managed_kudu; > Fwiw, this won't be needed once the commit lies atop https://gerrit.clouder Thanks! Removing... http://gerrit.cloudera.org:8080/#/c/4175/1/tests/common/test_result_verifier.py File tests/common/test_result_verifier.py: PS1, Line 333: # TINYINT). Bypass the type # checking by ignoring the actual types of the Avro > Nit: You left the # when joining the line. Done PS1, Line 336: if 'TIMESTAMP' in expected_types: : LOG.info("TIMESTAMP columns unsupported in %s, skipping verification." %\ : file_format) : return > It's weird to see this logic again. this can be removed, thanks -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()
Zoltan Ivanfi has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() .. Patch Set 17: Dan, could you please +2 again? I added the missing test case and I also had to rebase to allow jenkins to verify the change. Thanks. > Please see comment about possible missing test case, but if nothing > else changes, this looks good. -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests
Michael Brown has posted comments on this change. Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests .. Patch Set 5: Code-Review+1 Thanks for the review. Patch set 5 adds an ASF license notice to a file that previously didn't have it. Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests
David Knupp has posted comments on this change. Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests .. Patch Set 4: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/4404/4/tests/comparison/tests/test_cluster.py File tests/comparison/tests/test_cluster.py: Line 51: ls = cluster.hdfs.create_client().list("/") Ha, I just noticed this. Look familiar, mikeb? -- To view, visit http://gerrit.cloudera.org:8080/4404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests
Hello David Knupp, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4404 to look at the new patch set (#5). Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests .. IMPALA-4122: qgen: fix bitrotted cluster unit tests There's a small set of pytest-style tests and associated conftest for testing some of the cluster-related test infrastructure Python objects. Going forward, I want unit tests for the query generator to be run as part of patch acceptance (CI isn't necessary at this time). This patch fixes a bitrotted test and moves the tests into the tests-for-qgen directory. The moves were performed thusly: $ git mv conftest.py tests/ $ git mv cluster_tests.py tests/test_cluster.py Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 --- D tests/comparison/conftest.py A tests/comparison/tests/conftest.py R tests/comparison/tests/test_cluster.py 3 files changed, 66 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/4404/5 -- To view, visit http://gerrit.cloudera.org:8080/4404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Dan Hecht has posted comments on this change. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/4326/4/be/src/exec/hash-table.h File be/src/exec/hash-table.h: Line 308: uint8_t* ALWAYS_INLINE cur_expr_values() const { return cur_expr_values_; } let's add a short comment here too: /// Returns the current row's expression buffer. or something. PS4, Line 310: . for the current row. PS4, Line 400: into This one should have stayed "in" (or "of"), right? i.e. it doesn't write *expr_values. (Could make them 'const uint8_t*' to clarify that too). Line 427: uint32_t HashVariableLenRow(uint8_t* expr_values, uint8_t* expr_values_null) const; how about making these const as well. Line 441: TupleRow* build_row, uint8_t* expr_values, uint8_t* expr_values_null) const; these too could be const to clarify they are "in" params. -- To view, visit http://gerrit.cloudera.org:8080/4326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
Michael Brown has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 4: I'd like you to run the small suite of cluster tests, but you can't yet until you rebase on top of https://gerrit.cloudera.org/#/c/4404/ I realize there aren't that many, and they may not find anything, but it's a habit we need to start. Thanks. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4122: qgen: fix bitrotted cluster unit tests
Michael Brown has uploaded a new change for review. http://gerrit.cloudera.org:8080/4404 Change subject: IMPALA-4122: qgen: fix bitrotted cluster unit tests .. IMPALA-4122: qgen: fix bitrotted cluster unit tests There's a small set of pytest-style tests and associated conftest for testing some of the cluster-related test infrastructure Python objects. Going forward, I want unit tests for the query generator to be run as part of patch acceptance (CI isn't necessary at this time). This patch fixes a bitrotted test and moves the tests into the tests-for-qgen directory. The moves were performed thusly: $ git mv conftest.py tests/ $ git mv cluster_tests.py tests/test_cluster.py Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 --- R tests/comparison/tests/conftest.py R tests/comparison/tests/test_cluster.py 2 files changed, 6 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/4404/4 -- To view, visit http://gerrit.cloudera.org:8080/4404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3e855e265ae245ebe3691d077284ac5761909e00 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/4326/3/be/src/exec/hash-table.h File be/src/exec/hash-table.h: PS3, Line 300: ExprValuesHash > CurExprValuesHash() [see below for motiviation] Done PS3, Line 303: SetExprValuesHash > Let's rename this to more closely match cur_expr_values() rather than match Done PS3, Line 309: cur_expr_values_null > comment explaining that there is one byte per null value (i.e. these are us Done. I'm not actually sure why we need this for codegen, since bool and uint8_t should both be represented as int8 internally in LLVM, but I don't want to get distracted digging into that, so I left a TODO. There's some weirdness with vector where it uses bitfields but otherwise C++ bools should be 8 bits - maybe that was the reason. PS3, Line 403: in > the first time I read this, I read it as this functions stores the values t Makes sense - done. PS3, Line 403: in > same Done PS3, Line 413: in > same Done PS3, Line 431: This will be replaced by : /// codegen. > nit: move this sentence to end of the comment so it's standardized with sim Done PS3, Line 438: with 'cur_expr_values_' > "to compare against the current row." (since it also uses cur_expr_values_ Done -- To view, visit http://gerrit.cloudera.org:8080/4326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. Patch Set 4: Code-Review+1 Carry Michael's +1 -- To view, visit http://gerrit.cloudera.org:8080/4326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie353666dbb5c958f0094d169306fe930ec3014c5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4008: don't bake in hash table and hash join pointers
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4326 to look at the new patch set (#4). Change subject: IMPALA-4008: don't bake in hash table and hash join pointers .. IMPALA-4008: don't bake in hash table and hash join pointers This fixes some of the cases where memory addresses are baked into codegen'd code. Testing: Ran exhaustive build. Perf: Ran a local perf run. No significant changes. I was able to see some small improvements on microbenchmarks. +---+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---+---+-++++ | TPCH(_20) | parquet / none / none | 9.07| +0.46% | 5.88 | +0.34% | +---+---+-++++ +---+--+---++-++++-+---+ | Workload | Query| File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +---+--+---++-++++-+---+ | TPCH(_20) | TPCH-Q2 | parquet / none / none | 2.12 | 1.89| +12.29% | * 10.85% * | * 20.30% * | 1 | 10| | TPCH(_20) | TPCH-Q13 | parquet / none / none | 9.84 | 9.34| +5.39% | 9.01%| 3.79%| 1 | 10| | TPCH(_20) | TPCH-Q17 | parquet / none / none | 14.61 | 14.19 | +2.97% | 2.15%| 1.52%| 1 | 10| | TPCH(_20) | TPCH-Q18 | parquet / none / none | 14.76 | 14.35 | +2.82% | 3.20%| 2.64%| 1 | 10| | TPCH(_20) | TPCH-Q9 | parquet / none / none | 13.72 | 13.54 | +1.30% | 1.75%| 0.70%| 1 | 10| | TPCH(_20) | TPCH-Q8 | parquet / none / none | 5.71 | 5.64| +1.30% | 1.21%| 1.23%| 1 | 10| | TPCH(_20) | TPCH-Q19 | parquet / none / none | 47.35 | 46.75 | +1.28% | 2.39%| 1.88%| 1 | 10| | TPCH(_20) | TPCH-Q5 | parquet / none / none | 4.57 | 4.52| +1.20% | 1.30%| 0.88%| 1 | 10| | TPCH(_20) | TPCH-Q16 | parquet / none / none | 2.07 | 2.05| +1.12% | 2.59%| 1.79%| 1 | 10| | TPCH(_20) | TPCH-Q11 | parquet / none / none | 1.45 | 1.45| +0.15% | 2.69%| 2.06%| 1 | 10| | TPCH(_20) | TPCH-Q3 | parquet / none / none | 4.65 | 4.65| -0.09% | 2.12%| 2.17%| 1 | 10| | TPCH(_20) | TPCH-Q4 | parquet / none / none | 3.22 | 3.23| -0.26% | 1.03%| 1.33%| 1 | 10| | TPCH(_20) | TPCH-Q7 | parquet / none / none | 15.84 | 15.92 | -0.50% | 0.91%| 1.15%| 1 | 10| | TPCH(_20) | TPCH-Q14 | parquet / none / none | 3.29 | 3.31| -0.59% | 3.31%| 1.58%| 1 | 10| | TPCH(_20) | TPCH-Q22 | parquet / none / none | 2.65 | 2.67| -0.78% | 3.03%| 1.46%| 1 | 10| | TPCH(_20) | TPCH-Q15 | parquet / none / none | 4.50 | 4.55| -1.19% | 2.87%| 2.45%| 1 | 10| | TPCH(_20) | TPCH-Q20 | parquet / none / none | 3.84 | 3.91| -1.76% | 2.20%| 1.94%| 1 | 10| | TPCH(_20) | TPCH-Q10 | parquet / none / none | 5.58 | 5.70| -2.00% | 1.01%| 1.79%| 1 | 10| | TPCH(_20) | TPCH-Q21 | parquet / none / none | 22.84 | 23.42 | -2.47% | 0.68%| 0.56%| 1 | 10| | TPCH(_20) | TPCH-Q1 | parquet / none / none | 11.25 | 11.60 | -3.06% | 0.48%| 1.74%| 1 | 10| | TPCH(_20) | TPCH-Q12 | parquet / none / none | 3.81 | 3.98| -4.38% | 1.62%| 1.14%| 1 | 10| | TPCH(_20) | TPCH-Q6 | parquet / none / none | 1.94 | 2.04| -4.85% | 2.40%| 1.58%| 1 | 10| +---+--+---++-++++-+---+ ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +---
[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() .. Patch Set 17: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3873: Add QueryStateAccessor
Henry Robinson has abandoned this change. Change subject: IMPALA-3873: Add QueryStateAccessor .. Abandoned No time to work on this. Will pick up when I've implemented Marcel's suggestions as a comparison. -- To view, visit http://gerrit.cloudera.org:8080/3851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Iec3dae66a81988c99cde1516ff511186e17dd8c0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4111: backend death tests should not produce minidumps
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4111: backend death tests should not produce minidumps .. Patch Set 4: Somehow Lars' earlier comments got deleted. Reproducing from my email: Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/4353/3/be/src/util/minidump.h File be/src/util/minidump.h: Lars Volker has posted comments on this change. Change subject: IMPALA-4111: backend death tests should not produce minidumps .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/4353/3/be/src/util/minidump.h File be/src/util/minidump.h: PS3, Line 29: for death test nit: maybe remove or reword like "for example during death test" or "during tests" -- To view, visit http://gerrit.cloudera.org:8080/4353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I33037f4f77c8188fc2ec46a77083a0a64f6ea404 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4026: Implement double-buffering for BlockingQueue.
Michael Ho has posted comments on this change. Change subject: IMPALA-4026: Implement double-buffering for BlockingQueue. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4350/3/be/src/util/blocking-queue.h File be/src/util/blocking-queue.h: Line 99: bool BlockingPut(const T& val) { > I think there's a pretty bad bug with how the signalling works. Yes, I have thought about it before. I have tested various locations for notification: 1. notify_one once an entry is consumed from get_list by BlockingGet() 2. notify_all when the get_list is empty 3. notify_one when the get_list is empty Apparently, option (3) seems to give the best performance. I have investigated option (2) and it appears that the thundering herd effect causes all scanner threads to start immediately leading to contention in memory allocation (e.g. memset takes 6x longer), causing some TPCH queries to regress. I suspect the slow down in option (1) may have to do the extra signaling overhead per entry or it could be a side effect of the poor struct layout before so it may be worth retrying again. In practice, a consumer can consume a row batch faster than a scanner thread can produce it so the effect of option (3) could be that it scatters out when the scanner threads get unblocked. -- To view, visit http://gerrit.cloudera.org:8080/4350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib9f4cf351455efefb0f3bb791cf9bc82d1421d54 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Chen Huang Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
Jim Apple has posted comments on this change. Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4318/2/community.html File community.html: Line 118: https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala";>Contribute > Good idea! Done -- To view, visit http://gerrit.cloudera.org:8080/4318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4318 to look at the new patch set (#4). Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala. .. IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala. While I'm in here, remove outdated roadmap. Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82 --- M bylaws.html M community.html R impala-docs.html M index.html M overview.html 5 files changed, 26 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4318/4 -- To view, visit http://gerrit.cloudera.org:8080/4318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4318 to look at the new patch set (#3). Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala. .. IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala. While I'm in here, remove outdated roadmap. Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82 --- M bylaws.html M community.html R impala-docs.html M index.html M overview.html 5 files changed, 26 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4318/3 -- To view, visit http://gerrit.cloudera.org:8080/4318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/4011/3/tests/comparison/cluster.py File tests/comparison/cluster.py: PS3, Line 56: DEFAULT_HIVE_PORT = 11050 > This should be reverted back to 11050 to match what's already in use in the Done -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has uploaded a new patch set (#4). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py 3 files changed, 61 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/4 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py
Alex Behm has posted comments on this change. Change subject: IMPALA-3491: Use unique db in test_scanners.py and test_aggregation.py .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal()
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-3949: Log the error message in FileSystemUtil.copyToLocal() .. Patch Set 10: Code-Review+2 Rebased Carrying Alex's +2. Thanks Alex, Matt and Brock for reviews. -- To view, visit http://gerrit.cloudera.org:8080/4125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5664a75aa837951de1d5dcc90e43bd8f313736b8 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Brock Noland Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
Michael Brown has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4011/3/tests/comparison/cluster.py File tests/comparison/cluster.py: PS3, Line 56: DEFAULT_HIVE_PORT = 1 This should be reverted back to 11050 to match what's already in use in the Impala dev environment (for good or ill). -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/4011/2//COMMIT_MSG Commit Message: PS2, Line 7: IMPALA-3980: qgen: re-enable Hive as a target database > I prefer commit messages that, after the bug, speak of the code section cha Done PS2, Line 19: * Hive integration tested locally by invoking the data generator via the command: : : ./data-generator.py \ : --db-name=functional \ : --use-hive \ : --min-row-count=50 \ > While it makes the commit message have more lines, it is more readable to w Done http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cli_options.py File tests/comparison/cli_options.py: PS2, Line 150: group.add_argument('--hive-host', default=DEFAULT_HIVE_HOST, : help="The name of the host running the HS2") : group.add_argument("--hive-port", default=DEFAULT_HIVE_PORT, type=int, : help="The port of HiveServer2") > Why not derive these defaults from symbols imported from cluster.py? (The c Done http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cluster.py File tests/comparison/cluster.py: PS2, Line 166: class MiniCluster(Cluster): > This changes the MiniCluster constructor interface for all callers. It also Done PS2, Line 178: def _init_local_hadoop_conf_dir(self): > Thanks for doing the cleanups like this one. No problem! It makes the code a lot easier to run locally. http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/data_generator.py File tests/comparison/data_generator.py: PS2, Line 101: def populate_db(self, table_count, postgresql_conn=None, is_hive=False) > We don't use variable names of the format "isHive" typically, so can you ch Done -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
Henry Robinson has posted comments on this change. Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4318/2/community.html File community.html: Line 118: https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala";>Contribute > So, perhaps I could rename "books" to "docs", then have that page link to t Good idea! -- To view, visit http://gerrit.cloudera.org:8080/4318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has uploaded a new patch set (#3). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py 3 files changed, 61 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/3 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3491: Use unique db in test scanners.py and test aggregation.py
David Knupp has posted comments on this change. Change subject: IMPALA-3491: Use unique db in test_scanners.py and test_aggregation.py .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ided0848c138bdc1d43694a1010c48e23ee1c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
Michael Brown has posted comments on this change. Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
David Knupp has uploaded a new patch set (#8). Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh We used to include a step in run-hbase.sh for calling a python script that queried Zookeeper to see if the HBase master was up. The original script was problematic, so we stopped using it during our mini-cluster HBase start up procedure. HBase start up issues continue to plague us, however. This patch reintroduces a Zookeeper check, with the following updates: - replace the original script with check-hbase-nodes.py - query the correct node /hbase/master, not just /hbase/rs - use the python Zookeeper library kazoo, rather than calling out to the shell and parsing the return string - since we are moving toward testing on a remote cluster, also add the capability to pass in the address for the host that provides the Zookeeper and HBase services - add an additional check that the HDFS service is running, because of an edge case where the HBase master can briefly start without a cluster running. In addition to the expected tests, this script was also tested under the conditions of IMPALA-4088, whereby the HBase RegionServer is running, but the master fails because another listening process has already taken its TCP port (60010) during startup. Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 --- M infra/python/deps/requirements.txt A testdata/bin/check-hbase-nodes.py M testdata/bin/run-hbase.sh D testdata/bin/wait-for-hbase-master.py 4 files changed, 173 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/8 -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
David Knupp has uploaded a new patch set (#7). Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh We used to include a step in run-hbase.sh for calling a python script that queried Zookeeper to see if the HBase master was up. The original script was problematic, so we stopped using it during our mini-cluster HBase start up procedure. HBase start up issues continue to plague us, however. This patch reintroduces a Zookeeper check, with the following updates: - replace the original script with check-hbase-nodes.py - query the correct node /hbase/master, not just /hbase/rs - use the python Zookeeper library kazoo, rather than calling out to the shell and parsing the return string - since we are moving toward testing on a remote cluster, also add the capability to pass in the address for the host that provides the Zookeeper and HBase services - add an additional check that the HDFS service is running, because of an edge case where the HBase master can briefly start without a cluster running. In addition to the expected tests, this script was also tested under the conditions of IMPALA-4088, whereby the HBase RegionServer is running, but the master fails because another listening process has already taken its TCP port (60010) during startup. Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 --- M infra/python/deps/requirements.txt A testdata/bin/check-hbase-nodes.py M testdata/bin/run-hbase.sh D testdata/bin/wait-for-hbase-master.py 4 files changed, 173 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/7 -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
David Knupp has posted comments on this change. Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/4348/6/testdata/bin/check-hbase-nodes.py File testdata/bin/check-hbase-nodes.py: PS6, Line 59: help=('Comma-delineated string of hosts in host:PORT format, ' > When is more than one zk host needed? Don't know, but that's the way the API is written. PS6, Line 135: # Confirm that HDFS is available. There is a pathological case where the HBase master : # can start up briefly if HDFS is not available, and then quit immediately, but that can : # be long enough to give a false positive that the HBase master is running. : # : try: : hdfs_client = InsecureClient('http://' + args.hdfs_host) : hdfs_client.list('/') : except requests.exceptions.ConnectionError as e: : msg = 'Could not connect to HDFS web host http://{0} - {1}'.format(args.hdfs_host, e) : LOGGER.error(msg) : sys.exit(1) : : errors = check_znodes_list_for_errors(args.nodes, args.zookeeper_hosts, args.timeout) > I meant for this section also to be a method, with errors returned. Done -- To view, visit http://gerrit.cloudera.org:8080/4348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4026: Implement double-buffering for BlockingQueue.
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4026: Implement double-buffering for BlockingQueue. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4350/3/be/src/util/blocking-queue.h File be/src/util/blocking-queue.h: Line 99: bool BlockingPut(const T& val) { I think there's a pretty bad bug with how the signalling works. Consider a queue with a limit of N. If the queue gets into a state where get_list has N elements and put_list has 0 elements with all of the producer threads waiting on it, then the caller to BlockingGet() will drain get_list before signalling put_cv, even though it should wake up a producer thread once there is space in the queue. Worse, it will only wake up one producer thread each time BlockingGet() is called, which means that one only producer thread can run at a time. -- To view, visit http://gerrit.cloudera.org:8080/4350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib9f4cf351455efefb0f3bb791cf9bc82d1421d54 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Chen Huang Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: Re-enable Hive as a target database for the Random Query Generator
Michael Brown has posted comments on this change. Change subject: IMPALA-3980: Re-enable Hive as a target database for the Random Query Generator .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/4011/2//COMMIT_MSG Commit Message: PS2, Line 7: IMPALA-3980: Re-enable Hive as a target database for the Random Query Generator I prefer commit messages that, after the bug, speak of the code section changed. So something like: "IMPALA-3890: qgen: reenable Hive as a target database" PS2, Line 19: * Hive integration tested locally by invoking the data generator (./data-generator.py : --db-name=functional --use-hive --min-row-count=50 --max-row-count=100 : --storage-file-formats textfile --use-postgresql --postgresql-user stakiar) and the : discrepancy checker (./discrepancy-checker.py --db-name=functional --use-hive : --use-postgresql --postgresql-user stakiar --test-db-type HIVE --timeout 300 : --query-count 50 --profile hive) While it makes the commit message have more lines, it is more readable to write the commands with breaks across options, like this: ./data_generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cli_options.py File tests/comparison/cli_options.py: PS2, Line 150: group.add_argument('--hive-host', default='localhost', : help="The name of the host running the HS2") : group.add_argument("--hive-port", default=1, type=int, : help="The port of HiveServer2") Why not derive these defaults from symbols imported from cluster.py? (The constants I've asked you to create in that file) http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cluster.py File tests/comparison/cluster.py: PS2, Line 166: def __init__(self, hive_host, hive_port, num_impalads=3): This changes the MiniCluster constructor interface for all callers. It also implicitly changes the cli_options.create_cluster() call, which has several callers. What do you think about defining 127.0.0.1 and 11050 (from the old code L179) as default hive host and port as constants at the top of this module? Then make hive_host and hive_port on this line optional, defaulting to those constants. PS2, Line 178: node_conf_dir = os.environ["HADOOP_CONF_DIR"] Thanks for doing the cleanups like this one. http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/data_generator.py File tests/comparison/data_generator.py: PS2, Line 101: def populate_db(self, table_count, postgresql_conn=None, isHive=False): We don't use variable names of the format "isHive" typically, so can you change this to is_hive? -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() .. Patch Set 16: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/190/ -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala.
Jim Apple has posted comments on this change. Change subject: IMPALA-4068: Focus pages on Apache Impala, not Cloudera Impala. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4318/2/community.html File community.html: Line 118: https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala";>Contribute > So, perhaps I could rename "books" to "docs", then have that page link to t Any more thoughts on this, Henry? -- To view, visit http://gerrit.cloudera.org:8080/4318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0492bff6306089f0745bbff46f070cf3ad2d9c82 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()
Hello Lars Volker, Matthew Jacobs, Internal Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4094 to look at the new patch set (#16). Change subject: IMPALA-3973: add position and occurrence to instr() .. IMPALA-3973: add position and occurrence to instr() Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd --- M be/src/experiments/CMakeLists.txt R be/src/experiments/string-search-sse-test.cc M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M be/src/runtime/CMakeLists.txt A be/src/runtime/string-search-test.cc M be/src/runtime/string-search.h M common/function-registry/impala_functions.py 9 files changed, 343 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4094/16 -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zoltan Ivanfi