[Impala-ASF-CR] IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14748 ) Change subject: IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1" .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5252/ -- To view, visit http://gerrit.cloudera.org:8080/14748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899 Gerrit-Change-Number: 14748 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 20 Nov 2019 07:57:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14611 ) Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5080/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 20 Nov 2019 07:34:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9110: Add table loading time break-down metrics for HdfsTable
Hello Quanlong Huang, Yongzhi Chen, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14611 to look at the new patch set (#6). Change subject: IMPALA-9110: Add table loading time break-down metrics for HdfsTable .. IMPALA-9110: Add table loading time break-down metrics for HdfsTable A. Problem: Catalog table loading currently only records the total loading time. We will need some break-down times, i.e. more detailed time recording on each loading function. Also, the table schema loading is not taken into account for load-duration. We will need to add some more metrics for that. B. Solution: - We added "hms-load-tbl-schema", "load-duration.all-column-stats", "load-duration.all-partitions.total-time", "load-duration.all-partitions.file-metadata". Also, we logged the loadValidWriteIdList() time. So now we have a more detailed breakdown time for table loading info. The table loading time metrics for HDFS tables are in the following hierarchy: - Table Schema Loading - Table Metadata Loading - total time - all column stats loading time - ValidWriteIds loading time - all partitions loading time - total time - file metadata loading time - storage-metadata-loading-time(standalone metric) 1. Table Schema Loading: * Meaning: The time for HMS to fetch table object and the real schema loading time. Normally, the code path is "msClient.getHiveClient().getTable(dbName, tblName)" * Metric : hms-load-tbl-schema 2. Table Metadata Loading -- total time * Meaning: The time to load all the table metadata. The code path is load() function in HdfsTable.load() function. * Metric: load-duration.total-time 2.1 Table Metadata Loading -- all column stats * Meaning: load all column stats, this is part of table metadata loading The code path is HdfsTable.loadAllColumnStats() * Metric: load-duration.all-column-stats 2.2 Table Metadata Loading -- loadValidWriteIdList * Meaning: fetch ValidWriteIds from HMS The code path is HdfsTable.loadValidWriteIdList() * Metric: no metric recorded for this one. Instead, a debug log is generated. 2.3 Table Metadata Loading -- storage metadata loading(standalone metric) * Meaning: Storage related to file system operations during metadata loading.(The amount of time spent loading metadata from the underlying storage layer.) * Metric: we rename it to load-duration.storage-metadata. This is a metric introduced by IMPALA-7322 2.4 Table Metadata Loading -- load all partitions * Meaning: Load all partitions time, including fetching all partitions from HMS and loading all partitions. The code path is MetaStoreUtil.fetchAllPartitions() and HdfsTable.loadAllPartitions() * Metric: load-duration.all-partitions 2.4.1 Table Metadata Loading -- load all partitions -- load file metadata * Meaning: The file metadata loading for all all partitions. (This is part of 2.4). Code path: loadFileMetadataForPartitions() inside loadAllPartitions() * Metric: load-duration.all-partitions.file-metadata C. Extra thing in this commit: 1. Add PrintUtils.printTimeNs for PrettyPrint time in FrontEnd 2. Add explanation for table loading manager D. Test: 1. Add Unit tests for PrintUtils.printTime() function 2. Manual describe table and verify the table loading metrics are correct. Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/common/PrintUtils.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/util/PrintUtilsTest.java 10 files changed, 188 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14611/6 -- To view, visit http://gerrit.cloudera.org:8080/14611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10 Gerrit-Change-Number: 14611 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: > Oh I meant something simpler - same option values, but a plan that has an e Ah, sure will add that. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 05:40:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: > Good suggestion. Just want to clarify..the first negative test case can be Oh I meant something simpler - same option values, but a plan that has an estimate of < broadcast_bytes_limit so we're testing both sides of the boundary. We kinda get that coverage from all the other planner tests but it'd be nice to have it explicitly tested here. It should have read "Can you also add a negative test where it is below the threshold and stays as a broadcast join." Sorry about that. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 05:20:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14741 ) Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5253/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 20 Nov 2019 04:44:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9090: Add name of table being scanned in scan node profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14660 ) Change subject: IMPALA-9090: Add name of table being scanned in scan node profile .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5079/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 20 Nov 2019 03:27:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9092: Add support for creating external Kudu table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14750 ) Change subject: IMPALA-9092: Add support for creating external Kudu table .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5078/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14750 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I76f81d41db0cf2269ee1b365857164a43677e14d Gerrit-Change-Number: 14750 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 20 Nov 2019 02:51:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9090: Add name of table being scanned in scan node profile
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14660 ) Change subject: IMPALA-9090: Add name of table being scanned in scan node profile .. IMPALA-9090: Add name of table being scanned in scan node profile Before this change, the only way to figure out the table being scanned by a scan node in the profile is to pull the string out of the explain plan or execsummary. This is awkward, both for manual and automated analysis of the profiles. We should include the table name as a string in the SCAN_NODE implementation. After this change, a new line "Table Name: database.table" will be added in first line of scan node profile. Also fix a bug that frontend pass hbase and kudu table name incorrectly to thrift. Before this change, native name of hbase and kudu table are passed in and there is no way to get hms table name from TableDescriptor in backend. After this change, for HBaseTableDescriptor and KuduTableDescriptor, function name() would return hms table name, function table_name() would return hbase or kudu native table name. Manually tested on mini-cluster with: 1. hdfs and s3 table with file format text and parquet. 2. hbase table. 3. kudu table. Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-table-writer.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/kudu-scan-node-base.cc M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java 6 files changed, 8 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/14660/4 -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: > Can you also add a negative test where it is below the threshold and doesn' Good suggestion. Just want to clarify..the first negative test case can be something like setting the 'mem_limit' to a small number but leave the broadcast_bytes_limit as default (or sufficiently high). In this case, the planner should not pick broadcast. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 02:16:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9092: Add support for creating external Kudu table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14750 ) Change subject: IMPALA-9092: Add support for creating external Kudu table .. Patch Set 1: (43 comments) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@397 PS1, Line 397: private void analyzeSynchronizedKuduTableParams(Analyzer analyzer) throws AnalysisException { line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@441 PS1, Line 441: private void analyzeSynchronizedKuduTableName(Analyzer analyzer) throws AnalysisException { line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2239 PS1, Line 2239: !KuduTable.isSynchronizedTable(newTable) || !isKuduHmsIntegrationEnabled(newTable); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java: http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@44 PS1, Line 44: return super.AnalyzesOk(appendSynchronizedTblProps(stmt, isSynchronizedTbl), errorStr); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@65 PS1, Line 65: super.AnalysisError(appendSynchronizedTblProps(stmt, isSynchronizedTbl), expectedError); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@97 PS1, Line 97: "partition value = (cast (30 as int), factorial(5))) stored as kudu", isSynchronizedTable); line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@171 PS1, Line 171: "partitioning columns: (2 vs 1). Range partition: 'PARTITION VALUE = (1, 2)'", isSynchronizedTable); line too long (112 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@196 PS1, Line 196: "(partition value = false, partition value = true) stored as kudu", isSynchronizedTable); line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@212 PS1, Line 212: "(PARTITION VALUE = 'abc')' is not a key column. Only key columns can be used " + line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@238 PS1, Line 238: "Unpartitioned Kudu tables are inefficient for large data sizes.", isSynchronizedTable); line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@242 PS1, Line 242: "Table property 'kudu.num_tablet_replicas' must be an integer.", isSynchronizedTable); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@252 PS1, Line 252: "Unpartitioned Kudu tables are inefficient for large data sizes.", isSynchronizedTable); line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@265 PS1, Line 265: "partition by range(a, b) (partition (0, 0) < values <= (1, 1)) stored as kudu", isSynchronizedTable); line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@273 PS1, Line 273: "partitioning columns: (1 vs 2). Range partition: 'PARTITION 0 < VALUES <= 1'", isSynchronizedTable); line too long (113 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@315 PS1, Line 315: AnalysisError(createTblStr, "Primary key columns cannot be nullable", isSynchronizedTable); line too long (107 > 90) http://gerrit.cloudera.org:8080/#/c/14750/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@355 PS1, Line 355: "algorithms are: " + Joiner.on(", ").join(CompressionAlgorithm.values()), isSynchronizedTable); line too long (107 > 90)
[Impala-ASF-CR] IMPALA-9092: Add support for creating external Kudu table
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14750 Change subject: IMPALA-9092: Add support for creating external Kudu table .. IMPALA-9092: Add support for creating external Kudu table In HMS-3 the translation layer converts a managed kudu table into a external kudu table and adds additional table property 'external.table.purge' to 'true'. This means any installation which is using HMS-3 (or a Hive version which has HIVE-22158) will always create Kudu tables as external tables. This is problematic since the the output of show create table will now be different and may confuse the users. In order to improve the user experience of such synchronized tables (external tables with external.table.purge property set to true), this patch adds support in Impala to create external Kudu tables. Previous versions of Impala disallowed creating a external Kudu table if the Kudu table did not exist. After this patch, Impala will check if the Kudu table exists and if it does not it will create a Kudu table based on the schema provided in the create table statement. However, this applies to only the synchronized tables. Previous way to create a pure external table behaves the same. Following syntax of creating a synchronized table is now allowed: CREATE EXTERNAL TABLE foo ( id int PRIMARY KEY, name string) PARTITION BY HASH PARTITIONS 8 STORED AS KUDU TBLPROPERTIES ('external.table.purge'='true') The syntax is very similar to creating a managed table, except for the EXTERNAL keyword and additional table property. A synchronized table will behave similar to managed Kudu tables (drops and renames are allowed). The output of show create table on a synchronized table will display the full column and partition spec similar to the managed tables. Testing: 1. After the CDP version bump all of the existing Kudu tables now create synchronized tables so there is good coverage there. 2. Added additional tests which create synchronized tables and compares the show create table output. Change-Id: I76f81d41db0cf2269ee1b365857164a43677e14d --- M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java M fe/src/test/java/org/apache/impala/common/FrontendFixture.java M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/common/skip.py M tests/metadata/test_show_create_table.py M tests/query_test/test_kudu.py 15 files changed, 703 insertions(+), 256 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/14750/1 -- To view, visit http://gerrit.cloudera.org:8080/14750 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I76f81d41db0cf2269ee1b365857164a43677e14d Gerrit-Change-Number: 14750 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14741 ) Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5251/ -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 20 Nov 2019 01:30:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14720 ) Change subject: IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server. .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5077/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8942dfbbd4a3be244eed1c61ac2ce17069960477 Gerrit-Change-Number: 14720 Gerrit-PatchSet: 8 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 20 Nov 2019 01:26:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/14720 ) Change subject: IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server. .. Patch Set 8: (13 comments) The developer docs page says clang can be used only with the C++ part of the code. I was manually doing this and still miss a few code style errors. Clang documentation says it can be used for Java too. Maybe I should try and modify our clang tool to work with Java. Thank you for catching these errors. http://gerrit.cloudera.org:8080/#/c/14720/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14720/7//COMMIT_MSG@38 PS7, Line 38: This can be : tracked in IMPAL > Is there a JIRA for it? This was first mentioned in https://gerrit.cloudera.org/#/c/14106/. I could not find a JIRA, so I filed IMPALA-9172. Updated the commit message. http://gerrit.cloudera.org:8080/#/c/14720/7/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/14720/7/common/thrift/Frontend.thrift@576 PS7, Line 576: GET_CROSS_REFERENC > GET_CROSS_REFERENCE Done http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/analysis/TableDef.java File fe/src/main/java/org/apache/impala/analysis/TableDef.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/analysis/TableDef.java@547 PS7, Line 547: i > nit: extra whitespace Done http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/FeTable.java File fe/src/main/java/org/apache/impala/catalog/FeTable.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/FeTable.java@99 PS7, Line 99: > I don't think this is necessary, here and elsewhere This is obsolete. I reverted the NotImplementedException change. http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@56 PS7, Line 56: import org.apache.impala.thrift.CatalogObjectsConstants; > I don't think you need this Done http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@216 PS7, Line 216: : > nit: you can get rid of the '+' by putting this all on the second line Not needed anymore since I reverted the NotImplementedException change. http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/Frontend.java@907 PS7, Line 907: blic List getDbs(PatternMatcher matcher, User user) : throws InternalException { : List dbs = getCatalog().getDbs(matcher); : // If authorization is enabled > My thinking in suggesting the NotImplementedException was that we would bub I see your point. Since the IMPALA-9158 patch is in review, depending on which patch lands first, I will either create a cleanup patch or modify this patch to clean things. If it is okay with you, to avoid changing the tests more than once, I will revert to the earlier behavior of returning an empty PK/FK list in LocalCatlogMode. http://gerrit.cloudera.org:8080/#/c/14720/5/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: http://gerrit.cloudera.org:8080/#/c/14720/5/fe/src/main/java/org/apache/impala/service/MetadataOp.java@736 PS5, Line 736: eName == null) { > Good question - I just know if it as our usual convention, eg. if you look Thanks for the explanation. I reverted it to use the "diamond" style. http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/MetadataOp.java@650 PS7, Line 650: if (LOG.isTraceEnabled()) { : LOG.trace("Returning {} primary k > formatting Done http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/MetadataOp.java@676 PS7, Line 676: tableM > formatting Done http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/MetadataOp.java@720 PS7, Line 720: } : if (LOG.isTraceEnabled()) { > formatting Done
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread Modifies HdfsFileReader so that it calls hdfsPreadFully instead of hdfsPread. hdfsPreadFully is a new libhdfs API introduced by HDFS-14564 (Add libhdfs APIs for readFully; add readFully to ByteBufferPositionedReadable). hdfsPreadFully improves performance of preads, especially when reading data from S3. The major difference between hdfsPread and hdfsPreadFully is that hdfsPreadFully is guaranteed to read all the requested bytes, whereas hdfsPread is only guaranteed to read up to the number of requested bytes. hdfsPreadFully reduces the amount of JNI array allocations necessary when reading data from S3. When any read method in libhdfs is called, the method allocates an array whose size is equal to the amount of data requested. The issue is that Java's InputStream#read only guarantees that it will read up to the amount of data requested. This can lead to issues where a libhdfs read request allocates a large Java array, even though the read request only partially fills it up. PositionedReadable#readFully on the other hand, guarantees that all requested data will be read, thus preventing any unnecessary JNI array allocations. hdfsPreadFully improves the effectiveness of fs.s3a.experimental.input.fadvise=RANDOM (HADOOP-13203). S3A recommends setting fadvise=RANDOM when doing random reads, which is common in Impala when reading Parquet or ORC files. fadvise=RANDOM causes the HTTP GET request that reads the S3 data to simply request the data bounded by the parameters of the current read request (e.g. for 'read(long position, ..., int length)' it requests 'length' bytes). The chunk-size optimization in HdfsFileReader hurts performance when fadvise=RANDOM because each HTTP GET request will only request 'chunk-size' amount of bytes at a time. Which is why this patch removes the chunk-size optimization as well. hdfsPreadFully helps here because all the data in the scan range will be requested by a single HTTP GET request. Since hdfsPreadFully improves S3 read performance, this patch enables preads for S3A files by default. Even if fadvise=SEQUENTIAL, hdfsPreadFully still improves performance since it avoids unnecessary JNI allocation overhead. The chunk-size optimization (added in https://gerrit.cloudera.org/#/c/63/) is no longer necessary after this patch. hdfsPreadFully prevents any unnecessary array allocations. Furthermore, it is likely the chunk-size optimization was added due to overhead fixed by HDFS-14285. Fixes a bug in IMPALA-8884 where the 'impala-server.io-mgr.queue-$i.read-size' statistics were being updated with the chunk-size passed to HdfsFileReader::ReadFromPosInternal, which is not necessarily equivalent to the amount of data actually read. Testing: * Ran core tests * Ran core tests on S3 * Ad-hoc functional and performance testing on ABFS; no perf regression observed; planning to further investigate the interaction between hdfsPreadFully + ABFS in a future JIRA Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Reviewed-on: http://gerrit.cloudera.org:8080/14635 Reviewed-by: Sahil Takiar Tested-by: Impala Public Jenkins --- M be/src/common/global-flags.cc M be/src/runtime/io/hdfs-file-reader.cc M be/src/runtime/io/hdfs-file-reader.h M be/src/runtime/io/local-file-reader.cc M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc 6 files changed, 20 insertions(+), 52 deletions(-) Approvals: Sahil Takiar: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 8 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 00:44:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server.
Anurag Mantripragada has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/14720 ) Change subject: IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server. .. IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server. The goal is to let JDBC clients get constraint information from Impala tables. We implement two new metadata operations in impala-hs2-server, GetPrimaryKeys and GetCrossReference, whose thrift definitions are taken from Hive. For these to work, new TCLIService methods were added. In FE, these two operations are implemented to get the information from tables in the catalog. Much like GetColumns(), tables need to be loaded in order to be able to get PK/FK information. We wait for the PK table/FK table to load. In the implementation, PK/FK information is returned ONLY if the user has access to ALL the columns involved in the PK/FK relationship. Testing: - Added three test tables to our test datasets since most of our FE tests relied on dummy tables or testdata. It was difficult to test PK/FK with these methods. Also, we can build on this testdata in future when we make optimizer improvements. - Added unit tests in AuthorizationTest and JDBCtest. - Added e2e test in test_hs2.py Caveats: - Ranger needs OWNER user information for authorization. Since this is HMS metadata that we do not aggresively load, this information is not available for IncompleteTables. Some foreign key tables (fact tables for example) might have FK/PK relationships with several PK tables some of which might not be loaded in catalog. Currently we have no way to check column previleges without owner user information tables. We do not return keys involving such columns. Therefore, when Ranger is used, there maybe missing PK/FK relationships for parent tables that are not loaded. This can be tracked in IMPALA-9172. - Retrieval of constraints is not yet supported in LocalCatalog mode. See IMPALA-9158. Change-Id: I8942dfbbd4a3be244eed1c61ac2ce17069960477 --- M be/src/rpc/kerberos-test.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.h M bin/rat_exclude_files.txt M common/thrift/Frontend.thrift M common/thrift/hive-1-api/TCLIService.thrift M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/catalog/FeTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTest.java M fe/src/test/java/org/apache/impala/service/JdbcTest.java M testdata/data/README A testdata/data/child_table.txt A testdata/data/parent_table.txt A testdata/data/parent_table_2.txt M testdata/datasets/functional/functional_schema_template.sql M tests/hs2/test_hs2.py 24 files changed, 803 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/14720/8 -- To view, visit http://gerrit.cloudera.org:8080/14720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8942dfbbd4a3be244eed1c61ac2ce17069960477 Gerrit-Change-Number: 14720 Gerrit-PatchSet: 8 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8863: Add support to run tests over HTTP/HS2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14059 ) Change subject: IMPALA-8863: Add support to run tests over HTTP/HS2 .. Patch Set 30: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5249/ -- To view, visit http://gerrit.cloudera.org:8080/14059 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018 Gerrit-Change-Number: 14059 Gerrit-PatchSet: 30 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 00:22:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14717 ) Change subject: IMPALA-9134: pipeline data stream sender flush .. IMPALA-9134: pipeline data stream sender flush This changes the logic so that the last TransmitData and EndDataStream RPCs can be sent to receivers in parallel, instead of them being serialized in FlushAndSendEos(). This was a problem because each call to FlushAndSendEos() would wait for one or two network round-trips, so previously the flush could easily take sum(2 * RTT). After this change the sending and waiting for RPCs is pipeline so the time taken should be closer to 2 * RTT. Testing: This code is exercised by existing tests. Ran exhaustive tests. Perf: No measurable change on local TPC-H (where network latency isn't a significant factor). +--+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--+---+-++++ | TPCH(30) | parquet / none / none | 7.64| -1.37% | 5.00 | +0.29% | +--+---+-++++ +--+--+---++-++++---++-+---+ | Workload | Query| File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval | +--+--+---++-++++---++-+---+ | TPCH(30) | TPCH-Q14 | parquet / none / none | 3.75 | 3.39| +10.48% | * 19.43% * | 8.88%| 7 | +7.26% | 0.48| 1.19 | | TPCH(30) | TPCH-Q13 | parquet / none / none | 13.55 | 12.78 | +6.05% | 5.43%| 4.75%| 7 | +5.09% | 1.76| 2.14 | | TPCH(30) | TPCH-Q2 | parquet / none / none | 1.30 | 1.26| +3.63% | 5.53%| 3.27%| 7 | +3.93% | 1.28| 1.46 | | TPCH(30) | TPCH-Q12 | parquet / none / none | 2.65 | 2.56| +3.44% | 4.23%| 2.92%| 7 | +3.68% | 1.28| 1.73 | | TPCH(30) | TPCH-Q15 | parquet / none / none | 3.80 | 3.76| +0.91% | 0.81%| 1.80%| 7 | +1.42% | 1.12| 1.22 | | TPCH(30) | TPCH-Q8 | parquet / none / none | 5.23 | 5.13| +2.11% | 5.41%| 0.48%| 7 | +0.04% | 0.16| 1.01 | | TPCH(30) | TPCH-Q10 | parquet / none / none | 5.77 | 5.62| +2.70% | 9.84%| 2.79%| 7 | -0.80% | -0.96 | 0.68 | | TPCH(30) | TPCH-Q21 | parquet / none / none | 28.82 | 28.62 | +0.70% | 0.59%| 0.37%| 7 | +0.68% | 1.92| 2.67 | | TPCH(30) | TPCH-Q6 | parquet / none / none | 1.32 | 1.31| +0.82% | 0.36%| 1.53%| 7 | +0.23% | 1.28| 1.38 | | TPCH(30) | TPCH-Q11 | parquet / none / none | 1.30 | 1.29| +0.60% | 5.11%| 2.94%| 7 | +0.14% | 0.16| 0.27 | | TPCH(30) | TPCH-Q19 | parquet / none / none | 4.52 | 4.49| +0.57% | 2.98%| 1.58%| 7 | +0.12% | 0.32| 0.44 | | TPCH(30) | TPCH-Q16 | parquet / none / none | 2.78 | 2.77| +0.35% | 3.21%| 2.02%| 7 | +0.17% | 0.64| 0.25 | | TPCH(30) | TPCH-Q3 | parquet / none / none | 5.65 | 5.65| -0.04% | 1.62%| 1.91%| 7 | -0.06% | -0.32 | -0.04 | | TPCH(30) | TPCH-Q1 | parquet / none / none | 10.88 | 10.89 | -0.11% | 0.76%| 0.90%| 7 | +0.01% | 0.00| -0.26 | | TPCH(30) | TPCH-Q4 | parquet / none / none | 2.18 | 2.18| -0.38% | 1.33%| 2.48%| 7 | -0.04% | -0.16 | -0.36 | | TPCH(30) | TPCH-Q5 | parquet / none / none | 3.94 | 3.95| -0.27% | 2.37%| 1.33%| 7 | -0.26% | -1.12 | -0.26 | | TPCH(30) | TPCH-Q7 | parquet / none / none | 22.37 | 22.41 | -0.19% | 2.94%| 1.92%| 7 | -0.66% | -0.80 | -0.15 | | TPCH(30) | TPCH-Q22 | parquet / none / none | 3.68 | 3.72| -1.14% | 4.95%| 3.99%| 7 | +0.13% | 0.00| -0.48 | | TPCH(30) | TPCH-Q20 | parquet / none / none | 2.83 | 2.86| -1.21% | 2.69%| 2.20%| 7 | -1.60% | -0.64 | -0.93 | | TPCH(30) | TPCH-Q17 | parquet / none / none | 4.92 | 5.04| -2.46% | 3.11%| 2.63%| 7 | -2.00% | -1.44 | -1.62 | |
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14717 ) Change subject: IMPALA-9134: pipeline data stream sender flush .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If66add99e4370f8faf22761b336205cc9f9f1867 Gerrit-Change-Number: 14717 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 23:41:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 2: no hj probe structures in build
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14716 ) Change subject: IMPALA-9126: part 2: no hj probe structures in build .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5076/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14716 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d Gerrit-Change-Number: 14716 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 23:28:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 2: no hj probe structures in build
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14716 ) Change subject: IMPALA-9126: part 2: no hj probe structures in build .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5075/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14716 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d Gerrit-Change-Number: 14716 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 22:50:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 2: no hj probe structures in build
Tim Armstrong has restored this change. ( http://gerrit.cloudera.org:8080/14716 ) Change subject: IMPALA-9126: part 2: no hj probe structures in build .. Restored -- To view, visit http://gerrit.cloudera.org:8080/14716 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: restore Gerrit-Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d Gerrit-Change-Number: 14716 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9126: part 2: no hj probe structures in build
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14716 to look at the new patch set (#5). Change subject: IMPALA-9126: part 2: no hj probe structures in build .. IMPALA-9126: part 2: no hj probe structures in build This is actually independent of part 1, and can be merged ahead of it if needed. This cleans a up a bit of tech debt, where the hash join builder allocated probe-side streams. This was implemented before we had reliable memory reservations. Now we can simply transfer reservation. The reason things are this way is because the separation of PhjBuilder from PartitionedHashJoinNode (IMPALA-3567) happened before we switched to the new BufferPool (IMPALA-4674). It wasn't possible to reliably transfer reservations, instead the workaround of allocating and transferring probe streams was necessary. After this change, PartitionedHashJoinBuilder does not explicitly touch any probe-side data structures. There is still some implicit sharing of things like the buffer pool client, which is expected as long as the builder belongs to the ExecNode. Testing: Ran exhaustive tests. We should already have adequate coverage for spilling and non-spilling hash joins. Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d --- M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h 6 files changed, 176 insertions(+), 131 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/14716/5 -- To view, visit http://gerrit.cloudera.org:8080/14716 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d Gerrit-Change-Number: 14716 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9126: part 2: no hj probe structures in build
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14716 ) Change subject: IMPALA-9126: part 2: no hj probe structures in build .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14716/4/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/14716/4/be/src/exec/partitioned-hash-join-builder.cc@414 PS4, Line 414: bool need_read_buffer = ht_ctx_->level() > 0; I thought this was a bug, because we need a read buffer when we're probing a single spilled partition, even if that partition's level is 0. But actually it works for a non-obvious reason - we don't currently call this function when we're probing a single spilled partition. Changed the code around so this is more explicit. -- To view, visit http://gerrit.cloudera.org:8080/14716 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d Gerrit-Change-Number: 14716 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 22:44:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14741 ) Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. Patch Set 2: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/14741/2/bin/run-all-tests.sh File bin/run-all-tests.sh: http://gerrit.cloudera.org:8080/#/c/14741/2/bin/run-all-tests.sh@125 PS2, Line 125: "run-all-tests.sh "$0"? http://gerrit.cloudera.org:8080/#/c/14741/2/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/14741/2/testdata/bin/create-load-data.sh@118 PS2, Line 118: create-load-data.sh "$0"? -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 19 Nov 2019 22:41:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9126: part 1: hash join build partition cleanup
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14632 ) Change subject: IMPALA-9126: part 1: hash join build partition cleanup .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5074/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb Gerrit-Change-Number: 14632 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 22:38:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 1: hash join build partition cleanup
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14632 ) Change subject: IMPALA-9126: part 1: hash join build partition cleanup .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5073/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb Gerrit-Change-Number: 14632 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 22:37:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 2: no hj probe structures in build
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/14716 ) Change subject: IMPALA-9126: part 2: no hj probe structures in build .. Abandoned Noticed a bug - will abandon for now. -- To view, visit http://gerrit.cloudera.org:8080/14716 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d Gerrit-Change-Number: 14716 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14748 ) Change subject: IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1" .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5072/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899 Gerrit-Change-Number: 14748 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 22:32:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 2: no hj probe structures in build
Tim Armstrong has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14716 ) Change subject: IMPALA-9126: part 2: no hj probe structures in build .. IMPALA-9126: part 2: no hj probe structures in build This is actually independent of part 1, and can be merged ahead of it if needed. This cleans a up a bit of tech debt, where the hash join builder allocated probe-side streams. This was implemented before we had reliable memory reservations. Now we can simply transfer reservation. The reason things are this way is because the separation of PhjBuilder from PartitionedHashJoinNode (IMPALA-3567) happened before we switched to the new BufferPool (IMPALA-4674). It wasn't possible to reliably transfer reservations, instead the workaround of allocating and transferring probe streams was necessary. After this change, PartitionedHashJoinBuilder does not explicitly touch any probe-side data structures. There is still some implicit sharing of things like the buffer pool client, which is expected as long as the builder belongs to the ExecNode. Testing: Ran exhaustive tests. We should already have adequate coverage for spilling and non-spilling hash joins. Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d --- M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h 6 files changed, 175 insertions(+), 131 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/14716/4 -- To view, visit http://gerrit.cloudera.org:8080/14716 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0065f7f44f44f02b7616b1f694178ca42341c42d Gerrit-Change-Number: 14716 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-8867: Further deflake test auto scaling
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14740 ) Change subject: IMPALA-8867: Further deflake test_auto_scaling .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/14740/1/tests/custom_cluster/test_auto_scaling.py File tests/custom_cluster/test_auto_scaling.py: http://gerrit.cloudera.org:8080/#/c/14740/1/tests/custom_cluster/test_auto_scaling.py@104 PS1, Line 104: min_query_rate = 1.1 * EXECUTOR_SLOTS I looked through the failures for this, and I've seen a couple failures on an older Centos 6 machine that was slightly below this: custom_cluster/test_auto_scaling.py:116: in test_single_workload assert max_query_rate >= min_query_rate, "Query rate did not reach %s within %s " \ E AssertionError: Query rate did not reach 3.6 within 45 s. Maximum was 3.2. Cluster size is 5. E assert 3.2002 >= 3.5996 That's the lowest I saw it go in the failures I reviewed. (Most of the current failures are above 3.3 and your fix would be good.) Would a 1.05 multiplier be ok? -- To view, visit http://gerrit.cloudera.org:8080/14740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29808982cc6226152c544cb99f76961b582975a7 Gerrit-Change-Number: 14740 Gerrit-PatchSet: 1 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 22:05:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14748 ) Change subject: IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1" .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899 Gerrit-Change-Number: 14748 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 21:57:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14748 ) Change subject: IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1" .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5252/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899 Gerrit-Change-Number: 14748 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 21:57:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 1: hash join build partition cleanup
Hello Thomas Tauber-Marshall, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14632 to look at the new patch set (#6). Change subject: IMPALA-9126: part 1: hash join build partition cleanup .. IMPALA-9126: part 1: hash join build partition cleanup Clarify some invariants (e.g. which join modes actually require build partition data to be attached to row batches). Move some logic for maintenance of the hash partitions to the builder. Testing: Ran exhaustive tests. We should already have adequate coverage for spilling and non-spilling hash joins. Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h A be/src/exec/join-op.h M be/src/exec/nested-loop-join-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h 8 files changed, 166 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/14632/6 -- To view, visit http://gerrit.cloudera.org:8080/14632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb Gerrit-Change-Number: 14632 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9126: part 1: hash join build partition cleanup
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14632 ) Change subject: IMPALA-9126: part 1: hash join build partition cleanup .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h@119 PS4, Line 119: > nit: formatting Done http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h@124 PS4, Line 124: // We don't need to pass in a batch because the anti-join only returns tuple data > Is it possible to DCHECK for this? I guess it would be null_aware_partition I don't think there's a direct way to assert on it. It follows from the join node not including this tuple in its output row descriptor. I added that fact to the comment. Seems like an improvement since the reader can verify that fact (e.g. see the DCHECK in BlockingJoinNode::Prepare()). http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h@163 PS4, Line 163: /// Needs to be log2(PARTITION_FANOUT). > Might make sense to move this and MAX_PARTITION_DEPTH up too, to keep the r good point http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.cc@483 PS4, Line 483: stream->Close(nullptr, RowBatch::FlushMode::NO_FLUSH_RESOURCES); > Maybe comment/DCHECK why we don't need to pass row_batch here Done http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-node.cc@339 PS4, Line 339: if (state_ == PROBING_SPILLED_PARTITION && NeedToProcessUnmatchedBuildRows(join_op_)) { > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-node.cc@1154 PS4, Line 1154: hash_tbl_iterator_ = output_build_partitions_.front()->hash_tbl()->FirstUnmatched(ht_ctx_.get()); > line too long (101 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/14632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb Gerrit-Change-Number: 14632 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 21:55:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9126: part 1: hash join build partition cleanup
Hello Thomas Tauber-Marshall, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14632 to look at the new patch set (#5). Change subject: IMPALA-9126: part 1: hash join build partition cleanup .. IMPALA-9126: part 1: hash join build partition cleanup Clarify some invariants (e.g. which join modes actually require build partition data to be attached to row batches). Move some logic for maintenance of the hash partitions to the builder. Testing: Ran exhaustive tests. We should already have adequate coverage for spilling and non-spilling hash joins. Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h A be/src/exec/join-op.h M be/src/exec/nested-loop-join-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h 8 files changed, 164 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/14632/5 -- To view, visit http://gerrit.cloudera.org:8080/14632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb Gerrit-Change-Number: 14632 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9126: part 1: hash join build partition cleanup
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14632 ) Change subject: IMPALA-9126: part 1: hash join build partition cleanup .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/14632/5/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: http://gerrit.cloudera.org:8080/#/c/14632/5/be/src/exec/partitioned-hash-join-node.cc@339 PS5, Line 339: if (state_ == PROBING_SPILLED_PARTITION && NeedToProcessUnmatchedBuildRows(join_op_)) { line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/14632/5/be/src/exec/partitioned-hash-join-node.cc@1154 PS5, Line 1154: hash_tbl_iterator_ = output_build_partitions_.front()->hash_tbl()->FirstUnmatched(ht_ctx_.get()); line too long (101 > 90) -- To view, visit http://gerrit.cloudera.org:8080/14632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb Gerrit-Change-Number: 14632 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 21:54:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1"
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14748 ) Change subject: IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1" .. Patch Set 1: Code-Review+2 Thanks for the quick revert -- To view, visit http://gerrit.cloudera.org:8080/14748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899 Gerrit-Change-Number: 14748 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 21:50:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1"
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14748 Change subject: IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1" .. IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1" This reverts commit 7b135949d94eb1e530a525e6a0ea264965762dd4. Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899 --- M infra/python/deps/compiled-requirements.txt 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/14748/1 -- To view, visit http://gerrit.cloudera.org:8080/14748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899 Gerrit-Change-Number: 14748 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14741 ) Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5071/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 19 Nov 2019 20:23:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9126: part 1: hash join build partition cleanup
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14632 ) Change subject: IMPALA-9126: part 1: hash join build partition cleanup .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h@119 PS4, Line 119: nit: formatting http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h@124 PS4, Line 124: // We don't need to pass in a batch because the anti-join only returns tuple data Is it possible to DCHECK for this? I guess it would be null_aware_partition_->build_rows()->num_rows() == 0? Maybe unnecessarily defensive http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.h@163 PS4, Line 163: /// Needs to be log2(PARTITION_FANOUT). Might make sense to move this and MAX_PARTITION_DEPTH up too, to keep the related static const stuff all together. http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/14632/4/be/src/exec/partitioned-hash-join-builder.cc@483 PS4, Line 483: stream->Close(nullptr, RowBatch::FlushMode::NO_FLUSH_RESOURCES); Maybe comment/DCHECK why we don't need to pass row_batch here -- To view, visit http://gerrit.cloudera.org:8080/14632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife8d0fa5dd14c7d3f3d726dd38c07d8cbceabadb Gerrit-Change-Number: 14632 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 20:08:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14741 ) Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5251/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 19 Nov 2019 20:09:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Hello Laszlo Gaal, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14741 to look at the new patch set (#2). Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. IMPALA-9165: Add timeout for create-load-data.sh This converts the existing bin/run-all-tests-timeout-check.sh to a more generic bin/script-timeout-check.sh. It uses this new script for both bin/run-all-tests.sh and testdata/bin/create-load-data.sh. The new script takes two arguments: -timeout : timeout in minutes -script_name : name of the calling script The script_name is used in debugging output / output filenames to make it clear what timed out. The run-all-tests.sh timeout remains the same. testdata/bin/create-load-data.sh uses a 2.5 hour timeout. This should help debug the issue in IMPALA-9165, because at least the logs would be preserved on the Jenkins job. Testing: - Tested the timeout script by hand with a caller script that sleeps longer than the timeout - Ran a gerrit-verify-dryrun-external Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 --- D bin/run-all-tests-timeout-check.sh M bin/run-all-tests.sh A bin/script-timeout-check.sh M testdata/bin/create-load-data.sh 4 files changed, 129 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/14741/2 -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14741 ) Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. Patch Set 1: (2 comments) Made some changes to the script's argument parsing to add a script_name to use in debugging messages / filenames. http://gerrit.cloudera.org:8080/#/c/14741/1/bin/script-timeout-check.sh File bin/script-timeout-check.sh: http://gerrit.cloudera.org:8080/#/c/14741/1/bin/script-timeout-check.sh@74 PS1, Line 74: --step "test_run" > does this parameter, or the --phase parameter need to be changed depending Good point, added a script_name argument to this script so that we can have better messages / filenames. http://gerrit.cloudera.org:8080/#/c/14741/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/14741/1/testdata/bin/create-load-data.sh@106 PS1, Line 106: minute > nit: minutes Done -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 19 Nov 2019 20:07:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9158: Support loading primary key/foreign key constraints in LocalCatalog Mode.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14731 ) Change subject: IMPALA-9158: Support loading primary key/foreign key constraints in LocalCatalog Mode. .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/14731/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14731/3//COMMIT_MSG@27 PS3, Line 27: Testing: Can you add some end-to-end tests for show create table, since this is adding that functionality for local catalog mode. I assume this wouldn't have worked before for local catalog http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@167 PS3, Line 167: public Pair, List> loadConstraints( Optional: it seems like basically everywhere that we want one list, we also want to store the other. So it might be worth defining a simple class (SQLConstraints or something like that) as a container for these two lists. It's a little cleaner now, and if we ever add another kind of constraint, you're going to have to do this refactor anyway. http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@114 PS3, Line 114:* List of primary keys associated with the table. Can you document the invariants around these two variables - i.e. when is it null, when is it empty? When are they mutated? Some of the methods below have various checks around null and empty and I'm not sure which are actually necessary (and if they are, why they are). http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@554 PS3, Line 554: || primaryKeys_.isEmpty() Shouldn't an empty list mean that the pks are loaded and that there are no primary keys? http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@558 PS3, Line 558: LOG.warn("Table {} is not loaded, SHOW CREATE TABLE will not show primary key " Is this code reachable? If so, shouldn't we load the table instead of returning bogus output? If not, can we document the invariant and turn this into a precondition? I also think this logging doesn't fit, for two reasons: * this would only really be relevant/actionable to the end user, who won't be looking at the logs. * warning logs should really be reserved for service health issues or problems that are actionable by administrators. http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@569 PS3, Line 569: if (foreignKeys_ == null || foreignKeys_.isEmpty()) { Same comments apply here. http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java File fe/src/test/java/org/apache/impala/catalog/CatalogTest.java: http://gerrit.cloudera.org:8080/#/c/14731/3/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java@452 PS3, Line 452: // Order of PKs is not guaranteed. It's worth thinking about putting the primary and foreign keys into a canonical order, if the HMS is returning them in inconsistent orders. Would simplify testing and reduce chances of future flaky tests. http://gerrit.cloudera.org:8080/#/c/14731/3/testdata/data/child_table.txt File testdata/data/child_table.txt: PS3: Please add description of these data files to the README in this directory - we can't add comments inline and it is important to document what their purpose is. -- To view, visit http://gerrit.cloudera.org:8080/14731 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ea7e1bacf6eb502c67caf310a847b32687e0d58 Gerrit-Change-Number: 14731 Gerrit-PatchSet: 3 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 19 Nov 2019 20:06:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5250/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 20:04:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 7: Code-Review+2 Carrying +2. -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 20:03:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: Bump CDP build number to 1617729
Sahil Takiar has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14732 ) Change subject: IMPALA-8525: Bump CDP build number to 1617729 .. IMPALA-8525: Bump CDP build number to 1617729 This pulls in some changes to libhdfs that will help with Impala-on-S3 performance. Specifically, HDFS-14564: "Add libhdfs APIs for readFully; add readFully to ByteBufferPositionedReadable" Includes the change from https://gerrit.cloudera.org/#/c/14739/, which fixes a Hive 3 issue with test_ddl.py. Testing: * Ran core tests Change-Id: I11260dc959890c69f353eaa2c311b5b02d2a915e Reviewed-on: http://gerrit.cloudera.org:8080/14732 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- M bin/impala-config.sh M tests/metadata/test_ddl.py 2 files changed, 9 insertions(+), 7 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/14732 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I11260dc959890c69f353eaa2c311b5b02d2a915e Gerrit-Change-Number: 14732 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9137, IMPALA-9138: Mark failed RPCs as retryable and update blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14677 ) Change subject: IMPALA-9137, IMPALA-9138: Mark failed RPCs as retryable and update blacklist .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c Gerrit-Change-Number: 14677 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 20:00:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: Bump CDP build number to 1617729
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14732 ) Change subject: IMPALA-8525: Bump CDP build number to 1617729 .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14732 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11260dc959890c69f353eaa2c311b5b02d2a915e Gerrit-Change-Number: 14732 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 19:55:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14717 ) Change subject: IMPALA-9134: pipeline data stream sender flush .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5069/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If66add99e4370f8faf22761b336205cc9f9f1867 Gerrit-Change-Number: 14717 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:55:21 + Gerrit-HasComments: No
[Impala-ASF-CR] Fixup for test ddl.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14739 ) Change subject: Fixup for test_ddl.py .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5070/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee8f9a99547b4a0cc458c44bd53b7f367babc5db Gerrit-Change-Number: 14739 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 19 Nov 2019 19:54:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8863: Add support to run tests over HTTP/HS2
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/14059 ) Change subject: IMPALA-8863: Add support to run tests over HTTP/HS2 .. Patch Set 29: Build hit IMPALA-9148 (https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8877/) -- To view, visit http://gerrit.cloudera.org:8080/14059 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018 Gerrit-Change-Number: 14059 Gerrit-PatchSet: 29 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:48:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8863: Add support to run tests over HTTP/HS2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14059 ) Change subject: IMPALA-8863: Add support to run tests over HTTP/HS2 .. Patch Set 30: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14059 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018 Gerrit-Change-Number: 14059 Gerrit-PatchSet: 30 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:48:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8863: Add support to run tests over HTTP/HS2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14059 ) Change subject: IMPALA-8863: Add support to run tests over HTTP/HS2 .. Patch Set 30: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5249/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14059 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018 Gerrit-Change-Number: 14059 Gerrit-PatchSet: 30 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:48:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14720 ) Change subject: IMPALA-9104: Support retrieval of PK/FK information through impala-hs2-server. .. Patch Set 7: (13 comments) Btw, not sure if you're familiar with clang-format, some instructions here: https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala, but it can help a lot to reduce the number of formatting mistakes and make the review process easier http://gerrit.cloudera.org:8080/#/c/14720/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14720/7//COMMIT_MSG@38 PS7, Line 38: We should fix : this seperately. Is there a JIRA for it? http://gerrit.cloudera.org:8080/#/c/14720/7/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/14720/7/common/thrift/Frontend.thrift@576 PS7, Line 576: GET_CROSS_REFERECE GET_CROSS_REFERENCE http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/analysis/TableDef.java File fe/src/main/java/org/apache/impala/analysis/TableDef.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/analysis/TableDef.java@547 PS7, Line 547: nit: extra whitespace http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/FeTable.java File fe/src/main/java/org/apache/impala/catalog/FeTable.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/FeTable.java@99 PS7, Line 99: throws NotImplementedException; I don't think this is necessary, here and elsewhere http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@56 PS7, Line 56: import org.apache.impala.common.NotImplementedException; I don't think you need this http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@216 PS7, Line 216: "getPrimaryKeys() is not yet implemented in " : + "LocalCatalog Mode" nit: you can get rid of the '+' by putting this all on the second line http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/Frontend.java@907 PS7, Line 907: } catch (NotImplementedException e) { : // getForeignKeys() is not supported in LocalCatalog mode. Until IMPALA-9158 is : // fixed, return an empty list. : return Lists.newArrayList(); My thinking in suggesting the NotImplementedException was that we would bubble it up to users so that they know that its not something that's expected to work yet. I'm not certain that would actually work, eg. because of the way this is getting called through JNI, and I suppose it doesn't really matter since I see that you've already got a patch out for IMPALA-9158. Feel free to leave this the way it is, just as long as you clean it up in that patch. http://gerrit.cloudera.org:8080/#/c/14720/5/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: http://gerrit.cloudera.org:8080/#/c/14720/5/fe/src/main/java/org/apache/impala/service/MetadataOp.java@736 PS5, Line 736: Lists.newArrayList > Done. Just curious: How is this different from new ArrayList<>()? Good question - I just know if it as our usual convention, eg. if you look at the rest of this file. Apparently though Lists.newArrayList() is supposed to be deprecated as of Java 7, see: https://guava.dev/releases/19.0/api/docs/com/google/common/collect/Lists.html#newArrayList() and the way you had it previously is now considered the preferable way Of course, it can be a bit of a judgement call whether to go with the "preferred" style or to match the existing style in the file you're modifying, but some quick grepping suggests that we already use the "diamond" style more overall across the FE than Lists.newArrayList, so you can feel free to put this back the way that it was. http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: http://gerrit.cloudera.org:8080/#/c/14720/7/fe/src/main/java/org/apache/impala/service/MetadataOp.java@650 PS7, Line 650: if (LOG.isTraceEnabled()) LOG.trace("Returning {} primary keys for
[Impala-ASF-CR] IMPALA-6894: Use an internal representation of query states in ClientRequestState
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14744 ) Change subject: IMPALA-6894: Use an internal representation of query states in ClientRequestState .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5068/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ce70bd2e964b309ebfc9d6ff6d900485db4d630 Gerrit-Change-Number: 14744 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 19:45:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (3 comments) Code looks good, just had a few asks as far as additional tests. http://gerrit.cloudera.org:8080/#/c/14690/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14690/4//COMMIT_MSG@30 PS4, Line 30: - Added a new unit test in PlannerTest that sets the Can you also add the option to query-options-test just to test the validations. I think you can just add it to the table of byte options in SetByteOptions() since its behaviour is consistent with the other options. http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc@914 PS4, Line 914: ParseMemValue(value, "broadcast bytes limit for join operations", _bytes_limit)); > line too long (103 > 90) Assume you'll fix in next patch. You can run these checks locally the HEAD commit with: ./bin/jenkins/critique-gerrit-review.py --dryrun If you want to check whether they're resolved. Also ok to rely on the bot. http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: Can you also add a negative test where it is below the threshold and doesn't switch to broadcast. And a negative test where it's above the threshold but the /* +broadcast */ hint overrides it. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:39:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8525: Bump CDP build number to 1617729
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14732 ) Change subject: IMPALA-8525: Bump CDP build number to 1617729 .. Patch Set 3: Code-Review+2 This makes sense to me. Good to get the CDP GBN updated. -- To view, visit http://gerrit.cloudera.org:8080/14732 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11260dc959890c69f353eaa2c311b5b02d2a915e Gerrit-Change-Number: 14732 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 19:11:01 + Gerrit-HasComments: No
[Impala-ASF-CR] Fixup for test ddl.py
Joe McDonnell has abandoned this change. ( http://gerrit.cloudera.org:8080/14739 ) Change subject: Fixup for test_ddl.py .. Abandoned This was incorporated into https://gerrit.cloudera.org/#/c/14732/ -- To view, visit http://gerrit.cloudera.org:8080/14739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Iee8f9a99547b4a0cc458c44bd53b7f367babc5db Gerrit-Change-Number: 14739 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14717 ) Change subject: IMPALA-9134: pipeline data stream sender flush .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If66add99e4370f8faf22761b336205cc9f9f1867 Gerrit-Change-Number: 14717 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:09:41 + Gerrit-HasComments: No
[Impala-ASF-CR] Fixup for test ddl.py
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14739 Change subject: Fixup for test_ddl.py .. Fixup for test_ddl.py Change-Id: Iee8f9a99547b4a0cc458c44bd53b7f367babc5db --- M tests/metadata/test_ddl.py 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/14739/1 -- To view, visit http://gerrit.cloudera.org:8080/14739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iee8f9a99547b4a0cc458c44bd53b7f367babc5db Gerrit-Change-Number: 14739 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] Fixup for test ddl.py
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14739 ) Change subject: Fixup for test_ddl.py .. Patch Set 1: This is not for review. Published it for posterity. -- To view, visit http://gerrit.cloudera.org:8080/14739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee8f9a99547b4a0cc458c44bd53b7f367babc5db Gerrit-Change-Number: 14739 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 19 Nov 2019 19:09:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14717 ) Change subject: IMPALA-9134: pipeline data stream sender flush .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5248/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If66add99e4370f8faf22761b336205cc9f9f1867 Gerrit-Change-Number: 14717 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:10:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14717 ) Change subject: IMPALA-9134: pipeline data stream sender flush .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/14717/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/14717/3/be/src/runtime/krpc-data-stream-sender.cc@161 PS3, Line 161: This function : // blocks until the EOS RPC is complete. > ? Thanks for catching. -- To view, visit http://gerrit.cloudera.org:8080/14717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If66add99e4370f8faf22761b336205cc9f9f1867 Gerrit-Change-Number: 14717 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:09:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Hello Michael Ho, Thomas Tauber-Marshall, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14717 to look at the new patch set (#4). Change subject: IMPALA-9134: pipeline data stream sender flush .. IMPALA-9134: pipeline data stream sender flush This changes the logic so that the last TransmitData and EndDataStream RPCs can be sent to receivers in parallel, instead of them being serialized in FlushAndSendEos(). This was a problem because each call to FlushAndSendEos() would wait for one or two network round-trips, so previously the flush could easily take sum(2 * RTT). After this change the sending and waiting for RPCs is pipeline so the time taken should be closer to 2 * RTT. Testing: This code is exercised by existing tests. Ran exhaustive tests. Perf: No measurable change on local TPC-H (where network latency isn't a significant factor). +--+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--+---+-++++ | TPCH(30) | parquet / none / none | 7.64| -1.37% | 5.00 | +0.29% | +--+---+-++++ +--+--+---++-++++---++-+---+ | Workload | Query| File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval | +--+--+---++-++++---++-+---+ | TPCH(30) | TPCH-Q14 | parquet / none / none | 3.75 | 3.39| +10.48% | * 19.43% * | 8.88%| 7 | +7.26% | 0.48| 1.19 | | TPCH(30) | TPCH-Q13 | parquet / none / none | 13.55 | 12.78 | +6.05% | 5.43%| 4.75%| 7 | +5.09% | 1.76| 2.14 | | TPCH(30) | TPCH-Q2 | parquet / none / none | 1.30 | 1.26| +3.63% | 5.53%| 3.27%| 7 | +3.93% | 1.28| 1.46 | | TPCH(30) | TPCH-Q12 | parquet / none / none | 2.65 | 2.56| +3.44% | 4.23%| 2.92%| 7 | +3.68% | 1.28| 1.73 | | TPCH(30) | TPCH-Q15 | parquet / none / none | 3.80 | 3.76| +0.91% | 0.81%| 1.80%| 7 | +1.42% | 1.12| 1.22 | | TPCH(30) | TPCH-Q8 | parquet / none / none | 5.23 | 5.13| +2.11% | 5.41%| 0.48%| 7 | +0.04% | 0.16| 1.01 | | TPCH(30) | TPCH-Q10 | parquet / none / none | 5.77 | 5.62| +2.70% | 9.84%| 2.79%| 7 | -0.80% | -0.96 | 0.68 | | TPCH(30) | TPCH-Q21 | parquet / none / none | 28.82 | 28.62 | +0.70% | 0.59%| 0.37%| 7 | +0.68% | 1.92| 2.67 | | TPCH(30) | TPCH-Q6 | parquet / none / none | 1.32 | 1.31| +0.82% | 0.36%| 1.53%| 7 | +0.23% | 1.28| 1.38 | | TPCH(30) | TPCH-Q11 | parquet / none / none | 1.30 | 1.29| +0.60% | 5.11%| 2.94%| 7 | +0.14% | 0.16| 0.27 | | TPCH(30) | TPCH-Q19 | parquet / none / none | 4.52 | 4.49| +0.57% | 2.98%| 1.58%| 7 | +0.12% | 0.32| 0.44 | | TPCH(30) | TPCH-Q16 | parquet / none / none | 2.78 | 2.77| +0.35% | 3.21%| 2.02%| 7 | +0.17% | 0.64| 0.25 | | TPCH(30) | TPCH-Q3 | parquet / none / none | 5.65 | 5.65| -0.04% | 1.62%| 1.91%| 7 | -0.06% | -0.32 | -0.04 | | TPCH(30) | TPCH-Q1 | parquet / none / none | 10.88 | 10.89 | -0.11% | 0.76%| 0.90%| 7 | +0.01% | 0.00| -0.26 | | TPCH(30) | TPCH-Q4 | parquet / none / none | 2.18 | 2.18| -0.38% | 1.33%| 2.48%| 7 | -0.04% | -0.16 | -0.36 | | TPCH(30) | TPCH-Q5 | parquet / none / none | 3.94 | 3.95| -0.27% | 2.37%| 1.33%| 7 | -0.26% | -1.12 | -0.26 | | TPCH(30) | TPCH-Q7 | parquet / none / none | 22.37 | 22.41 | -0.19% | 2.94%| 1.92%| 7 | -0.66% | -0.80 | -0.15 | | TPCH(30) | TPCH-Q22 | parquet / none / none | 3.68 | 3.72| -1.14% | 4.95%| 3.99%| 7 | +0.13% | 0.00| -0.48 | | TPCH(30) | TPCH-Q20 | parquet / none / none | 2.83 | 2.86| -1.21% | 2.69%| 2.20%| 7 | -1.60% | -0.64 | -0.93 | | TPCH(30) | TPCH-Q17 | parquet / none / none | 4.92 | 5.04| -2.46%
[Impala-ASF-CR] IMPALA-6894: Use an internal representation of query states in ClientRequestState
Hello Michael Ho, Thomas Tauber-Marshall, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14744 to look at the new patch set (#2). Change subject: IMPALA-6894: Use an internal representation of query states in ClientRequestState .. IMPALA-6894: Use an internal representation of query states in ClientRequestState Re-factors the state machine of ClientRequestState so that it uses an internal state represetation rather than using the one defined by TOperationState. The possible states are defined in ClientRequestState::ExecState and the possible state transitions are outlined in client-request-state.h and enforced in ClientRequestState::UpdateNonErrorExecState. The states defined in ClientRequestState::ExecState are the same states currently used in TOperationState. This patch simply makes it easy to define new states in the future. The value of ClientRequestState::ExecState is exposed to clients via the entry "Impala Query State" in the runtime profile. It is meant to be the Impala specific version of "Query State" (which is the Beeswax state). This allows Impala to expose its internal state without breaking existing clients that might rely on the value of "Query State". Additional Bug Fixes: * Fixed a possible race condition where a query could transition from RUNNING to PENDING * The ClientRequestState state is now tracked using an AtomicEnum, which fixes a few possible race conditions where the state was being read without holding the ClientRequestState::lock_ Testing: * Ran core tests * Added test to make sure "Impala Query State" is populated Change-Id: I1ce70bd2e964b309ebfc9d6ff6d900485db4d630 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M tests/query_test/test_observability.py M tests/util/cancel_util.py 8 files changed, 185 insertions(+), 95 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/14744/2 -- To view, visit http://gerrit.cloudera.org:8080/14744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1ce70bd2e964b309ebfc9d6ff6d900485db4d630 Gerrit-Change-Number: 14744 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6894: Use an internal representation of query states in ClientRequestState
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14744 ) Change subject: IMPALA-6894: Use an internal representation of query states in ClientRequestState .. Patch Set 2: Re-based and added some more tests. -- To view, visit http://gerrit.cloudera.org:8080/14744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ce70bd2e964b309ebfc9d6ff6d900485db4d630 Gerrit-Change-Number: 14744 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 19:00:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9134: pipeline data stream sender flush
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14717 ) Change subject: IMPALA-9134: pipeline data stream sender flush .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/14717/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/14717/3/be/src/runtime/krpc-data-stream-sender.cc@161 PS3, Line 161: This function : // blocks until the EOS RPC is complete. ? -- To view, visit http://gerrit.cloudera.org:8080/14717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If66add99e4370f8faf22761b336205cc9f9f1867 Gerrit-Change-Number: 14717 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 18:55:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5067/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 18:38:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9127: explicit probe state machine in hash join
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14688 ) Change subject: IMPALA-9127: explicit probe state machine in hash join .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5066/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 Gerrit-Change-Number: 14688 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 18:22:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14690 to look at the new patch set (#4). Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads such that in vast majority of the cases the parameter value does not need to be changed. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Setting this parameter to 0 causes it to be ignored. Testing: - Ran all regression tests on Jenkins successfully - Added a new unit test in PlannerTest that sets the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 8 files changed, 72 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14690/4 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc@914 PS4, Line 914: ParseMemValue(value, "broadcast bytes limit for join operations", _bytes_limit)); line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/14690/4/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14690/4/common/thrift/ImpalaService.thrift@488 PS4, Line 488: line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 18:21:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6894: Use an internal representation of query states in ClientRequestState
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14744 ) Change subject: IMPALA-6894: Use an internal representation of query states in ClientRequestState .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5065/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ce70bd2e964b309ebfc9d6ff6d900485db4d630 Gerrit-Change-Number: 14744 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 17:53:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9127: explicit probe state machine in hash join
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14688 ) Change subject: IMPALA-9127: explicit probe state machine in hash join .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.h@204 PS5, Line 204: beein > typo Done http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.cc@522 PS5, Line 522: case ProbeState::PROBING_IN_BATCH: { > optional: I think that it would be nice to move this and the next case into Done. It was pretty easy to extract some of the logic. I kept the logic relating to state transitions in this function so that the state machine is defined in this lop as much as possible. -- To view, visit http://gerrit.cloudera.org:8080/14688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 Gerrit-Change-Number: 14688 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 17:37:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9127: explicit probe state machine in hash join
Hello Michael Ho, Thomas Tauber-Marshall, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14688 to look at the new patch set (#6). Change subject: IMPALA-9127: explicit probe state machine in hash join .. IMPALA-9127: explicit probe state machine in hash join This refactors the main loop in PartitionedHashJoinNode::GetNext() to use an explicit state machine, rather than the hard-to-follow implicit state machine previously used. A new state variable 'probe_state_' is used to drive the loop, with DCHECKs enforcing invariants of other member variables. I deliberately tried to minimise changes to other functions (including any attempts to factor logic out of GetNext()) to minimise the scope of this patch. The new logic is mostly equivalent to the old logic, although there may be a different number of trips through the loop because of the way the cascading checks in the old version worked. A few notable changes: * DoneProbing() is consistently called when probing is finished, including in cases, like probing a single spilled partition, where it wasn't previously. * The repeated AtCapacity() checks are consolidated into a single check that happens at the end of the loop. Resources attached to batches should still be flushed at the appropriate points, since each previous "if (out_batch->AtCapacity()) break;" corresponds to a new loop iteration in the new code. * OutputNullAwareNullProbe() and OutputNullAwareProbeRows() now explicitly signal when they are done using an output argument, instead of implicitly via AtCapacity(), which is incredibly error-prone. Testing: We have adequate coverage for different join modes, including with spilling. * Ran exhaustive tests. * Ran a single node stress test with TPC-H and TPC-DS * Ran a single node stress test with larger scale factor Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 --- M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h 3 files changed, 354 insertions(+), 202 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/14688/6 -- To view, visit http://gerrit.cloudera.org:8080/14688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 Gerrit-Change-Number: 14688 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6894: Use an internal representation of query states in ClientRequestState
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14744 Change subject: IMPALA-6894: Use an internal representation of query states in ClientRequestState .. IMPALA-6894: Use an internal representation of query states in ClientRequestState Re-factors the state machine of ClientRequestState so that it uses an internal state represetation rather than using the one defined by TOperationState. The possible states are defined in ClientRequestState::ExecState and the possible state transitions are outlined in client-request-state.h and enforced in ClientRequestState::UpdateNonErrorExecState. The states defined in ClientRequestState::ExecState are the same states currently used in TOperationState. This patch simply makes it easy to define new states in the future. The value of ClientRequestState::ExecState is exposed to clients via the entry "Impala Query State" in the runtime profile. It is meant to be the Impala specific version of "Query State" (which is the Beeswax state). This allows Impala to expose its internal state without breaking existing clients that might rely on the value of "Query State". Additional Bug Fixes: * Fixed a possible race condition where a query could transition from RUNNING to PENDING * The ClientRequestState state is now tracked using an AtomicEnum, which fixes a few possible race conditions where the state was being read without holding the ClientRequestState::lock_ Testing: * Ran core tests * Added test to make sure "Impala Query State" is populated Change-Id: I1ce70bd2e964b309ebfc9d6ff6d900485db4d630 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M tests/query_test/test_observability.py 7 files changed, 123 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/14744/1 -- To view, visit http://gerrit.cloudera.org:8080/14744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1ce70bd2e964b309ebfc9d6ff6d900485db4d630 Gerrit-Change-Number: 14744 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-9127: explicit probe state machine in hash join
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14688 ) Change subject: IMPALA-9127: explicit probe state machine in hash join .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.h@204 PS5, Line 204: beein typo http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: http://gerrit.cloudera.org:8080/#/c/14688/5/be/src/exec/partitioned-hash-join-node.cc@522 PS5, Line 522: case ProbeState::PROBING_IN_BATCH: { optional: I think that it would be nice to move this and the next case into different functions, e.g. ProcessCurrentProbeBatch() and NextProbeBatch() (and probably rename NextProbeRowBatch() to get NextProbeRowBatchFromChild() or something similar). The functions could be ordered like the old logic to reduce the diff. -- To view, visit http://gerrit.cloudera.org:8080/14688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 Gerrit-Change-Number: 14688 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 16:24:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6660: Change -0/+0 floating point to compare as equal in hash table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14588 ) Change subject: IMPALA-6660: Change -0/+0 floating point to compare as equal in hash table .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14588/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14588/4//COMMIT_MSG@15 PS4, Line 15: Added relevant tests please elaborate which tests did you extend -- To view, visit http://gerrit.cloudera.org:8080/14588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6bb1a817c81c452d041238c19cb6c9f602a5d565 Gerrit-Change-Number: 14588 Gerrit-PatchSet: 4 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 19 Nov 2019 16:24:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9165: Add timeout for create-load-data.sh
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14741 ) Change subject: IMPALA-9165: Add timeout for create-load-data.sh .. Patch Set 1: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/14741/1/bin/script-timeout-check.sh File bin/script-timeout-check.sh: http://gerrit.cloudera.org:8080/#/c/14741/1/bin/script-timeout-check.sh@74 PS1, Line 74: --step "test_run" does this parameter, or the --phase parameter need to be changed depending on if this is calling during test execution vs. data load? http://gerrit.cloudera.org:8080/#/c/14741/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/14741/1/testdata/bin/create-load-data.sh@106 PS1, Line 106: minute nit: minutes -- To view, visit http://gerrit.cloudera.org:8080/14741 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19d76bd8850c7d4b5a4d21f32d8715a382c6 Gerrit-Change-Number: 14741 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 19 Nov 2019 16:24:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-8778: Support Apache Hudi Read Optimized Table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14711 ) Change subject: [WIP]IMPALA-8778: Support Apache Hudi Read Optimized Table .. Patch Set 3: Thanks, Yanija. I think that makes sense, so I recommend you to continue with the implementation and testing. -- To view, visit http://gerrit.cloudera.org:8080/14711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf Gerrit-Change-Number: 14711 Gerrit-PatchSet: 3 Gerrit-Owner: Yanjia Gary Li Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yanjia Gary Li Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 19 Nov 2019 16:04:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14714 ) Change subject: IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3 .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@200 PS3, Line 200: (IsUsedToken("MM") && (IsUsedToken("MONTH") || IsUsedToken("MON"))) || : (IsUsedToken("MONTH") && IsUsedToken("MON")) Probably it would be simpler to do this check with a counter, similarly to L215-219. http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h File be/src/runtime/datetime-parser-common.h: http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@22 PS3, Line 22: #include : #include : #include nit: order includes alphabetically. http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@75 PS3, Line 75: TUE Closing apostrophe is missing. http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.h@315 PS3, Line 315: month or day nit: month or day name http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.cc File be/src/runtime/datetime-parser-common.cc: http://gerrit.cloudera.org:8080/#/c/14714/3/be/src/runtime/datetime-parser-common.cc@223 PS3, Line 223: // If it's ambiguous where the end of the month token is in the input then try to find : // the end of the month token by probing the input string. : int search_len = token_len; : while (search_len >= 3) { : const auto iter = MONTH_NAME_TO_INDEX.find(buff.substr(0, search_len)); : if (iter != MONTH_NAME_TO_INDEX.end()) { : *month = iter->second; : *token_end = token_start + search_len; : return true; : } : --search_len; : } I think, we could do this faster by taking advantage of the fact that checking the first 3 letters of the month is enough to decide what month to look for. const unordered_map month_prefix_to_suffix = { {"jan", "uary"}, {"feb", "ruary"}, .. {"dec", "ember"} }; if (token_len >= 3) { const auto& prefix = buff.substr(0, 3); const auto it = month_prefix_to_suffix.find(prefix); if (it != month_prefix_to_suffix.end()) { const auto& suffix = it->second; // // Test that 'suffix' matches the rest of 'buff'. // .. } } return false; What do you think? -- To view, visit http://gerrit.cloudera.org:8080/14714 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic797f19a1311b54e5d00d01d0a7afe1f0f21fb8f Gerrit-Change-Number: 14714 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 16:02:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9137, IMPALA-9138: Mark failed RPCs as retryable and update blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14677 ) Change subject: IMPALA-9137, IMPALA-9138: Mark failed RPCs as retryable and update blacklist .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5247/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/14677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c Gerrit-Change-Number: 14677 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 15:26:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9137, IMPALA-9138: Mark failed RPCs as retryable and update blacklist
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14677 ) Change subject: IMPALA-9137, IMPALA-9138: Mark failed RPCs as retryable and update blacklist .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5245/ -- To view, visit http://gerrit.cloudera.org:8080/14677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c Gerrit-Change-Number: 14677 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 19 Nov 2019 13:30:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: Bump CDP build number to 1617729
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14732 ) Change subject: IMPALA-8525: Bump CDP build number to 1617729 .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5244/ -- To view, visit http://gerrit.cloudera.org:8080/14732 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11260dc959890c69f353eaa2c311b5b02d2a915e Gerrit-Change-Number: 14732 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 19 Nov 2019 13:24:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14714 ) Change subject: IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3 .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5064/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14714 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic797f19a1311b54e5d00d01d0a7afe1f0f21fb8f Gerrit-Change-Number: 14714 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 12:12:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8863: Add support to run tests over HTTP/HS2
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14059 ) Change subject: IMPALA-8863: Add support to run tests over HTTP/HS2 .. Patch Set 29: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5242/ -- To view, visit http://gerrit.cloudera.org:8080/14059 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018 Gerrit-Change-Number: 14059 Gerrit-PatchSet: 29 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 11:04:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14714 ) Change subject: IMPALA-8705: ISO:SQL:2016 datetime patterns - Milestone 3 .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5063/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14714 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic797f19a1311b54e5d00d01d0a7afe1f0f21fb8f Gerrit-Change-Number: 14714 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 09:51:52 + Gerrit-HasComments: No