[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has removed a vote on this change. Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Removed Code-Review+1 by Aman Sinha -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
wangsheng has uploaded a new patch set (#26). ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. IMPALA-9741: Support querying Iceberg table by impala This patch mainly realizes the querying of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently only support PARQUET, other format would be supported in the future. And if you don't specify this property in your SQL, default file format is PARQUET. We achieved this function by treating the iceberg table as normal unpartitioned hdfs table. When querying iceberg table, we pushdown partition column predicates to iceberg to decide which data files need to be scanned, and then transfer this information to BE to do the real scan operation. Testing: - Unit test for Iceberg in FileMetadataLoaderTest - Create table tests in functional_schema_template.sql - Iceberg table query test in test_scanners.py Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M bin/rat_exclude_files.txt M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java M testdata/data/README A testdata/data/iceberg_test/iceberg_non_partitioned/data/1-1-5dbd44ad-18bc-40f2-9dd6-aeb2cc23457c-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/3-3-27db2521-1e8b-40c1-b846-552cd620abce-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/4-4-f1b55628-0544-4833-8b11-1b4add53dfd6-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/6-6-f75530ef-93b6-4994-b3c8-db957d44848c-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/7-7-8d9b22da-5f10-4cbf-8e4d-160f829b5e48-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/9-9-f029a1f7-9024-4bc3-a030-e20861586146-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-11-f07814ae-56cd-486b-af81-18541437da7d-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-12-967c70a4-bf4d-4a82-8c97-c90e2b4d9dcf-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-14-d0cdca7f-c050-407e-b70c-2bd076f83e4e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-15-0e931a1f-309e-43b3-a5cf-3ef82fa4a87c-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-17-43138078-244c-4b38-8127-04a5bfbc4695-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-19-52569895-df25-4ad8-b64d-49c4540d36c9-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-20-f160c1ea-a2f5-4109-81ec-3ff9c155430f-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-22-c1f61b8c-9d9a
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: Code-Review+1 Carry Shant's +1 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 06:48:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95 PS10, Line 95: public static final com.google.common.base.Predicate IS_RANGE_PREDICATE = line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java@1234 PS10, Line 1234: BitSet changed = propagateConstants(tmpConjuncts, candidates, keepConjuncts, analyzer); line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 06:48:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#10). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 9 files changed, 401 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/10 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 25: > That's great news! It looks like some of the Iceberg-related tests > failed - https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11883/ > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11884/ > https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3067/ > > But the good news is that it loaded the data successfully. Yes Tim, I regenerated the test files, but forgot to modify the test cases in iceberg-query.test which contains 'show files in xxx' related to specific files‘ name. I will adjust code and submit Jenkins to verify as soon as possible. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 25 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 01 Sep 2020 06:32:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15798 ) Change subject: IMPALA-9382: part 1: transposed profile prototype .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift File common/thrift/RuntimeProfile.thrift: http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift@249 PS16, Line 249: an averaged profile : // for the fragment is also included with averaged counter values. > Does it increase the serialized size for V1 by a noticeable amount? That wo It shouldn't change the serialized size cause of how the thrift encoding works - it just doesn't include unset fields. From the generated code if (this->__isset.aggregated) { xfer += oprot->writeFieldBegin("aggregated", ::apache::thrift::protocol::T_STRUCT, 13); xfer += this->aggregated.write(oprot); xfer += oprot->writeFieldEnd(); } -- To view, visit http://gerrit.cloudera.org:8080/15798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a Gerrit-Change-Number: 15798 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 04:54:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16379 ) Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77 Gerrit-Change-Number: 16379 Gerrit-PatchSet: 3 Gerrit-Owner: abeltian Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 04:23:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16379 ) Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6371/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77 Gerrit-Change-Number: 16379 Gerrit-PatchSet: 3 Gerrit-Owner: abeltian Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 04:23:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16379 ) Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported .. Patch Set 2: Code-Review+2 Slightly cleaned up the commit message but looks good. Thank you for contributing! I'd be interested to talk more about how to test against Alluxio if you have time. -- To view, visit http://gerrit.cloudera.org:8080/16379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77 Gerrit-Change-Number: 16379 Gerrit-PatchSet: 2 Gerrit-Owner: abeltian Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 04:22:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
Tim Armstrong has uploaded a new patch set (#2) to the change originally created by abeltian. ( http://gerrit.cloudera.org:8080/16379 ) Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported .. IMPALA-10087: IMPALA-6050 causes alluxio not to be supported This change adds file type support for alluxio. Alluxio URLs have a different prefix such as:alluxio://zk@zk-1:2181,zk-2:2181,zk-3:2181/path/ Testing: Add unit test for alluxio file system type checks. Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77 --- M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java 2 files changed, 11 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/16379/2 -- To view, visit http://gerrit.cloudera.org:8080/16379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77 Gerrit-Change-Number: 16379 Gerrit-PatchSet: 2 Gerrit-Owner: abeltian Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 25: That's great news! It looks like some of the Iceberg-related tests failed - https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11883/ https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11884/ https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3067/ But the good news is that it loaded the data successfully. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 25 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 01 Sep 2020 04:09:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 12: Filed IMPALA-10119 for the flaky test. -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 12 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 03:59:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6370/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 13 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 03:43:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 13 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 03:43:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 12: Looks unrelated, will rerun and file a JIRA -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 12 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 03:42:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 12: > Patch Set 12: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6368/ Flaky test? shell.test_shell_interactive.TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt[table_format_and_file_extension: ('textfile', '.txt') | protocol: hs2] (from pytest) E TIMEOUT: Timeout exceeded. E E version: 3.3 E command: /home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell E args: ['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', '--protocol=hs2', '-ilocalhost:21050'] E searcher: E buffer (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] default> ' E before (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] default> ' E after: E match: None E match_index: None E exitstatus: None E flag_eof: False E pid: 12993 E child_fd: 24 E closed: False E timeout: 30 E delimiter: E logfile: None E logfile_read: None E logfile_send: None E maxread: 2000 E ignorecase: False E searchwindowsize: None E delaybeforesend: 0.05 E delayafterclose: 0.1 E delayafterterminate: 0.1 -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 12 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 03:07:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419 PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', timestamp_col >= TIMESTAMP '2009-12-01 00:00:00' > Don't we still need to keep the date_col = cast(timestamp_col as date) pred Good point. All the use cases I have seen so far were ones where date_col was derived from the timestamp column. Yeah, for your example, we need to keep the cast predicate if the constant is a range predicate. I think the code change isn't much but I need to think about how to create a test data set for this. http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test File testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4 PS9, Line 4: functional_parquet > We generally don't include database names in the test files, since the infr I would be ok with running with other data sets but I had some struggles in loading the alltypes_date_partition table and had offline discussion with Shant. For Text format loading, the following error occurred since it went through HIve load process rather than Impala: The load-functional-planner-core-hive-generated-text-none-none.sql.log had the following error: "Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict" Setting the partition.mode to nonstrict got past that but ran into a default limit of the # dynamic partitions: "The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to 100 partitions per node, number of dynamic partitions on this node: 101" I could bump this up too .. but the Tez job does take much longer to execute..so I wasn't sure if it is worthwhile. I could move this to TestQueriesParquetTables unless you have other suggestions. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 02:20:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: (7 comments) http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java: http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@55 PS9, Line 55:* predicates. Mention how 'candidates' is used? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@61 PS9, Line 61: (E nit: these parens prob aren't needed, right? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@66 PS9, Line 66: !(bp.getOp() == BinaryPredicate.Operator.EQ) can't this be !=? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@128 PS9, Line 128: Map.Entry Can't this be Map.Entry to keep it type-safe and avoid cast? http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@132 PS9, Line 132: Map.Entry Map.Entry>? http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419 PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', timestamp_col >= TIMESTAMP '2009-12-01 00:00:00' Don't we still need to keep the date_col = cast(timestamp_col as date) predicate for this to be correct in cases where this isn't guaranteed to be true in the underlying data set? E.g. one counter-example would be date_col timestamp_col 2009-12-01 2009-12-2 00:00:00 I.e. I think we need to keep the equality predicate around. http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test File testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4 PS9, Line 4: functional_parquet We generally don't include database names in the test files, since the infra should switch to the appropriate functional database. We can move it to TestQueriesParquetTables if we only want it to run on the parquet data set (not kudu, etc). -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 01:17:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16311 ) Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. IMPALA-10030: Remove unnecessary jar dependencies Remove the dependency on hadoop-hdfs, this jar file contains the core code for implementing HDFS, and thus pulls in a bunch of unnecessary transitive dependencies. Impala currently only requires this jar for some configuration key names. Most of these configuration key names have been moved to the appropriate HDFS client jars, and some others are deprecated altogether. Removing this jar required making a few code changes to move the location of the referenced configuration keys. Removes all transitive Kafka dependencies from the Apache Ranger dependency. Previously, Impala only excluded Kafka jars with binary version kafka_2.11, however, it seems the Ranger recently upgraded the dependency version to kafka_2.12. Now all Kafka dependencies are excluded, regardless of artifact name. Removes all transitive dependencies from the Apache Ozone dependency. Impala has a dependency on the Ozone client shaded-jar, which already includes all required transitive dependencies. For some reason, Ozone still pulls in some transitive dependencies even though they are not needed. Made some other minor cleanup / improvements in the fe/pom.xml file. This saves about 70 MB of space in the Docker images. Testing: * Ran exhaustive tests * Ran on-prem cluster E2E tests Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Reviewed-on: http://gerrit.cloudera.org:8080/16311 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/pom.xml M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/util/FsPermissionChecker.java M fe/src/main/java/org/apache/impala/util/HdfsCachingUtil.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java 5 files changed, 51 insertions(+), 102 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 6 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16311 ) Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 01:15:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 12: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6368/ -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 12 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 00:54:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16390 ) Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 01 Sep 2020 00:09:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16390 ) Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer In HIVE-19064 the class of GenericHiveLexer was introduced as an intermediate class between the classes of HiveLexer and Lexer. In order for ToSqlUtils.java to be compiled once we bump up CDP_BUILD_NUMBER that includes this change on the Hive side, this patch updates shaded-deps/hive-exec/pom.xml to include the jar of GenericHiveLexer so that Impala could be successfully built. Testing: - Verified that Impala could compile in a local development environment after applying this patch. Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Reviewed-on: http://gerrit.cloudera.org:8080/16390 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M shaded-deps/hive-exec/pom.xml 1 file changed, 1 insertion(+), 0 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9792: Implement splitting kudu scan ranges for greater parallelism
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Implement splitting kudu scan ranges for greater parallelism .. Patch Set 1: (4 comments) This seems I think basically OK. I'm on the fence about whether we should do some additional cluster testing, but leaning towards no because the complexity is all in the Kudu layer and I don't think we'd learn much from testing at small scale. http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@24 PS1, Line 24: Testing > Did you do any performance testing to gauge the impact (good or bad)? Yeah it'd be good to do some sanity checks to confirm that it gives the speedup expected. E.g. TPC-H or similar. Maybe just a single query would be fine, e.g. TPC-H Q1. http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@25 PS1, Line 25: - Added e2e tests Do we have other any other tests that are going to exercise this code path, e.g. kudu e2e tests with multithreading. http://gerrit.cloudera.org:8080/#/c/16385/1/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: http://gerrit.cloudera.org:8080/#/c/16385/1/tests/query_test/test_kudu.py@1442 PS1, Line 1442: union union all would be a little more efficient, no? http://gerrit.cloudera.org:8080/#/c/16385/1/tests/query_test/test_kudu.py@1468 PS1, Line 1468: assert regular_num_inst < with_mt_dop_and_disabled_range_len_num_inst < \ :with_mt_dop_num_inst < with_mt_dop_and_low_range_len_num_inst I don't think the < operator works this way - you're going to be comparing a bool with an in. Probably best to have each inequality as a separate assert. -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 23:44:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/16389 ) Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH clauses .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/16389/2/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/16389/2/shell/impala_shell.py@1280 PS2, Line 1280: if self.DML_REGEX.match(query_type.lower()): looks like there were failed tests in the dry-run nit: this code can be simplified like below. is_dml = self.DML_REGEX.match(query_type.lower()) return self._execute_stmt(query, is_dml=is_dml, print_web_link=True) -- To view, visit http://gerrit.cloudera.org:8080/16389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad Gerrit-Change-Number: 16389 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 31 Aug 2020 22:06:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 9: Code-Review+1 (1 comment) LGTM nice little fix. http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461 PS7, Line 461: timestamp_col <= '2010-12-01'; > Done Done -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 21:39:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10016: Split jars for Impala exec and coord Docker images
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16320 ) Change subject: IMPALA-10016: Split jars for Impala exec and coord Docker images .. Patch Set 4: I had thought about doing that split in the past - it seems like it would be useful. I don't see any obvious issues with doing it aside from tests making assumptions about scheduling. -- To view, visit http://gerrit.cloudera.org:8080/16320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd Gerrit-Change-Number: 16320 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 21:02:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10016: Split jars for Impala exec and coord Docker images
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/16320 ) Change subject: IMPALA-10016: Split jars for Impala exec and coord Docker images .. Patch Set 4: > (1 comment) > > The code changes basically look good. Let me know how testing goes, > I can approve it then. Is there a change that we could make to the dockerized tests that would help us test the coordinator-only and executor-only images? At the moment, the dockerized tests use the impalad_coord_exec docker image, and that has been fine for coverage because impalad_coord_exec, impalad_executor, and impalad_coordinator have been so similar. I wonder how much work it would be to migrate to using one impalad_coordinator and three impalad_executor nodes (or one impalad_coord_exec and two impalad_executor nodes). Would this be a useful direction? -- To view, visit http://gerrit.cloudera.org:8080/16320 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd Gerrit-Change-Number: 16320 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 20:10:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16311 ) Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:57:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16311 ) Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6369/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:57:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15798 ) Change subject: IMPALA-9382: part 1: transposed profile prototype .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift File common/thrift/RuntimeProfile.thrift: http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift@249 PS16, Line 249: an averaged profile : // for the fragment is also included with averaged counter values. > It does. It will make the deserialized objects larger because of the extra Does it increase the serialized size for V1 by a noticeable amount? That would be my main concern, since that corresponds to disk usage for the profile log. -- To view, visit http://gerrit.cloudera.org:8080/15798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a Gerrit-Change-Number: 15798 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:52:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16311 ) Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. Patch Set 4: Code-Review+2 Thanks for the update, appreciate the due diligence on it! -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:42:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6368/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 12 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:39:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 12 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:39:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 11 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:39:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7054/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 11 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:28:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16390 ) Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7053/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:17:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 11: (10 comments) Yeah I guess some of those comments in the tests came from different patches, all cleaned up now. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@98 PS10, Line 98: adjustNumD > Maybe renamed as getNumDistinctValuesAdjusted(). made it adjustNumDistinctValues, since this is a private method want to avoid the get/set verbs as not to conflate with standard conventions for public methods. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@191 PS10, Line 191: > nit: seems like a move of the method in the module. Done http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@211 PS10, Line 211: verifySelectCol("nullrows", "null_str", > This comment can be removed. Done http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java File fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@182 PS10, Line 182: // NDV(blanks) = 1, add 1 for nulls : // Bug: See IMPALA-7310, IMPALA-8094 : //verifyNdvStmt("SELECT blanks FROM functional.nullrows", 2); > Seems like these lines can be removed. Actually no this is a different issue, just adjusted the references. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@132 PS10, Line 132: group_str h > This comment is not accurate. Done http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@136 PS10, Line 136: null_str is al > Seems like the reference to c is not right here. Done http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@138 PS10, Line 138: i > same here. Done http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140 PS10, Line 140: (g > same Done http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140 PS10, Line 140: i > same here Done http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@182 PS10, Line 182: = 1 > Maybe as // NDV(id) = 26, ndv(null_str) = 1, NDV(id)*ndv(null_str) = 26. Done -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 11 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 19:11:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16349 to look at the new patch set (#11). Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. IMPALA-7310: Partial fix for NDV cardinality with NULLs. This fix just handles the case where a column's cardinality is zero however it's nullable and we have null stats to indicate there are null values, therefore we adjust the cardinality from 0 to 1. The cardinality of zero was especially problematic when calculating cardinalities for multiple predicates with multiplication. The 0 would propagate up the plan tree and result in poor plan choices such as always using broadcast joins where shuffle would've been more optimal. Testing: * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11 - Q4 172s -> 80s - Q11 103s -> 77s * CardinalityTest * TpcdsPlannerTest Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 --- M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test 6 files changed, 795 insertions(+), 784 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/11 -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 11 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16390 ) Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16390 ) Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. Patch Set 1: This seems like a safe change and the reasoning makes sense. -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16390 ) Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6367/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Fang-Yu Rao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16390 Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer In HIVE-19064 the class of GenericHiveLexer was introduced as an intermediate class between the classes of HiveLexer and Lexer. In order for ToSqlUtils.java to be compiled once we bump up CDP_BUILD_NUMBER that includes this change on the Hive side, this patch updates shaded-deps/hive-exec/pom.xml to include the jar of GenericHiveLexer so that Impala could be successfully built. Testing: - Verified that Impala could compile in a local development environment after applying this patch. Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc --- M shaded-deps/hive-exec/pom.xml 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/16390/1 -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16390 ) Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc Gerrit-Change-Number: 16390 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16350 ) Change subject: IMPALA-10099: Push down DISTINCT in Set operations .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5 Gerrit-Change-Number: 16350 Gerrit-PatchSet: 4 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 18:34:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16350 ) Change subject: IMPALA-10099: Push down DISTINCT in Set operations .. IMPALA-10099: Push down DISTINCT in Set operations INTERSECT/EXCEPT are not duplicate preserving operations. The distinct aggregations can happen in each operand, the leftmost operand only, or after all the operands in a separate aggregation step. Except for a couple special cases we would use the last strategy most often. This change pushes the distinct aggregation down to the leftmost operand in cases where there are no analytic functions, or when a distinct or grouping operation already eliminates duplicates. In general DISTINCT placement such as in this case should be done throughout the entire plan tree in a cost based manner as described in IMPALA-5260 Testing: * TpcdsPlannerTest * PlannerTest * TPC-DS 30TB Perf run for any affected queries - Q14-1 180s -> 150s - Q14-2 109s -> 90s - Q8 no significant change * SetOperation Planner Tests * Analyzer tests * Tpcds Functional Workload Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5 Reviewed-on: http://gerrit.cloudera.org:8080/16350 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong --- M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test 6 files changed, 2,049 insertions(+), 1,806 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5 Gerrit-Change-Number: 16350 Gerrit-PatchSet: 5 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103 PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null values. > Yes the intent of this patch per earlier notes is to only address the =0 ca Done -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 10 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 18:25:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 10: I can +2 once you've addressed Qifan's comments. -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 10 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 18:02:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103 PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null values. > When the numDistinctValues > 0, such adjustment is not performed. I wonder Yes the intent of this patch per earlier notes is to only address the =0 cases as the the general fix is a bit involved at the moment. -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 10 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 17:28:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16311 ) Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. Patch Set 4: I ran exhaustive tests + on-prem E2E tests (L0s). I haven't run any K8s tests, but I can. I think the hadoop-hdfs dependency change should be covered by the Impala exhaustive tests + L0s. The Ranger dependency change is actually already present internally, so I don't think this actually does anything. I confirmed with the Ozone team that this change is safe. -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 16:21:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 25: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6366/ -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 25 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 31 Aug 2020 16:20:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Hello Tim Armstrong, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16311 to look at the new patch set (#4). Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. IMPALA-10030: Remove unnecessary jar dependencies Remove the dependency on hadoop-hdfs, this jar file contains the core code for implementing HDFS, and thus pulls in a bunch of unnecessary transitive dependencies. Impala currently only requires this jar for some configuration key names. Most of these configuration key names have been moved to the appropriate HDFS client jars, and some others are deprecated altogether. Removing this jar required making a few code changes to move the location of the referenced configuration keys. Removes all transitive Kafka dependencies from the Apache Ranger dependency. Previously, Impala only excluded Kafka jars with binary version kafka_2.11, however, it seems the Ranger recently upgraded the dependency version to kafka_2.12. Now all Kafka dependencies are excluded, regardless of artifact name. Removes all transitive dependencies from the Apache Ozone dependency. Impala has a dependency on the Ozone client shaded-jar, which already includes all required transitive dependencies. For some reason, Ozone still pulls in some transitive dependencies even though they are not needed. Made some other minor cleanup / improvements in the fe/pom.xml file. This saves about 70 MB of space in the Docker images. Testing: * Ran exhaustive tests * Ran on-prem cluster E2E tests Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc --- M fe/pom.xml M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/util/FsPermissionChecker.java M fe/src/main/java/org/apache/impala/util/HdfsCachingUtil.java M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java 5 files changed, 51 insertions(+), 102 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/16311/4 -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16389 ) Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH clauses .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6365/ -- To view, visit http://gerrit.cloudera.org:8080/16389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad Gerrit-Change-Number: 16389 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 31 Aug 2020 15:29:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022 part 1/2: Outer join simplification
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022 part 1/2: Outer join simplification .. Patch Set 16: (5 comments) Looks good to me! http://gerrit.cloudera.org:8080/#/c/16266/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16266/16//COMMIT_MSG@9 PS16, Line 9: As a general rule, an outer join can be converted to an inner join if : there is a condition on the inner table that filters out non‑matching : rows. In a left outer join, the right table is the inner table, while : it is the left table in a right outer join. In a full outer join, both : tables are inner tables. Conditions that are FALSE for nulls are : referred to as null filtering conditions, and these are the conditions : that enable the outer‑to‑inner join conversion to be made. Maybe reworded as "Outer joins in SQL can return rows with certain columns filled with NULLs when a match can not be found. However, such rows can be rejected by null-rejecting predicates. The conditions in a null-rejecting predicate that are always evaluated to FALSE for NULLs are referred to as null-filtering conditions. In general, an outer join can be converted to an inner join if there exist null-filtering conditions on the inner tables. In a left outer join, the right table is the inner table, while in a right outer join it is the left table. In a full outer join, both tables are inner tables." http://gerrit.cloudera.org:8080/#/c/16266/16//COMMIT_MSG@50 PS16, Line 50: I think we need to add a high-level description of what work is done in this commit. And also what will be the part 2 work. http://gerrit.cloudera.org:8080/#/c/16266/13/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/16266/13/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3277 PS13, Line 3277:*/ : private boolean isNullableConjunct(Expr e, List tupleIds) { : // A clause like "t1.v1 IS NOT NULL OR t2.v2 IS NOT NULL" and t1 in 'tupleIds' does : // not prove that t1.v1 can't be NULL, because when t2.v2 IS NOT NULL, t1.v1 can be : // null. But a clause like "t1.v1 IS NOT NULL OR t1.v2 IS NOT NULL" proves that the : // t1 row as a whole can't be all-NULL. : Lis > I changed to use the set retainAll method, but we should collect all of the OK. It sounds like test in one shot for t1.id>10 and t2.id<10 or t2.id>50 or t2.name='a' will not work. http://gerrit.cloudera.org:8080/#/c/16266/12/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/16266/12/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@321 PS12, Line 321: Condition nit. "Conditional" http://gerrit.cloudera.org:8080/#/c/16266/12/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@327 PS12, Line 327: f > OK, I see. I changed this as the doc. But I think the 'case' is not Functio Done -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 16 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Mon, 31 Aug 2020 15:13:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 ) Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. .. Patch Set 10: (14 comments) Looks good! http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@98 PS10, Line 98: adjustNdv( Maybe renamed as getNumDistinctValuesAdjusted(). http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103 PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null values. When the numDistinctValues > 0, such adjustment is not performed. I wonder if the adjustment is unconditional, it will hurt anything. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@191 PS10, Line 191: tNumDistinctValues() { return numDistinctValues_; } nit: seems like a move of the method in the module. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@211 PS10, Line 211: // Bug: NDV should be 1 to include nulls This comment can be removed. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java File fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@176 PS10, Line 176: a id http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@178 PS10, Line 178: f some_nulls http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@180 PS10, Line 180: c null_str http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@182 PS10, Line 182: // NDV(b) = 1, add 1 for nulls : // Bug: See IMPALA-7310, IMPALA-8094 : //verifyNdvStmt("SELECT blanks FROM functional.nullrows", 2); Seems like these lines can be removed. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@132 PS10, Line 132: f has NDV=3 This comment is not accurate. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@136 PS10, Line 136: c is all nulls Seems like the reference to c is not right here. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@138 PS10, Line 138: a same here. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140 PS10, Line 140: a same here http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140 PS10, Line 140: f) same http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@182 PS10, Line 182: = 1 Maybe as // NDV(id) = 26, ndv(null_str) = 1, NDV(id)*ndv(null_str) = 26. -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 10 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 14:28:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9792: Implement splitting kudu scan ranges for greater parallelism
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16385 ) Change subject: IMPALA-9792: Implement splitting kudu scan ranges for greater parallelism .. Patch Set 1: (4 comments) Awesome! I am super interested to see the performance impact of this change. http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@15 PS1, Line 15: TARGETED_KUDU_SCAN_RANGE_LENGTH nit can you add the default chosen to the commit message here. http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@24 PS1, Line 24: Testing Did you do any performance testing to gauge the impact (good or bad)? http://gerrit.cloudera.org:8080/#/c/16385/1/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/16385/1/be/src/service/query-options.h@a50 PS1, Line 50: Was this change an accident? http://gerrit.cloudera.org:8080/#/c/16385/1/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/16385/1/common/thrift/ImpalaService.thrift@574 PS1, Line 574: mt_dop >= 2 nit: "mt_dop > 1" to simplify and to match the commit message. -- To view, visit http://gerrit.cloudera.org:8080/16385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Gerrit-Change-Number: 16385 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 31 Aug 2020 13:31:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16351 ) Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda Gerrit-Change-Number: 16351 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Mon, 31 Aug 2020 13:28:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16351 ) Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state .. IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state A query will come into the FINISHED state when some rows are available, even when some fragment instances are still executing. When a retryable query comes into the FINISHED state and the client hasn't fetched any results, we are still able to retry it for any retryable failures. This patch fixes a DCHECK when retrying a FINISHED state query. Tests: - Add a test in test_query_retries.py for retrying a query in FINISHED state. Change-Id: I11d82bf80640760a47325833463def8a3791bdda Reviewed-on: http://gerrit.cloudera.org:8080/16351 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/runtime/query-driver.cc M tests/custom_cluster/test_query_retries.py 2 files changed, 25 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda Gerrit-Change-Number: 16351 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-10115: Impala should check file schema as well to check full ACIDv2 files
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16383 ) Change subject: IMPALA-10115: Impala should check file schema as well to check full ACIDv2 files .. Patch Set 1: Code-Review+2 (1 comment) The patch seems good to me, my only concern is about losing test coverage in the future. http://gerrit.cloudera.org:8080/#/c/16383/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16383/1//COMMIT_MSG@20 PS1, Line 20: * tested manually on a file that has ACIDv2 schema, but : 'hive.acid.version' is missing I would prefer to have an automatic test with a specific file, as Hive may set "hive.acid.version" during query-based compaction in the future, but should still be able to handle files written by older versions. -- To view, visit http://gerrit.cloudera.org:8080/16383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52642c1755599efd28fa2c90f13396cfe0f5fa14 Gerrit-Change-Number: 16383 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 31 Aug 2020 12:01:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 25: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7052/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 25 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 31 Aug 2020 11:30:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 25: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6366/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 25 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 31 Aug 2020 11:15:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10108: Implement ds kll stringify function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16370 ) Change subject: IMPALA-10108: Implement ds_kll_stringify function .. Patch Set 7: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6362/ -- To view, visit http://gerrit.cloudera.org:8080/16370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I97f654a4838bf91e3e0bed6a00d78b2c7aa96f75 Gerrit-Change-Number: 16370 Gerrit-PatchSet: 7 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 31 Aug 2020 11:14:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
wangsheng has uploaded a new patch set (#25). ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. IMPALA-9741: Support querying Iceberg table by impala This patch mainly realizes the querying of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently only support PARQUET, other format would be supported in the future. And if you don't specify this property in your SQL, default file format is PARQUET. We achieved this function by treating the iceberg table as normal unpartitioned hdfs table. When querying iceberg table, we pushdown partition column predicates to iceberg to decide which data files need to be scanned, and then transfer this information to BE to do the real scan operation. Testing: - Unit test for Iceberg in FileMetadataLoaderTest - Create table tests in functional_schema_template.sql - Iceberg table query test in test_scanners.py Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M bin/rat_exclude_files.txt M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java M testdata/data/README A testdata/data/iceberg_test/iceberg_non_partitioned/data/1-1-5dbd44ad-18bc-40f2-9dd6-aeb2cc23457c-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/3-3-27db2521-1e8b-40c1-b846-552cd620abce-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/4-4-f1b55628-0544-4833-8b11-1b4add53dfd6-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/6-6-f75530ef-93b6-4994-b3c8-db957d44848c-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/7-7-8d9b22da-5f10-4cbf-8e4d-160f829b5e48-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/9-9-f029a1f7-9024-4bc3-a030-e20861586146-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-11-f07814ae-56cd-486b-af81-18541437da7d-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-12-967c70a4-bf4d-4a82-8c97-c90e2b4d9dcf-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-14-d0cdca7f-c050-407e-b70c-2bd076f83e4e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-15-0e931a1f-309e-43b3-a5cf-3ef82fa4a87c-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-17-43138078-244c-4b38-8127-04a5bfbc4695-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-19-52569895-df25-4ad8-b64d-49c4540d36c9-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-20-f160c1ea-a2f5-4109-81ec-3ff9c155430f-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-22-c1f61b8c-9d9a
[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16389 ) Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH clauses .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6365/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad Gerrit-Change-Number: 16389 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 31 Aug 2020 10:17:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16389 ) Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH clauses .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7051/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad Gerrit-Change-Number: 16389 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 31 Aug 2020 10:13:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses
Tamas Mate has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16389 Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH clauses .. IMPALA-10051: impala-shell exits with ValueError with WITH clauses When a query a contains WITH clause impala-shell tries to identify whether it is a DML query or not, so that later it can provide appropriate result messages. Earlier shlex was used to create tokens and assess the query type based on that. However shlex can misinterpret some query strings where whitespace charachters are mixed with quotes, because it splits the string based on whitespace charachters. In some scenarios 'ValueError: No closing quotation' error can occur. This change moves the tokenization from shlex to sqlparse. Testing: - Added unit test to cover queries that contain mixed whitespaces and strings Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 21 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/16389/2 -- To view, visit http://gerrit.cloudera.org:8080/16389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad Gerrit-Change-Number: 16389 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 24: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6364/ -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 24 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 31 Aug 2020 09:44:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 24: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7050/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 24 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 31 Aug 2020 08:45:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 24: Code-Review+1 Thanks for the changes! I've just restarted the verify job on PS24. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 24 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 31 Aug 2020 08:30:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 24: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6364/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 24 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 31 Aug 2020 08:30:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
wangsheng has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. IMPALA-9741: Support querying Iceberg table by impala This patch mainly realizes the querying of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently only support PARQUET, other format would be supported in the future. And if you don't specify this property in your SQL, default file format is PARQUET. We achieved this function by treating the iceberg table as normal unpartitioned hdfs table. When querying iceberg table, we pushdown partition column predicates to iceberg to decide which data files need to be scanned, and then transfer this information to BE to do the real scan operation. Testing: - Unit test for Iceberg in FileMetadataLoaderTest - Create table tests in functional_schema_template.sql - Iceberg table query test in test_scanners.py Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M bin/rat_exclude_files.txt M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java M testdata/data/README A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/v1.metadata.json A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/v2.metadata.json A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/version-hint.text A testdata/data/iceberg_test/iceberg_partitioned/metadata/v1.metadata.json A testdata/data/iceberg_test/iceberg_partitioned/metadata/v2.metadata.json A testdata/data/iceberg_test/iceberg_partitioned/metadata/version-hint.text M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv R testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-profile.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test R tests/query_test/test_iceberg.py M tests/query_test/test_scanners.py 45 files changed, 1,436 insertions(+), 200 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/16143/24 -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 24 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Ger
[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16351 ) Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6363/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda Gerrit-Change-Number: 16351 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Mon, 31 Aug 2020 08:17:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16351 ) Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state .. Patch Set 4: > Patch Set 4: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6361/ Hit IMPALA-9351. Rerun the test. -- To view, visit http://gerrit.cloudera.org:8080/16351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda Gerrit-Change-Number: 16351 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Mon, 31 Aug 2020 08:17:22 + Gerrit-HasComments: No