[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Reviewed-on: http://gerrit.cloudera.org:8080/17575 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/custom_cluster/test_event_processing.py 20 files changed, 181 insertions(+), 180 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 15 Jul 2021 15:15:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9096/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 15 Jul 2021 09:17:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7301/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 15 Jul 2021 08:59:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 15 Jul 2021 08:59:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 7: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 15 Jul 2021 08:58:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17575 to look at the new patch set (#7). Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/custom_cluster/test_event_processing.py 20 files changed, 181 insertions(+), 180 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/7 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7299/ -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 14 Jul 2021 20:24:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7299/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 14 Jul 2021 14:10:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 14 Jul 2021 14:10:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 14 Jul 2021 14:07:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9080/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 13 Jul 2021 09:04:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 4: (4 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG@33 PS4, Line 33: makes Impala to use > typo: makes Impala use Done http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@291 PS4, Line 291: transformType.startsWit > Not your change, but why is startsWith() used instead of equals() for BUCKE Because transformType might contain the parameters as well (e.g. "BUCKET[5]") when we process the partition spec loaded from iceberg. http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@302 PS4, Line 302: "Unsupported iceberg partition type: " > Do we have a test that exercises this error message? Added a test case to iceberg-negative.test http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@296 PS4, Line 296: switch (transformType) { : case "HOUR": case "HOURS": return TIcebergPartitionTransformType.HOUR; : case "DAY": case "DAYS": return TIcebergPartitionTransformType.DAY; : case "MONTH": case "MONTHS": return TIcebergPartitionTransformType.MONTH; : case "YEAR": case "YEARS": return TIcebergPartitionTransformType.YEAR; : default: : throw new TableLoadingException("Unsupported iceberg partition type: " + : transformType); : } > nit: Maybe adding these transform type strings and the ones above to a Stri 'transformType' for BUCKET and TRUNCATE might contain the parameters as well, see above. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 13 Jul 2021 08:44:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17575 to look at the new patch set (#5). Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/custom_cluster/test_event_processing.py 20 files changed, 181 insertions(+), 180 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/5 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG@33 PS4, Line 33: makes Impala to use typo: makes Impala use http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@291 PS4, Line 291: transformType.startsWit Not your change, but why is startsWith() used instead of equals() for BUCKET and TRUNCATE transports? http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@302 PS4, Line 302: "Unsupported iceberg partition type: " Do we have a test that exercises this error message? http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@296 PS4, Line 296: switch (transformType) { : case "HOUR": case "HOURS": return TIcebergPartitionTransformType.HOUR; : case "DAY": case "DAYS": return TIcebergPartitionTransformType.DAY; : case "MONTH": case "MONTHS": return TIcebergPartitionTransformType.MONTH; : case "YEAR": case "YEARS": return TIcebergPartitionTransformType.YEAR; : default: : throw new TableLoadingException("Unsupported iceberg partition type: " + : transformType); : } nit: Maybe adding these transform type strings and the ones above to a String -> TIcebergPartitionTransformType immutable map would make the code shorter and simpler. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 07 Jul 2021 14:04:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 4: Code-Review+1 Ok, I understand, thanks for your explanation! -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 22 Jun 2021 01:50:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8957/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 21 Jun 2021 13:01:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 3: (4 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG@35 PS3, Line 35: We might later consider making 'SPEC' optional. > I think making 'SPEC' optional maybe confused users. It's better to removed Hive uses PARTITIONED BY SPEC, Spark uses PARTITIONED BY only, so it would make sense to support both in the future. Anyway, I removed this sentence because in the near future we'll only support the Hive-like PARTITIONED BY SPEC. Supporting PARTITIONED BY would require bigger changes in the SQL parser. http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java: http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@97 PS3, Line 97: builder.append(transformType_.toString() + "("); > Maybe this is better: Done http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@99 PS3, Line 99: builder.append(transformParam_.toString() + ", "); > Same as above Done http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@297 PS3, Line 297: case "HOUR": case "HOURS": return TIcebergPartitionTransformType.HOUR; : case "DAY": case "DAYS": return TIcebergPartitionTransformType.DAY; : case "MONTH": case "MONTHS": return TIcebergPartitionTransformType.MONTH; : case "YEAR": case "YEARS": return TIcebergPartitionTransformType.YEAR; > I see that both Hive and Spark use HOURS/DAYS/MONTHS/YEARS in your comment I contacted the Hive team, and actually Hive supports both: https://github.com/apache/hive/blob/8ef538c6d84d0c9d7b610ca446bdc1083d62fa1b/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java#L182 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 21 Jun 2021 12:39:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17575 to look at the new patch set (#4). Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala to use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/custom_cluster/test_event_processing.py 20 files changed, 174 insertions(+), 180 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/4 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 3: (4 comments) Sorry for my late reply, Zoltan. Thanks for this feature, it's quite important to keep consistent syntax with other engines. Here is some of my opinions. http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG@35 PS3, Line 35: We might later consider making 'SPEC' optional. I think making 'SPEC' optional maybe confused users. It's better to removed this key word or keeping it as necessary in create statement. How do you think? http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java: http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@97 PS3, Line 97: builder.append(transformType_.toString() + "("); Maybe this is better: builder.append(transformType_.toString()).append("("); http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@99 PS3, Line 99: builder.append(transformParam_.toString() + ", "); Same as above http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@297 PS3, Line 297: case "HOUR": case "HOURS": return TIcebergPartitionTransformType.HOUR; : case "DAY": case "DAYS": return TIcebergPartitionTransformType.DAY; : case "MONTH": case "MONTHS": return TIcebergPartitionTransformType.MONTH; : case "YEAR": case "YEARS": return TIcebergPartitionTransformType.YEAR; I see that both Hive and Spark use HOURS/DAYS/MONTHS/YEARS in your comment example, maybe we do not need to support HOUR/DAY/MONTH/YEAR. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 17 Jun 2021 08:16:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8903/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 14 Jun 2021 13:15:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17575 to look at the new patch set (#3). Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala to use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. We might later consider making 'SPEC' optional. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/custom_cluster/test_event_processing.py 20 files changed, 174 insertions(+), 180 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/3 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8896/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 11 Jun 2021 12:56:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17575 to look at the new patch set (#2). Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala to use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. We might later consider making 'SPEC' optional. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 19 files changed, 173 insertions(+), 179 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/2 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7216/ -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 10 Jun 2021 21:32:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8882/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 10 Jun 2021 15:57:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7216/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 10 Jun 2021 15:35:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17575/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/17575/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4861 PS1, Line 4861: "PARTITIONED BY SPEC (BUCKET(10, p1), TRUNCATE(5, p1), DAY(p2)) STORED AS ICEBERG" + line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 10 Jun 2021 15:35:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17575 Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala to use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. We might later consider making 'SPEC' optional. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 19 files changed, 171 insertions(+), 178 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/1 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy