[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..

IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Reviewed-on: http://gerrit.cloudera.org:8080/17575
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
M tests/custom_cluster/test_event_processing.py
20 files changed, 181 insertions(+), 180 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 15 Jul 2021 15:15:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9096/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 15 Jul 2021 09:17:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7301/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 15 Jul 2021 08:59:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 15 Jul 2021 08:59:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-15 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 7: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 15 Jul 2021 08:58:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-15 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17575

to look at the new patch set (#7).

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..

IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
M tests/custom_cluster/test_event_processing.py
20 files changed, 181 insertions(+), 180 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/7
--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7299/


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 14 Jul 2021 20:24:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7299/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 14 Jul 2021 14:10:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 14 Jul 2021 14:10:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-14 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 14 Jul 2021 14:07:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9080/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 13 Jul 2021 09:04:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 4:

(4 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG@33
PS4, Line 33: makes Impala to use
> typo: makes Impala use
Done


http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@291
PS4, Line 291: transformType.startsWit
> Not your change, but why is startsWith() used instead of equals() for BUCKE
Because transformType might contain the parameters as well (e.g. "BUCKET[5]") 
when we process the partition spec loaded from iceberg.


http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@302
PS4, Line 302: "Unsupported iceberg partition type: "
> Do we have a test that exercises this error message?
Added a test case to iceberg-negative.test


http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@296
PS4, Line 296:   switch (transformType) {
 :   case "HOUR":  case "HOURS":  return 
TIcebergPartitionTransformType.HOUR;
 :   case "DAY":   case "DAYS":   return 
TIcebergPartitionTransformType.DAY;
 :   case "MONTH": case "MONTHS": return 
TIcebergPartitionTransformType.MONTH;
 :   case "YEAR":  case "YEARS":  return 
TIcebergPartitionTransformType.YEAR;
 :   default:
 : throw new TableLoadingException("Unsupported iceberg 
partition type: " +
 : transformType);
 : }
> nit: Maybe adding these transform type strings and the ones above to a Stri
'transformType' for BUCKET and TRUNCATE might contain the parameters as well, 
see above.



--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 13 Jul 2021 08:44:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-13 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17575

to look at the new patch set (#5).

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..

IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
M tests/custom_cluster/test_event_processing.py
20 files changed, 181 insertions(+), 180 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/5
--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-07-07 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG@33
PS4, Line 33: makes Impala to use
typo: makes Impala use


http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@291
PS4, Line 291: transformType.startsWit
Not your change, but why is startsWith() used instead of equals() for BUCKET 
and TRUNCATE transports?


http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@302
PS4, Line 302: "Unsupported iceberg partition type: "
Do we have a test that exercises this error message?


http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@296
PS4, Line 296:   switch (transformType) {
 :   case "HOUR":  case "HOURS":  return 
TIcebergPartitionTransformType.HOUR;
 :   case "DAY":   case "DAYS":   return 
TIcebergPartitionTransformType.DAY;
 :   case "MONTH": case "MONTHS": return 
TIcebergPartitionTransformType.MONTH;
 :   case "YEAR":  case "YEARS":  return 
TIcebergPartitionTransformType.YEAR;
 :   default:
 : throw new TableLoadingException("Unsupported iceberg 
partition type: " +
 : transformType);
 : }
nit: Maybe adding these transform type strings and the ones above to a String 
-> TIcebergPartitionTransformType immutable map would make the code shorter and 
simpler.



--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 07 Jul 2021 14:04:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-21 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 4: Code-Review+1

Ok, I understand, thanks for your explanation!


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 22 Jun 2021 01:50:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8957/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 21 Jun 2021 13:01:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-21 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 3:

(4 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG@35
PS3, Line 35: We might later consider making 'SPEC' optional.
> I think making 'SPEC' optional maybe confused users. It's better to removed
Hive uses PARTITIONED BY SPEC, Spark uses PARTITIONED BY only, so it would make 
sense to support both in the future.

Anyway, I removed this sentence because in the near future we'll only support 
the Hive-like PARTITIONED BY SPEC. Supporting PARTITIONED BY would require 
bigger changes in the SQL parser.


http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java:

http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@97
PS3, Line 97: builder.append(transformType_.toString() + "(");
> Maybe this is better:
Done


http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@99
PS3, Line 99: builder.append(transformParam_.toString() + ", ");
> Same as above
Done


http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@297
PS3, Line 297:   case "HOUR":  case "HOURS":  return 
TIcebergPartitionTransformType.HOUR;
 :   case "DAY":   case "DAYS":   return 
TIcebergPartitionTransformType.DAY;
 :   case "MONTH": case "MONTHS": return 
TIcebergPartitionTransformType.MONTH;
 :   case "YEAR":  case "YEARS":  return 
TIcebergPartitionTransformType.YEAR;
> I see that both Hive and Spark use HOURS/DAYS/MONTHS/YEARS in your comment
I contacted the Hive team, and actually Hive supports both:
https://github.com/apache/hive/blob/8ef538c6d84d0c9d7b610ca446bdc1083d62fa1b/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java#L182



--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 21 Jun 2021 12:39:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-21 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17575

to look at the new patch set (#4).

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..

IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala to use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
M tests/custom_cluster/test_event_processing.py
20 files changed, 174 insertions(+), 180 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-17 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 3:

(4 comments)

Sorry for my late reply, Zoltan. Thanks for this feature, it's quite important 
to keep consistent syntax with other engines. Here is some of my opinions.

http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17575/3//COMMIT_MSG@35
PS3, Line 35: We might later consider making 'SPEC' optional.
I think making 'SPEC' optional maybe confused users. It's better to removed 
this key word or keeping it as necessary in create statement. How do you think?


http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java:

http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@97
PS3, Line 97: builder.append(transformType_.toString() + "(");
Maybe this is better:
builder.append(transformType_.toString()).append("(");


http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java@99
PS3, Line 99: builder.append(transformParam_.toString() + ", ");
Same as above


http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/17575/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@297
PS3, Line 297:   case "HOUR":  case "HOURS":  return 
TIcebergPartitionTransformType.HOUR;
 :   case "DAY":   case "DAYS":   return 
TIcebergPartitionTransformType.DAY;
 :   case "MONTH": case "MONTHS": return 
TIcebergPartitionTransformType.MONTH;
 :   case "YEAR":  case "YEARS":  return 
TIcebergPartitionTransformType.YEAR;
I see that both Hive and Spark use HOURS/DAYS/MONTHS/YEARS in your comment 
example, maybe we do not need to support HOUR/DAY/MONTH/YEAR.



--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 17 Jun 2021 08:16:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8903/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 14 Jun 2021 13:15:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-14 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17575

to look at the new patch set (#3).

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..

IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala to use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax. We might later consider making 'SPEC' optional.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
M tests/custom_cluster/test_event_processing.py
20 files changed, 174 insertions(+), 180 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/3
--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8896/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 11 Jun 2021 12:56:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-11 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17575

to look at the new patch set (#2).

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..

IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala to use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax. We might later consider making 'SPEC' optional.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
19 files changed, 173 insertions(+), 179 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/2
--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7216/


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 10 Jun 2021 21:32:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8882/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 10 Jun 2021 15:57:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7216/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 10 Jun 2021 15:35:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17575 )

Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17575/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/17575/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4861
PS1, Line 4861: "PARTITIONED BY SPEC (BUCKET(10, p1), TRUNCATE(5, p1), 
DAY(p2)) STORED AS ICEBERG" +
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 10 Jun 2021 15:35:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

2021-06-10 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17575


Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg 
partitions
..

IMPALA-10732: Use consistent DDL for specifying Iceberg partitions

Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala to use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax. We might later consider making 'SPEC' optional.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionTransform.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
19 files changed, 171 insertions(+), 178 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/17575/1
--
To view, visit http://gerrit.cloudera.org:8080/17575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Gerrit-Change-Number: 17575
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy