[jira] [Comment Edited] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809762#comment-17809762
 ] 

Butao Zhang edited comment on HIVE-28015 at 1/23/24 2:59 PM:
-

Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id{*}-- single column}}

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data{*}-- multiple 
columns}}

 

Or using *primary key( i )* syntax like your example?

*create table ice_pk (i int, j int, primary key( i )) stored by iceberg;*


was (Author: zhangbutao):
Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id{*}-- single column}}

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data{*}-- multiple 
columns}}

 

Or using *primary key(i)* syntax like your example?

*create table ice_pk (i int, j int, primary key(i)) stored as iceberg;*

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code:java}
> create table ice_pk (i int, j int, primary key(i)) stored by iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-22 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809762#comment-17809762
 ] 

Butao Zhang edited comment on HIVE-28015 at 1/23/24 6:55 AM:
-

Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id{*}-- single column}}

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data{*}-- multiple 
columns}}

 

Or using *primary key(i)* syntax like your example?

*create table ice_pk (i int, j int, primary key(i)) stored as iceberg;*


was (Author: zhangbutao):
Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{*ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id*-- single column}}

{{*ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data*-- multiple 
columns}}

 

Or using *primary key(i)* syntax like your example?

*create table ice_pk (i int, j int, primary key(i)) stored as iceberg;*

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code}
> create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)