[ 
https://issues.apache.org/jira/browse/FLINK-32596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817533#comment-17817533
 ] 

Vallari Rastogi commented on FLINK-32596:
-----------------------------------------

[~luoyuxia] 

Hive Metastore expects the partitioned column should be last while inserting 
data. [hiveql - Hive partition column - Stack 
Overflow|https://stackoverflow.com/questions/60510174/hive-partition-column]

So what flink does is, it uses the last 'n' columns as PartitionColumns: 
[https://github.com/apache/flink/blob/403694e7b9c213386f3ed9cff21ce2664030ebc2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveTableUtil.java#L515]

 

And select , insert Queries follows the same logic of finding partition columns 
at the last!

 

As a test, I made a chng here.  

[https://github.com/apache/flink/commit/df47ceaba82a3d4f3392c1b53bb52f34d520cc3d]

 

Results:

!image-2024-02-15-03-05-22-541.png|width=600,height=106!

!image-2024-02-15-03-06-28-175.png|width=468,height=267!

The partitions will always come at the last due to HMS. Either we use insert 
stmt like: 

_INSERT INTO testHive2 PARTITION (ts='22:16:46', active='TRUE') SELECT 1, 46, 
'false';_

_SELECT query output:_

_!image-2024-02-15-03-08-50-029.png|width=525,height=328!_

> The partition key will be wrong when use Flink dialect to create Hive table
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-32596
>                 URL: https://issues.apache.org/jira/browse/FLINK-32596
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Hive
>    Affects Versions: 1.15.0, 1.16.0, 1.17.0
>            Reporter: luoyuxia
>            Assignee: Vallari Rastogi
>            Priority: Major
>         Attachments: image-2024-02-14-16-06-13-126.png, 
> image-2024-02-15-03-05-22-541.png, image-2024-02-15-03-06-28-175.png, 
> image-2024-02-15-03-08-50-029.png
>
>
> Can be reproduced by the following SQL:
>  
> {code:java}
> tableEnv.getConfig().setSqlDialect(SqlDialect.DEFAULT);
> tableEnv.executeSql(
>         "create table t1(`date` string, `geo_altitude` FLOAT) partitioned by 
> (`date`)"
>                 + " with ('connector' = 'hive', 
> 'sink.partition-commit.delay'='1 s',  
> 'sink.partition-commit.policy.kind'='metastore,success-file')");
> CatalogTable catalogTable =
>         (CatalogTable) 
> hiveCatalog.getTable(ObjectPath.fromString("default.t1"));
> // the following assertion will fail
> assertThat(catalogTable.getPartitionKeys().toString()).isEqualTo("[date]");{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to