[ 
https://issues.apache.org/jira/browse/IMPALA-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-11034:
---------------------------------------
    Description: 
When external tables are converted to Iceberg, the data files remain intact.
This means that the old data files don't have field id information which is 
essential for schema evolution.

However, there is a workaround for this, see: 
[https://github.com/trinodb/trino/issues/9843]

We need to create a NameMapping which maps field ids to column names, then we 
can do column resolution in the legacy files with the help of the name mapping.

  was:
When external tables are converted to Iceberg, the data files remain intact.
This means that the old data files don't have field id information which is 
essential for schema evolution.

However, there is a workaround for this, see: 
https://github.com/trinodb/trino/issues/9843

Basically we need to translate the current schema to the first schema of the 
table using the field ids, then we can use name-based on position-based schema 
resolution in the data files.


> Resolve schema of old data files in migrated Iceberg tables
> -----------------------------------------------------------
>
>                 Key: IMPALA-11034
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11034
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> When external tables are converted to Iceberg, the data files remain intact.
> This means that the old data files don't have field id information which is 
> essential for schema evolution.
> However, there is a workaround for this, see: 
> [https://github.com/trinodb/trino/issues/9843]
> We need to create a NameMapping which maps field ids to column names, then we 
> can do column resolution in the legacy files with the help of the name 
> mapping.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to