[
https://issues.apache.org/jira/browse/IMPALA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515607#comment-17515607
]
ASF subversion and git services commented on IMPALA-11053:
----------------------------------------------------------
Commit 4d5530e762ca52763c7eb5acff62517507126925 in impala's branch
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4d5530e ]
IMPALA-11214: Impala reloads Iceberg tables per each data file
Due to a bug in IMPALA-11053, Impala reloads the Iceberg table per each
data file. This causes a serious perf regression for table loads.
This patch avoids reloading the Iceberg tables for each data file.
Testing:
* added exhaustive e2e test
Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Reviewed-on: http://gerrit.cloudera.org:8080/18371
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Impala should be able to read migrated partitioned Iceberg tables
> -----------------------------------------------------------------
>
> Key: IMPALA-11053
> URL: https://issues.apache.org/jira/browse/IMPALA-11053
> Project: IMPALA
> Issue Type: Bug
> Reporter: Zoltán Borók-Nagy
> Assignee: Zoltán Borók-Nagy
> Priority: Major
> Labels: impala-iceberg
> Fix For: Impala 4.1.0
>
>
> When Hive (and probably other engines as well) converts a legacy Hive table
> to Iceberg it doesn't rewrite the data files.
> It means that the data files don't have write ids, moreover they don't have
> the partition columns neither.
> Currently Impala expects tha partition columns to be present in the data
> files, so it won't be able to read converted partitioned tables.
> So we need to inject partition values from the Iceberg metadata, plus resolve
> columns correctly (position-based resolution needs an offset).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]