[
https://issues.apache.org/jira/browse/IMPALA-14734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Csaba Ringhofer updated IMPALA-14734:
-------------------------------------
Description: Noticed that planning on large Iceberg tables can be faster
when using Iceberg's plan files compared to Impala's "optimized" path using
cached file descriptors. The reason seems to be that planning time is dominated
by sorting file descriptors, which decodes utf8 pathes in the backing flat
buffer structure n log ( n ) times
https://github.com/apache/impala/blob/3be15fd3598071eaeddd9b4d29e0883b95fdd14a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java#L116
(was: Noticed that planning on large Iceberg tables can be faster when using
Iceberg's plan files compared to Impala's "optimized" path using cached file
descriptors. The reason seems to be that planning time is dominated by sorting
file descriptors, which decodes utf8 pathes in the backing flat buffer
structure n log(n) times
https://github.com/apache/impala/blob/3be15fd3598071eaeddd9b4d29e0883b95fdd14a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java#L116)
> Planning on large iceberg tables can be dominated by sorting
> ------------------------------------------------------------
>
> Key: IMPALA-14734
> URL: https://issues.apache.org/jira/browse/IMPALA-14734
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Csaba Ringhofer
> Priority: Major
> Labels: iceberg, performance
>
> Noticed that planning on large Iceberg tables can be faster when using
> Iceberg's plan files compared to Impala's "optimized" path using cached file
> descriptors. The reason seems to be that planning time is dominated by
> sorting file descriptors, which decodes utf8 pathes in the backing flat
> buffer structure n log ( n ) times
> https://github.com/apache/impala/blob/3be15fd3598071eaeddd9b4d29e0883b95fdd14a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java#L116
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]