[ 
https://issues.apache.org/jira/browse/IMPALA-14734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-14734:
-------------------------------------
    Description: Noticed that planning on large Iceberg tables can be faster 
when using Iceberg's plan files compared to Impala's "optimized" path using 
cached file descriptors. The reason seems to be that planning time is dominated 
by sorting file descriptors, which decodes utf8 pathes in the backing flat 
buffer structure n log ( n ) times 
https://github.com/apache/impala/blob/3be15fd3598071eaeddd9b4d29e0883b95fdd14a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java#L116
  (was: Noticed that planning on large Iceberg tables can be faster when using 
Iceberg's plan files compared to Impala's "optimized" path using cached file 
descriptors. The reason seems to be that planning time is dominated by sorting 
file descriptors, which decodes utf8 pathes in the backing flat buffer 
structure n log(n) times 
https://github.com/apache/impala/blob/3be15fd3598071eaeddd9b4d29e0883b95fdd14a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java#L116)

> Planning on large iceberg tables can be dominated by sorting
> ------------------------------------------------------------
>
>                 Key: IMPALA-14734
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14734
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Csaba Ringhofer
>            Priority: Major
>              Labels: iceberg, performance
>
> Noticed that planning on large Iceberg tables can be faster when using 
> Iceberg's plan files compared to Impala's "optimized" path using cached file 
> descriptors. The reason seems to be that planning time is dominated by 
> sorting file descriptors, which decodes utf8 pathes in the backing flat 
> buffer structure n log ( n ) times 
> https://github.com/apache/impala/blob/3be15fd3598071eaeddd9b4d29e0883b95fdd14a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java#L116



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to