[ https://issues.apache.org/jira/browse/IMPALA-11784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645368#comment-17645368 ]
Zoltán Borók-Nagy commented on IMPALA-11784: -------------------------------------------- [~LiPenglin] most of the checks in icebergTableFormatCheck() are not needed anymore. There's also a Jira about it: IMPALA-11620 with a Draft CR. Cc. [~noemi] > Don't call Iceberg's planFiles redundantly during table load > ------------------------------------------------------------ > > Key: IMPALA-11784 > URL: https://issues.apache.org/jira/browse/IMPALA-11784 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Reporter: Zoltán Borók-Nagy > Assignee: Zoltán Borók-Nagy > Priority: Major > Labels: impala-iceberg > > Iceberg's planFiles() API is very expensive because it involves reading the > Avro manifest files. It's especially expensive on object stores, though > manifest caching can help here. > Currently we invoke this API two times during table loading (via > IcebergUtil.getIcebergFiles()), once in loadAllPartition() and once in > loadPartitionStats(). > We should just invoke IcebergUtil.getIcebergFiles() once, then pass the > result object to loadAllPartition() and loadPartitionStats(). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org