[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Sumit updated HUDI-3717: ------------------------------ Fix Version/s: 0.12.1 (was: 0.12.0) > Avoid double-listing w/in BaseHoodieTableFileIndex > -------------------------------------------------- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug > Reporter: Alexey Kudinkin > Assignee: Alexey Kudinkin > Priority: Major > Fix For: 0.12.1 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)