[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhaojing Yu updated HUDI-3717: -- Fix Version/s: 0.13.0 (was: 0.12.1) > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.13.0 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3717: - Sprint: (was: 2022/09/19) > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.12.1 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3717: - Sprint: 2022/09/19 (was: 2022/09/05) > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.12.1 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3717: - Sprint: 2022/09/05 (was: 2022/08/22) > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.12.1 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3717: -- Sprint: 2022/08/22 > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.12.1 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3717: -- Fix Version/s: 0.12.1 (was: 0.12.0) > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.12.1 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3717: - Fix Version/s: 0.12.0 (was: 0.11.0) > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.12.0 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3717: -- Attachment: Screen Shot 2022-03-25 at 7.14.20 PM.png > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.11.0 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.05.09 PM.png! > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3717: -- Description: Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially does file-listing twice: * Once when `getAllQueryPartitionPaths` is invoked * Second time when `getFilesInPartitions` is invoked While this will not result in double-listing of the files on FS (b/c of `FIleStatusCache`, if any), this leads however to MT being queried twice: !Screen Shot 2022-03-25 at 7.14.20 PM.png! !Screen Shot 2022-03-25 at 7.05.09 PM.png! was: Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially does file-listing twice: * Once when `getAllQueryPartitionPaths` is invoked * Second time when `getFilesInPartitions` is invoked While this will not result in double-listing of the files on FS (b/c of `FIleStatusCache`, if any), this leads however to MT being queried twice: !Screen Shot 2022-03-25 at 7.05.09 PM.png! !Screen Shot 2022-03-25 at 7.05.09 PM.png! > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.11.0 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png, Screen Shot 2022-03-25 at 7.14.20 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.14.20 PM.png! > > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-3717) Avoid double-listing w/in BaseHoodieTableFileIndex
[ https://issues.apache.org/jira/browse/HUDI-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3717: -- Fix Version/s: 0.11.0 > Avoid double-listing w/in BaseHoodieTableFileIndex > -- > > Key: HUDI-3717 > URL: https://issues.apache.org/jira/browse/HUDI-3717 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Major > Fix For: 0.11.0 > > Attachments: Screen Shot 2022-03-25 at 7.05.09 PM.png, Screen Shot > 2022-03-25 at 7.05.43 PM.png > > > Currently in `BaseHoodieTableFileIndex::loadPartitionPathFiles` essentially > does file-listing twice: > * Once when `getAllQueryPartitionPaths` is invoked > * Second time when `getFilesInPartitions` is invoked > > While this will not result in double-listing of the files on FS (b/c of > `FIleStatusCache`, if any), this leads however to MT being queried twice: > !Screen Shot 2022-03-25 at 7.05.09 PM.png! > !Screen Shot 2022-03-25 at 7.05.09 PM.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)