[ https://issues.apache.org/jira/browse/HUDI-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raymond Xu updated HUDI-5611: ----------------------------- Component/s: metadata > Revisit metadata-table-based file listing calls and use batch lookup instead > ---------------------------------------------------------------------------- > > Key: HUDI-5611 > URL: https://issues.apache.org/jira/browse/HUDI-5611 > Project: Apache Hudi > Issue Type: Improvement > Components: metadata > Reporter: Ethan Guo > Priority: Critical > Fix For: 0.13.1 > > > We discover a performance issue with savepoint when the metadata table is > enabled. It is due to unnecessary scanning of the metadata table when the > number of partitions is large. When the metadata table is enabled, in the > savepoint operation, for each partition, the metadata table is scanned, which > leads to a lot of S3 requests. The solution is to batch the list calls of > all partitions (HUDI-5485). > > We need to revisit metadata-table-based file listing calls in a similar > fashion and replace them with batch lookup if needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)