yihua opened a new pull request, #7561: URL: https://github.com/apache/hudi/pull/7561
### Change Logs Before this change, the Hudi archived timeline is always loaded during the metastore sync process if the last sync time is given. Besides, the archived timeline is not cached inside the meta client if the start instant time is given. These cause performance issues and read timeout on cloud storage due to rate limiting on requests because of loading archived timeline from the storage, when the archived timeline is huge, e.g., hundreds of log files in `.hoodie/archived` folder. This PR improves the timeline loading by (1) only reading active timeline if the last sync time is the same as or after the start of the active timeline; (2) caching the archived timeline based on the start instant time in the meta client, to avoid unnecessary repeated loading of the same archived timeline. ### Impact This PR improves the performance of metastore sync. ### Risk level low ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org