[ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:
----------------------------
    Description: The Hudi archived timeline is always loaded during the 
metastore sync process if the last sync time is given. Besides, the archived 
timeline is not cached inside the meta client if the start instant time is 
given. These cause performance issues and read timeout on cloud storage due to 
rate limiting on requests because of loading archived timeline from the 
storage, when the archived timeline is huge, e.g., hundreds of log files in 
{{.hoodie/archived}} folder.

> Optimize timeline loading in Hudi sync client
> ---------------------------------------------
>
>                 Key: HUDI-5477
>                 URL: https://issues.apache.org/jira/browse/HUDI-5477
>             Project: Apache Hudi
>          Issue Type: Improvement
>            Reporter: Ethan Guo
>            Priority: Major
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to