[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2023-02-15 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5477:
-
Fix Version/s: 0.12.3

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0, 0.12.3
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-29 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Status: Patch Available  (was: In Progress)

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-29 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Status: In Progress  (was: Open)

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-29 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Story Points: 2

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-29 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Reviewers: Danny Chen

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-29 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Sprint: 0.13.0 Final Sprint

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-5477:
-
Labels: pull-request-available  (was: )

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Component/s: meta-sync

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving, meta-sync
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Component/s: archiving

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: archiving
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Description: The Hudi archived timeline is always loaded during the 
metastore sync process if the last sync time is given. Besides, the archived 
timeline is not cached inside the meta client if the start instant time is 
given. These cause performance issues and read timeout on cloud storage due to 
rate limiting on requests because of loading archived timeline from the 
storage, when the archived timeline is huge, e.g., hundreds of log files in 
{{.hoodie/archived}} folder.

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Priority: Major
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Priority: Blocker  (was: Major)

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5477) Optimize timeline loading in Hudi sync client

2022-12-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5477:

Fix Version/s: 0.13.0

> Optimize timeline loading in Hudi sync client
> -
>
> Key: HUDI-5477
> URL: https://issues.apache.org/jira/browse/HUDI-5477
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.13.0
>
>
> The Hudi archived timeline is always loaded during the metastore sync process 
> if the last sync time is given. Besides, the archived timeline is not cached 
> inside the meta client if the start instant time is given. These cause 
> performance issues and read timeout on cloud storage due to rate limiting on 
> requests because of loading archived timeline from the storage, when the 
> archived timeline is huge, e.g., hundreds of log files in 
> {{.hoodie/archived}} folder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)