[ 
https://issues.apache.org/jira/browse/HUDI-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaojing Yu updated HUDI-2860:
------------------------------
    Fix Version/s: 0.13.0
                       (was: 0.12.1)

> Make timeline server work with concurrent/async table service
> -------------------------------------------------------------
>
>                 Key: HUDI-2860
>                 URL: https://issues.apache.org/jira/browse/HUDI-2860
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: table-service, writer-core
>            Reporter: sivabalan narayanan
>            Priority: Critical
>             Fix For: 0.13.0
>
>
> Make timeline server work with multiple concurrent writers. 
> As of now, if an executor is lagging wrt timeline server (timeline server 
> refreshes its state for every call if timeline has moved), we throw an 
> exception and executor falls back to secondary which will list the file 
> system. 
>  
> Related ticket: https://issues.apache.org/jira/browse/HUDI-2761
>  
> We want to revisit this code and see how can we make timeline server work 
> with multi-writer scenario. 
>  
> Few points to consider:
> 1. Executors should try to call getLatestBaseFilesOnOrBefore() instead of 
> getLatestBaseFiles(). Not calls has to be fixed. the ones doing conflict 
> resolutions, might have to get the latest snapshot always. 
> 2. Fix async services to use separate write client in deltastreamer flow
> 3. Revist every call from executor and set "REFRESH" param on only when 
> matters.
> 4. Sharing embedded timeline server. 
> 5. Check for any holes. when C100 and C101 concurrently started and C101 
> finishes early, if C100 makes getLatestBaseFileOnOrBefore(), do we return 
> base files from C101? 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to