hudi-bot opened a new issue, #14925: URL: https://github.com/apache/hudi/issues/14925
Make timeline server work with multiple concurrent writers. As of now, if an executor is lagging wrt timeline server (timeline server refreshes its state for every call if timeline has moved), we throw an exception and executor falls back to secondary which will list the file system. Related ticket: https://issues.apache.org/jira/browse/HUDI-2761 We want to revisit this code and see how can we make timeline server work with multi-writer scenario. Few points to consider: 1. Executors should try to call getLatestBaseFilesOnOrBefore() instead of getLatestBaseFiles(). Not calls has to be fixed. the ones doing conflict resolutions, might have to get the latest snapshot always. 2. Fix async services to use separate write client in deltastreamer flow 3. Revist every call from executor and set "REFRESH" param on only when matters. 4. Sharing embedded timeline server. 5. Check for any holes. when C100 and C101 concurrently started and C101 finishes early, if C100 makes getLatestBaseFileOnOrBefore(), do we return base files from C101? ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-2860 - Type: Improvement - Epic: https://issues.apache.org/jira/browse/HUDI-3248 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
