[ 
https://issues.apache.org/jira/browse/HUDI-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5176:
----------------------------
    Fix Version/s: 0.12.2

> Incremental source may miss commits if there are inflight commits before 
> completed commits
> ------------------------------------------------------------------------------------------
>
>                 Key: HUDI-5176
>                 URL: https://issues.apache.org/jira/browse/HUDI-5176
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Ethan Guo
>            Assignee: Ethan Guo
>            Priority: Major
>             Fix For: 0.12.2
>
>
> Consider the following scenario of concurrent writers. Writer 1 starts a 
> commit at t1 and later writer 2 starts another commit at t2 (t2 > t1). Commit 
> t2 finishes earlier than t1.
> {code:java}
> ---------------------------------------------------------> t
>  instant t1 |------------------------------| (writer 1)
>  instant t2         |--------------|         (writer 2) {code}
> This leaves an inflight commit (t1) before a completed commit (t2) on the 
> Hudi timeline.  Given that the incremental pull uses only completed commits 
> to determine the start and end instants for incremental query and advance the 
> checkpoint, the data for the inflight commits may never be pulled from the 
> incremental source.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to