[ https://issues.apache.org/jira/browse/HUDI-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yue Zhang updated HUDI-5517: ---------------------------- Fix Version/s: (was: 0.13.1) > HoodieTimeline support filter instants by state transition time > --------------------------------------------------------------- > > Key: HUDI-5517 > URL: https://issues.apache.org/jira/browse/HUDI-5517 > Project: Apache Hudi > Issue Type: New Feature > Components: core, incremental-query > Reporter: Hui An > Priority: Major > Labels: pull-request-available > Fix For: 0.14.0 > > > Hudi timeline can actually miss some instants if we incremental pulling from > upstream hudi table, which is written by several writers. > For example, say we have 2 writers writing data to the hudi table, and the > last success incremental pulling end timestamp is 001 > w1 is writing 002, w2 is writing 003, if w2 is finished earlier than the w1, > then the incremental pulling end timestamp will be updated to 003, and > actually w1's commit: 002 will be skipped since it's instant time is earlier > than the w2's. > We actually needs to use commit end time(state transition time) to filter the > commits if using incremental pulling. As w2's state transition time is > earlier than the w1's, so w1's data won't be filtered. > This relates to the HUDI-1623 but not adding end time to the end of each > commit, instead use `FileStatus.getModificationTime` to represent the end > time. -- This message was sent by Atlassian Jira (v8.20.10#820010)