danny0405 commented on issue #11016: URL: https://github.com/apache/hudi/issues/11016#issuecomment-2066279351
> but the issue is that we can't access older data. If you table is ingested in streaming `upsert`, then you just specify the `read.start-commit` as the first commit instant time on the timeline, and skip the compaction. Only instant that has not been cleaned can be consumed. It actually depends on how you write the history dataset, because `bulk_insert` does not guarantee the payload sequence of one key, so if the table is boostraped with `bulk_insert`, the only way is to consume from `earliest`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org