[ https://issues.apache.org/jira/browse/HUDI-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HUDI-6738: --------------------------------- Labels: pull-request-available (was: ) > Apply object filter before checkpoint batching in GcsEventsHoodieIncrSource > ---------------------------------------------------------------------------- > > Key: HUDI-6738 > URL: https://issues.apache.org/jira/browse/HUDI-6738 > Project: Apache Hudi > Issue Type: Bug > Reporter: Lokesh Lingarajan > Priority: Major > Labels: pull-request-available > > Recent refactoring to support batching within commit for GCS incr job moved > the filtering of objects after the checkpoint batching. The issue with this > on bootstrap scenarios where we are looking for only latest commits, we will > have to go through the entire set of commits based on sourcelimit instead of > directly skipping to the latest commit. > Fix is to apply filtering before we start checkpoint batching. This change > list will bring GCS job similar to S3 job. -- This message was sent by Atlassian Jira (v8.20.10#820010)