[ https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicholas Jiang updated HUDI-6317: --------------------------------- Description: At present, the default value of read.streaming.skip_clustering is false, which could cause the situation that streaming reading reads the replaced file slices of clustering, so that streaming reading may read T-1 day data when clustering the data of T-1 day to cause duplicated data. Therefore streaming read should skip clustering instants for all cases to avoid reading the replaced file slices. (was: At present, the default value of read.streaming.skip_clustering is false, which could cause the situation that streaming reading reads the replaced file slices of clustering so that streaming reading may read T-1 day data when clustering the data of T-1 day. Therefore streaming read should skip clustering instants for all cases to avoid reading the replaced file slices.) > Streaming read should skip clustering instants > ---------------------------------------------- > > Key: HUDI-6317 > URL: https://issues.apache.org/jira/browse/HUDI-6317 > Project: Apache Hudi > Issue Type: Bug > Components: flink > Reporter: Nicholas Jiang > Assignee: Nicholas Jiang > Priority: Major > Fix For: 0.14.0 > > > At present, the default value of read.streaming.skip_clustering is false, > which could cause the situation that streaming reading reads the replaced > file slices of clustering, so that streaming reading may read T-1 day data > when clustering the data of T-1 day to cause duplicated data. Therefore > streaming read should skip clustering instants for all cases to avoid reading > the replaced file slices. -- This message was sent by Atlassian Jira (v8.20.10#820010)