[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants to avoid duplicated reading

2023-06-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated HUDI-6317:
-
Status: In Progress  (was: Open)

> Streaming read should skip clustering instants to avoid duplicated reading
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering, so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day to cause duplicated data. Therefore 
> streaming read should skip clustering instants for all cases to avoid reading 
> the replaced file slices.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants to avoid duplicated reading

2023-06-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6317:
-
Labels: pull-request-available  (was: )

> Streaming read should skip clustering instants to avoid duplicated reading
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering, so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day to cause duplicated data. Therefore 
> streaming read should skip clustering instants for all cases to avoid reading 
> the replaced file slices.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants to avoid duplicated reading

2023-06-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated HUDI-6317:
-
Summary: Streaming read should skip clustering instants to avoid duplicated 
reading  (was: Streaming read should skip clustering instants to avoid 
deplicated reading)

> Streaming read should skip clustering instants to avoid duplicated reading
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering, so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day to cause duplicated data. Therefore 
> streaming read should skip clustering instants for all cases to avoid reading 
> the replaced file slices.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants to avoid deplicated reading

2023-06-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated HUDI-6317:
-
Summary: Streaming read should skip clustering instants to avoid deplicated 
reading  (was: Streaming read should skip clustering instants)

> Streaming read should skip clustering instants to avoid deplicated reading
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering, so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day to cause duplicated data. Therefore 
> streaming read should skip clustering instants for all cases to avoid reading 
> the replaced file slices.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants

2023-06-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated HUDI-6317:
-
Description: At present, the default value of 
read.streaming.skip_clustering is false, which could cause the situation that 
streaming reading reads the replaced file slices of clustering, so that 
streaming reading may read T-1 day data when clustering the data of T-1 day to 
cause duplicated data. Therefore streaming read should skip clustering instants 
for all cases to avoid reading the replaced file slices.  (was: At present, the 
default value of read.streaming.skip_clustering is false, which could cause the 
situation that streaming reading reads the replaced file slices of clustering 
so that streaming reading may read T-1 day data when clustering the data of T-1 
day. Therefore streaming read should skip clustering instants for all cases to 
avoid reading the replaced file slices.)

> Streaming read should skip clustering instants
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering, so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day to cause duplicated data. Therefore 
> streaming read should skip clustering instants for all cases to avoid reading 
> the replaced file slices.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants

2023-06-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated HUDI-6317:
-
Description: At present, the default value of 
read.streaming.skip_clustering is false, which could cause the situation that 
streaming reading reads the replaced file slices of clustering so that 
streaming reading may read T-1 day data when clustering the data of T-1 day. 
Therefore   (was: At present, the default value of 
read.streaming.skip_clustering is false, which could cause the situation that 
streaming reading reads the replaced file slices of clustering so that 
streaming reading may read T-1 day data when clustering the data of T-1 day. 
Therefore read.streaming.skip_clustering should be true.)

> Streaming read should skip clustering instants
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day. Therefore 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants

2023-06-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated HUDI-6317:
-
Description: At present, the default value of 
read.streaming.skip_clustering is false, which could cause the situation that 
streaming reading reads the replaced file slices of clustering so that 
streaming reading may read T-1 day data when clustering the data of T-1 day. 
Therefore streaming read should skip clustering instants for all cases to avoid 
reading the replaced file slices.  (was: At present, the default value of 
read.streaming.skip_clustering is false, which could cause the situation that 
streaming reading reads the replaced file slices of clustering so that 
streaming reading may read T-1 day data when clustering the data of T-1 day. 
Therefore )

> Streaming read should skip clustering instants
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day. Therefore streaming read should skip 
> clustering instants for all cases to avoid reading the replaced file slices.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6317) Streaming read should skip clustering instants

2023-06-04 Thread Nicholas Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Jiang updated HUDI-6317:
-
Summary: Streaming read should skip clustering instants  (was: The default 
value of read.streaming.skip_clustering should be true)

> Streaming read should skip clustering instants
> --
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Nicholas Jiang
>Assignee: Nicholas Jiang
>Priority: Major
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false, 
> which could cause the situation that streaming reading reads the replaced 
> file slices of clustering so that streaming reading may read T-1 day data 
> when clustering the data of T-1 day. Therefore read.streaming.skip_clustering 
> should be true.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)