ashulin commented on issue #2127:
URL:
https://github.com/apache/incubator-seatunnel/issues/2127#issuecomment-1175749813
> > > org.apache.seatunnel.api.source.SourceReader#snapshotState looks
similar with
org.apache.seatunnel.api.source.SourceSplitEnumerator#snapshotState , but
return a different type(List and StateT), the comments is split checkpoint
state. while actually it returns List, they are not same in my mind. Could we
have a chance to unify the snapshot behavior?
> >
> >
> > `SourceSplitEnumerator#snapshotState` and `SourceReader#snapshotState`
are different. `SourceSplitEnumerator` assumes the role of coordinator, which
may require information beyond the snapshot split. `SourceReader` is designed
to only need splits to run, so the snapshot returns `List<SplitT>`.
>
> In my option, `SplitT` is the type of `SourceSplit`, not the State of
`checkpoint`, I wonder if `List<StateT>` is more appropriate?
The `List<SplitT>` returned by SourceReader#snapshotState will be added to
SourceSplitEnumerator#addSplitsBack. This is to run normally when the
parallelism of the source is changed, so the state of the reader needs to be
able to be converted to split.
To avoid ambiguity, we can add
```java
public interface SourceSplitState {
SourceSplit toSplit();
}
```
and `SourceReader#snapshotState` return `List<SourceSplitState>`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]