Hi Peter,

Maybe you can try Flink new source[1]? It allows you to customize your
own `SplitEnumerator` which can control how to assign splits.

DefaultInputSplitAssigner is not a public interface. And currently there
is no requirement of the order of splits by Flink itself. Therefore, it
looks to me more an outdated documentation problem. Furthermore,
DefaultInputSplitAssigner is used for legacy sources which will be
deprecated in the future.

[1] 
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/datastream/sources/#the-data-source-api

Thanks,
Zhu

Péter Váry <peter.vary.apa...@gmail.com> 于2023年4月19日周三 20:27写道:
>
> Hi Team,
>
> Recently I ran into the DefaultInputSplitAssigner [1].
> The javadoc documentation states:
>
>
> /**
>
>  * This is the default implementation of the {@link InputSplitAssigner}
> interface. The default input
>
>  * split assigner simply returns all input splits of an input vertex *in
> the order they were*
>
> * * originally computed*.
>
>  */
>
> The highlighted part says the order of the elements are kept and we wanted
> to rely on this.
>
> OTOH the code does this:
>
>
>         synchronized (this.splits) {
>
>             if (this.splits.size() > 0) {
>
>                 next = this.splits.remove(this.splits.size() - 1);
>
>             }
>
>         }
>
> Which is exactly opposite of the stated intent and returns the splits in
> the reverse order :D
> The change got committed almost a decade ago [2]
>
> I can see 2 solutions here:
> - Change the comment by removing the restrictions for the ordering of the
> splits. Since the "Feature" is there 20th Sept 2014, and fixing it would
> possibly cause issues where someone rely on the current behaviour
> - Fix the return order
>
> What do you think we should do in this case?
>
> Thanks,
> Peter
>
> [1]
> https://github.com/apache/flink/blob/9e5f39e36a9b20a60573ff4051a9e8e8e54a78a9/flink-core/src/main/java/org/apache/flink/api/common/io/DefaultInputSplitAssigner.java
> [2] FLINK-1094: Reworked, improved, and testes split assigners:
> https://github.com/apache/flink/commit/c32569aed12ffa968e2c2289c2d56db262c0eba4

Reply via email to