Hi!

The source runs parallel (n tasks), but the sink has a parallelism of 1.
The sink hence has to merge the parallel streams from the source, which
happens based on arrival speed of the streams, i.e., its not deterministic.
That's why you see the lines being mixed.

Try running source and sink with the same parallelism, then no merge of
streams needs to happen. You'll see then that per output file, the lines
are correct.

Stephan


On Thu, Aug 11, 2016 at 2:29 PM, Yassin Marzouki <yassmar...@gmail.com>
wrote:

> Hi all,
>
> When I use out.collect() twice inside a faltMap, the output is sometimes
> and randomly skewed. Take this example:
>
> final StreamExecutionEnvironment env = StreamExecutionEnvironment.
> createLocalEnvironment();
>     env.generateSequence(1, 100000)
>         .flatMap((Long t, Collector<String> out) -> {
>             out.collect("line1");
>             out.collect("line2");
>         })
>         .writeAsText("test",FileSystem.WriteMode.
> OVERWRITE).setParallelism(1);
> env.execute("Test");
>
> I expect the output to be
> line1
> line2
> line1
> line2
> ...
>
> But some resulting lines (18 out of 200000) were:
> line2
> line2
> and the same for line1.
>
> What could be the reason for this?
>
> Best,
> Yassine
>

Reply via email to