Hi Pawel,
does it happen only with the Flink runner ? I bet it happens with any
runner.
Let me take a look.
Regards
JB
On 05/30/2016 01:38 AM, Pawel Szczur wrote:
Hi,
I'm running a pipeline with Flink backend, Beam bleeding edge, Oracle
Java 1.8, maven 3.3.3, linux64.
The pipeline is run with --parallelism=6.
Adding .withoutSharding()causes a TextIO sink to write only one of the
shards.
Example use:
data.apply(TextIO.Write.named("write-debug-csv").to("/tmp/some-stats"));
vs.
data.apply(TextIO.Write.named("write-debug-csv").to("/tmp/some-stats")*.withoutSharding()*);
Result:
Only part of data is written to file. After comparing to sharded output,
it seems to be just one of shard files.
Cheers,
Pawel
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com