Re: How to force the parallelism on small streams?

2015-09-02 Thread Matthias J. Sax
Hi, If I understand you correctly, you want to have 100 mappers. Thus you need to apply the .setParallelism() after .map() > addSource(myFileSource).rebalance().map(myFileMapper).setParallelism(100) The order of commands you used, set the dop for the source to 100 (which might be ignored, if the

RE: How to force the parallelism on small streams?

2015-09-02 Thread LINZ, Arnaud
6 À : user@flink.apache.org Objet : Re: How to force the parallelism on small streams? Hi, If I understand you correctly, you want to have 100 mappers. Thus you need to apply the .setParallelism() after .map() > addSource(myFileSource).rebalance().map(myFileMapper).setParallelism(1 > 00) The order

Re: How to force the parallelism on small streams?

2015-09-02 Thread Matthias J. Sax
sage d'origine- > De : Matthias J. Sax [mailto:mj...@apache.org] > Envoyé : mercredi 2 septembre 2015 17:56 > À : user@flink.apache.org > Objet : Re: How to force the parallelism on small streams? > > Hi, > > If I understand you correctly, you want to have 100 map

Re: How to force the parallelism on small streams?

2015-09-03 Thread Aljoscha Krettek
alancing > evenly between the mappers. > > > > Greetings, > > Arnaud > > > > > > -Message d'origine- > > De : Matthias J. Sax [mailto:mj...@apache.org] > > Envoyé : mercredi 2 septembre 2015 17:56 > > À : user@flink.apache.org >

Re: How to force the parallelism on small streams?

2015-09-03 Thread Matthias J. Sax
:mj...@apache.org > <mailto:mj...@apache.org>] > > Envoyé : mercredi 2 septembre 2015 17:56 > > À : user@flink.apache.org <mailto:user@flink.apache.org> > > Objet : Re: How to force the parallelism on small streams? > > > > Hi

Re: How to force the parallelism on small streams?

2015-09-03 Thread Fabian Hueske
mappers. > > > > > > Greetings, > > > Arnaud > > > > > > > > > -Message d'origine- > > > De : Matthias J. Sax [mailto:mj...@apache.org > > <mailto:mj...@apache.org>] > >

Re: How to force the parallelism on small streams?

2015-09-03 Thread Fabian Hueske
e() with shuffle(). > > > > > > But I found a workaround: setting parallelism to 1 for the source > > (I don't need a 100 directory scanners anyway), it forces the > > rebalancing evenly between the mappers. > > > > > > Greetings, > > >

Re: How to force the parallelism on small streams?

2015-09-03 Thread Fabian Hueske
my problem, since I have > >>> 100 parallelism everywhere. Each of my 100 sources gives only a few > lines > >>> (say 14 max), and only the first 14 next nodes will receive data. > >>>> Same problem by replacing rebalance() with shuffle(). > >>>&g