It is one type of mapper with a parallelism of 16
It's the same for the sinks and sources (parallelism of 4)

The settings are 
Env.setParallelism(4)
Mapper.setPrallelism(env.getParallelism() * 4)

We mean to have X mapper tasks per source / sink

The mapper is doing some heavy computation and we have only 4 kafka partitions. 
That's why we need more mappers than sources / sinks


-----Original Message-----
From: Aljoscha Krettek [mailto:aljos...@apache.org] 
Sent: mercredi 3 février 2016 16:26
To: user@flink.apache.org
Subject: Re: Distribution of sinks among the nodes

Hi Gwenhäel,
when you say 16 maps, are we talking about one mapper with parallelism 16 or 16 
unique map operators?

Regards,
Aljoscha
> On 03 Feb 2016, at 15:48, Gwenhael Pasquiers 
> <gwenhael.pasqui...@ericsson.com> wrote:
> 
> Hi,
>  
> We try to deploy an application with the following “architecture” :
>  
> 4 kafka sources => 16 maps => 4 kafka sinks, on 4 nodes, with 24 slots (we 
> disabled operator chaining).
>  
> So we’d like on each node :
> 1x source => 4x map => 1x sink
>  
> That way there are no exchanges between different instances of flink and 
> performances would be optimal.
>  
> But we get (according to the flink GUI and the Host column when looking at 
> the details of each task) :
>  
> Node 1 : 1 source =>  2 map
> Node 2 : 1 source =>  1 map
> Node 3 : 1 source =>  1 map
> Node 4 : 1 source =>  12 maps => 4 sinks
>  
> (I think no comments are needed J)
>  
> The the Web UI says that there are 24 slots and they are all used but they 
> don’t seem evenly dispatched …
>  
> How could we make Flink deploy the tasks the way we want ?
>  
> B.R.
>  
> Gwen’

Reply via email to