Thanks Gary. Understood the behavior.

I am leaning towards running 7 TM on each machine(8 core), I have 4 nodes, that 
will end up 28 taskmanagers and 1 job manager. I was wondering if this can 
bring additional burden on jobmanager? Is it recommended?

Thanks,

Jins George

On 2/14/19 8:49 AM, Gary Yao wrote:
Hi Jins George,

This has been asked before [1]. The bottom line is that you currently cannot
pre-allocate TMs and distribute your tasks evenly. You might be able to
achieve a better distribution across hosts by configuring fewer slots in your
TMs.

Best,
Gary

[1] 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-1-5-job-distribution-over-cluster-nodes-td21588.html


On Wed, Feb 13, 2019 at 6:20 AM Tzu-Li (Gordon) Tai 
<tzuli...@apache.org<mailto:tzuli...@apache.org>> wrote:
Hi,

I'm forwarding this question to Gary (CC'ed), who most likely would have an 
answer for your question here.

Cheers,
Gordon

On Wed, Feb 13, 2019 at 8:33 AM Jins George 
<jins.geo...@aeris.net<mailto:jins.geo...@aeris.net>> wrote:

Hello community,

I am trying to  upgrade a  Flink Yarn session cluster running BEAM pipelines  
from version 1.2.0 to 1.6.3.

Here is my session start command: yarn-session.sh -d -n 4  -jm 1024 -tm 3072 -s 
7

Because of the dynamic resource allocation,  no taskmanager gets created 
initially. Now once I submit a job with parallelism 5, I see that 1 
task-manager gets created and all 5 parallel instances are scheduled on the 
same taskmanager( because I have 7 slots).  This can create hot spot as only 
one physical node ( out of 4 in my case) is utilized for processing.

I noticed the legacy mode, which would provision all task managers at cluster 
creation, but since legacy mode is expected to go away soon, I didn't want to 
try that route.

Is there a way I can configure the multiple jobs or parallel instances of same 
job spread across all the available Yarn nodes and continue using the 'new' 
mode ?

Thanks,

Jins George

Reply via email to