CCing @Weihua Hu<mailto:huweihua....@gmail.com> , who is an expert on this. Do 
you have any ideas on the phenomenon here?

Best,
Zhanghao Chen
________________________________
From: Lu Niu <qqib...@gmail.com>
Sent: Tuesday, August 29, 2023 12:11:35 PM
To: Chen Zhanghao <zhanghao.c...@outlook.com>
Cc: Kenan Kılıçtepe <kkilict...@gmail.com>; user <user@flink.apache.org>
Subject: Re: Uneven TM Distribution of Flink on YARN

Thanks for your reply.

The interesting fact is that we also managed spark on yarn. However. Only the 
flink cluster is having the issue. I am wondering whether there is a difference 
in the implementation on flink side.

Best
Lu

On Mon, Aug 28, 2023 at 8:38 PM Chen Zhanghao 
<zhanghao.c...@outlook.com<mailto:zhanghao.c...@outlook.com>> wrote:
Hi Lu Niu,

TM distribution on YARN nodes is managed by YARN RM and is out of the scope of 
Flink. On the other hand, cluster.evenly-spread-out-slots forces even 
distribution of tasks among Flink TMs, and has nothing to do with your 
concerns. Also, the config currently only supports Standalone mode Flink 
clusters, and does not take effect on a Flink cluster on YARN.

Best,
Zhanghao Chen
________________________________
发件人: Lu Niu <qqib...@gmail.com<mailto:qqib...@gmail.com>>
发送时间: 2023年8月29日 4:30
收件人: Kenan Kılıçtepe <kkilict...@gmail.com<mailto:kkilict...@gmail.com>>
抄送: user <user@flink.apache.org<mailto:user@flink.apache.org>>
主题: Re: Uneven TM Distribution of Flink on YARN

Thanks for the reply. We've already set cluster.evenly-spread-out-slots = true

Best
Lu

On Mon, Aug 28, 2023 at 1:23 PM Kenan Kılıçtepe 
<kkilict...@gmail.com<mailto:kkilict...@gmail.com>> wrote:
Have you checked config param cluster.evenly-spread-out-slots ?


On Mon, Aug 28, 2023 at 10:31 PM Lu Niu 
<qqib...@gmail.com<mailto:qqib...@gmail.com>> wrote:
Hi, Flink users

We have recently observed that the allocation of Flink TaskManagers in our YARN 
cluster is not evenly distributed. We would like to hear your thoughts on this 
matter.

1. Our setup includes Flink version 1.15.1 and Hadoop 2.10.0.
2. The uneven distribution is that out of a 370-node YARN cluster, there are 16 
nodes with either 0 or 1 vCore available, while 110 nodes have more than 10 
vCores available.

Is such behavior expected? If not, is there a fix provided in Flink? Thanks!

Best
Lu

Reply via email to