Thanks Yangze for providing these links I'll try it !

-----邮件原件-----
发件人: Yangze Guo [mailto:karma...@gmail.com] 
发送时间: 2020年8月18日 星期二 12:57
收件人: 范超 <fanc...@mgtv.com>
抄送: user (user@flink.apache.org) <user@flink.apache.org>
主题: Re: How to specify the number of TaskManagers in Yarn Cluster using Per-Job 
Mode

The number of TM mainly depends on the parallelism and job graph.
Flink now allows you to set the maximum slots number 
(slotmanager-number-of-slots-max[1]). There is also a plan to support setting 
the minimum number of slots[2].

[1] 
https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#slotmanager-number-of-slots-max
[2] https://issues.apache.org/jira/browse/FLINK-15959

Best,
Yangze Guo

On Tue, Aug 18, 2020 at 12:21 PM 范超 <fanc...@mgtv.com> wrote:
>
> Thanks Yangze
>
> 1. Do you meet any problem when deploying on Yarn or running Flink job?
> My job works well
>
> 2. Why do you need to start the TMs on all the three machines?
> From cluster perspective, I wonder if the process pressure can be balance to 
> 3 machines.
>
> 3. Flink can control how many TM to start, but where to start the TMs depends 
> on Yarn.
> Yes, the job where to start the TM is depend on Yarn.
> Could you please tell me parameter controls how many TM to start, the 
> yn parameter is delete from 1.10 as the 1.9 doc sample list[1] below
>
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/cli.ht
> ml
>
> Run example program using a per-job YARN cluster with 2 TaskManagers:
>
> ./bin/flink run -m yarn-cluster -yn 2 \
>                        ./examples/batch/WordCount.jar \
>                        --input hdfs:///user/hamlet.txt --output 
> hdfs:///user/wordcount_out
>
> -----邮件原件-----
> 发件人: Yangze Guo [mailto:karma...@gmail.com]
> 发送时间: 2020年8月18日 星期二 11:31
> 收件人: 范超 <fanc...@mgtv.com>
> 抄送: user (user@flink.apache.org) <user@flink.apache.org>
> 主题: Re: How to specify the number of TaskManagers in Yarn Cluster 
> using Per-Job Mode
>
> Hi,
>
> Flink can control how many TM to start, but where to start the TMs depends on 
> Yarn.
>
> Do you meet any problem when deploying on Yarn or running Flink job?
> Why do you need to start the TMs on all the three machines?
>
> Best,
> Yangze Guo
>
> On Tue, Aug 18, 2020 at 11:25 AM 范超 <fanc...@mgtv.com> wrote:
> >
> > Thanks Yangze
> > The reason why I don’t deploying a standalone cluster, it's because there 
> > kafka, kudu, hadoop, zookeeper on these machines, maybe currently using the 
> > yarn to manage resources is the best choice for me.
> > If Flink can not control how many tm to start , could anyone 
> > providing me some best practice for deploying on yarn please? I read 
> > the [1] and still don't very clear
> >
> > [1]
> > https://www.ververica.com/blog/how-to-size-your-apache-flink-cluster
> > -g
> > eneral-guidelines
> >
> > -----邮件原件-----
> > 发件人: Yangze Guo [mailto:karma...@gmail.com]
> > 发送时间: 2020年8月18日 星期二 10:50
> > 收件人: 范超 <fanc...@mgtv.com>
> > 抄送: user (user@flink.apache.org) <user@flink.apache.org>
> > 主题: Re: How to specify the number of TaskManagers in Yarn Cluster 
> > using Per-Job Mode
> >
> > Hi,
> >
> > I think that is only related to the Yarn scheduling strategy. AFAIK, Flink 
> > could not control it. You could check the RM log to figure out why it did 
> > not schedule the containers to all the three machines. BTW, if you have 
> > specific requirements to start with all the three machines, how about 
> > deploying a standalone cluster instead?
> >
> > Best,
> > Yangze Guo
> >
> > On Tue, Aug 18, 2020 at 10:24 AM 范超 <fanc...@mgtv.com> wrote:
> > >
> > > Thanks Yangze
> > >
> > > All 3 machines NodeManager is started.
> > >
> > > I just don't know why not three machines each running a Flink 
> > > TaskManager and how to achieve this
> > >
> > > -----邮件原件-----
> > > 发件人: Yangze Guo [mailto:karma...@gmail.com]
> > > 发送时间: 2020年8月18日 星期二 10:10
> > > 收件人: 范超 <fanc...@mgtv.com>
> > > 抄送: user (user@flink.apache.org) <user@flink.apache.org>
> > > 主题: Re: How to specify the number of TaskManagers in Yarn Cluster 
> > > using Per-Job Mode
> > >
> > > Hi,
> > >
> > > Do you start the NodeManager in all the three machines? If so, could you 
> > > check all the NMs correctly connect to the ResourceManager?
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Aug 18, 2020 at 10:01 AM 范超 <fanc...@mgtv.com> wrote:
> > > >
> > > > Hi, Dev and Users
> > > > I’ve 3 machines each one is 8 cores and 16GB memory.
> > > > Following it’s my Resource Manager screenshot the cluster have 36GB 
> > > > total.
> > > > I specify the paralism to 3 or even up to 12,  But the task manager is 
> > > > always running on two nodes not all three machine, the third node does 
> > > > not start the task manager.
> > > > I tried set the –p –tm –jm parameters, but it always the same, only 
> > > > different is more container on the two maching but not all three 
> > > > machine start the task manager.
> > > > My question is how to set the cli parameter to start all of my 
> > > > three machine (all task manager start on 3 machines)
> > > >
> > > > Thanks a lot
> > > > [cid:image001.png@01D67546.62291B70]
> > > >
> > > >
> > > > Chao fan
> > > >

Reply via email to