Hi Lu,

Yang is right. enabling cgroup isolation is probably the one you are
looking for to control how Flink utilize the CPU resources.
One more idea is to enable DominantResourceCalculator[1] which I think
you've probably done so already.

Found an interesting read[2] about this if you would like to dig deeper.

Thanks,
Rong

[1]
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/apidocs/org/apache/hadoop/yarn/util/resource/DominantResourceCalculator.html
[2] https://developer.ibm.com/hadoop/2017/06/30/deep-dive-yarn-cgroups/

--
Rong

On Fri, Nov 8, 2019 at 3:51 AM Yang Wang <danrtsey...@gmail.com> wrote:

> Hi Lu Niu,
>
> Yes, you could use `yarn.containers.vcores` to set the vcores of
> taskmanager. However, it could not
> guarantee that the application do not affect each other. By default, the
> yarn cluster are using cgroup
> share. That means a taskmanager could use more cpu than it allocated. When
> the machine is heavy,
> linux kernel will use cpu share as weight to control different processes.
>
> If you want to limit the taskmanager could only use as it allocated, the
> `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true`
> is the only way. Yarn
> nodemanager will set cpu quota for each taskmanager.
>
>
>
>
> Best,
> Yang
>
> Lu Niu <qqib...@gmail.com> 于2019年11月7日周四 上午1:15写道:
>
>> Hi,
>>
>> Thanks for replying! Basically I want to limit cpu usage so that
>> different application don't affect each other. What's current best
>> practice? Looks
>> `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true`
>> is one way. How to set how many cpu resources to use? is it
>> "yarn.containers.vcores" ?
>>
>> it should be -ys not -yn in original post, sorry for the typo.
>>
>> Best
>> Lu
>>
>> On Wed, Nov 6, 2019 at 1:41 AM Yang Wang <danrtsey...@gmail.com> wrote:
>>
>>> If you want to limit the TaskManager container cpu usage, it is based on
>>> your yarn cluster configuration.
>>> By default, yarn only uses cpu share. You need to set
>>> `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage=true`
>>> in yarn-site.xml of all yarn node managers.
>>>
>>>
>>> Best,
>>> Yang
>>>
>>> Victor Wong <jiasheng.w...@outlook.com> 于2019年11月6日周三 下午5:02写道:
>>>
>>>> Hi Lu,
>>>>
>>>>
>>>>
>>>> You can check out which operator thread causes the high CPU usage, and
>>>> set a unique slot sharing group name [1] to it to prevent too many operator
>>>> threads running in the same TM.
>>>>
>>>> Hope this will be helpful😊
>>>>
>>>>
>>>>
>>>> [1].
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#task-chaining-and-resource-groups
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>> Victor
>>>>
>>>>
>>>>
>>>> *From: *Vino Yang <yanghua1...@gmail.com>
>>>> *Date: *Wednesday, 6 November 2019 at 4:26 PM
>>>> *To: *Lu Niu <qqib...@gmail.com>
>>>> *Cc: *user <user@flink.apache.org>
>>>> *Subject: *Re: Limit max cpu usage per TaskManager
>>>>
>>>>
>>>>
>>>> Hi Lu,
>>>>
>>>>
>>>>
>>>> When using Flink on YARN, it will rely on YARN's resource management
>>>> capabilities, and Flink cannot currently limit CPU usage.
>>>>
>>>> Also, what version of Flink do you use? As far as I know, since Flink
>>>> 1.8, the -yn parameter will not work.
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>> Vino
>>>>
>>>>
>>>>
>>>> Lu Niu <qqib...@gmail.com> 于2019年11月6日周三 下午1:29写道:
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> When run flink application in yarn mode, is there a way to limit
>>>> maximum cpu usage per TaskManager?
>>>>
>>>>
>>>>
>>>> I tried this application with just source and sink operator.
>>>> parallelism of source is 60 and parallelism of sink is 1. When running in
>>>> default config, there are 60 TaskManager assigned. I notice one TaskManager
>>>> process cpu usage could be 200% white the rest below 50%.
>>>>
>>>>
>>>>
>>>> When I set -yn = 2 (default is 1), I notice # of TaskManger dropped
>>>> down to 30. and one TaskManger process cpu usage could be 600% while the
>>>> rest below 50%.
>>>>
>>>>
>>>>
>>>> Tried to set yarn.containers.vcores = 2,  all tasks are in start state
>>>> forever, application is not able to turn to running state.
>>>>
>>>>
>>>>
>>>> Best
>>>>
>>>> Lu
>>>>
>>>>

Reply via email to