[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630357#comment-14630357
 ] 

Jie Yu commented on MESOS-2652:
-------------------------------

I did another experiment in which I used a non-flattened cgroups layout. To be 
more specific, the layout is the following:
{noformat}
/mesos
    |------ regular_container1/                         # cpu.shares = 1024 * 
nr_cpus
    |------ regular_container2/                         # cpu.shares = 1024 * 
nr_cpus
    |------ revocable/                                       # cpu.shares = 2 
(minimal share)
                   |------- revocable_container1/
                   |------- revocable_container2/
{noformat}

Surprisingly, the performance using a non-flattened cgroups layout is similar 
to that using a flattened cgroups layout (using 10 cpu.shares per cpu for 
revocable containers). See the attached graph (I did the cgroups layout change 
at around 11:45am July 16).

My interpretation of the above results is: workload characteristic outweighs 
the impact from cgroups layout (as long as the cpu.shares is set to be low 
enough). To be more specific, for some benchmark, if some of its threads go to 
wait status (waiting for some events, lock, etc) regularly (e.g., facesim), 
it's more vulnerable to interferences from revocable tasks. On the other hand, 
for some benchmark, if all of its threads are always runnable (e.g., ferret), 
it's less vulnerable to interferences because revocable tasks don't have a 
chance to run.

> Update Mesos containerizer to understand revocable cpu resources
> ----------------------------------------------------------------
>
>                 Key: MESOS-2652
>                 URL: https://issues.apache.org/jira/browse/MESOS-2652
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Vinod Kone
>            Assignee: Ian Downes
>              Labels: twitter
>             Fix For: 0.23.0
>
>         Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png, cpu.share from 1024 to 10 for 
> revocable tasks (1).png, cpu.share from 1024 to 10 for revocable tasks 
> (2).png, flattened vs non-flattened cgroups layout (1).png, flattened vs 
> non-flattened cgroups layout (2).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to