[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-24 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640207#comment-14640207
 ] 

Adam B commented on MESOS-2652:
---

[~jieyu] Is this still an open issue?
If so, please remove the "Fix Version" field, since it was not actually fixed 
yet.
If not, please resolve this ticket.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png, cpu.share from 1024 to 10 for 
> revocable tasks (1).png, cpu.share from 1024 to 10 for revocable tasks 
> (2).png, flattened vs non-flattened cgroups layout (1).png, flattened vs 
> non-flattened cgroups layout (2).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-16 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630357#comment-14630357
 ] 

Jie Yu commented on MESOS-2652:
---

I did another experiment in which I used a non-flattened cgroups layout. To be 
more specific, the layout is the following:
{noformat}
/mesos
|-- regular_container1/ # cpu.shares = 1024 * 
nr_cpus
|-- regular_container2/ # cpu.shares = 1024 * 
nr_cpus
|-- revocable/   # cpu.shares = 2 
(minimal share)
   |--- revocable_container1/
   |--- revocable_container2/
{noformat}

Surprisingly, the performance using a non-flattened cgroups layout is similar 
to that using a flattened cgroups layout (using 10 cpu.shares per cpu for 
revocable containers). See the attached graph (I did the cgroups layout change 
at around 11:45am July 16).

My interpretation of the above results is: workload characteristic outweighs 
the impact from cgroups layout (as long as the cpu.shares is set to be low 
enough). To be more specific, for some benchmark, if some of its threads go to 
wait status (waiting for some events, lock, etc) regularly (e.g., facesim), 
it's more vulnerable to interferences from revocable tasks. On the other hand, 
for some benchmark, if all of its threads are always runnable (e.g., ferret), 
it's less vulnerable to interferences because revocable tasks don't have a 
chance to run.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png, cpu.share from 1024 to 10 for 
> revocable tasks (1).png, cpu.share from 1024 to 10 for revocable tasks 
> (2).png, flattened vs non-flattened cgroups layout (1).png, flattened vs 
> non-flattened cgroups layout (2).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-16 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630351#comment-14630351
 ] 

Jie Yu commented on MESOS-2652:
---

I did a few more experiments using the Parsec based CPU benchmark to further 
quantify the interferences from revocable tasks.

I launched 16 instances of the benchmark (using Aurora) on 16 slaves with each 
instance takes 16 cpus (all available cpus on the slave). I configured the 
fixed resource estimator such that instance N has N revocable tasks running 
(each revocable task does while(1) loop burning cpus).

Initially, all revocable containers have cpu.share=1024 and uses SCHED_IDLE as 
the scheduling policy. As you can see in the graph, the interference is 
proportional to the number of revocable tasks for almost all benchmarks.

Later, I changed their cpu.share to be 10. As you can see. setting cpu.share to 
be 10 reduces the interferences a lot. Also, interestingly, the interferences 
is not always proportional to the number of revocable tasks on the slave after 
I changed the cpu.share from 1024 to 10. For some benchmarks, there is no 
interferences (or very little interferences) no matter how many revocable tasks 
are running on the same slave.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png, cpu.share from 1024 to 10 for 
> revocable tasks (1).png, cpu.share from 1024 to 10 for revocable tasks (2).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-14 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627174#comment-14627174
 ] 

Jie Yu commented on MESOS-2652:
---

Absolutely! I think a smart QoS controller will help as well. For instance, a 
OoS controller can monitor the application specific SLA or some general 
indicators like CPI to predict potential interferences and kill revocable tasks 
if needed.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-14 Thread Christos Kozyrakis (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627096#comment-14627096
 ] 

Christos Kozyrakis commented on MESOS-2652:
---

You are correct Jie, the lower you set the shares for the BE tasks, the better 
it will be. If you give the BE group 1:1024, the BE tasks will get 0.1% of the 
CPU time over the long term (assuming other tasks can consume the rest 99.9%). 
This is a perfectly good solution to begin with, The high priority throughput 
tasks will never even notice. Some low latency tasks will but it's not a bad 
starting point at all. 

However, every now and then you will see some glitches on low latency tasks. 
Even if the HP tasks are 100% busy, the 1:1024 setting will allow the BE task 
to run eventually, introducing a glitch of a few msec. Keep this in mind. If at 
some point this becomes an issue, there are several ways to deal with this:
- disallow oversubscription for the (hopefully small % of) low latency apps 
that care about it
- fix SCHED_IDLE somehow
- use CPU_sets 
- ... 


> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-14 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627077#comment-14627077
 ] 

Jie Yu commented on MESOS-2652:
---

[~kozyraki] Thank you for pointing me the paper. I looked at figure 5. Looks 
like you assigned equal share to both BE task and HP task ("Both memcached and 
the antagonist are assigned 50% share of the CPU"). I am wondering if you have 
tested the scenario where the BE task has a very low share comparing to the HP 
task (e.g., 1:100)?

It was mentioned in the paper that:
{quote}Coming back to Fig. 5, this fully explains why memcached achieves good 
quality of service when its load is lower than 12%; it is accumulating virtual 
runtime more slowly than the square-wave workload and always staying behind, so 
it never gets preempted when the square-wave workload wakes{quote}

I am wondering if you assign low share to BE task, will its vruntime run much 
faster and stay ahead of the HP task? 

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-14 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627007#comment-14627007
 ] 

Jie Yu commented on MESOS-2652:
---

Christos, I think the current situation is: SCHED_IDLE does not work across 
cgroups. We have to find an alternative solution.

I would say lowing the cpu.share is a *best-effort* way to mitigate cpu 
interferences without a QoS controller. And looks like this is the only 
plausible solution without a QoS controller.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-14 Thread Christos Kozyrakis (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626981#comment-14626981
 ] 

Christos Kozyrakis commented on MESOS-2652:
---

Ian is correct. If we just rely on shares, we will run into CFS pitfalls no 
matter what. See 
http://csl.stanford.edu/~christos/publications/2014.mutilate.eurosys.pdf figure 
5 for a memcache example. 

For non latency critical non revocable tasks, this will not be an issue. But 
for latency critical tasks, you will see a slowdown. 

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-10 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623136#comment-14623136
 ] 

Jie Yu commented on MESOS-2652:
---

{quote}E.g., high share ratio, revokable is idle, non-revokable consumes a ton 
of cpu time (more than, say, the 1000:1 ratio), then goes idle, revokable then 
has something to do and starts running ==> now what happens if the 
non-revokable wants to run? Won't the revokable task continue to run until the 
share ratio is equalized?{quote}

As far as I know, there's no such preemption mechanism exists in the kernel 
that we can use. Real time priority allows preemption, but realtime priority is 
not compatible with cgroups 
(http://www.novell.com/support/kb/doc.php?id=7012851).

{quote}I don't know the answer without reading the scheduler source code but 
given that my assumption about SCHED_IDLE turned out to be incomplete/incorrect 
then let's understand the preemption behavior before committing another 
incorrect mechanism{quote}

Yeah, I am using the benchmark I mentioned above to see if the new hierarchy 
works as expected or not. I'll probably add another latency benchmark (e.g. 
http://parsa.epfl.ch/cloudsuite/memcached.html) to see if latency will be 
affected or not.

But given that we don't have a way to allow kernel preempts revocable tasks, 
setting shares seems to be the only solution.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-10 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623126#comment-14623126
 ] 

Ian Downes commented on MESOS-2652:
---

I'm not sure I agree with using only shares because we haven't determined the 
preemption behavior... if we only use shares then I don't know when the 
non-revokable task will preempt a running revokable task. 

E.g., high share ratio, revokable is idle, non-revokable consumes a ton of cpu 
time (more than, say, the 1000:1 ratio), then goes idle, revokable then has 
something to do and starts running ==> now what happens if the non-revokable 
wants to run? Won't the revokable task continue to run until the share ratio is 
equalized? Furthermore, with a flattened hierarchy, *all* revokable tasks with 
unequalized shares will run... I don't know the answer without reading the 
scheduler source code but given that my assumption about SCHED_IDLE turned out 
to be incomplete/incorrect then let's understand the preemption behavior before 
committing another incorrect mechanism :-)

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-10 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623105#comment-14623105
 ] 

Jie Yu commented on MESOS-2652:
---

https://reviews.apache.org/r/36410/
https://reviews.apache.org/r/36411/
https://reviews.apache.org/r/36412/
https://reviews.apache.org/r/36413/

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-10 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622947#comment-14622947
 ] 

Jie Yu commented on MESOS-2652:
---

As originally suggested in this ticket, a preferred way is to use a flattened 
cgroups layout (easy to roll forward and roll back) and set the shares of a 
revocable cgroup to be very small.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (1).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (10).png, Performance improvement after 
> reducing cpu.share to 2 for revocable tasks (2).png, Performance improvement 
> after reducing cpu.share to 2 for revocable tasks (3).png, Performance 
> improvement after reducing cpu.share to 2 for revocable tasks (4).png, 
> Performance improvement after reducing cpu.share to 2 for revocable tasks 
> (5).png, Performance improvement after reducing cpu.share to 2 for revocable 
> tasks (6).png, Performance improvement after reducing cpu.share to 2 for 
> revocable tasks (7).png, Performance improvement after reducing cpu.share to 
> 2 for revocable tasks (8).png, Performance improvement after reducing 
> cpu.share to 2 for revocable tasks (9).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-07-10 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622804#comment-14622804
 ] 

Jie Yu commented on MESOS-2652:
---

Re-open this ticket because we observed that setting processes scheduler policy 
to SCHED_IDLE does not work as expected when cgroups are used.

Here is my testing environment:

(1) I used a widely used open source cpu benchmark for multi-processors, called 
Parsec (http://parsec.cs.princeton.edu/), to test cpu performance. The idea is 
to launch a job (using Aurora) with each instance continuously running Parsec 
benchmark and reporting statistics.

(2) Each instance of the job uses 16 threads (by configuring the Parsec). Each 
instance of the job is scheduled on a box with 16 cores. That means no other 
regular job can land on those boxes.

(3) Use a fixed resource estimator on each slave and launch revocable tasks 
using no_executor_framework. Each revocable task simply does a 'while(true)' 
burning cpus.

There is one interesting observation: one instance of the benchmark job lands 
on a slave that happens to have 11 revocable tasks running (each uses 1 
revocable cpu). All other slaves all have 8 revocable tasks running. And that 
instance of the benchmark job performs consistently worse than other instances. 
However, after I killed the 3 extra revocable tasks, the performance improves 
immediately and matches that of other instances. See the attached result.

To be continued...

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
> Fix For: 0.23.0
>
> Attachments: Abnormal performance with 3 additional revocable tasks 
> (1).png, Abnormal performance with 3 additional revocable tasks (2).png, 
> Abnormal performance with 3 additional revocable tasks (3).png, Abnormal 
> performance with 3 additional revocable tasks (4).png, Abnormal performance 
> with 3 additional revocable tasks (5).png, Abnormal performance with 3 
> additional revocable tasks (6).png, Abnormal performance with 3 additional 
> revocable tasks (7).png
>
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-21 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555265#comment-14555265
 ] 

Joris Van Remoortere commented on MESOS-2652:
-

[~vinodkone]Sort of:
An executor using some revocable resources is definitely BE.
An executor using no revocable resources might be intended to be BE.
therefore:
An executor using no revocable resources is not always PR.


> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-21 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554840#comment-14554840
 ] 

Vinod Kone commented on MESOS-2652:
---

[~jvanremoortere] I agree. Are you saying that the fact that the executor is 
using some revocable resources is not a good enough signal that an executor is 
intended for BE?

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-20 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552772#comment-14552772
 ] 

Joris Van Remoortere commented on MESOS-2652:
-

[~idownes][~vinodkone] I chatted with [~nnielsen] about the weird behavior if 
an executor that is *intended to be BE* is first started with purely 
non-revocable resources to start (due to the revocable resource it is 
interested in not being available currently).

My thought was to be more explicit when an executor starts up that it is 
intended to be BE. This way we are not **guessing** about how we should be 
isolating this container. I agree with [~nnielsen] that we should still fail 
hard if we try to add revocable resources to a PR executor. I think being 
explicit about the executor (as well as the resources) reduces the surface 
areas for bugs and hard to diagnose isolation issues.

Thoughts?

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-19 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551462#comment-14551462
 ] 

Joris Van Remoortere commented on MESOS-2652:
-

Review for setting core affinity:
https://reviews.apache.org/r/34442

Will base the SCHED_OTHER over SCHED_IDLE pre-emption test on this.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-19 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551381#comment-14551381
 ] 

Ian Downes commented on MESOS-2652:
---

Borg does prod and non-prod as coarse prioritization bands but supports 
different priorities within each.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-18 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548342#comment-14548342
 ] 

Timothy Chen commented on MESOS-2652:
-

I see, and you also set SCHED_IDLE on the revocable tasks right?
I was just wondering if SCHED_IDLE becomes a limiting factor that easily any 
other SCHED_OTHER task that might not be more important can overwhelm the tasks 
running on overscribed resources, since there isn't a way to express task 
priorities when we launch anything. 

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-18 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548394#comment-14548394
 ] 

Timothy Chen commented on MESOS-2652:
-

Just chatted with Ian offline, in the future we should consider expressing some 
priority from the frameworks even using non-revocable resource can put tasks on 
low priority as well, that it's a nice balance since I think cutting on 
[non]revocable might be too limiting.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-18 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548302#comment-14548302
 ] 

Ian Downes commented on MESOS-2652:
---

CFS bandwidth quota provides and upper bound on CPU time for a task. If the 
non-revocable workload is variable then we can increase utilization by removing 
that bound for revocable CPU, given that we immediately preempt for 
non-revocable. Then, we just uses cpu shares to balance between the revocable 
tasks.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-15 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546563#comment-14546563
 ] 

Timothy Chen commented on MESOS-2652:
-

Can you clarify what you mean in your last sentence? What does batch style jobs 
mean here? and you're suggesting that we use cpu shares instead when there are 
batch style jobs running?

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-15 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546470#comment-14546470
 ] 

Ian Downes commented on MESOS-2652:
---

Reviews for using SCHED_IDLE:

https://reviews.apache.org/r/34309/
https://reviews.apache.org/r/34310/

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-14 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544685#comment-14544685
 ] 

Ian Downes commented on MESOS-2652:
---

Actually, a much better way to do this would be to use separate scheduling 
policies with CFS. Specifically, run normal tasks with SCHED_OTHER (default) 
and run tasks with revokable CPU under SCHED_IDLE.

This creates separate run queues and tasks (in the Linux sense) on SCHED_IDLE 
will only run if there's nothing on the SCHED_OTHER queue, i.e., at the 
resolution of the scheduler we will always run tasks from non-revokable 
containers over tasks in revokable containers.

Further, if non-revokable containers are running batch style jobs we could 
*not* use CFS bandwidth quotas for revokable containers and use only cpu shares 
to set relative weights. These containers would then balance idle cycles 
appropriately, consuming whatever is left after the needs of the non-revokable 
containers.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Ian Downes
>  Labels: twitter
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-06 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530993#comment-14530993
 ] 

Ian Downes commented on MESOS-2652:
---

IIUC that would require constantly updating cpu.shares values across all 
containers anytime a container changed/was added/removed to ensure the relative 
weights were preserved unless you have very extreme values. I recall Ben 
relating that [~kozyraki] had observed strange behavior with CFS with very 
small values?

The two way split means the aggregate of the non-revocable containers dominates 
the revocable; then, within each subtree, there's a weighting (proportional to 
the cpu) between containers of the same type.

We should investigate both options and determine have each behaves.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-06 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530971#comment-14530971
 ] 

Jie Yu commented on MESOS-2652:
---

Instead of using a hierarchical structure, I am wondering if it's possible to 
still keep the existing flattened structure? We can set cpu.shares of those 
revokable containers to be very small (or the shares of the normal container to 
be very large) so that it's still negligible even if there are many revokable 
containers.

> Update Mesos containerizer to understand revocable cpu resources
> 
>
> Key: MESOS-2652
> URL: https://issues.apache.org/jira/browse/MESOS-2652
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>
> The CPU isolator needs to properly set limits for revocable and non-revocable 
> containers.
> The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
> -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
> a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
> (TBD). Containers would be present in only one of the subtrees. CFS quotas 
> will *not* be set on subtree roots, only cpu.shares. Each container would set 
> CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)