[ 
https://issues.apache.org/jira/browse/MESOS-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174787#comment-16174787
 ] 

Qian Zhang commented on MESOS-6162:
-----------------------------------

I did more tests for this performance issue with Mesos (rather than just 
manually tested it with {{dd}} in my previous post), I used {{mesos-execute}} 
to launch task to run {{dd}} like this:
{code}mesos-execute --master=192.168.1.6:5050 --name=test --command="dd 
if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync"{code}
And I found this performance issue will *always* happen as long as the 
combination {{ext4/ext3 with the data=ordered option}} + {{cfq IO scheduler}} 
is met *no matter `cgroups/blkio` isolation is enabled or not*, i.e., if that 
combination is met, the task will always take much longer to complete (~16s) 
than what the task will take (~1.2s) if that combination is not met regardless 
`cgroups/blkio` enabled or not.

So it seems this performance issue has nothing to do with `cgroups/blkio` since 
it will happen even `cgroups/blkio` is not enabled at all. However a weird 
issue I found is, if the process is assigned to the *root* blkio cgroup and 
even that combination is met, this performance issue will *not* happen:
{code}
# echo $$ > /sys/fs/cgroup/blkio/cgroup.procs 
# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync         
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 1.19546 s, 428 kB/s    <--- No 
performance issue.
{code}

So the conclusion is when the combination is met, 
# If the process is not assigned to any blkio cgroups (i.e., `cgroups/blio` 
isolation is not enabled), the performance issue will happen.
# If the process is assigned to a sub blkio cgroup (i.e., `cgroups/blio` 
isolation is enabled), the performance issue will happen.
# If the process is assigned to the root blkio cgroup, the performance issue 
will not happen.

I think 1 and 2 will happen in the Mesos context but not 3 since a container 
launched by Mesos will never be assigned to the root blkio cgroup. Originally I 
thought we should add a note for the performance issue in the doc of 
`cgroups/blkio`, but now I think that may not be the right place to mention 
such performance issue, instead we should add such note in the doc 
{{mesos-containerizer.md}} and {{persistent-volume.md}}.


> Add support for cgroups blkio subsystem blkio statistics.
> ---------------------------------------------------------
>
>                 Key: MESOS-6162
>                 URL: https://issues.apache.org/jira/browse/MESOS-6162
>             Project: Mesos
>          Issue Type: Task
>          Components: cgroups, containerization
>            Reporter: haosdent
>            Assignee: Jason Lai
>              Labels: cgroups, containerizer, mesosphere
>             Fix For: 1.4.0
>
>
> Noted that cgroups blkio subsystem may have performance issue, refer to 
> https://github.com/opencontainers/runc/issues/861



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to