[ https://issues.apache.org/jira/browse/MESOS-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174787#comment-16174787 ]
Qian Zhang commented on MESOS-6162: ----------------------------------- I did more tests for this performance issue with Mesos (rather than just manually tested it with {{dd}} in my previous post), I used {{mesos-execute}} to launch task to run {{dd}} like this: {code}mesos-execute --master=192.168.1.6:5050 --name=test --command="dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync"{code} And I found this performance issue will *always* happen as long as the combination {{ext4/ext3 with the data=ordered option}} + {{cfq IO scheduler}} is met *no matter `cgroups/blkio` isolation is enabled or not*, i.e., if that combination is met, the task will always take much longer to complete (~16s) than what the task will take (~1.2s) if that combination is not met regardless `cgroups/blkio` enabled or not. So it seems this performance issue has nothing to do with `cgroups/blkio` since it will happen even `cgroups/blkio` is not enabled at all. However a weird issue I found is, if the process is assigned to the *root* blkio cgroup and even that combination is met, this performance issue will *not* happen: {code} # echo $$ > /sys/fs/cgroup/blkio/cgroup.procs # dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync 1000+0 records in 1000+0 records out 512000 bytes (512 kB, 500 KiB) copied, 1.19546 s, 428 kB/s <--- No performance issue. {code} So the conclusion is when the combination is met, # If the process is not assigned to any blkio cgroups (i.e., `cgroups/blio` isolation is not enabled), the performance issue will happen. # If the process is assigned to a sub blkio cgroup (i.e., `cgroups/blio` isolation is enabled), the performance issue will happen. # If the process is assigned to the root blkio cgroup, the performance issue will not happen. I think 1 and 2 will happen in the Mesos context but not 3 since a container launched by Mesos will never be assigned to the root blkio cgroup. Originally I thought we should add a note for the performance issue in the doc of `cgroups/blkio`, but now I think that may not be the right place to mention such performance issue, instead we should add such note in the doc {{mesos-containerizer.md}} and {{persistent-volume.md}}. > Add support for cgroups blkio subsystem blkio statistics. > --------------------------------------------------------- > > Key: MESOS-6162 > URL: https://issues.apache.org/jira/browse/MESOS-6162 > Project: Mesos > Issue Type: Task > Components: cgroups, containerization > Reporter: haosdent > Assignee: Jason Lai > Labels: cgroups, containerizer, mesosphere > Fix For: 1.4.0 > > > Noted that cgroups blkio subsystem may have performance issue, refer to > https://github.com/opencontainers/runc/issues/861 -- This message was sent by Atlassian JIRA (v6.4.14#64029)