[jira] [Commented] (MESOS-6162) Add support for cgroups blkio subsystem blkio statistics.

Qian Zhang (JIRA) Sun, 27 Aug 2017 19:03:28 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143301#comment-16143301
 ]


Qian Zhang commented on MESOS-6162:
-----------------------------------

For the [performance issue|https://github.com/opencontainers/runc/issues/861] 
mentioned in the description of this ticket, after some experiments, I found it 
will happen only when the IO scheduler for the disk is set to {{cfq}} and the 
filesystem is {{ext4}}/{{ext3}} with the {{data=ordered}} option. 
{code}
# pwd
/mnt
# mount | grep mnt 
/dev/sdb on /mnt type ext4 (rw,relatime,data=ordered)
# cat /sys/block/sdb/queue/scheduler                         
noop deadline [cfq] 
# echo $$ > /sys/fs/cgroup/blkio/cgroup.procs 
# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync           
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 1.51425 s, 338 kB/s
# echo $$ >/sys/fs/cgroup/blkio/test/cgroup.procs
# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync           
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 16.0301 s, 31.9 kB/s  <--- Performance 
degradation when we put the process into "test" blkio cgroup
{code}

If we change the IO scheduler to {{deadline}} (see [this 
doc|https://www.kernel.org/doc/Documentation/block/switching-sched.txt] for 
more info about switching IO scheduler, and see [CFS 
scheduler|https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt] and 
[deadline 
scheduler|https://www.kernel.org/doc/Documentation/block/deadline-iosched.txt] 
for more info about CFQ and deadline IO scheduler) , we will not have this 
performance issue.
{code}
# echo deadline > /sys/block/sdb/queue/scheduler      
# cat /sys/block/sdb/queue/scheduler            
noop [deadline] cfq 
# echo $$ > /sys/fs/cgroup/blkio/cgroup.procs
root@workstation:/mnt# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 1.21094 s, 423 kB/s
# echo $$ > /sys/fs/cgroup/blkio/test/cgroup.procs
# dd if=/dev/zero of=test.bin bs=512 count=1000 oflag=dsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 1.19367 s, 429 kB/s  <--- No performance 
degradation
{code}

And I also tested if the disk is formatted as other filesystems (e.g., {{xfs}}, 
{{btrfs}}) or the disk is mounted without the {{data=ordered}} option (it is 
enabled by default for {{ext4}} and {{ext3}}, we can disable it by specifying a 
different option when mounting the disk, e.g., {{data=journal}}), we will not 
have this performance issue. See [this 
doc|https://www.ibm.com/developerworks/library/l-fs8/index.html] for the 
difference between {{data=ordered}} and {{data=journal}}.
{quote}
Theoretically, data=journal mode is the slowest journaling mode of all, since 
data gets written to disk twice rather than once. However, it turns out that in 
certain situations, data=journal mode can be blazingly fast.
{quote}

It seems only SUSE has this performance issue since it by default has the 
disk's IO scheduler set to {{cfq}} and the filesystem is {{ext4}} with the 
{{data=ordered}} option. I tested other distros (CoreOS, CentOS 7.2 and Ubuntu 
16.04), they do not have that issue since some of them have the disk's IO 
scheduler set to {{deadline}} by default (Ubuntu 16.04), and some of them have 
the disk formatted as {{xfs}} by default (CentOS 7.2).

So I think this should not be a general performance issue since most of the 
distros have not such issue, and this issue can be fixed on the fly by 
switching IO scheduler to {{deadline}}. But in future, when we support blkio 
control functionalities 
([MESOS-7843|https://issues.apache.org/jira/browse/MESOS-7843]), setting IO 
scheduler to {{deadline}} will be a problem because blkio control 
functionalities needs IO scheduler set to {{CFQ}}, if it is set to 
{{deadline}}, all the {{blkio.weight}}, {{blkio.weight_device}} and 
{{blkio.leaf_weight\[_device\]}} proportional weight policy files will NOT take 
effect.

> Add support for cgroups blkio subsystem blkio statistics.
> ---------------------------------------------------------
>
>                 Key: MESOS-6162
>                 URL: https://issues.apache.org/jira/browse/MESOS-6162
>             Project: Mesos
>          Issue Type: Task
>          Components: cgroups, containerization
>            Reporter: haosdent
>            Assignee: Jason Lai
>              Labels: cgroups, containerizer, mesosphere
>             Fix For: 1.4.0
>
>
> Noted that cgroups blkio subsystem may have performance issue, refer to 
> https://github.com/opencontainers/runc/issues/861



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (MESOS-6162) Add support for cgroups blkio subsystem blkio statistics.

Reply via email to