From: Andrey Zhadchenko <andrey.zhadche...@virtuozzo.com> Use cgroup_get_e_ve_css to get correct blkcg_css for writeback instances.
https://jira.sw.ru/browse/PSBM-131253 Signed-off-by: Andrey Zhadchenko <andrey.zhadche...@virtuozzo.com> Reviewed-by: Kirill Tkhai <ktk...@virtuozzo.com> v2: khorenko@: introduce a wrapper for getting blkcg_css from memcg_css. ========================== mm/writeback: Adopt cgroup-v2 writeback (limit per-memcg dirty memory) In cgroup-v1 all writeback IO is accounted to root blkcg by design. With cgroup-v2 it became possible to link memcg and blkcg, so writeback code was enhanced to 1) consider balancing dirty pages per memory cgroup 2) account writeback generated IO to blkcg In vz7 writeback was balancing by means of beancounter cgroup. However we dropped it. In vz8 @aryabinin tried to enable cgroup-v2 writeback with 5cc286c98ee20 ("mm, cgroup, writeback: Enable per-cgroup writeback for v1 cgroup."), but cgroup_get_e_css(), which is used to find blkcg based on memcg, does not work well with cgroup-v1 and always returns root blkcg. However we can implement a new function to associate blkcg with memcg via ve css_set. Test results with 256M container without patch: =============================================== # echo "253:22358 100000000" > /sys/fs/cgroup/blkio/machine.slice/1/blkio.throttle.write_bps_device # vzctl exec 1 dd if=/dev/zero of=/test bs=1M count=1000 # 1048576000 bytes (1.0 GB, 1000 MiB) copied, 1.35522 s, 774 MB/s Since dirty balancing is global, Container can dirty more than it's RAM and blkio limits are not respected. With patch: =========== # echo "253:22765 100000000" > /sys/fs/cgroup/blkio/machine.slice/1/blkio.throttle.write_bps_device # vzctl exec 1 dd if=/dev/zero of=/test bs=1M count=1000 # 1048576000 bytes (1.0 GB, 1000 MiB) copied, 10.2267 s, 103 MB/s Per-ve dirty balancing and throttling work as expected. v2: Since ve->ve_ns is pointing to task nsproxy, it can be changed during ve lifetime. We already have a helper ve_get_init_css() that handles this case, so I decided to reuse it's code in new cgroup_get_e_ve_css(). Additionally I have added two patches that improve current code: 1) drop 'get' from css_get_local_root() name since get with css functions usually results in taking reference 2) drop duplicate code and reuse css_local_root() helper in ve_get_init_css() Andrey Zhadchenko (4): kernel/cgroup: rename css_get_local_root kernel/ve: simplify ve_get_init_css kernel/cgroup: implement cgroup_get_e_ve_css mm/backing-dev: associate writeback with correct blkcg --- mm/backing-dev.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/mm/backing-dev.c b/mm/backing-dev.c index f5561ea7d90a..9c1a128199e6 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -434,6 +434,22 @@ static void cgwb_remove_from_bdi_list(struct bdi_writeback *wb) spin_unlock_irq(&cgwb_lock); } +static inline struct cgroup_subsys_state * +cgroup_get_e_css_virtialized(struct cgroup *cgroup, + struct cgroup_subsys *ss) +{ + struct cgroup_subsys_state *css; + +#ifdef CONFIG_VE + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) + css = cgroup_get_e_ve_css(cgroup, ss); + else +#endif + css = cgroup_get_e_css(cgroup, ss); + + return css; +} + static int cgwb_create(struct backing_dev_info *bdi, struct cgroup_subsys_state *memcg_css, gfp_t gfp) { @@ -446,7 +462,8 @@ static int cgwb_create(struct backing_dev_info *bdi, int ret = 0; memcg = mem_cgroup_from_css(memcg_css); - blkcg_css = cgroup_get_e_css(memcg_css->cgroup, &io_cgrp_subsys); + blkcg_css = cgroup_get_e_css_virtialized(memcg_css->cgroup, + &io_cgrp_subsys); blkcg = css_to_blkcg(blkcg_css); memcg_cgwb_list = &memcg->cgwb_list; blkcg_cgwb_list = &blkcg->cgwb_list; @@ -566,7 +583,8 @@ struct bdi_writeback *wb_get_lookup(struct backing_dev_info *bdi, struct cgroup_subsys_state *blkcg_css; /* see whether the blkcg association has changed */ - blkcg_css = cgroup_get_e_css(memcg_css->cgroup, &io_cgrp_subsys); + blkcg_css = cgroup_get_e_css_virtialized(memcg_css->cgroup, + &io_cgrp_subsys); if (unlikely(wb->blkcg_css != blkcg_css || !wb_tryget(wb))) wb = NULL; css_put(blkcg_css); _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel