[Devel] [PATCH RHEL7 COMMIT] mm/filemap: fix potential memcg->cache charge leak
The commit is pushed to "branch-rh7-3.10.0-1127.18.2.vz7.163.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-1127.18.2.vz7.163.35 --> commit 79a5642e9d9a6bdbb56d9e0ee990fd96b7c8625c Author: Andrey Ryabinin Date: Thu Oct 8 21:35:13 2020 +0300 mm/filemap: fix potential memcg->cache charge leak __add_to_page_cache_locked() after mem_cgroup_try_charge_cache() uses mem_cgroup_cancel_charge() in one of the error paths. This may lead to leaking a few memcg->cache charges. Use mem_cgroup_cancel_cache_charge() to fix this. https://jira.sw.ru/browse/PSBM-121046 Signed-off-by: Andrey Ryabinin --- mm/filemap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/filemap.c b/mm/filemap.c index 53db13f..2bd5ca4 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -732,7 +732,7 @@ static int __add_to_page_cache_locked(struct page *page, error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK); if (error) { if (!huge) - mem_cgroup_cancel_charge(page, memcg); + mem_cgroup_cancel_cache_charge(page, memcg); return error; } ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ploop: kaio: Clear swapfile flag
The commit is pushed to "branch-rh7-3.10.0-1127.18.2.vz7.163.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-1127.18.2.vz7.163.35 --> commit 709e776b4e0d20b5d4f33b56a9a91d46b4d400ea Author: Kirill Tkhai Date: Thu Oct 8 21:35:19 2020 +0300 ploop: kaio: Clear swapfile flag This allows to call defrag on image file. https://jira.sw.ru/browse/PSBM-107743 Signed-off-by: Kirill Tkhai --- drivers/block/ploop/io_kaio.c | 2 -- drivers/block/ploop/io_kaio_map.c | 2 -- 2 files changed, 4 deletions(-) diff --git a/drivers/block/ploop/io_kaio.c b/drivers/block/ploop/io_kaio.c index 2c2fb90..4c4a0c6 100644 --- a/drivers/block/ploop/io_kaio.c +++ b/drivers/block/ploop/io_kaio.c @@ -1119,9 +1119,7 @@ static int __kaio_truncate(struct ploop_io * io, struct file * file, u64 pos) newattrs.ia_valid = ATTR_SIZE; mutex_lock(>files.inode->i_mutex); - io->files.inode->i_flags &= ~S_SWAPFILE; err = notify_change(F_DENTRY(file), , NULL); - io->files.inode->i_flags |= S_SWAPFILE; mutex_unlock(>files.inode->i_mutex); if (err) { diff --git a/drivers/block/ploop/io_kaio_map.c b/drivers/block/ploop/io_kaio_map.c index 09add48..d4ff39d9 100644 --- a/drivers/block/ploop/io_kaio_map.c +++ b/drivers/block/ploop/io_kaio_map.c @@ -58,7 +58,6 @@ int ploop_kaio_open(struct file * file, int rdonly) pm->readers = rdonly ? 1 : -1; list_add(>list, _mappings); pm = NULL; - mapping->host->i_flags |= S_SWAPFILE; kaio_open_done: spin_unlock(_mappings_lock); @@ -82,7 +81,6 @@ int ploop_kaio_close(struct address_space * mapping, int rdonly) } if (m->readers == 0) { - mapping->host->i_flags &= ~S_SWAPFILE; list_del(>list); pm = m; } ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL8 COMMIT] kernel/sched/fair: Fix 'releasing a pinned lock'
The commit is pushed to "branch-rh8-4.18.0-193.6.3.vz8.4.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh8-4.18.0-193.6.3.vz8.4.12 --> commit a9f90d574027e70025b6401cc3dcb4135c645893 Author: Andrey Ryabinin Date: Thu Oct 8 19:00:23 2020 +0300 kernel/sched/fair: Fix 'releasing a pinned lock' Lockdep complains that after rq_repin_lock() the lock wasn't unpinned before rq->lock release. Add rq_unpin_lock(); call to fix this. Also for consistency use 'busiest' instead of 'env.src_rq' which is the same. https://jira.sw.ru/browse/PSBM-120800 Signed-off-by: Andrey Ryabinin Reviewed-by: Kirill Tkhai --- kernel/sched/fair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fc87dee4fd0e..23a2f2452474 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9178,9 +9178,10 @@ static int load_balance(int this_cpu, struct rq *this_rq, env.loop = 0; local_irq_save(rf.flags); double_rq_lock(env.dst_rq, busiest); - rq_repin_lock(env.src_rq, ); + rq_repin_lock(busiest, ); update_rq_clock(env.dst_rq); cur_ld_moved = ld_moved = move_task_groups(); + rq_unpin_lock(busiest, ); double_rq_unlock(env.dst_rq, busiest); local_irq_restore(rf.flags); } ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh8 5/5] ve/time: rework times() syscall and /proc/[pid]/stat to handle u64 time offsets
ve_struct.{start_time,real_start_time} are u64 now, change the code correspondingly. Drop duplicated fields start_timespec/real_start_timespec in ve_struct. Fixes: f2716576136d ("ve/time: Use ve_relative_clock in times() syscall and /proc/[pid]/stat") Signed-off-by: Konstantin Khorenko --- fs/proc/array.c| 7 ++- include/linux/ve.h | 4 kernel/sys.c | 19 --- 3 files changed, 10 insertions(+), 20 deletions(-) diff --git a/fs/proc/array.c b/fs/proc/array.c index e85d0caa6efa..e9b6e403858a 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -546,11 +546,8 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, #ifdef CONFIG_VE if (!is_super) { - struct timespec *ve_start_ts = - _exec_env()->real_start_timespec; - start_time -= - (unsigned long long)ve_start_ts->tv_sec * NSEC_PER_SEC - + ve_start_ts->tv_nsec; + u64 offset = get_exec_env()->real_start_time; + start_time -= (unsigned long long)offset; } /* tasks inside a CT can have negative start time e.g. if the CT was * migrated from another hw node, in which case we will report 0 in diff --git a/include/linux/ve.h b/include/linux/ve.h index b659e779cb49..0db98e8e08c1 100644 --- a/include/linux/ve.h +++ b/include/linux/ve.h @@ -52,10 +52,6 @@ struct ve_struct { struct net_device *venet_dev; #endif -/* per VE CPU stats*/ - struct timespec start_timespec; /* monotonic time */ - struct timespec real_start_timespec;/* boot based time */ - /* see vzcalluser.h for VE_FEATURE_XXX definitions */ __u64 features; diff --git a/kernel/sys.c b/kernel/sys.c index 2644090f8d4b..df02329b0e5c 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -957,16 +957,13 @@ static void do_sys_times(struct tms *tms) } #ifdef CONFIG_VE -unsigned long long ve_relative_clock(struct timespec * ts) +static u64 ve_relative_clock(u64 time) { - unsigned long long offset = 0; + u64 offset = 0; + struct ve_struct *ve = get_exec_env(); - if (ts->tv_sec > get_exec_env()->start_timespec.tv_sec || - (ts->tv_sec == get_exec_env()->start_timespec.tv_sec && -ts->tv_nsec >= get_exec_env()->start_timespec.tv_nsec)) - offset = (unsigned long long)(ts->tv_sec - - get_exec_env()->start_timespec.tv_sec) * NSEC_PER_SEC - + ts->tv_nsec - get_exec_env()->start_timespec.tv_nsec; + if (time > ve->start_time) + offset = time - ve->start_time; return nsec_to_clock_t(offset); } #endif @@ -974,7 +971,7 @@ unsigned long long ve_relative_clock(struct timespec * ts) SYSCALL_DEFINE1(times, struct tms __user *, tbuf) { #ifdef CONFIG_VE - struct timespec now; + u64 now; #endif if (tbuf) { @@ -989,9 +986,9 @@ SYSCALL_DEFINE1(times, struct tms __user *, tbuf) return (long) jiffies_64_to_clock_t(get_jiffies_64()); #else /* Compare to calculation in fs/proc/array.c */ - ktime_get_ts(); + now = ktime_get_ns(); force_successful_syscall_return(); - return ve_relative_clock(); + return (long) ve_relative_clock(now); #endif } -- 2.28.0 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh8 4/5] ve/time: Add comment in ve_start_container() on start time initialization
Fixes: e931118f8139 ("ve: Add ve cgroup and ve_hook subsys") Signed-off-by: Konstantin Khorenko --- kernel/ve/ve.c | 5 + 1 file changed, 5 insertions(+) diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c index 1688407562d4..ac2252445841 100644 --- a/kernel/ve/ve.c +++ b/kernel/ve/ve.c @@ -398,6 +398,11 @@ static int ve_start_container(struct ve_struct *ve) if (task_active_pid_ns(tsk) != tsk->nsproxy->pid_ns_for_children) return -ECHILD; + /* +* Setup uptime for new containers only, if restored +* the value won't be zero here already but setup from +* cgroup write while resuming the container. +*/ if (ve->start_time == 0) { ve->start_time = tsk->start_time; ve->real_start_time = tsk->real_start_time; -- 2.28.0 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh8 3/5] ve/time: Fix VE uptime virtualization to use u64 start_time
From: Cyrill Gorcunov Fixes: a3c4d1d8f383 ("ve/time: Customize VE uptime") Signed-off-by: Konstantin Khorenko +++ ve: Use @real_start_timespec in uptime_proc_show uptime_proc_show uses bootbased clocks so we should use @real_start_timespec here instead. Seems was a typo while converting from pcs6 code. In scope of https://jira.sw.ru/browse/PSBM-41406 Signed-off-by: Cyrill Gorcunov Reviewed-by: Vladimir Davydov vdavydov@: This hunk was a part of diff-cpt-record-ct-boot-based-start-time-to-show-correct-uptime which was skipped during rebase to RH7 because it was considered cpt-related. (cherry picked from vz7 commit 55b9202e39282f2a21773fd1fd99317bc6e07ddd) Signed-off-by: Konstantin Khorenko --- fs/proc/uptime.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/proc/uptime.c b/fs/proc/uptime.c index 9a08b8a92c13..bc07d42ce9f5 100644 --- a/fs/proc/uptime.c +++ b/fs/proc/uptime.c @@ -38,7 +38,7 @@ FIXME:to be reworked anyway in static int uptime_proc_show(struct seq_file *m, void *v) { - struct timespec uptime; + struct timespec uptime, offset; struct timespec64 idle; if (ve_is_super(get_exec_env())) @@ -58,9 +58,10 @@ FIXME: to be reworked anyway in get_monotonic_boottime(); #ifdef CONFIG_VE if (!ve_is_super(get_exec_env())) { + offset = ns_to_timespec(get_exec_env()->real_start_time); set_normalized_timespec(, - uptime.tv_sec - get_exec_env()->start_timespec.tv_sec, - uptime.tv_nsec - get_exec_env()->start_timespec.tv_nsec); +uptime.tv_sec - offset.tv_sec, +uptime.tv_nsec - offset.tv_nsec); } #endif seq_printf(m, "%lu.%02lu %lu.%02lu\n", -- 2.28.0 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh8 2/5] sched: Account task_group::start_time
From: Kirill Tkhai Extracted from "Initial patch". Signed-off-by: Kirill Tkhai (cherry picked from vz7 commit bad04073f185d257f6a3290523ca02c095837e8b) Signed-off-by: Konstantin Khorenko Rebase to vz8 notes: * moved from struct timespec to u64 (nsec) --- kernel/sched/core.c | 4 kernel/sched/sched.h | 3 +++ 2 files changed, 7 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 92062773e632..8a57956d64d6 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6095,6 +6095,7 @@ void __init sched_init(void) #ifdef CONFIG_CFS_CPULIMIT root_task_group.topmost_limited_ancestor = _task_group; #endif + root_task_group.start_time = 0; #endif /* CONFIG_CGROUP_SCHED */ for_each_possible_cpu(i) { @@ -6413,6 +6414,9 @@ struct task_group *sched_create_group(struct task_group *parent) if (!alloc_rt_sched_group(tg, parent)) goto err; + /* start_timespec is saved CT0 uptime */ + tg->start_time = ktime_get_boot_ns(); + return tg; err: diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 4dbf03a3242f..b2f0c26b2c50 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -404,6 +404,9 @@ struct task_group { struct autogroup*autogroup; #endif + /* Monotonic time in nsecs: */ + u64 start_time; + struct cfs_bandwidthcfs_bandwidth; #ifdef CONFIG_CFS_CPULIMIT #define MAX_CPU_RATE 1024 -- 2.28.0 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh8 1/5] ve: Virtualize sysinfo
From: Kirill Tkhai Extracted from "Initial patch". Signed-off-by: Kirill Tkhai (cherry picked from vz7 commit e55cd51304b3271a2adaf43de9b9a5a7be34541e) Signed-off-by: Konstantin Khorenko Port to vz8 notes: * virtinfo_notifier_call (bc_fill_sysinfo()) is substituted by direct call to si_meminfo_ve() and only for not VE0. * "avenrun" is not virtualized yet - need to port first commit 715f311fdb4a ("sched: Account task_group::cpustat,taskstats,avenrun") * ve_struct.real_start_time is u64 now instead of timespec --- kernel/sys.c | 28 +++- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index cfde07d0ba9f..2644090f8d4b 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2545,6 +2545,8 @@ SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep, return err ? -EFAULT : 0; } +extern void si_meminfo_ve(struct sysinfo *si, struct ve_struct *ve); + /** * do_sysinfo - fill in sysinfo struct * @info: pointer to buffer to fill @@ -2554,18 +2556,34 @@ static int do_sysinfo(struct sysinfo *info) unsigned long mem_total, sav_total; unsigned int mem_unit, bitcount; struct timespec tp; + struct ve_struct *ve; memset(info, 0, sizeof(struct sysinfo)); + si_meminfo(info); + si_swapinfo(info); + get_monotonic_boottime(); - info->uptime = tp.tv_sec + (tp.tv_nsec ? 1 : 0); - get_avenrun(info->loads, 0, SI_LOAD_SHIFT - FSHIFT); + ve = get_exec_env(); + if (ve_is_super(ve)) { + info->uptime = tp.tv_sec + (tp.tv_nsec ? 1 : 0); + get_avenrun(info->loads, 0, SI_LOAD_SHIFT - FSHIFT); + + info->procs = nr_threads; + } else { + si_meminfo_ve(info, ve); + info->uptime = tp.tv_sec + (tp.tv_nsec ? 1 : 0) - + ve->real_start_time / NSEC_PER_SEC; - info->procs = nr_threads; + info->procs = nr_threads_ve(ve); - si_meminfo(info); - si_swapinfo(info); +#if 0 +FIXME after +715f311fdb4a ("sched: Account task_group::cpustat,taskstats,avenrun") is ported + get_avenrun_ve(info->loads, 0, SI_LOAD_SHIFT - FSHIFT); +#endif + } /* * If the sum of all the available memory (i.e. ram + swap) -- 2.28.0 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh8 0/5] ve/time: first patchset for times virtualization in CT
Port several patches related to times virtualization from vz7, move from "timespec" to ktime (u64) and make it working. Signed-off-by: Konstantin Khorenko Cyrill Gorcunov (1): ve/time: Fix VE uptime virtualization to use u64 start_time Kirill Tkhai (2): ve: Virtualize sysinfo sched: Account task_group::start_time Konstantin Khorenko (2): ve/time: Add comment in ve_start_container() on start time initialization ve/time: rework times() syscall and /proc/[pid]/stat to handle u64 time offsets fs/proc/array.c | 7 ++- fs/proc/uptime.c | 7 --- include/linux/ve.h | 4 kernel/sched/core.c | 4 kernel/sched/sched.h | 3 +++ kernel/sys.c | 47 +--- kernel/ve/ve.c | 5 + 7 files changed, 49 insertions(+), 28 deletions(-) -- 2.28.0 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RH7] ploop: kaio: Clear swapfile flag
This allows to call defrag on image file. https://jira.sw.ru/browse/PSBM-107743 Signed-off-by: Kirill Tkhai --- drivers/block/ploop/io_kaio.c |2 -- drivers/block/ploop/io_kaio_map.c |2 -- 2 files changed, 4 deletions(-) diff --git a/drivers/block/ploop/io_kaio.c b/drivers/block/ploop/io_kaio.c index 2c2fb90d2b53..4c4a0c6a908c 100644 --- a/drivers/block/ploop/io_kaio.c +++ b/drivers/block/ploop/io_kaio.c @@ -1119,9 +1119,7 @@ static int __kaio_truncate(struct ploop_io * io, struct file * file, u64 pos) newattrs.ia_valid = ATTR_SIZE; mutex_lock(>files.inode->i_mutex); - io->files.inode->i_flags &= ~S_SWAPFILE; err = notify_change(F_DENTRY(file), , NULL); - io->files.inode->i_flags |= S_SWAPFILE; mutex_unlock(>files.inode->i_mutex); if (err) { diff --git a/drivers/block/ploop/io_kaio_map.c b/drivers/block/ploop/io_kaio_map.c index 09add482db8b..d4ff39d95e74 100644 --- a/drivers/block/ploop/io_kaio_map.c +++ b/drivers/block/ploop/io_kaio_map.c @@ -58,7 +58,6 @@ int ploop_kaio_open(struct file * file, int rdonly) pm->readers = rdonly ? 1 : -1; list_add(>list, _mappings); pm = NULL; - mapping->host->i_flags |= S_SWAPFILE; kaio_open_done: spin_unlock(_mappings_lock); @@ -82,7 +81,6 @@ int ploop_kaio_close(struct address_space * mapping, int rdonly) } if (m->readers == 0) { - mapping->host->i_flags &= ~S_SWAPFILE; list_del(>list); pm = m; } ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh7] mm/filemap: fix potential memcg->cache charge leak
__add_to_page_cache_locked() after mem_cgroup_try_charge_cache() uses mem_cgroup_cancel_charge() in one of the error paths. This may lead to leaking a few memcg->cache charges. Use mem_cgroup_cancel_cache_charge() to fix this. https://jira.sw.ru/browse/PSBM-121046 Signed-off-by: Andrey Ryabinin --- mm/filemap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/filemap.c b/mm/filemap.c index 53db13f236da..2bd5ca4e7528 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -732,7 +732,7 @@ static int __add_to_page_cache_locked(struct page *page, error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK); if (error) { if (!huge) - mem_cgroup_cancel_charge(page, memcg); + mem_cgroup_cancel_cache_charge(page, memcg); return error; } -- 2.26.2 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel