[Devel] [PATCH rh7] timers should not get negative argument
From: Vasily Averin This patch fixes 25-sec delay on login into systemd based containers. Userspace application can set timer for past and expect that the timer will be expired immediately. This can do not work as expected inside migrated containers. Translated argument provided to timer can become negative, and according timer will sleep a very long time. https://jira.sw.ru/browse/PSBM-48475 CC: Vladimir Davydov CC: Konstantin Khorenko Signed-off-by: Vasily Averin Acked-by: Cyrill Gorcunov --- kernel/posix-timers.c |6 ++ 1 file changed, 6 insertions(+) Index: linux-pcs7.git/kernel/posix-timers.c === --- linux-pcs7.git.orig/kernel/posix-timers.c +++ linux-pcs7.git/kernel/posix-timers.c @@ -133,6 +133,8 @@ static struct k_clock posix_clocks[MAX_C (which_clock) == CLOCK_MONOTONIC_COARSE) #ifdef CONFIG_VE +static struct timespec zero_time; + void monotonic_abs_to_ve(clockid_t which_clock, struct timespec *tp) { struct ve_struct *ve = get_exec_env(); @@ -151,6 +153,10 @@ void monotonic_ve_to_abs(clockid_t which set_normalized_timespec(tp, tp->tv_sec + ve->start_timespec.tv_sec, tp->tv_nsec + ve->start_timespec.tv_nsec); + if (timespec_compare(tp, &zero_time) <= 0) { + tp->tv_sec = 0; + tp->tv_nsec = 1; + } } #endif ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] test - pls ignore
___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [NEW KERNEL] 3.10.0-327.18.2.vz7.14.16 (rhel7)
Changelog: OpenVZ kernel rh7-3.10.0-327.18.2.vz7.14.16 * sysctl: dropped unused fs.ve-area-access-check, net.ipv4.tcp_max_tw* * device_cgroup: allow to manage devices from inside a Container in @pseudosuper state with no usual Container constraints. This is used on CRIU restore stage. * device_cgroup: allow to change device mount permission via cgroup. Previously it was only possible via ioctl on a running Container only which is inconveninent on CRIU restore stage. * device_cgroup: kill ACC_QUOTA permission. Not needed anymore. * ploop: couple of bugs introduced during rebase from 2.6.32-x * netlink: added possibility to dump and restore netlink sockets with data in receive queue (in case of no ongoing callback execution) Generated changelog: * Fri Jun 17 2016 Konstantin Khorenko [3.10.0-327.18.2.vz7.14.16] - netlink/diag: report flags for netlink sockets (Andrey Vagin) [PSBM-28386] - netlink: add an ability to restore messages in a receive queue (Andrey Vagin) [PSBM-28386] - netlink: allow to set peeking offset for sockets (Andrey Vagin) [PSBM-28386 PSBM-48484 PSBM-28386] - ploop: io_kaio: fix silly bug in kaio_complete_io_state() (Maxim Patlasov) - ploop: fix counting bio_qlen (Maxim Patlasov) - ve/device_cgroup: kill ACC_QUOTA permission (Andrey Ryabinin) [PSBM-48482] - ve/device_cgroup: allow to change device mount permission via cgroup (Andrey Ryabinin) [PSBM-48431] - ve/security: device_cgroup -- Allow manage devices in @pseudosuper state (Cyrill Gorcunov) [PSBM-48421] - ve/sysctl: remove unused fs.ve-area-access-check, net.ipv4.tcp_max_tw* (Pavel Tikhomirov) [PSBM-47061] Built packages: http://kojistorage.eng.sw.ru/packages/vzkernel/3.10.0/327.18.2.vz7.14.16/ ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh7] mm: memcontrol: fix race between kmem uncharge and charge reparenting
When a cgroup is destroyed, all user memory pages get recharged to the parent cgroup. Recharging is done by mem_cgroup_reparent_charges which keeps looping until res <= kmem. This is supposed to guarantee that by the time cgroup gets released, no pages is charged to it. However, the guarantee might be violated in case mem_cgroup_reparent_charges races with kmem charge or uncharge. Currently, kmem is charged before res and uncharged after. As a result, kmem might become greater than res for a short period of time even if there are still user memory pages charged to the cgroup. In this case mem_cgroup_reparent_charges will give up prematurely, and the cgroup might be released though there are still pages charged to it. Uncharge of such a page will trigger kernel panic: general protection fault: [#1] SMP CPU: 0 PID: 972445 Comm: httpd ve: 0 Tainted: G OE 3.10.0-427.10.1.lve1.4.9.el7.x86_64 #1 12.14 task: 88065d53d8d0 ti: 880224f34000 task.ti: 880224f34000 RIP: 0010:[] [] mem_cgroup_charge_statistics.isra.16+0x13/0x60 RSP: 0018:880224f37a80 EFLAGS: 00010202 RAX: RBX: 8807b26f0110 RCX: RDX: 79726f6765746163 RSI: ea000c9c0440 RDI: 8806a55662f8 RBP: 880224f37a80 R08: R09: 03808000 R10: 00b8 R11: ea001eaa8980 R12: ea000c9c0440 R13: 0001 R14: R15: 8806a5566000 FS: () GS:8807d400() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f54289bd74c CR3: 0006638b1000 CR4: 06f0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Stack: 880224f37ac0 811e9ddf 88060001 ea000c9c0440 0001 037d1000 880224f37c78 0380 880224f37ad0 811ee99a 880224f37b08 811b9ec9 Call Trace: [] __mem_cgroup_uncharge_common+0xcf/0x320 [] mem_cgroup_uncharge_page+0x2a/0x30 [] page_remove_rmap+0xb9/0x160 [] ? res_counter_uncharge+0x13/0x20 [] unmap_page_range+0x460/0x870 [] unmap_single_vma+0x81/0xf0 [] unmap_vmas+0x49/0x90 [] exit_mmap+0xac/0x1a0 [] mmput+0x6b/0x140 [] flush_old_exec+0x467/0x8d0 [] load_elf_binary+0x33c/0xde0 [] ? get_user_pages+0x52/0x60 [] ? load_elf_library+0x220/0x220 [] search_binary_handler+0xd5/0x300 [] do_execve_common.isra.26+0x657/0x720 [] SyS_execve+0x29/0x30 [] stub_execve+0x69/0xa0 To prevent this from happening, let's always charge kmem after res and uncharge before res. https://bugs.openvz.org/browse/OVZ-6756 Reported-by: Anatoly Stepanov Signed-off-by: Vladimir Davydov --- mm/memcontrol.c | 44 1 file changed, 36 insertions(+), 8 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1c3fbb2d2c48..de7c36295515 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3163,10 +3163,6 @@ int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size) int ret = 0; bool may_oom; - ret = res_counter_charge(&memcg->kmem, size, &fail_res); - if (ret) - return ret; - /* * Conditions under which we can wait for the oom_killer. Those are * the same conditions tested by the core page allocator @@ -3198,8 +3194,33 @@ int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size) res_counter_charge_nofail(&memcg->memsw, size, &fail_res); ret = 0; - } else if (ret) - res_counter_uncharge(&memcg->kmem, size); + } + + if (ret) + return ret; + + /* +* When a cgroup is destroyed, all user memory pages get recharged to +* the parent cgroup. Recharging is done by mem_cgroup_reparent_charges +* which keeps looping until res <= kmem. This is supposed to guarantee +* that by the time cgroup gets released, no pages is charged to it. +* +* If kmem were charged before res or uncharged after, kmem might +* become greater than res for a short period of time even if there +* were still user memory pages charged to the cgroup. In this case +* mem_cgroup_reparent_charges would give up prematurely, and the +* cgroup could be released though there were still pages charged to +* it. Uncharge of such a page would trigger kernel panic. +* +* To prevent this from happening, kmem must be charged after res and +* uncharged before res. +*/ + ret = res_counter_charge(&memcg->kmem, size, &fail_res); + if (ret) { + res_counter_uncharge(&memcg->res, size); + if (do_swap_account) + res_counter_u
Re: [Devel] [PATCH rh7] mm: memcontrol: fix race between kmem uncharge and charge reparenting
Kirill, please review. -- Best regards, Konstantin Khorenko, Virtuozzo Linux Kernel Team On 06/17/2016 01:35 PM, Vladimir Davydov wrote: When a cgroup is destroyed, all user memory pages get recharged to the parent cgroup. Recharging is done by mem_cgroup_reparent_charges which keeps looping until res <= kmem. This is supposed to guarantee that by the time cgroup gets released, no pages is charged to it. However, the guarantee might be violated in case mem_cgroup_reparent_charges races with kmem charge or uncharge. Currently, kmem is charged before res and uncharged after. As a result, kmem might become greater than res for a short period of time even if there are still user memory pages charged to the cgroup. In this case mem_cgroup_reparent_charges will give up prematurely, and the cgroup might be released though there are still pages charged to it. Uncharge of such a page will trigger kernel panic: general protection fault: [#1] SMP CPU: 0 PID: 972445 Comm: httpd ve: 0 Tainted: G OE 3.10.0-427.10.1.lve1.4.9.el7.x86_64 #1 12.14 task: 88065d53d8d0 ti: 880224f34000 task.ti: 880224f34000 RIP: 0010:[] [] mem_cgroup_charge_statistics.isra.16+0x13/0x60 RSP: 0018:880224f37a80 EFLAGS: 00010202 RAX: RBX: 8807b26f0110 RCX: RDX: 79726f6765746163 RSI: ea000c9c0440 RDI: 8806a55662f8 RBP: 880224f37a80 R08: R09: 03808000 R10: 00b8 R11: ea001eaa8980 R12: ea000c9c0440 R13: 0001 R14: R15: 8806a5566000 FS: () GS:8807d400() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f54289bd74c CR3: 0006638b1000 CR4: 06f0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Stack: 880224f37ac0 811e9ddf 88060001 ea000c9c0440 0001 037d1000 880224f37c78 0380 880224f37ad0 811ee99a 880224f37b08 811b9ec9 Call Trace: [] __mem_cgroup_uncharge_common+0xcf/0x320 [] mem_cgroup_uncharge_page+0x2a/0x30 [] page_remove_rmap+0xb9/0x160 [] ? res_counter_uncharge+0x13/0x20 [] unmap_page_range+0x460/0x870 [] unmap_single_vma+0x81/0xf0 [] unmap_vmas+0x49/0x90 [] exit_mmap+0xac/0x1a0 [] mmput+0x6b/0x140 [] flush_old_exec+0x467/0x8d0 [] load_elf_binary+0x33c/0xde0 [] ? get_user_pages+0x52/0x60 [] ? load_elf_library+0x220/0x220 [] search_binary_handler+0xd5/0x300 [] do_execve_common.isra.26+0x657/0x720 [] SyS_execve+0x29/0x30 [] stub_execve+0x69/0xa0 To prevent this from happening, let's always charge kmem after res and uncharge before res. https://bugs.openvz.org/browse/OVZ-6756 Reported-by: Anatoly Stepanov Signed-off-by: Vladimir Davydov --- mm/memcontrol.c | 44 1 file changed, 36 insertions(+), 8 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1c3fbb2d2c48..de7c36295515 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3163,10 +3163,6 @@ int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size) int ret = 0; bool may_oom; - ret = res_counter_charge(&memcg->kmem, size, &fail_res); - if (ret) - return ret; - /* * Conditions under which we can wait for the oom_killer. Those are * the same conditions tested by the core page allocator @@ -3198,8 +3194,33 @@ int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size) res_counter_charge_nofail(&memcg->memsw, size, &fail_res); ret = 0; - } else if (ret) - res_counter_uncharge(&memcg->kmem, size); + } + + if (ret) + return ret; + + /* +* When a cgroup is destroyed, all user memory pages get recharged to +* the parent cgroup. Recharging is done by mem_cgroup_reparent_charges +* which keeps looping until res <= kmem. This is supposed to guarantee +* that by the time cgroup gets released, no pages is charged to it. +* +* If kmem were charged before res or uncharged after, kmem might +* become greater than res for a short period of time even if there +* were still user memory pages charged to the cgroup. In this case +* mem_cgroup_reparent_charges would give up prematurely, and the +* cgroup could be released though there were still pages charged to +* it. Uncharge of such a page would trigger kernel panic. +* +* To prevent this from happening, kmem must be charged after res and +* uncharged before res. +*/ + ret = res_counter_charge(&m
[Devel] [PATCH RHEL7 COMMIT] ploop: io_kaio: fix silly bug in kaio_complete_io_state()
The commit is pushed to "branch-rh7-3.10.0-327.18.2.vz7.14.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.18.2.vz7.14.15 --> commit 75347fcbd30151977601d819a56f0a0bb57182f5 Author: Maxim Patlasov Date: Fri Jun 17 13:32:34 2016 +0400 ploop: io_kaio: fix silly bug in kaio_complete_io_state() It's useless to check for preq->req_rw & REQ_FUA after: preq->req_rw &= ~REQ_FUA; Signed-off-by: Maxim Patlasov Acked-by: Dmitry Monakhov Note: original code: ... preq->req_rw &= ~REQ_FUA; /* Convert requested fua to fsync */ if (test_and_clear_bit(PLOOP_REQ_FORCE_FUA, &preq->state) || test_and_clear_bit(PLOOP_REQ_KAIO_FSYNC, &preq->state)) post_fsync = 1; if (!post_fsync && !ploop_req_delay_fua_possible(preq->req_rw, preq) && (preq->req_rw & REQ_FUA)) post_fsync = 1; preq->req_rw &= ~REQ_FUA; ... --- drivers/block/ploop/io_kaio.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/block/ploop/io_kaio.c b/drivers/block/ploop/io_kaio.c index 54f8e21..81da1c5 100644 --- a/drivers/block/ploop/io_kaio.c +++ b/drivers/block/ploop/io_kaio.c @@ -78,8 +78,6 @@ static void kaio_complete_io_state(struct ploop_request * preq) return; } - preq->req_rw &= ~REQ_FUA; - /* Convert requested fua to fsync */ if (test_and_clear_bit(PLOOP_REQ_FORCE_FUA, &preq->state) || test_and_clear_bit(PLOOP_REQ_KAIO_FSYNC, &preq->state)) ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ploop: fix counting bio_qlen
The commit is pushed to "branch-rh7-3.10.0-327.18.2.vz7.14.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.18.2.vz7.14.15 --> commit 6cf1b457fb7252f4d2ada14f8cff0d3b91c26b5d Author: Maxim Patlasov Date: Fri Jun 17 13:32:34 2016 +0400 ploop: fix counting bio_qlen The commit ec1eeb868 (May 22 2015) ported "separate queue for discard bio" patch from RHEL6-based kernel incorrectly. Original patch stated clearly that if we want to decrement bio_discard_qlen, bio_qlen must not change: @@ -500,7 +502,7 @@ ploop_bio_queue(struct ploop_device * pl (err = ploop_discard_add_bio(plo->fbd, bio))) { BIO_ENDIO(bio, err); list_add(&preq->list, &plo->free_list); - plo->bio_qlen--; + plo->bio_discard_qlen--; plo->bio_total--; return; } but that port did the opposite: @@ -521,6 +523,7 @@ ploop_bio_queue(struct ploop_device * plo, struct bio * bio, BIO_ENDIO(plo->queue, bio, err); list_add(&preq->list, &plo->free_list); plo->bio_qlen--; + plo->bio_discard_qlen--; plo->bio_total--; return; } Signed-off-by: Maxim Patlasov Acked-by: Dmitry Monakhov --- drivers/block/ploop/dev.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c index 01a5189..2ef1449 100644 --- a/drivers/block/ploop/dev.c +++ b/drivers/block/ploop/dev.c @@ -530,7 +530,6 @@ ploop_bio_queue(struct ploop_device * plo, struct bio * bio, } BIO_ENDIO(plo->queue, bio, err); list_add(&preq->list, &plo->free_list); - plo->bio_qlen--; plo->bio_discard_qlen--; plo->bio_total--; return; ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ve/device_cgroup: kill ACC_QUOTA permission
The commit is pushed to "branch-rh7-3.10.0-327.18.2.vz7.14.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.18.2.vz7.14.15 --> commit 32eb6c887fd633b840453f9011a62d8253ef689c Author: Andrey Ryabinin Date: Fri Jun 17 13:26:15 2016 +0400 ve/device_cgroup: kill ACC_QUOTA permission This is a leftover from PCS6. Currently this code does absolutely nothing, so let's remove it. https://jira.sw.ru/browse/PSBM-48482 Signed-off-by: Andrey Ryabinin khorenko@: keep MAY_QUOTACTL and ACC_QUOTA defines with comment about deprecation. --- include/linux/fs.h | 2 +- security/device_cgroup.c | 14 +++--- 2 files changed, 4 insertions(+), 12 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index b035f62..7203dba 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -77,7 +77,7 @@ typedef void (dio_iodone_t)(struct kiocb *iocb, loff_t offset, /* called from RCU mode, don't block */ #define MAY_NOT_BLOCK 0x0080 /* for devgroup-vs-openvz only */ -#define MAY_QUOTACTL 0x0001 +#define MAY_QUOTACTL 0x0001 /* deprecated */ #define MAY_MOUNT 0x0002 /* diff --git a/security/device_cgroup.c b/security/device_cgroup.c index fc14cdc..8e77d78 100644 --- a/security/device_cgroup.c +++ b/security/device_cgroup.c @@ -22,10 +22,10 @@ #define ACC_MKNOD 1 #define ACC_READ 2 #define ACC_WRITE 4 -#define ACC_QUOTA 8 +#define ACC_QUOTA 8/* deprecated */ #define ACC_HIDDEN 16 #define ACC_MOUNT 64 -#define ACC_MASK (ACC_MKNOD | ACC_READ | ACC_WRITE | ACC_QUOTA | ACC_MOUNT) +#define ACC_MASK (ACC_MKNOD | ACC_READ | ACC_WRITE | ACC_MOUNT) #define DEV_BLOCK 1 #define DEV_CHAR 2 @@ -941,8 +941,6 @@ int __devcgroup_inode_permission(struct inode *inode, int mask) access |= ACC_WRITE; if (mask & MAY_READ) access |= ACC_READ; - if (mask & MAY_QUOTACTL) - access |= ACC_QUOTA; if (mask & MAY_MOUNT) access |= ACC_MOUNT; @@ -962,8 +960,6 @@ int devcgroup_device_permission(umode_t mode, dev_t dev, int mask) access |= ACC_WRITE; if (mask & MAY_READ) access |= ACC_READ; - if (mask & MAY_QUOTACTL) - access |= ACC_QUOTA; return __devcgroup_check_permission(type, MAJOR(dev), MINOR(dev), access); } @@ -972,7 +968,7 @@ int devcgroup_device_visible(umode_t mode, int major, int start_minor, int nr_mi { struct dev_cgroup *dev_cgroup; struct dev_exception_item *ex; - short access = ACC_READ | ACC_WRITE | ACC_QUOTA; + short access = ACC_READ | ACC_WRITE; bool match = false; rcu_read_lock(); @@ -1076,8 +1072,6 @@ static unsigned decode_ve_perms(unsigned perm) mask |= ACC_READ; if (perm & S_IWOTH) mask |= ACC_WRITE; - if (perm & S_IXGRP) - mask |= ACC_QUOTA; if (perm & S_IXUSR) mask |= ACC_MOUNT; @@ -1092,8 +1086,6 @@ static unsigned encode_ve_perms(unsigned mask) perm |= S_IROTH; if (mask & ACC_WRITE) perm |= S_IWOTH; - if (mask & ACC_QUOTA) - perm |= S_IXGRP; if (mask & ACC_MOUNT) perm |= S_IXUSR; ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ve/device_cgroup: allow to change device mount permission via cgroup
The commit is pushed to "branch-rh7-3.10.0-327.18.2.vz7.14.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.18.2.vz7.14.15 --> commit 3c080800a7f1d1a3102c9d9de1b46b80e0fec187 Author: Andrey Ryabinin Date: Fri Jun 17 13:08:24 2016 +0400 ve/device_cgroup: allow to change device mount permission via cgroup Currently, in order to allow a Container to mount device, we call an ioctl(get_vzctlfd(), VZCTL_SETDEVPERMS, &devperms) with S_IXUSR bit set. In fact, this ioctl() is just a wrapper around dev cgroup interface, which is very odd. Instead, lets allow to change mount permission via dev cgroup interface. Since letter 'm' already occupied for mknod permission, we will use capitalize 'M' for mount permission. E.g.: $ echo 'b 182:954545 M' > /sys/fs/cgroup/devices/$ID/devices.allow $ cat /sys/fs/cgroup/devices/$ID/devices.list ... b 182:954545 rmM https://jira.sw.ru/browse/PSBM-48431 Signed-off-by: Andrey Ryabinin --- security/device_cgroup.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/security/device_cgroup.c b/security/device_cgroup.c index f94d08e..fc14cdc 100644 --- a/security/device_cgroup.c +++ b/security/device_cgroup.c @@ -269,7 +269,7 @@ static void devcgroup_css_free(struct cgroup *cgroup) #define DEVCG_LIST 3 #define MAJMINLEN 13 -#define ACCLEN 4 +#define ACCLEN 5 static void set_access(char *acc, short access) { @@ -281,6 +281,8 @@ static void set_access(char *acc, short access) acc[idx++] = 'w'; if (access & ACC_MKNOD) acc[idx++] = 'm'; + if (access & ACC_MOUNT) + acc[idx++] = 'M'; } static char type_to_char(short type) @@ -771,7 +773,7 @@ static int devcgroup_update_access(struct dev_cgroup *devcgroup, } if (!isspace(*b)) return -EINVAL; - for (b++, count = 0; count < 3; count++, b++) { + for (b++, count = 0; count < ACCLEN - 1; count++, b++) { switch (*b) { case 'r': ex.access |= ACC_READ; @@ -782,9 +784,12 @@ static int devcgroup_update_access(struct dev_cgroup *devcgroup, case 'm': ex.access |= ACC_MKNOD; break; + case 'M': + ex.access |= ACC_MOUNT; + break; case '\n': case '\0': - count = 3; + count = ACCLEN - 1; break; default: return -EINVAL; ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ve/security: device_cgroup -- Allow manage devices in @pseudosuper state
The commit is pushed to "branch-rh7-3.10.0-327.18.2.vz7.14.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.18.2.vz7.14.15 --> commit 6504a698d0cb68644ad61f139e528c7fb605a246 Author: Cyrill Gorcunov Date: Fri Jun 17 13:06:56 2016 +0400 ve/security: device_cgroup -- Allow manage devices in @pseudosuper state When restoring containers with several disks it's more convenient to mount device first and the setup permissions needed. So for this sake we allow to escape device permissions testing inside VE only if @pseudosuper state enabled. https://jira.sw.ru/browse/PSBM-48421 CC: Vladimir Davydov CC: Konstantin Khorenko CC: Andrey Vagin Signed-off-by: Cyrill Gorcunov --- security/device_cgroup.c | 16 1 file changed, 16 insertions(+) diff --git a/security/device_cgroup.c b/security/device_cgroup.c index 0a6d9c4..f94d08e 100644 --- a/security/device_cgroup.c +++ b/security/device_cgroup.c @@ -902,8 +902,24 @@ static int __devcgroup_check_permission(short type, u32 major, u32 minor, minor, access); rcu_read_unlock(); +#ifdef CONFIG_VE + /* +* When restoring container allow everything in +* pseudosuper state. We need this for early +* mounting of second ploop device. Still, don't +* change behaviour on the ve0. +*/ + if (!rc) { + struct ve_struct *ve = get_exec_env(); + + if (!ve_is_super(ve) && ve->is_pseudosuper) + return 0; + return -EPERM; + } +#else if (!rc) return -EPERM; +#endif return 0; } ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [vzlin-dev] [PATCH rh7] ploop: io_kaio: fix silly bug in kaio_complete_io_state()
Maxim Patlasov writes: > It's useless to check for preq->req_rw & REQ_FUA after: > preq->req_rw &= ~REQ_FUA; ACK :) But in order to make it clear for others let's post original code here! ... preq->req_rw &= ~REQ_FUA; /* Convert requested fua to fsync */ if (test_and_clear_bit(PLOOP_REQ_FORCE_FUA, &preq->state) || test_and_clear_bit(PLOOP_REQ_KAIO_FSYNC, &preq->state)) post_fsync = 1; if (!post_fsync && !ploop_req_delay_fua_possible(preq->req_rw, preq) && (preq->req_rw & REQ_FUA)) post_fsync = 1; preq->req_rw &= ~REQ_FUA; ... > > Signed-off-by: Maxim Patlasov > --- > drivers/block/ploop/io_kaio.c |2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/block/ploop/io_kaio.c b/drivers/block/ploop/io_kaio.c > index 79aa9af..de26319 100644 > --- a/drivers/block/ploop/io_kaio.c > +++ b/drivers/block/ploop/io_kaio.c > @@ -71,8 +71,6 @@ static void kaio_complete_io_state(struct ploop_request * > preq) > return; > } > > - preq->req_rw &= ~REQ_FUA; > - > /* Convert requested fua to fsync */ > if (test_and_clear_bit(PLOOP_REQ_FORCE_FUA, &preq->state) || > test_and_clear_bit(PLOOP_REQ_KAIO_FSYNC, &preq->state)) signature.asc Description: PGP signature ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [vzlin-dev] [PATCH rh7] ploop: fix counting bio_qlen
Maxim Patlasov writes: > The commit ec1eeb868 (May 22 2015) ported "separate queue for discard bio" > patch from RHEL6-based kernel incorrectly. Original patch stated clearly > that if we want to decrement bio_discard_qlen, bio_qlen must not change: > > @@ -500,7 +502,7 @@ ploop_bio_queue(struct ploop_device * pl > (err = ploop_discard_add_bio(plo->fbd, bio))) { > BIO_ENDIO(bio, err); > list_add(&preq->list, &plo->free_list); > - plo->bio_qlen--; > + plo->bio_discard_qlen--; > plo->bio_total--; > return; > } > > but that port did the opposite: > > @@ -521,6 +523,7 @@ ploop_bio_queue(struct ploop_device * plo, struct bio * > bio, > BIO_ENDIO(plo->queue, bio, err); > list_add(&preq->list, &plo->free_list); > plo->bio_qlen--; > + plo->bio_discard_qlen--; > plo->bio_total--; > return; > } > > Signed-off-by: Maxim Patlasov > --- > drivers/block/ploop/dev.c |1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c > index db55be3..e1fbfcf 100644 > --- a/drivers/block/ploop/dev.c > +++ b/drivers/block/ploop/dev.c > @@ -523,7 +523,6 @@ ploop_bio_queue(struct ploop_device * plo, struct bio * > bio, > } > BIO_ENDIO(plo->queue, bio, err); > list_add(&preq->list, &plo->free_list); > - plo->bio_qlen--; > plo->bio_discard_qlen--; > plo->bio_total--; > return; ACK signature.asc Description: PGP signature ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel