[Devel] [PATCH rh7] cbt: blk_cbt_update_size() must return if cbt->block_max not changed
It's useless to recreate cbt every time as we called for the same block-device size. Actually, it's worthy only if cbt->block_max increases. Since commit b8e560a299 (fix cbt->block_max calculation), we calculate cbt->block_max precisely: > cbt->block_max = (size + blocksize - 1) >> cbt->block_bits; Hence, the following check: if ((new_sz + bsz) >> cbt->block_bits <= cbt->block_max) goto err_mtx; must be corrected accordingly. Signed-off-by: Maxim Patlasov --- block/blk-cbt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-cbt.c b/block/blk-cbt.c index 3a2b197..4f2ce26 100644 --- a/block/blk-cbt.c +++ b/block/blk-cbt.c @@ -440,7 +440,7 @@ void blk_cbt_update_size(struct block_device *bdev) return; } bsz = 1 << cbt->block_bits; - if ((new_sz + bsz) >> cbt->block_bits <= cbt->block_max) + if ((new_sz + bsz - 1) >> cbt->block_bits <= cbt->block_max) goto err_mtx; new = do_cbt_alloc(q, cbt->uuid, new_sz, bsz); ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh7] cbt: blk_cbt_update_size() should not copy uninitialized data
to_cpy is the number of page pointers to copy from current cbt to new. The following check: > if ((new_sz + bsz) >> cbt->block_bits <= cbt->block_max) > goto err_mtx; ensures that the copy will be done only for new cbt bigger than current. So, we have to calculate to_cpy based on the current (smaller) cbt. The rest of new cbt is OK because it was nullified by do_cbt_alloc(). The bug existed since the very first version of CBT (commit ad7ba3dfe). https://jira.sw.ru/browse/PSBM-48120 Signed-off-by: Maxim Patlasov --- block/blk-cbt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-cbt.c b/block/blk-cbt.c index 001dbfd..3a2b197 100644 --- a/block/blk-cbt.c +++ b/block/blk-cbt.c @@ -448,7 +448,7 @@ void blk_cbt_update_size(struct block_device *bdev) set_bit(CBT_ERROR, &cbt->flags); goto err_mtx; } - to_cpy = NR_PAGES(new->block_max); + to_cpy = NR_PAGES(cbt->block_max); set_bit(CBT_NOCACHE, &cbt->flags); cbt_flush_cache(cbt); spin_lock_irq(&cbt->lock); ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH] scripts: vz-rst-action -- Restore certain members before creating namespaces
When restoring tasks we call clone() and unshare() with flags needed but some of VE settings such as @iptables_mask affects how create proceed new namespace. Thus we need to restore this member at the very early pre-restore stage. I put @features here as well, for example sitX net_init action depends on it. Signed-off-by: Cyrill Gorcunov --- Igor, don't apply it please until explicit Ack from CC'ed list. scripts/vz-rst-action.in | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/scripts/vz-rst-action.in b/scripts/vz-rst-action.in index 4e408f2..f0b6aca 100755 --- a/scripts/vz-rst-action.in +++ b/scripts/vz-rst-action.in @@ -68,6 +68,12 @@ fi set -e case "$CRTOOLS_SCRIPT_ACTION" in +"pre-restore") + if [ -n "$VEID" ]; then + [ -n "$VE_IPTABLES_MASK" ] && cgset -r ve.iptables_mask="$VE_IPTABLES_MASK" $VEID + [ -n "$VE_FEATURES" ] && cgset -r ve.features="$VE_FEATURES" $VEID + fi + ;; "setup-namespaces") pid=$(cat $VE_PIDFILE) ln -s /proc/$pid/ns/net $VE_NETNS_FILE @@ -75,8 +81,6 @@ case "$CRTOOLS_SCRIPT_ACTION" in if [ -n "$VEID" ]; then [ -n "$VE_CLOCK_BOOTBASED" ] && cgset -r ve.clock_bootbased="$VE_CLOCK_BOOTBASED" $VEID [ -n "$VE_CLOCK_MONOTONIC" ] && cgset -r ve.clock_monotonic="$VE_CLOCK_MONOTONIC" $VEID - [ -n "$VE_IPTABLES_MASK" ] && cgset -r ve.iptables_mask="$VE_IPTABLES_MASK" $VEID - [ -n "$VE_FEATURES" ] && cgset -r ve.features="$VE_FEATURES" $VEID [ -n "$VE_AIO_MAX_NR" ] && cgset -r ve.aio_max_nr="$VE_AIO_MAX_NR" $VEID cgset -r ve.state="START $pid" $VEID || { echo "Failed to start $VEID"; exit 1; } fi -- 2.5.5 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [patch vz7] do not allow rootfs umount
On Mon, Jun 06, 2016 at 03:45:11PM +0300, Vasily Averin wrote: > In mainline rootfs is marked always as MNT_LOCKED, > sys_umount checks this flag and fails its processing. > Our kernels lacks for MNT_LOCKED flag, so we use another kind of check > to prevent incorrect operation. > > https://jira.sw.ru/browse/PSBM-46437 > > Signed-off-by: Vasily Averin > diff --git a/fs/namespace.c b/fs/namespace.c > index 988320b..6f05245 100644 > --- a/fs/namespace.c > +++ b/fs/namespace.c > @@ -1355,6 +1355,8 @@ SYSCALL_DEFINE2(umount, char __user *, name, int, flags) > goto dput_and_out; > if (!check_mnt(mnt)) > goto dput_and_out; > + if (path.mnt->mnt_parent == path.mnt) We can user mnt_has_parent() here > + goto dput_and_out; > > retval = do_umount(mnt, flags); > dput_and_out: ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [vzlin-dev] [PATCH rh7 2/2] ploop: push_backup: rework lockout machinery
Maxim Patlasov writes: > It was not very nice idea to reuse plo->lockout_tree for push_backup. Because > by design only one preq (for any given req_cluster) can sit in the lockout > tree, but while we're reusing the tree for a WRITE request, a READ from > backup tool may come. Such a READ may want to to use the tree: see how > map_index_fault calls add_lockout for snapshot configuration. > > The patch introduces ad-hoc separate push_backup lockout tree. This fix the > issue (PSBM-47680) and makes the code much easier to understand. > > https://jira.sw.ru/browse/PSBM-47680 ACK > > Signed-off-by: Maxim Patlasov > --- > drivers/block/ploop/dev.c| 111 > ++ > drivers/block/ploop/events.h |1 > include/linux/ploop/ploop.h |3 + > 3 files changed, 95 insertions(+), 20 deletions(-) > > diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c > index d3f0ec0..27827a8 100644 > --- a/drivers/block/ploop/dev.c > +++ b/drivers/block/ploop/dev.c > @@ -1117,20 +1117,25 @@ static int ploop_congested(void *data, int bits) > return ret; > } > > -static int check_lockout(struct ploop_request *preq) > +static int __check_lockout(struct ploop_request *preq, bool pb) > { > struct ploop_device * plo = preq->plo; > - struct rb_node * n = plo->lockout_tree.rb_node; > + struct rb_node * n = pb ? plo->lockout_pb_tree.rb_node : > + plo->lockout_tree.rb_node; > struct ploop_request * p; > + int lockout_bit = pb ? PLOOP_REQ_PB_LOCKOUT : PLOOP_REQ_LOCKOUT; > > if (n == NULL) > return 0; > > - if (test_bit(PLOOP_REQ_LOCKOUT, &preq->state)) > + if (test_bit(lockout_bit, &preq->state)) > return 0; > > while (n) { > - p = rb_entry(n, struct ploop_request, lockout_link); > + if (pb) > + p = rb_entry(n, struct ploop_request, lockout_pb_link); > + else > + p = rb_entry(n, struct ploop_request, lockout_link); > > if (preq->req_cluster < p->req_cluster) > n = n->rb_left; > @@ -1146,19 +1151,51 @@ static int check_lockout(struct ploop_request *preq) > return 0; > } > > -int ploop_add_lockout(struct ploop_request *preq, int try) > +static int check_lockout(struct ploop_request *preq) > +{ > + if (__check_lockout(preq, false)) > + return 1; > + > + /* push_backup passes READs intact */ > + if (!(preq->req_rw & REQ_WRITE)) > + return 0; > + > + if (__check_lockout(preq, true)) > + return 1; > + > + return 0; > +} > + > +static int __ploop_add_lockout(struct ploop_request *preq, int try, bool pb) > { > struct ploop_device * plo = preq->plo; > - struct rb_node ** p = &plo->lockout_tree.rb_node; > + struct rb_node ** p; > struct rb_node *parent = NULL; > struct ploop_request * pr; > + struct rb_node *link; > + struct rb_root *tree; > + int lockout_bit; > + > + if (pb) { > + link = &preq->lockout_pb_link; > + tree = &plo->lockout_pb_tree; > + lockout_bit = PLOOP_REQ_PB_LOCKOUT; > + } else { > + link = &preq->lockout_link; > + tree = &plo->lockout_tree; > + lockout_bit = PLOOP_REQ_LOCKOUT; > + } > > - if (test_bit(PLOOP_REQ_LOCKOUT, &preq->state)) > + if (test_bit(lockout_bit, &preq->state)) > return 0; > > + p = &tree->rb_node; > while (*p) { > parent = *p; > - pr = rb_entry(parent, struct ploop_request, lockout_link); > + if (pb) > + pr = rb_entry(parent, struct ploop_request, > lockout_pb_link); > + else > + pr = rb_entry(parent, struct ploop_request, > lockout_link); > > if (preq->req_cluster == pr->req_cluster) { > if (try) > @@ -1174,23 +1211,56 @@ int ploop_add_lockout(struct ploop_request *preq, int > try) > > trace_add_lockout(preq); > > - rb_link_node(&preq->lockout_link, parent, p); > - rb_insert_color(&preq->lockout_link, &plo->lockout_tree); > - __set_bit(PLOOP_REQ_LOCKOUT, &preq->state); > + rb_link_node(link, parent, p); > + rb_insert_color(link, tree); > + __set_bit(lockout_bit, &preq->state); > return 0; > } > + > +int ploop_add_lockout(struct ploop_request *preq, int try) > +{ > + return __ploop_add_lockout(preq, try, false); > +} > EXPORT_SYMBOL(ploop_add_lockout); > > -void del_lockout(struct ploop_request *preq) > +static void ploop_add_pb_lockout(struct ploop_request *preq) > +{ > + __ploop_add_lockout(preq, 0, true); > +} > + > +static void __del_lockout(struct ploop_request *preq, bool pb) > { > struct ploop_device * plo = preq->plo; > + struct rb_node *link; > + struct rb_root *tree; > + int lockout_bit; > + > +
Re: [Devel] [vzlin-dev] [PATCH rh7 1/2] ploop: push_backup: roll back ALLOW_READS patch
Maxim Patlasov writes: > The patch reverts: > > Subject: [PATCH rh7] ploop: push_backup must pass READs intact > > If push_backup is in progress (doesn't matter "full" or "incremental") and > ploop state-machine detects incoming WRITE request to the cluster-block that > was not push_backup-ed yet, it suspends the request until userspace reports it > as "processed". > > The above is fine, but while such a WRITE request is suspended, only > subsequent WRITEs (to given cluster-block) must be suspended too. READs must > not. Otherwise userspace backup tool will be blocked infinintely trying > to push_backup given cluster-block. > > Passing READs while blocking WRITEs must be OK because: 1) ploop has not > finalized that first WRITE yet; 2) given cluster-block will be kept > intact (non-modified) while the WRITE is suspended. > > https://jira.sw.ru/browse/PSBM-46775 ACK > > Signed-off-by: Maxim Patlasov > --- > drivers/block/ploop/dev.c |7 --- > include/linux/ploop/ploop.h |1 - > 2 files changed, 8 deletions(-) > > diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c > index 96f7850..d3f0ec0 100644 > --- a/drivers/block/ploop/dev.c > +++ b/drivers/block/ploop/dev.c > @@ -1137,11 +1137,6 @@ static int check_lockout(struct ploop_request *preq) > else if (preq->req_cluster > p->req_cluster) > n = n->rb_right; > else { > - /* do not block backup tool READs from /dev/ploop */ > - if (!(preq->req_rw & REQ_WRITE) && > - test_bit(PLOOP_REQ_ALLOW_READS, &p->state)) > - return 0; > - > list_add_tail(&preq->list, &p->delay_list); > plo->st.bio_lockouts++; > trace_preq_lockout(preq, p); > @@ -2053,7 +2048,6 @@ restart: > ploop_pb_clear_bit(plo->pbd, preq->req_cluster); > } else { > spin_lock_irq(&plo->lock); > - __set_bit(PLOOP_REQ_ALLOW_READS, &preq->state); > ploop_add_lockout(preq, 0); > spin_unlock_irq(&plo->lock); > /* > @@ -2072,7 +2066,6 @@ restart: > > spin_lock_irq(&plo->lock); > del_lockout(preq); > - __clear_bit(PLOOP_REQ_ALLOW_READS, &preq->state); > if (!list_empty(&preq->delay_list)) > list_splice_init(&preq->delay_list, > plo->ready_queue.prev); > spin_unlock_irq(&plo->lock); > diff --git a/include/linux/ploop/ploop.h b/include/linux/ploop/ploop.h > index 0fba25e..77fd833 100644 > --- a/include/linux/ploop/ploop.h > +++ b/include/linux/ploop/ploop.h > @@ -470,7 +470,6 @@ enum > PLOOP_REQ_KAIO_FSYNC, /*force image fsync by KAIO module */ > PLOOP_REQ_POST_SUBMIT, /* preq needs post_submit processing */ > PLOOP_REQ_PUSH_BACKUP, /* preq was ACKed by userspace push_backup */ > - PLOOP_REQ_ALLOW_READS, /* READs are allowed for given req_cluster */ > PLOOP_REQ_FSYNC_DONE, /* fsync_thread() performed f_op->fsync() */ > }; > signature.asc Description: PGP signature ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [PATCH rh7] sched/core/cfs: don't reset nr_cpus while setting cpu limits
On Tue, Jun 07, 2016 at 04:50:38PM +0300, Andrey Ryabinin wrote: > Setting cpu limits reset number of cpus > # echo 2 >/sys/fs/cgroup/cpu,cpuacct/101/cpu.nr_cpus > # exec 101 cat /proc/cpuinfo |grep -c processor > 2 > # echo 16 >/sys/fs/cgroup/cpu,cpuacct/101/cpu.cfs_quota_us > # vzctl exec 101 cat /proc/cpuinfo |grep -c processor > 4 > # cat /sys/fs/cgroup/cpu,cpuacct/101/cpu.nr_cpus > 0 > > tg_update_cpu_limit() does that without any apparent reason, > so let's fix it,. > > https://jira.sw.ru/browse/PSBM-48061 > > Signed-off-by: Andrey Ryabinin > --- > kernel/sched/core.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 2c147c8..51ebed2 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -8696,7 +8696,6 @@ static void tg_update_cpu_limit(struct task_group *tg) > } > > tg->cpu_rate = rate; > - tg->nr_cpus = 0; This is incorrect. Suppose nr_cpus = 2 and you set cfs_quota to 4 * cfs_period. If you don't reset nr_cpus, you'll get cpu limit equal to 400, although it should be min(nr_cpus * 100, cpu_rate) = 200. > } > > static int tg_set_cpu_limit(struct task_group *tg, ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel