[Devel] [PATCH rh7 2/2] ploop: push_backup: implement auto destroy
If userspace backup tool dies unexpectedly (or killed intentionally), ploop catches that last reference to /dev/ploop has gone, aborts push_backup, and releases associated resorces (pbd and friends). https://jira.sw.ru/browse/PSBM-45000 Signed-off-by: Maxim Patlasov--- drivers/block/ploop/dev.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c index a560734..c4d2bc1 100644 --- a/drivers/block/ploop/dev.c +++ b/drivers/block/ploop/dev.c @@ -2869,6 +2869,7 @@ static void ploop_release(struct gendisk *disk, fmode_t fmode) mutex_lock(>ctl_mutex); if (atomic_dec_and_test(>open_count)) { + ploop_pb_destroy(plo, NULL); ploop_tracker_stop(plo, 1); plo->bdev = NULL; } ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh7 0/2] ploop: implement push_backup auto destroy
If backup tool dies or is killed, we cannot continue push_backup anyway (because next instance of backup tool runtime won't know which blocks to push_backup). So, it's useless to keep in-kernel push_backup state after userspace part disappeared. The patch set implements auto-destroy: ploop detects when userspace goes away, then aborts and releases in-kernel push_backup. --- Maxim Patlasov (2): ploop: push_backup: factor out destroy ploop: push_backup: implement auto destroy drivers/block/ploop/dev.c | 14 ++ drivers/block/ploop/push_backup.c | 22 ++ drivers/block/ploop/push_backup.h |2 ++ 3 files changed, 26 insertions(+), 12 deletions(-) -- Signature ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH rh7 1/2] ploop: push_backup: factor out destroy
The patch makes minor code rearrangement. No logic changed. New function ploop_pb_destroy() will be used by the next patch. https://jira.sw.ru/browse/PSBM-45000 Signed-off-by: Maxim Patlasov--- drivers/block/ploop/dev.c | 13 + drivers/block/ploop/push_backup.c | 22 ++ drivers/block/ploop/push_backup.h |2 ++ 3 files changed, 25 insertions(+), 12 deletions(-) diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c index 6058449..a560734 100644 --- a/drivers/block/ploop/dev.c +++ b/drivers/block/ploop/dev.c @@ -4687,18 +4687,7 @@ static int ploop_push_backup_stop(struct ploop_device *plo, unsigned long arg) return -EINVAL; } - if (!test_and_clear_bit(PLOOP_S_PUSH_BACKUP, >state)) - return -EINVAL; - - BUG_ON (!pbd); - ctl.status = ploop_pb_stop(pbd); - - ploop_quiesce(plo); - ploop_pb_fini(plo->pbd); - plo->maintenance_type = PLOOP_MNTN_OFF; - ploop_relax(plo); - - return 0; + return ploop_pb_destroy(plo, ); } static int ploop_ioctl(struct block_device *bdev, fmode_t fmode, unsigned int cmd, diff --git a/drivers/block/ploop/push_backup.c b/drivers/block/ploop/push_backup.c index 10fd55a..50b776c 100644 --- a/drivers/block/ploop/push_backup.c +++ b/drivers/block/ploop/push_backup.c @@ -567,3 +567,25 @@ void ploop_pb_put_reported(struct ploop_pushbackup_desc *pbd, spin_unlock_irq(>lock); } } + +int ploop_pb_destroy(struct ploop_device *plo, __u32 *status) +{ + struct ploop_pushbackup_desc *pbd = plo->pbd; + unsigned long ret; + + if (!test_and_clear_bit(PLOOP_S_PUSH_BACKUP, >state)) + return -EINVAL; + + BUG_ON (!pbd); + ret = ploop_pb_stop(pbd); + + if (status) + *status = ret; + + ploop_quiesce(plo); + ploop_pb_fini(plo->pbd); + plo->maintenance_type = PLOOP_MNTN_OFF; + ploop_relax(plo); + + return 0; +} diff --git a/drivers/block/ploop/push_backup.h b/drivers/block/ploop/push_backup.h index 476ac53..cfb1138 100644 --- a/drivers/block/ploop/push_backup.h +++ b/drivers/block/ploop/push_backup.h @@ -17,3 +17,5 @@ bool ploop_pb_check_bit(struct ploop_pushbackup_desc *pbd, cluster_t clu); int ploop_pb_preq_add_pending(struct ploop_pushbackup_desc *pbd, struct ploop_request *preq); + +int ploop_pb_destroy(struct ploop_device *plo, __u32 *status); ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [NEW KERNEL] 3.10.0-327.10.1.vz7.12.18 (rhel7)
Changelog: OpenVZ kernel rh7-3.10.0-327.10.1.vz7.12.18 * kmod: added more kernel modules into autoload whitelist - for CRIU * cbt: block_max calculation corrected * ploop: fix reentrance in ploop_pb_get_pending() * ploop: do not block READ requests even in case cluster is blocked for WRITE Generated changelog: * Wed May 11 2016 Konstantin Khorenko[3.10.0-327.10.1.vz7.12.18] - ploop: push_backup must pass READs intact (Maxim Patlasov) [PSBM-46775] - cbt: fix cbt->block_max calculation (Maxim Patlasov) - ploop: push_backup: fix reentrance in ploop_pb_get_pending() (Dmitry Monakhov) [PSBM-45000] - ve/kmod: Add modules to whitelist for c/r sake (Cyrill Gorcunov) [PSBM-46789] Built packages: http://kojistorage.eng.sw.ru/packages/vzkernel/3.10.0/327.10.1.vz7.12.18/ ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ploop: push_backup must pass READs intact
The commit is pushed to "branch-rh7-3.10.0-327.10.1.vz7.12.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.10.1.vz7.12.17 --> commit 8193d31f2de13045c14475e949af2e48312d13e3 Author: Maxim PatlasovDate: Tue May 10 20:37:00 2016 +0400 ploop: push_backup must pass READs intact If push_backup is in progress (doesn't matter "full" or "incremental") and ploop state-machine detects incoming WRITE request to the cluster-block that was not push_backup-ed yet, it suspends the request until userspace reports it as "processed". The above is fine, but while such a WRITE request is suspended, only subsequent WRITEs (to given cluster-block) must be suspended too. READs must not. Otherwise userspace backup tool will be blocked infinintely trying to push_backup given cluster-block. Passing READs while blocking WRITEs must be OK because: 1) ploop has not finalized that first WRITE yet; 2) given cluster-block will be kept intact (non-modified) while the WRITE is suspended. https://jira.sw.ru/browse/PSBM-46775 Signed-off-by: Maxim Patlasov --- drivers/block/ploop/dev.c | 7 +++ include/linux/ploop/ploop.h | 1 + 2 files changed, 8 insertions(+) diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c index c7cc385..6058449 100644 --- a/drivers/block/ploop/dev.c +++ b/drivers/block/ploop/dev.c @@ -1137,6 +1137,11 @@ static int check_lockout(struct ploop_request *preq) else if (preq->req_cluster > p->req_cluster) n = n->rb_right; else { + /* do not block backup tool READs from /dev/ploop */ + if (!(preq->req_rw & REQ_WRITE) && + test_bit(PLOOP_REQ_ALLOW_READS, >state)) + return 0; + list_add_tail(>list, >delay_list); plo->st.bio_lockouts++; trace_preq_lockout(preq, p); @@ -2030,6 +2035,7 @@ restart: ploop_pb_clear_bit(plo->pbd, preq->req_cluster); } else { spin_lock_irq(>lock); + __set_bit(PLOOP_REQ_ALLOW_READS, >state); ploop_add_lockout(preq, 0); spin_unlock_irq(>lock); /* @@ -2048,6 +2054,7 @@ restart: spin_lock_irq(>lock); del_lockout(preq); + __clear_bit(PLOOP_REQ_ALLOW_READS, >state); if (!list_empty(>delay_list)) list_splice_init(>delay_list, plo->ready_queue.prev); spin_unlock_irq(>lock); diff --git a/include/linux/ploop/ploop.h b/include/linux/ploop/ploop.h index 762d2fd..ad36a91 100644 --- a/include/linux/ploop/ploop.h +++ b/include/linux/ploop/ploop.h @@ -465,6 +465,7 @@ enum PLOOP_REQ_KAIO_FSYNC, /*force image fsync by KAIO module */ PLOOP_REQ_POST_SUBMIT, /* preq needs post_submit processing */ PLOOP_REQ_PUSH_BACKUP, /* preq was ACKed by userspace push_backup */ + PLOOP_REQ_ALLOW_READS, /* READs are allowed for given req_cluster */ }; enum ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ve/kmod: Add modules to whitelist for c/r sake
The commit is pushed to "branch-rh7-3.10.0-327.10.1.vz7.12.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.10.1.vz7.12.17 --> commit e0914131eeb08e6b1953c682be05b9fbcf185f1f Author: Cyrill GorcunovDate: Tue May 10 20:19:31 2016 +0400 ve/kmod: Add modules to whitelist for c/r sake When doing checpoint/restore during migration we use netlink sockets with diag functionality to fetch various information we need. In particular when restoring on the machine where say netfilter modules are not loaded we fail with | [root@s175 ~]# less /vz/dump/rst-iVS9OC-16.05.04-22.32/criu_restore.11.log | (00.151066) 1: Running ip addr restore | RTNETLINK answers: File exists | RTNETLINK answers: File exists | (00.152641) 1: Running ip route restore | (00.175144) 1: Running ip route restore | (00.184676) 1: Running ip rule delete | (00.186448) 1: Running ip rule delete | (00.188191) 1: Running ip rule delete | (00.190054) 1: Running ip rule restore | (00.191964) 1: Running iptables-restore for iptables-restore | (00.200958) 1: Running ip6tables-restore for ip6tables-restore | >(00.203833) 1: Error (net.c:466): Can't open rtnl sock for net dump: Protocol not supported | (00.229107) Error (cr-restore.c:1407): 15091 killed by signal 9: Killed | (00.229192) Switching to new ns to clean ghosts | (00.241142) uns: calling exit_usernsd (-1, 1) | (00.241173) uns: daemon calls 0x454950 (15085, -1, 1) | (00.241188) uns: `- daemon exits w/ 0 | (00.241570) uns: daemon stopped | (00.241584) Error (cr-restore.c:2248): Restoring FAILED which stands for the following criu code | sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_NETFILTER); | if (sk < 0) { | pr_perror("Can't open rtnl sock for net dump"); | goto out_img; | } because the nfnetlink module is not loaded on the destination machine we're failing. If we would have been running on node the module would be uploaded automatically but restore happens in veX context where modules can't be uploaded. Thus add modules needed for c/r into whitelist, so the criu will upload them automatically. https://jira.sw.ru/browse/PSBM-46789 CC: Vladimir Davydov CC: Konstantin Khorenko CC: Andrey Vagin CC: Pavel Emelyanov Signed-off-by: Cyrill Gorcunov --- kernel/kmod.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/kernel/kmod.c b/kernel/kmod.c index 26b0c33..8df0959 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -377,7 +377,16 @@ static inline int module_payload_iptable_allowed(const char *module) /* ve0 allowed modules */ static const char * const ve0_allowed_mod[] = { - "binfmt_misc" + "binfmt_misc", + "netlink_diag", + "inet_diag", + "tcp_diag", + "udp_diag", + "unix_diag", + "af_packet_diag", + "nfnetlink", + "nf_conntrack", + "nf_conntrack_netlink", }; /* ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] cbt: fix cbt->block_max calculation
The commit is pushed to "branch-rh7-3.10.0-327.10.1.vz7.12.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.10.1.vz7.12.17 --> commit c16aa030e3245b595bba98d67c926f0a6f575752 Author: Maxim PatlasovDate: Tue May 10 20:26:29 2016 +0400 cbt: fix cbt->block_max calculation When the size of block device is multiple of CBT blocksize, the following: > cbt->block_max = (size + blocksize) >> cbt->block_bits; is incorrect. This may end up in allocating one extra page in cbt->map and also make various checks with cbt->block_max prone to error. Signed-off-by: Maxim Patlasov Acked-by: Dmitry Monakhov --- block/blk-cbt.c | 2 +- drivers/block/ploop/push_backup.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-cbt.c b/block/blk-cbt.c index 8c52bd8..8cdf1d6 100644 --- a/block/blk-cbt.c +++ b/block/blk-cbt.c @@ -252,7 +252,7 @@ static struct cbt_info* do_cbt_alloc(struct request_queue *q, __u8 *uuid, return ERR_PTR(-ENOMEM); cbt->block_bits = ilog2(blocksize); - cbt->block_max = (size + blocksize) >> cbt->block_bits; + cbt->block_max = (size + blocksize - 1) >> cbt->block_bits; spin_lock_init(>lock); memcpy(cbt->uuid, uuid, sizeof(cbt->uuid)); cbt->cache = alloc_percpu(struct cbt_extent); diff --git a/drivers/block/ploop/push_backup.c b/drivers/block/ploop/push_backup.c index 056918c..6323243 100644 --- a/drivers/block/ploop/push_backup.c +++ b/drivers/block/ploop/push_backup.c @@ -175,7 +175,7 @@ bool ploop_pb_check_bit(struct ploop_pushbackup_desc *pbd, cluster_t clu) static int convert_map_to_map(struct ploop_pushbackup_desc *pbd) { struct page **from_map = pbd->cbt_map; - blkcnt_t from_max = pbd->cbt_block_max - 1; + blkcnt_t from_max = pbd->cbt_block_max; blkcnt_t from_bits = pbd->cbt_block_bits; struct page **to_map = pbd->ppb_map; ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH RHEL7 COMMIT] ploop: push_backup: fix reentrance in ploop_pb_get_pending()
The commit is pushed to "branch-rh7-3.10.0-327.10.1.vz7.12.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-327.10.1.vz7.12.17 --> commit 7cd35b576e4e45248df9ad7728f207fd1fc5a462 Author: Dmitry MonakhovDate: Tue May 10 20:22:50 2016 +0400 ploop: push_backup: fix reentrance in ploop_pb_get_pending() The patch implements what Dima Monakhov suggested: > AFAIU you have a re-entrance issue if several tasks want performs ioctls > task1:ioctl->wait > task2:ioctl->wait > > Just change wait sequence like this and you are safe: > /* blocking case */ > if (unlikely(pbd->ppb_waiting)) > /* Other task is already waitng for event */ > err = -EBUSY; > goto get_pending_unlock; > } > pbd->ppb_waiting = true; > spin_unlock(>ppb_lock); > mutex_unlock(>ctl_mutex); Fixes commit adcff732cabc43be32649055afd8a1aed41c63d9 ("ploop: implement PLOOP_IOC_PUSH_BACKUP_IO"). https://jira.sw.ru/browse/PSBM-45000 Signed-off-by: Maxim Patlasov --- drivers/block/ploop/push_backup.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/block/ploop/push_backup.c b/drivers/block/ploop/push_backup.c index 05af67c..056918c 100644 --- a/drivers/block/ploop/push_backup.c +++ b/drivers/block/ploop/push_backup.c @@ -466,6 +466,11 @@ int ploop_pb_get_pending(struct ploop_pushbackup_desc *pbd, } /* blocking case */ + if (unlikely(pbd->ppb_waiting)) { + /* Other task is already waiting for event */ + err = -EBUSY; + goto get_pending_unlock; + } pbd->ppb_waiting = true; spin_unlock(>ppb_lock); ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [PATCH rh7] kmod: Add modules to whitelist for c/r sake
On Thu, May 05, 2016 at 07:13:11PM +0300, Cyrill Gorcunov wrote: > When doing checpoint/restore during migration we use netlink > sockets with diag functionality to fetch various information > we need. In particular when restoring on the machine where > say netfilter modules are not loaded we fail with Ping? ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [PATCH rh7] cbt: fix cbt->block_max calculation
Maxim Patlasovwrites: > When the size of block device is multiple of CBT blocksize, the following: > >> cbt->block_max = (size + blocksize) >> cbt->block_bits; Pure typo fix. ACK. > > is incorrect. This may end up in allocating one extra page in cbt->map and > also make various checks with cbt->block_max prone to error. > > Signed-off-by: Maxim Patlasov > --- > block/blk-cbt.c |2 +- > drivers/block/ploop/push_backup.c |2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/block/blk-cbt.c b/block/blk-cbt.c > index 8c52bd8..8cdf1d6 100644 > --- a/block/blk-cbt.c > +++ b/block/blk-cbt.c > @@ -252,7 +252,7 @@ static struct cbt_info* do_cbt_alloc(struct request_queue > *q, __u8 *uuid, > return ERR_PTR(-ENOMEM); > > cbt->block_bits = ilog2(blocksize); > - cbt->block_max = (size + blocksize) >> cbt->block_bits; > + cbt->block_max = (size + blocksize - 1) >> cbt->block_bits; > spin_lock_init(>lock); > memcpy(cbt->uuid, uuid, sizeof(cbt->uuid)); > cbt->cache = alloc_percpu(struct cbt_extent); > diff --git a/drivers/block/ploop/push_backup.c > b/drivers/block/ploop/push_backup.c > index 05af67c..4d671a5 100644 > --- a/drivers/block/ploop/push_backup.c > +++ b/drivers/block/ploop/push_backup.c > @@ -175,7 +175,7 @@ bool ploop_pb_check_bit(struct ploop_pushbackup_desc > *pbd, cluster_t clu) > static int convert_map_to_map(struct ploop_pushbackup_desc *pbd) > { > struct page **from_map = pbd->cbt_map; > - blkcnt_t from_max = pbd->cbt_block_max - 1; > + blkcnt_t from_max = pbd->cbt_block_max; > blkcnt_t from_bits = pbd->cbt_block_bits; > > struct page **to_map = pbd->ppb_map; signature.asc Description: PGP signature ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel