Re: [PATCH v2 6/9] hw/scsi: Rename SCSIRequest::resid as 'residual'
On 1/4/22 12:54 AM, Philippe Mathieu-Daudé wrote: From: Philippe Mathieu-Daudé The 'resid' field is slightly confusing and could be interpreted as some ID. Rename it as 'residual' which is clearer to review. No logical change. Signed-off-by: Philippe Mathieu-Daudé Signed-off-by: Philippe Mathieu-Daudé --- include/hw/scsi/scsi.h | 4 ++-- hw/scsi/megasas.c | 42 +- hw/scsi/scsi-bus.c | 10 +- hw/scsi/scsi-disk.c| 4 ++-- softmmu/dma-helpers.c | 26 +- 5 files changed, 47 insertions(+), 39 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v2 5/9] hw/rdma/rdma_utils: Rename rdma_pci_dma_map 'len' argument
On 1/4/22 12:54 AM, Philippe Mathieu-Daudé wrote: From: Philippe Mathieu-Daudé Various APIs use 'pval' naming for 'pointer to val'. rdma_pci_dma_map() uses 'plen' for 'PCI length', but since 'PCI' is already explicit in the function name, simplify and rename the argument 'len'. No logical change. Signed-off-by: Philippe Mathieu-Daudé Signed-off-by: Philippe Mathieu-Daudé --- hw/rdma/rdma_utils.h | 2 +- hw/rdma/rdma_utils.c | 14 +++--- 2 files changed, 8 insertions(+), 8 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v2 7/9] hw/dma: Fix format string issues using dma_addr_t
On 1/4/22 12:54 AM, Philippe Mathieu-Daudé wrote: Signed-off-by: Philippe Mathieu-Daudé --- hw/ide/ahci.c| 2 +- hw/rdma/trace-events | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v2 4/9] hw/dma: Remove CONFIG_USER_ONLY check
On 1/4/22 12:54 AM, Philippe Mathieu-Daudé wrote: From: Philippe Mathieu-Daudé DMA API should not be included in user-mode emulation. If so, build should fail. Remove the CONFIG_USER_ONLY check. Signed-off-by: Philippe Mathieu-Daudé Signed-off-by: Philippe Mathieu-Daudé --- include/sysemu/dma.h | 3 --- 1 file changed, 3 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v2 3/9] hw/pci: Document pci_dma_map()
On 1/4/22 12:54 AM, Philippe Mathieu-Daudé wrote: From: Philippe Mathieu-Daudé Signed-off-by: Philippe Mathieu-Daudé Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 12 1 file changed, 12 insertions(+) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 5b36334a28a..07f08aa0626 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -876,6 +876,18 @@ PCI_DMA_DEFINE_LDST(q_be, q_be, 64); #undef PCI_DMA_DEFINE_LDST +/** + * pci_dma_map: Map device PCI address space range into host virtual address + * + * May map a subset of the requested range, given by and returned in @plen. + * May return %NULL and set *@plen to zero(0), if resources needed to perform + * the mapping are exhausted. + * + * @dev: #PCIDevice to be accessed + * @addr: address within that device's address space + * @plen: pointer to length of buffer; updated on return + * @dir: indicates the transfer direction + */ As Peter recently mentioned, the ** format is off. The description goes below the arguments. Otherwise, Reviewed-by: Richard Henderson r~
Re: [PATCH v2 2/9] hw/pci: Restrict pci-bus stub to sysemu
On 1/4/22 12:54 AM, Philippe Mathieu-Daudé wrote: From: Philippe Mathieu-Daudé Neither tools nor user-mode emulation require the PCI bus stub. Signed-off-by: Philippe Mathieu-Daudé Signed-off-by: Philippe Mathieu-Daudé --- stubs/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v2 1/9] hw/nvram: Restrict stub to sysemu and tools
On 1/4/22 12:54 AM, Philippe Mathieu-Daudé wrote: +if have_system or have_tools ... if have_system + stub_ss.add(files('fw_cfg.c')) Disconnect in tests? r~
Re: [PATCH v2 1/2] qemu-storage-daemon: Add vhost-user-blk help
On Thu, Dec 23, 2021 at 11:14:25AM +0100, Philippe Mathieu-Daudé wrote: > Add missing vhost-user-blk help: > > $ qemu-storage-daemon -h > ... > --export [type=]vhost-user-blk,id=,node-name=, > addr.type=unix,addr.path=[,writable=on|off] > [,logical-block-size=][,num-queues=] >export the specified block node as a >vhosts-user-blk device over UNIX domain socket Why 'vhosts-' here instead of 'vhost'? > --export [type=]vhost-user-blk,id=,node-name=, > fd,addr.str=[,writable=on|off] > [,logical-block-size=][,num-queues=] >export the specified block node as a >vhosts-user-blk device over file descriptor here too. > ... > > Fixes: 90fc91d50b7 ("convert vhost-user-blk server to block export API") > Reported-by: Qing Wang > Signed-off-by: Philippe Mathieu-Daudé > --- > storage-daemon/qemu-storage-daemon.c | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/storage-daemon/qemu-storage-daemon.c > b/storage-daemon/qemu-storage-daemon.c > index 52cf17e8ace..0c19e128e3f 100644 > --- a/storage-daemon/qemu-storage-daemon.c > +++ b/storage-daemon/qemu-storage-daemon.c > @@ -104,6 +104,19 @@ static void help(void) > " export the specified block node over FUSE\n" > "\n" > #endif /* CONFIG_FUSE */ > +#ifdef CONFIG_VHOST_USER_BLK_SERVER > +" --export [type=]vhost-user-blk,id=,node-name=,\n" > +" addr.type=unix,addr.path=[,writable=on|off]\n" > +" [,logical-block-size=][,num-queues=]\n" > +" export the specified block node as a\n" > +" vhosts-user-blk device over UNIX domain socket\n" This... > +" --export [type=]vhost-user-blk,id=,node-name=,\n" > +" fd,addr.str=[,writable=on|off]\n" > +" [,logical-block-size=][,num-queues=]\n" > +" export the specified block node as a\n" > +" vhosts-user-blk device over file descriptor\n" ...and this line would need the same tweak. If you agree that it's a typo, then let's fix it, and you can have Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH V3] block/rbd: implement bdrv_co_block_status
>> Nit: I would rename ret to status or something like that to make >>>>>>> it clear(er) that it is an actual value and never an error. Or, >>>>>>> even better, drop it entirely and return one of the two bitmasks >>>>>>> directly. >>>>>>> >>>>>>>> +struct rbd_diff_req req = { .offs = offset }; >>>>>>>> +uint64_t features, flags; >>>>>>>> + >>>>>>>> +assert(offset + bytes <= s->image_size); >>>>>>>> + >>>>>>>> +/* default to all sectors allocated */ >>>>>>>> +ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID; >>>>>>>> +*map = offset; >>>>>>>> +*file = bs; >>>>>>>> +*pnum = bytes; >>>>>>>> + >>>>>>>> +/* check if RBD image supports fast-diff */ >>>>>>>> +r = rbd_get_features(s->image, &features); >>>>>>>> +if (r < 0) { >>>>>>>> +goto out; >>>>>>>> +} >>>>>>>> +if (!(features & RBD_FEATURE_FAST_DIFF)) { >>>>>>>> +goto out; >>>>>>>> +} >>>>>>>> + >>>>>>>> +/* check if RBD fast-diff result is valid */ >>>>>>>> +r = rbd_get_flags(s->image, &flags); >>>>>>>> +if (r < 0) { >>>>>>>> +goto out; >>>>>>>> +} >>>>>>>> +if (flags & RBD_FLAG_FAST_DIFF_INVALID) { >>>>>>>> +goto out; >>>>>>> Nit: I'm not a fan of labels that cover just the return statement. >>>>>>> Feel free to ignore this one but I would get rid of it and replace >>>>>>> these gotos with returns. >>>>>> That would be return with the bitmask directly coded in if I also >>>>>> >>>>>> drop the ret variable. I can change that, no problem. >>>>>> >>>>>> >>>>>>>> +} >>>>>>>> + >>>>>>>> +r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, >>>>>>>> + qemu_rbd_co_block_status_cb, &req); >>>>>>>> +if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { >>>>>>>> +goto out; >>>>>>>> +} >>>>>>>> +assert(req.bytes <= bytes); >>>>>>>> +if (!req.exists) { >>>>>>>> +if (r == 0 && !req.bytes) { >>>>>>>> +/* >>>>>>>> + * rbd_diff_iterate2 does not invoke callbacks for >>>>>>>> unallocated areas >>>>>>>> + * except for the case where an overlay has a hole where >>>>>>>> the parent >>>>>>>> + * has not. This here catches the case where no callback >>>>>>>> was >>>>>>>> + * invoked at all. >>>>>>>> + */ >>>>>>> The above is true in the case of diffing against a snapshot, i.e. when >>>>>>> the "from" snapshot has some data where the "to" revision (whether HEAD >>>>>>> or another snapshot) has a hole. I don't think it is true for child vs >>>>>>> parent (but it has been a while since I looked at the diff code). As >>>>>>> long as NULL is passed for fromsnapname, I would expect the callback to >>>>>>> be invoked only for allocated areas. If I'm right, we could simplify >>>>>>> qemu_rbd_co_block_status_cb() considerably. >>>>>> See my comment in the callback. Can you look at the diff code or give >>>>>> me at least a pointer where I can find it. I am not that familiar with >>>>>> the librbd code yet. >>>>> I assumed that you added !exists handling because it came up in your >>>>> testing. If you don't have a test case then let's proceed under the >>>>> assumption that it doesn't happen for clones. >>>> Hi Ilya, >>>> >>>> >>>> it seems that our assumption was false. I have an image which shows holes >>>> without diffing against >>>> >>>> a snapshot. Do you have an idea why? >>>> >>>> >>>> $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven info >>>> dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw >>>> rbd image 'c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw': >>>> size 20 GiB in 2 objects >>>> order 20 (1 MiB objects) >>>> snapshot_count: 2 >>>> id: 3d6daa102e4d9f >>>> block_name_prefix: rbd_data.3d6daa102e4d9f >>>> format: 2 >>>> features: layering, exclusive-lock, object-map, fast-diff, >>>> deep-flatten >>>> op_features: >>>> flags: >>>> create_timestamp: Tue Sep 21 14:16:56 2021 >>>> access_timestamp: Thu Jan 6 15:24:46 2022 >>>> modify_timestamp: Thu Jan 6 15:45:42 2022 >>>> >>>> >>>> $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven snap ls >>>> dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw >>>> SNAPID NAME SIZEPROTECTED TIMESTAMP >>>>12297 dlp-20210921-144509 20 GiB Tue Sep 21 14:45:09 2021 >>>>17745 dlp-20220106-04 20 GiB Thu Jan 6 04:00:01 2022 >>>> >>>> >>>> $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven diff --whole-object >>>> dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw >>>> | grep zero >>>> 2044723201048576 zero >>>> 1114636288 1048576 zero >>>> 1115684864 1048576 zero >>>> 1116733440 1048576 zero >>>> 1117782016 1048576 zero >>>> 1218445312 1048576 zero >>>> 1219493888 1048576 zero >>>> 1220542464 1048576 zero >>> Hi Peter, >>> >>> Yes, I stumbled upon this just yesterday while looking into another >>> librbd issue surfaced by this patch [1] and was going to email you and >>> the list after wrapping my head around this behavior. I see where it >>> is coming from but I'm not sure what the right fix is. I would prefer >>> to patch librbd but that may turn out to be not feasible. Let me get >>> back on this next week. >>> >>> [1] https://tracker.ceph.com/issues/53784 >> >> Looks like it was a good idea to add all these asserts... >> >> >> The issue I see does occur with a simple qemu-img convert on the given image >> as source. It triggers the assert(exists) in the cb. > Right, "qemu-img convert" issues a block status request. Does your > image have snapshots? This behavior should be caused by a snapshot > that contains a discarded object: > > $ rbd create --size 16M foo > $ sudo rbd map foo > /dev/rbd0 > $ sudo xfs_io -d -c 'pwrite -b 4M 0 16M' /dev/rbd0 > wrote 16777216/16777216 bytes at offset 0 > 16 MiB, 4 ops; 0.4607 sec (34.726 MiB/sec and 8.6814 ops/sec) > $ rbd snap create foo@snap > Creating snap: 100% complete...done. > $ rbd diff foo > OffsetLength Type > 0 4194304 data > 4194304 4194304 data > 8388608 4194304 data > 12582912 4194304 data > > $ blkdiscard -o 4M -l 8M /dev/rbd0 > $ rbd diff foo > OffsetLength Type > 0 4194304 data > 4194304 4194304 zero > 8388608 4194304 zero > 12582912 4194304 data > > If you remove that snapshot, these "holes" should disappear from the > report: > > $ rbd snap rm foo@snap > Removing snap: 100% complete...done. > $ rbd diff foo > OffsetLength Type > 0 4194304 data > 12582912 4194304 data > > Thanks, > > Ilya So we have 2 different issues: 1) For the holes that we did not expect we should use the original version of my patch which was able to handle holes properly. 2) For issue #53784 the right would be to fix the issue in librbd. We could discuss if we try to "detect" this issue, report a warning once and then disable the get_block_status functionality for the session. The question is, is it possible to "detect" the issue? If thats not possible should we patch qemu to report unaligned requests as allocated for affected librbd versions? Best, Peter
Re: [PATCH V3] block/rbd: implement bdrv_co_block_status
>>>>>> +assert(offset + bytes <= s->image_size); > >>>>>> + > >>>>>> +/* default to all sectors allocated */ > >>>>>> +ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID; > >>>>>> +*map = offset; > >>>>>> +*file = bs; > >>>>>> +*pnum = bytes; > >>>>>> + > >>>>>> +/* check if RBD image supports fast-diff */ > >>>>>> +r = rbd_get_features(s->image, &features); > >>>>>> +if (r < 0) { > >>>>>> +goto out; > >>>>>> +} > >>>>>> +if (!(features & RBD_FEATURE_FAST_DIFF)) { > >>>>>> +goto out; > >>>>>> +} > >>>>>> + > >>>>>> +/* check if RBD fast-diff result is valid */ > >>>>>> +r = rbd_get_flags(s->image, &flags); > >>>>>> +if (r < 0) { > >>>>>> +goto out; > >>>>>> +} > >>>>>> +if (flags & RBD_FLAG_FAST_DIFF_INVALID) { > >>>>>> +goto out; > >>>>> Nit: I'm not a fan of labels that cover just the return statement. > >>>>> Feel free to ignore this one but I would get rid of it and replace > >>>>> these gotos with returns. > >>>> That would be return with the bitmask directly coded in if I also > >>>> > >>>> drop the ret variable. I can change that, no problem. > >>>> > >>>> > >>>>>> +} > >>>>>> + > >>>>>> +r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, > >>>>>> + qemu_rbd_co_block_status_cb, &req); > >>>>>> +if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { > >>>>>> +goto out; > >>>>>> +} > >>>>>> +assert(req.bytes <= bytes); > >>>>>> +if (!req.exists) { > >>>>>> +if (r == 0 && !req.bytes) { > >>>>>> +/* > >>>>>> + * rbd_diff_iterate2 does not invoke callbacks for > >>>>>> unallocated areas > >>>>>> + * except for the case where an overlay has a hole where > >>>>>> the parent > >>>>>> + * has not. This here catches the case where no callback > >>>>>> was > >>>>>> + * invoked at all. > >>>>>> + */ > >>>>> The above is true in the case of diffing against a snapshot, i.e. when > >>>>> the "from" snapshot has some data where the "to" revision (whether HEAD > >>>>> or another snapshot) has a hole. I don't think it is true for child vs > >>>>> parent (but it has been a while since I looked at the diff code). As > >>>>> long as NULL is passed for fromsnapname, I would expect the callback to > >>>>> be invoked only for allocated areas. If I'm right, we could simplify > >>>>> qemu_rbd_co_block_status_cb() considerably. > >>>> See my comment in the callback. Can you look at the diff code or give > >>>> me at least a pointer where I can find it. I am not that familiar with > >>>> the librbd code yet. > >>> I assumed that you added !exists handling because it came up in your > >>> testing. If you don't have a test case then let's proceed under the > >>> assumption that it doesn't happen for clones. > >> > >> Hi Ilya, > >> > >> > >> it seems that our assumption was false. I have an image which shows holes > >> without diffing against > >> > >> a snapshot. Do you have an idea why? > >> > >> > >> $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven info > >> dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw > >> rbd image 'c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw': > >> size 20 GiB in 2 objects > >> order 20 (1 MiB objects) > >> snapshot_count: 2 > >> id: 3d6daa102e4d9f > >> block_name_prefix: rbd_data.3d6daa102e4d9f > >> format: 2 > >> features: layering, exclusive-lock, object-map, fast-diff, > >> deep-flatten > >> op_features: > >> flags: > >> create_timestamp: Tue Sep 21 14:16:56 2021 > >> access_timestamp: Thu Jan 6 15:24:46 2022 > >> modify_timestamp: Thu Jan 6 15:45:42 2022 > >> > >> > >> $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven snap ls > >> dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw > >> SNAPID NAME SIZEPROTECTED TIMESTAMP > >>12297 dlp-20210921-144509 20 GiB Tue Sep 21 14:45:09 2021 > >>17745 dlp-20220106-04 20 GiB Thu Jan 6 04:00:01 2022 > >> > >> > >> $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven diff --whole-object > >> dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw > >> | grep zero > >> 2044723201048576 zero > >> 1114636288 1048576 zero > >> 1115684864 1048576 zero > >> 1116733440 1048576 zero > >> 1117782016 1048576 zero > >> 1218445312 1048576 zero > >> 1219493888 1048576 zero > >> 1220542464 1048576 zero > > Hi Peter, > > > > Yes, I stumbled upon this just yesterday while looking into another > > librbd issue surfaced by this patch [1] and was going to email you and > > the list after wrapping my head around this behavior. I see where it > > is coming from but I'm not sure what the right fix is. I would prefer > > to patch librbd but that may turn out to be not feasible. Let me get > > back on this next week. > > > > [1] https://tracker.ceph.com/issues/53784 > > > Looks like it was a good idea to add all these asserts... > > > The issue I see does occur with a simple qemu-img convert on the given image > as source. It triggers the assert(exists) in the cb. Right, "qemu-img convert" issues a block status request. Does your image have snapshots? This behavior should be caused by a snapshot that contains a discarded object: $ rbd create --size 16M foo $ sudo rbd map foo /dev/rbd0 $ sudo xfs_io -d -c 'pwrite -b 4M 0 16M' /dev/rbd0 wrote 16777216/16777216 bytes at offset 0 16 MiB, 4 ops; 0.4607 sec (34.726 MiB/sec and 8.6814 ops/sec) $ rbd snap create foo@snap Creating snap: 100% complete...done. $ rbd diff foo OffsetLength Type 0 4194304 data 4194304 4194304 data 8388608 4194304 data 12582912 4194304 data $ blkdiscard -o 4M -l 8M /dev/rbd0 $ rbd diff foo OffsetLength Type 0 4194304 data 4194304 4194304 zero 8388608 4194304 zero 12582912 4194304 data If you remove that snapshot, these "holes" should disappear from the report: $ rbd snap rm foo@snap Removing snap: 100% complete...done. $ rbd diff foo OffsetLength Type 0 4194304 data 12582912 4194304 data Thanks, Ilya
Re: [PATCH V3] block/rbd: implement bdrv_co_block_status
BlockDriverState **file) +{ +BDRVRBDState *s = bs->opaque; +int ret, r; Nit: I would rename ret to status or something like that to make it clear(er) that it is an actual value and never an error. Or, even better, drop it entirely and return one of the two bitmasks directly. +struct rbd_diff_req req = { .offs = offset }; +uint64_t features, flags; + +assert(offset + bytes <= s->image_size); + +/* default to all sectors allocated */ +ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID; +*map = offset; +*file = bs; +*pnum = bytes; + +/* check if RBD image supports fast-diff */ +r = rbd_get_features(s->image, &features); +if (r < 0) { +goto out; +} +if (!(features & RBD_FEATURE_FAST_DIFF)) { +goto out; +} + +/* check if RBD fast-diff result is valid */ +r = rbd_get_flags(s->image, &flags); +if (r < 0) { +goto out; +} +if (flags & RBD_FLAG_FAST_DIFF_INVALID) { +goto out; Nit: I'm not a fan of labels that cover just the return statement. Feel free to ignore this one but I would get rid of it and replace these gotos with returns. That would be return with the bitmask directly coded in if I also drop the ret variable. I can change that, no problem. +} + +r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, + qemu_rbd_co_block_status_cb, &req); +if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { +goto out; +} +assert(req.bytes <= bytes); +if (!req.exists) { +if (r == 0 && !req.bytes) { +/* + * rbd_diff_iterate2 does not invoke callbacks for unallocated areas + * except for the case where an overlay has a hole where the parent + * has not. This here catches the case where no callback was + * invoked at all. + */ The above is true in the case of diffing against a snapshot, i.e. when the "from" snapshot has some data where the "to" revision (whether HEAD or another snapshot) has a hole. I don't think it is true for child vs parent (but it has been a while since I looked at the diff code). As long as NULL is passed for fromsnapname, I would expect the callback to be invoked only for allocated areas. If I'm right, we could simplify qemu_rbd_co_block_status_cb() considerably. See my comment in the callback. Can you look at the diff code or give me at least a pointer where I can find it. I am not that familiar with the librbd code yet. I assumed that you added !exists handling because it came up in your testing. If you don't have a test case then let's proceed under the assumption that it doesn't happen for clones. Hi Ilya, it seems that our assumption was false. I have an image which shows holes without diffing against a snapshot. Do you have an idea why? $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven info dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw rbd image 'c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw': size 20 GiB in 2 objects order 20 (1 MiB objects) snapshot_count: 2 id: 3d6daa102e4d9f block_name_prefix: rbd_data.3d6daa102e4d9f format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Tue Sep 21 14:16:56 2021 access_timestamp: Thu Jan 6 15:24:46 2022 modify_timestamp: Thu Jan 6 15:45:42 2022 $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven snap ls dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw SNAPID NAME SIZEPROTECTED TIMESTAMP 12297 dlp-20210921-144509 20 GiB Tue Sep 21 14:45:09 2021 17745 dlp-20220106-04 20 GiB Thu Jan 6 04:00:01 2022 $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven diff --whole-object dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw | grep zero 2044723201048576 zero 1114636288 1048576 zero 1115684864 1048576 zero 1116733440 1048576 zero 1117782016 1048576 zero 1218445312 1048576 zero 1219493888 1048576 zero 1220542464 1048576 zero Hi Peter, Yes, I stumbled upon this just yesterday while looking into another librbd issue surfaced by this patch [1] and was going to email you and the list after wrapping my head around this behavior. I see where it is coming from but I'm not sure what the right fix is. I would prefer to patch librbd but that may turn out to be not feasible. Let me get back on this next week. [1] https://tracker.ceph.com/issues/53784 Looks like it was a good idea to add all these asserts... The issue I see does occur with a simple qemu-img convert on the given image as source. It triggers the assert(exists) in the cb. My dev cluster is running on Nautilus 14.2.22. Best, Peter
Re: [PATCH V3] block/rbd: implement bdrv_co_block_status
; >> > >> That would be return with the bitmask directly coded in if I also > >> > >> drop the ret variable. I can change that, no problem. > >> > >> > >>>> +} > >>>> + > >>>> +r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, > >>>> + qemu_rbd_co_block_status_cb, &req); > >>>> +if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { > >>>> +goto out; > >>>> +} > >>>> +assert(req.bytes <= bytes); > >>>> +if (!req.exists) { > >>>> +if (r == 0 && !req.bytes) { > >>>> +/* > >>>> + * rbd_diff_iterate2 does not invoke callbacks for > >>>> unallocated areas > >>>> + * except for the case where an overlay has a hole where > >>>> the parent > >>>> + * has not. This here catches the case where no callback was > >>>> + * invoked at all. > >>>> + */ > >>> The above is true in the case of diffing against a snapshot, i.e. when > >>> the "from" snapshot has some data where the "to" revision (whether HEAD > >>> or another snapshot) has a hole. I don't think it is true for child vs > >>> parent (but it has been a while since I looked at the diff code). As > >>> long as NULL is passed for fromsnapname, I would expect the callback to > >>> be invoked only for allocated areas. If I'm right, we could simplify > >>> qemu_rbd_co_block_status_cb() considerably. > >> See my comment in the callback. Can you look at the diff code or give > >> me at least a pointer where I can find it. I am not that familiar with > >> the librbd code yet. > > I assumed that you added !exists handling because it came up in your > > testing. If you don't have a test case then let's proceed under the > > assumption that it doesn't happen for clones. > > > Hi Ilya, > > > it seems that our assumption was false. I have an image which shows holes > without diffing against > > a snapshot. Do you have an idea why? > > > $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven info > dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw > rbd image 'c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw': > size 20 GiB in 2 objects > order 20 (1 MiB objects) > snapshot_count: 2 > id: 3d6daa102e4d9f > block_name_prefix: rbd_data.3d6daa102e4d9f > format: 2 > features: layering, exclusive-lock, object-map, fast-diff, deep-flatten > op_features: > flags: > create_timestamp: Tue Sep 21 14:16:56 2021 > access_timestamp: Thu Jan 6 15:24:46 2022 > modify_timestamp: Thu Jan 6 15:45:42 2022 > > > $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven snap ls > dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw > SNAPID NAME SIZEPROTECTED TIMESTAMP > 12297 dlp-20210921-144509 20 GiB Tue Sep 21 14:45:09 2021 > 17745 dlp-20220106-04 20 GiB Thu Jan 6 04:00:01 2022 > > > $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven diff --whole-object > dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw > | grep zero > 2044723201048576 zero > 1114636288 1048576 zero > 1115684864 1048576 zero > 1116733440 1048576 zero > 1117782016 1048576 zero > 1218445312 1048576 zero > 1219493888 1048576 zero > 1220542464 1048576 zero Hi Peter, Yes, I stumbled upon this just yesterday while looking into another librbd issue surfaced by this patch [1] and was going to email you and the list after wrapping my head around this behavior. I see where it is coming from but I'm not sure what the right fix is. I would prefer to patch librbd but that may turn out to be not feasible. Let me get back on this next week. [1] https://tracker.ceph.com/issues/53784 Thanks, Ilya
Re: [PATCH V3] block/rbd: implement bdrv_co_block_status
ret to status or something like that to make it clear(er) that it is an actual value and never an error. Or, even better, drop it entirely and return one of the two bitmasks directly. +struct rbd_diff_req req = { .offs = offset }; +uint64_t features, flags; + +assert(offset + bytes <= s->image_size); + +/* default to all sectors allocated */ +ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID; +*map = offset; +*file = bs; +*pnum = bytes; + +/* check if RBD image supports fast-diff */ +r = rbd_get_features(s->image, &features); +if (r < 0) { +goto out; +} +if (!(features & RBD_FEATURE_FAST_DIFF)) { +goto out; +} + +/* check if RBD fast-diff result is valid */ +r = rbd_get_flags(s->image, &flags); +if (r < 0) { +goto out; +} +if (flags & RBD_FLAG_FAST_DIFF_INVALID) { +goto out; Nit: I'm not a fan of labels that cover just the return statement. Feel free to ignore this one but I would get rid of it and replace these gotos with returns. That would be return with the bitmask directly coded in if I also drop the ret variable. I can change that, no problem. +} + +r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, + qemu_rbd_co_block_status_cb, &req); +if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { +goto out; +} +assert(req.bytes <= bytes); +if (!req.exists) { +if (r == 0 && !req.bytes) { +/* + * rbd_diff_iterate2 does not invoke callbacks for unallocated areas + * except for the case where an overlay has a hole where the parent + * has not. This here catches the case where no callback was + * invoked at all. + */ The above is true in the case of diffing against a snapshot, i.e. when the "from" snapshot has some data where the "to" revision (whether HEAD or another snapshot) has a hole. I don't think it is true for child vs parent (but it has been a while since I looked at the diff code). As long as NULL is passed for fromsnapname, I would expect the callback to be invoked only for allocated areas. If I'm right, we could simplify qemu_rbd_co_block_status_cb() considerably. See my comment in the callback. Can you look at the diff code or give me at least a pointer where I can find it. I am not that familiar with the librbd code yet. I assumed that you added !exists handling because it came up in your testing. If you don't have a test case then let's proceed under the assumption that it doesn't happen for clones. Hi Ilya, it seems that our assumption was false. I have an image which shows holes without diffing against a snapshot. Do you have an idea why? $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven info dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw rbd image 'c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw': size 20 GiB in 2 objects order 20 (1 MiB objects) snapshot_count: 2 id: 3d6daa102e4d9f block_name_prefix: rbd_data.3d6daa102e4d9f format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Tue Sep 21 14:16:56 2021 access_timestamp: Thu Jan 6 15:24:46 2022 modify_timestamp: Thu Jan 6 15:45:42 2022 $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven snap ls dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw SNAPID NAME SIZE PROTECTED TIMESTAMP 12297 dlp-20210921-144509 20 GiB Tue Sep 21 14:45:09 2021 17745 dlp-20220106-04 20 GiB Thu Jan 6 04:00:01 2022 $ rbd --conf /etc/ceph/ceph-dev01.conf --id lieven diff --whole-object dhp-standard/c4ca7ee9-36ce-4fc9-9d3b-ece8a4f8b83e/c1ad11d0-4f6a-4cc1-8aa3-ff3c413c1471.raw | grep zero 204472320 1048576 zero 1114636288 1048576 zero 1115684864 1048576 zero 1116733440 1048576 zero 1117782016 1048576 zero 1218445312 1048576 zero 1219493888 1048576 zero 1220542464 1048576 zero Best, Peter
[PULL 28/52] hw/vhost-user-blk: turn on VIRTIO_BLK_F_SIZE_MAX feature for virtio blk device
From: Andy Pei Turn on pre-defined feature VIRTIO_BLK_F_SIZE_MAX for virtio blk device to avoid guest DMA request sizes which are too large for hardware spec. Signed-off-by: Andy Pei Message-Id: <1641202092-149677-1-git-send-email-andy@intel.com> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin Acked-by: Raphael Norwitz --- hw/block/vhost-user-blk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index ab11ce8252..1a42ae9187 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -252,6 +252,7 @@ static uint64_t vhost_user_blk_get_features(VirtIODevice *vdev, VHostUserBlk *s = VHOST_USER_BLK(vdev); /* Turn on pre-defined features */ +virtio_add_feature(&features, VIRTIO_BLK_F_SIZE_MAX); virtio_add_feature(&features, VIRTIO_BLK_F_SEG_MAX); virtio_add_feature(&features, VIRTIO_BLK_F_GEOMETRY); virtio_add_feature(&features, VIRTIO_BLK_F_TOPOLOGY); -- MST
[PULL 22/52] vhost-user-blk: propagate error return from generic vhost
From: Roman Kagan Fix the only callsite that doesn't propagate the error code from the generic vhost code. Signed-off-by: Roman Kagan Message-Id: <2021153354.18807-11-rvka...@yandex-team.ru> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin Reviewed-by: Raphael Norwitz --- hw/block/vhost-user-blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index f9b17f6813..ab11ce8252 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -100,7 +100,7 @@ static int vhost_user_blk_handle_config_change(struct vhost_dev *dev) &local_err); if (ret < 0) { error_report_err(local_err); -return -1; +return ret; } /* valid for resize only */ -- MST
[PULL 14/52] vhost-user-blk: reconnect on any error during realize
From: Roman Kagan vhost-user-blk realize only attempts to reconnect if the previous connection attempt failed on "a problem with the connection and not an error related to the content (which would fail again the same way in the next attempt)". However this distinction is very subtle, and may be inadvertently broken if the code changes somewhere deep down the stack and a new error gets propagated up to here. OTOH now that the number of reconnection attempts is limited it seems harmless to try reconnecting on any error. So relax the condition of whether to retry connecting to check for any error. This patch amends a527e312b5 "vhost-user-blk: Implement reconnection during realize". Signed-off-by: Roman Kagan Message-Id: <2021153354.18807-2-rvka...@yandex-team.ru> Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin Reviewed-by: Raphael Norwitz --- hw/block/vhost-user-blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index ba13cb87e5..f9b17f6813 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -511,7 +511,7 @@ static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp) *errp = NULL; } ret = vhost_user_blk_realize_connect(s, errp); -} while (ret == -EPROTO && retries--); +} while (ret < 0 && retries--); if (ret < 0) { goto virtio_err; -- MST
Re: [PULL 0/2] SD/MMC patches for 2022-01-04
On 4/1/22 19:42, Richard Henderson wrote: On 1/3/22 11:54 PM, Philippe Mathieu-Daudé wrote: The following changes since commit b5a3d8bc9146ba22a25116cb748c97341bf99737: Merge tag 'pull-misc-20220103' of https://gitlab.com/rth7680/qemu into staging (2022-01-03 09:34:41 -0800) are available in the Git repository at: https://github.com/philmd/qemu.git tags/sdmmc-20220104 for you to fetch changes up to d666c7b529c503381a714b97d2e174848b5aad8d: hw/sd: Add SDHC support for SD card SPI-mode (2022-01-04 08:50:27 +0100) SD/MMC patches queue - Add SDHC support for SD card SPI-mode (Frank Chang) Frank Chang (1): hw/sd: Add SDHC support for SD card SPI-mode Philippe Mathieu-Daudé (1): hw/sd/sdcard: Rename Write Protect Group variables hw/sd/sd.c | 48 +--- 1 file changed, 29 insertions(+), 19 deletions(-) Lots of failures of the form ../hw/sd/sd.c:842:33: error: 'SDState' {aka 'struct SDState'} has no member named 'wp_groups'; did you mean 'wp_group_bits'? 842 | if (test_bit(wpnum, sd->wp_groups)) { | ^ | wp_group_bits I am really sorry for having wasted your time this way... I was in the middle of migrating my workstation and posted the non-rebased branch :( I'll take extra care this doesn't happen again. Regards, Phil.
Re: [PATCH v2 0/9] hw/dma: Use dma_addr_t type definition when relevant
On Tue, Jan 04, 2022 at 09:54:22AM +0100, Philippe Mathieu-Daudé wrote: > Since v1: > - Addressed David review comment (stick to dma_addr_t type) > - Addressed Peter review comment (incorrect doc string) PCI things: Reviewed-by: Michael S. Tsirkin who's merging all this? Yourself? > Hi, > > This series aims to clarify a bit the DMA API, in particular the > 'size' argument which is not clear enough (as we use multiple types > for it). This helps avoiding build failures on 32-bit host [*] (and > likely overflows calculation too IMO). > > Some units using the DMA API are first removed from user-mode > emulation to avoid build failure (they shouldn't be there in > the first place). > > Then some variables are renamed for clarity (no functional change). > > Finally we replace misuses with dma_addr_t typedef. The previous > patch which failed on 32-bit host applied on top (not failing anymore). > > Regards, > > Phil. > > [*] https://www.mail-archive.com/qemu-devel@nongnu.org/msg858825.html > > Supersedes: <20211231114901.976937-1-phi...@redhat.com> > > Philippe Mathieu-Daudé (9): > hw/nvram: Restrict stub to sysemu and tools > hw/pci: Restrict pci-bus stub to sysemu > hw/pci: Document pci_dma_map() > hw/dma: Remove CONFIG_USER_ONLY check > hw/rdma/rdma_utils: Rename rdma_pci_dma_map 'len' argument > hw/scsi: Rename SCSIRequest::resid as 'residual' > hw/dma: Fix format string issues using dma_addr_t > hw/dma: Use dma_addr_t type definition when relevant > hw/dma: Let dma_buf_read() / dma_buf_write() propagate MemTxResult > > hw/rdma/rdma_utils.h | 2 +- > include/hw/pci/pci.h | 12 ++ > include/hw/scsi/scsi.h | 4 +- > include/sysemu/dma.h | 31 --- > hw/ide/ahci.c | 10 ++--- > hw/nvme/ctrl.c | 6 +-- > hw/rdma/rdma_utils.c | 14 +++ > hw/scsi/megasas.c | 85 +- > hw/scsi/scsi-bus.c | 12 +++--- > hw/scsi/scsi-disk.c| 4 +- > softmmu/dma-helpers.c | 34 +++-- > hw/nvram/meson.build | 6 ++- > hw/rdma/trace-events | 2 +- > stubs/meson.build | 4 +- > 14 files changed, 134 insertions(+), 92 deletions(-) > > -- > 2.33.1 >
Re: [PATCH v3 01/15] pcie: Add support for Single Root I/O Virtualization (SR/IOV)
On Tue, Dec 21, 2021 at 03:32:32PM +0100, Lukasz Maniak wrote: > From: Knut Omang > > This patch provides the building blocks for creating an SR/IOV > PCIe Extended Capability header and register/unregister > SR/IOV Virtual Functions. > > Signed-off-by: Knut Omang > --- > hw/pci/meson.build | 1 + > hw/pci/pci.c| 97 +--- > hw/pci/pcie.c | 5 + > hw/pci/pcie_sriov.c | 287 > hw/pci/trace-events | 5 + > include/hw/pci/pci.h| 12 +- > include/hw/pci/pcie.h | 6 + > include/hw/pci/pcie_sriov.h | 67 + > include/qemu/typedefs.h | 2 + > 9 files changed, 456 insertions(+), 26 deletions(-) > create mode 100644 hw/pci/pcie_sriov.c > create mode 100644 include/hw/pci/pcie_sriov.h > > diff --git a/hw/pci/meson.build b/hw/pci/meson.build > index 5c4bbac817..bcc9c75919 100644 > --- a/hw/pci/meson.build > +++ b/hw/pci/meson.build > @@ -5,6 +5,7 @@ pci_ss.add(files( >'pci.c', >'pci_bridge.c', >'pci_host.c', > + 'pcie_sriov.c', >'shpc.c', >'slotid_cap.c' > )) > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index e5993c1ef5..1892a7e74c 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -239,6 +239,9 @@ int pci_bar(PCIDevice *d, int reg) > { > uint8_t type; > > +/* PCIe virtual functions do not have their own BARs */ > +assert(!pci_is_vf(d)); > + > if (reg != PCI_ROM_SLOT) > return PCI_BASE_ADDRESS_0 + reg * 4; > > @@ -304,10 +307,30 @@ void pci_device_deassert_intx(PCIDevice *dev) > } > } > > -static void pci_do_device_reset(PCIDevice *dev) > +static void pci_reset_regions(PCIDevice *dev) > { > int r; > +if (pci_is_vf(dev)) { > +return; > +} > + > +for (r = 0; r < PCI_NUM_REGIONS; ++r) { > +PCIIORegion *region = &dev->io_regions[r]; > +if (!region->size) { > +continue; > +} > + > +if (!(region->type & PCI_BASE_ADDRESS_SPACE_IO) && > +region->type & PCI_BASE_ADDRESS_MEM_TYPE_64) { > +pci_set_quad(dev->config + pci_bar(dev, r), region->type); > +} else { > +pci_set_long(dev->config + pci_bar(dev, r), region->type); > +} > +} > +} > > +static void pci_do_device_reset(PCIDevice *dev) > +{ > pci_device_deassert_intx(dev); > assert(dev->irq_state == 0); > > @@ -323,19 +346,7 @@ static void pci_do_device_reset(PCIDevice *dev) >pci_get_word(dev->wmask + PCI_INTERRUPT_LINE) | >pci_get_word(dev->w1cmask + > PCI_INTERRUPT_LINE)); > dev->config[PCI_CACHE_LINE_SIZE] = 0x0; > -for (r = 0; r < PCI_NUM_REGIONS; ++r) { > -PCIIORegion *region = &dev->io_regions[r]; > -if (!region->size) { > -continue; > -} > - > -if (!(region->type & PCI_BASE_ADDRESS_SPACE_IO) && > -region->type & PCI_BASE_ADDRESS_MEM_TYPE_64) { > -pci_set_quad(dev->config + pci_bar(dev, r), region->type); > -} else { > -pci_set_long(dev->config + pci_bar(dev, r), region->type); > -} > -} > +pci_reset_regions(dev); > pci_update_mappings(dev); > > msi_reset(dev); > @@ -884,6 +895,15 @@ static void pci_init_multifunction(PCIBus *bus, > PCIDevice *dev, Error **errp) > dev->config[PCI_HEADER_TYPE] |= PCI_HEADER_TYPE_MULTI_FUNCTION; > } > > +/* With SR/IOV and ARI, a device at function 0 need not be a > multifunction > + * device, as it may just be a VF that ended up with function 0 in > + * the legacy PCI interpretation. Avoid failing in such cases: > + */ > +if (pci_is_vf(dev) && > +dev->exp.sriov_vf.pf->cap_present & QEMU_PCI_CAP_MULTIFUNCTION) { > +return; > +} > + > /* > * multifunction bit is interpreted in two ways as follows. > * - all functions must set the bit to 1. > @@ -1083,6 +1103,7 @@ static PCIDevice *do_pci_register_device(PCIDevice > *pci_dev, > bus->devices[devfn]->name); > return NULL; > } else if (dev->hotplugged && > + !pci_is_vf(pci_dev) && > pci_get_function_0(pci_dev)) { > error_setg(errp, "PCI: slot %d function 0 already occupied by %s," > " new func %s cannot be exposed to guest.", > @@ -1191,6 +1212,7 @@ void pci_register_bar(PCIDevice *pci_dev, int > region_num, > pcibus_t size = memory_region_size(memory); > uint8_t hdr_type; > > +assert(!pci_is_vf(pci_dev)); /* VFs must use pcie_sriov_vf_register_bar > */ > assert(region_num >= 0); > assert(region_num < PCI_NUM_REGIONS); > assert(is_power_of_2(size)); > @@ -1294,11 +1316,43 @@ pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int > region_num) > return pci_dev->io_regions[region_num].addr; > } > > -static pcibus_t pci_bar_address(PCIDevice *d, > -
Re: [PATCH 00/10] vhost: stick to -errno error return convention
On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > Error propagation between the generic vhost code and the specific backends is > not quite consistent: some places follow "return -1 and set errno" convention, > while others assume "return negated errno". Furthermore, not enough care is > taken not to clobber errno. > > As a result, on certain code paths the errno resulting from a failure may get > overridden by another function call, and then that zero errno inidicating > success is propagated up the stack, leading to failures being lost. In > particular, we've seen errors in the communication with a vhost-user-blk slave > not trigger an immediate connection drop and reconnection, leaving it in a > broken state. > > Rework error propagation to always return negated errno on errors and > correctly pass it up the stack. > > Roman Kagan (10): > vhost-user-blk: reconnect on any error during realize > chardev/char-socket: tcp_chr_recv: don't clobber errno > chardev/char-socket: tcp_chr_sync_read: don't clobber errno > chardev/char-fe: don't allow EAGAIN from blocking read So I dropped this one. If you are so inclined, pls work on this separately. > vhost-backend: avoid overflow on memslots_limit > vhost-backend: stick to -errno error return convention > vhost-vdpa: stick to -errno error return convention > vhost-user: stick to -errno error return convention > vhost: stick to -errno error return convention > vhost-user-blk: propagate error return from generic vhost > > chardev/char-fe.c | 7 +- > chardev/char-socket.c | 17 +- > hw/block/vhost-user-blk.c | 4 +- > hw/virtio/vhost-backend.c | 4 +- > hw/virtio/vhost-user.c| 401 +- > hw/virtio/vhost-vdpa.c| 37 ++-- > hw/virtio/vhost.c | 98 +- > 7 files changed, 307 insertions(+), 261 deletions(-) > > -- > 2.33.1 >