Re: [PATCH v11 00/13] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set
On Dec 9 10:57, Klaus Jensen wrote: > Hi Dmitry, > > By and large, this looks OK to me. There are still some issues here and > there, and some comments of mine that you did not address, but I will > follow up with patches to fix that. Let's get this merged. > > It looks like the nvme-next you rebased on is slightly old and missing > two commits: > > "hw/block/nvme: remove superfluous NvmeCtrl parameter" and > "hw/block/nvme: pull aio error handling" > > It caused a couple of conflicts, but nothing that I couldn't fix up. > > Since I didn't manage to convince anyone about the zsze and zcap > parameters being in terms of LBAs, I'll revert that to be > 'zoned.zone_size' and 'zoned.zone_capacity'. > > Finally, would you accept that we skip "hw/block/nvme: Add injection of > Offline/Read-Only zones" for now? I'd like to discuss it a bit since I > think the random injects feels a bit ad-hoc. Back when I did OCSSD > emulation with Hans, we did something like this for setting up state > through a descriptor text file - I think we should explore something > like that before we lock down the two parameters. I'll amend the final > documentation commit to not include those parameters. > > Sounds good? > > Otherwise, I think this is mergeable to nvme-next. So, for the series > (excluding "hw/block/nvme: Add injection of Offline/Read-Only zones"): > > Reviewed-by: Klaus Jensen > I've applied this series to my local nvme-next. Our repo host is unavailable this morning (infradead.org), but I will push as soon as possible. Thanks! Klaus signature.asc Description: PGP signature
Re: [Bug 1910586] [NEW] SD card size constraint conceptually wrong
On 1/7/21 8:24 PM, - wrote: > Public bug reported: > > The patch discussed here: > https://www.mail-archive.com/qemu-devel@nongnu.org/msg720833.html > introduces an artificial size constraint for SD cards > that has no relation to reality. > > I'm trying to use an _actual_ **physical** SD card, > and qemu tells me its size is "invalid". > > Something here appears to be conceptually wrong. > > -- > # fdisk -l /dev/sdg > Disk /dev/sdg: 14.84 GiB, 15931539456 bytes, 31116288 sectors > Disk model: USB SD Reader > Units: sectors of 1 * 512 = 512 bytes > Sector size (logical/physical): 512 bytes / 512 bytes > I/O size (minimum/optimal): 512 bytes / 512 bytes > Disklabel type: dos > Disk identifier: 0x7a0c8bb0 > > Device Boot Start End Sectors Size Id Type > /dev/sdg1 2048 524287 522240 255M c W95 FAT32 (LBA) > /dev/sdg2 524288 31116287 30592000 14.6G 83 Linux > # qemu-system-aarch64 -M raspi3 -m 1G -kernel vmlinuz-5.4.79-v8 -dtb > bcm2837-rpi-3-b-plus.dtb -append console=ttyAMA0\ root=/dev/mmcblk0p2\ rw > -nographic -serial mon:stdio -drive file=/dev/sdg,format=raw > qemu-system-aarch64: Invalid SD card size: 14.8 GiB > SD card size has to be a power of 2, e.g. 16 GiB. Your physical card likely is 16GiB. The firmware running on it is free to reserve some amount to replace broken blocks. In your case ~7%. We choose to restrict the model to the physical layer to simplify the design and avoid to deal with security issues. Patches to improve the model by better matching the real world are always welcomed! > You can resize disk images with 'qemu-img resize ' > (note that this will lose data if you make the image smaller than it > currently is). Indeed, we can remove this warning for block devices. > -- > > The same invocation with a dump of the actual image > resized to match qemu's odd expectations works fine. > > > This is on QEMU 5.2.0, as evidenced by the following: > -- > # qemu-system-aarch64 -version > QEMU emulator version 5.2.0 > Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers > -- > > Is there a simple workaround that disables this rather > arbitrary constraint? No, but you can send a patch :) Regards, Phil.
Re: To start multiple KVM guests from one qcow2 image with transient disk option
On Thu, Jan 07, 2021 at 09:05:42AM +0100, Peter Krempa wrote: > On Tue, Jan 05, 2021 at 15:12:55 +0100, Peter Krempa wrote: > > On Mon, Jan 04, 2021 at 15:30:19 -0500, Masayoshi Mizuma wrote: > > > On Sat, Dec 19, 2020 at 11:30:39PM -0500, Masayoshi Mizuma wrote: > > [...] > > > {"execute":"cont"} > > > > So that is a no-go. Some disk bus-es such as IDE don't support hotplug: > > > > https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_hotplug.c#L1074 > > > > You could try to just instantiate the backend of the disk as read-only, > > and then create a writable overlay. You just need to make sure that the > > disk will be writable and that it works even for IDE/SATA which doesn't > > support read-only: > > > > https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_validate.c#L2634 > > Okay, I've actually tried implementing my suggestion and that doesn't > work. With scsi disks qemu crashes on an assertion failure when I try > writing to the disk after setting it up as suggested and virtio disk > returns an I/O error as it remembers that it was read-only when > starting. > > I'm considering whether it's worth implementing this via hotplug then. > > Arguably simple deployments don't need it and complex ones have some > kind of orchestration which can do it externally. > > This specifically comes with a caveat of the above, as currently the > overlay used to discard writes is created under the same path as the > image and can't be controlled, which might be a problem for complex > deployments. > > Also the above algorithm with a constant suffix added to the image > prevents to use it with multiple VMs anyways, since the overlay file > name will collide (since it's generated based on the same rules). > > Didn't you run into the collision? Yes, I needed to set the different filename for each guests with blockdev-add QMP command. Is it good idea to add the guest name as the suffix to the filename? Like as: Guest0 QMP: {"execute":"blockdev-add","arguments":{"driver":"file","filename":"/var/lib/libvirt/images/guest.TRANSIENT-Guest0","node-name":"storage2","auto-read-only":true,"discard":"unmap"}} Guest1 QMP: {"execute":"blockdev-add","arguments":{"driver":"file","filename":"/var/lib/libvirt/images/guest.TRANSIENT-Guest1","node-name":"storage2","auto-read-only":true,"discard":"unmap"}} Thanks, Masa
Re: Potential regression in 'qemu-img convert' to LVM
On Tue, Sep 15, 2020 at 2:51 PM Stefan Reiter wrote: > > On 9/15/20 11:08 AM, Nir Soffer wrote: > > On Mon, Sep 14, 2020 at 3:25 PM Stefan Reiter wrote: > >> > >> Hi list, > >> > >> following command fails since 5.1 (tested on kernel 5.4.60): > >> > >> # qemu-img convert -p -f raw -O raw /dev/zvol/pool/disk-1 /dev/vg/disk-1 > >> qemu-img: error while writing at byte 2157968896: Device or resource busy > >> > >> (source is ZFS here, but doesn't matter in practice, it always fails the > >> same; offset changes slightly but consistently hovers around 2^31) > >> > >> strace shows the following: > >> fallocate(13, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2157968896, > >> 4608) = -1 EBUSY (Device or resource busy) > > > > What is the size of the LV? > > > > Same as the source, 5GB in my test case. Created with: > > # lvcreate -ay --size 5242880k --name disk-1 vg > > > Does it happen if you change sparse minimum size (-S)? > > > > For example: -S 64k > > > > qemu-img convert -p -f raw -O raw -S 64k /dev/zvol/pool/disk-1 > > /dev/vg/disk-1 > > > > Tried a few different values, always the same result: EBUSY at byte > 2157968896. > > >> Other fallocate calls leading up to this work fine. > >> > >> This happens since commit edafc70c0c "qemu-img convert: Don't pre-zero > >> images", before that all fallocates happened at the start. Reverting the > >> commit and calling qemu-img exactly the same way on the same data works > >> fine. > > > > But slowly, doing up to 100% more work for fully allocated images. > > > > Of course, I'm not saying the patch is wrong, reverting it just avoids > triggering the bug. > > >> Simply retrying the syscall on EBUSY (like EINTR) does *not* work, > >> once it fails it keeps failing with the same error. > >> > >> I couldn't find anything related to EBUSY on fallocate, and it only > >> happens on LVM targets... Any idea or pointers where to look? > > > > Is this thin LV? > > > > No, regular LV. See command above. > > > This works for us using regular LVs. > > > > Which kernel? which distro? > > > > Reproducible on: > * PVE w/ kernel 5.4.60 (Ubuntu based) > * Manjaro w/ kernel 5.8.6 > > I found that it does not happen with all images, I suppose there must be > a certain number of smaller holes for it to happen. I am using a VM > image with a bare-bones Alpine Linux installation, but it's not an > isolated case, we've had two people report the issue on our bug tracker: > https://bugzilla.proxmox.com/show_bug.cgi?id=3002 I think that this issue may be fixed by https://lists.nongnu.org/archive/html/qemu-block/2020-11/msg00358.html Nir
Re: [PATCH] meson: Propagate gnutls dependency
Il gio 7 gen 2021, 20:36 Roman Bolshakov ha scritto: > > No I think that Meson should simply explode link_whole libraries to their > > constituent objects. This way duplicates are avoided. > > > > Ok. I've looked through related changes in meson and it flattens object > files implicitly for link_with/link_whole parameters of static_library: > > https://github.com/mesonbuild/meson/pull/6030/files > > But qemu adds dependencies to source set and populates dependencies > parameter of static_library and declare_dependency and we get duplicate > symbols: > > https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00411.html > > Perhaps it's a bug then. > No, the same deduplication is not done for executables, because executables use libraries directly and not their object files. Paolo > > Regards, > Roman > >
Re: [PATCH] meson: Propagate gnutls dependency
On Thu, Jan 07, 2021 at 07:22:06PM +0100, Paolo Bonzini wrote: > On 07/01/21 19:18, Roman Bolshakov wrote: > > > > > The real issue is that Meson's implementation of link_whole for > > > library-in-library makes sense for one use case (convenience library that > > > is > > > linked into another convenience library) but not for another (grouping > > > code > > > for subsystems). I cannot blame them for this because link_with is a more > > > common case for the latter; OTOH QEMU is using link_whole a lot in order > > > to > > > support the *_init() construct. > > > > > > I really think the correct fix is for Meson to use objects instead of > > > archives for link_whole, similar to how QEMU Makefiles used to do it. This > > > would also remove the need for the special .fa suffix, so it would be an > > > improvement all around. > > > > > Does it mean that we need a kind of object target in meson? Do you think > > if this interface would work? > > > > crypto_objs = object_library(..., dependencies: public_deps + > > [aninternaldep]) > > crypto = declare_dependency(link_with: crypto_objs, dependencies: > > public_deps) > > No I think that Meson should simply explode link_whole libraries to their > constituent objects. This way duplicates are avoided. > Ok. I've looked through related changes in meson and it flattens object files implicitly for link_with/link_whole parameters of static_library: https://github.com/mesonbuild/meson/pull/6030/files But qemu adds dependencies to source set and populates dependencies parameter of static_library and declare_dependency and we get duplicate symbols: https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00411.html Perhaps it's a bug then. Regards, Roman
Re: [PATCH] meson: Propagate gnutls dependency
On 07/01/21 19:18, Roman Bolshakov wrote: The real issue is that Meson's implementation of link_whole for library-in-library makes sense for one use case (convenience library that is linked into another convenience library) but not for another (grouping code for subsystems). I cannot blame them for this because link_with is a more common case for the latter; OTOH QEMU is using link_whole a lot in order to support the *_init() construct. I really think the correct fix is for Meson to use objects instead of archives for link_whole, similar to how QEMU Makefiles used to do it. This would also remove the need for the special .fa suffix, so it would be an improvement all around. Does it mean that we need a kind of object target in meson? Do you think if this interface would work? crypto_objs = object_library(..., dependencies: public_deps + [aninternaldep]) crypto = declare_dependency(link_with: crypto_objs, dependencies: public_deps) No I think that Meson should simply explode link_whole libraries to their constituent objects. This way duplicates are avoided. Paolo
Re: [PATCH] meson: Propagate gnutls dependency
On Thu, Jan 07, 2021 at 05:23:54PM +0100, Paolo Bonzini wrote: > On 07/01/21 16:56, Roman Bolshakov wrote: > > IMO duplication of dependencies shouldn't be needed for a build system. > > Meta build system should allow private and public dependencies. Different > > rules are applied to them. Private dependency is not propagated beyond a > > target that uses it, public dependency is propagated. > > > > Right now it seems that meson is missing the notion of public and > > private dependencies and that's where the problem arises. The post [1] (and > > the related issue) summarizes what I'm trying to say. > > Meson doesn't have a concept of public dependencies because it separates the > private (static library) and the public (declare_dependency) view. That is > you'd have: > > public_deps = [gnutls, anotherpublicdep] > lib = static_library(..., dependencies: public_deps + [aninternaldep]) > dep = declare_dependency(link_with: lib, dependencies: public_deps) > Thanks! This wasn't obvious to me. But what's not clear that CMake can do both collection of objects (what I provided in the example) and static libraries and they're different. I assume what you have shown would look in CMake like (please note that STATIC is used instead of OBJECT): add_library(crypto STATIC crypto-file1.c ...) target_link_libraries(crypto PRIVATE aninternaldep PUBLIC gnutls anotherpublicdep) That explains why attempt to use dependencies between link_whole static libraries in meson causes symbol duplication. CMake on other hand can just make collection of objects or even a chain of collection of objects. They'll be linked in fully only in a final static library, shared library or an executable. > The real issue is that Meson's implementation of link_whole for > library-in-library makes sense for one use case (convenience library that is > linked into another convenience library) but not for another (grouping code > for subsystems). I cannot blame them for this because link_with is a more > common case for the latter; OTOH QEMU is using link_whole a lot in order to > support the *_init() construct. > > I really think the correct fix is for Meson to use objects instead of > archives for link_whole, similar to how QEMU Makefiles used to do it. This > would also remove the need for the special .fa suffix, so it would be an > improvement all around. > Does it mean that we need a kind of object target in meson? Do you think if this interface would work? crypto_objs = object_library(..., dependencies: public_deps + [aninternaldep]) crypto = declare_dependency(link_with: crypto_objs, dependencies: public_deps) Regards, Roman > Paolo > > > If we resolve the issue, then we just specify gnutls as a public > > dependency of crypto and all users of crypto would get gnutls headers. > > > > Here's an example how clearly CMake approaches the issue [2][3]: > > > > add_library(crypto OBJECT crypto-file1.c ...) > > target_link_libraries(crypto PRIVATE aninternaldep > > PUBLIC gnutls > > anotherpublicdep) > > > > 1.https://github.com/mesonbuild/meson/issues/495#issuecomment-206178570 > > 2.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#linking-object-libraries > > 3.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#libraries-for-a-target-and-or-its-dependents >
Re: [PATCH v3] block: report errno when flock fcntl fails
Ugh, please ignore this one. I didn't remove the test changes, but the tests passed, which I don't understand. On Thursday, 2021-01-07 at 17:55:38 GMT, David Edmondson wrote: > When a call to fcntl(2) for the purpose of manipulating file locks > fails with an error other than EAGAIN or EACCES, report the error > returned by fcntl. > > EAGAIN or EACCES are elided as they are considered to be common > failures, indicating that a conflicting lock is held by another > process. > > Signed-off-by: David Edmondson > --- > v3: > - Remove the now unnecessary updates to the test framework (Max). > - Elide the error detail for EAGAIN or EACCES when locking (Kevin, > sort-of Max). > - Philippe and Vladimir sent Reviewed-by, but things have changed > noticeably, so I didn't add them (dme). > > block/file-posix.c | 52 +- > tests/qemu-iotests/153.out | 76 +++--- > tests/qemu-iotests/182.out | 2 +- > 3 files changed, 81 insertions(+), 49 deletions(-) > > diff --git a/block/file-posix.c b/block/file-posix.c > index 00cdaaa2d4..c5142f7ffa 100644 > --- a/block/file-posix.c > +++ b/block/file-posix.c > @@ -836,7 +836,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, > if ((perm_lock_bits & bit) && !(locked_perm & bit)) { > ret = qemu_lock_fd(fd, off, 1, false); > if (ret) { > -error_setg(errp, "Failed to lock byte %d", off); > +int err = -ret; > + > +if (err == EAGAIN || err == EACCES) { > +error_setg(errp, "Failed to lock byte %d", off); > +} else { > +error_setg_errno(errp, err, "Failed to lock byte %d", > off); > +} > return ret; > } else if (s) { > s->locked_perm |= bit; > @@ -844,7 +850,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, > } else if (unlock && (locked_perm & bit) && !(perm_lock_bits & bit)) > { > ret = qemu_unlock_fd(fd, off, 1); > if (ret) { > -error_setg(errp, "Failed to unlock byte %d", off); > +int err = -ret; > + > +if (err == EAGAIN || err == EACCES) { > +error_setg(errp, "Failed to lock byte %d", off); > +} else { > +error_setg_errno(errp, err, "Failed to lock byte %d", > off); > +} > return ret; > } else if (s) { > s->locked_perm &= ~bit; > @@ -857,7 +869,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, > if ((shared_perm_lock_bits & bit) && !(locked_shared_perm & bit)) { > ret = qemu_lock_fd(fd, off, 1, false); > if (ret) { > -error_setg(errp, "Failed to lock byte %d", off); > +int err = -ret; > + > +if (err == EAGAIN || err == EACCES) { > +error_setg(errp, "Failed to lock byte %d", off); > +} else { > +error_setg_errno(errp, err, "Failed to lock byte %d", > off); > +} > return ret; > } else if (s) { > s->locked_shared_perm |= bit; > @@ -866,7 +884,7 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, > !(shared_perm_lock_bits & bit)) { > ret = qemu_unlock_fd(fd, off, 1); > if (ret) { > -error_setg(errp, "Failed to unlock byte %d", off); > +error_setg_errno(errp, -ret, "Failed to unlock byte %d", > off); > return ret; > } else if (s) { > s->locked_shared_perm &= ~bit; > @@ -890,9 +908,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, > uint64_t shared_perm, > ret = qemu_lock_fd_test(fd, off, 1, true); > if (ret) { > char *perm_name = bdrv_perm_names(p); > -error_setg(errp, > - "Failed to get \"%s\" lock", > - perm_name); > +int err = -ret; > + > +if (err == EAGAIN || err == EACCES) { > +error_setg(errp, "Failed to get \"%s\" lock", > + perm_name); > +} else { > +error_setg_errno(errp, err, > + "Failed to get \"%s\" lock", > + perm_name); > +} > g_free(perm_name); > return ret; > } > @@ -905,9 +930,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, > uint64_t shared_perm, > ret = qemu_lock_fd_test(fd, off, 1, true); > if (ret) { > char *perm_name = bdrv_perm_names(p); > -error_setg(errp, > -
[PATCH v3] block: report errno when flock fcntl fails
When a call to fcntl(2) for the purpose of manipulating file locks fails with an error other than EAGAIN or EACCES, report the error returned by fcntl. EAGAIN or EACCES are elided as they are considered to be common failures, indicating that a conflicting lock is held by another process. Signed-off-by: David Edmondson --- v3: - Remove the now unnecessary updates to the test framework (Max). - Elide the error detail for EAGAIN or EACCES when locking (Kevin, sort-of Max). - Philippe and Vladimir sent Reviewed-by, but things have changed noticeably, so I didn't add them (dme). block/file-posix.c | 52 +- tests/qemu-iotests/153.out | 76 +++--- tests/qemu-iotests/182.out | 2 +- 3 files changed, 81 insertions(+), 49 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 00cdaaa2d4..c5142f7ffa 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -836,7 +836,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, if ((perm_lock_bits & bit) && !(locked_perm & bit)) { ret = qemu_lock_fd(fd, off, 1, false); if (ret) { -error_setg(errp, "Failed to lock byte %d", off); +int err = -ret; + +if (err == EAGAIN || err == EACCES) { +error_setg(errp, "Failed to lock byte %d", off); +} else { +error_setg_errno(errp, err, "Failed to lock byte %d", off); +} return ret; } else if (s) { s->locked_perm |= bit; @@ -844,7 +850,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, } else if (unlock && (locked_perm & bit) && !(perm_lock_bits & bit)) { ret = qemu_unlock_fd(fd, off, 1); if (ret) { -error_setg(errp, "Failed to unlock byte %d", off); +int err = -ret; + +if (err == EAGAIN || err == EACCES) { +error_setg(errp, "Failed to lock byte %d", off); +} else { +error_setg_errno(errp, err, "Failed to lock byte %d", off); +} return ret; } else if (s) { s->locked_perm &= ~bit; @@ -857,7 +869,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, if ((shared_perm_lock_bits & bit) && !(locked_shared_perm & bit)) { ret = qemu_lock_fd(fd, off, 1, false); if (ret) { -error_setg(errp, "Failed to lock byte %d", off); +int err = -ret; + +if (err == EAGAIN || err == EACCES) { +error_setg(errp, "Failed to lock byte %d", off); +} else { +error_setg_errno(errp, err, "Failed to lock byte %d", off); +} return ret; } else if (s) { s->locked_shared_perm |= bit; @@ -866,7 +884,7 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd, !(shared_perm_lock_bits & bit)) { ret = qemu_unlock_fd(fd, off, 1); if (ret) { -error_setg(errp, "Failed to unlock byte %d", off); +error_setg_errno(errp, -ret, "Failed to unlock byte %d", off); return ret; } else if (s) { s->locked_shared_perm &= ~bit; @@ -890,9 +908,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, uint64_t shared_perm, ret = qemu_lock_fd_test(fd, off, 1, true); if (ret) { char *perm_name = bdrv_perm_names(p); -error_setg(errp, - "Failed to get \"%s\" lock", - perm_name); +int err = -ret; + +if (err == EAGAIN || err == EACCES) { +error_setg(errp, "Failed to get \"%s\" lock", + perm_name); +} else { +error_setg_errno(errp, err, + "Failed to get \"%s\" lock", + perm_name); +} g_free(perm_name); return ret; } @@ -905,9 +930,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, uint64_t shared_perm, ret = qemu_lock_fd_test(fd, off, 1, true); if (ret) { char *perm_name = bdrv_perm_names(p); -error_setg(errp, - "Failed to get shared \"%s\" lock", - perm_name); +int err = -ret; + +if (err == EAGAIN || err == EACCES) { +error_setg(errp, "Failed to get shared \"%s\" lock", + perm_name); +} else { +error_setg_errno(errp, err, +
Re: [PATCH] meson: Propagate gnutls dependency
On 07/01/21 16:56, Roman Bolshakov wrote: IMO duplication of dependencies shouldn't be needed for a build system. Meta build system should allow private and public dependencies. Different rules are applied to them. Private dependency is not propagated beyond a target that uses it, public dependency is propagated. Right now it seems that meson is missing the notion of public and private dependencies and that's where the problem arises. The post [1] (and the related issue) summarizes what I'm trying to say. Meson doesn't have a concept of public dependencies because it separates the private (static library) and the public (declare_dependency) view. That is you'd have: public_deps = [gnutls, anotherpublicdep] lib = static_library(..., dependencies: public_deps + [aninternaldep]) dep = declare_dependency(link_with: lib, dependencies: public_deps) The real issue is that Meson's implementation of link_whole for library-in-library makes sense for one use case (convenience library that is linked into another convenience library) but not for another (grouping code for subsystems). I cannot blame them for this because link_with is a more common case for the latter; OTOH QEMU is using link_whole a lot in order to support the *_init() construct. I really think the correct fix is for Meson to use objects instead of archives for link_whole, similar to how QEMU Makefiles used to do it. This would also remove the need for the special .fa suffix, so it would be an improvement all around. Paolo If we resolve the issue, then we just specify gnutls as a public dependency of crypto and all users of crypto would get gnutls headers. Here's an example how clearly CMake approaches the issue [2][3]: add_library(crypto OBJECT crypto-file1.c ...) target_link_libraries(crypto PRIVATE aninternaldep PUBLIC gnutls anotherpublicdep) 1.https://github.com/mesonbuild/meson/issues/495#issuecomment-206178570 2.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#linking-object-libraries 3.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#libraries-for-a-target-and-or-its-dependents
Re: [PATCH] meson: Propagate gnutls dependency
On Thu, Jan 07, 2021 at 12:41:40PM +0100, Paolo Bonzini wrote: > On 05/01/21 15:37, Roman Bolshakov wrote: > > Does it work if you do: > > > > crypto_ss.add(authz, qom) > > libcrypto = static_library('crypto', crypto_ss.sources() + genh, > > dependencies: crypto_ss.dependencies(), > > ...) > > crypto = declare_dependency(link_whole: libcrypto, > > dependencies: crypto_ss.dependencies()) > > Ok, so the final attempt is a mix of the three :) Keep the link_whole > dependencies in the declare_dependency, and add the sourceset dependencies > there too. Hi Paolo, Thanks for the patch but unfortunately it doesn't resolve the issue. io and other libraries can't still find gnutls. I've also tried your meson trans-deps branch and wonder if it's supposed to fix the issue without any changes to qemu build files? Do you need any help with meson changes? IMO duplication of dependencies shouldn't be needed for a build system. Meta build system should allow private and public dependencies. Different rules are applied to them. Private dependency is not propagated beyond a target that uses it, public dependency is propagated. There's also declare_dependency that has to be always public because it serves no purpose on it's own. declare_dependency is like INTERFACE library in CMake. If a project specifies a dependency that is public, it should be transitively passed downstream. Build system shouldn't obscurely hide flags a dependency provides on case-by-case basis. Right now it seems that meson is missing the notion of public and private dependencies and that's where the problem arises. The post [1] (and the related issue) summarizes what I'm trying to say. If we resolve the issue, then we just specify gnutls as a public dependency of crypto and all users of crypto would get gnutls headers. Here's an example how clearly CMake approaches the issue [2][3]: add_library(crypto OBJECT crypto-file1.c ...) target_link_libraries(crypto PRIVATE aninternaldep PUBLIC gnutls anotherpublicdep) 1. https://github.com/mesonbuild/meson/issues/495#issuecomment-206178570 2. https://cmake.org/cmake/help/latest/command/target_link_libraries.html#linking-object-libraries 3. https://cmake.org/cmake/help/latest/command/target_link_libraries.html#libraries-for-a-target-and-or-its-dependents Regards, Roman > > diff --git a/meson.build b/meson.build > index e9bf290966..774df4db8e 100644 > --- a/meson.build > +++ b/meson.build > @@ -1904,7 +1904,8 @@ libqom = static_library('qom', qom_ss.sources() + > genh, > dependencies: [qom_ss.dependencies()], > name_suffix: 'fa') > > -qom = declare_dependency(link_whole: libqom) > +qom = declare_dependency(link_whole: libqom, > + dependencies: [qom_ss.dependencies()]) > > authz_ss = authz_ss.apply(config_host, strict: false) > libauthz = static_library('authz', authz_ss.sources() + genh, > @@ -1913,7 +1914,7 @@ libauthz = static_library('authz', authz_ss.sources() > + genh, >build_by_default: false) > > authz = declare_dependency(link_whole: libauthz, > - dependencies: qom) > + dependencies: [authz_ss.dependencies(), qom]) > > crypto_ss = crypto_ss.apply(config_host, strict: false) > libcrypto = static_library('crypto', crypto_ss.sources() + genh, > @@ -1922,7 +1923,7 @@ libcrypto = static_library('crypto', > crypto_ss.sources() + genh, > build_by_default: false) > > crypto = declare_dependency(link_whole: libcrypto, > -dependencies: [authz, qom]) > +dependencies: [crypto_ss.dependencies(), authz, > qom]) > > io_ss = io_ss.apply(config_host, strict: false) > libio = static_library('io', io_ss.sources() + genh, > @@ -1931,13 +1932,14 @@ libio = static_library('io', io_ss.sources() + genh, > name_suffix: 'fa', > build_by_default: false) > > -io = declare_dependency(link_whole: libio, dependencies: [crypto, qom]) > +io = declare_dependency(link_whole: libio, > +dependencies: [io_ss.dependencies(), crypto, qom]) > > libmigration = static_library('migration', sources: migration_files + genh, >name_suffix: 'fa', >build_by_default: false) > migration = declare_dependency(link_with: libmigration, > - dependencies: [zlib, qom, io]) > + dependencies: [qom, io]) > softmmu_ss.add(migration) > > block_ss = block_ss.apply(config_host, strict: false) > @@ -1949,7 +1951,7 @@ libblock = static_library('block', block_ss.sources() > + genh, > > block = declare_dependency(link_whole: [libblock], >
Re: [PATCH v3 5/5] block/scsi: correctly emulate the VPD block limits page
On 17.12.20 17:56, Maxim Levitsky wrote: When the device doesn't support the VPD block limits page, we emulate it even for SCSI passthrough. As a part of the emulation we need to add it to the 'Supported VPD Pages' The code that does this adds it to the page, but it doesn't increase the length of the data to be copied to the guest, thus the guest never sees the VPD block limits page as supported. Isn’t the problem more generally that if there is a block limits page, the last supported page is cut off (which perhaps more likely than not is the block limits page (given that it’s 0xb0, which relatively high))? Bump the transfer size by 1 in this case. Signed-off-by: Maxim Levitsky --- hw/scsi/scsi-generic.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) Anyway, looks good to me, though I have a hard time following the code, which yields a rather weak: Reviewed-by: Max Reitz
Re: [PATCH v3 4/5] block: use blk_get_max_ioctl_transfer for SCSI passthrough
On 17.12.20 17:56, Maxim Levitsky wrote: Switch file-posix to expose only the max_ioctl_transfer limit. Let the iscsi driver work as it did before since it is bound by the transfer limit in both regular read/write and in SCSI passthrough case. Switch the scsi-disk and scsi-block drivers to read the SG max transfer limits using the new blk_get_max_ioctl_transfer interface. Fixes: 867eccfed8 ("file-posix: Use max transfer length/segment count only for SCSI passthrough") Signed-off-by: Maxim Levitsky --- block/file-posix.c | 7 --- block/iscsi.c | 1 + hw/scsi/scsi-generic.c | 4 ++-- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 2bf4d095a7..c34a7a9599 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1282,13 +1282,14 @@ static void hdev_refresh_limits(BlockDriverState *bs, Error **errp) get_max_transfer_length(s->fd); if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) { -bs->bl.max_transfer = pow2floor(ret); +bs->bl.max_ioctl_transfer = pow2floor(ret); } ret = bs->sg ? sg_get_max_segments(s->fd) : get_max_segments(s->fd); if (ret > 0) { -bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, - ret * qemu_real_host_page_size); +bs->bl.max_ioctl_transfer = +MIN_NON_ZERO(bs->bl.max_ioctl_transfer, + ret * qemu_real_host_page_size); } Do non-SG devices even have a max transfer length then? I would’ve thought max_ioctl_transfer simply doesn’t apply to them and so could be left 0. (Rest looks good – from what I can tell...) Max raw_refresh_limits(bs, errp);
Re: [PATCH v3 3/5] block: add max_ioctl_transfer to BlockLimits
On 17.12.20 17:56, Maxim Levitsky wrote: Maximum transfer size when accessing a kernel block device is only relevant when using SCSI passthrough (SG_IO ioctl) since only in this case the requests are passed directly to underlying hardware with no pre-processing. So by “with no pre-processing” you mean in particular no pre-processing by the kernel? I.e., that file-posix for non-SG devices actually has no max transfer limit, because the kernel will split overly long requests by itself? Same is true when using /dev/sg* character devices (which only support SG_IO) Therefore split the block driver's advertized max transfer size by the regular max transfer size, and the max transfer size for SCSI passthrough (the new max_ioctl_transfer field) In the next patch, the qemu block drivers that support SCSI passthrough will set the max_ioctl_transfer field, and simultaneously, the block devices that implement scsi passthrough will switch to 'blk_get_max_ioctl_transfer' to query and to pass it to the guest. Signed-off-by: Maxim Levitsky --- block/block-backend.c | 12 block/io.c | 2 ++ include/block/block_int.h | 4 include/sysemu/block-backend.h | 1 + 4 files changed, 19 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index ce78d30794..c1d149a755 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1938,6 +1938,18 @@ uint32_t blk_get_max_transfer(BlockBackend *blk) return MIN_NON_ZERO(max, INT_MAX); } +/* Returns the maximum transfer length, for SCSI passthrough */ +uint32_t blk_get_max_ioctl_transfer(BlockBackend *blk) +{ +BlockDriverState *bs = blk_bs(blk); +uint32_t max = 0; + +if (bs) { +max = bs->bl.max_ioctl_transfer; +} +return MIN_NON_ZERO(max, INT_MAX); +} + int blk_get_max_iov(BlockBackend *blk) { return blk->root->bs->bl.max_iov; diff --git a/block/io.c b/block/io.c index 24205f5168..ac5aea435e 100644 --- a/block/io.c +++ b/block/io.c @@ -126,6 +126,8 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src) { dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer); dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer); +dst->max_ioctl_transfer = MIN_NON_ZERO(dst->max_ioctl_transfer, +src->max_ioctl_transfer); I’d prefer this to be aligned to the opening parenthesis. dst->opt_mem_alignment = MAX(dst->opt_mem_alignment, src->opt_mem_alignment); dst->min_mem_alignment = MAX(dst->min_mem_alignment, diff --git a/include/block/block_int.h b/include/block/block_int.h index 1eeafc118c..c59b0aefc4 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -686,6 +686,10 @@ typedef struct BlockLimits { * clamped down. */ uint32_t max_transfer; +/* Maximal transfer length for SCSI passthrough (ioctl interface) */ +uint32_t max_ioctl_transfer; + + Is there a specific reason you added a double newline here? (All other fields in this struct are separated by a single newline) Max /* memory alignment, in bytes so that no bounce buffer is needed */ size_t min_mem_alignment; diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 8203d7f6f9..b019a37b7a 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -203,6 +203,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag); int blk_get_flags(BlockBackend *blk); uint32_t blk_get_request_alignment(BlockBackend *blk); uint32_t blk_get_max_transfer(BlockBackend *blk); +uint32_t blk_get_max_ioctl_transfer(BlockBackend *blk); int blk_get_max_iov(BlockBackend *blk); void blk_set_guest_block_size(BlockBackend *blk, int align); void *blk_try_blockalign(BlockBackend *blk, size_t size);
Re: [PATCH 1/2] file-posix: allow -EBUSY errors during write zeros on raw block devices
On Mon, 2020-11-16 at 15:48 +0100, Kevin Wolf wrote: > Am 11.11.2020 um 16:39 hat Maxim Levitsky geschrieben: > > On Linux, fallocate(fd, FALLOC_FL_PUNCH_HOLE) when it is used on a block > > device, > > without O_DIRECT can return -EBUSY if it races with another write to the > > same page. > > > > Since this is rare and discard is not a critical operation, ignore this > > error > > > > Signed-off-by: Maxim Levitsky > > I'm applying this one for 5.2, it certainly shouldn't hurt and makes > things work at least, even if possibly not in the optimal way. > > Patch 2 seems to be a bit less obvious and discussion is ongoing, so > that's probably more 6.0 material. > > Kevin Any feedback on patch 2? Best regards, Maxim Levitsky
Re: [PATCH v3 2/5] file-posix: add sg_get_max_segments that actually works with sg
On 17.12.20 17:56, Maxim Levitsky wrote: From: Tom Yan sg devices have different major/minor than their corresponding block devices. Using sysfs to get max segments never really worked for them. Fortunately the sg driver provides an ioctl to get sg_tablesize, which is apparently equivalent to max segments. Signed-off-by: Tom Yan Signed-off-by: Maxim Levitsky --- block/file-posix.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/block/file-posix.c b/block/file-posix.c index cbf1271773..2bf4d095a7 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1179,6 +1179,26 @@ static int sg_get_max_transfer_length(int fd) #endif } +static int sg_get_max_segments(int fd) +{ +/* + * /dev/sg* character devices report 'max_segments' via + * SG_GET_SG_TABLESIZE ioctl + */ + +#ifdef SG_GET_SG_TABLESIZE +long max_segments = 0; + +if (ioctl(fd, SG_GET_SG_TABLESIZE, _segments) == 0) { As far as I can tell from googling, SG_GET_SG_TABLESIZE takes an int pointer. Apart from that, looks good. Max +return max_segments; +} else { +return -errno; +} +#else +return -ENOSYS; +#endif +} + static int get_max_transfer_length(int fd) { #if defined(BLKSECTGET) @@ -1265,7 +1285,7 @@ static void hdev_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.max_transfer = pow2floor(ret); } -ret = get_max_segments(s->fd); +ret = bs->sg ? sg_get_max_segments(s->fd) : get_max_segments(s->fd); if (ret > 0) { bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, ret * qemu_real_host_page_size);
Re: [PATCH v3 1/5] file-posix: split hdev_refresh_limits from raw_refresh_limits
On 17.12.20 17:56, Maxim Levitsky wrote: From: Tom Yan We can and should get max transfer length and max segments for all host devices / cdroms (on Linux). Also use MIN_NON_ZERO instead when we clamp max transfer length against max segments. Signed-off-by: Tom Yan Signed-off-by: Maxim Levitsky --- block/file-posix.c | 57 +- 1 file changed, 41 insertions(+), 16 deletions(-) I’m aware that most of my remarks below apply to the pre-patch state just as well, but I feel like now is a good time to raise them, so, here goes: diff --git a/block/file-posix.c b/block/file-posix.c index 9804681d5c..cbf1271773 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1166,6 +1166,10 @@ static int sg_get_max_transfer_length(int fd) int max_bytes = 0; if (ioctl(fd, BLKSECTGET, _bytes) == 0) { +/* + * BLKSECTGET for /dev/sg* character devices incorrectly returns + * the max transfer size in bytes (rather than in blocks). + */ return max_bytes; } else { return -errno; @@ -1175,7 +1179,22 @@ static int sg_get_max_transfer_length(int fd) #endif } -static int sg_get_max_segments(int fd) +static int get_max_transfer_length(int fd) +{ +#if defined(BLKSECTGET) +int sect = 0; + +if (ioctl(fd, BLKSECTGET, ) == 0) { +return sect << 9; Can this overflow? (I mean, technically it would still be safe, because either the limit is set too low or it isn’t set at all, which would be correct on overflow. But still.) +} else { +return -errno; +} +#else +return -ENOSYS; +#endif +} + +static int get_max_segments(int fd) { #ifdef CONFIG_LINUX char buf[32]; This function stores max_segments (a long) in ret (an int) and returns the latter. Should we guard against overflows? @@ -1230,23 +1249,29 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) { BDRVRawState *s = bs->opaque; -if (bs->sg) { -int ret = sg_get_max_transfer_length(s->fd); +raw_probe_alignment(bs, s->fd, errp); +bs->bl.min_mem_alignment = s->buf_align; +bs->bl.opt_mem_alignment = MAX(s->buf_align, qemu_real_host_page_size); +} + +static void hdev_refresh_limits(BlockDriverState *bs, Error **errp) +{ +BDRVRawState *s = bs->opaque; + +int ret = bs->sg ? sg_get_max_transfer_length(s->fd) : + get_max_transfer_length(s->fd); -if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) { -bs->bl.max_transfer = pow2floor(ret); -} +if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) { +bs->bl.max_transfer = pow2floor(ret); +} -ret = sg_get_max_segments(s->fd); -if (ret > 0) { -bs->bl.max_transfer = MIN(bs->bl.max_transfer, - ret * qemu_real_host_page_size); (1) Can this overflow? (Which I suppose could result in a non-power-of-two result) (2) Even disregarding overflows, is ret * qemu_real_host_page_size guaranteed to be a power of two? Max -} +ret = get_max_segments(s->fd); +if (ret > 0) { +bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, + ret * qemu_real_host_page_size); } -raw_probe_alignment(bs, s->fd, errp); -bs->bl.min_mem_alignment = s->buf_align; -bs->bl.opt_mem_alignment = MAX(s->buf_align, qemu_real_host_page_size); +raw_refresh_limits(bs, errp); } static int check_for_dasd(int fd) @@ -3600,7 +3625,7 @@ static BlockDriver bdrv_host_device = { .bdrv_co_pdiscard = hdev_co_pdiscard, .bdrv_co_copy_range_from = raw_co_copy_range_from, .bdrv_co_copy_range_to = raw_co_copy_range_to, -.bdrv_refresh_limits = raw_refresh_limits, +.bdrv_refresh_limits = hdev_refresh_limits, .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, .bdrv_attach_aio_context = raw_aio_attach_aio_context, @@ -3724,7 +3749,7 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_co_preadv = raw_co_preadv, .bdrv_co_pwritev= raw_co_pwritev, .bdrv_co_flush_to_disk = raw_co_flush_to_disk, -.bdrv_refresh_limits = raw_refresh_limits, +.bdrv_refresh_limits = hdev_refresh_limits, .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, .bdrv_attach_aio_context = raw_aio_attach_aio_context,
Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
On Thu, Jan 07, 2021 at 10:56:12AM +, Richard W.M. Jones wrote: > On Thu, Jan 07, 2021 at 09:58:17AM +, Richard W.M. Jones wrote: > > On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > Finally to be safe with calculations, to not calculate different > > > maximums for different nodes (depending on cluster size and > > > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30) > > > as absolute maximum bytes length for Qemu. Actually, it's not much less > > > than INT64_MAX. > > > > > +/* > > > + * We want allow aligning requests and disk length up to any 32bit > > > alignment > > > + * and don't afraid of overflow. > > > + * To achieve it, and in the same time use some pretty number as maximum > > > disk > > > + * size, let's define maximum "length" (a limit for any offset/bytes > > > request and > > > + * for disk size) to be the greatest power of 2 less than INT64_MAX. > > > + */ > > > +#define BDRV_MAX_ALIGNMENT (1L << 30) > > > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT)) > > > > This change broke nbdkit tests. > > Actually that's not the only problem. It appears that we're unable to > read or write the last sector of this disk: > > $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c > "r -v $(( 2**63 - 2**30 - 512 )) 512" ' > read failed: Input/output error > > $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w > -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' > write failed: Input/output error > > You can play around with the constants. I found it's possible to read > and write the non-aligned 512 bytes starting at 2^63-2^30-513. Could > be a fencepost error somewhere in qemu? Actually this is a pre-existing bug in qemu. What happens is qemu-io calls qemu_strtosz("9223372035781033472") which returns 0x7fffc000 and no error. That answer is plain flat out wrong. The reason for that is qemu_strtosz uses floating point for the calculation(!) so is limited to 53 bits of precision and silently truncates. It happened we were just lucky that our earlier test with 2^63 - 1024 worked. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [PATCH] meson: Propagate gnutls dependency
On 05/01/21 15:37, Roman Bolshakov wrote: Does it work if you do: crypto_ss.add(authz, qom) libcrypto = static_library('crypto', crypto_ss.sources() + genh, dependencies: crypto_ss.dependencies(), ...) crypto = declare_dependency(link_whole: libcrypto, dependencies: crypto_ss.dependencies()) Ok, so the final attempt is a mix of the three :) Keep the link_whole dependencies in the declare_dependency, and add the sourceset dependencies there too. diff --git a/meson.build b/meson.build index e9bf290966..774df4db8e 100644 --- a/meson.build +++ b/meson.build @@ -1904,7 +1904,8 @@ libqom = static_library('qom', qom_ss.sources() + genh, dependencies: [qom_ss.dependencies()], name_suffix: 'fa') -qom = declare_dependency(link_whole: libqom) +qom = declare_dependency(link_whole: libqom, + dependencies: [qom_ss.dependencies()]) authz_ss = authz_ss.apply(config_host, strict: false) libauthz = static_library('authz', authz_ss.sources() + genh, @@ -1913,7 +1914,7 @@ libauthz = static_library('authz', authz_ss.sources() + genh, build_by_default: false) authz = declare_dependency(link_whole: libauthz, - dependencies: qom) + dependencies: [authz_ss.dependencies(), qom]) crypto_ss = crypto_ss.apply(config_host, strict: false) libcrypto = static_library('crypto', crypto_ss.sources() + genh, @@ -1922,7 +1923,7 @@ libcrypto = static_library('crypto', crypto_ss.sources() + genh, build_by_default: false) crypto = declare_dependency(link_whole: libcrypto, -dependencies: [authz, qom]) +dependencies: [crypto_ss.dependencies(), authz, qom]) io_ss = io_ss.apply(config_host, strict: false) libio = static_library('io', io_ss.sources() + genh, @@ -1931,13 +1932,14 @@ libio = static_library('io', io_ss.sources() + genh, name_suffix: 'fa', build_by_default: false) -io = declare_dependency(link_whole: libio, dependencies: [crypto, qom]) +io = declare_dependency(link_whole: libio, +dependencies: [io_ss.dependencies(), crypto, qom]) libmigration = static_library('migration', sources: migration_files + genh, name_suffix: 'fa', build_by_default: false) migration = declare_dependency(link_with: libmigration, - dependencies: [zlib, qom, io]) + dependencies: [qom, io]) softmmu_ss.add(migration) block_ss = block_ss.apply(config_host, strict: false) @@ -1949,7 +1951,7 @@ libblock = static_library('block', block_ss.sources() + genh, block = declare_dependency(link_whole: [libblock], link_args: '@block.syms', - dependencies: [crypto, io]) + dependencies: [block_ss.dependencies(), crypto, io]) blockdev_ss = blockdev_ss.apply(config_host, strict: false) libblockdev = static_library('blockdev', blockdev_ss.sources() + genh, @@ -1958,7 +1960,7 @@ libblockdev = static_library('blockdev', blockdev_ss.sources() + genh, build_by_default: false) blockdev = declare_dependency(link_whole: [libblockdev], - dependencies: [block]) + dependencies: [blockdev_ss.dependencies(), block]) qmp_ss = qmp_ss.apply(config_host, strict: false) libqmp = static_library('qmp', qmp_ss.sources() + genh, @@ -1966,7 +1968,8 @@ libqmp = static_library('qmp', qmp_ss.sources() + genh, name_suffix: 'fa', build_by_default: false) -qmp = declare_dependency(link_whole: [libqmp]) +qmp = declare_dependency(link_whole: [libqmp], + dependencies: qmp_ss.dependencies()) libchardev = static_library('chardev', chardev_ss.sources() + genh, name_suffix: 'fa', diff --git a/migration/meson.build b/migration/meson.build index 9645f44005..e1f237b5db 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -9,6 +9,7 @@ migration_files = files( ) softmmu_ss.add(migration_files) +softmmu_ss.add(zlib) softmmu_ss.add(files( 'block-dirty-bitmap.c', 'channel.c',
Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
On Thu, Jan 07, 2021 at 09:58:17AM +, Richard W.M. Jones wrote: > On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote: > > Finally to be safe with calculations, to not calculate different > > maximums for different nodes (depending on cluster size and > > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30) > > as absolute maximum bytes length for Qemu. Actually, it's not much less > > than INT64_MAX. > > > +/* > > + * We want allow aligning requests and disk length up to any 32bit > > alignment > > + * and don't afraid of overflow. > > + * To achieve it, and in the same time use some pretty number as maximum > > disk > > + * size, let's define maximum "length" (a limit for any offset/bytes > > request and > > + * for disk size) to be the greatest power of 2 less than INT64_MAX. > > + */ > > +#define BDRV_MAX_ALIGNMENT (1L << 30) > > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT)) > > This change broke nbdkit tests. Actually that's not the only problem. It appears that we're unable to read or write the last sector of this disk: $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c "r -v $(( 2**63 - 2**30 - 512 )) 512" ' read failed: Input/output error $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' write failed: Input/output error You can play around with the constants. I found it's possible to read and write the non-aligned 512 bytes starting at 2^63-2^30-513. Could be a fencepost error somewhere in qemu? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Re: [PATCH v3 0/5] SCSI: fix transfer limits for SCSI passthrough
On Thu, 2020-12-17 at 18:56 +0200, Maxim Levitsky wrote: > This patch series attempts to provide a solution to the problem of the > transfer > limits of the raw file driver (host_device/file-posix), some of which I > already tried to fix in the past. > > I included 2 patches from Tom Yan which fix two issues with reading the limits > correctly from the */dev/sg* character devices in the first place. > > The only change to these patches is that I tweaked a bit the comments in the > source to better document the /dev/sg quirks. > > The other two patches in this series split the max transfer limits that qemu > block devices expose in two: > One limit is for the regular IO, and another is for the SG_IO (aka > bdrv_*_ioctl), > and the two device drivers (scsi-block and scsi-generic) that use the later > are switched to the new interface. > > This should ensure that the raw driver can still advertise the unlimited > transfer length, unless it is used for SG_IO, because that yields the highest > performance. > > Also I include a somewhat unrelated fix to a bug I found in qemu's > SCSI passthrough while testing this: > When qemu emulates the VPD block limit page, for a SCSI device that doesn't > implement it, it doesn't really advertise the emulated page to the guest. > > I tested this by doing both regular and SG_IO passthrough of my > USB SD card reader. > > That device turned out to be a perfect device for the task, since it has max > transfer size of 1024 blocks (512K), and it enforces it. > > Also it didn't implement the VPD block limits page, > (transfer size limit probably comes from something USB related) which > triggered > the unrelated bug. > > I was able to see IO errors without the patches, and the wrong max transfer > size in the guest, and with patches both issues were gone. > > I also found an unrelated issue in /dev/sg passthrough in the kernel. > It turns out that in-kernel driver has a limitation of 16 requests in flight, > regardless of what underlying device supports. > > With a large multi-threaded fio job and a debug print in qemu, it is easy to > see it, although the errors don't do much harm to the guest as it retries the > IO, and eventually succeed. > It is an open question if this should be solved. > > V2: fixed an issue in a patch from Tom Yan (thanks), and removed > refactoring from last patch according to Paulo's request. > > V3: few cosmitic changes due to the review feedback. > > Maxim Levitsky (3): > block: add max_ioctl_transfer to BlockLimits > block: use blk_get_max_ioctl_transfer for SCSI passthrough > block/scsi: correctly emulate the VPD block limits page > > Tom Yan (2): > file-posix: split hdev_refresh_limits from raw_refresh_limits > file-posix: add sg_get_max_segments that actually works with sg > > block/block-backend.c | 12 ++ > block/file-posix.c | 76 +++--- > block/io.c | 2 + > block/iscsi.c | 1 + > hw/scsi/scsi-generic.c | 13 -- > include/block/block_int.h | 4 ++ > include/sysemu/block-backend.h | 1 + > 7 files changed, 90 insertions(+), 19 deletions(-) > > -- > 2.26.2 > > > Any update on this? Best regards, Maxim Levitsky
Re: [Bug 1910505] [NEW] atomic failure linking with --enable-sanitizers on 32-bit Linux hosts
On 07/01/21 10:10, Philippe Mathieu-Daudé wrote: libblock.fa(block_io.c.o): In function `stat64_max': include/qemu/stats64.h:58: undefined reference to `__atomic_load_8' include/qemu/stats64.h:60: undefined reference to `__atomic_compare_exchange_8' libblock.fa(block_qapi.c.o): In function `stat64_get': include/qemu/stats64.h:40: undefined reference to `__atomic_load_8' libqemuutil.a(util_qsp.c.o): In function `qatomic_set_u64': include/qemu/atomic.h:478: undefined reference to `__atomic_store_8' libqemuutil.a(util_qsp.c.o): In function `qatomic_read_u64': include/qemu/atomic.h:468: undefined reference to `__atomic_load_8' clang: error: linker command failed with exit code 1 (use -v to see invocation) Issue previously reported on the list here: https://www.mail-archive.com/qemu-devel@nongnu.org/msg770128.html Looks like 64-bit atomics are available without sanitizers, but fall back to libatomic with sanitizers (tsan probably?). So CONFIG_ATOMIC64 is not detected properly in configure, because -fsanitize=* flags are added (at "end of CC checks" around line 5000) after the atomic64 test (around 400 lines before that). I'm not sure what breaks if -fsanitize flags are added earlier. Paolo
Re: [PATCH v6 0/3] qcow2: don't leave partially initialized file on image creation
On Thu, 2020-12-17 at 19:09 +0200, Maxim Levitsky wrote: > Use the bdrv_co_delete_file interface to delete the underlying > file if qcow2 initialization fails (e.g due to bad encryption secret) > > This makes the qcow2 driver behave the same way as the luks driver behaves. > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1845353 > > V3: addressed review feedback and reworked commit messages > > V4: got rid of code duplication by adding bdrv_co_delete_file_noerr > and made the qcow2 driver use this function to delete > both the main and the data file. > > V5: addresssed review feedback on reworked version. > > V6: addressed most of the review feedback. > > Best regards, > Maxim Levitsky > > Maxim Levitsky (3): > crypto: luks: Fix tiny memory leak > block: add bdrv_co_delete_file_noerr > block: qcow2: remove the created file on initialization error > > block.c | 22 ++ > block/crypto.c| 13 ++--- > block/qcow2.c | 8 +--- > include/block/block.h | 1 + > 4 files changed, 30 insertions(+), 14 deletions(-) > > -- > 2.26.2 > > > Any update on this? Best regards, Maxim Levitsky
Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH
On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Finally to be safe with calculations, to not calculate different > maximums for different nodes (depending on cluster size and > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30) > as absolute maximum bytes length for Qemu. Actually, it's not much less > than INT64_MAX. > +/* > + * We want allow aligning requests and disk length up to any 32bit alignment > + * and don't afraid of overflow. > + * To achieve it, and in the same time use some pretty number as maximum disk > + * size, let's define maximum "length" (a limit for any offset/bytes request > and > + * for disk size) to be the greatest power of 2 less than INT64_MAX. > + */ > +#define BDRV_MAX_ALIGNMENT (1L << 30) > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT)) This change broke nbdkit tests. We test that qemu can handle a qemu NBD export of size 2^63 - 512, the largest size that (experimentally) we found qemu could safely handle. eg: https://github.com/libguestfs/nbdkit/blob/master/tests/test-memory-largest-for-qemu.sh Before this commit: $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"' image: nbd://localhost:10809 file format: raw virtual size: 8 EiB (9223372036854775296 bytes) disk size: unavailable After this commit: $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"' qemu-img: Could not open 'nbd://localhost:10809': Could not refresh total sector count: File too large Can I confirm that this limit is now the new official one and we should adjust nbdkit tests? Or was this change unintentional given that qemu seemed happy to handle 2^63 - 512 disks before? Note that nbdkit & libnbd support up to 2^63 - 1 bytes (we are not limited to whole sectors). Also the Linux kernel will let you create a /dev/nbdX device of size 2^63 - 1. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [Bug 1910505] [NEW] atomic failure linking with --enable-sanitizers on 32-bit Linux hosts
Cc'ing atomic team and qemu-block@ for "qemu/stats64.h". On 1/7/21 9:40 AM, Philippe Mathieu-Daudé wrote: > Public bug reported: > > As of commit 50536341b47, using --enable-sanitizers on 32-bit Linux host: > - displays various warnings > - fails linking > > Using Ubuntu 18.04 (release 20201211.1) and Clang10 on i386: > > [139/675] Compiling C object softmmu.fa.p/softmmu_icount.c.o > In file included from ../softmmu/icount.c:31: > In file included from include/exec/exec-all.h:23: > In file included from ../target/mips/cpu.h:4: > In file included from ../target/mips/cpu-qom.h:23: > In file included from include/hw/core/cpu.h:23: > In file included from include/hw/qdev-core.h:5: > In file included from include/qemu/bitmap.h:16: > In file included from include/qemu/bitops.h:17: > include/qemu/atomic.h:463:12: warning: misaligned atomic operation may > incur significant performance penalty [-Watomic-alignment] > return qatomic_read__nocheck(ptr); >^ > include/qemu/atomic.h:129:5: note: expanded from macro > 'qatomic_read__nocheck' > __atomic_load_n(ptr, __ATOMIC_RELAXED) > ^ > include/qemu/atomic.h:473:5: warning: misaligned atomic operation may > incur significant performance penalty [-Watomic-alignment] > qatomic_set__nocheck(ptr, val); > ^ > include/qemu/atomic.h:138:5: note: expanded from macro > 'qatomic_set__nocheck' > __atomic_store_n(ptr, i, __ATOMIC_RELAXED) > ^ > 2 warnings generated. > [...] > > [850/2216] Linking target tests/test-hbitmap > FAILED: tests/test-hbitmap > clang -o tests/test-hbitmap tests/test-hbitmap.p/test-hbitmap.c.o > tests/test-hbitmap.p/iothread.c.o -Wl,--as-needed -Wl,--no-undefined > -pie -Wl,--whole-archive libblock.fa libcrypto.fa libauthz.fa libqom.fa > libio.fa -Wl,--no-whole-archive -Wl,--warn-common -fsanitize=undefined > -fsanitize=address -Wl,-z,relro -Wl,-z,now -m32 -ggdb > -fstack-protector-strong -Wl,--start-group libqemuutil.a > subprojects/libvhost-user/libvhost-user-glib.a > subprojects/libvhost-user/libvhost-user.a libblock.fa libcrypto.fa > libauthz.fa libqom.fa libio.fa @block.syms -lgio-2.0 -lgobject-2.0 > -lglib-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -pthread -lutil -lgnutls > -lm -lgthread-2.0 -lglib-2.0 /usr/lib/i386-linux-gnu/libglib-2.0.so > -liscsi -lgthread-2.0 -lglib-2.0 -laio -lcurl > /usr/lib/i386-linux-gnu/libz.so -lrbd -lrados -lnettle -lgnutls > -Wl,--end-group > libblock.fa(block_io.c.o): In function `stat64_max': > include/qemu/stats64.h:58: undefined reference to `__atomic_load_8' > include/qemu/stats64.h:60: undefined reference to > `__atomic_compare_exchange_8' > libblock.fa(block_qapi.c.o): In function `stat64_get': > include/qemu/stats64.h:40: undefined reference to `__atomic_load_8' > libqemuutil.a(util_qsp.c.o): In function `qatomic_set_u64': > include/qemu/atomic.h:478: undefined reference to `__atomic_store_8' > libqemuutil.a(util_qsp.c.o): In function `qatomic_read_u64': > include/qemu/atomic.h:468: undefined reference to `__atomic_load_8' > clang: error: linker command failed with exit code 1 (use -v to see > invocation) > > Issue previously reported on the list here: > https://www.mail-archive.com/qemu-devel@nongnu.org/msg770128.html > > ** Affects: qemu > Importance: Undecided > Status: New >
Re: To start multiple KVM guests from one qcow2 image with transient disk option
On Tue, Jan 05, 2021 at 15:12:55 +0100, Peter Krempa wrote: > On Mon, Jan 04, 2021 at 15:30:19 -0500, Masayoshi Mizuma wrote: > > On Sat, Dec 19, 2020 at 11:30:39PM -0500, Masayoshi Mizuma wrote: [...] > {"execute":"cont"} > > So that is a no-go. Some disk bus-es such as IDE don't support hotplug: > > https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_hotplug.c#L1074 > > You could try to just instantiate the backend of the disk as read-only, > and then create a writable overlay. You just need to make sure that the > disk will be writable and that it works even for IDE/SATA which doesn't > support read-only: > > https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_validate.c#L2634 Okay, I've actually tried implementing my suggestion and that doesn't work. With scsi disks qemu crashes on an assertion failure when I try writing to the disk after setting it up as suggested and virtio disk returns an I/O error as it remembers that it was read-only when starting. I'm considering whether it's worth implementing this via hotplug then. Arguably simple deployments don't need it and complex ones have some kind of orchestration which can do it externally. This specifically comes with a caveat of the above, as currently the overlay used to discard writes is created under the same path as the image and can't be controlled, which might be a problem for complex deployments. Also the above algorithm with a constant suffix added to the image prevents to use it with multiple VMs anyways, since the overlay file name will collide (since it's generated based on the same rules). Didn't you run into the collision?