Re: [PATCH v11 00/13] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set

2021-01-07 Thread Klaus Jensen
On Dec  9 10:57, Klaus Jensen wrote:
> Hi Dmitry,
> 
> By and large, this looks OK to me. There are still some issues here and
> there, and some comments of mine that you did not address, but I will
> follow up with patches to fix that. Let's get this merged.
> 
> It looks like the nvme-next you rebased on is slightly old and missing
> two commits:
> 
>   "hw/block/nvme: remove superfluous NvmeCtrl parameter" and
>   "hw/block/nvme: pull aio error handling"
> 
> It caused a couple of conflicts, but nothing that I couldn't fix up.
> 
> Since I didn't manage to convince anyone about the zsze and zcap
> parameters being in terms of LBAs, I'll revert that to be
> 'zoned.zone_size' and 'zoned.zone_capacity'.
> 
> Finally, would you accept that we skip "hw/block/nvme: Add injection of
> Offline/Read-Only zones" for now? I'd like to discuss it a bit since I
> think the random injects feels a bit ad-hoc. Back when I did OCSSD
> emulation with Hans, we did something like this for setting up state
> through a descriptor text file - I think we should explore something
> like that before we lock down the two parameters. I'll amend the final
> documentation commit to not include those parameters.
> 
> Sounds good?
> 
> Otherwise, I think this is mergeable to nvme-next. So, for the series
> (excluding "hw/block/nvme: Add injection of Offline/Read-Only zones"):
> 
> Reviewed-by: Klaus Jensen 
> 

I've applied this series to my local nvme-next. Our repo host is
unavailable this morning (infradead.org), but I will push as soon as
possible.


Thanks!
Klaus


signature.asc
Description: PGP signature


Re: [Bug 1910586] [NEW] SD card size constraint conceptually wrong

2021-01-07 Thread Philippe Mathieu-Daudé
On 1/7/21 8:24 PM, - wrote:
> Public bug reported:
> 
> The patch discussed here:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg720833.html
> introduces an artificial size constraint for SD cards
> that has no relation to reality.
> 
> I'm trying to use an _actual_ **physical** SD card,
> and qemu tells me its size is "invalid".
> 
> Something here appears to be conceptually wrong.
> 
> --
> # fdisk -l /dev/sdg
> Disk /dev/sdg: 14.84 GiB, 15931539456 bytes, 31116288 sectors
> Disk model: USB  SD Reader  
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disklabel type: dos
> Disk identifier: 0x7a0c8bb0
> 
> Device Boot  Start  End  Sectors  Size Id Type
> /dev/sdg1 2048   524287   522240  255M  c W95 FAT32 (LBA)
> /dev/sdg2   524288 31116287 30592000 14.6G 83 Linux
> # qemu-system-aarch64 -M raspi3 -m 1G -kernel vmlinuz-5.4.79-v8 -dtb 
> bcm2837-rpi-3-b-plus.dtb -append console=ttyAMA0\ root=/dev/mmcblk0p2\ rw 
> -nographic -serial mon:stdio -drive file=/dev/sdg,format=raw
> qemu-system-aarch64: Invalid SD card size: 14.8 GiB
> SD card size has to be a power of 2, e.g. 16 GiB.

Your physical card likely is 16GiB. The firmware running
on it is free to reserve some amount to replace broken
blocks. In your case ~7%.

We choose to restrict the model to the physical layer to
simplify the design and avoid to deal with security issues.

Patches to improve the model by better matching the real
world are always welcomed!

> You can resize disk images with 'qemu-img resize  '
> (note that this will lose data if you make the image smaller than it 
> currently is).

Indeed, we can remove this warning for block devices.

> --
> 
> The same invocation with a dump of the actual image
> resized to match qemu's odd expectations works fine.
> 
> 
> This is on QEMU 5.2.0, as evidenced by the following:
> --
> # qemu-system-aarch64 -version
> QEMU emulator version 5.2.0
> Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
> --
> 
> Is there a simple workaround that disables this rather
> arbitrary constraint?

No, but you can send a patch :)

Regards,

Phil.



Re: To start multiple KVM guests from one qcow2 image with transient disk option

2021-01-07 Thread Masayoshi Mizuma
On Thu, Jan 07, 2021 at 09:05:42AM +0100, Peter Krempa wrote:
> On Tue, Jan 05, 2021 at 15:12:55 +0100, Peter Krempa wrote:
> > On Mon, Jan 04, 2021 at 15:30:19 -0500, Masayoshi Mizuma wrote:
> > > On Sat, Dec 19, 2020 at 11:30:39PM -0500, Masayoshi Mizuma wrote:
>  
>  [...]
>  
>  >   {"execute":"cont"}
> > 
> > So that is a no-go. Some disk bus-es such as IDE don't support hotplug:
> > 
> > https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_hotplug.c#L1074
> > 
> > You could try to just instantiate the backend of the disk as read-only,
> > and then create a writable overlay. You just need to make sure that the
> > disk will be writable and that it works even for IDE/SATA which doesn't
> > support read-only:
> > 
> > https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_validate.c#L2634
> 
> Okay, I've actually tried implementing my suggestion and that doesn't
> work. With scsi disks qemu crashes on an assertion failure when I try
> writing to the disk after setting it up as suggested and virtio disk
> returns an I/O error as it remembers that it was read-only when
> starting.
> 
> I'm considering whether it's worth implementing this via hotplug then.
> 
> Arguably simple deployments don't need it and complex ones have some
> kind of orchestration which can do it externally.
> 
> This specifically comes with a caveat of the above, as currently the
> overlay used to discard writes is created under the same path as the
> image and can't be controlled, which might be a problem for complex
> deployments.
> 
> Also the above algorithm with a constant suffix added to the image
> prevents to use it with multiple VMs anyways, since the overlay file
> name will collide (since it's generated based on the same rules).
> 
> Didn't you run into the collision?

Yes, I needed to set the different filename for each guests with blockdev-add 
QMP command.
Is it good idea to add the guest name as the suffix to the filename? Like as:

Guest0 QMP:

  
{"execute":"blockdev-add","arguments":{"driver":"file","filename":"/var/lib/libvirt/images/guest.TRANSIENT-Guest0","node-name":"storage2","auto-read-only":true,"discard":"unmap"}}

Guest1 QMP:

  
{"execute":"blockdev-add","arguments":{"driver":"file","filename":"/var/lib/libvirt/images/guest.TRANSIENT-Guest1","node-name":"storage2","auto-read-only":true,"discard":"unmap"}}

Thanks,
Masa



Re: Potential regression in 'qemu-img convert' to LVM

2021-01-07 Thread Nir Soffer
On Tue, Sep 15, 2020 at 2:51 PM Stefan Reiter  wrote:
>
> On 9/15/20 11:08 AM, Nir Soffer wrote:
> > On Mon, Sep 14, 2020 at 3:25 PM Stefan Reiter  wrote:
> >>
> >> Hi list,
> >>
> >> following command fails since 5.1 (tested on kernel 5.4.60):
> >>
> >> # qemu-img convert -p -f raw -O raw /dev/zvol/pool/disk-1 /dev/vg/disk-1
> >> qemu-img: error while writing at byte 2157968896: Device or resource busy
> >>
> >> (source is ZFS here, but doesn't matter in practice, it always fails the
> >> same; offset changes slightly but consistently hovers around 2^31)
> >>
> >> strace shows the following:
> >> fallocate(13, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2157968896,
> >> 4608) = -1 EBUSY (Device or resource busy)
> >
> > What is the size of the LV?
> >
>
> Same as the source, 5GB in my test case. Created with:
>
> # lvcreate -ay --size 5242880k --name disk-1 vg
>
> > Does it happen if you change sparse minimum size (-S)?
> >
> > For example: -S 64k
> >
> >  qemu-img convert -p -f raw -O raw -S 64k /dev/zvol/pool/disk-1
> > /dev/vg/disk-1
> >
>
> Tried a few different values, always the same result: EBUSY at byte
> 2157968896.
>
> >> Other fallocate calls leading up to this work fine.
> >>
> >> This happens since commit edafc70c0c "qemu-img convert: Don't pre-zero
> >> images", before that all fallocates happened at the start. Reverting the
> >> commit and calling qemu-img exactly the same way on the same data works
> >> fine.
> >
> > But slowly, doing up to 100% more work for fully allocated images.
> >
>
> Of course, I'm not saying the patch is wrong, reverting it just avoids
> triggering the bug.
>
> >> Simply retrying the syscall on EBUSY (like EINTR) does *not* work,
> >> once it fails it keeps failing with the same error.
> >>
> >> I couldn't find anything related to EBUSY on fallocate, and it only
> >> happens on LVM targets... Any idea or pointers where to look?
> >
> > Is this thin LV?
> >
>
> No, regular LV. See command above.
>
> > This works for us using regular LVs.
> >
> > Which kernel? which distro?
> >
>
> Reproducible on:
> * PVE w/ kernel 5.4.60 (Ubuntu based)
> * Manjaro w/ kernel 5.8.6
>
> I found that it does not happen with all images, I suppose there must be
> a certain number of smaller holes for it to happen. I am using a VM
> image with a bare-bones Alpine Linux installation, but it's not an
> isolated case, we've had two people report the issue on our bug tracker:
> https://bugzilla.proxmox.com/show_bug.cgi?id=3002

I think that this issue may be fixed by
https://lists.nongnu.org/archive/html/qemu-block/2020-11/msg00358.html

Nir




Re: [PATCH] meson: Propagate gnutls dependency

2021-01-07 Thread Paolo Bonzini
Il gio 7 gen 2021, 20:36 Roman Bolshakov  ha scritto:

> > No I think that Meson should simply explode link_whole libraries to their
> > constituent objects.  This way duplicates are avoided.
> >
>
> Ok. I've looked through related changes in meson and it flattens object
> files implicitly for link_with/link_whole parameters of static_library:
>
>   https://github.com/mesonbuild/meson/pull/6030/files
>
> But qemu adds dependencies to source set and populates dependencies
> parameter of static_library and declare_dependency and we get duplicate
> symbols:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00411.html
>
> Perhaps it's a bug then.
>

No, the same deduplication is not done for executables, because executables
use libraries directly and not their object files.

Paolo

>
> Regards,
> Roman
>
>


Re: [PATCH] meson: Propagate gnutls dependency

2021-01-07 Thread Roman Bolshakov
On Thu, Jan 07, 2021 at 07:22:06PM +0100, Paolo Bonzini wrote:
> On 07/01/21 19:18, Roman Bolshakov wrote:
> > 
> > > The real issue is that Meson's implementation of link_whole for
> > > library-in-library makes sense for one use case (convenience library that 
> > > is
> > > linked into another convenience library) but not for another (grouping 
> > > code
> > > for subsystems).  I cannot blame them for this because link_with is a more
> > > common case for the latter; OTOH QEMU is using link_whole a lot in order 
> > > to
> > > support the *_init() construct.
> > > 
> > > I really think the correct fix is for Meson to use objects instead of
> > > archives for link_whole, similar to how QEMU Makefiles used to do it. This
> > > would also remove the need for the special .fa suffix, so it would be an
> > > improvement all around.
> > > 
> > Does it mean that we need a kind of object target in meson? Do you think
> > if this interface would work?
> > 
> > crypto_objs = object_library(..., dependencies: public_deps + 
> > [aninternaldep])
> > crypto = declare_dependency(link_with: crypto_objs, dependencies: 
> > public_deps)
> 
> No I think that Meson should simply explode link_whole libraries to their
> constituent objects.  This way duplicates are avoided.
> 

Ok. I've looked through related changes in meson and it flattens object
files implicitly for link_with/link_whole parameters of static_library:

  https://github.com/mesonbuild/meson/pull/6030/files

But qemu adds dependencies to source set and populates dependencies
parameter of static_library and declare_dependency and we get duplicate
symbols:

  https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00411.html

Perhaps it's a bug then.

Regards,
Roman



Re: [PATCH] meson: Propagate gnutls dependency

2021-01-07 Thread Paolo Bonzini

On 07/01/21 19:18, Roman Bolshakov wrote:



The real issue is that Meson's implementation of link_whole for
library-in-library makes sense for one use case (convenience library that is
linked into another convenience library) but not for another (grouping code
for subsystems).  I cannot blame them for this because link_with is a more
common case for the latter; OTOH QEMU is using link_whole a lot in order to
support the *_init() construct.

I really think the correct fix is for Meson to use objects instead of
archives for link_whole, similar to how QEMU Makefiles used to do it. This
would also remove the need for the special .fa suffix, so it would be an
improvement all around.


Does it mean that we need a kind of object target in meson? Do you think
if this interface would work?

crypto_objs = object_library(..., dependencies: public_deps + [aninternaldep])
crypto = declare_dependency(link_with: crypto_objs, dependencies: public_deps)


No I think that Meson should simply explode link_whole libraries to 
their constituent objects.  This way duplicates are avoided.


Paolo




Re: [PATCH] meson: Propagate gnutls dependency

2021-01-07 Thread Roman Bolshakov
On Thu, Jan 07, 2021 at 05:23:54PM +0100, Paolo Bonzini wrote:
> On 07/01/21 16:56, Roman Bolshakov wrote:
> > IMO duplication of dependencies shouldn't be needed for a build system.
> > Meta build system should allow private and public dependencies. Different
> > rules are applied to them. Private dependency is not propagated beyond a
> > target that uses it, public dependency is propagated.
> > 
> > Right now it seems that meson is missing the notion of public and
> > private dependencies and that's where the problem arises. The post [1] (and
> > the related issue) summarizes what I'm trying to say.
> 
> Meson doesn't have a concept of public dependencies because it separates the
> private (static library) and the public (declare_dependency) view. That is
> you'd have:
> 
> public_deps = [gnutls, anotherpublicdep]
> lib = static_library(..., dependencies: public_deps + [aninternaldep])
> dep = declare_dependency(link_with: lib, dependencies: public_deps)
> 

Thanks! This wasn't obvious to me. But what's not clear that CMake can
do both collection of objects (what I provided in the example) and
static libraries and they're different. I assume what you have shown
would look in CMake like (please note that STATIC is used instead of
OBJECT):

add_library(crypto STATIC crypto-file1.c ...)
target_link_libraries(crypto PRIVATE aninternaldep
  PUBLIC  gnutls
  anotherpublicdep)


That explains why attempt to use dependencies between link_whole static
libraries in meson causes symbol duplication. CMake on other hand can
just make collection of objects or even a chain of collection of
objects. They'll be linked in fully only in a final static library,
shared library or an executable.

> The real issue is that Meson's implementation of link_whole for
> library-in-library makes sense for one use case (convenience library that is
> linked into another convenience library) but not for another (grouping code
> for subsystems).  I cannot blame them for this because link_with is a more
> common case for the latter; OTOH QEMU is using link_whole a lot in order to
> support the *_init() construct.
> 
> I really think the correct fix is for Meson to use objects instead of
> archives for link_whole, similar to how QEMU Makefiles used to do it. This
> would also remove the need for the special .fa suffix, so it would be an
> improvement all around.
> 

Does it mean that we need a kind of object target in meson? Do you think
if this interface would work?

crypto_objs = object_library(..., dependencies: public_deps + [aninternaldep])
crypto = declare_dependency(link_with: crypto_objs, dependencies: public_deps)

Regards,
Roman

> Paolo
> 
> > If we resolve the issue, then we just specify gnutls as a public
> > dependency of crypto and all users of crypto would get gnutls headers.
> > 
> > Here's an example how clearly CMake approaches the issue [2][3]:
> > 
> > add_library(crypto OBJECT crypto-file1.c ...)
> > target_link_libraries(crypto PRIVATE aninternaldep
> >   PUBLIC  gnutls
> >   anotherpublicdep)
> > 
> > 1.https://github.com/mesonbuild/meson/issues/495#issuecomment-206178570
> > 2.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#linking-object-libraries
> > 3.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#libraries-for-a-target-and-or-its-dependents
> 



Re: [PATCH v3] block: report errno when flock fcntl fails

2021-01-07 Thread David Edmondson
Ugh, please ignore this one. I didn't remove the test changes, but the
tests passed, which I don't understand.

On Thursday, 2021-01-07 at 17:55:38 GMT, David Edmondson wrote:

> When a call to fcntl(2) for the purpose of manipulating file locks
> fails with an error other than EAGAIN or EACCES, report the error
> returned by fcntl.
>
> EAGAIN or EACCES are elided as they are considered to be common
> failures, indicating that a conflicting lock is held by another
> process.
>
> Signed-off-by: David Edmondson 
> ---
> v3:
> - Remove the now unnecessary updates to the test framework (Max).
> - Elide the error detail for EAGAIN or EACCES when locking (Kevin,
>   sort-of Max).
> - Philippe and Vladimir sent Reviewed-by, but things have changed
>   noticeably, so I didn't add them (dme).
>
>  block/file-posix.c | 52 +-
>  tests/qemu-iotests/153.out | 76 +++---
>  tests/qemu-iotests/182.out |  2 +-
>  3 files changed, 81 insertions(+), 49 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 00cdaaa2d4..c5142f7ffa 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -836,7 +836,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
>  if ((perm_lock_bits & bit) && !(locked_perm & bit)) {
>  ret = qemu_lock_fd(fd, off, 1, false);
>  if (ret) {
> -error_setg(errp, "Failed to lock byte %d", off);
> +int err = -ret;
> +
> +if (err == EAGAIN || err == EACCES) {
> +error_setg(errp, "Failed to lock byte %d", off);
> +} else {
> +error_setg_errno(errp, err, "Failed to lock byte %d", 
> off);
> +}
>  return ret;
>  } else if (s) {
>  s->locked_perm |= bit;
> @@ -844,7 +850,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
>  } else if (unlock && (locked_perm & bit) && !(perm_lock_bits & bit)) 
> {
>  ret = qemu_unlock_fd(fd, off, 1);
>  if (ret) {
> -error_setg(errp, "Failed to unlock byte %d", off);
> +int err = -ret;
> +
> +if (err == EAGAIN || err == EACCES) {
> +error_setg(errp, "Failed to lock byte %d", off);
> +} else {
> +error_setg_errno(errp, err, "Failed to lock byte %d", 
> off);
> +}
>  return ret;
>  } else if (s) {
>  s->locked_perm &= ~bit;
> @@ -857,7 +869,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
>  if ((shared_perm_lock_bits & bit) && !(locked_shared_perm & bit)) {
>  ret = qemu_lock_fd(fd, off, 1, false);
>  if (ret) {
> -error_setg(errp, "Failed to lock byte %d", off);
> +int err = -ret;
> +
> +if (err == EAGAIN || err == EACCES) {
> +error_setg(errp, "Failed to lock byte %d", off);
> +} else {
> +error_setg_errno(errp, err, "Failed to lock byte %d", 
> off);
> +}
>  return ret;
>  } else if (s) {
>  s->locked_shared_perm |= bit;
> @@ -866,7 +884,7 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
> !(shared_perm_lock_bits & bit)) {
>  ret = qemu_unlock_fd(fd, off, 1);
>  if (ret) {
> -error_setg(errp, "Failed to unlock byte %d", off);
> +error_setg_errno(errp, -ret, "Failed to unlock byte %d", 
> off);
>  return ret;
>  } else if (s) {
>  s->locked_shared_perm &= ~bit;
> @@ -890,9 +908,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, 
> uint64_t shared_perm,
>  ret = qemu_lock_fd_test(fd, off, 1, true);
>  if (ret) {
>  char *perm_name = bdrv_perm_names(p);
> -error_setg(errp,
> -   "Failed to get \"%s\" lock",
> -   perm_name);
> +int err = -ret;
> +
> +if (err == EAGAIN || err == EACCES) {
> +error_setg(errp, "Failed to get \"%s\" lock",
> +   perm_name);
> +} else {
> +error_setg_errno(errp, err,
> + "Failed to get \"%s\" lock",
> + perm_name);
> +}
>  g_free(perm_name);
>  return ret;
>  }
> @@ -905,9 +930,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, 
> uint64_t shared_perm,
>  ret = qemu_lock_fd_test(fd, off, 1, true);
>  if (ret) {
>  char *perm_name = bdrv_perm_names(p);
> -error_setg(errp,
> -   

[PATCH v3] block: report errno when flock fcntl fails

2021-01-07 Thread David Edmondson
When a call to fcntl(2) for the purpose of manipulating file locks
fails with an error other than EAGAIN or EACCES, report the error
returned by fcntl.

EAGAIN or EACCES are elided as they are considered to be common
failures, indicating that a conflicting lock is held by another
process.

Signed-off-by: David Edmondson 
---
v3:
- Remove the now unnecessary updates to the test framework (Max).
- Elide the error detail for EAGAIN or EACCES when locking (Kevin,
  sort-of Max).
- Philippe and Vladimir sent Reviewed-by, but things have changed
  noticeably, so I didn't add them (dme).

 block/file-posix.c | 52 +-
 tests/qemu-iotests/153.out | 76 +++---
 tests/qemu-iotests/182.out |  2 +-
 3 files changed, 81 insertions(+), 49 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 00cdaaa2d4..c5142f7ffa 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -836,7 +836,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
 if ((perm_lock_bits & bit) && !(locked_perm & bit)) {
 ret = qemu_lock_fd(fd, off, 1, false);
 if (ret) {
-error_setg(errp, "Failed to lock byte %d", off);
+int err = -ret;
+
+if (err == EAGAIN || err == EACCES) {
+error_setg(errp, "Failed to lock byte %d", off);
+} else {
+error_setg_errno(errp, err, "Failed to lock byte %d", off);
+}
 return ret;
 } else if (s) {
 s->locked_perm |= bit;
@@ -844,7 +850,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
 } else if (unlock && (locked_perm & bit) && !(perm_lock_bits & bit)) {
 ret = qemu_unlock_fd(fd, off, 1);
 if (ret) {
-error_setg(errp, "Failed to unlock byte %d", off);
+int err = -ret;
+
+if (err == EAGAIN || err == EACCES) {
+error_setg(errp, "Failed to lock byte %d", off);
+} else {
+error_setg_errno(errp, err, "Failed to lock byte %d", off);
+}
 return ret;
 } else if (s) {
 s->locked_perm &= ~bit;
@@ -857,7 +869,13 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
 if ((shared_perm_lock_bits & bit) && !(locked_shared_perm & bit)) {
 ret = qemu_lock_fd(fd, off, 1, false);
 if (ret) {
-error_setg(errp, "Failed to lock byte %d", off);
+int err = -ret;
+
+if (err == EAGAIN || err == EACCES) {
+error_setg(errp, "Failed to lock byte %d", off);
+} else {
+error_setg_errno(errp, err, "Failed to lock byte %d", off);
+}
 return ret;
 } else if (s) {
 s->locked_shared_perm |= bit;
@@ -866,7 +884,7 @@ static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
!(shared_perm_lock_bits & bit)) {
 ret = qemu_unlock_fd(fd, off, 1);
 if (ret) {
-error_setg(errp, "Failed to unlock byte %d", off);
+error_setg_errno(errp, -ret, "Failed to unlock byte %d", off);
 return ret;
 } else if (s) {
 s->locked_shared_perm &= ~bit;
@@ -890,9 +908,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, 
uint64_t shared_perm,
 ret = qemu_lock_fd_test(fd, off, 1, true);
 if (ret) {
 char *perm_name = bdrv_perm_names(p);
-error_setg(errp,
-   "Failed to get \"%s\" lock",
-   perm_name);
+int err = -ret;
+
+if (err == EAGAIN || err == EACCES) {
+error_setg(errp, "Failed to get \"%s\" lock",
+   perm_name);
+} else {
+error_setg_errno(errp, err,
+ "Failed to get \"%s\" lock",
+ perm_name);
+}
 g_free(perm_name);
 return ret;
 }
@@ -905,9 +930,16 @@ static int raw_check_lock_bytes(int fd, uint64_t perm, 
uint64_t shared_perm,
 ret = qemu_lock_fd_test(fd, off, 1, true);
 if (ret) {
 char *perm_name = bdrv_perm_names(p);
-error_setg(errp,
-   "Failed to get shared \"%s\" lock",
-   perm_name);
+int err = -ret;
+
+if (err == EAGAIN || err == EACCES) {
+error_setg(errp, "Failed to get shared \"%s\" lock",
+   perm_name);
+} else {
+error_setg_errno(errp, err,
+ 

Re: [PATCH] meson: Propagate gnutls dependency

2021-01-07 Thread Paolo Bonzini

On 07/01/21 16:56, Roman Bolshakov wrote:

IMO duplication of dependencies shouldn't be needed for a build system.
Meta build system should allow private and public dependencies. Different
rules are applied to them. Private dependency is not propagated beyond a
target that uses it, public dependency is propagated.

Right now it seems that meson is missing the notion of public and
private dependencies and that's where the problem arises. The post [1] (and
the related issue) summarizes what I'm trying to say.


Meson doesn't have a concept of public dependencies because it separates 
the private (static library) and the public (declare_dependency) view. 
That is you'd have:


public_deps = [gnutls, anotherpublicdep]
lib = static_library(..., dependencies: public_deps + [aninternaldep])
dep = declare_dependency(link_with: lib, dependencies: public_deps)

The real issue is that Meson's implementation of link_whole for 
library-in-library makes sense for one use case (convenience library 
that is linked into another convenience library) but not for another 
(grouping code for subsystems).  I cannot blame them for this because 
link_with is a more common case for the latter; OTOH QEMU is using 
link_whole a lot in order to support the *_init() construct.


I really think the correct fix is for Meson to use objects instead of 
archives for link_whole, similar to how QEMU Makefiles used to do it. 
This would also remove the need for the special .fa suffix, so it would 
be an improvement all around.


Paolo


If we resolve the issue, then we just specify gnutls as a public
dependency of crypto and all users of crypto would get gnutls headers.

Here's an example how clearly CMake approaches the issue [2][3]:

add_library(crypto OBJECT crypto-file1.c ...)
target_link_libraries(crypto PRIVATE aninternaldep
  PUBLIC  gnutls
  anotherpublicdep)

1.https://github.com/mesonbuild/meson/issues/495#issuecomment-206178570
2.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#linking-object-libraries
3.https://cmake.org/cmake/help/latest/command/target_link_libraries.html#libraries-for-a-target-and-or-its-dependents





Re: [PATCH] meson: Propagate gnutls dependency

2021-01-07 Thread Roman Bolshakov
On Thu, Jan 07, 2021 at 12:41:40PM +0100, Paolo Bonzini wrote:
> On 05/01/21 15:37, Roman Bolshakov wrote:
> > Does it work if you do:
> > 
> > crypto_ss.add(authz, qom)
> > libcrypto = static_library('crypto', crypto_ss.sources() + genh,
> > dependencies: crypto_ss.dependencies(),
> > ...)
> > crypto = declare_dependency(link_whole: libcrypto,
> >  dependencies: crypto_ss.dependencies())
> 
> Ok, so the final attempt is a mix of the three :)  Keep the link_whole
> dependencies in the declare_dependency, and add the sourceset dependencies
> there too.

Hi Paolo,

Thanks for the patch but unfortunately it doesn't resolve the issue.
io and other libraries can't still find gnutls.

I've also tried your meson trans-deps branch and wonder if it's supposed
to fix the issue without any changes to qemu build files?
Do you need any help with meson changes?

IMO duplication of dependencies shouldn't be needed for a build system.
Meta build system should allow private and public dependencies. Different
rules are applied to them. Private dependency is not propagated beyond a
target that uses it, public dependency is propagated. There's also
declare_dependency that has to be always public because it serves no
purpose on it's own. declare_dependency is like INTERFACE library in
CMake.

If a project specifies a dependency that is public, it should be
transitively passed downstream. Build system shouldn't obscurely hide
flags a dependency provides on case-by-case basis.

Right now it seems that meson is missing the notion of public and
private dependencies and that's where the problem arises. The post [1] (and
the related issue) summarizes what I'm trying to say.

If we resolve the issue, then we just specify gnutls as a public
dependency of crypto and all users of crypto would get gnutls headers.

Here's an example how clearly CMake approaches the issue [2][3]:

add_library(crypto OBJECT crypto-file1.c ...)
target_link_libraries(crypto PRIVATE aninternaldep
 PUBLIC  gnutls
 anotherpublicdep)

1. https://github.com/mesonbuild/meson/issues/495#issuecomment-206178570
2. 
https://cmake.org/cmake/help/latest/command/target_link_libraries.html#linking-object-libraries
3. 
https://cmake.org/cmake/help/latest/command/target_link_libraries.html#libraries-for-a-target-and-or-its-dependents

Regards,
Roman

> 
> diff --git a/meson.build b/meson.build
> index e9bf290966..774df4db8e 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1904,7 +1904,8 @@ libqom = static_library('qom', qom_ss.sources() +
> genh,
>  dependencies: [qom_ss.dependencies()],
>  name_suffix: 'fa')
> 
> -qom = declare_dependency(link_whole: libqom)
> +qom = declare_dependency(link_whole: libqom,
> + dependencies: [qom_ss.dependencies()])
> 
>  authz_ss = authz_ss.apply(config_host, strict: false)
>  libauthz = static_library('authz', authz_ss.sources() + genh,
> @@ -1913,7 +1914,7 @@ libauthz = static_library('authz', authz_ss.sources()
> + genh,
>build_by_default: false)
> 
>  authz = declare_dependency(link_whole: libauthz,
> -   dependencies: qom)
> +  dependencies: [authz_ss.dependencies(), qom])
> 
>  crypto_ss = crypto_ss.apply(config_host, strict: false)
>  libcrypto = static_library('crypto', crypto_ss.sources() + genh,
> @@ -1922,7 +1923,7 @@ libcrypto = static_library('crypto',
> crypto_ss.sources() + genh,
> build_by_default: false)
> 
>  crypto = declare_dependency(link_whole: libcrypto,
> -dependencies: [authz, qom])
> +dependencies: [crypto_ss.dependencies(), authz,
> qom])
> 
>  io_ss = io_ss.apply(config_host, strict: false)
>  libio = static_library('io', io_ss.sources() + genh,
> @@ -1931,13 +1932,14 @@ libio = static_library('io', io_ss.sources() + genh,
> name_suffix: 'fa',
> build_by_default: false)
> 
> -io = declare_dependency(link_whole: libio, dependencies: [crypto, qom])
> +io = declare_dependency(link_whole: libio,
> +dependencies: [io_ss.dependencies(), crypto, qom])
> 
>  libmigration = static_library('migration', sources: migration_files + genh,
>name_suffix: 'fa',
>build_by_default: false)
>  migration = declare_dependency(link_with: libmigration,
> -   dependencies: [zlib, qom, io])
> +   dependencies: [qom, io])
>  softmmu_ss.add(migration)
> 
>  block_ss = block_ss.apply(config_host, strict: false)
> @@ -1949,7 +1951,7 @@ libblock = static_library('block', block_ss.sources()
> + genh,
> 
>  block = declare_dependency(link_whole: [libblock],
>

Re: [PATCH v3 5/5] block/scsi: correctly emulate the VPD block limits page

2021-01-07 Thread Max Reitz

On 17.12.20 17:56, Maxim Levitsky wrote:

When the device doesn't support the VPD block limits page, we emulate it even
for SCSI passthrough.

As a part of the emulation we need to add it to the 'Supported VPD Pages'

The code that does this adds it to the page, but it doesn't increase the length
of the data to be copied to the guest, thus the guest never sees the VPD block
limits page as supported.


Isn’t the problem more generally that if there is a block limits page, 
the last supported page is cut off (which perhaps more likely than not 
is the block limits page (given that it’s 0xb0, which relatively high))?



Bump the transfer size by 1 in this case.

Signed-off-by: Maxim Levitsky 
---
  hw/scsi/scsi-generic.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)


Anyway, looks good to me, though I have a hard time following the code, 
which yields a rather weak:


Reviewed-by: Max Reitz 




Re: [PATCH v3 4/5] block: use blk_get_max_ioctl_transfer for SCSI passthrough

2021-01-07 Thread Max Reitz

On 17.12.20 17:56, Maxim Levitsky wrote:

Switch file-posix to expose only the max_ioctl_transfer limit.

Let the iscsi driver work as it did before since it is bound by the transfer
limit in both regular read/write and in SCSI passthrough case.

Switch the scsi-disk and scsi-block drivers to read the SG max transfer limits
using the new blk_get_max_ioctl_transfer interface.


Fixes: 867eccfed8 ("file-posix: Use max transfer length/segment count only for SCSI 
passthrough")
Signed-off-by: Maxim Levitsky 
---
  block/file-posix.c | 7 ---
  block/iscsi.c  | 1 +
  hw/scsi/scsi-generic.c | 4 ++--
  3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 2bf4d095a7..c34a7a9599 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1282,13 +1282,14 @@ static void hdev_refresh_limits(BlockDriverState *bs, 
Error **errp)
 get_max_transfer_length(s->fd);
  
  if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {

-bs->bl.max_transfer = pow2floor(ret);
+bs->bl.max_ioctl_transfer = pow2floor(ret);
  }
  
  ret = bs->sg ? sg_get_max_segments(s->fd) : get_max_segments(s->fd);

  if (ret > 0) {
-bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
-   ret * qemu_real_host_page_size);
+bs->bl.max_ioctl_transfer =
+MIN_NON_ZERO(bs->bl.max_ioctl_transfer,
+ ret * qemu_real_host_page_size);
  }


Do non-SG devices even have a max transfer length then?  I would’ve 
thought max_ioctl_transfer simply doesn’t apply to them and so could be 
left 0.



(Rest looks good – from what I can tell...)

Max

  
  raw_refresh_limits(bs, errp);





Re: [PATCH v3 3/5] block: add max_ioctl_transfer to BlockLimits

2021-01-07 Thread Max Reitz

On 17.12.20 17:56, Maxim Levitsky wrote:

Maximum transfer size when accessing a kernel block device is only relevant
when using SCSI passthrough (SG_IO ioctl) since only in this case the requests
are passed directly to underlying hardware with no pre-processing.


So by “with no pre-processing” you mean in particular no pre-processing 
by the kernel?  I.e., that file-posix for non-SG devices actually has no 
max transfer limit, because the kernel will split overly long requests 
by itself?



Same is true when using /dev/sg* character devices (which only support SG_IO)

Therefore split the block driver's advertized max transfer size by
the regular max transfer size, and the max transfer size for SCSI passthrough
(the new max_ioctl_transfer field)

In the next patch, the qemu block drivers that support SCSI passthrough
will set the max_ioctl_transfer field, and simultaneously, the block devices
that implement scsi passthrough will switch to 'blk_get_max_ioctl_transfer' to
query and to pass it to the guest.

Signed-off-by: Maxim Levitsky 
---
  block/block-backend.c  | 12 
  block/io.c |  2 ++
  include/block/block_int.h  |  4 
  include/sysemu/block-backend.h |  1 +
  4 files changed, 19 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index ce78d30794..c1d149a755 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1938,6 +1938,18 @@ uint32_t blk_get_max_transfer(BlockBackend *blk)
  return MIN_NON_ZERO(max, INT_MAX);
  }
  
+/* Returns the maximum transfer length, for SCSI passthrough */

+uint32_t blk_get_max_ioctl_transfer(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+uint32_t max = 0;
+
+if (bs) {
+max = bs->bl.max_ioctl_transfer;
+}
+return MIN_NON_ZERO(max, INT_MAX);
+}
+
  int blk_get_max_iov(BlockBackend *blk)
  {
  return blk->root->bs->bl.max_iov;
diff --git a/block/io.c b/block/io.c
index 24205f5168..ac5aea435e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -126,6 +126,8 @@ static void bdrv_merge_limits(BlockLimits *dst, const 
BlockLimits *src)
  {
  dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
  dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
+dst->max_ioctl_transfer = MIN_NON_ZERO(dst->max_ioctl_transfer,
+src->max_ioctl_transfer);


I’d prefer this to be aligned to the opening parenthesis.


  dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
   src->opt_mem_alignment);
  dst->min_mem_alignment = MAX(dst->min_mem_alignment,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1eeafc118c..c59b0aefc4 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -686,6 +686,10 @@ typedef struct BlockLimits {
   * clamped down. */
  uint32_t max_transfer;
  
+/* Maximal transfer length for SCSI passthrough (ioctl interface) */

+uint32_t max_ioctl_transfer;
+
+


Is there a specific reason you added a double newline here?  (All other 
fields in this struct are separated by a single newline)


Max


  /* memory alignment, in bytes so that no bounce buffer is needed */
  size_t min_mem_alignment;
  
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h

index 8203d7f6f9..b019a37b7a 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -203,6 +203,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag);
  int blk_get_flags(BlockBackend *blk);
  uint32_t blk_get_request_alignment(BlockBackend *blk);
  uint32_t blk_get_max_transfer(BlockBackend *blk);
+uint32_t blk_get_max_ioctl_transfer(BlockBackend *blk);
  int blk_get_max_iov(BlockBackend *blk);
  void blk_set_guest_block_size(BlockBackend *blk, int align);
  void *blk_try_blockalign(BlockBackend *blk, size_t size);






Re: [PATCH 1/2] file-posix: allow -EBUSY errors during write zeros on raw block devices

2021-01-07 Thread Maxim Levitsky
On Mon, 2020-11-16 at 15:48 +0100, Kevin Wolf wrote:
> Am 11.11.2020 um 16:39 hat Maxim Levitsky geschrieben:
> > On Linux, fallocate(fd, FALLOC_FL_PUNCH_HOLE) when it is used on a block 
> > device,
> > without O_DIRECT can return -EBUSY if it races with another write to the 
> > same page.
> > 
> > Since this is rare and discard is not a critical operation, ignore this 
> > error
> > 
> > Signed-off-by: Maxim Levitsky 
> 
> I'm applying this one for 5.2, it certainly shouldn't hurt and makes
> things work at least, even if possibly not in the optimal way.
> 
> Patch 2 seems to be a bit less obvious and discussion is ongoing, so
> that's probably more 6.0 material.
> 
> Kevin

Any feedback on patch 2?

Best regards,
Maxim Levitsky




Re: [PATCH v3 2/5] file-posix: add sg_get_max_segments that actually works with sg

2021-01-07 Thread Max Reitz

On 17.12.20 17:56, Maxim Levitsky wrote:

From: Tom Yan 

sg devices have different major/minor than their corresponding
block devices. Using sysfs to get max segments never really worked
for them.

Fortunately the sg driver provides an ioctl to get sg_tablesize,
which is apparently equivalent to max segments.

Signed-off-by: Tom Yan 
Signed-off-by: Maxim Levitsky 
---
  block/file-posix.c | 22 +-
  1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index cbf1271773..2bf4d095a7 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1179,6 +1179,26 @@ static int sg_get_max_transfer_length(int fd)
  #endif
  }
  
+static int sg_get_max_segments(int fd)

+{
+/*
+ * /dev/sg* character devices report 'max_segments' via
+ * SG_GET_SG_TABLESIZE ioctl
+ */
+
+#ifdef SG_GET_SG_TABLESIZE
+long max_segments = 0;
+
+if (ioctl(fd, SG_GET_SG_TABLESIZE, _segments) == 0) {


As far as I can tell from googling, SG_GET_SG_TABLESIZE takes an int 
pointer.


Apart from that, looks good.

Max


+return max_segments;
+} else {
+return -errno;
+}
+#else
+return -ENOSYS;
+#endif
+}
+
  static int get_max_transfer_length(int fd)
  {
  #if defined(BLKSECTGET)
@@ -1265,7 +1285,7 @@ static void hdev_refresh_limits(BlockDriverState *bs, 
Error **errp)
  bs->bl.max_transfer = pow2floor(ret);
  }
  
-ret = get_max_segments(s->fd);

+ret = bs->sg ? sg_get_max_segments(s->fd) : get_max_segments(s->fd);
  if (ret > 0) {
  bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
 ret * qemu_real_host_page_size);






Re: [PATCH v3 1/5] file-posix: split hdev_refresh_limits from raw_refresh_limits

2021-01-07 Thread Max Reitz

On 17.12.20 17:56, Maxim Levitsky wrote:

From: Tom Yan 

We can and should get max transfer length and max segments for all host
devices / cdroms (on Linux).

Also use MIN_NON_ZERO instead when we clamp max transfer length against
max segments.

Signed-off-by: Tom Yan 
Signed-off-by: Maxim Levitsky 
---
  block/file-posix.c | 57 +-
  1 file changed, 41 insertions(+), 16 deletions(-)


I’m aware that most of my remarks below apply to the pre-patch state 
just as well, but I feel like now is a good time to raise them, so, here 
goes:



diff --git a/block/file-posix.c b/block/file-posix.c
index 9804681d5c..cbf1271773 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1166,6 +1166,10 @@ static int sg_get_max_transfer_length(int fd)
  int max_bytes = 0;
  
  if (ioctl(fd, BLKSECTGET, _bytes) == 0) {

+/*
+ * BLKSECTGET for /dev/sg* character devices incorrectly returns
+ * the max transfer size in bytes (rather than in blocks).
+ */
  return max_bytes;
  } else {
  return -errno;
@@ -1175,7 +1179,22 @@ static int sg_get_max_transfer_length(int fd)
  #endif
  }
  
-static int sg_get_max_segments(int fd)

+static int get_max_transfer_length(int fd)
+{
+#if defined(BLKSECTGET)
+int sect = 0;
+
+if (ioctl(fd, BLKSECTGET, ) == 0) {
+return sect << 9;


Can this overflow?

(I mean, technically it would still be safe, because either the limit is 
set too low or it isn’t set at all, which would be correct on overflow. 
 But still.)



+} else {
+return -errno;
+}
+#else
+return -ENOSYS;
+#endif
+}
+
+static int get_max_segments(int fd)
  {
  #ifdef CONFIG_LINUX
  char buf[32];


This function stores max_segments (a long) in ret (an int) and returns 
the latter.  Should we guard against overflows?



@@ -1230,23 +1249,29 @@ static void raw_refresh_limits(BlockDriverState *bs, 
Error **errp)
  {
  BDRVRawState *s = bs->opaque;
  
-if (bs->sg) {

-int ret = sg_get_max_transfer_length(s->fd);
+raw_probe_alignment(bs, s->fd, errp);
+bs->bl.min_mem_alignment = s->buf_align;
+bs->bl.opt_mem_alignment = MAX(s->buf_align, qemu_real_host_page_size);
+}
+
+static void hdev_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+BDRVRawState *s = bs->opaque;
+
+int ret = bs->sg ? sg_get_max_transfer_length(s->fd) :
+   get_max_transfer_length(s->fd);
  
-if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {

-bs->bl.max_transfer = pow2floor(ret);
-}
+if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
+bs->bl.max_transfer = pow2floor(ret);
+}
  
-ret = sg_get_max_segments(s->fd);

-if (ret > 0) {
-bs->bl.max_transfer = MIN(bs->bl.max_transfer,
-  ret * qemu_real_host_page_size);


(1) Can this overflow?  (Which I suppose could result in a 
non-power-of-two result)


(2) Even disregarding overflows, is ret * qemu_real_host_page_size 
guaranteed to be a power of two?


Max


-}
+ret = get_max_segments(s->fd);
+if (ret > 0) {
+bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
+   ret * qemu_real_host_page_size);
  }
  
-raw_probe_alignment(bs, s->fd, errp);

-bs->bl.min_mem_alignment = s->buf_align;
-bs->bl.opt_mem_alignment = MAX(s->buf_align, qemu_real_host_page_size);
+raw_refresh_limits(bs, errp);
  }
  
  static int check_for_dasd(int fd)

@@ -3600,7 +3625,7 @@ static BlockDriver bdrv_host_device = {
  .bdrv_co_pdiscard   = hdev_co_pdiscard,
  .bdrv_co_copy_range_from = raw_co_copy_range_from,
  .bdrv_co_copy_range_to  = raw_co_copy_range_to,
-.bdrv_refresh_limits = raw_refresh_limits,
+.bdrv_refresh_limits = hdev_refresh_limits,
  .bdrv_io_plug = raw_aio_plug,
  .bdrv_io_unplug = raw_aio_unplug,
  .bdrv_attach_aio_context = raw_aio_attach_aio_context,
@@ -3724,7 +3749,7 @@ static BlockDriver bdrv_host_cdrom = {
  .bdrv_co_preadv = raw_co_preadv,
  .bdrv_co_pwritev= raw_co_pwritev,
  .bdrv_co_flush_to_disk  = raw_co_flush_to_disk,
-.bdrv_refresh_limits = raw_refresh_limits,
+.bdrv_refresh_limits = hdev_refresh_limits,
  .bdrv_io_plug = raw_aio_plug,
  .bdrv_io_unplug = raw_aio_unplug,
  .bdrv_attach_aio_context = raw_aio_attach_aio_context,






Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH

2021-01-07 Thread Richard W.M. Jones
On Thu, Jan 07, 2021 at 10:56:12AM +, Richard W.M. Jones wrote:
> On Thu, Jan 07, 2021 at 09:58:17AM +, Richard W.M. Jones wrote:
> > On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy 
> > wrote:
> > > Finally to be safe with calculations, to not calculate different
> > > maximums for different nodes (depending on cluster size and
> > > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> > > as absolute maximum bytes length for Qemu. Actually, it's not much less
> > > than INT64_MAX.
> > 
> > > +/*
> > > + * We want allow aligning requests and disk length up to any 32bit 
> > > alignment
> > > + * and don't afraid of overflow.
> > > + * To achieve it, and in the same time use some pretty number as maximum 
> > > disk
> > > + * size, let's define maximum "length" (a limit for any offset/bytes 
> > > request and
> > > + * for disk size) to be the greatest power of 2 less than INT64_MAX.
> > > + */
> > > +#define BDRV_MAX_ALIGNMENT (1L << 30)
> > > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> > 
> > This change broke nbdkit tests.
> 
> Actually that's not the only problem.  It appears that we're unable to
> read or write the last sector of this disk:
> 
> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c 
> "r -v $(( 2**63 - 2**30 - 512 )) 512" ' 
> read failed: Input/output error
> 
> $ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w 
> -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' 
> write failed: Input/output error
> 
> You can play around with the constants.  I found it's possible to read
> and write the non-aligned 512 bytes starting at 2^63-2^30-513.  Could
> be a fencepost error somewhere in qemu?

Actually this is a pre-existing bug in qemu.

What happens is qemu-io calls qemu_strtosz("9223372035781033472")
which returns 0x7fffc000 and no error.  That answer is plain
flat out wrong.  The reason for that is qemu_strtosz uses floating
point for the calculation(!) so is limited to 53 bits of precision and
silently truncates.

It happened we were just lucky that our earlier test with
2^63 - 1024 worked.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org




Re: [PATCH] meson: Propagate gnutls dependency

2021-01-07 Thread Paolo Bonzini

On 05/01/21 15:37, Roman Bolshakov wrote:

Does it work if you do:

crypto_ss.add(authz, qom)
libcrypto = static_library('crypto', crypto_ss.sources() + genh,
dependencies: crypto_ss.dependencies(),
...)
crypto = declare_dependency(link_whole: libcrypto,
 dependencies: crypto_ss.dependencies())


Ok, so the final attempt is a mix of the three :)  Keep the link_whole 
dependencies in the declare_dependency, and add the sourceset 
dependencies there too.


diff --git a/meson.build b/meson.build
index e9bf290966..774df4db8e 100644
--- a/meson.build
+++ b/meson.build
@@ -1904,7 +1904,8 @@ libqom = static_library('qom', qom_ss.sources() + 
genh,

 dependencies: [qom_ss.dependencies()],
 name_suffix: 'fa')

-qom = declare_dependency(link_whole: libqom)
+qom = declare_dependency(link_whole: libqom,
+ dependencies: [qom_ss.dependencies()])

 authz_ss = authz_ss.apply(config_host, strict: false)
 libauthz = static_library('authz', authz_ss.sources() + genh,
@@ -1913,7 +1914,7 @@ libauthz = static_library('authz', 
authz_ss.sources() + genh,

   build_by_default: false)

 authz = declare_dependency(link_whole: libauthz,
-   dependencies: qom)
+  dependencies: [authz_ss.dependencies(), qom])

 crypto_ss = crypto_ss.apply(config_host, strict: false)
 libcrypto = static_library('crypto', crypto_ss.sources() + genh,
@@ -1922,7 +1923,7 @@ libcrypto = static_library('crypto', 
crypto_ss.sources() + genh,

build_by_default: false)

 crypto = declare_dependency(link_whole: libcrypto,
-dependencies: [authz, qom])
+dependencies: [crypto_ss.dependencies(), 
authz, qom])


 io_ss = io_ss.apply(config_host, strict: false)
 libio = static_library('io', io_ss.sources() + genh,
@@ -1931,13 +1932,14 @@ libio = static_library('io', io_ss.sources() + genh,
name_suffix: 'fa',
build_by_default: false)

-io = declare_dependency(link_whole: libio, dependencies: [crypto, qom])
+io = declare_dependency(link_whole: libio,
+dependencies: [io_ss.dependencies(), crypto, qom])

 libmigration = static_library('migration', sources: migration_files + 
genh,

   name_suffix: 'fa',
   build_by_default: false)
 migration = declare_dependency(link_with: libmigration,
-   dependencies: [zlib, qom, io])
+   dependencies: [qom, io])
 softmmu_ss.add(migration)

 block_ss = block_ss.apply(config_host, strict: false)
@@ -1949,7 +1951,7 @@ libblock = static_library('block', 
block_ss.sources() + genh,


 block = declare_dependency(link_whole: [libblock],
link_args: '@block.syms',
-   dependencies: [crypto, io])
+   dependencies: [block_ss.dependencies(), 
crypto, io])


 blockdev_ss = blockdev_ss.apply(config_host, strict: false)
 libblockdev = static_library('blockdev', blockdev_ss.sources() + genh,
@@ -1958,7 +1960,7 @@ libblockdev = static_library('blockdev', 
blockdev_ss.sources() + genh,

  build_by_default: false)

 blockdev = declare_dependency(link_whole: [libblockdev],
-  dependencies: [block])
+  dependencies: 
[blockdev_ss.dependencies(), block])


 qmp_ss = qmp_ss.apply(config_host, strict: false)
 libqmp = static_library('qmp', qmp_ss.sources() + genh,
@@ -1966,7 +1968,8 @@ libqmp = static_library('qmp', qmp_ss.sources() + 
genh,

 name_suffix: 'fa',
 build_by_default: false)

-qmp = declare_dependency(link_whole: [libqmp])
+qmp = declare_dependency(link_whole: [libqmp],
+ dependencies: qmp_ss.dependencies())

 libchardev = static_library('chardev', chardev_ss.sources() + genh,
 name_suffix: 'fa',
diff --git a/migration/meson.build b/migration/meson.build
index 9645f44005..e1f237b5db 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -9,6 +9,7 @@ migration_files = files(
 )
 softmmu_ss.add(migration_files)

+softmmu_ss.add(zlib)
 softmmu_ss.add(files(
   'block-dirty-bitmap.c',
   'channel.c',




Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH

2021-01-07 Thread Richard W.M. Jones
On Thu, Jan 07, 2021 at 09:58:17AM +, Richard W.M. Jones wrote:
> On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > Finally to be safe with calculations, to not calculate different
> > maximums for different nodes (depending on cluster size and
> > request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> > as absolute maximum bytes length for Qemu. Actually, it's not much less
> > than INT64_MAX.
> 
> > +/*
> > + * We want allow aligning requests and disk length up to any 32bit 
> > alignment
> > + * and don't afraid of overflow.
> > + * To achieve it, and in the same time use some pretty number as maximum 
> > disk
> > + * size, let's define maximum "length" (a limit for any offset/bytes 
> > request and
> > + * for disk size) to be the greatest power of 2 less than INT64_MAX.
> > + */
> > +#define BDRV_MAX_ALIGNMENT (1L << 30)
> > +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))
> 
> This change broke nbdkit tests.

Actually that's not the only problem.  It appears that we're unable to
read or write the last sector of this disk:

$ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -r -f raw "$uri" -c 
"r -v $(( 2**63 - 2**30 - 512 )) 512" ' 
read failed: Input/output error

$ nbdkit memory $(( 2**63 - 2**30 )) --run 'build/qemu-io -f raw "$uri" -c "w 
-P 3 $(( 2**63 - 2**30 - 512 )) 512" ' 
write failed: Input/output error

You can play around with the constants.  I found it's possible to read
and write the non-aligned 512 bytes starting at 2^63-2^30-513.  Could
be a fencepost error somewhere in qemu?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html




Re: [PATCH v3 0/5] SCSI: fix transfer limits for SCSI passthrough

2021-01-07 Thread Maxim Levitsky
On Thu, 2020-12-17 at 18:56 +0200, Maxim Levitsky wrote:
> This patch series attempts to provide a solution to the problem of the 
> transfer
> limits of the raw file driver (host_device/file-posix), some of which I
> already tried to fix in the past.
> 
> I included 2 patches from Tom Yan which fix two issues with reading the limits
> correctly from the */dev/sg* character devices in the first place.
> 
> The only change to these patches is that I tweaked a bit the comments in the
> source to better document the /dev/sg quirks.
> 
> The other two patches in this series split the max transfer limits that qemu
> block devices expose in two:
> One limit is for the regular IO, and another is for the SG_IO (aka 
> bdrv_*_ioctl),
> and the two device drivers (scsi-block and scsi-generic) that use the later
> are switched to the new interface.
> 
> This should ensure that the raw driver can still advertise the unlimited
> transfer  length, unless it is used for SG_IO, because that yields the highest
> performance.
> 
> Also I include a somewhat unrelated fix to a bug I found in qemu's
> SCSI passthrough while testing this:
> When qemu emulates the VPD block limit page, for a SCSI device that doesn't
> implement it, it doesn't really advertise the emulated page to the guest.
> 
> I tested this by doing both regular and SG_IO passthrough of my
> USB SD card reader.
> 
> That device turned out to be a perfect device for the task, since it has max
> transfer size of 1024 blocks (512K), and it enforces it.
> 
> Also it didn't implement the VPD block limits page,
> (transfer size limit probably comes from something USB related) which 
> triggered
> the unrelated bug.
> 
> I was able to see IO errors without the patches, and the wrong max transfer
> size in the guest, and with patches both issues were gone.
> 
> I also found an unrelated issue in /dev/sg passthrough in the kernel.
> It turns out that in-kernel driver has a limitation of 16 requests in flight,
> regardless of what underlying device supports.
> 
> With a large multi-threaded fio job  and a debug print in qemu, it is easy to
> see it, although the errors don't do much harm to the guest as it retries the
> IO, and eventually succeed.
> It is an open question if this should be solved.
> 
> V2: fixed an issue in a patch from Tom Yan (thanks), and removed
> refactoring from last patch according to Paulo's request.
> 
> V3: few cosmitic changes due to the review feedback.
> 
> Maxim Levitsky (3):
>   block: add max_ioctl_transfer to BlockLimits
>   block: use blk_get_max_ioctl_transfer for SCSI passthrough
>   block/scsi: correctly emulate the VPD block limits page
> 
> Tom Yan (2):
>   file-posix: split hdev_refresh_limits from raw_refresh_limits
>   file-posix: add sg_get_max_segments that actually works with sg
> 
>  block/block-backend.c  | 12 ++
>  block/file-posix.c | 76 +++---
>  block/io.c |  2 +
>  block/iscsi.c  |  1 +
>  hw/scsi/scsi-generic.c | 13 --
>  include/block/block_int.h  |  4 ++
>  include/sysemu/block-backend.h |  1 +
>  7 files changed, 90 insertions(+), 19 deletions(-)
> 
> -- 
> 2.26.2
> 
> 
> 
Any update on this?
Best regards,
Maxim Levitsky




Re: [Bug 1910505] [NEW] atomic failure linking with --enable-sanitizers on 32-bit Linux hosts

2021-01-07 Thread Paolo Bonzini

On 07/01/21 10:10, Philippe Mathieu-Daudé wrote:

libblock.fa(block_io.c.o): In function `stat64_max':
include/qemu/stats64.h:58: undefined reference to `__atomic_load_8'
include/qemu/stats64.h:60: undefined reference to
`__atomic_compare_exchange_8'
libblock.fa(block_qapi.c.o): In function `stat64_get':
include/qemu/stats64.h:40: undefined reference to `__atomic_load_8'
libqemuutil.a(util_qsp.c.o): In function `qatomic_set_u64':
include/qemu/atomic.h:478: undefined reference to `__atomic_store_8'
libqemuutil.a(util_qsp.c.o): In function `qatomic_read_u64':
include/qemu/atomic.h:468: undefined reference to `__atomic_load_8'
clang: error: linker command failed with exit code 1 (use -v to see
invocation)

Issue previously reported on the list here:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg770128.html


Looks like 64-bit atomics are available without sanitizers, but fall 
back to libatomic with sanitizers (tsan probably?).


So CONFIG_ATOMIC64 is not detected properly in configure, because 
-fsanitize=* flags are added (at "end of CC checks" around line 5000) 
after the atomic64 test (around 400 lines before that).


I'm not sure what breaks if -fsanitize flags are added earlier.

Paolo




Re: [PATCH v6 0/3] qcow2: don't leave partially initialized file on image creation

2021-01-07 Thread Maxim Levitsky
On Thu, 2020-12-17 at 19:09 +0200, Maxim Levitsky wrote:
> Use the bdrv_co_delete_file interface to delete the underlying
> file if qcow2 initialization fails (e.g due to bad encryption secret)
> 
> This makes the qcow2 driver behave the same way as the luks driver behaves.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1845353
> 
> V3: addressed review feedback and reworked commit messages
> 
> V4: got rid of code duplication by adding bdrv_co_delete_file_noerr
> and made the qcow2 driver use this function to delete
> both the main and the data file.
> 
> V5: addresssed review feedback on reworked version.
> 
> V6: addressed most of the review feedback.
> 
> Best regards,
>   Maxim Levitsky
> 
> Maxim Levitsky (3):
>   crypto: luks: Fix tiny memory leak
>   block: add bdrv_co_delete_file_noerr
>   block: qcow2: remove the created file on initialization error
> 
>  block.c   | 22 ++
>  block/crypto.c| 13 ++---
>  block/qcow2.c |  8 +---
>  include/block/block.h |  1 +
>  4 files changed, 30 insertions(+), 14 deletions(-)
> 
> -- 
> 2.26.2
> 
> 
> 
Any update on this?

Best regards,
Maxim Levitsky




Re: [PATCH 4/4] block: introduce BDRV_MAX_LENGTH

2021-01-07 Thread Richard W.M. Jones
On Fri, Dec 04, 2020 at 01:27:13AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Finally to be safe with calculations, to not calculate different
> maximums for different nodes (depending on cluster size and
> request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
> as absolute maximum bytes length for Qemu. Actually, it's not much less
> than INT64_MAX.

> +/*
> + * We want allow aligning requests and disk length up to any 32bit alignment
> + * and don't afraid of overflow.
> + * To achieve it, and in the same time use some pretty number as maximum disk
> + * size, let's define maximum "length" (a limit for any offset/bytes request 
> and
> + * for disk size) to be the greatest power of 2 less than INT64_MAX.
> + */
> +#define BDRV_MAX_ALIGNMENT (1L << 30)
> +#define BDRV_MAX_LENGTH (QEMU_ALIGN_DOWN(INT64_MAX, BDRV_MAX_ALIGNMENT))

This change broke nbdkit tests.

We test that qemu can handle a qemu NBD export of size 2^63 - 512, the
largest size that (experimentally) we found qemu could safely handle.
eg:

  
https://github.com/libguestfs/nbdkit/blob/master/tests/test-memory-largest-for-qemu.sh

Before this commit:

  $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
  image: nbd://localhost:10809
  file format: raw
  virtual size: 8 EiB (9223372036854775296 bytes)
  disk size: unavailable

After this commit:

  $ nbdkit memory $(( 2**63 - 512 )) --run './qemu-img info "$uri"'
  qemu-img: Could not open 'nbd://localhost:10809': Could not refresh total 
sector count: File too large

Can I confirm that this limit is now the new official one and we
should adjust nbdkit tests?  Or was this change unintentional given
that qemu seemed happy to handle 2^63 - 512 disks before?

Note that nbdkit & libnbd support up to 2^63 - 1 bytes (we are not
limited to whole sectors).  Also the Linux kernel will let you create
a /dev/nbdX device of size 2^63 - 1.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org




Re: [Bug 1910505] [NEW] atomic failure linking with --enable-sanitizers on 32-bit Linux hosts

2021-01-07 Thread Philippe Mathieu-Daudé
Cc'ing atomic team and qemu-block@ for "qemu/stats64.h".

On 1/7/21 9:40 AM, Philippe Mathieu-Daudé wrote:
> Public bug reported:
> 
> As of commit 50536341b47, using --enable-sanitizers on 32-bit Linux host:
> - displays various warnings
> - fails linking
> 
> Using Ubuntu 18.04 (release 20201211.1) and Clang10 on i386:
> 
> [139/675] Compiling C object softmmu.fa.p/softmmu_icount.c.o
> In file included from ../softmmu/icount.c:31:
> In file included from include/exec/exec-all.h:23:
> In file included from ../target/mips/cpu.h:4:
> In file included from ../target/mips/cpu-qom.h:23:
> In file included from include/hw/core/cpu.h:23:
> In file included from include/hw/qdev-core.h:5:
> In file included from include/qemu/bitmap.h:16:
> In file included from include/qemu/bitops.h:17:
> include/qemu/atomic.h:463:12: warning: misaligned atomic operation may
> incur significant performance penalty [-Watomic-alignment]
> return qatomic_read__nocheck(ptr);
>^
> include/qemu/atomic.h:129:5: note: expanded from macro
> 'qatomic_read__nocheck'
> __atomic_load_n(ptr, __ATOMIC_RELAXED)
> ^
> include/qemu/atomic.h:473:5: warning: misaligned atomic operation may
> incur significant performance penalty [-Watomic-alignment]
> qatomic_set__nocheck(ptr, val);
> ^
> include/qemu/atomic.h:138:5: note: expanded from macro
> 'qatomic_set__nocheck'
> __atomic_store_n(ptr, i, __ATOMIC_RELAXED)
> ^
> 2 warnings generated.
> [...]
> 
> [850/2216] Linking target tests/test-hbitmap
> FAILED: tests/test-hbitmap
> clang  -o tests/test-hbitmap tests/test-hbitmap.p/test-hbitmap.c.o
> tests/test-hbitmap.p/iothread.c.o -Wl,--as-needed -Wl,--no-undefined
> -pie -Wl,--whole-archive libblock.fa libcrypto.fa libauthz.fa libqom.fa
> libio.fa -Wl,--no-whole-archive -Wl,--warn-common -fsanitize=undefined
> -fsanitize=address -Wl,-z,relro -Wl,-z,now -m32 -ggdb
> -fstack-protector-strong -Wl,--start-group libqemuutil.a
> subprojects/libvhost-user/libvhost-user-glib.a
> subprojects/libvhost-user/libvhost-user.a libblock.fa libcrypto.fa
> libauthz.fa libqom.fa libio.fa @block.syms -lgio-2.0 -lgobject-2.0
> -lglib-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -pthread -lutil -lgnutls
> -lm -lgthread-2.0 -lglib-2.0 /usr/lib/i386-linux-gnu/libglib-2.0.so
> -liscsi -lgthread-2.0 -lglib-2.0 -laio -lcurl
> /usr/lib/i386-linux-gnu/libz.so -lrbd -lrados -lnettle -lgnutls
> -Wl,--end-group
> libblock.fa(block_io.c.o): In function `stat64_max':
> include/qemu/stats64.h:58: undefined reference to `__atomic_load_8'
> include/qemu/stats64.h:60: undefined reference to
> `__atomic_compare_exchange_8'
> libblock.fa(block_qapi.c.o): In function `stat64_get':
> include/qemu/stats64.h:40: undefined reference to `__atomic_load_8'
> libqemuutil.a(util_qsp.c.o): In function `qatomic_set_u64':
> include/qemu/atomic.h:478: undefined reference to `__atomic_store_8'
> libqemuutil.a(util_qsp.c.o): In function `qatomic_read_u64':
> include/qemu/atomic.h:468: undefined reference to `__atomic_load_8'
> clang: error: linker command failed with exit code 1 (use -v to see
> invocation)
> 
> Issue previously reported on the list here:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg770128.html
> 
> ** Affects: qemu
>  Importance: Undecided
>  Status: New
> 




Re: To start multiple KVM guests from one qcow2 image with transient disk option

2021-01-07 Thread Peter Krempa
On Tue, Jan 05, 2021 at 15:12:55 +0100, Peter Krempa wrote:
> On Mon, Jan 04, 2021 at 15:30:19 -0500, Masayoshi Mizuma wrote:
> > On Sat, Dec 19, 2020 at 11:30:39PM -0500, Masayoshi Mizuma wrote:
 
 [...]
 
 >   {"execute":"cont"}
> 
> So that is a no-go. Some disk bus-es such as IDE don't support hotplug:
> 
> https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_hotplug.c#L1074
> 
> You could try to just instantiate the backend of the disk as read-only,
> and then create a writable overlay. You just need to make sure that the
> disk will be writable and that it works even for IDE/SATA which doesn't
> support read-only:
> 
> https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_validate.c#L2634

Okay, I've actually tried implementing my suggestion and that doesn't
work. With scsi disks qemu crashes on an assertion failure when I try
writing to the disk after setting it up as suggested and virtio disk
returns an I/O error as it remembers that it was read-only when
starting.

I'm considering whether it's worth implementing this via hotplug then.

Arguably simple deployments don't need it and complex ones have some
kind of orchestration which can do it externally.

This specifically comes with a caveat of the above, as currently the
overlay used to discard writes is created under the same path as the
image and can't be controlled, which might be a problem for complex
deployments.

Also the above algorithm with a constant suffix added to the image
prevents to use it with multiple VMs anyways, since the overlay file
name will collide (since it's generated based on the same rules).

Didn't you run into the collision?