from:"\"Peter Krempa\""

Re: [Qemu-devel] [PATCH v4 3/9] cli: add -preconfig option

2018-04-03 Thread Peter Krempa

On Thu, Mar 29, 2018 at 13:57:54 -0300, Eduardo Habkost wrote:
> On Thu, Mar 29, 2018 at 03:01:12PM +0200, Igor Mammedov wrote:
> > On Wed, 28 Mar 2018 16:17:32 -0300
> > Eduardo Habkost  wrote:
> > > On Tue, Mar 27, 2018 at 05:05:41PM +0200, Igor Mammedov wrote:
> > > > On Fri, 23 Mar 2018 18:25:08 -0300
> > > > Eduardo Habkost  wrote:
> > > > > On Mon, Mar 12, 2018 at 02:11:09PM +0100, Igor Mammedov wrote:  

[...]

> > > > > Why exactly it's not possible to use -incoming with -preconfig?  
> > > > there are 2 reasons why I made options mutually exclusive
> > > > 1. (excuse ) '-incoming' is an option with non explicit side effects
> > > >on other parts of code. It's hard to predict behavior
> > > >of preconfig commands in combination with inmigrate.
> > > >I wouldn't try to touch/change anything related to it
> > > >in this series.
> > > >If we need to change how option is handled, it should
> > > >be separate series that focuses on it.
> > > > 2. (main reason) is to expose as minimal interface
> > > >as possible. It's easier to extend/modify it future if
> > > >necessary than cut it down after it was introduced.
> > > > 
> > > >Not counting [1], I don't see a reason to permit
> > > >'preconfig' while migration is in progress.
> > > >Configuration commands that where used during 'preconfig'
> > > >stage on source side, should use corresponding CLI options
> > > >on target side. (it's the same behavior as with hotplugged
> > > >devices, keeping migration work-flow the same)
> > > > 
> > > > In short I'd prefer to keep restriction until there will be
> > > > a real usecase for combo to work together.  
> > > 
> > > I understand the reasons, but I think we already have an
> > > important use case: live-migrating a VM with non-trivial NUMA
> > > config (that needs -preconfig).  Don't we?
> > Not really,
> > whatever we have configured on source side using -preconfig
> > (discovering valid topology in process), we should be able
> > to replicate using only CLI options on target since we
> > already have all necessary values for it from source (it's
> > certainly the case with this series set-numa-node command).
> > 
> > As for the future, I agree it would be much more flexible
> > to allow both -preconfig and -incoming at the same time,
> > so we could start target with empty CLI, and then feed it
> > options from source. It would require audit/refactoring of
> > INMIGRATE state and making 'all' current CLI options
> > available via QMP interface.
> > 
> > But for now I'd prefer to keep using old way to start target.
> 
> Well, if management software developers tell us that -preconfig
> will be already useful without -incoming support, I won't object.


Hmm, that depends on what we will be configured using the new interface.
We usually prefer to use the same approach to set up things when
starting a new VM and when starting a VM for migration.

Since we do have options to setup the vCPU topology stuff on the
command line once we are going to migrate, we certainly can use
-preconfig even when it will collide with -incoming

Ideally -preconfig should replace -incoming, so that we can swithc to
incomming migration operation after we configure everything.

> But it would be very nice for management software if they can
> simply assume that -preconfig and -incoming will work together
> since the first version.  Can we have this as a goal for 2.13?

It usually helps in reducing code clutter.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] [PATCH for-2.12] gluster: Fix blockdev-add with server.N.type=unix

2018-04-04 Thread Peter Krempa

On Tue, Apr 03, 2018 at 08:09:49 -0500, Eric Blake wrote:
> On 04/03/2018 06:08 AM, Kevin Wolf wrote:
> > The legacy command line interface gets the socket path from an option
> > called 'socket'. QAPI in contract uses SocketAddress, where the
> > corresponding option is called 'path'.
> > 
> > Fix the gluster block driver to accept both 'socket' and 'path', with
> > 'path' being the preferred syntax.
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1545155
> > 
> > Cc: qemu-sta...@nongnu.org
> > Signed-off-by: Kevin Wolf 
> > ---
> >  block/gluster.c | 21 +
> >  1 file changed, 17 insertions(+), 4 deletions(-)

Thanks for fixing this. I'm using the new syntax in the -blockdev code
in libvirt and since I'm qapi- schema-checking the results, it would
not be possible to use the legacy approach.

> 
> Reviewed-by: Eric Blake 
> 
> Should we also add a deprecation warning for 'socket' and update the
> deprecation documentation, so we can start the clock ticking on getting
> rid of maintaining the back-compat glue forever?

Well, that won't be as easy. Since there is at least one qemu release
which declared this in the QAPI schema but did not support using it it's
hard for libvirt to detect that this was fixed, and thus we can't infer
a capability which would be used to switch to the new-syntax only.

signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC 4/6] CLI: add -paused option

2017-10-17 Thread Peter Krempa

On Tue, Oct 17, 2017 at 12:56:28 +0200, Laszlo Ersek wrote:
> On 10/17/17 10:17, Igor Mammedov wrote:
> > On Mon, 16 Oct 2017 17:35:15 +0100
> > "Daniel P. Berrange"  wrote:
> > 
> >> On Mon, Oct 16, 2017 at 06:22:54PM +0200, Igor Mammedov wrote:
> >>
> >> This really needs to have a commit message that provides justification
> >> for why this option is needed when we already have -S that is used
> >> to allow configuration before the guest starts.
> > Sorry, I've should have added here what I've tried to describe in cover 
> > letter.
> > 
> > -S pauses machine too late as machine is already created by the time
> > it's paused so trying to reconfigure it might require machine to be 
> > recreated.
> > In case of NUMA options it might be possible to hack x86 target to
> > rebuild/override acpi/fw_cfg so it would reflect the new settings set
> > this late but I wouldn't expect that it would work in general.
> > 
> > The cleanest way to configure it is pausing and configuring numa mapping
> > before machine is build.
> 
> Asking from the sideline: if the NUMA mapping has to be configured so
> early, why can't it be done on the QEMU command line?
> 
> (I asked myself the same question when I first saw your patches -- I
> couldn't find an explanation in the blurb --, so I assumed it was
> obvious and/or others would ask the same question.)

Because libvirt needs to be able to query qemu before setting stuff up.
As we already established, it's not okay to run a throwaway qemu process
to do so, so we are getting into the chicken/egg problem zone.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] block: don't add 'driver' to options when refering to backing via node name

2017-10-17 Thread Peter Krempa

On Tue, Oct 17, 2017 at 16:41:00 +0200, Kevin Wolf wrote:
> Am 12.10.2017 um 16:14 hat Peter Krempa geschrieben:
> > When refering to a backing file of an image via node name
> > bdrv_open_backing_file would add the 'driver' option to the option list
> > filling it with the backing format driver. This breaks construction of
> > the backing chain via -blockdev, as bdrv_open_inherit reports an error
> > if both 'reference' and 'options' are provided.
> > 
> > Signed-off-by: Peter Krempa 
> 
> If you don't mind, I'd add a specific example to the commit message:

I certainly don't mind. I was frustrated after some attempts to find
bugs in my code using it before looking into qemu so I forgot to add
the info. (Also it would take me some time to figure out that I have the
backing format specified in the overlay image).

> 
> $ qemu-img create -f raw /tmp/backing.raw 64M
> $ qemu-img create -f qcow2 -F raw -b /tmp/backing.raw /tmp/test.qcow2
> $ qemu-system-x86_64 \
>   -blockdev driver=file,filename=/tmp/backing.raw,node-name=backing \
>   -blockdev 
> driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing
> qemu-system-x86_64: -blockdev 
> driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing:
>  Could not open backing file: Cannot reference an existing block device with 
> additional options or a new filename
> 
> > diff --git a/block.c b/block.c
> > index 46eb1728da..684cb018da 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -2245,7 +2245,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, 
> > QDict *parent_options,
> >  goto free_exit;
> >  }
> > 
> > -if (bs->backing_format[0] != '\0' && !qdict_haskey(options, "driver")) 
> > {
> > +if (!reference &&
> > +bs->backing_format[0] != '\0' && !qdict_haskey(options, "driver")) 
> > {
> >  qdict_put_str(options, "driver", bs->backing_format);
> >  }
> 
> Looks good to me.

Thanks.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] Raw notes from a small block layer/QAPI/something pre-christmas meeting

2018-01-08 Thread Peter Krempa

On Wed, Dec 20, 2017 at 19:15:55 +0100, Markus Armbruster wrote:
> "Daniel P. Berrange"  writes:
> 
> > On Fri, Dec 15, 2017 at 05:38:00PM +0100, Max Reitz wrote:
> >
> >> Image creation in qemu-system-* vs. qemu-img:
> >>   In order to get proper introspection for qemu-img create, we need a
> >>   QAPI schema.  If we have a QAPI schema, we might as well add
> >>   blockdev-create to QMP.
> >>   As long as we do not have a really-none (null, void, ...) machine type
> >>   for qemu-system-*, launching such a process just for creating an image
> >>   will bring quite a bit of overhead (e.g. with -M none -accel qtest).
> >>   However, as for libvirt, this is not exactly a regression since
> >>   libvirt currently cannot create images at all (apart from implicitly
> >>   through drive-mirror etc.).  Further work on voidifying qemu-system-*
> >>   will improve performance.
> >
> > In terms of the I/O operations involved, image creation is a already a
> > pretty slow process, particularly if pre-allocation is used which is
> > common.  So even QEMU's current slow (circa 300ms) startup time is a
> > complete non-issue for image creation IMHO - it'll be dwarfed by the
> > time to actually create the image. 
> >
> >>   On the other side, we can also add QAPI introspection to qemu-img.
> >>   (qemu-img already links to QAPI, so this should not be too hard.)
> >>   qemu-img will also need command-line introspection, though.
> >
> > I figure the qapi-ificiation is the hard & time consuming bit of
> > work. Once that's done exporting it via both qemu-img & qemu-system*
> > is quite straighforward.
> 
> qemu-system-*: trivial.
> qemu-img, via command line: straightforward
> qemu-img, via QMP: more difficult, since QMP is entangled with HMP,
> character devices, ...
> 
> If libvirt really wants to use QMP for the job, *and* doesn't want to
> use the qemu-system-* that's running a guest, the easy solution is
> running another qemu-system-* without a guest.

QMP is necessary to have for very-long operations (blockdev-mirror), but
for image creation the command line will be enough. Provided that
interface for -blockdev and "create image" will be similar enough.

Especially the fact that qemu-img create does not really like 'json:{}'
thus some image options are impossible to pass (multiple hosts for
gluster, ...).


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] Raw notes from a small block layer/QAPI/something pre-christmas meeting

2017-12-21 Thread Peter Krempa

On Wed, Dec 20, 2017 at 13:40:18 +, Daniel Berrange wrote:
> On Wed, Dec 20, 2017 at 02:33:54PM +0100, Kevin Wolf wrote:
> > Am 20.12.2017 um 11:57 hat Daniel P. Berrange geschrieben:
> > > On Wed, Dec 20, 2017 at 11:44:36AM +0100, Kashyap Chamarthy wrote:
> > > > On Mon, Dec 18, 2017 at 11:11:00AM +0100, Markus Armbruster wrote:
> > > > > Max Reitz  writes:
> > > > 
> > > > [...]
> > > > 
> > > > Thanks, Max, for the detailed notes.
> > > > 
> > > > > > Image creation in qemu-system-* vs. qemu-img:
> > > > > >   In order to get proper introspection for qemu-img create, we need 
> > > > > > a
> > > > > >   QAPI schema.  If we have a QAPI schema, we might as well add
> > > > > >   blockdev-create to QMP.
> > > > > >   As long as we do not have a really-none (null, void, ...) machine 
> > > > > > type
> > > > > >   for qemu-system-*, launching such a process just for creating an 
> > > > > > image
> > > > > >   will bring quite a bit of overhead (e.g. with -M none -accel 
> > > > > > qtest).
> > > > > >   However, as for libvirt, this is not exactly a regression since
> > > > > >   libvirt currently cannot create images at all (apart from 
> > > > > > implicitly
> > > > > >   through drive-mirror etc.).  Further work on voidifying 
> > > > > > qemu-system-*
> > > > > >   will improve performance.
> > > > > 
> > > > > Another thought: do we want to give qemu-system-* the necessary
> > > > > privileges for creating images?  Two cases: running with and without a
> > > > > guest.
> > > > 
> > > > Related: Just curious -- was it an explicit design decision to not give
> > > > `qemu-system-*` permissions to create disk images?
> > > 
> > > Our security model considers QEMU broadly untrustworthy, and so any 
> > > resources
> > > it needs to use must either be passed in by libvirt, or have permissions
> > > explicitly assigned to permit usage by QEMU. QEMU is allowed to create tmp
> > > files, and create RAM files for memory backing, but in general we don't 
> > > want
> > > to have QEMU able to create arbitrary files, only open things that are
> > > already created.
> > 
> > Which kind of suggests that libvirt should use an external qemu-img
> > process rather than a QMP command to create images?
> 
> That is my gut feeling, though Peter K might have other opinions as the
> person doing most libvirt work in this area.

Well then we need a way to do capability probing for qemu-img. And also
libvirt will then need a way to override the qemu-img binary on a per-VM
basis as we do with qemu binaries, so that users can switch to a more
capable binary.

If that will not be possible, block jobs for a VM might be impossible if
an old qemu-img is installed system-wide.

Also I really doubt it will save any qemu work when forcing us to do it
via qemu-img, since qemu-img still needs to be improved to support e.g.
creating an image on a multi-host gluster server [1], which currently is
not possible (basically we need the same capabilities as when opening an
image).


Peter


[1] You could argue, that if you select the properly functioning volfile
server from the list of servers configured by the user only one is
necessary for creating the image. Libvirt has no way of nowing which one
works though, so let's just have gluster figure it out by themselves.


signature.asc
Description: PGP signature

Re: [PATCH 0/3] Tighten qemu-img rules on missing backing format

2020-02-24 Thread Peter Krempa

On Sat, Feb 22, 2020 at 05:23:38 -0600, Eric Blake wrote:
> In the past, we have had CVEs caused by qemu probing one image type
> when an image started out as another but the guest was able to modify
> content.  The solution to those CVEs was to encode backing format
> information into qcow2, to ensure that once we make a decision, we
> don't have to probe any further.  However, we failed to enforce this
> at the time.  And now that libvirt is switching to -blockdev, it has
> come back to bite us: with -block, libvirt had no easy way (other than

s/-block/-drive/ ?

> json:{} pseudoprotocol) to force a backing file, but with -blockdev,

"json:{}" is basically -blockdev with extra steps. Old -drive usage
didn't have any way to do that apart from rewriting the image. Which is
basically the same since json:{} also needs to be recorded in the image
to take effect.

> libvirt HAS to use blockdev-open on the backing chain and supply a
> backing format there, and thus has to probe images.  If libvirt ever
> probes differently than qemu, we are back to the potential
> guest-visible data corruption or potential host CVEs.

As I've elaborated in [1] I disagree with the host CVE part. The
insecure part is not probing the format itself, but probing format AND
using the backing file of the image if we probed format.

I agree that mis-probing format leads to data corruption though.

> It's time to deprecate images without backing formats.  This patch
> series does two things: 1. record an implicit backing format where one
> is learned (although sadly, not all qemu-img commands are able to
> learn a format), 2. warn to the user any time a probe had ambiguous
> results or a backing format is omitted from an image.  All previous
> images without a backing format are still usable, but hopefully the
> warnings (along with libvirt's complaints about images without a
> backing format) help us pinpoint remaining applications that are

It is not a warning in libvirt though. We just refuse it now because we
don't do probing. Previously we allowed qemu to probe the format and the
only thing that prevented host CVEs was if the host used selinux or any
other security approach which would prevent opening the backing file.

> creating images on their own without recording a backing format.

Re: [PATCH 3/3] qemu-img: Deprecate use of -b without -F

2020-02-24 Thread Peter Krempa

On Sat, Feb 22, 2020 at 05:23:41 -0600, Eric Blake wrote:
> Creating an image that requires format probing of the backing image is
> inherently unsafe (we've had several CVEs over the years based on
> probes leaking information to the guest on a subsequent boot).  If our
> probing algorithm ever changes, or if other tools like libvirt
> determine a different probe result than we do, then subsequent use of
> that backing file under a different format will present corrupted data
> to the guest.  Start a deprecation clock so that future qemu-img can
> refuse to create unsafe backing chains that would rely on probing.
> 
> However, there is one time where probing is safe: when we first create
> an image, no guest has yet used the new image, so as long as we record
> what we probed, all future uses of the image will see the same data -

I disagree. If you are creating an overlay on top of an existing image
it's not safe to probe the format any more generally. (obviously you'd
have to trust the image and express the trust somehow)

The image may have been used in a VM as raw and that means that the VM
might have recorded a valid qcow2 header into it. Creating the overlay
with probing would legitimize this.

Let's assume we have a malicious image written by the guest but we
simulate it by:

$ qemu-img  create -f qcow2 -F raw -b /etc/passwd /tmp/malicious
Formatting '/tmp/malicious', fmt=qcow2 size=2560 backing_file=/etc/passwd 
backing_fmt=raw cluster_size=65536 lazy_refcounts=off refcount_bits=16


Now we want to create an overlay.

a) without this patchset:

$ qemu-img create -f qcow2 -b /tmp/malicious /tmp/pre-patch.qcow2
Formatting '/tmp/pre-patch.qcow2', fmt=qcow2 size=2560 
backing_file=/tmp/malicious cluster_size=65536 lazy_refcounts=off 
refcount_bits=16
$ qemu-img info /tmp/pre-patch.qcow2
image: /tmp/pre-patch.qcow2
file format: qcow2
virtual size: 2.5 KiB (2560 bytes)
disk size: 196 KiB
cluster_size: 65536
backing file: /tmp/malicious
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false

There's no 'backing file format'. When used by libvirt we'd not allow
the VM to touch the backing file of /tmp/malicious in pre-blockdev era
and in libvirt-6.0 we'd report an error right away.

b) Now with this patchset:

$ ./qemu-img create -f qcow2 -b /tmp/malicious /tmp/post-patch.qcow2
qemu-img: warning: Deprecated use of non-raw backing file without explicit 
backing format, using detected format of qcow2
Formatting '/tmp/post-patch.qcow2', fmt=qcow2 size=2560 
backing_file=/tmp/malicious backing_fmt=qcow2 cluster_size=65536 
lazy_refcounts=off refcount_bits=16
$ qemu-img info /tmp/post-patch.qcow2
image: /tmp/post-patch.qcow2
file format: qcow2
virtual size: 2.5 KiB (2560 bytes)
disk size: 196 KiB
cluster_size: 65536
backing file: /tmp/malicious
backing file format: qcow2
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false

You now get a warning, but "backing file format" is now recorded in the
overlay. Now this is WAY worse than it was before. The overlay now
legitimizes the format recorded by the malicious guest which circumvents
libvirt's protections. The warning is very easy to miss, and if you run
it in scripts you might never get to see it. We can't allow that.


> so the code now records the probe results as if the user had passed
> -F.  When this happens, it is unconditionally safe to record a probe
> of 'raw', but any other probe is still worth warning the user in case

While it's safe I don't think it should be encouraged. IMO -F should be
made mandatory with -b.

> our probe differed from their expectations.  Similarly, if the backing
> file name uses the json: psuedo-protocol, the backing name includes
> the format.

Not necessarily. The backing store string can be e.g.:

$ ./qemu-img create -f qcow1 -b 
'json:{"driver":"file","filename":"/tmp/malicious"}' /tmp/json.qcow2
Formatting '/tmp/json.qcow1', fmt=qcow2 size=197120 
backing_file=json:{"driver":"file",,"filename":"/tmp/malicious"} 
cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ qemu-img info /tmp/json.qcow1
image: /tmp/json.qcow1
file format: qcow1
virtual size: 191 KiB (197120 bytes)
disk size: 195 KiB
cluster_size: 65535
backing file: json:{"driver":"file","filename":"/tmp/malicious"}
Format specific information:
compat: 0.1
lazy refcounts: false
refcount bits: 15
corrupt: false

Now this has the old semantics but we didn't even get the warning. But
at least the backing file format is not written into the overlay.


> iotest 114 specifically wants to create an unsafe image for later
> amendment rather than defaulting to our new default of recording a
> probed format, so it needs an update.
> 
> Signed-off-by: Eric Blake 
> ---
>  block.c| 17 -
>  qemu-deprecated.texi   | 12 
>  qemu-img.c |  8 +++-
>  tests/qemu-iotests/114 |

Re: [PATCH 0/3] Tighten qemu-img rules on missing backing format

2020-02-24 Thread Peter Krempa

On Mon, Feb 24, 2020 at 12:01:45 +0100, Peter Krempa wrote:
> On Sat, Feb 22, 2020 at 05:23:38 -0600, Eric Blake wrote:

[...]

> > libvirt HAS to use blockdev-open on the backing chain and supply a
> > backing format there, and thus has to probe images.  If libvirt ever
> > probes differently than qemu, we are back to the potential
> > guest-visible data corruption or potential host CVEs.
> 
> As I've elaborated in [1] I disagree with the host CVE part. The

[1] https://www.redhat.com/archives/libvir-list/2020-February/msg00624.html

> insecure part is not probing the format itself, but probing format AND
> using the backing file of the image if we probed format.

Re: [libvirt] Offline manipulation of Dirty Bitmaps by qemu-img

2019-12-06 Thread Peter Krempa

On Thu, Dec 05, 2019 at 17:37:11 -0500, John Snow wrote:
> This has come up in the past, and I believe we discussed this at KVM
> Forum, too:
> 
> There have been requests from oVirt (via Nir Soffer) to add some offline
> bitmap manipulation functionality. In the past, our stance has generally
> been "Use QEMU without an accelerator, and use QMP to manipulate the
> images."

This is a thing I wanted to do for a long time but never had time for.
I'm not sure whether that will change though.

We have a workaround for this tough: you can start the VM with CPUs
stopped:

virsh start --pause $VMNAME 

(That translates to VIR_DOMAIN_START_PAUSED flag for
virDomainCreateWithFlags).

You can then use any libvirt API which requires a running VM including
blockjobs checkpoints etc.

The VM then can be destroyed. Since the CPUs didn't run the guest
visible image content was nott modified.

Alternatively to make this slightly more official we could introduce a
new flag for the VM starting API which will actually start the VM in the
no-machine mode, will interlock certain operations such as resuming of
the execution or migration perhaps  and the official purpose will be to
allow complex blockjobs without starting the actual VM.

Since starting an actual VM will be impossible anyways until such a VM
is gone it might be a sane thing to do here.

> We like this for a few reasons:
> 
> 1. It keeps bitmap management code tightly centralized
> 2. It allows for the full suite of bitmap manipulations in either
> offline or online mode with one tool
> 3. We wouldn't have to write new code.
> 4. Or design new CLIs and duplicate our existing work.
> 5. Or write even more tests.

In libvirt we'd like to use it because qemu-img has no reasonable
progress reporting and we could reuse the code we have for interacting
with the jobs when the VM is running.

> However, tools like oVirt may or may not be fully equipped to launch
> QEMU in this context, and there is always a desire for qemu-img to be
> able to "do more", so existing management suites could extend
> functionality more easily.
>
> (Or so I am imagining.)
> 
> I am still leaning heavily against adding any more CLI commands or
> options to qemu-img right now. Even if we do add some of the fundamental
> ones like "add" or "remove", it seems only a matter of time before we
> have to add "clear", "merge", etc. Is this just a race to code duplication?
> 
> On the other hand, one of the other suggestions is to have qemu-img
> check --repair optionally delete corrupted bitmaps. I kind of like this
> idea: it's a buyer beware operation that might make management layers
> unhappy, but then again ... repair is always something that could make
> things worse.

Well, dealing with corrupted bitmaps will be possible. I plan to expose
in the checkpoint XML whether a ckeckpoint is invalid (if it contains at
least one corrupted bitmap) and the user will have the option to delete
all previous checkpoints including the corrupted one to clear any
problem.

Note that deleting only the corrupted checkpoint will not be possible
until it's the oldest one as we attempt to merge them into the previous
ones. We could alternatively add a flag to skip merging of the invalid
checkpoint.

> Plus, if you manage to corrupt bitmaps badly enough that they can't even
> be parsed, you might need a heavyweight repair operation.
> 
> Nir, do you think that'd be sufficient for your needs for now, or would
> you still like to see more granular offline management?
> 
> --js
> 
> --
> libvir-list mailing list
> libvir-l...@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

Re: Insufficiently documented deprecated command arguments

2019-12-11 Thread Peter Krempa

On Wed, Dec 11, 2019 at 09:12:41 +0100, Markus Armbruster wrote:
> I went through the QAPI schema looking for deprecated stuff not
> mentioned in qemu-deprecated.texi.  Here's what I found:
> 
> Commit b33945cfff "block: Accept device model name for
> blockdev-open/close-tray" (v2.8.0) deprecated blockdev-open-tray,
> blockdev-close-tray argument @device.

Libvirt uses these commands only in 'blockdev' mode. We use the qom
name/path [1] as the value for @id field. We never used @device.

> Commit 70e2cb3bd7 "block: Accept device model name for
> blockdev-change-medium" (v2.8.0) deprecated blockdev-change-medium
> argument @device.

Same as above.

> Commit 7a9877a026 "block: Accept device model name for
> block_set_io_throttle" (v2.8.0) deprecated block_set_io_throttle
> argument @device.

This one is more complex. The command is used both in 'blockdev' and in
'drive' mode:

In 'drive' mode we pass the alias of the 'drive' as the @device
argument.

In 'blockdev' mode we pass the qom name as @id

> Commit c01c214b69 "block: remove all encryption handling APIs"
> (v2.10.0) deprecated query-named-block-nodes result field
> encryption_key_missing and query-block result field
> inserted.encryption_key_missing.

We don't extract or use the 'encryption_key_missing' at all.

> Commit c42e8742f5 "block: Use JSON null instead of "" to disable
> backing file" (v2.10.0) deprecated blockdev-add empty string
> argument @backing.

This is used in 'blockdev' mode only and we always pass the JSON null or
a node name string.

> These were missed in commit eb22aeca65 "docs: document deprecation
> policy & deprecated features in appendix" (v2.10.0).
> 
> Commit 3c605f4074 "commit: Add top-node/base-node options" (v3.1.0)
> deprecated block-commit arguments @base and @top.

This command also has two modes:

In 'drive' mode we pass in path strings as @base and @ top.

In 'blockdev' mode we pass in nodenames as @base-node and @top-node.

Starting from qemu-4.2 libvirt uses 'blockdev' mode for VMs unless an SD
card is configured as we didn't convert to the '-device' approach for
those as AFAIK not everything is possible to be converted.

> I can update qemu-deprecated.texi for these.
> 
> Now my question: I wonder whether we want to remove any of them right
> away.

Feel free to delete any unused ones. I'm afraid that -drive mode will
need to be supported for a while until we can convert the sd-cards too.

[1]: The qom name used by libvirt is generated is generated here:

https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_alias.c;h=93bdcb7548757de547b2ccb01d0a2af16d5a7cc6;hb=HEAD#l224

Re: Insufficiently documented deprecated command arguments

2019-12-11 Thread Peter Krempa

On Wed, Dec 11, 2019 at 12:32:10 +, Daniel Berrange wrote:
> On Wed, Dec 11, 2019 at 01:24:17PM +0100, Kevin Wolf wrote:
> > Am 11.12.2019 um 11:51 hat Peter Krempa geschrieben:
> > > On Wed, Dec 11, 2019 at 11:14:39 +0100, Kevin Wolf wrote:

[...]

> > > Well, in some specific cases we could detect the node names
> > > auto-assigned by qemu and use them instead of paths, but in my opinion
> > > it's not worth the effort and extra code.
> > 
> > Well, the question is what to do on the QEMU side then. Deprecation
> > should mean that we have a plan for removing the feature. If we're
> > planning to keep the feature indefinitely because libvirt needs it, we
> > might want to consider removing the deprecation notice.
> 
> Ideally libvirt would stop using -drive entirely, as I hate the idea of
> having to keep this around indefinitely in libvirt too. This needs QEMU
> to close the last gaps wrt SD cards

Yes and also give us guidance how to convert it. Looking at the code
didn't help. There's a plethora of controllers and options to configure
without clear indication what is default behaviour expected.

Trying to convert it blindly would end up worse than just ditching
support for sdcards altogehter.

Re: Insufficiently documented deprecated command arguments

2019-12-11 Thread Peter Krempa

On Wed, Dec 11, 2019 at 11:14:39 +0100, Kevin Wolf wrote:
> Am 11.12.2019 um 10:33 hat Peter Krempa geschrieben:
> > On Wed, Dec 11, 2019 at 09:12:41 +0100, Markus Armbruster wrote:
> > > Commit 7a9877a026 "block: Accept device model name for
> > > block_set_io_throttle" (v2.8.0) deprecated block_set_io_throttle
> > > argument @device.
> > 
> > This one is more complex. The command is used both in 'blockdev' and in
> > 'drive' mode:
> > 
> > In 'drive' mode we pass the alias of the 'drive' as the @device
> > argument.
> > 
> > In 'blockdev' mode we pass the qom name as @id
> 
> Any reason you couldn't use the QOM name even in 'drive' mode for any
> QEMU version that has the @id option?

Honestly, I didn't want to change the existing implementation at all.

If it will help I can change that as it will be pretty easy to do.

> 
> > > Commit c42e8742f5 "block: Use JSON null instead of "" to disable
> > > backing file" (v2.10.0) deprecated blockdev-add empty string
> > > argument @backing.
> > 
> > This is used in 'blockdev' mode only and we always pass the JSON null or
> > a node name string.
> 
> Here the thing to consider might be that JSON null isn't easy to use on
> the command line for manual users.
> 
> > > These were missed in commit eb22aeca65 "docs: document deprecation
> > > policy & deprecated features in appendix" (v2.10.0).
> > > 
> > > Commit 3c605f4074 "commit: Add top-node/base-node options" (v3.1.0)
> > > deprecated block-commit arguments @base and @top.
> > 
> > This command also has two modes:
> > 
> > In 'drive' mode we pass in path strings as @base and @ top.
> > 
> > In 'blockdev' mode we pass in nodenames as @base-node and @top-node.
> > 
> > Starting from qemu-4.2 libvirt uses 'blockdev' mode for VMs unless an SD
> > card is configured as we didn't convert to the '-device' approach for
> > those as AFAIK not everything is possible to be converted.
> 
> Hm... I guess in 'drive' mode, you stil don't assign node names, so you
> actually have to rely on paths?

Well, in some specific cases we could detect the node names
auto-assigned by qemu and use them instead of paths, but in my opinion
it's not worth the effort and extra code.

[PATCH RFC] qapi: Allow getting flat output from 'query-named-block-nodes'

2019-12-13 Thread Peter Krempa

When a management application manages node names there's no reason to
recurse into backing images in the output of query-named-block-nodes.

Add a parameter to the command which will return just the top level
structs.

Signed-off-by: Peter Krempa 
---
 block.c   |  5 +++--
 block/qapi.c  | 10 --
 blockdev.c| 12 ++--
 include/block/block.h |  2 +-
 include/block/qapi.h  |  4 +++-
 monitor/hmp-cmds.c|  2 +-
 qapi/block-core.json  |  6 +-
 7 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 473eb6eeaa..b30bdfa0d3 100644
--- a/block.c
+++ b/block.c
@@ -4766,14 +4766,15 @@ BlockDriverState *bdrv_find_node(const char *node_name)
 }

 /* Put this QMP function here so it can access the static graph_bdrv_states. */
-BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp)
+BlockDeviceInfoList *bdrv_named_nodes_list(bool flat,
+   Error **errp)
 {
 BlockDeviceInfoList *list, *entry;
 BlockDriverState *bs;

 list = NULL;
 QTAILQ_FOREACH(bs, &graph_bdrv_states, node_list) {
-BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, errp);
+BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, flat, errp);
 if (!info) {
 qapi_free_BlockDeviceInfoList(list);
 return NULL;
diff --git a/block/qapi.c b/block/qapi.c
index 9a5d0c9b27..84048e1a57 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -42,7 +42,9 @@
 #include "qemu/cutils.h"

 BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
-BlockDriverState *bs, Error **errp)
+BlockDriverState *bs,
+bool flat,
+Error **errp)
 {
 ImageInfo **p_image_info;
 BlockDriverState *bs0;
@@ -156,6 +158,10 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
 return NULL;
 }

+/* stop gathering data for flat output */
+if (flat)
+break;
+
 if (bs0->drv && bs0->backing) {
 info->backing_file_depth++;
 bs0 = bs0->backing->bs;
@@ -389,7 +395,7 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo 
**p_info,

 if (bs && bs->drv) {
 info->has_inserted = true;
-info->inserted = bdrv_block_device_info(blk, bs, errp);
+info->inserted = bdrv_block_device_info(blk, bs, false, errp);
 if (info->inserted == NULL) {
 goto err;
 }
diff --git a/blockdev.c b/blockdev.c
index 8e029e9c01..5f9c5e258f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3707,9 +3707,17 @@ void qmp_drive_backup(DriveBackup *arg, Error **errp)
 }
 }

-BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
+BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
+ bool flat,
+ Error **errp)
 {
-return bdrv_named_nodes_list(errp);
+bool return_flat = false;
+
+if (has_flat) {
+return_flat = flat;
+}
+
+return bdrv_named_nodes_list(return_flat, errp);
 }

 XDbgBlockGraph *qmp_x_debug_query_block_graph(Error **errp)
diff --git a/include/block/block.h b/include/block/block.h
index 1df9848e74..177ba09e3f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -468,7 +468,7 @@ void bdrv_lock_medium(BlockDriverState *bs, bool locked);
 void bdrv_eject(BlockDriverState *bs, bool eject_flag);
 const char *bdrv_get_format_name(BlockDriverState *bs);
 BlockDriverState *bdrv_find_node(const char *node_name);
-BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp);
+BlockDeviceInfoList *bdrv_named_nodes_list(bool flat, Error **errp);
 XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp);
 BlockDriverState *bdrv_lookup_bs(const char *device,
  const char *node_name,
diff --git a/include/block/qapi.h b/include/block/qapi.h
index cd9410dee3..22c7807c89 100644
--- a/include/block/qapi.h
+++ b/include/block/qapi.h
@@ -29,7 +29,9 @@
 #include "block/snapshot.h"

 BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
-BlockDriverState *bs, Error **errp);
+BlockDriverState *bs,
+bool flat,
+Error **errp);
 int bdrv_query_snapshot_info_list(BlockDriverState *bs,
   SnapshotInfoList **p_list,
   Error **errp);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index b2551c16d1..651969819b 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -620,7 +620,7 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
 }

 /* Prin

Re: [libvirt] [PATCH] virtio-blk: deprecate SCSI passthrough

2019-12-13 Thread Peter Krempa

On Fri, Dec 13, 2019 at 15:56:08 +0100, Paolo Bonzini wrote:
> On 13/12/19 15:46, Stefan Hajnoczi wrote:
> > The Linux virtio_blk.ko guest driver is removing legacy SCSI passthrough
> > support.  Deprecate this feature in QEMU too.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  qemu-deprecated.texi | 11 +++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> > index 4b4b7425ac..ef94d497da 100644
> > --- a/qemu-deprecated.texi
> > +++ b/qemu-deprecated.texi
> > @@ -285,6 +285,17 @@ spec you can use the ``-cpu 
> > rv64gcsu,priv_spec=v1.9.1`` command line argument.
> >  
> >  @section Device options
> >  
> > +@subsection Emulated device options
> > +
> > +@subsubsection -device virtio-blk,scsi=on|off (since 5.0.0)
> > +
> > +The virtio-blk SCSI passthrough feature is a legacy VIRTIO feature.  
> > VIRTIO 1.0
> > +and later do not support it because the virtio-scsi device was introduced 
> > for
> > +full SCSI support.  Use virtio-scsi instead when SCSI passthrough is 
> > required.
> > +
> > +Note this also applies to ``-device virtio-blk-pci,scsi=on|off'', which is 
> > an
> > +alias.
> > +
> >  @subsection Block device options
> >  
> >  @subsubsection "backing": "" (since 2.12.0)
> > 
> 
> Reviewed-by: Paolo Bonzini 

Libvirt still allows and exposes this configuration:


  
  
  
  


which results into the following command line:

-drive file=/dev/sdfake2,format=qcow2,if=none,id=drive-virtio-disk1 \
-device virtio-blk-pci,scsi=on,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,\
id=virtio-disk1

In this case I don't see any possibility how to fix it since it requires
change of controller.

I don't think that we've also added interlocks for this with VIRTIO 1.0

Re: [libvirt] [PATCH v2 82/86] numa: forbid '-numa node, mem' for 5.0 and newer machine types

2020-01-15 Thread Peter Krempa

On Wed, Jan 15, 2020 at 16:07:37 +0100, Igor Mammedov wrote:
> Deprecation period is ran out and it's a time to flip the switch
> introduced by cd5ff8333a.
> Disable legacy option for new machine types and amend documentation.
> 
> Signed-off-by: Igor Mammedov 
> ---
> CC: peter.mayd...@linaro.org
> CC: ehabk...@redhat.com
> CC: marcel.apfelb...@gmail.com
> CC: m...@redhat.com
> CC: pbonz...@redhat.com
> CC: r...@twiddle.net
> CC: da...@gibson.dropbear.id.au
> CC: libvir-l...@redhat.com
> CC: qemu-...@nongnu.org
> CC: qemu-...@nongnu.org
> ---
>  hw/arm/virt.c|  2 +-
>  hw/core/numa.c   |  6 ++
>  hw/i386/pc.c |  1 -
>  hw/i386/pc_piix.c|  1 +
>  hw/i386/pc_q35.c |  1 +
>  hw/ppc/spapr.c   |  2 +-
>  qemu-deprecated.texi | 16 
>  qemu-options.hx  |  8 
>  8 files changed, 14 insertions(+), 23 deletions(-)

I'm afraid nobody bothered to fix it yet:

https://bugzilla.redhat.com/show_bug.cgi?id=1783355

[PATCH v2] qapi: Allow getting flat output from 'query-named-block-nodes'

2020-01-20 Thread Peter Krempa

When a management application manages node names there's no reason to
recurse into backing images in the output of query-named-block-nodes.

Add a parameter to the command which will return just the top level
structs.

Signed-off-by: Peter Krempa 
---

Diff to v1:
 - rewrote setting of 'return_flat' in qmp_query_named_block_nodes
 - tried to clarify the QMP schema docs for the new field

This patch does not aim to fix the rather suboptimal original
documentation of the command as that is going to end up in a bunch of
bikeshedding.

While I know that there are plans for a new command that should fix
this, the plans were already there for quite some time without much
happening. This is a quick fix to a real problem, because if you have
(maybe unpractically) deep backing chains, the returned JSON is getting
huge. (140 nesting levels exceeds 10MiB of JSON)

 block.c   |  5 +++--
 block/qapi.c  | 10 --
 blockdev.c|  8 ++--
 include/block/block.h |  2 +-
 include/block/qapi.h  |  4 +++-
 monitor/hmp-cmds.c|  2 +-
 qapi/block-core.json  |  7 ++-
 7 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index ecd09dbbfd..73d70aec11 100644
--- a/block.c
+++ b/block.c
@@ -4784,14 +4784,15 @@ BlockDriverState *bdrv_find_node(const char *node_name)
 }

 /* Put this QMP function here so it can access the static graph_bdrv_states. */
-BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp)
+BlockDeviceInfoList *bdrv_named_nodes_list(bool flat,
+   Error **errp)
 {
 BlockDeviceInfoList *list, *entry;
 BlockDriverState *bs;

 list = NULL;
 QTAILQ_FOREACH(bs, &graph_bdrv_states, node_list) {
-BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, errp);
+BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, flat, errp);
 if (!info) {
 qapi_free_BlockDeviceInfoList(list);
 return NULL;
diff --git a/block/qapi.c b/block/qapi.c
index 9a5d0c9b27..84048e1a57 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -42,7 +42,9 @@
 #include "qemu/cutils.h"

 BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
-BlockDriverState *bs, Error **errp)
+BlockDriverState *bs,
+bool flat,
+Error **errp)
 {
 ImageInfo **p_image_info;
 BlockDriverState *bs0;
@@ -156,6 +158,10 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
 return NULL;
 }

+/* stop gathering data for flat output */
+if (flat)
+break;
+
 if (bs0->drv && bs0->backing) {
 info->backing_file_depth++;
 bs0 = bs0->backing->bs;
@@ -389,7 +395,7 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo 
**p_info,

 if (bs && bs->drv) {
 info->has_inserted = true;
-info->inserted = bdrv_block_device_info(blk, bs, errp);
+info->inserted = bdrv_block_device_info(blk, bs, false, errp);
 if (info->inserted == NULL) {
 goto err;
 }
diff --git a/blockdev.c b/blockdev.c
index 8e029e9c01..cba2f567aa 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3707,9 +3707,13 @@ void qmp_drive_backup(DriveBackup *arg, Error **errp)
 }
 }

-BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
+BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
+ bool flat,
+ Error **errp)
 {
-return bdrv_named_nodes_list(errp);
+bool return_flat = has_flat && flat;
+
+return bdrv_named_nodes_list(return_flat, errp);
 }

 XDbgBlockGraph *qmp_x_debug_query_block_graph(Error **errp)
diff --git a/include/block/block.h b/include/block/block.h
index e9dcfef7fa..afced5249c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -469,7 +469,7 @@ void bdrv_lock_medium(BlockDriverState *bs, bool locked);
 void bdrv_eject(BlockDriverState *bs, bool eject_flag);
 const char *bdrv_get_format_name(BlockDriverState *bs);
 BlockDriverState *bdrv_find_node(const char *node_name);
-BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp);
+BlockDeviceInfoList *bdrv_named_nodes_list(bool flat, Error **errp);
 XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp);
 BlockDriverState *bdrv_lookup_bs(const char *device,
  const char *node_name,
diff --git a/include/block/qapi.h b/include/block/qapi.h
index cd9410dee3..22c7807c89 100644
--- a/include/block/qapi.h
+++ b/include/block/qapi.h
@@ -29,7 +29,9 @@
 #include "block/snapshot.h"

 BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
-Block

Re: [PATCH 0/2] block: Fix multiple blockdev-snapshot calls

2019-11-18 Thread Peter Krempa

On Fri, Nov 08, 2019 at 09:53:10 +0100, Kevin Wolf wrote:
> Kevin Wolf (2):
>   block: Remove 'backing': null from bs->{explicit_,}options
>   iotests: Test multiple blockdev-snapshot calls

Hi,

I'm not sure how the freeze rules for qemu are at this point thus:

Will this patchset make it into 4.2. I argue it's a pretty important fix
as it ends up in image corruption.

If it won't make into we will need to add a QMP feature flag to detect
presence of the fix so that I can gate libvirt's support of blockdev
with it. (This can also be done if we make it into 4.2 but I'll be okay
infering it from other fixes present in 4.2).

Thanks

Peter

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 0/2] block/snapshot: Restrict set of snapshot nodes

2019-09-17 Thread Peter Krempa

On Tue, Sep 17, 2019 at 13:04:41 +0200, Kevin Wolf wrote:
> This fixes internal snapshots with separately defined protocol nodes
> (like libvirt will do with -blockdev).

The code change is exactly what I thought would be necessary in this
case. I've tested it with my blockdev code in libvirt enabled and all
three internal snapshot commands seem to work as expected even when
fully using blockdev and also only the top layer gets the snapshot.

I'll send patches that attempt adding introspection to allow libvirt
detecting this fix.

Reviewed-by: Peter Krempa 
Tested-by: Peter Krempa 

signature.asc
Description: PGP signature

[Qemu-devel] [PATCH 2/2] qapi: Allow introspecting fix for savevm's cooperation with blockdev

2019-09-17 Thread Peter Krempa

savevm was buggy as it considered all monitor owned block device nodes
for snapshot. With introduction of -blockdev the common usage made all
nodes including protocol nodes monitor owned and thus considered for
snapshot. This was fixed but clients need to be able to detect whether
this fix is present.

Since savevm does not have an QMP alternative add the feature for the
'human-monitor-command' backdoor which is used to call this command in
modern use.

Signed-off-by: Peter Krempa 
---
 qapi/misc.json | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index 6bd11f50e6..e2b33c3f8a 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1020,6 +1020,11 @@
 #
 # @cpu-index: The CPU to use for commands that require an implicit CPU
 #
+# Features:
+# @savevm-blockdev-monitor-nodes: If present, the savevm monitor command
+# correctly handles monitor owned block nodes
+# when taking a snapshot.
+#
 # Returns: the output of the command as a string
 #
 # Since: 0.14.0
@@ -1047,7 +1052,8 @@
 ##
 { 'command': 'human-monitor-command',
   'data': {'command-line': 'str', '*cpu-index': 'int'},
-  'returns': 'str' }
+  'returns': 'str',
+  'features' : [ { 'name': 'savevm-blockdev-monitor-nodes' } ] }

 ##
 # @change:
-- 
2.21.0

[Qemu-devel] [PATCH 0/2] qapi: Add detection for the 'savevm' fix for blockdev

2019-09-17 Thread Peter Krempa

Add 'features' field in the schema for commands and add a feature flag
to advertise that the fix for savevm [1] is present.

[1] https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg03487.html

Peter Krempa (2):
  qapi: Add feature flags to commands in qapi introspection
  qapi: Allow introspecting fix for savevm's cooperation with blockdev

 docs/devel/qapi-code-gen.txt   |  4 ++--
 qapi/introspect.json   |  6 -
 qapi/misc.json |  8 ++-
 scripts/qapi/commands.py   |  3 ++-
 scripts/qapi/common.py | 40 +-
 scripts/qapi/doc.py|  3 ++-
 scripts/qapi/introspect.py |  7 +-
 tests/qapi-schema/test-qapi.py |  7 +-
 8 files changed, 65 insertions(+), 13 deletions(-)

-- 
2.21.0

[Qemu-devel] [PATCH 1/2] qapi: Add feature flags to commands in qapi introspection

2019-09-17 Thread Peter Krempa

Similarly to features for struct types introduce the feature flags also
for commands. This will allow notifying management layers of fixes and
compatible changes in the behaviour of an command which may not be
detectable any other way.

The changes were heavily inspired by commit 6a8c0b51025.

Signed-off-by: Peter Krempa 
---
 docs/devel/qapi-code-gen.txt   |  4 ++--
 qapi/introspect.json   |  6 -
 scripts/qapi/commands.py   |  3 ++-
 scripts/qapi/common.py | 40 +-
 scripts/qapi/doc.py|  3 ++-
 scripts/qapi/introspect.py |  7 +-
 tests/qapi-schema/test-qapi.py |  7 +-
 7 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt
index e8ec8ac1de..38682daace 100644
--- a/docs/devel/qapi-code-gen.txt
+++ b/docs/devel/qapi-code-gen.txt
@@ -726,8 +726,8 @@ change in the QMP syntax (usually by allowing values or 
operations that
 previously resulted in an error). QMP clients may still need to know
 whether the extension is available.

-For this purpose, a list of features can be specified for a struct type.
-This is exposed to the client as a list of string, where each string
+For this purpose, a list of features can be specified for a command or struct
+type. This is exposed to the client as a list of string, where each string
 signals that this build of QEMU shows a certain behaviour.

 In the schema, features can be specified as simple strings, for example:
diff --git a/qapi/introspect.json b/qapi/introspect.json
index 1843c1cb17..031a954fa9 100644
--- a/qapi/introspect.json
+++ b/qapi/introspect.json
@@ -266,13 +266,17 @@
 # @allow-oob: whether the command allows out-of-band execution,
 # defaults to false (Since: 2.12)
 #
+# @features: names of features associated with the command, in no particular
+#order. (since 4.2)
+#
 # TODO: @success-response (currently irrelevant, because it's QGA, not QMP)
 #
 # Since: 2.5
 ##
 { 'struct': 'SchemaInfoCommand',
   'data': { 'arg-type': 'str', 'ret-type': 'str',
-'*allow-oob': 'bool' } }
+'*allow-oob': 'bool',
+'*features': [ 'str' ] } }

 ##
 # @SchemaInfoEvent:
diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py
index b929e07be4..6cfe6cdd9e 100644
--- a/scripts/qapi/commands.py
+++ b/scripts/qapi/commands.py
@@ -276,7 +276,8 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds);
 genc.add(gen_registry(self._regy.get_content(), self._prefix))

 def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
-  success_response, boxed, allow_oob, allow_preconfig):
+  success_response, boxed, allow_oob, allow_preconfig,
+  features):
 if not gen:
 return
 # FIXME: If T is a user-defined type, the user is responsible
diff --git a/scripts/qapi/common.py b/scripts/qapi/common.py
index d61bfdc526..1502820f46 100644
--- a/scripts/qapi/common.py
+++ b/scripts/qapi/common.py
@@ -838,6 +838,7 @@ def check_type(info, source, value, allow_array=False,
 def check_command(expr, info):
 name = expr['command']
 boxed = expr.get('boxed', False)
+features = expr.get('features')

 args_meta = ['struct']
 if boxed:
@@ -852,6 +853,19 @@ def check_command(expr, info):
expr.get('returns'), allow_array=True,
allow_optional=True, allow_metas=returns_meta)

+if features:
+if not isinstance(features, list):
+raise QAPISemError(info,
+   "Command '%s' requires an array for 'features'" 
%
+   name)
+for f in features:
+assert isinstance(f, dict)
+check_known_keys(info, "feature of command %s" % name, f,
+ ['name'], ['if'])
+
+check_if(f, info)
+check_name(info, "Feature of command %s" % name, f['name'])
+

 def check_event(expr, info):
 name = expr['event']
@@ -1138,8 +1152,10 @@ def check_exprs(exprs):
 meta = 'command'
 check_keys(expr_elem, 'command', [],
['data', 'returns', 'gen', 'success-response',
-'boxed', 'allow-oob', 'allow-preconfig', 'if'])
+'boxed', 'allow-oob', 'allow-preconfig', 'if',
+'features'])
 normalize_members(expr.get('data'))
+normalize_features(expr.get('

Re: [Qemu-devel] [PATCH 2/2] qapi: Allow introspecting fix for savevm's cooperation with blockdev

2019-09-18 Thread Peter Krempa

On Wed, Sep 18, 2019 at 10:22:13 +0200, Kevin Wolf wrote:
> Am 17.09.2019 um 18:33 hat Eric Blake geschrieben:
> > On 9/17/19 10:49 AM, Peter Krempa wrote:
> > > savevm was buggy as it considered all monitor owned block device nodes
> > > for snapshot. With introduction of -blockdev the common usage made all
> > > nodes including protocol nodes monitor owned and thus considered for
> > > snapshot. This was fixed but clients need to be able to detect whether
> > > this fix is present.
> > > 
> > > Since savevm does not have an QMP alternative add the feature for the
> > > 'human-monitor-command' backdoor which is used to call this command in
> > > modern use.
> > > 
> > > Signed-off-by: Peter Krempa 
> > > ---
> > >  qapi/misc.json | 8 +++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/qapi/misc.json b/qapi/misc.json
> > > index 6bd11f50e6..e2b33c3f8a 100644
> > > --- a/qapi/misc.json
> > > +++ b/qapi/misc.json
> > > @@ -1020,6 +1020,11 @@
> > >  #
> > >  # @cpu-index: The CPU to use for commands that require an implicit CPU
> > >  #
> > > +# Features:
> > > +# @savevm-blockdev-monitor-nodes: If present, the savevm monitor command
> > > +# correctly handles monitor owned block 
> > > nodes
> > > +# when taking a snapshot.
> > 
> > Is it worth adding a '(since 4.2)' on when features are added?
> 
> I would also rather describe the change in behaviour ("only includes
> monitor owned block nodes if they have no parents") than saying that the
> behaviour is now correct.

That's a good idea. I'll post it in a v2 if required.

> (Not the least because we know that the current way still isn't quite
> correct, just hopefully correct enough: Block job BlockBackends are
> currently snapshotted, which is unintended and will be changed in the
> future. However, in practice people probably won't use block jobs on
> non-root nodes and internal snapshots together.)
> 
> > > +#
> > >  # Returns: the output of the command as a string
> > >  #
> > >  # Since: 0.14.0
> > > @@ -1047,7 +1052,8 @@
> > >  ##
> > >  { 'command': 'human-monitor-command',
> > >'data': {'command-line': 'str', '*cpu-index': 'int'},
> > > -  'returns': 'str' }
> > > +  'returns': 'str',
> > > +  'features' : [ { 'name': 'savevm-blockdev-monitor-nodes' } ] }
> > 
> > We could, of course, actually implement a QMP 'savevm' and use _that_ as
> > the introspection.  But that's a bigger can of worms, so this is
> > reasonable enough for the 4.2 timeframe.
> 
> I think a QMP savevm would sidestep the whole issue by taking an
> explicit list of nodes to snapshot, and an explicit option to tell which
> node to store the vmstate on.

A straight replacement for savevm would be quite pointless and also such
discussion took already place multiple times in the past. The result
always was that we need something better than just a qmp version of
savevm.

I'll be happy to implement the libvirt support for the "new" savevm if
it ever appears.


signature.asc
Description: PGP signature

[PATCH v2 2/2] qapi: Allow introspecting fix for savevm's cooperation with blockdev

2019-09-20 Thread Peter Krempa

savevm was buggy as it considered all monitor owned block device nodes
for snapshot. With introduction of -blockdev the common usage made all
nodes including protocol nodes monitor owned and thus considered for
snapshot. This was fixed but clients need to be able to detect whether
this fix is present.

Since savevm does not have an QMP alternative add the feature for the
'human-monitor-command' backdoor which is used to call this command in
modern use.

Signed-off-by: Peter Krempa 
---
 qapi/misc.json | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index 6bd11f50e6..673445dfa9 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1020,6 +1020,12 @@
 #
 # @cpu-index: The CPU to use for commands that require an implicit CPU
 #
+# Features:
+# @savevm-blockdev-monitor-nodes: If present, the savevm monitor command only
+# snapshots monitor owned nodes if they have no
+# parents. This allows to use 'savevm' with
+# -blockdev. (since 4.2)
+#
 # Returns: the output of the command as a string
 #
 # Since: 0.14.0
@@ -1047,7 +1053,8 @@
 ##
 { 'command': 'human-monitor-command',
   'data': {'command-line': 'str', '*cpu-index': 'int'},
-  'returns': 'str' }
+  'returns': 'str',
+  'features' : [ { 'name': 'savevm-blockdev-monitor-nodes' } ] }

 ##
 # @change:
-- 
2.21.0

[PATCH v2 0/2] qapi: Add detection for the 'savevm' fix for blockdev

2019-09-20 Thread Peter Krempa

Add 'features' field in the schema for commands and add a feature flag
to advertise that the fix for savevm [1] is present.

[1] https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg03487.html

v2:
 - fixed typos pointed out by Eric
 - added 'since' tag
 - reworded to describe the fix this allows to detect properly (Kevin)
 - verified that this can be rebased on top of Markus' series
   automatically

Peter Krempa (2):
  qapi: Add feature flags to commands in qapi introspection
  qapi: Allow introspecting fix for savevm's cooperation with blockdev

 docs/devel/qapi-code-gen.txt   |  4 ++--
 qapi/introspect.json   |  6 -
 qapi/misc.json |  9 +++-
 scripts/qapi/commands.py   |  3 ++-
 scripts/qapi/common.py | 40 +-
 scripts/qapi/doc.py|  3 ++-
 scripts/qapi/introspect.py |  7 +-
 tests/qapi-schema/test-qapi.py |  7 +-
 8 files changed, 66 insertions(+), 13 deletions(-)

-- 
2.21.0

[PATCH v2 1/2] qapi: Add feature flags to commands in qapi introspection

2019-09-20 Thread Peter Krempa

Similarly to features for struct types introduce the feature flags also
for commands. This will allow notifying management layers of fixes and
compatible changes in the behaviour of a command which may not be
detectable any other way.

The changes were heavily inspired by commit 6a8c0b51025.

Signed-off-by: Peter Krempa 
---
 docs/devel/qapi-code-gen.txt   |  4 ++--
 qapi/introspect.json   |  6 -
 scripts/qapi/commands.py   |  3 ++-
 scripts/qapi/common.py | 40 +-
 scripts/qapi/doc.py|  3 ++-
 scripts/qapi/introspect.py |  7 +-
 tests/qapi-schema/test-qapi.py |  7 +-
 7 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt
index e8ec8ac1de..d465cc6330 100644
--- a/docs/devel/qapi-code-gen.txt
+++ b/docs/devel/qapi-code-gen.txt
@@ -726,8 +726,8 @@ change in the QMP syntax (usually by allowing values or 
operations that
 previously resulted in an error). QMP clients may still need to know
 whether the extension is available.

-For this purpose, a list of features can be specified for a struct type.
-This is exposed to the client as a list of string, where each string
+For this purpose, a list of features can be specified for a command or struct
+type. This is exposed to the client as a list of strings, where each string
 signals that this build of QEMU shows a certain behaviour.

 In the schema, features can be specified as simple strings, for example:
diff --git a/qapi/introspect.json b/qapi/introspect.json
index 1843c1cb17..031a954fa9 100644
--- a/qapi/introspect.json
+++ b/qapi/introspect.json
@@ -266,13 +266,17 @@
 # @allow-oob: whether the command allows out-of-band execution,
 # defaults to false (Since: 2.12)
 #
+# @features: names of features associated with the command, in no particular
+#order. (since 4.2)
+#
 # TODO: @success-response (currently irrelevant, because it's QGA, not QMP)
 #
 # Since: 2.5
 ##
 { 'struct': 'SchemaInfoCommand',
   'data': { 'arg-type': 'str', 'ret-type': 'str',
-'*allow-oob': 'bool' } }
+'*allow-oob': 'bool',
+'*features': [ 'str' ] } }

 ##
 # @SchemaInfoEvent:
diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py
index b929e07be4..6cfe6cdd9e 100644
--- a/scripts/qapi/commands.py
+++ b/scripts/qapi/commands.py
@@ -276,7 +276,8 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds);
 genc.add(gen_registry(self._regy.get_content(), self._prefix))

 def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
-  success_response, boxed, allow_oob, allow_preconfig):
+  success_response, boxed, allow_oob, allow_preconfig,
+  features):
 if not gen:
 return
 # FIXME: If T is a user-defined type, the user is responsible
diff --git a/scripts/qapi/common.py b/scripts/qapi/common.py
index d61bfdc526..1502820f46 100644
--- a/scripts/qapi/common.py
+++ b/scripts/qapi/common.py
@@ -838,6 +838,7 @@ def check_type(info, source, value, allow_array=False,
 def check_command(expr, info):
 name = expr['command']
 boxed = expr.get('boxed', False)
+features = expr.get('features')

 args_meta = ['struct']
 if boxed:
@@ -852,6 +853,19 @@ def check_command(expr, info):
expr.get('returns'), allow_array=True,
allow_optional=True, allow_metas=returns_meta)

+if features:
+if not isinstance(features, list):
+raise QAPISemError(info,
+   "Command '%s' requires an array for 'features'" 
%
+   name)
+for f in features:
+assert isinstance(f, dict)
+check_known_keys(info, "feature of command %s" % name, f,
+ ['name'], ['if'])
+
+check_if(f, info)
+check_name(info, "Feature of command %s" % name, f['name'])
+

 def check_event(expr, info):
 name = expr['event']
@@ -1138,8 +1152,10 @@ def check_exprs(exprs):
 meta = 'command'
 check_keys(expr_elem, 'command', [],
['data', 'returns', 'gen', 'success-response',
-'boxed', 'allow-oob', 'allow-preconfig', 'if'])
+'boxed', 'allow-oob', 'allow-preconfig', 'if',
+'features'])
 normalize_members(expr.get('data'))
+normalize_features(expr.get('

Re: [libvirt] [PULL 1/9] IDE: deprecate ide-drive

2019-11-01 Thread Peter Krempa

On Thu, Oct 31, 2019 at 23:02:45 +0100, Paolo Bonzini wrote:
> On 31/10/19 11:58, John Snow wrote:
> > It's an old compatibility shim that just delegates to ide-cd or ide-hd.
> > I'd like to refactor these some day, and getting rid of the super-object
> > will make that easier.
> > 
> > Either way, we don't need this.
> 
> Good idea.  I will prepare a similar patch for scsi-disk, even though
> technically we're already in soft freeze; it makes no sense to deprecate
> only one of the two.

I checked in libvirt and you are welcome to do so since we no longer use
it similarly to ide-disk.



signature.asc
Description: PGP signature

Re: [PATCH v2 2/2] qapi: deprecate implicit filters

2019-11-08 Thread Peter Krempa

On Fri, Nov 08, 2019 at 13:16:55 +0300, Vladimir Sementsov-Ogievskiy wrote:
> To get rid of implicit filters related workarounds in future let's
> deprecate them now.
> 
> Deprecation warning breaks some bash iotests output, so fix it here
> too: in most of cases just add filter-node-name in test.
> 
> In 161 add FIXME and deprecation warning into 161.out.
> 
> In 249, the test case is changed, as we don't need to test "default"
> filter node name anymore.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qemu-deprecated.texi   |  6 ++
>  qapi/block-core.json   |  9 ++---
>  include/block/block_int.h  | 10 +-
>  blockdev.c | 10 ++
>  tests/qemu-iotests/094 |  1 +
>  tests/qemu-iotests/095 |  6 --
>  tests/qemu-iotests/109 |  1 +
>  tests/qemu-iotests/127 |  1 +
>  tests/qemu-iotests/141 |  5 -
>  tests/qemu-iotests/144 |  3 ++-
>  tests/qemu-iotests/156 |  1 +
>  tests/qemu-iotests/161 |  7 +++
>  tests/qemu-iotests/161.out |  1 +
>  tests/qemu-iotests/185 |  3 +++
>  tests/qemu-iotests/191 |  2 ++
>  tests/qemu-iotests/229 |  1 +
>  tests/qemu-iotests/247 |  8 +---
>  tests/qemu-iotests/249 |  5 +++--
>  tests/qemu-iotests/249.out |  2 +-
>  19 files changed, 68 insertions(+), 14 deletions(-)
> 
> diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
> index 296bfc93a3..c969faf55a 100644
> --- a/qemu-deprecated.texi
> +++ b/qemu-deprecated.texi
> @@ -204,6 +204,12 @@ and accurate ``query-qmp-schema'' command.
>  Character devices creating sockets in client mode should not specify
>  the 'wait' field, which is only applicable to sockets in server mode
>  
> +@subsection implicit filters in mirror and commit (since 4.2)
> +
> +Mirror and commit jobs insert filters, which becomes implicit if user
> +omitted @option(filter-node-name) parameter. So omitting it is deprecated
> +in ``blockdev-mirror'', ``drive-mirror'' and ``block-commit'', set it always.
> +
>  @section Human Monitor Protocol (HMP) commands
>  
>  @subsection The hub_id parameter of 'hostfwd_add' / 'hostfwd_remove' (since 
> 3.1)
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 93530d3a13..37caed775f 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1659,7 +1659,8 @@
>  # @filter-node-name: the node name that should be assigned to the
>  #filter driver that the commit job inserts into the graph
>  #above @top. If this option is not given, a node name is
> -#autogenerated. (Since: 2.9)
> +#autogenerated. Omitting this option is deprecated, it 
> will
> +#be required in future. (Since: 2.9)
>  #
>  # @auto-finalize: When false, this job will wait in a PENDING state after it 
> has
>  # finished its work, waiting for @block-job-finalize before

Note that 'block-commit' and 'drive-mirror' commands are used by libvirt
in the pre-blockdev era. In those instances we gather statistics of
block devices by nesting in the output of query-blockstats and
query-block rather than selecting the appropriate info by any other
means (e.g. by node name).

This means that the output MUST stay consistend when block jobs are used
and the hack this patch is deprcating will break those.

Note that in libvirt we don't plan to invest time to add workarounds for
non-blockdev cases since blockdev by itself is complex enough and I'd
strongly prefer not having a third code path to care about.

Given that -blockdev can't be used in all cases (e.g. for sd-cards)
which also blocks deprecation of -drive I don't think that hiding of
implicit filter nodes can be deprecated until -drive is deprecated.

Re: [PATCH v2 1/2] qapi: add filter-node-name option to drive-mirror

2019-11-08 Thread Peter Krempa

On Fri, Nov 08, 2019 at 13:16:54 +0300, Vladimir Sementsov-Ogievskiy wrote:
> To correspond to blockdev-mirror command and make it possible to
> deprecate implicit filters in the next commit.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json | 7 +++
>  blockdev.c   | 2 +-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index aa97ee2641..93530d3a13 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1992,6 +1992,12 @@
>  # @on-target-error: the action to take on an error on the target,
>  #   default 'report' (no limitations, since this applies to
>  #   a different block device than @device).
> +#
> +# @filter-node-name: the node name that should be assigned to the
> +#filter driver that the mirror job inserts into the graph
> +#above @device. If this option is not given, a node name 
> is
> +#autogenerated. (Since: 4.2)
> +#

Speaking for libvirt and based on what I've responded to patch 2 there's
no value in adding this parameter for libvirt's use.

This also kind of means that drive-mirror can be deprecated together
with deprecating -drive.

Re: [PATCH v2 2/2] qapi: deprecate implicit filters

2019-11-08 Thread Peter Krempa

On Fri, Nov 08, 2019 at 13:56:03 +, Vladimir Sementsov-Ogievskiy wrote:
> 08.11.2019 16:27, Peter Krempa wrote:
> > On Fri, Nov 08, 2019 at 13:16:55 +0300, Vladimir Sementsov-Ogievskiy wrote:

[...]

> > Note that 'block-commit' and 'drive-mirror' commands are used by libvirt
> > in the pre-blockdev era. In those instances we gather statistics of
> > block devices by nesting in the output of query-blockstats and
> > query-block rather than selecting the appropriate info by any other
> > means (e.g. by node name).
> > 
> > This means that the output MUST stay consistend when block jobs are used
> > and the hack this patch is deprcating will break those.
> > 
> > Note that in libvirt we don't plan to invest time to add workarounds for
> > non-blockdev cases since blockdev by itself is complex enough and I'd
> > strongly prefer not having a third code path to care about.
> > 
> > Given that -blockdev can't be used in all cases (e.g. for sd-cards)
> > which also blocks deprecation of -drive I don't think that hiding of
> > implicit filter nodes can be deprecated until -drive is deprecated.
> > 
> 
> 
> OK, so, we can't deprecate anything around it now.
> 
> What is the problem with sd-cards?

So the problem was that it was impossible to instantiate it via -device,
but looking at the qemu code base this doesn't seem to be true any more.

I'll have a look whether we can rework the instantiation of sd card
frontends in libvirt somehow or whether it actually ever worked.
Unfortunately the documentation seems to be rather sparse.

Re: [PATCH 1/2] block: Remove 'backing': null from bs->{explicit_,}options

2019-11-12 Thread Peter Krempa

On Fri, Nov 08, 2019 at 09:53:11 +0100, Kevin Wolf wrote:
> bs->options and bs->explicit_options shouldn't contain any options for
> child nodes. bdrv_open_inherited() takes care to remove any options that
> match a child name after opening the image and the same is done when
> reopening.
> 
> However, we miss the case of 'backing': null, which is a child option,
> but results in no child being created. This means that a 'backing': null
> remains in bs->options and bs->explicit_options.
> 
> A typical use for 'backing': null is in live snapshots: blockdev-add for
> the qcow2 overlay makes sure not to open the backing file (because it is

Note that we also use '"backing": null' as a terminator for the last
image in the chain if the user configures the chain manually.

This is kind-of a protection from opening the backing file from the
header if it was misconfigured somehow. I think this functionality
should be kept despite probably not making practical sense.

In my testing this scenario worked properly.

> already opened and blockdev-snapshot will attach it). After doing a
> blockdev-snapshot, bs->options and bs->explicit_options become
> inconsistent with the actual state (bs has a backing file now, but the
> options still say null). On the next occasion that the image is
> reopened, e.g. switching it from read-write to read-only when another
> snapshot is taken, the option will take effect again and the node
> incorrectly loses its backing file.
> 
> Fix bdrv_open_inherited() to remove the 'backing' option from
> bs->options and bs->explicit_options even for the case where it
> specifies that no backing file is wanted.
> 
> Reported-by: Peter Krempa 
> Signed-off-by: Kevin Wolf 

The fix looks sane-enough to me and works as expected, but since I'm not
familiar enough with this code I'm comfortable only with a:

Tested-by: Peter Krempa 

> ---
>  block.c | 2 ++
>  1 file changed, 2 insertions(+)

Re: [PATCH 2/2] iotests: Test multiple blockdev-snapshot calls

2019-11-12 Thread Peter Krempa

On Fri, Nov 08, 2019 at 09:53:12 +0100, Kevin Wolf wrote:
> Test that doing a second blockdev-snapshot doesn't make the first
> overlay's backing file go away.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/273 |  76 +
>  tests/qemu-iotests/273.out | 337 +
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 414 insertions(+)
>  create mode 100755 tests/qemu-iotests/273
>  create mode 100644 tests/qemu-iotests/273.out

Didn't apply cleanly for me.

> 
> diff --git a/tests/qemu-iotests/273 b/tests/qemu-iotests/273
> new file mode 100755
> index 00..60076de7ce
> --- /dev/null
> +++ b/tests/qemu-iotests/273
> @@ -0,0 +1,76 @@
> +#!/usr/bin/env bash
> +#
> +# Test large write to a qcow2 image

Cut&paste?


Rest looks good

Reviewed-by: Peter Krempa 

> +#
> +# Copyright (C) 2019 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +seq=$(basename "$0")
> +echo "QA output created by $seq"
> +
> +status=1 # failure is the default!
> +
> +_cleanup()
> +{
> +_cleanup_test_img
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +
> +# This is a qcow2 regression test
> +_supported_fmt qcow2
> +_supported_proto file
> +_supported_os Linux
> +
> +do_run_qemu()
> +{
> +echo Testing: "$@"
> +$QEMU -nographic -qmp-pretty stdio -nodefaults "$@"

-qmp-pretty, that's useful

> +echo
> +}
> +
> +run_qemu()
> +{
> +do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qemu | _filter_qmp |
> +_filter_generated_node_ids | _filter_imgfmt | 
> _filter_actual_image_size
> +}
> +
> +TEST_IMG="$TEST_IMG.base" _make_test_img 64M
> +TEST_IMG="$TEST_IMG.mid" _make_test_img -b "$TEST_IMG.base"
> +_make_test_img -b "$TEST_IMG.mid"
> +
> +run_qemu \
> +-blockdev file,node-name=base,filename="$TEST_IMG.base" \
> + -blockdev file,node-name=midf,filename="$TEST_IMG.mid" \
> + -blockdev 
> '{"driver":"qcow2","node-name":"mid","file":"midf","backing":null}' \
> + -blockdev file,node-name=topf,filename="$TEST_IMG" \
> + -blockdev 
> '{"driver":"qcow2","file":"topf","node-name":"top","backing":null}' \
> +< +{"execute":"qmp_capabilities"}
> +{"execute":"blockdev-snapshot","arguments":{"node":"base","overlay":"mid"}}
> +{"execute":"blockdev-snapshot","arguments":{"node":"mid","overlay":"top"}}
> +{"execute":"query-named-block-nodes"}
> +{"execute":"x-debug-query-block-graph"}

Oh, this too!

> +{"execute":"quit"}
> +EOF
> +
> +# success, all done
> +echo "*** done"
> +rm -f $seq.full
> +status=0

Re: [PATCH v3 00/13] RFC: [for 5.0]: HMP monitor handlers cleanups

2020-01-27 Thread Peter Krempa

On Mon, Jan 27, 2020 at 14:39:02 -0500, John Snow wrote:
> 
> 
> On 1/27/20 5:36 AM, Maxim Levitsky wrote:
> > This patch series is bunch of cleanups
> > to the hmp monitor code.
> > 
> > This series only touched blockdev related hmp handlers.
> > 
> > No functional changes expected other that
> > light error message changes by the last patch.
> > 
> > This was inspired by this bugzilla:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1719169
> > 
> > Basically some users still parse hmp error messages,
> > and they would like to have them prefixed with 'Error:'
> > 
> 
> HMP isn't meant to be parsed. It's explicitly *not* API or ABI. I do
> like consistency in my UIs (it's useful for human eyes, too), but I'd
> like to know more about the request.

That's true as long as there's an stable replacement ... see below.

> 
> Is this request coming from libvirt? Can we wean them off of this
> interface? What do they need as a replacement?

There are 5 commands that libvirt still has HMP interfaces for:

drive_add
drive_del

savevm
loadvm
delvm

>From upstream point of view there's no value in adding the 'error'
prefix to drive_add/drive_del as libvirt now uses blockdev-add/del QMP
command instead which have implicit error propagation.

There are no replacements for the internal snapshot commands, but they
reported the 'error' prefix for some time even before this series.

Said that, please don't break savevm/loadvm/delvm until a QMP
replacement is added.

The bug was reported at the time when libvirt didn't use blockdev yet,
but at this point it's pointless from our side. This wouldn't even fix
the scenario when old (pre-5.10) libvirt would use new qemu because the
drive-add handler never checked the error prefix.

[1] 
https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_monitor_text.c;h=9135a11f0a3aae718c86bb199112fba8d16d4d80;hb=HEAD

Re: [PATCH RFC] qapi: Allow getting flat output from 'query-named-block-nodes'

2019-12-19 Thread Peter Krempa

On Tue, Dec 17, 2019 at 16:11:37 +0100, Markus Armbruster wrote:
> Eric Blake  writes:
> 
> > On 12/17/19 1:36 AM, Markus Armbruster wrote:
> >
> >> Un-snipping the QAPI schema change:
> >
> > Sorry about that...
> >
> >>
>  diff --git a/qapi/block-core.json b/qapi/block-core.json
>  index 0cf68fea14..bd651106bd 100644
>  --- a/qapi/block-core.json
>  +++ b/qapi/block-core.json
>  @@ -1752,6 +1752,8 @@
>    #
>    # Get the named block driver list
>    #
>  +# @flat: don't recurse into backing images if true. Default is false 
>  (Since 5.0)
>  +#
>    # Returns: the list of BlockDeviceInfo
>    #
>    # Since: 2.0
> >>
> >> What does it mean not to recurse?  Sounds like flat: false asks for a
> >> subset of the full set.  How exactly is the subset defined?
> >
> > Bike-shedding:
> >
> > Would it be easier to explain as:
> >
> > @recurse: If true, include child information in each node (note that
> > this can result in redundant output). Default is true (since 5.0)
> >
> > and then pass false when you don't want recursion, with it being more
> > obvious that using the new flag results in more compact output with no
> > loss of information.
> 
> Adds a bit of information, namely that the information suppressed by
> recurse: false is actually redundant.
> 
> The command's doc comment is weak: it doesn't really specify what
> exactly is returned.  Simply omitting redundant information always
> should still conform to this weak spec.  Of course, what really counts
> isn't spec conformance, but meeting client expectations.  I don't even
> understand what exactly gets returned, let alone how clients use it.

Well I by default expect that if the top level array has data for all
node names the nesting which repeats the information (partially) should
not be there because you can just look elsewhere for a more detailed
output.

Said that two years ago I wrote some code which detects the node names
from running qemu because at that time it was the only way to use the
block write threshold event. This code needs to extract the nesting
topology somehow but I don't remember nor fancy re-understanding the
algorithm for the detection during the hollidays. The only thing I can
say that the nesting is extracted either from the output of query-block
or from query-named-block nodes as both these outputs are fed to that
algorithm.

With blockdev that piece of code thankfully is unused, but unfortunately
I can't say that the nested output of query-named-block-nodes doesn't
have it's use.

Re: [PATCH v2 0/5] fix migration with bitmaps and mirror

2019-12-19 Thread Peter Krempa

On Thu, Dec 19, 2019 at 11:51:01 +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> It's a continuation for
> "bitmap migration bug with -drive while block mirror runs"
> <315cff78-dcdb-a3ce-2742-da3cc9f0c...@redhat.com>
> https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg07241.html
> 
> The problem is that bitmaps migrated to node with same node-name or
> blk-parent name. And currently only the latter actually work in libvirt.
> And with mirror-top filter it doesn't work, because
> bdrv_get_device_or_node_name don't go through filters.

I want to point out that since libvirt-5.10 we use -blockdev to
configure the backend of storage devices with qemu-4.2 and later. This
means unfortunately that the BlockBackend of the drive does not have a
name any more and thus the above will not work even if you make the
lookup code to see through filters.

As I've pointed out separately node-names are not good idea to use for
matching either as they can be distinct on the destination of migration.

Having same node names for images during migration was not documented as
a requiremend and even if it was the case when the mirror job is used
the destination is a different image and thus having a different node
name is expected.

Since it's not documented, expect the same situation as with
autogenerated nodenames, the destination may have different node-names
and the same node-name may refer to a different image. Implicit matching
based on node-names is thus impossible.

Re: [PATCH] block: nbd: Fix dirty bitmap context name

2019-12-20 Thread Peter Krempa

On Fri, Dec 20, 2019 at 09:39:17 +, Vladimir Sementsov-Ogievskiy wrote:
> 19.12.2019 18:55, Nir Soffer wrote:
> > On Thu, Dec 19, 2019 at 5:17 PM Vladimir Sementsov-Ogievskiy
> >  wrote:
> >>
> >> 19.12.2019 17:59, Nir Soffer wrote:
> >>> On Thu, Dec 19, 2019 at 4:04 PM Kevin Wolf  wrote:
> 
>  Am 19.12.2019 um 14:41 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > Ahh, I see, it's documented as
> >
> > +# @bitmap: Also export the dirty bitmap reachable from @device, so the
> > +#  NBD client can use NBD_OPT_SET_META_CONTEXT with
> > +#  "qemu:dirty-bitmap:NAME" to inspect the bitmap. (since 4.0)
> >
> > and it is logical to assume that export name (which is @name argument) 
> > is
> > mentioned. But we never mentioned it. This is just documented after
> > removed experimenatl command x-nbd-server-add-bitmap,
> >
> > look at
> >
> > commit 7dc570b3806e5b0a4c9219061556ed5a4a0de80c
> > Author: Eric Blake 
> > Date:   Fri Jan 11 13:47:18 2019 -0600
> >
> >nbd: Remove x-nbd-server-add-bitmap
> >
> > ...
> >
> > -# @bitmap-export-name: How the bitmap will be seen by nbd clients
> > -#  (default @bitmap)
> > -#
> > -# Note: the client must use NBD_OPT_SET_META_CONTEXT with a query of
> > -# "qemu:dirty-bitmap:NAME" (where NAME matches @bitmap-export-name) to 
> > access
> > -# the exposed bitmap.
> >
> >
> > So, this "NAME" is saved and now looks incorrect. What should be fixed, 
> > is Qapi
> > documentation.
> 
>  Hm, I don't know these interfaces very well, but from you explanation it
>  looks to me as if having a bitmap name made sense with
>  x-nbd-server-add-bitmap because it could be called more than once for
>  exporting multiple bitmaps.
> 
>  But now, we have only nbd-server-add, which takes a single bitmap name.
>  As we don't have to distinguish multiple bitmaps any more, wouldn't it
>  make more sense to use "qemu:dirty-bitmap" without any other
>  information? Both export name and bitmap name are already identified by
>  the connection.
> >>>
> >>> We can use empty string (like the default export name), so the bitmap
> >>> would be exposed as:
> >>>
> >>>   "qemu:dirty-bitmap:"
> >>>
> >>> This would solve the issue for users, and keep the API extensible.
> >>
> >> As I already said, we can not. If we really wont such thing, use another 
> >> name,
> >> likq qemu:default-dirty-bitmap..
> >>
> >> Or call bitmap export "default", to produce
> >>"qemu:dirty-bitmaps:default"
> >>
> >> But don't change default behavior of nbd-server-add
> >>
> >>>
>  But if we have to have something there, using the bitmap name (which may
>  or may not be the same as used in the image file) makes a little more
>  sense because it makes the interface extensible for the case that we
>  ever want to re-introduce an nbd-server-add-bitmap.
> >>>
> >>> But using the bitmap name means user of the NBD server need to know this 
> >>> name.
> >>
> >> Why not? What is your case exactly? User knows, what kind of bitmap you are
> >> exporting, so user is in some relation with exporting process anyway. Why
> >> shouldn't it know the name?
> > 
> > Because the user configuring qemu (libvirt) is not the same user
> > accessing qemu NBD
> > server (ovirt, or some backup application).
> > 
> >> This name may be defined in you exporting protocol.. What are you 
> >> exporting?
> > 
> > We export HTTP API, allowing getting dirty extents and reading data:
> > https://www.ovirt.org/develop/release-management/features/storage/incremental-backup.html#map-request
> > (this document is outdated, but it gives the general idea)
> > 
> > Or provide the NBD URL directly to user (future).
> > 
> >> Note also, that client may use NBD_OPT_LIST_META_CONTEXT with query
> >> "qemu:dirty-bitmap:", to get list of all exported bitmaps.
> > 
> > This is another option, I did not try to use this yet. We can use the single
> > exported bitmap and fail if we get more than one. This is probably better
> > than changing the entire stack to support bitmap name.
> > 
> >>> One option is that libvirt would publish the name of the bitmap in the
> >>> xml describing
> >>> the backup, and oVirt will have to propagate this name to the actual
> >>> program accessing
> >>> the NBD server, which may be a user program in the case when we expose 
> >>> the NBD
> >>> URL to users (planned for future version).
> >>>
> >>> Another option is that the user will control this name, and libvirt
> >>> will use the name specified
> >>> by the user. This means oVirt will have to provide API to set this
> >>> name and pass it to libvirt.
> >>>
> >>> Both cases require lot of effort which does not help anyone in the
> >>> task of getting dirty
> >>> extents from an image - which is our current goal. We need to have
> >>> good defaults t

Re: [PATCH] block: nbd: Fix dirty bitmap context name

2019-12-20 Thread Peter Krempa

On Fri, Dec 20, 2019 at 10:06:17 +, Vladimir Sementsov-Ogievskiy wrote:
> 20.12.2019 12:56, Peter Krempa wrote:
> > On Fri, Dec 20, 2019 at 09:39:17 +, Vladimir Sementsov-Ogievskiy wrote:
> >> 19.12.2019 18:55, Nir Soffer wrote:
> >>> On Thu, Dec 19, 2019 at 5:17 PM Vladimir Sementsov-Ogievskiy
> >>>  wrote:
> >>>>
> >>>> 19.12.2019 17:59, Nir Soffer wrote:
> >>>>> On Thu, Dec 19, 2019 at 4:04 PM Kevin Wolf  wrote:
> >>>>>>
> >>>>>> Am 19.12.2019 um 14:41 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >>>>>>> Ahh, I see, it's documented as
> >>>>>>>
> >>>>>>> +# @bitmap: Also export the dirty bitmap reachable from @device, so 
> >>>>>>> the
> >>>>>>> +#  NBD client can use NBD_OPT_SET_META_CONTEXT with
> >>>>>>> +#  "qemu:dirty-bitmap:NAME" to inspect the bitmap. (since 
> >>>>>>> 4.0)
> >>>>>>>
> >>>>>>> and it is logical to assume that export name (which is @name 
> >>>>>>> argument) is
> >>>>>>> mentioned. But we never mentioned it. This is just documented after
> >>>>>>> removed experimenatl command x-nbd-server-add-bitmap,
> >>>>>>>
> >>>>>>> look at
> >>>>>>>
> >>>>>>> commit 7dc570b3806e5b0a4c9219061556ed5a4a0de80c
> >>>>>>> Author: Eric Blake 
> >>>>>>> Date:   Fri Jan 11 13:47:18 2019 -0600
> >>>>>>>
> >>>>>>> nbd: Remove x-nbd-server-add-bitmap
> >>>>>>>
> >>>>>>> ...
> >>>>>>>
> >>>>>>> -# @bitmap-export-name: How the bitmap will be seen by nbd clients
> >>>>>>> -#  (default @bitmap)
> >>>>>>> -#
> >>>>>>> -# Note: the client must use NBD_OPT_SET_META_CONTEXT with a query of
> >>>>>>> -# "qemu:dirty-bitmap:NAME" (where NAME matches @bitmap-export-name) 
> >>>>>>> to access
> >>>>>>> -# the exposed bitmap.
> >>>>>>>
> >>>>>>>
> >>>>>>> So, this "NAME" is saved and now looks incorrect. What should be 
> >>>>>>> fixed, is Qapi
> >>>>>>> documentation.
> >>>>>>
> >>>>>> Hm, I don't know these interfaces very well, but from you explanation 
> >>>>>> it
> >>>>>> looks to me as if having a bitmap name made sense with
> >>>>>> x-nbd-server-add-bitmap because it could be called more than once for
> >>>>>> exporting multiple bitmaps.
> >>>>>>
> >>>>>> But now, we have only nbd-server-add, which takes a single bitmap name.
> >>>>>> As we don't have to distinguish multiple bitmaps any more, wouldn't it
> >>>>>> make more sense to use "qemu:dirty-bitmap" without any other
> >>>>>> information? Both export name and bitmap name are already identified by
> >>>>>> the connection.
> >>>>>
> >>>>> We can use empty string (like the default export name), so the bitmap
> >>>>> would be exposed as:
> >>>>>
> >>>>>"qemu:dirty-bitmap:"
> >>>>>
> >>>>> This would solve the issue for users, and keep the API extensible.
> >>>>
> >>>> As I already said, we can not. If we really wont such thing, use another 
> >>>> name,
> >>>> likq qemu:default-dirty-bitmap..
> >>>>
> >>>> Or call bitmap export "default", to produce
> >>>> "qemu:dirty-bitmaps:default"
> >>>>
> >>>> But don't change default behavior of nbd-server-add
> >>>>
> >>>>>
> >>>>>> But if we have to have something there, using the bitmap name (which 
> >>>>>> may
> >>>>>> or may not be the same as used in the image file) makes a little more
> >>>>>> sense because it makes the interface extensible for the case that we
> >>>>

Re: [Qemu-devel] [libvirt] [PATCH for-4.0 v4 0/2] virtio: Provide version-specific variants of virtio PCI devices

2019-03-05 Thread Peter Krempa

On Tue, Mar 05, 2019 at 16:56:43 +0100, Andrea Bolognani wrote:
> On Tue, 2019-03-05 at 15:38 +0100, Gerd Hoffmann wrote:

[...]

> So I agree neither scenario is exactly perfect, but I still think
> adding non-transitional alias devices would overall be more
> user-friendly.

I don't think it makes sense to add it at the qemu level. From libvirt's
point of view users should be shielded from any qemu impl detail or
inconsistency as libvirt is the 'user friendly'[1] layer. In qemu the
devices would be the same and thus does not make sense to do that
because it would be more confusing.

You can argue that we should add the alias at the libvirt level though.

[1] Yes. I'm aware that statement is quite ironical.

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 0/8] file-posix: Make auto-read-only dynamic

2019-03-11 Thread Peter Krempa

On Fri, Mar 08, 2019 at 16:37:49 +0100, Kevin Wolf wrote:
> We introduced the auto-read-only option to fix the problem that block
> jobs that reopen a backing file read-write don't work any more when all
> nodes are created individually with -blockdev. The reason is that
> bs->file of these backing files doesn't inherit the read-only option
> from the format layer node any more if it's created separately.
> 
> The way auto-read-only was designed to fix this is that it just always
> opens the file node read-write if it can, so reopening the format layer
> node is enough to make the backing file writable when necessary.
> 
> This works in principle, but not when libvirt uses sVirt: Then QEMU
> doesn't even have the permissions to open the image file read-write
> until libvirt performs an operation where write access is needed.
> 
> This series changes auto-read-only so that it works dynamically and
> automatically reopens the file read-only or read-write depending on the
> permissions that users attached to the node requested.

While testing this series by attempting a non-active layer block-commit
I've got an assertion failure:

qemu-system-x86_64: block/file-posix.c:2715: raw_check_perm: Assertion 
`!s->perm_change_fd' failed.

Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted.
0x7f69fe08753f in raise () from target:/lib64/libc.so.6

Thread 1 (Thread 0x7f69f3b08680 (LWP 2694)):
#0  0x7f69fe08753f in raise () from target:/lib64/libc.so.6
#1  0x7f69fe071895 in abort () from target:/lib64/libc.so.6
#2  0x7f69fe071769 in __assert_fail_base.cold.0 ()
   from target:/lib64/libc.so.6
#3  0x7f69fe07f9f6 in __assert_fail () from target:/lib64/libc.so.6
#4  0x5563cfeea7e9 in raw_check_perm (bs=0x5563d1f90100, perm=11, 
shared=21, errp=0x7fffc280c0f0) at block/file-posix.c:2715
#5  0x5563cfe9aab4 in bdrv_check_perm (bs=bs@entry=0x5563d1f90100, 
q=0x5563d2bd0170, q@entry=0xbed88f135e2a5600, cumulative_perms=11, 
cumulative_shared_perms=21, 
ignore_children=ignore_children@entry=0x7f69a47f9710, 
errp=errp@entry=0x7fffc280c0f0) at block.c:1790
#6  0x5563cfe9a875 in bdrv_check_update_perm (bs=0x5563d1f90100, 
q=0xbed88f135e2a5600, q@entry=0x5563d2bd0170, 
new_used_perm=new_used_perm@entry=11, new_shared_perm=new_shared_perm@entry=21, 
ignore_children=ignore_children@entry=0x7f69a47f9710, 
errp=errp@entry=0x7fffc280c0f0) at block.c:1976
#7  0x5563cfe9a9ba in bdrv_child_check_perm (c=c@entry=0x5563d1eb9dd0, 
q=q@entry=0x5563d2bd0170, perm=11, shared=21, ignore_children=0x7f69a47f9710, 
ignore_children@entry=0x5563d1a57810, errp=errp@entry=0x7fffc280c0f0) at 
block.c:1989
#8  0x5563cfe9ab44 in bdrv_check_perm (bs=0x5563d1f943a0, bs@entry=0xb, 
q=0x5563d2bd0170, q@entry=0xbed88f135e2a5600, cumulative_perms=0, 
cumulative_shared_perms=31, 
ignore_children=ignore_children@entry=0x5563d1a57810, errp=0x7fffc280c0f0, 
errp@entry=0x15) at block.c:1806
#9  0x5563cfe9a875 in bdrv_check_update_perm (bs=0xb, q=0xbed88f135e2a5600, 
q@entry=0x5563d2bd0170, new_used_perm=new_used_perm@entry=0, 
new_shared_perm=new_shared_perm@entry=31, 
ignore_children=ignore_children@entry=0x5563d1a57810, errp=0x15, 
errp@entry=0x7fffc280c0f0) at block.c:1976
#10 0x5563cfe9a9ba in bdrv_child_check_perm (c=c@entry=0x5563d1eb9910, 
q=q@entry=0x5563d2bd0170, perm=0, shared=31, ignore_children=0x5563d1a57810, 
ignore_children@entry=0x5563d32dfd40, errp=errp@entry=0x7fffc280c0f0) at 
block.c:1989
#11 0x5563cfe9ab44 in bdrv_check_perm (bs=0x5563d2011320, bs@entry=0x0, 
q=0x5563d2bd0170, q@entry=0xbed88f135e2a5600, cumulative_perms=0, 
cumulative_shared_perms=6, 
ignore_children=ignore_children@entry=0x5563d32dfd40, errp=0x7fffc280c0f0, 
errp@entry=0x1f) at block.c:1806
#12 0x5563cfe9a875 in bdrv_check_update_perm (bs=0x0, q=0xbed88f135e2a5600, 
q@entry=0x5563d2bd0170, new_used_perm=new_used_perm@entry=0, 
new_shared_perm=new_shared_perm@entry=31, 
ignore_children=ignore_children@entry=0x5563d32dfd40, errp=0x1f, 
errp@entry=0x7fffc280c0f0) at block.c:1976
#13 0x5563cfe9a9ba in bdrv_child_check_perm (c=c@entry=0x5563d1d96690, 
q=q@entry=0x5563d2bd0170, perm=0, shared=31, ignore_children=0x5563d32dfd40, 
ignore_children@entry=0x5563d32dfe00, errp=errp@entry=0x7fffc280c0f0) at 
block.c:1989
#14 0x5563cfe9ab44 in bdrv_check_perm (bs=0x5563d2092340, bs@entry=0x0, 
q=0x5563d2bd0170, q@entry=0xbed88f135e2a5600, cumulative_perms=0, 
cumulative_shared_perms=6, 
ignore_children=ignore_children@entry=0x5563d32dfe00, errp=0x7fffc280c0f0, 
errp@entry=0x1f) at block.c:1806
#15 0x5563cfe9a875 in bdrv_check_update_perm (bs=0x0, q=0xbed88f135e2a5600, 
q@entry=0x5563d2bd0170, new_used_perm=new_used_perm@entry=0, 
new_shared_perm=new_shared_perm@entry=31, 
ignore_children=ignore_children@entry=0x5563d32dfe00, errp=0x1f, 
errp@entry=0x7fffc280c0f0) at block.c:1976
#16 0x5563cfe9a9ba in bdrv_child_check_perm (c=c@entry=0x5563d37424d0, 
q=q@entry=0x5563d2bd0170, perm=0, shared=31, ignore_children=0x55

Re: [Qemu-devel] [PATCH v2 10/10] file-posix: Make auto-read-only dynamic

2019-03-11 Thread Peter Krempa

On Mon, Mar 11, 2019 at 12:26:08 -0500, Eric Blake wrote:
> On 3/11/19 11:50 AM, Kevin Wolf wrote:
> > Until now, with auto-read-only=on we tried to open the file read-write
> > first and if that failed, read-only was tried. This is actually not good
> > enough for libvirt, which gives QEMU SELinux permissions for read-write
> > only as soon as it actually intends to write to the image. So we need to
> > be able to switch between read-only and read-write at runtime.
> > 
> > This patch makes auto-read-only dynamic, i.e. the file is opened
> > read-only as long as no user of the node has requested write
> > permissions, but it is automatically reopened read-write as soon as the
> > first writer is attached. Conversely, if the last writer goes away, the
> > file is reopened read-only again.
> > 
> > bs->read_only is no longer set for auto-read-only=on files even if the
> > file descriptor is opened read-only because it will be transparently
> > upgraded as soon as a writer is attached. This changes the output of
> > qemu-iotests 232.
> 
> auto-read-only was introduced in 3.1, at which point we intentionally
> had sufficiently loose wording to permit (but not require) dynamic state
> checking; so you are not breaking the interface.  On the other hand, is
> libvirt going to have problems introspecting whether it can use
> auto-read-only and get the dynamic behavior it needs?  Or is there
> enough else in the way of libvirt's switch to -blockdev that it won't
> attempt anything that needs auto-read-only without other 4.0 interfaces
> anyway, at which point detecting the presence of the field (but not
> whether the field has a guarantee of dynamic behavior) on 3.1 doesn't
> matter?

I think we can use Stefan's capability detection mechanism he introduced
for the migration with cache enabled for local files to add a flag for
this as well.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 00/10] file-posix: Make auto-read-only dynamic

2019-03-11 Thread Peter Krempa

On Mon, Mar 11, 2019 at 17:50:07 +0100, Kevin Wolf wrote:
> We introduced the auto-read-only option to fix the problem that block
> jobs that reopen a backing file read-write don't work any more when all
> nodes are created individually with -blockdev. The reason is that
> bs->file of these backing files doesn't inherit the read-only option
> from the format layer node any more if it's created separately.
> 
> The way auto-read-only was designed to fix this is that it just always
> opens the file node read-write if it can, so reopening the format layer
> node is enough to make the backing file writable when necessary.
> 
> This works in principle, but not when libvirt uses sVirt: Then QEMU
> doesn't even have the permissions to open the image file read-write
> until libvirt performs an operation where write access is needed.
> 
> This series changes auto-read-only so that it works dynamically and
> automatically reopens the file read-only or read-write depending on the
> permissions that users attached to the node requested.
> 
> See also: https://bugzilla.redhat.com/show_bug.cgi?id=1685989
> 
> v2:
> - Added test for and fixed snapshot=on,read-only=on regression [Berto]
> - Added test for and fixed commit regression [Peter]

I've quickly tested it with non-active commit and it works in this
scenario as expected.

I'll give it a bit more thorough test in the morning.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 10/10] file-posix: Make auto-read-only dynamic

2019-03-11 Thread Peter Krempa

On Mon, Mar 11, 2019 at 15:10:36 -0500, Eric Blake wrote:
> On 3/11/19 2:59 PM, Peter Krempa wrote:
> 
> >> auto-read-only was introduced in 3.1, at which point we intentionally
> >> had sufficiently loose wording to permit (but not require) dynamic state
> >> checking; so you are not breaking the interface.  On the other hand, is
> >> libvirt going to have problems introspecting whether it can use
> >> auto-read-only and get the dynamic behavior it needs?  Or is there
> >> enough else in the way of libvirt's switch to -blockdev that it won't
> >> attempt anything that needs auto-read-only without other 4.0 interfaces
> >> anyway, at which point detecting the presence of the field (but not
> >> whether the field has a guarantee of dynamic behavior) on 3.1 doesn't
> >> matter?
> > 
> > I think we can use Stefan's capability detection mechanism he introduced
> > for the migration with cache enabled for local files to add a flag for
> > this as well.
> 
> Except I thought we decided that the most recent version of his QMP
> changes was now fully-introspectible, thanks to using conditional
> compilation.
> 
> https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg02510.html

Oh, bummer, I missed that it was no longer needed. I still think it's
worth adding that for future use once it will be necessary to detect
that certain things were patched and require libvirt to change behaviour
if that's the case.

> Well, that may prove to be a short-lived hiatus, if libvirt would
> happily attempt to use qemu 3.1 and fail without some other
> introspectible hook to know whether auto-read-only has required semantics.
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
> 





signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] [PATCH 0/5] Discrad blocks during block-stream operation

2018-11-22 Thread Peter Krempa

On Thu, Nov 22, 2018 at 21:48:02 +0300, Andrey Shinkevich wrote:
> Hello everyone!
> 
> The given feature discards blocks with copy-on-read operation while the
> streaming process runs. Adding the 'discard' argument to the QMP block-stream
> command allows dropping a block in the backing chain after it has been copied
> to the active layer. That will elude the block duplication in the intermediate
> backing file. It saves the disk space while external snapshots are being
> merged.

So you specifically want to merge the snapshot by pulling rather than
commiting? Do you have any specific reasons for that? I'm curious
because I plan to finally finish external snapshots in libvirt.

Allowing to pull into intermediate layers will be (or is?) very welcome
by libvirt since I plan to do external snapshot deletion/merging and
that will be greatly simplified by pulling.

On the other hand libvirt will not be able to always use 'discard' as
libvirt's API allows creating alternate histories for a VM and in such
case when merging a snapshot at a branching point we'll need to pull it
into multiple images. The 'discard' optimization can then be used only
with the last branch.

Libvirt's reasons for using 'block-stream' are mostly as it corresponds
to the operations necessary for not messing up the relationship between
the snapshot and which files on disk belong to it.

signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] [PATCH 3/5] iotests: allow resume_drive by node name

2018-11-22 Thread Peter Krempa

On Thu, Nov 22, 2018 at 21:48:05 +0300, Andrey Shinkevich wrote:
> After node graph changes, we may not be able to resume_drive by device
> name (backing files are not recursively searched). So, lets allow to
> resume by node-name. Set constant name for breakpoints, to avoid
> introducing extra parameters.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

This patch has a mismatch between the author name and the person signing
it off.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] qmp: add query-qemu-capabilities

2019-02-25 Thread Peter Krempa

On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:
> Stefan Hajnoczi  writes:
> 
> > QMP clients can usually detect the presence of features via schema
> > introspection.  There are rare features that do not involve schema
> > changes and are therefore impossible to detect with schema
> > introspection.
> >
> > This patch adds the query-qemu-capabilities command.  It returns a list
> > of capabilities that this QEMU supports.
> 
> The name "capabilities" could be confusing, because we already have QMP
> capabilities, complete with command qmp_capabilities.  Would "features"
> work?
> 
> > The decision to make this a command rather than something statically
> > defined in the schema is intentional.  It allows QEMU to decide which
> > capabilities are available at runtime, if necessary.
> >
> > This new interface is necessary so that QMP clients can discover that
> > migrating disk image files is safe with cache.direct=off on Linux.
> > There is no other way to detect whether or not QEMU supports this.
> 
> I think what's unsaid here is that we don't want to make a completely
> arbitrary schema change just to carry this bit of information.  We
> could, but we don't want to.  Correct?

One example of such 'unrelated' change was a recent query- command which
is used by libvirt just to detect presence of the feature but the
queried result is never used and not very useful.

> > Cc: Markus Armbruster 
> > Cc: Peter Krempa 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  qapi/misc.json | 42 ++
> >  qmp.c  | 18 ++
> >  2 files changed, 60 insertions(+)

[...]

> > +QemuCapabilities *qmp_query_qemu_capabilities(Error **errp)
> > +{
> > +QemuCapabilities *caps = g_new0(QemuCapabilities, 1);
> > +QemuCapabilityList **prev = &caps->capabilities;
> > +QemuCapability cap_val;
> > +
> > +/* Add all QemuCapability enum values defined in the schema */
> > +for (cap_val = 0; cap_val < QEMU_CAPABILITY__MAX; cap_val++) {
> > +QemuCapabilityList *cap = g_new0(QemuCapabilityList, 1);
> > +
> > +cap->value = cap_val;
> > +*prev = cap;
> > +prev = &cap->next;
> > +}
> > +
> > +return caps;
> > +}
> 
> All capabilities known to this build of QEMU are always present.  Okay;
> no need to complicate things until we need to.  I just didn't expect it
> to be this simple after the commit message's "It allows QEMU to decide
> which capabilities are available at runtime, if necessary."

I'm slightly worried of misuse of the possibility to change the behavior
on runtime. In libvirt we cache the capabilities to prevent a "chicken
and egg" problem where we need to know what qemu is able to do when
generating the command line which is obviously prior to starting qemu.
This means that we will want to cache even information determined by
interpreting results of this API.

If any further addition is not as simple as this one it may be
challenging to be able to interpret the result correctly and be able to
cache it.

Libvirt's use of capabilities is split to pre-startup steps and runtime
usage. For the pre-startup case [A]  we obviously want to use the cached
capabilities which are obtained by running same qemu with a different
configuration that will be used later. After qemu is started we use
QMP to interact [B] with it also depending to the capabilities
determined from the cache. Scenario [B] also allows us to re-probe some
things but they need to be strictly usable only with QMP afterwards.

The proposed API with the 'runtime' behaviour allows for these 3
scenarios for a capability:
1) Static capability (as this patch shows)
This is easy to cache and also supports both [A] and [B]

2) Capability depending on configuration
[A] is out of the window but if QMP only use is necessary we can
adapt.

3) Capability depending on host state.
Same commandline might not result in same behaviour. Obviously can't
be cached at all, but libvirt would not do it anyways. [B] is
possible, but it's unpleasant.

I propose that the docs for the function filling the result (and perhaps
also the documentation for the QMP interface) clarify and/or guide devs
to avoid situations 2) and 3) if possible and motivate them to document
the limitations on when the capability is detactable.

Additionally a new field could be added that e.g. pledges that the given
capability is of type 1) as described above and thus can be easily
cached. That way we can make sure programatically that we pre-cache only
the 'good' capabilities.

Other than the above, this is a welcome improvement as I've personally
ran into scenarios where a feature in qemu was fixed but it was
impossible to detect whether given qemu version contains the fix as it
did not have any influence on the QMP schema.

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] qmp: add query-qemu-capabilities

2019-02-25 Thread Peter Krempa

On Mon, Feb 25, 2019 at 17:40:01 +, Stefan Hajnoczi wrote:
> On Mon, Feb 25, 2019 at 10:28:46AM +0100, Peter Krempa wrote:
> > On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:

[...]

> > I'm slightly worried of misuse of the possibility to change the behavior
> > on runtime. In libvirt we cache the capabilities to prevent a "chicken
> > and egg" problem where we need to know what qemu is able to do when
> > generating the command line which is obviously prior to starting qemu.
> > This means that we will want to cache even information determined by
> > interpreting results of this API.
> > 
> > If any further addition is not as simple as this one it may be
> > challenging to be able to interpret the result correctly and be able to
> > cache it.
> > 
> > Libvirt's use of capabilities is split to pre-startup steps and runtime
> > usage. For the pre-startup case [A]  we obviously want to use the cached
> > capabilities which are obtained by running same qemu with a different
> > configuration that will be used later. After qemu is started we use
> > QMP to interact [B] with it also depending to the capabilities
> > determined from the cache. Scenario [B] also allows us to re-probe some
> > things but they need to be strictly usable only with QMP afterwards.
> > 
> > The proposed API with the 'runtime' behaviour allows for these 3
> > scenarios for a capability:
> > 1) Static capability (as this patch shows)
> > This is easy to cache and also supports both [A] and [B]
> > 
> > 2) Capability depending on configuration
> > [A] is out of the window but if QMP only use is necessary we can
> > adapt.
> 
> Does "configuration" mean "QEMU command-line"?  The result of the query
> command should not depend on command-line arguments.

Yes exactly. There probably is possibility that something would be
detectable only after a shared library load which in turn would depend
on the command line ...

> 
> > 3) Capability depending on host state.
> > Same commandline might not result in same behaviour. Obviously can't
> > be cached at all, but libvirt would not do it anyways. [B] is
> > possible, but it's unpleasant.
> 
> Say the kernel or a library dependency is updated, and this enables a
> feature that QEMU was capable of but couldn't advertise before.  I guess
> this might happen and this is why I noted that the features could be
> selected at runtime.

Yeah, such scenario is really breaking our caching even now. I don't
want to say it's bad. It may even be necessary in some scenarios. Both
this and the above scenario may be necessary eventually. Libvirt
certainly can make use of the detection for QMP use if there is such a
thing.

> > I propose that the docs for the function filling the result (and perhaps
> > also the documentation for the QMP interface) clarify and/or guide devs
> > to avoid situations 2) and 3) if possible and motivate them to document
> > the limitations on when the capability is detactable.
> > 
> > Additionally a new field could be added that e.g. pledges that the given
> > capability is of type 1) as described above and thus can be easily
> > cached. That way we can make sure programatically that we pre-cache only
> > the 'good' capabilities.
> > 
> > Other than the above, this is a welcome improvement as I've personally
> > ran into scenarios where a feature in qemu was fixed but it was
> > impossible to detect whether given qemu version contains the fix as it
> > did not have any influence on the QMP schema.
> 
> I'd like to make things as simpler as possible, but no simpler :).
> 
> The simplest option is that the advertised features are set in stone at
> build time (e.g. selected with #ifdef if necessary).  But then we have
> no way of selecting features at runtime (e.g. based on kernel features).
> 
> What do you think?

I really don't want to limit the possibilities for the API. My goal is
only that it's obvious both from the docs and preferably also from the
returned data (so that we can filter what to cache to prevent mistakes)
that a given capability bit is static or dynamic.

I think adding the field to the returned data which will be set
according to how the capability was detected should be simple enough,
but I will be okay with just documenting any caveats along with the
inidividual capabilities. In that case an comment should encourage those
coming after you to document it properly.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [libvirt] [PATCH] qemu-nbd: Deprecate qemu-nbd --partition

2019-01-24 Thread Peter Krempa

On Wed, Jan 23, 2019 at 15:19:53 -0600, Eric Blake wrote:
> The existing qemu-nbd --partition code claims to handle logical
> partitions up to 8, since its introduction in 2008 (commit 7a5ca86).
> However, the implementation is bogus (actual MBR logical partitions
> form a sort of linked list, with one partition per extended table
> entry, rather than four logical partitions in a single extended
> table), making the code unlikely to work for anything beyond -P5 on
> actual guest images. What's more, the code does not support GPT
> partitions, which are becoming more popular, and maintaining device
> subsetting in both NBD and the raw device is unnecessary maintenance
> burden.  And nbdkit has just added code to properly handle an
> arbitrary number of MBR partitions, along with its existing code
> for handling GPT partitions.
> 
> Note that obtaining the offsets of a partition can be learned by
> using 'qemu-nbd -c /dev/nbd0 file.qcow2 && sfdisk --dump /dev/nbd0',
> but by the time you've done that, you might as well just mount
> /dev/nbd0p1 that the kernel creates for you.
> 
> Start the clock on the deprecation cycle, with an example of how
> to write device subsetting without using -P.
> 
> Signed-off-by: Eric Blake 
> ---
>  qemu-deprecated.texi | 27 +++
>  qemu-nbd.texi|  6 --
>  qemu-nbd.c   |  2 ++
>  3 files changed, 33 insertions(+), 2 deletions(-)

Libvirt does not use qemu-nbd in any way so it's okay from our POV.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 2/3] scsi-disk: Add device_id property

2019-01-28 Thread Peter Krempa

On Fri, Jan 25, 2019 at 18:46:52 +0100, Kevin Wolf wrote:
> The new device_id property specifies which value to use for the vendor
> specific designator in the Device Identification VPD page.
> 
> In particular, this is necessary for libvirt to maintain guest ABI
> compatibility when no serial number is given and a VM is switched from
> -drive (where the BlockBackend name is used) to -blockdev (where the
> vendor specific designator is left out by default).
> 
> Signed-off-by: Kevin Wolf 
> ---
>  hw/scsi/scsi-disk.c | 24 
>  1 file changed, 16 insertions(+), 8 deletions(-)

[...]

> @@ -2904,7 +2910,9 @@ static const TypeInfo scsi_disk_base_info = {
>  DEFINE_PROP_STRING("ver", SCSIDiskState, version),   \
>  DEFINE_PROP_STRING("serial", SCSIDiskState, serial), \
>  DEFINE_PROP_STRING("vendor", SCSIDiskState, vendor), \
> -DEFINE_PROP_STRING("product", SCSIDiskState, product)
> +DEFINE_PROP_STRING("product", SCSIDiskState, product),   \
> +DEFINE_PROP_STRING("device_id", SCSIDiskState, device_id)

This adds the property only to 'scsi-disk', whereas libvirt will use
'scsi-cd' or 'scsi-hd' depending on the media type if the 'scsi-cd'
device is detected. The following logic decides:

https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_command.c;h=3e46f3ced3c1e6a4865e98e2d0cdce3214c4a5d2;hb=HEAD#l1978

This brings multiple questions:

1) Is this necssary also for scsi-cd/scsi-hd and if yes, the property
does not seem to be present for those

2) Is actually using 'scsi-cd'/'scsi-hd' the better option than
'scsi-disk'?

3) Since upstream libvirt supports qemu-1.5 and newer and 'scsi-cd' is
already supported there, can we assume that all newer versions support
it? (Basically the question is whether it can be compiled out by
upstream means).




signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 2/3] scsi-disk: Add device_id property

2019-01-28 Thread Peter Krempa

On Mon, Jan 28, 2019 at 09:50:41 +0100, Peter Krempa wrote:
> On Fri, Jan 25, 2019 at 18:46:52 +0100, Kevin Wolf wrote:
> > The new device_id property specifies which value to use for the vendor
> > specific designator in the Device Identification VPD page.
> > 
> > In particular, this is necessary for libvirt to maintain guest ABI
> > compatibility when no serial number is given and a VM is switched from
> > -drive (where the BlockBackend name is used) to -blockdev (where the
> > vendor specific designator is left out by default).
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  hw/scsi/scsi-disk.c | 24 
> >  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> [...]
> 
> > @@ -2904,7 +2910,9 @@ static const TypeInfo scsi_disk_base_info = {
> >  DEFINE_PROP_STRING("ver", SCSIDiskState, version),   \
> >  DEFINE_PROP_STRING("serial", SCSIDiskState, serial), \
> >  DEFINE_PROP_STRING("vendor", SCSIDiskState, vendor), \
> > -DEFINE_PROP_STRING("product", SCSIDiskState, product)
> > +DEFINE_PROP_STRING("product", SCSIDiskState, product),   \
> > +DEFINE_PROP_STRING("device_id", SCSIDiskState, device_id)
> 
> This adds the property only to 'scsi-disk', whereas libvirt will use
> 'scsi-cd' or 'scsi-hd' depending on the media type if the 'scsi-cd'
> device is detected. The following logic decides:
> 
> https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_command.c;h=3e46f3ced3c1e6a4865e98e2d0cdce3214c4a5d2;hb=HEAD#l1978
> 
> This brings multiple questions:

I've found:

commit b443ae67130d32ad06b06fc9aa6d04d05ccd93ce
Author: Markus Armbruster 
Date:   Mon May 16 15:04:53 2011 +0200

scsi: Split qdev "scsi-disk" into "scsi-hd" and "scsi-cd"

Which seems to put scsi-hd and scsi-cd as the thing to use.

Libvirt's code obviously did not adapt after that and we still keep to
probe scsi-disk instead of the split up devices, whereas we will never
use scsi-disk.

Also some of our tests seem to look as if scsi-disk was used which
confused me.

Since the new property is present also for scsi-hd and scsi-cd we really
need to adapt our code to that.

Btw, this also means that qemu can deprecate scsi-disk.

Sorry for the noise.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 2/3] scsi-disk: Add device_id property

2019-01-29 Thread Peter Krempa

On Mon, Jan 28, 2019 at 17:55:14 +0100, Markus Armbruster wrote:
> Kevin Wolf  writes:
> 
> > Am 28.01.2019 um 09:50 hat Peter Krempa geschrieben:
> [...]
> >> 2) Is actually using 'scsi-cd'/'scsi-hd' the better option than
> >> 'scsi-disk'?
> >
> > Yes, scsi-disk is a legacy device. Maybe we should formally deprecate
> > it.
> 
> There's an internal use in scsi_bus_legacy_add_drive(), which in turn
> powers two legacy features:
> 
> 1. -drive if=scsi
> 
>Creates scsi-disk frontends.
> 
>Only works with onboard HBAs since commit 14545097267, v2.12.0.
> 
> 2. -device usb-storage
> 
>Bad magic: usb-storage pretends to be a block device, but it's really
>a SCSI bus that can serve only a single device, which it creates
>automatically.
> 
> If we deprecate scsi-disk, we should deprecate these, too.  Can't say
> whether that's practical right now.

Unfortunately we did not bother replacing usb-storage yet.

scsi-disk was unused for some time (if scsi-hd was supported). I just
deleted any mentions of it from libvirt now.

> 
> >> 3) Since upstream libvirt supports qemu-1.5 and newer and 'scsi-cd' is
> >> already supported there, can we assume that all newer versions support
> >> it? (Basically the question is whether it can be compiled out by
> >> upstream means).
> >
> > I think so.
> 
> Compiling out scsi-hd or scsi-cd, but not scsi-disk would be silly.  All
> three devices are in scsi-disk.c.  You'd have to hack that up to be
> silly.

That would be a downstream modification of qemu thus libvirt will
not want to support that.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [libvirt] [PATCH 2/3] scsi-disk: Add device_id property

2019-01-29 Thread Peter Krempa

On Tue, Jan 29, 2019 at 08:10:19 +0100, Markus Armbruster wrote:
> Kevin Wolf  writes:
> 
> > Am 28.01.2019 um 17:55 hat Markus Armbruster geschrieben:
> >> Kevin Wolf  writes:
> >> 
> >> > Am 28.01.2019 um 09:50 hat Peter Krempa geschrieben:
> >> [...]
> >> >> 2) Is actually using 'scsi-cd'/'scsi-hd' the better option than
> >> >> 'scsi-disk'?
> >> >
> >> > Yes, scsi-disk is a legacy device. Maybe we should formally deprecate
> >> > it.
> >> 
> >> There's an internal use in scsi_bus_legacy_add_drive(), which in turn
> >> powers two legacy features:
> >> 
> >> 1. -drive if=scsi
> >> 
> >>Creates scsi-disk frontends.
> >> 
> >>Only works with onboard HBAs since commit 14545097267, v2.12.0.
> >> 
> >> 2. -device usb-storage
> >> 
> >>Bad magic: usb-storage pretends to be a block device, but it's really
> >>a SCSI bus that can serve only a single device, which it creates
> >>automatically.
> >> 
> >> If we deprecate scsi-disk, we should deprecate these, too.  Can't say
> >> whether that's practical right now.
> >
> > Most likely not worth the effort anyway. I don't think it's blocking
> > anything.
> 
> We could also wean them off the legacy device models.

In libvirt we should get rid of usb-storage usage, but I remember there
were a few ABI-related problems so that we could not plainly switch to
uas.

> >> >> 3) Since upstream libvirt supports qemu-1.5 and newer and 'scsi-cd' is
> >> >> already supported there, can we assume that all newer versions support
> >> >> it? (Basically the question is whether it can be compiled out by
> >> >> upstream means).
> >> >
> >> > I think so.
> >> 
> >> Compiling out scsi-hd or scsi-cd, but not scsi-disk would be silly.  All
> >> three devices are in scsi-disk.c.  You'd have to hack that up to be
> >> silly.
> >
> > I understood this as a question about libvirt, i.e. whether libvirt can
> > drop/compile out their scsi-disk code and instead assume that scsi-hd/cd
> > are always present. Maybe I misunderstood, though?
> 
> If questions remain, I trust Peter will ask.

I in fact wanted to know whether it's possible to compile it out of qemu
somehow. Removing it from libvirt is then easy.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [libvirt] [PATCH v3 16/17] qmp: Deprecate query-events in favor of query-qmp-schema

2019-02-07 Thread Peter Krempa

On Wed, Feb 06, 2019 at 12:42:05 -0600, Eric Blake wrote:
> Adding libvirt in cc
> 
> On 2/6/19 12:17 PM, Markus Armbruster wrote:
> > query-events doesn't reflect compile-time configuration.  Instead of
> > fixing that, deprecate the command in favor of query-qmp-schema.
> > 
> > Signed-off-by: Markus Armbruster 
> > ---
> >  monitor.c| 5 +
> >  qapi/misc.json   | 7 +--
> >  qemu-deprecated.texi | 5 +
> >  3 files changed, 15 insertions(+), 2 deletions(-)
> 
> Libvirt uses query-events; we'll need to patch libvirt to start relying
> on query-qmp-schema before this deprecation cycle can complete (although
> I'm in favor of making the change).

Patches to use 'query-qmp-schema' output rather than 'query-events' in
libvirt:

https://www.redhat.com/archives/libvir-list/2019-February/msg00298.html

You can start the deprecation clock.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [libvirt] [PATCH v3 16/17] qmp: Deprecate query-events in favor of query-qmp-schema

2019-02-08 Thread Peter Krempa

On Thu, Feb 07, 2019 at 14:46:41 +0100, Markus Armbruster wrote:
> Peter Krempa  writes:
> 
> > On Wed, Feb 06, 2019 at 12:42:05 -0600, Eric Blake wrote:
> >> Adding libvirt in cc
> >> 
> >> On 2/6/19 12:17 PM, Markus Armbruster wrote:
> >> > query-events doesn't reflect compile-time configuration.  Instead of
> >> > fixing that, deprecate the command in favor of query-qmp-schema.
> >> > 
> >> > Signed-off-by: Markus Armbruster 
> >> > ---
> >> >  monitor.c| 5 +
> >> >  qapi/misc.json   | 7 +--
> >> >  qemu-deprecated.texi | 5 +
> >> >  3 files changed, 15 insertions(+), 2 deletions(-)
> >> 
> >> Libvirt uses query-events; we'll need to patch libvirt to start relying
> >> on query-qmp-schema before this deprecation cycle can complete (although
> >> I'm in favor of making the change).
> >
> > Patches to use 'query-qmp-schema' output rather than 'query-events' in
> > libvirt:
> >
> > https://www.redhat.com/archives/libvir-list/2019-February/msg00298.html
> >
> > You can start the deprecation clock.
> 
> Awesome!

It's now pushed. Thankfully our existing infrastructure for querying
schema was able to handle it without any changes.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [libvirt] Qemu migration with vhost-user-blk on top of local storage

2019-01-09 Thread Peter Krempa

On Wed, Jan 09, 2019 at 21:23:25 +0800, wuzhouhui wrote:
> > -Original Messages-
> > From: "Stefan Hajnoczi" 
> > Sent Time: 2019-01-09 20:42:58 (Wednesday)
> > To: wuzhouhui 
> > Cc: qemu-devel@nongnu.org, xieyon...@baidu.com, lili...@baidu.com, 
> > libvir-l...@redhat.com, s...@lists.01.org
> > Subject: Re: [Qemu-devel] Qemu migration with vhost-user-blk on top of 
> > local storage
> > 
> > On Wed, Jan 09, 2019 at 06:23:42PM +0800, wuzhouhui wrote:
> > > Hi everyone,
> > > 
> > > I'm working qemu with vhost target (e.g. spdk), and I attempt to migrate 
> > > VM with
> > > 2 local storages. One local storage is a regular file, e.g. 
> > > /tmp/c74.qcow2, and
> > > the other is a malloc bdev that spdk created. This malloc bdev will 
> > > exported to
> > > VM via vhost-user-blk. When I execute following command:
> > > 
> > >   virsh migrate --live --persistent --unsafe --undefinesource 
> > > --copy-storage-all \
> > >  --p2p --auto-converge --verbose --desturi qemu+tcp:///system vm0
> > > 
> > > The libvirt reports:
> > > 
> > >   qemu-2.12.1: error: internal error: unable to execute QEMU command \
> > > 'nbd-server-add': Cannot find device=drive-virtio-disk1 nor \
> > > node_name=drive-virtio-disk1
> > 
> > Please post your libvirt domain XML.
> 
> My libvirt is based on libvirt-1.1.1-29.el7, and add many patches to satisfy 
> our
> own needs, e.g. add support for vhost-user-blk. Post domain xml may not 
> useful.

So you added support for vhost-user-blk, but did not fix the
NBD migration code. You are on your own, sorry. We can't support
arbitrary downstream changes.

Please either upstream your patches or make sure to skip it for
vhost-user.


signature.asc
Description: PGP signature

[PATCH 2/2] block: stream: Allow users to request only format driver names in backing file format

2023-11-24 Thread Peter Krempa

Introduce a new flag 'backing_file_format_no_protocol' for the
block-commit QMP command which instructs the internals to use 'raw'
instead of the protocol driver in case when a image is used without a
dummy 'raw' wrapper.

The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.

The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.

Signed-off-by: Peter Krempa 
---
 block/monitor/block-hmp-cmds.c |  2 +-
 block/stream.c | 10 +-
 blockdev.c |  7 +++
 include/block/block_int-global-state.h |  3 +++
 qapi/block-core.json   | 11 ++-
 5 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index c729cbf1eb..28e708a981 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -509,7 +509,7 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
 const char *base = qdict_get_try_str(qdict, "base");
 int64_t speed = qdict_get_try_int(qdict, "speed", 0);

-qmp_block_stream(device, device, base, NULL, NULL, NULL,
+qmp_block_stream(device, device, base, NULL, NULL, NULL, false, false,
  qdict_haskey(qdict, "speed"), speed,
  true, BLOCKDEV_ON_ERROR_REPORT, NULL,
  false, false, false, false, &error);
diff --git a/block/stream.c b/block/stream.c
index 01fe7c0f16..42befd6b1d 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -39,6 +39,7 @@ typedef struct StreamBlockJob {
 BlockDriverState *target_bs;
 BlockdevOnError on_error;
 char *backing_file_str;
+bool backing_file_format_no_protocol;
 bool bs_read_only;
 } StreamBlockJob;

@@ -95,7 +96,12 @@ static int stream_prepare(Job *job)
 if (unfiltered_base) {
 base_id = s->backing_file_str ?: unfiltered_base->filename;
 if (unfiltered_base->drv) {
-base_fmt = unfiltered_base->drv->format_name;
+if (s->backing_file_format_no_protocol &&
+unfiltered_base->drv->protocol_name) {
+base_fmt = "raw";
+} else {
+base_fmt = unfiltered_base->drv->format_name;
+}
 }
 }

@@ -247,6 +253,7 @@ static const BlockJobDriver stream_job_driver = {

 void stream_start(const char *job_id, BlockDriverState *bs,
   BlockDriverState *base, const char *backing_file_str,
+  bool backing_file_format_no_protocol,
   BlockDriverState *bottom,
   int creation_flags, int64_t speed,
   BlockdevOnError on_error,
@@ -398,6 +405,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 s->base_overlay = base_overlay;
 s->above_base = above_base;
 s->backing_file_str = g_strdup(backing_file_str);
+s->backing_file_format_no_protocol = backing_file_format_no_protocol;
 s->cor_filter_bs = cor_filter_bs;
 s->target_bs = bs;
 s->bs_read_only = bs_read_only;
diff --git a/blockdev.c b/blockdev.c
index 038031bb03..dc477c4f7e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2408,6 +2408,8 @@ void qmp_block_stream(const char *job_id, const char 
*device,
   const char *base,
   const char *base_node,
   const char *backing_file,
+  bool has_backing_file_format_no_protocol,
+  bool backing_file_format_no_protocol,
   const char *bottom,
   bool has_speed, int64_t speed,
   bool has_on_error, BlockdevOnError on_error,
@@ -2443,6 +2445,10 @@ void qmp_block_stream(const char *job_id, const char 
*device,
 return;
 }

+if (!has_backing_file_format_no_protocol) {
+backing_file_format_no_protocol = false;
+}
+
 if (!has_on_error) {
 on_error = BLOCKDEV_ON_ERROR_REPORT;
 }
@@ -2531,6 +2537,7 @@ void qmp_block_stream(const char *job_id, const char 
*device,
 }

 stream_start(job_id, bs, base_bs, backing_file,
+ backing_file_format_no_protocol,
  bottom_bs, job_flags, has_speed ? speed : 0, on_error,
  filter_node_name, &local_err);
 if (local_err) {
diff --git a/include/block/block_int-global-state.h 
b/include/block/block_int-global-state.h
index 4f253ff362..4301061048 100644
-

[PATCH 1/2] block: commit: Allow users to request only format driver names in backing file format

2023-11-24 Thread Peter Krempa

Introduce a new flag 'backing_file_format_no_protocol' for the
block-commit QMP command which instructs the internals to use 'raw'
instead of the protocol driver in case when a image is used without a
dummy 'raw' wrapper.

The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.

The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.

Signed-off-by: Peter Krempa 
---
 block.c| 37 +-
 block/commit.c |  6 -
 blockdev.c |  6 +
 include/block/block-global-state.h |  3 ++-
 include/block/block_int-common.h   |  4 ++-
 include/block/block_int-global-state.h |  3 +++
 qapi/block-core.json   | 11 +++-
 tests/unit/test-bdrv-drain.c   |  3 ++-
 8 files changed, 61 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index bfb0861ec6..986a529941 100644
--- a/block.c
+++ b/block.c
@@ -1309,11 +1309,14 @@ static void bdrv_backing_detach(BdrvChild *c)
 }

 static int bdrv_backing_update_filename(BdrvChild *c, BlockDriverState *base,
-const char *filename, Error **errp)
+const char *filename,
+bool backing_file_format_no_protocol,
+Error **errp)
 {
 BlockDriverState *parent = c->opaque;
 bool read_only = bdrv_is_read_only(parent);
 int ret;
+const char *format_name;
 GLOBAL_STATE_CODE();

 if (read_only) {
@@ -1323,9 +1326,23 @@ static int bdrv_backing_update_filename(BdrvChild *c, 
BlockDriverState *base,
 }
 }

-ret = bdrv_change_backing_file(parent, filename,
-   base->drv ? base->drv->format_name : "",
-   false);
+if (base->drv) {
+/*
+ * If the new base image doesn't have a format driver layer, which we
+ * detect by the fact that @base is a protocol driver, we record
+ * 'raw' as the format instead of putting the protocol name as the
+ * backing format
+ */
+if (backing_file_format_no_protocol && base->drv->protocol_name) {
+format_name = "raw";
+} else {
+format_name = base->drv->format_name;
+}
+} else {
+format_name = "";
+}
+
+ret = bdrv_change_backing_file(parent, filename, format_name, false);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not update backing file link");
 }
@@ -1479,10 +1496,14 @@ static void GRAPH_WRLOCK bdrv_child_cb_detach(BdrvChild 
*child)
 }

 static int bdrv_child_cb_update_filename(BdrvChild *c, BlockDriverState *base,
- const char *filename, Error **errp)
+ const char *filename,
+ bool backing_file_format_no_protocol,
+ Error **errp)
 {
 if (c->role & BDRV_CHILD_COW) {
-return bdrv_backing_update_filename(c, base, filename, errp);
+return bdrv_backing_update_filename(c, base, filename,
+backing_file_format_no_protocol,
+errp);
 }
 return 0;
 }
@@ -5961,7 +5982,8 @@ void bdrv_unfreeze_backing_chain(BlockDriverState *bs, 
BlockDriverState *base)
  *
  */
 int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
-   const char *backing_file_str)
+   const char *backing_file_str,
+   bool backing_file_format_no_protocol)
 {
 BlockDriverState *explicit_top = top;
 bool update_inherits_from;
@@ -6027,6 +6049,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,

 if (c->klass->update_filename) {
 ret = c->klass->update_filename(c, base, backing_file_str,
+backing_file_format_no_protocol,
 &local_err);
 if (ret < 0) {
 /*
diff --git a/block/commit.c b/block/commit.c
index 69cc75be0c..5a584b712e 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -42,6 +42,7 @@ typedef struct CommitBlockJob {
 bool base_read_only;
 bool chain_frozen;
 char *backing_file_str;
+boo

[PATCH 0/2] block: commit/stream: Allow users to request only format driver names in backing file format

2023-11-24 Thread Peter Krempa

Please see patches for rationale.

Libvirt patches using this new flag will be posted soon-ish (after
cleanup).

Peter Krempa (2):
  block: commit: Allow users to request only format driver names in
backing file format
  block: stream: Allow users to request only format driver names in
backing file format

 block.c| 37 +-
 block/commit.c |  6 -
 block/monitor/block-hmp-cmds.c |  2 +-
 block/stream.c | 10 ++-
 blockdev.c | 13 +
 include/block/block-global-state.h |  3 ++-
 include/block/block_int-common.h   |  4 ++-
 include/block/block_int-global-state.h |  6 +
 qapi/block-core.json   | 22 +--
 tests/unit/test-bdrv-drain.c   |  3 ++-
 10 files changed, 91 insertions(+), 15 deletions(-)

-- 
2.43.0

Re: [PATCH 1/2] block: commit: Allow users to request only format driver names in backing file format

2023-11-28 Thread Peter Krempa

On Tue, Nov 28, 2023 at 20:10:10 +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 24.11.23 17:52, Peter Krempa wrote:
> > Introduce a new flag 'backing_file_format_no_protocol' for the
> > block-commit QMP command which instructs the internals to use 'raw'
> > instead of the protocol driver in case when a image is used without a
> > dummy 'raw' wrapper.
> > 
> > The flag is designed such that it can be always asserted by management
> > tools even when there isn't any update to backing files.
> > 
> > The flag will be used by libvirt so that the backing images still
> > reference the proper format even when libvirt will stop using the dummy
> > raw driver (raw driver with no other config). Libvirt needs this so that
> > the images stay compatible with older libvirt versions which didn't
> > expect that a protocol driver name can appear in the backing file format
> > field.
> > 
> > Signed-off-by: Peter Krempa 
> > ---

[...]

> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index ca390c5700..367e896905 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -1810,6 +1810,14 @@
> >   # Care should be taken when specifying the string, to specify a
> >   # valid filename or protocol.  (Since 2.1)
> >   #
> > +# @backing-file-format-no-protocol: If true always use a 'format' driver 
> > name
> > +# for the 'backing file format' field if updating the image header of 
> > the
> > +# overlay of 'top'. Otherwise the real name of the driver of the 
> > backing
> > +# image may be used which may be a protocol driver.
> > +#
> > +# Can be used also when no image header will be updated.
> > +# (default: false; since: 8.2)
> > +#
> 
> Hi Peter.

Hi,

Firstly to be honest, I consider the fact that qemu can put a protocol
driver into the header field named "backing file format" to be a bug.
After discussion with Kevin I understand the rationale of doing so, but
nevertheless, a backing protocol name is not a format and should have
had it's own header field.

> Hmm. Could this just be @backing-file-format ?

I don't really care too deeply about the name.

Calling it just @backing-file-format though would imply (as you write
below) that as argument the string to write into the metadata. More on
that below.

> As I understand, finally, that's just a string which we are going to put into 
> qcow2 metadata.

Yes.

> And from qcow2 point of view, it's rather strange to set 
> backing_file_format="raw" for backing image which is actually "qcow2".

Indeed, that would be wrong, but this is _NOT_ what this patch
actually does.

This patch ensures that only a *format* driver name is ever written to
the backing image. It overrides the fact that a *protocol* driver name
would be written into the field if the tail of the backing chain is a
raw image, which was instantiated in qemu without the dummy 'raw' driver
on top.

Since a raw driver with no configuration simply passes request to the
*protocol* driver below it it's not needed in most configs. In fact as
stefanha figured out a long time ago it's just simpy overhead.

We need a format name in the backing file format field as libvirt
assumed for a very long time that such a field would contain only format
drivers, and since I want to remove the overhead of the 'raw' driver I
need to ensure that images won't break for older libvirt.

>"raw" say nothing to the reader and may be even misleading. Same for qemu, 
>this seems just a strange thing.
> 
> Also, what I dislike, that new feature sounds like hardcoded "raw" is the 
> only format driver. If go this way, more honest name would be 
> @backing-file-raw.

Once again, that is not what's happening here. The field enables that
only a *format* driver is used. This is only a problem when the final
image is raw, but without the dummy 'raw' driver on top of it. Thus
that's the reason the workaround only ever writes 'raw' into it.
Otherwise the proper format name is in the 'driver' field already and is
not overwritten.

> And, if we allow user to give any backing-file name to be written into qcow2 
> metadata after the operation, why not just allow to set any backing-file-name 
> as well?

This IMO makes no sense to allow, but based on the reimagined design of
the 'backing file /format/' field it might appear that it does.

'block-commit' and 'block-stream, can't change the format of any of the
images, they simply move data, thus for any non-raw image in a chain it
must still

[PATCH v2 1/2] block: commit: Allow users to request only format driver names in backing file format

2023-11-30 Thread Peter Krempa

Introduce a new flag 'backing_file_format_no_protocol' for the
block-commit QMP command which instructs the internals to use 'raw'
instead of the protocol driver in case when a image is used without a
dummy 'raw' wrapper.

The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.

The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.

Signed-off-by: Peter Krempa 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block.c| 37 +-
 block/commit.c |  6 -
 blockdev.c |  6 +
 include/block/block-global-state.h |  3 ++-
 include/block/block_int-common.h   |  4 ++-
 include/block/block_int-global-state.h |  3 +++
 qapi/block-core.json   | 11 +++-
 tests/unit/test-bdrv-drain.c   |  3 ++-
 8 files changed, 61 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index bfb0861ec6..986a529941 100644
--- a/block.c
+++ b/block.c
@@ -1309,11 +1309,14 @@ static void bdrv_backing_detach(BdrvChild *c)
 }

 static int bdrv_backing_update_filename(BdrvChild *c, BlockDriverState *base,
-const char *filename, Error **errp)
+const char *filename,
+bool backing_file_format_no_protocol,
+Error **errp)
 {
 BlockDriverState *parent = c->opaque;
 bool read_only = bdrv_is_read_only(parent);
 int ret;
+const char *format_name;
 GLOBAL_STATE_CODE();

 if (read_only) {
@@ -1323,9 +1326,23 @@ static int bdrv_backing_update_filename(BdrvChild *c, 
BlockDriverState *base,
 }
 }

-ret = bdrv_change_backing_file(parent, filename,
-   base->drv ? base->drv->format_name : "",
-   false);
+if (base->drv) {
+/*
+ * If the new base image doesn't have a format driver layer, which we
+ * detect by the fact that @base is a protocol driver, we record
+ * 'raw' as the format instead of putting the protocol name as the
+ * backing format
+ */
+if (backing_file_format_no_protocol && base->drv->protocol_name) {
+format_name = "raw";
+} else {
+format_name = base->drv->format_name;
+}
+} else {
+format_name = "";
+}
+
+ret = bdrv_change_backing_file(parent, filename, format_name, false);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not update backing file link");
 }
@@ -1479,10 +1496,14 @@ static void GRAPH_WRLOCK bdrv_child_cb_detach(BdrvChild 
*child)
 }

 static int bdrv_child_cb_update_filename(BdrvChild *c, BlockDriverState *base,
- const char *filename, Error **errp)
+ const char *filename,
+ bool backing_file_format_no_protocol,
+ Error **errp)
 {
 if (c->role & BDRV_CHILD_COW) {
-return bdrv_backing_update_filename(c, base, filename, errp);
+return bdrv_backing_update_filename(c, base, filename,
+backing_file_format_no_protocol,
+errp);
 }
 return 0;
 }
@@ -5961,7 +5982,8 @@ void bdrv_unfreeze_backing_chain(BlockDriverState *bs, 
BlockDriverState *base)
  *
  */
 int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
-   const char *backing_file_str)
+   const char *backing_file_str,
+   bool backing_file_format_no_protocol)
 {
 BlockDriverState *explicit_top = top;
 bool update_inherits_from;
@@ -6027,6 +6049,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,

 if (c->klass->update_filename) {
 ret = c->klass->update_filename(c, base, backing_file_str,
+backing_file_format_no_protocol,
 &local_err);
 if (ret < 0) {
 /*
diff --git a/block/commit.c b/block/commit.c
index 69cc75be0c..5a584b712e 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -42,6 +42,7 @@ typedef struct CommitBlockJob {
 bool base_read_only;
 bool chain_frozen

[PATCH v2 0/2] block: commit/stream: Allow users to request only format driver names in backing file format

2023-11-30 Thread Peter Krempa

Please see patches for rationale.

Libvirt patches using this new flag will be posted soon-ish (after
cleanup).

v2:
 - fixed mistaken argument order in 'hmp_block_stream'
 - changed version in docs to 9.0 as getting this into RC 3 probably
   isn't realistic

Peter Krempa (2):
  block: commit: Allow users to request only format driver names in
backing file format
  block: stream: Allow users to request only format driver names in
backing file format

 block.c| 37 +-
 block/commit.c |  6 -
 block/monitor/block-hmp-cmds.c |  2 +-
 block/stream.c | 10 ++-
 blockdev.c | 13 +
 include/block/block-global-state.h |  3 ++-
 include/block/block_int-common.h   |  4 ++-
 include/block/block_int-global-state.h |  6 +
 qapi/block-core.json   | 22 +--
 tests/unit/test-bdrv-drain.c   |  3 ++-
 10 files changed, 91 insertions(+), 15 deletions(-)

-- 
2.43.0

[PATCH v2 2/2] block: stream: Allow users to request only format driver names in backing file format

2023-11-30 Thread Peter Krempa

Introduce a new flag 'backing_file_format_no_protocol' for the
block-commit QMP command which instructs the internals to use 'raw'
instead of the protocol driver in case when a image is used without a
dummy 'raw' wrapper.

The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.

The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.

Signed-off-by: Peter Krempa 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/monitor/block-hmp-cmds.c |  2 +-
 block/stream.c | 10 +-
 blockdev.c |  7 +++
 include/block/block_int-global-state.h |  3 +++
 qapi/block-core.json   | 11 ++-
 5 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index c729cbf1eb..9080e29d4d 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -509,7 +509,7 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
 const char *base = qdict_get_try_str(qdict, "base");
 int64_t speed = qdict_get_try_int(qdict, "speed", 0);

-qmp_block_stream(device, device, base, NULL, NULL, NULL,
+qmp_block_stream(device, device, base, NULL, NULL, false, false, NULL,
  qdict_haskey(qdict, "speed"), speed,
  true, BLOCKDEV_ON_ERROR_REPORT, NULL,
  false, false, false, false, &error);
diff --git a/block/stream.c b/block/stream.c
index 01fe7c0f16..42befd6b1d 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -39,6 +39,7 @@ typedef struct StreamBlockJob {
 BlockDriverState *target_bs;
 BlockdevOnError on_error;
 char *backing_file_str;
+bool backing_file_format_no_protocol;
 bool bs_read_only;
 } StreamBlockJob;

@@ -95,7 +96,12 @@ static int stream_prepare(Job *job)
 if (unfiltered_base) {
 base_id = s->backing_file_str ?: unfiltered_base->filename;
 if (unfiltered_base->drv) {
-base_fmt = unfiltered_base->drv->format_name;
+if (s->backing_file_format_no_protocol &&
+unfiltered_base->drv->protocol_name) {
+base_fmt = "raw";
+} else {
+base_fmt = unfiltered_base->drv->format_name;
+}
 }
 }

@@ -247,6 +253,7 @@ static const BlockJobDriver stream_job_driver = {

 void stream_start(const char *job_id, BlockDriverState *bs,
   BlockDriverState *base, const char *backing_file_str,
+  bool backing_file_format_no_protocol,
   BlockDriverState *bottom,
   int creation_flags, int64_t speed,
   BlockdevOnError on_error,
@@ -398,6 +405,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 s->base_overlay = base_overlay;
 s->above_base = above_base;
 s->backing_file_str = g_strdup(backing_file_str);
+s->backing_file_format_no_protocol = backing_file_format_no_protocol;
 s->cor_filter_bs = cor_filter_bs;
 s->target_bs = bs;
 s->bs_read_only = bs_read_only;
diff --git a/blockdev.c b/blockdev.c
index 038031bb03..dc477c4f7e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2408,6 +2408,8 @@ void qmp_block_stream(const char *job_id, const char 
*device,
   const char *base,
   const char *base_node,
   const char *backing_file,
+  bool has_backing_file_format_no_protocol,
+  bool backing_file_format_no_protocol,
   const char *bottom,
   bool has_speed, int64_t speed,
   bool has_on_error, BlockdevOnError on_error,
@@ -2443,6 +2445,10 @@ void qmp_block_stream(const char *job_id, const char 
*device,
 return;
 }

+if (!has_backing_file_format_no_protocol) {
+backing_file_format_no_protocol = false;
+}
+
 if (!has_on_error) {
 on_error = BLOCKDEV_ON_ERROR_REPORT;
 }
@@ -2531,6 +2537,7 @@ void qmp_block_stream(const char *job_id, const char 
*device,
 }

 stream_start(job_id, bs, base_bs, backing_file,
+ backing_file_format_no_protocol,
  bottom_bs, job_flags, has_speed ? speed : 0, on_error,
  filter_node_name, &local_err);
 if (local_err) {
diff --git a/include/block/block_int-global-state.h 
b/include/block/block_int-global-s

Re: [PATCH v2 1/2] block: commit: Allow users to request only format driver names in backing file format

2023-11-30 Thread Peter Krempa

On Thu, Nov 30, 2023 at 13:24:18 -0600, Eric Blake wrote:
> On Thu, Nov 30, 2023 at 05:06:03PM +0100, Peter Krempa wrote:
> > Introduce a new flag 'backing_file_format_no_protocol' for the
> > block-commit QMP command which instructs the internals to use 'raw'
> > instead of the protocol driver in case when a image is used without a
> > dummy 'raw' wrapper.
> > 
> > The flag is designed such that it can be always asserted by management
> > tools even when there isn't any update to backing files.
> > 
> > The flag will be used by libvirt so that the backing images still
> > reference the proper format even when libvirt will stop using the dummy
> > raw driver (raw driver with no other config). Libvirt needs this so that
> > the images stay compatible with older libvirt versions which didn't
> > expect that a protocol driver name can appear in the backing file format
> > field.
> > 
> > Signed-off-by: Peter Krempa 
> > Reviewed-by: Vladimir Sementsov-Ogievskiy 
> > ---
> 
> > +++ b/qapi/block-core.json
> > @@ -1810,6 +1810,14 @@
> >  # Care should be taken when specifying the string, to specify a
> >  # valid filename or protocol.  (Since 2.1)
> >  #
> > +# @backing-file-format-no-protocol: If true always use a 'format' driver 
> > name
> > +# for the 'backing file format' field if updating the image header of 
> > the
> > +# overlay of 'top'. Otherwise the real name of the driver of the 
> > backing
> > +# image may be used which may be a protocol driver.
> > +#
> > +# Can be used also when no image header will be updated.
> > +# (default: false; since: 9.0)


As I've previously stated, I don't really care about a name as long as I
don't have to keep re-sending,

> This is a long name.  What about:

But is the long name really a problem?

> @backing-mask-protocol: If true, replace any protocol mentioned in the
> 'backing file format' with 'raw', rather than storing the protocol
> name as the backing format.  Can be used even when no image header
> will be updated (default false; since 9.0).

Sounds okay to me. In the end, nobody will really see this as libvirt
will be using it internally

[PATCH v3 0/2] block: commit/stream: Allow users to request only format driver names in backing file format

2023-12-05 Thread Peter Krempa

Please see patches for rationale.

Libvirt patches using this new flag will be posted soon-ish (after
cleanup).

v3:
 - changed name of flag to 'backing-mask-protocol' (Eric)
 - decided to keep Vladimir's R-b as he requested shorter name too

v2:
 - fixed mistaken argument order in 'hmp_block_stream'
 - changed version in docs to 9.0 as getting this into RC 3 probably
   isn't realistic

Peter Krempa (2):
  block: commit: Allow users to request only format driver names in
backing file format
  block: stream: Allow users to request only format driver names in
backing file format

 block.c| 37 +-
 block/commit.c |  6 -
 block/monitor/block-hmp-cmds.c |  2 +-
 block/stream.c | 10 ++-
 blockdev.c | 13 +
 include/block/block-global-state.h |  3 ++-
 include/block/block_int-common.h   |  4 ++-
 include/block/block_int-global-state.h |  6 +
 qapi/block-core.json   | 17 ++--
 tests/unit/test-bdrv-drain.c   |  3 ++-
 10 files changed, 86 insertions(+), 15 deletions(-)

-- 
2.43.0

[PATCH v3 1/2] block: commit: Allow users to request only format driver names in backing file format

2023-12-05 Thread Peter Krempa

Introduce a new flag 'backing-mask-protocol' for the block-commit QMP
command which instructs the internals to use 'raw' instead of the
protocol driver in case when a image is used without a dummy 'raw'
wrapper.

The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.

The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.

Signed-off-by: Peter Krempa 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block.c| 37 +-
 block/commit.c |  6 -
 blockdev.c |  6 +
 include/block/block-global-state.h |  3 ++-
 include/block/block_int-common.h   |  4 ++-
 include/block/block_int-global-state.h |  3 +++
 qapi/block-core.json   |  8 +-
 tests/unit/test-bdrv-drain.c   |  3 ++-
 8 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index bfb0861ec6..771a00a14d 100644
--- a/block.c
+++ b/block.c
@@ -1309,11 +1309,14 @@ static void bdrv_backing_detach(BdrvChild *c)
 }

 static int bdrv_backing_update_filename(BdrvChild *c, BlockDriverState *base,
-const char *filename, Error **errp)
+const char *filename,
+bool backing_mask_protocol,
+Error **errp)
 {
 BlockDriverState *parent = c->opaque;
 bool read_only = bdrv_is_read_only(parent);
 int ret;
+const char *format_name;
 GLOBAL_STATE_CODE();

 if (read_only) {
@@ -1323,9 +1326,23 @@ static int bdrv_backing_update_filename(BdrvChild *c, 
BlockDriverState *base,
 }
 }

-ret = bdrv_change_backing_file(parent, filename,
-   base->drv ? base->drv->format_name : "",
-   false);
+if (base->drv) {
+/*
+ * If the new base image doesn't have a format driver layer, which we
+ * detect by the fact that @base is a protocol driver, we record
+ * 'raw' as the format instead of putting the protocol name as the
+ * backing format
+ */
+if (backing_mask_protocol && base->drv->protocol_name) {
+format_name = "raw";
+} else {
+format_name = base->drv->format_name;
+}
+} else {
+format_name = "";
+}
+
+ret = bdrv_change_backing_file(parent, filename, format_name, false);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not update backing file link");
 }
@@ -1479,10 +1496,14 @@ static void GRAPH_WRLOCK bdrv_child_cb_detach(BdrvChild 
*child)
 }

 static int bdrv_child_cb_update_filename(BdrvChild *c, BlockDriverState *base,
- const char *filename, Error **errp)
+ const char *filename,
+ bool backing_mask_protocol,
+ Error **errp)
 {
 if (c->role & BDRV_CHILD_COW) {
-return bdrv_backing_update_filename(c, base, filename, errp);
+return bdrv_backing_update_filename(c, base, filename,
+backing_mask_protocol,
+errp);
 }
 return 0;
 }
@@ -5961,7 +5982,8 @@ void bdrv_unfreeze_backing_chain(BlockDriverState *bs, 
BlockDriverState *base)
  *
  */
 int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
-   const char *backing_file_str)
+   const char *backing_file_str,
+   bool backing_mask_protocol)
 {
 BlockDriverState *explicit_top = top;
 bool update_inherits_from;
@@ -6027,6 +6049,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,

 if (c->klass->update_filename) {
 ret = c->klass->update_filename(c, base, backing_file_str,
+backing_mask_protocol,
 &local_err);
 if (ret < 0) {
 /*
diff --git a/block/commit.c b/block/commit.c
index 69cc75be0c..4c351644c1 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -42,6 +42,7 @@ typedef struct CommitBlockJob {
 bool base_read_only;
 bool chain_frozen;
 char *backing_file_str;
+bool backing_mask_protocol;
 }

[PATCH v3 2/2] block: stream: Allow users to request only format driver names in backing file format

2023-12-05 Thread Peter Krempa

Introduce a new flag 'backing-mask-protocol' for the block-stream QMP
command which instructs the internals to use 'raw' instead of the
protocol driver in case when a image is used without a dummy 'raw'
wrapper.

The flag is designed such that it can be always asserted by management
tools even when there isn't any update to backing files.

The flag will be used by libvirt so that the backing images still
reference the proper format even when libvirt will stop using the dummy
raw driver (raw driver with no other config). Libvirt needs this so that
the images stay compatible with older libvirt versions which didn't
expect that a protocol driver name can appear in the backing file format
field.

Signed-off-by: Peter Krempa 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/monitor/block-hmp-cmds.c |  2 +-
 block/stream.c | 10 +-
 blockdev.c |  7 +++
 include/block/block_int-global-state.h |  3 +++
 qapi/block-core.json   |  9 -
 5 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index c729cbf1eb..9080e29d4d 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -509,7 +509,7 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
 const char *base = qdict_get_try_str(qdict, "base");
 int64_t speed = qdict_get_try_int(qdict, "speed", 0);

-qmp_block_stream(device, device, base, NULL, NULL, NULL,
+qmp_block_stream(device, device, base, NULL, NULL, false, false, NULL,
  qdict_haskey(qdict, "speed"), speed,
  true, BLOCKDEV_ON_ERROR_REPORT, NULL,
  false, false, false, false, &error);
diff --git a/block/stream.c b/block/stream.c
index 01fe7c0f16..da62410dde 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -39,6 +39,7 @@ typedef struct StreamBlockJob {
 BlockDriverState *target_bs;
 BlockdevOnError on_error;
 char *backing_file_str;
+bool backing_mask_protocol;
 bool bs_read_only;
 } StreamBlockJob;

@@ -95,7 +96,12 @@ static int stream_prepare(Job *job)
 if (unfiltered_base) {
 base_id = s->backing_file_str ?: unfiltered_base->filename;
 if (unfiltered_base->drv) {
-base_fmt = unfiltered_base->drv->format_name;
+if (s->backing_mask_protocol &&
+unfiltered_base->drv->protocol_name) {
+base_fmt = "raw";
+} else {
+base_fmt = unfiltered_base->drv->format_name;
+}
 }
 }

@@ -247,6 +253,7 @@ static const BlockJobDriver stream_job_driver = {

 void stream_start(const char *job_id, BlockDriverState *bs,
   BlockDriverState *base, const char *backing_file_str,
+  bool backing_mask_protocol,
   BlockDriverState *bottom,
   int creation_flags, int64_t speed,
   BlockdevOnError on_error,
@@ -398,6 +405,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 s->base_overlay = base_overlay;
 s->above_base = above_base;
 s->backing_file_str = g_strdup(backing_file_str);
+s->backing_mask_protocol = backing_mask_protocol;
 s->cor_filter_bs = cor_filter_bs;
 s->target_bs = bs;
 s->bs_read_only = bs_read_only;
diff --git a/blockdev.c b/blockdev.c
index 9ced07a8c4..a5ca1cf3ad 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2408,6 +2408,8 @@ void qmp_block_stream(const char *job_id, const char 
*device,
   const char *base,
   const char *base_node,
   const char *backing_file,
+  bool has_backing_mask_protocol,
+  bool backing_mask_protocol,
   const char *bottom,
   bool has_speed, int64_t speed,
   bool has_on_error, BlockdevOnError on_error,
@@ -2443,6 +2445,10 @@ void qmp_block_stream(const char *job_id, const char 
*device,
 return;
 }

+if (!has_backing_mask_protocol) {
+backing_mask_protocol = false;
+}
+
 if (!has_on_error) {
 on_error = BLOCKDEV_ON_ERROR_REPORT;
 }
@@ -2531,6 +2537,7 @@ void qmp_block_stream(const char *job_id, const char 
*device,
 }

 stream_start(job_id, bs, base_bs, backing_file,
+ backing_mask_protocol,
  bottom_bs, job_flags, has_speed ? speed : 0, on_error,
  filter_node_name, &local_err);
 if (local_err) {
diff --git a/include/block/block_int-global-state.h 
b/include/block/block_int-global-state.h
index 2162269df6..d2201e27f4 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_i

Re: [PATCH v3 0/2] block: commit/stream: Allow users to request only format driver names in backing file format

2023-12-05 Thread Peter Krempa

On Tue, Dec 05, 2023 at 18:14:40 +0100, Peter Krempa wrote:
> Please see patches for rationale.
> 
> Libvirt patches using this new flag will be posted soon-ish (after
> cleanup).

https://lists.libvirt.org/archives/list/de...@lists.libvirt.org/message/GCCZP5ANYTPVXCIEYGSM5NWYCGDL23V6/

Re: [Qemu-devel] nbd: Possible regression in 2.9 RCs

2017-04-03 Thread Peter Krempa

On Mon, Apr 03, 2017 at 10:15:42 +0200, Kevin Wolf wrote:
> Am 31.03.2017 um 19:43 hat Max Reitz geschrieben:
> > On 31.03.2017 18:03, Ciprian Barbu wrote:

[...]

> > So this doesn't work:
> > 
> > $ x86_64-softmmu/qemu-system-x86_64 \
> > -blockdev node-name=image,driver=qcow2,\
> > file.driver=file,file.filename=foo.qcow2 \
> > -device virtio-blk,drive=image \
> > -qmp stdio
> > {"QMP": {"version": {"qemu": {"micro": 92, "minor": 8, "major": 2},
> > "package": " (v2.8.0-2038-g6604c893d0)"}, "capabilities": []}}
> > {'execute':'qmp_capabilities'}
> > {"return": {}}
> > {'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'
> > {"return": {}}
> > {'execute':'nbd-server-add','arguments':{'device':'image','writable':true}}
> > {"error": {"class": "GenericError", "desc": "Conflicts with use by
> > /machine/peripheral-anon/device[0]/virtio-backend as 'root', which does
> > not allow 'write' on image"}
> > 
> > But this works:
> > 
> > $ x86_64-softmmu/qemu-system-x86_64 \
> > -blockdev node-name=image,driver=qcow2,\
> > file.driver=file,file.filename=foo.qcow2 \
> > -device virtio-blk,drive=image,share-rw=on \
> > -qmp stdio
> > {"QMP": {"version": {"qemu": {"micro": 92, "minor": 8, "major": 2},
> > "package": " (v2.8.0-2038-g6604c893d0)"}, "capabilities": []}}
> > {'execute':'qmp_capabilities'}
> > {"return": {}}
> > {'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'
> > {"return": {}}
> > {'execute':'nbd-server-add','arguments':{'device':'image','writable':true}}
> > {"return": {}}
> > 
> > (The difference is the share-rw=on in the -device parameter.)
> > 
> > So in theory all that's necessary is to set share-rw=on for the device
> > in the management layer. But I'm not sure whether that's practical.
> 
> Yes, libvirt needs to provide this option if the guest supports sharing.
> If it doesn't support sharing, rejecting a read-write NBD client seems
> correct to me.
> 
> Peter, Eric, what is the status on the libvirt side here?

Libvirt currently uses the NBD server only for non-shared storage
migration. At that point the disk is not shared (while qemu may think
so) since the other side is not actually running until the mirror
reaches synchronized state.

The fix should be trivial.

> > As for just allowing the NBD server write access to the device... To me
> > that appears pretty difficult from an implementation perspective. We
> > assert that nobody can write without having requested write access and
> > we make sure that nobody can request write access without it being
> > allowed. Making an exception for NBD seems very difficult and would
> > probably mean we'd have to drop the assertion for write accesses altogether.
> 
> Making an exception would simply be wrong.
> 
> Kevin




signature.asc
Description: PGP signature

Re: [Qemu-devel] nbd: Possible regression in 2.9 RCs

2017-04-03 Thread Peter Krempa

On Mon, Apr 03, 2017 at 15:00:41 +0200, Kevin Wolf wrote:
> Am 03.04.2017 um 14:39 hat Max Reitz geschrieben:
> > On 03.04.2017 10:15, Kevin Wolf wrote:
> > > Am 31.03.2017 um 19:43 hat Max Reitz geschrieben:
> > 
> > [...]
> > 
> > >> So in theory all that's necessary is to set share-rw=on for the device
> > >> in the management layer. But I'm not sure whether that's practical.
> > > 
> > > Yes, libvirt needs to provide this option if the guest supports sharing.
> > > If it doesn't support sharing, rejecting a read-write NBD client seems
> > > correct to me.
> > > 
> > > Peter, Eric, what is the status on the libvirt side here?
> > > 
> > >> As for just allowing the NBD server write access to the device... To me
> > >> that appears pretty difficult from an implementation perspective. We
> > >> assert that nobody can write without having requested write access and
> > >> we make sure that nobody can request write access without it being
> > >> allowed. Making an exception for NBD seems very difficult and would
> > >> probably mean we'd have to drop the assertion for write accesses 
> > >> altogether.
> > > 
> > > Making an exception would simply be wrong.
> > 
> > Indeed. That is why it would be so difficult.
> > 
> > The question remains whether it is practical not to make an exception.
> > As far as I know, libvirt is only guaranteed to support older qemu
> > versions, not necessarily future ones. So we should be allowed to break
> > existing use cases here until libvirt is updated (assuming it is
> > possible for libvirt to express "guest device allows shared writes" as
> > an option for its next release).
> 
> If I understand correctly, this is a case of incoming live migration,
> i.e. the virtio-blk device which is blocking the writes to the image
> doesn't really belong to a running guest yet.

Yes, exactly. libvirt starts the NBD server on the destination of
the migration. Until then the VM did not ever run yet. The VM will run
once the memory migration finishes, so the guest front-end will not
write anything at that point.

> So if we need to make an exception (and actually reading the context
> makes it appear so), I guess it would have to be that devices actually
> can share the write permission during incoming migration, but not any
> more later (unless the share-rw flag is set).

Yes, this would be desired to avoid a regression. Libvirt then tears
down the mirror prior to resuming the VM (afaik).


signature.asc
Description: PGP signature

Re: [Qemu-devel] nbd: Possible regression in 2.9 RCs

2017-04-04 Thread Peter Krempa

On Tue, Apr 04, 2017 at 15:19:02 +0200, Kevin Wolf wrote:
> Am 03.04.2017 um 14:51 hat Peter Krempa geschrieben:
> > On Mon, Apr 03, 2017 at 10:15:42 +0200, Kevin Wolf wrote:
> > > Am 31.03.2017 um 19:43 hat Max Reitz geschrieben:
> > > > On 31.03.2017 18:03, Ciprian Barbu wrote:

[...]

> > > Peter, Eric, what is the status on the libvirt side here?
> > 
> > Libvirt currently uses the NBD server only for non-shared storage
> > migration. At that point the disk is not shared (while qemu may think
> > so) since the other side is not actually running until the mirror
> > reaches synchronized state.
> 
> Yes, I misunderstood the situation at first.
> 
> Anyway, is there already a libvirt patch for the cases where the image
> is actually shared?

As said, we use the nbd server only for migration. There's no other
usage I know of and thus no patch.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [libvirt] Fail to start 2nd guest

2017-02-27 Thread Peter Krempa

On Mon, Feb 27, 2017 at 13:31:30 +, Stefan Hajnoczi wrote:
> On Mon, Feb 27, 2017 at 08:50:03PM +0800, Xiong Zhou wrote:
> > On Mon, Feb 27, 2017 at 10:11:04AM +, Stefan Hajnoczi wrote:
> > > On Mon, Feb 27, 2017 at 05:40:50PM +0800, Xiong Zhou wrote:

[...]

> > > > sh-4.2# virsh start 73us
> > > > error: Failed to start domain 73us
> > > > error: internal error: qemu unexpectedly closed the monitor: 
> > > > ((null):11497): Spice-Warning **: reds.c:2499:reds_init_socket: listen: 
> > > > Address already in use
> > > > 2017-02-27T09:33:42.335708Z qemu-kvm: failed to initialize spice server
> > > 
> > > The error message says that the spice remote desktop cannot listen on
> > > -spice port=5900,addr=127.0.0.1.
> > > 
> > > Did you hardcode port 5900 in the domain XML?  That could explain why
> > No.
> > > the second guest fails to launch - you need to use unique port numbers
> > > or let libvirt automatically assign them.  Check the domain XML:
> > > 
> > >   
> > 
> > It looks like:
> > 
> > 
> >   
> >   
> > 
> > 
> > 
> > > 
> > > Another possibility is that a process running on the host is already
> > > using port 5900.  Perhaps a guest or VNC server that was launched
> > > outside of libvirt?  You can check this with:
> > > 
> > >   netstat -alpn | grep 5900
> > # netstat -alpn | grep 5900
> > tcp0  0 127.0.0.1:5900  0.0.0.0:*   LISTEN  
> > 11065/qemu-kvm  
> 
> Please check that 11065/qemu-kvm was launched by the same libvirtd and
> its domain XML also uses autoport='yes'.
> 
> I have CCed the libvirt mailing list because they may be able to explain
> why there is a collision on TCP port 5900.

There was a bug in the code. There could be a race between startup of
two VMs:
https://bugzilla.redhat.com/show_bug.cgi?id=1397440


signature.asc
Description: PGP signature

Re: [PATCH] qdev-monitor: QAPIfy QMP device_add

2024-07-31 Thread Peter Krempa

On Tue, Jul 09, 2024 at 16:27:22 +0200, Markus Armbruster wrote:
> Stefan Hajnoczi  writes:
> 
> > The QMP device_add monitor command converts the QDict arguments to
> > QemuOpts and then back again to QDict. This process only supports scalar
> > types. Device properties like virtio-blk-pci's iothread-vq-mapping (an
> > array of objects) are silently dropped by qemu_opts_from_qdict() during
> > the QemuOpts conversion even though QAPI is capable of validating them.
> > As a result, hotplugging virtio-blk-pci devices with the
> > iothread-vq-mapping property does not work as expected (the property is
> > ignored). It's time to QAPIfy QMP device_add!
> 
> This patch doesn't fully QAPIfy device_add: we still lack a schema
> and use 'gen': false.  It gets us closer, though.
> 
> > Get rid of the QemuOpts conversion in qmp_device_add() and call
> > qdev_device_add_from_qdict() with from_json=true. Using the QMP
> > command's QDict arguments directly allows non-scalar properties.
> >
> > The HMP is also adjusted since qmp_device_add()'s now expects properly
> > typed JSON arguments and cannot be used from HMP anymore. Move the code
> > that was previously in qmp_device_add() (with QemuOpts conversion and
> > from_json=false) into hmp_device_add() so that its behavior is
> > unchanged.
> >
> > This patch changes the behavior of QMP device_add but not HMP
> > device_add. QMP clients that sent incorrectly typed device_add QMP
> > commands no longer work. This is a breaking change but clients should be
> > using the correct types already. See the netdev_add QAPIfication in
> > commit db2a380c8457 for similar reasoning.
> 
> Another one is 9151e59a8b6e: it QAPIfied object-add.
> 
> Both commits eliminated the roundtrip through QemuOpts, and weaned the
> command off 'gen': false.
> 
> This commit eliminates the roundtrip, but keeps 'gen': false.  Best we
> can do now, but I'd like the commit message to make this clear.
> 
> > Markus helped me figure this out and even provided a draft patch. The
> > code ended up very close to what he suggested.
> >
> > Suggested-by: Markus Armbruster 
> > Cc: Daniel P. Berrangé 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  system/qdev-monitor.c | 41 -
> >  1 file changed, 28 insertions(+), 13 deletions(-)

[...]

> Have a look at this TODO in vl.c:
> 
> QTAILQ_FOREACH(opt, &device_opts, next) {
> DeviceState *dev;
> loc_push_restore(&opt->loc);
> /*
>  * TODO Eventually we should call qmp_device_add() here to make sure 
> it
>  * behaves the same, but QMP still has to accept incorrectly typed
>  * options until libvirt is fixed and we want to be strict on the CLI
>  * from the start, so call qdev_device_add_from_qdict() directly for
>  * now.
>  */

So at least this TODO should already be handled on libvirt's side.

With modern qemu libvirt is using JSON for -device and exactly the same
JSON for device_add. From what I remember from the time when I've
converted -device to use JSON, -device already required the correct
types, which would mean that also libvirt uses the correct types now.

Re: [PATCH v2 00/10] mirror: allow switching from background to active mode

2024-03-04 Thread Peter Krempa

On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > On 03.11.23 18:56, Markus Armbruster wrote:
> > > Kevin Wolf  writes:

[...]

> > > Is the job abstraction a failure?
> > > 
> > > We have
> > > 
> > >  block-job- command  since   job- commandsince
> > >  -
> > >  block-job-set-speed 1.1
> > >  block-job-cancel1.1 job-cancel  3.0
> > >  block-job-pause 1.3 job-pause   3.0
> > >  block-job-resume1.3 job-resume  3.0
> > >  block-job-complete  1.3 job-complete3.0
> > >  block-job-dismiss   2.12job-dismiss 3.0
> > >  block-job-finalize  2.12job-finalize3.0
> > >  block-job-change8.2
> > >  query-block-jobs1.1 query-jobs

[...]

> I consider these strictly optional. We don't really have strong reasons
> to deprecate these commands (they are just thin wrappers), and I think
> libvirt still uses block-job-* in some places.

Libvirt uses 'block-job-cancel' because it has different semantics from
'job-cancel' which libvirt documented as the behaviour of the API that
uses it. (Semantics regarding the expectation of what is written to the
destination node at the point when the job is cancelled).

Libvirt also uses 'block-job-set-speed' and 'query-block-jobs' but those
can be replaced easily and looking at the above table even without any
feature checks.

Thus the plan to deprecate at least 'block-job-cancel' will not work
unless the semantics are ported into 'job-cancel'.

Re: [PATCH v2 00/10] mirror: allow switching from background to active mode

2024-03-10 Thread Peter Krempa

On Thu, Mar 07, 2024 at 22:42:56 +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 04.03.24 14:09, Peter Krempa wrote:
> > On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
> > > Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > On 03.11.23 18:56, Markus Armbruster wrote:
> > > > > Kevin Wolf  writes:

[...]

> 
> I only see the following differencies:
> 
> 1. block-job-complete documents that it completes the job synchronously.. But 
> looking at mirror code I see it just set s->should_complete = true, which 
> will be then handled asynchronously.. So I doubt that documentation is 
> correct.
> 
> 2. block-job-complete will trigger final graph changes. block-job-cancel will 
> not.
> 
> Is [2] really useful? Seems yes: in case of some failure before starting 
> migration target, we'd like to continue executing source. So, no reason to 
> break block-graph in source, better keep it unchanged.
> 
> But I think, such behavior better be setup by mirror-job start parameter, 
> rather then by special option for cancel (or even compelete) command, useful 
> only for mirror.

Libvirt's API design was based on the qemu quirk and thus we allow users
to do the decision after starting the job, thus anything you pick needs
to allow us to do this at "completion" time.

Libvirt can adapt to any option that will give us the above semantics
(extra parameter at completion time, different completion command or
extra command to switch job properties right before completion), but to
be honest all of these feel like they would be more hassle than keeping
'block-job-cancel' around from qemu's side.

Re: [PATCH v2 00/10] mirror: allow switching from background to active mode

2024-03-11 Thread Peter Krempa

On Mon, Mar 11, 2024 at 18:51:18 +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 11.03.24 00:07, Peter Krempa wrote:
> > On Thu, Mar 07, 2024 at 22:42:56 +0300, Vladimir Sementsov-Ogievskiy wrote:

[...]

> > Libvirt can adapt to any option that will give us the above semantics
> > (extra parameter at completion time, different completion command or
> > extra command to switch job properties right before completion), but to
> > be honest all of these feel like they would be more hassle than keeping
> > 'block-job-cancel' around from qemu's side.
> > 
> 
> I understand. But still, it would be good to finally resolve the duplication 
> between job-* and block-job-* APIs. We can keep old quirk working for a long 
> time even after making a new consistent API.

Sure, if you decide to go that way it's okay as long as we can avoid the
graph change at 'completion' time. However you decide to implement it
just let me know in advance so that I can prepare the libvirt patches
for it.

Re: [PATCH] chardev: allow specifying finer-grained reconnect timeouts

2024-08-29 Thread Peter Krempa

On Thu, Aug 29, 2024 at 13:56:43 +0200, Markus Armbruster wrote:
> Daniil Tatianin  writes:

[...]

So firstly, libvirt models the timeout in the XML in seconds for now so
making use of this will require some XML design plumbing making use of
it if there'll be any users wanting it.

When libvirt would want to make use of this the amount of work for any
of the options below is almost the same (capability detection, adding
of the new plumbing, etc). The only difference is what to
do if nobody shows interest in exposing the sub-second intervals in
libvirt.

> > @@ -287,7 +292,8 @@
> >  '*telnet': 'bool',
> >  '*tn3270': 'bool',
> >  '*websocket': 'bool',
> > -'*reconnect': 'int' },
> > +'*reconnect': 'int',
> > +'*reconnect-is-ms': 'bool' },
> >'base': 'ChardevCommon' }
> >  
> >  ##
> 
> I don't like this interface.
> 
>PRO: compatible extension; no management application updates needed
>unless they want to support sub-second delays.
> 
>CON: specifying a sub-second delay takes two parameters, which is
>awkward.
> 
>CON: trap in combination with -set.  Before the patch, something like
>-set chardev.ID.reconnect=N means N seconds no matter what.
>Afterwards, it depends on the value of reconnect-is-ms, which may be
>set far away.  Mitigating factor: -set is obscure.

Here we'd have to do nothing.

> Alternatives:
> 
> 1. Change @reconnect to 'number', specify sub-second delays as
>fractions.
> 
>PRO: compatible extension; no management application updates needed
>unless they want to support sub-second delays.
> 
>CON: first use of floating-point for time in seconds in QAPI, as far
>as I can see.
> 
>CON: QemuOpts can't do floating-point.

Same here.

> 
> 2. Deprecate @reconnect in favour of a new member with a more suitable
>unit.  Error if both are present.
> 
>PRO: after @reconnect is gone, the interface is what it arguably
>should've been from the start.
> 
>CON: incompatible change.  Management application updates needed
>within the deprecation grace period.

This one will require change to libvirt including a capability addition
even when libvirt might not want to use the new field. (Add capability
detection, switch to new interface if present. Libvirt doesn't want to
use deprecated interfaces to avoid future breakage.)

But we've done this multiple times so it's not a big deal.

> Let's get additional input from management application developers.  I
> cc'ed some.

From libvirt's point of view I'd say either option is viable. We're okay
with deprecating the old interface libvirt is used to do that.
I'd personally prefer if floating point numbers were avoided.

Re: [PATCH] chardev: introduce 'reconnect-ms' and deprecate 'reconnect'

2024-09-04 Thread Peter Krempa

On Wed, Sep 04, 2024 at 08:19:13 +0300, Daniil Tatianin wrote:
> The 'reconnect' option only allows to specify the time in seconds,
> which is way too long for certain workflows.
> 
> We have a lightweight disk backend server, which takes about 20ms to
> live update, but due to this limitation in QEMU, previously the guest
> disk controller would hang for one second because it would take this
> long for QEMU to reinitialize the socket connection.
> 
> Introduce a new option called 'reconnect-ms', which is the same as
> 'reconnect', except the value is treated as milliseconds. These are
> mutually exclusive and specifying both results in an error.
> 
> 'reconnect' is also deprecated by this commit to make it possible to
> remove it in the future as to not keep two options that control the
> same thing.
> 
> Signed-off-by: Daniil Tatianin 
> ---

[...]

> diff --git a/qapi/char.json b/qapi/char.json
> index ef58445cee..7f117438c6 100644
> --- a/qapi/char.json
> +++ b/qapi/char.json
> @@ -273,7 +273,19 @@
>  #
>  # @reconnect: For a client socket, if a socket is disconnected, then
>  # attempt a reconnect after the given number of seconds.  Setting
> -# this to zero disables this function.  (default: 0) (Since: 2.2)
> +# this to zero disables this function.  The use of this member is
> +# deprecated, use @reconnect-ms instead. (default: 0) (Since: 2.2)
> +#
> +# @reconnect-ms: For a client socket, if a socket is disconnected,
> +# then attempt a reconnect after the given number of milliseconds.
> +# Setting this to zero disables this function.  This member is
> +# mutually exclusive with @reconnect.
> +# (default: 0) (Since: 9.2)
> +#
> +# Features:
> +#
> +# @deprecated: Member @reconnect is deprecated.  Use @reconnect-ms
> +# instead.
>  #
>  # Since: 1.4
>  ##
> @@ -287,7 +299,8 @@
>  '*telnet': 'bool',
>  '*tn3270': 'bool',
>  '*websocket': 'bool',
> -'*reconnect': 'int' },
> +'*reconnect': { 'type': 'int', 'features': [ 'deprecated' ] },
> +'*reconnect-ms': 'int' },
>'base': 'ChardevCommon' }

This is a little off-topic:

So I wanted to make libvirt use the new parameter to stay ahead
deprecation. I've applied this patch to qemu, dumped capabilities and
pretty much expected a bunch of test cases in libvirt fail as they'd be
using a deprecated field as libvirt is supposed to validate everything.

And the test suite passed unexpectedly. I've dug further and noticed
that for some reason libvirt doesn't still use JSON parameters for
-chardev (which is the pre-requisite for validation).

I've also noticed that at some point I attempted to convert it over
witnessed by having an (unused) capability named QEMU_CAPS_CHARDEV_JSON
that I've introduced.

My questions are:
1) Does '-chardev' accept JSON identical to 'chardev-add' QMP command?

If yes:
2) Since when can that be used? (What can I use as a witness)
3) Are there any gotchas?

I wonder this as I'd love to finish that out, but I really don't fancy
digging into qemu to find a gotcha 3/4 of the way there.

Anyways, as I've already stated, this patch is okay for libvirt, but I
didn't review the implementation, thus, on behalf of libvirt:

ACKed-by: Peter Krempa

Re: [PATCH 0/2] ppc: Rename power5+ and power7+ for the new QOM naming rules

2024-01-12 Thread Peter Krempa

On Thu, Jan 11, 2024 at 17:46:50 +0100, Thomas Huth wrote:
> We can get rid of the "power5+" / "power7+" hack in qom/object.c
> by using CPU aliases for those names instead (first patch).
> 
> I think in the long run, we should get rid of the names with a "+"
> in it completely, so the second patch suggests to deprecate those,
> but I'd also be fine if we keep the aliases around, so in that case
> please ignore the second patch.
> 
> Thomas Huth (2):
>   target/ppc/cpu-models: Rename power5+ and power7+ for new QOM naming
> rules
>   docs/about: Deprecate the old "power5+" and "power7+" CPU names

libvirt seems to be explicitly referencing power7+ in the code, so I
guess we'll need code to translate the + versions to the spellt-out
version to preserve compatibility.

Re: [PATCH v3 0/2] block: commit/stream: Allow users to request only format driver names in backing file format

2024-01-03 Thread Peter Krempa

On Tue, Dec 05, 2023 at 18:14:40 +0100, Peter Krempa wrote:
> Please see patches for rationale.
> 
> Libvirt patches using this new flag will be posted soon-ish (after
> cleanup).
> 
> v3:
>  - changed name of flag to 'backing-mask-protocol' (Eric)
>  - decided to keep Vladimir's R-b as he requested shorter name too
> 
> v2:
>  - fixed mistaken argument order in 'hmp_block_stream'
>  - changed version in docs to 9.0 as getting this into RC 3 probably
>isn't realistic
> 
> Peter Krempa (2):
>   block: commit: Allow users to request only format driver names in
> backing file format
>   block: stream: Allow users to request only format driver names in
> backing file format

Polite ping, now that qemu-8.2 was released.

< 1 2 3 4

301 - 378 of 378 matches

Mail list logo