date:20160427

Re: [Qemu-devel] [PATCH v4 00/16] IOMMU: Enable interrupt remapping for Intel IOMMU

2016-04-27 Thread Peter Xu

On Wed, Apr 27, 2016 at 04:31:13PM +0200, Radim Krčmář wrote:

[...]

> >> > I am still looking into guest part codes. Although the above patch
> >> > should solve the issue, there are still issues in guest codes when
> >> > IR is enabled:
> >> > 
> >> > - mismatched "vector" in IOAPIC entry and IRTE entry (this is
> >> >   required in vt-d spec 5.1.5.1, and required to correctly deliver
> >> >   EOI broadcast I guess). See intel_irq_remapping_prepare_irte():
> >> 
> >> "required" is a way of saying that the opposite is undefined.
> >> No need to think about it in IOMMU.
> > 
> > Why? Without correct vector information, IOAPIC will not be able to
> > know which entry to clear the Remote IRR bit (please check
> > ioapic_eoi_broadcast())?
> 
> IOAPIC won't get correct EOI and Intel made it into an OS bug, because
> there was no good action that the hardware could take.  (We have a lot
> more freedom, but I think that partially fixing something that doesn't
> work on real hardware is a wasted effort.)

To make sure I understand this correctly... Do you mean that real
IOAPIC hardware will not handle this EOI broadcast correctly even if
we fill in matched vector in the IOAPIC entry with IRTE one (when IR
is enabled)?

I'd appreciate if there is any link or anything that can provide me
more background on this matter.. TIA.

> 
> Or did you mean that mismatched vector is a possible source of the fixed
> bug?  (I originally dismissed it, because real hardware works.)

Nop. The above patch fixes the hack for "explicit IOAPIC EOI", and I
suppose mismatched vector issue will cause "EOI broadcast" problem.
But IIUC from your above comment, we can temporarily skip this
"issue" for now, if it won't work even on real hardwares and even
vectors are matched.

Anyway, as long as the explicit EOI works, we can survive. And this
gives me the reason to send v5 first.

> 
> >> > - I encountered that level-triggered entries in IOAPIC is marked as
> >> >   edge-triggered interrupt in APIC (which is strange)...
> >> 
> >> What/where do you mean?
> >> (The only difference I know of is that level triggered vectors in LAPIC
> >>  have their respective TMR bit set while edge do not.)
> > 
> > Exactly. Here is what I mean:
> > 
> > static void apic_eoi(APICCommonState *s)
> > {
> > int isrv;
> > isrv = get_highest_priority_int(s->isr);
> > if (isrv < 0)
> > return;
> > apic_reset_bit(s->isr, isrv);
> > if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && apic_get_bit(s->tmr, 
> > isrv)) {
> > ioapic_eoi_broadcast(isrv);
> > }
> > apic_sync_vapic(s, SYNC_FROM_VAPIC | SYNC_TO_VAPIC);
> > apic_update_irq(s);
> > }
> > 
> > APIC will notify IOAPIC only if the corresponding vector in TMR bit
> > is set (in "apic_get_bit(s->tmr, isrv)", or say, it's a
> > level-triggered interrupt in APIC registers). What I have traced is
> > that, the EOI broadcast is missing because this bit is cleared in
> > APIC TMR while it should be set. I need some more tests to double
> > confirm this though, in case I made any mistake.
> 
> (There are two "legal" situations where TMR can be 0 and IOAPIC sets
>  remote IRR -- if edge and level interrupts are assigned to the same
>  vector and if IOAPIC is level while IR and OS edge, both would bug on
>  real hardware too ...)
> 
> Does QEMU bug with TCG?

Gave it a shot today. It happens as well.

Thanks,

-- peterx

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-04-27 Thread Yuanhan Liu

On Fri, Apr 01, 2016 at 01:16:21PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  docs/specs/vhost-user.txt | 15 +++
>  hw/virtio/vhost-user.c| 16 
>  2 files changed, 31 insertions(+)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 8a635fa..60d6d13 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -487,3 +487,18 @@ Message types
>request to the master. It is passed in the ancillary data.
>This message is only sent if VHOST_USER_PROTOCOL_F_SLAVE_CHANNEL
>feature is available.
> +
> +Slave message types
> +---
> +
> + * VHOST_USER_SLAVE_SHUTDOWN:
> +  Id: 1
> +  Master payload: N/A
> +  Slave payload: u64
> +
> +  Request the master to shutdown the slave. A 0 reply is for
> +  success, in which case the slave may close all connections
> +  immediately and quit. A non-zero reply cancels the request.
> +
> +  Before a reply comes, the master may make other requests in
> +  order to flush or sync state.

Hi all,

I threw this proposal as well as DPDK's implementation to our customer
(OVS, Openstack and some other teams) who made such request before. I'm
sorry to say that none of them really liked that we can't handle crash.
Making reconnect work from a vhost-user backend crash is exactly something
they are after.

And to handle the crash, I was thinking of the proposal from Michael.
That is to do reset from the guest OS. This would fix this issue
ultimately. However, old kernel will not benefit from this, as well
as other guest other than Linux, making it not that useful for current
usage. 

Thinking of that the VHOST_USER_SLAVE_SHUTDOWN just gives QEMU a chance
to get the vring base (last used idx) from the backend, Huawei suggests
that we could still make it in a consistent state after the crash, if
we get the vring base from vring->used->idx.  That worked as expected
from my test. The only tricky thing might be how to detect a crash,
and we could do a simple compare of the vring base from QEMU with
the vring->used->idx at the initiation stage. If mismatch found, get
it from vring->used->idx instead.

Comments/thoughts? Is it a solid enough solution to you?  If so, we
could make things much simpler, and what's most important, we could
be able to handle crash.

--yliu

Re: [Qemu-devel] [PATCH 2/2] Debug : Add error messages before a call to debug().

2016-04-27 Thread Prerna

Hi Eric,
Thank you for the review.

On Wed, Apr 27, 2016 at 9:30 PM, Eric Blake  wrote:

> On 04/14/2016 09:02 PM, Prerna Saxena wrote:
> > Qemu code has abort() calls in various places which raises a SIGABRT;
> > This patch adds error messages before (most)calls to abort(), so that
> > it is easier to determine why QEMU died.
>
> The subject line says you are adding messages before debug(), but the
> rest of the patch is adding message before abort(). You'll need to fix
> that.  Also, subject lines usually don't end in '.'
>

Sorry, the subject line shouldve said : "Add error messages before abort()".
Will resend.


>
> > +++ b/block.c
> > @@ -3725,6 +3725,7 @@ void
> bdrv_remove_aio_context_notifier(BlockDriverState *bs,
> >  }
> >  }
> >
> > +error_report("Matching context notifier not found for removal.
> Aborting");
> >  abort();
>
> The "Aborting" part of the message is redundant; it's pretty obvious
> that qemu aborted.
>
>
Agree.


> I also wonder if you should be using g_assert_not_reached() instead of
> abort() in some (all?) of the places touched in this patch - at which
> point you don't have to worry about inventing a message that will never
> be printed.  The reason I suggest it is that g_assert_not_reached() is
> self-documenting, and prints a nicer message than abort() if it does
> accidentally get reached.
>
>
Ah, thanks for this suggestion. I will look up g_assert_not_reached() to
see how I can use it (at all places, possibly).

Regards,
Prerna


> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
>

Re: [Qemu-devel] [PATCH 04/18] char: add wait support for reconnect

2016-04-27 Thread Yuanhan Liu

On Fri, Apr 01, 2016 at 01:16:14PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Until now, 'wait' was solely used for listening sockets. However, it can
> also be useful for 'reconnect' socket kind, where the first open must
> succeed before continuing.
> 
> This allows for instance (with other support patches) having vhost-user
> wait for an initial connection to setup the vhost-net, before eventually
> accepting new connections.

I have met a tricky issue about this patch. Assume the socket exist in
before we start QEMU, but somehow we can not connect to it, say due to
permission error. In such case, QEMU would wait there forever, without
any error messages. I would assume it's a bug, right?

--yliu
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  qemu-char.c | 31 +++
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/qemu-char.c b/qemu-char.c
> index 8702931..3e25c08 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -3659,7 +3659,7 @@ static void qemu_chr_parse_socket(QemuOpts *opts, 
> ChardevBackend *backend,
>Error **errp)
>  {
>  bool is_listen  = qemu_opt_get_bool(opts, "server", false);
> -bool is_waitconnect = is_listen && qemu_opt_get_bool(opts, "wait", true);
> +bool is_wait= qemu_opt_get_bool(opts, "wait", is_listen);
>  bool is_telnet  = qemu_opt_get_bool(opts, "telnet", false);
>  bool do_nodelay = !qemu_opt_get_bool(opts, "delay", true);
>  int64_t reconnect   = qemu_opt_get_number(opts, "reconnect", 0);
> @@ -3696,7 +3696,7 @@ static void qemu_chr_parse_socket(QemuOpts *opts, 
> ChardevBackend *backend,
>  sock->has_telnet = true;
>  sock->telnet = is_telnet;
>  sock->has_wait = true;
> -sock->wait = is_waitconnect;
> +sock->wait = is_wait;
>  sock->has_reconnect = true;
>  sock->reconnect = reconnect;
>  sock->tls_creds = g_strdup(tls_creds);
> @@ -4350,7 +4350,7 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  bool do_nodelay = sock->has_nodelay ? sock->nodelay : false;
>  bool is_listen  = sock->has_server  ? sock->server  : true;
>  bool is_telnet  = sock->has_telnet  ? sock->telnet  : false;
> -bool is_waitconnect = sock->has_wait? sock->wait: false;
> +bool is_wait= sock->has_wait? sock->wait: false;
>  int64_t reconnect   = sock->has_reconnect ? sock->reconnect : 0;
>  ChardevCommon *common = qapi_ChardevSocket_base(sock);
>  QIOChannelSocket *sioc = NULL;
> @@ -4424,7 +4424,7 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  }
>  
>  sioc = qio_channel_socket_new();
> -if (s->reconnect_time) {
> +if (s->reconnect_time && !is_wait) {
>  qio_channel_socket_connect_async(sioc, s->addr,
>   qemu_chr_socket_connected,
>   chr, NULL);
> @@ -4433,7 +4433,7 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  goto error;
>  }
>  s->listen_ioc = sioc;
> -if (is_waitconnect) {
> +if (is_wait) {
>  trace_char_socket_waiting(chr->filename);
>  tcp_chr_accept(QIO_CHANNEL(s->listen_ioc), G_IO_IN, chr);
>  }
> @@ -4443,9 +4443,24 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  QIO_CHANNEL(s->listen_ioc), G_IO_IN, tcp_chr_accept, chr, 
> NULL);
>  }
>  } else {
> -if (qio_channel_socket_connect_sync(sioc, s->addr, errp) < 0) {
> -goto error;
> -}
> +do {
> +Error *err = NULL;
> +
> +if (qio_channel_socket_connect_sync(sioc, s->addr, ) < 0) {
> +if (reconnect) {
> +trace_char_socket_reconnect_error(chr->label,
> +  error_get_pretty(err));
> +error_free(err);
> +continue;
> +} else {
> +error_propagate(errp, err);
> +goto error;
> +}
> +} else {
> +break;
> +}
> +} while (true);
> +
>  tcp_chr_new_client(chr, sioc);
>  object_unref(OBJECT(sioc));
>  }
> -- 
> 2.5.5

Re: [Qemu-devel] Hang with migration multi-thread compression under high load

2016-04-27 Thread Li, Liang Z

> I've been testing various features of migration and have hit a problem with
> the multi-thread compression. It works fine when I have 2 or more threads,
> but if I tell it to only use a single thread, then it almost always hangs
> 
> I'm doing a migration between 2 guests on the same machine over a tcp
> localhost socket, using this command line to launch them:
> 
>   /home/berrange/src/virt/qemu/x86_64-softmmu/qemu-system-x86_64
>  -chardev socket,id=mon,path=/var/tmp/qemu-src-4644-monitor.sock
>  -mon chardev=mon,mode=control
>  -display none
>  -vga none
>  -machine accel=kvm
>  -kernel /boot/vmlinuz-4.4.7-300.fc23.x86_64
>  -initrd /home/berrange/src/virt/qemu/tests/migration/initrd-stress.img
>  -append "noapic edd=off printk.time=1 noreplace-smp
> cgroup_disable=memory pci=noearly console=ttyS0 debug ramsize=1"
>  -chardev stdio,id=cdev0
>  -device isa-serial,chardev=cdev0
>  -m 1536
>  -smp 1
> 
> The target VM also gets
> 
> -incoming tcp:localhost:9000
> 
> 
> When the VM hangs, the source QEMU shows this stack trace:
> 

What's the mean of  "VM hangs", the VM has no response?
or just the live migration process can't not complete.

I do the test in my environment, it works for me.

Could you try to exec 'info migrate' in qemu monitor on the source side
to check if the live migration process is ongoing, if the 'transferred ram'
keeps unchanged,  it shows dad lock happen.

Liang

> for some reason it isn't shown in the stack thrace for thread
> 1 above, when initially connecting GDB it says the main thread is at:
> 
> decompress_data_with_multi_threads (len=702, host=0x7fd78fe06000,
> f=0x55901af09950) at /home/berrange/src/virt/qemu/migration/ram.c:2254
> 2254  for (idx = 0; idx < thread_count; idx++) {
> 
> 
> Looking at the target QEMU, we see  do_data_decompress method is
> waiting in a condition var:
> 
> while (!param->start && !quit_decomp_thread) {
>   qemu_cond_wait(>cond, >mutex);
> do stuff..
>   param->start = false
> }
> 
> 
> Now the decompress_data_with_multi_threads is checking param->start
> without holding the param->mutex lock.
> 
> Changing decompress_data_with_multi_threads to acquire param->mutex
> lock makes it work, but isn't ideal, since that now blocks the
> decompress_data_with_multi_threads() method on the completion of each
> thread, which defeats the point of having multiple threads.
> 
> 
> As mentioned above I'm only seeing the hang when using 1 decompress
> thread. If it let QEMU have multiple decompress threads everything is fine.
> Also, it only happens if I have a very heavy guest workload.
> If the guest is completely idle, it again works fine. So clearly there is some
> kind of race condition I'm unlucky enough to hit here.
> 
> In terms of monitor commands I'm just running
> 
> 
>   migrate_set_capabilities compress on(src + dst)
>   migrate_set_parameters compress-threads 1   (src only)
>   migrate_set_parameters decompress-threads 1 (dst only)
> 
> Then migrate -d tcp:localhost:9000
> 
> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] post-copy is broken?

2016-04-27 Thread Li, Liang Z

> -Original Message-
> From: Andrea Arcangeli [mailto:aarca...@redhat.com]
> Sent: Wednesday, April 27, 2016 10:48 PM
> To: Li, Liang Z
> Cc: Dr. David Alan Gilbert; Kirill A. Shutemov; 
> kirill.shute...@linux.intel.com;
> Amit Shah; qemu-devel@nongnu.org; quint...@redhat.com; linux-
> m...@kvack.org
> Subject: Re: post-copy is broken?
> 
> Hello Liang,
> 
> On Mon, Apr 18, 2016 at 10:33:14AM +, Li, Liang Z wrote:
> > If the THP is disabled, no fails.
> > And your test was always passed, even when  real post-copy was failed.
> >
> > In my env, the output of
> > 'cat /sys/kernel/mm/transparent_hugepage/enabled'  is:
> >
> >  [always] ...
> >
> 
> Can you test the fix?
> https://marc.info/?l=linux-mm=146175869123580=2
> 
> This was not a breakage in userfaultfd nor in postcopy. userfaultfd had no
> bugs and is fully rock solid and with zero chances of generating undetected
> memory corruption like it was happening in v4.5.
> 
> As I suspected, the same problem would have happened with any THP
> pmd_trans_huge split (swapping/inflating-balloon etc..). Postcopy just
> makes it easier to reproduce the problem because it does a scattered
> MADV_DONTNEED on the destination qemu guest memory for the pages
> redirtied during the last precopy pass that run, or not transferred (to allow
> THP faults in destination qemu during precopy), just before starting the
> guest in the destination node.
> 
> Other reports of KVM memory corruption happening on v4.5 with THP
> enabled will also be taken care of by the above fix.
> 
> I hope I managed to fix this in time for v4.6 final (current is v4.6-rc5-69), 
> so
> the only kernel where KVM must not be used with THP enabled will be v4.5.
> 
> On a side note, this MADV_DONTEED trigger reminded me as soon as the
> madvisev syscall is merged, loadvm_postcopy_ram_handle_discard should
> start using it to reduce the enter/exit kernel to just 1 (or a few madvisev in
> case we want to give a limit to the temporary buffer to avoid the risk of
> allocating too much temporary RAM for very large
> guests) to do the MADV_DONTNEED scattered zapping. Same thing in
> virtio_balloon_handle_output.
> 

I have test the patch, guest doesn't crash anymore, I think the issue is fixed. 
Thanks!

Liang
> Thanks,
> Andrea

Re: [Qemu-devel] Hang with migration multi-thread compression under high load

2016-04-27 Thread Li, Liang Z

> On Wed, Apr 27, 2016 at 03:29:30PM +0100, Dr. David Alan Gilbert wrote:
> > ccing in Liang Li
> >
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > for some reason it isn't shown in the stack thrace for thread
> > > 1 above, when initially connecting GDB it says the main thread is
> > > at:
> > >
> > > decompress_data_with_multi_threads (len=702, host=0x7fd78fe06000,
> f=0x55901af09950) at /home/berrange/src/virt/qemu/migration/ram.c:2254
> > > 2254  for (idx = 0; idx < thread_count; idx++) {
> > >
> > >
> > > Looking at the target QEMU, we see  do_data_decompress method is
> > > waiting in a condition var:
> > >
> > > while (!param->start && !quit_decomp_thread) {
> > >   qemu_cond_wait(>cond, >mutex);
> > > do stuff..
> > >   param->start = false
> > > }
> > >
> > >
> > > Now the decompress_data_with_multi_threads is checking param->start
> > > without holding the param->mutex lock.
> > >
> > > Changing decompress_data_with_multi_threads to acquire param-
> >mutex
> > > lock makes it work, but isn't ideal, since that now blocks the
> > > decompress_data_with_multi_threads() method on the completion of
> > > each thread, which defeats the point of having multiple threads.
> 
> FWIW, the following patch also appears to "fix" the issue, presumably by just
> making the race much less likely to hit:
> 
> diff --git a/migration/ram.c b/migration/ram.c index 3f05738..be0233f 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2271,6 +2271,7 @@ static void
> decompress_data_with_multi_threads(QEMUFile *f,
>  if (idx < thread_count) {
>  break;
>  }
> +sched_yield();
>  }
>  }
> 
> 
> Incidentally IIUC, this decompress_data_with_multi_threads is just busy
> waiting for a thread to become free, which seems pretty wasteful of CPU
> resources. I wonder if there's a more effective way to structure this, so that
> instead of having decompress_data_with_multi_threads()
> choose which thread to pass the decompression job to, it just puts the job
> into a queue, and then let all the threads pull from that shared queue. IOW
> whichever thread the kerenl decides to wakeup would get the job, without
> us having to explicitly assign a thread to the job.
> 
> 

Thanks for reporting this issue, I will take a look and get back to you.

Liang

> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [Query] Does Linux & Qemu support KVM for ARM32 guest on ARM64 host

2016-04-27 Thread Shannon Zhao



On 2016/4/28 9:50, RAVINDRA KUMAR SANDE wrote:
> 
> What I did  :
> 1) Just for investigation, I took a ARMv8 ( OdroidC2 ) board
> 2)  I compiled Linux 3.14 with KVM support for this ARMv8 ( OdroidC2 )
> board, with modification replacing meson_timer by arm timer in its dts
> file.
> Why Linux 3.14 : I took Linux 3.14 because display drivers for this
> board are officially for this version; and I am interested in seeing
> some Linux guest booting with display on.
> 3)  I see from boot log of  that KVM is initialized successfully, and I
> can see /dev/kvm node.
> 4) I built latest Qemu with --enable-kvm on this board natively.
> 
> What I find :
> 1) running "qemu-system-arm  -enable-kvm   -machine vexpress-a9 "
> gives error :  no accelerator found
> 2) running "qemu-system-aarch64 -enable-kvm  -machine vexpress-a9 "
> gives error : kmv_init_vcpu (IOCtl on /dev/kvm) failed, guest not supported
> ( I experimented some modifications as well to overcome above error,
> such as replacing value assigned to cpu->kvm_target etc, but IOCtl call
> is failing)
> 
> Query:
> 1) Does Arm64 Linux not enable KVM support for Arm32 guest ?
> 2) Can qemu-system-arm not use the KVM feature on Arm64 host ?
> 3) Can qemu-system-aarch64 not use KVM feature for Arm32 guest ?
> 
You can use below command to boot a ARM32 guest on ARM64：

qemu-system-aarch64 -enable-kvm -machine virt,kernel_irqchip=on -cpu
host,aarch64=off 

-- 
Shannon

Re: [Qemu-devel] [PATCH for-2.7 v2 07/17] rbd: Implement image locking

2016-04-27 Thread Fam Zheng

On Wed, 04/27 13:18, Jason Dillaman wrote:
> On Tue, Apr 26, 2016 at 7:20 PM, Fam Zheng  wrote:
> > On Tue, 04/26 10:42, Jason Dillaman wrote:
> >> On Sun, Apr 24, 2016 at 7:42 PM, Fam Zheng  wrote:
> >> > On Fri, 04/22 21:57, Jason Dillaman wrote:
> >> >> Since this cannot automatically recover from a crashed QEMU client with 
> >> >> an
> >> >> RBD image, perhaps this RBD locking should not default to enabled.
> >> >> Additionally, this will conflict with the "exclusive-lock" feature
> >> >> available since the Ceph Hammer-release since both utilize the same 
> >> >> locking
> >> >> construct.
> >> >>
> >> >> As a quick background, the optional exclusive-lock feature can be
> >> >> enabled/disabled on image and safely/automatically handles the case of
> >> >> recovery from a crashed client.  Under normal conditions, the RBD
> >> >> exclusive-lock feature automatically acquires the lock upon the first
> >> >> attempt to write to the image and transparently transitions ownership of
> >> >> the lock between two or more clients -- used for QEMU live-migration.
> >> >
> >> > Is it enabled by default?
> >> >
> >>
> >> Starting with the Jewel release of Ceph it is enabled by default.
> >
> > OK, then I'll leave rbd in this QEMU series for now.
> 
> Without exposing some new API methods, this patch will, unfortunately,
> directly conflict with the new Jewel rbd defaults and will actually
> result in the image becoming read-only since librbd won't be able to
> acquire the exclusive lock when QEMU owns the advisory lock.
> 
> We can probably get the new API methods upstream within a week or two
> [1].  If that's too long of a delay, I'd recommend dropping rbd
> locking from the series for now.

Yes you are right, I tried to mean "drop" with "leave" :)

Thanks,
Fam

> 
> [1] http://tracker.ceph.com/issues/15632
> 
> -- 
> Jason

Re: [Qemu-devel] [PATCH qemu v15 14/17] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping being used by VFIO

2016-04-27 Thread David Gibson

On Wed, Apr 27, 2016 at 07:14:15PM +1000, Alexey Kardashevskiy wrote:
> On 04/27/2016 04:39 PM, David Gibson wrote:
> >On Thu, Apr 21, 2016 at 02:22:01PM +1000, Alexey Kardashevskiy wrote:
> >>On 04/21/2016 01:59 PM, David Gibson wrote:
> >>>On Wed, Apr 20, 2016 at 07:15:15PM +1000, Alexey Kardashevskiy wrote:
> On 04/07/2016 10:40 AM, David Gibson wrote:
> >On Mon, Apr 04, 2016 at 07:33:43PM +1000, Alexey Kardashevskiy wrote:
> >>The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU -
> >>a guest view of the table and a hardware TCE table. If there is no VFIO
> >>presense in the address space, then just the guest view is used, if
> >>this is the case, it is allocated in the KVM. However since there is no
> >>support yet for VFIO in KVM TCE hypercalls, when we start using VFIO,
> >>we need to move the guest view from KVM to the userspace; and we need
> >>to do this for every IOMMU on a bus with VFIO devices.
> >>
> >>This adds vfio_start/vfio_stop callbacks in MemoryRegionIOMMUOps to
> >>notifiy IOMMU about changing environment so it can reallocate the table
> >>to/from KVM or (when available) hook the IOMMU groups with the logical
> >>bus (LIOBN) in the KVM.
> >>
> >>This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug
> >>path as the new callbacks do this better - they notify IOMMU at
> >>the exact moment when the configuration is changed, and this also
> >>includes the case of PCI hot unplug.
> >>
> >>As there can be multiple containers attached to the same PHB/LIOBN,
> >>this replaces the @need_vfio flag in sPAPRTCETable with the counter
> >>of VFIO users.
> >>
> >>Signed-off-by: Alexey Kardashevskiy 
> >
> >This looks correct, but there's one remaining ugly.
> >
> >>---
> >>Changes:
> >>v15:
> >>* s/need_vfio/vfio-Users/g
> >>---
> >> hw/ppc/spapr_iommu.c   | 30 --
> >> hw/ppc/spapr_pci.c |  6 --
> >> hw/vfio/common.c   |  9 +
> >> include/exec/memory.h  |  4 
> >> include/hw/ppc/spapr.h |  2 +-
> >> 5 files changed, 34 insertions(+), 17 deletions(-)
> >>
> >>diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> >>index c945dba..ea09414 100644
> >>--- a/hw/ppc/spapr_iommu.c
> >>+++ b/hw/ppc/spapr_iommu.c
> >>@@ -155,6 +155,16 @@ static uint64_t 
> >>spapr_tce_get_page_sizes(MemoryRegion *iommu)
> >> return 1ULL << tcet->page_shift;
> >> }
> >>
> >>+static void spapr_tce_vfio_start(MemoryRegion *iommu)
> >>+{
> >>+spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), 
> >>true);
> >>+}
> >>+
> >>+static void spapr_tce_vfio_stop(MemoryRegion *iommu)
> >>+{
> >>+spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), 
> >>false);
> >>+}
> >>+
> >> static void spapr_tce_table_do_enable(sPAPRTCETable *tcet);
> >> static void spapr_tce_table_do_disable(sPAPRTCETable *tcet);
> >>
> >>@@ -239,6 +249,8 @@ static const VMStateDescription 
> >>vmstate_spapr_tce_table = {
> >> static MemoryRegionIOMMUOps spapr_iommu_ops = {
> >> .translate = spapr_tce_translate_iommu,
> >> .get_page_sizes = spapr_tce_get_page_sizes,
> >>+.vfio_start = spapr_tce_vfio_start,
> >>+.vfio_stop = spapr_tce_vfio_stop,
> >
> >Ok, so AFAICT these callbacks are called whenever a VFIO context is
> >added / removed from the gIOMMU's address space, and it's up to the
> >gIOMMU code to ref count that to see if there are any current vfio
> >users.  That makes "vfio_start" and "vfio_stop" not great names.
> >
> >But.. better than changing the names would be to move the refcounting
> >to the generic code if you can manage it, so the individual gIOMMU
> >backends don't need to - they just told when they need to start / stop
> >providing VFIO support.
> 
> Everything is manageable...
> 
> This referencing is needed for the case of >=2 containers so
> 2xvfio_listener_region_add will create 2xVFIOGuestIOMMU as they are per
> VFIOContainer so VFIOGuestIOMMU is not the right place for the reference
> counting, VFIOAddressSpace seems to be that place (=> add list of IOMMU 
> MRs
> with refcounter). Or even IOMMU MR. Or move VFIOGuestIOMMU list from
> VFIOContainer to VFIOAddressSpace and then gIOMMU can handle
> refcounting?
> >>>
> >>>I'm having a lot of trouble parsing that.  I think the ref parsing has
> >>>to be per-giommu (because individual giommus could, in theory, be
> >>>mapped or unmapped from an address space).
> >>
> >>
> >>Example 1.
> >>POWER8, no DDW, one QEMU PHB, 2 IOMMU groups, table sharing so just 1
> >>container, one TCE table (aka gIOMMU), one TCE table in KVM, no reference
> >>counting needed at all, simple.
> >>
>

Re: [Qemu-devel] Is anyone able to load a web page from a guest operating system?

2016-04-27 Thread Programmingkid


On Apr 27, 2016, at 2:34 AM, Thomas Huth wrote:

> On 26.04.2016 22:19, Programmingkid wrote:
>> 
>> On Apr 26, 2016, at 4:12 PM, Thomas Huth wrote:
>> 
>>> On 26.04.2016 21:25, Programmingkid wrote:
 
 On Apr 26, 2016, at 3:00 PM, Dr. David Alan Gilbert wrote:
 
> * Programmingkid (programmingk...@gmail.com) wrote:
>> My three guest operating systems can't load a web page. I think this is 
>> a bug with QEMU. Is there anyone who has the latest revision of QEMU 
>> that can access the web from a guest? Or are you experiencing the same 
>> problem?
> ...
> What's the qemu command line you use?
 qemu-system-ppc -hda  -hdb  -m 512 -boot c -M mac99 -netdev 
 user,id=mynet0 -device usb-net,netdev=mynet0 -cpu 750 -prom-env 
 boot-args=-v -device ich9-usb-uhci1,id=newusb -device 
 usb-audio,bus=newusb.0 
 
 and
 
 qemu-system-ppc -hda  -hdb  -m 512 -boot c -M mac99 -netdev 
 user,id=mynet0 -device rtl8139,netdev=mynet0 -cpu 750 -prom-env 
 boot-args=-v -device ich9-usb-uhci1,id=newusb -device 
 usb-audio,bus=newusb.0 
>>> 
>>> Ok, that means you're using user-mode / slirp networking.
>>> I just tried it with a pseries guest, and it seems to be working fine
>>> for me with the current git version of QEMU (f419a626c76bcb266).
>> 
>> So you are saying you can view web pages on your guest?
> 
> Yes. Linux guest on a Linux host, and I was able to access a web page
> from the guest.
> 
>>> 
>>> Now, what kind of host do you use? Mac OS X? 
>> Yes. Mac OS 10.6.
> 
> Ok, at least not Windows ... because there have been some problems with
> Windows recently, which could be the culprit otherwise:
> 
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=3424c8a9c89a3bc0d29ad
> 
>>> Also can you determine a revision when it was still working fine for
>>> you? (and then maybe even bisect the problem?)
>> 
>> I will see what I can find out.
> 
> Thanks! Bisecting the problem is likely the best way to deal with this...

Found out which patch was causing problems. This one: 

commit 5379229a2708df3a1506113315214c3ce5325859
Author: Guillaume Subiron 
Date:   Sat Dec 19 22:24:59 2015 +0100

slirp: Factorizing address translation

This patch factorizes some duplicate code into a new function,
sotranslate_out(). This function perform the address translation when a
packet is transmitted to the host network. If the packet is destinated
to the host, the loopback address is used, and if the packet is
destinated to the virtual DNS, the real DNS address is used. This code
is just a copy of the existent, but factorized and ready to manage the
IPv6 case.

[Qemu-devel] [PATCH v15 19/23] qapi: Split visit_end_struct() into pieces

2016-04-27 Thread Eric Blake

As mentioned in previous patches, we want to call visit_end_struct()
functions unconditionally, so that visitors can release resources
tied up since the matching visit_start_struct() without also having
to worry about error priority if more than one error occurs.

Even though error_propagate() can be safely used to ignore a second
error during cleanup caused by a first error, it is simpler if the
cleanup cannot set an error.  So, split out the error checking
portion (basically, input visitors checking for unvisited keys) into
a new function visit_check_struct(), which can be safely skipped if
any earlier errors are encountered, and leave the cleanup portion
(which never fails, but must be called unconditionally if
visit_start_struct() succeeded) in visit_end_struct().

Generated code in qapi-visit.c has diffs resembling:

|@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
| goto out_obj;
| }
| visit_type_ACPIOSTInfo_members(v, obj, );
|-error_propagate(errp, err);
|-err = NULL;
|+if (err) {
|+goto out_obj;
|+}
|+visit_check_struct(v, );
| out_obj:
|-visit_end_struct(v, );
|+visit_end_struct(v);
| out:

and in qapi-event.c:

@@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP
| goto out;
| }
| visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, , );
|-visit_end_struct(v, err ? NULL : );
|+if (!err) {
|+visit_check_struct(v, );
|+}
|+visit_end_struct(v);
| if (err) {
| goto out;

Signed-off-by: Eric Blake 

---
v15: rebase, simplify user_creatable_add
v14: rebase to master
v13: rebase to earlier changes
v12: rebase to earlier changes, fix bug in spapr_drc not calling
visit_end_struct, fold in docs, fix stray DO_UPCAST from sneaking
back in
[no v10, v11]
v9: rebase to earlier changes, drop Marc-Andre's R-b
v8: rebase to 'name' motion
v7: rebase to earlier changes
v6: new patch, revised version of RFC based on discussion of v5 7/46
---
 include/qapi/visitor.h  | 21 -
 include/qapi/visitor-impl.h |  5 -
 scripts/qapi-commands.py|  7 +--
 scripts/qapi-event.py   |  5 -
 scripts/qapi-visit.py   | 15 +--
 qapi/qapi-visit-core.c  | 11 +--
 block/crypto.c  | 14 --
 hw/ppc/spapr_drc.c  |  3 ++-
 hw/virtio/virtio-balloon.c  | 15 ---
 qapi/opts-visitor.c | 17 +++--
 qapi/qapi-dealloc-visitor.c |  2 +-
 qapi/qmp-input-visitor.c| 35 ---
 qapi/qmp-output-visitor.c   |  2 +-
 qom/object.c|  5 ++---
 qom/object_interfaces.c | 25 +++--
 tests/test-qmp-input-visitor.c  |  3 ++-
 tests/test-qmp-output-visitor.c |  3 ++-
 docs/qapi-code-gen.txt  | 15 ++-
 18 files changed, 129 insertions(+), 74 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index e79c09e..c882da6 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -191,10 +191,11 @@
  *  }
  * outlist:
  *  visit_end_list(v);
+ *  if (!err) {
+ *  visit_check_struct(v, );
+ *  }
  * outobj:
- *  error_propagate(errp, err);
- *  err = NULL;
- *  visit_end_struct(v, );
+ *  visit_end_struct(v);
  *  error_propagate(errp, err);
  *  ...clean up v...
  * 
@@ -247,17 +248,27 @@ void visit_start_struct(Visitor *v, const char *name, 
void **obj,
 size_t size, Error **errp);

 /*
- * Complete an object visit started earlier.
+ * Prepare for completing an object visit.
  *
  * @errp obeys typical error usage, and reports failures such as
  * unparsed keys remaining in the input stream.
  *
+ * Should be called prior to visit_end_struct() if all other
+ * intermediate visit steps were successful, to allow the visitor one
+ * last chance to report errors.  May be skipped on a cleanup path,
+ * where there is no need to check for further errors.
+ */
+void visit_check_struct(Visitor *v, Error **errp);
+
+/*
+ * Complete an object visit started earlier.
+ *
  * Must be called after any successful use of visit_start_struct(),
  * even if intermediate processing was skipped due to errors, to allow
  * the backend to release any resources.  Destroying the visitor early
  * behaves as if this was implicitly called.
  */
-void visit_end_struct(Visitor *v, Error **errp);
+void visit_end_struct(Visitor *v);


 /*** Visiting lists ***/
diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 88d27d5..b20a922 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -43,8 +43,11 @@ struct Visitor
 void (*start_struct)(Visitor *v, const char *name, void **obj,
  size_t size, Error **errp);

+/* Optional; intended for input visitors */
+void (*check_struct)(Visitor *v, Error **errp);
+
 /* Must be set to visit structs */
-void

[Qemu-devel] [PATCH v15 22/23] qapi: Simplify semantics of visit_next_list()

2016-04-27 Thread Eric Blake

The semantics of the list visit are somewhat baroque, with the
following pseudocode when FooList is used:

start()
for (prev = head; cur = next(prev); prev = ) {
visit(>value)
}

Note that these semantics (advance before visit) requires that
the first call to next() return the list head, while all other
calls return the next element of the list; that is, every visitor
implementation is required to track extra state to decide whether
to return the input as-is, or to advance.  It also requires an
argument of 'GenericList **' to next(), solely because the first
iteration might need to modify the caller's GenericList head, so
that all other calls have to do a layer of dereferencing.

Thankfully, we only have two uses of list visits in the entire
code base: one in spapr_drc (which completely avoids
visit_next_list(), feeding in integers from a different source
than uint8List), and one in qapi-visit.py.  That is, all other
list visitors are generated in qapi-visit.c, and share the same
paradigm based on a qapi FooList type, so we can refactor how
lists are laid out with minimal churn among clients.

We can greatly simplify things by hoisting the special case
into the start() routine, and flipping the order in the loop
to visit before advance:

start(head)
for (tail = *head; tail; tail = next(tail)) {
visit(>value)
}

With the simpler semantics, visitors have less state to track,
the argument to next() is reduced to 'GenericList *', and it
also becomes obvious whether an input visitor is allocating a
FooList during visit_start_list() (rather than the old way of
not knowing if an allocation happened until the first
visit_next_list()).  As a minor drawback, we now allocate in
two functions instead of one, and have to pass the size to
both functions (unless we were to tweak the input visitors to
cache the size to start_list for reuse during next_list, but
that defeats the goal of less visitor state).

The signature of visit_start_list() is chosen to match
visit_start_struct(), with the new parameters after 'name'.

The spapr_drc case is a virtual visit, done by passing NULL for
list, similarly to how NULL is passed to visit_start_struct()
when a qapi type is not used in those visits.  It was easy to
provide these semantics for qmp-output and dealloc visitors,
and a bit harder for qmp-input (several prerequisite patches
refactored things to make this patch straightforward).  But it
turned out that the string and opts visitors munge enough other
state during visit_next_list() to make it easier to just
document and require a GenericList visit for now; an assertion
will remind us to adjust things if we need the semantics in the
future.

Several pre-requisite cleanup patches made the reshuffling of
the various visitors easier; particularly the qmp input visitor.

Signed-off-by: Eric Blake 

---
v15: address review comments, tweak commit message, rebase atop
fix for string-input list bug, have qmp input push() return entry
rather than taking another parameter
v14: no change
v13: documentation wording tweaks
v12: rebase to earlier changes, drop R-b, improve documentation,
split out qmp-input cleanups into prereq patches
[no v10, v11]
v9: no change
v8: consistent parameter order, fix qmp_input_get_next_type() to not
skip list entries
v7: new patch
---
 include/qapi/visitor.h   | 49 +++-
 include/qapi/visitor-impl.h  |  8 +++---
 scripts/qapi-visit.py| 16 ++--
 include/qapi/opts-visitor.h  |  3 ++-
 include/qapi/string-input-visitor.h  |  3 ++-
 include/qapi/string-output-visitor.h |  3 ++-
 qapi/qapi-visit-core.c   | 12 +
 hw/ppc/spapr_drc.c   |  2 +-
 qapi/opts-visitor.c  | 33 +++-
 qapi/qapi-dealloc-visitor.c  | 35 ++
 qapi/qmp-input-visitor.c | 40 +++--
 qapi/qmp-output-visitor.c| 22 
 qapi/string-input-visitor.c  | 29 +
 qapi/string-output-visitor.c | 41 +++---
 tests/test-string-input-visitor.c|  2 --
 docs/qapi-code-gen.txt   | 16 ++--
 16 files changed, 136 insertions(+), 178 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index c882da6..4f2a460 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -175,7 +175,7 @@
  *  if (err) {
  *  goto outobj;
  *  }
- *  visit_start_list(v, "list", );
+ *  visit_start_list(v, "list", NULL, 0, );
  *  if (err) {
  *  goto outobj;
  *  }
@@ -279,19 +279,27 @@ void visit_end_struct(Visitor *v);
  * @name expresses the relationship of this list to its parent
  * container; see the general description of @name above.
  *
+ * @list must be non-NULL for a real walk, in which case @size
+ * determines how much memory an input visitor will allocate into
+ * *@list (at least

[Qemu-devel] [PATCH v15 23/23] qapi: Change visit_type_FOO() to no longer return partial objects

2016-04-27 Thread Eric Blake

Returning a partial object on error is an invitation for a careless
caller to leak memory.  We already fixed things in an earlier
patch to guarantee NULL if visit_start fails ("qapi: Guarantee
NULL obj on input visitor callback error"), but that does not
help the case where visit_start succeeds but some other failure
happens before visit_end, such that we leak a partially constructed
object outside visit_type_FOO(). As no one outside the testsuite
was actually relying on these semantics, it is cleaner to just
document and guarantee that ALL pointer-based visit_type_FOO()
functions always leave a safe value in *obj during an input visitor
(either the new object on success, or NULL if an error is
encountered), so callers can now unconditionally use
qapi_free_FOO() to clean up regardless of whether an error occurred.

The decision is done by adding visit_is_input(), then updating the
generated code to check if additional cleanup is needed based on
the type of visitor in use.

Note that we still leave *obj unchanged after a scalar-based
visit_type_FOO(); I did not feel like auditing all uses of
visit_type_Enum() to see if the callers would tolerate a specific
sentinel value (not to mention having to decide whether it would
be better to use 0 or ENUM__MAX as that sentinel).

Signed-off-by: Eric Blake 

---
v15: rebase, redo semantics on how generated code learns
about input visitors, hoist unrelated assertions earlier
v14: no change
v13: rebase to latest, documentation tweaks
v12: rebase to latest, don't modify callback signatures, use newer
approach for detecting input visitors, avoid 'allocated' boolean
[no v10, v11]
v9: fix bug in use of errp
v8: rebase to earlier changes
v7: rebase to earlier changes, enhance commit message, also fix
visit_type_str() and visit_type_any()
v6: rebase on top of earlier doc and formatting improvements, mention
that *obj can be uninitialized on entry to an input visitor, rework
semantics to keep valgrind happy on uninitialized input, break some
long lines
---
 include/qapi/visitor.h | 24 
 scripts/qapi-visit.py  | 22 +-
 qapi/qapi-visit-core.c |  6 ++
 tests/test-qmp-commands.c  | 13 ++---
 tests/test-qmp-input-strict.c  | 17 +++--
 tests/test-qmp-input-visitor.c | 10 ++
 docs/qapi-code-gen.txt |  8 
 7 files changed, 58 insertions(+), 42 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 4f2a460..7b97711 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -66,11 +66,14 @@
  * member @name is not present, or is present but not the specified
  * type).
  *
- * FIXME: At present, input visitors may allocate an incomplete *@obj
- * even when visit_type_FOO() reports an error.  Using an output
- * visitor with an incomplete object has undefined behavior; callers
- * must call qapi_free_FOO() (which uses the dealloc visitor, and
- * safely handles an incomplete object) to avoid a memory leak.
+ * If an error is detected during visit_type_FOO() with an input
+ * visitor, then *@obj will be NULL for pointer types, and left
+ * unchanged for scalar types.  Using an output visitor with an
+ * incomplete object has undefined behavior (other than a special case
+ * for visit_type_str() treating NULL like ""), while the dealloc
+ * visitor safely handles incomplete objects.  Since input visitors
+ * never produce an incomplete object, such an object is possible only
+ * by manual construction.
  *
  * For the QAPI object types (structs, unions, and alternates), there
  * is an additional generated function in qapi-visit.h compatible
@@ -105,7 +108,6 @@
  *  v = ...obtain input visitor...
  *  visit_type_Foo(v, NULL, , );
  *  if (err) {
- *  qapi_free_Foo(f);
  *  ...handle error...
  *  } else {
  *  ...use f...
@@ -123,7 +125,6 @@
  *  v = ...obtain input visitor...
  *  visit_type_FooList(v, NULL, , );
  *  if (err) {
- *  qapi_free_FooList(l);
  *  ...handle error...
  *  } else {
  *  for ( ; l; l = l->next) {
@@ -153,7 +154,9 @@
  * helpers that rely on in-tree information to control the walk:
  * visit_optional() for the 'has_member' field associated with
  * optional 'member' in the C struct; and visit_next_list() for
- * advancing through a FooList linked list.  Only the generated
+ * advancing through a FooList linked list.  Similarly, the
+ * visit_is_input() helper makes it possible to write code that is
+ * visitor-agnostic everywhere except for cleanup.  Only the generated
  * visit_type functions need to use these helpers.
  *
  * It is also possible to use the visitors to do a virtual walk, where
@@ -403,6 +406,11 @@ bool visit_optional(Visitor *v, const char *name, bool 
*present);
 void visit_type_enum(Visitor *v, const char *name, int *obj,
  const char *const strings[], Error **errp);

+/*
+ * Check if visitor is an input visitor.
+ */
+bool

[Qemu-devel] [PATCH v15 21/23] qapi: Fix string input visitor handling of invalid list

2016-04-27 Thread Eric Blake

As shown in the previous commit, the string input visitor was
treating bogus input as an empty list rather than an error.
Fix parse_str() to set errp, then the callers to exit early if
an error was reported.  Also, make sure the error message
for visit_type_uint64() gracefully handles a NULL 'name' when
called from the top level or a list context.

Meanwhile, fix the testsuite to use the generated
qapi_free_int16List() instead of rolling our own, and to
validate the fixed behavior, while at the same time documenting
one more change that we'd like to make in a later patch (a
failed visit_start_list should guarantee a NULL pointer,
regardless of what things were on input).

Signed-off-by: Eric Blake 

---
v15: new patch
---
 qapi/string-input-visitor.c   | 19 +--
 tests/test-string-input-visitor.c | 12 +---
 2 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
index 797973a..ad150dc 100644
--- a/qapi/string-input-visitor.c
+++ b/qapi/string-input-visitor.c
@@ -44,7 +44,7 @@ static void free_range(void *range, void *dummy)
 g_free(range);
 }

-static void parse_str(StringInputVisitor *siv, Error **errp)
+static int parse_str(StringInputVisitor *siv, const char *name, Error **errp)
 {
 char *str = (char *) siv->string;
 long long start, end;
@@ -52,7 +52,7 @@ static void parse_str(StringInputVisitor *siv, Error **errp)
 char *endptr;

 if (siv->ranges) {
-return;
+return 0;
 }

 do {
@@ -117,11 +117,14 @@ static void parse_str(StringInputVisitor *siv, Error 
**errp)
 }
 } while (str);

-return;
+return 0;
 error:
 g_list_foreach(siv->ranges, free_range, NULL);
 g_list_free(siv->ranges);
 siv->ranges = NULL;
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
+   "an int64 value or range");
+return -1;
 }

 static void
@@ -129,7 +132,9 @@ start_list(Visitor *v, const char *name, Error **errp)
 {
 StringInputVisitor *siv = to_siv(v);

-parse_str(siv, errp);
+if (parse_str(siv, name, errp) < 0) {
+return;
+}

 siv->cur_range = g_list_first(siv->ranges);
 if (siv->cur_range) {
@@ -195,7 +200,9 @@ static void parse_type_int64(Visitor *v, const char *name, 
int64_t *obj,
 return;
 }

-parse_str(siv, errp);
+if (parse_str(siv, name, errp) < 0) {
+return;
+}

 if (!siv->ranges) {
 goto error;
@@ -222,7 +229,7 @@ static void parse_type_int64(Visitor *v, const char *name, 
int64_t *obj,
 return;

 error:
-error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name,
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
"an int64 value or range");
 }

diff --git a/tests/test-string-input-visitor.c 
b/tests/test-string-input-visitor.c
index 8114908..f99824d 100644
--- a/tests/test-string-input-visitor.c
+++ b/tests/test-string-input-visitor.c
@@ -92,19 +92,17 @@ static void test_visitor_in_intList(TestInputVisitorData 
*data,
 }
 g_assert(!tmp);

-tmp = res;
-while (tmp) {
-res = res->next;
-g_free(tmp);
-tmp = res;
-}
+qapi_free_int16List(res);

 visitor_input_teardown(data, unused);

 v = visitor_input_test_init(data, "not an int list");

+/* FIXME: res should be NULL on failure, regardless of starting value */
+res = NULL;
 visit_type_int16List(v, NULL, , );
-/* FIXME fix the visitor, then error_free_or_abort() here */
+error_free_or_abort();
+g_assert(!res);
 }

 static void test_visitor_in_bool(TestInputVisitorData *data,
-- 
2.5.5

[Qemu-devel] [PATCH v15 20/23] tests/string-input-visitor: Add negative integer tests

2016-04-27 Thread Eric Blake

From: Markus Armbruster 

Add two negative tests, one for int and one for int16List.  The latter
exposes a bug: nonsensical input results in an empty list instead of
an error.

Signed-off-by: Markus Armbruster 
Message-Id: <1461325048-14122-1-git-send-email-arm...@redhat.com>
Signed-off-by: Eric Blake 

---
v15: new patch
---
 tests/test-string-input-visitor.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/tests/test-string-input-visitor.c 
b/tests/test-string-input-visitor.c
index 9e6906a..8114908 100644
--- a/tests/test-string-input-visitor.c
+++ b/tests/test-string-input-visitor.c
@@ -63,6 +63,13 @@ static void test_visitor_in_int(TestInputVisitorData *data,
 visit_type_int(v, NULL, , );
 g_assert(!err);
 g_assert_cmpint(res, ==, value);
+
+visitor_input_teardown(data, unused);
+
+v = visitor_input_test_init(data, "not an int");
+
+visit_type_int(v, NULL, , );
+error_free_or_abort();
 }

 static void test_visitor_in_intList(TestInputVisitorData *data,
@@ -70,6 +77,7 @@ static void test_visitor_in_intList(TestInputVisitorData 
*data,
 {
 int64_t value[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20};
 int16List *res = NULL, *tmp;
+Error *err = NULL;
 Visitor *v;
 int i = 0;

@@ -90,6 +98,13 @@ static void test_visitor_in_intList(TestInputVisitorData 
*data,
 g_free(tmp);
 tmp = res;
 }
+
+visitor_input_teardown(data, unused);
+
+v = visitor_input_test_init(data, "not an int list");
+
+visit_type_int16List(v, NULL, , );
+/* FIXME fix the visitor, then error_free_or_abort() here */
 }

 static void test_visitor_in_bool(TestInputVisitorData *data,
-- 
2.5.5

[Qemu-devel] [PATCH v15 15/23] qmp: Support explicit null during visits

2016-04-27 Thread Eric Blake

Implement the new type_null() callback for the qmp input and
output visitors. While we don't yet have a use for this in QAPI
input (the generator will need some tweaks first), some
potential usages have already been discussed on the list.
Meanwhile, the output visitor could already output explicit null
via type_any, but this gives us finer control.

At any rate, it's easy to test that we can round-trip an explicit
null through manual use of visit_type_null() wrapped by a virtual
visit_start_struct() walk, even if we can't do the visit in a
QAPI type.  Repurpose the test_visitor_out_empty test,
particularly since a future patch will tighten semantics to
forbid use of qmp_output_get_qobject() without at least one
intervening visit_type_*.

Signed-off-by: Eric Blake 

---
v15: hoist stubs to earlier patch, testsuite improvements
v14: rebase to header inclusion cleanups
v13: no change
v12: retitle and merge in output visitor support, move refcount
tests to separate file
[no v10, v11]
v9: new patch
---
 qapi/qmp-input-visitor.c|  8 +++-
 qapi/qmp-output-visitor.c   |  4 ++--
 tests/check-qnull.c | 13 +++--
 tests/test-qmp-input-visitor.c  | 26 ++
 tests/test-qmp-output-visitor.c | 20 +++-
 5 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index 739bbd5..7e94be6 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -342,7 +342,13 @@ static void qmp_input_type_any(Visitor *v, const char 
*name, QObject **obj,

 static void qmp_input_type_null(Visitor *v, const char *name, Error **errp)
 {
-error_setg(errp, "FIXME: Implement");
+QmpInputVisitor *qiv = to_qiv(v);
+QObject *qobj = qmp_input_get_object(qiv, name, true);
+
+if (qobject_type(qobj) != QTYPE_QNULL) {
+error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
+   "null");
+}
 }

 static void qmp_input_optional(Visitor *v, const char *name, bool *present)
diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
index a67e3e8..5681ad3 100644
--- a/qapi/qmp-output-visitor.c
+++ b/qapi/qmp-output-visitor.c
@@ -18,7 +18,6 @@
 #include "qemu/queue.h"
 #include "qemu-common.h"
 #include "qapi/qmp/types.h"
-#include "qapi/error.h"

 typedef struct QStackEntry
 {
@@ -199,7 +198,8 @@ static void qmp_output_type_any(Visitor *v, const char 
*name, QObject **obj,

 static void qmp_output_type_null(Visitor *v, const char *name, Error **errp)
 {
-error_setg(errp, "FIXME: Implement");
+QmpOutputVisitor *qov = to_qov(v);
+qmp_output_add_obj(qov, name, qnull());
 }

 /* Finish building, and return the root object. Will not be NULL. */
diff --git a/tests/check-qnull.c b/tests/check-qnull.c
index 4a1c3d8..fd9c68f 100644
--- a/tests/check-qnull.c
+++ b/tests/check-qnull.c
@@ -11,7 +11,9 @@

 #include "qapi/qmp/qobject.h"
 #include "qemu-common.h"
+#include "qapi/qmp-input-visitor.h"
 #include "qapi/qmp-output-visitor.h"
+#include "qapi/error.h"

 /*
  * Public Interface test-cases
@@ -37,6 +39,7 @@ static void qnull_visit_test(void)
 {
 QObject *obj;
 QmpOutputVisitor *qov;
+QmpInputVisitor *qiv;

 /*
  * Most tests of interactions between QObject and visitors are in
@@ -45,13 +48,19 @@ static void qnull_visit_test(void)
  */

 g_assert(qnull_.refcnt == 1);
+obj = qnull();
+qiv = qmp_input_visitor_new(obj, true);
+qobject_decref(obj);
+visit_type_null(qmp_input_get_visitor(qiv), NULL, _abort);
+qmp_input_visitor_cleanup(qiv);
+
 qov = qmp_output_visitor_new();
-/* FIXME: Empty visits are ugly, we should have a visit_type_null(). */
+visit_type_null(qmp_output_get_visitor(qov), NULL, _abort);
 obj = qmp_output_get_qobject(qov);
 g_assert(obj == _);
 qobject_decref(obj);
-
 qmp_output_visitor_cleanup(qov);
+
 g_assert(qnull_.refcnt == 1);
 }

diff --git a/tests/test-qmp-input-visitor.c b/tests/test-qmp-input-visitor.c
index c039806..3b6ae92 100644
--- a/tests/test-qmp-input-visitor.c
+++ b/tests/test-qmp-input-visitor.c
@@ -279,6 +279,30 @@ static void test_visitor_in_any(TestInputVisitorData *data,
 qobject_decref(res);
 }

+static void test_visitor_in_null(TestInputVisitorData *data,
+ const void *unused)
+{
+Visitor *v;
+Error *err = NULL;
+char *tmp;
+
+/*
+ * FIXME: Since QAPI doesn't know the 'null' type yet, we can't
+ * test visit_type_null() by reading into a QAPI struct then
+ * checking that it was populated correctly.  The best we can do
+ * for now is ensure that we consumed null from the input, proven
+ * by the fact that we can't re-read the key.
+ */
+
+v = visitor_input_test_init(data, "{ 'a': null }");
+visit_start_struct(v, NULL, NULL, 0, _abort);
+visit_type_null(v, "a", _abort);
+visit_type_str(v, "a", , );
+g_assert(err);
+

[Qemu-devel] [PATCH v15 17/23] qmp: Add qmp_output_visitor_reset()

2016-04-27 Thread Eric Blake

Add a new qmp_output_visitor_reset(), to make it easier for a
caller to reset all state while still reusing an existing
visitor, regardless of whether the previous visit was
successfully completed.  Then use it in the testsuite.

The tests needing patching were found by tightening asserts
in the QMP output visitor; see the next patch.  An audit of
all other users of qmp_output_visitor_new() did not find any
other attempts to reuse a visitor.

Signed-off-by: Eric Blake 

---
v15: new patch, split off of v14 13/19
---
 include/qapi/qmp-output-visitor.h | 1 +
 qapi/qmp-output-visitor.c | 8 +++-
 tests/test-qmp-output-visitor.c   | 6 ++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/qapi/qmp-output-visitor.h 
b/include/qapi/qmp-output-visitor.h
index 2266770..5093f0d 100644
--- a/include/qapi/qmp-output-visitor.h
+++ b/include/qapi/qmp-output-visitor.h
@@ -21,6 +21,7 @@ typedef struct QmpOutputVisitor QmpOutputVisitor;

 QmpOutputVisitor *qmp_output_visitor_new(void);
 void qmp_output_visitor_cleanup(QmpOutputVisitor *v);
+void qmp_output_visitor_reset(QmpOutputVisitor *v);

 QObject *qmp_output_get_qobject(QmpOutputVisitor *v);
 Visitor *qmp_output_get_visitor(QmpOutputVisitor *v);
diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
index 5681ad3..6c44210 100644
--- a/qapi/qmp-output-visitor.c
+++ b/qapi/qmp-output-visitor.c
@@ -221,7 +221,7 @@ Visitor *qmp_output_get_visitor(QmpOutputVisitor *v)
 return >visitor;
 }

-void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
+void qmp_output_visitor_reset(QmpOutputVisitor *v)
 {
 QStackEntry *e, *tmp;

@@ -231,6 +231,12 @@ void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
 }

 qobject_decref(v->root);
+v->root = NULL;
+}
+
+void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
+{
+qmp_output_visitor_reset(v);
 g_free(v);
 }

diff --git a/tests/test-qmp-output-visitor.c b/tests/test-qmp-output-visitor.c
index 8acc229..0e83099 100644
--- a/tests/test-qmp-output-visitor.c
+++ b/tests/test-qmp-output-visitor.c
@@ -139,6 +139,7 @@ static void test_visitor_out_enum(TestOutputVisitorData 
*data,
 g_assert_cmpstr(qstring_get_str(qobject_to_qstring(obj)), ==,
 EnumOne_lookup[i]);
 qobject_decref(obj);
+qmp_output_visitor_reset(data->qov);
 }
 }

@@ -153,6 +154,7 @@ static void 
test_visitor_out_enum_errors(TestOutputVisitorData *data,
 visit_type_EnumOne(data->ov, "unused", _values[i], );
 g_assert(err);
 error_free(err);
+qmp_output_visitor_reset(data->qov);
 }
 }

@@ -262,6 +264,7 @@ static void 
test_visitor_out_struct_errors(TestOutputVisitorData *data,
 visit_type_UserDefOne(data->ov, "unused", , );
 g_assert(err);
 error_free(err);
+qmp_output_visitor_reset(data->qov);
 }
 }

@@ -366,6 +369,7 @@ static void test_visitor_out_any(TestOutputVisitorData 
*data,
 qobject_decref(obj);
 qobject_decref(qobj);

+qmp_output_visitor_reset(data->qov);
 qdict = qdict_new();
 qdict_put(qdict, "integer", qint_from_int(-42));
 qdict_put(qdict, "boolean", qbool_from_bool(true));
@@ -442,6 +446,7 @@ static void 
test_visitor_out_alternate(TestOutputVisitorData *data,
 qapi_free_UserDefAlternate(tmp);
 qobject_decref(arg);

+qmp_output_visitor_reset(data->qov);
 tmp = g_new0(UserDefAlternate, 1);
 tmp->type = QTYPE_QSTRING;
 tmp->u.s = g_strdup("hello");
@@ -455,6 +460,7 @@ static void 
test_visitor_out_alternate(TestOutputVisitorData *data,
 qapi_free_UserDefAlternate(tmp);
 qobject_decref(arg);

+qmp_output_visitor_reset(data->qov);
 tmp = g_new0(UserDefAlternate, 1);
 tmp->type = QTYPE_QDICT;
 tmp->u.udfu.integer = 1;
-- 
2.5.5

Re: [Qemu-devel] [RFC for-2.7 01/11] qdt: IEEE1275-style device tree utility code

2016-04-27 Thread David Gibson

On Wed, Apr 27, 2016 at 09:28:57AM +0200, Markus Armbruster wrote:
> Thomas Huth  writes:
> 
> > On 27.04.2016 08:43, Markus Armbruster wrote:
> >> David Gibson  writes:
> >> 
> >>> On Tue, Apr 26, 2016 at 01:00:06PM +0200, Thomas Huth wrote:
>  On 20.04.2016 04:33, David Gibson wrote:
> > [...]
> > +/*
> > + * Property functions
> > + */
> > +
> > +QDTProperty *qdt_new_property(const gchar *name, gconstpointer val, 
> > gsize len)
> > +{
> > +QDTProperty *prop = g_malloc0(sizeof(*prop) + len);
> > +
> > +prop->name = g_strdup(name);
> > +prop->len = len;
> > +memcpy(prop->val, val, len);
> > +return prop;
> > +}
> > +
> > +static QDTProperty *getprop_(const QDTNode *node, const gchar *name)
> 
>  Underscore at the end looks somewhat strange ... can't you simply drop 
>  that?
> >>>
> >>> Well.. the idea was that the _ versions are the "internal" ones,
> >>> whereas external users will generally use the non-underscore version
> >> 
> >> I've seen that convention used before.  It's fine with me.
> >
> > Can't remember to have seen that convention before ... I know that some
> > people use the underscore at the beginning to mark an internal function,
> > but at the end?
> > So if you really want to use the underscore, what about putting it at
> > the beginning instead?
> 
> C99 7.1.3  Reserved identifiers:
> 
>  -- All identifiers  that  begin  with  an  underscore  are
> always  reserved for use as identifiers with file scope
> in both the ordinary and tag name spaces.

Right.  The kernel uses the _ prefix convention, but it can kind of
get away with it, because it doesn't use the standard library.  For
things in userspace, _ prefixed identifiers are reserved, hence using
the _ suffix instead.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH v15 13/23] tests: Add check-qnull

2016-04-27 Thread Eric Blake

Add a new test, for checking reference counting of qnull(). As
part of the new file, move a previous reference counting change
added in commit a861564 to a more logical place.

Note that while most of the check-q*.c leave visitor stuff to
the test-qmp-*-visitor.c, in this case we actually want the
visitor tests in our new file because we are validating the
reference count of qnull_, which is an internal detail that
test-qmp-*-visitor should not be peeking into (or put another
way, qnull() is the only special case where we don't have
independent allocation of a QObject, so none of the other
visitor tests require the layering violation present in this
test).

Signed-off-by: Eric Blake 

---
v15: add comment, commit message tweak
v14: no change
v13: no change
v12: new patch
---
 tests/check-qnull.c | 66 +
 tests/test-qmp-output-visitor.c |  2 --
 tests/.gitignore|  1 +
 tests/Makefile  |  6 +++-
 4 files changed, 72 insertions(+), 3 deletions(-)
 create mode 100644 tests/check-qnull.c

diff --git a/tests/check-qnull.c b/tests/check-qnull.c
new file mode 100644
index 000..4a1c3d8
--- /dev/null
+++ b/tests/check-qnull.c
@@ -0,0 +1,66 @@
+/*
+ * QNull unit-tests.
+ *
+ * Copyright (C) 2016 Red Hat Inc.
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include 
+
+#include "qapi/qmp/qobject.h"
+#include "qemu-common.h"
+#include "qapi/qmp-output-visitor.h"
+
+/*
+ * Public Interface test-cases
+ *
+ * (with some violations to access 'private' data)
+ */
+
+static void qnull_ref_test(void)
+{
+QObject *obj;
+
+g_assert(qnull_.refcnt == 1);
+obj = qnull();
+g_assert(obj);
+g_assert(obj == _);
+g_assert(qnull_.refcnt == 2);
+g_assert(qobject_type(obj) == QTYPE_QNULL);
+qobject_decref(obj);
+g_assert(qnull_.refcnt == 1);
+}
+
+static void qnull_visit_test(void)
+{
+QObject *obj;
+QmpOutputVisitor *qov;
+
+/*
+ * Most tests of interactions between QObject and visitors are in
+ * test-qmp-*-visitor; but these tests live here because they
+ * depend on layering violations to check qnull_ refcnt.
+ */
+
+g_assert(qnull_.refcnt == 1);
+qov = qmp_output_visitor_new();
+/* FIXME: Empty visits are ugly, we should have a visit_type_null(). */
+obj = qmp_output_get_qobject(qov);
+g_assert(obj == _);
+qobject_decref(obj);
+
+qmp_output_visitor_cleanup(qov);
+g_assert(qnull_.refcnt == 1);
+}
+
+int main(int argc, char **argv)
+{
+g_test_init(, , NULL);
+
+g_test_add_func("/public/qnull_ref", qnull_ref_test);
+g_test_add_func("/public/qnull_visit", qnull_visit_test);
+
+return g_test_run();
+}
diff --git a/tests/test-qmp-output-visitor.c b/tests/test-qmp-output-visitor.c
index c709267..fddb5a6 100644
--- a/tests/test-qmp-output-visitor.c
+++ b/tests/test-qmp-output-visitor.c
@@ -484,8 +484,6 @@ static void test_visitor_out_empty(TestOutputVisitorData 
*data,

 arg = qmp_output_get_qobject(data->qov);
 g_assert(qobject_type(arg) == QTYPE_QNULL);
-/* Check that qnull reference counting is sane */
-g_assert(arg->refcnt == 2);
 qobject_decref(arg);
 }

diff --git a/tests/.gitignore b/tests/.gitignore
index 9eed229..a06a8ba 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -3,6 +3,7 @@ check-qfloat
 check-qint
 check-qjson
 check-qlist
+check-qnull
 check-qstring
 check-qom-interface
 check-qom-proplist
diff --git a/tests/Makefile b/tests/Makefile
index 9194f18..9e6 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -16,6 +16,8 @@ check-unit-y += tests/check-qstring$(EXESUF)
 gcov-files-check-qstring-y = qobject/qstring.c
 check-unit-y += tests/check-qlist$(EXESUF)
 gcov-files-check-qlist-y = qobject/qlist.c
+check-unit-y += tests/check-qnull$(EXESUF)
+gcov-files-check-qnull-y = qobject/qnull.c
 check-unit-y += tests/check-qjson$(EXESUF)
 gcov-files-check-qjson-y = qobject/qjson.c
 check-unit-y += tests/test-qmp-output-visitor$(EXESUF)
@@ -382,7 +384,8 @@ GENERATED_HEADERS += tests/test-qapi-types.h 
tests/test-qapi-visit.h \
tests/test-qmp-introspect.h

 test-obj-y = tests/check-qint.o tests/check-qstring.o tests/check-qdict.o \
-   tests/check-qlist.o tests/check-qfloat.o tests/check-qjson.o \
+   tests/check-qlist.o tests/check-qfloat.o tests/check-qnull.o \
+   tests/check-qjson.o \
tests/test-coroutine.o tests/test-string-output-visitor.o \
tests/test-string-input-visitor.o tests/test-qmp-output-visitor.o \
tests/test-qmp-input-visitor.o tests/test-qmp-input-strict.o \
@@ -410,6 +413,7 @@ tests/check-qstring$(EXESUF): tests/check-qstring.o 
$(test-util-obj-y)
 tests/check-qdict$(EXESUF): tests/check-qdict.o $(test-util-obj-y)
 tests/check-qlist$(EXESUF): tests/check-qlist.o $(test-util-obj-y)

[Qemu-devel] [PATCH v15 11/23] qmp-input: Refactor when list is advanced

2016-04-27 Thread Eric Blake

In the QMP input visitor, visiting a list traverses two objects:
the QAPI GenericList of the caller (which gets advanced in
visit_next_list() regardless of this patch), and the QList input
that we are converting to QAPI.  For consistency with QDict
visits, we want to consume elements from the input QList during
the visit_type_FOO() for the list element; that is, we want ALL
the code for consuming an input to live in qmp_input_get_object(),
rather than having it split according to whether we are visiting
a dict or a list.  Making qmp_input_get_object() the common point
of consumption will make it easier for a later patch to refactor
visit_start_list() to cover the GenericList * head of a QAPI list,
and in turn will get rid of the 'first' flag (which lived in
qmp_input_next_list() pre-patch, and is hoisted to StackObject
by this patch).

This patch is therefore shifting the post-condition use of
'entry' from:

start_list next_list type_ELT ... next_list type_ELT end_list
 entry  NULL   1st elt   1st elt  last elt  last elt gone

where type_ELT() returns (entry ? entry : 1st elt) and next_list() steps
entry

to this usage:

start_list next_list type_ELT ... next_list type_ELT end_list
 entry  1st elt1nd elt   2nd elt  last elt  NULL gone

where type_ELT() steps entry and returns the old entry, and next_list()
leaves entry alone.

Signed-off-by: Eric Blake 

---
v15: rebase, improve commit message, hoist root handling earlier
so that this patch is more pinpointed
v14: no change
v13: no change
v12: new patch
---
 qapi/qmp-input-visitor.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index 6409a83..eb77adc 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -29,6 +29,7 @@ typedef struct StackObject

 GHashTable *h;   /* If obj is dict: unvisited keys */
 const QListEntry *entry; /* If obj is list: unvisited tail */
+bool first;  /* If obj is list: will next_list() visit head */
 } StackObject;

 struct QmpInputVisitor
@@ -80,7 +81,11 @@ static QObject *qmp_input_get_object(QmpInputVisitor *qiv,
 } else {
 assert(qobject_type(qobj) == QTYPE_QLIST);
 assert(!name);
+assert(!tos->first);
 ret = qlist_entry_obj(tos->entry);
+if (consume) {
+tos->entry = qlist_next(tos->entry);
+}
 }

 return ret;
@@ -104,13 +109,16 @@ static void qmp_input_push(QmpInputVisitor *qiv, QObject 
*obj, Error **errp)
 }

 tos->obj = obj;
-tos->entry = NULL;
-tos->h = NULL;
+assert(!tos->h);
+assert(!tos->entry);

 if (qiv->strict && qobject_type(obj) == QTYPE_QDICT) {
 h = g_hash_table_new(g_str_hash, g_str_equal);
 qdict_iter(qobject_to_qdict(obj), qdict_add_key, h);
 tos->h = h;
+} else if (qobject_type(obj) == QTYPE_QLIST) {
+tos->entry = qlist_first(qobject_to_qlist(obj));
+tos->first = true;
 }

 qiv->nb_stack++;
@@ -119,10 +127,11 @@ static void qmp_input_push(QmpInputVisitor *qiv, QObject 
*obj, Error **errp)

 static void qmp_input_pop(QmpInputVisitor *qiv, Error **errp)
 {
+StackObject *tos = >stack[qiv->nb_stack - 1];
 assert(qiv->nb_stack > 0);

 if (qiv->strict) {
-GHashTable * const top_ht = qiv->stack[qiv->nb_stack - 1].h;
+GHashTable * const top_ht = tos->h;
 if (top_ht) {
 GHashTableIter iter;
 const char *key;
@@ -133,6 +142,7 @@ static void qmp_input_pop(QmpInputVisitor *qiv, Error 
**errp)
 }
 g_hash_table_unref(top_ht);
 }
+tos->h = NULL;
 }

 qiv->nb_stack--;
@@ -192,23 +202,15 @@ static GenericList *qmp_input_next_list(Visitor *v, 
GenericList **list,
 QmpInputVisitor *qiv = to_qiv(v);
 GenericList *entry;
 StackObject *so = >stack[qiv->nb_stack - 1];
-bool first;

-if (so->entry == NULL) {
-so->entry = qlist_first(qobject_to_qlist(so->obj));
-first = true;
-} else {
-so->entry = qlist_next(so->entry);
-first = false;
-}
-
-if (so->entry == NULL) {
+if (!so->entry) {
 return NULL;
 }

 entry = g_malloc0(size);
-if (first) {
+if (so->first) {
 *list = entry;
+so->first = false;
 } else {
 (*list)->next = entry;
 }
-- 
2.5.5

[Qemu-devel] [PATCH v15 12/23] qapi: Document visitor interfaces, add assertions

2016-04-27 Thread Eric Blake

The visitor interface for mapping between QObject/QemuOpts/string
and QAPI is scandalously under-documented, making changes to visitor
core, individual visitors, and users of visitors difficult to
coordinate.  Among other questions: when is it safe to pass NULL,
vs. when a string must be provided; which visitors implement which
callbacks; the difference between concrete and virtual visits.

Correct this by retrofitting proper contracts, and document where some
of the interface warts remain (for example, we may want to modify
visit_end_* to require the same 'obj' as the visit_start counterpart,
so the dealloc visitor can be simplified).  Later patches in this
series will tackle some, but not all, of these warts.

Add assertions to (partially) enforce the contract.  Some of these
were only made possible by recent cleanup commits.

Signed-off-by: Eric Blake 

---
v15: wording tweaks based on review, rebase to some assertions going
in earlier
v14: rebase to master context
v13: minor wording tweaks for consistency
v12: major rework based on Markus' comments, drop R-b
[no v10, v11]
v9: no change
v8: rebase to 'name' motion
v7: retitle; more wording changes, add asserts to enforce the
wording, place later in series to rebase on fixes that would
otherwise trip the new assertions
v6: mention that input visitors blindly assign over *obj; wording
improvements
---
 include/qapi/visitor.h   | 445 +--
 include/qapi/visitor-impl.h  |  44 +++-
 include/qapi/dealloc-visitor.h   |   5 +
 include/qapi/opts-visitor.h  |   3 +
 include/qapi/string-input-visitor.h  |   4 +
 include/qapi/string-output-visitor.h |   4 +
 qapi/qapi-visit-core.c   |  18 +-
 7 files changed, 494 insertions(+), 29 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 690589f..0e028ba 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -16,8 +16,194 @@

 #include "qapi/qmp/qobject.h"

+/*
+ * The QAPI schema defines both a set of C data types, and a QMP wire
+ * format.  QAPI objects can contain references to other QAPI objects,
+ * resulting in a directed acyclic graph.  QAPI also generates visitor
+ * functions to walk these graphs.  This file represents the interface
+ * for doing work at each node of a QAPI graph; it can also be used
+ * for a virtual walk, where there is no actual QAPI C struct.
+ *
+ * There are three kinds of visitor classes: input visitors (QMP,
+ * string, and QemuOpts) parse an external representation and build
+ * the corresponding QAPI graph, output visitors (QMP and string) take
+ * a completed QAPI graph and generate an external representation, and
+ * the dealloc visitor can take a QAPI graph (possibly partially
+ * constructed) and recursively free its resources.  While the dealloc
+ * and QMP input/output visitors are general, the string and QemuOpts
+ * visitors have some implementation limitations; see the
+ * documentation for each visitor for more details on what it
+ * supports.  Also, see visitor-impl.h for the callback contracts
+ * implemented by each visitor, and docs/qapi-code-gen.txt for more
+ * about the QAPI code generator.
+ *
+ * All QAPI types have a corresponding function with a signature
+ * roughly compatible with this:
+ *
+ * void visit_type_FOO(Visitor *v, const char *name, T obj, Error **errp);
+ *
+ * where T is FOO for scalar types, and FOO * otherwise.  The scalar
+ * visitors are declared here; the remaining visitors are generated in
+ * qapi-visit.h.
+ *
+ * The @name parameter of visit_type_FOO() describes the relation
+ * between this QAPI value and its parent container.  When visiting
+ * the root of a tree, @name is ignored; when visiting a member of an
+ * object, @name is the key associated with the value; and when
+ * visiting a member of a list, @name is NULL.
+ *
+ * FIXME: Clients must pass NULL for @name when visiting a member of a
+ * list, but this leads to poor error messages; it might be nicer to
+ * require a non-NULL name such as "key.0" for '{ "key": [ "value" ]
+ * }' if an error is encountered on "value" (or to have the visitor
+ * core auto-generate the nicer name).
+ *
+ * The visit_type_FOO() functions expect a non-null @obj argument;
+ * they allocate *@obj during input visits, leave it unchanged on
+ * output visits, and recursively free any resources during a dealloc
+ * visit.  Each function also takes the customary @errp argument (see
+ * qapi/error.h for details), for reporting any errors (such as if a
+ * member @name is not present, or is present but not the specified
+ * type).
+ *
+ * FIXME: At present, input visitors may allocate an incomplete *@obj
+ * even when visit_type_FOO() reports an error.  Using an output
+ * visitor with an incomplete object has undefined behavior; callers
+ * must call qapi_free_FOO() (which uses the dealloc visitor, and
+ * safely handles an incomplete object) to avoid a memory leak.
+ *
+ *

[Qemu-devel] [PATCH v15 04/23] qmp-input: Clean up stack handling

2016-04-27 Thread Eric Blake

Management of the top of stack was a bit verbose; creating a
temporary variable and adding some comments makes the existing
code more legible before the next few patches improve things.
No semantic changes other than asserting that we are always
visiting a QObject, and not a NULL value.  In particular, the
check for 'name && qobject_type(qobj) == QTYPE_QDICT)' is a
bit overkill (a dict visit should always have a name); a later
patch revisits that, while this patch is only changing one
layer of indentation due to dropping 'if (qobj)'.

Signed-off-by: Eric Blake 

---
v15: comment improvements
v14: no change
v13: no change
v12: new patch
---
 qapi/qmp-input-visitor.c | 51 +---
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index 77cce8b..8550bc7 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -25,16 +25,23 @@

 typedef struct StackObject
 {
-QObject *obj;
-const QListEntry *entry;
-GHashTable *h;
+QObject *obj; /* Object being visited */
+
+GHashTable *h;   /* If obj is dict: unvisited keys */
+const QListEntry *entry; /* If obj is list: unvisited tail */
 } StackObject;

 struct QmpInputVisitor
 {
 Visitor visitor;
+
+/* Stack of objects being visited.  stack[0] is root of visit,
+ * stack[1..] records the nesting of start_struct()/end_struct()
+ * and start_list()/end_list() pairs. */
 StackObject stack[QIV_STACK_SIZE];
 int nb_stack;
+
+/* True to reject parse in visit_end_struct() if unvisited keys remain. */
 bool strict;
 };

@@ -47,19 +54,29 @@ static QObject *qmp_input_get_object(QmpInputVisitor *qiv,
  const char *name,
  bool consume)
 {
-QObject *qobj = qiv->stack[qiv->nb_stack - 1].obj;
+StackObject *tos = >stack[qiv->nb_stack - 1];
+QObject *qobj = tos->obj;

-if (qobj) {
-if (name && qobject_type(qobj) == QTYPE_QDICT) {
-if (qiv->stack[qiv->nb_stack - 1].h && consume) {
-g_hash_table_remove(qiv->stack[qiv->nb_stack - 1].h, name);
-}
-return qdict_get(qobject_to_qdict(qobj), name);
-} else if (qiv->stack[qiv->nb_stack - 1].entry) {
-return qlist_entry_obj(qiv->stack[qiv->nb_stack - 1].entry);
+assert(qobj);
+
+/* If we have a name, and we're in a dictionary, then return that
+ * value. */
+if (name && qobject_type(qobj) == QTYPE_QDICT) {
+if (tos->h && consume) {
+g_hash_table_remove(tos->h, name);
 }
+return qdict_get(qobject_to_qdict(qobj), name);
 }

+/* If we are in the middle of a list, then return the next element
+ * of the list. */
+if (tos->entry) {
+assert(qobject_type(qobj) == QTYPE_QLIST);
+return qlist_entry_obj(tos->entry);
+}
+
+/* Otherwise, we are at the root of the visit or the start of a
+ * list, and return the object as-is. */
 return qobj;
 }

@@ -72,20 +89,22 @@ static void qdict_add_key(const char *key, QObject *obj, 
void *opaque)
 static void qmp_input_push(QmpInputVisitor *qiv, QObject *obj, Error **errp)
 {
 GHashTable *h;
+StackObject *tos = >stack[qiv->nb_stack];

+assert(obj);
 if (qiv->nb_stack >= QIV_STACK_SIZE) {
 error_setg(errp, "An internal buffer overran");
 return;
 }

-qiv->stack[qiv->nb_stack].obj = obj;
-qiv->stack[qiv->nb_stack].entry = NULL;
-qiv->stack[qiv->nb_stack].h = NULL;
+tos->obj = obj;
+tos->entry = NULL;
+tos->h = NULL;

 if (qiv->strict && qobject_type(obj) == QTYPE_QDICT) {
 h = g_hash_table_new(g_str_hash, g_str_equal);
 qdict_iter(qobject_to_qdict(obj), qdict_add_key, h);
-qiv->stack[qiv->nb_stack].h = h;
+tos->h = h;
 }

 qiv->nb_stack++;
-- 
2.5.5

[Qemu-devel] [PATCH v15 01/23] qapi-visit: Add visitor.type classification

2016-04-27 Thread Eric Blake

We have three classes of QAPI visitors: input, output, and dealloc.
Currently, all implementations of these visitors have one thing in
common based on their visitor type: the implementation used for the
visit_type_enum() callback.  But since we plan to add more such
common behavior, in relation to documenting and further refining
the semantics, it makes more sense to have the visitor
implementations advertise which class they belong to, so the common
qapi-visit-core code can use that information in multiple places.

A later patch will better document the types of visitors directly
in visitor.h.

For this patch, knowing the class of a visitor implementation lets
us make input_type_enum() and output_type_enum() become static
functions, by replacing the callback function Visitor.type_enum()
with the simpler enum member Visitor.type.  Share a common
assertion in qapi-visit-core as part of the refactoring.

Move comments in opts-visitor.c to match the refactored layout.

Signed-off-by: Eric Blake 

---
v15: comment improvements
v14: no change
v13: no change
v12: new patch
---
 include/qapi/visitor.h   | 11 +++
 include/qapi/visitor-impl.h  | 23 ++-
 qapi/qapi-visit-core.c   | 28 +++-
 qapi/opts-visitor.c  | 17 +++--
 qapi/qapi-dealloc-visitor.c  |  7 +--
 qapi/qmp-input-visitor.c |  2 +-
 qapi/qmp-output-visitor.c|  2 +-
 qapi/string-input-visitor.c  |  2 +-
 qapi/string-output-visitor.c |  2 +-
 9 files changed, 52 insertions(+), 42 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 9a8d010..690589f 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -78,8 +78,19 @@ void visit_end_alternate(Visitor *v);
  */
 bool visit_optional(Visitor *v, const char *name, bool *present);

+/*
+ * Visit an enum value.
+ *
+ * @strings expresses the mapping between C enum values and QAPI enum
+ * names; it should be the ENUM_lookup array from visit-types.h.
+ *
+ * May call visit_type_str() under the hood, and the enum visit may
+ * fail even if the corresponding string visit succeeded; this implies
+ * that visit_type_str() must have no unwelcome side effects.
+ */
 void visit_type_enum(Visitor *v, const char *name, int *obj,
  const char *const strings[], Error **errp);
+
 void visit_type_int(Visitor *v, const char *name, int64_t *obj, Error **errp);
 void visit_type_uint8(Visitor *v, const char *name, uint8_t *obj,
   Error **errp);
diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 2bd8f29..51c338a 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -14,6 +14,17 @@

 #include "qapi/visitor.h"

+/*
+ * There are three classes of visitors; setting the class determines
+ * how QAPI enums are visited, as well as what additional restrictions
+ * can be asserted.
+ */
+typedef enum VisitorType {
+VISITOR_INPUT,
+VISITOR_OUTPUT,
+VISITOR_DEALLOC,
+} VisitorType;
+
 struct Visitor
 {
 /* Must be set */
@@ -36,10 +47,6 @@ struct Visitor
 void (*end_alternate)(Visitor *v);

 /* Must be set. */
-void (*type_enum)(Visitor *v, const char *name, int *obj,
-  const char *const strings[], Error **errp);
-
-/* Must be set. */
 void (*type_int64)(Visitor *v, const char *name, int64_t *obj,
Error **errp);
 /* Must be set. */
@@ -58,11 +65,9 @@ struct Visitor

 /* May be NULL; most useful for input visitors. */
 void (*optional)(Visitor *v, const char *name, bool *present);
+
+/* Must be set */
+VisitorType type;
 };

-void input_type_enum(Visitor *v, const char *name, int *obj,
- const char *const strings[], Error **errp);
-void output_type_enum(Visitor *v, const char *name, int *obj,
-  const char *const strings[], Error **errp);
-
 #endif
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index fa680c9..3cd7edc 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -72,12 +72,6 @@ bool visit_optional(Visitor *v, const char *name, bool 
*present)
 return *present;
 }

-void visit_type_enum(Visitor *v, const char *name, int *obj,
- const char *const strings[], Error **errp)
-{
-v->type_enum(v, name, obj, strings, errp);
-}
-
 void visit_type_int(Visitor *v, const char *name, int64_t *obj, Error **errp)
 {
 v->type_int64(v, name, obj, errp);
@@ -208,14 +202,13 @@ void visit_type_any(Visitor *v, const char *name, QObject 
**obj, Error **errp)
 v->type_any(v, name, obj, errp);
 }

-void output_type_enum(Visitor *v, const char *name, int *obj,
-  const char *const strings[], Error **errp)
+static void output_type_enum(Visitor *v, const char *name, int *obj,
+ const char *const strings[], Error **errp)
 {
 int i = 0;
 int value = *obj;
 char

[Qemu-devel] [PATCH v15 16/23] spapr_drc: Expose 'null' in qom-get when there is no fdt

2016-04-27 Thread Eric Blake

Now that the QMP output visitor supports an explicit null
output, we should utilize it to make it easier to diagnose
the difference between a missing fdt ('null') vs. a
present-but-empty one ('{}').

(Note that this reverts the behavior of commit ab8bf1d, taking
us back to the behavior of commit 6c2f9a1 [which in turn
stemmed from a crash fix in 1d10b44]; but that this time,
the change is intentional and not an accidental side-effect.)

Signed-off-by: Eric Blake 
Acked-by: David Gibson 

---
v15: no change
v14: no change
v13: no change
v12: tweak commit message
[no v10, v11]
v9: improved commit message
v8: rebase to 'name' motion
v7: new patch, based on discussion about spapr_drc.c
---
 hw/ppc/spapr_drc.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 3173940..72d725c 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -269,11 +269,7 @@ static void prop_get_fdt(Object *obj, Visitor *v, const 
char *name,
 void *fdt;

 if (!drc->fdt) {
-visit_start_struct(v, name, NULL, 0, );
-if (!err) {
-visit_end_struct(v, );
-}
-error_propagate(errp, err);
+visit_type_null(v, NULL, errp);
 return;
 }

-- 
2.5.5

[Qemu-devel] [PATCH v15 08/23] monitor: Let generated code validate arguments

2016-04-27 Thread Eric Blake

Having to manually call out the set of expected arguments in
qmp-commands.hx, in addition to what is already in *.json,
is tedious and prone to error.  The only reason we use
.args_type is for checking if there is any excess arguments
or incorrectly typed arguments during qmp_check_client_args(),
but a strict QMP Input visitor also does that checking.
Simplify things so that .args_type is only needed for the
few remaining commands that don't have a QAPI description
(namely, device_add, netdev_add, qmp_capabilities) or which
use 'gen':false (query-qmp-schema).

There was one case where the generated marshal code has to be
beefed up: for a command that has no (or empty) 'data' in the
.json file, we were just ignoring the passed-in QDict; now we
need to validate that there are no extra arguments.

Generated code changes like:

|@@ -1206,7 +1206,10 @@ void qmp_marshal_cont(QDict *args, QObje
| {
| Error *err = NULL;
|
|-(void)args;
|+if (args && qdict_size(args)) {
|+error_setg(errp, "Command %s does not take arguments", "cont");
|+return;
|+}
|
| qmp_cont();
| error_propagate(errp, err);

QMP behavior for no-arg commands changes from:

{"execute":"query-version","arguments":{"a":1}}
{"error": {"class": "GenericError", "desc": "Invalid parameter 'a'"}}

to:

{"execute":"query-version","arguments":{"a":1}}
{"error": {"class": "GenericError", "desc": "Command 'query-version' does not 
take arguments"}}

and for commands with at least one (possibly-optional) argument,
the output changes from:

{"execute":"query-command-line-options","arguments":{"a":1}}
{"error": {"class": "GenericError", "desc": "Invalid parameter 'a'"}}

to:

{"execute":"query-command-line-options","arguments":{"a":1}}
{"error": {"class": "GenericError", "desc": "QMP input object member 'a' is 
unexpected"}}

Signed-off-by: Eric Blake 

---
v15: new patch
---
 scripts/qapi-commands.py |   8 ++-
 monitor.c|   4 ++
 qmp-commands.hx  | 148 +--
 3 files changed, 12 insertions(+), 148 deletions(-)

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index 04549fa..beb34c5 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -136,8 +136,12 @@ def gen_marshal(name, arg_type, ret_type):
 else:
 ret += mcgen('''

-(void)args;
-''')
+if (args && qdict_size(args)) {
+error_setg(errp, "Command '%%s' does not take arguments", "%(name)s");
+return;
+}
+''',
+ name=name)

 ret += gen_call(name, arg_type, ret_type)

diff --git a/monitor.c b/monitor.c
index d1c1930..bc8ff5e 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3783,6 +3783,7 @@ out:
 /*
  * Client argument checking rules:
  *
+ * 0. If args_type is NULL, rely on the generated QAPI to do the check
  * 1. Client must provide all mandatory arguments
  * 2. Each argument provided by the client must be expected
  * 3. Each argument provided by the client must have the type expected
@@ -3795,6 +3796,9 @@ static void qmp_check_client_args(const mon_cmd_t *cmd, 
QDict *client_args,
 int flags;
 QDict *cmd_args;

+if (!cmd->args_type) {
+return;
+}
 cmd_args = qdict_from_args_type(cmd->args_type);

 flags = 0;
diff --git a/qmp-commands.hx b/qmp-commands.hx
index de896a5..49189ce 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -62,7 +62,6 @@ EQMP

 {
 .name   = "quit",
-.args_type  = "",
 .mhandler.cmd_new = qmp_marshal_quit,
 },

@@ -83,7 +82,6 @@ EQMP

 {
 .name   = "eject",
-.args_type  = "force:-f,device:B",
 .mhandler.cmd_new = qmp_marshal_eject,
 },

@@ -109,7 +107,6 @@ EQMP

 {
 .name   = "change",
-.args_type  = "device:B,target:F,arg:s?",
 .mhandler.cmd_new = qmp_marshal_change,
 },

@@ -145,7 +142,6 @@ EQMP

 {
 .name   = "screendump",
-.args_type  = "filename:F",
 .mhandler.cmd_new = qmp_marshal_screendump,
 },

@@ -168,7 +164,6 @@ EQMP

 {
 .name   = "stop",
-.args_type  = "",
 .mhandler.cmd_new = qmp_marshal_stop,
 },

@@ -189,7 +184,6 @@ EQMP

 {
 .name   = "cont",
-.args_type  = "",
 .mhandler.cmd_new = qmp_marshal_cont,
 },

@@ -210,7 +204,6 @@ EQMP

 {
 .name   = "system_wakeup",
-.args_type  = "",
 .mhandler.cmd_new = qmp_marshal_system_wakeup,
 },

@@ -231,7 +224,6 @@ EQMP

 {
 .name   = "system_reset",
-.args_type  = "",
 .mhandler.cmd_new = qmp_marshal_system_reset,
 },

@@ -252,7 +244,6 @@ EQMP

 {
 .name   = "system_powerdown",
-.args_type  = "",
 .mhandler.cmd_new = qmp_marshal_system_powerdown,
 },

@@ -309,7 +300,6 @@ EQMP

 {
 .name   = "device_del",
-.args_type  = "id:s",

[Qemu-devel] [PATCH v15 05/23] qapi: Use strict QMP input visitor in more places

2016-04-27 Thread Eric Blake

Rather than having two separate ways to create a QMP input
visitor, where the safer approach has the more verbose name,
it is better to consolidate things into a single function
where the caller must explicitly choose whether to be strict
or to ignore excess input.  Use strict mode in more places of
internal code (such as when cloning a QAPI struct in
util/socket.c, where the QObject better not have excess
input since it was just generated by qmp-output), while
documenting in user-facing code a question of whether we
should change our policy about ignoring excess input.

In the case of qmp_object_add(), we intentionally switch to a
strict visitor; this matches the fact that the code for
user_creatable_add_type() is shared by both qmp_object_add()
(QMP input visitor) and by user_creatable_add_opts() (QemuOpts
visitor); the latter is always strict, so our usage in the
former should also be strict, so that both visits will
equally diagnose any excess input in a nested dict.  But in
practice, we don't really have any -object where the
properties are a nested dict, and excess input at the top
level is already caught earlier by object_property_set() on
an unknown key, whether from QemuOpts:

$ ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio -object 
secret,id=sec0,data=letmein,format=raw,foo=bar
qemu-system-x86_64: Property '.foo' not found

or from QMP:

$ ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 93, "minor": 5, "major": 2}, "package": 
""}, "capabilities": []}}
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"object-add","arguments":{"qom-type":"secret","id":"sec0","props":{"format":"raw","data":"letmein","a":1}}}
{"error": {"class": "GenericError", "desc": "Property '.a' not found"}}

Signed-off-by: Eric Blake 

---
v15: new patch
---
 scripts/qapi-commands.py   |  2 +-
 include/qapi/qmp-input-visitor.h   |  9 +++--
 qapi/qmp-input-visitor.c   | 13 ++---
 qmp.c  |  2 +-
 qom/qom-qobject.c  |  3 ++-
 replay/replay-input.c  |  2 +-
 tests/test-qmp-commands.c  |  2 +-
 tests/test-qmp-input-strict.c  |  2 +-
 tests/test-qmp-input-visitor.c |  2 +-
 tests/test-visitor-serialization.c |  2 +-
 util/qemu-sockets.c|  2 +-
 docs/qapi-code-gen.txt |  2 +-
 12 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index b570069..6261e44 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -115,7 +115,7 @@ def gen_marshal(name, arg_type, ret_type):

 if arg_type and arg_type.members:
 ret += mcgen('''
-QmpInputVisitor *qiv = qmp_input_visitor_new_strict(QOBJECT(args));
+QmpInputVisitor *qiv = qmp_input_visitor_new(QOBJECT(args), true);
 QapiDeallocVisitor *qdv;
 Visitor *v;
 %(c_name)s arg = {0};
diff --git a/include/qapi/qmp-input-visitor.h b/include/qapi/qmp-input-visitor.h
index 3ed499c..b0624d8 100644
--- a/include/qapi/qmp-input-visitor.h
+++ b/include/qapi/qmp-input-visitor.h
@@ -19,8 +19,13 @@

 typedef struct QmpInputVisitor QmpInputVisitor;

-QmpInputVisitor *qmp_input_visitor_new(QObject *obj);
-QmpInputVisitor *qmp_input_visitor_new_strict(QObject *obj);
+/*
+ * Return a new input visitor that converts QMP to QAPI.
+ *
+ * Set @strict to reject a parse that doesn't consume all keys of a
+ * dictionary; otherwise excess input is ignored.
+ */
+QmpInputVisitor *qmp_input_visitor_new(QObject *obj, bool strict);

 void qmp_input_visitor_cleanup(QmpInputVisitor *v);

diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index 8550bc7..c3c3271 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -356,7 +356,7 @@ void qmp_input_visitor_cleanup(QmpInputVisitor *v)
 g_free(v);
 }

-QmpInputVisitor *qmp_input_visitor_new(QObject *obj)
+QmpInputVisitor *qmp_input_visitor_new(QObject *obj, bool strict)
 {
 QmpInputVisitor *v;

@@ -376,19 +376,10 @@ QmpInputVisitor *qmp_input_visitor_new(QObject *obj)
 v->visitor.type_number = qmp_input_type_number;
 v->visitor.type_any = qmp_input_type_any;
 v->visitor.optional = qmp_input_optional;
+v->strict = strict;

 qmp_input_push(v, obj, NULL);
 qobject_incref(obj);

 return v;
 }
-
-QmpInputVisitor *qmp_input_visitor_new_strict(QObject *obj)
-{
-QmpInputVisitor *v;
-
-v = qmp_input_visitor_new(obj);
-v->strict = true;
-
-return v;
-}
diff --git a/qmp.c b/qmp.c
index 9d0953b..e784a67 100644
--- a/qmp.c
+++ b/qmp.c
@@ -663,7 +663,7 @@ void qmp_object_add(const char *type, const char *id,
 }
 }

-qiv = qmp_input_visitor_new(props);
+qiv = qmp_input_visitor_new(props, true);
 obj = user_creatable_add_type(type, id, pdict,
   qmp_input_get_visitor(qiv), errp);
 qmp_input_visitor_cleanup(qiv);

[Qemu-devel] [PATCH v15 02/23] qapi: Guarantee NULL obj on input visitor callback error

2016-04-27 Thread Eric Blake

Our existing input visitors were not very consistent on errors
in a function taking 'TYPE **obj' (that is, start_struct(),
start_alternate(), next_list(), type_str(), and type_any()).
While all of them set '*obj' to allocated storage on success,
it was not obvious whether '*obj' was guaranteed safe on failure,
or whether it was left uninitialized.  But a future patch wants
to guarantee that visit_type_FOO() does not leak a partially-
constructed obj back to the caller; it is easier to implement
this if we can reliably state that '*obj' is assigned on exit,
even on failures.  Add assertions to enforce it.

The opts-visitor start_struct() doesn't set an error, but it
also was doing a weird check for 0 size; all callers pass in
non-zero size if obj is non-NULL.

The testsuite has at least one spot where we no longer need
to pre-initialize a variable prior to a visit; valgrind confirms
that the test is still fine with the cleanup.

A later patch will document the design constraint implemented
here.

Signed-off-by: Eric Blake 

---
v15: enhance commit message, hoist assertions from later in series
v14: no change
v13: no change
v12: new patch
---
 qapi/qapi-visit-core.c| 34 ++
 qapi/opts-visitor.c   |  3 ++-
 qapi/qmp-input-visitor.c  |  4 
 qapi/string-input-visitor.c   |  1 +
 tests/test-qmp-input-strict.c |  2 +-
 5 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 3cd7edc..3a131ce 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -23,7 +23,13 @@
 void visit_start_struct(Visitor *v, const char *name, void **obj,
 size_t size, Error **errp)
 {
-v->start_struct(v, name, obj, size, errp);
+Error *err = NULL;
+
+v->start_struct(v, name, obj, size, );
+if (obj && v->type == VISITOR_INPUT) {
+assert(err || *obj);
+}
+error_propagate(errp, err);
 }

 void visit_end_struct(Visitor *v, Error **errp)
@@ -51,9 +57,15 @@ void visit_start_alternate(Visitor *v, const char *name,
GenericAlternate **obj, size_t size,
bool promote_int, Error **errp)
 {
+Error *err = NULL;
+
 assert(obj && size >= sizeof(GenericAlternate));
 if (v->start_alternate) {
-v->start_alternate(v, name, obj, size, promote_int, errp);
+v->start_alternate(v, name, obj, size, promote_int, );
+if (v->type == VISITOR_INPUT) {
+assert(err || *obj);
+}
+error_propagate(errp, err);
 }
 }

@@ -188,7 +200,14 @@ void visit_type_bool(Visitor *v, const char *name, bool 
*obj, Error **errp)

 void visit_type_str(Visitor *v, const char *name, char **obj, Error **errp)
 {
-v->type_str(v, name, obj, errp);
+Error *err = NULL;
+
+assert(obj);
+v->type_str(v, name, obj, );
+if (v->type == VISITOR_INPUT) {
+assert(err || *obj);
+}
+error_propagate(errp, err);
 }

 void visit_type_number(Visitor *v, const char *name, double *obj,
@@ -199,7 +218,14 @@ void visit_type_number(Visitor *v, const char *name, 
double *obj,

 void visit_type_any(Visitor *v, const char *name, QObject **obj, Error **errp)
 {
-v->type_any(v, name, obj, errp);
+Error *err = NULL;
+
+assert(obj);
+v->type_any(v, name, obj, );
+if (v->type == VISITOR_INPUT) {
+assert(err || *obj);
+}
+error_propagate(errp, err);
 }

 static void output_type_enum(Visitor *v, const char *name, int *obj,
diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c
index 66aeaed..4cb6436 100644
--- a/qapi/opts-visitor.c
+++ b/qapi/opts-visitor.c
@@ -133,7 +133,7 @@ opts_start_struct(Visitor *v, const char *name, void **obj,
 const QemuOpt *opt;

 if (obj) {
-*obj = g_malloc0(size > 0 ? size : 1);
+*obj = g_malloc0(size);
 }
 if (ov->depth++ > 0) {
 return;
@@ -314,6 +314,7 @@ opts_type_str(Visitor *v, const char *name, char **obj, 
Error **errp)

 opt = lookup_scalar(ov, name, errp);
 if (!opt) {
+*obj = NULL;
 return;
 }
 *obj = g_strdup(opt->str ? opt->str : "");
diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index 02d4233..77cce8b 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -120,6 +120,9 @@ static void qmp_input_start_struct(Visitor *v, const char 
*name, void **obj,
 QObject *qobj = qmp_input_get_object(qiv, name, true);
 Error *err = NULL;

+if (obj) {
+*obj = NULL;
+}
 if (!qobj || qobject_type(qobj) != QTYPE_QDICT) {
 error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
"QDict");
@@ -267,6 +270,7 @@ static void qmp_input_type_str(Visitor *v, const char 
*name, char **obj,
 QString *qstr = qobject_to_qstring(qmp_input_get_object(qiv, name, true));

 if (!qstr) {
+*obj = NULL;
 error_setg(errp,

[Qemu-devel] [PATCH v15 10/23] qmp-input: Require struct push to visit members of top dict

2016-04-27 Thread Eric Blake

Don't embed the root of the visit into the stack of current
containers being visited.  That way, we no longer get confused
on whether the first visit of a dictionary is to the dictionary
itself or to one of the members of the dictionary, based on
whether the caller passed name=NULL; and makes the QMP Input
visitor like other visitors where the value of 'name' is now
ignored on the root visit.  (We may someday want to revisit
the rules on what 'name' should be on a top-level visit,
rather than just ignoring it; but that would be the topic of
another patch).

An audit of all qmp_input_visitor_new() call sites shows that
there were only two places where callers had previously been
visiting to a QDict with a non-NULL name to bypass a call to
visit_start_struct(), and those were fixed in prior patches.

Signed-off-by: Eric Blake 

---
v15: hoist earlier in series, improve variable naming
v14: no change
v13: no change
v12: new patch
---
 qapi/qmp-input-visitor.c | 45 -
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index cae6387..6409a83 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -35,9 +35,11 @@ struct QmpInputVisitor
 {
 Visitor visitor;

-/* Stack of objects being visited.  stack[0] is root of visit,
- * stack[1..] records the nesting of start_struct()/end_struct()
- * and start_list()/end_list() pairs. */
+/* Root of visit at visitor creation. */
+QObject *root;
+
+/* Stack of objects being visited (all entries will be either
+ * QDict or QList). */
 StackObject stack[QIV_STACK_SIZE];
 int nb_stack;

@@ -54,33 +56,34 @@ static QObject *qmp_input_get_object(QmpInputVisitor *qiv,
  const char *name,
  bool consume)
 {
-StackObject *tos = >stack[qiv->nb_stack - 1];
-QObject *qobj = tos->obj;
-QObject *ret;
+StackObject *tos;
+QObject *qobj;
+QObject *ret = NULL;

+if (!qiv->nb_stack) {
+/* Starting at root, name is ignored. */
+return qiv->root;
+}
+
+/* We are in a container; find the next element. */
+tos = >stack[qiv->nb_stack - 1];
+qobj = tos->obj;
 assert(qobj);

-/* If we have a name, and we're in a dictionary, then return that
- * value. */
-if (name && qobject_type(qobj) == QTYPE_QDICT) {
+if (qobject_type(qobj) == QTYPE_QDICT) {
+assert(name);
 ret = qdict_get(qobject_to_qdict(qobj), name);
 if (tos->h && consume && ret) {
 bool removed = g_hash_table_remove(tos->h, name);
 assert(removed);
 }
-return ret;
-}
-
-/* If we are in the middle of a list, then return the next element
- * of the list. */
-if (tos->entry) {
+} else {
 assert(qobject_type(qobj) == QTYPE_QLIST);
-return qlist_entry_obj(tos->entry);
+assert(!name);
+ret = qlist_entry_obj(tos->entry);
 }

-/* Otherwise, we are at the root of the visit or the start of a
- * list, and return the object as-is. */
-return qobj;
+return ret;
 }

 static void qdict_add_key(const char *key, QObject *obj, void *opaque)
@@ -355,7 +358,7 @@ Visitor *qmp_input_get_visitor(QmpInputVisitor *v)

 void qmp_input_visitor_cleanup(QmpInputVisitor *v)
 {
-qobject_decref(v->stack[0].obj);
+qobject_decref(v->root);
 g_free(v);
 }

@@ -381,7 +384,7 @@ QmpInputVisitor *qmp_input_visitor_new(QObject *obj, bool 
strict)
 v->visitor.optional = qmp_input_optional;
 v->strict = strict;

-qmp_input_push(v, obj, NULL);
+v->root = obj;
 qobject_incref(obj);

 return v;
-- 
2.5.5

[Qemu-devel] [PATCH v15 07/23] qapi-commands: Wrap argument visit in visit_start_struct

2016-04-27 Thread Eric Blake

The qmp-input visitor was allowing callers to play rather fast
and loose: when visiting a QDict, you could grab members of the
root dictionary without first pushing into the dict; among the
culprit callers was the generated marshal code on the 'arguments'
dictionary of a QMP command.  But we are about to tighten the
input visitor, at which point the generated marshal code MUST
follow the same paradigms as everyone else, of pushing into the
struct before grabbing its keys.

Generated code grows as follows:

|@@ -515,7 +641,12 @@ void qmp_marshal_blockdev_backup(QDict *
| BlockdevBackup arg = {0};
|
| v = qmp_input_get_visitor(qiv);
|+visit_start_struct(v, NULL, NULL, 0, );
|+if (err) {
|+goto out;
|+}
| visit_type_BlockdevBackup_members(v, , );
|+visit_end_struct(v, err ? NULL : );
| if (err) {
| goto out;
| }
|@@ -527,7 +715,9 @@ out:
| qmp_input_visitor_cleanup(qiv);
| qdv = qapi_dealloc_visitor_new();
| v = qapi_dealloc_get_visitor(qdv);
|+visit_start_struct(v, NULL, NULL, 0, NULL);
| visit_type_BlockdevBackup_members(v, , NULL);
|+visit_end_struct(v, NULL);
| qapi_dealloc_visitor_cleanup(qdv);
| }

The use of 'err ? NULL : ' is temporary; a later patch will
clean that up when it splits visit_end_struct().

Prior to this patch, the fact that there was no final
visit_end_struct() meant that even though we are using a strict
input visit, the marshalling code was not detecting excess input
at the top level (only in nested levels).  Fortunately, we have
code in monitor.c:qmp_check_client_args() that also checks for
no excess arguments at the top level.  But as the generated code
is more compact than the manual check, a later patch will clean
up monitor.c to drop the redundancy added here.

Signed-off-by: Eric Blake 

---
v15: hoist earlier in series, improve commit message, update docs
v14: rebase to master context
v13: rebase to earlier patches
v12: new patch
---
 scripts/qapi-commands.py | 7 +++
 docs/qapi-code-gen.txt   | 7 +++
 2 files changed, 14 insertions(+)

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index 6261e44..04549fa 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -121,7 +121,12 @@ def gen_marshal(name, arg_type, ret_type):
 %(c_name)s arg = {0};

 v = qmp_input_get_visitor(qiv);
+visit_start_struct(v, NULL, NULL, 0, );
+if (err) {
+goto out;
+}
 visit_type_%(c_name)s_members(v, , );
+visit_end_struct(v, err ? NULL : );
 if (err) {
 goto out;
 }
@@ -150,7 +155,9 @@ out:
 qmp_input_visitor_cleanup(qiv);
 qdv = qapi_dealloc_visitor_new();
 v = qapi_dealloc_get_visitor(qdv);
+visit_start_struct(v, NULL, NULL, 0, NULL);
 visit_type_%(c_name)s_members(v, , NULL);
+visit_end_struct(v, NULL);
 qapi_dealloc_visitor_cleanup(qdv);
 ''',
  c_name=arg_type.c_name())
diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
index 4a917f9..b4ae1be 100644
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -1002,7 +1002,12 @@ Example:
 UserDefOneList *arg1 = NULL;

 v = qmp_input_get_visitor(qiv);
+visit_start_struct(v, NULL, NULL, 0, );
+if (err) {
+goto out;
+}
 visit_type_UserDefOneList(v, "arg1", , );
+visit_end_struct(v, err ? NULL : );
 if (err) {
 goto out;
 }
@@ -1019,7 +1024,9 @@ Example:
 qmp_input_visitor_cleanup(qiv);
 qdv = qapi_dealloc_visitor_new();
 v = qapi_dealloc_get_visitor(qdv);
+visit_start_struct(v, NULL, NULL, 0, NULL);
 visit_type_UserDefOneList(v, "arg1", , NULL);
+visit_end_struct(v, NULL);
 qapi_dealloc_visitor_cleanup(qdv);
 }

-- 
2.5.5

[Qemu-devel] [PATCH v15 18/23] qmp: Tighten output visitor rules

2016-04-27 Thread Eric Blake

Tighten assertions in the QMP output visitor, so that:

- qmp_output_get_qobject() can only be called after pairing a
visit_end_* for every visit_start_* (rather than allowing it on
a partially built object)

- qmp_output_get_qobject() cannot be called unless at least one
visit_type_* or visit_start/visit_end pair has occurred since
creation/reset (the accidental return of NULL fixed by commit
ab8bf1d7 would have been much easier to diagnose)

- ensure that we are encountering the expected object or list
type, to provide protection against mismatched push(struct)/
pop(list) or push(list)/pop(struct), similar to the qmp-input
protection added in commit bdd8e6b5.

- ensure that except for the root, 'name' is non-null inside a
dict, and NULL inside a list (this may need changing later if
we add "name.0" support for better error messages for a list,
but for now it makes sure all users are at least consistent)

Signed-off-by: Eric Blake 

---
v15: split off qmp_output_visitor_reset(), improve comments,
use QTAILQ_EMPTY
v14: no change
v13: no change
v12: rebase to latest, move type_null() into earlier patches,
don't change signature of pop, don't auto-reset after a single
get_qobject
[no v10, v11]
v9: rebase to added patch, squash in more sanity checks, drop
Marc-Andre's R-b
v8: rename qmp_output_reset to qmp_output_visitor_reset
v7: new patch, based on discussion about spapr_drc.c

Signed-off-by: Eric Blake 
---
 qapi/qmp-output-visitor.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
index 6c44210..7155bde 100644
--- a/qapi/qmp-output-visitor.c
+++ b/qapi/qmp-output-visitor.c
@@ -82,9 +82,8 @@ static void qmp_output_add_obj(QmpOutputVisitor *qov, const 
char *name,
 QObject *cur = e ? e->value : NULL;

 if (!cur) {
-/* FIXME we should require the user to reset the visitor, rather
- * than throwing away the previous root */
-qobject_decref(qov->root);
+/* Don't allow reuse of visitor on more than one root */
+assert(!qov->root);
 qov->root = value;
 } else {
 switch (qobject_type(cur)) {
@@ -93,6 +92,7 @@ static void qmp_output_add_obj(QmpOutputVisitor *qov, const 
char *name,
 qdict_put_obj(qobject_to_qdict(cur), name, value);
 break;
 case QTYPE_QLIST:
+assert(!name);
 qlist_append_obj(qobject_to_qlist(cur), value);
 break;
 default:
@@ -114,7 +114,8 @@ static void qmp_output_start_struct(Visitor *v, const char 
*name, void **obj,
 static void qmp_output_end_struct(Visitor *v, Error **errp)
 {
 QmpOutputVisitor *qov = to_qov(v);
-qmp_output_pop(qov);
+QObject *value = qmp_output_pop(qov);
+assert(qobject_type(value) == QTYPE_QDICT);
 }

 static void qmp_output_start_list(Visitor *v, const char *name, Error **errp)
@@ -145,7 +146,8 @@ static GenericList *qmp_output_next_list(Visitor *v, 
GenericList **listp,
 static void qmp_output_end_list(Visitor *v)
 {
 QmpOutputVisitor *qov = to_qov(v);
-qmp_output_pop(qov);
+QObject *value = qmp_output_pop(qov);
+assert(qobject_type(value) == QTYPE_QLIST);
 }

 static void qmp_output_type_int64(Visitor *v, const char *name, int64_t *obj,
@@ -202,18 +204,16 @@ static void qmp_output_type_null(Visitor *v, const char 
*name, Error **errp)
 qmp_output_add_obj(qov, name, qnull());
 }

-/* Finish building, and return the root object. Will not be NULL. */
+/* Finish building, and return the root object.
+ * The root object is never null. The caller becomes the object's
+ * owner, and should use qobject_decref() when done with it.  */
 QObject *qmp_output_get_qobject(QmpOutputVisitor *qov)
 {
-/* FIXME: we should require that a visit occurred, and that it is
- * complete (no starts without a matching end) */
-QObject *obj = qov->root;
-if (obj) {
-qobject_incref(obj);
-} else {
-obj = qnull();
-}
-return obj;
+/* A visit must have occurred, with each start paired with end.  */
+assert(qov->root && QTAILQ_EMPTY(>stack));
+
+qobject_incref(qov->root);
+return qov->root;
 }

 Visitor *qmp_output_get_visitor(QmpOutputVisitor *v)
-- 
2.5.5

[Qemu-devel] [PATCH v15 09/23] qom: Wrap prop visit in visit_start_struct

2016-04-27 Thread Eric Blake

The qmp-input visitor was allowing callers to play rather fast
and loose: when visiting a QDict, you could grab members of the
root dictionary without first pushing into the dict; the final
such culprit was the QOM code for converting to and from object
properties.  But we are about to tighten the input visitor, at
which point user_creatable_add_type() as called with a QMP input
visitor via qmp_object_add() MUST follow the same paradigms as
everyone else, of pushing into the struct before grabbing its
keys.

The use of 'err ? NULL : ' is temporary; a later patch will
clean that up when it splits visit_end_struct().

The change has no impact to the testsuite now, but is required to
avoid a failure in tests/test-netfilter once qmp-input is made
stricter to detect inconsistent 'name' arguments on the root visit.

Since user_creatable_add_type() is also called with OptsVisitor
through user_creatable_add_opts(), we must also check that there
is no negative impact there; both pre- and post-patch, we see:

$ ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio -object 
secret,id=sec0,data=letmein,format=raw,foo=bar
qemu-system-x86_64: Property '.foo' not found

That is, the only new checking that the new visit_end_struct() can
perform is for excess input, but we already catch excess input
earlier in object_property_set().

Signed-off-by: Eric Blake 

---
v15: hoist earlier in series, improve commit message
v14: no change
v13: no change
v12: new patch
---
 qom/object_interfaces.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
index ab5da35..4a60d6d 100644
--- a/qom/object_interfaces.c
+++ b/qom/object_interfaces.c
@@ -120,12 +120,20 @@ Object *user_creatable_add_type(const char *type, const 
char *id,

 obj = object_new(type);
 if (qdict) {
+visit_start_struct(v, NULL, NULL, 0, _err);
+if (local_err) {
+goto out;
+}
 for (e = qdict_first(qdict); e; e = qdict_next(qdict, e)) {
 object_property_set(obj, v, e->key, _err);
 if (local_err) {
-goto out;
+break;
 }
 }
+visit_end_struct(v, local_err ? NULL : _err);
+if (local_err) {
+goto out;
+}
 }

 object_property_add_child(object_get_objects_root(),
-- 
2.5.5

[Qemu-devel] [PATCH v15 14/23] qapi: Add visit_type_null() visitor

2016-04-27 Thread Eric Blake

Right now, qmp-output-visitor happens to produce a QNull result
if nothing is actually visited between the creation of the visitor
and the request for the resulting QObject.  A stronger protocol
would require that a QMP output visit MUST visit something.  But
to still be able to produce a JSON 'null' output, we need a new
visitor function that states our intentions.  Yes, we could say
that such a visit must go through visit_type_any(), but that
feels clunky.

So this patch introduces the new visit_type_null() interface and
its no-op interface in the dealloc visitor, and stubs in the
qmp visitors (the next patch will finish the implementation).
For the visitors that will not implement the callback, document
the situation. The code in qapi-visit-core unconditionally
dereferences the callback pointer, so that a segfault will inform
a developer if they need to implement the callback for their
choice of visitor.

Note that JSON has a primitive null type, with the single value
null; likewise with the QNull type for QObject; but for QAPI,
we just have the 'null' value without a null type.  We may
eventually want to add more support in QAPI for null (most likely,
we'd use it via an alternate type that permits 'null' or an
object); but we'll create that usage when we need it.

Signed-off-by: Eric Blake 

---
v15: improve commit message, add stubs here
v14: no change
v13: no change
v12: rebase to earlier changes, drop R-b due to better documentation
[no v10, v11]
v9: no change
v8: rebase to 'name' motion
v7: new patch, based on discussion about spapr_drc.c
---
 include/qapi/visitor.h   | 12 
 include/qapi/visitor-impl.h  |  3 +++
 include/qapi/opts-visitor.h  |  3 ++-
 include/qapi/string-input-visitor.h  |  2 +-
 include/qapi/string-output-visitor.h |  2 +-
 qapi/qapi-visit-core.c   |  5 +
 qapi/qapi-dealloc-visitor.c  |  5 +
 qapi/qmp-input-visitor.c |  6 ++
 qapi/qmp-output-visitor.c|  7 +++
 9 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 0e028ba..e79c09e 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -520,4 +520,16 @@ void visit_type_number(Visitor *v, const char *name, 
double *obj,
  */
 void visit_type_any(Visitor *v, const char *name, QObject **obj, Error **errp);

+/*
+ * Visit a JSON null value.
+ *
+ * @name expresses the relationship of the null value to its parent
+ * container; see the general description of @name above.
+ *
+ * Unlike all other visit_type_* functions, no obj parameter is
+ * needed; rather, this is a witness that an explicit null value is
+ * expected rather than any other type.
+ */
+void visit_type_null(Visitor *v, const char *name, Error **errp);
+
 #endif
diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 796d180..88d27d5 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -90,6 +90,9 @@ struct Visitor
 void (*type_any)(Visitor *v, const char *name, QObject **obj,
  Error **errp);

+/* Must be set to visit explicit null values.  */
+void (*type_null)(Visitor *v, const char *name, Error **errp);
+
 /* Must be set for input visitors, optional otherwise.  The core
  * takes care of the return type in the public interface. */
 void (*optional)(Visitor *v, const char *name, bool *present);
diff --git a/include/qapi/opts-visitor.h b/include/qapi/opts-visitor.h
index 633aa71..fe37ed9 100644
--- a/include/qapi/opts-visitor.h
+++ b/include/qapi/opts-visitor.h
@@ -31,7 +31,8 @@ typedef struct OptsVisitor OptsVisitor;
  * - values above INT64_MAX or LLONG_MAX are rejected.
  *
  * The Opts input visitor does not implement support for visiting QAPI
- * alternates, numbers (other than integers), or arbitrary QTypes.
+ * alternates, numbers (other than integers), null, or arbitrary
+ * QTypes.
  */
 OptsVisitor *opts_visitor_new(const QemuOpts *opts);
 void opts_visitor_cleanup(OptsVisitor *nv);
diff --git a/include/qapi/string-input-visitor.h 
b/include/qapi/string-input-visitor.h
index fdf33ae..a8d8f67 100644
--- a/include/qapi/string-input-visitor.h
+++ b/include/qapi/string-input-visitor.h
@@ -19,7 +19,7 @@ typedef struct StringInputVisitor StringInputVisitor;

 /*
  * The string input visitor does not implement support for visiting
- * QAPI structs, alternates, or arbitrary QTypes.
+ * QAPI structs, alternates, null, or arbitrary QTypes.
  */
 StringInputVisitor *string_input_visitor_new(const char *str);
 void string_input_visitor_cleanup(StringInputVisitor *v);
diff --git a/include/qapi/string-output-visitor.h 
b/include/qapi/string-output-visitor.h
index 3bb09af..89b7e4b 100644
--- a/include/qapi/string-output-visitor.h
+++ b/include/qapi/string-output-visitor.h
@@ -19,7 +19,7 @@ typedef struct StringOutputVisitor StringOutputVisitor;

 /*
  * The string output visitor does not

[Qemu-devel] [PATCH v15 00/23] qapi visitor cleanups (post-introspection cleanups subset E)

2016-04-27 Thread Eric Blake

2.7 material; hopefully this iteration is close enough for
Markus to stick it in his qapi-next staging branch, so we
can move on to my other pending series.

Based on master, with no prerequisite patches.

Also available as a tag at this location:
git fetch git://repo.or.cz/qemu/ericb.git qapi-cleanupv15e

and will soon be part of my branch with the rest of the v5 series, at:
http://repo.or.cz/qemu/ericb.git/shortlog/refs/heads/qapi

v14 was:
https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg01486.html

Since then, I've rearranged several patches (including moving hunks
between patches), added a couple of new ones, and in general
addressed a lot of Markus' findings.  The comparison to the previous
posting looks rather big, but a lot of it is due to comment changes
or rebase artifacts from shuffling things around, where a lot of
the end results are still the same.

001/23:[0024] [FC] 'qapi-visit: Add visitor.type classification'
002/23:[0036] [FC] 'qapi: Guarantee NULL obj on input visitor callback error'
003/23:[] [--] 'qmp: Drop dead command->type'
004/23:[0019] [FC] 'qmp-input: Clean up stack handling'
005/23:[down] 'qapi: Use strict QMP input visitor in more places'
006/23:[0007] [FC] 'qmp-input: Don't consume input when checking has_member'
007/23:[0014] [FC] 'qapi-commands: Wrap argument visit in visit_start_struct'
008/23:[down] 'monitor: Let generated code validate arguments'
009/23:[0005] [FC] 'qom: Wrap prop visit in visit_start_struct'
010/23:[0052] [FC] 'qmp-input: Require struct push to visit members of top dict'
011/23:[0032] [FC] 'qmp-input: Refactor when list is advanced'
012/23:[0253] [FC] 'qapi: Document visitor interfaces, add assertions'
013/23:[0006] [FC] 'tests: Add check-qnull'
014/23:[0026] [FC] 'qapi: Add visit_type_null() visitor'
015/23:[0046] [FC] 'qmp: Support explicit null during visits'
016/23:[] [--] 'spapr_drc: Expose 'null' in qom-get when there is no fdt'
017/23:[down] 'qmp: Add qmp_output_visitor_reset()'
018/23:[0024] [FC] 'qmp: Tighten output visitor rules'
019/23:[0040] [FC] 'qapi: Split visit_end_struct() into pieces'
020/23:[down] 'tests/string-input-visitor: Add negative integer tests'
021/23:[down] 'qapi: Fix string input visitor handling of invalid list'
022/23:[0082] [FC] 'qapi: Simplify semantics of visit_next_list()'
023/23:[0108] [FC] 'qapi: Change visit_type_FOO() to no longer return partial 
objects'

Eric Blake (22):
  qapi-visit: Add visitor.type classification
  qapi: Guarantee NULL obj on input visitor callback error
  qmp: Drop dead command->type
  qmp-input: Clean up stack handling
  qapi: Use strict QMP input visitor in more places
  qmp-input: Don't consume input when checking has_member
  qapi-commands: Wrap argument visit in visit_start_struct
  monitor: Let generated code validate arguments
  qom: Wrap prop visit in visit_start_struct
  qmp-input: Require struct push to visit members of top dict
  qmp-input: Refactor when list is advanced
  qapi: Document visitor interfaces, add assertions
  tests: Add check-qnull
  qapi: Add visit_type_null() visitor
  qmp: Support explicit null during visits
  spapr_drc: Expose 'null' in qom-get when there is no fdt
  qmp: Add qmp_output_visitor_reset()
  qmp: Tighten output visitor rules
  qapi: Split visit_end_struct() into pieces
  qapi: Fix string input visitor handling of invalid list
  qapi: Simplify semantics of visit_next_list()
  qapi: Change visit_type_FOO() to no longer return partial objects

Markus Armbruster (1):
  tests/string-input-visitor: Add negative integer tests

 include/qapi/visitor.h   | 492 +--
 include/qapi/visitor-impl.h  |  81 --
 scripts/qapi-commands.py |  20 +-
 scripts/qapi-event.py|   5 +-
 scripts/qapi-visit.py|  53 ++--
 include/qapi/dealloc-visitor.h   |   5 +
 include/qapi/opts-visitor.h  |   5 +
 include/qapi/qmp-input-visitor.h |   9 +-
 include/qapi/qmp-output-visitor.h|   1 +
 include/qapi/qmp/dispatch.h  |   6 -
 include/qapi/string-input-visitor.h  |   5 +
 include/qapi/string-output-visitor.h |   5 +
 qapi/qapi-visit-core.c   | 106 ++--
 block/crypto.c   |  14 +-
 hw/ppc/spapr_drc.c   |  11 +-
 hw/virtio/virtio-balloon.c   |  15 +-
 monitor.c|   4 +
 qapi/opts-visitor.c  |  70 ++---
 qapi/qapi-dealloc-visitor.c  |  43 +--
 qapi/qmp-dispatch.c  |  18 +-
 qapi/qmp-input-visitor.c | 189 --
 qapi/qmp-output-visitor.c|  71 ++---
 qapi/qmp-registry.c  |   1 -
 qapi/string-input-visitor.c  |  51 ++--
 qapi/string-output-visitor.c |  43 ++-
 qmp.c|   2 +-
 qom/object.c |   5 +-
 qom/object_interfaces.c  |  33 ++-
 qom/qom-qobject.c|   3 +-

[Qemu-devel] [PATCH v15 06/23] qmp-input: Don't consume input when checking has_member

2016-04-27 Thread Eric Blake

Commit e8316d7 mistakenly passed consume=true within
qmp_input_optional() when checking if an optional member was
present, but the mistake was silently ignored since the code
happily let us extract a member more than once.  Fix
qmp_input_optional() to not consume anything, then tighten up
the input visitor to ensure that a member is consumed exactly
once (all generated code follows this pattern; and the new
assert will catch any hand-written code that tries to visit
the same key more than once).

Signed-off-by: Eric Blake 

---
v15: change variable name, improve commit message
v14: no change
v13: no change
v12: new patch
---
 qapi/qmp-input-visitor.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index c3c3271..cae6387 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -56,16 +56,19 @@ static QObject *qmp_input_get_object(QmpInputVisitor *qiv,
 {
 StackObject *tos = >stack[qiv->nb_stack - 1];
 QObject *qobj = tos->obj;
+QObject *ret;

 assert(qobj);

 /* If we have a name, and we're in a dictionary, then return that
  * value. */
 if (name && qobject_type(qobj) == QTYPE_QDICT) {
-if (tos->h && consume) {
-g_hash_table_remove(tos->h, name);
+ret = qdict_get(qobject_to_qdict(qobj), name);
+if (tos->h && consume && ret) {
+bool removed = g_hash_table_remove(tos->h, name);
+assert(removed);
 }
-return qdict_get(qobject_to_qdict(qobj), name);
+return ret;
 }

 /* If we are in the middle of a list, then return the next element
@@ -335,7 +338,7 @@ static void qmp_input_type_any(Visitor *v, const char 
*name, QObject **obj,
 static void qmp_input_optional(Visitor *v, const char *name, bool *present)
 {
 QmpInputVisitor *qiv = to_qiv(v);
-QObject *qobj = qmp_input_get_object(qiv, name, true);
+QObject *qobj = qmp_input_get_object(qiv, name, false);

 if (!qobj) {
 *present = false;
-- 
2.5.5

[Qemu-devel] [PATCH v15 03/23] qmp: Drop dead command->type

2016-04-27 Thread Eric Blake

Ever since QMP was first added back in commit 43c20a43, we have
never had any QmpCommandType other than QCT_NORMAL.  It's
pointless to carry around the cruft.

Signed-off-by: Eric Blake 

---
v15: no change
v14: no change
v13: no change
v12: new patch
---
 include/qapi/qmp/dispatch.h |  6 --
 qapi/qmp-dispatch.c | 18 +++---
 qapi/qmp-registry.c |  1 -
 3 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
index 4955209..5609946 100644
--- a/include/qapi/qmp/dispatch.h
+++ b/include/qapi/qmp/dispatch.h
@@ -19,11 +19,6 @@

 typedef void (QmpCommandFunc)(QDict *, QObject **, Error **);

-typedef enum QmpCommandType
-{
-QCT_NORMAL,
-} QmpCommandType;
-
 typedef enum QmpCommandOptions
 {
 QCO_NO_OPTIONS = 0x0,
@@ -33,7 +28,6 @@ typedef enum QmpCommandOptions
 typedef struct QmpCommand
 {
 const char *name;
-QmpCommandType type;
 QmpCommandFunc *fn;
 QmpCommandOptions options;
 QTAILQ_ENTRY(QmpCommand) node;
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 510a1ae..08faf85 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -94,17 +94,13 @@ static QObject *do_qmp_dispatch(QObject *request, Error 
**errp)
 QINCREF(args);
 }

-switch (cmd->type) {
-case QCT_NORMAL:
-cmd->fn(args, , _err);
-if (local_err) {
-error_propagate(errp, local_err);
-} else if (cmd->options & QCO_NO_SUCCESS_RESP) {
-g_assert(!ret);
-} else if (!ret) {
-ret = QOBJECT(qdict_new());
-}
-break;
+cmd->fn(args, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+} else if (cmd->options & QCO_NO_SUCCESS_RESP) {
+g_assert(!ret);
+} else if (!ret) {
+ret = QOBJECT(qdict_new());
 }

 QDECREF(args);
diff --git a/qapi/qmp-registry.c b/qapi/qmp-registry.c
index 4ebfbcc..4332a68 100644
--- a/qapi/qmp-registry.c
+++ b/qapi/qmp-registry.c
@@ -25,7 +25,6 @@ void qmp_register_command(const char *name, QmpCommandFunc 
*fn,
 QmpCommand *cmd = g_malloc0(sizeof(*cmd));

 cmd->name = name;
-cmd->type = QCT_NORMAL;
 cmd->fn = fn;
 cmd->enabled = true;
 cmd->options = options;
-- 
2.5.5

[Qemu-devel] [PATCH] target-mips: Fix RDHWR exception host PC

2016-04-27 Thread James Hogan

Commit b00c72180c36 ("target-mips: add PC, XNP reg numbers to RDHWR")
changed the rdhwr helpers to use check_hwrena() to check the register
being accessed is enabled in CP0_HWREna when used from user mode. If
that check fails an EXCP_RI exception is raised at the host PC
calculated with GETPC().

However check_hwrena() may not be fully inlined as the
do_raise_exception() part of it is common regardless of the arguments.
This causes GETPC() to calculate the address in the call in the helper
instead of the generated code calling the helper. No TB will be found
and the EPC reported with the resulting guest RI exception points to the
beginning of the TB instead of the RDHWR instruction.

We can't reliably force check_hwrena() to be inlined, and converting it
to a macro would be ugly, so instead pass the host PC in as an argument,
with each rdhwr helper passing GETPC(). This should avoid any dependence
on compiler behaviour, and in practice seems to prevent the partial
inlining of check_hwrena() on x86_64.

This issue causes failures when running a MIPS KVM (trap & emulate)
guest in a MIPS QEMU TCG guest, as the inner guest kernel will do a
RDHWR of counter, which is disabled in the outer guest's CP0_HWREna by
KVM so it can emulate the inner guest's counter. The emulation fails and
the RI exception is passed to the inner guest.

Fixes: b00c72180c36 ("target-mips: add PC, XNP reg numbers to RDHWR")
Signed-off-by: James Hogan 
Cc: Leon Alrae 
Cc: Yongbok Kim 
Cc: Aurelien Jarno 
---
 target-mips/op_helper.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 8ec1bef7d034..4417e6ba225f 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2294,29 +2294,29 @@ void helper_deret(CPUMIPSState *env)
 }
 #endif /* !CONFIG_USER_ONLY */
 
-static inline void check_hwrena(CPUMIPSState *env, int reg)
+static inline void check_hwrena(CPUMIPSState *env, int reg, uintptr_t pc)
 {
 if ((env->hflags & MIPS_HFLAG_CP0) || (env->CP0_HWREna & (1 << reg))) {
 return;
 }
-do_raise_exception(env, EXCP_RI, GETPC());
+do_raise_exception(env, EXCP_RI, pc);
 }
 
 target_ulong helper_rdhwr_cpunum(CPUMIPSState *env)
 {
-check_hwrena(env, 0);
+check_hwrena(env, 0, GETPC());
 return env->CP0_EBase & 0x3ff;
 }
 
 target_ulong helper_rdhwr_synci_step(CPUMIPSState *env)
 {
-check_hwrena(env, 1);
+check_hwrena(env, 1, GETPC());
 return env->SYNCI_Step;
 }
 
 target_ulong helper_rdhwr_cc(CPUMIPSState *env)
 {
-check_hwrena(env, 2);
+check_hwrena(env, 2, GETPC());
 #ifdef CONFIG_USER_ONLY
 return env->CP0_Count;
 #else
@@ -2326,19 +2326,19 @@ target_ulong helper_rdhwr_cc(CPUMIPSState *env)
 
 target_ulong helper_rdhwr_ccres(CPUMIPSState *env)
 {
-check_hwrena(env, 3);
+check_hwrena(env, 3, GETPC());
 return env->CCRes;
 }
 
 target_ulong helper_rdhwr_performance(CPUMIPSState *env)
 {
-check_hwrena(env, 4);
+check_hwrena(env, 4, GETPC());
 return env->CP0_Performance0;
 }
 
 target_ulong helper_rdhwr_xnp(CPUMIPSState *env)
 {
-check_hwrena(env, 5);
+check_hwrena(env, 5, GETPC());
 return (env->CP0_Config5 >> CP0C5_XNP) & 1;
 }
 
-- 
2.4.10

Re: [Qemu-devel] [PATCH v14 19/19] qapi: Change visit_type_FOO() to no longer return partial objects

2016-04-27 Thread Eric Blake

On 04/15/2016 08:49 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> Returning a partial object on error is an invitation for a careless
>> caller to leak memory.  As no one outside the testsuite was actually
>> relying on these semantics, it is cleaner to just document and
>> guarantee that ALL pointer-based visit_type_FOO() functions always
>> leave a safe value in *obj during an input visitor (either the new
>> object on success, or NULL if an error is encountered), so callers
>> can now unconditionally use qapi_free_FOO() to clean up regardless
>> of whether an error occurred.
> 
> Hmm, wasn't the "assign null on error" part done in a prior patch?
> Checking...  no, only half of it, in PATCH 03: there, we went from "may
> store an incomplete object on error" to "store either an incomplete
> object or null on error".  Now we go on to just "store null on error".
> Correct?

Yes. I'll tweak the wording to make it clearer.

> 
>> The decision is done by enhancing qapi-visit-core to return true
>> for input visitors (the callbacks themselves do not need
>> modification); since we've documented that visit_end_* must be
>> called after any successful visit_start_*, that is a sufficient
>> point for knowing that something was allocated during start.
> 
> I find this sentence a bit confusing.  Let me try:
> 
> To help visitor-agnostic code, such as the generated qapi-visit.c,
> make the visit_end_FOO() return true when something was allocated.
> Easily done in the visitor core, no need to change the callbacks.
> 
> But see my comments on the visit_end_FOO() inline.

Reply below, where your comments are indeed worth thinking about.

> 
>> Note that we still leave *obj unchanged after a scalar-based
>> visit_type_FOO(); I did not feel like auditing all uses of
>> visit_type_Enum() to see if the callers would tolerate a specific
>> sentinel value (not to mention having to decide whether it would
>> be better to use 0 or ENUM__MAX as that sentinel).
> 
> Should this note be in PATCH 03?
> 
> The inconsistency isn't pretty, but tolerable if it simplifies things.

No. Patch 03 fixed visit_start_struct, not visit_type_FOO.  Since it is
this patch that is tweaking visit_type_FOO, I have to explain why
visit_type_ENUM preserves values, while visit_type_OBJ sets to NULL.


>>   *
>> - * FIXME: At present, input visitors may allocate an incomplete *@obj
>> - * even when visit_type_FOO() reports an error.  Using an output
>> - * visitor with an incomplete object has undefined behavior; callers
>> - * must call qapi_free_FOO() (which uses the dealloc visitor, and
>> - * safely handles an incomplete object) to avoid a memory leak.
>> + * If an error is detected during visit_type_FOO() with an input
>> + * visitor, then *@obj will be NULL for pointer types, and left
>> + * unchanged for scalar types.
> 
> Okay.

And this matches the commit message explaining the difference between
scalar and object (and also applies to visit_type_int being a scalar
that leaves the value unchanged on error).

> 
>> + *  Using an output visitor with an
>> + * incomplete object has undefined behavior (other than a special case
>> + * for visit_type_str() treating NULL like ""), while the dealloc
>> + * visitor safely handles incomplete objects.
> 
> Where do the incomplete objects come from now?  I thought this patch
> gets rid of them.

Still possible to create one by manual means, just no longer possible
from a QAPI input visitor.  I'll tweak the wording.


>> -void visit_end_struct(Visitor *v);
>> +bool visit_end_struct(Visitor *v);
> 
> I generally like functions to return something useful, but not in this
> case, because the function name gives you no clue about its value.
> Consider:
> 
> if (visit_end_struct(v) && err) {
> qapi_free_FOO(*obj);
> *obj = NULL;
> }
> 
> To find out what this means, a reader not familiar with visitors almost
> certainly needs to refer to visit_end_struct()'s contract or code.
> 
> Compare:
> 
> visit_end_struct(v);
> if (err && v->type == VISITOR_INPUT) {

v->type is a layering violation...

> qapi_free_FOO(*obj);
> *obj = NULL;
> }
> 
> Or:
> 
> visit_end_struct(v);
> if (err && visit_is_input(v)) {

...but this is doable by exporting visit_is_input().

> qapi_free_FOO(*obj);
> *obj = NULL;
> }

Makes the generated code have more lines, but who really cares.  So
consider it done in v15.

>> +++ b/qapi/qapi-visit-core.c
>> @@ -23,11 +23,17 @@
>>  void visit_start_struct(Visitor *v, const char *name, void **obj,
>>  size_t size, Error **errp)
>>  {
>> +Error *err = NULL;
>> +
>>  if (obj) {
>>  assert(size);
>>  assert(v->type != VISITOR_OUTPUT || *obj);
>>  }
>> -v->start_struct(v, name, obj, size, errp);
>> +v->start_struct(v, name, obj, size, );
>> +if (obj && v->type == VISITOR_INPUT) {
>> +

Re: [Qemu-devel] [PATCH v14 18/19] qapi: Simplify semantics of visit_next_list()

2016-04-27 Thread Eric Blake

On 04/22/2016 05:35 AM, Markus Armbruster wrote:

  static void
 -start_list(Visitor *v, const char *name, Error **errp)
 +start_list(Visitor *v, const char *name, GenericList **list, size_t size,

 +
 +parse_str(siv, );
 +if (err) {
 +*list = NULL;
 +error_propagate(errp, err);
 +return;
 +}
>>
>> parse_str() never sets an error, and therefore your new error check is
>> dead.  Just as well, because it would be wrong.
>>
>> parse_str() parses a complete string into a non-empty list of uint64_t
>> ranges.  On success, it sets siv->ranges to this list.  On error, it
>> sets it to null.  It could also set an error then, but it doesn't.
>>
>> If it did, then what would start_list() do with it?  Reporting it would
>> be wrong, because the list members need not be integers.

parse_str() is only ever called for start_list and parse_type_int64 - so
the real question is do we ever support the string input visitor on
anything other than an integral list.  Per the comments I'm adding
earlier in the series:

 * The string input visitor does not implement support for visiting
 * QAPI structs, alternates, null, or arbitrary QTypes.

so it sounds like the only lists it supports ARE integer lists.

>>
>> If they aren't, the speculative parse_str()'s failure will be ignored.
>>
>> If they are, parse_type_int64() will call parse_str() again, then use
>> siv->ranges.
>>
>> If the first parse_str() succeeds, the second will do nothing, and we'll
>> use the first one's siv->ranges.  Works.
>>
>> If the first parse_str() fails, the second will fail as well, because
>> its input is the same.  We'll use the second one's failure.  Works.
> 
> No, it doesn't: failure gets interpreted as empty list.  I'll post my
> test case separately.
> 
>> When used outside list context, parse_type_int64() will call parse_str()
>> for the first time, and use its result.  Works.
>>
>> Note that opts-visitor does it differently: opts_start_list() doesn't
>> parse numbers, opts_type_int64() and opts_type_uint64() do.

I like the approach used in opts-visitor (start_list should only check
if a list is present, but save the parsing for when the items are
actually consumed off the list).  But opts-visitor also handles structs,
unlike the string visitor, which forces the separation (at start_list,
we don't know what the list element will be, unlike the string visitor
where we know the list element is integral).

>>
>> Further note the latent bug in parse_type_int64(): we first call
>> parse_str(siv, errp), and goto error if it fails, where we promptly
>> error_setg(errp, ...).  If parse_str() set an error, the error_setg()
>> would fail error_setv()'s assertion.
>>
>> Please drop parse_str()'s unused errp parameter, and add a comment to
>> start_list() explaining the speculative call to parse_str() there.
> 
> Insufficient, doesn't fix the bug.  After parse_str(), we need to be
> able to distinguish empty list from error.  Moving the error_set() into
> parse_str() could work.  Returning succes/failure and dropping the errp
> parameter could also work.

I'm playing with these ideas, but will get the bug fixed (along with
your testsuite addition) as a prerequisite to the list refactoring (to
make sure that the refactored list still passes the fixed test).

> 
>> Alternatively, change the string visitor to work like the opts visitor.

Trickier, but may be my only option if the other approaches don't work.
 Thanks again for spotting yet another ugly corner case worth fixing
(this series seems to have been a never-ending source of them...)

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 2/4] Postcopy: Add stats on page requests

2016-04-27 Thread Eric Blake

On 04/27/2016 01:08 PM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> On the source, add a count of page requests received from the
> destination.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  hmp.c |  4 +++
>  include/migration/migration.h |  2 ++
>  migration/migration.c | 59 
> ++-
>  migration/ram.c   |  1 +
>  qapi-schema.json  |  6 -
>  5 files changed, 36 insertions(+), 36 deletions(-)
> 

> +++ b/migration/migration.c
> @@ -561,6 +561,26 @@ static void get_xbzrle_cache_stats(MigrationInfo *info)
>  }
>  }
>  
> +static void populate_ram_info(MigrationInfo *info, MigrationState *s)
> +{
> +info->has_ram = true;
> +info->ram = g_malloc0(sizeof(*info->ram));

> @@ -585,18 +605,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
>  info->has_setup_time = true;
>  info->setup_time = s->setup_time;
>  
> -info->has_ram = true;
> -info->ram = g_malloc0(sizeof(*info->ram));

If you respin, please split the refactoring into one patch, and the
addition of postcopy stats in another, so that the addition is not lost
in the noise.

Otherwise looks fine.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread David Woodhouse

On Wed, 2016-04-27 at 21:17 +0300, Michael S. Tsirkin wrote:
> 
> > Because it's a dirty hack in the *wrong* place.
> 
> No one came up with a better one so far :(

Seriously?

Take a look at drivers/iommu/intel-iommu.c. It has quirks for all kinds
of shitty devices that have to be put in passthrough mode or otherwise
excluded.

We don't actually *need* it for the Intel IOMMU; all we need is for
QEMU to stop lying in its DMAR tables.

We *do* want the same kind of quirks in the relevant POWER and ARM
IOMMU code in the kernel. Do that (hell, a simple quirk for all virtio
devices will suffice, but NOT in the virtio driver) at the same moment
you fix the virtio devices to use the DMA API. Job done.

Some time *later* we can work on *refining* that quirk, and a way for
QEMU to tell the guest (via something generic like fwcfg, maybe) that
some devices are and aren't translated.

Actually, I'm about to look at moving dma_ops into struct device and
cleaning up the way we detect which IOMMU is attached, at device
instantiation time. Perhaps I can shove the virtio-exception quirk in
there while I'm at it...

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature

[Qemu-devel] [PATCH 4/4] tests: fix libqtest socket timeouts

2016-04-27 Thread Dr. David Alan Gilbert (git)

From: Andrea Arcangeli 

I kept getting timeouts and unix socket accept failures under high
load, the patch fixes it.

Signed-off-by: Andrea Arcangeli 
---
 tests/libqtest.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index b12a9e4..57ce292 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -27,7 +27,7 @@
 #include "qapi/qmp/qjson.h"
 
 #define MAX_IRQ 256
-#define SOCKET_TIMEOUT 5
+#define SOCKET_TIMEOUT 50
 
 QTestState *global_qtest;
 
-- 
2.5.5

[Qemu-devel] [PATCH 2/4] Postcopy: Add stats on page requests

2016-04-27 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

On the source, add a count of page requests received from the
destination.

Signed-off-by: Dr. David Alan Gilbert 
---
 hmp.c |  4 +++
 include/migration/migration.h |  2 ++
 migration/migration.c | 59 ++-
 migration/ram.c   |  1 +
 qapi-schema.json  |  6 -
 5 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/hmp.c b/hmp.c
index d510236..cd5fae3 100644
--- a/hmp.c
+++ b/hmp.c
@@ -209,6 +209,10 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
 monitor_printf(mon, "dirty pages rate: %" PRIu64 " pages\n",
info->ram->dirty_pages_rate);
 }
+if (info->ram->postcopy_requests) {
+monitor_printf(mon, "postcopy request count: %" PRIu64 "\n",
+   info->ram->postcopy_requests);
+}
 }
 
 if (info->has_disk) {
diff --git a/include/migration/migration.h b/include/migration/migration.h
index ac2c12c..78fa59b 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -157,6 +157,8 @@ struct MigrationState
 int64_t xbzrle_cache_size;
 int64_t setup_time;
 int64_t dirty_sync_count;
+/* Count of requests incoming from destination */
+int64_t postcopy_requests;
 
 /* Flag set once the migration has been asked to enter postcopy */
 bool start_postcopy;
diff --git a/migration/migration.c b/migration/migration.c
index 991313a..9d41618 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -561,6 +561,26 @@ static void get_xbzrle_cache_stats(MigrationInfo *info)
 }
 }
 
+static void populate_ram_info(MigrationInfo *info, MigrationState *s)
+{
+info->has_ram = true;
+info->ram = g_malloc0(sizeof(*info->ram));
+info->ram->transferred = ram_bytes_transferred();
+info->ram->total = ram_bytes_total();
+info->ram->duplicate = dup_mig_pages_transferred();
+info->ram->skipped = skipped_mig_pages_transferred();
+info->ram->normal = norm_mig_pages_transferred();
+info->ram->normal_bytes = norm_mig_bytes_transferred();
+info->ram->mbps = s->mbps;
+info->ram->dirty_sync_count = s->dirty_sync_count;
+info->ram->postcopy_requests = s->postcopy_requests;
+
+if (s->state != MIGRATION_STATUS_COMPLETED) {
+info->ram->remaining = ram_bytes_remaining();
+info->ram->dirty_pages_rate = s->dirty_pages_rate;
+}
+}
+
 MigrationInfo *qmp_query_migrate(Error **errp)
 {
 MigrationInfo *info = g_malloc0(sizeof(*info));
@@ -585,18 +605,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 info->has_setup_time = true;
 info->setup_time = s->setup_time;
 
-info->has_ram = true;
-info->ram = g_malloc0(sizeof(*info->ram));
-info->ram->transferred = ram_bytes_transferred();
-info->ram->remaining = ram_bytes_remaining();
-info->ram->total = ram_bytes_total();
-info->ram->duplicate = dup_mig_pages_transferred();
-info->ram->skipped = skipped_mig_pages_transferred();
-info->ram->normal = norm_mig_pages_transferred();
-info->ram->normal_bytes = norm_mig_bytes_transferred();
-info->ram->dirty_pages_rate = s->dirty_pages_rate;
-info->ram->mbps = s->mbps;
-info->ram->dirty_sync_count = s->dirty_sync_count;
+populate_ram_info(info, s);
 
 if (blk_mig_active()) {
 info->has_disk = true;
@@ -624,18 +633,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 info->has_setup_time = true;
 info->setup_time = s->setup_time;
 
-info->has_ram = true;
-info->ram = g_malloc0(sizeof(*info->ram));
-info->ram->transferred = ram_bytes_transferred();
-info->ram->remaining = ram_bytes_remaining();
-info->ram->total = ram_bytes_total();
-info->ram->duplicate = dup_mig_pages_transferred();
-info->ram->skipped = skipped_mig_pages_transferred();
-info->ram->normal = norm_mig_pages_transferred();
-info->ram->normal_bytes = norm_mig_bytes_transferred();
-info->ram->dirty_pages_rate = s->dirty_pages_rate;
-info->ram->mbps = s->mbps;
-info->ram->dirty_sync_count = s->dirty_sync_count;
+populate_ram_info(info, s);
 
 if (blk_mig_active()) {
 info->has_disk = true;
@@ -658,17 +656,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 info->has_setup_time = true;
 info->setup_time = s->setup_time;
 
-info->has_ram = true;
-info->ram = g_malloc0(sizeof(*info->ram));
-info->ram->transferred = ram_bytes_transferred();
-info->ram->remaining = 0;
-info->ram->total = ram_bytes_total();
-info->ram->duplicate = dup_mig_pages_transferred();
-info->ram->skipped = skipped_mig_pages_transferred();
-info->ram->normal =

[Qemu-devel] [PATCH 3/4] test: Postcopy

2016-04-27 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

This is a postcopy test (x86 only) that actually runs the guest
and checks the memory contents.

The test runs from an x86 boot block with the hex embedded in the test;
the source for this is:

...

.code16
.org 0x7c00
.file   "fill.s"
.text
.globl  start
.type   start, @function
start: # at 0x7c00 ?
cli
lgdt gdtdesc
mov $1,%eax
mov %eax,%cr0  # Protected mode enable
data32 ljmp $8,$0x7c20

.org 0x7c20
.code32
# A20 enable - not sure I actually need this
inb $0x92,%al
or  $2,%al
outb %al, $0x92

# set up DS for the whole of RAM (needed on KVM)
mov $16,%eax
mov %eax,%ds

mov $65,%ax
mov $0x3f8,%dx
outb %al,%dx

# bl keeps a counter so we limit the output speed
mov $0, %bl
mainloop:
# Start from 1MB
mov $(1024*1024),%eax
innerloop:
incb (%eax)
add $4096,%eax
cmp $(100*1024*1024),%eax
jl innerloop

inc %bl
jnz mainloop

mov $66,%ax
mov $0x3f8,%dx
outb %al,%dx

jmp mainloop

# GDT magic from old (GPLv2)  Grub startup.S
.p2align2   /* force 4-byte alignment */
gdt:
.word   0, 0
.byte   0, 0, 0, 0

/* -- code segment --
 * base = 0x, limit = 0xF (4 KiB Granularity), present
 * type = 32bit code execute/read, DPL = 0
 */
.word   0x, 0
.byte   0, 0x9A, 0xCF, 0

/* -- data segment --
 * base = 0x, limit 0xF (4 KiB Granularity), present
 * type = 32 bit data read/write, DPL = 0
 */
.word   0x, 0
.byte   0, 0x92, 0xCF, 0

gdtdesc:
.word   0x27/* limit */
.long   gdt /* addr */

/* I'm a bootable disk */
.org 0x7dfe
.byte 0x55
.byte 0xAA

...

and that can be assembled by the following magic:
as --32 -march=i486 fill.s -o fill.o
objcopy -O binary fill.o fill.boot
dd if=fill.boot of=bootsect bs=256 count=2 skip=124
xxd -i bootsect

Signed-off-by: Dr. David Alan Gilbert 
---
 tests/Makefile|   2 +
 tests/postcopy-test.c | 426 ++
 2 files changed, 428 insertions(+)
 create mode 100644 tests/postcopy-test.c

diff --git a/tests/Makefile b/tests/Makefile
index 9194f18..f356f4a 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -224,6 +224,7 @@ endif
 check-qtest-i386-y += tests/test-netfilter$(EXESUF)
 check-qtest-i386-y += tests/test-filter-mirror$(EXESUF)
 check-qtest-i386-y += tests/test-filter-redirector$(EXESUF)
+check-qtest-i386-y += tests/postcopy-test$(EXESUF)
 check-qtest-x86_64-y = $(check-qtest-i386-y)
 gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c
 gcov-files-x86_64-y = $(subst 
i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
@@ -579,6 +580,7 @@ tests/usb-hcd-uhci-test$(EXESUF): tests/usb-hcd-uhci-test.o 
$(libqos-usb-obj-y)
 tests/usb-hcd-ehci-test$(EXESUF): tests/usb-hcd-ehci-test.o $(libqos-usb-obj-y)
 tests/usb-hcd-xhci-test$(EXESUF): tests/usb-hcd-xhci-test.o $(libqos-usb-obj-y)
 tests/pc-cpu-test$(EXESUF): tests/pc-cpu-test.o
+tests/postcopy-test$(EXESUF): tests/postcopy-test.o
 tests/vhost-user-test$(EXESUF): tests/vhost-user-test.o qemu-char.o 
qemu-timer.o $(qtest-obj-y) $(test-io-obj-y)
 tests/qemu-iotests/socket_scm_helper$(EXESUF): 
tests/qemu-iotests/socket_scm_helper.o
 tests/test-qemu-opts$(EXESUF): tests/test-qemu-opts.o $(test-util-obj-y)
diff --git a/tests/postcopy-test.c b/tests/postcopy-test.c
new file mode 100644
index 000..c5343ff
--- /dev/null
+++ b/tests/postcopy-test.c
@@ -0,0 +1,426 @@
+/*
+ * QTest testcase for postcopy
+ *
+ * Copyright (c) 2016 Red Hat, Inc. and/or its affiliates
+ *   based on the vhost-user-test.c that is:
+ *  Copyright (c) 2014 Virtual Open Systems Sarl.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include 
+
+#include "libqtest.h"
+#include "qemu/option.h"
+#include "qemu/range.h"
+#include "sysemu/char.h"
+#include "sysemu/sysemu.h"
+
+#include 
+#include 
+#include 
+
+#if defined(__linux__)
+#include 
+#endif
+
+#if defined(__linux__) && defined(__NR_userfaultfd) && defined(CONFIG_EVENTFD)
+#include 
+#include 
+#include 
+
+const unsigned start_address = 1024 * 1024;
+const unsigned end_address = 100 * 1024 * 1024;
+static bool ufd_version_check(void)
+{
+struct uffdio_api api_struct;
+uint64_t ioctl_mask;
+
+int ufd = ufd = syscall(__NR_userfaultfd, O_CLOEXEC);
+
+if (ufd == -1) {
+g_test_message("Skipping test: userfaultfd not available");
+return false;
+}
+
+api_struct.api = UFFD_API;
+

[Qemu-devel] [PATCH 0/4] postcopy (& 1 test) patch for 2.7

2016-04-27 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Hi,
  This is a small set of postcopy changes, the largest of which
is an x86 test for postcopy.

Andrea's libqtest change came about from running my test under very heavy
load.

The test includes a self contained migration workload that rapidly changes
RAM in a predictable fashion allowing us to end up in postcopy mode and
also to be able to check the contents of RAM.  Note this sometimes fails
on Linux kernels 4.5 (and current 4.6) which have a KVM+THP bug.

Dave

Andrea Arcangeli (1):
  tests: fix libqtest socket timeouts

Dr. David Alan Gilbert (3):
  Postcopy: Avoid 0 length discards
  Postcopy: Add stats on page requests
  test: Postcopy

 hmp.c |   4 +
 include/migration/migration.h |   2 +
 migration/migration.c |  59 +++---
 migration/ram.c   |   5 +-
 qapi-schema.json  |   6 +-
 tests/Makefile|   2 +
 tests/libqtest.c  |   2 +-
 tests/postcopy-test.c | 426 ++
 8 files changed, 468 insertions(+), 38 deletions(-)
 create mode 100644 tests/postcopy-test.c

-- 
2.5.5

[Qemu-devel] [PATCH 1/4] Postcopy: Avoid 0 length discards

2016-04-27 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

The discard code in migration/ram.c would send request for
zero length discards in the case where no discards were needed.
It doesn't appear to have had any bad effect.

Signed-off-by: Dr. David Alan Gilbert 
---
 migration/ram.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 3f05738..e96c2af 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1557,7 +1557,9 @@ static int postcopy_send_discard_bm_ram(MigrationState 
*ms,
 } else {
 discard_length = zero - one;
 }
-postcopy_discard_send_range(ms, pds, one, discard_length);
+if (discard_length) {
+postcopy_discard_send_range(ms, pds, one, discard_length);
+}
 current = one + discard_length;
 } else {
 current = one;
-- 
2.5.5

Re: [Qemu-devel] [RFC PATCH v2 2/2] spapr: Memory hot-unplug support

2016-04-27 Thread Michael Roth

Quoting Igor Mammedov (2016-04-27 09:34:53)
> On Wed, 27 Apr 2016 15:59:52 +0200
> Thomas Huth  wrote:
> 
> > On 27.04.2016 15:37, Igor Mammedov wrote:
> > > On Tue, 26 Apr 2016 16:03:37 -0500
> > > Michael Roth  wrote:
> > >   
> > >> Quoting Igor Mammedov (2016-04-26 02:52:36)  
> > >>> On Tue, 26 Apr 2016 10:39:23 +0530
> > >>> Bharata B Rao  wrote:
> > >>> 
> >  On Mon, Apr 25, 2016 at 11:20:50AM +0200, Igor Mammedov wrote:
> > > On Wed, 16 Mar 2016 10:11:54 +0530
> > > Bharata B Rao  wrote:
> > >   
> > >> On Wed, Mar 16, 2016 at 12:36:05PM +1100, David Gibson wrote:  
> > >>> On Tue, Mar 15, 2016 at 10:08:56AM +0530, Bharata B Rao wrote:  
> > >>>   
> >  Add support to hot remove pc-dimm memory devices.
> > 
> >  Signed-off-by: Bharata B Rao 
> > >>>
> > >>> Reviewed-by: David Gibson 
> > >>>
> > >>> Looks correct, but again, needs to wait on the PAPR change.  
> > > [...]  
> > >>
> > >> While we are here, I would also like to get some opinion on the real
> > >> need for memory unplug. Is there anything that memory unplug gives us
> > >> which memory ballooning (shrinking mem via ballooning) can't give ?  
> > >> 
> > > Sure ballooning can complement memory hotplug but turning it on would
> > > effectively reduce hotplug to balloning as it would enable overcommit
> > > capability instead of hard partitioning pc-dimms provides. So one
> > > could just use ballooning only and not bother with hotplug at all.
> > >
> > > On the other hand memory hotplug/unplug (at least on x86) tries
> > > to model real hardware, thus removing need in paravirt ballooning
> > > solution in favor of native guest support.  
> > 
> >  Thanks for your views.
> >  
> > >
> > > PS:
> > > Guest wise, currently hot-unplug is not well supported in linux,
> > > i.e. it's not guarantied that guest will honor unplug request
> > > as it may pin dimm by using it as a non migratable memory. So
> > > there is something to work on guest side to make unplug more
> > > reliable/guarantied.  
> > 
> >  In the above scenario where the guest doesn't allow removal of certain
> >  parts of DIMM memory, what is the expected behaviour as far as QEMU
> >  DIMM device is concerned ? I seem to be running into this situation
> >  very often with PowerPC mem unplug where I am left with a DIMM device
> >  that has only some memory blocks released. In this situation, I would 
> >  like
> >  to block further unplug requests on the same device, but QEMU seems
> >  to allow more such unplug requests to come in via the monitor. So
> >  qdev won't help me here ? Should I detect such condition from the
> >  machine unplug() handler and take required action ?
> > >>> I think offlining is a guests task along with recovering from
> > >>> inability to offline (i.e. offline all + eject or restore original 
> > >>> state).
> > >>> QUEM does it's job by notifying guest what dimm it wants to remove
> > >>> and removes it when guest asks it (at least in x86 world).
> > >>
> > >> In the case of pseries, the DIMM abstraction isn't really exposed to
> > >> the guest, but rather the memory blocks we use to make the backing
> > >> memdev memory available to the guest. During unplug, the guest
> > >> completely releases these blocks back to QEMU, and if it can only
> > >> release a subset of what's requested it does not attempt to recover.
> > >> We can potentially change that behavior on the guest side, since
> > >> partially-freed DIMMs aren't currently useful on the host-side...
> > >>
> > >> But, in the case of pseries, I wonder if it makes sense to maybe go
> > >> ahead and MADV_DONTNEED the ranges backing these released blocks so the
> > >> host can at least partially reclaim the memory from a partially
> > >> unplugged DIMM?  
> > > It's a little bit confusing, one asked to remove device but it's still
> > > there but not completely usable/available.
> > > What will happen when user wants that memory plugged back?  
> > 
> > As far as I've understood MADV_DONTNEED, you can use the memory again at
> > any time - just the previous contents will be gone, which is ok in this
> > case since the guest previously marked this area as unavailable.
> If host gave returned memory to someone else there might not be enough
> resources to give it back (what would happen I can't tell may be VM will
> stall or just get exception).

It's not really an issue for pseries, since once the LMB is released
it's totally gone as far as the guest is concerned, and there's no
way to plug it back in via the still-present DIMM until removal
completes after, say, reset time.

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Michael S. Tsirkin

On Wed, Apr 27, 2016 at 04:15:35PM +0100, David Woodhouse wrote:
> On Wed, 2016-04-27 at 18:05 +0300, Michael S. Tsirkin wrote:
> > 
> > I really don't get it.
> > 
> > There's exactly one device that works now and needs the work-around and
> > so that we need to support, and that is virtio. It happens to have
> > exactly the same issue on all platforms.
> 
> False. We have other devices which are currently *not* translated by
> the emulated IOMMU and which aren't going to be in the short term
> either.
> 
> We also have other devices (emulated hardware NICs) to which precisely
> the same "we don't need protection" arguments apply, and which we
> *could* expose to the guest without an IOMMU translation if we really
> wanted to. It makes as much sense as exposing virtio without an IOMMU,
> going forward.

The reasons for virtio are mostly dealing legacy.
We don't need protection is a separate issue
that I'd rather drop for now.

> > Why would we want to work hard to build platform-specific
> > solutions to a problem that can be solved in 5 lines of
> > generic code?
> 
> Because it's a dirty hack in the *wrong* place.

No one came up with a better one so far :(

> -- 
> dwmw2

Re: [Qemu-devel] [PATCH for-2.7 v2 07/17] rbd: Implement image locking

2016-04-27 Thread Jason Dillaman

On Tue, Apr 26, 2016 at 7:20 PM, Fam Zheng  wrote:
> On Tue, 04/26 10:42, Jason Dillaman wrote:
>> On Sun, Apr 24, 2016 at 7:42 PM, Fam Zheng  wrote:
>> > On Fri, 04/22 21:57, Jason Dillaman wrote:
>> >> Since this cannot automatically recover from a crashed QEMU client with an
>> >> RBD image, perhaps this RBD locking should not default to enabled.
>> >> Additionally, this will conflict with the "exclusive-lock" feature
>> >> available since the Ceph Hammer-release since both utilize the same 
>> >> locking
>> >> construct.
>> >>
>> >> As a quick background, the optional exclusive-lock feature can be
>> >> enabled/disabled on image and safely/automatically handles the case of
>> >> recovery from a crashed client.  Under normal conditions, the RBD
>> >> exclusive-lock feature automatically acquires the lock upon the first
>> >> attempt to write to the image and transparently transitions ownership of
>> >> the lock between two or more clients -- used for QEMU live-migration.
>> >
>> > Is it enabled by default?
>> >
>>
>> Starting with the Jewel release of Ceph it is enabled by default.
>
> OK, then I'll leave rbd in this QEMU series for now.

Without exposing some new API methods, this patch will, unfortunately,
directly conflict with the new Jewel rbd defaults and will actually
result in the image becoming read-only since librbd won't be able to
acquire the exclusive lock when QEMU owns the advisory lock.

We can probably get the new API methods upstream within a week or two
[1].  If that's too long of a delay, I'd recommend dropping rbd
locking from the series for now.

[1] http://tracker.ceph.com/issues/15632

-- 
Jason

Re: [Qemu-devel] Wiki account request

2016-04-27 Thread Stefan Weil

Am 27.04.2016 um 19:33 schrieb Bastian Koppelmann:
> Hi,
>
> can someone create an account (username: kbastian) on the qemu wiki for
> me, such that I can update the changelog regularly for TriCore?
>
> Thanks,
> Bastian

Done.

Cheers
Stefan

[Qemu-devel] Wiki account request

2016-04-27 Thread Bastian Koppelmann

Hi,

can someone create an account (username: kbastian) on the qemu wiki for
me, such that I can update the changelog regularly for TriCore?

Thanks,
Bastian

Re: [Qemu-devel] [Qemu-discuss] iolimits for virtio-9p

2016-04-27 Thread Greg Kurz

On Wed, 27 Apr 2016 16:39:58 +0200
Pradeep Kiruvale  wrote:

> On 27 April 2016 at 10:38, Alberto Garcia  wrote:
> 
> > On Wed, Apr 27, 2016 at 09:29:02AM +0200, Pradeep Kiruvale wrote:
> >
> > > Thanks for the reply. I am still in the early phase, I will let you
> > > know if any changes are needed for the APIs.
> > >
> > > We might also have to implement throttle-group.c for 9p devices, if
> > > we want to apply throttle for group of devices.
> >
> > Fair enough, but again please note that:
> >
> > - throttle-group.c is not meant to be generic, but it's tied to
> >   BlockDriverState / BlockBackend.
> > - it is currently being rewritten:
> >   https://lists.gnu.org/archive/html/qemu-block/2016-04/msg00645.html
> >
> > If you can explain your use case with a bit more detail we can try to
> > see what can be done about it.
> >
> >
> We want to use  virtio-9p for block io instead of virtio-blk-pci. But in
> case of

9p is mostly aimed at sharing files... why would you want to use it for
block io instead of a true block device ? And how would you do that ?

> virtio-9p we can just use fsdev devices, so we want to apply throttling
> (QoS)
> on these devices and as of now the io throttling only possible with the
> -drive option.
> 

Indeed.

> As a work around we are doing the throttling using cgroup. It has its own
> costs.

Can you elaborate ?

> So, we want to have throttling for fsdev devices inside the qemu itself. I
> am just
> trying to understand and estimate time required for implementing it for the
> fsdevices.
> 

I still don't clearly understand what you are trying to do... maybe provide
a more detailed scenario.

> 
> -Pradeep

Cheers.

--
Greg

Re: [Qemu-devel] [PATCH for-2.6 2/3] replay: Fix dangling location bug in replay_configure()

2016-04-27 Thread Eduardo Habkost

On Wed, Apr 27, 2016 at 04:29:08PM +0200, Markus Armbruster wrote:
> replay_configure() pushes and pops a Location with automatic storage
> duration.  Except it fails to pop when -icount parameter "rr" isn't
> given.  cur_loc then points to unused stack space, and will most
> likely get clobbered in short order.
> 
> Clobbered cur_loc can make loc_pop() and error_print_loc() crash or
> report bogus locations.
> 
> Broken in commit 890ad55.
> 
> I didn't take the time to find a reproducer.
> 
> Cc: Eduardo Habkost 
> Signed-off-by: Markus Armbruster 

Oops! Thanks for catching it.

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [Qemu-devel] [PATCH v5] qemu-img: check block status of backing file when converting.

2016-04-27 Thread Max Reitz

On 27.04.2016 18:04, Ren Kimura wrote:
> When converting images, check the block status of its backing file chain
> to avoid needlessly reading zeros.
> 
> Signed-off-by: Ren Kimura 
> ---
>  qemu-img.c | 15 +--
>  1 file changed, 13 insertions(+), 2 deletions(-)

Thank you, I applied the patch to my block-next tree for inclusion into
qemu after the 2.6 release:

https://github.com/XanClic/qemu/commits/block-next

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/2] Block: Cleanup vvfat.c to remove dead code.

2016-04-27 Thread Alex Bennée


Prerna Saxena  writes:

> Commit 43dc2a64 replaced assert() with abort(), but didnt remove statements
> that followed these calls. So current code still has return values set after
> a call to abort(). Such statements will never execute and need to be cleaned
> up.
>
> Signed-off-by: Prerna Saxena 
> ---
>  block/vvfat.c | 17 +++--
>  1 file changed, 3 insertions(+), 14 deletions(-)
>
> diff --git a/block/vvfat.c b/block/vvfat.c
> index 6b85314..ffe739b 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -1747,8 +1747,7 @@ static uint32_t 
> get_cluster_count_for_direntry(BDRVVVFATState* s,
>   schedule_new_file(s, g_strdup(path), cluster_num);
>   else {
>  abort();
> - return 0;
> - }
> + }
>  }
>
>  while(1) {
> @@ -1768,7 +1767,6 @@ static uint32_t 
> get_cluster_count_for_direntry(BDRVVVFATState* s,
>   * (cluster_num - mapping->begin)) {
>   /* offset of this cluster in file chain has changed */
>  abort();
> - copy_it = 1;
>   } else if (offset == 0) {
>   const char* basename = get_basename(mapping->path);
>
> @@ -1780,7 +1778,6 @@ static uint32_t 
> get_cluster_count_for_direntry(BDRVVVFATState* s,
>   if (mapping->first_mapping_index != first_mapping_index
>   && mapping->info.file.offset > 0) {
>  abort();
> - copy_it = 1;
>   }
>
>   /* need to write out? */
> @@ -1946,8 +1943,6 @@ DLOG(fprintf(stderr, "check direntry %d:\n", i); 
> print_direntry(direntries + i))
>   }
>   } else
>  abort(); /* cluster_count = 0; */
> -
> - ret += cluster_count;

I don't think you meant to do this. The ret += is not part of the else leg.

>   }
>
>   cluster_num = modified_fat_get(s, cluster_num);
> @@ -2578,10 +2573,6 @@ static int handle_commits(BDRVVVFATState* s)
>  for (i = 0; !fail && i < s->commits.next; i++) {
>   commit_t* commit = array_get(&(s->commits), i);
>   switch(commit->action) {
> - case ACTION_RENAME: case ACTION_MKDIR:
> -abort();
> - fail = -2;
> - break;
>   case ACTION_WRITEOUT: {
>  #ifndef NDEBUG
>  /* these variables are only used by assert() below */
> @@ -2639,6 +2630,8 @@ static int handle_commits(BDRVVVFATState* s)
>
>   break;
>   }
> +case ACTION_RENAME:
> +case ACTION_MKDIR:
>   default:
>  abort();
>   }
> @@ -2729,7 +2722,6 @@ static int do_commit(BDRVVVFATState* s)
>  if (ret) {
>   fprintf(stderr, "Error handling renames (%d)\n", ret);
>  abort();
> - return ret;
>  }
>
>  /* copy FAT (with bdrv_read) */
> @@ -2740,21 +2732,18 @@ static int do_commit(BDRVVVFATState* s)
>  if (ret) {
>   fprintf(stderr, "Fatal: error while committing (%d)\n", ret);
>  abort();
> - return ret;
>  }
>
>  ret = handle_commits(s);
>  if (ret) {
>   fprintf(stderr, "Error handling commits (%d)\n", ret);
>  abort();
> - return ret;
>  }
>
>  ret = handle_deletes(s);
>  if (ret) {
>   fprintf(stderr, "Error deleting\n");
>  abort();
> - return ret;
>  }
>
>  if (s->qcow->drv->bdrv_make_empty) {


--
Alex Bennée

[Qemu-devel] [PATCH v5] qemu-img: check block status of backing file when converting.

2016-04-27 Thread Ren Kimura

When converting images, check the block status of its backing file chain
to avoid needlessly reading zeros.

Signed-off-by: Ren Kimura 
---
 qemu-img.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 1697762..cb72f14 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1478,10 +1478,21 @@ static int convert_iteration_sectors(ImgConvertState 
*s, int64_t sector_num)
 } else if (!s->target_has_backing) {
 /* Without a target backing file we must copy over the contents of
  * the backing file as well. */
-/* TODO Check block status of the backing file chain to avoid
+/* Check block status of the backing file chain to avoid
  * needlessly reading zeroes and limiting the iteration to the
  * buffer size */
-s->status = BLK_DATA;
+ret = bdrv_get_block_status_above(blk_bs(s->src[s->src_cur]), NULL,
+  sector_num - s->src_cur_offset,
+  n, , );
+if (ret < 0) {
+return ret;
+}
+
+if (ret & BDRV_BLOCK_ZERO) {
+s->status = BLK_ZERO;
+} else {
+s->status = BLK_DATA;
+}
 } else {
 s->status = BLK_BACKING_FILE;
 }
-- 
2.5.0

Re: [Qemu-devel] [PATCH 1/1] mm: thp: kvm: fix memory corruption in KVM with THP enabled

2016-04-27 Thread Andrea Arcangeli

On Wed, Apr 27, 2016 at 05:57:30PM +0200, Andrea Arcangeli wrote:
> couldn't do a fix as cleaner as this one for 4.6.

ehm "cleaner then"

If you've suggestions for a better name than PageTransCompoundMap I
can respin a new patch though, I considered "CanMap" but I opted for
the short version.

Also I'm not really sure moving transparent_hugepage_adjust will make
much sense. I mentioned it because Andres in another thread said it
was suggested but the real common code knowledge is about
PageTransCompoundMap only, all sort of !mmu_gfn_lpage_is_disallowed
for dirty logging at 4k shadow granularity is KVM internal.

Re: [Qemu-devel] [PATCH 2/2] Debug : Add error messages before a call to debug().

2016-04-27 Thread Eric Blake

On 04/14/2016 09:02 PM, Prerna Saxena wrote:
> Qemu code has abort() calls in various places which raises a SIGABRT;
> This patch adds error messages before (most)calls to abort(), so that
> it is easier to determine why QEMU died.

The subject line says you are adding messages before debug(), but the
rest of the patch is adding message before abort(). You'll need to fix
that.  Also, subject lines usually don't end in '.'

> +++ b/block.c
> @@ -3725,6 +3725,7 @@ void bdrv_remove_aio_context_notifier(BlockDriverState 
> *bs,
>  }
>  }
>  
> +error_report("Matching context notifier not found for removal. 
> Aborting");
>  abort();

The "Aborting" part of the message is redundant; it's pretty obvious
that qemu aborted.

I also wonder if you should be using g_assert_not_reached() instead of
abort() in some (all?) of the places touched in this patch - at which
point you don't have to worry about inventing a message that will never
be printed.  The reason I suggest it is that g_assert_not_reached() is
self-documenting, and prints a nicer message than abort() if it does
accidentally get reached.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/1] mm: thp: kvm: fix memory corruption in KVM with THP enabled

2016-04-27 Thread Andrea Arcangeli

On Wed, Apr 27, 2016 at 06:18:34PM +0300, Kirill A. Shutemov wrote:
> Okay, I see.
> 
> But do we really want to make PageTransCompoundMap() visiable beyond KVM
> code? It looks like too KVM-specific.

Any other secondary MMU notifier manager (KVM is just one of the many
MMU notifier users) will need the same information if it doesn't want
to run a flood of get_user_pages_fast and it can support multiple
granularity in the secondary MMU mappings, so I think it is justified
to be exposed not just to KVM.

The other option would be to move transparent_hugepage_adjust to
mm/huge_memory.c but that currently has all kind of KVM data
structures in it, so it's definitely not a cut-and-paste work, so I
couldn't do a fix as cleaner as this one for 4.6.

Re: [Qemu-devel] [RFC PATCH resend 00/11] Make CoMutex/CoQueue/CoRwlock thread-safe

2016-04-27 Thread Stefan Hajnoczi

On Tue, Apr 26, 2016 at 11:54:19AM +0100, Stefan Hajnoczi wrote:
> On Fri, Apr 15, 2016 at 01:31:55PM +0200, Paolo Bonzini wrote:
> > [this time including the mailing list]
> > 
> > This is yet another tiny bit of the multiqueue work, this time affecting
> > the synchronization infrastructure for coroutines.  Currently, coroutines
> > synchronize between the main I/O thread and the dataplane iothread through
> > the AioContext lock.  However, for multiqueue a single BDS will be used
> > by multiple iothreads and hence multiple AioContexts.  This calls for
> > a different approach to coroutine synchronization.  This series is my
> > first attempt at it.  Besides multiqueue, I think throttling groups
> > (which can already span multiple AioContexts) could also take advantage
> > of the new CoMutexes.
> > 
> > The series has two main parts:
> > 
> > - it makes CoMutex bind coroutines to an AioContexts.  It is of course
> >   still possible to move coroutines around with explicit yield and
> >   qemu_coroutine_enter, but a CoMutex will now reenter the coroutine
> >   where it was running before, rather than in the AioContext of
> >   the previous owner.  To do this, a new function aio_co_schedule is
> >   introduced to run a coroutine on a given iothread.  I think this could
> >   be used in other places too; for now it lets a CoMutex protect stuff
> >   across multiple AioContexts without them moving around(*).  Of course
> >   currently a CoMutex is generally not used across multiple iothreads,
> >   because you have to acquire/release the AioContext around CoMutex
> >   critical sections.  However...
> > 
> > - ... the second change is exactly to make CoMutex thread-safe and remove
> >   the need for a "thread-based" mutex around it.  The new CoMutex is
> >   exactly the same as a mutex implementation that you'd find in an
> >   operating system.  iothreads are the moral equivalent of CPUs in
> >   a kernel, while coroutines resemble kernel threads running without
> >   preemption on a CPU.  Even if you have to take concurrency into account,
> >   the lack of preemption while running coroutines or bottom halves
> >   keeps the complexity at bay.  For example, it is easy to synchronize
> >   between qemu_co_mutex_lock's yield and the qemu_coroutine_enter in
> >   aio_co_schedule's bottom half.
> > 
> >   Same as before, CoMutex puts coroutines to sleep with
> >   qemu_coroutine_yield and wake them up with the new aio_co_schedule.
> >   I could have wrapped CoMutex's CoQueue with a "regular" thread mutex or
> >   spinlock.  The resulting code would have looked a lot like RFifoLock
> >   (with CoQueue replacing RFifoLock's condition variable).  Rather,
> >   inspired by the parallel between coroutines and non-preemptive kernel
> >   threads, I chose to adopt the same lock-free mutex algorithm as OSv.
> >   The algorithm only needs two to four atomic ops for a lock-unlock pair
> >   (two when uncontended).  To cover CoQueue, each CoQueue is made to
> >   depend on a CoMutex, similar to condition variables.  Most CoQueues
> >   already have a corresponding CoMutex so this is not a big deal;
> >   converting the others is left for a future series.  I did this because
> >   CoQueue looks a lot like OSv's waitqueues; so if necessary, we could
> >   even take OSv's support for wait morphing (which avoids the thundering
> >   herd problem) and add it to CoMutex and CoQueue.  This may be useful
> >   when making tracked_requests thread-safe.
> > 
> > Kevin: this has nothing to do with my old plan from Brno, and it's
> > possibly a lot closer to what you wanted.  Your idea was to automatically
> > release the "thread mutex" when a coroutine yields, I think you should
> > be fine with not having a thread mutex at all!
> > 
> > This will need some changes in the formats because, for multiqueue,
> > CoMutexes would need to be used like "real" thread mutexes.  Code like
> > this:
> > 
> > ...
> > qemu_co_mutex_unlock()
> > ... /* still access shared data, but don't yield */
> > qemu_coroutine_yield()
> > 
> > might be required to use this other pattern:
> > 
> > ... /* access shared data, but don't yield */
> > qemu_co_mutex_unlock()
> > qemu_coroutine_yield()
> > 
> > because "adding a second CPU" is already introducing concurrency that
> > wasn't there before.  The "non-preemptive multitasking" reference only
> > applies to things that access AioContext-local data.  This includes the
> > synchronization primitives implemented in this series, the thread pool,
> > the Linux AIO support, but not much else.  It still simplifies _those_
> > though. :)
> > 
> > Anyhow, we'll always have some BlockDriver that do not support multiqueue,
> > such as the network protocols.  Thus it would be possible to handle the
> > formats one at a time.  raw-posix, raw and qcow2 would already form a
> > pretty good set, and the first two do not use CoMutex at all.
> > 
> > The patch has quite a lot of new code, but

Re: [Qemu-devel] Working on AF_VSOCK packet capture

2016-04-27 Thread Stefan Hajnoczi

On Tue, Apr 26, 2016 at 04:44:25PM +, Gerard wrote:
> My name is Gerard Garcia and I have been funded by GSOC16 to work on a
> device driver to allow capturing host/guest traffic through AF_VSOCK
> sockets: http://qemu-project.org/Features/VirtioVsock
> 
> I'll mostly work on the Linux kernel codebase but the device driver is
> closely related to QEMU so I felt like introducing myself here!
> 
> My IRC nick is gerardgarcia. I'll try to be around!

Welcome Gerard!  Packet capture will be a very useful feature for
everyone working on AF_VSOCK.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 06/17] bochs: Implement .bdrv_co_preadv() interface

2016-04-27 Thread Eric Blake

On 04/27/2016 03:52 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  block/bochs.c | 46 +-
>  1 file changed, 29 insertions(+), 17 deletions(-)
> 

>  static void bochs_close(BlockDriverState *bs)
> @@ -267,7 +279,7 @@ static BlockDriver bdrv_bochs = {
>  .instance_size   = sizeof(BDRVBochsState),
>  .bdrv_probe  = bochs_probe,
>  .bdrv_open   = bochs_open,
> -.bdrv_read  = bochs_co_read,
> +.bdrv_co_preadv = bochs_co_preadv,
>  .bdrv_close  = bochs_close,
>  };

Alignment is funky here.  I'd rather just get rid of all the extra
spaces, if that's easier than having half but not all of the = aligned.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 05/17] block: Introduce .bdrv_co_preadv/pwritev BlockDriver function

2016-04-27 Thread Eric Blake

On 04/27/2016 03:52 AM, Kevin Wolf wrote:
> Many parts of the block layer are already byte granularity. The block
> driver interface, however, was still missing an interface that allows
> making use of this. This patch introduces a new BlockDriver interface,
> which is based on coroutines, vectored, has flags and uses a byte
> granularity. This is now the preferred interface for new drivers.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/io.c| 28 ++--
>  include/block/block_int.h |  4 
>  2 files changed, 26 insertions(+), 6 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] last call for bugs that need to be fixed for 2.6 release!

2016-04-27 Thread Markus Armbruster

Peter Maydell  writes:

> Hi; looking at http://wiki.qemu.org/Planning/2.6#Known_issues and
> the mailing list we seem to be in reasonable shape for the 2.6 release.
> I would ideally like the 2.6rc3 tarball which we will make later this
> week to be the final one before full release.
>
> This is therefore the last call for any bugs that need to be fixed
> for 2.6 or patches that must go in. If you have anything you think
> should go into 2.6 please either add it to the "still unfixed in
> master" part of the planning wiki page or follow up to this email to
> describe it/link to the relevant patches/etc.

[PATCH for-2.6 0/3] Fix dangling pointers and error message regressions

1+2/3 are small and simple.  We can punt 3/3 to -stable if the patch
looks too big for rc4.

Re: [Qemu-devel] [PATCH 2/2] Debug : Add error messages before a call to debug().

2016-04-27 Thread Stefan Hajnoczi

On Fri, Apr 15, 2016 at 08:32:54AM +0530, Prerna Saxena wrote:
> diff --git a/block/block-backend.c b/block/block-backend.c
> index d74f670..0aa8692 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -407,6 +407,7 @@ BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
>  return blk;
>  }
>  }
> +error_report("Drive Info not found, Aborting.");

error_report() documentation says:

 * Format arguments like sprintf().  The resulting message should be a
 * single phrase, with no newline or trailing punctuation.

>  abort();
>  }
>  
> @@ -463,6 +464,8 @@ int blk_attach_dev(BlockBackend *blk, void *dev)
>  void blk_attach_dev_nofail(BlockBackend *blk, void *dev)
>  {
>  if (blk_attach_dev(blk, dev) < 0) {
> +error_report("Attaching device model to block %s failed. Aborting",

Please use '%s' instead of just %s for consistency with other error
messages.

> +blk->name);
>  abort();
>  }
>  }
> @@ -1143,6 +1146,7 @@ BlockErrorAction blk_get_error_action(BlockBackend 
> *blk, bool is_read,
>  case BLOCKDEV_ON_ERROR_IGNORE:
>  return BLOCK_ERROR_ACTION_IGNORE;
>  default:
> +error_report("Unrecognized Block Error Action %d. Aborting.",on_err);

s/,/, /

Please use scripts/checkpatch.pl to check for coding style violations.

> diff --git a/exec.c b/exec.c
> index c4f9036..1b1d713 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2218,6 +2218,7 @@ static MemTxResult subpage_read(void *opaque, hwaddr 
> addr, uint64_t *data,
>  *data = ldq_p(buf);
>  return MEMTX_OK;
>  default:
> +printf("Unsupported read size %d, Aborting", len);

Why use printf() instead of error_report() in this file?


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.6 3/3] qom: -object error messages lost location, restore it

2016-04-27 Thread Markus Armbruster

"Daniel P. Berrange"  writes:

> On Wed, Apr 27, 2016 at 04:29:09PM +0200, Markus Armbruster wrote:
>> qemu_opts_foreach() runs its callback with the error location set to
>> the option's location.  Any errors the callback reports use the
>> option's location automatically.
>> 
>> Commit 90998d5 moved the actual error reporting from "inside"
>> qemu_opts_foreach() to after it.  Here's a typical hunk:
>> 
>>   if (qemu_opts_foreach(qemu_find_opts("object"),
>> -  object_create,
>> -  object_create_initial, NULL)) {
>> +  user_creatable_add_opts_foreach,
>> +  object_create_initial, )) {
>> +error_report_err(err);
>>   exit(1);
>>   }
>> 
>> Before, object_create() reports from within qemu_opts_foreach(), using
>> the option's location.  Afterwards, we do it after
>> qemu_opts_foreach(), using whatever location happens to be current.
>> Commonly a "none" location.
>
> IMHO this shows a major design flaw with error_report_err() method
> and the location handling. The design pattern we have for "Error *"
> objects is that we can freely propagate them up the caller, because
> it is a self-contained record of the error information. As soon as
> you do that you loose the location information, because it was not
> in fact associated with the Error, but rather stored in a single
> global variable. For that matter, the Location info isn't even
> thread safe AFAICT since its a simple state var, so you better hope
> that there's no code which calls loc_push/pop from a non-main thread :-(

I readily concede that the current state is decidedly sub-optimal.
Error reporting in QEMU has a tortuous history, and it shows.

Locations date back to simpler times.  Threads?  What's a "thread"?

The current location stack was the simplest way to retrofit locations to
most of the errors with the least churn.  If it's a good idea (which is
debatable), it should certainly be thread-local.

Error was created with cavalier disregard for actual error messages.
We've fixed the worst issues, but we haven't attacked location
information.

Instead, we fall back to what error_report() gives us for free: the
current location at the point where we report the error.

Blindly replacing this by the current location at the point where we
detect the error may not always be an improvement.  It depends.

Here's an instructive example:

-drive if=none,cache=none,file=blkdebug:blkdebug.conf:...

with an erroneous blkdebug.conf.

The current location at the point where we detect the error is the bad
spot in blkdebug.conf.  That's useful information.  It currently gets
lost.

The current location at the point where we report the error should be
the -drive (it currently isn't, but that's just a bug).  Also useful
information.

>> Reproducer:
>> 
>> $ qemu-system-x86_64 -nodefaults -display none -object 
>> secret,id=foo,foo=bar
>> qemu-system-x86_64: Property '.foo' not found
>> 
>> Note no location.  This commit restores it:
>> 
>> qemu-system-x86_64: -object secret,id=foo,foo=bar: Property '.foo' not 
>> found
>> 
>> Note that the qemu_opts_foreach() bug just fixed could mask the bug
>> here: if the location it leaves dandling hasn't been clobbered, yet,
>> it's the correct one.
>> 
>> Reported-by: Eric Blake 
>> Cc: Daniel P. Berrange 
>> Signed-off-by: Markus Armbruster 
[...]
>
> Very reluctant
>
>  Reviewed-by: Daniel P. Berrange 

Thanks!

> this really needs fixing properly in 2.7 so that the Error object is
> fully self contained so that later use of it does not rely on any
> global state.

Worthwhile project.

Re: [Qemu-devel] [RFC 00/11] Current MTTCG kvm-unit-test patches

2016-04-27 Thread Andrew Jones

On Wed, Apr 27, 2016 at 04:09:00PM +0100, Alex Bennée wrote:
> 
> Andrew Jones  writes:
> 
> > On Fri, Feb 26, 2016 at 01:15:22PM +, Alex Bennée wrote:
> >> Hi,
> >>
> >> Some of these patches have been posted before and previous patches
> >> have already been accepted upstream so I'm tagging this as a new RFC
> >> series.
> >>
> >> This is a series of tests built around kvm-unit-tests but built with
> >> the express purpose of stressing the TCG, in particular MTTCG builds.
> >>
> >> Changes from previous appearances:
> >>
> >>  * Separated locking and barrier tests
> >>  * Included Drew's IPI patches (used in tcg-test)
> >>  * New TCG chaining test
> >>
> >> The new barrier tests really only fails when running on MTTCG builds on
> >> a weak backend. Many thanks to Will Deacon for helping me get a
> >> working test case at the last Connect.
> >>
> >> I'm mainly posting these for reference for others testing MTTCG as
> >> I've still got to check I've addressed any outstanding review
> >> comments. However there has been enough code churn some of the
> >> comments may no longer be relevant.
> >>
> >> The TCG tests are also useful as benchmarks for comparing the cost of
> >> having chained basic blocks versus exiting the loop every time. The
> >> pathological case is the computed jumps test as all the addresses are
> >> within a PAGE_SIZE boundary the tb_jump_cache has no effect meaning a
> >> full look up each time.
> >>
> >> Alex Bennée (8):
> >>   config/config-arm-common: build-up tests-common target
> >>   lib: add isaac prng library from CCAN
> >>   arm/run: set indentation defaults for emacs
> >>   arm/run: allow aarch64 to start arm binaries
> >>   arm/tlbflush-test: Add TLB torture test
> >>   arm/locking-tests: add comprehensive locking test
> >>   arm/barrier-litmus-tests: add some litmus tests
> >>   arm/tcg-test: some basic TCG exercising tests
> >>
> >> Andrew Jones (3):
> >>   arm/arm64: irq enable/disable
> >>   arm/arm64: Add initial gic support
> >>   arm/arm64: Add IPI test
> >
> > I've actually updated these patches a bit, and started extending the
> > series to also work with a v3 gic. I'll pick that back up and get it
> > posted for you (hopefully next week). Or I'll at least update my
> > arm/ipi-test branch with the changes I've made for gicv2...
> 
> I'm getting ready to post the current iteration and I realised I hadn't
> seen your updates. Have they gone public anywhere?

Sorry. I didn't finish polishing the gicv3 stuff so didn't end up
sending anything. I'll send something tomorrow (same story as last time,
if not gicv3 stuff, at least updated gicv2 :-)

Thanks,
drew

Re: [Qemu-devel] [PATCH for-2.6 3/3] qom: -object error messages lost location, restore it

2016-04-27 Thread Eric Blake

On 04/27/2016 08:29 AM, Markus Armbruster wrote:
> qemu_opts_foreach() runs its callback with the error location set to
> the option's location.  Any errors the callback reports use the
> option's location automatically.
> 
> Commit 90998d5 moved the actual error reporting from "inside"
> qemu_opts_foreach() to after it.  Here's a typical hunk:
> 
>if (qemu_opts_foreach(qemu_find_opts("object"),
> -  object_create,
> -  object_create_initial, NULL)) {
> +  user_creatable_add_opts_foreach,
> +  object_create_initial, )) {
> +error_report_err(err);
>exit(1);
>}
> 
> Before, object_create() reports from within qemu_opts_foreach(), using
> the option's location.  Afterwards, we do it after
> qemu_opts_foreach(), using whatever location happens to be current.
> Commonly a "none" location.

I agree with Dan that Error objects ought to track the Location in
effect at the point the Error is first registered, rather than
concatenating the two back together at the time the Error is eventually
reported; but also that such a change is too big to even consider this
late in 2.6.  So as a band-aid, this particular patch improves the error
message quality back to its useful state.

Reviewed-by: Eric Blake 


> Note that the qemu_opts_foreach() bug just fixed could mask the bug
> here: if the location it leaves dandling hasn't been clobbered, yet,

s/dandling/dangling/

> it's the correct one.
> 
> Reported-by: Eric Blake 
> Cc: Daniel P. Berrange 
> Signed-off-by: Markus Armbruster 
> ---


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 for-2.7 0/8] blockdev: (Nearly) free clean-up work

2016-04-27 Thread Kevin Wolf

Am 08.04.2016 um 19:10 hat Max Reitz geschrieben:
> After a lot has been restructed in the block layer in the past, we can
> now reap at least one of the fruits: Make bdrv_open() return a BDS!
> 
> 
> This series depends on the following series/patches:
> - Revert "block: Forbid I/O throttling on nodes with multiple parents
>   for 2.6"
>   This is something I suppose Kevin will send when the 2.7 development
>   window opens.
> - "block: Move I/O throttling to BlockBackend" by Kevin
> - "block: Remove BlockDriverState.blk" by Kevin

Reviewed-by: Kevin Wolf 

Waiting for the dependencies before merging this into block-next (just
in case you're wondering how you could speed up the process ;-)).

Kevin

Re: [Qemu-devel] [PATCH 1/2] Block: Cleanup vvfat.c to remove dead code.

2016-04-27 Thread Stefan Hajnoczi

On Fri, Apr 15, 2016 at 08:32:53AM +0530, Prerna Saxena wrote:
> Commit 43dc2a64 replaced assert() with abort(), but didnt remove statements
> that followed these calls. So current code still has return values set after
> a call to abort(). Such statements will never execute and need to be cleaned
> up.
> 
> Signed-off-by: Prerna Saxena 
> ---
>  block/vvfat.c | 17 +++--
>  1 file changed, 3 insertions(+), 14 deletions(-)
> 
> diff --git a/block/vvfat.c b/block/vvfat.c
> index 6b85314..ffe739b 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -1747,8 +1747,7 @@ static uint32_t 
> get_cluster_count_for_direntry(BDRVVVFATState* s,
>   schedule_new_file(s, g_strdup(path), cluster_num);
>   else {
>  abort();
> - return 0;
> - }
> + }

The following change breaks indentation:

  ^Ielse {
  ...
  -^I}
  +^I}

Since the else statement only has a single tab character it's wrong to
have mixed tabs and spaces for the closing curly.

Please resend this patch with tab-only indentation.  The diff hunks
below sometimes introduce spaces.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/1] mm: thp: kvm: fix memory corruption in KVM with THP enabled

2016-04-27 Thread Kirill A. Shutemov

On Wed, Apr 27, 2016 at 04:59:57PM +0200, Andrea Arcangeli wrote:
> On Wed, Apr 27, 2016 at 04:50:30PM +0300, Kirill A. Shutemov wrote:
> > I know nothing about kvm. How do you protect against pmd splitting between
> > get_user_pages() and the check?
> 
> get_user_pages_fast() runs fully lockless and unpins the page right
> away (we need a get_user_pages_fast without the FOLL_GET in fact to
> avoid a totally useless atomic_inc/dec!).
> 
> Then we take a lock that is also taken by
> mmu_notifier_invalidate_range_start. This way __split_huge_pmd will
> block in mmu_notifier_invalidate_range_start if it tries to run again
> (every other mmu notifier like mmu_notifier_invalidate_page will also
> block).
> 
> Then after we serialized against __split_huge_pmd through the MMU
> notifier KVM internal locking, we are able to tell if any mmu_notifier
> invalidate happened in the region just before get_user_pages_fast()
> was invoked, until we call PageCompoundTransMap and we actually map
> the shadow pagetable into the compound page with hugepage
> granularity (to allow real 2MB TLBs if guest also uses trans_huge_pmd
> in the guest pagetables).
> 
> After the shadow pagetable is mapped, we drop the internal MMU
> notifier lock and __split_huge_pmd mmu_notifier_invalidate_range_start
> can continue and drop the shadow pagetable that we just mapped in the
> above paragraph just before dropping the mmu notifier internal lock.
> 
> To be able to tell if any invalidate happened while
> get_user_pages_fast was running and until we grab the lock again and
> we start mapping the shadow pagtable we use:
> 
>   mmu_seq = vcpu->kvm->mmu_notifier_seq;
>   smp_rmb();
> 
>   if (try_async_pf(vcpu, prefault, gfn, v, , write, _writable))
>    this is get_user_pages and does put_page on the page
>and just returns the 
>this is why we need a get_user_pages_fast that won't
>attempt to touch the page->_count at all! we can avoid
>2 atomic ops for each secondary MMU fault that way
>   return 0;
> 
>   spin_lock(>kvm->mmu_lock);
>   if (mmu_notifier_retry(vcpu->kvm, mmu_seq))
>   goto out_unlock;
>   ... here we check PageTransCompoundMap(pfn_to_page(pfn)) and
>   map a 4k or 2MB shadow pagetable on "pfn" ...
> 
> 
> Note mmu_notifier_retry does the other side of the smp_rmb():
> 
>   smp_rmb();
>   if (kvm->mmu_notifier_seq != mmu_seq)
>   return 1;
>   return 0;

Okay, I see.

But do we really want to make PageTransCompoundMap() visiable beyond KVM
code? It looks like too KVM-specific.

-- 
 Kirill A. Shutemov

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Andy Lutomirski

On Wed, Apr 27, 2016 at 7:54 AM, Michael S. Tsirkin  wrote:
> On Wed, Apr 27, 2016 at 07:43:07AM -0700, Andy Lutomirski wrote:
>> On Wed, Apr 27, 2016 at 7:38 AM, Michael S. Tsirkin  wrote:
>> > On Wed, Apr 27, 2016 at 07:31:43AM -0700, Andy Lutomirski wrote:
>> >> On Wed, Apr 27, 2016 at 7:23 AM, Joerg Roedel  wrote:
>> >> > On Wed, Apr 27, 2016 at 04:37:04PM +0300, Michael S. Tsirkin wrote:
>> >> >> One correction: it's a feature of the device in the system.
>> >> >> There could be a mix of devices bypassing and not
>> >> >> bypassing the IOMMU.
>> >> >
>> >> > No, it really is not. A device can't chose to bypass the IOMMU. But the
>> >> > IOMMU can chose to let the device bypass. So any fix here belongs
>> >> > into the platform/iommu code too and not into some driver.
>> >> >
>> >> >> Sounds good. And a way to detect appropriate devices could
>> >> >> be by looking at the feature flag, perhaps?
>> >> >
>> >> > Again, no! The way to detect that is to look into the iommu description
>> >> > structures provided by the firmware. They provide everything necessary
>> >> > to tell the iommu code which devices are not translated.
>> >> >
>> >>
>> >> Except on PPC and SPARC.  As far as I know, those are the only
>> >> problematic platforms.
>> >>
>> >> Is it too late to *disable* QEMU's q35-iommu thingy until it can be
>> >> fixed to report correct data in the DMAR tables?
>> >>
>> >> --Andy
>> >
>> > Meaning virtio or assigned devices?
>> > For virtio - it's way too late since these are working configurations.
>> > For assigned devices - they don't work on x86 so it doesn't have
>> > to be disabled, it's safe to ignore.
>>
>> I mean actually prevent QEMU from running in q35-iommu mode with any
>> virtio devices attached or maybe even turn off q35-iommu mode entirely
>> [1].  Doesn't it require that the user literally pass the word
>> "experimental" into QEMU right now?  It did at some point IIRC.
>>
>> The reason I'm asking is that, other than q35-iommu, QEMU's virtio
>> devices *don't* bypass the IOMMU except on PPC and SPARC, simply
>> because there is no other configuration AFAICT that has virtio and and
>> IOMMU.  So maybe the right solution is to fix q35-iommu to use DMAR
>> correctly (thus breaking q35-iommu users with older guest kernels,
>> which hopefully don't actually exist) and to come up with a PPC- and
>> SPARC-specific solution, or maybe OpenFirmware-specific solution, to
>> handle PPC and SPARC down the road.
>>
>> [1] I'm pretty sure I emailed the QEMU list before q35-iommu ever
>> showed up in a release asking the QEMU team to please not do that
>> until this issue was resolved.  Sadly, that email was ignored :(
>>
>> --Andy
>
> Sorry, I didn't make myself clear.
> Point is, QEMU is not the only virtio implementation out there.
> So we can't know no virtio implementations have an IOMMU as long as
> linux supports this IOMMU.
> virtio always used physical addresses since it was born and if it
> changes that it must do this in a way that does not break existing
> users.

Is there any non-QEMU virtio implementation can provide an
IOMMU-bypassing virtio device on a platform that has a nontrivial
IOMMU?

--Andy

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread David Woodhouse

On Wed, 2016-04-27 at 18:05 +0300, Michael S. Tsirkin wrote:
> 
> I really don't get it.
> 
> There's exactly one device that works now and needs the work-around and
> so that we need to support, and that is virtio. It happens to have
> exactly the same issue on all platforms.

False. We have other devices which are currently *not* translated by
the emulated IOMMU and which aren't going to be in the short term
either.

We also have other devices (emulated hardware NICs) to which precisely
the same "we don't need protection" arguments apply, and which we
*could* expose to the guest without an IOMMU translation if we really
wanted to. It makes as much sense as exposing virtio without an IOMMU,
going forward.

> Why would we want to work hard to build platform-specific
> solutions to a problem that can be solved in 5 lines of
> generic code?

Because it's a dirty hack in the *wrong* place.

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature

Re: [Qemu-devel] [PATCH v4 1/1] qemu-img: check block status of backing file when converting.

2016-04-27 Thread Ren Kimura

Aha. I realized what bdrv_get_block_status_above(bs, NULL...) meant.
Sorry for many times resendings.
I'll fix it next time then.

Ren

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Michael S. Tsirkin

On Wed, Apr 27, 2016 at 04:58:51PM +0200, Joerg Roedel wrote:
> On Wed, Apr 27, 2016 at 05:54:57PM +0300, Michael S. Tsirkin wrote:
> > Point is, QEMU is not the only virtio implementation out there.
> > So we can't know no virtio implementations have an IOMMU as long as
> > linux supports this IOMMU.
> > virtio always used physical addresses since it was born and if it
> > changes that it must do this in a way that does not break existing
> > users.
> 
> FWIW, virtio in qemu can continue to just use physical addresses. But
> qemu needs to advertise that fact correctly to the OS in the DMAR table.
> This way old kernels (where virtio does not use DMA-API) will also
> continue to work on the fixed qemu.
> 
> 
> 
>   Joerg

It's not clear it can do this since DMAR tables seem to assume
a given slot is either bypassing IOMMU or going through it.
QEMU allows reusing same slot for virtio and non-virtio devices.

Besides, this patch is not about it, it's a flag for QEMU to
tell guest that it can trust DMAR tables.

-- 
MST

Re: [Qemu-devel] [RFC 00/11] Current MTTCG kvm-unit-test patches

2016-04-27 Thread Alex Bennée


Andrew Jones  writes:

> On Fri, Feb 26, 2016 at 01:15:22PM +, Alex Bennée wrote:
>> Hi,
>>
>> Some of these patches have been posted before and previous patches
>> have already been accepted upstream so I'm tagging this as a new RFC
>> series.
>>
>> This is a series of tests built around kvm-unit-tests but built with
>> the express purpose of stressing the TCG, in particular MTTCG builds.
>>
>> Changes from previous appearances:
>>
>>  * Separated locking and barrier tests
>>  * Included Drew's IPI patches (used in tcg-test)
>>  * New TCG chaining test
>>
>> The new barrier tests really only fails when running on MTTCG builds on
>> a weak backend. Many thanks to Will Deacon for helping me get a
>> working test case at the last Connect.
>>
>> I'm mainly posting these for reference for others testing MTTCG as
>> I've still got to check I've addressed any outstanding review
>> comments. However there has been enough code churn some of the
>> comments may no longer be relevant.
>>
>> The TCG tests are also useful as benchmarks for comparing the cost of
>> having chained basic blocks versus exiting the loop every time. The
>> pathological case is the computed jumps test as all the addresses are
>> within a PAGE_SIZE boundary the tb_jump_cache has no effect meaning a
>> full look up each time.
>>
>> Alex Bennée (8):
>>   config/config-arm-common: build-up tests-common target
>>   lib: add isaac prng library from CCAN
>>   arm/run: set indentation defaults for emacs
>>   arm/run: allow aarch64 to start arm binaries
>>   arm/tlbflush-test: Add TLB torture test
>>   arm/locking-tests: add comprehensive locking test
>>   arm/barrier-litmus-tests: add some litmus tests
>>   arm/tcg-test: some basic TCG exercising tests
>>
>> Andrew Jones (3):
>>   arm/arm64: irq enable/disable
>>   arm/arm64: Add initial gic support
>>   arm/arm64: Add IPI test
>
> I've actually updated these patches a bit, and started extending the
> series to also work with a v3 gic. I'll pick that back up and get it
> posted for you (hopefully next week). Or I'll at least update my
> arm/ipi-test branch with the changes I've made for gicv2...

I'm getting ready to post the current iteration and I realised I hadn't
seen your updates. Have they gone public anywhere?

>
> drew


--
Alex Bennée

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Michael S. Tsirkin

On Wed, Apr 27, 2016 at 04:56:32PM +0200, Joerg Roedel wrote:
> On Wed, Apr 27, 2016 at 05:34:30PM +0300, Michael S. Tsirkin wrote:
> > On Wed, Apr 27, 2016 at 04:23:32PM +0200, Joerg Roedel wrote:
> 
> > QEMU can choose to bypass IOMMU for one device and not the other.
> > IOMMU in QEMU isn't involved when it's bypassed.
> 
> And it is QEMU's task to tell the OS, right? And the correct way to do
> this is via the firmware ACPI tables.

Going forward, this might be reasonable. Of course it didn't in the past
and no one cared because virtio devices used physical addresses. We have
to keep these setups working.

> > Fine but this is beside the point. Almost all virtio devices
> > bypass IOMMU and what this patch does is create a way
> > to detect devices that don't. This code can maybe go into
> > platform.
> 
> Again, the way to detect this is in platform code must not be device
> specific. This is what the DMAR and IVRS tables on x86 are for.
> 
> When there is no way to do this in the firmware (or there is no firmware
> at all), we have to do a quirk in the platform code for it.
> 
> 
> 
>   Joerg

I really don't get it.

There's exactly one device that works now and needs the work-around and
so that we need to support, and that is virtio. It happens to have
exactly the same issue on all platforms.

Why would we want to work hard to build platform-specific
solutions to a problem that can be solved in 5 lines of
generic code?

-- 
MST

Re: [Qemu-devel] [PATCH for-2.7 3/9] s390x/ipl: Extend the IplParameterBlock struct

2016-04-27 Thread Cornelia Huck

On Wed, 27 Apr 2016 15:43:45 +0200
Christian Borntraeger  wrote:

> On 04/25/2016 05:18 PM, Cornelia Huck wrote:
> 
> 
> > +union {
> > +IplBlockCcw ccw;
> > +IplBlockCcw fcp;
> 
> you later fix this up in patch 
>  s390x/ipl: Provide ipl parameter block
> 
> -IplBlockCcw fcp;
> +IplBlockFcp fcp;
> 
> 
> Maybe just move this hunk to patch 3.
> 

Oops, definetly :)

Re: [Qemu-devel] [PATCH for-2.6 2/3] replay: Fix dangling location bug in replay_configure()

2016-04-27 Thread Eric Blake

On 04/27/2016 08:29 AM, Markus Armbruster wrote:
> replay_configure() pushes and pops a Location with automatic storage
> duration.  Except it fails to pop when -icount parameter "rr" isn't
> given.  cur_loc then points to unused stack space, and will most
> likely get clobbered in short order.
> 
> Clobbered cur_loc can make loc_pop() and error_print_loc() crash or
> report bogus locations.
> 
> Broken in commit 890ad55.
> 
> I didn't take the time to find a reproducer.
> 
> Cc: Eduardo Habkost 
> Signed-off-by: Markus Armbruster 
> ---
>  replay/replay.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Hang with migration multi-thread compression under high load

2016-04-27 Thread Daniel P. Berrange

On Wed, Apr 27, 2016 at 03:29:30PM +0100, Dr. David Alan Gilbert wrote:
> ccing in Liang Li
> 
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > for some reason it isn't shown in the stack thrace for thread
> > 1 above, when initially connecting GDB it says the main thread
> > is at:
> > 
> > decompress_data_with_multi_threads (len=702, host=0x7fd78fe06000, 
> > f=0x55901af09950) at /home/berrange/src/virt/qemu/migration/ram.c:2254
> > 2254for (idx = 0; idx < thread_count; idx++) {
> > 
> > 
> > Looking at the target QEMU, we see  do_data_decompress method
> > is waiting in a condition var:
> > 
> > while (!param->start && !quit_decomp_thread) {
> > qemu_cond_wait(>cond, >mutex);
> > do stuff..
> > param->start = false
> > }
> > 
> > 
> > Now the decompress_data_with_multi_threads is checking param->start without
> > holding the param->mutex lock.
> > 
> > Changing decompress_data_with_multi_threads to acquire param->mutex
> > lock makes it work, but isn't ideal, since that now blocks the
> > decompress_data_with_multi_threads() method on the completion of
> > each thread, which defeats the point of having multiple threads.

FWIW, the following patch also appears to "fix" the issue, presumably
by just making the race much less likely to hit:

diff --git a/migration/ram.c b/migration/ram.c
index 3f05738..be0233f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2271,6 +2271,7 @@ static void decompress_data_with_multi_threads(QEMUFile 
*f,
 if (idx < thread_count) {
 break;
 }
+sched_yield();
 }
 }
 

Incidentally IIUC, this decompress_data_with_multi_threads is just
busy waiting for a thread to become free, which seems pretty wasteful
of CPU resources. I wonder if there's a more effective way to structure
this, so that instead of having decompress_data_with_multi_threads()
choose which thread to pass the decompression job to, it just puts
the job into a queue, and then let all the threads pull from that
shared queue. IOW whichever thread the kerenl decides to wakeup would
get the job, without us having to explicitly assign a thread to the
job.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH v9 10/11] qemu-iotests: test overlapping block-stream operations

2016-04-27 Thread Alberto Garcia

On Wed 27 Apr 2016 03:48:26 PM CEST, Max Reitz wrote:
>> +# Attach the drive to the VM
>> +self.vm = iotests.VM()
>> +self.vm.add_drive("blkdebug::" + self.imgs[-1], ','.join(opts))
>
> Any special reason for blkdebug? For me it works just fine without.

Oh, I think it's just a remainder from previous tests that did use
blkdebug. I'll remove it.

Berto

Re: [Qemu-devel] [PATCH 1/1] mm: thp: kvm: fix memory corruption in KVM with THP enabled

2016-04-27 Thread Andrea Arcangeli

On Wed, Apr 27, 2016 at 04:50:30PM +0300, Kirill A. Shutemov wrote:
> I know nothing about kvm. How do you protect against pmd splitting between
> get_user_pages() and the check?

get_user_pages_fast() runs fully lockless and unpins the page right
away (we need a get_user_pages_fast without the FOLL_GET in fact to
avoid a totally useless atomic_inc/dec!).

Then we take a lock that is also taken by
mmu_notifier_invalidate_range_start. This way __split_huge_pmd will
block in mmu_notifier_invalidate_range_start if it tries to run again
(every other mmu notifier like mmu_notifier_invalidate_page will also
block).

Then after we serialized against __split_huge_pmd through the MMU
notifier KVM internal locking, we are able to tell if any mmu_notifier
invalidate happened in the region just before get_user_pages_fast()
was invoked, until we call PageCompoundTransMap and we actually map
the shadow pagetable into the compound page with hugepage
granularity (to allow real 2MB TLBs if guest also uses trans_huge_pmd
in the guest pagetables).

After the shadow pagetable is mapped, we drop the internal MMU
notifier lock and __split_huge_pmd mmu_notifier_invalidate_range_start
can continue and drop the shadow pagetable that we just mapped in the
above paragraph just before dropping the mmu notifier internal lock.

To be able to tell if any invalidate happened while
get_user_pages_fast was running and until we grab the lock again and
we start mapping the shadow pagtable we use:

mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();

if (try_async_pf(vcpu, prefault, gfn, v, , write, _writable))
 this is get_user_pages and does put_page on the page
 and just returns the 
 this is why we need a get_user_pages_fast that won't
 attempt to touch the page->_count at all! we can avoid
 2 atomic ops for each secondary MMU fault that way
return 0;

spin_lock(>kvm->mmu_lock);
if (mmu_notifier_retry(vcpu->kvm, mmu_seq))
goto out_unlock;
... here we check PageTransCompoundMap(pfn_to_page(pfn)) and
map a 4k or 2MB shadow pagetable on "pfn" ...

Note mmu_notifier_retry does the other side of the smp_rmb():

smp_rmb();
if (kvm->mmu_notifier_seq != mmu_seq)
return 1;
return 0;

Re: [Qemu-devel] [PATCH v9 05/11] block: allow block jobs in any arbitrary node

2016-04-27 Thread Alberto Garcia

On Wed 27 Apr 2016 02:30:55 PM CEST, Max Reitz wrote:
>> @@ -3694,18 +3696,18 @@ static BlockJob *find_block_job(const char *device, 
>> AioContext **aio_context,
>>  
>>  *aio_context = NULL;
>>  
>> -blk = blk_by_name(device);
>> -if (!blk) {
>> +bs = bdrv_lookup_bs(device_or_node, device_or_node, errp);
>> +if (!bs) {
>
> If we get here, *errp is already set... [1]

Good catch, thanks.

>> -if (!blk_is_available(blk)) {
>> +blk = blk_by_name(device_or_node);
>> +if (blk && !blk_is_available(blk)) {
>
> I'd just drop this. The reason blk_is_available() was added here
> because it became possible for BBs not to have a BDS.
>
> Now that you get the BDS directly through bdrv_lookup_bs(), it's no
> longer necessary.

Ok.

>> @@ -78,7 +79,7 @@ void *block_job_create(const BlockJobDriver *driver, 
>> BlockDriverState *bs,
>>  bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
>>  
>>  job->driver= driver;
>> -job->id= g_strdup(bdrv_get_device_name(bs));
>> +job->id= g_strdup(bdrv_get_device_or_node_name(bs));
>
> Hm, since this is only used for events, it's not too bad that a block
> job will have a different name if it was created on a root BDS by
> specifying its node name. But it still is kind of strange.

The idea is that if you create a block job on a root BDS (which is still
the most common case) you get the same id that you used to get before
this series.

Berto

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Joerg Roedel

On Wed, Apr 27, 2016 at 05:54:57PM +0300, Michael S. Tsirkin wrote:
> Point is, QEMU is not the only virtio implementation out there.
> So we can't know no virtio implementations have an IOMMU as long as
> linux supports this IOMMU.
> virtio always used physical addresses since it was born and if it
> changes that it must do this in a way that does not break existing
> users.

FWIW, virtio in qemu can continue to just use physical addresses. But
qemu needs to advertise that fact correctly to the OS in the DMAR table.
This way old kernels (where virtio does not use DMA-API) will also
continue to work on the fixed qemu.

Joerg

Re: [Qemu-devel] [PATCH v6 01/10] qom: add helpers for UserCreatable object types

2016-04-27 Thread Daniel P. Berrange

On Wed, Apr 27, 2016 at 06:43:43AM -0600, Eric Blake wrote:
> On 04/27/2016 03:58 AM, Daniel P. Berrange wrote:
> > On Wed, Apr 27, 2016 at 11:26:23AM +0200, Markus Armbruster wrote:
> >> This commit regresses error message quality from
> >>
> >> $ qemu-system-x86_64 -nodefaults -display none -object 
> >> secret,id=sec0,data=letmein,format=raw,foo=bar
> >> qemu-system-x86_64: -object 
> >> secret,id=sec0,data=letmein,format=raw,foo=bar: Property '.foo' not found
> >>
> >> to just
> >>
> >> qemu-system-x86_64: Property '.foo' not found
> > 
> > I'm not seeing that behaviour myself in current git master, nor
> > immediately before or after 90998d58964cd17f8b0b03800b0a4508f8b543da
> > is applied. I always just get
> > 
> >  $ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -display none -object 
> > secret,id=sec0,data=letmein,format=raw,foo=bar
> >  qemu-system-x86_64: -object 
> > secret,id=sec0,data=letmein,format=raw,foo=bar: Property '.foo' not found
> > 
> > So it all appears to be working correctly. How reliably reproducable
> > is it for you ?  I'm testing on Fedora 23 x86_64 host and can't
> > see the failure despite many invokations.
> 
> I'm reproducing it on my F23 machine, where 90998d58 indeed flips the
> behavior I'm seeing. Maybe it's a factor of which malloc engine is in
> use, or level of compiler optimization?
> 
> My config.status states:
> exec '/home/eblake/qemu/configure' '--enable-kvm' '--enable-system'
> '--disable-user' '--target-list=x86_64-softmmu,ppc64-softmmu'
> '--enable-debug'

For the record, the --enable-debug flag was the key to making this
bug reproducable


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Joerg Roedel

On Wed, Apr 27, 2016 at 05:34:30PM +0300, Michael S. Tsirkin wrote:
> On Wed, Apr 27, 2016 at 04:23:32PM +0200, Joerg Roedel wrote:

> QEMU can choose to bypass IOMMU for one device and not the other.
> IOMMU in QEMU isn't involved when it's bypassed.

And it is QEMU's task to tell the OS, right? And the correct way to do
this is via the firmware ACPI tables.

> Fine but this is beside the point. Almost all virtio devices
> bypass IOMMU and what this patch does is create a way
> to detect devices that don't. This code can maybe go into
> platform.

Again, the way to detect this is in platform code must not be device
specific. This is what the DMAR and IVRS tables on x86 are for.

When there is no way to do this in the firmware (or there is no firmware
at all), we have to do a quirk in the platform code for it.

Joerg

Re: [Qemu-devel] [Qemu-discuss] iolimits for virtio-9p

2016-04-27 Thread Pradeep Kiruvale

On 27 April 2016 at 10:38, Alberto Garcia  wrote:

> On Wed, Apr 27, 2016 at 09:29:02AM +0200, Pradeep Kiruvale wrote:
>
> > Thanks for the reply. I am still in the early phase, I will let you
> > know if any changes are needed for the APIs.
> >
> > We might also have to implement throttle-group.c for 9p devices, if
> > we want to apply throttle for group of devices.
>
> Fair enough, but again please note that:
>
> - throttle-group.c is not meant to be generic, but it's tied to
>   BlockDriverState / BlockBackend.
> - it is currently being rewritten:
>   https://lists.gnu.org/archive/html/qemu-block/2016-04/msg00645.html
>
> If you can explain your use case with a bit more detail we can try to
> see what can be done about it.
>
>
We want to use  virtio-9p for block io instead of virtio-blk-pci. But in
case of
virtio-9p we can just use fsdev devices, so we want to apply throttling
(QoS)
on these devices and as of now the io throttling only possible with the
-drive option.

As a work around we are doing the throttling using cgroup. It has its own
costs.
So, we want to have throttling for fsdev devices inside the qemu itself. I
am just
trying to understand and estimate time required for implementing it for the
fsdevices.


-Pradeep

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Michael S. Tsirkin

On Wed, Apr 27, 2016 at 07:43:07AM -0700, Andy Lutomirski wrote:
> On Wed, Apr 27, 2016 at 7:38 AM, Michael S. Tsirkin  wrote:
> > On Wed, Apr 27, 2016 at 07:31:43AM -0700, Andy Lutomirski wrote:
> >> On Wed, Apr 27, 2016 at 7:23 AM, Joerg Roedel  wrote:
> >> > On Wed, Apr 27, 2016 at 04:37:04PM +0300, Michael S. Tsirkin wrote:
> >> >> One correction: it's a feature of the device in the system.
> >> >> There could be a mix of devices bypassing and not
> >> >> bypassing the IOMMU.
> >> >
> >> > No, it really is not. A device can't chose to bypass the IOMMU. But the
> >> > IOMMU can chose to let the device bypass. So any fix here belongs
> >> > into the platform/iommu code too and not into some driver.
> >> >
> >> >> Sounds good. And a way to detect appropriate devices could
> >> >> be by looking at the feature flag, perhaps?
> >> >
> >> > Again, no! The way to detect that is to look into the iommu description
> >> > structures provided by the firmware. They provide everything necessary
> >> > to tell the iommu code which devices are not translated.
> >> >
> >>
> >> Except on PPC and SPARC.  As far as I know, those are the only
> >> problematic platforms.
> >>
> >> Is it too late to *disable* QEMU's q35-iommu thingy until it can be
> >> fixed to report correct data in the DMAR tables?
> >>
> >> --Andy
> >
> > Meaning virtio or assigned devices?
> > For virtio - it's way too late since these are working configurations.
> > For assigned devices - they don't work on x86 so it doesn't have
> > to be disabled, it's safe to ignore.
> 
> I mean actually prevent QEMU from running in q35-iommu mode with any
> virtio devices attached or maybe even turn off q35-iommu mode entirely
> [1].  Doesn't it require that the user literally pass the word
> "experimental" into QEMU right now?  It did at some point IIRC.
> 
> The reason I'm asking is that, other than q35-iommu, QEMU's virtio
> devices *don't* bypass the IOMMU except on PPC and SPARC, simply
> because there is no other configuration AFAICT that has virtio and and
> IOMMU.  So maybe the right solution is to fix q35-iommu to use DMAR
> correctly (thus breaking q35-iommu users with older guest kernels,
> which hopefully don't actually exist) and to come up with a PPC- and
> SPARC-specific solution, or maybe OpenFirmware-specific solution, to
> handle PPC and SPARC down the road.
> 
> [1] I'm pretty sure I emailed the QEMU list before q35-iommu ever
> showed up in a release asking the QEMU team to please not do that
> until this issue was resolved.  Sadly, that email was ignored :(
> 
> --Andy

Sorry, I didn't make myself clear.
Point is, QEMU is not the only virtio implementation out there.
So we can't know no virtio implementations have an IOMMU as long as
linux supports this IOMMU.
virtio always used physical addresses since it was born and if it
changes that it must do this in a way that does not break existing
users.

-- 
MST

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Michael S. Tsirkin

On Wed, Apr 27, 2016 at 04:23:32PM +0200, Joerg Roedel wrote:
> On Wed, Apr 27, 2016 at 04:37:04PM +0300, Michael S. Tsirkin wrote:
> > One correction: it's a feature of the device in the system.
> > There could be a mix of devices bypassing and not
> > bypassing the IOMMU.
> 
> No, it really is not. A device can't chose to bypass the IOMMU. But the
> IOMMU can chose to let the device bypass.

QEMU can choose to bypass IOMMU for one device and not the other.
IOMMU in QEMU isn't involved when it's bypassed.

> So any fix here belongs
> into the platform/iommu code too and not into some driver.

Fine but this is beside the point. Almost all virtio devices
bypass IOMMU and what this patch does is create a way
to detect devices that don't. This code can maybe go into
platform.

> > Sounds good. And a way to detect appropriate devices could
> > be by looking at the feature flag, perhaps?
> 
> Again, no! The way to detect that is to look into the iommu description
> structures provided by the firmware. They provide everything necessary
> to tell the iommu code which devices are not translated.
> 
> 
> 
>   Joerg

It would be easy if they did but they don't do this on all systems.
In particular the idea for firmware interface was clearly
that a given bus either is connected through IOMMU or bypassing it.
Whether virtio bypasses the iommu is unrelated to the bus it's on.

-- 
MST

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Andy Lutomirski

On Wed, Apr 27, 2016 at 7:38 AM, Michael S. Tsirkin  wrote:
> On Wed, Apr 27, 2016 at 07:31:43AM -0700, Andy Lutomirski wrote:
>> On Wed, Apr 27, 2016 at 7:23 AM, Joerg Roedel  wrote:
>> > On Wed, Apr 27, 2016 at 04:37:04PM +0300, Michael S. Tsirkin wrote:
>> >> One correction: it's a feature of the device in the system.
>> >> There could be a mix of devices bypassing and not
>> >> bypassing the IOMMU.
>> >
>> > No, it really is not. A device can't chose to bypass the IOMMU. But the
>> > IOMMU can chose to let the device bypass. So any fix here belongs
>> > into the platform/iommu code too and not into some driver.
>> >
>> >> Sounds good. And a way to detect appropriate devices could
>> >> be by looking at the feature flag, perhaps?
>> >
>> > Again, no! The way to detect that is to look into the iommu description
>> > structures provided by the firmware. They provide everything necessary
>> > to tell the iommu code which devices are not translated.
>> >
>>
>> Except on PPC and SPARC.  As far as I know, those are the only
>> problematic platforms.
>>
>> Is it too late to *disable* QEMU's q35-iommu thingy until it can be
>> fixed to report correct data in the DMAR tables?
>>
>> --Andy
>
> Meaning virtio or assigned devices?
> For virtio - it's way too late since these are working configurations.
> For assigned devices - they don't work on x86 so it doesn't have
> to be disabled, it's safe to ignore.

I mean actually prevent QEMU from running in q35-iommu mode with any
virtio devices attached or maybe even turn off q35-iommu mode entirely
[1].  Doesn't it require that the user literally pass the word
"experimental" into QEMU right now?  It did at some point IIRC.

The reason I'm asking is that, other than q35-iommu, QEMU's virtio
devices *don't* bypass the IOMMU except on PPC and SPARC, simply
because there is no other configuration AFAICT that has virtio and and
IOMMU.  So maybe the right solution is to fix q35-iommu to use DMAR
correctly (thus breaking q35-iommu users with older guest kernels,
which hopefully don't actually exist) and to come up with a PPC- and
SPARC-specific solution, or maybe OpenFirmware-specific solution, to
handle PPC and SPARC down the road.

[1] I'm pretty sure I emailed the QEMU list before q35-iommu ever
showed up in a release asking the QEMU team to please not do that
until this issue was resolved.  Sadly, that email was ignored :(

--Andy

Re: [Qemu-devel] [PATCH for-2.6 3/3] qom: -object error messages lost location, restore it

2016-04-27 Thread Daniel P. Berrange

On Wed, Apr 27, 2016 at 04:29:09PM +0200, Markus Armbruster wrote:
> qemu_opts_foreach() runs its callback with the error location set to
> the option's location.  Any errors the callback reports use the
> option's location automatically.
> 
> Commit 90998d5 moved the actual error reporting from "inside"
> qemu_opts_foreach() to after it.  Here's a typical hunk:
> 
>if (qemu_opts_foreach(qemu_find_opts("object"),
> -  object_create,
> -  object_create_initial, NULL)) {
> +  user_creatable_add_opts_foreach,
> +  object_create_initial, )) {
> +error_report_err(err);
>exit(1);
>}
> 
> Before, object_create() reports from within qemu_opts_foreach(), using
> the option's location.  Afterwards, we do it after
> qemu_opts_foreach(), using whatever location happens to be current.
> Commonly a "none" location.

IMHO this shows a major design flaw with error_report_err() method
and the location handling. The design pattern we have for "Error *"
objects is that we can freely propagate them up the caller, because
it is a self-contained record of the error information. As soon as
you do that you loose the location information, because it was not
in fact associated with the Error, but rather stored in a single
global variable. For that matter, the Location info isn't even
thread safe AFAICT since its a simple state var, so you better hope
that there's no code which calls loc_push/pop from a non-main thread :-(

> 
> Reproducer:
> 
> $ qemu-system-x86_64 -nodefaults -display none -object 
> secret,id=foo,foo=bar
> qemu-system-x86_64: Property '.foo' not found
> 
> Note no location.  This commit restores it:
> 
> qemu-system-x86_64: -object secret,id=foo,foo=bar: Property '.foo' not 
> found
> 
> Note that the qemu_opts_foreach() bug just fixed could mask the bug
> here: if the location it leaves dandling hasn't been clobbered, yet,
> it's the correct one.
> 
> Reported-by: Eric Blake 
> Cc: Daniel P. Berrange 
> Signed-off-by: Markus Armbruster 
> ---
>  include/qom/object_interfaces.h |  5 +++--
>  qemu-img.c  | 39 +++
>  qemu-io.c   |  3 +--
>  qemu-nbd.c  |  3 +--
>  qom/object_interfaces.c |  4 +++-
>  vl.c|  6 ++
>  6 files changed, 21 insertions(+), 39 deletions(-)
> 
> diff --git a/include/qom/object_interfaces.h b/include/qom/object_interfaces.h
> index d579746..8b17f4d 100644
> --- a/include/qom/object_interfaces.h
> +++ b/include/qom/object_interfaces.h
> @@ -140,7 +140,7 @@ typedef bool (*user_creatable_add_opts_predicate)(const 
> char *type);
>   * user_creatable_add_opts_foreach:
>   * @opaque: a user_creatable_add_opts_predicate callback or NULL
>   * @opts: options to create
> - * @errp: if an error occurs, a pointer to an area to store the error
> + * @errp: unused
>   *
>   * An iterator callback to be used in conjunction with
>   * the qemu_opts_foreach() method for creating a list of
> @@ -148,8 +148,9 @@ typedef bool (*user_creatable_add_opts_predicate)(const 
> char *type);
>   *
>   * The @opaque parameter can be passed a user_creatable_add_opts_predicate
>   * callback to filter which types of object are created during iteration.
> + * When it fails, report the error.
>   *
> - * Returns: 0 on success, -1 on error
> + * Returns: 0 on success, -1 when an error was reported.
>   */
>  int user_creatable_add_opts_foreach(void *opaque,
>  QemuOpts *opts, Error **errp);
> diff --git a/qemu-img.c b/qemu-img.c
> index 1697762..46f2a6d 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -435,8 +435,7 @@ static int img_create(int argc, char **argv)
>  
>  if (qemu_opts_foreach(_object_opts,
>user_creatable_add_opts_foreach,
> -  NULL, _err)) {
> -error_report_err(local_err);
> +  NULL, NULL)) {
>  goto fail;
>  }
>  
> @@ -598,7 +597,6 @@ static int img_check(int argc, char **argv)
>  bool writethrough;
>  ImageCheck *check;
>  bool quiet = false;
> -Error *local_err = NULL;
>  bool image_opts = false;
>  
>  fmt = NULL;
> @@ -679,8 +677,7 @@ static int img_check(int argc, char **argv)
>  
>  if (qemu_opts_foreach(_object_opts,
>user_creatable_add_opts_foreach,
> -  NULL, _err)) {
> -error_report_err(local_err);
> +  NULL, NULL)) {
>  return 1;
>  }
>  
> @@ -871,8 +868,7 @@ static int img_commit(int argc, char **argv)
>  
>  if (qemu_opts_foreach(_object_opts,
>user_creatable_add_opts_foreach,
> -  NULL, _err)) {
> -

Re: [Qemu-devel] post-copy is broken?

2016-04-27 Thread Andrea Arcangeli

Hello Liang,

On Mon, Apr 18, 2016 at 10:33:14AM +, Li, Liang Z wrote:
> If the THP is disabled, no fails.
> And your test was always passed, even when  real post-copy was failed. 
> 
> In my env, the output of 
> 'cat /sys/kernel/mm/transparent_hugepage/enabled'  is:
> 
>  [always] ...
> 

Can you test the fix?
https://marc.info/?l=linux-mm=146175869123580=2

This was not a breakage in userfaultfd nor in postcopy. userfaultfd
had no bugs and is fully rock solid and with zero chances of
generating undetected memory corruption like it was happening in v4.5.

As I suspected, the same problem would have happened with any THP
pmd_trans_huge split (swapping/inflating-balloon etc..). Postcopy just
makes it easier to reproduce the problem because it does a scattered
MADV_DONTNEED on the destination qemu guest memory for the pages
redirtied during the last precopy pass that run, or not transferred
(to allow THP faults in destination qemu during precopy), just before
starting the guest in the destination node.

Other reports of KVM memory corruption happening on v4.5 with THP
enabled will also be taken care of by the above fix.

I hope I managed to fix this in time for v4.6 final (current is
v4.6-rc5-69), so the only kernel where KVM must not be used with THP
enabled will be v4.5.

On a side note, this MADV_DONTEED trigger reminded me as soon as the
madvisev syscall is merged, loadvm_postcopy_ram_handle_discard should
start using it to reduce the enter/exit kernel to just 1 (or a few
madvisev in case we want to give a limit to the temporary buffer to
avoid the risk of allocating too much temporary RAM for very large
guests) to do the MADV_DONTNEED scattered zapping. Same thing in
virtio_balloon_handle_output.

Thanks,
Andrea

Re: [Qemu-devel] [PATCH 04/17] block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev

2016-04-27 Thread Kevin Wolf

Am 27.04.2016 um 16:34 hat Eric Blake geschrieben:
> On 04/27/2016 03:52 AM, Kevin Wolf wrote:
> > It used to be an internal helper function just for implementing
> > bdrv_co_do_readv/writev(), but now that it's a public interface, it
> > deserves a name without "do" in it.
> > 
> > Signed-off-by: Kevin Wolf 

> > +++ b/hw/ide/macio.c
> > @@ -55,8 +55,8 @@ static const int debug_macio = 0;
> >  /*
> >   * Unaligned DMA read/write access functions required for OS X/Darwin which
> >   * don't perform DMA transactions on sector boundaries. These functions are
> > - * modelled on bdrv_co_do_preadv()/bdrv_co_do_pwritev() and so should be
> > - * easy to remove if the unaligned block APIs are ever exposed.
> > + * modelled on bdrv_co_preadv()/bdrv_co_pwritev() and so should be easy to
> > + * remove if the unaligned block APIs are ever exposed.
> >   */
> 
> Is this comment now stale as a result of your series?

No, as I mentioned in the cover letter, bdrv_co_preadv() and
bdrv_co_pwritev() still enforce a minimum alignment of 512. The next
steps towards using unaligned I/O in macio.c are removing that minimum
(which we can now do for all drivers that implement
.bdrv_co_preadv/pwritev) and then using these functions in
dma-helpers.c.

Kevin


pgppoOM3fVU1w.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.6 1/3] QemuOpts: Fix qemu_opts_foreach() dangling location regression

2016-04-27 Thread Eric Blake

On 04/27/2016 08:29 AM, Markus Armbruster wrote:
> qemu_opts_foreach() pushes and pops a Location with automatic storage
> duration.  Except it fails to pop when @func() returns non-zero.
> cur_loc then points to unused stack space, and will most likely get
> clobbered in short order.
> 
> Clobbered cur_loc can make loc_pop() and error_print_loc() crash or
> report bogus locations.
> 
> Affects several qemu command line options as well as qemu-img,
> qemu-io, qemu-nbd -object, and blkdebug's configuration file.
> 
> Broken in commit a4c7367, v2.4.0.

Latent bug means it's not a regression between 2.5 and 2.6, but I agree
that if there is time to get this in 2.6, it is worth having.  It's a
shame that valgrind doesn't catch use of stale stack space.


> cur_loc then points to where qemu_opts_foreach()'s Location used to
> be, i.e. unused stack space.  With optimization, this Location doesn't
> get clobbered for me, and also happens to be the correct location.
> Without optimization, it does get clobbered in a way that makes
> error_report_err() report no location.

And that explains why some people were having problems reproducing.

> 
> Signed-off-by: Markus Armbruster 
> ---
>  util/qemu-option.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Michael S. Tsirkin

On Wed, Apr 27, 2016 at 07:31:43AM -0700, Andy Lutomirski wrote:
> On Wed, Apr 27, 2016 at 7:23 AM, Joerg Roedel  wrote:
> > On Wed, Apr 27, 2016 at 04:37:04PM +0300, Michael S. Tsirkin wrote:
> >> One correction: it's a feature of the device in the system.
> >> There could be a mix of devices bypassing and not
> >> bypassing the IOMMU.
> >
> > No, it really is not. A device can't chose to bypass the IOMMU. But the
> > IOMMU can chose to let the device bypass. So any fix here belongs
> > into the platform/iommu code too and not into some driver.
> >
> >> Sounds good. And a way to detect appropriate devices could
> >> be by looking at the feature flag, perhaps?
> >
> > Again, no! The way to detect that is to look into the iommu description
> > structures provided by the firmware. They provide everything necessary
> > to tell the iommu code which devices are not translated.
> >
> 
> Except on PPC and SPARC.  As far as I know, those are the only
> problematic platforms.
> 
> Is it too late to *disable* QEMU's q35-iommu thingy until it can be
> fixed to report correct data in the DMAR tables?
> 
> --Andy

Meaning virtio or assigned devices?
For virtio - it's way too late since these are working configurations.
For assigned devices - they don't work on x86 so it doesn't have
to be disabled, it's safe to ignore.

-- 
MST

Re: [Qemu-devel] [PATCH 10/17] vdi: Implement .bdrv_co_pwritev() interface

2016-04-27 Thread Kevin Wolf

Am 27.04.2016 um 16:17 hat Stefan Hajnoczi geschrieben:
> On Wed, Apr 27, 2016 at 11:52:40AM +0200, Kevin Wolf wrote:
> > @@ -703,6 +712,7 @@ static int vdi_co_write(BlockDriverState *bs,
> >  VdiHeader *header = (VdiHeader *) block;
> >  uint8_t *base;
> >  uint64_t offset;
> > +uint32_t n_sectors;
> >  
> >  logout("now writing modified header\n");
> >  assert(VDI_IS_ALLOCATED(bmap_first));
> 
> Unnecessary change?

No, I removed a top-level n_sectors declaration because I wanted the
compiler to complain about any uses in the main part, which should be
byte-aligned now. This final block still uses some sector arithmetics,
though, and therefore needs to reintroduce the variable.

Kevin

pgppC9ARa_m3i.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v2 2/2] spapr: Memory hot-unplug support

2016-04-27 Thread Igor Mammedov

On Wed, 27 Apr 2016 15:59:52 +0200
Thomas Huth  wrote:

> On 27.04.2016 15:37, Igor Mammedov wrote:
> > On Tue, 26 Apr 2016 16:03:37 -0500
> > Michael Roth  wrote:
> >   
> >> Quoting Igor Mammedov (2016-04-26 02:52:36)  
> >>> On Tue, 26 Apr 2016 10:39:23 +0530
> >>> Bharata B Rao  wrote:
> >>> 
>  On Mon, Apr 25, 2016 at 11:20:50AM +0200, Igor Mammedov wrote:
> > On Wed, 16 Mar 2016 10:11:54 +0530
> > Bharata B Rao  wrote:
> >   
> >> On Wed, Mar 16, 2016 at 12:36:05PM +1100, David Gibson wrote:  
> >>> On Tue, Mar 15, 2016 at 10:08:56AM +0530, Bharata B Rao wrote:
>  Add support to hot remove pc-dimm memory devices.
> 
>  Signed-off-by: Bharata B Rao 
> >>>
> >>> Reviewed-by: David Gibson 
> >>>
> >>> Looks correct, but again, needs to wait on the PAPR change.  
> > [...]  
> >>
> >> While we are here, I would also like to get some opinion on the real
> >> need for memory unplug. Is there anything that memory unplug gives us
> >> which memory ballooning (shrinking mem via ballooning) can't give ?
> >>   
> > Sure ballooning can complement memory hotplug but turning it on would
> > effectively reduce hotplug to balloning as it would enable overcommit
> > capability instead of hard partitioning pc-dimms provides. So one
> > could just use ballooning only and not bother with hotplug at all.
> >
> > On the other hand memory hotplug/unplug (at least on x86) tries
> > to model real hardware, thus removing need in paravirt ballooning
> > solution in favor of native guest support.  
> 
>  Thanks for your views.
>  
> >
> > PS:
> > Guest wise, currently hot-unplug is not well supported in linux,
> > i.e. it's not guarantied that guest will honor unplug request
> > as it may pin dimm by using it as a non migratable memory. So
> > there is something to work on guest side to make unplug more
> > reliable/guarantied.  
> 
>  In the above scenario where the guest doesn't allow removal of certain
>  parts of DIMM memory, what is the expected behaviour as far as QEMU
>  DIMM device is concerned ? I seem to be running into this situation
>  very often with PowerPC mem unplug where I am left with a DIMM device
>  that has only some memory blocks released. In this situation, I would 
>  like
>  to block further unplug requests on the same device, but QEMU seems
>  to allow more such unplug requests to come in via the monitor. So
>  qdev won't help me here ? Should I detect such condition from the
>  machine unplug() handler and take required action ?
> >>> I think offlining is a guests task along with recovering from
> >>> inability to offline (i.e. offline all + eject or restore original state).
> >>> QUEM does it's job by notifying guest what dimm it wants to remove
> >>> and removes it when guest asks it (at least in x86 world).
> >>
> >> In the case of pseries, the DIMM abstraction isn't really exposed to
> >> the guest, but rather the memory blocks we use to make the backing
> >> memdev memory available to the guest. During unplug, the guest
> >> completely releases these blocks back to QEMU, and if it can only
> >> release a subset of what's requested it does not attempt to recover.
> >> We can potentially change that behavior on the guest side, since
> >> partially-freed DIMMs aren't currently useful on the host-side...
> >>
> >> But, in the case of pseries, I wonder if it makes sense to maybe go
> >> ahead and MADV_DONTNEED the ranges backing these released blocks so the
> >> host can at least partially reclaim the memory from a partially
> >> unplugged DIMM?  
> > It's a little bit confusing, one asked to remove device but it's still
> > there but not completely usable/available.
> > What will happen when user wants that memory plugged back?  
> 
> As far as I've understood MADV_DONTNEED, you can use the memory again at
> any time - just the previous contents will be gone, which is ok in this
> case since the guest previously marked this area as unavailable.
If host gave returned memory to someone else there might not be enough
resources to give it back (what would happen I can't tell may be VM will
stall or just get exception).

Anyhow I'd suggest ballooning if one needs partial unplug and fix
physical unplug to unplug whole pc-dimm or none instead of
turning pc-dimm device model into some hybrid with balloon device
and making users/mgmt even more confused.

> 
>  Thomas
>

Re: [Qemu-devel] [PATCH 04/17] block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev

2016-04-27 Thread Eric Blake

On 04/27/2016 03:52 AM, Kevin Wolf wrote:
> It used to be an internal helper function just for implementing
> bdrv_co_do_readv/writev(), but now that it's a public interface, it
> deserves a name without "do" in it.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/block-backend.c |  4 ++--
>  block/io.c| 20 ++--
>  block/raw_bsd.c   |  4 ++--
>  hw/ide/macio.c|  4 ++--
>  include/block/block_int.h |  4 ++--
>  5 files changed, 18 insertions(+), 18 deletions(-)
> 

> @@ -1127,7 +1127,7 @@ static int coroutine_fn 
> bdrv_co_do_readv(BlockDriverState *bs,
>  return -EINVAL;
>  }
>  
> -return bdrv_co_do_preadv(bs, sector_num << BDRV_SECTOR_BITS,
> +return bdrv_co_preadv(bs, sector_num << BDRV_SECTOR_BITS,
>   nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
>  }

Missed alignment.


> @@ -1523,7 +1523,7 @@ static int coroutine_fn 
> bdrv_co_do_writev(BlockDriverState *bs,
>  return -EINVAL;
>  }
>  
> -return bdrv_co_do_pwritev(bs, sector_num << BDRV_SECTOR_BITS,
> +return bdrv_co_pwritev(bs, sector_num << BDRV_SECTOR_BITS,
>nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
>  }

and again

> +++ b/hw/ide/macio.c
> @@ -55,8 +55,8 @@ static const int debug_macio = 0;
>  /*
>   * Unaligned DMA read/write access functions required for OS X/Darwin which
>   * don't perform DMA transactions on sector boundaries. These functions are
> - * modelled on bdrv_co_do_preadv()/bdrv_co_do_pwritev() and so should be
> - * easy to remove if the unaligned block APIs are ever exposed.
> + * modelled on bdrv_co_preadv()/bdrv_co_pwritev() and so should be easy to
> + * remove if the unaligned block APIs are ever exposed.
>   */

Is this comment now stale as a result of your series?

> +++ b/include/block/block_int.h
> @@ -517,10 +517,10 @@ extern BlockDriver bdrv_qcow2;
>   */
>  void bdrv_setup_io_funcs(BlockDriver *bdrv);
>  
> -int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
> +int coroutine_fn bdrv_co_preadv(BlockDriverState *bs,
>  int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
>  BdrvRequestFlags flags);
> -int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
> +int coroutine_fn bdrv_co_pwritev(BlockDriverState *bs,
>  int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
>  BdrvRequestFlags flags);

Should alignment be attempted here, while touching it?

My comments are minor, so whether or not you make those changes:
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api

2016-04-27 Thread Andy Lutomirski

On Wed, Apr 27, 2016 at 7:23 AM, Joerg Roedel  wrote:
> On Wed, Apr 27, 2016 at 04:37:04PM +0300, Michael S. Tsirkin wrote:
>> One correction: it's a feature of the device in the system.
>> There could be a mix of devices bypassing and not
>> bypassing the IOMMU.
>
> No, it really is not. A device can't chose to bypass the IOMMU. But the
> IOMMU can chose to let the device bypass. So any fix here belongs
> into the platform/iommu code too and not into some driver.
>
>> Sounds good. And a way to detect appropriate devices could
>> be by looking at the feature flag, perhaps?
>
> Again, no! The way to detect that is to look into the iommu description
> structures provided by the firmware. They provide everything necessary
> to tell the iommu code which devices are not translated.
>

Except on PPC and SPARC.  As far as I know, those are the only
problematic platforms.

Is it too late to *disable* QEMU's q35-iommu thingy until it can be
fixed to report correct data in the DMAR tables?

--Andy

Re: [Qemu-devel] [PATCH 06/17] bochs: Implement .bdrv_co_preadv() interface

2016-04-27 Thread Kevin Wolf

Am 27.04.2016 um 16:06 hat Stefan Hajnoczi geschrieben:
> On Wed, Apr 27, 2016 at 11:52:36AM +0200, Kevin Wolf wrote:
> > Signed-off-by: Kevin Wolf 
> > ---
> >  block/bochs.c | 46 +-
> >  1 file changed, 29 insertions(+), 17 deletions(-)
> > 
> > diff --git a/block/bochs.c b/block/bochs.c
> > index af8b7ab..d148454 100644
> > --- a/block/bochs.c
> > +++ b/block/bochs.c
> > @@ -104,6 +104,7 @@ static int bochs_open(BlockDriverState *bs, QDict 
> > *options, int flags,
> >  int ret;
> >  
> >  bs->read_only = 1; // no write support yet
> > +bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O 
> > supported */
> 
> Can this be the default in block.c?  Drivers that have other alignment
> characteristics can set it explicitly but most drivers will want
> BDRV_SECTOR_SIZE so it's nice to make it common code.

I absolutely don't expect BDRV_SECTOR_SIZE to be what most drivers will
want. It's basically a sign that a driver is either a protocol that
imposes restrictions (like O_DIRECT on raw-posix) or it is broken and we
don't care enough about the emulation overhead to fix it (like this one).

The expected value for most block drivers is 1.

> >  while (nb_sectors > 0) {
> >  int64_t block_offset = seek_to_sector(bs, sector_num);
> >  if (block_offset < 0) {
> >  return block_offset;
> 
> s->lock must be unlocked.

I can't believe I messed this up in so many places. Thanks for catching
this, I'll have to go through all patches and check the locking.

Kevin

pgpMw2pR8EHXq.pgp
Description: PGP signature

1 2 3 >

1 - 100 of 256 matches

Mail list logo