Re: [Qemu-devel] qemu-system-ppc hangs

2017-11-20 Thread Mark Cave-Ayland
On 21/11/17 00:00, Richard Purdie wrote:

> Hi,
> 
> I work on the Yocto Project and we use qemu to test boot our Linux
> images and run tests against them. We've been noticing some instability
> for ppc where the images sometimes hang, usually around udevd bring up
> time so just after booting into userspace.
> 
> To cut a long story short, I've tracked down what I think is the
> problem. I believe the decrementer timer stops receiving interrupts so
> tasks in our images hang indefinitely as the timer stopped. 
> 
> It can be summed up with this line of debug:
> 
> ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0004
> 
> It should normally read:
> 
> ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0002
> 
> The question is why CPU_INTERRUPT_EXITTB ends up being set when the
> lines above this log message clearly sets CPU_INTERRUPT_HARD (via 
> cpu_interrupt() ).
> 
> I note in cpu.h:
> 
> /* updates protected by BQL */
> uint32_t interrupt_request;
> 
> (for struct CPUState)
> 
> The ppc code does "cs->interrupt_request |= CPU_INTERRUPT_EXITTB" in 5
> places, 3 in excp_helper.c and 2 in helper_regs.h. In all cases,  
> g_assert(qemu_mutex_iothread_locked()); fails. If I do something like:
> 
> if (!qemu_mutex_iothread_locked()) {
> qemu_mutex_lock_iothread();
> cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> qemu_mutex_unlock_iothread();
> } else {
> cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> }
> 
> in these call sites then I can no longer lock qemu up with my test
> case.
> 
> I suspect the _HARD setting gets overwritten which stops the 
> decrementer interrupts being delivered.
> 
> I don't know if taking this lock in these situations is going to be bad
> for performance and whether such a patch would be right/wrong.
> 
> At this point I therefore wanted to seek advice on what the real issue
> is here and how to fix it!

Thanks for the report - given that a lot of work has been done on MTTCG
and iothread over the past few releases, it wouldn't be a complete
surprise if something had crept in here.

Firstly let's start off with some basics: what is your host
architecture, QEMU version and full command line being used to launch QEMU?

Would it also be possible for you to make your test image available for
other people to see if they can recreate the same issue?


ATB,

Mark.



Re: [Qemu-devel] [PATCH] vl.c && linux-user/main.c : removed **envp from main() arguments

2017-11-20 Thread Laurent Vivier
Le 21/11/2017 à 07:41, Yoni Bettan a écrit :
> * it was added on 2008 902b3d5c392bb6f48ef340ad8ecc3311705d2800
>   when introduced cache-utils.[ch]
> * since then cache-utils.[ch] were removed but **envp was left
>   behind
> 
> Signed-off-by: Yoni Bettan 
> ---
>  linux-user/main.c | 2 +-
>  vl.c  | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/linux-user/main.c b/linux-user/main.c
> index aa02f25b85..ca5628c1ca 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -4233,7 +4233,7 @@ static int parse_args(int argc, char **argv)
>  return optind;
>  }
>  
> -int main(int argc, char **argv, char **envp)
> +int main(int argc, char **argv)
>  {
>  struct target_pt_regs regs1, *regs = 
>  struct image_info info1, *info = 
> diff --git a/vl.c b/vl.c
> index 1ad1c04637..9667756ccc 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -35,10 +35,10 @@
>  #ifdef CONFIG_SDL
>  #if defined(__APPLE__) || defined(main)
>  #include 
> -int qemu_main(int argc, char **argv, char **envp);
> +int qemu_main(int argc, char **argv);
>  int main(int argc, char **argv)
>  {
> -return qemu_main(argc, argv, NULL);
> +return qemu_main(argc, argv);
>  }
>  #undef main
>  #define main qemu_main

I think this part can be removed now. As it seems it has been added
because of the envp parameter.

Thanks,
Laurent



Re: [Qemu-devel] [Qemu-ppc] [PATCH] spapr: Implement bug in spapr-vty device to be compatible with PowerVM

2017-11-20 Thread Thomas Huth
On 20.11.2017 08:14, David Gibson wrote:
> The spapr-vty device implements the PAPR defined virtual console,
> which is also implemented by IBM's proprietary PowerVM hypervisor.
> 
> PowerVM's implementation has a bug where it inserts an extra \0 after
> every \r going to the guest.  Because of that Linux's guest side
> driver has a workaround which strips \0 characters that appear
> immediately after a \r.

Ouch.

I wonder whether it would make more sense to change the guest side
driver to apply the workaround only if it's really running under
PowerVM...? E.g. what if the PowerVM bug ever gets fixed one day?

> That means that when running under qemu, sending a binary stream from
> host to guest via spapr-vty which happens to include a \r\0 sequence
> will get corrupted by that workaround.
> 
> To deal with that, this patch duplicates PowerVM's bug, inserting an
> extra \0 after each \r.  Ugly, but the best option available.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/char/spapr_vty.c | 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/hw/char/spapr_vty.c b/hw/char/spapr_vty.c
> index 0fa416ca6b..a95e5e91a7 100644
> --- a/hw/char/spapr_vty.c
> +++ b/hw/char/spapr_vty.c
> @@ -58,6 +58,24 @@ static int vty_getchars(VIOsPAPRDevice *sdev, uint8_t 
> *buf, int max)
>  
>  while ((n < max) && (dev->out != dev->in)) {
>  buf[n++] = dev->buf[dev->out++ % VTERM_BUFSIZE];
> +
> +/* PowerVM's vty implementation has a bug where it inserts a
> + * \0 after every \r going to the guest.  Existing guests have
> + * a workaround for this which removes every \0 immediately
> + * following a \r, so here we make ourselves bug-for-bug
> + * compatible, so that the guest won't drop a real \0-after-\r
> + * that happens to occur in a binary stream. */
> +if (buf[n-1] == '\r') {
> +if (n < max) {
> +buf[n++] = '\0';
> +} else {
> +/* No room for the extra \0, roll back and try again
> + * next time */
> +dev->out--;
> +n--;
> +break;
> +}
> +}
>  }
>  
>  qemu_chr_fe_accept_input(>chardev);
> 

Code looks fine to me, so:

Reviewed-by: Thomas Huth 



Re: [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types

2017-11-20 Thread Dong Jia Shi
* Cornelia Huck  [2017-11-14 11:50:14 +0100]:

Hallo Conny,

After spending some time, just some updates for this one.

> On Tue, 14 Nov 2017 16:25:47 +0800
> Dong Jia Shi  wrote:
> 
> > Dear Conny,
> > 
> > Good day!
> > 
> > Just now, our Libvirt folks pointed out a "usability mess" for the
> > design of differentiating address based on devices classes (real |
> > virtual). The complaints are mainly about the "s390-squash-mcss"
> > property and restrictions to define virtual device in the special 0xFE
> > css, and define real devices in non-0xFE.
> > 
> > We have some discussions internally, but failed to get it cleared. As we
> > think this is about the architecture, so hereby, I as a representative,
> > forward our arguments and questions here to ask you for help:
> > 
> > 1. What benifit do we get to put virtual devices in css 0xFE?
> 
> Some background here (for the benefit of innocent bystanders):
:)

> 
> In the past, I had been involved with some cases where Linux guests
> under z/VM died after a customer followed the recommended procedure to
> vary off a path before applying service. That path was supposed to be a
> path to a disk; unfortunately, z/VM had mapped all kinds of virtual
> paths to it, including the only path to the console device. Oops.
> 
> With that in mind, we wanted to make sure that qemu would not be
> susceptible to the same problem; IOW, we wanted to make sure that
> chpids etc. were not mashed together for devices that did not have
> anything to do with each other. At one point in time, the idea came up
> to use a reserved css for virtio devices, which was deemed an elegant
> solution as 'real' devices were still something far in the future. (And
> I was under the delusion that we would have MCSS-E support in Linux by
> then; that has not happened...)
> 
> So the basic idea of css 0xfe is: Maintain a clear separation between
> devices emulated by qemu and pass-through devices (a more divisive
> separation than by simply separating chpids).
> 
Thanks for the information. I think now everybody are clear about the
background.

[I sometime found it is a pleasure to listen to your story. Clear and
interesting.]

> > 
> > 2. Since we could accept squashing virtual devices into css 0, can we
> > accept to not trading 0xFE as a special css?
> 
> Using css 0xfe seemed like a good idea; but as things worked out
> differently in the meantime, it seems it causes more problems right now
> than it avoids.
> 
Have to agree. In particular after knowing the background.

> > So that we can remove the restrictions for the cssid validation for each
> > type of device. Even we could drop the s390-squash-mcss, and just allow
> > the user to define any device in any css.
> 
> Opening up the different csses for all devices might help, but we need
> to be careful:
> - We still want to keep the chpids separated. Probably not a problem
>   right now.
> - We need to be able to point to a default css, especially as there are
>   no MCSS-E capable OSs around yet.
> - You need to double check if there are further restrictions on the
>   allowed css ids. (I know that 0xff is reserved for special usage as
>   well; but I can't find out more.)
> - Backwards compatibility and migration: We certainly don't want old
>   setups to break, and compat machines need to force the old scheme.
> 
> All best tested out via a prototype :)
> 
After a round of internal discussion, Halil now has a prototype. I think
sooner he will post his patch with our internal agreement, and we can
continuing talking based on that then.

> > 
> > 3. If we have to keep the squash property, then when squashing, it's
> > somewhat like "I don't care for the cssid", so is it possible for us to
> > not check the cssid in the device devno?
> > Libvirt would be benifited with this when automatically generating the
> > addresses.
> 
> I think we still need to keep the squashing around for compatibility,
> but we may be able to give it the chop for something like 3.0.
> 
> (And we probably need to keep the existing restrictions in compat mode.)
> 
Can we just drop the squash property, right after we opened up all the
csses?
We do not support LGM for vfio-ccw, and there is no libvirt user until
now. So what else case could it be to stop us from dropping it?

> > 
> > 4. Error message for devno conflict is not helpful. For the following
> > case:
> >   -M s390-ccw-virtio,s390-squash-mcss=on \
> >   -drive 
> > file=/dev/disk/by-path/ccw-0.0.3f3e,if=none,id=drive-virtio-disk1,format=raw
> >  \
> >   -device 
> > virtio-blk-ccw,devno=fe.0.,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=0
> >  \
> >   -device 
> > vfio-ccw,devno=0.0.,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920
> >  \
> > 
> > We get this error message:
> > qemu-system-s390x: -device 
> > vfio-ccw,devno=0.0.,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920:
> >  

Re: [Qemu-devel] [PATCH] vl.c && linux-user/main.c : removed **envp from main() arguments

2017-11-20 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH] vl.c && linux-user/main.c : removed **envp from 
main() arguments
Type: series
Message-id: 20171121064106.13721-1-ybet...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   patchew/20171121064106.13721-1-ybet...@redhat.com -> 
patchew/20171121064106.13721-1-ybet...@redhat.com
Switched to a new branch 'test'
7657b24e5a vl.c && linux-user/main.c : removed **envp from main() arguments

=== OUTPUT BEGIN ===
Checking PATCH 1/1: vl.c && linux-user/main.c : removed **envp from main() 
arguments...
ERROR: externs should be avoided in .c files
#37: FILE: vl.c:38:
+int qemu_main(int argc, char **argv);

total: 1 errors, 0 warnings, 28 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH] vl.c && linux-user/main.c : removed **envp from main() arguments

2017-11-20 Thread Yoni Bettan
* it was added on 2008 902b3d5c392bb6f48ef340ad8ecc3311705d2800
  when introduced cache-utils.[ch]
* since then cache-utils.[ch] were removed but **envp was left
  behind

Signed-off-by: Yoni Bettan 
---
 linux-user/main.c | 2 +-
 vl.c  | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index aa02f25b85..ca5628c1ca 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -4233,7 +4233,7 @@ static int parse_args(int argc, char **argv)
 return optind;
 }
 
-int main(int argc, char **argv, char **envp)
+int main(int argc, char **argv)
 {
 struct target_pt_regs regs1, *regs = 
 struct image_info info1, *info = 
diff --git a/vl.c b/vl.c
index 1ad1c04637..9667756ccc 100644
--- a/vl.c
+++ b/vl.c
@@ -35,10 +35,10 @@
 #ifdef CONFIG_SDL
 #if defined(__APPLE__) || defined(main)
 #include 
-int qemu_main(int argc, char **argv, char **envp);
+int qemu_main(int argc, char **argv);
 int main(int argc, char **argv)
 {
-return qemu_main(argc, argv, NULL);
+return qemu_main(argc, argv);
 }
 #undef main
 #define main qemu_main
@@ -3088,7 +3088,7 @@ static void register_global_properties(MachineState *ms)
 user_register_global_props();
 }
 
-int main(int argc, char **argv, char **envp)
+int main(int argc, char **argv)
 {
 int i;
 int snapshot, linux_boot;
-- 
2.13.6




Re: [Qemu-devel] [PATCH for-2.11] Makefile: use $(MAKE) variable

2017-11-20 Thread Fam Zheng
On Tue, 11/21 00:21, Philippe Mathieu-Daudé wrote:
> @@ -904,7 +904,7 @@ $(patsubst %, check-%, $(check-qapi-schema-y)): 
> check-%.json: $(SRC_PATH)/%.json
>   echo $$? >$*.test.exit, \
>   "TEST","$*.out")
>   @diff -q $(SRC_PATH)/$*.out $*.test.out
> - @# Sanitize error messages (make them independent of build directory)
> + @# Sanitize error messages ($(MAKE) them independent of build directory)

This doesn't look like a good candidate, does it?

Fam

>   @perl -p -e 's|\Q$(SRC_PATH)\E/||g' $*.test.err | diff -q 
> $(SRC_PATH)/$*.err -
>   @diff -q $(SRC_PATH)/$*.exit $*.test.exit
>  
> -- 
> 2.15.0
> 
> 

Fam



Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()

2017-11-20 Thread Zhong Yang
On Mon, Nov 20, 2017 at 02:14:50PM +, Daniel P. Berrange wrote:
> On Mon, Nov 20, 2017 at 04:54:42PM +0800, Zhong Yang wrote:
> > On Fri, Nov 17, 2017 at 02:06:20PM +, Daniel P. Berrange wrote:
> > > On Fri, Nov 17, 2017 at 01:54:09PM +, Stefan Hajnoczi wrote:
> > > > On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > > > > diff --git a/util/rcu.c b/util/rcu.c
> > > > > index ca5a63e..8d491a6 100644
> > > > > --- a/util/rcu.c
> > > > > +++ b/util/rcu.c
> > > > > @@ -26,6 +26,7 @@
> > > > >   * IBM's contributions to this file may be relicensed under LGPLv2 
> > > > > or later.
> > > > >   */
> > > > >  
> > > > > +#include 
> > > > 
> > > > This header file is not mentioned in the C99 standard or POSIX.  It is
> > > > probably not available on all host OSes that QEMU supports.  Please use
> > > > #ifdef CONFIG_LINUX.
> > > > 
> > > > >  #include "qemu/osdep.h"
> > > > >  #include "qemu-common.h"
> > > > >  #include "qemu/rcu.h"
> > > > > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> > > > >  node->func(node);
> > > > >  }
> > > > >  qemu_mutex_unlock_iothread();
> > > > > +#ifdef CONFIG_LINUX
> > > > > +malloc_trim(0);
> > > > > +#endif
> > > > 
> > > > It is important that the rcu thread isn't overzealous in minimizing heap
> > > > size if that means ordinary malloc(3) calls will experience latency
> > > > spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> > > > slow path.
> > > 
> > > If you pass '0' the docs say that the minimum amount is left in the
> > > heap, per M_TOP_PAD, which is 128kb.
> > > 
> > > Strangely the mallopt(3) man page suggests, that free() should 
> > > automatically
> > > trim the heap when its size exceeds M_TOP_TRIM, which is again 128kb by
> > > default.  So I'm puzzelled by malloc_trim() would be needed unless there
> > > are scenarios in which free() won't trim, that aren't mentioned in the
> > > manpage.
> > 
> >   In fact, i firstly adopted mallopt() solution to optimize the heap memory,
> >   but i found this function is NOT useful, which are difference with MAN's
> >   description, so i had to swith to use malloc_trim().
> 
> 
> > > Also, how does malloc_trim interact with tcmalloc.so that people often
> > > use in preference to glibc's built in malloc ?
> >   Thanks, you reminded me to consider tcmalloc or jemalloc.
> >  
> >   Whether below code is more suitable? thanks!
> >   #if defined(CONFIG_LINUX) && defined(__GLIBC__)
> >  malloc_trim(0);
> >   #endif
> 
> Both of those macro symbols will still be defined even when tcmalloc/jemalloc
> are in use.
  
  Hello Daniel,

  If those two macro are NOT useful for this scenario, how about below changes?

  In configure file, 

  if test "$tcmalloc" = "yes" ; then
  echo "CONFIG__TCMALLOC=y" >> $config_host_mak
   

  then use !CONFIG_TCMALLOC macro to check malloc_trim()? thanks!

  Regards,

  Yang

 
> I wonder if tcmalloc/jemalloc even suffer from the same problem that libc's
> builtin malloc has ?
 
  Hello Xulei,

  Did you compiled Qemu with tcmalloc or jemalloc ? If you did , would you 
  please share those datas with different malloc compile option?  Many thanks!

  Regards,

  Yang


> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH] 9pfs: don't ignore O_DIRECT flag in the 9pfs server

2017-11-20 Thread jiangyiwen
On 2017/11/20 18:13, Greg Kurz wrote:
> On Mon, 20 Nov 2017 13:48:59 +0800
> jiangyiwen  wrote:
> 
>> Now v9fs in linux has already supported O_DIRECT(v9fs_direct_IO),
>> when guest user open file with O_DIRECT flag and return success,
>> so user hopes data doesn't pass through page cache, but 9pfs in
>> qemu ignore direct disk access and use host page cache, it is not
>> match to DIRECT_IO semantic, so we should not ignore O_DIRECT in
>> 9pfs unless v9fs in linux don't support DIRECT_IO.
>>
>> And if server fs don't support O_DIRECT, user will receive -EINVAL
>> and know the filesystem don't support O_DIRECT.
>>
>> So in order to ensure semantic consistency, don't ignore direct
>> disk access in 9pfs.
>>
> 
> There are good reasons for us to ignore O_DIRECT. AFAIK, O_DIRECT requires
> the application to take care of the alignment of the buffers with either the
> logical block size of the filesystem (linux <= 2.4) or the logical block size
> of the underlying storage... I don't really see how you can achieve that since
> the linux 9p client only cares to break up requests into msize-chunks, and
> ignores alignment. ie, you're likely to not ensure semantic consistency on the
> host side anyway and linux will fallback to cached I/O.
> 
> Also, this change would silently break existing users of O_DIRECT, which isn't
> acceptable... it would require some fsdev property to enable this new 
> behavior.
> 

Thank you very much, I understand what your mean, v9fs in linux doesn't take 
care
of the alignment, only split as the msize. Then there is another problem,
should v9fs support DIRECT_IO? or delete DIRECT_IO ops in v9fs?

Thanks,
Yiwen.
>> Signed-off-by: Yiwen Jiang 
>> ---
>>  hw/9pfs/9p.c | 4 
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
>> index 52d4663..5ea01c4 100644
>> --- a/hw/9pfs/9p.c
>> +++ b/hw/9pfs/9p.c
>> @@ -155,10 +155,6 @@ static int get_dotl_openflags(V9fsState *s, int oflags)
>>   */
>>  flags = dotl_to_open_flags(oflags);
>>  flags &= ~(O_NOCTTY | O_ASYNC | O_CREAT);
>> -/*
>> - * Ignore direct disk access hint until the server supports it.
>> - */
>> -flags &= ~O_DIRECT;
>>  return flags;
>>  }
>>
> 
> 
> .
> 





[Qemu-devel] [PATCH for-2.11] Makefile: add more targets to the UNCHECKED_GOALS rule

2017-11-20 Thread Philippe Mathieu-Daudé
These targets don't need a full build of git submodules.
(See b8e535ae8af and eaa2ddbb767).

Signed-off-by: Philippe Mathieu-Daudé 
---
 Makefile | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index adda3fb63e..5e8d79618c 100644
--- a/Makefile
+++ b/Makefile
@@ -6,7 +6,10 @@ BUILD_DIR=$(CURDIR)
 # Before including a proper config-host.mak, assume we are in the source tree
 SRC_PATH=.
 
-UNCHECKED_GOALS := %clean TAGS cscope ctags docker docker-% help
+UNCHECKED_GOALS := %clean TAGS cscope ctags dist \
+html info pdf txt \
+help check-help \
+docker docker-% vm-test vm-build-%
 
 # All following code might depend on configuration variables
 ifneq ($(wildcard config-host.mak),)
-- 
2.15.0




[Qemu-devel] [PATCH for-2.11] Makefile: use $(MAKE) variable

2017-11-20 Thread Philippe Mathieu-Daudé
For some systems (i.e. FreeBSD) the default 'make' is not compatible with the
GNU extensions used by QEMU makefiles.

Calling the GNU make (gmake) works, however the help displayed refers to the
host 'make' and copy/paste leads to lot of unobvious errors:

  $ gmake check-help
  [...]
   make checkRun all tests

  $ make check
  make: "Makefile" line 28: Missing dependency operator
  make: "Makefile" line 37: Need an operator
  make: "Makefile" line 41: warning: duplicate script for target 
"git-submodule-update" ignored
  make: "rules.mak" line 70: warning: duplicate script for target "%.o" ignored
  make: Unknown modifier ' '
  make: Unclosed substitution for eval modules (= missing)
  make: "tests/Makefile.include" line 24: Variable/Value missing from "export"
  make: "tests/" line 1: warning: Zero byte read from file, skipping rest of 
line.
  make: "tests/" line 1: Need an operator
  make: "Makefile" line 660: warning: duplicate script for target "ifneq" 
ignored
  make: "Makefile" line 78: warning: using previous script for "ifneq" defined 
here
  make: Fatal errors encountered -- cannot continue

Using the $(MAKE) variable, the help displayed is consistent with the 'make'
program used.

Signed-off-by: Philippe Mathieu-Daudé 
---
 Makefile   |  6 +++---
 tests/Makefile.include | 22 +++---
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index 5392cfac4d..5e8d79618c 100644
--- a/Makefile
+++ b/Makefile
@@ -53,7 +53,7 @@ ifneq ($(realpath $(SRC_PATH)),$(realpath .))
 ifneq ($(wildcard $(SRC_PATH)/config-host.mak),)
 $(error This is an out of tree build but your source tree ($(SRC_PATH)) \
 seems to have been used for an in-tree build. You can fix this by running \
-"make distclean && rm -rf *-linux-user *-softmmu" in your source tree)
+"$(MAKE) distclean && rm -rf *-linux-user *-softmmu" in your source tree)
 endif
 endif
 
@@ -306,7 +306,7 @@ endif
else \
  echo "WARNING: $@ out of date.";\
fi; \
-   echo "Run \"make defconfig\" to regenerate."; \
+   echo "Run \"$(MAKE) defconfig\" to regenerate."; \
rm $@.tmp; \
  fi; \
 else \
@@ -940,4 +940,4 @@ ifdef QEMU_GA_MSI_ENABLED
 endif
@echo  ''
 endif
-   @echo  '  make V=0|1 [targets] 0 => quiet build (default), 1 => verbose 
build'
+   @echo  '  $(MAKE) V=0|1 [targets] 0 => quiet build (default), 1 => 
verbose build'
diff --git a/tests/Makefile.include b/tests/Makefile.include
index c002352134..060dcd6f36 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -3,21 +3,21 @@
 check-help:
@echo "Regression testing targets:"
@echo
-   @echo " make checkRun all tests"
-   @echo " make check-qtest-TARGET   Run qtest tests for given target"
-   @echo " make check-qtest  Run qtest tests"
-   @echo " make check-unit   Run qobject tests"
-   @echo " make check-speed  Run qobject speed tests"
-   @echo " make check-qapi-schemaRun QAPI schema tests"
-   @echo " make check-block  Run block tests"
-   @echo " make check-report.htmlGenerates an HTML test report"
-   @echo " make check-clean  Clean the tests"
+   @echo " $(MAKE) checkRun all tests"
+   @echo " $(MAKE) check-qtest-TARGET   Run qtest tests for given target"
+   @echo " $(MAKE) check-qtest  Run qtest tests"
+   @echo " $(MAKE) check-unit   Run qobject tests"
+   @echo " $(MAKE) check-speed  Run qobject speed tests"
+   @echo " $(MAKE) check-qapi-schemaRun QAPI schema tests"
+   @echo " $(MAKE) check-block  Run block tests"
+   @echo " $(MAKE) check-report.htmlGenerates an HTML test report"
+   @echo " $(MAKE) check-clean  Clean the tests"
@echo
@echo "Please note that HTML reports do not regenerate if the unit 
tests"
@echo "has not changed."
@echo
@echo "The variable SPEED can be set to control the gtester speed 
setting."
-   @echo "Default options are -k and (for make V=1) --verbose; they can be"
+   @echo "Default options are -k and (for $(MAKE) V=1) --verbose; they can 
be"
@echo "changed with variable GTESTER_OPTIONS."
 
 ifneq ($(wildcard config-host.mak),)
@@ -904,7 +904,7 @@ $(patsubst %, check-%, $(check-qapi-schema-y)): 
check-%.json: $(SRC_PATH)/%.json
echo $$? >$*.test.exit, \
"TEST","$*.out")
@diff -q $(SRC_PATH)/$*.out $*.test.out
-   @# Sanitize error messages (make them independent of build directory)
+   @# Sanitize error messages ($(MAKE) them independent of build directory)
@perl -p -e 's|\Q$(SRC_PATH)\E/||g' $*.test.err | diff -q 
$(SRC_PATH)/$*.err -
@diff -q $(SRC_PATH)/$*.exit $*.test.exit
 
-- 
2.15.0




Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()

2017-11-20 Thread Zhong Yang
On Mon, Nov 20, 2017 at 02:03:44PM +, Stefan Hajnoczi wrote:
> On Mon, Nov 20, 2017 at 04:41:41PM +0800, Zhong Yang wrote:
> > On Fri, Nov 17, 2017 at 01:54:09PM +, Stefan Hajnoczi wrote:
> > > On Fri, Nov 17, 2017 at 02:23:34PM +0800, Yang Zhong wrote:
> > > > @@ -272,6 +273,9 @@ static void *call_rcu_thread(void *opaque)
> > > >  node->func(node);
> > > >  }
> > > >  qemu_mutex_unlock_iothread();
> > > > +#ifdef CONFIG_LINUX
> > > > +malloc_trim(0);
> > > > +#endif
> > > 
> > > It is important that the rcu thread isn't overzealous in minimizing heap
> > > size if that means ordinary malloc(3) calls will experience latency
> > > spikes.  Please leave a few MB free so that malloc(3) doesn't take the
> > > slow path.
> > > 
> >   Hello Stefan,
> > 
> >   From the Qemu bootup procedure, the qemu malloc chunk memory from OS, not
> >   from glibc free list which is freed before by Qemu. Maybe there are some 
> >   issues in glibc memory mechanism. I will continue to fine this parameter
> >   to get better balance, Many thanks!
> 
> I was suggesting malloc_trim(4 * 1024 * 1024) or similar instead of
> malloc_trim(0).
>  
  Hello Stefan,

  Thanks for your suggestion!
  I tried your suggested changes, malloc_trim(4 * 1024 * 1024), the heap size 
with
  this change almost same with malloc_trim(0), thanks!

  Regards,

  Yang   
 
> Stefan





Re: [Qemu-devel] [PATCH v5 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-11-20 Thread Michael S. Tsirkin
On Tue, Nov 21, 2017 at 01:16:12AM +0100, Paolo Bonzini wrote:
> On 20/11/2017 21:44, Michael S. Tsirkin wrote:
> > Live migrations is supposed to be migrating guest writeable state too.
> > If you mean migrating RO fields like size, then
> > I don't think it's a good idea to reuse SET_CONFIG for that.
> > SET_CONFIG should obey exactly the virtio semantics.
> > 
> > And I agree, it should say that slave must treat it as a write,
> > and get config as a read according to virtio semantics.
> > 
> > If someone needs to pass configuration from qemu to
> > slave, let's add specific messages with precisely defined semantics.
> 
> Fair enough, but I'd add nevertheless a 32-bit flags field to both
> GET_CONFIG and SET_CONFIG, and document that the slave MUST check that
> it is zero and otherwise fail.
> 
> Paolo

We generally just use protocol feature bits for extensions.
But I have no problem with reserved padding if desired
which seems to be what's described here.

-- 
MST



Re: [Qemu-devel] [PATCH] rcu: reduce half heap memory size by malloc_trim()

2017-11-20 Thread Zhong Yang
On Mon, Nov 20, 2017 at 10:28:28PM +0800, Fam Zheng wrote:
> On Fri, 11/17 14:23, Yang Zhong wrote:
> > diff --git a/util/rcu.c b/util/rcu.c
> > index ca5a63e..8d491a6 100644
> > --- a/util/rcu.c
> > +++ b/util/rcu.c
> > @@ -26,6 +26,7 @@
> >   * IBM's contributions to this file may be relicensed under LGPLv2 or 
> > later.
> >   */
> >  
> > +#include 
> 
> BTW, if you respin, the "qemu/osdep.h" header should always be included first 
> as
> per HACKING file.

  Hello Fam,

  Thanks for your reminder!
  I ever thought the header file include in Qemu like linux kernel did, thanks 
again!

  Regards,

  Yang



[Qemu-devel] [PATCH v2 for-2.11 3/4] qemu-iotests: add option in common.qemu for mismatch only

2017-11-20 Thread Jeff Cody
Add option to echo response to QMP / HMP command only on mismatch.

Useful for ignore all normal responses, but catching things like
segfaults.

Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/common.qemu | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 7b3052d..85f66b8 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -50,6 +50,8 @@ _in_fd=4
 #
 # If $silent is set to anything but an empty string, then
 # response is not echoed out.
+# If $mismatch_only is set, only non-matching responses will
+# be echoed.
 function _timed_wait_for()
 {
 local h=${1}
@@ -58,14 +60,18 @@ function _timed_wait_for()
 QEMU_STATUS[$h]=0
 while IFS= read -t ${QEMU_COMM_TIMEOUT} resp <&${QEMU_OUT[$h]}
 do
-if [ -z "${silent}" ]; then
+if [ -z "${silent}" ] && [ -z "${mismatch_only}" ]; then
 echo "${resp}" | _filter_testdir | _filter_qemu \
| _filter_qemu_io | _filter_qmp | _filter_hmp
 fi
 grep -q "${*}" < <(echo "${resp}")
 if [ $? -eq 0 ]; then
 return
+elif [ -z "${silent}" ] && [ -n "${mismatch_only}" ]; then
+echo "${resp}" | _filter_testdir | _filter_qemu \
+   | _filter_qemu_io | _filter_qmp | _filter_hmp
 fi
+
 done
 QEMU_STATUS[$h]=-1
 if [ -z "${qemu_error_no_exit}" ]; then
-- 
2.9.5




[Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition

2017-11-20 Thread Jeff Cody
Changes from v1 -> v2:

Patch 1: Updated docs in blockjob_int.h (Thanks Stefan)

Patch 2/3: Squashed, and used const char * to hold the __func__ name of
   the original scheduler (Thanks Paolo)

Patch 4: Unchanged.

Patch 5: Dropped qcow format for the test, it was so slow the test times
 out, and it doesn't add any new dimension to the test.


# git-backport-diff -r qemu/master.. -u github/bz1508708

001/4:[0003] [FC] 'blockjob: do not allow coroutine double entry or 
entry-after-completion'
002/4:[down] 'coroutine: abort if we try to schedule or enter a pending 
coroutine'
003/4:[] [--] 'qemu-iotests: add option in common.qemu for mismatch only'
004/4:[0002] [FC] 'qemu-iotest: add test for blockjob coroutine race condition'


This series fixes a race condition segfault when using iothreads with
blockjobs.

The qemu iotest in this series is a reproducer, as is the reproducer
script attached in this bug report:

https://bugzilla.redhat.com/show_bug.cgi?id=1508708

There are two additional patches to try and catch this sort of scenario
with an abort, before a segfault or memory corruption occurs.


Jeff Cody (4):
  blockjob: do not allow coroutine double entry or
entry-after-completion
  coroutine: abort if we try to schedule or enter a pending coroutine
  qemu-iotests: add option in common.qemu for mismatch only
  qemu-iotest: add test for blockjob coroutine race condition

 blockjob.c |  9 ++--
 include/block/blockjob_int.h   |  3 +-
 include/qemu/coroutine_int.h   |  6 +++
 tests/qemu-iotests/200 | 99 ++
 tests/qemu-iotests/200.out | 14 ++
 tests/qemu-iotests/common.qemu |  8 +++-
 tests/qemu-iotests/group   |  1 +
 util/async.c   | 11 +
 util/qemu-coroutine-sleep.c| 11 +
 util/qemu-coroutine.c  | 11 +
 10 files changed, 168 insertions(+), 5 deletions(-)
 create mode 100755 tests/qemu-iotests/200
 create mode 100644 tests/qemu-iotests/200.out

-- 
2.9.5




[Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion

2017-11-20 Thread Jeff Cody
When block_job_sleep_ns() is called, the co-routine is scheduled for
future execution.  If we allow the job to be re-entered prior to the
scheduled time, we present a race condition in which a coroutine can be
entered recursively, or even entered after the coroutine is deleted.

The job->busy flag is used by blockjobs when a coroutine is busy
executing. The function 'block_job_enter()' obeys the busy flag,
and will not enter a coroutine if set.  If we sleep a job, we need to
leave the busy flag set, so that subsequent calls to block_job_enter()
are prevented.

This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508708

Also, in block_job_start(), set the relevant job flags (.busy, .paused)
before creating the coroutine, not just before executing it.

Signed-off-by: Jeff Cody 
---
 blockjob.c   | 9 ++---
 include/block/blockjob_int.h | 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 3a0c491..e181295 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -291,10 +291,10 @@ void block_job_start(BlockJob *job)
 {
 assert(job && !block_job_started(job) && job->paused &&
job->driver && job->driver->start);
-job->co = qemu_coroutine_create(block_job_co_entry, job);
 job->pause_count--;
 job->busy = true;
 job->paused = false;
+job->co = qemu_coroutine_create(block_job_co_entry, job);
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -797,11 +797,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType 
type, int64_t ns)
 return;
 }
 
-job->busy = false;
+/* We need to leave job->busy set here, because when we have
+ * put a coroutine to 'sleep', we have scheduled it to run in
+ * the future.  We cannot enter that same coroutine again before
+ * it wakes and runs, otherwise we risk double-entry or entry after
+ * completion. */
 if (!block_job_should_pause(job)) {
 co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
 }
-job->busy = true;
 
 block_job_pause_point(job);
 }
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index f13ad05..43f3be2 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -143,7 +143,8 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
  * @ns: How many nanoseconds to stop for.
  *
  * Put the job to sleep (assuming that it wasn't canceled) for @ns
- * nanoseconds.  Canceling the job will interrupt the wait immediately.
+ * nanoseconds.  Canceling the job will not interrupt the wait, so the
+ * cancel will not process until the coroutine wakes up.
  */
 void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);
 
-- 
2.9.5




[Qemu-devel] [PATCH v2 for-2.11 4/4] qemu-iotest: add test for blockjob coroutine race condition

2017-11-20 Thread Jeff Cody
Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/200 | 99 ++
 tests/qemu-iotests/200.out | 14 +++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 114 insertions(+)
 create mode 100755 tests/qemu-iotests/200
 create mode 100644 tests/qemu-iotests/200.out

diff --git a/tests/qemu-iotests/200 b/tests/qemu-iotests/200
new file mode 100755
index 000..d8787dd
--- /dev/null
+++ b/tests/qemu-iotests/200
@@ -0,0 +1,99 @@
+#!/bin/bash
+#
+# Block job co-routine race condition test.
+#
+# See: https://bugzilla.redhat.com/show_bug.cgi?id=1508708
+#
+# Copyright (C) 2017 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=jc...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1# failure is the default!
+
+_cleanup()
+{
+_cleanup_qemu
+rm -f "${TEST_IMG}" "${BACKING_IMG}"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2 qed
+_supported_proto file
+_supported_os Linux
+
+BACKING_IMG="${TEST_DIR}/backing.img"
+TEST_IMG="${TEST_DIR}/test.img"
+
+${QEMU_IMG} create -f $IMGFMT "${BACKING_IMG}" 512M | _filter_img_create
+${QEMU_IMG} create -f $IMGFMT -F $IMGFMT "${TEST_IMG}" -b "${BACKING_IMG}" 
512M | _filter_img_create
+
+${QEMU_IO} -c "write -P 0xa5 512 300M" "${BACKING_IMG}" | _filter_qemu_io
+
+echo
+echo === Starting QEMU VM ===
+echo
+qemu_comm_method="qmp"
+_launch_qemu -device pci-bridge,id=bridge1,chassis_nr=1,bus=pci.0 \
+ -object iothread,id=iothread0 \
+ -device 
virtio-scsi-pci,bus=bridge1,addr=0x1f,id=scsi0,iothread=iothread0 \
+ -drive 
file="${TEST_IMG}",media=disk,if=none,cache=none,id=drive_sysdisk,aio=native,format=$IMGFMT
 \
+ -device 
scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0
+h1=$QEMU_HANDLE
+
+_send_qemu_cmd $h1 "{ 'execute': 'qmp_capabilities' }" 'return'
+
+echo
+echo === Sending stream/cancel, checking for SIGSEGV only  ===
+echo
+for (( i=1;i<500;i++ ))
+do
+mismatch_only='y' qemu_error_no_exit='n' _send_qemu_cmd $h1 \
+   "{
+'execute': 'block-stream',
+'arguments': {
+'device': 'drive_sysdisk',
+'speed': 1000,
+'on-error': 'report',
+'job-id': 'job-$i'
+ }
+}
+{
+'execute': 'block-job-cancel',
+'arguments': {
+'device': 'job-$i'
+ }
+}" \
+   "{.*{.*}.*}"  # should match all well-formed QMP 
responses
+done
+
+silent='y' _send_qemu_cmd $h1  "{ 'execute': 'quit' }" 'return'
+
+echo "$i iterations performed"
+
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/200.out b/tests/qemu-iotests/200.out
new file mode 100644
index 000..af6a809
--- /dev/null
+++ b/tests/qemu-iotests/200.out
@@ -0,0 +1,14 @@
+QA output created by 200
+Formatting 'TEST_DIR/backing.img', fmt=IMGFMT size=536870912
+Formatting 'TEST_DIR/test.img', fmt=IMGFMT size=536870912 
backing_file=TEST_DIR/backing.img backing_fmt=IMGFMT
+wrote 314572800/314572800 bytes at offset 512
+300 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Starting QEMU VM ===
+
+{"return": {}}
+
+=== Sending stream/cancel, checking for SIGSEGV only ===
+
+500 iterations performed
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 24e5ad1..25d9adf 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -194,3 +194,4 @@
 194 rw auto migration quick
 195 rw auto quick
 197 rw auto quick
+200 rw auto
-- 
2.9.5




[Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine

2017-11-20 Thread Jeff Cody
The previous patch fixed a race condition, in which there were
coroutines being executing doubly, or after coroutine deletion.

We can detect common scenarios when this happens, and print an error
message and abort before we corrupt memory / data, or segfault.

This patch will abort if an attempt to enter a coroutine is made while
it is currently pending execution, either in a specific AioContext bh,
or pending execution via a timer.  It will also abort if a coroutine
is scheduled, before a prior scheduled run has occured.

We cannot rely on the existing co->caller check for recursive re-entry
to catch this, as the coroutine may run and exit with
COROUTINE_TERMINATE before the scheduled coroutine executes.

(This is the scenario that was occuring and fixed in the previous
patch).

Signed-off-by: Jeff Cody 
---
 include/qemu/coroutine_int.h |  6 ++
 util/async.c | 11 +++
 util/qemu-coroutine-sleep.c  | 11 +++
 util/qemu-coroutine.c| 11 +++
 4 files changed, 39 insertions(+)

diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index cb98892..56e4c48 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -53,6 +53,12 @@ struct Coroutine {
 
 /* Only used when the coroutine has yielded.  */
 AioContext *ctx;
+
+/* Used to catch and abort on illegal co-routine entry.
+ * Will contain the name of the function that had first
+ * scheduled the coroutine. */
+const char *scheduled;
+
 QSIMPLEQ_ENTRY(Coroutine) co_queue_next;
 QSLIST_ENTRY(Coroutine) co_scheduled_next;
 };
diff --git a/util/async.c b/util/async.c
index 0e1bd87..49174b3 100644
--- a/util/async.c
+++ b/util/async.c
@@ -388,6 +388,7 @@ static void co_schedule_bh_cb(void *opaque)
 QSLIST_REMOVE_HEAD(, co_scheduled_next);
 trace_aio_co_schedule_bh_cb(ctx, co);
 aio_context_acquire(ctx);
+atomic_set(>scheduled, NULL);
 qemu_coroutine_enter(co);
 aio_context_release(ctx);
 }
@@ -438,6 +439,16 @@ fail:
 void aio_co_schedule(AioContext *ctx, Coroutine *co)
 {
 trace_aio_co_schedule(ctx, co);
+const char *scheduled = atomic_read(>scheduled);
+
+if (scheduled) {
+fprintf(stderr,
+"%s: Co-routine was already scheduled in '%s'\n",
+__func__, scheduled);
+abort();
+}
+atomic_set(>scheduled, __func__);
+
 QSLIST_INSERT_HEAD_ATOMIC(>scheduled_coroutines,
   co, co_scheduled_next);
 qemu_bh_schedule(ctx->co_schedule_bh);
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
index 9c56550..38dc4c8 100644
--- a/util/qemu-coroutine-sleep.c
+++ b/util/qemu-coroutine-sleep.c
@@ -13,6 +13,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/coroutine.h"
+#include "qemu/coroutine_int.h"
 #include "qemu/timer.h"
 #include "block/aio.h"
 
@@ -25,6 +26,7 @@ static void co_sleep_cb(void *opaque)
 {
 CoSleepCB *sleep_cb = opaque;
 
+atomic_set(_cb->co->scheduled, NULL);
 aio_co_wake(sleep_cb->co);
 }
 
@@ -34,6 +36,15 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, 
QEMUClockType type,
 CoSleepCB sleep_cb = {
 .co = qemu_coroutine_self(),
 };
+const char *scheduled = atomic_read(_cb.co->scheduled);
+
+if (scheduled) {
+fprintf(stderr,
+"%s: Co-routine was already scheduled in '%s'\n",
+__func__, scheduled);
+abort();
+}
+atomic_set(_cb.co->scheduled, __func__);
 sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, _cb);
 timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
 qemu_coroutine_yield();
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index d6095c1..fbfd0ad 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -106,9 +106,20 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine 
*co)
 {
 Coroutine *self = qemu_coroutine_self();
 CoroutineAction ret;
+const char *scheduled = atomic_read(>scheduled);
 
 trace_qemu_aio_coroutine_enter(ctx, self, co, co->entry_arg);
 
+/* if the Coroutine has already been scheduled, entering it again will
+ * cause us to enter it twice, potentially even after the coroutine has
+ * been deleted */
+if (scheduled) {
+fprintf(stderr,
+"%s: Co-routine was already scheduled in '%s'\n",
+__func__, scheduled);
+abort();
+}
+
 if (co->caller) {
 fprintf(stderr, "Co-routine re-entered recursively\n");
 abort();
-- 
2.9.5




Re: [Qemu-devel] [PULL 00/15] late linux-user fixes for 2.11

2017-11-20 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PULL 00/15] late linux-user fixes for 2.11
Type: series
Message-id: cover.1511212753.git.riku.voi...@linaro.org

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]
patchew/1511201308-23580-1-git-send-email-peter.mayd...@linaro.org -> 
patchew/1511201308-23580-1-git-send-email-peter.mayd...@linaro.org
 t [tag update]patchew/20171117190422.23626-1-ebl...@redhat.com -> 
patchew/20171117190422.23626-1-ebl...@redhat.com
 * [new tag]   patchew/cover.1511212753.git.riku.voi...@linaro.org 
-> patchew/cover.1511212753.git.riku.voi...@linaro.org
Switched to a new branch 'test'
788f1bba45 linux-user: Fix calculation of auxv length
052a0df24b linux-user: Handle rt_sigaction correctly for SPARC
f604e7242e linux-user/sparc: Put address for data faults where linux-user 
expects it
72147cb49c linux-user/ppc: Report correct fault address for data faults
e463bfe01b linux-user/s390x: Mask si_addr for SIGSEGV
453566e577 linux-user: return EINVAL from prctl(PR_*_SECCOMP)
fa4841c61b linux-user: fix 'finshed' typo in comment
8c79f00a62 linux-user/syscall.c: Handle SH4's exceptional alignment for p{read, 
write}64
00aa6fca9b linux-user: Handle TARGET_MAP_STACK and TARGET_MAP_HUGETLB
0404230a13 linux-user/hppa: Fix TARGET_F_RDLCK, TARGET_F_WRLCK, TARGET_F_UNLCK
843644155e linux-user/hppa: Fix TARGET_MAP_TYPE
928786ff34 linux-user/hppa: Fix typo for TARGET_NR_epoll_wait
244abacff6 linux-user/hppa: Fix cpu_clone_regs
63e9a64d06 linux-user/hppa: Fix TARGET_SA_* defines
46e4119671 linux-user: Restrict usage of sa_restorer

=== OUTPUT BEGIN ===
Checking PATCH 1/15: linux-user: Restrict usage of sa_restorer...
Checking PATCH 2/15: linux-user/hppa: Fix TARGET_SA_* defines...
Checking PATCH 3/15: linux-user/hppa: Fix cpu_clone_regs...
Checking PATCH 4/15: linux-user/hppa: Fix typo for TARGET_NR_epoll_wait...
Checking PATCH 5/15: linux-user/hppa: Fix TARGET_MAP_TYPE...
ERROR: code indent should never use tabs
#25: FILE: linux-user/syscall_defs.h:1340:
+#define TARGET_MAP_TYPE 0x03^I^I/* Mask for type of mapping */$

ERROR: code indent should never use tabs
#27: FILE: linux-user/syscall_defs.h:1342:
+#define TARGET_MAP_TYPE 0x0f^I^I/* Mask for type of mapping */$

total: 2 errors, 0 warnings, 12 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 6/15: linux-user/hppa: Fix TARGET_F_RDLCK, TARGET_F_WRLCK, 
TARGET_F_UNLCK...
Checking PATCH 7/15: linux-user: Handle TARGET_MAP_STACK and 
TARGET_MAP_HUGETLB...
Checking PATCH 8/15: linux-user/syscall.c: Handle SH4's exceptional alignment 
for p{read, write}64...
Checking PATCH 9/15: linux-user: fix 'finshed' typo in comment...
Checking PATCH 10/15: linux-user: return EINVAL from prctl(PR_*_SECCOMP)...
Checking PATCH 11/15: linux-user/s390x: Mask si_addr for SIGSEGV...
Checking PATCH 12/15: linux-user/ppc: Report correct fault address for data 
faults...
Checking PATCH 13/15: linux-user/sparc: Put address for data faults where 
linux-user expects it...
Checking PATCH 14/15: linux-user: Handle rt_sigaction correctly for SPARC...
Checking PATCH 15/15: linux-user: Fix calculation of auxv length...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH for-2.11 0/3] block: Error out on load_vm with active dirty bitmaps

2017-11-20 Thread John Snow


On 11/20/2017 09:50 AM, Kevin Wolf wrote:
> Following the discussing at
> https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg03572.html
> this implements the error return for loading a snapshot while dirty
> bitmaps are active.
> 
> Kevin Wolf (3):
>   block: Add errp to bdrv_snapshot_goto()
>   block: Add errp to bdrv_all_goto_snapshot()
>   block: Error out on load_vm with active dirty bitmaps
> 
>  include/block/snapshot.h |  6 --
>  block/snapshot.c | 43 +++
>  migration/savevm.c   |  6 +++---
>  qemu-img.c   |  6 +++---
>  4 files changed, 33 insertions(+), 28 deletions(-)
> 

You have enough reviews, but just so it's unambiguous, I'm fine with
this approach in lieu of Vladimir's; so:

Reviewed-by: John Snow 



Re: [Qemu-devel] [PATCH for-2.12 REPOST] spapr_cpu_core: instantiate CPUs separately

2017-11-20 Thread David Gibson
On Mon, Nov 20, 2017 at 03:56:48PM +0100, Igor Mammedov wrote:
> On Mon, 20 Nov 2017 10:19:54 +0100
> Greg Kurz  wrote:
> 
> > The current code assumes that only the CPU core object holds a
> > reference on each individual CPU object, and happily frees their
> > allocated memory when the core is unrealized. This is dangerous
> > as some other code can legitimely keep a pointer to a CPU if it
> > calls object_ref(), but it would end up with a dangling pointer.
> > 
> > Let's allocate all CPUs with object_new() and let QOM frees them
> s/frees/free/
> 
> > when their reference count reaches zero. This greatly simplify the
> > code as we don't have to fiddle with the instance size anymore.
> > 
> > Signed-off-by: Greg Kurz 
> Acked-by: Igor Mammedov 

Applied to ppc-for-2.12 with the typo above fixed.


> 
> > ---
> >  hw/ppc/spapr.c  |   11 +++
> >  hw/ppc/spapr_cpu_core.c |   19 +++
> >  include/hw/ppc/spapr_cpu_core.h |2 +-
> >  3 files changed, 11 insertions(+), 21 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 174e7ff0678d..fc92b9d914a5 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -3173,12 +3173,10 @@ void spapr_core_release(DeviceState *dev)
> >  
> >  if (smc->pre_2_10_has_unused_icps) {
> >  sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
> > -sPAPRCPUCoreClass *scc = SPAPR_CPU_CORE_GET_CLASS(OBJECT(cc));
> > -size_t size = object_type_get_instance_size(scc->cpu_type);
> >  int i;
> >  
> >  for (i = 0; i < cc->nr_threads; i++) {
> > -CPUState *cs = CPU(sc->threads + i * size);
> > +CPUState *cs = CPU(sc->threads[i]);
> >  
> >  pre_2_10_vmstate_register_dummy_icp(cs->cpu_index);
> >  }
> > @@ -3224,7 +3222,7 @@ static void spapr_core_plug(HotplugHandler 
> > *hotplug_dev, DeviceState *dev,
> >  sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> >  sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
> >  CPUCore *cc = CPU_CORE(dev);
> > -CPUState *cs = CPU(core->threads);
> > +CPUState *cs = CPU(core->threads[0]);
> >  sPAPRDRConnector *drc;
> >  Error *local_err = NULL;
> >  int smt = kvmppc_smt_threads();
> > @@ -3269,15 +3267,12 @@ static void spapr_core_plug(HotplugHandler 
> > *hotplug_dev, DeviceState *dev,
> >  core_slot->cpu = OBJECT(dev);
> >  
> >  if (smc->pre_2_10_has_unused_icps) {
> > -sPAPRCPUCoreClass *scc = SPAPR_CPU_CORE_GET_CLASS(OBJECT(cc));
> > -size_t size = object_type_get_instance_size(scc->cpu_type);
> >  int i;
> >  
> >  for (i = 0; i < cc->nr_threads; i++) {
> >  sPAPRCPUCore *sc = SPAPR_CPU_CORE(dev);
> > -void *obj = sc->threads + i * size;
> >  
> > -cs = CPU(obj);
> > +cs = CPU(sc->threads[i]);
> >  pre_2_10_vmstate_unregister_dummy_icp(cs->cpu_index);
> >  }
> >  }
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > index 3a4c17401226..588f9b45714a 100644
> > --- a/hw/ppc/spapr_cpu_core.c
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -79,13 +79,11 @@ const char *spapr_get_cpu_core_type(const char 
> > *cpu_type)
> >  static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
> >  {
> >  sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
> > -sPAPRCPUCoreClass *scc = SPAPR_CPU_CORE_GET_CLASS(OBJECT(dev));
> > -size_t size = object_type_get_instance_size(scc->cpu_type);
> >  CPUCore *cc = CPU_CORE(dev);
> >  int i;
> >  
> >  for (i = 0; i < cc->nr_threads; i++) {
> > -void *obj = sc->threads + i * size;
> > +Object *obj = OBJECT(sc->threads[i]);
> >  DeviceState *dev = DEVICE(obj);
> >  CPUState *cs = CPU(dev);
> >  PowerPCCPU *cpu = POWERPC_CPU(cs);
> > @@ -146,9 +144,8 @@ static void spapr_cpu_core_realize(DeviceState *dev, 
> > Error **errp)
> >  sPAPRCPUCore *sc = SPAPR_CPU_CORE(OBJECT(dev));
> >  sPAPRCPUCoreClass *scc = SPAPR_CPU_CORE_GET_CLASS(OBJECT(dev));
> >  CPUCore *cc = CPU_CORE(OBJECT(dev));
> > -size_t size;
> >  Error *local_err = NULL;
> > -void *obj;
> > +Object *obj;
> >  int i, j;
> >  
> >  if (!spapr) {
> > @@ -156,18 +153,16 @@ static void spapr_cpu_core_realize(DeviceState *dev, 
> > Error **errp)
> >  return;
> >  }
> >  
> > -size = object_type_get_instance_size(scc->cpu_type);
> > -sc->threads = g_malloc0(size * cc->nr_threads);
> > +sc->threads = g_new(PowerPCCPU *, cc->nr_threads);
> >  for (i = 0; i < cc->nr_threads; i++) {
> >  char id[32];
> >  CPUState *cs;
> >  PowerPCCPU *cpu;
> >  
> > -obj = sc->threads + i * size;
> > +obj = object_new(scc->cpu_type);
> >  
> > -object_initialize(obj, size, scc->cpu_type);
> >  cs = CPU(obj);
> > -cpu = 

Re: [Qemu-devel] [PATCH for-2.11] hw/ppc/spapr: Fix virtio-scsi bootindex handling for LUNs >= 256

2017-11-20 Thread David Gibson
On Mon, Nov 20, 2017 at 08:44:38AM +0100, Thomas Huth wrote:
> LUNs >= 256 have to be encoded with the so-called "flat space
> addressing method" for virtio-scsi, where an additional bit has to
> be set. SLOF already took care of this with the following commit:
> 
>  https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=f72a37713fea47da
>  (see https://bugzilla.redhat.com/show_bug.cgi?id=1431584 for details)
> 
> But QEMU does not use this encoding yet for device tree paths
> that have to be handed over to SLOF to deal with the "bootindex"
> property, so SLOF currently fails to boot from virtio-scsi devices
> with LUNs >= 256 in the right boot order. Fix it by using the bit
> to indicate the "flat space addressing method" for LUNs >= 256.
> 
> Signed-off-by: Thomas Huth 

Applied to ppc-for-2.11, thanks.

> ---
>  hw/ppc/spapr.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d682f01..ff2eec4 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2643,6 +2643,10 @@ static char *spapr_get_fw_dev_path(FWPathProvider *p, 
> BusState *bus,
>   * swap 0100 or 10 << or 20 << ( target lun-id -- srplun )
>   */
>  unsigned id = 0x100 | (d->id << 16) | d->lun;
> +if (d->lun >= 256) {
> +/* Use the LUN "flat space addressing method" */
> +id |= 0x4000;
> +}
>  return g_strdup_printf("%s@%"PRIX64, qdev_fw_name(dev),
> (uint64_t)id << 32);
>  } else if (usb) {

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] spapr: Implement bug in spapr-vty device to be compatible with PowerVM

2017-11-20 Thread David Gibson
On Sun, Nov 19, 2017 at 11:18:55PM -0800, no-re...@patchew.org wrote:
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Subject: [Qemu-devel] [PATCH] spapr: Implement bug in spapr-vty device to be 
> compatible with PowerVM
> Type: series
> Message-id: 20171120071423.11693-1-da...@gibson.dropbear.id.au
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> 
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
> 
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> 
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; 
> then
> failed=1
> echo
> fi
> n=$((n+1))
> done
> 
> exit $failed
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> From https://github.com/patchew-project/qemu
>  * [new tag]   
> patchew/20171120071423.11693-1-da...@gibson.dropbear.id.au -> 
> patchew/20171120071423.11693-1-da...@gibson.dropbear.id.au
> Switched to a new branch 'test'
> 08e9d8fe82 spapr: Implement bug in spapr-vty device to be compatible with 
> PowerVM
> 
> === OUTPUT BEGIN ===
> Checking PATCH 1/1: spapr: Implement bug in spapr-vty device to be compatible 
> with PowerVM...
> ERROR: spaces required around that '-' (ctx:VxV)
> #40: FILE: hw/char/spapr_vty.c:68:
> +if (buf[n-1] == '\r') {

I've corrected this in my tree.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v5 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-11-20 Thread Paolo Bonzini
On 20/11/2017 21:44, Michael S. Tsirkin wrote:
> Live migrations is supposed to be migrating guest writeable state too.
> If you mean migrating RO fields like size, then
> I don't think it's a good idea to reuse SET_CONFIG for that.
> SET_CONFIG should obey exactly the virtio semantics.
> 
> And I agree, it should say that slave must treat it as a write,
> and get config as a read according to virtio semantics.
> 
> If someone needs to pass configuration from qemu to
> slave, let's add specific messages with precisely defined semantics.

Fair enough, but I'd add nevertheless a 32-bit flags field to both
GET_CONFIG and SET_CONFIG, and document that the slave MUST check that
it is zero and otherwise fail.

Paolo



[Qemu-devel] qemu-system-ppc hangs

2017-11-20 Thread Richard Purdie
Hi,

I work on the Yocto Project and we use qemu to test boot our Linux
images and run tests against them. We've been noticing some instability
for ppc where the images sometimes hang, usually around udevd bring up
time so just after booting into userspace.

To cut a long story short, I've tracked down what I think is the
problem. I believe the decrementer timer stops receiving interrupts so
tasks in our images hang indefinitely as the timer stopped. 

It can be summed up with this line of debug:

ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0004

It should normally read:

ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0002

The question is why CPU_INTERRUPT_EXITTB ends up being set when the
lines above this log message clearly sets CPU_INTERRUPT_HARD (via 
cpu_interrupt() ).

I note in cpu.h:

/* updates protected by BQL */
uint32_t interrupt_request;

(for struct CPUState)

The ppc code does "cs->interrupt_request |= CPU_INTERRUPT_EXITTB" in 5
places, 3 in excp_helper.c and 2 in helper_regs.h. In all cases,  
g_assert(qemu_mutex_iothread_locked()); fails. If I do something like:

if (!qemu_mutex_iothread_locked()) {
qemu_mutex_lock_iothread();
cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
qemu_mutex_unlock_iothread();
} else {
cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
}

in these call sites then I can no longer lock qemu up with my test
case.

I suspect the _HARD setting gets overwritten which stops the 
decrementer interrupts being delivered.

I don't know if taking this lock in these situations is going to be bad
for performance and whether such a patch would be right/wrong.

At this point I therefore wanted to seek advice on what the real issue
is here and how to fix it!

Cheers,

Richard






Re: [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine

2017-11-20 Thread Jeff Cody
On Tue, Nov 21, 2017 at 12:13:46AM +0100, Paolo Bonzini wrote:
> On 21/11/2017 00:08, Jeff Cody wrote:
> > @@ -34,6 +36,7 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, 
> > QEMUClockType type,
> >  CoSleepCB sleep_cb = {
> >  .co = qemu_coroutine_self(),
> >  };
> > +if (sleep_cb.co->sleeping == 1 || sleep_cb.co->scheduled == 1) {
> > +   fprintf(stderr, "Cannot sleep a co-routine that is already sleeping 
> > "
> > +   " or scheduled\n");
> > +   abort();
> > +}
> > +sleep_cb.co->sleeping = 1;
> >  sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, 
> > _cb);
> >  timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
> >  qemu_coroutine_yield();
> 
> I understand that this was just an example and not the actual patch, but
> I'll still point out that this loses the benefit (better error message)
> of keeping the flags separate.
> 
> What do you think about making "scheduled" a const char * and assigning
> __func__ to it (i.e. either "aio_co_schedule" or "co_aio_sleep_ns")?
> 

Ohhh, nice.  I'll spin a v2 with that, and merge patches 3 and 5 together.
And then maybe for 2.12 we can look at making it a fsm, like Stefan
suggested (or somehow make coroutine entry thread safe and idempotent).

Jeff



[Qemu-devel] [Bug 1717708] Re: QEMU aarch64 can't run Windows ARM64 iso's

2017-11-20 Thread Andrew Baumann
@Laszlo, I got the binaries from someone else, but they from a parallel
build system... nothing particularly special: some preprocessor rules to
build against the newer WDK, and also it was just those three drivers
(netkvm, vioinput, viostor). I imagine you can get the same result by
tweaking the VS-based build (sln/vcxproj files), but haven't tried.
Sorry that's not terribly helpful.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1717708

Title:
  QEMU aarch64 can't run Windows ARM64 iso's

Status in QEMU:
  New

Bug description:
  Hi,
  recently Windows ARM64 ISOs have been posted on the internet..
  just checked with latest QEMU 2.10 release from 
  https://qemu.weilnetz.de/w64/qemu-w64-setup-20170830.exe 
  "h:\qemu\qemu-system-aarch64.exe" -boot d -cdrom 
h:\iso\16353.1000.170825-1423.RS_PRERELEASE_CLIENTPRO_OEMRET_ARM64FRE_ES-ES.ISO 
-m 2048 -cpu cortex-a57 -smp 1 -machine virt
  seems no video output..
  checked various machine options for example versatilepb (says guest has not 
initialized the guest)..

  so don't know if it's a QEMU bug or lacking feature but can support
  running Windows ARM64 builds (would be nice if you can add a
  Snapdragon835 machine type which is which first machines will be
  running..)

  
  note running a Windows x64 ISO with similar parameters works (removing -cpu 
and -machine as not needed)

  thanks..

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1717708/+subscriptions



Re: [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine

2017-11-20 Thread Paolo Bonzini
On 21/11/2017 00:08, Jeff Cody wrote:
> @@ -34,6 +36,7 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, 
> QEMUClockType type,
>  CoSleepCB sleep_cb = {
>  .co = qemu_coroutine_self(),
>  };
> +if (sleep_cb.co->sleeping == 1 || sleep_cb.co->scheduled == 1) {
> +   fprintf(stderr, "Cannot sleep a co-routine that is already sleeping "
> +   " or scheduled\n");
> +   abort();
> +}
> +sleep_cb.co->sleeping = 1;
>  sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, _cb);
>  timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
>  qemu_coroutine_yield();

I understand that this was just an example and not the actual patch, but
I'll still point out that this loses the benefit (better error message)
of keeping the flags separate.

What do you think about making "scheduled" a const char * and assigning
__func__ to it (i.e. either "aio_co_schedule" or "co_aio_sleep_ns")?

Thanks,

Paolo



Re: [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine

2017-11-20 Thread Jeff Cody
On Mon, Nov 20, 2017 at 11:47:09PM +0100, Paolo Bonzini wrote:
> On 20/11/2017 23:35, Jeff Cody wrote:
> >> Is this a different "state" (in Stefan's parlance) than scheduled?  In
> >> practice both means that someone may call qemu_(aio_)coroutine_enter
> >> concurrently, so you'd better not do it yourself.
> >>
> > It is slightly different; it is from sleeping with a timer via
> > co_aio_sleep_ns and waking via co_sleep_cb.  Whereas the 'co->scheduled' is
> > specifically from being scheduled for a specific AioContext, via
> > aio_co_schedule().
> 
> Right; however, that would only make a difference if we allowed
> canceling a co_aio_sleep_ns.  Since we don't want that, they have the
> same transitions.
> 
> > In practice, 'co->schedule' and 'co->sleeping' certainly rhyme, at the very
> > least.
> > 
> > But having them separate will put the abort closer to where the problem 
> > lies,
> > so it should make debugging a bit easier if we hit it.
> 
> What do you mean by closer?  It would print a slightly more informative
> message, but the message is in qemu_aio_coroutine_for both cases.
> 

Sorry, sloppy wording; I meant what you said above, that the error message
is more informative, so by tracking down where co->sleeping is set the
developer is closer to where the problem lies.

> In fact, unifying co->scheduled and co->sleeping means that you can
> easily abort when co_aio_sleep_ns is called on a scheduled coroutine, like
> 
> /* This is valid. */
> aio_co_schedule(qemu_get_current_aio_context(),
> qemu_coroutine_self());
> 
> /* But only if there was a qemu_coroutine_yield here.  */
> co_aio_sleep_ns(qemu_get_current_aio_context(), 1000);
>

That is true.  But we could also check (co->sleeping || co->scheduled) in
co_aio_sleep_ns() though, as well.

Hmm... not checking co->sleeping in co_aio_sleep_ns() is a bug in my
patch.  We don't want to schedule a coroutine on two different timers,
either.

So what do you think about adding this to the patch:

@@ -34,6 +36,7 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, 
QEMUClockType type,
 CoSleepCB sleep_cb = {
 .co = qemu_coroutine_self(),
 };
+if (sleep_cb.co->sleeping == 1 || sleep_cb.co->scheduled == 1) {
+   fprintf(stderr, "Cannot sleep a co-routine that is already sleeping "
+   " or scheduled\n");
+   abort();
+}
+sleep_cb.co->sleeping = 1;
 sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, _cb);
 timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
 qemu_coroutine_yield();


Jeff



Re: [Qemu-devel] [SPARC] Qemu failed to display MMU mapping for non memory area.

2017-11-20 Thread Mark Cave-Ayland
On 19/11/17 14:12, Jean-Christophe DUBOIS wrote:

> Hello,
> 
> I am using Qemu to emulate a Leon3 based board.
> 
> In the software I am running on Qemu, I configured the virtual memory
> through MMU programming.
> 
> In particular, I mapped the built-in UART to a 4K page.
> 
> To check that my MMU table was OK I switched on (at compile time) the
> DEBUG_MMU facility in the target/sparc/ldst_helper.c file.
> 
> Then anytime I changed the MMU setting (through software) I got a
> display of it. A typical debug from Qemu console is as follow:
> 
> MMU: mmu change reg[2]: 0x0001 -> 0x0002
> Root ptr: 40003000, ctx: 2
> VA: 4000, PA: 4000 PDE: 04000401
>  VA: 4000, PA: 4000 PDE: 04000421
>   VA: 4000, PA: 4000 PTE: 04ba
>   VA: 40001000, PA: 40001000 PTE: 0400019a
>   VA: 40002000, PA: 40002000 PTE: 0400029a
>   VA: 40006000, PA: 40006000 PTE: 0400069e
>   VA: 40007000, PA: 40007000 PTE: 0400079a
>   VA: 40008000, PA: 40008000 PTE: 0400089e
>  VA: 4080, PA: 4000d000 PDE: 04000411
>   VA: 4080, PA: 4000d000 PTE: 04000db2
>   VA: 40802000, PA: 4000e000 PTE: 04000e82
>   VA: 40804000, PA: 40013000 PTE: 04001386
>   VA: 40806000, PA: 40017000 PTE: 04001786
>   VA: 40808000, PA:  PTE: 0806
>   VA: 4080a000, PA: 4001a000 PTE: 04001a82
>   VA: 4080c000, PA: 40019000 PTE: 04001982
>   VA: 4080e000, PA: 4001c000 PTE: 04001c82
>   VA: 4081, PA: 4001b000 PTE: 04001b82
> 
> As you can see Qemu (debug) is unable to find the physical address
> associated to 0x40808000 (which should be 0x8000 where the UART lives).
> 
> Note: This also has on impact on the ability to explore the memory
> through GDB. Trying to access 0x40808100 (mapped to 0x8100) through
> gdb (connected to Qemu) is impossible.
> 
> (gdb) x 0x40808100
> 0x40808100:    Cannot access memory at address 0x40808100
> (gdb)
> 
> I traced the problem down to the sparc_cpu_get_phys_page_debug()
> function in the target/sparc/mmu_helper.c file.
> 
> By commenting out the last part of the function, the MMU mapping debug
> in Qemu is functional again.
> 
> hwaddr sparc_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
> {
>     SPARCCPU *cpu = SPARC_CPU(cs);
>     CPUSPARCState *env = >env;
>     hwaddr phys_addr;
>     int mmu_idx = cpu_mmu_index(env, false);
>     //MemoryRegionSection section;
> 
>     if (cpu_sparc_get_phys_page(env, _addr, addr, 2, mmu_idx) != 0) {
>     if (cpu_sparc_get_phys_page(env, _addr, addr, 0, mmu_idx)
> != 0) {
>     return -1;
>     }
>     }
>     /*
>     section = memory_region_find(get_system_memory(), phys_addr, 1);
>     memory_region_unref(section.mr);
>     if (!int128_nz(section.size)) {
>     printf("%s: failed to int128_nz()\n", __func__);
>     return -1;
>     }
>     */
>     return phys_addr;
> }
> 
> Root ptr: 40003000, ctx: 2
> VA: 4000, PA: 4000 PDE: 04000401
>  VA: 4000, PA: 4000 PDE: 04000421
>   VA: 4000, PA: 4000 PTE: 04ba
>   VA: 40001000, PA: 40001000 PTE: 0400019a
>   VA: 40002000, PA: 40002000 PTE: 0400029a
>   VA: 40006000, PA: 40006000 PTE: 0400069e
>   VA: 40007000, PA: 40007000 PTE: 0400079a
>   VA: 40008000, PA: 40008000 PTE: 0400089e
>  VA: 4080, PA: 4000d000 PDE: 04000411
>   VA: 4080, PA: 4000d000 PTE: 04000db2
>   VA: 40802000, PA: 4000e000 PTE: 04000e82
>   VA: 40804000, PA: 40013000 PTE: 04001386
>   VA: 40806000, PA: 40017000 PTE: 04001786
>   VA: 40808000, PA: 8000 PTE: 0806
>   VA: 4080a000, PA: 4001a000 PTE: 04001a82
>   VA: 4080c000, PA: 40019000 PTE: 04001982
>   VA: 4080e000, PA: 4001c000 PTE: 04001c82
>   VA: 4081, PA: 4001b000 PTE: 04001b82
> 
> Moreover, the GDB memory display is also working again with this change.
> 
> (gdb) x 0x40808100
> 0x40808100:    0x
> (gdb)
> 0x40808104:    0x0006
> (gdb)
> 0x40808108:    0x0002
> 
> I am not sure the proposed change is correct because GDB would then
> display memory result for memory area where there is no device mapped.
> For example accessing 0x40808000 would return 0 when there is no device
> mapped from 0x8000 to 0x80FF.
> 
> (gdb) x 0x40808000
> 0x40808000:    0x
> (gdb)
> 0x40808004:    0x
> (gdb)
> 0x40808008:    0x
> 
> You feed back would be appreciated.

Hi Jean-Christophe,

Thanks for the bug report. I asked on IRC earlier to see if anyone knew
what that particular section of code that you commented was for, and
Peter suggested that it was likely obsolete code from when the debug
path was also used to look up code instructions (apparently it is now
handled by cpu_get_phys_page_nofault()).

Can you send a formal version of your patch to qemu-devel@ as 

Re: [Qemu-devel] [PATCH 10/15] apb: remove pci_apb_init() and instantiate APB device using qdev

2017-11-20 Thread Mark Cave-Ayland
On 20/11/17 17:51, Philippe Mathieu-Daudé wrote:

> Hi Mark,
> 
> On 11/17/2017 10:42 AM, Mark Cave-Ayland wrote:
>> By making the special_base and mem_base values qdev properties, we can move
>> the remaining parts of pci_apb_init() into the pbm init() and realize()
>> functions.
>>
>> This finally allows us to instantiate the APB directly using standard qdev
>> create/init functions in sun4u.c.
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  hw/pci-host/apb.c |  123 
>> ++---
>>  hw/sparc64/sun4u.c|6 ++-
>>  include/hw/pci-host/apb.h |4 +-
>>  3 files changed, 68 insertions(+), 65 deletions(-)
>>
>> diff --git a/hw/pci-host/apb.c b/hw/pci-host/apb.c
>> index 823661a..6c20285 100644
>> --- a/hw/pci-host/apb.c
>> +++ b/hw/pci-host/apb.c
>> @@ -611,41 +611,56 @@ static void apb_pci_bridge_realize(PCIDevice *dev, 
>> Error **errp)
>>  pci_bridge_update_mappings(PCI_BRIDGE(br));
>>  }
>>  
>> -APBState *pci_apb_init(hwaddr special_base,
>> -   hwaddr mem_base)
>> +static void pci_pbm_reset(DeviceState *d)
>>  {
>> -DeviceState *dev;
>> -SysBusDevice *s;
>> -PCIHostState *phb;
>> -APBState *d;
>> -IOMMUState *is;
>> +unsigned int i;
>> +APBState *s = APB_DEVICE(d);
>> +
>> +for (i = 0; i < 8; i++) {
>> +s->pci_irq_map[i] &= PBM_PCI_IMR_MASK;
>> +}
>> +for (i = 0; i < 32; i++) {
>> +s->obio_irq_map[i] &= PBM_PCI_IMR_MASK;
>> +}
>> +
>> +s->irq_request = NO_IRQ_REQUEST;
>> +s->pci_irq_in = 0ULL;
>> +
>> +if (s->nr_resets++ == 0) {
>> +/* Power on reset */
>> +s->reset_control = POR;
>> +}
>> +}
>> +
>> +static const MemoryRegionOps pci_config_ops = {
>> +.read = apb_pci_config_read,
>> +.write = apb_pci_config_write,
>> +.endianness = DEVICE_LITTLE_ENDIAN,
>> +};
>> +
>> +static void pci_pbm_realize(DeviceState *dev, Error **errp)
>> +{
>> +APBState *s = APB_DEVICE(dev);
>> +PCIHostState *phb = PCI_HOST_BRIDGE(dev);
>> +SysBusDevice *sbd = SYS_BUS_DEVICE(s);
>>  PCIDevice *pci_dev;
>> +IOMMUState *is;
>>  
>> -/* Ultrasparc PBM main bus */
>> -dev = qdev_create(NULL, TYPE_APB);
>> -d = APB_DEVICE(dev);
>> -phb = PCI_HOST_BRIDGE(dev);
>> -phb->bus = pci_register_bus(DEVICE(phb), "pci",
>> -pci_apb_set_irq, pci_apb_map_irq, d,
>> ->pci_mmio,
>> ->pci_ioport,
>> -0, 32, TYPE_PCI_BUS);
>> -qdev_init_nofail(dev);
>> -s = SYS_BUS_DEVICE(dev);
>>  /* apb_config */
>> -sysbus_mmio_map(s, 0, special_base);
>> +sysbus_mmio_map(sbd, 0, s->special_base);
> 
> You might add a safety check than special_base is correctly initialize
> (compare to -1?)

I guess strictly speaking this would be good to have, however since the
same value is also hard-coded in OpenBIOS then you'll get a crash if you
change it to any other value anyhow. So for that reason I think it's
okay for the moment as it is.


ATB,

Mark.



Re: [Qemu-devel] [PATCH 12/15] ebus: wire up OBIO interrupts to APB pbm via qdev GPIOs

2017-11-20 Thread Mark Cave-Ayland
On 20/11/17 01:02, Philippe Mathieu-Daudé wrote:

> Hi Mark,
> 
> On 11/17/2017 10:42 AM, Mark Cave-Ayland wrote:
>> This enables us to remove the static array mapping in the ISA IRQ
>> handler (and the embedded reference to the APB device) by formalising
>> the interrupt wiring via the qdev GPIO API.
>>
>> For more clarity we replace the APB OBIO interrupt numbers with constants
>> designating the interrupt source, and rename isa_irq_handler() to
>> ebus_isa_irq_handler().
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  hw/pci-host/apb.c |2 +-
>>  hw/sparc64/sun4u.c|   48 
>> ++---
>>  include/hw/pci-host/apb.h |8 +++-
>>  3 files changed, 32 insertions(+), 26 deletions(-)
>>
>> diff --git a/hw/pci-host/apb.c b/hw/pci-host/apb.c
>> index 268100e..f092780c 100644
>> --- a/hw/pci-host/apb.c
>> +++ b/hw/pci-host/apb.c
>> @@ -700,7 +700,7 @@ static void pci_pbm_init(Object *obj)
>>  for (i = 0; i < 32; i++) {
>>  s->obio_irq_map[i] = ((0x1f << 6) | 0x20) + i;
>>  }
>> -s->pbm_irqs = qemu_allocate_irqs(pci_apb_set_irq, s, MAX_IVEC);
>> +qdev_init_gpio_in_named(DEVICE(s), pci_apb_set_irq, "pbm-irq", 
>> MAX_IVEC);
>>  qdev_init_gpio_out_named(DEVICE(s), s->ivec_irqs, "ivec-irq", MAX_IVEC);
>>  s->irq_request = NO_IRQ_REQUEST;
>>  s->pci_irq_in = 0ULL;
>> diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
>> index 0a30fb8..da386d3 100644
>> --- a/hw/sparc64/sun4u.c
>> +++ b/hw/sparc64/sun4u.c
>> @@ -86,6 +86,7 @@ typedef struct EbusState {
>>  PCIDevice parent_obj;
>>  
>>  ISABus *isa_bus;
>> +qemu_irq isa_bus_irqs[16];
> 
> Can you use ISA_NUM_IRQS here?

Ah yes, good spot. I've updated this (and the other occurrences you
found) to ISA_NUM_IRQS accordingly.


ATB,

Mark.



Re: [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine

2017-11-20 Thread Paolo Bonzini
On 20/11/2017 23:35, Jeff Cody wrote:
>> Is this a different "state" (in Stefan's parlance) than scheduled?  In
>> practice both means that someone may call qemu_(aio_)coroutine_enter
>> concurrently, so you'd better not do it yourself.
>>
> It is slightly different; it is from sleeping with a timer via
> co_aio_sleep_ns and waking via co_sleep_cb.  Whereas the 'co->scheduled' is
> specifically from being scheduled for a specific AioContext, via
> aio_co_schedule().

Right; however, that would only make a difference if we allowed
canceling a co_aio_sleep_ns.  Since we don't want that, they have the
same transitions.

> In practice, 'co->schedule' and 'co->sleeping' certainly rhyme, at the very
> least.
> 
> But having them separate will put the abort closer to where the problem lies,
> so it should make debugging a bit easier if we hit it.

What do you mean by closer?  It would print a slightly more informative
message, but the message is in qemu_aio_coroutine_for both cases.

In fact, unifying co->scheduled and co->sleeping means that you can
easily abort when co_aio_sleep_ns is called on a scheduled coroutine, like

/* This is valid. */
aio_co_schedule(qemu_get_current_aio_context(),
qemu_coroutine_self());

/* But only if there was a qemu_coroutine_yield here.  */
co_aio_sleep_ns(qemu_get_current_aio_context(), 1000);

Thanks,

Paolo



Re: [Qemu-devel] [PATCH 05/15] sun4u: move initialisation of all ISABus devices into ebus_realize()

2017-11-20 Thread Mark Cave-Ayland
On 20/11/17 00:47, Philippe Mathieu-Daudé wrote:

> On 11/17/2017 10:42 AM, Mark Cave-Ayland wrote:
>> This belongs in the PCI-ISA bridge rather than at the machine level.
> 
> nice, this helps me in another series (clean out i386/pc, refactor
> superio devices).

Awesome!

>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  hw/sparc64/sun4u.c |   78 
>> +++-
>>  1 file changed, 46 insertions(+), 32 deletions(-)
>>
>> diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
>> index f3203ea..b441f1e 100644
>> --- a/hw/sparc64/sun4u.c
>> +++ b/hw/sparc64/sun4u.c
>> @@ -85,6 +85,7 @@ typedef struct EbusState {
>>  PCIDevice parent_obj;
>>  
>>  ISABus *isa_bus;
>> +uint64_t console_serial_base;
> 
> I'd rather use hwaddr here, anyway:
> Reviewed-by: Philippe Mathieu-Daudé 

Yeah it could potentially be either, however to me it feels nicer to
have the value as uint64_t so it "matches" qdev_prop_set_uint64().


ATB,

Mark.



Re: [Qemu-devel] [PATCH 14/15] sparc64: introduce trace-events for hw/sparc64

2017-11-20 Thread Mark Cave-Ayland
On 19/11/17 15:14, Philippe Mathieu-Daudé wrote:

> Hi Mark,
> 
> On 11/17/2017 10:42 AM, Mark Cave-Ayland wrote:
>> This is in preparation for switching code in hw/sparc64 from DPRINTF over to
>> trace events.
> 
> This could be squashed with next commit,
> 
> Either way:
> Reviewed-by: Philippe Mathieu-Daudé 

Thanks for the review! I'd prefer to keep this separate as it touches an
area which is outside SPARC just in case anyone wants to handle this
separately.


ATB,

Mark.



Re: [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine

2017-11-20 Thread Jeff Cody
On Mon, Nov 20, 2017 at 11:30:39PM +0100, Paolo Bonzini wrote:
> On 20/11/2017 03:46, Jeff Cody wrote:
> > Once a coroutine is "sleeping", the timer callback will either enter the
> > coroutine, or schedule it for the next AioContext if using iothreads.
> > 
> > It is illegal to enter that coroutine while waiting for this timer
> > event and subsequent callback.  This patch will catch such an attempt,
> > and abort QEMU with an error.
> > 
> > Like with the previous patch, we cannot rely solely on the co->caller
> > check for recursive entry.  The prematurely entered coroutine may exit
> > with COROUTINE_TERMINATE before the timer expires, making co->caller no
> > longer valid.
> > 
> > We can clear co->sleeping in in co_sleep_cb(), because any doubly entry
> > attempt after point should be caught by either the co->scheduled or
> > co->caller checks.
> > 
> > Signed-off-by: Jeff Cody 
> > ---
> >  include/qemu/coroutine_int.h | 2 ++
> >  util/qemu-coroutine-sleep.c  | 3 +++
> >  util/qemu-coroutine.c| 5 +
> >  3 files changed, 10 insertions(+)
> > 
> > diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
> > index 931cdc9..b071217 100644
> > --- a/include/qemu/coroutine_int.h
> > +++ b/include/qemu/coroutine_int.h
> > @@ -56,6 +56,8 @@ struct Coroutine {
> >  
> >  int scheduled;
> >  
> > +int sleeping;
> 
> Is this a different "state" (in Stefan's parlance) than scheduled?  In
> practice both means that someone may call qemu_(aio_)coroutine_enter
> concurrently, so you'd better not do it yourself.
> 

It is slightly different; it is from sleeping with a timer via
co_aio_sleep_ns and waking via co_sleep_cb.  Whereas the 'co->scheduled' is
specifically from being scheduled for a specific AioContext, via
aio_co_schedule().

In practice, 'co->schedule' and 'co->sleeping' certainly rhyme, at the very
least.

But having them separate will put the abort closer to where the problem lies,
so it should make debugging a bit easier if we hit it.

> 
> > +
> >  QSIMPLEQ_ENTRY(Coroutine) co_queue_next;
> >  QSLIST_ENTRY(Coroutine) co_scheduled_next;
> >  };
> > diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
> > index 9c56550..11ae95a 100644
> > --- a/util/qemu-coroutine-sleep.c
> > +++ b/util/qemu-coroutine-sleep.c
> > @@ -13,6 +13,7 @@
> >  
> >  #include "qemu/osdep.h"
> >  #include "qemu/coroutine.h"
> > +#include "qemu/coroutine_int.h"
> >  #include "qemu/timer.h"
> >  #include "block/aio.h"
> >  
> > @@ -25,6 +26,7 @@ static void co_sleep_cb(void *opaque)
> >  {
> >  CoSleepCB *sleep_cb = opaque;
> >  
> > +sleep_cb->co->sleeping = 0;
> >  aio_co_wake(sleep_cb->co);
> >  }
> >  
> > @@ -34,6 +36,7 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, 
> > QEMUClockType type,
> >  CoSleepCB sleep_cb = {
> >  .co = qemu_coroutine_self(),
> >  };
> > +sleep_cb.co->sleeping = 1;
> >  sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, 
> > _cb);
> >  timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
> >  qemu_coroutine_yield();
> > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> > index 2edab63..1d9f93d 100644
> > --- a/util/qemu-coroutine.c
> > +++ b/util/qemu-coroutine.c
> > @@ -118,6 +118,11 @@ void qemu_aio_coroutine_enter(AioContext *ctx, 
> > Coroutine *co)
> >  abort();
> >  }
> >  
> > +if (co->sleeping == 1) {
> > +fprintf(stderr, "Cannot enter a co-routine that is still 
> > sleeping\n");
> > +abort();
> > +}
> > +
> >  if (co->caller) {
> >  fprintf(stderr, "Co-routine re-entered recursively\n");
> >  abort();
> > 
> 



Re: [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine

2017-11-20 Thread Paolo Bonzini
On 20/11/2017 03:46, Jeff Cody wrote:
> Once a coroutine is "sleeping", the timer callback will either enter the
> coroutine, or schedule it for the next AioContext if using iothreads.
> 
> It is illegal to enter that coroutine while waiting for this timer
> event and subsequent callback.  This patch will catch such an attempt,
> and abort QEMU with an error.
> 
> Like with the previous patch, we cannot rely solely on the co->caller
> check for recursive entry.  The prematurely entered coroutine may exit
> with COROUTINE_TERMINATE before the timer expires, making co->caller no
> longer valid.
> 
> We can clear co->sleeping in in co_sleep_cb(), because any doubly entry
> attempt after point should be caught by either the co->scheduled or
> co->caller checks.
> 
> Signed-off-by: Jeff Cody 
> ---
>  include/qemu/coroutine_int.h | 2 ++
>  util/qemu-coroutine-sleep.c  | 3 +++
>  util/qemu-coroutine.c| 5 +
>  3 files changed, 10 insertions(+)
> 
> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
> index 931cdc9..b071217 100644
> --- a/include/qemu/coroutine_int.h
> +++ b/include/qemu/coroutine_int.h
> @@ -56,6 +56,8 @@ struct Coroutine {
>  
>  int scheduled;
>  
> +int sleeping;

Is this a different "state" (in Stefan's parlance) than scheduled?  In
practice both means that someone may call qemu_(aio_)coroutine_enter
concurrently, so you'd better not do it yourself.

Paolo

> +
>  QSIMPLEQ_ENTRY(Coroutine) co_queue_next;
>  QSLIST_ENTRY(Coroutine) co_scheduled_next;
>  };
> diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
> index 9c56550..11ae95a 100644
> --- a/util/qemu-coroutine-sleep.c
> +++ b/util/qemu-coroutine-sleep.c
> @@ -13,6 +13,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/coroutine.h"
> +#include "qemu/coroutine_int.h"
>  #include "qemu/timer.h"
>  #include "block/aio.h"
>  
> @@ -25,6 +26,7 @@ static void co_sleep_cb(void *opaque)
>  {
>  CoSleepCB *sleep_cb = opaque;
>  
> +sleep_cb->co->sleeping = 0;
>  aio_co_wake(sleep_cb->co);
>  }
>  
> @@ -34,6 +36,7 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, 
> QEMUClockType type,
>  CoSleepCB sleep_cb = {
>  .co = qemu_coroutine_self(),
>  };
> +sleep_cb.co->sleeping = 1;
>  sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, _cb);
>  timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
>  qemu_coroutine_yield();
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index 2edab63..1d9f93d 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -118,6 +118,11 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine 
> *co)
>  abort();
>  }
>  
> +if (co->sleeping == 1) {
> +fprintf(stderr, "Cannot enter a co-routine that is still 
> sleeping\n");
> +abort();
> +}
> +
>  if (co->caller) {
>  fprintf(stderr, "Co-routine re-entered recursively\n");
>  abort();
> 




Re: [Qemu-devel] [Qemu-block] [PATCH 1/5] blockjob: do not allow coroutine double entry or entry-after-completion

2017-11-20 Thread Paolo Bonzini
On 20/11/2017 12:16, Stefan Hajnoczi wrote:
> This raises questions about the ability to cancel sleep:
> 
> 1. Does something depend on cancelling sleep?

block_job_cancel does, but in practice the sleep time is so small
(smaller than SLICE_TIME, which is 100 ms) that we probably don't care.

I agree with Jeff that canceling the sleep by force-entering the
coroutine seemed clever but is probably a very bad idea.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry

2017-11-20 Thread Michael Roth
Quoting lu.zhip...@zte.com.cn (2017-11-14 19:41:58)
> i used  xp  version:
> 
> xp professional 2002 service pack 3
> 
> build environment: 
> 
> root@localhost qemu-2.5.0]# cat /etc/redhat-release 
> 
> CentOS Linux release 7.0.1406 (Core) 

I haven't yet figured out why exactly, but with a CentOS 7.4 mingw
environment (via EPEL) I was indeed able to generate a binary that's
compatible with XP.

Still looking into it, but I've posted a pull for your modified patch in
the meantime. One notable change I made was to not make it fatal if we
fail to probe for GetIfStats2, since the stats are an optional return
value.

> 
> 
> 
> 
> 
> 
> 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支
> 持。
> 
> 芦志朋 luzhipeng
> 
> 
> IT开发工程师 IT Development Engineer
> 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> System Product
> 
> 
> [cid]  [cid]
>四川省成都市天府大道中段800号
>E: lu.zhip...@zte.com.cn
>www.zte.com.cn
> 
> 原始邮件
> 发件人: ;
> 收件人:芦志朋10108272;
> 抄送人: ;
> 日期:2017年11月15日 09:22
> 主题:Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> Quoting lu.zhip...@zte.com.cn (2017-11-14 05:09:35)
> >  i test the latest qga in xp , it run ok .
> > 
> > 
> > my qga config :
> > 
> > Configured with: './configure' '--enable-guest-agent' '--cross-prefix=
> > x86_64-w64-mingw32-' '--with-vss-sdk=/home/VSSSDK72' '--disable-fdt'
> > '--target-list=x86_64-softmmu'
> 
> Hmm, so you're testing with Windows XP x64? I was using XP 32-bit (SP3),
> but I retried with XP x64 (SP2) and I still have the same issue.
> 
> I can only get qemu-ga working if I build on top of something prior to
> commit 12f8def0e.
> 
> What build environment are you using? I've tried Fedora Core 18 and 20
> and have the same issue with both.
> 
> > 
> > used qga version info
> > 
> > [root@ceshi qemu]# git log
> > 
> > commit 533ab83ea074d5fc457769f6ac698524a12f1156
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 10 10:17:14 2017 +0800
> > 
> > 
> > qga: fix some errors for guest_get_network_stats
> > 
> > 
> > 
> > fix some erros:
> > 
> > 1.if building qga on Windows Vista/2008 and newer,
> > 
> > it cann't find the link to GetIfEntry2 in windows xp.
> > 
> > 2. check valid of if_index.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > 
> > commit de597a8b27722ce4f9cc660f930f7dccc712712d
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 3 22:54:20 2017 +0800
> > 
> > 
> > qga: replace GetIfEntry
> > 
> > 
> > 
> >
>  The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus 
> using
> > GetIfEntry2 instead of GetIfEntry.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > *avoid CamelCase variable names
> > 
> > *update field names for MIB_IFROW -> MIB_IF_ROW2
> > 
> > Signed-off-by: Michael Roth 
> > 
> > 
> > commit 5ca7a3cba468736cfe555887af1f6ba754f6eac9
> > 
> > Merge: a4f0537 10a7b7e
> > 
> > Author: Peter Maydell 
> > 
> > Date:   Tue Nov 7 14:43:35 2017 +
> > 
> > 
> > Merge remote-tracking branch 'remotes/berrange/tags/
> pull-2017-11-06-2' into
> > staging
> > 
> > 
> > 
> > Pull IO 2017/11/06 v2
> > 
> > 
> > 
> > 
> > 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术
> 支
> > 持。
> > 
> > 芦志朋 luzhipeng
> > 
> > 
> > IT开发工程师 IT Development Engineer
> > 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> > System Product
> > 
> > 
> > [cid]  [cid]
> >四川省成都市天府大道中段800号
> >E: lu.zhip...@zte.com.cn
> >www.zte.com.cn
> > 
> > 原始邮件
> > 发件人: ;
> > 收件人:芦志朋10108272;
> > 抄送人: ;
> > 日期:2017年11月14日 07:57
> > 主题:Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> > Quoting lu.zhip...@zte.com.cn (2017-11-09 05:26:15)
> > >  i think the code is better
> > > 
> > >  if (OSver.dwMajorVersion >= 6) {
> > >   MIB_IF_ROW2 aMib_ifrow;
> > >   typedef NETIOAPI_API (WINAPI *getifentry2_t)(PMIB_IF_ROW2 Row);
> > >   memset(_ifrow, 0, sizeof(aMib_ifrow));
> > >   aMib_ifrow.InterfaceIndex = nicId;
> > >   HMODULE module = GetModuleHandle("iphlpapi");
> > >   PVOID fun = GetProcAddress(module, "GetIfEntry2");
> > >   if (fun == NULL) {
> > >   error_setg(errp, QERR_QGA_COMMAND_FAILED,
> > >  "Failed to get address of GetIfEntry2");
> > >   return NULL;
> > >   }
> > > getifentry2_t getifentry2_ex = (getifentry2_t)fun;
> > > if (NO_ERROR == getifentry2_ex(_ifrow)){
> > > }
> > 
> > I've updated the patch with this change:
> >   https://github.com/mdroth/qemu/commits/qga-if-stats
> > 
> > But I'm a bit confused now: when I tried to test this on XP I realized that
> > that qemu-ga no longer works on XP, and generates the 

Re: [Qemu-devel] [Qemu-stable] [PATCH for-2.11] scripts/make-release: ship u-boot source as a tarball

2017-11-20 Thread Michael Roth
Quoting Michael Roth (2017-11-07 14:52:01)
> The u-boot sources we ship currently cause problems with unpacking on
> a case-insensitive filesystem due to path conflicts. This has been
> fixed in upstream u-boot via commit 610eec7f, but since it is not
> yet included in an official release we implement this approach as a
> temporary workaround.
> 
> Once we move to a u-boot containing commit 610eec7f we should revert
> this patch.
> 
> Cc: qemu-sta...@nongnu.org
> Cc: Alexander Graf 
> Cc: Richard Henderson 
> Cc: Thomas Huth 
> Cc: Peter Maydell 
> Suggested-by: Richard Henderson 
> Signed-off-by: Michael Roth 

Ping.

Peter, would you be able to apply this directly? Not sure who else this
would go through.

> ---
>  scripts/make-release | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/scripts/make-release b/scripts/make-release
> index fa6323fda8..3917df7142 100755
> --- a/scripts/make-release
> +++ b/scripts/make-release
> @@ -20,6 +20,10 @@ git checkout "v${version}"
>  git submodule update --init
>  (cd roms/seabios && git describe --tags --long --dirty > .version)
>  rm -rf .git roms/*/.git dtc/.git pixman/.git
> +# FIXME: The following line is a workaround for avoiding filename collisions
> +# when unpacking u-boot sources on case-insensitive filesystems. Once we
> +# update to something with u-boot commit 610eec7f0 we can drop this line.
> +tar cfj roms/u-boot.tar.bz2 -C roms u-boot && rm -rf roms/u-boot
>  popd
>  tar cfj ${destination}.tar.bz2 ${destination}
>  rm -rf ${destination}
> -- 
> 2.11.0
> 
> 




[Qemu-devel] [PULL for-2.11 1/1] qga: replace GetIfEntry with GetIfEntry2 for interface stats

2017-11-20 Thread Michael Roth
From: ZhiPeng Lu 

The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus
using GetIfEntry2 instead of GetIfEntry.

Signed-off-by: ZhiPeng Lu 
*avoid CamelCase variable names
*update field names for MIB_IFROW -> MIB_IF_ROW2
*dynamically probe for GetIfIndex2 to deal with older OSs
*check return value from get_interface_index
Signed-off-by: Michael Roth 
---
 qga/commands-win32.c | 54 
 1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 0322188a73..d79974f212 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -1169,24 +1169,46 @@ static DWORD get_interface_index(const char *guid)
 return index;
 }
 }
+
+typedef NETIOAPI_API (WINAPI *GetIfEntry2Func)(PMIB_IF_ROW2 Row);
+
 static int guest_get_network_stats(const char *name,
-   GuestNetworkInterfaceStat *stats)
+   GuestNetworkInterfaceStat *stats)
 {
-DWORD if_index = 0;
-MIB_IFROW a_mid_ifrow;
-memset(_mid_ifrow, 0, sizeof(a_mid_ifrow));
-if_index = get_interface_index(name);
-a_mid_ifrow.dwIndex = if_index;
-if (NO_ERROR == GetIfEntry(_mid_ifrow)) {
-stats->rx_bytes = a_mid_ifrow.dwInOctets;
-stats->rx_packets = a_mid_ifrow.dwInUcastPkts;
-stats->rx_errs = a_mid_ifrow.dwInErrors;
-stats->rx_dropped = a_mid_ifrow.dwInDiscards;
-stats->tx_bytes = a_mid_ifrow.dwOutOctets;
-stats->tx_packets = a_mid_ifrow.dwOutUcastPkts;
-stats->tx_errs = a_mid_ifrow.dwOutErrors;
-stats->tx_dropped = a_mid_ifrow.dwOutDiscards;
-return 0;
+OSVERSIONINFO os_ver;
+
+os_ver.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
+GetVersionEx(_ver);
+if (os_ver.dwMajorVersion >= 6) {
+MIB_IF_ROW2 a_mid_ifrow;
+GetIfEntry2Func getifentry2_ex;
+DWORD if_index = 0;
+HMODULE module = GetModuleHandle("iphlpapi");
+PVOID func = GetProcAddress(module, "GetIfEntry2");
+
+if (func == NULL) {
+return -1;
+}
+
+getifentry2_ex = (GetIfEntry2Func)func;
+if_index = get_interface_index(name);
+if (if_index == (DWORD)~0) {
+return -1;
+}
+
+memset(_mid_ifrow, 0, sizeof(a_mid_ifrow));
+a_mid_ifrow.InterfaceIndex = if_index;
+if (NO_ERROR == getifentry2_ex(_mid_ifrow)) {
+stats->rx_bytes = a_mid_ifrow.InOctets;
+stats->rx_packets = a_mid_ifrow.InUcastPkts;
+stats->rx_errs = a_mid_ifrow.InErrors;
+stats->rx_dropped = a_mid_ifrow.InDiscards;
+stats->tx_bytes = a_mid_ifrow.OutOctets;
+stats->tx_packets = a_mid_ifrow.OutUcastPkts;
+stats->tx_errs = a_mid_ifrow.OutErrors;
+stats->tx_dropped = a_mid_ifrow.OutDiscards;
+return 0;
+}
 }
 return -1;
 }
-- 
2.11.0




[Qemu-devel] [PULL for-2.11 0/1] qemu-ga patch queue for 2.11

2017-11-20 Thread Michael Roth
The following changes since commit b2996bb405e2806725a341c72d80be9e77ed8b82:

  Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171120' 
into staging (2017-11-20 18:00:16 +)

are available in the git repository at:

  git://github.com/mdroth/qemu.git tags/qga-pull-2017-11-20-tag

for you to fetch changes up to df83eabd5245828cbca32060aa191d8b03bc5d50:

  qga: replace GetIfEntry with GetIfEntry2 for interface stats (2017-11-20 
14:45:31 -0600)


qemu-ga patch queue for 2.11

* fix potential overflow in network interface stats reporting


ZhiPeng Lu (1):
  qga: replace GetIfEntry with GetIfEntry2 for interface stats

 qga/commands-win32.c | 54 
 1 file changed, 38 insertions(+), 16 deletions(-)




Re: [Qemu-devel] [PATCH v7 3/5] fw_cfg: do DMA read operation

2017-11-20 Thread Michael S. Tsirkin
On Mon, Nov 20, 2017 at 10:55:17AM +0100, Marc-André Lureau wrote:
> Modify fw_cfg_read_blob() to use DMA if the device supports it.
> Return errors, because the operation may fail.
> 
> The DMA operation is expected to run synchronously with today qemu,
> but the specification states that it may become async, so we run
> "control" field check in a loop for eventual changes.
> 
> We may want to switch all the *buf addresses to use only kmalloc'ed
> buffers (instead of using stack/image addresses with dma=false).
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  drivers/firmware/qemu_fw_cfg.c | 140 
> -
>  1 file changed, 123 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
> index 740df0df2260..e0fe6ff037c3 100644
> --- a/drivers/firmware/qemu_fw_cfg.c
> +++ b/drivers/firmware/qemu_fw_cfg.c
> @@ -33,6 +33,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  MODULE_AUTHOR("Gabriel L. Somlo ");
>  MODULE_DESCRIPTION("QEMU fw_cfg sysfs support");
> @@ -43,12 +45,25 @@ MODULE_LICENSE("GPL");
>  #define FW_CFG_ID 0x01
>  #define FW_CFG_FILE_DIR   0x19
>  
> +#define FW_CFG_VERSION_DMA 0x02
> +#define FW_CFG_DMA_CTL_ERROR   0x01
> +#define FW_CFG_DMA_CTL_READ0x02
> +#define FW_CFG_DMA_CTL_SKIP0x04
> +#define FW_CFG_DMA_CTL_SELECT  0x08
> +#define FW_CFG_DMA_CTL_WRITE   0x10
> +
>  /* size in bytes of fw_cfg signature */
>  #define FW_CFG_SIG_SIZE 4
>  
>  /* fw_cfg "file name" is up to 56 characters (including terminating nul) */
>  #define FW_CFG_MAX_FILE_PATH 56
>  
> +/* platform device for dma mapping */
> +static struct device *dev;
> +
> +/* fw_cfg revision attribute, in /sys/firmware/qemu_fw_cfg top-level dir. */
> +static u32 fw_cfg_rev;
> +
>  /* fw_cfg file directory entry type */
>  struct fw_cfg_file {
>   u32 size;
> @@ -57,6 +72,12 @@ struct fw_cfg_file {
>   char name[FW_CFG_MAX_FILE_PATH];
>  };
>  
> +struct fw_cfg_dma {
> + u32 control;
> + u32 length;
> + u64 address;
> +} __packed;
> +
>  /* fw_cfg device i/o register addresses */
>  static bool fw_cfg_is_mmio;
>  static phys_addr_t fw_cfg_p_base;

Drop __packed please. It causes many gcc versions to do insane things.
Can be a patch on top.


> @@ -75,12 +96,79 @@ static inline u16 fw_cfg_sel_endianness(u16 key)
>   return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
>  }
>  
> +static inline bool fw_cfg_dma_enabled(void)
> +{
> + return fw_cfg_rev & FW_CFG_VERSION_DMA && fw_cfg_reg_dma;
> +}
> +
> +/* qemu fw_cfg device is sync today, but spec says it may become async */
> +static void fw_cfg_wait_for_control(struct fw_cfg_dma *d, dma_addr_t dma)
> +{
> + do {
> + dma_sync_single_for_cpu(dev, dma, sizeof(*d), DMA_FROM_DEVICE);
> + if ((be32_to_cpu(d->control) & ~FW_CFG_DMA_CTL_ERROR) == 0)
> + return;
> +
> + usleep_range(50, 100);

And since in practice we never get to this line,
maybe we should just go back to yield here.

> + } while (true);
> +}
> +
> +static ssize_t fw_cfg_dma_transfer(void *address, u32 length, u32 control)
> +{
> + dma_addr_t dma_addr = 0;
> + static struct fw_cfg_dma d;
> + dma_addr_t dma;
> + ssize_t ret = length;
> + enum dma_data_direction dir =
> + (control & FW_CFG_DMA_CTL_READ ? DMA_FROM_DEVICE : 0);
> +
> + if (address && length) {
> + dma_addr = dma_map_single(dev, address, length, dir);
> + if (dma_mapping_error(NULL, dma_addr)) {
> + WARN(1, "%s: failed to map address\n", __func__);
> + return -EFAULT;
> + }
> + }
> +
> + d = (struct fw_cfg_dma) {
> + .address = cpu_to_be64(dma_addr),
> + .length = cpu_to_be32(length),
> + .control = cpu_to_be32(control)
> + };
> +
> + dma = dma_map_single(dev, , sizeof(d), DMA_BIDIRECTIONAL);
> + if (dma_mapping_error(NULL, dma)) {
> + WARN(1, "%s: failed to map fw_cfg_dma\n", __func__);
> + ret = -EFAULT;
> + goto end;
> + }
> +
> + iowrite32be((u64)dma >> 32, fw_cfg_reg_dma);
> + iowrite32be(dma, fw_cfg_reg_dma + 4);
> +
> + fw_cfg_wait_for_control(, dma);
> +
> + if (be32_to_cpu(d.control) & FW_CFG_DMA_CTL_ERROR) {
> + ret = -EIO;
> + }
> +
> + dma_unmap_single(dev, dma, sizeof(d), DMA_BIDIRECTIONAL);
> +
> +end:
> + if (dma_addr)
> + dma_unmap_single(dev, dma_addr, length, dir);
> +
> + return ret;
> +}
> +
>  /* read chunk of given fw_cfg blob (caller responsible for sanity-check) */
> -static inline void fw_cfg_read_blob(u16 key,
> - void *buf, loff_t pos, size_t count)
> +static ssize_t fw_cfg_read_blob(u16 key,
> + void *buf, loff_t pos, size_t count,
> +   

Re: [Qemu-devel] [Libguestfs] [qemu-img] support for XVA

2017-11-20 Thread Gandalf Corvotempesta
I did something different, that will build a raw image directly from a
xenserver export, on the fly.
Compared the resulting file (via MD5) with xenmygrate.py and there is a match.

Currently, this is the faster way to convert a XenServer image to a
raw file. Don't need to wait
for export, tar extract and conversion. It does all of that, at the
same time during the VM export in a single pass.

If someone interested in testing it, it would be apreciated.

2017-11-16 15:02 GMT+01:00 Max Reitz :
> On 2017-11-16 11:08, Gandalf Corvotempesta wrote:
>> 2017-11-15 23:55 GMT+01:00 Max Reitz :
>>> https://xanclic.moe/convert-xva.rb -- does this work?
>>> (It seems to works on the two example images I found...)
>>>
>>> An example is in the code, you use it like this:
>>>
>>> $ ./convert-xva.rb ~/Downloads/stats-appliance-2.36.020502.xva Ref:73
>>
>>
>> It doesn't work on huge images:
>>
>> # time convert-xva.rb 20171115_193814_efff_.xva Ref:10 -O qcow2 disk1.qcow2
>> /usr/bin/convert-xva.rb:119:in `exec': Argument list too long -
>> qemu-img (Errno::E2BIG)
>> from /usr/bin/convert-xva.rb:119:in `'
>>
>> real 9m41.414s
>> user 0m19.280s
>> sys 0m3.260s
>
> Well, there is this:
>
> https://github.com/XanClic/qemu.rb/blob/master/examples/convert-xva-online.rb
>
> That should get around that restriction, but currently it is stupidly
> slow (because it just issues all 1 MB copy operations simultaneously),
> and it doesn't use qemu-img convert.  Instead, it uses a real qemu with
> block jobs.  Ideally, you'd use it like this:
>
> $ ./convert-xva-online.rb ~/Downloads/stats-appliance-2.36.020502.xva
> Ref:73  81936 M
> $ qemu-img create -f qcow2 [whatever options you need] disk1.qcow2 \
> 81936M
> Formatting 'disk1.qcow2', fmt=qcow2 size=85916123136 encryption=off
> cluster_size=65536 lazy_refcounts=off refcount_bits=16
> $ ./convert-xva-online.rb ~/Downloads/stats-appliance-2.36.020502.xva \
> Ref:73 \
> '{"driver":"qcow2",
>   "detect-zeroes":"unmap",
>   "discard":"unmap",
>   "file":{"driver":"file","filename":"disk1.qcow2"}}'
> Adding block devices...
> Starting block jobs...
> 100.00 % of jobs completed
>
> (Maybe I can get it faster.)
>
> Max
>


xva_conv.sh
Description: Bourne shell script


Re: [Qemu-devel] [PATCH v2 for-2.11 2/2] accel/tcg: Handle atomic accesses to notdirty memory correctly

2017-11-20 Thread Richard Henderson
On 11/20/2017 07:08 PM, Peter Maydell wrote:
> To do a write to memory that is marked as notdirty, we need
> to invalidate any TBs we have cached for that memory, and
> update the cpu physical memory dirty flags for VGA and migration.
> The slowpath code in notdirty_mem_write() does all this correctly,
> but the new atomic handling code in atomic_mmu_lookup() doesn't
> do anything at all, it just clears the dirty bit in the TLB.
> 
> The effect of this bug is that if the first write to a notdirty
> page for which we have cached TBs is by a guest atomic access,
> we fail to invalidate the TBs and subsequently will execute
> incorrect code. This can be seen by trying to run 'javac' on AArch64.
> 
> Use the new notdirty_call_before() and notdirty_call_after()
> functions to correctly handle the update to notdirty memory
> in the atomic codepath.
> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Peter Maydell 
> ---
>  accel/tcg/atomic_template.h | 12 
>  accel/tcg/cputlb.c  | 38 +-
>  accel/tcg/user-exec.c   |  1 +
>  3 files changed, 38 insertions(+), 13 deletions(-)

Reviewed-by: Richard Henderson 


r~



[Qemu-devel] [PULL 11/15] linux-user/s390x: Mask si_addr for SIGSEGV

2017-11-20 Thread riku . voipio
From: Peter Maydell 

For s390x, the address passed to a signal handler in the
siginfo_t si_addr field is masked (in the kernel this is done in
do_sigbus() and do_sigsegv() in arch/s390/mm/fault.c). Implement
this architecture-specific oddity in linux-user.

This is one of the issues described in
https://bugs.launchpad.net/qemu/+bug/1705118

Reviewed-by: Laurent Vivier 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
Signed-off-by: Riku Voipio 
---
 linux-user/main.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index aa02f25b85..b6dd9efd2d 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -3238,6 +3238,10 @@ void cpu_loop(CPUAlphaState *env)
 #endif /* TARGET_ALPHA */
 
 #ifdef TARGET_S390X
+
+/* s390x masks the fault address it reports in si_addr for SIGSEGV and SIGBUS 
*/
+#define S390X_FAIL_ADDR_MASK -4096LL
+
 void cpu_loop(CPUS390XState *env)
 {
 CPUState *cs = CPU(s390_env_get_cpu(env));
@@ -3294,7 +3298,7 @@ void cpu_loop(CPUS390XState *env)
 sig = TARGET_SIGSEGV;
 /* XXX: check env->error_code */
 n = TARGET_SEGV_MAPERR;
-addr = env->__excp_addr;
+addr = env->__excp_addr & S390X_FAIL_ADDR_MASK;
 goto do_signal;
 case PGM_EXECUTE:
 case PGM_SPECIFICATION:
-- 
2.14.2




[Qemu-devel] [PULL 13/15] linux-user/sparc: Put address for data faults where linux-user expects it

2017-11-20 Thread riku . voipio
From: Peter Maydell 

In the user-mode-only version of sparc_cpu_handle_mmu_fault(),
we must save the fault address for a data fault into the CPU
state's mmu registers, because the code in linux-user/main.c
expects to find it there in order to populate the si_addr
field of the guest siginfo.

Reviewed-by: Laurent Vivier 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
Signed-off-by: Riku Voipio 
---
 target/sparc/mmu_helper.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
index 126ea5e3ee..d5b6c1e48c 100644
--- a/target/sparc/mmu_helper.c
+++ b/target/sparc/mmu_helper.c
@@ -30,10 +30,18 @@
 int sparc_cpu_handle_mmu_fault(CPUState *cs, vaddr address, int rw,
int mmu_idx)
 {
+SPARCCPU *cpu = SPARC_CPU(cs);
+CPUSPARCState *env = >env;
+
 if (rw & 2) {
 cs->exception_index = TT_TFAULT;
 } else {
 cs->exception_index = TT_DFAULT;
+#ifdef TARGET_SPARC64
+env->dmmu.mmuregs[4] = address;
+#else
+env->mmuregs[4] = address;
+#endif
 }
 return 1;
 }
-- 
2.14.2




[Qemu-devel] [PULL 10/15] linux-user: return EINVAL from prctl(PR_*_SECCOMP)

2017-11-20 Thread riku . voipio
From: James Cowgill 

If an application tries to install a seccomp filter using
prctl(PR_SET_SECCOMP), the filter is likely for the target instead of the host
architecture. This will probably cause qemu to be immediately killed when it
executes another syscall.

Prevent this from happening by returning EINVAL from both seccomp prctl
calls. This is the error returned by the kernel when seccomp support is
disabled.

Fixes: https://bugs.launchpad.net/qemu/+bug/1726394
Reviewed-by: Laurent Vivier 
Signed-off-by: James Cowgill 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 84e123b67b..f31b853bb7 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -10505,6 +10505,12 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 break;
 }
 #endif
+case PR_GET_SECCOMP:
+case PR_SET_SECCOMP:
+/* Disable seccomp to prevent the target disabling syscalls we
+ * need. */
+ret = -TARGET_EINVAL;
+break;
 default:
 /* Most prctl options have no pointer arguments */
 ret = get_errno(prctl(arg1, arg2, arg3, arg4, arg5));
-- 
2.14.2




[Qemu-devel] [PULL 04/15] linux-user/hppa: Fix typo for TARGET_NR_epoll_wait

2017-11-20 Thread riku . voipio
From: Helge Deller 

Reviewed-by: Laurent Vivier 
Signed-off-by: Helge Deller 
Message-Id: <20170311100543.ga29...@ls3530.fritz.box>
Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/hppa/syscall_nr.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/hppa/syscall_nr.h b/linux-user/hppa/syscall_nr.h
index 0f396fa1e2..55bdf71d50 100644
--- a/linux-user/hppa/syscall_nr.h
+++ b/linux-user/hppa/syscall_nr.h
@@ -228,7 +228,7 @@
 #define TARGET_NR_lookup_dcookie223
 #define TARGET_NR_epoll_create  224
 #define TARGET_NR_epoll_ctl 225
-#define TARGET_NR_epill_wait226
+#define TARGET_NR_epoll_wait226
 #define TARGET_NR_remap_file_pages  227
 #define TARGET_NR_semtimedop228
 #define TARGET_NR_mq_open   229
-- 
2.14.2




[Qemu-devel] [PULL 14/15] linux-user: Handle rt_sigaction correctly for SPARC

2017-11-20 Thread riku . voipio
From: Peter Maydell 

SPARC is like Alpha in its handling of the rt_sigaction syscall:
it takes an extra parameter 'restorer' which needs to be copied
into the sa_restorer field of the sigaction struct. The order
of the arguments differs slightly between SPARC and Alpha but
the implementation is otherwise the same. (Compare the
rt_sigaction() functions in arch/sparc/kernel/sys_sparc_64.c
and arch/alpha/kernel/signal.c.)

Note that this change is somewhat moot until SPARC acquires
support for actually delivering RT signals.

Reviewed-by: Laurent Vivier 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 27 +++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f31b853bb7..11c9116c4a 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8579,8 +8579,16 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 case TARGET_NR_rt_sigaction:
 {
 #if defined(TARGET_ALPHA)
-struct target_sigaction act, oact, *pact = 0;
+/* For Alpha and SPARC this is a 5 argument syscall, with
+ * a 'restorer' parameter which must be copied into the
+ * sa_restorer field of the sigaction struct.
+ * For Alpha that 'restorer' is arg5; for SPARC it is arg4,
+ * and arg5 is the sigsetsize.
+ * Alpha also has a separate rt_sigaction struct that it uses
+ * here; SPARC uses the usual sigaction struct.
+ */
 struct target_rt_sigaction *rt_act;
+struct target_sigaction act, oact, *pact = 0;
 
 if (arg4 != sizeof(target_sigset_t)) {
 ret = -TARGET_EINVAL;
@@ -8606,18 +8614,29 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 unlock_user_struct(rt_act, arg3, 1);
 }
 #else
+#ifdef TARGET_SPARC
+target_ulong restorer = arg4;
+target_ulong sigsetsize = arg5;
+#else
+target_ulong sigsetsize = arg4;
+#endif
 struct target_sigaction *act;
 struct target_sigaction *oact;
 
-if (arg4 != sizeof(target_sigset_t)) {
+if (sigsetsize != sizeof(target_sigset_t)) {
 ret = -TARGET_EINVAL;
 break;
 }
 if (arg2) {
-if (!lock_user_struct(VERIFY_READ, act, arg2, 1))
+if (!lock_user_struct(VERIFY_READ, act, arg2, 1)) {
 goto efault;
-} else
+}
+#ifdef TARGET_SPARC
+act->sa_restorer = restorer;
+#endif
+} else {
 act = NULL;
+}
 if (arg3) {
 if (!lock_user_struct(VERIFY_WRITE, oact, arg3, 0)) {
 ret = -TARGET_EFAULT;
-- 
2.14.2




[Qemu-devel] [PULL 03/15] linux-user/hppa: Fix cpu_clone_regs

2017-11-20 Thread riku . voipio
From: Richard Henderson 

By failing to return from the syscall in the child, the child
issues another clone syscall and hilarity ensues.

Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/hppa/target_cpu.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/linux-user/hppa/target_cpu.h b/linux-user/hppa/target_cpu.h
index 1a5cecad3c..e50522eae9 100644
--- a/linux-user/hppa/target_cpu.h
+++ b/linux-user/hppa/target_cpu.h
@@ -24,7 +24,11 @@ static inline void cpu_clone_regs(CPUHPPAState *env, 
target_ulong newsp)
 if (newsp) {
 env->gr[30] = newsp;
 }
+/* Indicate child in return value.  */
 env->gr[28] = 0;
+/* Return from the syscall.  */
+env->iaoq_f = env->gr[31];
+env->iaoq_b = env->gr[31] + 4;
 }
 
 static inline void cpu_set_tls(CPUHPPAState *env, target_ulong newtls)
-- 
2.14.2




[Qemu-devel] [PULL 15/15] linux-user: Fix calculation of auxv length

2017-11-20 Thread riku . voipio
From: Peter Maydell 

In commit 7c4ee5bcc82e643 we changed the order in which we construct
the AUXV, but forgot to adjust the calculation of the length. The
result is that we set info->auxv_len to a bogus and negative value,
and then later on the code in open_self_auxv() gets confused and
ends up presenting the guest with an empty file.

Since we now have to calculate the auxv length up-front as part
of figuring out how much we're going to put on the stack, set
info->auxv_len then; this allows us to assert that we put the
same number of entries into auxv as we pre-calculated, rather
than merely having a comment saying we need to do that.

Fixes: https://bugs.launchpad.net/qemu/+bug/1728116

Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
Signed-off-by: Riku Voipio 
---
 linux-user/elfload.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 3b857fbc9c..20f3d8c2c3 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1732,6 +1732,8 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, 
int envc,
 #ifdef ELF_HWCAP2
 size += 2;
 #endif
+info->auxv_len = size * n;
+
 size += envc + argc + 2;
 size += 1;  /* argc itself */
 size *= n;
@@ -1760,7 +1762,6 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, 
int envc,
 put_user_ual(val, u_auxv); u_auxv += n; \
 } while(0)
 
-/* There must be exactly DLINFO_ITEMS entries here.  */
 #ifdef ARCH_DLINFO
 /*
  * ARCH_DLINFO must come first so platform specific code can enforce
@@ -1768,6 +1769,9 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, 
int envc,
  */
 ARCH_DLINFO;
 #endif
+/* There must be exactly DLINFO_ITEMS entries here, or the assert
+ * on info->auxv_len will trigger.
+ */
 NEW_AUX_ENT(AT_PHDR, (abi_ulong)(info->load_addr + exec->e_phoff));
 NEW_AUX_ENT(AT_PHENT, (abi_ulong)(sizeof (struct elf_phdr)));
 NEW_AUX_ENT(AT_PHNUM, (abi_ulong)(exec->e_phnum));
@@ -1793,7 +1797,10 @@ static abi_ulong create_elf_tables(abi_ulong p, int 
argc, int envc,
 NEW_AUX_ENT (AT_NULL, 0);
 #undef NEW_AUX_ENT
 
-info->auxv_len = u_argv - info->saved_auxv;
+/* Check that our initial calculation of the auxv length matches how much
+ * we actually put into it.
+ */
+assert(info->auxv_len == u_auxv - info->saved_auxv);
 
 put_user_ual(argc, u_argc);
 
-- 
2.14.2




[Qemu-devel] [PULL 06/15] linux-user/hppa: Fix TARGET_F_RDLCK, TARGET_F_WRLCK, TARGET_F_UNLCK

2017-11-20 Thread riku . voipio
From: Helge Deller 

Signed-off-by: Helge Deller 
Signed-off-by: Richard Henderson 
Message-ID: <20170311175019.ga7...@ls3530.fritz.box>
Signed-off-by: Riku Voipio 
---
 linux-user/syscall_defs.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index a6ed30d70e..daa2a57398 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -2361,6 +2361,9 @@ struct target_statfs64 {
 #define TARGET_F_SETOWN24   /*  for sockets. */
 #define TARGET_F_GETOWN23   /*  for sockets. */
 #elif defined(TARGET_HPPA)
+#define TARGET_F_RDLCK 1
+#define TARGET_F_WRLCK 2
+#define TARGET_F_UNLCK 3
 #define TARGET_F_GETLK 5
 #define TARGET_F_SETLK 6
 #define TARGET_F_SETLKW7
-- 
2.14.2




[Qemu-devel] [PULL 08/15] linux-user/syscall.c: Handle SH4's exceptional alignment for p{read, write}64

2017-11-20 Thread riku . voipio
From: James Clarke 

Fixes: https://bugs.launchpad.net/qemu/+bug/1716767
Reviewed-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-By: John Paul Adrian Glaubitz 
Signed-off-by: James Clarke 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 36 +---
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8047bf3aac..9268c3ef69 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -671,18 +671,32 @@ static inline int next_free_host_timer(void)
 
 /* ARM EABI and MIPS expect 64bit types aligned even on pairs or registers */
 #ifdef TARGET_ARM
-static inline int regpairs_aligned(void *cpu_env) {
+static inline int regpairs_aligned(void *cpu_env, int num)
+{
 return CPUARMState *)cpu_env)->eabi) == 1) ;
 }
 #elif defined(TARGET_MIPS) && (TARGET_ABI_BITS == 32)
-static inline int regpairs_aligned(void *cpu_env) { return 1; }
+static inline int regpairs_aligned(void *cpu_env, int num) { return 1; }
 #elif defined(TARGET_PPC) && !defined(TARGET_PPC64)
 /* SysV AVI for PPC32 expects 64bit parameters to be passed on odd/even pairs
  * of registers which translates to the same as ARM/MIPS, because we start with
  * r3 as arg1 */
-static inline int regpairs_aligned(void *cpu_env) { return 1; }
+static inline int regpairs_aligned(void *cpu_env, int num) { return 1; }
+#elif defined(TARGET_SH4)
+/* SH4 doesn't align register pairs, except for p{read,write}64 */
+static inline int regpairs_aligned(void *cpu_env, int num)
+{
+switch (num) {
+case TARGET_NR_pread64:
+case TARGET_NR_pwrite64:
+return 1;
+
+default:
+return 0;
+}
+}
 #else
-static inline int regpairs_aligned(void *cpu_env) { return 0; }
+static inline int regpairs_aligned(void *cpu_env, int num) { return 0; }
 #endif
 
 #define ERRNO_TABLE_SIZE 1200
@@ -6870,7 +6884,7 @@ static inline abi_long target_truncate64(void *cpu_env, 
const char *arg1,
  abi_long arg3,
  abi_long arg4)
 {
-if (regpairs_aligned(cpu_env)) {
+if (regpairs_aligned(cpu_env, TARGET_NR_truncate64)) {
 arg2 = arg3;
 arg3 = arg4;
 }
@@ -6884,7 +6898,7 @@ static inline abi_long target_ftruncate64(void *cpu_env, 
abi_long arg1,
   abi_long arg3,
   abi_long arg4)
 {
-if (regpairs_aligned(cpu_env)) {
+if (regpairs_aligned(cpu_env, TARGET_NR_ftruncate64)) {
 arg2 = arg3;
 arg3 = arg4;
 }
@@ -10508,7 +10522,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #endif
 #ifdef TARGET_NR_pread64
 case TARGET_NR_pread64:
-if (regpairs_aligned(cpu_env)) {
+if (regpairs_aligned(cpu_env, num)) {
 arg4 = arg5;
 arg5 = arg6;
 }
@@ -10518,7 +10532,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 unlock_user(p, arg2, ret);
 break;
 case TARGET_NR_pwrite64:
-if (regpairs_aligned(cpu_env)) {
+if (regpairs_aligned(cpu_env, num)) {
 arg4 = arg5;
 arg5 = arg6;
 }
@@ -11288,7 +11302,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 arg6 = ret;
 #else
 /* 6 args: fd, offset (high, low), len (high, low), advice */
-if (regpairs_aligned(cpu_env)) {
+if (regpairs_aligned(cpu_env, num)) {
 /* offset is in (3,4), len in (5,6) and advice in 7 */
 arg2 = arg3;
 arg3 = arg4;
@@ -11307,7 +11321,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #ifdef TARGET_NR_fadvise64
 case TARGET_NR_fadvise64:
 /* 5 args: fd, offset (high, low), len, advice */
-if (regpairs_aligned(cpu_env)) {
+if (regpairs_aligned(cpu_env, num)) {
 /* offset is in (3,4), len in 5 and advice in 6 */
 arg2 = arg3;
 arg3 = arg4;
@@ -11420,7 +11434,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #ifdef TARGET_NR_readahead
 case TARGET_NR_readahead:
 #if TARGET_ABI_BITS == 32
-if (regpairs_aligned(cpu_env)) {
+if (regpairs_aligned(cpu_env, num)) {
 arg2 = arg3;
 arg3 = arg4;
 arg4 = arg5;
-- 
2.14.2




[Qemu-devel] [PULL 12/15] linux-user/ppc: Report correct fault address for data faults

2017-11-20 Thread riku . voipio
From: Peter Maydell 

For faults on loads and stores, ppc_cpu_handle_mmu_fault() in
target/ppc/user_only_helper.c stores the offending address
in env->spr[SPR_DAR]. Report this correctly to the guest
in si_addr, rather than incorrectly using the address of the
instruction that caused the fault.

This fixes the test case in
https://bugs.launchpad.net/qemu/+bug/1077116
for ppc, ppc64 and ppc64le.

Reviewed-by: Laurent Vivier 
Signed-off-by: Peter Maydell 
Signed-off-by: Riku Voipio 
---
 linux-user/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index b6dd9efd2d..6286661bd3 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -1420,7 +1420,7 @@ void cpu_loop(CPUPPCState *env)
 info.si_code = TARGET_SEGV_MAPERR;
 break;
 }
-info._sifields._sigfault._addr = env->nip;
+info._sifields._sigfault._addr = env->spr[SPR_DAR];
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
 break;
 case POWERPC_EXCP_ISI:  /* Instruction storage exception */
-- 
2.14.2




[Qemu-devel] [PULL 05/15] linux-user/hppa: Fix TARGET_MAP_TYPE

2017-11-20 Thread riku . voipio
From: Helge Deller 

TARGET_MAP_TYPE needs to be 0x03 instead of 0x0f on the hppa
architecture, otherwise it conflicts with MAP_FIXED which is 0x04.

Signed-off-by: Helge Deller 
Signed-off-by: Richard Henderson 
Message-ID: <20170311175019.ga7...@ls3530.fritz.box>
Signed-off-by: Riku Voipio 
---
 linux-user/syscall_defs.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 38339ecb9a..a6ed30d70e 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -1336,7 +1336,11 @@ struct target_winsize {
 /* Common */
 #define TARGET_MAP_SHARED  0x01/* Share changes */
 #define TARGET_MAP_PRIVATE 0x02/* Changes are private */
-#define TARGET_MAP_TYPE0x0f/* Mask for type of 
mapping */
+#if defined(TARGET_HPPA)
+#define TARGET_MAP_TYPE 0x03   /* Mask for type of mapping */
+#else
+#define TARGET_MAP_TYPE 0x0f   /* Mask for type of mapping */
+#endif
 
 /* Target specific */
 #if defined(TARGET_MIPS)
-- 
2.14.2




[Qemu-devel] [PULL 02/15] linux-user/hppa: Fix TARGET_SA_* defines

2017-11-20 Thread riku . voipio
From: Helge Deller 

Reviewed-by: Laurent Vivier 
Signed-off-by: Helge Deller 
Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall_defs.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index e366183419..38339ecb9a 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -473,6 +473,14 @@ int do_sigaction(int sig, const struct target_sigaction 
*act,
 #define TARGET_SA_RESETHAND0x0010
 #define TARGET_SA_NOCLDWAIT0x0020 /* not supported yet */
 #define TARGET_SA_SIGINFO  0x0040
+#elif defined(TARGET_HPPA)
+#define TARGET_SA_ONSTACK   0x0001
+#define TARGET_SA_RESETHAND 0x0004
+#define TARGET_SA_NOCLDSTOP 0x0008
+#define TARGET_SA_SIGINFO   0x0010
+#define TARGET_SA_NODEFER   0x0020
+#define TARGET_SA_RESTART   0x0040
+#define TARGET_SA_NOCLDWAIT 0x0080
 #else
 #define TARGET_SA_NOCLDSTOP0x0001
 #define TARGET_SA_NOCLDWAIT0x0002 /* not supported yet */
-- 
2.14.2




[Qemu-devel] [PULL 01/15] linux-user: Restrict usage of sa_restorer

2017-11-20 Thread riku . voipio
From: Richard Henderson 

Reading and writing to an sa_restorer member that isn't supposed to
exist corrupts user memory.  Introduce TARGET_ARCH_HAS_SA_RESTORER,
similar to the kernel's __ARCH_HAS_SA_RESTORER.

Reported-by: Helge Deller 
Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/signal.c   |  4 ++--
 linux-user/syscall_defs.h | 13 +
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 7a238aaea1..cf35473671 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -777,7 +777,7 @@ int do_sigaction(int sig, const struct target_sigaction 
*act,
 if (oact) {
 __put_user(k->_sa_handler, >_sa_handler);
 __put_user(k->sa_flags, >sa_flags);
-#if !defined(TARGET_MIPS)
+#ifdef TARGET_ARCH_HAS_SA_RESTORER
 __put_user(k->sa_restorer, >sa_restorer);
 #endif
 /* Not swapped.  */
@@ -787,7 +787,7 @@ int do_sigaction(int sig, const struct target_sigaction 
*act,
 /* FIXME: This is not threadsafe.  */
 __get_user(k->_sa_handler, >_sa_handler);
 __get_user(k->sa_flags, >sa_flags);
-#if !defined(TARGET_MIPS)
+#ifdef TARGET_ARCH_HAS_SA_RESTORER
 __get_user(k->sa_restorer, >sa_restorer);
 #endif
 /* To be swapped in target_to_host_sigset.  */
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 450960bb54..e366183419 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -445,6 +445,7 @@ int do_sigaction(int sig, const struct target_sigaction 
*act,
 #define TARGET_SA_RESTART  2u
 #define TARGET_SA_NODEFER  0x20u
 #define TARGET_SA_RESETHAND4u
+#define TARGET_ARCH_HAS_SA_RESTORER 1
 #elif defined(TARGET_MIPS)
 #define TARGET_SA_NOCLDSTOP0x0001
 #define TARGET_SA_NOCLDWAIT0x0001
@@ -483,6 +484,10 @@ int do_sigaction(int sig, const struct target_sigaction 
*act,
 #define TARGET_SA_RESTORER 0x0400
 #endif
 
+#ifdef TARGET_SA_RESTORER
+#define TARGET_ARCH_HAS_SA_RESTORER 1
+#endif
+
 #if defined(TARGET_ALPHA)
 
 #define TARGET_SIGHUP1
@@ -718,19 +723,27 @@ struct target_sigaction {
abi_ulong   _sa_handler;
 #endif
target_sigset_t sa_mask;
+#ifdef TARGET_ARCH_HAS_SA_RESTORER
+/* ??? This is always present, but ignored unless O32.  */
+abi_ulong sa_restorer;
+#endif
 };
 #else
 struct target_old_sigaction {
 abi_ulong _sa_handler;
 abi_ulong sa_mask;
 abi_ulong sa_flags;
+#ifdef TARGET_ARCH_HAS_SA_RESTORER
 abi_ulong sa_restorer;
+#endif
 };
 
 struct target_sigaction {
 abi_ulong _sa_handler;
 abi_ulong sa_flags;
+#ifdef TARGET_ARCH_HAS_SA_RESTORER
 abi_ulong sa_restorer;
+#endif
 target_sigset_t sa_mask;
 };
 #endif
-- 
2.14.2




[Qemu-devel] [PULL 07/15] linux-user: Handle TARGET_MAP_STACK and TARGET_MAP_HUGETLB

2017-11-20 Thread riku . voipio
From: Helge Deller 

Add the missing defines and for TARGET_MAP_STACK and TARGET_MAP_HUGETLB
for alpha, mips, ppc, x86, hppa.  Fix the mmap_flags translation table
to translate MAP_HUGETLB between host and target architecture, and to
drop MAP_STACK.

Signed-off-by: Helge Deller 
Message-Id: <20170311183016.ga20...@ls3530.fritz.box>
[rth: Drop MAP_STACK instead of translating it, since it is ignored
in the kernel anyway.  Fix tabs to spaces.]
Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c  | 31 ---
 linux-user/syscall_defs.h | 10 ++
 2 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index d4497dec5d..8047bf3aac 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5872,17 +5872,26 @@ static const StructEntry struct_termios_def = {
 };
 
 static bitmask_transtbl mmap_flags_tbl[] = {
-   { TARGET_MAP_SHARED, TARGET_MAP_SHARED, MAP_SHARED, MAP_SHARED },
-   { TARGET_MAP_PRIVATE, TARGET_MAP_PRIVATE, MAP_PRIVATE, MAP_PRIVATE },
-   { TARGET_MAP_FIXED, TARGET_MAP_FIXED, MAP_FIXED, MAP_FIXED },
-   { TARGET_MAP_ANONYMOUS, TARGET_MAP_ANONYMOUS, MAP_ANONYMOUS, 
MAP_ANONYMOUS },
-   { TARGET_MAP_GROWSDOWN, TARGET_MAP_GROWSDOWN, MAP_GROWSDOWN, 
MAP_GROWSDOWN },
-   { TARGET_MAP_DENYWRITE, TARGET_MAP_DENYWRITE, MAP_DENYWRITE, 
MAP_DENYWRITE },
-   { TARGET_MAP_EXECUTABLE, TARGET_MAP_EXECUTABLE, MAP_EXECUTABLE, 
MAP_EXECUTABLE },
-   { TARGET_MAP_LOCKED, TARGET_MAP_LOCKED, MAP_LOCKED, MAP_LOCKED },
-{ TARGET_MAP_NORESERVE, TARGET_MAP_NORESERVE, MAP_NORESERVE,
-  MAP_NORESERVE },
-   { 0, 0, 0, 0 }
+{ TARGET_MAP_SHARED, TARGET_MAP_SHARED, MAP_SHARED, MAP_SHARED },
+{ TARGET_MAP_PRIVATE, TARGET_MAP_PRIVATE, MAP_PRIVATE, MAP_PRIVATE },
+{ TARGET_MAP_FIXED, TARGET_MAP_FIXED, MAP_FIXED, MAP_FIXED },
+{ TARGET_MAP_ANONYMOUS, TARGET_MAP_ANONYMOUS,
+  MAP_ANONYMOUS, MAP_ANONYMOUS },
+{ TARGET_MAP_GROWSDOWN, TARGET_MAP_GROWSDOWN,
+  MAP_GROWSDOWN, MAP_GROWSDOWN },
+{ TARGET_MAP_DENYWRITE, TARGET_MAP_DENYWRITE,
+  MAP_DENYWRITE, MAP_DENYWRITE },
+{ TARGET_MAP_EXECUTABLE, TARGET_MAP_EXECUTABLE,
+  MAP_EXECUTABLE, MAP_EXECUTABLE },
+{ TARGET_MAP_LOCKED, TARGET_MAP_LOCKED, MAP_LOCKED, MAP_LOCKED },
+{ TARGET_MAP_NORESERVE, TARGET_MAP_NORESERVE,
+  MAP_NORESERVE, MAP_NORESERVE },
+{ TARGET_MAP_HUGETLB, TARGET_MAP_HUGETLB, MAP_HUGETLB, MAP_HUGETLB },
+/* MAP_STACK had been ignored by the kernel for quite some time.
+   Recognize it for the target insofar as we do not want to pass
+   it through to the host.  */
+{ TARGET_MAP_STACK, TARGET_MAP_STACK, 0, 0 },
+{ 0, 0, 0, 0 }
 };
 
 #if defined(TARGET_I386)
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index daa2a57398..bec3680b94 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -1353,6 +1353,8 @@ struct target_winsize {
 #define TARGET_MAP_NORESERVE   0x0400  /* don't check for reservations 
*/
 #define TARGET_MAP_POPULATE0x1 /* populate (prefault) 
pagetables */
 #define TARGET_MAP_NONBLOCK0x2 /* do not block on IO */
+#define TARGET_MAP_STACK0x4 /* ignored */
+#define TARGET_MAP_HUGETLB  0x8 /* create a huge page mapping 
*/
 #elif defined(TARGET_PPC)
 #define TARGET_MAP_FIXED   0x10/* Interpret addr exactly */
 #define TARGET_MAP_ANONYMOUS   0x20/* don't use a file */
@@ -1363,6 +1365,8 @@ struct target_winsize {
 #define TARGET_MAP_NORESERVE   0x0040  /* don't check for reservations 
*/
 #define TARGET_MAP_POPULATE0x8000  /* populate (prefault) 
pagetables */
 #define TARGET_MAP_NONBLOCK0x1 /* do not block on IO */
+#define TARGET_MAP_STACK0x2 /* ignored */
+#define TARGET_MAP_HUGETLB  0x4 /* create a huge page mapping 
*/
 #elif defined(TARGET_ALPHA)
 #define TARGET_MAP_ANONYMOUS   0x10/* don't use a file */
 #define TARGET_MAP_FIXED   0x100   /* Interpret addr exactly */
@@ -1373,6 +1377,8 @@ struct target_winsize {
 #define TARGET_MAP_NORESERVE   0x1 /* no check for reservations */
 #define TARGET_MAP_POPULATE0x2 /* pop (prefault) pagetables */
 #define TARGET_MAP_NONBLOCK0x4 /* do not block on IO */
+#define TARGET_MAP_STACK0x8 /* ignored */
+#define TARGET_MAP_HUGETLB  0x10/* create a huge page mapping 
*/
 #elif defined(TARGET_HPPA)
 #define TARGET_MAP_ANONYMOUS   0x10/* don't use a file */
 #define TARGET_MAP_FIXED   0x04/* Interpret addr exactly */
@@ -1383,6 +1389,8 @@ struct target_winsize {
 #define TARGET_MAP_NORESERVE   0x04000 /* no check for reservations */
 

[Qemu-devel] [PULL 00/15] late linux-user fixes for 2.11

2017-11-20 Thread riku . voipio
From: Riku Voipio <riku.voi...@linaro.org>

The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842:

  Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +)

are available in the git repository at:

  git://git.linaro.org/people/riku.voipio/qemu.git tags/pull-linux-user-20171120

for you to fetch changes up to f516511ea84d8bb3395d6ea95a7c7b80dc2a05e9:

  linux-user: Fix calculation of auxv length (2017-11-20 16:15:41 +0200)


late linux-user fixes for Qemu 2.11


Emilio G. Cota (1):
  linux-user: fix 'finshed' typo in comment

Helge Deller (5):
  linux-user/hppa: Fix TARGET_SA_* defines
  linux-user/hppa: Fix typo for TARGET_NR_epoll_wait
  linux-user/hppa: Fix TARGET_MAP_TYPE
  linux-user/hppa: Fix TARGET_F_RDLCK, TARGET_F_WRLCK, TARGET_F_UNLCK
  linux-user: Handle TARGET_MAP_STACK and TARGET_MAP_HUGETLB

James Clarke (1):
  linux-user/syscall.c: Handle SH4's exceptional alignment for p{read, 
write}64

James Cowgill (1):
  linux-user: return EINVAL from prctl(PR_*_SECCOMP)

Peter Maydell (5):
  linux-user/s390x: Mask si_addr for SIGSEGV
  linux-user/ppc: Report correct fault address for data faults
  linux-user/sparc: Put address for data faults where linux-user expects it
  linux-user: Handle rt_sigaction correctly for SPARC
  linux-user: Fix calculation of auxv length

Richard Henderson (2):
  linux-user: Restrict usage of sa_restorer
  linux-user/hppa: Fix cpu_clone_regs

 linux-user/elfload.c |  11 +++--
 linux-user/hppa/syscall_nr.h |   2 +-
 linux-user/hppa/target_cpu.h |   4 
 linux-user/main.c|   8 +--
 linux-user/signal.c  |   4 ++--
 linux-user/syscall.c | 102 
++--
 linux-user/syscall_defs.h|  40 -
 target/sparc/mmu_helper.c|   8 +++
 8 files changed, 144 insertions(+), 35 deletions(-)




[Qemu-devel] [PULL 09/15] linux-user: fix 'finshed' typo in comment

2017-11-20 Thread riku . voipio
From: "Emilio G. Cota" 

Signed-off-by: Emilio G. Cota 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 9268c3ef69..84e123b67b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6258,7 +6258,7 @@ static void *clone_func(void *arg)
 pthread_mutex_lock(>mutex);
 pthread_cond_broadcast(>cond);
 pthread_mutex_unlock(>mutex);
-/* Wait until the parent has finshed initializing the tls state.  */
+/* Wait until the parent has finished initializing the tls state.  */
 pthread_mutex_lock(_lock);
 pthread_mutex_unlock(_lock);
 cpu_loop(env);
-- 
2.14.2




Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-20 Thread Paolo Bonzini
On 20/11/2017 13:50, Peter Maydell wrote:
> More generally, this commit seems to assume that QEMU always
> does:
>  * set exception_index to something
>  * handle that
>  * clear exception_index to -1
> 
> but it's not clear to me that it's actually always the case
> that it gets cleared back to -1.

After returning from cpu_handle_interrupt, cpu_exec goes to
cpu_handle_exception which does

if (cpu->exception_index >= EXCP_INTERRUPT) {
*ret = cpu->exception_index;
if (*ret == EXCP_DEBUG) {
cpu_handle_debug_exception(cpu);
}
cpu->exception_index = -1;
return true;
} else {
CPUClass *cc = CPU_GET_CLASS(cpu);
qemu_mutex_lock_iothread();
cc->do_interrupt(cpu);
qemu_mutex_unlock_iothread();
cpu->exception_index = -1;
}

return false;

Does ARM have a case where cc->do_interrupt can longjmp back to the
beginning of cpu_handle_exception?  But I still do not understand why
you don't eventually clear exception_index to -1.  Maybe there should be
an assertion for that before and after cpu_handle_interrupt.

Thanks,

Paolo



Re: [Qemu-devel] [PATCH for-2.11] iotests: Fix 176 on 32-bit host

2017-11-20 Thread Max Reitz
On 2017-11-17 20:04, Eric Blake wrote:
> The contents of a qcow2 bitmap are rounded up to a size that
> matches the number of bits available for the granularity, but
> that granularity differs for 32-bit hosts (our default 64k
> cluster allows for 2M bitmap coverage per 'long') and 64-bit
> hosts (4M bitmap per 'long').  If the image is a multiple of
> 2M but not 4M, then the number of bytes occupied by the array
> of longs in memory differs between architecture, thus
> resulting in different SHA256 hashes.
> 
> Furthermore (but untested by me), if our computation of the
> SHA256 hash is at all endian-dependent because of how we store
> data in memory, that's another variable we'd have to account
> for (ideally, we specified the bitmap stored in qcow2 as
> fixed-endian on disk, because the same qcow2 file must be
> usable across any architecture; but that says nothing about
> how we represent things in memory).  But we already have test
> 165 to validate that bitmaps are stored correctly on disk,
> while this test is merely testing that the bitmap exists.
> 
> So for this test, the easiest solution is to filter out the
> actual hash value.  Broken in commit 4096974e.
> 
> Reported-by: Max Reitz 
> Signed-off-by: Eric Blake 
> ---
>  tests/qemu-iotests/176 | 3 ++-
>  tests/qemu-iotests/176.out | 8 
>  2 files changed, 6 insertions(+), 5 deletions(-)

Now that I've failed to keep my tree empty anyway:

Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

(No offense taken if **someone** *cough* *cough* were to take it from me)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 for-2.11 0/2] Fix TCG atomic writes to nondirty pages

2017-11-20 Thread Paolo Bonzini
On 20/11/2017 19:08, Peter Maydell wrote:
> To do a write to memory that is marked as notdirty, we need
> to invalidate any TBs we have cached for that memory, and
> update the cpu physical memory dirty flags for VGA and migration.
> The slowpath code in notdirty_mem_write() does all this correctly,
> but the new atomic handling code in atomic_mmu_lookup() doesn't
> do anything at all, it just clears the dirty bit in the TLB.
> 
> The effect of this bug is that if the first write to a notdirty
> page for which we have cached TBs is by a guest atomic access,
> we fail to invalidate the TBs and subsequently will execute
> incorrect code. This can be seen by trying to run 'javac' on AArch64.
> 
> The first patch here refactors notdirty_mem_write() to pull out
> the "correctly handle dirty bit updates" parts of the code into
> two new functions memory_notdirty_write_prepare() and
> memory_notdirty_write_complete(). The second patch then uses
> those functions to fix the atomic helpers.

Reviewed-by: Paolo Bonzini 

Thanks!

Paolo

> Changes v1->v2:
>  * add the 'bool active;' flag to NotDirtyInfo in patch 1
>  * change the comment on NotDirtyInfo in patch 1 to document
>the active flag (and to fix incorrect references to function
>names that I forgot to update when I decided on the names for
>the prepare/complete functions)
>  * in patch 2, don't call prepare unless the TLB was notdirty
>  * in patch 2, use the active flag to track whether we need to
>call complete or not
> 
> thanks
> -- PMM
> 
> Peter Maydell (2):
>   exec.c: Factor out before/after actions for notdirty memory writes
>   accel/tcg: Handle atomic accesses to notdirty memory correctly
> 
>  accel/tcg/atomic_template.h| 12 
>  include/exec/memory-internal.h | 62 
>  accel/tcg/cputlb.c | 38 +++-
>  accel/tcg/user-exec.c  |  1 +
>  exec.c | 65 
> --
>  5 files changed, 144 insertions(+), 34 deletions(-)
> 




Re: [Qemu-devel] [PATCH] block: Close a BlockDriverState completely even when bs->drv is NULL

2017-11-20 Thread Max Reitz
On 2017-11-06 15:53, Alberto Garcia wrote:
> bdrv_close() skips much of its logic when bs->drv is NULL. This is
> fine when we're closing a BlockDriverState that has just been created
> (because e.g the initialization process failed), but it's not enough
> in other cases.
> 
> For example, when a valid qcow2 image is found to be corrupted then
> QEMU marks it as such in the file header and then sets bs->drv to
> NULL in order to make the BlockDriverState unusable. When that BDS is
> later closed then many of its data structures are not freed (leaking
> their memory) and none of its children are detached. This results in
> bdrv_close_all() failing to close all BDSs and making this assertion
> fail when QEMU is being shut down:
> 
>bdrv_close_all: Assertion `QTAILQ_EMPTY(_bdrv_states)' failed.
> 
> This patch makes bdrv_close() do the full uninitialization process
> in all cases. This fixes the problem with corrupted images and still
> works fine with freshly created BDSs.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block.c| 57 
> +++---
>  tests/qemu-iotests/060 | 13 +++
>  tests/qemu-iotests/060.out | 12 ++
>  3 files changed, 53 insertions(+), 29 deletions(-)

Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 1/4] vhost-user: add new vhost user messages to support virtio config space

2017-11-20 Thread Michael S. Tsirkin
On Mon, Nov 20, 2017 at 04:26:31PM +, Stefan Hajnoczi wrote:
> On Fri, Nov 17, 2017 at 04:44:37AM +0800, Changpeng Liu wrote:
> > Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
> > used for live migration of vhost user devices, also vhost user devices
> > can benefit from the messages to get/set virtio config space from/to the
> > I/O target. For the purpose to support virtio config space change,
> > VHOST_USER_SET_CONFIG_FD message is added as the event notifier
> > in case virtio config space change in the I/O target.
> > 
> > Signed-off-by: Changpeng Liu 
> > ---
> >  docs/interop/vhost-user.txt   | 39 
> >  hw/virtio/vhost-user.c| 98 
> > +++
> >  hw/virtio/vhost.c | 63 +
> >  include/hw/virtio/vhost-backend.h |  8 
> >  include/hw/virtio/vhost.h | 16 +++
> >  5 files changed, 224 insertions(+)
> > 
> > diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> > index 954771d..1b98388 100644
> > --- a/docs/interop/vhost-user.txt
> > +++ b/docs/interop/vhost-user.txt
> > @@ -116,6 +116,16 @@ Depending on the request type, payload can be:
> >  - 3: IOTLB invalidate
> >  - 4: IOTLB access fail
> >  
> > + * Virtio device config space
> > +   ---
> > +   | offset | size | payload |
> > +   ---
> > +
> > +   Offset: a 32-bit offset of virtio device's configuration space
> 
> s/of/in the/
> 
> > +   Size: a 32-bit size of configuration space that master wanted to change
> 
> Is this also used for GET_CONFIG?  If yes, I suggest "a 32-bit
> configuration space access size in bytes".
> 
> Please mention that Size must be <= 256 bytes.
> 
> > +   Payload: a 256-bytes array holding the contents of the virtio
> > +   device's configuration space
> 
> What about bytes outside the [offset, offset+size) range?  I guess they
> must be 0 and are ignored by the master/slave.
> 
> Would it be cleaner to make Payload a variable-sized field with Size
> bytes?  That way it's not necessary to transfer 0s and memcpy() a subset
> of the payload array.
> 
> > +
> >  In QEMU the vhost-user message is implemented with the following struct:
> >  
> >  typedef struct VhostUserMsg {
> > @@ -129,6 +139,7 @@ typedef struct VhostUserMsg {
> >  VhostUserMemory memory;
> >  VhostUserLog log;
> >  struct vhost_iotlb_msg iotlb;
> > +VhostUserConfig config;
> >  };
> >  } QEMU_PACKED VhostUserMsg;
> >  
> > @@ -596,6 +607,34 @@ Master message types
> >and expect this message once (per VQ) during device configuration
> >(ie. before the master starts the VQ).
> >  
> > + * VHOST_USER_GET_CONFIG
> > +  Id: 24
> > +  Equivalent ioctl: N/A
> > +  Master payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master to fetch the contents of the 
> > virtio
> > +  device configuration space. The vhost-user master may cache the 
> > contents
> > +  to avoid repeated VHOST_USER_GET_CONFIG calls.
> > +
> > +* VHOST_USER_SET_CONFIG
> > +  Id: 25
> > +  Equivalent ioctl: N/A
> > +  Master payload: virtio device config space
> > +
> > +  Submitted by the vhost-user master when the Guest changes the virtio
> > +  device configuration space and also can be used for live migration
> > +  on the destination host.
> 
> There might be security issues if the vhost slave cannot tell whether
> SET_CONFIG is coming from the guest driver or from the master process
> (live migration).  Typically certain fields are read-only for the guest
> driver.  Maybe those fields need to be set by the master after live
> migration.
> 
> One way to solve this is adding a flags field to the message.  A special
> flag can be used for live migration so the slave knows that this
> SET_CONFIG message is allowed to write to read-only fields.
> 
> It's also worth documenting that slaves MUST NOT accept SET_CONFIG for
> read-only configuration space fields unless the live migration bit is
> set.  Hopefully this will remind implementors to think through the
> security issues.

Live migrations is supposed to be migrating guest writeable state too.
If you mean migrating RO fields like size, then
I don't think it's a good idea to reuse SET_CONFIG for that.
SET_CONFIG should obey exactly the virtio semantics.

And I agree, it should say that slave must treat it as a write,
and get config as a read according to virtio semantics.

If someone needs to pass configuration from qemu to
slave, let's add specific messages with precisely defined semantics.

Which reminds me.

virtio 1 changed endian-ness for config.

I think we should specify it's all virtio 1 format, and
just disallow vhost blk for virtio 0 guests.

-- 
MST



Re: [Qemu-devel] [Qemu-block] [PATCH for-2.11] iotests: Fix 176 on 32-bit host

2017-11-20 Thread Eric Blake
On 11/17/2017 04:59 PM, John Snow wrote:
>>> So for this test, the easiest solution is to filter out the
>>> actual hash value.  Broken in commit 4096974e.
>>
>> Of course, if Kevin sends a v2 pull, it's probably better to just squash
>> this in to my original testsuite change (since a v2 would probably
>> invalidate that commit id).
>>
> 
> Whichever way you go,

Commit 4096974e has already landed in master, so this commit message is
correct as-is and belongs in the next pull request (whether for -rc2 or
-rc3 depends on timing).

> 
> Reviewed-by: John Snow 
> 
> I don't know how important it is to nail this down, but the purpose of
> this command is primarily for sanity testing and not necessarily between
> architectures, so this might just be a limitation to document.
> 
> Also, don't use this for anything except testing.

Indeed, the name of x-debug-block-dirty-bitmap-sha256 already implies
its limited portability; but it also gives us free reign to change it's
behavior if we find something we like better for the same ease of bitmap
testing/debugging.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2] tests/bios-tables-test: Fix endianess problems when passing data to iasl

2017-11-20 Thread Michael S. Tsirkin
On Mon, Nov 20, 2017 at 05:55:22PM +0100, Igor Mammedov wrote:
> On Thu, 16 Nov 2017 13:17:02 +0100
> Thomas Huth  wrote:
> 
> > The bios-tables-test was writing out files that we pass to iasl in
> > with the wrong endianness in the header when running on a big endian
> > host. So instead of storing mixed endian information in our structures,
> > let's keep everything in little endian and byte-swap it only when we
> > need a value in the code.
> > 
> > Reported-by: Daniel P. Berrange 
> > Buglink: https://bugs.launchpad.net/qemu/+bug/1724570
> > Suggested-by: Michael S. Tsirkin 
> > Signed-off-by: Thomas Huth 
> > ---
> >  v2: Fixed vmgenid-test which was accidentially broken in v1
> > 
> >  tests/acpi-utils.h   | 27 +--
> >  tests/bios-tables-test.c | 42 ++
> >  tests/vmgenid-test.c | 22 --
> >  3 files changed, 39 insertions(+), 52 deletions(-)
> > 
> > diff --git a/tests/acpi-utils.h b/tests/acpi-utils.h
> > index f8d8723..d5ca5b6 100644
> > --- a/tests/acpi-utils.h
> > +++ b/tests/acpi-utils.h
> > @@ -28,24 +28,9 @@ typedef struct {
> >  bool tmp_files_retain;   /* do not delete the temp asl/aml */
> >  } AcpiSdtTable;
> >  
> > -#define ACPI_READ_FIELD(field, addr)   \
> > -do {   \
> > -switch (sizeof(field)) {   \
> > -case 1:\
> > -field = readb(addr);   \
> > -break; \
> > -case 2:\
> > -field = readw(addr);   \
> > -break; \
> > -case 4:\
> > -field = readl(addr);   \
> > -break; \
> > -case 8:\
> > -field = readq(addr);   \
> > -break; \
> > -default:   \
> > -g_assert(false);   \
> > -}  \
> probably it's been discussed but, why not do
>  leXX_to_cpu()
> here, instead of making each place that access read field
> to do leXX_to_cpu() manually.?
> 
> Beside of keeping access to structure in natural host order,
> it should also be less error-prone as field users don't
> have to worry about endianness.

Yes, I suggested that.
The issue is that we don't byte-swap all of the tables
(we can't, that would require a full ACPI parser).
So when byte-swapping we end up with a mixed host/LE
structures.

Keeping it all LE seems cleaner.


-- 
MST



Re: [Qemu-devel] [PATCH for-2.11] iotests: Fix 176 on 32-bit host

2017-11-20 Thread Max Reitz
On 2017-11-17 20:04, Eric Blake wrote:
> The contents of a qcow2 bitmap are rounded up to a size that
> matches the number of bits available for the granularity, but
> that granularity differs for 32-bit hosts (our default 64k
> cluster allows for 2M bitmap coverage per 'long') and 64-bit
> hosts (4M bitmap per 'long').  If the image is a multiple of
> 2M but not 4M, then the number of bytes occupied by the array
> of longs in memory differs between architecture, thus
> resulting in different SHA256 hashes.
> 
> Furthermore (but untested by me), if our computation of the
> SHA256 hash is at all endian-dependent because of how we store
> data in memory, that's another variable we'd have to account
> for (ideally, we specified the bitmap stored in qcow2 as
> fixed-endian on disk, because the same qcow2 file must be
> usable across any architecture; but that says nothing about
> how we represent things in memory).  But we already have test
> 165 to validate that bitmaps are stored correctly on disk,
> while this test is merely testing that the bitmap exists.
> 
> So for this test, the easiest solution is to filter out the
> actual hash value.  Broken in commit 4096974e.
> 
> Reported-by: Max Reitz 
> Signed-off-by: Eric Blake 
> ---
>  tests/qemu-iotests/176 | 3 ++-
>  tests/qemu-iotests/176.out | 8 
>  2 files changed, 6 insertions(+), 5 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v7 for-2.12 20/25] block: Generically refresh runtime options

2017-11-20 Thread Max Reitz
Instead of having every block driver which implements
bdrv_refresh_filename() copy all of the significant runtime options over
to bs->full_open_options, implement this process generically in
bdrv_refresh_filename().

This patch only adds this new generic implementation, it does not remove
the old functionality. This is done in a follow-up patch.

With this patch, some superfluous information (that should never have
been there) may be removed from some JSON filenames, as can be seen in
the change to iotest 110's reference output.  In case of 191, backing
nodes that have not been overridden are now removed from the filename.

Signed-off-by: Max Reitz 
---
 block.c| 116 -
 tests/qemu-iotests/110.out |   2 +-
 tests/qemu-iotests/191.out |  24 +-
 3 files changed, 128 insertions(+), 14 deletions(-)

diff --git a/block.c b/block.c
index 2e59e77f9d..64ef30bbac 100644
--- a/block.c
+++ b/block.c
@@ -4944,6 +4944,92 @@ out:
 return to_replace_bs;
 }
 
+/**
+ * Iterates through the list of runtime option keys that are said to be
+ * "significant" for a BDS. An option is called "significant" if it changes a
+ * BDS's data. For example, the null block driver's "size" and "read-zeroes"
+ * options are significant, but its "latency-ns" option is not.
+ *
+ * If a key returned by this function ends with a dot, all options starting 
with
+ * that prefix are significant.
+ */
+static const char *const *significant_options(BlockDriverState *bs,
+  const char *const *curopt)
+{
+static const char *const global_options[] = {
+"driver", "filename", "base-directory", NULL
+};
+
+if (!curopt) {
+return _options[0];
+}
+
+curopt++;
+if (curopt == _options[ARRAY_SIZE(global_options) - 1] && bs->drv) {
+curopt = bs->drv->sgfnt_runtime_opts;
+}
+
+return (curopt && *curopt) ? curopt : NULL;
+}
+
+/**
+ * Copies all significant runtime options from bs->options to the given QDict.
+ * The set of significant option keys is determined by invoking
+ * significant_options().
+ *
+ * Returns true iff any significant option was present in bs->options (and thus
+ * copied to the target QDict) with the exception of "filename" and "driver".
+ * The caller is expected to use this value to decide whether the existence of
+ * significant options prevents the generation of a plain filename.
+ */
+static bool append_significant_runtime_options(QDict *d, BlockDriverState *bs)
+{
+bool found_any = false;
+const char *const *option_name = NULL;
+
+if (!bs->drv) {
+return false;
+}
+
+while ((option_name = significant_options(bs, option_name))) {
+bool option_given = false;
+
+assert(strlen(*option_name) > 0);
+if ((*option_name)[strlen(*option_name) - 1] != '.') {
+QObject *entry = qdict_get(bs->options, *option_name);
+if (!entry) {
+continue;
+}
+
+qobject_incref(entry);
+qdict_put_obj(d, *option_name, entry);
+option_given = true;
+} else {
+const QDictEntry *entry;
+for (entry = qdict_first(bs->options); entry;
+ entry = qdict_next(bs->options, entry))
+{
+if (strstart(qdict_entry_key(entry), *option_name, NULL)) {
+qobject_incref(qdict_entry_value(entry));
+qdict_put_obj(d, qdict_entry_key(entry),
+  qdict_entry_value(entry));
+option_given = true;
+}
+}
+}
+
+/* While "driver" and "filename" need to be included in a JSON 
filename,
+ * their existence does not prohibit generation of a plain filename. */
+if (!found_any && option_given &&
+strcmp(*option_name, "driver") && strcmp(*option_name, "filename"))
+{
+found_any = true;
+}
+}
+
+return found_any;
+}
+
 static bool append_open_options(QDict *d, BlockDriverState *bs)
 {
 const QDictEntry *entry;
@@ -5098,9 +5184,37 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 bs->full_open_options = opts;
 }
 
+/* Gather the options QDict */
+opts = qdict_new();
+append_significant_runtime_options(opts, bs);
+
+if (drv->bdrv_gather_child_options) {
+/* Some block drivers may not want to present all of their children's
+ * options, or name them differently from BdrvChild.name */
+drv->bdrv_gather_child_options(bs, opts);
+} else {
+QLIST_FOREACH(child, >children, next) {
+if (child->role == _backing && !bs->backing_overridden) {
+/* We can skip the backing BDS if it has not been overridden */
+continue;
+}
+
+QINCREF(child->bs->full_open_options);
+

[Qemu-devel] [PATCH v7 for-2.12 19/25] block: Add BlockDriver.bdrv_gather_child_options

2017-11-20 Thread Max Reitz
Some follow-up patches will rework the way bs->full_open_options is
refreshed in bdrv_refresh_filename(). The new implementation will remove
the need for the block drivers' bdrv_refresh_filename() implementations
to set bs->full_open_options; instead, it will be generic and use static
information from each block driver.

However, by implementing bdrv_gather_child_options(), block drivers will
still be able to override the way the full_open_options of their
children are incorporated into their own.

We need to implement this function for VMDK because we have to prevent
the generic implementation from gathering the options of all children:
It is not possible to specify options for the extents through the
runtime options.

For quorum, the child names that would be used by the generic
implementation and the ones that we actually (currently) want to use
differ. See quorum_gather_child_options() for more information.

Note that both of these are cases which are not ideal: In case of VMDK
it would probably be nice to be able to specify options for all extents.
In case of quorum, the current runtime option structure is simply broken
and needs to be fixed (but that is left for another patch).

Signed-off-by: Max Reitz 
---
 include/block/block_int.h | 17 +
 block/quorum.c| 38 ++
 block/vmdk.c  | 13 +
 3 files changed, 68 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index d99ded8891..f81516b615 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -136,6 +136,23 @@ struct BlockDriver {
 void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options);
 
 /*
+ * Gathers the open options for all children into @target.
+ * A simple format driver (without backing file support) might
+ * implement this function like this:
+ *
+ * QINCREF(bs->file->bs->full_open_options);
+ * qdict_put(target, "file", bs->file->bs->full_open_options);
+ *
+ * If not specified, the generic implementation will simply put
+ * all children's options under their respective name.
+ *
+ * Note that ideally this function would not be needed.  Every
+ * block driver which implements it is probably doing something
+ * shady regarding its runtime option structure.
+ */
+void (*bdrv_gather_child_options)(BlockDriverState *bs, QDict *target);
+
+/*
  * Returns an allocated string which is the directory name of this BDS: It
  * will be used to make relative filenames absolute by prepending this
  * function's return value to them.
diff --git a/block/quorum.c b/block/quorum.c
index 4b38201aa2..9d680addc6 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1095,6 +1095,43 @@ static void quorum_refresh_filename(BlockDriverState 
*bs, QDict *options)
 bs->full_open_options = opts;
 }
 
+static void quorum_gather_child_options(BlockDriverState *bs, QDict *target)
+{
+BDRVQuorumState *s = bs->opaque;
+QList *children_list;
+int i;
+
+/* The generic implementation for gathering child options in
+ * bdrv_refresh_filename() would use the names of the children
+ * as specified for bdrv_open_child() or bdrv_attach_child(),
+ * which is "children.%u" with %u being a value
+ * (s->next_child_index) that is incremented each time a new child
+ * is added (and never decremented).  Since children can be
+ * deleted at runtime, there may be gaps in that enumeration.
+ * When creating a new quorum BDS and specifying the children for
+ * it through runtime options, the enumeration used there may not
+ * have any gaps, though.
+ *
+ * Therefore, we have to create a new gap-less enumeration here
+ * (which we can achieve by simply putting all of the children's
+ * full_open_options into a QList).
+ *
+ * XXX: Note that there are issues with the current child option
+ *  structure quorum uses (such as the fact that children do
+ *  not really have unique permanent names).  Therefore, this
+ *  is going to have to change in the future and ideally we
+ *  want quorum to be covered by the generic implementation.
+ */
+
+children_list = qlist_new();
+qdict_put(target, "children", children_list);
+
+for (i = 0; i < s->num_children; i++) {
+QINCREF(s->children[i]->bs->full_open_options);
+qlist_append(children_list, s->children[i]->bs->full_open_options);
+}
+}
+
 static char *quorum_dirname(BlockDriverState *bs, Error **errp)
 {
 /* In general, there are multiple BDSs with different dirnames below this
@@ -1123,6 +1160,7 @@ static BlockDriver bdrv_quorum = {
 .bdrv_file_open = quorum_open,
 .bdrv_close = quorum_close,
 .bdrv_refresh_filename  = quorum_refresh_filename,
+.bdrv_gather_child_options 

[Qemu-devel] [PATCH v7 for-2.12 22/25] block: Do not copy exact_filename from format file

2017-11-20 Thread Max Reitz
If the a format BDS's file BDS is in turn a format BDS, we cannot simply
use the same filename, because when opening a BDS tree based on a
filename alone, qemu will create only one format node on top of one
protocol node (disregarding a potential backing file).

Signed-off-by: Max Reitz 
---
 block.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 275cae460a..526ce43e9b 100644
--- a/block.c
+++ b/block.c
@@ -5104,9 +5104,20 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 
 bs->exact_filename[0] = '\0';
 
-/* If no specific options have been given for this BDS, the filename of
- * the underlying file should suffice for this one as well */
-if (bs->file->bs->exact_filename[0] && !generate_json_filename) {
+/* We can use the underlying file's filename if:
+ * - it has a filename,
+ * - the file is a protocol BDS, and
+ * - opening that file (as this BDS's format) will automatically create
+ *   the BDS tree we have right now, that is:
+ *   - the user did not significantly change this BDS's behavior with
+ * some explicit options
+ *   - no non-file child of this BDS has been overridden by the user
+ *   Both of these conditions are represented by 
generate_json_filename.
+ */
+if (bs->file->bs->exact_filename[0] &&
+bs->file->bs->drv->bdrv_file_open &&
+!generate_json_filename)
+{
 strcpy(bs->exact_filename, bs->file->bs->exact_filename);
 }
 }
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 21/25] block: Purify .bdrv_refresh_filename()

2017-11-20 Thread Max Reitz
Currently, BlockDriver.bdrv_refresh_filename() is supposed to both
refresh the filename (BDS.exact_filename) and set BDS.full_open_options.
Now that we have generic code in the central bdrv_refresh_filename() for
creating BDS.full_open_options, we can drop the latter part from all
BlockDriver.bdrv_refresh_filename() implementations.

This also means that we can drop all of the existing default code for
this from the global bdrv_refresh_filename() itself.

Furthermore, we now have to call BlockDriver.bdrv_refresh_filename()
after having set BDS.full_open_options, because the block driver's
implementation should now be allowed to depend on BDS.full_open_options
being set correctly.

Finally, with this patch we can drop the @options parameter from
BlockDriver.bdrv_refresh_filename(); also, add a comment on this
function's purpose in block/block_int.h while touching its interface.

Signed-off-by: Max Reitz 
---
 include/block/block_int.h |   6 +-
 block.c   | 145 +++---
 block/blkdebug.c  |  52 ++---
 block/blkverify.c |  16 +
 block/commit.c|   2 +-
 block/mirror.c|   2 +-
 block/nbd.c   |  23 +---
 block/nfs.c   |  36 +---
 block/null.c  |  23 +---
 block/quorum.c|  30 --
 10 files changed, 63 insertions(+), 272 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index f81516b615..b67f3af599 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -133,7 +133,11 @@ struct BlockDriver {
 int (*bdrv_create)(const char *filename, QemuOpts *opts, Error **errp);
 int (*bdrv_make_empty)(BlockDriverState *bs);
 
-void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options);
+/*
+ * Refreshes the bs->exact_filename field. If that is impossible,
+ * bs->exact_filename has to be left empty.
+ */
+void (*bdrv_refresh_filename)(BlockDriverState *bs);
 
 /*
  * Gathers the open options for all children into @target.
diff --git a/block.c b/block.c
index 64ef30bbac..275cae460a 100644
--- a/block.c
+++ b/block.c
@@ -5030,47 +5030,6 @@ static bool append_significant_runtime_options(QDict *d, 
BlockDriverState *bs)
 return found_any;
 }
 
-static bool append_open_options(QDict *d, BlockDriverState *bs)
-{
-const QDictEntry *entry;
-QemuOptDesc *desc;
-BdrvChild *child;
-bool found_any = false;
-const char *p;
-
-for (entry = qdict_first(bs->options); entry;
- entry = qdict_next(bs->options, entry))
-{
-/* Exclude options for children */
-QLIST_FOREACH(child, >children, next) {
-if (strstart(qdict_entry_key(entry), child->name, )
-&& (!*p || *p == '.'))
-{
-break;
-}
-}
-if (child) {
-continue;
-}
-
-/* And exclude all non-driver-specific options */
-for (desc = bdrv_runtime_opts.desc; desc->name; desc++) {
-if (!strcmp(qdict_entry_key(entry), desc->name)) {
-break;
-}
-}
-if (desc->name) {
-continue;
-}
-
-qobject_incref(qdict_entry_value(entry));
-qdict_put_obj(d, qdict_entry_key(entry), qdict_entry_value(entry));
-found_any = true;
-}
-
-return found_any;
-}
-
 /* Updates the following BDS fields:
  *  - exact_filename: A filename which may be used for opening a block device
  *which (mostly) equals the given BDS (even without any
@@ -5088,6 +5047,8 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 BlockDriver *drv = bs->drv;
 BdrvChild *child;
 QDict *opts;
+bool generate_json_filename; /* Whether our default implementation should
+fill exact_filename (false) or not (true) 
*/
 
 if (!drv) {
 return;
@@ -5103,90 +5064,10 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 }
 }
 
-if (drv->bdrv_refresh_filename) {
-/* Obsolete information is of no use here, so drop the old file name
- * information before refreshing it */
-bs->exact_filename[0] = '\0';
-if (bs->full_open_options) {
-QDECREF(bs->full_open_options);
-bs->full_open_options = NULL;
-}
-
-opts = qdict_new();
-append_open_options(opts, bs);
-drv->bdrv_refresh_filename(bs, opts);
-QDECREF(opts);
-} else if (bs->file) {
-/* Try to reconstruct valid information from the underlying file */
-bool has_open_options;
-
-bs->exact_filename[0] = '\0';
-if (bs->full_open_options) {
-QDECREF(bs->full_open_options);
-bs->full_open_options = NULL;
-}
-
-opts = qdict_new();
-has_open_options = append_open_options(opts, bs);
- 

[Qemu-devel] [PATCH v7 for-2.12 12/25] blkverify: Make bdrv_dirname() return NULL

2017-11-20 Thread Max Reitz
blkverify's BDSs have a file BDS, but we do not want this to be
preferred over the raw node. There is no way to decide between the two
(and not really a reason to, either), so just return NULL in blkverify's
implementation of bdrv_dirname().

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
---
 block/blkverify.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/blkverify.c b/block/blkverify.c
index b2ed8cd70d..d5233eeaf9 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -309,6 +309,15 @@ static void blkverify_refresh_filename(BlockDriverState 
*bs, QDict *options)
 }
 }
 
+static char *blkverify_dirname(BlockDriverState *bs, Error **errp)
+{
+/* In general, there are two BDSs with different dirnames below this one;
+ * so there is no unique dirname we could return (unless both are equal by
+ * chance). Therefore, to be consistent, just always return NULL. */
+error_setg(errp, "Cannot generate a base directory for blkverify nodes");
+return NULL;
+}
+
 static BlockDriver bdrv_blkverify = {
 .format_name  = "blkverify",
 .protocol_name= "blkverify",
@@ -320,6 +329,7 @@ static BlockDriver bdrv_blkverify = {
 .bdrv_child_perm  = bdrv_filter_default_perms,
 .bdrv_getlength   = blkverify_getlength,
 .bdrv_refresh_filename= blkverify_refresh_filename,
+.bdrv_dirname = blkverify_dirname,
 
 .bdrv_co_preadv   = blkverify_co_preadv,
 .bdrv_co_pwritev  = blkverify_co_pwritev,
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 15/25] block/nfs: Implement bdrv_dirname()

2017-11-20 Thread Max Reitz
While the basic idea is obvious and could be handled by the default
bdrv_dirname() implementation, we cannot generate a directory name if
the gid or uid are set, so we have to explicitly return NULL in those
cases.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 block/nfs.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/block/nfs.c b/block/nfs.c
index 337fcd9c84..0152a9bd32 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -875,6 +875,19 @@ static void nfs_refresh_filename(BlockDriverState *bs, 
QDict *options)
 bs->full_open_options = opts;
 }
 
+static char *nfs_dirname(BlockDriverState *bs, Error **errp)
+{
+NFSClient *client = bs->opaque;
+
+if (client->uid || client->gid) {
+error_setg(errp, "Cannot generate a base directory for NBD node '%s'",
+   bs->filename);
+return NULL;
+}
+
+return g_strdup_printf("nfs://%s%s/", client->server->host, client->path);
+}
+
 #ifdef LIBNFS_FEATURE_PAGECACHE
 static void nfs_invalidate_cache(BlockDriverState *bs,
  Error **errp)
@@ -908,6 +921,7 @@ static BlockDriver bdrv_nfs = {
 .bdrv_detach_aio_context= nfs_detach_aio_context,
 .bdrv_attach_aio_context= nfs_attach_aio_context,
 .bdrv_refresh_filename  = nfs_refresh_filename,
+.bdrv_dirname   = nfs_dirname,
 
 #ifdef LIBNFS_FEATURE_PAGECACHE
 .bdrv_invalidate_cache  = nfs_invalidate_cache,
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 25/25] block/null: Generate filename even with latency-ns

2017-11-20 Thread Max Reitz
While we cannot represent the latency-ns option in a filename, it is not
a significant option so not being able to should not stop us from
generating a filename nonetheless.

Signed-off-by: Max Reitz 
---
 block/null.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/null.c b/block/null.c
index 65f5d681c0..ca0aa3d528 100644
--- a/block/null.c
+++ b/block/null.c
@@ -250,7 +250,8 @@ static void null_refresh_filename(BlockDriverState *bs)
 {
 /* These options can be ignored */
 if (strcmp(qdict_entry_key(e), "filename") &&
-strcmp(qdict_entry_key(e), "driver"))
+strcmp(qdict_entry_key(e), "driver") &&
+strcmp(qdict_entry_key(e), NULL_OPT_LATENCY))
 {
 return;
 }
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 18/25] block: Add sgfnt_runtime_opts to BlockDriver

2017-11-20 Thread Max Reitz
This new field can be set by block drivers to list the runtime options
they accept that may influence the contents of the respective BDS. As of
a follow-up patch, this list will be used by the common
bdrv_refresh_filename() implementation to decide which options to put
into BDS.full_open_options (and consequently whether a JSON filename has
to be created), thus freeing the drivers of having to implement that
logic themselves.

Additionally, this patch adds the field to all of the block drivers that
need it and sets it accordingly.

Signed-off-by: Max Reitz 
---
 include/block/block_int.h |  7 +++
 block/blkdebug.c  | 17 +
 block/crypto.c|  8 
 block/curl.c  | 21 +
 block/gluster.c   | 19 +++
 block/iscsi.c | 18 ++
 block/nbd.c   | 14 ++
 block/nfs.c   | 11 +++
 block/null.c  |  9 +
 block/qcow.c  |  7 +++
 block/qcow2.c |  7 +++
 block/quorum.c| 11 +++
 block/raw-format.c| 10 +-
 block/rbd.c   | 14 ++
 block/replication.c   |  8 
 block/sheepdog.c  | 12 
 block/ssh.c   | 12 
 block/throttle.c  |  7 +++
 block/vpc.c   |  7 +++
 block/vvfat.c | 12 
 block/vxhs.c  | 11 +++
 21 files changed, 241 insertions(+), 1 deletion(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c07882962d..d99ded8891 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -454,6 +454,13 @@ struct BlockDriver {
 Error **errp);
 
 QLIST_ENTRY(BlockDriver) list;
+
+/* Pointer to a NULL-terminated array of names of significant options that
+ * can be specified for bdrv_open(). A significant option is one that
+ * changes the data of a BDS.
+ * If this pointer is NULL, the array is considered empty.
+ * "filename" and "driver" are always considered significant. */
+const char *const *sgfnt_runtime_opts;
 };
 
 typedef struct BlockLimits {
diff --git a/block/blkdebug.c b/block/blkdebug.c
index e21669979d..f93139da58 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -886,6 +886,21 @@ static int blkdebug_reopen_prepare(BDRVReopenState 
*reopen_state,
 return 0;
 }
 
+static const char *const blkdebug_sgfnt_runtime_opts[] = {
+"config",
+"inject-error.",
+"set-state.",
+"suspend.",
+"align",
+"max-transfer",
+"opt-write-zero",
+"max-write-zero",
+"opt-discard",
+"max-discard",
+
+NULL
+};
+
 static BlockDriver bdrv_blkdebug = {
 .format_name= "blkdebug",
 .protocol_name  = "blkdebug",
@@ -915,6 +930,8 @@ static BlockDriver bdrv_blkdebug = {
 = blkdebug_debug_remove_breakpoint,
 .bdrv_debug_resume  = blkdebug_debug_resume,
 .bdrv_debug_is_suspended= blkdebug_debug_is_suspended,
+
+.sgfnt_runtime_opts = blkdebug_sgfnt_runtime_opts,
 };
 
 static void bdrv_blkdebug_init(void)
diff --git a/block/crypto.c b/block/crypto.c
index 60ddf8623e..492e64e922 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -609,6 +609,12 @@ block_crypto_get_specific_info_luks(BlockDriverState *bs)
 return spec_info;
 }
 
+static const char *const block_crypto_sgfnt_runtime_opts[] = {
+BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
+
+NULL
+};
+
 BlockDriver bdrv_crypto_luks = {
 .format_name= "luks",
 .instance_size  = sizeof(BlockCrypto),
@@ -626,6 +632,8 @@ BlockDriver bdrv_crypto_luks = {
 .bdrv_getlength = block_crypto_getlength,
 .bdrv_get_info  = block_crypto_get_info_luks,
 .bdrv_get_specific_info = block_crypto_get_specific_info_luks,
+
+.sgfnt_runtime_opts = block_crypto_sgfnt_runtime_opts,
 };
 
 static void block_crypto_init(void)
diff --git a/block/curl.c b/block/curl.c
index 2a244e2439..11318a9a29 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -957,6 +957,19 @@ static int64_t curl_getlength(BlockDriverState *bs)
 return s->len;
 }
 
+static const char *const curl_sgfnt_runtime_opts[] = {
+CURL_BLOCK_OPT_URL,
+CURL_BLOCK_OPT_SSLVERIFY,
+CURL_BLOCK_OPT_COOKIE,
+CURL_BLOCK_OPT_COOKIE_SECRET,
+CURL_BLOCK_OPT_USERNAME,
+CURL_BLOCK_OPT_PASSWORD_SECRET,
+CURL_BLOCK_OPT_PROXY_USERNAME,
+CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET,
+
+NULL
+};
+
 static BlockDriver bdrv_http = {
 .format_name= "http",
 .protocol_name  = "http",
@@ -971,6 +984,8 @@ static BlockDriver bdrv_http = {
 
 .bdrv_detach_aio_context= curl_detach_aio_context,
 .bdrv_attach_aio_context= curl_attach_aio_context,
+
+.sgfnt_runtime_opts = curl_sgfnt_runtime_opts,
 };
 

[Qemu-devel] [PATCH v7 for-2.12 09/25] block: Add bdrv_make_absolute_filename()

2017-11-20 Thread Max Reitz
This is a general function for making a filename that is relative to a
certain BDS absolute.

It calls bdrv_get_full_backing_filename_from_filename() for now, but
that will be changed in a follow-up patch.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 block.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 804adceafa..a11cebf0cb 100644
--- a/block.c
+++ b/block.c
@@ -313,13 +313,24 @@ char *bdrv_get_full_backing_filename_from_filename(const 
char *backed,
 }
 }
 
-char *bdrv_get_full_backing_filename(BlockDriverState *bs, Error **errp)
+/* If @filename is empty or NULL, this function returns NULL without
+ * setting @errp.  In all other cases, NULL will only be returned with
+ * @errp set.
+ */
+static char *bdrv_make_absolute_filename(BlockDriverState *relative_to,
+ const char *filename, Error **errp)
 {
-char *backed = bs->exact_filename[0] ? bs->exact_filename : bs->filename;
+char *bs_filename = relative_to->exact_filename[0]
+? relative_to->exact_filename
+: relative_to->filename;
+
+return bdrv_get_full_backing_filename_from_filename(bs_filename,
+filename ?: "", errp);
+}
 
-return bdrv_get_full_backing_filename_from_filename(backed,
-bs->backing_file,
-errp);
+char *bdrv_get_full_backing_filename(BlockDriverState *bs, Error **errp)
+{
+return bdrv_make_absolute_filename(bs, bs->backing_file, errp);
 }
 
 void bdrv_register(BlockDriver *bdrv)
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 14/25] block/nbd: Make bdrv_dirname() return NULL

2017-11-20 Thread Max Reitz
The generic bdrv_dirname() implementation would be able to generate some
form of directory name for many NBD nodes, but it would be always wrong.
Therefore, we have to explicitly make it an error (until NBD has some
form of specification for export paths, if it ever will).

Signed-off-by: Max Reitz 
---
 block/nbd.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index a50d24b50a..ed921fa333 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -565,6 +565,16 @@ static void nbd_refresh_filename(BlockDriverState *bs, 
QDict *options)
 bs->full_open_options = opts;
 }
 
+static char *nbd_dirname(BlockDriverState *bs, Error **errp)
+{
+/* The generic bdrv_dirname() implementation is able to work out some
+ * directory name for NBD nodes, but that would be wrong. So far there is 
no
+ * specification for how "export paths" would work, so NBD does not have
+ * directory names. */
+error_setg(errp, "Cannot generate a base directory for NBD nodes");
+return NULL;
+}
+
 static BlockDriver bdrv_nbd = {
 .format_name= "nbd",
 .protocol_name  = "nbd",
@@ -582,6 +592,7 @@ static BlockDriver bdrv_nbd = {
 .bdrv_detach_aio_context= nbd_detach_aio_context,
 .bdrv_attach_aio_context= nbd_attach_aio_context,
 .bdrv_refresh_filename  = nbd_refresh_filename,
+.bdrv_dirname   = nbd_dirname,
 };
 
 static BlockDriver bdrv_nbd_tcp = {
@@ -601,6 +612,7 @@ static BlockDriver bdrv_nbd_tcp = {
 .bdrv_detach_aio_context= nbd_detach_aio_context,
 .bdrv_attach_aio_context= nbd_attach_aio_context,
 .bdrv_refresh_filename  = nbd_refresh_filename,
+.bdrv_dirname   = nbd_dirname,
 };
 
 static BlockDriver bdrv_nbd_unix = {
@@ -620,6 +632,7 @@ static BlockDriver bdrv_nbd_unix = {
 .bdrv_detach_aio_context= nbd_detach_aio_context,
 .bdrv_attach_aio_context= nbd_attach_aio_context,
 .bdrv_refresh_filename  = nbd_refresh_filename,
+.bdrv_dirname   = nbd_dirname,
 };
 
 static void bdrv_nbd_init(void)
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 24/25] block/curl: Implement bdrv_refresh_filename()

2017-11-20 Thread Max Reitz
Signed-off-by: Max Reitz 
---
 block/curl.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/block/curl.c b/block/curl.c
index 11318a9a29..fe57223fda 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -957,6 +957,20 @@ static int64_t curl_getlength(BlockDriverState *bs)
 return s->len;
 }
 
+static void curl_refresh_filename(BlockDriverState *bs)
+{
+BDRVCURLState *s = bs->opaque;
+
+if (!s->sslverify || s->cookie ||
+s->username || s->password || s->proxyusername || s->proxypassword)
+{
+return;
+}
+
+pstrcpy(bs->exact_filename, sizeof(bs->exact_filename), s->url);
+}
+
+
 static const char *const curl_sgfnt_runtime_opts[] = {
 CURL_BLOCK_OPT_URL,
 CURL_BLOCK_OPT_SSLVERIFY,
@@ -985,6 +999,7 @@ static BlockDriver bdrv_http = {
 .bdrv_detach_aio_context= curl_detach_aio_context,
 .bdrv_attach_aio_context= curl_attach_aio_context,
 
+.bdrv_refresh_filename  = curl_refresh_filename,
 .sgfnt_runtime_opts = curl_sgfnt_runtime_opts,
 };
 
@@ -1003,6 +1018,7 @@ static BlockDriver bdrv_https = {
 .bdrv_detach_aio_context= curl_detach_aio_context,
 .bdrv_attach_aio_context= curl_attach_aio_context,
 
+.bdrv_refresh_filename  = curl_refresh_filename,
 .sgfnt_runtime_opts = curl_sgfnt_runtime_opts,
 };
 
@@ -1021,6 +1037,7 @@ static BlockDriver bdrv_ftp = {
 .bdrv_detach_aio_context= curl_detach_aio_context,
 .bdrv_attach_aio_context= curl_attach_aio_context,
 
+.bdrv_refresh_filename  = curl_refresh_filename,
 .sgfnt_runtime_opts = curl_sgfnt_runtime_opts,
 };
 
@@ -1039,6 +1056,7 @@ static BlockDriver bdrv_ftps = {
 .bdrv_detach_aio_context= curl_detach_aio_context,
 .bdrv_attach_aio_context= curl_attach_aio_context,
 
+.bdrv_refresh_filename  = curl_refresh_filename,
 .sgfnt_runtime_opts = curl_sgfnt_runtime_opts,
 };
 
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 10/25] block: Fix bdrv_find_backing_image()

2017-11-20 Thread Max Reitz
bdrv_find_backing_image() should use bdrv_get_full_backing_filename() or
bdrv_make_absolute_filename() instead of trying to do what those
functions do by itself.

path_combine_deprecated() can now be dropped, so let's do that.

Signed-off-by: Max Reitz 
---
 block.c | 30 ++
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/block.c b/block.c
index a11cebf0cb..ecea78dae0 100644
--- a/block.c
+++ b/block.c
@@ -196,15 +196,6 @@ char *path_combine(const char *base_path, const char 
*filename)
 return result;
 }
 
-static void path_combine_deprecated(char *dest, int dest_size,
-const char *base_path,
-const char *filename)
-{
-char *combined = path_combine(base_path, filename);
-pstrcpy(dest, dest_size, combined);
-g_free(combined);
-}
-
 /*
  * Helper function for bdrv_parse_filename() implementations to remove optional
  * protocol prefixes (especially "file:") from a filename and for putting the
@@ -4133,7 +4124,6 @@ BlockDriverState 
*bdrv_find_backing_image(BlockDriverState *bs,
 
 filename_full = g_malloc(PATH_MAX);
 backing_file_full = g_malloc(PATH_MAX);
-filename_tmp  = g_malloc(PATH_MAX);
 
 is_protocol = path_has_protocol(backing_file);
 
@@ -4162,22 +4152,23 @@ BlockDriverState 
*bdrv_find_backing_image(BlockDriverState *bs,
 } else {
 /* If not an absolute filename path, make it relative to the 
current
  * image's filename path */
-path_combine_deprecated(filename_tmp, PATH_MAX, curr_bs->filename,
-backing_file);
-
-/* We are going to compare absolute pathnames */
-if (!realpath(filename_tmp, filename_full)) {
+filename_tmp = bdrv_make_absolute_filename(curr_bs, backing_file,
+   NULL);
+/* We are going to compare canonicalized absolute pathnames */
+if (!filename_tmp || !realpath(filename_tmp, filename_full)) {
+g_free(filename_tmp);
 continue;
 }
+g_free(filename_tmp);
 
 /* We need to make sure the backing filename we are comparing 
against
  * is relative to the current image filename (or absolute) */
-path_combine_deprecated(filename_tmp, PATH_MAX, curr_bs->filename,
-curr_bs->backing_file);
-
-if (!realpath(filename_tmp, backing_file_full)) {
+filename_tmp = bdrv_get_full_backing_filename(curr_bs, NULL);
+if (!filename_tmp || !realpath(filename_tmp, backing_file_full)) {
+g_free(filename_tmp);
 continue;
 }
+g_free(filename_tmp);
 
 if (strcmp(backing_file_full, filename_full) == 0) {
 retval = curr_bs->backing->bs;
@@ -4188,7 +4179,6 @@ BlockDriverState 
*bdrv_find_backing_image(BlockDriverState *bs,
 
 g_free(filename_full);
 g_free(backing_file_full);
-g_free(filename_tmp);
 return retval;
 }
 
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 13/25] quorum: Make bdrv_dirname() return NULL

2017-11-20 Thread Max Reitz
While the common implementation for bdrv_dirname() should return NULL
for quorum BDSs already (because they do not have a file node and their
exact_filename field should be empty), there is no reason not to make
that explicit.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
---
 block/quorum.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index 2f1a628449..e5a844335e 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1095,6 +1095,16 @@ static void quorum_refresh_filename(BlockDriverState 
*bs, QDict *options)
 bs->full_open_options = opts;
 }
 
+static char *quorum_dirname(BlockDriverState *bs, Error **errp)
+{
+/* In general, there are multiple BDSs with different dirnames below this
+ * one; so there is no unique dirname we could return (unless all are equal
+ * by chance, or there is only one). Therefore, to be consistent, just
+ * always return NULL. */
+error_setg(errp, "Cannot generate a base directory for quorum nodes");
+return NULL;
+}
+
 static BlockDriver bdrv_quorum = {
 .format_name= "quorum",
 .protocol_name  = "quorum",
@@ -1104,6 +1114,7 @@ static BlockDriver bdrv_quorum = {
 .bdrv_file_open = quorum_open,
 .bdrv_close = quorum_close,
 .bdrv_refresh_filename  = quorum_refresh_filename,
+.bdrv_dirname   = quorum_dirname,
 
 .bdrv_co_flush_to_disk  = quorum_co_flush,
 
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 05/25] block: Respect backing bs in bdrv_refresh_filename

2017-11-20 Thread Max Reitz
Basically, bdrv_refresh_filename() should respect all children of a
BlockDriverState. However, generally those children are driver-specific,
so this function cannot handle the general case. On the other hand,
there are only few drivers which use other children than @file and
@backing (that being vmdk, quorum, and blkverify).

Most block drivers only use @file and/or @backing (if they use any
children at all). Both can be implemented directly in
bdrv_refresh_filename.

The user overriding the file's filename is already handled, however, the
user overriding the backing file is not. If this is done, opening the
BDS with the plain filename of its file will not be correct, so we may
not set bs->exact_filename in that case.

iotests 051 and 191 contain test cases for overwriting the backing file,
and so their output changes with this patch applied (which I consider a
good thing). Note that in the case of 191, the implicitly opened
(non-overridden) base file is included in the json:{} filename as well
-- this will be remedied by a later patch.

Signed-off-by: Max Reitz 
---
 block.c   | 12 +++-
 tests/qemu-iotests/051.out|  8 
 tests/qemu-iotests/051.pc.out |  8 
 tests/qemu-iotests/191.out| 24 
 4 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/block.c b/block.c
index eb67dfbcc0..2dc06cb9cb 100644
--- a/block.c
+++ b/block.c
@@ -5016,6 +5016,7 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 
 opts = qdict_new();
 has_open_options = append_open_options(opts, bs);
+has_open_options |= bs->backing_overridden;
 
 /* If no specific options have been given for this BDS, the filename of
  * the underlying file should suffice for this one as well */
@@ -5027,11 +5028,20 @@ void bdrv_refresh_filename(BlockDriverState *bs)
  * file BDS. The full options QDict of that file BDS should somehow
  * contain a representation of the filename, therefore the following
  * suffices without querying the (exact_)filename of this BDS. */
-if (bs->file->bs->full_open_options) {
+if (bs->file->bs->full_open_options &&
+(!bs->backing || bs->backing->bs->full_open_options))
+{
 qdict_put_str(opts, "driver", drv->format_name);
 QINCREF(bs->file->bs->full_open_options);
 qdict_put(opts, "file", bs->file->bs->full_open_options);
 
+if (bs->backing) {
+QINCREF(bs->backing->bs->full_open_options);
+qdict_put(opts, "backing", bs->backing->bs->full_open_options);
+} else if (bs->backing_overridden && !bs->backing) {
+qdict_put(opts, "backing", qstring_new());
+}
+
 bs->full_open_options = opts;
 } else {
 QDECREF(opts);
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index e3c6eaba57..50d5cd07c8 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -59,7 +59,7 @@ QEMU X.Y.Z monitor - type 'help' for more information
 Testing: -drive 
file=TEST_DIR/t.qcow2,driver=qcow2,backing.file.filename=TEST_DIR/t.qcow2.orig,if=none,id=drive0
 -nodefaults
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) info block
-drive0 (NODE_NAME): TEST_DIR/t.qcow2 (qcow2)
+drive0 (NODE_NAME): json:{"backing": {"driver": "qcow2", "file": {"driver": 
"file", "filename": "TEST_DIR/t.qcow2.orig"}}, "driver": "qcow2", "file": 
{"driver": "file", "filename": "TEST_DIR/t.qcow2"}} (qcow2)
 Removable device: not locked, tray closed
 Cache mode:   writeback
 Backing file: TEST_DIR/t.qcow2.orig (chain depth: 1)
@@ -148,7 +148,7 @@ QEMU_PROG: -drive driver=null-co,cache=invalid_value: 
invalid cache option
 Testing: -drive 
file=TEST_DIR/t.qcow2,cache=writeback,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0
 -nodefaults
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) info block
-drive0 (NODE_NAME): TEST_DIR/t.qcow2 (qcow2)
+drive0 (NODE_NAME): json:{"backing": {"driver": "qcow2", "file": {"driver": 
"file", "filename": "TEST_DIR/t.qcow2.base"}}, "driver": "qcow2", "file": 
{"driver": "file", "filename": "TEST_DIR/t.qcow2"}} (qcow2)
 Removable device: not locked, tray closed
 Cache mode:   writeback
 Backing file: TEST_DIR/t.qcow2.base (chain depth: 1)
@@ -168,7 +168,7 @@ backing-file: TEST_DIR/t.qcow2.base (file, read-only)
 Testing: -drive 
file=TEST_DIR/t.qcow2,cache=writethrough,backing.file.filename=TEST_DIR/t.qcow2.base,backing.cache.no-flush=on,backing.node-name=backing,backing.file.node-name=backing-file,file.node-name=file,if=none,id=drive0
 -nodefaults
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) info block
-drive0 (NODE_NAME): TEST_DIR/t.qcow2 (qcow2)
+drive0 

[Qemu-devel] [PATCH v7 for-2.12 08/25] block: bdrv_get_full_backing_filename's ret. val.

2017-11-20 Thread Max Reitz
Make bdrv_get_full_backing_filename() return an allocated string instead
of placing the result in a caller-provided buffer.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 include/block/block.h |  3 +--
 block.c   | 47 +--
 block/qapi.c  | 12 ++--
 3 files changed, 20 insertions(+), 42 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 46c6190414..3e2ea6c879 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -481,8 +481,7 @@ void bdrv_round_to_clusters(BlockDriverState *bs,
 const char *bdrv_get_encrypted_filename(BlockDriverState *bs);
 void bdrv_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
-void bdrv_get_full_backing_filename(BlockDriverState *bs,
-char *dest, size_t sz, Error **errp);
+char *bdrv_get_full_backing_filename(BlockDriverState *bs, Error **errp);
 char *bdrv_get_full_backing_filename_from_filename(const char *backed,
const char *backing,
Error **errp);
diff --git a/block.c b/block.c
index f8b3edd5bb..804adceafa 100644
--- a/block.c
+++ b/block.c
@@ -313,24 +313,13 @@ char *bdrv_get_full_backing_filename_from_filename(const 
char *backed,
 }
 }
 
-void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t 
sz,
-Error **errp)
+char *bdrv_get_full_backing_filename(BlockDriverState *bs, Error **errp)
 {
 char *backed = bs->exact_filename[0] ? bs->exact_filename : bs->filename;
-char *full_name;
-Error *local_error = NULL;
 
-full_name = bdrv_get_full_backing_filename_from_filename(backed,
- bs->backing_file,
- _error);
-if (full_name) {
-pstrcpy(dest, sz, full_name);
-g_free(full_name);
-} else if (local_error) {
-error_propagate(errp, local_error);
-} else if (sz > 0) {
-*dest = '\0';
-}
+return bdrv_get_full_backing_filename_from_filename(backed,
+bs->backing_file,
+errp);
 }
 
 void bdrv_register(BlockDriver *bdrv)
@@ -2226,7 +2215,7 @@ out:
 int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
const char *bdref_key, Error **errp)
 {
-char *backing_filename = g_malloc0(PATH_MAX);
+char *backing_filename = NULL;
 char *bdref_key_dot;
 const char *reference = NULL;
 int ret = 0;
@@ -2260,7 +2249,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
  */
 reference = qdict_get_try_str(parent_options, bdref_key);
 if (reference || qdict_haskey(options, "file.filename")) {
-backing_filename[0] = '\0';
+/* keep backing_filename NULL */
 
 /* FIXME: Should also be set to true if @options contains other runtime
  *options which control the data that is read from the backing
@@ -2270,8 +2259,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 QDECREF(options);
 goto free_exit;
 } else {
-bdrv_get_full_backing_filename(bs, backing_filename, PATH_MAX,
-   _err);
+backing_filename = bdrv_get_full_backing_filename(bs, _err);
 if (local_err) {
 ret = -EINVAL;
 error_propagate(errp, local_err);
@@ -2292,9 +2280,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 qdict_put_str(options, "driver", bs->backing_format);
 }
 
-backing_hd = bdrv_open_inherit(*backing_filename ? backing_filename : NULL,
-   reference, options, 0, bs, _backing,
-   errp);
+backing_hd = bdrv_open_inherit(backing_filename, reference, options, 0, bs,
+   _backing, errp);
 if (!backing_hd) {
 bs->open_flags |= BDRV_O_NO_BACKING;
 error_prepend(errp, "Could not open backing file: ");
@@ -4128,7 +4115,6 @@ BlockDriverState 
*bdrv_find_backing_image(BlockDriverState *bs,
 int is_protocol = 0;
 BlockDriverState *curr_bs = NULL;
 BlockDriverState *retval = NULL;
-Error *local_error = NULL;
 
 if (!bs || !bs->drv || !backing_file) {
 return NULL;
@@ -4145,21 +4131,22 @@ BlockDriverState 
*bdrv_find_backing_image(BlockDriverState *bs,
 /* If either of the filename paths is actually a protocol, then
  * compare unmodified paths; otherwise make paths relative */
 if (is_protocol || path_has_protocol(curr_bs->backing_file)) {
+char 

[Qemu-devel] [PATCH v7 for-2.12 23/25] block: Fix FIXME from "Add BDS.backing_overridden"

2017-11-20 Thread Max Reitz
Said commit introduced a FIXME stating that bdrv_open_backing_file()
should set bs->backing_overridden to true not only if the file.filename
option was set, but if the "options" QDict contained any option that is
significant for any node in the BDS tree emerging from the backing BDS.
This behavior is implemented by this patch.

Signed-off-by: Max Reitz 
---
 block.c | 81 -
 1 file changed, 75 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 526ce43e9b..40007544e5 100644
--- a/block.c
+++ b/block.c
@@ -73,6 +73,9 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
const BdrvChildRole *child_role,
Error **errp);
 
+static bool has_significant_runtime_options(BlockDriverState *bs,
+const QDict *d);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -2214,6 +2217,42 @@ out:
 bdrv_refresh_limits(bs, NULL);
 }
 
+/**
+ * Checks whether @options contains any significant option for any of the nodes
+ * in the BDS tree emerging from @bs.
+ */
+static bool is_significant_option_tree(BlockDriverState *bs, QDict *options)
+{
+BdrvChild *child;
+
+if (!qdict_size(options)) {
+/* No need to recurse */
+return false;
+}
+
+if (has_significant_runtime_options(bs, options)) {
+return true;
+}
+
+QLIST_FOREACH(child, >children, next) {
+QDict *child_options;
+char *option_prefix;
+
+option_prefix = g_strdup_printf("%s.", child->name);
+qdict_extract_subqdict(options, _options, option_prefix);
+g_free(option_prefix);
+
+if (is_significant_option_tree(child->bs, child_options)) {
+QDECREF(child_options);
+return true;
+}
+
+QDECREF(child_options);
+}
+
+return false;
+}
+
 /*
  * Opens the backing file for a BlockDriverState if not yet open
  *
@@ -2232,7 +2271,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 const char *reference = NULL;
 int ret = 0;
 BlockDriverState *backing_hd;
-QDict *options;
+QDict *options, *cloned_options = NULL;
 QDict *tmp_parent_options = NULL;
 Error *local_err = NULL;
 
@@ -2262,11 +2301,6 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 reference = qdict_get_try_str(parent_options, bdref_key);
 if (reference || qdict_haskey(options, "file.filename")) {
 /* keep backing_filename NULL */
-
-/* FIXME: Should also be set to true if @options contains other runtime
- *options which control the data that is read from the backing
- *BDS */
-bs->backing_overridden = true;
 } else if (bs->backing_file[0] == '\0' && qdict_size(options) == 0) {
 QDECREF(options);
 goto free_exit;
@@ -2287,6 +2321,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 goto free_exit;
 }
 
+cloned_options = qdict_clone_shallow(options);
+
 if (!reference &&
 bs->backing_format[0] != '\0' && !qdict_haskey(options, "driver")) {
 qdict_put_str(options, "driver", bs->backing_format);
@@ -2302,6 +2338,10 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 }
 bdrv_set_aio_context(backing_hd, bdrv_get_aio_context(bs));
 
+if (reference || is_significant_option_tree(backing_hd, cloned_options)) {
+bs->backing_overridden = true;
+}
+
 /* Hook up the backing file link; drop our reference, bs owns the
  * backing_hd reference now */
 bdrv_set_backing_hd(bs, backing_hd, _err);
@@ -2317,6 +2357,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 free_exit:
 g_free(backing_filename);
 QDECREF(tmp_parent_options);
+QDECREF(cloned_options);
 return ret;
 }
 
@@ -4973,6 +5014,34 @@ static const char *const 
*significant_options(BlockDriverState *bs,
 }
 
 /**
+ * Returns true if @d contains any options the block driver of @bs considers to
+ * be significant runtime options.
+ */
+static bool has_significant_runtime_options(BlockDriverState *bs,
+const QDict *d)
+{
+const char *const *option_name = NULL;
+
+while ((option_name = significant_options(bs, option_name))) {
+assert(strlen(*option_name) > 0);
+if ((*option_name)[strlen(*option_name) - 1] != '.') {
+if (qdict_haskey(d, *option_name)) {
+return true;
+}
+} else {
+const QDictEntry *entry;
+for (entry = qdict_first(d); entry; entry = qdict_next(d, entry)) {
+if (strstart(qdict_entry_key(entry), *option_name, NULL)) {
+return true;
+}

[Qemu-devel] [PATCH v7 for-2.12 11/25] block: Add bdrv_dirname()

2017-11-20 Thread Max Reitz
This function may be implemented by block drivers to derive a directory
name from a BDS. Concatenating this g_free()-able string with a relative
filename must result in a valid (not necessarily existing) filename, so
this is a function that should generally be not implemented by format
drivers, because this is protocol-specific.

If a BDS's driver does not implement this function, bdrv_dirname() will
fall through to the BDS's file if it exists. If it does not, the
exact_filename field will be used to generate a directory name.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 include/block/block.h |  1 +
 include/block/block_int.h |  7 +++
 block.c   | 26 ++
 3 files changed, 34 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 3e2ea6c879..a053a27148 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -485,6 +485,7 @@ char *bdrv_get_full_backing_filename(BlockDriverState *bs, 
Error **errp);
 char *bdrv_get_full_backing_filename_from_filename(const char *backed,
const char *backing,
Error **errp);
+char *bdrv_dirname(BlockDriverState *bs, Error **errp);
 
 int path_has_protocol(const char *path);
 int path_is_absolute(const char *path);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5e9734d8b5..c07882962d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -135,6 +135,13 @@ struct BlockDriver {
 
 void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options);
 
+/*
+ * Returns an allocated string which is the directory name of this BDS: It
+ * will be used to make relative filenames absolute by prepending this
+ * function's return value to them.
+ */
+char *(*bdrv_dirname)(BlockDriverState *bs, Error **errp);
+
 /* aio */
 BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
diff --git a/block.c b/block.c
index ecea78dae0..20e45c85c6 100644
--- a/block.c
+++ b/block.c
@@ -5098,6 +5098,32 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 }
 }
 
+char *bdrv_dirname(BlockDriverState *bs, Error **errp)
+{
+BlockDriver *drv = bs->drv;
+
+if (!drv) {
+error_setg(errp, "Node '%s' is ejected", bs->node_name);
+return NULL;
+}
+
+if (drv->bdrv_dirname) {
+return drv->bdrv_dirname(bs, errp);
+}
+
+if (bs->file) {
+return bdrv_dirname(bs->file->bs, errp);
+}
+
+if (bs->exact_filename[0] != '\0') {
+return path_combine(bs->exact_filename, "");
+}
+
+error_setg(errp, "Cannot generate a base directory for %s nodes",
+   drv->format_name);
+return NULL;
+}
+
 /*
  * Hot add/remove a BDS's child. So the user can take a child offline when
  * it is broken and take a new child online
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 04/25] iotests: Drop explicit base blockdev in 191

2017-11-20 Thread Max Reitz
Overriding the backing image should result in a json:{} pseudo-filename.
Then, you can no longer use the commit block job with filename
parameters.  Therefore, do not explicitly add the base and override the
middle image in iotest 191, since we do not need to anyway.  This will
allow us to continue to use the middle image's filename to identify it.

In the long run, we want block-commit to accept node names for base and
top (just like block-stream does).

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/191 |  3 +-
 tests/qemu-iotests/191.out | 70 +++---
 2 files changed, 36 insertions(+), 37 deletions(-)

diff --git a/tests/qemu-iotests/191 b/tests/qemu-iotests/191
index ad785e10b1..92805e9ca2 100755
--- a/tests/qemu-iotests/191
+++ b/tests/qemu-iotests/191
@@ -67,8 +67,7 @@ qemu_comm_method="qmp"
 qmp_pretty="y"
 
 _launch_qemu \
--blockdev 
"driver=${IMGFMT},file.driver=file,file.filename=${TEST_IMG}.base,node-name=base"
 \
--blockdev 
"driver=${IMGFMT},file.driver=file,file.filename=${TEST_IMG}.mid,node-name=mid,backing=base"
 \
+-blockdev 
"driver=${IMGFMT},file.driver=file,file.filename=${TEST_IMG}.mid,node-name=mid" 
\
 -blockdev 
"driver=${IMGFMT},file.driver=file,file.filename=${TEST_IMG},node-name=top,backing=mid"
 \
 -blockdev 
"driver=${IMGFMT},file.driver=file,file.filename=${TEST_IMG}.ovl2,node-name=top2,backing=mid"
 h=$QEMU_HANDLE
diff --git a/tests/qemu-iotests/191.out b/tests/qemu-iotests/191.out
index 73c0ed454c..26ff0a315d 100644
--- a/tests/qemu-iotests/191.out
+++ b/tests/qemu-iotests/191.out
@@ -220,26 +220,8 @@ wrote 65536/65536 bytes at offset 1048576
 "iops_rd": 0,
 "detect_zeroes": "off",
 "image": {
-"backing-image": {
-"virtual-size": 67108864,
-"filename": "TEST_DIR/t.qcow2.base",
-"cluster-size": 65536,
-"format": "qcow2",
-"actual-size": SIZE,
-"format-specific": {
-"type": "qcow2",
-"data": {
-"compat": "1.1",
-"lazy-refcounts": false,
-"refcount-bits": 16,
-"corrupt": false
-}
-},
-"dirty-flag": false
-},
-"backing-filename-format": "qcow2",
 "virtual-size": 67108864,
-"filename": "TEST_DIR/t.qcow2.mid",
+"filename": "TEST_DIR/t.qcow2.base",
 "cluster-size": 65536,
 "format": "qcow2",
 "actual-size": SIZE,
@@ -252,19 +234,16 @@ wrote 65536/65536 bytes at offset 1048576
 "corrupt": false
 }
 },
-"full-backing-filename": "TEST_DIR/t.qcow2.base",
-"backing-filename": "TEST_DIR/t.qcow2.base",
 "dirty-flag": false
 },
 "iops_wr": 0,
-"ro": false,
-"node-name": "mid",
-"backing_file_depth": 1,
+"ro": true,
+"node-name": "NODE_NAME",
+"backing_file_depth": 0,
 "drv": "qcow2",
 "iops": 0,
 "bps_wr": 0,
 "write_threshold": 0,
-"backing_file": "TEST_DIR/t.qcow2.base",
 "encrypted": false,
 "bps": 0,
 "bps_rd": 0,
@@ -273,7 +252,7 @@ wrote 65536/65536 bytes at offset 1048576
 "direct": false,
 "writeback": true
 },
-"file": "TEST_DIR/t.qcow2.mid",
+"file": "TEST_DIR/t.qcow2.base",
 "encryption_key_missing": false
 },
 {
@@ -281,13 +260,13 @@ wrote 65536/65536 bytes at offset 1048576
 "detect_zeroes": "off",
 "image": {
 "virtual-size": 393216,
-"filename": "TEST_DIR/t.qcow2.mid",
+"filename": "TEST_DIR/t.qcow2.base",
 "format": "file",
 "actual-size": SIZE,
 "dirty-flag": false
 },
 "iops_wr": 0,
-"ro": false,
+"ro": true,
 "node-name": "NODE_NAME",
 "backing_file_depth": 0,
 "drv": "file",
@@ -302,15 +281,33 @@ wrote 65536/65536 bytes at offset 1048576
 "direct": false,
 "writeback": true
 },
-"file": "TEST_DIR/t.qcow2.mid",
+"file": "TEST_DIR/t.qcow2.base",
 "encryption_key_missing": false
 },
 {
 "iops_rd": 0,
 "detect_zeroes": "off",
 "image": {
+"backing-image": {
+"virtual-size": 67108864,
+"filename": 

[Qemu-devel] [PATCH v7 for-2.12 17/25] iotests: Add quorum case to test 110

2017-11-20 Thread Max Reitz
Test 110 tests relative backing filenames for complex BDS trees.  Now
that the originally supposedly failing test passes, let us add a new
failing test: Quorum can never work automatically (without detecting
whether all child nodes have the same base directory, but that would be
rather inconsistent behavior).

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/110 | 26 ++
 tests/qemu-iotests/110.out |  7 +++
 2 files changed, 33 insertions(+)

diff --git a/tests/qemu-iotests/110 b/tests/qemu-iotests/110
index ba1b3c6c7d..08976c8a61 100755
--- a/tests/qemu-iotests/110
+++ b/tests/qemu-iotests/110
@@ -30,6 +30,7 @@ status=1  # failure is the default!
 _cleanup()
 {
_cleanup_test_img
+rm -f "$TEST_IMG.copy"
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
 
@@ -87,6 +88,31 @@ echo
 # omit the image size; it should work anyway
 _make_test_img -b "$TEST_IMG_REL.base"
 
+echo
+echo '=== Nodes without a common directory ==='
+echo
+
+cp "$TEST_IMG" "$TEST_IMG.copy"
+
+# Should inform us that the actual path of the backing file cannot be 
determined
+TEST_IMG="json:{
+'driver': '$IMGFMT',
+'file': {
+'driver': 'quorum',
+'vote-threshold': 1,
+'children': [
+{
+'driver': 'file',
+'filename': '$TEST_IMG'
+},
+{
+'driver': 'file',
+'filename': '$TEST_IMG.copy'
+}
+]
+}
+}" _img_info | _filter_img_info
+
 
 # success, all done
 echo '*** done'
diff --git a/tests/qemu-iotests/110.out b/tests/qemu-iotests/110.out
index 5370bc1d26..1d0b2475cc 100644
--- a/tests/qemu-iotests/110.out
+++ b/tests/qemu-iotests/110.out
@@ -19,4 +19,11 @@ backing file: t.IMGFMT.base (actual path: 
TEST_DIR/t.IMGFMT.base)
 === Backing name is always relative to the backed image ===
 
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
backing_file=t.IMGFMT.base
+
+=== Nodes without a common directory ===
+
+image: json:{"driver": "IMGFMT", "file": {"children": [{"driver": "file", 
"filename": "TEST_DIR/t.IMGFMT"}, {"driver": "file", "filename": 
"TEST_DIR/t.IMGFMT.copy"}], "driver": "quorum", "blkverify": false, 
"rewrite-corrupted": false, "vote-threshold": 1}}
+file format: IMGFMT
+virtual size: 64M (67108864 bytes)
+backing file: t.IMGFMT.base (cannot determine actual path)
 *** done
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 03/25] block: Add BDS.backing_overridden

2017-11-20 Thread Max Reitz
If the backing file is overridden, this most probably does change the
guest-visible data of a BDS. Therefore, we will need to consider this in
bdrv_refresh_filename().

Adding a new field to the BDS is not nice, but it is very simple and
exactly keeps track of whether the backing file has been overridden.

This commit adds a FIXME which will be remedied by a follow-up commit.
Until then, the respective piece of code will not result in any behavior
that is worse than what we currently have.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
---
 include/block/block_int.h |  1 +
 block.c   | 13 +
 block/mirror.c|  4 
 blockdev.c| 16 
 4 files changed, 34 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index a5482775ec..5e9734d8b5 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -619,6 +619,7 @@ struct BlockDriverState {
 char backing_file[PATH_MAX]; /* if non zero, the image is a diff of
 this file image */
 char backing_format[16]; /* if non-zero and backing_file exists */
+bool backing_overridden; /* backing file has been specified by the user */
 
 QDict *full_open_options;
 char exact_filename[PATH_MAX];
diff --git a/block.c b/block.c
index 4aebf590d8..eb67dfbcc0 100644
--- a/block.c
+++ b/block.c
@@ -2233,6 +2233,11 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*parent_options,
 reference = qdict_get_try_str(parent_options, bdref_key);
 if (reference || qdict_haskey(options, "file.filename")) {
 backing_filename[0] = '\0';
+
+/* FIXME: Should also be set to true if @options contains other runtime
+ *options which control the data that is read from the backing
+ *BDS */
+bs->backing_overridden = true;
 } else if (bs->backing_file[0] == '\0' && qdict_size(options) == 0) {
 QDECREF(options);
 goto free_exit;
@@ -2434,6 +2439,9 @@ static BlockDriverState 
*bdrv_append_temp_snapshot(BlockDriverState *bs,
 goto out;
 }
 
+bs_snapshot->backing_overridden = true;
+bdrv_refresh_filename(bs_snapshot);
+
 out:
 QDECREF(snapshot_options);
 g_free(tmp_filename);
@@ -2564,6 +2572,7 @@ static BlockDriverState *bdrv_open_inherit(const char 
*filename,
 backing = qdict_get_try_str(options, "backing");
 if (backing && *backing == '\0') {
 flags |= BDRV_O_NO_BACKING;
+bs->backing_overridden = true;
 qdict_del(options, "backing");
 }
 
@@ -4976,6 +4985,10 @@ void bdrv_refresh_filename(BlockDriverState *bs)
  * refresh those first */
 QLIST_FOREACH(child, >children, next) {
 bdrv_refresh_filename(child->bs);
+
+if (child->role == _backing && child->bs->backing_overridden) {
+bs->backing_overridden = true;
+}
 }
 
 if (drv->bdrv_refresh_filename) {
diff --git a/block/mirror.c b/block/mirror.c
index f059981f19..d6e487fb2e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -530,6 +530,10 @@ static void mirror_exit(BlockJob *job, void *opaque)
 error_report_err(local_err);
 data->ret = -EPERM;
 }
+
+/* The target image's file already has been created with the backing
+ * file we just set, so there is no need to set backing_overridden or
+ * call bdrv_refresh_filename(). */
 }
 
 if (s->to_replace) {
diff --git a/blockdev.c b/blockdev.c
index 56a6b24a0b..b0393e1786 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1781,6 +1781,8 @@ static void external_snapshot_commit(BlkActionState 
*common)
 {
 ExternalSnapshotState *state =
  DO_UPCAST(ExternalSnapshotState, common, common);
+TransactionAction *action = common->action;
+bool image_was_existing = false;
 
 /* We don't need (or want) to use the transactional
  * bdrv_reopen_multiple() across all the entries at once, because we
@@ -1789,6 +1791,20 @@ static void external_snapshot_commit(BlkActionState 
*common)
 bdrv_reopen(state->old_bs, state->old_bs->open_flags & ~BDRV_O_RDWR,
 NULL);
 }
+
+if (action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC) {
+BlockdevSnapshotSync *s = action->u.blockdev_snapshot_sync.data;
+if (s->has_mode && s->mode == NEW_IMAGE_MODE_EXISTING) {
+image_was_existing = true;
+}
+} else {
+image_was_existing = true;
+}
+
+if (image_was_existing) {
+state->new_bs->backing_overridden = true;
+bdrv_refresh_filename(state->new_bs);
+}
 }
 
 static void external_snapshot_abort(BlkActionState *common)
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 07/25] block: bdrv_get_full_backing_filename_from_...'s ret. val.

2017-11-20 Thread Max Reitz
Make bdrv_get_full_backing_filename_from_filename() return an allocated
string instead of placing the result in a caller-provided buffer.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 include/block/block.h |  7 +++
 block.c   | 51 +++
 block/vmdk.c  |  9 -
 3 files changed, 42 insertions(+), 25 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 4584893c78..46c6190414 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -483,10 +483,9 @@ void bdrv_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
 void bdrv_get_full_backing_filename(BlockDriverState *bs,
 char *dest, size_t sz, Error **errp);
-void bdrv_get_full_backing_filename_from_filename(const char *backed,
-  const char *backing,
-  char *dest, size_t sz,
-  Error **errp);
+char *bdrv_get_full_backing_filename_from_filename(const char *backed,
+   const char *backing,
+   Error **errp);
 
 int path_has_protocol(const char *path);
 int path_is_absolute(const char *path);
diff --git a/block.c b/block.c
index 8786b9b520..f8b3edd5bb 100644
--- a/block.c
+++ b/block.c
@@ -288,20 +288,28 @@ int bdrv_set_read_only(BlockDriverState *bs, bool 
read_only, Error **errp)
 return 0;
 }
 
-void bdrv_get_full_backing_filename_from_filename(const char *backed,
-  const char *backing,
-  char *dest, size_t sz,
-  Error **errp)
+/* If @backing is empty, this function returns NULL without setting
+ * @errp.  In all other cases, NULL will only be returned with @errp
+ * set.
+ *
+ * Therefore, a return value of NULL without @errp set means that
+ * there is no backing file; if @errp is set, there is one but its
+ * absolute filename cannot be generated.
+ */
+char *bdrv_get_full_backing_filename_from_filename(const char *backed,
+   const char *backing,
+   Error **errp)
 {
-if (backing[0] == '\0' || path_has_protocol(backing) ||
-path_is_absolute(backing))
-{
-pstrcpy(dest, sz, backing);
+if (backing[0] == '\0') {
+return NULL;
+} else if (path_has_protocol(backing) || path_is_absolute(backing)) {
+return g_strdup(backing);
 } else if (backed[0] == '\0' || strstart(backed, "json:", NULL)) {
 error_setg(errp, "Cannot use relative backing file names for '%s'",
backed);
+return NULL;
 } else {
-path_combine_deprecated(dest, sz, backed, backing);
+return path_combine(backed, backing);
 }
 }
 
@@ -309,9 +317,20 @@ void bdrv_get_full_backing_filename(BlockDriverState *bs, 
char *dest, size_t sz,
 Error **errp)
 {
 char *backed = bs->exact_filename[0] ? bs->exact_filename : bs->filename;
+char *full_name;
+Error *local_error = NULL;
 
-bdrv_get_full_backing_filename_from_filename(backed, bs->backing_file,
- dest, sz, errp);
+full_name = bdrv_get_full_backing_filename_from_filename(backed,
+ bs->backing_file,
+ _error);
+if (full_name) {
+pstrcpy(dest, sz, full_name);
+g_free(full_name);
+} else if (local_error) {
+error_propagate(errp, local_error);
+} else if (sz > 0) {
+*dest = '\0';
+}
 }
 
 void bdrv_register(BlockDriver *bdrv)
@@ -4593,17 +4612,17 @@ void bdrv_img_create(const char *filename, const char 
*fmt,
 size = qemu_opt_get_size(opts, BLOCK_OPT_SIZE, img_size);
 if (backing_file && !(flags & BDRV_O_NO_BACKING)) {
 BlockDriverState *bs;
-char *full_backing = g_new0(char, PATH_MAX);
+char *full_backing;
 int back_flags;
 QDict *backing_options = NULL;
 
-bdrv_get_full_backing_filename_from_filename(filename, backing_file,
- full_backing, PATH_MAX,
- _err);
+full_backing =
+bdrv_get_full_backing_filename_from_filename(filename, 
backing_file,
+ _err);
 if (local_err) {
-g_free(full_backing);
 goto out;
 }
+assert(full_backing);
 
 /* backing files always opened 

[Qemu-devel] [PATCH v7 for-2.12 02/25] block: Use children list in bdrv_refresh_filename

2017-11-20 Thread Max Reitz
bdrv_refresh_filename() should invoke itself recursively on all
children, not just on file.

With that change, we can remove the manual invocations in blkverify,
quorum, commit, and mirror.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 block.c   | 9 +
 block/blkverify.c | 3 ---
 block/commit.c| 1 -
 block/mirror.c| 1 -
 block/quorum.c| 1 -
 5 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 6c8ef98dfa..4aebf590d8 100644
--- a/block.c
+++ b/block.c
@@ -4965,16 +4965,17 @@ static bool append_open_options(QDict *d, 
BlockDriverState *bs)
 void bdrv_refresh_filename(BlockDriverState *bs)
 {
 BlockDriver *drv = bs->drv;
+BdrvChild *child;
 QDict *opts;
 
 if (!drv) {
 return;
 }
 
-/* This BDS's file name will most probably depend on its file's name, so
- * refresh that first */
-if (bs->file) {
-bdrv_refresh_filename(bs->file->bs);
+/* This BDS's file name may depend on any of its children's file names, so
+ * refresh those first */
+QLIST_FOREACH(child, >children, next) {
+bdrv_refresh_filename(child->bs);
 }
 
 if (drv->bdrv_refresh_filename) {
diff --git a/block/blkverify.c b/block/blkverify.c
index 06369f9eac..b2ed8cd70d 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -281,9 +281,6 @@ static void blkverify_refresh_filename(BlockDriverState 
*bs, QDict *options)
 {
 BDRVBlkverifyState *s = bs->opaque;
 
-/* bs->file->bs has already been refreshed */
-bdrv_refresh_filename(s->test_file->bs);
-
 if (bs->file->bs->full_open_options
 && s->test_file->bs->full_open_options)
 {
diff --git a/block/commit.c b/block/commit.c
index 5036eec434..cb529cc2aa 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -241,7 +241,6 @@ static int coroutine_fn 
bdrv_commit_top_preadv(BlockDriverState *bs,
 
 static void bdrv_commit_top_refresh_filename(BlockDriverState *bs, QDict *opts)
 {
-bdrv_refresh_filename(bs->backing->bs);
 pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
 bs->backing->bs->filename);
 }
diff --git a/block/mirror.c b/block/mirror.c
index f995924032..f059981f19 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1061,7 +1061,6 @@ static void 
bdrv_mirror_top_refresh_filename(BlockDriverState *bs, QDict *opts)
  * bdrv_set_backing_hd */
 return;
 }
-bdrv_refresh_filename(bs->backing->bs);
 pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
 bs->backing->bs->filename);
 }
diff --git a/block/quorum.c b/block/quorum.c
index 272f9a5b77..2f1a628449 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1074,7 +1074,6 @@ static void quorum_refresh_filename(BlockDriverState *bs, 
QDict *options)
 int i;
 
 for (i = 0; i < s->num_children; i++) {
-bdrv_refresh_filename(s->children[i]->bs);
 if (!s->children[i]->bs->full_open_options) {
 return;
 }
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 16/25] block: Use bdrv_dirname() for relative filenames

2017-11-20 Thread Max Reitz
bdrv_get_full_backing_filename_from_filename() breaks down when it comes
to JSON filenames. Using bdrv_dirname() as the basis is better because
since we have BDS, we can descend through the BDS tree to the protocol
layer, which gives us a greater probability of finding a non-JSON name;
also, bdrv_dirname() is more correct as it allows block drivers to
override the generation of that directory name in a protocol-specific
way.

We still need to keep bdrv_get_full_backing_filename_from_filename(),
though, because it has valid callers which need it during image creation
when no BDS is available yet.

This makes a test case in qemu-iotest 110, which was supposed to fail,
work. That is actually good, but we need to change the reference output
(and the comment in 110) accordingly.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 block.c| 20 +++-
 tests/qemu-iotests/110 |  3 ++-
 tests/qemu-iotests/110.out |  2 +-
 3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 20e45c85c6..2e59e77f9d 100644
--- a/block.c
+++ b/block.c
@@ -311,12 +311,22 @@ char *bdrv_get_full_backing_filename_from_filename(const 
char *backed,
 static char *bdrv_make_absolute_filename(BlockDriverState *relative_to,
  const char *filename, Error **errp)
 {
-char *bs_filename = relative_to->exact_filename[0]
-? relative_to->exact_filename
-: relative_to->filename;
+char *dir, *full_name;
 
-return bdrv_get_full_backing_filename_from_filename(bs_filename,
-filename ?: "", errp);
+if (!filename || filename[0] == '\0') {
+return NULL;
+} else if (path_has_protocol(filename) || path_is_absolute(filename)) {
+return g_strdup(filename);
+}
+
+dir = bdrv_dirname(relative_to, errp);
+if (!dir) {
+return NULL;
+}
+
+full_name = g_strconcat(dir, filename, NULL);
+g_free(dir);
+return full_name;
 }
 
 char *bdrv_get_full_backing_filename(BlockDriverState *bs, Error **errp)
diff --git a/tests/qemu-iotests/110 b/tests/qemu-iotests/110
index 9de7369f3a..ba1b3c6c7d 100755
--- a/tests/qemu-iotests/110
+++ b/tests/qemu-iotests/110
@@ -61,7 +61,8 @@ echo '=== Non-reconstructable filename ==='
 echo
 
 # Across blkdebug without a config file, you cannot reconstruct filenames, so
-# qemu is incapable of knowing the directory of the top image
+# qemu is incapable of knowing the directory of the top image from the filename
+# alone. However, using bdrv_dirname(), it should still work.
 TEST_IMG="json:{
 'driver': '$IMGFMT',
 'file': {
diff --git a/tests/qemu-iotests/110.out b/tests/qemu-iotests/110.out
index b3584ff87f..5370bc1d26 100644
--- a/tests/qemu-iotests/110.out
+++ b/tests/qemu-iotests/110.out
@@ -14,7 +14,7 @@ backing file: t.IMGFMT.base (actual path: 
TEST_DIR/t.IMGFMT.base)
 image: json:{"driver": "IMGFMT", "file": {"set-state.0.event": "read_aio", 
"image": {"driver": "file", "filename": "TEST_DIR/t.IMGFMT"}, "driver": 
"blkdebug", "set-state.0.new_state": 42}}
 file format: IMGFMT
 virtual size: 64M (67108864 bytes)
-backing file: t.IMGFMT.base (cannot determine actual path)
+backing file: t.IMGFMT.base (actual path: TEST_DIR/t.IMGFMT.base)
 
 === Backing name is always relative to the backed image ===
 
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 01/25] block/mirror: Small absolute-paths simplification

2017-11-20 Thread Max Reitz
When invoking drive-mirror in absolute-paths mode, the target's backing
BDS is assigned to it in mirror_exit(). The current logic only does so
if the target does not have that backing BDS already; but it actually
cannot have a backing BDS at all (the BDS is opened with O_NO_BACKING in
qmp_drive_mirror()), so just assert that and assign the new backing BDS
unconditionally.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 block/mirror.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 307b6391a8..f995924032 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -523,12 +523,12 @@ static void mirror_exit(BlockJob *job, void *opaque)
 _abort);
 if (s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
 BlockDriverState *backing = s->is_none_mode ? src : s->base;
-if (backing_bs(target_bs) != backing) {
-bdrv_set_backing_hd(target_bs, backing, _err);
-if (local_err) {
-error_report_err(local_err);
-data->ret = -EPERM;
-}
+
+assert(!target_bs->backing);
+bdrv_set_backing_hd(target_bs, backing, _err);
+if (local_err) {
+error_report_err(local_err);
+data->ret = -EPERM;
 }
 }
 
-- 
2.13.6




[Qemu-devel] [PATCH v7 for-2.12 06/25] block: Make path_combine() return the path

2017-11-20 Thread Max Reitz
Besides being safe for arbitrary path lengths, after some follow-up
patches all callers will want a freshly allocated buffer anyway.

In the meantime, path_combine_deprecated() is added which has the same
interface as path_combine() had before this patch. All callers to that
function will be converted in follow-up patches.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
---
 include/block/block.h |  4 +--
 block.c   | 85 ---
 block/vmdk.c  |  3 +-
 3 files changed, 49 insertions(+), 43 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index c05cac57e5..4584893c78 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -490,9 +490,7 @@ void bdrv_get_full_backing_filename_from_filename(const 
char *backed,
 
 int path_has_protocol(const char *path);
 int path_is_absolute(const char *path);
-void path_combine(char *dest, int dest_size,
-  const char *base_path,
-  const char *filename);
+char *path_combine(const char *base_path, const char *filename);
 
 int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
 int bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos);
diff --git a/block.c b/block.c
index 2dc06cb9cb..8786b9b520 100644
--- a/block.c
+++ b/block.c
@@ -147,53 +147,62 @@ int path_is_absolute(const char *path)
 #endif
 }
 
-/* if filename is absolute, just copy it to dest. Otherwise, build a
+/* if filename is absolute, just return its duplicate. Otherwise, build a
path to it by considering it is relative to base_path. URL are
supported. */
-void path_combine(char *dest, int dest_size,
-  const char *base_path,
-  const char *filename)
+char *path_combine(const char *base_path, const char *filename)
 {
+const char *protocol_stripped = NULL;
 const char *p, *p1;
+char *result;
 int len;
 
-if (dest_size <= 0)
-return;
 if (path_is_absolute(filename)) {
-pstrcpy(dest, dest_size, filename);
-} else {
-const char *protocol_stripped = NULL;
+return g_strdup(filename);
+}
 
-if (path_has_protocol(base_path)) {
-protocol_stripped = strchr(base_path, ':');
-if (protocol_stripped) {
-protocol_stripped++;
-}
+if (path_has_protocol(base_path)) {
+protocol_stripped = strchr(base_path, ':');
+if (protocol_stripped) {
+protocol_stripped++;
 }
-p = protocol_stripped ?: base_path;
+}
+p = protocol_stripped ?: base_path;
 
-p1 = strrchr(base_path, '/');
+p1 = strrchr(base_path, '/');
 #ifdef _WIN32
-{
-const char *p2;
-p2 = strrchr(base_path, '\\');
-if (!p1 || p2 > p1)
-p1 = p2;
+{
+const char *p2;
+p2 = strrchr(base_path, '\\');
+if (!p1 || p2 > p1) {
+p1 = p2;
 }
+}
 #endif
-if (p1)
-p1++;
-else
-p1 = base_path;
-if (p1 > p)
-p = p1;
-len = p - base_path;
-if (len > dest_size - 1)
-len = dest_size - 1;
-memcpy(dest, base_path, len);
-dest[len] = '\0';
-pstrcat(dest, dest_size, filename);
+if (p1) {
+p1++;
+} else {
+p1 = base_path;
+}
+if (p1 > p) {
+p = p1;
 }
+len = p - base_path;
+
+result = g_malloc(len + strlen(filename) + 1);
+memcpy(result, base_path, len);
+strcpy(result + len, filename);
+
+return result;
+}
+
+static void path_combine_deprecated(char *dest, int dest_size,
+const char *base_path,
+const char *filename)
+{
+char *combined = path_combine(base_path, filename);
+pstrcpy(dest, dest_size, combined);
+g_free(combined);
 }
 
 /*
@@ -292,7 +301,7 @@ void bdrv_get_full_backing_filename_from_filename(const 
char *backed,
 error_setg(errp, "Cannot use relative backing file names for '%s'",
backed);
 } else {
-path_combine(dest, sz, backed, backing);
+path_combine_deprecated(dest, sz, backed, backing);
 }
 }
 
@@ -4136,8 +4145,8 @@ BlockDriverState 
*bdrv_find_backing_image(BlockDriverState *bs,
 } else {
 /* If not an absolute filename path, make it relative to the 
current
  * image's filename path */
-path_combine(filename_tmp, PATH_MAX, curr_bs->filename,
- backing_file);
+path_combine_deprecated(filename_tmp, PATH_MAX, curr_bs->filename,
+backing_file);
 
 /* We are going to compare absolute pathnames */
 if (!realpath(filename_tmp, filename_full)) {
@@ -4146,8 +4155,8 @@ BlockDriverState 

[Qemu-devel] [PATCH v7 for-2.12 00/25] block: Fix some filename generation issues

2017-11-20 Thread Max Reitz
I'm sparing myself writing this cover letter again, and I'll just give
you a link to the previous version:

http://lists.nongnu.org/archive/html/qemu-block/2017-09/msg01030.html

The only difference is that I dropped patch 16 which added a QAPI
@base-directory option for any node that could be used to override the
base directory to be used for resolving relative filenames.  (Because
Berto and Kevin convinced me it's not that useful now -- and the patch
is pretty stand-alone, so we can always add it later if needed.)

In turn, I had to add patch 4 because test 191 was broken by this
series.


v7: Rebased on block-next, and:
- Patch 4: Added (iotest 191 tries to both override a node's backing
   file and use filenames for node identification at the same
   time -- not a very good idea, but with a simple modification
   we can still get away with it)
- Patch 5: Changes 191's output now
- Patch 10 (was 9): Shorten the code [Berto]
- Old patch 16: Dropped [Berto/Kevin]
- Patch 17: Remove test case for @base-directory
- Patch 18:
  - Never use inline definition of the significant runtime options array
[Kevin]
  - Add encrypt.key-secret options for qcow/qcow2
  - Add transfer limit options for blkdebug
  - s/uid/user/ and s/gid/group/ for NFS
  - Add offset/size for raw
- Patch 19: Sprinkle a couple of notes in about how we actually don't
want this new function [Kevin, kind of]
- Patch 20: Changes 191's output now (and also, there is a rebase
conflict for 110 due to patch 16 being removed and thus
patch 17 having been modified)


git-backport-diff against v6:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/25:[] [--] 'block/mirror: Small absolute-paths simplification'
002/25:[] [--] 'block: Use children list in bdrv_refresh_filename'
003/25:[] [--] 'block: Add BDS.backing_overridden'
004/25:[down] 'iotests: Drop explicit base blockdev in 191'
005/25:[0024] [FC] 'block: Respect backing bs in bdrv_refresh_filename'
006/25:[] [--] 'block: Make path_combine() return the path'
007/25:[] [-C] 'block: bdrv_get_full_backing_filename_from_...'s ret. val.'
008/25:[] [--] 'block: bdrv_get_full_backing_filename's ret. val.'
009/25:[] [--] 'block: Add bdrv_make_absolute_filename()'
010/25:[0014] [FC] 'block: Fix bdrv_find_backing_image()'
011/25:[] [--] 'block: Add bdrv_dirname()'
012/25:[] [--] 'blkverify: Make bdrv_dirname() return NULL'
013/25:[] [--] 'quorum: Make bdrv_dirname() return NULL'
014/25:[] [--] 'block/nbd: Make bdrv_dirname() return NULL'
015/25:[] [--] 'block/nfs: Implement bdrv_dirname()'
016/25:[] [--] 'block: Use bdrv_dirname() for relative filenames'
017/25:[0027] [FC] 'iotests: Add quorum case to test 110'
018/25:[0188] [FC] 'block: Add sgfnt_runtime_opts to BlockDriver'
019/25:[0044] [FC] 'block: Add BlockDriver.bdrv_gather_child_options'
020/25:[0026] [FC] 'block: Generically refresh runtime options'
021/25:[] [-C] 'block: Purify .bdrv_refresh_filename()'
022/25:[] [--] 'block: Do not copy exact_filename from format file'
023/25:[] [-C] 'block: Fix FIXME from "Add BDS.backing_overridden"'
024/25:[] [--] 'block/curl: Implement bdrv_refresh_filename()'
025/25:[] [--] 'block/null: Generate filename even with latency-ns'


Max Reitz (25):
  block/mirror: Small absolute-paths simplification
  block: Use children list in bdrv_refresh_filename
  block: Add BDS.backing_overridden
  iotests: Drop explicit base blockdev in 191
  block: Respect backing bs in bdrv_refresh_filename
  block: Make path_combine() return the path
  block: bdrv_get_full_backing_filename_from_...'s ret. val.
  block: bdrv_get_full_backing_filename's ret. val.
  block: Add bdrv_make_absolute_filename()
  block: Fix bdrv_find_backing_image()
  block: Add bdrv_dirname()
  blkverify: Make bdrv_dirname() return NULL
  quorum: Make bdrv_dirname() return NULL
  block/nbd: Make bdrv_dirname() return NULL
  block/nfs: Implement bdrv_dirname()
  block: Use bdrv_dirname() for relative filenames
  iotests: Add quorum case to test 110
  block: Add sgfnt_runtime_opts to BlockDriver
  block: Add BlockDriver.bdrv_gather_child_options
  block: Generically refresh runtime options
  block: Purify .bdrv_refresh_filename()
  block: Do not copy exact_filename from format file
  block: Fix FIXME from "Add BDS.backing_overridden"
  block/curl: Implement bdrv_refresh_filename()
  block/null: Generate filename even with latency-ns

 include/block/block.h |  15 +-
 include/block/block_int.h |  38 +++-
 block.c   | 507 --
 block/blkdebug.c  |  69 +++---
 block/blkverify.c |  29 +--
 block/commit.c|   3 +-
 block/crypto.c|   

Re: [Qemu-devel] [PATCH v2] tests/bios-tables-test: Fix endianess problems when passing data to iasl

2017-11-20 Thread Thomas Huth
On 20.11.2017 17:55, Igor Mammedov wrote:
> On Thu, 16 Nov 2017 13:17:02 +0100
> Thomas Huth  wrote:
> 
>> The bios-tables-test was writing out files that we pass to iasl in
>> with the wrong endianness in the header when running on a big endian
>> host. So instead of storing mixed endian information in our structures,
>> let's keep everything in little endian and byte-swap it only when we
>> need a value in the code.
>>
>> Reported-by: Daniel P. Berrange 
>> Buglink: https://bugs.launchpad.net/qemu/+bug/1724570
>> Suggested-by: Michael S. Tsirkin 
>> Signed-off-by: Thomas Huth 
>> ---
>>  v2: Fixed vmgenid-test which was accidentially broken in v1
>>
>>  tests/acpi-utils.h   | 27 +--
>>  tests/bios-tables-test.c | 42 ++
>>  tests/vmgenid-test.c | 22 --
>>  3 files changed, 39 insertions(+), 52 deletions(-)
>>
>> diff --git a/tests/acpi-utils.h b/tests/acpi-utils.h
>> index f8d8723..d5ca5b6 100644
>> --- a/tests/acpi-utils.h
>> +++ b/tests/acpi-utils.h
>> @@ -28,24 +28,9 @@ typedef struct {
>>  bool tmp_files_retain;   /* do not delete the temp asl/aml */
>>  } AcpiSdtTable;
>>  
>> -#define ACPI_READ_FIELD(field, addr)   \
>> -do {   \
>> -switch (sizeof(field)) {   \
>> -case 1:\
>> -field = readb(addr);   \
>> -break; \
>> -case 2:\
>> -field = readw(addr);   \
>> -break; \
>> -case 4:\
>> -field = readl(addr);   \
>> -break; \
>> -case 8:\
>> -field = readq(addr);   \
>> -break; \
>> -default:   \
>> -g_assert(false);   \
>> -}  \
> probably it's been discussed but, why not do
>  leXX_to_cpu()
> here, instead of making each place that access read field
> to do leXX_to_cpu() manually.?
>
> Beside of keeping access to structure in natural host order,
> it should also be less error-prone as field users don't
> have to worry about endianness.

Actually, the readw/l/q functions already do byte-swapping, so the data
was stored in host endian order. But Michael said that he'd prefer to
store all data from the ACPI tables in little endian mode instead -
that's why I've done the patch that way.

 Thomas



[Qemu-devel] [Bug 1673976] Re: linux-user clone() can't handle glibc posix_spawn() (causes locale-gen to assert)

2017-11-20 Thread Peter Maydell
OK, this can't be as simple as "posix_spawn() fails", because I've just
tried the test program from the posix_spawn manpage
(http://man7.org/linux/man-pages/man3/posix_spawn.3.html) and that works
fine for x86-64 guest, aarch64 guest and armhf guest. In the x86 and
armhf cases the libc I have seems to use the NR_vfork syscall, but for
aarch64 it uses clone(CLONE_VM | CLONE_VFORK | SIGCHLD, ...) which is
what the glibc sources linked in comment #5 do, and that all works fine.

And locale-gen runs fine for my xenial-armhf chroot using current head-
of-git QEMU:

root@e104462:/# locale-gen   
Generating locales (this might take a while)...
  en_GB.UTF-8... done
Generation complete.

So can I ask that people: (1) please try with current head of git (or
with 2.11-rc1, which is almost the same thing); (2) if there's still a
problem with localegen or with programs calling posix_spawn() or other
real-world code, please provide full repro instructions so I can try to
reproduce locally.

I don't think we can make clone() in general work, so oddball demo code
like the example program in the clone(2) manpage is out of scope, but
there may well be specific cases we can address.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1673976

Title:
  linux-user clone() can't handle glibc posix_spawn() (causes locale-gen
  to assert)

Status in QEMU:
  New

Bug description:
  I'm running a command (locale-gen) inside of an armv7h chroot mounted
  on my x86_64 desktop by putting qemu-arm-static into /usr/bin/ of the
  chroot file system and I get a core dump.

  locale-gen
  Generating locales...
    en_US.UTF-8...localedef: ../sysdeps/unix/sysv/linux/spawni.c:360: 
__spawnix: Assertion `ec >= 0' failed.
  qemu: uncaught target signal 6 (Aborted) - core dumped
  /usr/bin/locale-gen: line 41:34 Aborted (core dumped) 
localedef -i $input -c -f $charset -A /usr/share/locale/locale.alias $locale

  I've done this same thing successfully for years, but this breakage
  has appeared some time in the last 3 or so months. Possibly with the
  update to qemu version 2.8.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1673976/+subscriptions



[Qemu-devel] kvm: Failed to flush the L2 table cache: Input/output error

2017-11-20 Thread Stefan Priebe - Profihost AG
Hello,

while using qemu 2.9.1 and doing a backup of a disk:

I have sometimes the following output:

Formatting '/mnt/qemu-249-2017_11_19-04_00_05.qcow2', fmt=qcow2
size=236223201280 encryption=off cluster_size=65536 lazy_refcounts=off
refcount_bits=16


followed by:

kvm: Failed to flush the L2 table cache: Input/output error
kvm: Failed to flush the refcount block cache: Input/output error

If this error happens (Failed to flush) the whole backup is incomplete.
Host kernel is a 4.4 based openSuSE 42.3 kernel.

Is this a known bug?

Greets,
Stefan



Re: [Qemu-devel] [PULL 0/2] ppc-for-2.11 queue 20171120

2017-11-20 Thread Peter Maydell
On 20 November 2017 at 05:42, David Gibson <da...@gibson.dropbear.id.au> wrote:
> The following changes since commit 2e02083438962d26ef9dcc7100f3b378104183db:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
> (2017-11-17 19:08:07 +)
>
> are available in the Git repository at:
>
>   git://github.com/dgibson/qemu.git tags/ppc-for-2.11-20171120
>
> for you to fetch changes up to 82512483940c756e2db1bd67ea91b02bc29c5e01:
>
>   spapr: reset DRCs after devices (2017-11-20 10:10:56 +1100)
>
> 
> ppc patch queue 2017-11-20
>
> Here's the current queue of ppc patches.  These 2 patches are both
> more complex than I'd ideally like this late in the 2.11 cycle.
> However, they do fix important bugs, so I think it's worth it on
> balance.
>
> 
> Greg Kurz (1):
>   spapr: reset DRCs after devices
>
> Suraj Jitindar Singh (1):
>   target/ppc: Update setting of cpu features to account for compat modes
>
>  hw/ppc/spapr.c | 64 
> +++---
>  hw/ppc/spapr_drc.c |  7 --
>  2 files changed, 42 insertions(+), 29 deletions(-)

Applied, thanks.

-- PMM



  1   2   3   >