date:20140806

Amit Shah  writes:

> On (Mon) 04 Aug 2014 [13:33:56], Markus Armbruster wrote:
>> Amit Shah  writes:
>> 
>> > To ensure two virtserialports don't get added to the system with the
>> > same 'name' parameter, we need to access all the ports on all the
>> > devices added, and compare the names.
>> >
>> > We currently don't have a list of all VirtIOSerial devices added to the
>> > system.  This commit adds a simple linked list in which devices are put
>> > when they're initialized, and removed when they go away.
>
> 
>
>> > +struct VirtIOSerialDevices {
>> > +QLIST_HEAD(, VirtIOSerial) devices;
>> > +} vserdevices;
>> > +
>> 
>> Any particular reason for stuffing the list into a struct?
>
> Not strictly needed for this patch alone, but I think it's cleaner to
> keep it this way, and when something else comes up that needs a
> per-device variable, this list will be around.

Adding the struct when it's needed will be easy, so why pay the
notational overhead now?

> Also, this is also the
> way it's done in the kernel, so that uniformity helps as well.

Two uglies make a pretty?

Anyway, matter of taste.

[...]

Re: [Qemu-devel] [PATCH 0/2] virtio-serial: Prevent addition of ports with same name

Amit Shah  writes:

> Hi,
>
> The 2nd patch prevents adding ports to the system with conflicts in
> the 'name' parameter.  This is done by going through all the ports in
> the system, including ports on other virtio-serial devices.
>
> The first patch prepares for this by creating a linked list of all
> available virtio-serial devices, so they can be walked over to check
> their ports' names.
>
> Please review.

No advice from Andreas on better ways to iterate over devices, yet
(vacation time?).  If there is one, we can always use it on top.

Reviewed-by: Markus Armbruster

[Qemu-devel] [Bug 1353149] Re: qemu 2.1.0 fails to start if number of cores is greater than 1.

2014-08-06 Thread Andrey Korolyov

-smp 1,maxcpus=4,sockets=1,cores=4,threads=1

should be

-smp 4,maxcpus=4,sockets=1,cores=4,threads=1

although more human-friendly error is more appropriate there (better
than a silent fallback to either 1- or 4- core topology)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353149

Title:
  qemu 2.1.0 fails to start if number of cores is greater than 1.

Status in QEMU:
  New

Bug description:
  qemu (kvm) 2.1.0 (built from sources) fails to start if number of
  cores is greater than 1.

  relevant part of commandline arguments:

  /usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
  i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
  -smp 1,maxcpus=4,sockets=1,cores=4,threads=1

  the error reported is:

  qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
smbios_get_tables: Assertion `smbios_smp_sockets >= 1' failed.
  2014-08-05 21:45:35.825+: shutting down

  however setting 4 sockets with 1 core each allows me to start the
  machine just fine.

  the system is debian wheezy
  Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
GNU/Linux

  libvirt 1.2.7 (built from sources)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353149/+subscriptions

Re: [Qemu-devel] [PATCH 1/2] virtio-serial: create a linked list of all active devices

2014-08-06 Thread Amit Shah

On (Wed) 06 Aug 2014 [09:27:02], Markus Armbruster wrote:
> Amit Shah  writes:
> 
> > On (Mon) 04 Aug 2014 [13:33:56], Markus Armbruster wrote:
> >> Amit Shah  writes:
> >> 
> >> > To ensure two virtserialports don't get added to the system with the
> >> > same 'name' parameter, we need to access all the ports on all the
> >> > devices added, and compare the names.
> >> >
> >> > We currently don't have a list of all VirtIOSerial devices added to the
> >> > system.  This commit adds a simple linked list in which devices are put
> >> > when they're initialized, and removed when they go away.
> >
> > 
> >
> >> > +struct VirtIOSerialDevices {
> >> > +QLIST_HEAD(, VirtIOSerial) devices;
> >> > +} vserdevices;
> >> > +
> >> 
> >> Any particular reason for stuffing the list into a struct?
> >
> > Not strictly needed for this patch alone, but I think it's cleaner to
> > keep it this way, and when something else comes up that needs a
> > per-device variable, this list will be around.
> 
> Adding the struct when it's needed will be easy, so why pay the
> notational overhead now?

True.

> > Also, this is also the
> > way it's done in the kernel, so that uniformity helps as well.
> 
> Two uglies make a pretty?

Hah!

Amit

Re: [Qemu-devel] [PATCH 0/2] virtio-serial: Prevent addition of ports with same name

2014-08-06 Thread Amit Shah

On (Wed) 06 Aug 2014 [09:27:04], Markus Armbruster wrote:
> Amit Shah  writes:
> 
> > Hi,
> >
> > The 2nd patch prevents adding ports to the system with conflicts in
> > the 'name' parameter.  This is done by going through all the ports in
> > the system, including ports on other virtio-serial devices.
> >
> > The first patch prepares for this by creating a linked list of all
> > available virtio-serial devices, so they can be walked over to check
> > their ports' names.
> >
> > Please review.
> 
> No advice from Andreas on better ways to iterate over devices, yet
> (vacation time?).  If there is one, we can always use it on top.
> 
> Reviewed-by: Markus Armbruster 

Thanks!

I'll wait for a week before submitting a pull req.


Amit

Re: [Qemu-devel] [questions] about qemu log

"Zhang Haoyu"  writes:

 The output is on qemu's stderr.  You are in control of what that
>>> stderr is.
>>>
>>> I don't get why we can configure
>>> -D /path/to/unique/file/name.log
>>>
>>> but we also have to redirect stderr (I didn't checked if the daemonize
>>> option was closing it). What's the purpose of this logfile option?
>>>
>>
>>Well -D will log to file only loggable (i.e. qemu_log()) information
>>(which has all sorts of options and switches). Stderr, is a little
>>more static and should in theory be limited to genuine errors. But if
>>you want a combined log of both you can simply omit -D to default
>>qemu_log output to stderr. This gives you a combined log that you can
>>redirect anywhere. To be honest, this is what I do as a matter of
>>course (2> foo rather than -D foo).
>>
> Maybe we can introduce a new qemu option to specify a error logfile
> where stderr be redirected, like below,
> DEF("elogfile", HAS_ARG, QEMU_OPTION_elogfile, \
> "-elogfile logfile redirect stderr log to logfile(default
> /var/log/qemu/##.log)\n",
> QEMU_ARCH_ALL)
> STEXI
> @item -elogfile @var{logfile}
> @findex -elogfile
> redirect stderr in @var{logfile}
> ETEXI
> then we can set the error log file through qemu command,
> /var/log/qemu/##.log as default.
>

This sounds out-of-scope for QEMU to me and makes a standard flow
non-standard. If prints are going to stderr where should be going
elsewhere they probably should be fixed. Do you have specific examples
of information going to stderr that you would rather go to a log (be
it an error log or something else?).

>>> I use proxmox to manage vm, it dose not redirect qemu's stderr, and
>>> start vm with -daemonize option,
>>> so the error log disappeared.
>>> I want to redirect the error log of qemu to a specified logfile, if
>>> fault happened, I can use the error log to analyze the fault.
>>>
>>> And, why qemu output the error log to stderr instead of a error
>>> logfile which can be configure?
>>
>>Because the code is a mess in that regard.
>>
>>You don't fix that by redirecting stderr wholesale, because that just
>>adds to the mess.  You fix it at the root, one ill-advised fprintf() at
>>a time, as Peter advises:
>>
>
> Sorry, I'm afraid I misunderstand what you mean,
> should I replace all of fprintf(stderr, ...) with qemu_log() ?
> or only some cases where stderr is used where qemu_log should be, as
> Perter advises?

I didn't mean to suggest blind conversion from fprintf() to qemu_log().
Each instance of fprintf() needs to be reviewed before conversion.  I
think that's exactly Peter's advice, too.

> If so, should I still need to redirect the stderr to specified logfile
> in qemu's parent shell/process ?

Probably.  Libvirt does it.

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Il 06/08/2014 07:33, Ming Lei ha scritto:
>> > I played a bit with the following, I hope it's not too naive. I couldn't
>> > see a difference with your patches, but at least one reason for this is
>> > probably that my laptop SSD isn't fast enough to make the CPU the
>> > bottleneck. Haven't tried ramdisk yet, that would probably be the next
>> > thing. (I actually wrote the patch up just for some profiling on my own,
>> > not for comparing throughput, but it should be usable for that as well.)
> This might not be good for the test since it is basically a sequential
> read test, which can be optimized a lot by kernel. And I always use
> randread benchmark.

A microbenchmark already exists in tests/test-coroutine.c, and doesn't
really tell us much; it's obvious that coroutines execute more code, the
question is why it affects the iops performance.

The sequential read should be the right workload.  For fio, you want to
get as many iops as possible to QEMU and so you need randread.  But
qemu-img is not run in a guest and if the kernel optimizes sequential
reads then the bypass should have even more benefits because it makes
userspace proportionally more expensive.

In any case, the patches as written have no hope of being accepted.  If
you "invert" the logic from aio->co to co->aio, that would be much
better even if it's tedious.

Paolo

Re: [Qemu-devel] [PATCH] linux-user: Add missing unlock_user_struct to timer_create.

2014-08-06 Thread Riku Voipio

On Sun, Aug 03, 2014 at 09:45:38AM +1000, Erik de Castro Lopo wrote:
> Signed-off-by: Erik de Castro Lopo 

Thanks,

applied to linux-user tree.

> ---
>  linux-user/syscall.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index a50229d..5f22b37 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -9432,6 +9432,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
> arg1,
>  host_sevp.sigev_signo = tswap32(ptarget_sevp->sigev_signo);
>  host_sevp.sigev_notify = tswap32(ptarget_sevp->sigev_notify);
>  
> +unlock_user_struct(ptarget_sevp, arg2, 0);
>  phost_sevp = &host_sevp;
>  }
>  
> -- 
> 2.0.1
>

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

Il 05/08/2014 20:57, Dr. David Alan Gilbert ha scritto:
> (CC Paolo)
> 
> * William Dauchy (wdau...@gmail.com) wrote:
>> Hello,
>>
>> For a qemu live migration from 2.0 to 2.1 I'm getting:
>>
>> qemu-system-x86_64: Length mismatch: /rom@etc/acpi/tables: 3000 in != 2
>> qemu-system-x86_64: Unknown ramblock "/table-loader", cannot accept migration
>> qemu-system-x86_64: Ack, bad migration stream!
>> qemu-system-x86_64: Illegal RAM offset f0e2d500f06000
>> qemu: warning: error while loading state for instance 0x0 of device 'ram'
>> qemu-system-x86_64: load of migration failed: Invalid argument
>>
>> I was wondering if it was a bug or if I was supposed to do something
>> to avoid this error.
> 
> That should work.
> 
>> I saw the bug report for qemu1.7 but it's not my case.
> 
> Can you confirm this is on the final 2.1 release (there was a fix that
> went in just around rc5).
> 
> What's your command line on both ends?

I suspect he's using "-M pc" on both.  You must use "-M pc-i440fx-2.0"
on both if you're migrating from 2.0 to a different version.

Paolo

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

Il 06/08/2014 08:16, Markus Armbruster ha scritto:
>> for the receiver, I'm using 2.1
>> # qemu-system-x86_64 --version
>> QEMU emulator version 2.1.0, Copyright (c) 2003-2008 Fabrice Bellar
>>
>>
>> for the sender side, here is the qmp result:
>>
>> (QEMU) query-version
>> {   u'return': {   u'package': u'',
>>u'qemu': {   u'major': 2, u'micro': 50, u'minor': 0}}}
>>

This is a development version between 2.0.0 and 2.0.0-rc1; it may have
bugs such as the one that was fixed in 2.0.0-rc5.

Migration from development versions is not supported.  You must either
upgrade to 2.1.0, or downgrade to the released 2.0.0 version.

Paolo

Re: [Qemu-devel] [PATCH v3 for-2.2 0/8] don't use Yoda conditions

"Michael S. Tsirkin"  writes:

> On Wed, Aug 06, 2014 at 08:05:46AM +0200, Markus Armbruster wrote:
>> "Gonglei (Arei)"  writes:
>> 
>> > Hi,
>> >
>> >> >
>> >> > $WHATEVER: don't use 'Yoda conditions'
>> >> >
>> >> > 'Yoda conditions' are not part of idiomatic QEMU coding
>> >> > style, so rewrite them in the more usual order.
>> >> 
>> >> 
>> >> OK but why stop at these files? How about this
>> >> instead?
>> >> 
>> > I just search c files by using key words like "NULL ==" etc.
>> >
>> > I don't think we should change conditional statements like ">" and ">=".
>> 
>> Eric pointed out it's actually incorrect for NaNs.
>> 
>> If you want to touch inequalities, separate patch(es) please, because
>> they need more thorough review, both for correctness and for style.
>> 
>> > BTW, just using like "value == NULL" instead of "NULL == value" in all 
>> > files
>> > is not a good idea, which we have discussed in my patch serials
>> > v2. So, I posted
>> > v3, add change log " imitate nearby code about using '!value' or
>> > value == NULL' at
>> > every patch " .
>> 
>> Re "not a good idea": I think rewriting "NULL == value" to "value ==
>> NULL" *is* a good idea, but rewriting it to "!value" where that blends
>> in with surrounding code is a *better* idea.
>> 
>> Gonglei's patches do that, Michael's don't, but are more complete.
>> Therefore:
>
> Yes but it's unrelated to Yoda: we have x != NULL without Yoda
> in a lot of places. So this seems, to me, an unrelated issue.

Actually, the relation is clear to me.  The patch cleans up style, by
converting Yoda conditionals into the locally appropriate form.

"Locally appropriate" because the optimal choice between "x == NULL" and
"!x" depends on which form the context uses.

Gonglei made those choices.  It's only a small improvement, and I
wouldn't demand it, but since we already have it, let's not throw it
away.

> If people feel this == NULL -> !x is desired, it's better to do it all at
> once IMHO, and do x != NULL -> x at the same time.

Nope, because there's no consensus.  See thread leading to
https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg00033.html

> Easy to run another script to do it on top.
>
>> 
>> > So, maybe you can post patches for those files I have missed in
>> > the serials,
>> > but not simply instead all by semantic script IMO, thanks!
>> 
>> Easy: apply Gonglei's patches before you run the script.
>> 
>> You may have to split patches along subsystem boundaries to get them in.
>> Bothersome, as it involves guessing boundaries.  Not a request from me,
>> just a warning of possible misfortune :)
>
> It's going in through trivial tree, I don't think split-up is necessary.

Hope you have better luck than me, because when I try to route tree-wide
cleanups via -trivial, I usually don't have any.

[Qemu-devel] [PATCH] linux-user: redirect openat calls

2014-08-06 Thread riku . voipio

From: Riku Voipio 

While Mikhail fixed /proc/self/maps, it was noticed openat calls are
not redirected currently. Some archs don't have open at all, so
openat needs to be redirected.

Fix this by consolidating open/openat code to do_openat - open
is implemented using openat(AT_FDCWD, ... ), which according
to open(2) man page is identical.

Since all targets now have openat, remove the ifdef around sys_openat
and openat: case in do_syscall.

Cc: Mikhail Ilin 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index c8c2b4c..dd77673 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -294,7 +294,6 @@ static int sys_getcwd1(char *buf, size_t size)
   return strlen(buf)+1;
 }
 
-#ifdef TARGET_NR_openat
 static int sys_openat(int dirfd, const char *pathname, int flags, mode_t mode)
 {
   /*
@@ -306,7 +305,6 @@ static int sys_openat(int dirfd, const char *pathname, int 
flags, mode_t mode)
   }
   return (openat(dirfd, pathname, flags));
 }
-#endif
 
 #ifdef TARGET_NR_utimensat
 #ifdef CONFIG_UTIMENSAT
@@ -5274,7 +5272,7 @@ static int open_net_route(void *cpu_env, int fd)
 }
 #endif
 
-static int do_open(void *cpu_env, const char *pathname, int flags, mode_t mode)
+static int do_openat(void *cpu_env, int dirfd, const char *pathname, int 
flags, mode_t mode)
 {
 struct fake_open {
 const char *filename;
@@ -5295,7 +5293,7 @@ static int do_open(void *cpu_env, const char *pathname, 
int flags, mode_t mode)
 
 if (is_proc_myself(pathname, "exe")) {
 int execfd = qemu_getauxval(AT_EXECFD);
-return execfd ? execfd : get_errno(open(exec_path, flags, mode));
+return execfd ? execfd : get_errno(sys_openat(dirfd, exec_path, flags, 
mode));
 }
 
 for (fake_open = fakes; fake_open->filename; fake_open++) {
@@ -5329,7 +5327,7 @@ static int do_open(void *cpu_env, const char *pathname, 
int flags, mode_t mode)
 return fd;
 }
 
-return get_errno(open(path(pathname), flags, mode));
+return get_errno(sys_openat(dirfd, path(pathname), flags, mode));
 }
 
 /* do_syscall() should always have a single exit point at the end so
@@ -5404,22 +5402,19 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 case TARGET_NR_open:
 if (!(p = lock_user_string(arg1)))
 goto efault;
-ret = get_errno(do_open(cpu_env, p,
-target_to_host_bitmask(arg2, fcntl_flags_tbl),
-arg3));
+ret = get_errno(do_openat(cpu_env, AT_FDCWD, p,
+  target_to_host_bitmask(arg2, 
fcntl_flags_tbl),
+  arg3));
 unlock_user(p, arg1, 0);
 break;
-#if defined(TARGET_NR_openat) && defined(__NR_openat)
 case TARGET_NR_openat:
 if (!(p = lock_user_string(arg2)))
 goto efault;
-ret = get_errno(sys_openat(arg1,
-   path(p),
-   target_to_host_bitmask(arg3, 
fcntl_flags_tbl),
-   arg4));
+ret = get_errno(do_openat(cpu_env, arg1, p,
+  target_to_host_bitmask(arg3, 
fcntl_flags_tbl),
+  arg4));
 unlock_user(p, arg2, 0);
 break;
-#endif
 case TARGET_NR_close:
 ret = get_errno(close(arg1));
 break;
-- 
2.0.1

Re: [Qemu-devel] [PATCH] linux-user: /proc/self/maps content

2014-08-06 Thread Riku Voipio

On Tue, Aug 05, 2014 at 05:24:27PM +0400, Mikhail Ilin wrote:
> I've tested the sample for Aarch64 myself and found that the
> approach should also work fine.
> 
> Translation layout:
> 
> $ qemu-aarch64 -strace /tmp/busybox-static cat /proc/self/maps
> 
> startend  size prot
> 0040-005ba000 001ba000 r-x
> 005c9000-005d3000 a000 rw-
> 0040-00401000 1000 ---
> 00401000-004000801000 0080 rw-
> 
> /proc/self/maps output:
> 
> 0040-005ba000 r-xp  08:01 28837016  /tmp/busybox-static
> 005ba000-005c9000 ---p  00:00 0
> 005c9000-005cc000 rw-p 001b9000 08:01 28837016  /tmp/busybox-static
> 005cc000-005f4000 rw-p  00:00 0
> 6000-602eb000 r-xp  08:01 55578769
> /home/michail/my1/bin/qemu-aarch64
> 604eb000-604f6000 rw-p 002eb000 08:01 55578769
> /home/michail/my1/bin/qemu-aarch64
> 604f6000-6054a000 rw-p  00:00 0
> 6054a000-6254b000 rwxp  00:00 0
> 6254b000-62577000 rw-p  00:00 0
> 63396000-633da000 rw-p  00:00 0  [heap]
> 40-401000 ---p  00:00 0
> 401000-4000801000 rw-p  00:00 0
> 7ff830cab000-7ff8348fb000 rw-p  00:00 0
> 7fffb26ed000-7fffb270e000 rw-p  00:00 0  [stack]
> 7fffb27bb000-7fffb27bd000 r-xp  00:00 0  [vdso]
> ff60-ff601000 r-xp  00:00 0  [vsyscall]
> 
> And the reason why it doesn't work for Aarch64 is openat call which is
> used instead of open.
> 
> $ qemu-aarch64 -strace /tmp/busybox-static cat /proc/self/maps
> 
> 483 setgid(1000,0,47,45,0,274886296116) = 0
> 483 setuid(1000,0,47,45,0,274886296116) = 0
> 483 openat(AT_FDCWD,"/proc/self/maps",O_RDONLY) = 3
> 483 read(3,0x7febf0,4096) = 1071
> 
> this call doesn't have additional preprocessing and called directly.
> 
>case TARGET_NR_openat:
> if (!(p = lock_user_string(arg2)))
> goto efault;
> ret = get_errno(sys_openat(arg1,
>path(p),
>target_to_host_bitmask(arg3,
> fcntl_flags_tbl),
>arg4));
> 
> I believe OpenRISC case looks the same.

Thanks for looking into it. I just sent a patch that adds preprocessing
to openat, and seems to clear the issue for both aarch64 and OpenRISC.

Riku

> 
> On 05.08.2014 15:47, Riku Voipio wrote:
> >Hi,
> >
> >On Tue, Aug 05, 2014 at 03:10:07PM +0400, Mikhail Ilyin wrote:
> >>Build /proc/self/maps doing a match against guest memory translation table.
> >>Output only that map records which are valid for guest memory layout.
> >
> >This is clear improvement, for most archs. But seems aarch64, openrisc still
> >leak host maps. It's not a regression, same issue before the patch.
> >
> >+ /home/voipio/linaro/qemu/obj/alpha-linux-user/qemu-alpha 
> >/home/voipio/linaro/qemu-smoke/alpha/busybox cat /proc/self/maps
> >00012000-0001201cc000 r-xp  fe:00 8784862  
> >/home/voipio/linaro/qemu-smoke/alpha/busybox
> >0001201dc000-0001201e rw-p 001cc000 fe:00 8784862  
> >/home/voipio/linaro/qemu-smoke/alpha/busybox
> >0001201e-00012020a000 rw-p  00:00 0
> >0040-00402000 ---p  00:00 0
> >00402000-004000802000 rw-p  00:00 0
> >[stack]
> >+ /home/voipio/linaro/qemu/obj/arm-linux-user/qemu-arm 
> >/home/voipio/linaro/qemu-smoke/armel/busybox cat /proc/self/maps
> >8000-0014b000 r-xp  fe:00 8784905  
> >/home/voipio/linaro/qemu-smoke/armel/busybox
> >00153000-00154000 rw-p 00143000 fe:00 8784905  
> >/home/voipio/linaro/qemu-smoke/armel/busybox
> >00154000-0017b000 rw-p  00:00 0
> >f67ff000-f680 ---p  00:00 0
> >f680-f700 rw-p  00:00 0[stack]
> >+ /home/voipio/linaro/qemu/obj/aarch64-linux-user/qemu-aarch64 
> >/home/voipio/linaro/qemu-smoke/arm64/busybox cat /proc/self/maps
> >0040-00572000 r-xp  fe:00 8784917
> >/home/voipio/linaro/qemu-smoke/arm64/busybox
> >00572000-00581000 ---p  00:00 0
> >00581000-00584000 rw-p 00171000 fe:00 8784917
> >/home/voipio/linaro/qemu-smoke/arm64/busybox
> >00584000-005ac000 rw-p  00:00 0
> >40-401000 ---p  00:00 0
> >401000-4000811000 rw-p  00:00 0
> >7f38e312b000-7f38e6d2b000 rw-p  00:00 0
> >7f38e6d2b000-7f38e6d86000 r-xp  fe:00 5242918
> >/lib/x86_64-linux-gnu/libpcre.so.3.13.1
> >7f38e6d86000-7f38e6f86000 ---p 0005b000 fe:00 5242918
> >/lib/x86_64-linux-gnu/libpcre.so.3.13.1
> >7f38e6f86000-7f38e6f87000 rw-p 0005b000 fe:00 5242918
> >/lib/x86_64-linux-gnu/libpcre.so.3.13.1
> >7f38e6f87000-7f38e7126000 r-xp  fe:00 5248993
> >/lib/x86_64-lin

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 3:45 PM, Paolo Bonzini  wrote:
> Il 06/08/2014 07:33, Ming Lei ha scritto:
>>> > I played a bit with the following, I hope it's not too naive. I couldn't
>>> > see a difference with your patches, but at least one reason for this is
>>> > probably that my laptop SSD isn't fast enough to make the CPU the
>>> > bottleneck. Haven't tried ramdisk yet, that would probably be the next
>>> > thing. (I actually wrote the patch up just for some profiling on my own,
>>> > not for comparing throughput, but it should be usable for that as well.)
>> This might not be good for the test since it is basically a sequential
>> read test, which can be optimized a lot by kernel. And I always use
>> randread benchmark.
>
> A microbenchmark already exists in tests/test-coroutine.c, and doesn't
> really tell us much; it's obvious that coroutines execute more code, the
> question is why it affects the iops performance.

Could you take a look at the coroutine benchmark I worte?  The running
result shows coroutine does decrease performance a lot compared with
bypass coroutine like the patchset is doing.

>
> The sequential read should be the right workload.  For fio, you want to
> get as many iops as possible to QEMU and so you need randread.  But
> qemu-img is not run in a guest and if the kernel optimizes sequential
> reads then the bypass should have even more benefits because it makes
> userspace proportionally more expensive.
>
> In any case, the patches as written have no hope of being accepted.  If
> you "invert" the logic from aio->co to co->aio, that would be much
> better even if it's tedious.

Let's not talk the bypass patch, and see if the coroutine is really the cause
of performance drop first.

Thanks,

[Qemu-devel] [Bug 1353346] [NEW] ARMv7-M software-triggered interrupts-- unexpected behaviour

2014-08-06 Thread Boris Feigin

Public bug reported:

The handling of the NVIC "Software Triggered Interrupt Register" in
qemu-2.1.0/hw/intc/armv7m_nvic.c:375 isn't quite right.  As things
stand, writing a zero to the STIR ends up transferring control to vector
table entry zero, which, on ARMv7-M, holds the reset value of the stack
pointer.  That's what I see with lm3s811evb emulation, and that's not
what happens on my STM NUCLEO-F103RB board (Cortex-M3).

Seems like an oversight-- the handler probably wants
armv7m_nvic_set_pending(), not gic_set_pending_private(), and the IRQ
number needs 16 added onto it to get the exception number for the
interrupt.

ARM DUI 0552A (Cortex-M3 Devices: Generic User's Guide), p. 134:
  "Interrupt ID of the interrupt to trigger, in the range 0-239. For example, a 
value of 0x03 specifies interrupt IRQ3."

Cheers,
Boris

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353346

Title:
  ARMv7-M software-triggered interrupts-- unexpected behaviour

Status in QEMU:
  New

Bug description:
  The handling of the NVIC "Software Triggered Interrupt Register" in
  qemu-2.1.0/hw/intc/armv7m_nvic.c:375 isn't quite right.  As things
  stand, writing a zero to the STIR ends up transferring control to
  vector table entry zero, which, on ARMv7-M, holds the reset value of
  the stack pointer.  That's what I see with lm3s811evb emulation, and
  that's not what happens on my STM NUCLEO-F103RB board (Cortex-M3).

  Seems like an oversight-- the handler probably wants
  armv7m_nvic_set_pending(), not gic_set_pending_private(), and the IRQ
  number needs 16 added onto it to get the exception number for the
  interrupt.

  ARM DUI 0552A (Cortex-M3 Devices: Generic User's Guide), p. 134:
"Interrupt ID of the interrupt to trigger, in the range 0-239. For example, 
a value of 0x03 specifies interrupt IRQ3."

  Cheers,
  Boris

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353346/+subscriptions

Re: [Qemu-devel] [RFC PATCH v2 09/10] virtio-scsi-dataplane: Code to run virtio-scsi on iothread

Il 06/08/2014 07:35, Fam Zheng ha scritto:
>  ifeq ($(CONFIG_VIRTIO),y)
> -obj-y += virtio-scsi.o
> +obj-y += virtio-scsi.o virtio-scsi-dataplane.o
>  obj-$(CONFIG_VHOST_SCSI) += vhost-scsi.o
>  endif

I first thought that this must be conditional on 
CONFIG_VIRTIO_BLK_DATA_PLANE.  However, CONFIG_VIRTIO_BLK_DATA_PLANE is 
itself obsolete:

##
# adjust virtio-blk-data-plane based on linux-aio

if test "$virtio_blk_data_plane" = "yes" -a \
"$linux_aio" != "yes" ; then
  error_exit "virtio-blk-data-plane requires Linux AIO, please try 
--enable-linux-aio"
elif test -z "$virtio_blk_data_plane" ; then
  virtio_blk_data_plane=$linux_aio
fi

and there's no requirement to have Linux AIO anymore.  Can you prepare a
patch to drop CONFIG_VIRTIO_BLK_DATA_PLANE, and replace patch 1 with it?

We can leave --disable-virtio-blk-data-plane and --enable-virtio-blk-data-plane
in for a couple of releases.

Paolo

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
> Hi Kevin,
> 
> On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf  wrote:
> > Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
> >> I have been wondering how to prove that the root cause is the ucontext
> >> coroutine mechanism (stack switching).  Here is an idea:
> >>
> >> Hack your "bypass" code path to run the request inside a coroutine.
> >> That way you can compare "bypass without coroutine" against "bypass with
> >> coroutine".
> >>
> >> Right now I think there are doubts because the bypass code path is
> >> indeed a different (and not 100% correct) code path.  So this approach
> >> might prove that the coroutines are adding the overhead and not
> >> something that you bypassed.
> >
> > My doubts aren't only that the overhead might not come from the
> > coroutines, but also whether any coroutine-related overhead is really
> > unavoidable. If we can optimise coroutines, I'd strongly prefer to do
> > just that instead of introducing additional code paths.
> 
> OK, thank you for taking look at the problem, and hope we can
> figure out the root cause, :-)
> 
> >
> > Another thought I had was this: If the performance difference is indeed
> > only coroutines, then that is completely inside the block layer and we
> > don't actually need a VM to test it. We could instead have something
> > like a simple qemu-img based benchmark and should be observing the same.
> 
> Even it is simpler to run a coroutine-only benchmark, and I just
> wrote a raw one, and looks coroutine does decrease performance
> a lot, please see the attachment patch, and thanks for your template
> to help me add the 'co_bench' command in qemu-img.

Yes, we can look at coroutines microbenchmarks in isolation. I actually
did do that yesterday with the yield test from tests/test-coroutine.c.
And in fact profiling immediately showed something to optimise:
pthread_getspecific() was quite high, replacing it by __thread on
systems where it works is more efficient and helped the numbers a bit.
Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
in qemu-img bench), maybe there's even something that can be done here.

However, I just wasn't sure whether a change on this level would be
relevant in a realistic environment. This is the reason why I wanted to
get a benchmark involving the block layer and some I/O.

> From the profiling data in below link:
> 
> http://pastebin.com/YwH2uwbq
> 
> With coroutine, the running time for same loading is increased
> ~50%(1.325s vs. 0.903s), and dcache load events is increased
> ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
> 1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
> 
> The bypass code in the benchmark is very similar with the approach
> used in the bypass patch, since linux-aio with O_DIRECT seldom
> blocks in the the kernel I/O path.
> 
> Maybe the benchmark is a bit extremely, but given modern storage
> device may reach millions of IOPS, and it is very easy to slow down
> the I/O by coroutine.

I think in order to optimise coroutines, such benchmarks are fair game.
It's just not guaranteed that the effects are exactly the same on real
workloads, so we should take the results with a grain of salt.

Anyhow, the coroutine version of your benchmark is buggy, it leaks all
coroutines instead of exiting them, so it can't make any use of the
coroutine pool. On my laptop, I get this (where fixed coroutine is a
version that simply removes the yield at the end):

| bypass| fixed coro| buggy coro
+---+---+--
time| 1.09s | 1.10s | 1.62s
L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
insns per cycle | 2.39  | 2.39  | 1.90

Begs the question whether you see a similar effect on a real qemu and
the coroutine pool is still not big enough? With correct use of
coroutines, the difference seems to be barely measurable even without
any I/O involved.

> > I played a bit with the following, I hope it's not too naive. I couldn't
> > see a difference with your patches, but at least one reason for this is
> > probably that my laptop SSD isn't fast enough to make the CPU the
> > bottleneck. Haven't tried ramdisk yet, that would probably be the next
> > thing. (I actually wrote the patch up just for some profiling on my own,
> > not for comparing throughput, but it should be usable for that as well.)
> 
> This might not be good for the test since it is basically a sequential
> read test, which can be optimized a lot by kernel. And I always use
> randread benchmark.

Yes, I shortly pondered whether I should implement random offsets
instead. But then I realised that a quicker kernel operation would only
help the benchmark because we want it to test the CPU consumption in
userspace. So the faster the kernel gets, the better for us, because it
should make the impact of coroutines bigger.

Kevin

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Il 06/08/2014 10:38, Ming Lei ha scritto:
> On Wed, Aug 6, 2014 at 3:45 PM, Paolo Bonzini  wrote:
>> Il 06/08/2014 07:33, Ming Lei ha scritto:
> I played a bit with the following, I hope it's not too naive. I couldn't
> see a difference with your patches, but at least one reason for this is
> probably that my laptop SSD isn't fast enough to make the CPU the
> bottleneck. Haven't tried ramdisk yet, that would probably be the next
> thing. (I actually wrote the patch up just for some profiling on my own,
> not for comparing throughput, but it should be usable for that as well.)
>>> This might not be good for the test since it is basically a sequential
>>> read test, which can be optimized a lot by kernel. And I always use
>>> randread benchmark.
>>
>> A microbenchmark already exists in tests/test-coroutine.c, and doesn't
>> really tell us much; it's obvious that coroutines execute more code, the
>> question is why it affects the iops performance.
> 
> Could you take a look at the coroutine benchmark I worte?  The running
> result shows coroutine does decrease performance a lot compared with
> bypass coroutine like the patchset is doing.

Your benchmark is synchronous, while disk I/O is asynchronous.

Your benchmark doesn't add much compared to "time tests/test-coroutine
-m perf  -p /perf/yield".  It takes 8 seconds on my machine, and 10^8
function calls obviously take less than 8 seconds.  I've sent a patch to
add a "baseline" function call benchmark to test-coroutine.

>> The sequential read should be the right workload.  For fio, you want to
>> get as many iops as possible to QEMU and so you need randread.  But
>> qemu-img is not run in a guest and if the kernel optimizes sequential
>> reads then the bypass should have even more benefits because it makes
>> userspace proportionally more expensive.

Do you agree with this?

Paolo

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

On Wed, Aug 6, 2014 at 9:47 AM, Paolo Bonzini  wrote:
> I suspect he's using "-M pc" on both.

Is it the default value? because it's not the case in my command line.

> You must use "-M pc-i440fx-2.0"
> on both if you're migrating from 2.0 to a different version.

wow. I wasn't expecting such behavior; i.e migration between qemu
version does not seem trivial and it removes the benefits; better
using stop/start.
thanks for the info.

-- 
William

Re: [Qemu-devel] [RFC PATCH v2 09/10] virtio-scsi-dataplane: Code to run virtio-scsi on iothread

2014-08-06 Thread Fam Zheng

On Wed, 08/06 10:45, Paolo Bonzini wrote:
> Il 06/08/2014 07:35, Fam Zheng ha scritto:
> >  ifeq ($(CONFIG_VIRTIO),y)
> > -obj-y += virtio-scsi.o
> > +obj-y += virtio-scsi.o virtio-scsi-dataplane.o
> >  obj-$(CONFIG_VHOST_SCSI) += vhost-scsi.o
> >  endif
> 
> I first thought that this must be conditional on 
> CONFIG_VIRTIO_BLK_DATA_PLANE.  However, CONFIG_VIRTIO_BLK_DATA_PLANE is 
> itself obsolete:
> 
> ##
> # adjust virtio-blk-data-plane based on linux-aio
> 
> if test "$virtio_blk_data_plane" = "yes" -a \
> "$linux_aio" != "yes" ; then
>   error_exit "virtio-blk-data-plane requires Linux AIO, please try 
> --enable-linux-aio"
> elif test -z "$virtio_blk_data_plane" ; then
>   virtio_blk_data_plane=$linux_aio
> fi
> 
> and there's no requirement to have Linux AIO anymore.  Can you prepare a
> patch to drop CONFIG_VIRTIO_BLK_DATA_PLANE, and replace patch 1 with it?
> 
> We can leave --disable-virtio-blk-data-plane and 
> --enable-virtio-blk-data-plane
> in for a couple of releases.
> 

Yes. Sounds a good idea.

Fam

Re: [Qemu-devel] Are -cdrom/-hda (or -drive if=ide) supposed to work in q35?

Am 05.08.2014 um 23:14 hat John Snow geschrieben:
> 
> 
> On 08/05/2014 04:30 AM, Kevin Wolf wrote:
> >Am 01.08.2014 um 22:10 hat John Snow geschrieben:
> >>
> >>On 06/12/2014 05:03 AM, Markus Armbruster wrote:
> >>>-drive mixes up configuration of backend and frontend (a.k.a. device
> >>>  model), as follows:
> >>>
> >>>1. It always defines a backend.  "info block" shows them.
> >>>
> >>>2. It always defines a few frontend configuration bits for the device
> >>>models to pick up.
> >>>
> >>>3. Except with if=none, it posts a request to board code to create a
> >>>suitable frontend.  It's up to the board code to honor, reject or
> >>>ignore the request.  The i440FX boards honor it, the Q35 boards
> >>>ignore it.
> >>>
> >>>Nobody has gotten around to making the Q35 boards honor it, in part
> >>>because there has been some confusion on what if=ide is supposed to
> >>>mean on Q35.  Should it connect an ide-hd / ide-cd in SATA mode or in
> >>>legacy PATA mode?
> >>>
> >>>I've always argued for SATA, because for me if=ide does *not* imply a
> >>>specific kind of HBA any more than if=scsi does, and the "natural"
> >>>HBA for Q35 is AHCI in SATA mode.
> >>>
> >>>Kevin (cc'ed) has argued for a way to make it connect in legacy PATA
> >>>mode.  I'd be fine with that, as long as it's off by default.
> >>>
> >>>Patches welcome.
> >>>
> >>
> >>Kevin, (or anyone else with an opinion for that matter), what is the
> >>reasoning behind wanting -cdrom to use the old PATA interfaces?
> >
> >The assumption that makes it a problem is that sooner or later we'll
> >want to make Q35 the default. Most of the changes this brings in will
> >make the guest see different, but generally still compatible hardware.
> >AHCI however is not compatible with IDE in the sense that an OS that has
> >only an IDE driver won't work with AHCI.
> >
> >It wouldn't be reasonable to break things like '-hda winxp.qcow2'. And
> >in fact, if the internet is right, even newer Windows version give you
> >trouble when you suddenly change from IDE to AHCI. So after an upgrade
> >many users would find their existing guests to be broken.
> >
> >>For at least the immediate future, the AHCI device doesn't support
> >>the mixed-mode SATA/PATA access models, though I suppose we could,
> >>it seems like a more obvious and simple solution to just allow the
> >>shorthand syntactic sugar commands to use the native bus of the
> >>system until you specify otherwise.
> >>
> >>I think I will probably begin writing a patch under this assumption
> >>unless there is a better technical reason not to.
> >
> >I think the solution that was generally agreed on was to introduce a
> >machine option that decides whether to provide an IDE or AHCI interface
> >(similar to the BIOS option that commonly exists on real hardware). We
> >just need someone to implement it.
> >
> >Kevin
> >
> 
> OK, though for what it's worth I think that on real hardware, that
> BIOS switch toggles configuration bits on the AHCI device that
> actually changes its signature into a "new device."

I'm not completely sure, but afaik these bits aren't actually in the
AHCI standard, but vendor-specific extensions. If you toggle them,
apparently the device "magically" turns into a different one, including
different PCI IDs and things like that.

I think we can safely take the shortcut of directly creating the right
device and leaving the BIOS alone.

> I suppose in our case we could create a machine option that toggled
> the behavior of the AHCI device appropriately to be either IDE or
> AHCI and treated the syntactic sugar commands appropriately, though
> you still may run into problems with Windows guests if you ever
> change the default machine type -- I have a hunch that Windows would
> get rather grumpy if you swapped out its IDE device from under it
> with even the emulated ICH9 one in legacy mode, but I suppose we can
> cross that bridge when we get there.
> 
> For now, in the meantime, I will assume that -M q35 also implies the
> usage of AHCI, and treat the shorthands accordingly.

Okay.

Kevin

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

On Wed, Aug 6, 2014 at 9:49 AM, Paolo Bonzini  wrote:
> This is a development version between 2.0.0 and 2.0.0-rc1; it may have
> bugs such as the one that was fixed in 2.0.0-rc5.

you mean 2.1.0-rc5

> Migration from development versions is not supported.  You must either
> upgrade to 2.1.0, or downgrade to the released 2.0.0 version.

in all case I would have the same issue with 2.0.0 version since the
fix has been pushed in 2.1.0-rc5

-- 
William

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

2014-08-06 Thread Dr. David Alan Gilbert

* William Dauchy (wdau...@gmail.com) wrote:
> On Wed, Aug 6, 2014 at 9:47 AM, Paolo Bonzini  wrote:
> > I suspect he's using "-M pc" on both.
> 
> Is it the default value? because it's not the case in my command line.
> 
> > You must use "-M pc-i440fx-2.0"
> > on both if you're migrating from 2.0 to a different version.
> 
> wow. I wasn't expecting such behavior; i.e migration between qemu
> version does not seem trivial and it removes the benefits; better
> using stop/start.
> thanks for the info.

The trick is to pick a -M value and stick with it; then you should be
able to keep migrating to newer QEMU versions easily (just don't
go with dev versions because things are often broken in them).

-M pc   is special, don't use that if you want to be able to migrate

Dave

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

On Wed, Aug 6, 2014 at 11:17 AM, Dr. David Alan Gilbert
 wrote:
> The trick is to pick a -M value and stick with it; then you should be
> able to keep migrating to newer QEMU versions easily (just don't
> go with dev versions because things are often broken in them).

ok
for dev versions, I'm just cherry picking interesting fixes which are
not yet released in stable.

> -M pc   is special, don't use that if you want to be able to migrate

understood, I guess it's the default value
-- 
William

[Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls

This can be used to compute the cost of coroutine operations.  In the
end the cost of the function call is a few clock cycles, so it's pretty
cheap for now, but it may become more relevant as the coroutine code
is optimized.

For example, here are the results on my machine:

   Function call 1 iterations: 0.173884 s
   Yield 1 iterations: 8.445064 s
   Lifecycle 100 iterations: 0.098445 s
   Nesting 1 iterations of 1000 depth each: 7.406431 s

One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
one coroutine allocation takes (roughly, since some of the allocations
in the nesting test do hit the pool) 739 nanoseconds:

   (8.445064 - 0.173884) * 10^9 / 1 = 82.7
   (0.098445 * 100 - 0.173884) * 10^9 / 1 = 96.7
   (7.406431 * 10 - 0.173884) * 10^9 / 1 = 738.9

Signed-off-by: Paolo Bonzini 
---
 tests/test-coroutine.c | 24 
 1 files changed, 24 insertions(+)

diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c
index 760636d..6e634f4 100644
--- a/tests/test-coroutine.c
+++ b/tests/test-coroutine.c
@@ -288,6 +288,29 @@ static void perf_yield(void)
 maxcycles, duration);
 }
 
+static __attribute__((noinline)) void dummy(unsigned *i)
+{
+(*i)--;
+}
+
+static void perf_baseline(void)
+{
+unsigned int i, maxcycles;
+double duration;
+
+maxcycles = 1;
+i = maxcycles;
+
+g_test_timer_start();
+while (i > 0) {
+dummy(&i);
+}
+duration = g_test_timer_elapsed();
+
+g_test_message("Function call %u iterations: %f s\n",
+maxcycles, duration);
+}
+
 int main(int argc, char **argv)
 {
 g_test_init(&argc, &argv, NULL);
@@ -301,6 +324,7 @@ int main(int argc, char **argv)
 g_test_add_func("/perf/lifecycle", perf_lifecycle);
 g_test_add_func("/perf/nesting", perf_nesting);
 g_test_add_func("/perf/yield", perf_yield);
+g_test_add_func("/perf/function-call", perf_baseline);
 }
 return g_test_run();
 }
-- 
1.9.3

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf  wrote:
> Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
>> Hi Kevin,
>>
>> On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf  wrote:
>> > Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
>> >> I have been wondering how to prove that the root cause is the ucontext
>> >> coroutine mechanism (stack switching).  Here is an idea:
>> >>
>> >> Hack your "bypass" code path to run the request inside a coroutine.
>> >> That way you can compare "bypass without coroutine" against "bypass with
>> >> coroutine".
>> >>
>> >> Right now I think there are doubts because the bypass code path is
>> >> indeed a different (and not 100% correct) code path.  So this approach
>> >> might prove that the coroutines are adding the overhead and not
>> >> something that you bypassed.
>> >
>> > My doubts aren't only that the overhead might not come from the
>> > coroutines, but also whether any coroutine-related overhead is really
>> > unavoidable. If we can optimise coroutines, I'd strongly prefer to do
>> > just that instead of introducing additional code paths.
>>
>> OK, thank you for taking look at the problem, and hope we can
>> figure out the root cause, :-)
>>
>> >
>> > Another thought I had was this: If the performance difference is indeed
>> > only coroutines, then that is completely inside the block layer and we
>> > don't actually need a VM to test it. We could instead have something
>> > like a simple qemu-img based benchmark and should be observing the same.
>>
>> Even it is simpler to run a coroutine-only benchmark, and I just
>> wrote a raw one, and looks coroutine does decrease performance
>> a lot, please see the attachment patch, and thanks for your template
>> to help me add the 'co_bench' command in qemu-img.
>
> Yes, we can look at coroutines microbenchmarks in isolation. I actually
> did do that yesterday with the yield test from tests/test-coroutine.c.
> And in fact profiling immediately showed something to optimise:
> pthread_getspecific() was quite high, replacing it by __thread on
> systems where it works is more efficient and helped the numbers a bit.
> Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
> in qemu-img bench), maybe there's even something that can be done here.

The lock/unlock in dataplane is often from memory_region_find(), and Paolo
should have done lots of work on that.

>
> However, I just wasn't sure whether a change on this level would be
> relevant in a realistic environment. This is the reason why I wanted to
> get a benchmark involving the block layer and some I/O.
>
>> From the profiling data in below link:
>>
>> http://pastebin.com/YwH2uwbq
>>
>> With coroutine, the running time for same loading is increased
>> ~50%(1.325s vs. 0.903s), and dcache load events is increased
>> ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
>> 1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
>>
>> The bypass code in the benchmark is very similar with the approach
>> used in the bypass patch, since linux-aio with O_DIRECT seldom
>> blocks in the the kernel I/O path.
>>
>> Maybe the benchmark is a bit extremely, but given modern storage
>> device may reach millions of IOPS, and it is very easy to slow down
>> the I/O by coroutine.
>
> I think in order to optimise coroutines, such benchmarks are fair game.
> It's just not guaranteed that the effects are exactly the same on real
> workloads, so we should take the results with a grain of salt.
>
> Anyhow, the coroutine version of your benchmark is buggy, it leaks all
> coroutines instead of exiting them, so it can't make any use of the
> coroutine pool. On my laptop, I get this (where fixed coroutine is a
> version that simply removes the yield at the end):
>
> | bypass| fixed coro| buggy coro
> +---+---+--
> time| 1.09s | 1.10s | 1.62s
> L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
> insns per cycle | 2.39  | 2.39  | 1.90
>
> Begs the question whether you see a similar effect on a real qemu and
> the coroutine pool is still not big enough? With correct use of
> coroutines, the difference seems to be barely measurable even without
> any I/O involved.

When I comment qemu_coroutine_yield(), looks result of
bypass and fixed coro is very similar as your test, and I am just
wondering if stack is always switched in qemu_coroutine_enter()
without calling qemu_coroutine_yield().

Without the yield, the benchmark can't emulate coroutine usage in
bdrv_aio_readv/writev() path any more, and bypass in the patchset
skips two qemu_coroutine_enter() and one qemu_coroutine_yield()
for each bdrv_aio_readv/writev().

>
>> > I played a bit with the following, I hope it's not too naive. I couldn't
>> > see a difference with your patches, but at least one reason for this is
>> > probably that my laptop SSD isn't fast enough to make the CPU

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

2014-08-06 Thread Stefan Hajnoczi

On Wed, Aug 06, 2014 at 01:33:36PM +0800, Ming Lei wrote:
> With coroutine, the running time for same loading is increased
> ~50%(1.325s vs. 0.903s), and dcache load events is increased

I agree with Paolo about microbenchmarks.  We need to do I/O to get a
realistic picture of performance, since there is little point is
optimizing something that is not a significant factor in overall
performance.

But I also wanted to say that these benchmark durations are so short
that they can be greatly affected by outliers (e.g. scheduler behavior,
system background activity, etc).  Run benchmarks for 2 minutes to
reduce variance and give the system time to "warm up".

pgpdVv9EmCnOH.pgp
Description: PGP signature

[Qemu-devel] [Bug 1353149] Re: qemu 2.1.0 fails to start if number of cores is greater than 1.

2014-08-06 Thread asavah

I forgot to mention that VM was created and managed remotely via virt-manager 
0.9.5 from another host.
I used custom cpu topology.

however this config worked fine on qemu 2.0.0 with virt-manager 0.9.5

just tried virt-manager 1.0.1 - it creates the proper argument -smp
4,maxcpus=4,sockets=1,cores=4,threads=1

so this was a virt-manager bug already fixed upstream.

this bug can be closed :)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353149

Title:
  qemu 2.1.0 fails to start if number of cores is greater than 1.

Status in QEMU:
  New

Bug description:
  qemu (kvm) 2.1.0 (built from sources) fails to start if number of
  cores is greater than 1.

  relevant part of commandline arguments:

  /usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
  i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
  -smp 1,maxcpus=4,sockets=1,cores=4,threads=1

  the error reported is:

  qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
smbios_get_tables: Assertion `smbios_smp_sockets >= 1' failed.
  2014-08-05 21:45:35.825+: shutting down

  however setting 4 sockets with 1 core each allows me to start the
  machine just fine.

  the system is debian wheezy
  Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
GNU/Linux

  libvirt 1.2.7 (built from sources)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353149/+subscriptions

Re: [Qemu-devel] [v4][PATCH 4/5] xen:hw:pci-host:piix: create host bridge to passthrough

2014-08-06 Thread Michael S. Tsirkin

On Wed, Aug 06, 2014 at 02:50:34PM +0800, Tiejun Chen wrote:
> Implement a pci host bridge specific to passthrough. Actually
> this just inherits the standard one.
> 
> This is based on http://patchwork.ozlabs.org/patch/363810/.
> 
> Signed-off-by: Tiejun Chen 
> ---
>  hw/pci-host/piix.c   | 39 +++
>  include/hw/i386/pc.h |  2 ++
>  2 files changed, 41 insertions(+)
> 
> v4:
> 
> * Rebase
> 
> v3:
> 
> * Rebase
> 
> v2:
> 
> * Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough
> 
> diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
> index 4330599..5cac398 100644
> --- a/hw/pci-host/piix.c
> +++ b/hw/pci-host/piix.c
> @@ -303,6 +303,17 @@ static int i440fx_initfn(PCIDevice *dev)
>  return 0;
>  }
>  
> +static int xen_igd_passthrough_i440fx_initfn(PCIDevice *dev)
> +{
> +PCII440FXState *d = I440FX_PCI_DEVICE(dev,
> + TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE);
> +
> +dev->config[I440FX_SMRAM] = 0x02;
> +
> +cpu_smm_register(&i440fx_set_smm, d);
> +return 0;
> +}
> +
>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>  PCII440FXState **pi440fx_state,
>  int *piix3_devfn,

don't duplicate code.
Reuse the function from regular piix.

> @@ -703,6 +714,33 @@ static const TypeInfo i440fx_info = {
>  .class_init= i440fx_class_init,
>  };
>  
> +static void xen_igd_passthrough_i440fx_class_init(ObjectClass *klass, void 
> *data)
> +{
> +DeviceClass *dc = DEVICE_CLASS(klass);
> +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +k->init = xen_igd_passthrough_i440fx_initfn;
> +k->vendor_id = PCI_VENDOR_ID_INTEL;
> +k->device_id = PCI_DEVICE_ID_INTEL_82441;
> +k->revision = 0x02;
> +k->class_id = PCI_CLASS_BRIDGE_HOST;
> +dc->desc = "IGD PT XEN Host bridge";
> +dc->vmsd = &vmstate_i440fx;
> +/*
> + * PCI-facing part of the host bridge, not usable without the
> + * host-facing part, which can't be device_add'ed, yet.
> + */
> +dc->cannot_instantiate_with_device_add_yet = true;
> +dc->hotpluggable   = false;
> +}
> +
> +static const TypeInfo xen_igd_passthrough_i440fx_info = {
> +.name  = TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
> +.parent= TYPE_PCI_DEVICE,
> +.instance_size = sizeof(PCII440FXState),
> +.class_init= xen_igd_passthrough_i440fx_class_init,
> +};
> +
>  static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
>  PCIBus *rootbus)
>  {
> @@ -744,6 +782,7 @@ static const TypeInfo i440fx_pcihost_info = {
>  static void i440fx_register_types(void)
>  {
>  type_register_static(&i440fx_info);
> +type_register_static(&xen_igd_passthrough_i440fx_info);
>  type_register_static(&piix3_info);
>  type_register_static(&piix3_xen_info);
>  type_register_static(&i440fx_pcihost_info);
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 11fb72f..de34aa6 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -238,6 +238,8 @@ typedef struct PCII440FXState PCII440FXState;
>  #define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
>  #define TYPE_I440FX_PCI_DEVICE "i440FX"
>  
> +#define TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE 
> "xen-igd-passthrough-i440FX"
> +
>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>  PCII440FXState **pi440fx_state, int *piix_devfn,
>  ISABus **isa_bus, qemu_irq *pic,
> -- 
> 1.9.1

Re: [Qemu-devel] [PATCH v2 00/30] AHCI test suite framework

2014-08-06 Thread Stefan Hajnoczi

On Mon, Aug 04, 2014 at 05:11:01PM -0400, John Snow wrote:
> This patch series introduces a number of small fixes and tweaks to
> help support an AHCI test suite that in the future I hope to expand
> to a fuller regression suite to help guide the development of the
> AHCI device support under, in particular, the Q35 machine type in QEMU.
> 
> Paolo Bonzini has contributed a number of cleanup and refactoring patches
> that support changes to the PIO setup FIS packet construction code, which
> is necessary for testing ths specification adherence of the IDENTIFY command,
> which issues its data exclusively via PIO mechanisms.
> 
> The ahci-test code being checked in represents a minimum of functionality
> needed in order to issue and receive commands from the AHCI HBA under the
> libqos / qtest environment.
> 
> In V2, as detailed below, these tests are not currently expected to pass.
> I will post a complementary patch outside of this set that highlights
> the exact set of tests that will not pass, which can help verify at least
> the portions of these tests that do work correctly.
> 
> Assertions that currently fail:
> - Ordering of PCI capabilities as defined by either AHCI or Intel ICH9
> - Boot-time values of the PxTFD register, which should not have valid
>   data until after a D2H FIS is received, but does in Qemu 2.1
> - Boot-time values of the PxSIG register, which should have a specific
>   placeholder signature until the first D2H FIS is received, but is
>   currently blank.
> - The "Descriptor Processed" interrupt is expected after the IDENTIFY
>   command exhausts the given PRDT, but is not seen.

I guess these are the assertion failures:
ERROR:tests/ahci-test.c:777:ahci_test_pci_spec: assertion failed ((data & 0xFF) 
== PCI_CAP_ID_MSI): (0x0012 == 0x0005)
GTester: last random seed: R02Sd92815a5d013e8433808b903b2b13fb0
**
ERROR:tests/ahci-test.c:1165:ahci_test_port_spec: assertion failed ((reg) & 
((0x01)) == ((0x01))): (0x == 0x0001)
GTester: last random seed: R02S4d6c05e864dc777e64141cdc6d2a18cf
**
ERROR:tests/ahci-test.c:1360:ahci_test_identify: assertion failed ((reg) & 
((0x20)) == ((0x20))): (0x == 0x0020)
GTester: last random seed: R02S2b3b330b83a66badb24da80b48120b1d

Why publish this patch series if the test fails?  We can't merge failing
tests.


pgpxF10PzmRWz.pgp
Description: PGP signature

Re: [Qemu-devel] [v4][PATCH 3/5] I440FX_PCI_DEVICE: add pci_type to index

2014-08-06 Thread Michael S. Tsirkin

On Wed, Aug 06, 2014 at 02:50:33PM +0800, Tiejun Chen wrote:
> We need to use this index to reuse this macro later
> 
> Signed-off-by: Tiejun Chen 

Which index?
Most users don't need to change.
Just open-code OBJECT_CHECK where necessary, or add
a new wrapper.

> ---
>  hw/pci-host/piix.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> v4:
> 
> * New patch to extend I440FX_PCI_DEVICE
> 
> diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
> index 0cd82b8..4330599 100644
> --- a/hw/pci-host/piix.c
> +++ b/hw/pci-host/piix.c
> @@ -90,8 +90,8 @@ typedef struct PIIX3State {
>  MemoryRegion rcr_mem;
>  } PIIX3State;
>  
> -#define I440FX_PCI_DEVICE(obj) \
> -OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
> +#define I440FX_PCI_DEVICE(obj, type) \
> +OBJECT_CHECK(PCII440FXState, (obj), type)
>  
>  struct PCII440FXState {
>  /*< private >*/
> @@ -155,7 +155,7 @@ static void i440fx_set_smm(int val, void *arg)
>  static void i440fx_write_config(PCIDevice *dev,
>  uint32_t address, uint32_t val, int len)
>  {
> -PCII440FXState *d = I440FX_PCI_DEVICE(dev);
> +PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);
>  
>  /* XXX: implement SMRAM.D_LOCK */
>  pci_default_write_config(dev, address, val, len);
> @@ -295,7 +295,7 @@ static void i440fx_pcihost_realize(DeviceState *dev, 
> Error **errp)
>  
>  static int i440fx_initfn(PCIDevice *dev)
>  {
> -PCII440FXState *d = I440FX_PCI_DEVICE(dev);
> +PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);
>  
>  dev->config[I440FX_SMRAM] = 0x02;
>  
> @@ -333,7 +333,7 @@ PCIBus *i440fx_init(const char *host_type, const char 
> *pci_type,
>  qdev_init_nofail(dev);
>  
>  d = pci_create_simple(b, 0, pci_type);
> -*pi440fx_state = I440FX_PCI_DEVICE(d);
> +*pi440fx_state = I440FX_PCI_DEVICE(d, pci_type);
>  f = *pi440fx_state;
>  f->system_memory = address_space_mem;
>  f->pci_address_space = pci_address_space;
> -- 
> 1.9.1

Re: [Qemu-devel] [v4][PATCH 4/5] xen:hw:pci-host:piix: create host bridge to passthrough

2014-08-06 Thread Chen, Tiejun


On 2014/8/6 17:42, Michael S. Tsirkin wrote:

On Wed, Aug 06, 2014 at 02:50:34PM +0800, Tiejun Chen wrote:

Implement a pci host bridge specific to passthrough. Actually
this just inherits the standard one.

This is based on http://patchwork.ozlabs.org/patch/363810/.

Signed-off-by: Tiejun Chen 
---
  hw/pci-host/piix.c   | 39 +++
  include/hw/i386/pc.h |  2 ++
  2 files changed, 41 insertions(+)

v4:

* Rebase

v3:

* Rebase

v2:

* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 4330599..5cac398 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -303,6 +303,17 @@ static int i440fx_initfn(PCIDevice *dev)
  return 0;
  }

+static int xen_igd_passthrough_i440fx_initfn(PCIDevice *dev)
+{
+PCII440FXState *d = I440FX_PCI_DEVICE(dev,
+ TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE);


Sorry, I don't understand what you mean.

i440fx_initfn(PCIDevice *dev) just have one parameter, dev, how to 
multiplex that here?


Thanks
Tiejun


+
+dev->config[I440FX_SMRAM] = 0x02;
+
+cpu_smm_register(&i440fx_set_smm, d);
+return 0;
+}
+
  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
  PCII440FXState **pi440fx_state,
  int *piix3_devfn,


don't duplicate code.
Reuse the function from regular piix.







@@ -703,6 +714,33 @@ static const TypeInfo i440fx_info = {
  .class_init= i440fx_class_init,
  };

+static void xen_igd_passthrough_i440fx_class_init(ObjectClass *klass, void 
*data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->init = xen_igd_passthrough_i440fx_initfn;
+k->vendor_id = PCI_VENDOR_ID_INTEL;
+k->device_id = PCI_DEVICE_ID_INTEL_82441;
+k->revision = 0x02;
+k->class_id = PCI_CLASS_BRIDGE_HOST;
+dc->desc = "IGD PT XEN Host bridge";
+dc->vmsd = &vmstate_i440fx;
+/*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+dc->cannot_instantiate_with_device_add_yet = true;
+dc->hotpluggable   = false;
+}
+
+static const TypeInfo xen_igd_passthrough_i440fx_info = {
+.name  = TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(PCII440FXState),
+.class_init= xen_igd_passthrough_i440fx_class_init,
+};
+
  static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
  PCIBus *rootbus)
  {
@@ -744,6 +782,7 @@ static const TypeInfo i440fx_pcihost_info = {
  static void i440fx_register_types(void)
  {
  type_register_static(&i440fx_info);
+type_register_static(&xen_igd_passthrough_i440fx_info);
  type_register_static(&piix3_info);
  type_register_static(&piix3_xen_info);
  type_register_static(&i440fx_pcihost_info);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 11fb72f..de34aa6 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -238,6 +238,8 @@ typedef struct PCII440FXState PCII440FXState;
  #define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
  #define TYPE_I440FX_PCI_DEVICE "i440FX"

+#define TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE "xen-igd-passthrough-i440FX"
+
  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
  PCII440FXState **pi440fx_state, int *piix_devfn,
  ISABus **isa_bus, qemu_irq *pic,
--
1.9.1

[Qemu-devel] [PATCH v3] po: Add Chinese translation

2014-08-06 Thread Fam Zheng

Signed-off-by: Fam Zheng 
Reviewed-by: Amos Kong 
Reviewed-by: Dongsheng Song 
Reviewed-by: Wei Huang 

---
v3: Add a missing space for
msgid " - Press Ctrl+Alt+G to release grab"
As pointed out by Amos.
Meanwhile adding people's review-by's. [Hope no one in the rev-by
list is against with that change.  To Wei: I changed your Acked-by
to Reviewed-by which I think is more appropriate here.]
Cc'ing qemu-trivial.

v2: Fix typo for "_View".
---
 po/zh_CN.po | 86 +
 1 file changed, 86 insertions(+)
 create mode 100644 po/zh_CN.po

diff --git a/po/zh_CN.po b/po/zh_CN.po
new file mode 100644
index 000..2b1d42e
--- /dev/null
+++ b/po/zh_CN.po
@@ -0,0 +1,86 @@
+# Chinese translation for QEMU.
+# This file is put in the public domain.
+#
+# Fam Zheng , 2014
+msgid ""
+msgstr ""
+"Project-Id-Version: QEMU 2.2\n"
+"Report-Msgid-Bugs-To: qemu-devel@nongnu.org\n"
+"POT-Creation-Date: 2014-07-31 10:03+0800\n"
+"PO-Revision-Date: 2014-07-31 10:00+0800\n"
+"Last-Translator: Fam Zheng \n"
+"Language-Team: Chinese \n"
+"Language: zh\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=UTF-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=2; plural=n != 1;\n"
+"X-Generator: Lokalize 1.4\n"
+
+#: ui/gtk.c:321
+msgid " - Press Ctrl+Alt+G to release grab"
+msgstr " - 按下 Ctrl+Alt+G 取消捕获"
+
+#: ui/gtk.c:325
+msgid " [Paused]"
+msgstr " [已暂停]"
+
+#: ui/gtk.c:1601
+msgid "_Pause"
+msgstr "暂停(_P)"
+
+#: ui/gtk.c:1607
+msgid "_Reset"
+msgstr "重置(_R)"
+
+#: ui/gtk.c:1610
+msgid "Power _Down"
+msgstr "关闭电源(_D)"
+
+#: ui/gtk.c:1616
+msgid "_Quit"
+msgstr "退出(_Q)"
+
+#: ui/gtk.c:1692
+msgid "_Fullscreen"
+msgstr "全屏(_F)"
+
+#: ui/gtk.c:1702
+msgid "Zoom _In"
+msgstr "放大(_I)"
+
+#: ui/gtk.c:1709
+msgid "Zoom _Out"
+msgstr "缩小(_O)"
+
+#: ui/gtk.c:1716
+msgid "Best _Fit"
+msgstr "最合适大小(_F)"
+
+#: ui/gtk.c:1723
+msgid "Zoom To _Fit"
+msgstr "缩放以适应大小(_F)"
+
+#: ui/gtk.c:1729
+msgid "Grab On _Hover"
+msgstr "鼠标经过时捕获(_H)"
+
+#: ui/gtk.c:1732
+msgid "_Grab Input"
+msgstr "捕获输入(_G)"
+
+#: ui/gtk.c:1761
+msgid "Show _Tabs"
+msgstr "显示标签页(_T)"
+
+#: ui/gtk.c:1764
+msgid "Detach Tab"
+msgstr "分离标签页"
+
+#: ui/gtk.c:1778
+msgid "_Machine"
+msgstr "虚拟机(_M)"
+
+#: ui/gtk.c:1783
+msgid "_View"
+msgstr "视图(_V)"
-- 
2.0.3

Re: [Qemu-devel] [questions] about KVM asaMicrosoft-compatiblehypervisor

2014-08-06 Thread Vadim Rozenfeld

On Mon, 2014-08-04 at 14:29 +0800, Zhang Haoyu wrote:
> >Hi Zhang,
> >
> >No I haven't seen such problem
> >Which kernel version are you running?
> Host kernel: RHEL7-RC1(linux-3.10.0).
> 
> >Does it include the latest lazy eli changes?
> >
> lazy eli or lazy eoi?
EOI
> How to confirm whether lazy eli has been included?
> 
not in linux-3.10.0
> >Btw, hv_spinlocks=0xfff is a pretty huge value.
> >
> which value do you advise to use?
MS seems to be using 0x as a default.
best regards,
Vadim.
> 
> Thanks,
> Zhang Haoyu
> >Best regards,
> >Vadim.
> 
>

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Am 06.08.2014 um 11:37 hat Ming Lei geschrieben:
> On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf  wrote:
> > Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
> >> Hi Kevin,
> >>
> >> On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf  wrote:
> >> > Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
> >> >> I have been wondering how to prove that the root cause is the ucontext
> >> >> coroutine mechanism (stack switching).  Here is an idea:
> >> >>
> >> >> Hack your "bypass" code path to run the request inside a coroutine.
> >> >> That way you can compare "bypass without coroutine" against "bypass with
> >> >> coroutine".
> >> >>
> >> >> Right now I think there are doubts because the bypass code path is
> >> >> indeed a different (and not 100% correct) code path.  So this approach
> >> >> might prove that the coroutines are adding the overhead and not
> >> >> something that you bypassed.
> >> >
> >> > My doubts aren't only that the overhead might not come from the
> >> > coroutines, but also whether any coroutine-related overhead is really
> >> > unavoidable. If we can optimise coroutines, I'd strongly prefer to do
> >> > just that instead of introducing additional code paths.
> >>
> >> OK, thank you for taking look at the problem, and hope we can
> >> figure out the root cause, :-)
> >>
> >> >
> >> > Another thought I had was this: If the performance difference is indeed
> >> > only coroutines, then that is completely inside the block layer and we
> >> > don't actually need a VM to test it. We could instead have something
> >> > like a simple qemu-img based benchmark and should be observing the same.
> >>
> >> Even it is simpler to run a coroutine-only benchmark, and I just
> >> wrote a raw one, and looks coroutine does decrease performance
> >> a lot, please see the attachment patch, and thanks for your template
> >> to help me add the 'co_bench' command in qemu-img.
> >
> > Yes, we can look at coroutines microbenchmarks in isolation. I actually
> > did do that yesterday with the yield test from tests/test-coroutine.c.
> > And in fact profiling immediately showed something to optimise:
> > pthread_getspecific() was quite high, replacing it by __thread on
> > systems where it works is more efficient and helped the numbers a bit.
> > Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
> > in qemu-img bench), maybe there's even something that can be done here.
> 
> The lock/unlock in dataplane is often from memory_region_find(), and Paolo
> should have done lots of work on that.
> 
> >
> > However, I just wasn't sure whether a change on this level would be
> > relevant in a realistic environment. This is the reason why I wanted to
> > get a benchmark involving the block layer and some I/O.
> >
> >> From the profiling data in below link:
> >>
> >> http://pastebin.com/YwH2uwbq
> >>
> >> With coroutine, the running time for same loading is increased
> >> ~50%(1.325s vs. 0.903s), and dcache load events is increased
> >> ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
> >> 1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
> >>
> >> The bypass code in the benchmark is very similar with the approach
> >> used in the bypass patch, since linux-aio with O_DIRECT seldom
> >> blocks in the the kernel I/O path.
> >>
> >> Maybe the benchmark is a bit extremely, but given modern storage
> >> device may reach millions of IOPS, and it is very easy to slow down
> >> the I/O by coroutine.
> >
> > I think in order to optimise coroutines, such benchmarks are fair game.
> > It's just not guaranteed that the effects are exactly the same on real
> > workloads, so we should take the results with a grain of salt.
> >
> > Anyhow, the coroutine version of your benchmark is buggy, it leaks all
> > coroutines instead of exiting them, so it can't make any use of the
> > coroutine pool. On my laptop, I get this (where fixed coroutine is a
> > version that simply removes the yield at the end):
> >
> > | bypass| fixed coro| buggy coro
> > +---+---+--
> > time| 1.09s | 1.10s | 1.62s
> > L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
> > insns per cycle | 2.39  | 2.39  | 1.90
> >
> > Begs the question whether you see a similar effect on a real qemu and
> > the coroutine pool is still not big enough? With correct use of
> > coroutines, the difference seems to be barely measurable even without
> > any I/O involved.
> 
> When I comment qemu_coroutine_yield(), looks result of
> bypass and fixed coro is very similar as your test, and I am just
> wondering if stack is always switched in qemu_coroutine_enter()
> without calling qemu_coroutine_yield().

Yes, definitely. qemu_coroutine_enter() always involves calling
qemu_coroutine_switch(), which is the stack switch.

> Without the yield, the benchmark can't emulate coroutine usage in
> bdrv_aio_readv/writev() path an

Re: [Qemu-devel] [v4][PATCH 3/5] I440FX_PCI_DEVICE: add pci_type to index

2014-08-06 Thread Chen, Tiejun


On 2014/8/6 17:45, Michael S. Tsirkin wrote:

On Wed, Aug 06, 2014 at 02:50:33PM +0800, Tiejun Chen wrote:

We need to use this index to reuse this macro later

Signed-off-by: Tiejun Chen 


Which index?
Most users don't need to change.
Just open-code OBJECT_CHECK where necessary, or add
a new wrapper.


Okay so what about this?

hw:pci-host:piix: define I440FX_PCI_DEVICE_FROM_TYPE

We need to introduce I440FX_PCI_DEVICE_FROM_TYPE to get
object with type, then we can reuse i440fx_init() simply.

Signed-off-by: Tiejun Chen 

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 0cd82b8..8c74653 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -93,6 +93,9 @@ typedef struct PIIX3State {
 #define I440FX_PCI_DEVICE(obj) \
 OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)

+#define I440FX_PCI_DEVICE_FROM_TYPE(obj, type) \
+OBJECT_CHECK(PCII440FXState, (obj), type)
+
 struct PCII440FXState {
 /*< private >*/
 PCIDevice parent_obj;
@@ -333,7 +336,7 @@ PCIBus *i440fx_init(const char *host_type, const 
char *pci_type,

 qdev_init_nofail(dev);

 d = pci_create_simple(b, 0, pci_type);
-*pi440fx_state = I440FX_PCI_DEVICE(d);
+*pi440fx_state = I440FX_PCI_DEVICE_FROM_TYPE(d, pci_type);
 f = *pi440fx_state;
 f->system_memory = address_space_mem;
 f->pci_address_space = pci_address_space;


Thanks
Tiejun




---
  hw/pci-host/piix.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

v4:

* New patch to extend I440FX_PCI_DEVICE

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 0cd82b8..4330599 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -90,8 +90,8 @@ typedef struct PIIX3State {
  MemoryRegion rcr_mem;
  } PIIX3State;

-#define I440FX_PCI_DEVICE(obj) \
-OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
+#define I440FX_PCI_DEVICE(obj, type) \
+OBJECT_CHECK(PCII440FXState, (obj), type)

  struct PCII440FXState {
  /*< private >*/
@@ -155,7 +155,7 @@ static void i440fx_set_smm(int val, void *arg)
  static void i440fx_write_config(PCIDevice *dev,
  uint32_t address, uint32_t val, int len)
  {
-PCII440FXState *d = I440FX_PCI_DEVICE(dev);
+PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);

  /* XXX: implement SMRAM.D_LOCK */
  pci_default_write_config(dev, address, val, len);
@@ -295,7 +295,7 @@ static void i440fx_pcihost_realize(DeviceState *dev, Error 
**errp)

  static int i440fx_initfn(PCIDevice *dev)
  {
-PCII440FXState *d = I440FX_PCI_DEVICE(dev);
+PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);

  dev->config[I440FX_SMRAM] = 0x02;

@@ -333,7 +333,7 @@ PCIBus *i440fx_init(const char *host_type, const char 
*pci_type,
  qdev_init_nofail(dev);

  d = pci_create_simple(b, 0, pci_type);
-*pi440fx_state = I440FX_PCI_DEVICE(d);
+*pi440fx_state = I440FX_PCI_DEVICE(d, pci_type);
  f = *pi440fx_state;
  f->system_memory = address_space_mem;
  f->pci_address_space = pci_address_space;
--
1.9.1

Re: [Qemu-devel] [edk2] license for binary drivers

2014-08-06 Thread Laszlo Ersek

On 08/06/14 09:40, Reza Jelveh wrote:
> Hello,
>
> EDK2 integrates FAT as a binary driver. What is the license of the FAT driver?

https://svn.code.sf.net/p/edk2/code/trunk/edk2/FatBinPkg/License.txt

> What are the guidelines for use of binary drivers with EDK2? Specifically if
> you want to bundle an OVMF firmware with qemu?

I'm not a lawyer. You should consult a lawyer.

That said, the binary or source code nature of the FAT driver is
inconsequential in this instance. You can actually find the source code
for the FAT driver, and build it yourself, you just need to import it in
the edk2 tree, and patch OvmfPkg*.{dsc,fdf) slightly.

https://svn.code.sf.net/p/edk2/code/trunk/edk2/FatBinPkg/ReadMe.txt
https://github.com/tianocore/edk2-FatPkg/

The main issue is that the FAT driver is not free software. It doesn't
satisfy the four freedom requirements listed here:

https://www.gnu.org/philosophy/free-sw.html

[...]
The freedom to run the program as you wish, for any purpose
(freedom 0).
[...]

The FAT driver license does not give you this freedom.

The FAT driver's license is more restrictive than the 3-clause BSDL that
it is based upon.

The 3-clause BSDL is GPL compatible:

http://en.wikipedia.org/wiki/BSD_licenses#3-clause_license_.28.22Revised_BSD_License.22.2C_.22New_BSD_License.22.2C_or_.22Modified_BSD_License.22.29

But the additional restriction in the FAT driver's license makes it
incompatible with the GPLv2 (which QEMU as a whole us released under,
see LICENSE).

The additional restriction most likely originates from
, the Microsoft EFI FAT32
File System Specification:

Note: The download license agreement permits you to use the
Microsoft EFI FAT32 File System Specification only in connection
with a firmware implementation of the Extensible Firmware Initiative
Specification, v. 1.0. If you plan to implement the FAT32 File
System specification for other purposes, you must obtain an
additional license from Microsoft. For example, you must obtain an
additional license in order to create a file system for reading or
reading and writing FAT32 in digital cameras recording to flash
media, in computer operating systems reading and writing
internal/external hard disks or flash media, or in set-top boxes
reading FAT-formatted media.

(This is precisely what freedom 0 would be about.)

So no, you can't ship an OVMF binary (or source tarball) that contains
the FAT driver, bundled as part of the GPLv2 (+compatible) QEMU
distribution, either in source or in binary form.

Linux distributions have moved OVMF to their non-free channels accordingly.

https://packages.debian.org/sid/ovmf
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=745698

http://packages.ubuntu.com/trusty/misc/ovmf

What you can do is build an OVMF binary that doesn't include the FAT
driver. The result will be then covered by 3-clause BSDL (further
restricted by the OpenSSL license, if you include -D SECURE_BOOT), and
that one you can bundle with QEMU. However, such an OVMF binary will be
useless, unless users themselves can augment it somehow at build time or
at runtime with some FAT driver. (The UEFI specification requires FAT32
support.)

UEFI should have standardized a preexistent, but unencumbered, file
system. That horse is out of the barn.

Disclaimer: I am not a lawyer, and this is my personal opinion only, not
necessarily my employer's.

Laszlo

Re: [Qemu-devel] [questions] about qemu log

On Wed, Aug 6, 2014 at 12:40 AM, Peter Crosthwaite
 wrote:
> Well -D will log to file only loggable (i.e. qemu_log()) information
> (which has all sorts of options and switches). Stderr, is a little
> more static and should in theory be limited to genuine errors. But if
> you want a combined log of both you can simply omit -D to default
> qemu_log output to stderr. This gives you a combined log that you can
> redirect anywhere. To be honest, this is what I do as a matter of
> course (2> foo rather than -D foo).

understood; this make it incompatible with -daemonize option.
there should be a possibility to detach the process and also redirect
stderr somewhere.
-- 
William

Re: [Qemu-devel] [PATCH RFC v2 01/12] QEMUSizedBuffer/QEMUFile

2014-08-06 Thread Dr. David Alan Gilbert

* Eric Blake (ebl...@redhat.com) wrote:
> On 07/25/2014 09:39 AM, Sanidhya Kashyap wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> > Stefan Berger's to create a QEMUFile that goes to a memory buffer;
> 
> Missing something.  Maybe you meant:
> 
> This is based on Stefan Berger's patch to create ...

I'll update that in my version.

> > from:
> > 
> > http://lists.gnu.org/archive/html/qemu-devel/2013-03/msg05036.html
> > 
> > Using the QEMUFile interface, this patch adds support functions for
> > operating
> > on in-memory sized buffers that can be written to or read from.
> 
> Odd line breaking.
> 
> > 
> > Signed-off-by: Stefan Berger 
> > Signed-off-by: Joel Schopp 
> 
> Still looks weird to have David as author but not listed in S-o-B.

I'm thinking of trying to put this code in via a stand-alone patch
(with a testcase use); since we've got 3+ users of this code in unmerged
things and at least 3 implementations of something similar, it would
be best to get it in before someone writes a 4th.

> > ---
> >  include/migration/qemu-file.h |  27 +++
> >  qemu-file.c   | 411 
> > ++
> >  2 files changed, 438 insertions(+)
> > 
> 
> > +/**
> > + * Set the length of the buffer; The primary usage of this
> 
> s/The/the/

> > + * function is to truncate the number of used bytes in the buffer.
> > + * The size will not be extended beyond the current  number of
> 
> no need for double space
> 
> > + * allocated bytes in the QEMUSizedBuffer.
> > + *
> > + * @qsb: A QEMUSizedBuffer
> > + * @new_len : The new length of bytes in the buffer
> > + *
> > + * Returns the number of bytes the buffer was trucated or extended
> 
> s/trucated/truncated/

I've picked up these typos in my world.  Thanks.

> > +
> > +QEMUFile *qemu_bufopen(const char *mode, QEMUSizedBuffer *input)
> > +{
> > +QEMUBuffer *s;
> > +
> > +if (mode == NULL || (mode[0] != 'r' && mode[0] != 'w') || mode[1] != 
> > 0) {
> > +fprintf(stderr, "qemu_bufopen: Argument validity check failed\n");
> 
> Should this function be directly printing to stderr, or should it be
> converted to use Error **errp semantics?

It's consistent with the open functions for other file types
 ( qemu_fdopen, qemu_popen_cmd, qemu_fopen ).

Dave

> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] QEMU with KVM does not start Win8 on kernel 3.4.67 and core2duo

2014-08-06 Thread Erik Rull

Hi all,

I did already several tests and I'm not completely sure what's going wrong, but
here my scenario:

When I start up QEMU w/ KVM 1.7.0 on a Core2Duo machine running a vanilla kernel
3.4.67 to run a Windows 8.0 guest, the guest freezes at boot without any error.
When I dump the CPU registers via "info registers", nothing changes, that means
the system really stalled. Same happens with QEMU 2.0.0.

But - when I run the very same guest using Kernel 2.6.32.12 and QEMU 1.7.0 on
the host side it works on the Core2Duo. Also the system above but just with an
i3 or i5 CPU it works, too.

I already disabled networking and USB for the guest and changed the graphics
card - no effect. I assume that some mean bits and bytes have to be set up
properly to get the thing running.

Any hint what to change / test would be really appreciated.

Thanks in advance,

Best regards,

Erik

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 6:09 PM, Kevin Wolf  wrote:
> Am 06.08.2014 um 11:37 hat Ming Lei geschrieben:
>> On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf  wrote:
>> > Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
>> >> Hi Kevin,
>> >>
>> >> On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf  wrote:
>> >> > Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
>> >> >> I have been wondering how to prove that the root cause is the ucontext
>> >> >> coroutine mechanism (stack switching).  Here is an idea:
>> >> >>
>> >> >> Hack your "bypass" code path to run the request inside a coroutine.
>> >> >> That way you can compare "bypass without coroutine" against "bypass 
>> >> >> with
>> >> >> coroutine".
>> >> >>
>> >> >> Right now I think there are doubts because the bypass code path is
>> >> >> indeed a different (and not 100% correct) code path.  So this approach
>> >> >> might prove that the coroutines are adding the overhead and not
>> >> >> something that you bypassed.
>> >> >
>> >> > My doubts aren't only that the overhead might not come from the
>> >> > coroutines, but also whether any coroutine-related overhead is really
>> >> > unavoidable. If we can optimise coroutines, I'd strongly prefer to do
>> >> > just that instead of introducing additional code paths.
>> >>
>> >> OK, thank you for taking look at the problem, and hope we can
>> >> figure out the root cause, :-)
>> >>
>> >> >
>> >> > Another thought I had was this: If the performance difference is indeed
>> >> > only coroutines, then that is completely inside the block layer and we
>> >> > don't actually need a VM to test it. We could instead have something
>> >> > like a simple qemu-img based benchmark and should be observing the same.
>> >>
>> >> Even it is simpler to run a coroutine-only benchmark, and I just
>> >> wrote a raw one, and looks coroutine does decrease performance
>> >> a lot, please see the attachment patch, and thanks for your template
>> >> to help me add the 'co_bench' command in qemu-img.
>> >
>> > Yes, we can look at coroutines microbenchmarks in isolation. I actually
>> > did do that yesterday with the yield test from tests/test-coroutine.c.
>> > And in fact profiling immediately showed something to optimise:
>> > pthread_getspecific() was quite high, replacing it by __thread on
>> > systems where it works is more efficient and helped the numbers a bit.
>> > Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
>> > in qemu-img bench), maybe there's even something that can be done here.
>>
>> The lock/unlock in dataplane is often from memory_region_find(), and Paolo
>> should have done lots of work on that.
>>
>> >
>> > However, I just wasn't sure whether a change on this level would be
>> > relevant in a realistic environment. This is the reason why I wanted to
>> > get a benchmark involving the block layer and some I/O.
>> >
>> >> From the profiling data in below link:
>> >>
>> >> http://pastebin.com/YwH2uwbq
>> >>
>> >> With coroutine, the running time for same loading is increased
>> >> ~50%(1.325s vs. 0.903s), and dcache load events is increased
>> >> ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
>> >> 1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
>> >>
>> >> The bypass code in the benchmark is very similar with the approach
>> >> used in the bypass patch, since linux-aio with O_DIRECT seldom
>> >> blocks in the the kernel I/O path.
>> >>
>> >> Maybe the benchmark is a bit extremely, but given modern storage
>> >> device may reach millions of IOPS, and it is very easy to slow down
>> >> the I/O by coroutine.
>> >
>> > I think in order to optimise coroutines, such benchmarks are fair game.
>> > It's just not guaranteed that the effects are exactly the same on real
>> > workloads, so we should take the results with a grain of salt.
>> >
>> > Anyhow, the coroutine version of your benchmark is buggy, it leaks all
>> > coroutines instead of exiting them, so it can't make any use of the
>> > coroutine pool. On my laptop, I get this (where fixed coroutine is a
>> > version that simply removes the yield at the end):
>> >
>> > | bypass| fixed coro| buggy coro
>> > +---+---+--
>> > time| 1.09s | 1.10s | 1.62s
>> > L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
>> > insns per cycle | 2.39  | 2.39  | 1.90
>> >
>> > Begs the question whether you see a similar effect on a real qemu and
>> > the coroutine pool is still not big enough? With correct use of
>> > coroutines, the difference seems to be barely measurable even without
>> > any I/O involved.
>>
>> When I comment qemu_coroutine_yield(), looks result of
>> bypass and fixed coro is very similar as your test, and I am just
>> wondering if stack is always switched in qemu_coroutine_enter()
>> without calling qemu_coroutine_yield().
>
> Yes, definitely. qemu_coroutine_enter() always involves calli

Re: [Qemu-devel] [PATCH v2 00/30] AHCI test suite framework

Stefan Hajnoczi  writes:

> On Mon, Aug 04, 2014 at 05:11:01PM -0400, John Snow wrote:
>> This patch series introduces a number of small fixes and tweaks to
>> help support an AHCI test suite that in the future I hope to expand
>> to a fuller regression suite to help guide the development of the
>> AHCI device support under, in particular, the Q35 machine type in QEMU.
>> 
>> Paolo Bonzini has contributed a number of cleanup and refactoring patches
>> that support changes to the PIO setup FIS packet construction code, which
>> is necessary for testing ths specification adherence of the IDENTIFY command,
>> which issues its data exclusively via PIO mechanisms.
>> 
>> The ahci-test code being checked in represents a minimum of functionality
>> needed in order to issue and receive commands from the AHCI HBA under the
>> libqos / qtest environment.
>> 
>> In V2, as detailed below, these tests are not currently expected to pass.
>> I will post a complementary patch outside of this set that highlights
>> the exact set of tests that will not pass, which can help verify at least
>> the portions of these tests that do work correctly.
>> 
>> Assertions that currently fail:
>> - Ordering of PCI capabilities as defined by either AHCI or Intel ICH9
>> - Boot-time values of the PxTFD register, which should not have valid
>>   data until after a D2H FIS is received, but does in Qemu 2.1
>> - Boot-time values of the PxSIG register, which should have a specific
>>   placeholder signature until the first D2H FIS is received, but is
>>   currently blank.
>> - The "Descriptor Processed" interrupt is expected after the IDENTIFY
>>   command exhausts the given PRDT, but is not seen.
>
> I guess these are the assertion failures:
> ERROR:tests/ahci-test.c:777:ahci_test_pci_spec: assertion failed
> ((data & 0xFF) == PCI_CAP_ID_MSI): (0x0012 == 0x0005)
> GTester: last random seed: R02Sd92815a5d013e8433808b903b2b13fb0
> **
> ERROR:tests/ahci-test.c:1165:ahci_test_port_spec: assertion failed
> ((reg) & ((0x01)) == ((0x01))): (0x == 0x0001)
> GTester: last random seed: R02S4d6c05e864dc777e64141cdc6d2a18cf
> **
> ERROR:tests/ahci-test.c:1360:ahci_test_identify: assertion failed
> ((reg) & ((0x20)) == ((0x20))): (0x == 0x0020)
> GTester: last random seed: R02S2b3b330b83a66badb24da80b48120b1d
>
> Why publish this patch series if the test fails?  We can't merge failing
> tests.

Correct.

What I do when I want to start some bug fixing work with tests is to
write the tests to expect the actual, incorrect behavior, with a
greppable comment documenting the correct behavior.  Then clean that up
as the bugs get fixed.

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 7:28 PM, Ming Lei  wrote:
> On Wed, Aug 6, 2014 at 6:09 PM, Kevin Wolf  wrote:

>
> I use the /dev/nullb0 block device to test, which is available in linux kernel
> 3.13+, and follows the difference, which looks not very big(< 10%):
>
> And I added two parameter to your img-bench patch:
>
>   -c CNT  # which is passed to 'data.n'
>   -b   #enable bypass coroutine introduced in this patchset
>
> Another difference is that dataplane uses its own thread, and this
> bench takes main_loop.
>
> ming@:~/git/qemu$ sudo ~/bin/perf stat -e
> L1-dcache-loads,L1-dcache-load-misses,cpu-cycles,instructions,branch-instructions,branch-misses,branch-loads,branch-load-misses,dTLB-loads,dTLB-load-misses
> ./qemu-img bench -f raw -t off -n -c 1000  -b /dev/nullb0
> read time: 58024ms
>
>  Performance counter stats for './qemu-img bench -f raw -t off -n -c
> 1000 -b /dev/nullb0':
>
> 34,874,462,357  L1-dcache-loads
>   [40.00%]
>714,018,039  L1-dcache-load-misses #2.05% of all
> L1-dcache hits   [40.00%]
>133,897,794,677  cpu-cycles[40.05%]
>116,714,230,004  instructions  #0.87  insns per
> cycle [50.02%]
> 22,689,223,546  branch-instructions
>   [50.01%]
>391,673,952  branch-misses #1.73% of all
> branches [50.00%]
> 22,726,856,215  branch-loads
>   [50.01%]
> 18,570,766,783  branch-load-misses
>   [49.98%]
> 34,944,839,907  dTLB-loads
>   [39.99%]
> 24,405,944  dTLB-load-misses  #0.07% of all
> dTLB cache hits  [39.99%]
>
>   58.040785989 seconds time elapsed
>
>
> ming@:~/git/qemu$ sudo ~/bin/perf stat -e
> L1-dcache-loads,L1-dcache-load-misses,cpu-cycles,instructions,branch-instructions,branch-misses,branch-loads,branch-load-misses,dTLB-loads,dTLB-load-misses
> ./qemu-img bench -f raw -t off -n -c 1000  /dev/nullb0
> read time: 63369ms

BTW, Stefan's coroutine resize patch is applied in both the
tests(qem-img bench).

Thanks,

Re: [Qemu-devel] [questions] about qemu log

On Wed, Aug 6, 2014 at 5:42 PM, Markus Armbruster  wrote:
> "Zhang Haoyu"  writes:
>
> The output is on qemu's stderr.  You are in control of what that
 stderr is.

 I don't get why we can configure
 -D /path/to/unique/file/name.log

 but we also have to redirect stderr (I didn't checked if the daemonize
 option was closing it). What's the purpose of this logfile option?

>>>
>>>Well -D will log to file only loggable (i.e. qemu_log()) information
>>>(which has all sorts of options and switches). Stderr, is a little
>>>more static and should in theory be limited to genuine errors. But if
>>>you want a combined log of both you can simply omit -D to default
>>>qemu_log output to stderr. This gives you a combined log that you can
>>>redirect anywhere. To be honest, this is what I do as a matter of
>>>course (2> foo rather than -D foo).
>>>
>> Maybe we can introduce a new qemu option to specify a error logfile
>> where stderr be redirected, like below,
>> DEF("elogfile", HAS_ARG, QEMU_OPTION_elogfile, \
>> "-elogfile logfile redirect stderr log to logfile(default
>> /var/log/qemu/##.log)\n",
>> QEMU_ARCH_ALL)
>> STEXI
>> @item -elogfile @var{logfile}
>> @findex -elogfile
>> redirect stderr in @var{logfile}
>> ETEXI
>> then we can set the error log file through qemu command,
>> /var/log/qemu/##.log as default.
>>
>
>This sounds out-of-scope for QEMU to me and makes a standard flow
>non-standard. If prints are going to stderr where should be going
>elsewhere they probably should be fixed. Do you have specific examples
>of information going to stderr that you would rather go to a log (be
>it an error log or something else?).
>
 I use proxmox to manage vm, it dose not redirect qemu's stderr, and
 start vm with -daemonize option,
 so the error log disappeared.
 I want to redirect the error log of qemu to a specified logfile, if
 fault happened, I can use the error log to analyze the fault.

 And, why qemu output the error log to stderr instead of a error
 logfile which can be configure?
>>>
>>>Because the code is a mess in that regard.
>>>
>>>You don't fix that by redirecting stderr wholesale, because that just
>>>adds to the mess.  You fix it at the root, one ill-advised fprintf() at
>>>a time, as Peter advises:
>>>
>>
>> Sorry, I'm afraid I misunderstand what you mean,
>> should I replace all of fprintf(stderr, ...) with qemu_log() ?
>> or only some cases where stderr is used where qemu_log should be, as
>> Perter advises?
>
> I didn't mean to suggest blind conversion from fprintf() to qemu_log().
> Each instance of fprintf() needs to be reviewed before conversion.

Yes i'm afraid there is no quick one-shot fix to your problem. The
best is the suggested workarounds using stderr redirection.

>  I
> think that's exactly Peter's advice, too.
>

Correct.

Regards,
Peter

>> If so, should I still need to redirect the stderr to specified logfile
>> in qemu's parent shell/process ?
>
> Probably.  Libvirt does it.
>

Re: [Qemu-devel] [PATCH v5 1/6] exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao  wrote:
> Add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr so that
> we can handler errors.

"handle"

>
> Signed-off-by: Hu Tao 
> ---
>  exec.c  | 32 +++-
>  include/exec/ram_addr.h |  4 ++--
>  memory.c|  6 +++---
>  3 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index 765bd94..7e60a44 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1224,7 +1224,7 @@ static int memory_try_enable_merging(void *addr, size_t 
> len)
>  return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
>  }
>
> -static ram_addr_t ram_block_add(RAMBlock *new_block)
> +static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
>  {
>  RAMBlock *block;
>  ram_addr_t old_ram_size, new_ram_size;
> @@ -1241,9 +1241,11 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
>  } else {
>  new_block->host = phys_mem_alloc(new_block->length);
>  if (!new_block->host) {
> -fprintf(stderr, "Cannot set up guest memory '%s': %s\n",
> -new_block->mr->name, strerror(errno));
> -exit(1);
> +error_setg_errno(errp, errno,
> + "cannot set up guest memory '%s'",
> + new_block->mr->name);
> +qemu_mutex_unlock_ramlist();
> +return -1;
>  }
>  memory_try_enable_merging(new_block->host, new_block->length);
>  }
> @@ -1294,6 +1296,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
> MemoryRegion *mr,
>  Error **errp)
>  {
>  RAMBlock *new_block;
> +ram_addr_t addr;
>
>  if (xen_enabled()) {
>  error_setg(errp, "-mem-path not supported with Xen");
> @@ -1323,14 +1326,20 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
> MemoryRegion *mr,
>  return -1;
>  }
>
> -return ram_block_add(new_block);
> +addr = ram_block_add(new_block, errp);
> +if (errp && *errp) {
> +g_free(new_block);

The free being conditional on errp will cause a leak if clients
(validly) pass a NULL errp in. This free needs to be unconditional.
The way to achieve that is the local_err error_propagate pattern.

> +return -1;
> +}
> +return addr;
>  }
>  #endif
>
>  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> -   MemoryRegion *mr)
> +   MemoryRegion *mr, Error **errp)
>  {
>  RAMBlock *new_block;
> +ram_addr_t addr;
>
>  size = TARGET_PAGE_ALIGN(size);
>  new_block = g_malloc0(sizeof(*new_block));
> @@ -1341,12 +1350,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, 
> void *host,
>  if (host) {
>  new_block->flags |= RAM_PREALLOC;
>  }
> -return ram_block_add(new_block);
> +addr = ram_block_add(new_block, errp);
> +if (errp && *errp) {
> +g_free(new_block);

ditto.

Regards,
Peter

> +return -1;
> +}
> +return addr;
>  }
>
> -ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
> +ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp)
>  {
> -return qemu_ram_alloc_from_ptr(size, NULL, mr);
> +return qemu_ram_alloc_from_ptr(size, NULL, mr, errp);
>  }
>
>  void qemu_ram_free_from_ptr(ram_addr_t addr)
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 6593be1..cf1d4c7 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -26,8 +26,8 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
> MemoryRegion *mr,
>  bool share, const char *mem_path,
>  Error **errp);
>  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> -   MemoryRegion *mr);
> -ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
> +   MemoryRegion *mr, Error **errp);
> +ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp);
>  int qemu_get_ram_fd(ram_addr_t addr);
>  void *qemu_get_ram_block_host_ptr(ram_addr_t addr);
>  void *qemu_get_ram_ptr(ram_addr_t addr);
> diff --git a/memory.c b/memory.c
> index 64d7176..59d9935 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1169,7 +1169,7 @@ void memory_region_init_ram(MemoryRegion *mr,
>  mr->ram = true;
>  mr->terminates = true;
>  mr->destructor = memory_region_destructor_ram;
> -mr->ram_addr = qemu_ram_alloc(size, mr);
> +mr->ram_addr = qemu_ram_alloc(size, mr, &error_abort);
>  }
>
>  #ifdef __linux__
> @@ -1199,7 +1199,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
>  mr->ram = true;
>  mr->terminates = true;
>  mr->destructor = memory_region_destructor_ram_from_ptr;
> -mr->ram_addr = qemu_ram_alloc_from_ptr(size, ptr,

Re: [Qemu-devel] [PATCH v5 2/6] memory: add parameter errp to memory_region_init_ram

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao  wrote:
> Add parameter errp to memory_region_init_ram and update all call sites
> to pass in &error_abort.
>
> Signed-off-by: Hu Tao 

Liking it.

Reviewed-by: Peter Crosthwaite 

> ---
>  backends/hostmem-ram.c   |  2 +-
>  hw/alpha/typhoon.c   |  3 ++-
>  hw/arm/armv7m.c  |  7 ---
>  hw/arm/cubieboard.c  |  2 +-
>  hw/arm/digic_boards.c|  2 +-
>  hw/arm/exynos4210.c  |  9 +
>  hw/arm/highbank.c|  5 +++--
>  hw/arm/integratorcp.c|  5 +++--
>  hw/arm/kzm.c |  4 ++--
>  hw/arm/mainstone.c   |  3 ++-
>  hw/arm/musicpal.c|  6 --
>  hw/arm/omap1.c   |  6 --
>  hw/arm/omap2.c   |  6 --
>  hw/arm/omap_sx1.c|  6 --
>  hw/arm/palm.c|  3 ++-
>  hw/arm/pxa2xx.c  | 11 +++
>  hw/arm/realview.c|  9 ++---
>  hw/arm/spitz.c   |  2 +-
>  hw/arm/strongarm.c   |  3 ++-
>  hw/arm/tosa.c|  2 +-
>  hw/arm/versatilepb.c |  3 ++-
>  hw/arm/vexpress.c| 15 ++-
>  hw/arm/virt.c|  3 ++-
>  hw/arm/xilinx_zynq.c |  6 --
>  hw/block/onenand.c   |  2 +-
>  hw/core/loader.c |  2 +-
>  hw/cris/axis_dev88.c |  6 --
>  hw/display/cg3.c |  6 --
>  hw/display/qxl.c |  6 +++---
>  hw/display/sm501.c   |  2 +-
>  hw/display/tc6393xb.c|  3 ++-
>  hw/display/tcx.c |  5 +++--
>  hw/display/vga.c |  3 ++-
>  hw/display/vmware_vga.c  |  3 ++-
>  hw/i386/kvm/pci-assign.c |  3 ++-
>  hw/i386/pc.c |  3 ++-
>  hw/i386/pc_sysfw.c   |  5 +++--
>  hw/input/milkymist-softusb.c |  4 ++--
>  hw/lm32/lm32_boards.c|  6 --
>  hw/lm32/milkymist.c  |  3 ++-
>  hw/m68k/an5206.c |  4 ++--
>  hw/m68k/dummy_m68k.c |  2 +-
>  hw/m68k/mcf5208.c|  4 ++--
>  hw/microblaze/petalogix_ml605_mmu.c  |  5 +++--
>  hw/microblaze/petalogix_s3adsp1800_mmu.c |  6 --
>  hw/mips/mips_fulong2e.c  |  5 +++--
>  hw/mips/mips_jazz.c  |  8 +---
>  hw/mips/mips_malta.c |  6 --
>  hw/mips/mips_mipssim.c   |  6 --
>  hw/mips/mips_r4k.c   |  5 +++--
>  hw/moxie/moxiesim.c  |  4 ++--
>  hw/net/milkymist-minimac2.c  |  2 +-
>  hw/openrisc/openrisc_sim.c   |  2 +-
>  hw/pci-host/prep.c   |  3 ++-
>  hw/pci/pci.c |  2 +-
>  hw/ppc/mac_newworld.c|  3 ++-
>  hw/ppc/mac_oldworld.c|  3 ++-
>  hw/ppc/ppc405_boards.c   |  8 +---
>  hw/ppc/ppc405_uc.c   |  3 ++-
>  hw/s390x/s390-virtio-ccw.c   |  2 +-
>  hw/s390x/s390-virtio.c   |  2 +-
>  hw/sh4/r2d.c |  2 +-
>  hw/sh4/shix.c|  8 +---
>  hw/sparc/leon3.c |  4 ++--
>  hw/sparc/sun4m.c | 10 ++
>  hw/sparc64/sun4u.c   |  6 --
>  hw/unicore32/puv3.c  |  3 ++-
>  hw/xtensa/sim.c  |  4 ++--
>  hw/xtensa/xtfpga.c   |  8 +---
>  include/exec/memory.h|  4 +++-
>  memory.c |  5 +++--
>  numa.c   |  4 ++--
>  xen-hvm.c|  3 ++-
>  73 files changed, 203 insertions(+), 128 deletions(-)
>
> diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> index d9a8290..e55d066 100644
> --- a/backends/hostmem-ram.c
> +++ b/backends/hostmem-ram.c
> @@ -27,7 +27,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
> **errp)
>
>  path = object_get_canonical_path_component(OBJECT(backend));
>  memory_region_init_ram(&backend->mr, OBJECT(backend), path,
> -   backend->size);
> +   backend->size, &error_abort);
>  g_free(path);
>  }
>
> diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
> index 67a1070..058b723 100644
> --- a/hw/alpha/typhoon.c
> +++ b/hw/alpha/typhoon.c
> @@ -843,7 +843,8 @@ PCIBus *typhoon_init(ram_addr_t ram_size, ISABus 
>

Re: [Qemu-devel] [PATCH v5 3/6] memory: add parameter errp to memory_region_init_ram_ptr

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao  wrote:
> Add parameter errp to memory_region_init_ram_ptr and update all call
> sites to pass in &error_abort.
>
> Signed-off-by: Hu Tao 

Reviewed-by: Peter Crosthwaite 

> ---
>  hw/display/g364fb.c  | 2 +-
>  hw/i386/kvm/pci-assign.c | 3 ++-
>  hw/misc/ivshmem.c| 5 +++--
>  hw/misc/vfio.c   | 3 ++-
>  hw/ppc/spapr.c   | 2 +-
>  include/exec/memory.h| 4 +++-
>  memory.c | 5 +++--
>  7 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/hw/display/g364fb.c b/hw/display/g364fb.c
> index 46f7b41..cce33ae 100644
> --- a/hw/display/g364fb.c
> +++ b/hw/display/g364fb.c
> @@ -487,7 +487,7 @@ static void g364fb_init(DeviceState *dev, G364State *s)
>
>  memory_region_init_io(&s->mem_ctrl, NULL, &g364fb_ctrl_ops, s, "ctrl", 
> 0x18);
>  memory_region_init_ram_ptr(&s->mem_vram, NULL, "vram",
> -   s->vram_size, s->vram);
> +   s->vram_size, s->vram, &error_abort);
>  vmstate_register_ram(&s->mem_vram, dev);
>  memory_region_set_coalescing(&s->mem_vram);
>  }
> diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
> index 5dcd2d5..d2013af 100644
> --- a/hw/i386/kvm/pci-assign.c
> +++ b/hw/i386/kvm/pci-assign.c
> @@ -456,7 +456,8 @@ static void assigned_dev_register_regions(PCIRegion 
> *io_regions,
>   object_get_typename(OBJECT(pci_dev)), i);
>  memory_region_init_ram_ptr(&pci_dev->v_addrs[i].real_iomem,
> OBJECT(pci_dev), name,
> -   cur_region->size, virtbase);
> +   cur_region->size, virtbase,
> +   &error_abort);
>  vmstate_register_ram(&pci_dev->v_addrs[i].real_iomem,
>   &pci_dev->dev.qdev);
>  }
> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
> index 768e528..0949c15 100644
> --- a/hw/misc/ivshmem.c
> +++ b/hw/misc/ivshmem.c
> @@ -348,7 +348,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int 
> fd) {
>  ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>
>  memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s), "ivshmem.bar2",
> -   s->ivshmem_size, ptr);
> +   s->ivshmem_size, ptr, &error_abort);
>  vmstate_register_ram(&s->ivshmem, DEVICE(s));
>  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
>
> @@ -476,7 +476,8 @@ static void ivshmem_read(void *opaque, const uint8_t * 
> buf, int flags)
>  map_ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED,
>  incoming_fd, 0);
>  memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s),
> -   "ivshmem.bar2", s->ivshmem_size, map_ptr);
> +   "ivshmem.bar2", s->ivshmem_size, map_ptr,
> +   &error_abort);
>  vmstate_register_ram(&s->ivshmem, DEVICE(s));
>
>  IVSHMEM_DPRINTF("guest h/w addr = %" PRIu64 ", size = %" PRIu64 "\n",
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 0b9eba0..91d2c95 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -2894,7 +2894,8 @@ static int vfio_mmap_bar(VFIODevice *vdev, VFIOBAR *bar,
>  goto empty_region;
>  }
>
> -memory_region_init_ram_ptr(submem, OBJECT(vdev), name, size, *map);
> +memory_region_init_ram_ptr(submem, OBJECT(vdev), name, size, *map,
> +   &error_abort);
>  } else {
>  empty_region:
>  /* Create a zero sized sub-region to make cleanup easy. */
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d01978f..4dfe40a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1342,7 +1342,7 @@ static void ppc_spapr_init(MachineState *machine)
>  if (rma_alloc_size && rma) {
>  rma_region = g_new(MemoryRegion, 1);
>  memory_region_init_ram_ptr(rma_region, NULL, "ppc_spapr.rma",
> -   rma_alloc_size, rma);
> +   rma_alloc_size, rma, &error_abort);
>  vmstate_register_ram_global(rma_region);
>  memory_region_add_subregion(sysmem, 0, rma_region);
>  }
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index ec6299b..caa988d 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -351,12 +351,14 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
>   * @name: the name of the region.
>   * @size: size of the region.
>   * @ptr: memory to be mapped; must contain at least @size bytes.
> + * @errp: pointer to Error*, to store an error if it happens.
>   */
>  void memory_region_init_ram_ptr(MemoryRegion *mr,
>  struct Object

[Qemu-devel] [Bug 1353456] [NEW] qemu-io: Failure on a qcow2 image with the fuzzed refcount table

Public bug reported:

'qemu-io -c write' and 'qemu-io -c aio_write' crashes on a qcow2 image
with a fuzzed refcount table.

Sequence:
 1. Unpack the attached archive, make a copy of test.img
 2. Put copy.img and backing_img.file in the same directory
 3. Execute
qemu-io copy.img -c write 279552 322560
  or
   qemu-io copy.img -c aio_write 836608 166400

Result: qemu-io was killed by SIGIOT with the reason:

qemu-io: block/qcow2-cluster.c:1291: qcow2_alloc_cluster_offset:
Assertion `*host_offset != 0' failed.

qemu.git HEAD 69f87f713069f1f

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "traces.n.images.tar.gz"
   
https://bugs.launchpad.net/bugs/1353456/+attachment/4171103/+files/traces.n.images.tar.gz

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353456

Title:
  qemu-io: Failure on a qcow2 image with the fuzzed refcount table

Status in QEMU:
  New

Bug description:
  'qemu-io -c write' and 'qemu-io -c aio_write' crashes on a qcow2 image
  with a fuzzed refcount table.

  Sequence:
   1. Unpack the attached archive, make a copy of test.img
   2. Put copy.img and backing_img.file in the same directory
   3. Execute
  qemu-io copy.img -c write 279552 322560
or
 qemu-io copy.img -c aio_write 836608 166400

  Result: qemu-io was killed by SIGIOT with the reason:

  qemu-io: block/qcow2-cluster.c:1291: qcow2_alloc_cluster_offset:
  Assertion `*host_offset != 0' failed.

  qemu.git HEAD 69f87f713069f1f

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353456/+subscriptions

Re: [Qemu-devel] [PATCH v2 4/4] ivshmem: check the value returned by fstat()

2014-08-06 Thread Levente Kurusa

> The function fstat() may fail, so check its return value.
> 
> Signed-off-by: zhanghailiang 
> ---
>  hw/misc/ivshmem.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
> index 768e528..5d939d2 100644
> --- a/hw/misc/ivshmem.c
> +++ b/hw/misc/ivshmem.c
> @@ -324,7 +324,11 @@ static int check_shm_size(IVShmemState *s, int fd) {
>  
>  struct stat buf;
>  
> -fstat(fd, &buf);
> +if (fstat(fd, &buf) < 0) {
> +fprintf(stderr, "ivshmem: exiting for fstat fd '%d' failed: %s\n",
> +fd, strerror(errno));

exiting for fstat?

I would go with something like this:

"ivshmem: exiting: fstat on fd %d failed: %s"

... or something among the lines.

Thanks!
Levente Kurusa

Re: [Qemu-devel] [PATCH v5 4/6] memory: add parameter errp to memory_region_init_rom_device

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao  wrote:
> Add parameter errp to memory_region_init_rom_device and update all call
> sites to pass in &error_abort.
>
> Signed-off-by: Hu Tao 

Reviewed-by: Peter Crosthwaite 

> ---
>  hw/block/pflash_cfi01.c | 2 +-
>  hw/block/pflash_cfi02.c | 2 +-
>  include/exec/memory.h   | 4 +++-
>  memory.c| 5 +++--
>  4 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
> index f9507b4..649565d 100644
> --- a/hw/block/pflash_cfi01.c
> +++ b/hw/block/pflash_cfi01.c
> @@ -770,7 +770,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
> **errp)
>  memory_region_init_rom_device(
>  &pfl->mem, OBJECT(dev),
>  pfl->be ? &pflash_cfi01_ops_be : &pflash_cfi01_ops_le, pfl,
> -pfl->name, total_len);
> +pfl->name, total_len, &error_abort);
>  vmstate_register_ram(&pfl->mem, DEVICE(pfl));
>  pfl->storage = memory_region_get_ram_ptr(&pfl->mem);
>  sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
> diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
> index 8d4b828..49db02d 100644
> --- a/hw/block/pflash_cfi02.c
> +++ b/hw/block/pflash_cfi02.c
> @@ -608,7 +608,7 @@ static void pflash_cfi02_realize(DeviceState *dev, Error 
> **errp)
>
>  memory_region_init_rom_device(&pfl->orig_mem, OBJECT(pfl), pfl->be ?
>&pflash_cfi02_ops_be : 
> &pflash_cfi02_ops_le,
> -  pfl, pfl->name, chip_len);
> +  pfl, pfl->name, chip_len, &error_abort);

We probably should take the opportunity to error_propagate in these
cases, to prepare support for hotplug of devs like this. But I think
your blind conversions are a good first step as they will preserve
existing behaviour. So lets call that follow up.

Regards,
Peter

>  vmstate_register_ram(&pfl->orig_mem, DEVICE(pfl));
>  pfl->storage = memory_region_get_ram_ptr(&pfl->orig_mem);
>  pfl->chip_len = chip_len;
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index caa988d..71bed47 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -388,13 +388,15 @@ void memory_region_init_alias(MemoryRegion *mr,
>   * @ops: callbacks for write access handling.
>   * @name: the name of the region.
>   * @size: size of the region.
> + * @errp: pointer to Error*, to store an error if it happens.
>   */
>  void memory_region_init_rom_device(MemoryRegion *mr,
> struct Object *owner,
> const MemoryRegionOps *ops,
> void *opaque,
> const char *name,
> -   uint64_t size);
> +   uint64_t size,
> +   Error **errp);
>
>  /**
>   * memory_region_init_reservation: Initialize a memory region that reserves
> diff --git a/memory.c b/memory.c
> index bcebfd8..06a7e1b 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1223,7 +1223,8 @@ void memory_region_init_rom_device(MemoryRegion *mr,
> const MemoryRegionOps *ops,
> void *opaque,
> const char *name,
> -   uint64_t size)
> +   uint64_t size,
> +   Error **errp)
>  {
>  memory_region_init(mr, owner, name, size);
>  mr->ops = ops;
> @@ -1231,7 +1232,7 @@ void memory_region_init_rom_device(MemoryRegion *mr,
>  mr->terminates = true;
>  mr->rom_device = true;
>  mr->destructor = memory_region_destructor_rom_device;
> -mr->ram_addr = qemu_ram_alloc(size, mr, &error_abort);
> +mr->ram_addr = qemu_ram_alloc(size, mr, errp);
>  }
>
>  void memory_region_init_iommu(MemoryRegion *mr,
> --
> 1.9.3
>
>

Re: [Qemu-devel] [PATCH v5 5/6] hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao  wrote:
> When using monitor command object_add to add a memory backend whose
> size is way too big to allocate memory for it, qemu just exits. In
> the case we'd better give an error message and keep guest running.
>
> The problem can be reproduced as follows:
>
> 1. run qemu
> 2. (monitor)object_add memory-backend-ram,size=10G,id=ram0
>
> Signed-off-by: Hu Tao 

Reviewed-by: Peter Crosthwaite 

> ---
>  backends/hostmem-ram.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> index e55d066..a67a134 100644
> --- a/backends/hostmem-ram.c
> +++ b/backends/hostmem-ram.c
> @@ -27,7 +27,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
> **errp)
>
>  path = object_get_canonical_path_component(OBJECT(backend));
>  memory_region_init_ram(&backend->mr, OBJECT(backend), path,
> -   backend->size, &error_abort);
> +   backend->size, errp);
>  g_free(path);
>  }
>
> --
> 1.9.3
>
>

Re: [Qemu-devel] [PATCH v5 6/6] exec: improve error handling and reporting in file_ram_alloc() and gethugepagesize()

You subject line is excessively long. How about just "improve RAM file
error handling" and elaborate in a commit msg para?

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao  wrote:
> This patch fixes two problems of memory-backend-file:
>

It looks like two self contained changes. Any reason to not split?

> 1. If user adds a memory-backend-file object using object_add command,
>specifying a non-existing directory for property mem-path, qemu
>will core dump with message:
>
>  /nonexistingdir: No such file or directory
>  Bad ram offset f000
>  Aborted (core dumped)
>
>with this patch, qemu reports error message like:
>
>  qemu-system-x86_64: -object 
> memory-backend-file,mem-path=/nonexistingdir,id=mem-file0,size=128M:
>  failed to stat file /nonexistingdir: No such file or directory
>
> 2. If user adds a memory-backend-file object using object_add command,
>specifying a size that is less than huge page size, qemu
>will core dump with message:
>
>  Bad ram offset f000
>  Aborted (core dumped)
>
>with this patch, qemu reports error message like:
>
>  qemu-system-x86_64: -object 
> memory-backend-file,mem-path=/hugepages,id=mem-file0,size=1M: memory
>  size 0x10 should be euqal or larger than huge page size 0x20

"equal".

>
> Signed-off-by: Hu Tao 
> ---
>  exec.c | 21 -
>  1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index 7e60a44..6512820 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -996,7 +996,7 @@ void qemu_mutex_unlock_ramlist(void)
>

>  #define HUGETLBFS_MAGIC   0x958458f6
>
> -static long gethugepagesize(const char *path)
> +static long gethugepagesize(const char *path, Error **errp)
>  {
>  struct statfs fs;
>  int ret;
> @@ -1006,7 +1006,7 @@ static long gethugepagesize(const char *path)
>  } while (ret != 0 && errno == EINTR);
>
>  if (ret != 0) {
> -perror(path);
> +error_setg_errno(errp, errno, "failed to get size of file %s", path);

I think your error message is imprecise. It's not the file size you
are trying to get its the page size for that file (or its underlying
file system I think).

>  return 0;
>  }
>
> @@ -1024,17 +1024,20 @@ static void *file_ram_alloc(RAMBlock *block,
>  char *filename;
>  char *sanitized_name;
>  char *c;
> -void *area;
> +void *area = NULL;
>  int fd;
> -unsigned long hpagesize;
> +uint64_t hpagesize;
>
> -hpagesize = gethugepagesize(path);
> -if (!hpagesize) {
> +hpagesize = gethugepagesize(path, errp);
> +if (errp && *errp) {

More flow control dependent on non NULL errp. I think you want a
local_err for safety here.

>  goto error;
>  }
>
>  if (memory < hpagesize) {
> -return NULL;
> +error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be euqal to "

"equal"

Regards,
Peter

> +   "or larger than huge page size 0x%" PRIx64,
> +   memory, hpagesize);
> +goto error;
>  }
>
>  if (kvm_enabled() && !kvm_has_sync_mmu()) {
> @@ -1094,8 +1097,8 @@ static void *file_ram_alloc(RAMBlock *block,
>  return area;
>
>  error:
> -if (mem_prealloc) {
> -exit(1);
> +if (area && area != MAP_FAILED) {
> +munmap(area, memory);
>  }
>  return NULL;
>  }
> --
> 1.9.3
>
>

[Qemu-devel] [PATCH V5 0/5] tests: Add the image fuzzer with qcow2 support


This patch series introduces the image fuzzer, a tool for stability and
reliability testing.
Its approach is to run large amount of tests in background. During every test a
program (e.g. qemu-img) is called to read or modify an invalid test image.
A test image has valid inner structure defined by its format specification with
some fields having random invalid values.

Patch 1 contains documentation for the image fuzzer, patch 2 is the test runner
and remaining ones relate to the image generator for qcow2 format.

This patch series was created for the 'block-next' branch.

v4 -> v5:
Runner:
 * Added a warning message if a backing file failed to be created (based on
   the review of Fam Zheng)
 * Back up a test image before every test command
 * Fixed always logged messages of a program under test
 * Added a warning message if a wrong name of a program under test is specified
 * Made offset and length qemu-io arguments multiple of a sector size
 * Print all errors to stderr

Layout:
 * Fixed estimation of the feature name table length

Fuzz:
 * Added fuzzing vector for 8bit values

Overall:
 * Missed white spaces (based on reviews of Fam Zheng and Stefan Hajnoczi)
 * Simplified attribute calls (based on the review of Stefan Hajnoczi)

Maria Kustova (5):
  docs: Specification for the image fuzzer
  runner: Tool for fuzz tests execution
  fuzz: Fuzzing functions for qcow2 images
  layout: Generator of fuzzed qcow2 images
  package: Public API for image-fuzzer/runner/runner.py

 tests/image-fuzzer/docs/image-fuzzer.txt | 239 ++
 tests/image-fuzzer/qcow2/__init__.py |   1 +
 tests/image-fuzzer/qcow2/fuzz.py | 327 +
 tests/image-fuzzer/qcow2/layout.py   | 369 
 tests/image-fuzzer/runner/runner.py  | 405 +++
 5 files changed, 1341 insertions(+)
 create mode 100644 tests/image-fuzzer/docs/image-fuzzer.txt
 create mode 100644 tests/image-fuzzer/qcow2/__init__.py
 create mode 100644 tests/image-fuzzer/qcow2/fuzz.py
 create mode 100644 tests/image-fuzzer/qcow2/layout.py
 create mode 100755 tests/image-fuzzer/runner/runner.py

-- 
1.9.3

[Qemu-devel] [PATCH V5 5/5] package: Public API for image-fuzzer/runner/runner.py

__init__.py provides the public API required by the test runner

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/qcow2/__init__.py | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 tests/image-fuzzer/qcow2/__init__.py

diff --git a/tests/image-fuzzer/qcow2/__init__.py 
b/tests/image-fuzzer/qcow2/__init__.py
new file mode 100644
index 000..e2ebe19
--- /dev/null
+++ b/tests/image-fuzzer/qcow2/__init__.py
@@ -0,0 +1 @@
+from layout import create_image
-- 
1.9.3

[Qemu-devel] [PATCH V5 1/5] docs: Specification for the image fuzzer

'Overall fuzzer requirements' chapter contains the current product vision and
features done and to be done. This chapter is still in progress.

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/docs/image-fuzzer.txt | 239 +++
 1 file changed, 239 insertions(+)
 create mode 100644 tests/image-fuzzer/docs/image-fuzzer.txt

diff --git a/tests/image-fuzzer/docs/image-fuzzer.txt 
b/tests/image-fuzzer/docs/image-fuzzer.txt
new file mode 100644
index 000..efe0ed4
--- /dev/null
+++ b/tests/image-fuzzer/docs/image-fuzzer.txt
@@ -0,0 +1,239 @@
+# Specification for the fuzz testing tool
+#
+# Copyright (C) 2014 Maria Kustova 
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+
+
+Image fuzzer
+
+
+Description
+---
+
+The goal of the image fuzzer is to catch crashes of qemu-io/qemu-img
+by providing to them randomly corrupted images.
+Test images are generated from scratch and have valid inner structure with some
+elements, e.g. L1/L2 tables, having random invalid values.
+
+
+Test runner
+---
+
+The test runner generates test images, executes tests utilizing generated
+images, indicates their results and collects all test related artifacts (logs,
+core dumps, test images).
+The test means execution of all available commands under test with the same
+generated test image.
+By default, the test runner generates new tests and executes them until
+keyboard interruption. But if a test seed is specified via the '--seed' runner
+parameter, then only one test with this seed will be executed, after its finish
+the runner will exit.
+
+The runner uses an external image fuzzer to generate test images. An image
+generator should be specified as a mandatory parameter of the test runner.
+Details about interactions between the runner and fuzzers see "Module
+interfaces".
+
+The runner activates generation of core dumps during test executions, but it
+assumes that core dumps will be generated in the current working directory.
+For comprehensive test results, please, set up your test environment
+properly.
+
+Paths to binaries under test (SUTs) qemu-img and qemu-io are retrieved from
+environment variables. If the environment check fails the runner will
+use SUTs installed in system paths.
+qemu-img is required for creation of backing files, so it's mandatory to set
+the related environment variable if it's not installed in the system path.
+For details about environment variables see qemu-iotests/check.
+
+The runner accepts via the '--config' argument a JSON array of fields expected
+to be fuzzed, e.g.
+
+   '[["feature_name_table"], ["header", "l1_table_offset"]]'
+
+Each sublist can be have one or two strings defining image structure elements.
+In the latter case a parent element should be placed on the first position,
+and a field name on the second one.
+
+The runner accepts a list of commands under test as a JSON array via
+the '--command' argument. Each command is a list containing a SUT and all its
+arguments, e.g.
+
+   runner.py -c '[["qemu-io", "$test_img", "-c", "write $off $len"]]'
+ /tmp/test ../qcow2
+
+For variable arguments next aliases can be used:
+- $test_img for a fuzzed img
+- $off for an offset in the fuzzed image
+- $len for a data size
+
+Values for last two aliases will be generated based on the size of the virtal
+disk in the fuzzed image.
+In case when no commands are specified the runner will execute commands from
+the default list:
+- qemu-img check
+- qemu-img info
+- qemu-img convert
+- qemu-io -c read
+- qemu-io -c write
+- qemu-io -c aio_read
+- qemu-io -c aio_write
+- qemu-io -c flush
+- qemu-io -c discard
+- qemu-io -c truncate
+
+
+Qcow2 image generator
+-
+
+The 'qcow2' generator is a Python package providing 'create_image' method as
+a single public API. See details in 'Test runner/image fuzzer' in 'Module
+interfaces'.
+
+Qcow2 contains two submodules: fuzz.py and layout.py.
+
+'fuzz.py' contains all fuzzing functions, one per image field. It's assumed
+that after code analysis every field will have own constraints for its value.
+For now only universal potentially dangerous values are used, e.g. type limits
+for integers or unsafe symbols as '%s' for strings. For bitmasks random amount
+of bits are set to ones. All fuzzed values are checked on non-equality to

[Qemu-devel] [PATCH V5 4/5] layout: Generator of fuzzed qcow2 images

The layout submodule of the qcow2 package creates a random valid image,
randomly selects some amount of its fields, fuzzes them and write the fuzzed
image to the file. Fuzzing process can be controlled by an external
configuration.

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/qcow2/layout.py | 369 +
 1 file changed, 369 insertions(+)
 create mode 100644 tests/image-fuzzer/qcow2/layout.py

diff --git a/tests/image-fuzzer/qcow2/layout.py 
b/tests/image-fuzzer/qcow2/layout.py
new file mode 100644
index 000..4c08202
--- /dev/null
+++ b/tests/image-fuzzer/qcow2/layout.py
@@ -0,0 +1,369 @@
+# Generator of fuzzed qcow2 images
+#
+# Copyright (C) 2014 Maria Kustova 
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import random
+import struct
+import fuzz
+
+MAX_IMAGE_SIZE = 10 * (1 << 20)
+# Standard sizes
+UINT32_S = 4
+UINT64_S = 8
+
+
+class Field(object):
+
+"""Atomic image element (field).
+
+The class represents an image field as quadruple of a data format
+of value necessary for its packing to binary form, an offset from
+the beginning of the image, a value and a name.
+
+The field can be iterated as a list [format, offset, value].
+"""
+
+__slots__ = ('fmt', 'offset', 'value', 'name')
+
+def __init__(self, fmt, offset, val, name):
+self.fmt = fmt
+self.offset = offset
+self.value = val
+self.name = name
+
+def __iter__(self):
+return iter([self.fmt, self.offset, self.value])
+
+def __repr__(self):
+return "Field(fmt='%s', offset=%d, value=%s, name=%s)" % \
+(self.fmt, self.offset, str(self.value), self.name)
+
+
+class FieldsList(object):
+
+"""List of fields.
+
+The class allows access to a field in the list by its name and joins
+several list in one via in-place addition.
+"""
+
+def __init__(self, meta_data=None):
+if meta_data is None:
+self.data = []
+else:
+self.data = [Field(f[0], f[1], f[2], f[3])
+ for f in meta_data]
+
+def __getitem__(self, name):
+return [x for x in self.data if x.name == name]
+
+def __iter__(self):
+return iter(self.data)
+
+def __iadd__(self, other):
+self.data += other.data
+return self
+
+def __len__(self):
+return len(self.data)
+
+
+class Image(object):
+
+""" Qcow2 image object.
+
+This class allows to create qcow2 images with random valid structures and
+values, fuzz them via external qcow2.fuzz module and write the result to
+a file.
+"""
+
+@staticmethod
+def _size_params():
+"""Generate a random image size aligned to a random correct
+cluster size.
+"""
+cluster_bits = random.randrange(9, 21)
+cluster_size = 1 << cluster_bits
+img_size = random.randrange(0, MAX_IMAGE_SIZE + 1, cluster_size)
+return (cluster_bits, img_size)
+
+@staticmethod
+def _header(cluster_bits, img_size, backing_file_name=None):
+"""Generate a random valid header."""
+meta_header = [
+['>4s', 0, "QFI\xfb", 'magic'],
+['>I', 4, random.randint(2, 3), 'version'],
+['>Q', 8, 0, 'backing_file_offset'],
+['>I', 16, 0, 'backing_file_size'],
+['>I', 20, cluster_bits, 'cluster_bits'],
+['>Q', 24, img_size, 'size'],
+['>I', 32, 0, 'crypt_method'],
+['>I', 36, 0, 'l1_size'],
+['>Q', 40, 0, 'l1_table_offset'],
+['>Q', 48, 0, 'refcount_table_offset'],
+['>I', 56, 0, 'refcount_table_clusters'],
+['>I', 60, 0, 'nb_snapshots'],
+['>Q', 64, 0, 'snapshots_offset'],
+['>Q', 72, 0, 'incompatible_features'],
+['>Q', 80, 0, 'compatible_features'],
+['>Q', 88, 0, 'autoclear_features'],
+# Only refcount_order = 4 is supported by current (07.2014)
+# implementation of QEMU
+['>I', 96, 4, 'refcount_order'],
+['>I', 100, 0, 'header_length']
+]
+v_header = FieldsList(meta_header)
+
+if v_header['version'][0].value == 2:
+v_header['header_length'][0].value = 72
+else:
+v_header['incompatible_features'][0].value = random.getrandbits(2)
+

[Qemu-devel] [PATCH V5 2/5] runner: Tool for fuzz tests execution

The purpose of the test runner is to prepare the test environment (e.g. create
a work directory, a test image, etc), execute a program under test with
parameters, indicate a test failure if the program was killed during the test
execution and collect core dumps, logs and other test artifacts.

The test runner doesn't depend on an image format or a program will be tested,
so it can be used with any external image generator and program under test.

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/runner/runner.py | 405 
 1 file changed, 405 insertions(+)
 create mode 100755 tests/image-fuzzer/runner/runner.py

diff --git a/tests/image-fuzzer/runner/runner.py 
b/tests/image-fuzzer/runner/runner.py
new file mode 100755
index 000..3fa7fca
--- /dev/null
+++ b/tests/image-fuzzer/runner/runner.py
@@ -0,0 +1,405 @@
+#!/usr/bin/env python
+
+# Tool for running fuzz tests
+#
+# Copyright (C) 2014 Maria Kustova 
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import sys
+import os
+import signal
+import subprocess
+import random
+import shutil
+from itertools import count
+import getopt
+import StringIO
+import resource
+
+try:
+import json
+except ImportError:
+try:
+import simplejson as json
+except ImportError:
+print >>sys.stderr, \
+"Warning: Module for JSON processing is not found.\n" \
+"'--config' and '--command' options are not supported."
+
+# Backing file sizes in MB
+MAX_BACKING_FILE_SIZE = 10
+MIN_BACKING_FILE_SIZE = 1
+
+
+def multilog(msg, *output):
+""" Write an object to all of specified file descriptors."""
+for fd in output:
+fd.write(msg)
+fd.flush()
+
+
+def str_signal(sig):
+""" Convert a numeric value of a system signal to the string one
+defined by the current operational system.
+"""
+for k, v in signal.__dict__.items():
+if v == sig:
+return k
+
+
+def run_app(fd, q_args):
+"""Start an application with specified arguments and return its exit code
+or kill signal depending on the result of execution.
+"""
+devnull = open('/dev/null', 'r+')
+process = subprocess.Popen(q_args, stdin=devnull,
+   stdout=subprocess.PIPE,
+   stderr=subprocess.PIPE)
+out, err = process.communicate()
+fd.write(out)
+fd.write(err)
+return process.returncode
+
+
+class TestException(Exception):
+"""Exception for errors risen by TestEnv objects."""
+pass
+
+
+class TestEnv(object):
+
+"""Test object.
+
+The class sets up test environment, generates backing and test images
+and executes application under tests with specified arguments and a test
+image provided.
+
+All logs are collected.
+
+The summary log will contain short descriptions and statuses of tests in
+a run.
+
+The test log will include application (e.g. 'qemu-img') logs besides info
+sent to the summary log.
+"""
+
+def __init__(self, test_id, seed, work_dir, run_log,
+ cleanup=True, log_all=False):
+"""Set test environment in a specified work directory.
+
+Path to qemu-img and qemu-io will be retrieved from 'QEMU_IMG' and
+'QEMU_IO' environment variables.
+"""
+if seed is not None:
+self.seed = seed
+else:
+self.seed = str(random.randint(0, sys.maxint))
+random.seed(self.seed)
+
+self.init_path = os.getcwd()
+self.work_dir = work_dir
+self.current_dir = os.path.join(work_dir, 'test-' + test_id)
+self.qemu_img = os.environ.get('QEMU_IMG', 'qemu-img')\
+  .strip().split(' ')
+self.qemu_io = os.environ.get('QEMU_IO', 'qemu-io').strip().split(' ')
+self.commands = [['qemu-img', 'check', '-f', 'qcow2', '$test_img'],
+ ['qemu-img', 'info', '-f', 'qcow2', '$test_img'],
+ ['qemu-io', '$test_img', '-c', 'read $off $len'],
+ ['qemu-io', '$test_img', '-c', 'write $off $len'],
+ ['qemu-io', '$test_img', '-c',
+  'aio_read $off $len'],
+ ['qemu-io', '$test_img', '-c',
+  'aio_write $off $len'],
+ ['qemu-io', '$test_i

[Qemu-devel] [PATCH V5 3/5] fuzz: Fuzzing functions for qcow2 images

The fuzz submodule of the qcow2 image generator contains fuzzing functions for
image fields.
Each fuzzing function contains a list of constraints and a call of a helper
function that randomly selects a fuzzed value satisfied to one of constraints.
For now constraints include only known as invalid or potentially dangerous
values. But after investigation of code coverage by fuzz tests they will be
expanded by heuristic values based on inner checks and flows of a program
under test.

Now fuzzing of a header, header extensions and a backing file name is
supported.

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/qcow2/fuzz.py | 327 +++
 1 file changed, 327 insertions(+)
 create mode 100644 tests/image-fuzzer/qcow2/fuzz.py

diff --git a/tests/image-fuzzer/qcow2/fuzz.py b/tests/image-fuzzer/qcow2/fuzz.py
new file mode 100644
index 000..a53c84f
--- /dev/null
+++ b/tests/image-fuzzer/qcow2/fuzz.py
@@ -0,0 +1,327 @@
+# Fuzzing functions for qcow2 fields
+#
+# Copyright (C) 2014 Maria Kustova 
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import random
+
+
+UINT8 = 0xff
+UINT32 = 0x
+UINT64 = 0x
+# Most significant bit orders
+UINT32_M = 31
+UINT64_M = 63
+# Fuzz vectors
+UINT8_V = [0, 0x10, UINT8/4, UINT8/2 - 1, UINT8/2, UINT8/2 + 1, UINT8 - 1,
+   UINT8]
+UINT32_V = [0, 0x100, 0x1000, 0x1, 0x10, UINT32/4, UINT32/2 - 1,
+UINT32/2, UINT32/2 + 1, UINT32 - 1, UINT32]
+UINT64_V = UINT32_V + [0x100, 0x1000, 0x1, UINT64/4,
+   UINT64/2 - 1, UINT64/2, UINT64/2 + 1, UINT64 - 1,
+   UINT64]
+STRING_V = ['%s%p%x%d', '.1024d', '%.2049d', '%p%p%p%p', '%x%x%x%x',
+'%d%d%d%d', '%s%s%s%s', '%999s', '%08x', '%%20d', '%%20n',
+'%%20x', '%%20s', '%s%s%s%s%s%s%s%s%s%s', '%p%p%p%p%p%p%p%p%p%p',
+'%#0123456x%08x%x%s%p%d%n%o%u%c%h%l%q%j%z%Z%t%i%e%g%f%a%C%S%08x%%',
+'%s x 129', '%x x 257']
+
+
+def random_from_intervals(intervals):
+"""Select a random integer number from the list of specified intervals.
+
+Each interval is a tuple of lower and upper limits of the interval. The
+limits are included. Intervals in a list should not overlap.
+"""
+total = reduce(lambda x, y: x + y[1] - y[0] + 1, intervals, 0)
+r = random.randint(0, total - 1) + intervals[0][0]
+for x in zip(intervals, intervals[1:]):
+r = r + (r > x[0][1]) * (x[1][0] - x[0][1] - 1)
+return r
+
+
+def random_bits(bit_ranges):
+"""Generate random binary mask with ones in the specified bit ranges.
+
+Each bit_ranges is a list of tuples of lower and upper limits of bit
+positions will be fuzzed. The limits are included. Random amount of bits
+in range limits will be set to ones. The mask is returned in decimal
+integer format.
+"""
+bit_numbers = []
+# Select random amount of random positions in bit_ranges
+for rng in bit_ranges:
+bit_numbers += random.sample(range(rng[0], rng[1] + 1),
+ random.randint(0, rng[1] - rng[0] + 1))
+val = 0
+# Set bits on selected positions to ones
+for bit in bit_numbers:
+val |= 1 << bit
+return val
+
+
+def truncate_string(strings, length):
+"""Return strings truncated to specified length."""
+if type(strings) == list:
+return [s[:length] for s in strings]
+else:
+return strings[:length]
+
+
+def validator(current, pick, choices):
+"""Return a value not equal to the current selected by the pick
+function from choices.
+"""
+while True:
+val = pick(choices)
+if not val == current:
+return val
+
+
+def int_validator(current, intervals):
+"""Return a random value from intervals not equal to the current.
+
+This function is useful for selection from valid values except current one.
+"""
+return validator(current, random_from_intervals, intervals)
+
+
+def bit_validator(current, bit_ranges):
+"""Return a random bit mask not equal to the current.
+
+This function is useful for selection from valid values except current one.
+"""
+return validator(current, random_bits, bit_ranges)
+
+
+def string_validator(current, strings):
+"""Return a random string value from the list not equal to the current.
+
+

Re: [Qemu-devel] [edk2] license for binary drivers

Il 06/08/2014 12:34, Laszlo Ersek ha scritto:
> So no, you can't ship an OVMF binary (or source tarball) that contains
> the FAT driver, bundled as part of the GPLv2 (+compatible) QEMU
> distribution, either in source or in binary form.

What Laszlo said is mostly my understanding too (IANAL etc.).

Just one thing: the GPL does allow you to "merely aggregate" the OVMF
binaries with QEMU sources or QEMU binaries; and a lawyer could also
tell you the if QEMU were to ship OVMF binaries in its tarball, it would
be mere aggregation.  It would then be allowed by the GPL.

I wouldn't disagree; the OVMF binaries are just data as far as QEMU is
concerned.

However, the non-free nature of the OVMF binaries mean that QEMU will
never ever ship OVMF binaries until the license is fixed for the
offending FAT driver.  Not only because we don't want to get into legal
minefields, but also because QEMU is free software and wants to keep its
distributed releases entirely free.

Paolo

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 4:50 PM, Paolo Bonzini  wrote:
> Il 06/08/2014 10:38, Ming Lei ha scritto:
>> On Wed, Aug 6, 2014 at 3:45 PM, Paolo Bonzini  wrote:
>>> Il 06/08/2014 07:33, Ming Lei ha scritto:
>> I played a bit with the following, I hope it's not too naive. I couldn't
>> see a difference with your patches, but at least one reason for this is
>> probably that my laptop SSD isn't fast enough to make the CPU the
>> bottleneck. Haven't tried ramdisk yet, that would probably be the next
>> thing. (I actually wrote the patch up just for some profiling on my own,
>> not for comparing throughput, but it should be usable for that as well.)
 This might not be good for the test since it is basically a sequential
 read test, which can be optimized a lot by kernel. And I always use
 randread benchmark.
>>>
>>> A microbenchmark already exists in tests/test-coroutine.c, and doesn't
>>> really tell us much; it's obvious that coroutines execute more code, the
>>> question is why it affects the iops performance.
>>
>> Could you take a look at the coroutine benchmark I worte?  The running
>> result shows coroutine does decrease performance a lot compared with
>> bypass coroutine like the patchset is doing.
>
> Your benchmark is synchronous, while disk I/O is asynchronous.

It can be thought as asynchronous too, since it won't sleep like
synchronous I/O.

Basically the IO thread is CPU bound type in case of linux-aio
since both submission and completion won't block CPU mostly,
so my benchmark still fits if we thought the completion as nop.

The current problem is that from single coroutine benchmark,
looks it doesn't hurt performance with stack switch, but in Kevin's
block aio benchmark, bypass coroutine can still obtain observable
improvement.

>
> Your benchmark doesn't add much compared to "time tests/test-coroutine
> -m perf  -p /perf/yield".  It takes 8 seconds on my machine, and 10^8
> function calls obviously take less than 8 seconds.  I've sent a patch to
> add a "baseline" function call benchmark to test-coroutine.
>
>>> The sequential read should be the right workload.  For fio, you want to
>>> get as many iops as possible to QEMU and so you need randread.  But
>>> qemu-img is not run in a guest and if the kernel optimizes sequential
>>> reads then the bypass should have even more benefits because it makes
>>> userspace proportionally more expensive.
>
> Do you agree with this?

Yes, I have posted the result of the benchmark, and looks the result
is basically similar with my previous test on dataplane.

Thanks,

[Qemu-devel] [PATCH V2 0/3] image-fuzzer: Support L1/L2 tables in the qcow2 image generator

This patch series adds support of L1/L2 tables to the qcow2 image generator.

This patch series was created for the 'block-next' branch and based on the next
series:
 [PATCH V5 0/5] tests: Add the image fuzzer with qcow2 support.

v1 -> v2:
   * Rebased to the new version of the parent patch series
   * Fixed wrong maximum number of L2 tables
   * Fixed missed whitespaces (based on the review of Stefan Hajnoczi)

Maria Kustova (3):
  docs: Expand the list of supported image elements with L1/L2 tables
  fuzz: Add fuzzing functions for L1/L2 table entries
  layout: Add generators of L1/L2 tables

 tests/image-fuzzer/docs/image-fuzzer.txt |   3 +-
 tests/image-fuzzer/qcow2/fuzz.py |  28 
 tests/image-fuzzer/qcow2/layout.py   | 273 ---
 3 files changed, 240 insertions(+), 64 deletions(-)

-- 
1.9.3

[Qemu-devel] [PATCH V2 1/3] docs: Expand the list of supported image elements with L1/L2 tables

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/docs/image-fuzzer.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/image-fuzzer/docs/image-fuzzer.txt 
b/tests/image-fuzzer/docs/image-fuzzer.txt
index efe0ed4..2e8e3b9 100644
--- a/tests/image-fuzzer/docs/image-fuzzer.txt
+++ b/tests/image-fuzzer/docs/image-fuzzer.txt
@@ -125,8 +125,7 @@ If a fuzzer configuration is specified, then it has the 
next interpretation:
 will be always fuzzed for every test. This case is useful for regression
 testing.
 
-For now only header fields and header extensions are generated.
-
+For now only header fields, header extensions and L1/L2 tables are generated.
 
 Module interfaces
 -
-- 
1.9.3

[Qemu-devel] [PATCH V2 3/3] layout: Add generators of L1/L2 tables

Valid L2 entries contain offsets to image clusters filled with random data.
L2 entries have random positions inside L2 tables. L1 entries contain offsets
to generated L2 tables and also have random positions inside the L1 table.
Clusters for L1/L2 tables and guest data are selected randomly.

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/qcow2/layout.py | 273 -
 1 file changed, 211 insertions(+), 62 deletions(-)

diff --git a/tests/image-fuzzer/qcow2/layout.py 
b/tests/image-fuzzer/qcow2/layout.py
index 4c08202..7839d2c 100644
--- a/tests/image-fuzzer/qcow2/layout.py
+++ b/tests/image-fuzzer/qcow2/layout.py
@@ -19,6 +19,8 @@
 import random
 import struct
 import fuzz
+from math import ceil
+from os import urandom
 
 MAX_IMAGE_SIZE = 10 * (1 << 20)
 # Standard sizes
@@ -102,7 +104,66 @@ class Image(object):
 return (cluster_bits, img_size)
 
 @staticmethod
-def _header(cluster_bits, img_size, backing_file_name=None):
+def _get_available_clusters(used, number):
+"""Return a set of indices of not allocated clusters.
+
+'used' contains indices of currently allocated clusters.
+All clusters that cannot be allocated between 'used' clusters will have
+indices appended to the end of 'used'.
+"""
+append_id = max(used) + 1
+free = set(range(1, append_id)) - used
+if len(free) >= number:
+return set(random.sample(free, number))
+else:
+return free | set(range(append_id, append_id + number - len(free)))
+
+@staticmethod
+def _get_adjacent_clusters(used, size):
+"""Return an index of the first cluster in the sequence of free ones.
+
+'used' contains indices of currently allocated clusters. 'size' is the
+length of the sequence of free clusters.
+If the sequence of 'size' is not available between 'used' clusters, its
+first index will be append to the end of 'used'.
+"""
+def get_cluster_id(lst, length):
+"""Return the first index of the sequence of the specified length
+or None if the sequence cannot be inserted in the list.
+"""
+if len(lst) != 0:
+pairs = []
+pair = (lst[0], 1)
+for i in range(1, len(lst)):
+if lst[i] == lst[i-1] + 1:
+pair = (lst[i], pair[1] + 1)
+else:
+pairs.append(pair)
+pair = (lst[i], 1)
+pairs.append(pair)
+random.shuffle(pairs)
+for x, s in pairs:
+if s >= length:
+return x - length + 1
+return None
+
+append_id = max(used) + 1
+free = list(set(range(1, append_id)) - used)
+idx = get_cluster_id(free, size)
+if idx is None:
+return append_id
+else:
+return idx
+
+@staticmethod
+def _alloc_data(img_size, cluster_size):
+"""Return a set of random indices of clusters allocated for guest data.
+"""
+num_of_cls = img_size/cluster_size
+return set(random.sample(range(1, num_of_cls + 1),
+ random.randint(0, num_of_cls)))
+
+def create_header(self, cluster_bits, backing_file_name=None):
 """Generate a random valid header."""
 meta_header = [
 ['>4s', 0, "QFI\xfb", 'magic'],
@@ -110,7 +171,7 @@ class Image(object):
 ['>Q', 8, 0, 'backing_file_offset'],
 ['>I', 16, 0, 'backing_file_size'],
 ['>I', 20, cluster_bits, 'cluster_bits'],
-['>Q', 24, img_size, 'size'],
+['>Q', 24, self.image_size, 'size'],
 ['>I', 32, 0, 'crypt_method'],
 ['>I', 36, 0, 'l1_size'],
 ['>Q', 40, 0, 'l1_table_offset'],
@@ -126,63 +187,59 @@ class Image(object):
 ['>I', 96, 4, 'refcount_order'],
 ['>I', 100, 0, 'header_length']
 ]
-v_header = FieldsList(meta_header)
+self.header = FieldsList(meta_header)
 
-if v_header['version'][0].value == 2:
-v_header['header_length'][0].value = 72
+if self.header['version'][0].value == 2:
+self.header['header_length'][0].value = 72
 else:
-v_header['incompatible_features'][0].value = random.getrandbits(2)
-v_header['compatible_features'][0].value = random.getrandbits(1)
-v_header['header_length'][0].value = 104
-
-max_header_len = struct.calcsize(v_header['header_length'][0].fmt) + \
- v_header['header_length'][0].offset
+self.header['incompatible_features'][0].value = \
+random.getrandbits(2)
+self.header['compatible_features'][0].value = random.getrandbits(1)
+

[Qemu-devel] [PATCH V2 2/3] fuzz: Add fuzzing functions for L1/L2 table entries

Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/qcow2/fuzz.py | 28 
 1 file changed, 28 insertions(+)

diff --git a/tests/image-fuzzer/qcow2/fuzz.py b/tests/image-fuzzer/qcow2/fuzz.py
index a53c84f..57527f9 100644
--- a/tests/image-fuzzer/qcow2/fuzz.py
+++ b/tests/image-fuzzer/qcow2/fuzz.py
@@ -325,3 +325,31 @@ def feature_name(current):
 truncate_string(STRING_V, 46)  # Fuzz padding (field length = 46)
 ]
 return selector(current, constraints, string_validator)
+
+
+def l1_entry(current):
+"""Fuzz an entry of the L1 table."""
+constraints = UINT64_V
+# Reserved bits are ignored
+# Added a possibility when only flags are fuzzed
+offset = 0x7fff & random.choice([selector(current,
+  constraints),
+ current])
+is_cow = random.randint(0, 1)
+return offset + (is_cow << UINT64_M)
+
+
+def l2_entry(current):
+"""Fuzz an entry of an L2 table."""
+constraints = UINT64_V
+# Reserved bits are ignored
+# Add a possibility when only flags are fuzzed
+offset = 0x3ffe & random.choice([selector(current,
+  constraints),
+ current])
+is_compressed = random.randint(0, 1)
+is_cow = random.randint(0, 1)
+is_zero = random.randint(0, 1)
+value = offset + (is_cow << UINT64_M) + \
+(is_compressed << UINT64_M - 1) + is_zero
+return value
-- 
1.9.3

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Am 06.08.2014 um 13:28 hat Ming Lei geschrieben:
> On Wed, Aug 6, 2014 at 6:09 PM, Kevin Wolf  wrote:
> > Am 06.08.2014 um 11:37 hat Ming Lei geschrieben:
> >> On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf  wrote:
> >> > Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
> >> >> Hi Kevin,
> >> >>
> >> >> On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf  wrote:
> >> >> > Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
> >> >> >> I have been wondering how to prove that the root cause is the 
> >> >> >> ucontext
> >> >> >> coroutine mechanism (stack switching).  Here is an idea:
> >> >> >>
> >> >> >> Hack your "bypass" code path to run the request inside a coroutine.
> >> >> >> That way you can compare "bypass without coroutine" against "bypass 
> >> >> >> with
> >> >> >> coroutine".
> >> >> >>
> >> >> >> Right now I think there are doubts because the bypass code path is
> >> >> >> indeed a different (and not 100% correct) code path.  So this 
> >> >> >> approach
> >> >> >> might prove that the coroutines are adding the overhead and not
> >> >> >> something that you bypassed.
> >> >> >
> >> >> > My doubts aren't only that the overhead might not come from the
> >> >> > coroutines, but also whether any coroutine-related overhead is really
> >> >> > unavoidable. If we can optimise coroutines, I'd strongly prefer to do
> >> >> > just that instead of introducing additional code paths.
> >> >>
> >> >> OK, thank you for taking look at the problem, and hope we can
> >> >> figure out the root cause, :-)
> >> >>
> >> >> >
> >> >> > Another thought I had was this: If the performance difference is 
> >> >> > indeed
> >> >> > only coroutines, then that is completely inside the block layer and we
> >> >> > don't actually need a VM to test it. We could instead have something
> >> >> > like a simple qemu-img based benchmark and should be observing the 
> >> >> > same.
> >> >>
> >> >> Even it is simpler to run a coroutine-only benchmark, and I just
> >> >> wrote a raw one, and looks coroutine does decrease performance
> >> >> a lot, please see the attachment patch, and thanks for your template
> >> >> to help me add the 'co_bench' command in qemu-img.
> >> >
> >> > Yes, we can look at coroutines microbenchmarks in isolation. I actually
> >> > did do that yesterday with the yield test from tests/test-coroutine.c.
> >> > And in fact profiling immediately showed something to optimise:
> >> > pthread_getspecific() was quite high, replacing it by __thread on
> >> > systems where it works is more efficient and helped the numbers a bit.
> >> > Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
> >> > in qemu-img bench), maybe there's even something that can be done here.
> >>
> >> The lock/unlock in dataplane is often from memory_region_find(), and Paolo
> >> should have done lots of work on that.

qemu-img bench doesn't run that code. We have a few more locks that are
taken, and one of them (the coroutine pool lock) is avoided by your
bypass patches.

> >> >
> >> > However, I just wasn't sure whether a change on this level would be
> >> > relevant in a realistic environment. This is the reason why I wanted to
> >> > get a benchmark involving the block layer and some I/O.
> >> >
> >> >> From the profiling data in below link:
> >> >>
> >> >> http://pastebin.com/YwH2uwbq
> >> >>
> >> >> With coroutine, the running time for same loading is increased
> >> >> ~50%(1.325s vs. 0.903s), and dcache load events is increased
> >> >> ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
> >> >> 1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
> >> >>
> >> >> The bypass code in the benchmark is very similar with the approach
> >> >> used in the bypass patch, since linux-aio with O_DIRECT seldom
> >> >> blocks in the the kernel I/O path.
> >> >>
> >> >> Maybe the benchmark is a bit extremely, but given modern storage
> >> >> device may reach millions of IOPS, and it is very easy to slow down
> >> >> the I/O by coroutine.
> >> >
> >> > I think in order to optimise coroutines, such benchmarks are fair game.
> >> > It's just not guaranteed that the effects are exactly the same on real
> >> > workloads, so we should take the results with a grain of salt.
> >> >
> >> > Anyhow, the coroutine version of your benchmark is buggy, it leaks all
> >> > coroutines instead of exiting them, so it can't make any use of the
> >> > coroutine pool. On my laptop, I get this (where fixed coroutine is a
> >> > version that simply removes the yield at the end):
> >> >
> >> > | bypass| fixed coro| buggy coro
> >> > +---+---+--
> >> > time| 1.09s | 1.10s | 1.62s
> >> > L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
> >> > insns per cycle | 2.39  | 2.39  | 1.90
> >> >
> >> > Begs the question whether you see a similar effect on a real qemu and
> >> > the coroutine pool is still not big eno

[Qemu-devel] [PATCH] layout: Reduce number of generator functions in init

Some issues can be found only when a fuzzed image has a partial structure,
e.g. has L1/L2 tables but no refcount ones. Generation of an entirely
defined image limits these cases. Now the Image constructor creates only
a header and a backing file name (if any), other image elements are generated
in the 'create_image' API.

This patch series was created for the 'block-next' branch and based on the next
series:
[PATCH V2 0/3] image-fuzzer: Support L1/L2 tables in the qcow2 image
 generator


Signed-off-by: Maria Kustova 
---
 tests/image-fuzzer/qcow2/layout.py | 92 --
 1 file changed, 49 insertions(+), 43 deletions(-)

diff --git a/tests/image-fuzzer/qcow2/layout.py 
b/tests/image-fuzzer/qcow2/layout.py
index 7839d2c..98673c4 100644
--- a/tests/image-fuzzer/qcow2/layout.py
+++ b/tests/image-fuzzer/qcow2/layout.py
@@ -156,6 +156,14 @@ class Image(object):
 return idx
 
 @staticmethod
+def _get_cluster_ids(fields_list, cluster_size):
+"""Return indices of clusters allocated by fields of fields_list."""
+ids = set()
+for x in fields_list:
+ids.add(x.offset/cluster_size)
+return ids
+
+@staticmethod
 def _alloc_data(img_size, cluster_size):
 """Return a set of random indices of clusters allocated for guest data.
 """
@@ -196,12 +204,12 @@ class Image(object):
 random.getrandbits(2)
 self.header['compatible_features'][0].value = random.getrandbits(1)
 self.header['header_length'][0].value = 104
-
-max_header_len = struct.calcsize(
+# Extensions starts at the header last field offset and the field size
+self.ext_offset = struct.calcsize(
 self.header['header_length'][0].fmt) + \
 self.header['header_length'][0].offset
 end_of_extension_area_len = 2 * UINT32_S
-free_space = self.cluster_size - max_header_len - \
+free_space = self.cluster_size - self.ext_offset - \
  end_of_extension_area_len
 # If the backing file name specified and there is enough space for it
 # in the first cluster, then it's placed in the very end of the first
@@ -224,24 +232,18 @@ class Image(object):
 [data_fmt, self.header['backing_file_offset'][0].value,
  backing_file_name, 'bf_name']
 ])
-else:
-self.backing_file_name = FieldsList()
 
 def set_backing_file_format(self, backing_file_fmt=None):
 """Generate the header extension for the backing file
 format.
 """
-self.backing_file_format = FieldsList()
-offset = struct.calcsize(self.header['header_length'][0].fmt) + \
- self.header['header_length'][0].offset
-
 if backing_file_fmt is not None:
 # Calculation of the free space available in the first cluster
 end_of_extension_area_len = 2 * UINT32_S
 high_border = (self.header['backing_file_offset'][0].value or
(self.cluster_size - 1)) - \
 end_of_extension_area_len
-free_space = high_border - offset
+free_space = high_border - self.ext_offset
 ext_size = 2 * UINT32_S + ((len(backing_file_fmt) + 7) & ~7)
 
 if free_space >= ext_size:
@@ -249,18 +251,19 @@ class Image(object):
 ext_data_fmt = '>' + str(ext_data_len) + 's'
 ext_padding_len = 7 - (ext_data_len - 1) % 8
 self.backing_file_format = FieldsList([
-['>I', offset, 0xE2792ACA, 'ext_magic'],
-['>I', offset + UINT32_S, ext_data_len, 'ext_length'],
-[ext_data_fmt, offset + UINT32_S * 2, backing_file_fmt,
- 'bf_format']
+['>I', self.ext_offset, 0xE2792ACA, 'ext_magic'],
+['>I', self.ext_offset + UINT32_S, ext_data_len,
+ 'ext_length'],
+[ext_data_fmt, self.ext_offset + UINT32_S * 2,
+ backing_file_fmt, 'bf_format']
 ])
-offset = self.backing_file_format['bf_format'][0].offset + \
- struct.calcsize(self.backing_file_format[
- 'bf_format'][0].fmt) + ext_padding_len
-
-return offset
+self.ext_offset = \
+struct.calcsize(
+self.backing_file_format['bf_format'][0].fmt) + \
+ext_padding_len + \
+self.backing_file_format['bf_format'][0].offset
 
-def create_feature_name_table(self, offset):
+def create_feature_name_table(self):
 """Generate a random header extension for names of features used in
 the image.
 """
@@ -272,7 +275,7 @@ class Image(object):

Re: [Qemu-devel] [PATCH v2 1/2] hw/misc/arm_sp810: Create SP810 device

2014-08-06 Thread Aggeler Fabian


On 06 Aug 2014, at 01:03, Peter Crosthwaite  
wrote:

> On Tue, Aug 5, 2014 at 7:32 PM, Fabian Aggeler  wrote:
>> This adds a device model for the PrimeXsys System Controller (SP810)
>> which is present in the Versatile Express motherboards. It is
>> so far read-only but allows to set the SCCTRL register.
>> 
>> Signed-off-by: Fabian Aggeler 
>> ---
>> default-configs/arm-softmmu.mak |  1 +
>> hw/misc/Makefile.objs   |  1 +
>> hw/misc/arm_sp810.c | 98 
>> +
>> include/hw/misc/arm_sp810.h | 85 +++
>> 4 files changed, 185 insertions(+)
>> create mode 100644 hw/misc/arm_sp810.c
>> create mode 100644 include/hw/misc/arm_sp810.h
>> 
>> diff --git a/default-configs/arm-softmmu.mak 
>> b/default-configs/arm-softmmu.mak
>> index f3513fa..67ba99a 100644
>> --- a/default-configs/arm-softmmu.mak
>> +++ b/default-configs/arm-softmmu.mak
>> @@ -55,6 +55,7 @@ CONFIG_PL181=y
>> CONFIG_PL190=y
>> CONFIG_PL310=y
>> CONFIG_PL330=y
>> +CONFIG_SP810=y
>> CONFIG_CADENCE=y
>> CONFIG_XGMAC=y
>> CONFIG_EXYNOS4=y
>> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
>> index 979e532..49d023b 100644
>> --- a/hw/misc/Makefile.objs
>> +++ b/hw/misc/Makefile.objs
>> @@ -13,6 +13,7 @@ common-obj-$(CONFIG_PL310) += arm_l2x0.o
>> common-obj-$(CONFIG_INTEGRATOR_DEBUG) += arm_integrator_debug.o
>> common-obj-$(CONFIG_A9SCU) += a9scu.o
>> common-obj-$(CONFIG_ARM11SCU) += arm11scu.o
>> +common-obj-$(CONFIG_SP810) += arm_sp810.o
>> 
>> # PKUnity SoC devices
>> common-obj-$(CONFIG_PUV3) += puv3_pm.o
>> diff --git a/hw/misc/arm_sp810.c b/hw/misc/arm_sp810.c
>> new file mode 100644
>> index 000..21eb816
>> --- /dev/null
>> +++ b/hw/misc/arm_sp810.c
>> @@ -0,0 +1,98 @@
>> +/*
>> + * ARM PrimeXsys System Controller (SP810)
>> + *
>> + * Copyright (C) 2014 Fabian Aggeler 
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +
>> +#include "hw/misc/arm_sp810.h"
>> +
>> +static const VMStateDescription vmstate_arm_sysctl = {
>> +.name = "arm_sp810",
>> +.version_id = 1,
>> +.minimum_version_id = 1,
>> +.fields = (VMStateField[]) {
>> +VMSTATE_UINT32(sc_ctrl, arm_sp810_state),
>> +VMSTATE_END_OF_LIST()
>> +}
>> +};
>> +
>> +static uint64_t arm_sp810_read(void *opaque, hwaddr offset, unsigned size)
>> +{
>> +arm_sp810_state *s = opaque;
>> +
>> +switch (offset) {
>> +case SCCTRL:
>> +return s->sc_ctrl;
>> +default:
>> +qemu_log_mask(LOG_UNIMP,
>> +"arm_sp810_read: Register not yet implemented: %s\n",
>> +HWADDR_PRIx);
>> +return 0;
>> +}
>> +}
>> +
>> +static void arm_sp810_write(void *opaque, hwaddr offset,
>> +uint64_t val, unsigned size)
>> +{
>> +switch (offset) {
>> +default:
>> +qemu_log_mask(LOG_UNIMP,
>> +"arm_sp810_write: Register not yet implemented: %s\n",
>> +HWADDR_PRIx);
>> +return;
>> +}
>> +}
>> +
>> +static const MemoryRegionOps arm_sp810_ops = {
>> +.read = arm_sp810_read,
>> +.write = arm_sp810_write,
>> +.endianness = DEVICE_NATIVE_ENDIAN,
>> +};
>> +
>> +static void arm_sp810_init(Object *obj)
>> +{
>> +arm_sp810_state *s = ARM_SP810(obj);
>> +
>> +memory_region_init_io(&s->iomem, obj, &arm_sp810_ops, s, "arm-sp810",
>> +  0x1000);
>> +sysbus_init_mmio(SYS_BUS_DEVICE(obj), &s->iomem);
>> +}
>> +
>> +static Property arm_sp810_properties[] = {
>> +DEFINE_PROP_UINT32("sc-ctrl", arm_sp810_state, sc_ctrl, 0x09),
>> +DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
> 
> So it looks to me like the SCCTRL contains multiple register fields
> for multiple configurable options. Although i'm having trouble getting
> my head around it as I could not easily find public docco for this
> core (have a link handy?).

Unfortunately not as it seems ARM marked it as obsolete. We found 
the document offline. I could not find it on the web either. 
See also 
http://lists.infradead.org/pipermail/linux-arm-kernel/2011-January/038722.html

Indeed SCCTRL has many configuration bits, not just these 4 configuration 
options. 

> 
> Anyways, the thinking is you should define multiple configurable
> options as invididual QOM properties, rather than have board level
> code init registers. This would mean that you
> 
> DEFINE_PROP_UINT8("timeren0-sel", ...)
> DEFINE_PROP_UINT8("timeren1-sel", ...)
> 
> Have a look

[Qemu-devel] [Bug 1285708] Re: FreeBSD Guest crash on boot due to xsave instruction issue

2014-08-06 Thread Chris J Arges

** Changed in: linux (Ubuntu Precise)
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1285708

Title:
  FreeBSD Guest crash on boot due to xsave instruction issue

Status in QEMU:
  New
Status in qemu-kvm:
  New
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Precise:
  Won't Fix
Status in “linux” source package in Trusty:
  In Progress

Bug description:
  When trying to boot a working FreeBSD 9.1/9.2 guest on a kvm/qemu host
  with the following command:

  kvm -m 256 -cdrom FreeBSD-9.2-RELEASE-amd64-disc1.iso -drive
  file=FreeBSD-9.2-RELEASE-amd64.qcow2,if=virtio -net nic,model=virtio
  -net user -nographic -vnc :10 -enable-kvm -balloon virtio -cpu
  core2duo,+xsave

  The FreeBSD Guest will kernel crash on boot with the following error:

  panic: CPU0 does not support X87 or SSE: 0

  When launching the guest without the cpu flags, it works just fine.

  This bug has been resolved in source:
  https://lkml.org/lkml/2014/2/22/58

  Can this fix be included in Precise ASAP!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1285708/+subscriptions

[Qemu-devel] [PULL 02/11] icount: put icount variables into TimerState.

From: KONRAD Frederic 

This puts qemu_icount and qemu_icount_bias into TimerState structure to allow
them to be migrated.

Signed-off-by: KONRAD Frederic 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Paolo Bonzini 
---
 cpus.c | 29 -
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/cpus.c b/cpus.c
index 5e7f2cf..127de1c 100644
--- a/cpus.c
+++ b/cpus.c
@@ -102,17 +102,12 @@ static bool all_cpu_threads_idle(void)
 
 /* Protected by TimersState seqlock */
 
-/* Compensate for varying guest execution speed.  */
-static int64_t qemu_icount_bias;
 static int64_t vm_clock_warp_start;
 /* Conversion factor from emulated instructions to virtual clock ticks.  */
 static int icount_time_shift;
 /* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
 #define MAX_ICOUNT_SHIFT 10
 
-/* Only written by TCG thread */
-static int64_t qemu_icount;
-
 static QEMUTimer *icount_rt_timer;
 static QEMUTimer *icount_vm_timer;
 static QEMUTimer *icount_warp_timer;
@@ -129,6 +124,11 @@ typedef struct TimersState {
 int64_t cpu_clock_offset;
 int32_t cpu_ticks_enabled;
 int64_t dummy;
+
+/* Compensate for varying guest execution speed.  */
+int64_t qemu_icount_bias;
+/* Only written by TCG thread */
+int64_t qemu_icount;
 } TimersState;
 
 static TimersState timers_state;
@@ -139,14 +139,14 @@ static int64_t cpu_get_icount_locked(void)
 int64_t icount;
 CPUState *cpu = current_cpu;
 
-icount = qemu_icount;
+icount = timers_state.qemu_icount;
 if (cpu) {
 if (!cpu_can_do_io(cpu)) {
 fprintf(stderr, "Bad clock read\n");
 }
 icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
 }
-return qemu_icount_bias + (icount << icount_time_shift);
+return timers_state.qemu_icount_bias + (icount << icount_time_shift);
 }
 
 int64_t cpu_get_icount(void)
@@ -284,7 +284,8 @@ static void icount_adjust(void)
 icount_time_shift++;
 }
 last_delta = delta;
-qemu_icount_bias = cur_icount - (qemu_icount << icount_time_shift);
+timers_state.qemu_icount_bias = cur_icount
+  - (timers_state.qemu_icount << 
icount_time_shift);
 seqlock_write_unlock(&timers_state.vm_clock_seqlock);
 }
 
@@ -333,7 +334,7 @@ static void icount_warp_rt(void *opaque)
 int64_t delta = cur_time - cur_icount;
 warp_delta = MIN(warp_delta, delta);
 }
-qemu_icount_bias += warp_delta;
+timers_state.qemu_icount_bias += warp_delta;
 }
 vm_clock_warp_start = -1;
 seqlock_write_unlock(&timers_state.vm_clock_seqlock);
@@ -351,7 +352,7 @@ void qtest_clock_warp(int64_t dest)
 int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
 int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
 seqlock_write_lock(&timers_state.vm_clock_seqlock);
-qemu_icount_bias += warp;
+timers_state.qemu_icount_bias += warp;
 seqlock_write_unlock(&timers_state.vm_clock_seqlock);
 
 qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
@@ -1250,7 +1251,8 @@ static int tcg_cpu_exec(CPUArchState *env)
 int64_t count;
 int64_t deadline;
 int decr;
-qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
+timers_state.qemu_icount -= (cpu->icount_decr.u16.low
++ cpu->icount_extra);
 cpu->icount_decr.u16.low = 0;
 cpu->icount_extra = 0;
 deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
@@ -1265,7 +1267,7 @@ static int tcg_cpu_exec(CPUArchState *env)
 }
 
 count = qemu_icount_round(deadline);
-qemu_icount += count;
+timers_state.qemu_icount += count;
 decr = (count > 0x) ? 0x : count;
 count -= decr;
 cpu->icount_decr.u16.low = decr;
@@ -1278,7 +1280,8 @@ static int tcg_cpu_exec(CPUArchState *env)
 if (use_icount) {
 /* Fold pending instructions back into the
instruction counter, and clear the interrupt flag.  */
-qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
+timers_state.qemu_icount -= (cpu->icount_decr.u16.low
++ cpu->icount_extra);
 cpu->icount_decr.u32 = 0;
 cpu->icount_extra = 0;
 }
-- 
1.9.3

[Qemu-devel] [PULL 00/11] KVM, icount changes for 2014-08-06

The following changes since commit 41a1a9c42c4e0fb5f1b94aa8b72e42f66ebde3d9:

  po: Update German translation (2014-07-28 23:37:17 +0200)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 92627748f7f7355bb2ea676a45791bd66b84aee0:

  target-mips: Ignore unassigned accesses with KVM (2014-08-06 17:53:07 +0200)


KVM changes include a MIPS patch and the testdev backend used by the
ARM kvm-unit-tests.  icount include the first part of reverse execution
and Sebastian Tanase's patches to slow down -icount execution to the
desired speed of the target.


James Hogan (1):
  target-mips: Ignore unassigned accesses with KVM

KONRAD Frederic (3):
  icount: put icount variables into TimerState.
  migration: migrate icount fields.
  timer: add cpu_icount_to_ns function.

Paolo Bonzini (1):
  backends: Introduce chr-testdev

Sebastian Tanase (6):
  icount: Fix virtual clock start value on ARM
  icount: Add QemuOpts for icount
  icount: Add align option to icount
  cpu-exec: Add sleeping algorithm
  cpu-exec: Print to console if the guest is late
  monitor: Add drift info to 'info jit'

 backends/Makefile.objs  |   2 +-
 backends/testdev.c  | 131 
 cpu-exec.c  | 116 ++
 cpus.c  | 114 ++---
 include/qemu-common.h   |   8 ++-
 include/qemu/timer.h|   2 +
 include/sysemu/char.h   |   3 ++
 monitor.c   |   1 +
 qapi-schema.json|   3 +-
 qemu-char.c |   4 ++
 qemu-options.hx |  17 +--
 qtest.c |  13 -
 stubs/Makefile.objs |   1 +
 stubs/chr-testdev.c |   7 +++
 target-mips/op_helper.c |  11 
 vl.c|  39 +++---
 16 files changed, 440 insertions(+), 32 deletions(-)
 create mode 100644 backends/testdev.c
 create mode 100644 stubs/chr-testdev.c
-- 
1.9.3

[Qemu-devel] [PULL 03/11] migration: migrate icount fields.

From: KONRAD Frederic 

This fixes a bug where qemu_icount and qemu_icount_bias are not migrated.
It adds a subsection "timer/icount" to vmstate_timers so icount is migrated only
when needed.

Signed-off-by: KONRAD Frederic 
Reviewed-by: Amit Shah 
Reviewed-by: Juan Quintela 
Signed-off-by: Paolo Bonzini 
---
 cpus.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/cpus.c b/cpus.c
index 127de1c..a6b6557 100644
--- a/cpus.c
+++ b/cpus.c
@@ -429,6 +429,25 @@ void qemu_clock_warp(QEMUClockType type)
 }
 }
 
+static bool icount_state_needed(void *opaque)
+{
+return use_icount;
+}
+
+/*
+ * This is a subsection for icount migration.
+ */
+static const VMStateDescription icount_vmstate_timers = {
+.name = "timer/icount",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT64(qemu_icount_bias, TimersState),
+VMSTATE_INT64(qemu_icount, TimersState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_timers = {
 .name = "timer",
 .version_id = 2,
@@ -438,6 +457,14 @@ static const VMStateDescription vmstate_timers = {
 VMSTATE_INT64(dummy, TimersState),
 VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = &icount_vmstate_timers,
+.needed = icount_state_needed,
+}, {
+/* empty */
+}
 }
 };
 
-- 
1.9.3

[Qemu-devel] [PULL 08/11] cpu-exec: Add sleeping algorithm

From: Sebastian Tanase 

The goal is to sleep qemu whenever the guest clock
is in advance compared to the host clock (we use
the monotonic clocks). The amount of time to sleep
is calculated in the execution loop in cpu_exec.

At first, we tried to approximate at each for loop the real time elapsed
while searching for a TB (generating or retrieving from cache) and
executing it. We would then approximate the virtual time corresponding
to the number of virtual instructions executed. The difference between
these 2 values would allow us to know if the guest is in advance or delayed.
However, the function used for measuring the real time
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
We had an added overhead of 13% of the total run time.

Therefore, we modified the algorithm and only take into account the
difference between the 2 clocks at the begining of the cpu_exec function.
During the for loop we try to reduce the advance of the guest only by
computing the virtual time elapsed and sleeping if necessary. The overhead
is thus reduced to 3%. Even though this method still has a noticeable
overhead, it no longer is a bottleneck in trying to achieve a better
guest frequency for which the guest clock is faster than the host one.

As for the the alignement of the 2 clocks, with the first algorithm
the guest clock was oscillating between -1 and 1ms compared to the host clock.
Using the second algorithm we notice that the guest is 5ms behind the host, 
which
is still acceptable for our use case.

The tests where conducted using fio and stress. The host machine in an i5 CPU at
3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm 
versatile-pb
built with buildroot.

Currently, on our test machine, the lowest icount we can achieve that is 
suitable for
aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) 
are
slower than the cpu tests (using stress).

Signed-off-by: Sebastian Tanase 
Tested-by: Camille Bégué 
Signed-off-by: Paolo Bonzini 
---
 cpu-exec.c   | 79 
 cpus.c   | 17 +++
 include/qemu/timer.h |  1 +
 3 files changed, 97 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 38e5f02..68f82b6 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -22,6 +22,72 @@
 #include "tcg.h"
 #include "qemu/atomic.h"
 #include "sysemu/qtest.h"
+#include "qemu/timer.h"
+
+/* -icount align implementation. */
+
+typedef struct SyncClocks {
+int64_t diff_clk;
+int64_t last_cpu_icount;
+} SyncClocks;
+
+#if !defined(CONFIG_USER_ONLY)
+/* Allow the guest to have a max 3ms advance.
+ * The difference between the 2 clocks could therefore
+ * oscillate around 0.
+ */
+#define VM_CLOCK_ADVANCE 300
+
+static void align_clocks(SyncClocks *sc, const CPUState *cpu)
+{
+int64_t cpu_icount;
+
+if (!icount_align_option) {
+return;
+}
+
+cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
+sc->diff_clk += cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount);
+sc->last_cpu_icount = cpu_icount;
+
+if (sc->diff_clk > VM_CLOCK_ADVANCE) {
+#ifndef _WIN32
+struct timespec sleep_delay, rem_delay;
+sleep_delay.tv_sec = sc->diff_clk / 10LL;
+sleep_delay.tv_nsec = sc->diff_clk % 10LL;
+if (nanosleep(&sleep_delay, &rem_delay) < 0) {
+sc->diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 
10LL;
+sc->diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
+} else {
+sc->diff_clk = 0;
+}
+#else
+Sleep(sc->diff_clk / SCALE_MS);
+sc->diff_clk = 0;
+#endif
+}
+}
+
+static void init_delay_params(SyncClocks *sc,
+  const CPUState *cpu)
+{
+if (!icount_align_option) {
+return;
+}
+sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
+   qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+   cpu_get_clock_offset();
+sc->last_cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
+}
+#else
+static void align_clocks(SyncClocks *sc, const CPUState *cpu)
+{
+}
+
+static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
+{
+}
+#endif /* CONFIG USER ONLY */
 
 void cpu_loop_exit(CPUState *cpu)
 {
@@ -227,6 +293,8 @@ int cpu_exec(CPUArchState *env)
 TranslationBlock *tb;
 uint8_t *tc_ptr;
 uintptr_t next_tb;
+SyncClocks sc;
+
 /* This must be volatile so it is not trashed by longjmp() */
 volatile bool have_tb_lock = false;
 
@@ -283,6 +351,13 @@ int cpu_exec(CPUArchState *env)
 #endif
 cpu->exception_index = -1;
 
+/* Calculate difference between guest clock and host clock.
+ * This delay includes the delay of the last cycle, so
+ * what we have to do is sleep until it is 0. As for the
+ * advance/delay we gain here, we try to fix it next time.
+ */
+init_delay_params(&sc, cpu);
+
 /* prepare setjmp context fo

[Qemu-devel] [PULL 05/11] icount: Fix virtual clock start value on ARM

From: Sebastian Tanase 

When using the icount option on ARM, the virtual
clock starts counting at realtime clock but it
should start at 0.

The reason why the virtual clock starts at realtime clock
is because the first time we call qemu_clock_warp (which
calls icount_warp_rt) in tcg_exec_all, qemu_icount_bias
(which is part of the virtual time computation mechanism)
will increment by realtime - vm_clock_warp_start, with
vm_clock_warp_start being 0 (see icount_warp_rt in cpus.c).

By changing the value of vm_clock_warp_start from 0 to -1,
the first time we call qemu_clock_warp which calls
icount_warp_rt, we will return immediatly because
icount_warp_rt first checks if vm_clock_warp_start is -1
and if it's the case it returns. Therefore, qemu_icount_bias
will first be incremented by the value of a virtual timer
deadline when the virtual cpu goes from active to inactive.

The virtual time will start at 0 and increment based
on the instruction counter when the vcpu is active or
the qemu_icount_bias value when inactive.

Signed-off-by: Sebastian Tanase 
Signed-off-by: Paolo Bonzini 
---
 cpus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 62636a6..bbb8d4e 100644
--- a/cpus.c
+++ b/cpus.c
@@ -102,7 +102,7 @@ static bool all_cpu_threads_idle(void)
 
 /* Protected by TimersState seqlock */
 
-static int64_t vm_clock_warp_start;
+static int64_t vm_clock_warp_start = -1;
 /* Conversion factor from emulated instructions to virtual clock ticks.  */
 static int icount_time_shift;
 /* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
-- 
1.9.3

[Qemu-devel] [PULL 04/11] timer: add cpu_icount_to_ns function.

From: KONRAD Frederic 

This adds cpu_icount_to_ns function which is needed for reverse execution.

It returns the time for a specific instruction.

Signed-off-by: KONRAD Frederic 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Paolo Bonzini 
---
 cpus.c   | 7 ++-
 include/qemu/timer.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index a6b6557..62636a6 100644
--- a/cpus.c
+++ b/cpus.c
@@ -146,7 +146,7 @@ static int64_t cpu_get_icount_locked(void)
 }
 icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
 }
-return timers_state.qemu_icount_bias + (icount << icount_time_shift);
+return timers_state.qemu_icount_bias + cpu_icount_to_ns(icount);
 }
 
 int64_t cpu_get_icount(void)
@@ -162,6 +162,11 @@ int64_t cpu_get_icount(void)
 return icount;
 }
 
+int64_t cpu_icount_to_ns(int64_t icount)
+{
+return icount << icount_time_shift;
+}
+
 /* return the host CPU cycle counter and handle stop/restart */
 /* Caller must hold the BQL */
 int64_t cpu_get_ticks(void)
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 7f9a074..e12c714 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -745,6 +745,7 @@ static inline int64_t get_clock(void)
 /* icount */
 int64_t cpu_get_icount(void);
 int64_t cpu_get_clock(void);
+int64_t cpu_icount_to_ns(int64_t icount);
 
 /***/
 /* host CPU ticks (if available) */
-- 
1.9.3

[Qemu-devel] [PULL 06/11] icount: Add QemuOpts for icount

From: Sebastian Tanase 

Make icount parameter use QemuOpts style options in order
to easily add other suboptions.

Signed-off-by: Sebastian Tanase 
Tested-by: Camille Bégué 
Signed-off-by: Paolo Bonzini 
---
 cpus.c| 10 +-
 include/qemu-common.h |  3 ++-
 qemu-options.hx   |  4 ++--
 qtest.c   | 13 +++--
 vl.c  | 35 ---
 5 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/cpus.c b/cpus.c
index bbb8d4e..8291044 100644
--- a/cpus.c
+++ b/cpus.c
@@ -473,13 +473,21 @@ static const VMStateDescription vmstate_timers = {
 }
 };
 
-void configure_icount(const char *option)
+void configure_icount(QemuOpts *opts, Error **errp)
 {
+const char *option;
+
 seqlock_init(&timers_state.vm_clock_seqlock, NULL);
 vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+option = qemu_opt_get(opts, "shift");
 if (!option) {
 return;
 }
+/* When using -icount shift, the shift option will be
+   misinterpreted as a boolean */
+if (strcmp(option, "on") == 0 || strcmp(option, "off") == 0) {
+error_setg(errp, "The shift option must be a number or auto");
+}
 
 icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
   icount_warp_rt, NULL);
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 6ef8282..04b0769 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include "glib-compat.h"
+#include "qemu/option.h"
 
 #ifdef _WIN32
 #include "sysemu/os-win32.h"
@@ -105,7 +106,7 @@ static inline char *realpath(const char *path, char 
*resolved_path)
 #endif
 
 /* icount */
-void configure_icount(const char *option);
+void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
 
 #include "qemu/osdep.h"
diff --git a/qemu-options.hx b/qemu-options.hx
index 1549625..5a1b001 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3011,11 +3011,11 @@ re-inject them.
 ETEXI
 
 DEF("icount", HAS_ARG, QEMU_OPTION_icount, \
-"-icount [N|auto]\n" \
+"-icount [shift=N|auto]\n" \
 "enable virtual instruction counter with 2^N clock ticks 
per\n" \
 "instruction\n", QEMU_ARCH_ALL)
 STEXI
-@item -icount [@var{N}|auto]
+@item -icount [shift=@var{N}|auto]
 @findex -icount
 Enable virtual instruction counter.  The virtual cpu will execute one
 instruction every 2^@var{N} ns of virtual time.  If @code{auto} is specified
diff --git a/qtest.c b/qtest.c
index 04a6dc1..ef0d991 100644
--- a/qtest.c
+++ b/qtest.c
@@ -19,6 +19,9 @@
 #include "hw/irq.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/cpus.h"
+#include "qemu/config-file.h"
+#include "qemu/option.h"
+#include "qemu/error-report.h"
 
 #define MAX_IRQ 256
 
@@ -509,10 +512,16 @@ static void qtest_event(void *opaque, int event)
 }
 }
 
-int qtest_init_accel(MachineClass *mc)
+static void configure_qtest_icount(const char *options)
 {
-configure_icount("0");
+QemuOpts *opts  = qemu_opts_parse(qemu_find_opts("icount"), options, 1);
+configure_icount(opts, &error_abort);
+qemu_opts_del(opts);
+}
 
+int qtest_init_accel(MachineClass *mc)
+{
+configure_qtest_icount("0");
 return 0;
 }
 
diff --git a/vl.c b/vl.c
index fe451aa..f2621a5 100644
--- a/vl.c
+++ b/vl.c
@@ -537,6 +537,20 @@ static QemuOptsList qemu_mem_opts = {
 },
 };
 
+static QemuOptsList qemu_icount_opts = {
+.name = "icount",
+.implied_opt_name = "shift",
+.merge_lists = true,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_icount_opts.head),
+.desc = {
+{
+.name = "shift",
+.type = QEMU_OPT_STRING,
+},
+{ /* end of list */ }
+},
+};
+
 /**
  * Get machine options
  *
@@ -2908,13 +2922,12 @@ int main(int argc, char **argv, char **envp)
 {
 int i;
 int snapshot, linux_boot;
-const char *icount_option = NULL;
 const char *initrd_filename;
 const char *kernel_filename, *kernel_cmdline;
 const char *boot_order;
 DisplayState *ds;
 int cyls, heads, secs, translation;
-QemuOpts *hda_opts = NULL, *opts, *machine_opts;
+QemuOpts *hda_opts = NULL, *opts, *machine_opts, *icount_opts = NULL;
 QemuOptsList *olist;
 int optind;
 const char *optarg;
@@ -2979,6 +2992,7 @@ int main(int argc, char **argv, char **envp)
 qemu_add_opts(&qemu_msg_opts);
 qemu_add_opts(&qemu_name_opts);
 qemu_add_opts(&qemu_numa_opts);
+qemu_add_opts(&qemu_icount_opts);
 
 runstate_init();
 
@@ -3830,7 +3844,11 @@ int main(int argc, char **argv, char **envp)
 }
 break;
 case QEMU_OPTION_icount:
-icount_option = optarg;
+icount_opts = qemu_opts_parse(qemu_find_opts("icount"),
+  optarg, 1);
+if (!icount_opts) {
+exit(1);
+

[Qemu-devel] [PULL 09/11] cpu-exec: Print to console if the guest is late

From: Sebastian Tanase 

If the align option is enabled, we print to the user whenever
the guest clock is behind the host clock in order for he/she
to have a hint about the actual performance. The maximum
print interval is 2s and we limit the number of messages to 100.
If desired, this can be changed in cpu-exec.c

Signed-off-by: Sebastian Tanase 
Tested-by: Camille Bégué 
Signed-off-by: Paolo Bonzini 
---
 cpu-exec.c | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 68f82b6..3c14502 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -29,6 +29,7 @@
 typedef struct SyncClocks {
 int64_t diff_clk;
 int64_t last_cpu_icount;
+int64_t realtime_clock;
 } SyncClocks;
 
 #if !defined(CONFIG_USER_ONLY)
@@ -37,6 +38,9 @@ typedef struct SyncClocks {
  * oscillate around 0.
  */
 #define VM_CLOCK_ADVANCE 300
+#define THRESHOLD_REDUCE 1.5
+#define MAX_DELAY_PRINT_RATE 20LL
+#define MAX_NB_PRINTS 100
 
 static void align_clocks(SyncClocks *sc, const CPUState *cpu)
 {
@@ -68,16 +72,43 @@ static void align_clocks(SyncClocks *sc, const CPUState 
*cpu)
 }
 }
 
+static void print_delay(const SyncClocks *sc)
+{
+static float threshold_delay;
+static int64_t last_realtime_clock;
+static int nb_prints;
+
+if (icount_align_option &&
+sc->realtime_clock - last_realtime_clock >= MAX_DELAY_PRINT_RATE &&
+nb_prints < MAX_NB_PRINTS) {
+if ((-sc->diff_clk / (float)10LL > threshold_delay) ||
+(-sc->diff_clk / (float)10LL <
+ (threshold_delay - THRESHOLD_REDUCE))) {
+threshold_delay = (-sc->diff_clk / 10LL) + 1;
+printf("Warning: The guest is now late by %.1f to %.1f seconds\n",
+   threshold_delay - 1,
+   threshold_delay);
+nb_prints++;
+last_realtime_clock = sc->realtime_clock;
+}
+}
+}
+
 static void init_delay_params(SyncClocks *sc,
   const CPUState *cpu)
 {
 if (!icount_align_option) {
 return;
 }
+sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
-   qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+   sc->realtime_clock +
cpu_get_clock_offset();
 sc->last_cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
+
+/* Print every 2s max if the guest is late. We limit the number
+   of printed messages to NB_PRINT_MAX(currently 100) */
+print_delay(sc);
 }
 #else
 static void align_clocks(SyncClocks *sc, const CPUState *cpu)
-- 
1.9.3

[Qemu-devel] [PULL 01/11] backends: Introduce chr-testdev

From: Paolo Bonzini 

chr-testdev enables a virtio serial channel to be used for guest
initiated qemu exits. hw/misc/debugexit already enables guest
initiated qemu exits, but only for PC targets. chr-testdev supports
any virtio-capable target. kvm-unit-tests/arm is already making use
of this backend.

Currently there is a single command implemented, "q".  It takes a
(prefix) argument for the exit code, thus an exit is implemented by
writing, e.g. "1q", to the virtio-serial port.

It can be used as:
   $QEMU ... \
 -device virtio-serial-device \
 -device virtserialport,chardev=ctd -chardev testdev,id=ctd

or, use:
   $QEMU ... \
 -device virtio-serial-device \
 -device virtconsole,chardev=ctd -chardev testdev,id=ctd

to bind it to virtio-serial port0.

Signed-off-by: Paolo Bonzini 
Signed-off-by: Andrew Jones 
Signed-off-by: Paolo Bonzini 
---
 backends/Makefile.objs |   2 +-
 backends/testdev.c | 131 +
 include/sysemu/char.h  |   3 ++
 qapi-schema.json   |   3 +-
 qemu-char.c|   4 ++
 stubs/Makefile.objs|   1 +
 stubs/chr-testdev.c|   7 +++
 7 files changed, 149 insertions(+), 2 deletions(-)
 create mode 100644 backends/testdev.c
 create mode 100644 stubs/chr-testdev.c

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 506a46c..31a3a89 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -1,7 +1,7 @@
 common-obj-y += rng.o rng-egd.o
 common-obj-$(CONFIG_POSIX) += rng-random.o
 
-common-obj-y += msmouse.o
+common-obj-y += msmouse.o testdev.o
 common-obj-$(CONFIG_BRLAPI) += baum.o
 baum.o-cflags := $(SDL_CFLAGS)
 
diff --git a/backends/testdev.c b/backends/testdev.c
new file mode 100644
index 000..70d63b3
--- /dev/null
+++ b/backends/testdev.c
@@ -0,0 +1,131 @@
+/*
+ * QEMU Char Device for testsuite control
+ *
+ * Copyright (c) 2014 Red Hat, Inc.
+ *
+ * Author: Paolo Bonzini 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu-common.h"
+#include "sysemu/char.h"
+
+#define BUF_SIZE 32
+
+typedef struct {
+CharDriverState *chr;
+uint8_t in_buf[32];
+int in_buf_used;
+} TestdevCharState;
+
+/* Try to interpret a whole incoming packet */
+static int testdev_eat_packet(TestdevCharState *testdev)
+{
+const uint8_t *cur = testdev->in_buf;
+int len = testdev->in_buf_used;
+uint8_t c;
+int arg;
+
+#define EAT(c) do { \
+if (!len--) {   \
+return 0;   \
+}   \
+c = *cur++; \
+} while (0)
+
+EAT(c);
+
+while (isspace(c)) {
+EAT(c);
+}
+
+arg = 0;
+while (isdigit(c)) {
+arg = arg * 10 + c - '0';
+EAT(c);
+}
+
+while (isspace(c)) {
+EAT(c);
+}
+
+switch (c) {
+case 'q':
+exit((arg << 1) | 1);
+break;
+default:
+break;
+}
+return cur - testdev->in_buf;
+}
+
+/* The other end is writing some data.  Store it and try to interpret */
+static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
+{
+TestdevCharState *testdev = chr->opaque;
+int tocopy, eaten, orig_len = len;
+
+while (len) {
+/* Complete our buffer as much as possible */
+tocopy = MIN(len, BUF_SIZE - testdev->in_buf_used);
+
+memcpy(testdev->in_buf + testdev->in_buf_used, buf, tocopy);
+testdev->in_buf_used += tocopy;
+buf += tocopy;
+len -= tocopy;
+
+/* Interpret it as much as possible */
+while (testdev->in_buf_used > 0 &&
+   (eaten = testdev_eat_packet(testdev)) > 0) {
+memmove(testdev->in_buf, testdev->in_buf + eaten,
+testdev->in_buf_used - eaten);
+testdev->in_buf_used -= eaten;
+}
+}
+return orig_len;
+}
+
+static void testdev_close(struct CharDriverState *chr)
+{
+TestdevCharState *testdev = chr->opaque;
+
+g_free(testdev);
+}
+
+CharDriverSt

[Qemu-devel] [PULL 07/11] icount: Add align option to icount

From: Sebastian Tanase 

The align option is used for activating the align algorithm
in order to synchronise the host clock and the guest clock.

Signed-off-by: Sebastian Tanase 
Tested-by: Camille Bégué 
Signed-off-by: Paolo Bonzini 
---
 cpus.c| 19 ---
 include/qemu-common.h |  1 +
 qemu-options.hx   | 15 +--
 vl.c  |  4 
 4 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/cpus.c b/cpus.c
index 8291044..7e09538 100644
--- a/cpus.c
+++ b/cpus.c
@@ -476,25 +476,30 @@ static const VMStateDescription vmstate_timers = {
 void configure_icount(QemuOpts *opts, Error **errp)
 {
 const char *option;
+char *rem_str = NULL;
 
 seqlock_init(&timers_state.vm_clock_seqlock, NULL);
 vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
 option = qemu_opt_get(opts, "shift");
 if (!option) {
+if (qemu_opt_get(opts, "align") != NULL) {
+error_setg(errp, "Please specify shift option when using align");
+}
 return;
 }
-/* When using -icount shift, the shift option will be
-   misinterpreted as a boolean */
-if (strcmp(option, "on") == 0 || strcmp(option, "off") == 0) {
-error_setg(errp, "The shift option must be a number or auto");
-}
-
+icount_align_option = qemu_opt_get_bool(opts, "align", false);
 icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
   icount_warp_rt, NULL);
 if (strcmp(option, "auto") != 0) {
-icount_time_shift = strtol(option, NULL, 0);
+errno = 0;
+icount_time_shift = strtol(option, &rem_str, 0);
+if (errno != 0 || *rem_str != '\0' || !strlen(option)) {
+error_setg(errp, "icount: Invalid shift value");
+}
 use_icount = 1;
 return;
+} else if (icount_align_option) {
+error_setg(errp, "shift=auto and align=on are incompatible");
 }
 
 use_icount = 2;
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 04b0769..5d10ac2 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -108,6 +108,7 @@ static inline char *realpath(const char *path, char 
*resolved_path)
 /* icount */
 void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
+extern int icount_align_option;
 
 #include "qemu/osdep.h"
 #include "qemu/bswap.h"
diff --git a/qemu-options.hx b/qemu-options.hx
index 5a1b001..96516c1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3011,9 +3011,9 @@ re-inject them.
 ETEXI
 
 DEF("icount", HAS_ARG, QEMU_OPTION_icount, \
-"-icount [shift=N|auto]\n" \
+"-icount [shift=N|auto][,align=on|off]\n" \
 "enable virtual instruction counter with 2^N clock ticks 
per\n" \
-"instruction\n", QEMU_ARCH_ALL)
+"instruction and enable aligning the host and virtual 
clocks\n", QEMU_ARCH_ALL)
 STEXI
 @item -icount [shift=@var{N}|auto]
 @findex -icount
@@ -3026,6 +3026,17 @@ Note that while this option can give deterministic 
behavior, it does not
 provide cycle accurate emulation.  Modern CPUs contain superscalar out of
 order cores with complex cache hierarchies.  The number of instructions
 executed often has little or no correlation with actual performance.
+
+@option{align=on} will activate the delay algorithm which will try to
+to synchronise the host clock and the virtual clock. The goal is to
+have a guest running at the real frequency imposed by the shift option.
+Whenever the guest clock is behind the host clock and if
+@option{align=on} is specified then we print a messsage to the user
+to inform about the delay.
+Currently this option does not work when @option{shift} is @code{auto}.
+Note: The sync algorithm will work for those shift values for which
+the guest clock runs ahead of the host clock. Typically this happens
+when the shift value is high (how high depends on the host machine).
 ETEXI
 
 DEF("watchdog", HAS_ARG, QEMU_OPTION_watchdog, \
diff --git a/vl.c b/vl.c
index f2621a5..a8029d5 100644
--- a/vl.c
+++ b/vl.c
@@ -183,6 +183,7 @@ uint8_t *boot_splash_filedata;
 size_t boot_splash_filedata_size;
 uint8_t qemu_extra_params_fw[2];
 
+int icount_align_option;
 typedef struct FWBootEntry FWBootEntry;
 
 struct FWBootEntry {
@@ -546,6 +547,9 @@ static QemuOptsList qemu_icount_opts = {
 {
 .name = "shift",
 .type = QEMU_OPT_STRING,
+}, {
+.name = "align",
+.type = QEMU_OPT_BOOL,
 },
 { /* end of list */ }
 },
-- 
1.9.3

[Qemu-devel] [PULL 11/11] target-mips: Ignore unassigned accesses with KVM

From: James Hogan 

MIPS registers an unassigned access handler which raises a guest bus
error exception. However this causes QEMU to crash when KVM is enabled
as it isn't called from the main execution loop so longjmp() gets called
without a corresponding setjmp().

Until the KVM API can be updated to trigger a guest exception in
response to an MMIO exit, prevent the bus error exception being raised
from mips_cpu_unassigned_access() if KVM is enabled.

The check is at run time since the do_unassigned_access callback is
initialised before it is known whether KVM will be enabled.

The problem can be triggered with Malta emulation by making the guest
write to the reset region at physical address 0x1bf0, since it is
marked read-only which is treated as unassigned for writes.

Signed-off-by: James Hogan 
Reviewed-by: Aurelien Jarno 
Cc: Peter Maydell 
Cc: Paolo Bonzini 
Cc: Gleb Natapov 
Cc: Christoffer Dall 
Cc: Sanjay Lal 
Signed-off-by: Paolo Bonzini 
---
 target-mips/op_helper.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 27651a4..df97b35 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -21,6 +21,7 @@
 #include "qemu/host-utils.h"
 #include "exec/helper-proto.h"
 #include "exec/cpu_ldst.h"
+#include "sysemu/kvm.h"
 
 #ifndef CONFIG_USER_ONLY
 static inline void cpu_mips_tlb_flush (CPUMIPSState *env, int flush_global);
@@ -2168,6 +2169,16 @@ void mips_cpu_unassigned_access(CPUState *cs, hwaddr 
addr,
 MIPSCPU *cpu = MIPS_CPU(cs);
 CPUMIPSState *env = &cpu->env;
 
+/*
+ * Raising an exception with KVM enabled will crash because it won't be 
from
+ * the main execution loop so the longjmp won't have a matching setjmp.
+ * Until we can trigger a bus error exception through KVM lets just ignore
+ * the access.
+ */
+if (kvm_enabled()) {
+return;
+}
+
 if (is_exec) {
 helper_raise_exception(env, EXCP_IBE);
 } else {
-- 
1.9.3

[Qemu-devel] [PULL 10/11] monitor: Add drift info to 'info jit'

From: Sebastian Tanase 

Show in 'info jit' the current delay between the host clock
and the guest clock. In addition, print the maximum advance
and delay of the guest compared to the host.

Signed-off-by: Sebastian Tanase 
Tested-by: Camille Bégué 
Signed-off-by: Paolo Bonzini 
---
 cpu-exec.c|  6 ++
 cpus.c| 15 +++
 include/qemu-common.h |  4 
 monitor.c |  1 +
 4 files changed, 26 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 3c14502..cbc8067 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -105,6 +105,12 @@ static void init_delay_params(SyncClocks *sc,
sc->realtime_clock +
cpu_get_clock_offset();
 sc->last_cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
+if (sc->diff_clk < max_delay) {
+max_delay = sc->diff_clk;
+}
+if (sc->diff_clk > max_advance) {
+max_advance = sc->diff_clk;
+}
 
 /* Print every 2s max if the guest is late. We limit the number
of printed messages to NB_PRINT_MAX(currently 100) */
diff --git a/cpus.c b/cpus.c
index 19245e9..a5f78c1 100644
--- a/cpus.c
+++ b/cpus.c
@@ -64,6 +64,8 @@
 #endif /* CONFIG_LINUX */
 
 static CPUState *next_cpu;
+int64_t max_delay;
+int64_t max_advance;
 
 bool cpu_is_stopped(CPUState *cpu)
 {
@@ -1552,3 +1554,16 @@ void qmp_inject_nmi(Error **errp)
 error_set(errp, QERR_UNSUPPORTED);
 #endif
 }
+
+void dump_drift_info(FILE *f, fprintf_function cpu_fprintf)
+{
+cpu_fprintf(f, "Host - Guest clock  %ld(ms)\n",
+(cpu_get_clock() - cpu_get_icount())/SCALE_MS);
+if (icount_align_option) {
+cpu_fprintf(f, "Max guest delay %ld(ms)\n", -max_delay/SCALE_MS);
+cpu_fprintf(f, "Max guest advance   %ld(ms)\n", max_advance/SCALE_MS);
+} else {
+cpu_fprintf(f, "Max guest delay NA\n");
+cpu_fprintf(f, "Max guest advance   NA\n");
+}
+}
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 5d10ac2..bcf7a6a 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -109,6 +109,10 @@ static inline char *realpath(const char *path, char 
*resolved_path)
 void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
 extern int icount_align_option;
+/* drift information for info jit command */
+extern int64_t max_delay;
+extern int64_t max_advance;
+void dump_drift_info(FILE *f, fprintf_function cpu_fprintf);
 
 #include "qemu/osdep.h"
 #include "qemu/bswap.h"
diff --git a/monitor.c b/monitor.c
index 5bc70a6..cdbaa60 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1047,6 +1047,7 @@ static void do_info_registers(Monitor *mon, const QDict 
*qdict)
 static void do_info_jit(Monitor *mon, const QDict *qdict)
 {
 dump_exec_info((FILE *)mon, monitor_fprintf);
+dump_drift_info((FILE *)mon, monitor_fprintf);
 }
 
 static void do_info_history(Monitor *mon, const QDict *qdict)
-- 
1.9.3

[Qemu-devel] [Bug 1285708] Re: FreeBSD Guest crash on boot due to xsave instruction issue