[Qemu-devel] [Bug 1713408] [NEW] qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value passed to "maxcpus"

2017-08-28 Thread R.Nageswara Sastry
Public bug reported:

# ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
/avocado-
vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
-monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio -net
user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

(process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
18446744073709550568 bytes

>From GDB:
[New Thread 0x3fffb5aceb60 (LWP 12190)]

(process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
18446744073709550568 bytes

Program received signal SIGTRAP, Trace/breakpoint trap.
0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install glib2-2.50.3-3.el7.ppc64le 
glibc-2.17-196.el7.ppc64le gnutls-3.3.26-9.el7.ppc64le 
krb5-libs-1.15.1-8.el7.ppc64le libgcc-4.8.5-16.el7.ppc64le 
libstdc++-4.8.5-16.el7.ppc64le ncurses-libs-5.9-13.20130511.el7.ppc64le 
nss-3.28.4-8.el7.ppc64le nss-softokn-freebl-3.28.3-6.el7.ppc64le 
nss-util-3.28.4-3.el7.ppc64le openldap-2.4.44-5.el7.ppc64le 
openssl-libs-1.0.2k-8.el7.ppc64le p11-kit-0.23.5-3.el7.ppc64le
(gdb) bt
#0  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
#1  0x3fffb796be9c in _g_log_abort () from /lib64/libglib-2.0.so.0
#2  0x3fffb796d4c4 in g_log_default_handler () from /lib64/libglib-2.0.so.0
#3  0x3fffb796d86c in g_logv () from /lib64/libglib-2.0.so.0
#4  0x3fffb796db00 in g_log () from /lib64/libglib-2.0.so.0
#5  0x3fffb796b694 in g_malloc0 () from /lib64/libglib-2.0.so.0
#6  0x1018fa84 in spapr_possible_cpu_arch_ids (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:3322
#7  0x1018b444 in spapr_init_cpus (spapr=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2096
#8  0x1018bc6c in ppc_spapr_init (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2275
#9  0x1041ca38 in machine_run_board_init (machine=0x11165660) at 
hw/core/machine.c:760
#10 0x1037723c in main (argc=24, argv=0x3108, 
envp=0x31d0) at vl.c:4633
(gdb) i r
r0 0xfa 250
r1 0x3fffe450   70368744170576
r2 0x3fffb7608100   70367525765376
r3 0x0  0
r4 0x2f98   12184
r5 0x5  5
r6 0x0  0
r7 0x3fffa820   70367267782688
r8 0x2f98   12184
r9 0x0  0
r100x0  0
r110x0  0
r120x0  0
r130x3fffb64fccb0   70367507893424
r140x0  0
r150x0  0
r160x0  0
r170x0  0
r180x1  1
r190x0  0
r200x3fffb796d3f0   70367529325552
r210x0  0
r220x2000   536870912
r230x1  1
r240x3fffb7a61498   70367530325144
r250x3fffb7a614e8   70367530325224
r260x3fffb7a61488   70367530325128
r270x3fffa80008c0   70367267784896
r280x3fffb79cd2a8   70367529718440
r290x3fffb79cd2a8   70367529718440
r300x   18446744073709551615
r310x1  1
pc 0x3fffb75e5408   0x3fffb75e5408 
msr0x9000d033   10376293541461676083
cr 0x42244842   1109674050
lr 0x3fffb796be9c   0x3fffb796be9c <_g_log_abort+60>
ctr0x0  0
xer0x0  0
orig_r30x2f98   12184
trap   0xc003072

Similar error observed on x86_64 and PPC64LE architectures.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713408

Title:
  qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value
  passed to "maxcpus"

Status in QEMU:
  New

Bug description:
  # ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
  pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
  pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
  /avocado-
  vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
  -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio
  -net user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

  (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  From GDB:
  [New Thread 0x3fffb5aceb60 (LWP 12190)]

  (process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  Missing separate debuginfos, use: debuginfo-install 
glib2-2.50.3-3.el7.ppc64le glibc-2.17

[Qemu-devel] [Bug 1713408] Re: qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value passed to "maxcpus"

2017-08-28 Thread R.Nageswara Sastry
3308 static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState 
*machine)
3309 {
3310 int i;
3311 int spapr_max_cores = max_cpus / smp_threads;  << max_cpus is -ve 
and spapr_max_cores will also be -ve

...

3321
3322 machine->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
3323  sizeof(CPUArchId) * spapr_max_cores);

g_malloc0(is getting a -ve value) and then fails with a trap.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713408

Title:
  qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value
  passed to "maxcpus"

Status in QEMU:
  New

Bug description:
  # ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
  pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
  pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
  /avocado-
  vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
  -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio
  -net user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

  (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  From GDB:
  [New Thread 0x3fffb5aceb60 (LWP 12190)]

  (process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  Missing separate debuginfos, use: debuginfo-install 
glib2-2.50.3-3.el7.ppc64le glibc-2.17-196.el7.ppc64le 
gnutls-3.3.26-9.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le 
libgcc-4.8.5-16.el7.ppc64le libstdc++-4.8.5-16.el7.ppc64le 
ncurses-libs-5.9-13.20130511.el7.ppc64le nss-3.28.4-8.el7.ppc64le 
nss-softokn-freebl-3.28.3-6.el7.ppc64le nss-util-3.28.4-3.el7.ppc64le 
openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le 
p11-kit-0.23.5-3.el7.ppc64le
  (gdb) bt
  #0  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  #1  0x3fffb796be9c in _g_log_abort () from /lib64/libglib-2.0.so.0
  #2  0x3fffb796d4c4 in g_log_default_handler () from 
/lib64/libglib-2.0.so.0
  #3  0x3fffb796d86c in g_logv () from /lib64/libglib-2.0.so.0
  #4  0x3fffb796db00 in g_log () from /lib64/libglib-2.0.so.0
  #5  0x3fffb796b694 in g_malloc0 () from /lib64/libglib-2.0.so.0
  #6  0x1018fa84 in spapr_possible_cpu_arch_ids (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:3322
  #7  0x1018b444 in spapr_init_cpus (spapr=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2096
  #8  0x1018bc6c in ppc_spapr_init (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2275
  #9  0x1041ca38 in machine_run_board_init (machine=0x11165660) at 
hw/core/machine.c:760
  #10 0x1037723c in main (argc=24, argv=0x3108, 
envp=0x31d0) at vl.c:4633
  (gdb) i r
  r0 0xfa   250
  r1 0x3fffe450 70368744170576
  r2 0x3fffb7608100 70367525765376
  r3 0x00
  r4 0x2f98 12184
  r5 0x55
  r6 0x00
  r7 0x3fffa820 70367267782688
  r8 0x2f98 12184
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fffb64fccb0 70367507893424
  r140x00
  r150x00
  r160x00
  r170x00
  r180x11
  r190x00
  r200x3fffb796d3f0 70367529325552
  r210x00
  r220x2000 536870912
  r230x11
  r240x3fffb7a61498 70367530325144
  r250x3fffb7a614e8 70367530325224
  r260x3fffb7a61488 70367530325128
  r270x3fffa80008c0 70367267784896
  r280x3fffb79cd2a8 70367529718440
  r290x3fffb79cd2a8 70367529718440
  r300x 18446744073709551615
  r310x11
  pc 0x3fffb75e5408 0x3fffb75e5408 
  msr0x9000d033 10376293541461676083
  cr 0x42244842 1109674050
  lr 0x3fffb796be9c 0x3fffb796be9c <_g_log_abort+60>
  ctr0x00
  xer0x00
  orig_r30x2f98 12184
  trap   0xc00  3072

  Similar error observed on x86_64 and PPC64LE architectures.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713408/+subscriptions



[Qemu-devel] [Bug 1713408] Re: qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value passed to "maxcpus"

2017-08-28 Thread R.Nageswara Sastry
The above I am referring from hw/ppc/spapr.c

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713408

Title:
  qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value
  passed to "maxcpus"

Status in QEMU:
  New

Bug description:
  # ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
  pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
  pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
  /avocado-
  vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
  -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio
  -net user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

  (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  From GDB:
  [New Thread 0x3fffb5aceb60 (LWP 12190)]

  (process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  Missing separate debuginfos, use: debuginfo-install 
glib2-2.50.3-3.el7.ppc64le glibc-2.17-196.el7.ppc64le 
gnutls-3.3.26-9.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le 
libgcc-4.8.5-16.el7.ppc64le libstdc++-4.8.5-16.el7.ppc64le 
ncurses-libs-5.9-13.20130511.el7.ppc64le nss-3.28.4-8.el7.ppc64le 
nss-softokn-freebl-3.28.3-6.el7.ppc64le nss-util-3.28.4-3.el7.ppc64le 
openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le 
p11-kit-0.23.5-3.el7.ppc64le
  (gdb) bt
  #0  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  #1  0x3fffb796be9c in _g_log_abort () from /lib64/libglib-2.0.so.0
  #2  0x3fffb796d4c4 in g_log_default_handler () from 
/lib64/libglib-2.0.so.0
  #3  0x3fffb796d86c in g_logv () from /lib64/libglib-2.0.so.0
  #4  0x3fffb796db00 in g_log () from /lib64/libglib-2.0.so.0
  #5  0x3fffb796b694 in g_malloc0 () from /lib64/libglib-2.0.so.0
  #6  0x1018fa84 in spapr_possible_cpu_arch_ids (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:3322
  #7  0x1018b444 in spapr_init_cpus (spapr=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2096
  #8  0x1018bc6c in ppc_spapr_init (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2275
  #9  0x1041ca38 in machine_run_board_init (machine=0x11165660) at 
hw/core/machine.c:760
  #10 0x1037723c in main (argc=24, argv=0x3108, 
envp=0x31d0) at vl.c:4633
  (gdb) i r
  r0 0xfa   250
  r1 0x3fffe450 70368744170576
  r2 0x3fffb7608100 70367525765376
  r3 0x00
  r4 0x2f98 12184
  r5 0x55
  r6 0x00
  r7 0x3fffa820 70367267782688
  r8 0x2f98 12184
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fffb64fccb0 70367507893424
  r140x00
  r150x00
  r160x00
  r170x00
  r180x11
  r190x00
  r200x3fffb796d3f0 70367529325552
  r210x00
  r220x2000 536870912
  r230x11
  r240x3fffb7a61498 70367530325144
  r250x3fffb7a614e8 70367530325224
  r260x3fffb7a61488 70367530325128
  r270x3fffa80008c0 70367267784896
  r280x3fffb79cd2a8 70367529718440
  r290x3fffb79cd2a8 70367529718440
  r300x 18446744073709551615
  r310x11
  pc 0x3fffb75e5408 0x3fffb75e5408 
  msr0x9000d033 10376293541461676083
  cr 0x42244842 1109674050
  lr 0x3fffb796be9c 0x3fffb796be9c <_g_log_abort+60>
  ctr0x00
  xer0x00
  orig_r30x2f98 12184
  trap   0xc00  3072

  Similar error observed on x86_64 and PPC64LE architectures.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713408/+subscriptions



Re: [Qemu-devel] S390 bios breaks in qemu 2.10.rc3

2017-08-28 Thread Christian Borntraeger


On 08/25/2017 10:29 AM, Cornelia Huck wrote:
> On Fri, 25 Aug 2017 10:21:58 +0200
> Christian Borntraeger  wrote:
> 
>> On 08/25/2017 09:20 AM, Cornelia Huck wrote:
> 
>>> OK, to recap:
>>>
>>> - the current pre-built bios seems fine
>>> - rebuilding the bios may yield a version that fails on some systems
>>>   (different compiler?)
>>> - adding aligned(8) looks like the right thing to do
>>> - it seems to fix the problem, but on at least one system something
>>>   still seems off (under investigation)  
>>
>> Yes. I am out of office today, so for any aligned(8) patch
>> Acked-by: Christian Borntraeger 
>> even for 2.10.
> 
> I fear the 2.10 train has already left the station, but any aligned(8)
> patch should be cc:stable.

I think this could be a topic for QEMU summit. Our process of not allowing
fixes in rcx without requiring an rc(x+1) seems a bit odd. The Linux kernel
style (there are fixes between the last rc and release) seems more balanced
as long as we establish some safety nets.




Re: [Qemu-devel] [PATCH v2] [WIP] [RFC ]Add initial 9pfs support for Windows hosts v2

2017-08-28 Thread Michael Fritscher
Good day,

only a short announcement: Sorry for the very long delay :-( But I'm
working on this again. The biggest issue seems to be the *at stuff. I'll
try to workaround this via getting the directories' path from the file
descriptor with the /proc (as it is already done in the 9pfs_utils) -
luckily,the mingw environment emulates the /proc. If this doesn't work
I've another idea (the file descriptors needs to be "registered" with the
path (and saved in a sparse vector or map with the fd as key and the path
as value). The "big" solution would be to write a 9p_local_windows.c from
scratch, but I would like to avoid it.

Additionally, I changed my approach: Instead of one big patch with
everything in it, I split it into 2 patch series with several patches:

  * The first one only fixes the build via adding #ifdefs, creating stubs
in os-win32.h etc. and introduces no regressions. Status: Ready in my
local repo: It compiles fine with enabled 9pfs and works. If 9pfs is
tried to use there is a clean error message that the fsdev couldn't be
initialized - as expected.
  * The second one actually get 9pfs working. This involves mostly
implementing the *at functions and a few tiny things (O_BINARY and so
on) Status: Not started yet (will be at the next weekend I hope), but is
a combination of parts of the the old patch + the implementation of the
*at functions.

My hope is that the first series could be merged independently from the
second. Meanings? If so I'll clean up the first patch series a bit
(History cleaning) and send it to the list.

Best regards,
Michael Fritscher




Re: [Qemu-devel] [PATCHv4 01/03] qemu-iothread: IOThread supports theGMainContext event loop

2017-08-28 Thread wang.yong155
>Hi Wang Yong,>>To make the discussion easier, please try to fix your email 
>client to:>>1) set In-Reply-To: header when replying>2) use plain text instead 
>of html>3) use monospace fonts to view and compose a reply>4) avoid attaching 
>the original email in the end, just reply inline>5) maybe, use "Re:" in the 
>subject for reply, avoid "答复:">6) include not only email addresses in 
>From:To:/Cc: headers, but also>   the names of recipients, in the form of>>
>   Some Body , Another One ,>  
> ...>Or maybe just switch to a functional email client.

Hi Fam,I am very sorry that our company's mail client does not support the part.






>> I think it's ok, I don't know whether I understand it correctly or 
>> not?>>Your sequence is ok. But remember this is multi-threaded and the 
>> execution order>between two threads are non-deterministic. The sequence I 
>> pointed out is also>"possible" and will cause use-after-free due to TOCTOU 
>> race condition [1].>>[1]: 
>> https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use

Thanks,When iothread_stop is called two times concurrently,this BUG may be 
triggered.

I will submit the patch v5.




WangYong









原始邮件



发件人: 
收件人:王勇10170530
抄送人:
 王广10165992 
 
日 期 :2017年08月23日 16:43
主 题 :Re: 答复: Re: [PATCHv4 01/03] qemu-iothread: IOThread supports 
theGMainContext event loop





Hi Wang Yong,

To make the discussion easier, please try to fix your email client to:

1) set In-Reply-To: header when replying
2) use plain text instead of html
3) use monospace fonts to view and compose a reply
4) avoid attaching the original email in the end, just reply inline
5) maybe, use "Re:" in the subject for reply, avoid "答复:"
6) include not only email addresses in From:To:/Cc: headers, but also
   the names of recipients, in the form of

   Some Body , Another One ,
   ...

Or maybe just switch to a functional email client.

On Wed, 08/23 15:58, wang.yong...@zte.com.cn wrote:
> >> diff --git a/iothread.c b/iothread.c>> index beeb870..fb1c55b 100644>> --- 
> >> a/iothread.c>> +++ b/iothread.c>> @@ -57,6 +57,20 @@ static void 
> >> *iothread_run(void *opaque)>>  >>  while 
> >> (!atomic_read(&iothread->stopping)) {>>  aio_poll(iothread->ctx, 
> >> true)>> +>> +if (atomic_read(&iothread->worker_context)) {>> + 
> >>g_main_context_push_thread_default(iothread->worker_context)>> +
> >> iothread->main_loop =>> +
> >> g_main_loop_new(iothread->worker_context, TRUE)>> +
> >> g_main_loop_run(iothread->main_loop)>> +>> +
> >> g_main_loop_unref(iothread->main_loop)>> +iothread->main_loop 
> >> = NULL>
> 
> >You should clear iothread->main_loop first before calling 
> >g_main_loop_unref(),>to avoid TOCTOU race with iothread_stop():>
> 
> >  iothread_run (in IOThread)  iothread_stop (in main thread)> 
> > >   
> >  if (atomic_read(&iothread->main_loop)) {>  
> > /* frees iothread->main_loop */>  g_main_loop_unref(...)>   
> >   /* Accesses freed memory */>  
> > g_main_loop_quit(iothread->main_loop)>  
> > }>  iothread->main_loop = NULL
> 
> When the g_main_loop_quit function is called, the g_main_loop_run function 
> can exit?
> 
> 
> 
> 
> iothread_run (in IOThread) 
> iothread_stop (in main thread)
> 
> 
> 
>   
>  /*step1: set  loop->is_running = FALSE*/
> 
>   
> g_main_loop_quit(iothread->main_loop)
> 
> /*step2: main loop exit */
>   
>   
> 
> g_main_loop_run()
> 
> /*step3:frees iothread->main_loop memory*/
> 
> g_main_loop_unref(...)
> 
> iothread->main_loop = NULL
> 
> 
> 
> 
> I think it's ok, I don't know whether I understand it correctly or not?

Your sequence is ok. But remember this is multi-threaded and the execution order
between two threads are non-deterministic. The sequence I pointed out is also
"possible" and will cause use-after-free due to TOCTOU race condition [1].

[1]: https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use

Fam

[Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data

2017-08-28 Thread Yi Min Zhao
PCIDevcie pointer has been a parameter of kvm_arch_fixup_msi_route().
So we don't need to store zpci idx in msix message data to find out the
specific zpci device. Instead, we could use pci device id to find its
corresponding zpci device.

Signed-off-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-bus.c  | 16 +---
 hw/s390x/s390-pci-bus.h  |  2 ++
 hw/s390x/s390-pci-inst.c | 24 
 target/s390x/kvm.c   |  7 +--
 4 files changed, 12 insertions(+), 37 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 61cfd2138f..9e1f7ff5c5 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -209,8 +209,8 @@ static S390PCIBusDevice 
*s390_pci_find_dev_by_uid(S390pciState *s, uint16_t uid)
 return NULL;
 }
 
-static S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
- const char *target)
+S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
+  const char *target)
 {
 S390PCIBusDevice *pbdev;
 
@@ -475,19 +475,13 @@ static void s390_msi_ctrl_write(void *opaque, hwaddr 
addr, uint64_t data,
 unsigned int size)
 {
 S390PCIBusDevice *pbdev = opaque;
-uint32_t idx = data >> ZPCI_MSI_VEC_BITS;
 uint32_t vec = data & ZPCI_MSI_VEC_MASK;
 uint64_t ind_bit;
 uint32_t sum_bit;
-uint32_t e = 0;
 
-DPRINTF("write_msix data 0x%" PRIx64 " idx %d vec 0x%x\n", data, idx, vec);
-
-if (!pbdev) {
-e |= (vec << ERR_EVENT_MVN_OFFSET);
-s390_pci_generate_error_event(ERR_EVENT_NOMSI, idx, 0, addr, e);
-return;
-}
+assert(pbdev);
+DPRINTF("write_msix data 0x%" PRIx64 " idx %d vec 0x%x\n", data,
+pbdev->idx, vec);
 
 if (pbdev->state != ZPCI_FS_ENABLED) {
 return;
diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h
index 67af2c12ff..820c7fa52b 100644
--- a/hw/s390x/s390-pci-bus.h
+++ b/hw/s390x/s390-pci-bus.h
@@ -330,6 +330,8 @@ void s390_pci_generate_error_event(uint16_t pec, uint32_t 
fh, uint32_t fid,
 S390PCIBusDevice *s390_pci_find_dev_by_idx(S390pciState *s, uint32_t idx);
 S390PCIBusDevice *s390_pci_find_dev_by_fh(S390pciState *s, uint32_t fh);
 S390PCIBusDevice *s390_pci_find_dev_by_fid(S390pciState *s, uint32_t fid);
+S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
+  const char *target);
 S390PCIBusDevice *s390_pci_find_next_avail_dev(S390pciState *s,
S390PCIBusDevice *pbdev);
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index eba9ffb5f2..8e088f3dc9 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -413,29 +413,6 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 return 0;
 }
 
-static void update_msix_table_msg_data(S390PCIBusDevice *pbdev, uint64_t 
offset,
-   uint64_t *data, uint8_t len)
-{
-uint32_t val;
-uint8_t *msg_data;
-
-if (offset % PCI_MSIX_ENTRY_SIZE != 8) {
-return;
-}
-
-if (len != 4) {
-DPRINTF("access msix table msg data but len is %d\n", len);
-return;
-}
-
-msg_data = (uint8_t *)data - offset % PCI_MSIX_ENTRY_SIZE +
-   PCI_MSIX_ENTRY_VECTOR_CTRL;
-val = pci_get_long(msg_data) |
-((pbdev->fh & FH_MASK_INDEX) << ZPCI_MSI_VEC_BITS);
-pci_set_long(msg_data, val);
-DPRINTF("update msix msg_data to 0x%" PRIx64 "\n", *data);
-}
-
 static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
 {
 if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
@@ -508,7 +485,6 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 if (trap_msix(pbdev, offset, pcias)) {
 offset = offset - pbdev->msix.table_offset;
 mr = &pbdev->pdev->msix_table_mmio;
-update_msix_table_msg_data(pbdev, offset, &data, len);
 } else {
 mr = pbdev->pdev->io_regions[pcias].memory;
 }
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 1c68c36663..e348bfb7cc 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -2503,10 +2503,13 @@ int kvm_arch_fixup_msi_route(struct 
kvm_irq_routing_entry *route,
  uint64_t address, uint32_t data, PCIDevice *dev)
 {
 S390PCIBusDevice *pbdev;
-uint32_t idx = data >> ZPCI_MSI_VEC_BITS;
 uint32_t vec = data & ZPCI_MSI_VEC_MASK;
 
-pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx);
+if (!dev) {
+return -ENODEV;
+}
+
+pbdev = s390_pci_find_dev_by_target(s390_get_phb(), DEVICE(dev)->id);
 if (!pbdev) {
 DPRINTF("add_msi_route no dev\n");
 return -ENODEV;
-- 
2.11.0 (Apple Git-81)




[Qemu-devel] [PATCH 0/4] four zpci patches

2017-08-28 Thread Yi Min Zhao
This patch set contains four small zpci patches to fixup different issues.
1) fixup calculation of msix boundary
2) remove zpci idx from msix message, instead we could use PCIDevice's id to
   find zpci device in kvm_arch_fixup_msi_route()
3) fixup ind_offset calculation for adapter interrupt routing entry
4) introduce our own iommu_replay callback

Yi Min Zhao (4):
  s390x/pci: fixup trap_msix()
  s390x/pci: remove idx from msix msg data
  s390x/pci: fixup ind_offset of msix routing entry
  s390x/pci: add iommu replay callback

 hw/s390x/s390-pci-bus.c  | 24 +---
 hw/s390x/s390-pci-bus.h  |  2 ++
 hw/s390x/s390-pci-inst.c | 28 ++--
 target/s390x/kvm.c   | 11 ++-
 4 files changed, 23 insertions(+), 42 deletions(-)

-- 
2.11.0 (Apple Git-81)




[Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry

2017-08-28 Thread Yi Min Zhao
The aibvo of zpci device should be constant after issued mpcifc
registering irqs instruction. Each msix vector should offset from the
aibvo. But for flic adapter interrupt, we should use the absolute
offset within the aibv. So let's use the aibvo+vector to fixup msix
routing entry.

Signed-off-by: Yi Min Zhao 
---
 target/s390x/kvm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index e348bfb7cc..c08b7757e7 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -2515,14 +2515,12 @@ int kvm_arch_fixup_msi_route(struct 
kvm_irq_routing_entry *route,
 return -ENODEV;
 }
 
-pbdev->routes.adapter.ind_offset = vec;
-
 route->type = KVM_IRQ_ROUTING_S390_ADAPTER;
 route->flags = 0;
 route->u.adapter.summary_addr = pbdev->routes.adapter.summary_addr;
 route->u.adapter.ind_addr = pbdev->routes.adapter.ind_addr;
 route->u.adapter.summary_offset = pbdev->routes.adapter.summary_offset;
-route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset;
+route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset + vec;
 route->u.adapter.adapter_id = pbdev->routes.adapter.adapter_id;
 return 0;
 }
-- 
2.11.0 (Apple Git-81)




[Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()

2017-08-28 Thread Yi Min Zhao
The function trap_msix() is to check if pcistg instruction would access
msix table entries. The correct boundary condition should be
[table_offset, table_offset+entries*entry_size). But the current
condition calculated misses the last entry. So let's fixup it.

Acked-by: Dong Jia Shi 
Reviewed-by: Pierre Morel 
Signed-off-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-inst.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index b7beb8c36a..eba9ffb5f2 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t 
offset, uint8_t pcias)
 {
 if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
 offset >= pbdev->msix.table_offset &&
-offset <= pbdev->msix.table_offset +
-  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
+offset < (pbdev->msix.table_offset +
+  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
 return 1;
 } else {
 return 0;
-- 
2.11.0 (Apple Git-81)




[Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback

2017-08-28 Thread Yi Min Zhao
Let's introduce iommu replay callback for s390 pci iommu memory region.
Currently we don't need any dma mapping replay. So let it return
directly. This implementation will avoid meaningless loops calling
translation callback.

Reviewed-by: Pierre Morel 
Reviewed-by: Halil Pasic 
Signed-off-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-bus.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 9e1f7ff5c5..359509ccea 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -407,6 +407,13 @@ static IOMMUTLBEntry 
s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
 return ret;
 }
 
+static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
+  IOMMUNotifier *notifier)
+{
+/* we don't need iommu replay currently */
+return;
+}
+
 static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus,
 int devfn)
 {
@@ -1055,6 +1062,7 @@ static void 
s390_iommu_memory_region_class_init(ObjectClass *klass, void *data)
 IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
 imrc->translate = s390_translate_iommu;
+imrc->replay = s390_pci_iommu_replay;
 }
 
 static const TypeInfo s390_iommu_memory_region_info = {
-- 
2.11.0 (Apple Git-81)




Re: [Qemu-devel] [RFC v2 4/8] QAPI: new QMP command option "without-bql"

2017-08-28 Thread Peter Xu
On Fri, Aug 25, 2017 at 10:14:12AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > On Thu, Aug 24, 2017 at 07:37:32AM +0800, Fam Zheng wrote:
> > > On Wed, 08/23 18:44, Dr. David Alan Gilbert wrote:
> > > > * Peter Xu (pet...@redhat.com) wrote:
> > > > > Introducing this new parameter for QMP commands in general to mark out
> > > > > when the command does not need BQL.  Normally QMP command executions 
> > > > > are
> > > > > done with the protection of BQL in QEMU.  However the truth is that 
> > > > > not
> > > > > all the QMP commands require the BQL.
> > > > > 
> > > > > This new parameter provides a way to allow QMP commands to run in
> > > > > parallel when possible, without the contention on the BQL.
> > > > > 
> > > > > Since the default value of "without-bql" is still false, so now all 
> > > > > QMP
> > > > > commands are still protected by BQL still.
> > > > > 
> > > > > Signed-off-by: Peter Xu 
> > > > 
> > > > We should define what a 'without-bql' command is allowed to do:
> > > >'Commands that have without-bql set _may_ be called without the bql
> > > >being taken.  They must not take the bql or any other lock that may
> > > >become dependent on the bql.'
> > 
> > Sure.
> > 
> > > >(Do we need to say anything about RCU?)
> > 
> > Could I ask how is RCU related?
> 
> My definition above said that anything declared without bql couldn't
> take the bql, so couldn't block on any other thread holding the bql.
> But is our command allowed to use synchronize_rcu or rcu_read_lock
> that could wait for or block other threads doing rcu stuff?
> Because if it did is there any guarantee that it wouldn't block?

I see.  Shall we just ignore RCU for now?  Since currently I don't see
a real synchronize_rcu() user yet in QEMU, except the RCU thread.  And
rcu_read_lock() should not block itself, so IMHO calling it only in
monitor command handlers should always be fine?

> 
> 
> > 
> > > > 
> > > > Also, 'no-bql' is shorter :-)
> > > 
> > > Or rather "need-bql" that defaults to true to avoid double negative (TM) 
> > > with
> > > "no-bql = false"?
> > 
> > Ok let me use "need-bql". :)
> 
> Fine by me.

I'm switching to "need-bql" for QMP only, and used "no-bql" in HMP,
since I failed to find a good way to init mon_cmd_t field to true by
default.

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Cornelia Huck
On Fri, 25 Aug 2017 11:05:30 -0400
Farhan Ali  wrote:

> On 08/25/2017 10:04 AM, Cornelia Huck wrote:
> > On Fri, 25 Aug 2017 09:24:46 -0400
> > Farhan Ali  wrote:
> >  
> >> The commit 198c0d1f9df8c4 s390x/css: check ccw address validity
> >> exposes an alignment issue in ccw bios.
> >>
> >> According to PoP the CCW must be doubleword aligned. Let's fix
> >> this in the bios.
> >>
> >> Cc: qemu-sta...@nongnu.org
> >> Signed-off-by: Farhan Ali 
> >> Reviewed-by: Halil Pasic 
> >> Reviewed-by: Eric Farman 
> >> Acked-by: Christian Borntraeger 
> >> ---
> >>  pc-bios/s390-ccw/cio.h | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
> >> index f5b4549..55eaeee 100644
> >> --- a/pc-bios/s390-ccw/cio.h
> >> +++ b/pc-bios/s390-ccw/cio.h
> >> @@ -133,7 +133,7 @@ struct ccw1 {
> >>  __u8 flags;
> >>  __u16 count;
> >>  __u32 cda;
> >> -} __attribute__ ((packed));
> >> +} __attribute__ ((packed, aligned(8)));
> >>
> >>  #define CCW_FLAG_DC  0x80
> >>  #define CCW_FLAG_CC  0x40  
> >
> > Currently testing.
> >
> > This looks obviously right, but did you figure out what the (probably
> > unrelated) other failure was?
> >  
> 
> That is still under investigation, for some reason it only fails for an 
> LDL DASD and it works for SCSIs and CDL DASD.

Which are the symptoms of the failure? I'd like to understand this
before I update the (currently working by accident) bios with an
updated version.

I'll just apply the patch for now.



Re: [Qemu-devel] [RFC v2 4/8] QAPI: new QMP command option "without-bql"

2017-08-28 Thread Peter Xu
On Fri, Aug 25, 2017 at 10:06:27AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > On Wed, Aug 23, 2017 at 06:44:12PM +0100, Dr. David Alan Gilbert wrote:
> > 
> > [...]
> > 
> > > > +Most of the commands require the Big QEMU Lock (BQL) be held during
> > > > +execution.  However, there is a small subset of the commands that may
> > > > +not really need BQL at all.  To mark out this kind of commands, we can
> > > > +specify "without-bql" to "true".  This parameter is only a hint for
> > > > +internal QMP implementation to provide possiblility to allow commands
> > > > +be run in parallel, or reduce the contention of the lock.  Users of QMP
> > > > +should not really be aware of such information.
> > > 
> > > Well, I think users of these commands might select them specifically
> > > because they know that they won't block.  Those who care about latency 
> > > might
> > > look to use commands that don't take the lock because of a reduced
> > > effect on the performance as well.
> > 
> > What would be the best way to tell user?  I think again this should
> > mostly for HMP only, right?
> 
> It needs to be docuemnted for QMP users as well so that those developing
> management code know what's safe.

I see.  What's the corresponding QMP documentation I should touch up?

> 
> > Maybe we can add a new command to list these lock-free commands.  Or,
> > I can dump something in "help" and "help info" like:
> > 
> > (qemu) help migrate_incoming
> > migrate_incoming uri -- Continue an incoming migration from an -incoming 
> > defer (BQL-less)
> 
> 'lock free' might be better?

I'm ok with it.  But would the word "lock" too general?

-- 
Peter Xu



[Qemu-devel] [PATCH 1/1] s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs

2017-08-28 Thread Christian Borntraeger
Right now there is a potential hang situation for postcopy migrations,
if the guest is enabling storage keys on the target system during the
postcopy process.

For storage key virtualization, we have to forbid the empty zero page as
the storage key is a property of the physical page frame.  As we enable
storage key handling lazily we then drop all mappings for empty zero
pages for lazy refaulting later on.

This does not work with the postcopy migration, which relies on the
empty zero page never triggering a fault again in the future. The reason
is that postcopy migration will simply read a page on the target system
if that page is a known zero page to fault in an empty zero page.  At
the same time postcopy remembers that this page was already transferred
- so any future userfault on that page will be NOT retransmitted again
to avoid races.

If now the guest enters the storage key mode while in postcopy, we will
break this assumption of postcopy.

The solution is to disable the empty zero page for KVM guests early on
and not during storage key enablement. With this change, the postcopy
migration process is guaranteed to start after no zero pages are left.

As guest pages are very likely not empty zero pages anyway the memory
overhead is also pretty small.

While at it this also adds proper page table locking to the zero page
removal.

Signed-off-by: Christian Borntraeger 
Cc: sta...@vger.kernel.org
---
 arch/s390/include/asm/pgtable.h |  2 +-
 arch/s390/mm/gmap.c | 41 ++---
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index bb59a0a..9d7f62a 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -506,7 +506,7 @@ static inline int mm_alloc_pgste(struct mm_struct *mm)
  * In the case that a guest uses storage keys
  * faults should no longer be backed by zero pages
  */
-#define mm_forbids_zeropage mm_use_skey
+#define mm_forbids_zeropage mm_has_pgste
 static inline int mm_use_skey(struct mm_struct *mm)
 {
 #ifdef CONFIG_PGSTE
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 4fb3d3c..ae8bd57 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2121,6 +2121,39 @@ static inline void thp_split_mm(struct mm_struct *mm)
 }
 
 /*
+ * Remove all empty zero pages from the mapping for lazy refaulting
+ * - This must be called after mm->context.has_pgste is set, to avoid
+ *   future creation of zero pages
+ * - This must be called after THP was enabled
+ */
+static int __zap_zero_pages(pmd_t *pmd, unsigned long start,
+  unsigned long end, struct mm_walk *walk)
+{
+   unsigned long addr;
+
+   for (addr = start; addr != end; addr += PAGE_SIZE) {
+   pte_t *ptep;
+   spinlock_t *ptl;
+
+   ptep = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+   if (is_zero_pfn(pte_pfn(*ptep)))
+   ptep_xchg_direct(walk->mm, addr, ptep, 
__pte(_PAGE_INVALID));
+   pte_unmap_unlock(ptep, ptl);
+   }
+
+   return 0;
+
+}
+
+static inline void zap_zero_pages(struct mm_struct *mm)
+{
+   struct mm_walk walk = { .pmd_entry = __zap_zero_pages };
+
+   walk.mm = mm;
+   walk_page_range(0, TASK_SIZE, &walk);
+}
+
+/*
  * switch on pgstes for its userspace process (for kvm)
  */
 int s390_enable_sie(void)
@@ -2137,6 +2170,7 @@ int s390_enable_sie(void)
mm->context.has_pgste = 1;
/* split thp mappings and disable thp for future mappings */
thp_split_mm(mm);
+   zap_zero_pages(mm);
up_write(&mm->mmap_sem);
return 0;
 }
@@ -2149,13 +2183,6 @@ EXPORT_SYMBOL_GPL(s390_enable_sie);
 static int __s390_enable_skey(pte_t *pte, unsigned long addr,
  unsigned long next, struct mm_walk *walk)
 {
-   /*
-* Remove all zero page mappings,
-* after establishing a policy to forbid zero page mappings
-* following faults for that page will get fresh anonymous pages
-*/
-   if (is_zero_pfn(pte_pfn(*pte)))
-   ptep_xchg_direct(walk->mm, addr, pte, __pte(_PAGE_INVALID));
/* Clear storage key */
ptep_zap_key(walk->mm, addr, pte);
return 0;
-- 
2.7.4




[Qemu-devel] [PATCH 0/1] kernel fix for s390 postcopy hang

2017-08-28 Thread Christian Borntraeger
Andrea, David,

here is a patch that fixes my postcopy hang 
(https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg04129.html)

unless there are better ideas or complains, this will go via Martins
tree.

Christian Borntraeger (1):
  s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs

 arch/s390/include/asm/pgtable.h |  2 +-
 arch/s390/mm/gmap.c | 41 ++---
 2 files changed, 35 insertions(+), 8 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [RFC v2 8/8] migration: add incoming mgmt lock

2017-08-28 Thread Peter Xu
On Fri, Aug 25, 2017 at 10:34:56AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > On Wed, Aug 23, 2017 at 07:01:35PM +0100, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (pet...@redhat.com) wrote:
> > > > Now at least migrate_incoming can be run in parallel.  Let's provide a
> > > > migration lock to protect it.
> > > > 
> > > > Signed-off-by: Peter Xu 
> > > > ---
> > > >  migration/migration.c | 6 ++
> > > >  migration/migration.h | 3 +++
> > > >  2 files changed, 9 insertions(+)
> > > > 
> > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > index c3fe0ed..32058f7 100644
> > > > --- a/migration/migration.c
> > > > +++ b/migration/migration.c
> > > > @@ -145,6 +145,7 @@ MigrationIncomingState 
> > > > *migration_incoming_get_current(void)
> > > >  mis_current.state = MIGRATION_STATUS_NONE;
> > > >  memset(&mis_current, 0, sizeof(MigrationIncomingState));
> > > >  qemu_mutex_init(&mis_current.rp_mutex);
> > > > +qemu_mutex_init(&mis_current.mgmt_mutex);
> > > >  qemu_event_init(&mis_current.main_thread_load_event, false);
> > > >  once = true;
> > > >  }
> > > > @@ -1171,6 +1172,7 @@ void qmp_migrate_incoming(const char *uri, Error 
> > > > **errp)
> > > >  {
> > > >  Error *local_err = NULL;
> > > >  static bool once = true;
> > > > +MigrationIncomingState *mis = migration_incoming_get_current();
> > > 
> > > migration_incoming_get_current isn't actually thread-safe itself unless
> > > you can guarantee the initial allocation has happened - otherwise both
> > > threads can race and do the 'once' code at the same time.
> > 
> > How about I init the incoming object as well in
> > migration_object_init()?
> 
> Yes I think that might work.

This change would suite better for the postcopy recovery series.  Will
add one more patch for it.

> 
> > > 
> > > Similarly, these locks - they don't protect our 'once' - so a second
> > > thread could come in here and both get past the !once check.
> > 
> > Oh I missed this one since actually I am removing that "once" variable
> > in postcopy recovery series. :)
> > 
> > I can put the last two patches into postcopy recovery series, then
> > it'll be fine.
> 
> OK; these thigns just emphasise how hard it is to make a function really
> lock free.

Agreed.

-- 
Peter Xu



Re: [Qemu-devel] [PATCH 4/5] pci: Add INTERFACE_LEGACY_PCI_DEVICE to legacy PCI devices

2017-08-28 Thread Alberto Garcia
On Fri 25 Aug 2017 09:39:22 PM CEST, Eduardo Habkost wrote:
> CCing maintainers of affected devices (sorry for not CCing you
> before).

>> diff --git a/hw/ipack/tpci200.c b/hw/ipack/tpci200.c
>> index 4dfa6b3..e380378 100644
>> --- a/hw/ipack/tpci200.c
>> +++ b/hw/ipack/tpci200.c
>> @@ -646,6 +646,10 @@ static const TypeInfo tpci200_info = {
>>  .parent= TYPE_PCI_DEVICE,
>>  .instance_size = sizeof(TPCI200State),
>>  .class_init= tpci200_class_init,
>> +.interfaces = (InterfaceInfo[]) {
>> +{ INTERFACE_LEGACY_PCI_DEVICE },
>> +{ },
>> +},
>>  };

Acked-by: Alberto Garcia 

Berto



[Qemu-devel] [Bug 1713328] Re: Unable to C-a in -nographic if -serial telnet

2017-08-28 Thread Thomas Huth
Well, with your "-serial" setup, you've put the guest serial console on
the telnet port, so there is nothing to switch on the host console here
via the CTRL-a c key combination, i.e. this is the expected behavior.
What exactly were you trying to do here? Access the serial console via
two ways, one time via telnet and one time via the host console? AFAIK
that's not possible.

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713328

Title:
  Unable to C-a in -nographic if -serial telnet

Status in QEMU:
  Invalid

Bug description:
  qemu-system-i386 (version 2.6.1, running on Linux/x86_64) started
  with:

  qemu-system-i386 -m 64M -machine type=pc -rtc
  base=localtime,clock=host -nographic -serial
  telnet:127.0.0.1:1234,server,nowait -net nic,model=ne2k_pci -net
  user,hostfwd=tcp:127.0.0.1:2200-:22,tftp=/

  does not accept the escape key (C-a) to perform functions such as
  switching from monitor to console. Verified both in GNU screen and in
  the Linux console.

  If '-serial telnet:127.0.0.1:1234,server,nowait' is removed from the
  command line, the escape key is accepted (and Qemu doesn't enter the
  monitor immediately).

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713328/+subscriptions



Re: [Qemu-devel] [PATCH 1/2] migration: Reset rather than destroy main_thread_load_event

2017-08-28 Thread Peter Xu
On Fri, Aug 25, 2017 at 04:51:29PM +0100, Dr. David Alan Gilbert wrote:

[...]

> > PS, in migration_incoming_get_current() we do
> > mis_current.state = MIGRATION_STATUS_NONE;
> > memset(&mis_current, 0, sizeof(MigrationIncomingState));
> > 
> > and the first line there is pointless because the memset
> > blasts zeroes over it anyway.
> 
> Hmm yes.

I'll include this cleanup within my (future) patch to init incoming
migration object in migration_object_init() as well.  Thanks,

-- 
Peter Xu



Re: [Qemu-devel] [Qemu-ppc] [PATCH v2] spapr: fallback to raw mode if best compat mode cannot be set during CAS

2017-08-28 Thread Greg Kurz
On Thu, 17 Aug 2017 13:23:50 +0200
Greg Kurz  wrote:

> KVM PR doesn't allow to set a compat mode. This causes ppc_set_compat_all()
> to fail and we return H_HARDWARE to the guest right away.
> 
> This is excessive: even if we favor compat mode since commit 152ef803ceb19,
> we should at least fallback to raw mode if the guest supports it.
> 
> This patch modifies cas_check_pvr() so that it also reports that the real
> PVR was found in the table supplied by the guest. Note that this is only
> makes sense if raw mode isn't explicitely disabled (ie, the user didn't
> set the machine "max-cpu-compat" property). If this is the case, we can
> simply ignore ppc_set_compat_all() failures, and let the guest run in raw
> mode.
> 
> Signed-off-by: Greg Kurz 
> ---
> v2: - initialize raw_mode_supported to silent patchew
> ---

Ping ?

>  hw/ppc/spapr_hcall.c |   18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 07b3da8dc4cd..2f4c4f59e110 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -1441,7 +1441,8 @@ static target_ulong h_signal_sys_reset(PowerPCCPU *cpu,
>  }
>  
>  static uint32_t cas_check_pvr(sPAPRMachineState *spapr, PowerPCCPU *cpu,
> -  target_ulong *addr, Error **errp)
> +  target_ulong *addr, bool *raw_mode_supported,
> +  Error **errp)
>  {
>  bool explicit_match = false; /* Matched the CPU's real PVR */
>  uint32_t max_compat = spapr->max_compat_pvr;
> @@ -1481,6 +1482,8 @@ static uint32_t cas_check_pvr(sPAPRMachineState *spapr, 
> PowerPCCPU *cpu,
>  return 0;
>  }
>  
> +*raw_mode_supported = explicit_match;
> +
>  /* Parsing finished */
>  trace_spapr_cas_pvr(cpu->compat_pvr, explicit_match, best_compat);
>  
> @@ -1499,8 +1502,9 @@ static target_ulong 
> h_client_architecture_support(PowerPCCPU *cpu,
>  sPAPROptionVector *ov1_guest, *ov5_guest, *ov5_cas_old, *ov5_updates;
>  bool guest_radix;
>  Error *local_err = NULL;
> +bool raw_mode_supported = false;
>  
> -cas_pvr = cas_check_pvr(spapr, cpu, &addr, &local_err);
> +cas_pvr = cas_check_pvr(spapr, cpu, &addr, &raw_mode_supported, 
> &local_err);
>  if (local_err) {
>  error_report_err(local_err);
>  return H_HARDWARE;
> @@ -1510,8 +1514,14 @@ static target_ulong 
> h_client_architecture_support(PowerPCCPU *cpu,
>  if (cpu->compat_pvr != cas_pvr) {
>  ppc_set_compat_all(cas_pvr, &local_err);
>  if (local_err) {
> -error_report_err(local_err);
> -return H_HARDWARE;
> +/* We fail to set compat mode (likely because running with KVM 
> PR),
> + * but maybe we can fallback to raw mode if the guest supports 
> it.
> + */
> +if (!raw_mode_supported) {
> +error_report_err(local_err);
> +return H_HARDWARE;
> +}
> +local_err = NULL;
>  }
>  }
>  
> 
> 



pgp2ZzHtK5GhR.pgp
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/2] migration: Reset rather than destroy main_thread_load_event

2017-08-28 Thread Peter Xu
On Fri, Aug 25, 2017 at 03:19:39PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> migration_incoming_state_destroy doesn't really destroy, it cleans up.
> After a loadvm it's called, but the loadvm command can be run twice,
> and so destroying an init-once mutex breaks on the second loadvm.
> 
> Reported-by: Stafford Horne 
> Signed-off-by: Dr. David Alan Gilbert 

Reviewed-by: Peter Xu 

-- 
Peter Xu



Re: [Qemu-devel] [PATCH 2/2] snapshot/tests: Try loadvm twice

2017-08-28 Thread Peter Xu
On Fri, Aug 25, 2017 at 03:19:40PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> It's legal to loadvm twice, modify the existing save/loadvm test
> to do it twice.
> 
> Signed-off-by: Dr. David Alan Gilbert 

Reviewed-by: Peter Xu 

(a question below though)

> ---
>  tests/qemu-iotests/068 | 2 +-
>  tests/qemu-iotests/068.out | 4 
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/068 b/tests/qemu-iotests/068
> index cfa0f2aed5..e7fca6a494 100755
> --- a/tests/qemu-iotests/068
> +++ b/tests/qemu-iotests/068
> @@ -78,7 +78,7 @@ for extra_args in \
>  # Give qemu some time to boot before saving the VM state
>  { sleep 1; printf "savevm 0\nquit\n"; } | _qemu $extra_args
>  # Now try to continue from that VM state (this should just work)
> -echo quit | _qemu $extra_args -loadvm 0
> +{ sleep 1; printf "loadvm 0\nloadvm 0\nquit\n"; } | _qemu $extra_args -S

Curious about why there are both "sleep 1" and "-S" added - I thought
"-S" means CPU won't really run, then why wait for one more second?

>  done
>  
>  # success, all done
> diff --git a/tests/qemu-iotests/068.out b/tests/qemu-iotests/068.out
> index aa063cf711..f07a938a38 100644
> --- a/tests/qemu-iotests/068.out
> +++ b/tests/qemu-iotests/068.out
> @@ -7,6 +7,8 @@ QEMU X.Y.Z monitor - type 'help' for more information
>  (qemu) savevm 0
>  (qemu) quit
>  QEMU X.Y.Z monitor - type 'help' for more information
> +(qemu) loadvm 0
> +(qemu) loadvm 0
>  (qemu) quit
>  
>  === Saving and reloading a VM state to/from a qcow2 image (-object 
> iothread,id=iothread0 -set device.hba0.iothread=iothread0) ===
> @@ -16,5 +18,7 @@ QEMU X.Y.Z monitor - type 'help' for more information
>  (qemu) savevm 0
>  (qemu) quit
>  QEMU X.Y.Z monitor - type 'help' for more information
> +(qemu) loadvm 0
> +(qemu) loadvm 0
>  (qemu) quit
>  *** done
> -- 
> 2.13.5
> 

-- 
Peter Xu



Re: [Qemu-devel] Qemu 2.10 rc4 build issue on BE (luigi burdo)

2017-08-28 Thread Igor Mammedov
On Mon, 28 Aug 2017 06:13:03 +
luigi burdo  wrote:

> Hi,
> 
> this is the log that was attached in my email.
> 
> i will test only i386 softmmu when will return at home
> 
> thanks
> 
> Luigi
> 
> 
> ./configure 
> --target-list=ppc64-softmmu,ppc-softmmu,x86_64-softmmu,arm-softmmu,i386-softmmu
>  --with-sdlabi=2.0 --with-gtkabi=3.0 --audio-drv-list=pa,sdl,alsa 
> --disable-werror
> Install prefix/usr/local
> BIOS directory/usr/local/share/qemu
> binary directory  /usr/local/bin
> library directory /usr/local/lib
> module directory  /usr/local/lib/qemu
> libexec directory /usr/local/libexec
> include directory /usr/local/include
> config directory  /usr/local/etc
> local state directory   /usr/local/var
> Manual directory  /usr/local/share/man
> ELF interp prefix /usr/gnemul/qemu-%M
> Source path   /home/amigaone/Downloads/qemu-2.10.0-rc4
> C compilercc
> Host C compiler   cc
> C++ compiler  c++
> Objective-C compiler clang
> ARFLAGS   rv
> CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g
> QEMU_CFLAGS   -I/usr/include/pixman-1  -pthread -I/usr/include/glib-2.0 
> -I/usr/lib64/glib-2.0/include -DNCURSES_WIDECHAR -D_GNU_SOURCE -m64 
> -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
> -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
> -fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels 
> -Wno-shift-negative-value -Wno-missing-include-dirs -Wempty-body 
> -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
> -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
> -Wtype-limits -fstack-protector-strong -I/usr/include/p11-kit-1   
> -I/usr/include/libpng16 -I/usr/include/libdrm  -I/usr/include/libusb-1.0
> LDFLAGS   -Wl,--warn-common -m64 -g
> make  make
> install   install
> pythonpython -B
> smbd  /usr/sbin/smbd
> module supportno
> host CPU  ppc64
> host big endian   yes
> target list   ppc64-softmmu ppc-softmmu x86_64-softmmu arm-softmmu 
> i386-softmmu
> gprof enabled no
> sparse enabledno
> strip binariesyes
> profiler  no
> static build  no
> pixmansystem
> SDL support   yes (2.0.5)
> GTK support   yes (3.22.17)
> GTK GL supportyes
> VTE support   no
> TLS priority  NORMAL
> GNUTLS supportyes
> GNUTLS rndyes
> libgcrypt no
> libgcrypt kdf no
> nettleyes (3.3)
> nettle kdfyes
> libtasn1  yes
> curses supportyes
> virgl support yes
> curl support  no
> mingw32 support   no
> Audio drivers pa sdl alsa
> Block whitelist (rw)
> Block whitelist (ro)
> VirtFS supportyes
> VNC support   yes
> VNC SASL support  no
> VNC JPEG support  yes
> VNC PNG support   yes
> xen support   no
> brlapi supportno
> bluez  supportno
> Documentation no
> PIE   no
> vde support   no
> netmap supportno
> Linux AIO support yes
> ATTR/XATTR support yes
> Install blobs yes
> KVM support   yes
> HAX support   no
> TCG support   yes
> TCG debug enabled no
> TCG interpreter   no
> RDMA support  no
> fdt support   yes
> preadv supportyes
> fdatasync yes
> madvise   yes
> posix_madvise yes
> libcap-ng support no
> vhost-net support yes
> vhost-scsi support yes
> vhost-vsock support yes
> vhost-user support yes
> Trace backendslog
> spice support no
> rbd support   no
> xfsctl supportno
> smartcard support no
> libusbyes
> usb net redir no
> OpenGL supportyes
> OpenGL dmabufsyes
> libiscsi support  no
> libnfs supportno
> build guest agent yes
> QGA VSS support   no
> QGA w32 disk info no
> QGA MSI support   no
> seccomp support   no
> coroutine backend ucontext
> coroutine poolyes
> debug stack usage no
> crypto afalg  no
> GlusterFS support no
> gcov  gcov
> gcov enabled  no
> TPM support   yes
> libssh2 support   no
> TPM passthrough   no
> QOM debugging yes
> Live block migration yes
> lzo support   no
> snappy supportyes
> bzip2 support yes
> NUMA host support yes
> tcmalloc support  no
> jemalloc support  no
> avx2 optimization no
> replication support yes
> VxHS block device no
> 
> 
[...]

below errors look like corrupted source files

>   CC  x86_64-softmmu/hw/i386/pc.o
> /home/amigaone/Downloads/qemu-2.10.0-rc4/hw/i386/pc.c: In function 
> 'pc_dimm_unplug_request':
> /home/amigaone/Downloads/qemu-2.10.0-rc4/hw/i386/pc.c:1744:9: error: unknown 
> type name 'gCto'
>  gCto out;
>  ^~~~
> /home/amigaone/Downloads/qemu-2.10.0-rc4/hw/i386/pc.c:1744:14: warning: 
> unused variable 'out' [-Wunused-variable]
>  gCto out;
>   ^~~
> /home/amigaone/Downloads/qemu-2.10.0-rc4/hw/i386/pc.c:1747:49: error: stray 
> '@' in program
>  if (object_dynamic_cast(OBJECT(dev), TYPE_NV@IMM)) {
>

Re: [Qemu-devel] Qemu 2.10 rc4 build issue on BE (luigi burdo)

2017-08-28 Thread luigi burdo
Hi Igor,


below errors look like corrupted source files

>


i ust dowloaded the compressed file from qemu website,unzipped with tar from 
console (like i usually did)  and run configure and make.

can be the file of rc4 from qemu website not right and need a repack?


Luigi



[Qemu-devel] [Bug 1713434] [NEW] prom-env-test test aborted and core dumped

2017-08-28 Thread R.Nageswara Sastry
Public bug reported:

On ppc64le architecture machine the following test case Aborted and Core
dumped.

# tests/prom-env-test --quiet --keep-going -m=quick --GTestLogFD=6
**
ERROR:tests/libqtest.c:628:qtest_get_arch: assertion failed: (qemu != NULL)
Aborted (core dumped)

Steps to re-produce:
clone from the git
configure & compile 
run the unit tests by 'make check'

(gdb) bt
#0  0x3fff9d60eff0 in raise () from /lib64/libc.so.6
#1  0x3fff9d61136c in abort () from /lib64/libc.so.6
#2  0x3fff9de1aa04 in g_assertion_message () from /lib64/libglib-2.0.so.0
#3  0x3fff9de1ab0c in g_assertion_message_expr () from 
/lib64/libglib-2.0.so.0
#4  0x1000cc30 in qtest_get_arch () at tests/libqtest.c:628
#5  0x100048f0 in main (argc=5, argv=0x32145538) at 
tests/prom-env-test.c:82
(gdb) i r
r0 0xfa 250
r1 0x32144d30   70368510627120
r2 0x3fff9d7b9900   7036709176
r3 0x0  0
r4 0x12a7   4775
r5 0x6  6
r6 0x8  8
r7 0x1  1
r8 0x0  0
r9 0x0  0
r100x0  0
r110x0  0
r120x0  0
r130x3fff9dfa1950   70367099623760
r140x0  0
r150x0  0
r160x0  0
r170x0  0
r180x0  0
r190x0  0
r200x0  0
r210x0  0
r220x0  0
r230x0  0
r240x0  0
r250x0  0
r260x0  0
r270x100287f8   268601336
r280x16841b40   377756480
r290x4c 76
r300x32144de8   70368510627304
r310x6  6
pc 0x3fff9d60eff0   0x3fff9d60eff0 
msr0x9280f033   10376293541503627315
cr 0x42000842   1107298370
lr 0x3fff9d61136c   0x3fff9d61136c 
ctr0x0  0
xer0x0  0
orig_r30x12a7   4775
trap   0xc003072

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713434

Title:
  prom-env-test test aborted and core dumped

Status in QEMU:
  New

Bug description:
  On ppc64le architecture machine the following test case Aborted and
  Core dumped.

  # tests/prom-env-test --quiet --keep-going -m=quick --GTestLogFD=6
  **
  ERROR:tests/libqtest.c:628:qtest_get_arch: assertion failed: (qemu != NULL)
  Aborted (core dumped)

  Steps to re-produce:
  clone from the git
  configure & compile 
  run the unit tests by 'make check'

  (gdb) bt
  #0  0x3fff9d60eff0 in raise () from /lib64/libc.so.6
  #1  0x3fff9d61136c in abort () from /lib64/libc.so.6
  #2  0x3fff9de1aa04 in g_assertion_message () from /lib64/libglib-2.0.so.0
  #3  0x3fff9de1ab0c in g_assertion_message_expr () from 
/lib64/libglib-2.0.so.0
  #4  0x1000cc30 in qtest_get_arch () at tests/libqtest.c:628
  #5  0x100048f0 in main (argc=5, argv=0x32145538) at 
tests/prom-env-test.c:82
  (gdb) i r
  r0 0xfa   250
  r1 0x32144d30 70368510627120
  r2 0x3fff9d7b9900 7036709176
  r3 0x00
  r4 0x12a7 4775
  r5 0x66
  r6 0x88
  r7 0x11
  r8 0x00
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fff9dfa1950 70367099623760
  r140x00
  r150x00
  r160x00
  r170x00
  r180x00
  r190x00
  r200x00
  r210x00
  r220x00
  r230x00
  r240x00
  r250x00
  r260x00
  r270x100287f8 268601336
  r280x16841b40 377756480
  r290x4c   76
  r300x32144de8 70368510627304
  r310x66
  pc 0x3fff9d60eff0 0x3fff9d60eff0 
  msr0x9280f033 10376293541503627315
  cr 0x42000842 1107298370
  lr 0x3fff9d61136c 0x3fff9d61136c 
  ctr0x00
  xer0x00
  orig_r30x12a7 4775
  trap   0xc00  3072

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713434/+subscriptions



Re: [Qemu-devel] [Qemu-ppc] Qemu 2.10 rc4 build issue on BE

2017-08-28 Thread Thomas Huth
On 27.08.2017 18:56, luigi burdo wrote:
> Hi all,
> 
> the last rc4 from the qemu website fail in build in some parts on BE
> hardware.
> 
> attached on this email there is the configure and the build log hope it
> helps
> 
> 
> My machine is : PowerMac G5 Quad .
> 
> Distro is Ferdora 25 PPC64.

In your log there is:

/home/amigaone/Downloads/qemu-2.10.0-rc4/hw/i386/pc.c: In function
‘pc_dimm_unplug_request’:
/home/amigaone/Downloads/qemu-2.10.0-rc4/hw/i386/pc.c:1744:9: error:
unknown type name ‘gCto’
 gCto out;

But if you have a look at the freshly unpacked sources, that line
clearly reads "goto out;" and not "gCto out;" ... so it looks like
something messed up your sources very badly.
Can you reproduce this problem when starting again from scratch?

 Thomas



Re: [Qemu-devel] Qemu 2.10 rc4 build issue on BE (luigi burdo)

2017-08-28 Thread Igor Mammedov
On Mon, 28 Aug 2017 09:14:48 +
luigi burdo  wrote:

> Hi Igor,
> 
> 
> below errors look like corrupted source files
> 
> >  
> 
> 
> i ust dowloaded the compressed file from qemu website,unzipped with tar from 
> console (like i usually did)  and run configure and make.
> 
> can be the file of rc4 from qemu website not right and need a repack?
I've just downloaded it from 
http://download.qemu-project.org/qemu-2.10.0-rc4.tar.xz
and unpacked, it works for me (i.e. hw/i386/pc.c isn't corrupted).

> 
> Luigi
> 




Re: [Qemu-devel] [PATCH v9 4/6] block: convert ThrottleGroup to object with QOM

2017-08-28 Thread Alberto Garcia
On Fri 25 Aug 2017 03:20:26 PM CEST, Manos Pitsidianakis wrote:
> ThrottleGroup is converted to an object. This will allow the future
> throttle block filter drive easy creation and configuration of throttle
> groups in QMP and cli.

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH] slirp: fix clearing ifq_so from pending packets

2017-08-28 Thread P J P
  Hello Samuel,

+-- On Sat, 26 Aug 2017, Samuel Thibault wrote --+
| So Wjjzhang and PJP, can you confirm that this fixes your uses?

Yes, I confirm the patch fixes the use-after-free issue.

Thank you so much.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F



[Qemu-devel] [Bug 1713434] Re: prom-env-test test aborted and core dumped

2017-08-28 Thread Thomas Huth
When running tests directly, you've got to specify the QEMU binary like
this:

QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 tests/prom-env-test
--quiet --keep-going -m=quick

But the abort() is indeed ugly here and I think we should print out a
user-friendly error message instead.

** Changed in: qemu
 Assignee: (unassigned) => Thomas Huth (th-huth)

** Changed in: qemu
   Status: New => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713434

Title:
  prom-env-test test aborted and core dumped

Status in QEMU:
  In Progress

Bug description:
  On ppc64le architecture machine the following test case Aborted and
  Core dumped.

  # tests/prom-env-test --quiet --keep-going -m=quick --GTestLogFD=6
  **
  ERROR:tests/libqtest.c:628:qtest_get_arch: assertion failed: (qemu != NULL)
  Aborted (core dumped)

  Steps to re-produce:
  clone from the git
  configure & compile 
  run the unit tests by 'make check'

  (gdb) bt
  #0  0x3fff9d60eff0 in raise () from /lib64/libc.so.6
  #1  0x3fff9d61136c in abort () from /lib64/libc.so.6
  #2  0x3fff9de1aa04 in g_assertion_message () from /lib64/libglib-2.0.so.0
  #3  0x3fff9de1ab0c in g_assertion_message_expr () from 
/lib64/libglib-2.0.so.0
  #4  0x1000cc30 in qtest_get_arch () at tests/libqtest.c:628
  #5  0x100048f0 in main (argc=5, argv=0x32145538) at 
tests/prom-env-test.c:82
  (gdb) i r
  r0 0xfa   250
  r1 0x32144d30 70368510627120
  r2 0x3fff9d7b9900 7036709176
  r3 0x00
  r4 0x12a7 4775
  r5 0x66
  r6 0x88
  r7 0x11
  r8 0x00
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fff9dfa1950 70367099623760
  r140x00
  r150x00
  r160x00
  r170x00
  r180x00
  r190x00
  r200x00
  r210x00
  r220x00
  r230x00
  r240x00
  r250x00
  r260x00
  r270x100287f8 268601336
  r280x16841b40 377756480
  r290x4c   76
  r300x32144de8 70368510627304
  r310x66
  pc 0x3fff9d60eff0 0x3fff9d60eff0 
  msr0x9280f033 10376293541503627315
  cr 0x42000842 1107298370
  lr 0x3fff9d61136c 0x3fff9d61136c 
  ctr0x00
  xer0x00
  orig_r30x12a7 4775
  trap   0xc00  3072

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713434/+subscriptions



Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll

2017-08-28 Thread Marc-André Lureau
Hi

On Mon, Aug 28, 2017 at 5:05 AM, Peter Xu  wrote:
> On Fri, Aug 25, 2017 at 04:07:34PM +, Marc-André Lureau wrote:
>> On Fri, Aug 25, 2017 at 5:33 PM Dr. David Alan Gilbert 
>> wrote:
>>
>> > * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
>> > > Hi
>> > >
>> > > On Wed, Aug 23, 2017 at 8:52 AM Peter Xu  wrote:
>> > >
>> > > > Firstly, introduce Monitor.use_thread, and set it for monitors that are
>> > > > using non-mux typed backend chardev.  We only do this for monitors, so
>> > > > mux-typed chardevs are not suitable (when it connects to, e.g., serials
>> > > > and the monitor together).
>> > > >
>> > > > When use_thread is set, we create standalone thread to poll the monitor
>> > > > events, isolated from the main loop thread.  Here we still need to take
>> > > > the BQL before dispatching the tasks since some of the monitor commands
>> > > > are not allowed to execute without the protection of BQL.  Then this
>> > > > gives us the chance to avoid taking the BQL for some monitor commands
>> > in
>> > > > the future.
>> > > >
>> > > > * Why this change?
>> > > >
>> > > > We need these per-monitor threads to make sure we can have at least one
>> > > > monitor that will never stuck (that can receive further monitor
>> > > > commands).
>> > > >
>> > > > * So when will monitors stuck?  And, how do they stuck?
>> > > >
>> > > > After we have postcopy and remote page faults, it's simple to achieve a
>> > > > stuck in the monitor (which is also a stuck in main loop thread):
>> > > >
>> > > > (1) Monitor deadlock on BQL
>> > > >
>> > > > As we may know, when postcopy is running on destination VM, the vcpu
>> > > > threads can stuck merely any time as long as it tries to access an
>> > > > uncopied guest page.  Meanwhile, when the stuck happens, it is possible
>> > > > that the vcpu thread is holding the BQL.  If the page fault is not
>> > > > handled quickly, you'll find that monitors stop working, which is
>> > trying
>> > > > to take the BQL.
>> > > >
>> > > > If the page fault cannot be handled correctly (one case is a paused
>> > > > postcopy, when network is temporarily down), monitors will hang
>> > > > forever.  Without current patch, that means the main loop hanged.
>> > We'll
>> > > > never find a way to talk to VM again.
>> > > >
>> > >
>> > > Could the BQL be pushed down to the monitor commands level instead? That
>> > > way we wouldn't need a seperate thread to solve the hang on commands that
>> > > do not need BQL.
>> >
>> > If the main thread is stuck though I don't see how that helps you; you
>> > have to be able to run these commands on another thread.
>> >
>>
>> Why would the main thread be stuck? In (1) If the vcpu thread takes the BQL
>> and the command doesn't need it, it would work.  In (2),  info cpus
>> shouldn't keep the BQL (my qapi-async series would probably help here)
>
> (Thanks for joining the discussion)
>
> AFAIK the main thread can be stuck for many reasons.  I have seen one
> stack when the VGA code (IIUC) was trying to writting to guest graphic
> memory in main loop thread but luckily that guest page is still not
> copied yet from source.  As long as the main thread is stuck for any
> reason, no chance for monitor commands, even if the commands support
> async operations.

If that command becomes async (it probably should, any command doing
IO probaly should), then the main loop can keep running.

>
> So IMHO the only solution is doing these things in separate threads,
> rather than all in a single one.

I wouldn't say it's the only solution. I think the monitor can touch
many areas that haven't been written with multi-threading in mind. My
proposal is probably safer, although I don't know how hard it would be
to push the BQL down to QMP commands, and make async existing IO
commands. The benefits of this work are quite interesting imho,
because a stuck mainloop is basically a stuck qemu, and an additional
thread will not solve it...

-- 
Marc-André Lureau



[Qemu-devel] [PATCH v15 0/5] Virtio-balloon Enhancement

2017-08-28 Thread Wei Wang
This patch series enhances the existing virtio-balloon with the following
new features:
1) fast ballooning: transfer ballooned pages between the guest and host in
chunks using sgs, instead of one by one; and
2) free page block reporting: a new virtqueue to report guest free pages
to the host.

The second feature can be used to accelerate live migration of VMs. Here
are some details:

Live migration needs to transfer the VM's memory from the source machine
to the destination round by round. For the 1st round, all the VM's memory
is transferred. From the 2nd round, only the pieces of memory that were
written by the guest (after the 1st round) are transferred. One method
that is popularly used by the hypervisor to track which part of memory is
written is to write-protect all the guest memory.

The second feature  enables the optimization of the 1st round memory
transfer - the hypervisor can skip the transfer of guest free pages in the
1st round. It is not concerned that the memory pages are used after they
are given to the hypervisor as a hint of the free pages, because they will
be tracked by the hypervisor and transferred in the next round if they are
used and written.

Change Log:
v14->v15:
1) mm: make the report callback return a bool value - returning 1 to stop
walking through the free page list.
2) virtio-balloon: batching sgs of balloon pages till the vq is full
3) virtio-balloon: create a new workqueue, rather than using the default
system_wq, to queue the free page reporting work item.
4) virtio-balloon: add a ctrl_vq to be a central control plane which will
handle all the future control related commands between the host and guest.
Add free page report as the first feature controlled under ctrl_vq, and
the free_page_vq is a data plane vq dedicated to the transmission of free
page blocks.

v13->v14:
1) xbitmap: move the code from lib/radix-tree.c to lib/xbitmap.c.
2) xbitmap: consolidate the implementation of xb_bit_set/clear/test into
one xb_bit_ops.
3) xbitmap: add documents for the exported APIs.
4) mm: rewrite the function to walk through free page blocks.
5) virtio-balloon: when reporting a free page blcok to the device, if the
vq is full (less likey to happen in practice), just skip reporting this
block, instead of busywaiting till an entry gets released.
6) virtio-balloon: fail the probe function if adding the signal buf in
init_vqs fails.

v12->v13:
1) mm: use a callback function to handle the the free page blocks from the
report function. This avoids exposing the zone internal to a kernel
module.
2) virtio-balloon: send balloon pages or a free page block using a single
sg each time. This has the benefits of simpler implementation with no new
APIs.
3) virtio-balloon: the free_page_vq is used to report free pages only (no
multiple usages interleaving)
4) virtio-balloon: Balloon pages and free page blocks are sent via input
sgs, and the completion signal to the host is sent via an output sg.

v11->v12:
1) xbitmap: use the xbitmap from Matthew Wilcox to record ballooned pages.
2) virtio-ring: enable the driver to build up a desc chain using vring
desc.
3) virtio-ring: Add locking to the existing START_USE() and END_USE()
macro to lock/unlock the vq when a vq operation starts/ends.
4) virtio-ring: add virtqueue_kick_sync() and virtqueue_kick_async()
5) virtio-balloon: describe chunks of ballooned pages and free pages
blocks directly using one or more chains of desc from the vq.

v10->v11:
1) virtio_balloon: use vring_desc to describe a chunk;
2) virtio_ring: support to add an indirect desc table to virtqueue;
3)  virtio_balloon: use cmdq to report guest memory statistics.

v9->v10:
1) mm: put report_unused_page_block() under CONFIG_VIRTIO_BALLOON;
2) virtio-balloon: add virtballoon_validate();
3) virtio-balloon: msg format change;
4) virtio-balloon: move miscq handling to a task on system_freezable_wq;
5) virtio-balloon: code cleanup.

v8->v9:
1) Split the two new features, VIRTIO_BALLOON_F_BALLOON_CHUNKS and
VIRTIO_BALLOON_F_MISC_VQ, which were mixed together in the previous
implementation;
2) Simpler function to get the free page block.

v7->v8:
1) Use only one chunk format, instead of two.
2) re-write the virtio-balloon implementation patch.
3) commit changes
4) patch re-org


Matthew Wilcox (1):
  lib/xbitmap: Introduce xbitmap

Wei Wang (4):
  lib/xbitmap: add xb_find_next_bit() and xb_zero()
  virtio-balloon: VIRTIO_BALLOON_F_SG
  mm: support reporting free page blocks
  virtio-balloon: VIRTIO_BALLOON_F_CTRL_VQ

 drivers/virtio/virtio_balloon.c | 418 
 include/linux/mm.h  |   5 +
 include/linux/radix-tree.h  |   3 +
 include/linux/xbitmap.h |  64 ++
 include/uapi/linux/virtio_balloon.h |  16 ++
 lib/Makefile|   2 +-
 lib/radix-tree.c|  22 +-
 lib/xbitmap.c   | 215 +++
 mm/page_alloc.c |  65 ++
 9 files changed,

[Qemu-devel] [PATCH v15 1/5] lib/xbitmap: Introduce xbitmap

2017-08-28 Thread Wei Wang
From: Matthew Wilcox 

The eXtensible Bitmap is a sparse bitmap representation which is
efficient for set bits which tend to cluster.  It supports up to
'unsigned long' worth of bits, and this commit adds the bare bones --
xb_set_bit(), xb_clear_bit() and xb_test_bit().

Signed-off-by: Matthew Wilcox 
Signed-off-by: Wei Wang 
Cc: Andrew Morton 
Cc: Michal Hocko 
Cc: Michael S. Tsirkin 
---
 include/linux/radix-tree.h |   3 +
 include/linux/xbitmap.h|  61 
 lib/Makefile   |   2 +-
 lib/radix-tree.c   |  22 +-
 lib/xbitmap.c  | 176 +
 5 files changed, 260 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/xbitmap.h
 create mode 100644 lib/xbitmap.c

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 3e57350..e1203b1 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -309,6 +309,8 @@ void radix_tree_iter_replace(struct radix_tree_root *,
const struct radix_tree_iter *, void __rcu **slot, void *entry);
 void radix_tree_replace_slot(struct radix_tree_root *,
 void __rcu **slot, void *entry);
+bool __radix_tree_delete(struct radix_tree_root *root,
+struct radix_tree_node *node, void __rcu **slot);
 void __radix_tree_delete_node(struct radix_tree_root *,
  struct radix_tree_node *,
  radix_tree_update_node_t update_node,
@@ -325,6 +327,7 @@ unsigned int radix_tree_gang_lookup(const struct 
radix_tree_root *,
 unsigned int radix_tree_gang_lookup_slot(const struct radix_tree_root *,
void __rcu ***results, unsigned long *indices,
unsigned long first_index, unsigned int max_items);
+int __radix_tree_preload(gfp_t gfp_mask, unsigned int nr);
 int radix_tree_preload(gfp_t gfp_mask);
 int radix_tree_maybe_preload(gfp_t gfp_mask);
 int radix_tree_maybe_preload_order(gfp_t gfp_mask, int order);
diff --git a/include/linux/xbitmap.h b/include/linux/xbitmap.h
new file mode 100644
index 000..25b05ff
--- /dev/null
+++ b/include/linux/xbitmap.h
@@ -0,0 +1,61 @@
+/*
+ * eXtensible Bitmaps
+ * Copyright (c) 2017 Microsoft Corporation 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * eXtensible Bitmaps provide an unlimited-size sparse bitmap facility.
+ * All bits are initially zero.
+ */
+
+#ifndef __XBITMAP_H__
+#define __XBITMAP_H__
+
+#include 
+
+struct xb {
+   struct radix_tree_root xbrt;
+};
+
+#define XB_INIT {  \
+   .xbrt = RADIX_TREE_INIT(IDR_RT_MARKER | GFP_NOWAIT),\
+}
+#define DEFINE_XB(name)struct xb name = XB_INIT
+
+static inline void xb_init(struct xb *xb)
+{
+   INIT_RADIX_TREE(&xb->xbrt, IDR_RT_MARKER | GFP_NOWAIT);
+}
+
+int xb_set_bit(struct xb *xb, unsigned long bit);
+bool xb_test_bit(struct xb *xb, unsigned long bit);
+void xb_clear_bit(struct xb *xb, unsigned long bit);
+
+/* Check if the xb tree is empty */
+static inline bool xb_is_empty(const struct xb *xb)
+{
+   return radix_tree_empty(&xb->xbrt);
+}
+
+void xb_preload(gfp_t gfp);
+
+/**
+ * xb_preload_end - end preload section started with xb_preload()
+ *
+ * Each xb_preload() should be matched with an invocation of this
+ * function. See xb_preload() for details.
+ */
+static inline void xb_preload_end(void)
+{
+   preempt_enable();
+}
+
+#endif
diff --git a/lib/Makefile b/lib/Makefile
index 40c1837..ea50496 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -18,7 +18,7 @@ KCOV_INSTRUMENT_dynamic_debug.o := n
 
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 rbtree.o radix-tree.o dump_stack.o timerqueue.o\
-idr.o int_sqrt.o extable.o \
+idr.o xbitmap.o int_sqrt.o extable.o \
 sha1.o chacha20.o irq_regs.o argv_split.o \
 flex_proportions.o ratelimit.o show_mem.o \
 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 898e879..ee72e2c 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -463,7 +463,7 @@ radix_tree_node_free(struct radix_tree_node *node)
  * To make use of this facility, the radix tree must be initialised without
  * __GFP_DIRECT_RECLAIM being passed to INIT_RADIX_TREE().
  */
-static int __radix_tree_preload(gfp_t gfp_mask, unsigned nr)
+int __radix_tree_preload(gfp_t gfp_mask, unsigned int nr)
 {
struct radix_tree_pre

[Qemu-devel] [PATCH v15 2/5] lib/xbitmap: add xb_find_next_bit() and xb_zero()

2017-08-28 Thread Wei Wang
xb_find_next_bit() is used to find the next "1" or "0" bit in the
given range. xb_zero() is used to zero the given range of bits.

Signed-off-by: Wei Wang 
Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Michal Hocko 
Cc: Michael S. Tsirkin 
---
 include/linux/xbitmap.h |  3 +++
 lib/xbitmap.c   | 39 +++
 2 files changed, 42 insertions(+)

diff --git a/include/linux/xbitmap.h b/include/linux/xbitmap.h
index 25b05ff..0061f7a 100644
--- a/include/linux/xbitmap.h
+++ b/include/linux/xbitmap.h
@@ -38,6 +38,9 @@ static inline void xb_init(struct xb *xb)
 int xb_set_bit(struct xb *xb, unsigned long bit);
 bool xb_test_bit(struct xb *xb, unsigned long bit);
 void xb_clear_bit(struct xb *xb, unsigned long bit);
+void xb_zero(struct xb *xb, unsigned long start, unsigned long end);
+unsigned long xb_find_next_bit(struct xb *xb, unsigned long start,
+  unsigned long end, bool set);
 
 /* Check if the xb tree is empty */
 static inline bool xb_is_empty(const struct xb *xb)
diff --git a/lib/xbitmap.c b/lib/xbitmap.c
index 8c55296..b9e2a0c 100644
--- a/lib/xbitmap.c
+++ b/lib/xbitmap.c
@@ -174,3 +174,42 @@ void xb_preload(gfp_t gfp)
}
 }
 EXPORT_SYMBOL(xb_preload);
+
+/**
+ *  xb_zero - zero a range of bits in the xbitmap
+ *  @xb: the xbitmap that the bits reside in
+ *  @start: the start of the range, inclusive
+ *  @end: the end of the range, inclusive
+ */
+void xb_zero(struct xb *xb, unsigned long start, unsigned long end)
+{
+   unsigned long i;
+
+   for (i = start; i <= end; i++)
+   xb_clear_bit(xb, i);
+}
+EXPORT_SYMBOL(xb_zero);
+
+/**
+ * xb_find_next_bit - find next 1 or 0 in the give range of bits
+ * @xb: the xbitmap that the bits reside in
+ * @start: the start of the range, inclusive
+ * @end: the end of the range, inclusive
+ * @set: the polarity (1 or 0) of the next bit to find
+ *
+ * Return the index of the found bit in the xbitmap. If the returned index
+ * exceeds @end, it indicates that no such bit is found in the given range.
+ */
+unsigned long xb_find_next_bit(struct xb *xb, unsigned long start,
+  unsigned long end, bool set)
+{
+   unsigned long i;
+
+   for (i = start; i <= end; i++) {
+   if (xb_test_bit(xb, i) == set)
+   break;
+   }
+
+   return i;
+}
+EXPORT_SYMBOL(xb_find_next_bit);
-- 
2.7.4




[Qemu-devel] [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-08-28 Thread Wei Wang
Add a new feature, VIRTIO_BALLOON_F_SG, which enables the transfer
of balloon (i.e. inflated/deflated) pages using scatter-gather lists
to the host.

The implementation of the previous virtio-balloon is not very
efficient, because the balloon pages are transferred to the
host one by one. Here is the breakdown of the time in percentage
spent on each step of the balloon inflating process (inflating
7GB of an 8GB idle guest).

1) allocating pages (6.5%)
2) sending PFNs to host (68.3%)
3) address translation (6.1%)
4) madvise (19%)

It takes about 4126ms for the inflating process to complete.
The above profiling shows that the bottlenecks are stage 2)
and stage 4).

This patch optimizes step 2) by transferring pages to the host in
sgs. An sg describes a chunk of guest physically continuous pages.
With this mechanism, step 4) can also be optimized by doing address
translation and madvise() in chunks rather than page by page.

With this new feature, the above ballooning process takes ~597ms
resulting in an improvement of ~86%.

TODO: optimize stage 1) by allocating/freeing a chunk of pages
instead of a single page each time.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Suggested-by: Michael S. Tsirkin 
---
 drivers/virtio/virtio_balloon.c | 171 
 include/uapi/linux/virtio_balloon.h |   1 +
 2 files changed, 155 insertions(+), 17 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f0b3a0b..8ecc1d4 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -32,6 +32,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -79,6 +81,9 @@ struct virtio_balloon {
/* Synchronize access/update to this struct virtio_balloon elements */
struct mutex balloon_lock;
 
+   /* The xbitmap used to record balloon pages */
+   struct xb page_xb;
+
/* The array of pfns we tell the Host about. */
unsigned int num_pfns;
__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
@@ -141,13 +146,111 @@ static void set_page_pfns(struct virtio_balloon *vb,
  page_to_balloon_pfn(page) + i);
 }
 
+static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t size)
+{
+   struct scatterlist sg;
+
+   sg_init_one(&sg, addr, size);
+   return virtqueue_add_inbuf(vq, &sg, 1, vq, GFP_KERNEL);
+}
+
+static void send_balloon_page_sg(struct virtio_balloon *vb,
+struct virtqueue *vq,
+void *addr,
+uint32_t size,
+bool batch)
+{
+   unsigned int len;
+   int err;
+
+   err = add_one_sg(vq, addr, size);
+   /* Sanity check: this can't really happen */
+   WARN_ON(err);
+
+   /* If batching is in use, we batch the sgs till the vq is full. */
+   if (!batch || !vq->num_free) {
+   virtqueue_kick(vq);
+   wait_event(vb->acked, virtqueue_get_buf(vq, &len));
+   /* Release all the entries if there are */
+   while (virtqueue_get_buf(vq, &len))
+   ;
+   }
+}
+
+/*
+ * Send balloon pages in sgs to host. The balloon pages are recorded in the
+ * page xbitmap. Each bit in the bitmap corresponds to a page of PAGE_SIZE.
+ * The page xbitmap is searched for continuous "1" bits, which correspond
+ * to continuous pages, to chunk into sgs.
+ *
+ * @page_xb_start and @page_xb_end form the range of bits in the xbitmap that
+ * need to be searched.
+ */
+static void tell_host_sgs(struct virtio_balloon *vb,
+ struct virtqueue *vq,
+ unsigned long page_xb_start,
+ unsigned long page_xb_end)
+{
+   unsigned long sg_pfn_start, sg_pfn_end;
+   void *sg_addr;
+   uint32_t sg_len, sg_max_len = round_down(UINT_MAX, PAGE_SIZE);
+
+   sg_pfn_start = page_xb_start;
+   while (sg_pfn_start < page_xb_end) {
+   sg_pfn_start = xb_find_next_bit(&vb->page_xb, sg_pfn_start,
+   page_xb_end, 1);
+   if (sg_pfn_start == page_xb_end + 1)
+   break;
+   sg_pfn_end = xb_find_next_bit(&vb->page_xb, sg_pfn_start + 1,
+ page_xb_end, 0);
+   sg_addr = (void *)pfn_to_kaddr(sg_pfn_start);
+   sg_len = (sg_pfn_end - sg_pfn_start) << PAGE_SHIFT;
+   while (sg_len > sg_max_len) {
+   send_balloon_page_sg(vb, vq, sg_addr, sg_max_len, 1);
+   sg_addr += sg_max_len;
+   sg_len -= sg_max_len;
+   }
+   send_balloon_page_sg(vb, vq, sg_addr, sg_len, 1);
+   xb_zero(&vb->page_xb, sg_pfn_start, sg_pfn_end);
+   sg

[Qemu-devel] [PATCH v15 5/5] virtio-balloon: VIRTIO_BALLOON_F_CTRL_VQ

2017-08-28 Thread Wei Wang
Add a new vq, ctrl_vq, to handle commands between the host and guest.
With this feature, we will be able to have the control plane and data
plane separated. In other words, the control related data of each
feature will be sent via the ctrl_vq cmds, meanwhile each feature may
have its own data plane vq.

Free page report is the the first new feature controlled via ctrl_vq,
and a new cmd class, VIRTIO_BALLOON_CTRLQ_CLASS_FREE_PAGE, is added.
Currently, this feature has two cmds:
VIRTIO_BALLOON_FREE_PAGE_F_START: This cmd is sent from host to guest
to start the free page reporting work.
VIRTIO_BALLOON_FREE_PAGE_F_STOP: This cmd is used bidirectionally. The
guest would send the cmd to the host to indicate the reporting work is
done. The host would send the cmd to the guest to actively request the
stop of the reporting work.

The free_page_vq is used to transmit the guest free page blocks to the
host.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
---
 drivers/virtio/virtio_balloon.c | 247 +---
 include/uapi/linux/virtio_balloon.h |  15 +++
 2 files changed, 242 insertions(+), 20 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8ecc1d4..1d384a4 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -55,7 +55,13 @@ static struct vfsmount *balloon_mnt;
 
 struct virtio_balloon {
struct virtio_device *vdev;
-   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *ctrl_vq,
+*free_page_vq;
+
+   /* Balloon's own wq for cpu-intensive work items */
+   struct workqueue_struct *balloon_wq;
+   /* The work items submitted to the balloon wq are listed here */
+   struct work_struct report_free_page_work;
 
/* The balloon servicing is delegated to a freezable workqueue. */
struct work_struct update_balloon_stats_work;
@@ -65,6 +71,9 @@ struct virtio_balloon {
spinlock_t stop_update_lock;
bool stop_update;
 
+   /* Stop reporting free pages */
+   bool report_free_page_stop;
+
/* Waiting for host to ack the pages we released. */
wait_queue_head_t acked;
 
@@ -93,6 +102,11 @@ struct virtio_balloon {
 
/* To register callback in oom notifier call chain */
struct notifier_block nb;
+
+   /* Host to guest ctrlq cmd buf for free page report */
+   struct virtio_balloon_ctrlq_cmd free_page_cmd_in;
+   /* Guest to Host ctrlq cmd buf for free page report */
+   struct virtio_balloon_ctrlq_cmd free_page_cmd_out;
 };
 
 static struct virtio_device_id id_table[] = {
@@ -177,6 +191,26 @@ static void send_balloon_page_sg(struct virtio_balloon *vb,
}
 }
 
+static void send_free_page_sg(struct virtqueue *vq, void *addr, uint32_t size)
+{
+   unsigned int len;
+   int err = -ENOSPC;
+
+   do {
+   if (vq->num_free) {
+   err = add_one_sg(vq, addr, size);
+   /* Sanity check: this can't really happen */
+   WARN_ON(err);
+   if (!err)
+   virtqueue_kick(vq);
+   }
+
+   /* Release entries if there are */
+   while (virtqueue_get_buf(vq, &len))
+   ;
+   } while (err == -ENOSPC && vq->num_free);
+}
+
 /*
  * Send balloon pages in sgs to host. The balloon pages are recorded in the
  * page xbitmap. Each bit in the bitmap corresponds to a page of PAGE_SIZE.
@@ -525,42 +559,206 @@ static void update_balloon_size_func(struct work_struct 
*work)
queue_work(system_freezable_wq, work);
 }
 
-static int init_vqs(struct virtio_balloon *vb)
+static bool virtio_balloon_send_free_pages(void *opaque, unsigned long pfn,
+  unsigned long nr_pages)
+{
+   struct virtio_balloon *vb = (struct virtio_balloon *)opaque;
+   void *addr = (void *)pfn_to_kaddr(pfn);
+   uint32_t len = nr_pages << PAGE_SHIFT;
+
+   if (vb->report_free_page_stop)
+   return 1;
+
+   send_free_page_sg(vb->free_page_vq, addr, len);
+
+   return 0;
+}
+
+static void ctrlq_add_cmd(struct virtqueue *vq,
+ struct virtio_balloon_ctrlq_cmd *cmd,
+ bool inbuf)
 {
-   struct virtqueue *vqs[3];
-   vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request 
};
-   static const char * const names[] = { "inflate", "deflate", "stats" };
-   int err, nvqs;
+   struct scatterlist sg;
+   int err;
+
+   sg_init_one(&sg, cmd, sizeof(struct virtio_balloon_ctrlq_cmd));
+   if (inbuf)
+   err = virtqueue_add_inbuf(vq, &sg, 1, cmd, GFP_KERNEL);
+   else
+   err = virtqueue_add_outbuf(vq, &sg, 1, cmd, GFP_KERNEL);
+
+   /* Sanity check: this can't really happen */
+   WARN_O

[Qemu-devel] [PATCH v15 4/5] mm: support reporting free page blocks

2017-08-28 Thread Wei Wang
This patch adds support to walk through the free page blocks in the
system and report them via a callback function. Some page blocks may
leave the free list after zone->lock is released, so it is the caller's
responsibility to either detect or prevent the use of such pages.

One use example of this patch is to accelerate live migration by skipping
the transfer of free pages reported from the guest. A popular method used
by the hypervisor to track which part of memory is written during live
migration is to write-protect all the guest memory. So, those pages that
are reported as free pages but are written after the report function
returns will be captured by the hypervisor, and they will be added to the
next round of memory transfer.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Cc: Michal Hocko 
Cc: Michael S. Tsirkin 
---
 include/linux/mm.h |  5 +
 mm/page_alloc.c| 65 ++
 2 files changed, 70 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 46b9ac5..3c4267d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1835,6 +1835,11 @@ extern void free_area_init_node(int nid, unsigned long * 
zones_size,
unsigned long zone_start_pfn, unsigned long *zholes_size);
 extern void free_initmem(void);
 
+extern void walk_free_mem_block(void *opaque,
+   int min_order,
+   bool (*report_page_block)(void *, unsigned long,
+ unsigned long));
+
 /*
  * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
  * into the buddy system. The freed pages will be poisoned with pattern
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6d00f74..81eedc7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4762,6 +4762,71 @@ void show_free_areas(unsigned int filter, nodemask_t 
*nodemask)
show_swap_cache_info();
 }
 
+/**
+ * walk_free_mem_block - Walk through the free page blocks in the system
+ * @opaque: the context passed from the caller
+ * @min_order: the minimum order of free lists to check
+ * @report_page_block: the callback function to report free page blocks
+ *
+ * If the callback returns 1, stop iterating the list of free page blocks.
+ * Otherwise, continue to report.
+ *
+ * Please note that there are no locking guarantees for the callback and
+ * that the reported pfn range might be freed or disappear after the
+ * callback returns so the caller has to be very careful how it is used.
+ *
+ * The callback itself must not sleep or perform any operations which would
+ * require any memory allocations directly (not even GFP_NOWAIT/GFP_ATOMIC)
+ * or via any lock dependency. It is generally advisable to implement
+ * the callback as simple as possible and defer any heavy lifting to a
+ * different context.
+ *
+ * There is no guarantee that each free range will be reported only once
+ * during one walk_free_mem_block invocation.
+ *
+ * pfn_to_page on the given range is strongly discouraged and if there is
+ * an absolute need for that make sure to contact MM people to discuss
+ * potential problems.
+ *
+ * The function itself might sleep so it cannot be called from atomic
+ * contexts.
+ *
+ * In general low orders tend to be very volatile and so it makes more
+ * sense to query larger ones first for various optimizations which like
+ * ballooning etc... This will reduce the overhead as well.
+ */
+void walk_free_mem_block(void *opaque,
+int min_order,
+bool (*report_page_block)(void *, unsigned long,
+  unsigned long))
+{
+   struct zone *zone;
+   struct page *page;
+   struct list_head *list;
+   int order;
+   enum migratetype mt;
+   unsigned long pfn, flags;
+   bool stop = 0;
+
+   for_each_populated_zone(zone) {
+   for (order = MAX_ORDER - 1; order >= min_order; order--) {
+   for (mt = 0; !stop && mt < MIGRATE_TYPES; mt++) {
+   spin_lock_irqsave(&zone->lock, flags);
+   list = &zone->free_area[order].free_list[mt];
+   list_for_each_entry(page, list, lru) {
+   pfn = page_to_pfn(page);
+   stop = report_page_block(opaque, pfn,
+1 << order);
+   if (stop)
+   break;
+   }
+   spin_unlock_irqrestore(&zone->lock, flags);
+   }
+   }
+   }
+}
+EXPORT_SYMBOL_GPL(walk_free_mem_block);
+
 static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
 {
zoneref->zone = zone;
-- 
2.7.4




Re: [Qemu-devel] [PATCH] vga: stop passing pointers to vga_draw_line* functions

2017-08-28 Thread P J P
+-- On Fri, 25 Aug 2017, Gerd Hoffmann wrote --+
| > Do we have the actual number?
| Not yet, pjp still busy getting one, but will be filled for the final
| version of the patch.

CVE-2017-13672.
 -> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg04684.html


CVE-2017-13673.
 -> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg04685.html

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F



[Qemu-devel] [Bug 1713434] Re: prom-env-test test aborted and core dumped

2017-08-28 Thread R.Nageswara Sastry
The actual failure was the following

  LINKtests/test-hmp
  GTESTER check-qtest-ppc64
**
ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature 
== MAGIC): (0x7c7f1b78 == 0xcafec0de)
GTester: last random seed: R02Sfb567618f7c703a032934c0c11e263c6
make: *** [check-qtest-ppc64] Error 1

But I was going through found the above was aborting. so reported here.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713434

Title:
  prom-env-test test aborted and core dumped

Status in QEMU:
  In Progress

Bug description:
  On ppc64le architecture machine the following test case Aborted and
  Core dumped.

  # tests/prom-env-test --quiet --keep-going -m=quick --GTestLogFD=6
  **
  ERROR:tests/libqtest.c:628:qtest_get_arch: assertion failed: (qemu != NULL)
  Aborted (core dumped)

  Steps to re-produce:
  clone from the git
  configure & compile 
  run the unit tests by 'make check'

  (gdb) bt
  #0  0x3fff9d60eff0 in raise () from /lib64/libc.so.6
  #1  0x3fff9d61136c in abort () from /lib64/libc.so.6
  #2  0x3fff9de1aa04 in g_assertion_message () from /lib64/libglib-2.0.so.0
  #3  0x3fff9de1ab0c in g_assertion_message_expr () from 
/lib64/libglib-2.0.so.0
  #4  0x1000cc30 in qtest_get_arch () at tests/libqtest.c:628
  #5  0x100048f0 in main (argc=5, argv=0x32145538) at 
tests/prom-env-test.c:82
  (gdb) i r
  r0 0xfa   250
  r1 0x32144d30 70368510627120
  r2 0x3fff9d7b9900 7036709176
  r3 0x00
  r4 0x12a7 4775
  r5 0x66
  r6 0x88
  r7 0x11
  r8 0x00
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fff9dfa1950 70367099623760
  r140x00
  r150x00
  r160x00
  r170x00
  r180x00
  r190x00
  r200x00
  r210x00
  r220x00
  r230x00
  r240x00
  r250x00
  r260x00
  r270x100287f8 268601336
  r280x16841b40 377756480
  r290x4c   76
  r300x32144de8 70368510627304
  r310x66
  pc 0x3fff9d60eff0 0x3fff9d60eff0 
  msr0x9280f033 10376293541503627315
  cr 0x42000842 1107298370
  lr 0x3fff9d61136c 0x3fff9d61136c 
  ctr0x00
  xer0x00
  orig_r30x12a7 4775
  trap   0xc00  3072

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713434/+subscriptions



[Qemu-devel] [PATCH] tests/libqtest: Use a proper error message if QTEST_QEMU_BINARY is missing

2017-08-28 Thread Thomas Huth
The user can currently still cause an abort() if running certain tests
(like the prom-env-test) without setting the QTEST_QEMU_BINARY first.
A similar problem has been fixed with commit 7c933ad61b8f3f51337
already, but forgot to also take care of the qtest_get_arch() function,
so let's introduce a proper wrapper around getenv("QTEST_QEMU_BINARY")
that can be used in both locations now.

Buglink: https://bugs.launchpad.net/qemu/+bug/1713434
Signed-off-by: Thomas Huth 
---
 tests/libqtest.c | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index b9a1f18..342a77b 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -150,6 +150,19 @@ void qtest_add_abrt_handler(GHookFunc fn, const void *data)
 g_hook_prepend(&abrt_hooks, hook);
 }
 
+static const char *qtest_qemu_binary(void)
+{
+const char *qemu_bin;
+
+qemu_bin = getenv("QTEST_QEMU_BINARY");
+if (!qemu_bin) {
+fprintf(stderr, "Environment variable QTEST_QEMU_BINARY required\n");
+exit(1);
+}
+
+return qemu_bin;
+}
+
 QTestState *qtest_init_without_qmp_handshake(const char *extra_args)
 {
 QTestState *s;
@@ -157,13 +170,7 @@ QTestState *qtest_init_without_qmp_handshake(const char 
*extra_args)
 gchar *socket_path;
 gchar *qmp_socket_path;
 gchar *command;
-const char *qemu_binary;
-
-qemu_binary = getenv("QTEST_QEMU_BINARY");
-if (!qemu_binary) {
-fprintf(stderr, "Environment variable QTEST_QEMU_BINARY required\n");
-exit(1);
-}
+const char *qemu_binary = qtest_qemu_binary();
 
 s = g_malloc(sizeof(*s));
 
@@ -624,8 +631,7 @@ char *qtest_hmp(QTestState *s, const char *fmt, ...)
 
 const char *qtest_get_arch(void)
 {
-const char *qemu = getenv("QTEST_QEMU_BINARY");
-g_assert(qemu != NULL);
+const char *qemu = qtest_qemu_binary();
 const char *end = strrchr(qemu, '/');
 
 return end + strlen("/qemu-system-");
-- 
1.8.3.1




[Qemu-devel] [Bug 1713434] Re: prom-env-test test aborted and core dumped

2017-08-28 Thread Thomas Huth
The "ERROR:tests/prom-env-test.c:42:check_guest_memory" error is a
timeout error... is it reproducible? Was the host you're testing on very
loaded at that point in time?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713434

Title:
  prom-env-test test aborted and core dumped

Status in QEMU:
  In Progress

Bug description:
  On ppc64le architecture machine the following test case Aborted and
  Core dumped.

  # tests/prom-env-test --quiet --keep-going -m=quick --GTestLogFD=6
  **
  ERROR:tests/libqtest.c:628:qtest_get_arch: assertion failed: (qemu != NULL)
  Aborted (core dumped)

  Steps to re-produce:
  clone from the git
  configure & compile 
  run the unit tests by 'make check'

  (gdb) bt
  #0  0x3fff9d60eff0 in raise () from /lib64/libc.so.6
  #1  0x3fff9d61136c in abort () from /lib64/libc.so.6
  #2  0x3fff9de1aa04 in g_assertion_message () from /lib64/libglib-2.0.so.0
  #3  0x3fff9de1ab0c in g_assertion_message_expr () from 
/lib64/libglib-2.0.so.0
  #4  0x1000cc30 in qtest_get_arch () at tests/libqtest.c:628
  #5  0x100048f0 in main (argc=5, argv=0x32145538) at 
tests/prom-env-test.c:82
  (gdb) i r
  r0 0xfa   250
  r1 0x32144d30 70368510627120
  r2 0x3fff9d7b9900 7036709176
  r3 0x00
  r4 0x12a7 4775
  r5 0x66
  r6 0x88
  r7 0x11
  r8 0x00
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fff9dfa1950 70367099623760
  r140x00
  r150x00
  r160x00
  r170x00
  r180x00
  r190x00
  r200x00
  r210x00
  r220x00
  r230x00
  r240x00
  r250x00
  r260x00
  r270x100287f8 268601336
  r280x16841b40 377756480
  r290x4c   76
  r300x32144de8 70368510627304
  r310x66
  pc 0x3fff9d60eff0 0x3fff9d60eff0 
  msr0x9280f033 10376293541503627315
  cr 0x42000842 1107298370
  lr 0x3fff9d61136c 0x3fff9d61136c 
  ctr0x00
  xer0x00
  orig_r30x12a7 4775
  trap   0xc00  3072

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713434/+subscriptions



Re: [Qemu-devel] [PATCH v2 01/54] qapi: fix type_seen key error

2017-08-28 Thread Markus Armbruster
Marc-André Lureau  writes:

> On Fri, Aug 25, 2017 at 2:57 PM Eduardo Habkost  wrote:
>
>> On Fri, Aug 25, 2017 at 08:02:26AM +0200, Markus Armbruster wrote:
>> > Conflicts with Eduardo's "[PATCH v2] qapi: Fix error handling code on
>> > alternate conflict".
>> > Message-Id: <20170717180926.14924-1-ehabk...@redhat.com>
>> >
>> > Marc-André, could you have a look?  You can rebase your fix on top of
>> > Eduardo's, or merge the two into one commit.
>>
>> Both patches seem to be fixing the same bug.
>>
>
> Yes, it is solved differently. But take Eduardo's patch, it has more
> extensive tests .

Okay.

Too bad I didn't remember Eduardo's patch when I got yours.



Re: [Qemu-devel] [PATCH 0/3] iotests: clean up resources using context managers

2017-08-28 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> On Fri, Aug 25, 2017 at 03:32:29PM +0800, Fam Zheng wrote:
>> On Thu, 08/24 19:04, Stefan Hajnoczi wrote:
>> > On Thu, Aug 24, 2017 at 04:38:43PM +0800, Fam Zheng wrote:
>> > > On Thu, 08/24 08:21, Stefan Hajnoczi wrote:
>> > > > Tests should declare resources upfront in a with statement.  Resources 
>> > > > are
>> > > > automatically cleaned up whether the test passes or fails:
>> > > > 
>> > > >   with FilePath('test.img') as img_path,
>> > > >VM() as vm:
>> > > >   ...test...
>> > > >   # img_path is deleted and vm is shut down automatically
>> > > 
>> > > Looks good but still requires test writers to learn and remember to use 
>> > > FilePath
>> > > and with.
>> > 
>> > You cannot forget to use FilePath() unless you love typing at
>> > os.path.join(iotests.test_dir, 'test.img').  It's much better than open
>> > coding filename generation!
>> > 
>> > > These are still boilerplates.  Here goes my personal oppinion, so may
>> > > not be plausible:
>> > > 
>> > > - For VM() maybe add an atexit in the launch() method also makes sure 
>> > > the VM is
>> > >   eventually terminated.
>> > > 
>> > >   This means vm.shutdown() is still needed in tearDown() if there are 
>> > > multiple
>> > >   test methods and each of them expects a clean state, but that is 
>> > > probably
>> > >   still less typing (and also indenting) than the with approach, and 
>> > > also easy
>> > >   to remember (otherwise a test will fail).
>> > 
>> > I looked into atexit before going this route.  atexit does not have an
>> > unregister() API in Python 2.  This makes it ugly to use because some
>> > tests do not want the resource to remain for the duration of the
>> > process.
>> > 
>> > A related point is that the Python objects used by atexit handlers live
>> > until the end of the process.  They cannot be garbage collected because
>> > the atexit handler still has a reference to them.
>> 
>> I think this shortcoming can be solved with a clean up list ("all problems in
>> computer science can be solved by another level of indirection"):
>> 
>> _clean_up_list = set()
>> def _clean_up_handler():
>> for i in _clean_up_list:
>> try:
>> i()
>> except:
>> pass
>> 
>> atexit.register(_clean_up_handler)
>> 
>> class VM(...):
>> 
>> def launch():
>> ...
>> _clean_up_list.add(self.launch)
>> 
>> def shutdown():
>> _clean_up_list.remove(self.launch)
>> ...
>
> atexit is still less powerful than context managers because its scope is
> fixed.  Handler functions are only called when the process terminates.
> Many test cases do not want resources (especially the VMs) around
> forever because they run several iterations or sub-tests.
>
> The with statement can be used both for process-lifetime and for more
> fine-grained scoping.  That's why I chose it.

Moreover, context managers are Pythonic.

> If you stick to atexit then sub-tests or iterations require manual
> vm.shutdown() - something that is not necessary using the with
> statement.
>
> Stefan



[Qemu-devel] [Bug 1713408] Re: qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value passed to "maxcpus"

2017-08-28 Thread R.Nageswara Sastry
** Patch added: "0001-cpu-don-t-allow-negative-max_cpus.patch"
   
https://bugs.launchpad.net/qemu/+bug/1713408/+attachment/4940056/+files/0001-cpu-don-t-allow-negative-max_cpus.patch

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713408

Title:
  qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value
  passed to "maxcpus"

Status in QEMU:
  New

Bug description:
  # ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
  pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
  pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
  /avocado-
  vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
  -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio
  -net user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

  (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  From GDB:
  [New Thread 0x3fffb5aceb60 (LWP 12190)]

  (process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  Missing separate debuginfos, use: debuginfo-install 
glib2-2.50.3-3.el7.ppc64le glibc-2.17-196.el7.ppc64le 
gnutls-3.3.26-9.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le 
libgcc-4.8.5-16.el7.ppc64le libstdc++-4.8.5-16.el7.ppc64le 
ncurses-libs-5.9-13.20130511.el7.ppc64le nss-3.28.4-8.el7.ppc64le 
nss-softokn-freebl-3.28.3-6.el7.ppc64le nss-util-3.28.4-3.el7.ppc64le 
openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le 
p11-kit-0.23.5-3.el7.ppc64le
  (gdb) bt
  #0  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  #1  0x3fffb796be9c in _g_log_abort () from /lib64/libglib-2.0.so.0
  #2  0x3fffb796d4c4 in g_log_default_handler () from 
/lib64/libglib-2.0.so.0
  #3  0x3fffb796d86c in g_logv () from /lib64/libglib-2.0.so.0
  #4  0x3fffb796db00 in g_log () from /lib64/libglib-2.0.so.0
  #5  0x3fffb796b694 in g_malloc0 () from /lib64/libglib-2.0.so.0
  #6  0x1018fa84 in spapr_possible_cpu_arch_ids (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:3322
  #7  0x1018b444 in spapr_init_cpus (spapr=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2096
  #8  0x1018bc6c in ppc_spapr_init (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2275
  #9  0x1041ca38 in machine_run_board_init (machine=0x11165660) at 
hw/core/machine.c:760
  #10 0x1037723c in main (argc=24, argv=0x3108, 
envp=0x31d0) at vl.c:4633
  (gdb) i r
  r0 0xfa   250
  r1 0x3fffe450 70368744170576
  r2 0x3fffb7608100 70367525765376
  r3 0x00
  r4 0x2f98 12184
  r5 0x55
  r6 0x00
  r7 0x3fffa820 70367267782688
  r8 0x2f98 12184
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fffb64fccb0 70367507893424
  r140x00
  r150x00
  r160x00
  r170x00
  r180x11
  r190x00
  r200x3fffb796d3f0 70367529325552
  r210x00
  r220x2000 536870912
  r230x11
  r240x3fffb7a61498 70367530325144
  r250x3fffb7a614e8 70367530325224
  r260x3fffb7a61488 70367530325128
  r270x3fffa80008c0 70367267784896
  r280x3fffb79cd2a8 70367529718440
  r290x3fffb79cd2a8 70367529718440
  r300x 18446744073709551615
  r310x11
  pc 0x3fffb75e5408 0x3fffb75e5408 
  msr0x9000d033 10376293541461676083
  cr 0x42244842 1109674050
  lr 0x3fffb796be9c 0x3fffb796be9c <_g_log_abort+60>
  ctr0x00
  xer0x00
  orig_r30x2f98 12184
  trap   0xc00  3072

  Similar error observed on x86_64 and PPC64LE architectures.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713408/+subscriptions



Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll

2017-08-28 Thread Markus Armbruster
Marc-André Lureau  writes:

> On Fri, Aug 25, 2017 at 5:33 PM Dr. David Alan Gilbert 
> wrote:
>
>> * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
>> > Hi
>> >
>> > On Wed, Aug 23, 2017 at 8:52 AM Peter Xu  wrote:
>> >
>> > > Firstly, introduce Monitor.use_thread, and set it for monitors that are
>> > > using non-mux typed backend chardev.  We only do this for monitors, so
>> > > mux-typed chardevs are not suitable (when it connects to, e.g., serials
>> > > and the monitor together).
>> > >
>> > > When use_thread is set, we create standalone thread to poll the monitor
>> > > events, isolated from the main loop thread.  Here we still need to take
>> > > the BQL before dispatching the tasks since some of the monitor commands
>> > > are not allowed to execute without the protection of BQL.  Then this
>> > > gives us the chance to avoid taking the BQL for some monitor commands in
>> > > the future.
>> > >
>> > > * Why this change?
>> > >
>> > > We need these per-monitor threads to make sure we can have at least one
>> > > monitor that will never stuck (that can receive further monitor
>> > > commands).
>> > >
>> > > * So when will monitors stuck?  And, how do they stuck?
>> > >
>> > > After we have postcopy and remote page faults, it's simple to achieve a
>> > > stuck in the monitor (which is also a stuck in main loop thread):
>> > >
>> > > (1) Monitor deadlock on BQL
>> > >
>> > > As we may know, when postcopy is running on destination VM, the vcpu
>> > > threads can stuck merely any time as long as it tries to access an
>> > > uncopied guest page.  Meanwhile, when the stuck happens, it is possible
>> > > that the vcpu thread is holding the BQL.  If the page fault is not
>> > > handled quickly, you'll find that monitors stop working, which is trying
>> > > to take the BQL.
>> > >
>> > > If the page fault cannot be handled correctly (one case is a paused
>> > > postcopy, when network is temporarily down), monitors will hang
>> > > forever.  Without current patch, that means the main loop hanged. We'll
>> > > never find a way to talk to VM again.
>> > >
>> >
>> > Could the BQL be pushed down to the monitor commands level instead? That
>> > way we wouldn't need a seperate thread to solve the hang on commands that
>> > do not need BQL.
>>
>> If the main thread is stuck though I don't see how that helps you; you
>> have to be able to run these commands on another thread.
>>
>
> Why would the main thread be stuck? In (1) If the vcpu thread takes the BQL
> and the command doesn't need it, it would work.  In (2),  info cpus
> shouldn't keep the BQL (my qapi-async series would probably help here)

This has been discussed several times[*], but of course not with
everybody, so I'll summarize once more: asynchronous commands are not a
actually *required* for anything.  They are *one* way to package the
"kick off task, receive an asynchronous message when it's done" pattern.
Another way is synchronous command for the kick off, event for the
"done".

For better or worse, synchronous command + event is what we have today.
Whether adding another way to package the the same thing improves the
QMP interface is doubtful.


[*] Try this one:
Message-ID: <87o9yv890z@dusky.pond.sub.org>
https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg05483.html



Re: [Qemu-devel] Qemu 2.10 rc4 build issue on BE (luigi burdo)

2017-08-28 Thread luigi burdo
Ok igor,

i will download and build again and report

Luigi



Da: Igor Mammedov 
Inviato: lunedì 28 agosto 2017 11:32
A: luigi burdo
Cc: Programmingkid; qemu-...@nongnu.org; qemu-devel@nongnu.org qemu-devel
Oggetto: Re: [Qemu-devel] Qemu 2.10 rc4 build issue on BE (luigi burdo)

On Mon, 28 Aug 2017 09:14:48 +
luigi burdo  wrote:

> Hi Igor,
>
>
> below errors look like corrupted source files
>
> >
>
>
> i ust dowloaded the compressed file from qemu website,unzipped with tar from 
> console (like i usually did)  and run configure and make.
>
> can be the file of rc4 from qemu website not right and need a repack?
I've just downloaded it from 
http://download.qemu-project.org/qemu-2.10.0-rc4.tar.xz
and unpacked, it works for me (i.e. hw/i386/pc.c isn't corrupted).

>
> Luigi
>



Re: [Qemu-devel] [Qemu-ppc] Qemu 2.10 rc4 build issue on BE

2017-08-28 Thread luigi burdo
Hi Thomas,

i will check again and report.

Luigi



In your log there is:


But if you have a look at the freshly unpacked sources, that line
clearly reads "goto out;" and not "gCto out;" ... so it looks like
something messed up your sources very badly.
Can you reproduce this problem when starting again from scratch?

 Thomas


Re: [Qemu-devel] [PATCH 04/14] qlit: remove needless type cast

2017-08-28 Thread Markus Armbruster
Marc-André Lureau  writes:

> Hi
>
> On Fri, Aug 25, 2017 at 8:20 AM Markus Armbruster  wrote:
>
>> Marc-André Lureau  writes:
>>
>> > And misc code style fix.
>> >
>> > Signed-off-by: Marc-André Lureau 
>> > ---
>> >  include/qapi/qmp/qlit.h | 8 
>> >  1 file changed, 4 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/include/qapi/qmp/qlit.h b/include/qapi/qmp/qlit.h
>> > index f36bca4554..1e9696988a 100644
>> > --- a/include/qapi/qmp/qlit.h
>> > +++ b/include/qapi/qmp/qlit.h
>> > @@ -36,13 +36,13 @@ struct QLitDictEntry {
>> >  };
>> >
>> >  #define QLIT_QNUM(val) \
>> > -(QLitObject){.type = QTYPE_QNUM, .value.qnum = (val)}
>> > +{ .type = QTYPE_QNUM, .value.qnum = (val) }
>> >  #define QLIT_QSTR(val) \
>> > -(QLitObject){.type = QTYPE_QSTRING, .value.qstr = (val)}
>> > +{ .type = QTYPE_QSTRING, .value.qstr = (val) }
>> >  #define QLIT_QDICT(val) \
>> > -(QLitObject){.type = QTYPE_QDICT, .value.qdict = (val)}
>> > +{ .type = QTYPE_QDICT, .value.qdict = (val) }
>> >  #define QLIT_QLIST(val) \
>> > -(QLitObject){.type = QTYPE_QLIST, .value.qlist = (val)}
>> > +{ .type = QTYPE_QLIST, .value.qlist = (val) }
>> >
>> >  int compare_litqobj_to_qobj(QLitObject *lhs, QObject *rhs);
>>
>> Technically, this isn't a type cast, it's a compound literal.  A
>> compound literal is "an unnamed object whose value is given by the
>> initializer list" (C99 §6.5.2.5).  The standard then cautions "that this
>> differs from a cast expression" (ibid).
>
> You are right, but actually the patch helps with another issue if we keep
> the compound type:
>
> const QLitObject qmp_schema_qlit = QLIT_QLIST(((QLitObject[]) {
> QLIT_QNULL,
> {}
> }));
>
> qmp-introspect.c:17:63: error: initializer element is not constant
>  const QLitObject qmp_schema_qlit = QLIT_QLIST(((QLitObject[]) {
>^
> /home/elmarco/src/qemu/include/qapi/qmp/qlit.h:50:57: note: in definition
> of macro ‘QLIT_QLIST’
>  (QLitObject) { .type = QTYPE_QLIST, .value.qlist = (val) }
>  ^~~
> qmp-introspect.c:17:63: note: (near initialization for ‘(anonymous).value’)
>  const QLitObject qmp_schema_qlit = QLIT_QLIST(((QLitObject[]) {
>^
> /home/elmarco/src/qemu/include/qapi/qmp/qlit.h:50:57: note: in definition
> of macro ‘QLIT_QLIST’
>  (QLitObject) { .type = QTYPE_QLIST, .value.qlist = (val) }
>  ^~~
> Any idea how to fix that?

This one:

[...]
>> The original author's choice of compound literals is clearly
>> intentional.  I'd prefer to respect it unless there's a compelling
>> reason not to.

Rewrite your commit message so it explains your *actual* reason for
converting the macro from compound literal to initializer.



Re: [Qemu-devel] [PATCH v2 06/16] qapi-schema: Collect char device stuff in qapi/char.json

2017-08-28 Thread Markus Armbruster
Marc-André Lureau  writes:

> On Thu, Aug 24, 2017 at 9:24 PM Markus Armbruster  wrote:
>
>> Cc: Paolo Bonzini 
>> Cc: Marc-André Lureau 
>> Signed-off-by: Markus Armbruster 
>>
>
> Reviewed-by: Marc-André Lureau 
[...]
>> diff --git a/qapi/event.json b/qapi/event.json
>> index 9c6126d..b9aa6ed 100644
>> --- a/qapi/event.json
>> +++ b/qapi/event.json
>> @@ -397,27 +397,6 @@
>>  'sector-num': 'int', 'sectors-count': 'int' } }
>>
>>  ##
>> -# @VSERPORT_CHANGE:
>> -#
>> -# Emitted when the guest opens or closes a virtio-serial port.
>> -#
>> -# @id: device identifier of the virtio-serial port
>> -#
>> -# @open: true if the guest has opened the virtio-serial port
>> -#
>> -# Since: 2.1
>> -#
>> -# Example:
>> -#
>> -# <- { "event": "VSERPORT_CHANGE",
>> -#  "data": { "id": "channel0", "open": true },
>> -#  "timestamp": { "seconds": 1401385907, "microseconds": 422329 } }
>> -#
>> -##
>> -{ 'event': 'VSERPORT_CHANGE',
>> -  'data': { 'id': 'str', 'open': 'bool' } }
>>
>
> That one is a bit special (since it's for virtio-serial), but it is
> char-related stuff, ok.

It made me change title from "backend stuff" to "device stuff" :)

Thanks!

>> -##
>>  # @MEM_UNPLUG_ERROR:
>>  #
>>  # Emitted when memory hot unplug error occurs.
>> --
>> 2.7.5



Re: [Qemu-devel] [PATCH v2 09/16] qapi-schema: Collect migration stuff in qapi/migration.json

2017-08-28 Thread Markus Armbruster
"Dr. David Alan Gilbert"  writes:

> * Markus Armbruster (arm...@redhat.com) wrote:
>> Cc: Juan Quintela 
>> Cc: Dr. David Alan Gilbert 
>> Signed-off-by: Markus Armbruster 
>
>
> Two thoughts:
>   a) Do you actually want that as migration/migration.json?

I'd prefer to keep the QAPI schema together.  But that could be my
schema maintainer bias talking :)

>   b) I'd prefer StrOrNull to be somewhere more central; Migration may be
>   the only user, but it's not logically migration specific.

Makes sense.  I'll move it to common.json.

> Reviewed-by: Dr. David Alan Gilbert 

Thanks!



Re: [Qemu-devel] [PATCH 00/47] add missing entries in MAINTAINERS

2017-08-28 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> Hi,
>
> I first prepared this series thinking about 2.10 but then realized if someone
> is calling ./scripts/get_maintainer.pl he better is using an updated git
> clone :)
>
> Distribs do provide docs/ files in /usr/share/doc/qemu* but not the
> MAINTAINERS file so no hurry for this series.
>
> I provided the script I used in the last patch, I plan to improve it so
> patchew/travis can start to use it: if a new file is added to the repository
> without being covered in MAINTAINERS the script remember the submitter about
> it and eventually he'll resend his patch, I think this is a good practice and
> help the next contributors (anyway this is a valuable experience to look at a
> file history to see who has been hacking around when there is no MAINTAINERS
> entry).
>
> I don't want to overburden current maintainers, so the last patches add
> Orphans entries but I tried to select relevant commiters and include them in
> each patch.
>
> The criteria I used is "If I change this file, who should I notify? Who would
> be interested?"

Where do we stand with this series?



Re: [Qemu-devel] [PATCH v2 11/16] qapi-schema: Collect TPM stuff in qapi/tpm.json

2017-08-28 Thread Markus Armbruster
Marc-André Lureau  writes:

> On Thu, Aug 24, 2017 at 9:23 PM Markus Armbruster  wrote:
>
>> Sadly, we don't have a TPM maintainer, not even a MAINTAINERS entry.
>> Create one, and mark it orphaned.
>>
>>
> This is also proposed in:
> http://patchew.org/QEMU/20170728053610.15770-1-f4...@amsat.org/

Forgot about that one.

My patch additionally has "F: include/hw/acpi/tpm.h", and of course the
new "qapi/tpm.json", and it's not marked "RFC".

> Stefan Berger, any chance you declare yourself a TPM maintainer?

Please?

> Signed-off-by: Markus Armbruster 
>
> Reviewed-by: Marc-André Lureau 

Thanks!



[Qemu-devel] [PATCH] qemu-iotests: Extend non-shared storage migration test (194)

2017-08-28 Thread Kashyap Chamarthy
This is the follow-up patch that was discussed[*] as part of feedback to
qemu-iotest 194.

Changes in this patch:

  - Supply 'job-id' parameter to `drive-mirror` invocation.

  - Issue `block-job-cancel` command on the source QEMU to gracefully
complete the mirroring operation.

  - Stop the NBD server on the destination QEMU.

  - Finally, exit once the event BLOCK_JOB_COMPLETED is emitted.

With the above, the test will also be (almost) in sync with the
procedure outlined in the document live-block-operations.rst[+]
(section: "QMP invocation for live storage migration with
``drive-mirror`` + NBD").

[*] https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg04820.html
-- qemu-iotests: add 194 non-shared storage migration test
[+] 
https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=docs/interop/live-block-operations.rst

Signed-off-by: Kashyap Chamarthy 
---
I wonder:
  - Is it worth printing the MIGRATION event state change?
  - Since we're not checking on the MIGRATION event anymore, can
the migration state change events related code (that is triggerred
by setting 'migrate-set-capabilities') be simply removed?
---
 tests/qemu-iotests/194 | 17 -
 tests/qemu-iotests/194.out | 14 --
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index 
8028111e21bed5cf4a2e8e32dc04aa5a9ea9caca..8d746be9d0033f478f11886ee93f95b0fa55bab0
 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -46,16 +46,17 @@ iotests.log('Launching NBD server on destination...')
 iotests.log(dest_vm.qmp('nbd-server-start', addr={'type': 'unix', 'data': 
{'path': nbd_sock_path}}))
 iotests.log(dest_vm.qmp('nbd-server-add', device='drive0', writable=True))
 
-iotests.log('Starting drive-mirror on source...')
+iotests.log('Starting `drive-mirror` on source...')
 iotests.log(source_vm.qmp(
   'drive-mirror',
   device='drive0',
   target='nbd+unix:///drive0?socket={0}'.format(nbd_sock_path),
   sync='full',
   format='raw', # always raw, the server handles the format
-  mode='existing'))
+  mode='existing',
+  job_id='mirror-job0'))
 
-iotests.log('Waiting for drive-mirror to complete...')
+iotests.log('Waiting for `drive-mirror` to complete...')
 iotests.log(source_vm.event_wait('BLOCK_JOB_READY'),
 filters=[iotests.filter_qmp_event])
 
@@ -66,8 +67,14 @@ dest_vm.qmp('migrate-set-capabilities',
 capabilities=[{'capability': 'events', 'state': True}])
 iotests.log(source_vm.qmp('migrate', 
uri='unix:{0}'.format(migration_sock_path)))
 
+iotests.log('Gracefully ending the `drive-mirror` job on source...')
+iotests.log(source_vm.qmp('block-job-cancel', device='mirror-job0'))
+
+iotests.log('Stopping the NBD server on destination...')
+iotests.log(dest_vm.qmp('nbd-server-stop'))
+
 while True:
-event = source_vm.event_wait('MIGRATION')
+event = source_vm.event_wait('BLOCK_JOB_COMPLETED')
 iotests.log(event, filters=[iotests.filter_qmp_event])
-if event['data']['status'] in ('completed', 'failed'):
+if event['event'] == 'BLOCK_JOB_COMPLETED':
 break
diff --git a/tests/qemu-iotests/194.out b/tests/qemu-iotests/194.out
index 
ae501fecacb706b1851cb9063ce9c9d5a28bb7ea..3a0e3a26e5342b0e3f0373623efd9d2b3ee8d2be
 100644
--- a/tests/qemu-iotests/194.out
+++ b/tests/qemu-iotests/194.out
@@ -2,12 +2,14 @@ Launching VMs...
 Launching NBD server on destination...
 {u'return': {}}
 {u'return': {}}
-Starting drive-mirror on source...
+Starting `drive-mirror` on source...
 {u'return': {}}
-Waiting for drive-mirror to complete...
-{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'device': u'drive0', u'type': u'mirror', u'speed': 0, u'len': 1073741824, 
u'offset': 1073741824}, u'event': u'BLOCK_JOB_READY'}
+Waiting for `drive-mirror` to complete...
+{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'device': u'mirror-job0', u'type': u'mirror', u'speed': 0, u'len': 
1073741824, u'offset': 1073741824}, u'event': u'BLOCK_JOB_READY'}
 Starting migration...
 {u'return': {}}
-{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'status': u'setup'}, u'event': u'MIGRATION'}
-{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'status': u'active'}, u'event': u'MIGRATION'}
-{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'status': u'completed'}, u'event': u'MIGRATION'}
+Gracefully ending the `drive-mirror` job on source...
+{u'return': {}}
+Stopping the NBD server on destination...
+{u'return': {}}
+{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'device': u'mirror-job0', u'type': u'mirror', u'speed': 0, u'len': 
1073741824, u'offset': 1073741824}, u'event': u'BLOCK_JOB_COMPLETED'}
-- 
2.9.5




Re: [Qemu-devel] [PATCH v2 15/16] qapi-schema: Move queries from common.json to qapi-schema.json

2017-08-28 Thread Markus Armbruster
Marc-André Lureau  writes:

> Hi
>
> On Thu, Aug 24, 2017 at 9:20 PM Markus Armbruster  wrote:
>
>> query-version and query-commands are in common.json for no good
>> reason.  Several similar queries are in qapi-schema.json.  Move them
>> there.
>
> I suppose it was initially meant to be shared with all QMP users, qga etc?

Possibly.

> That didn't happen, and is not planned either

Correct.

>> Signed-off-by: Markus Armbruster 
>
>
> Reviewed-by: Marc-André Lureau 

Thanks!



[Qemu-devel] [PATCH] tests: fix incorrect size_t format in benchmark-crypto

2017-08-28 Thread Philippe Mathieu-Daudé
  $ make check-speed
  tests/benchmark-crypto-hash.c: In function 'test_hash_speed':
  tests/benchmark-crypto-hash.c:44:5: error: format '%ld' expects argument of 
type 'long int', but argument 2 has type 'size_t' [-Werror=format=]
   g_print("Testing chunk_size %ld bytes ", chunk_size);
   ^
  tests/benchmark-crypto-hash.c: In function 'main':
  tests/benchmark-crypto-hash.c:62:9: error: format '%lu' expects argument of 
type 'long unsigned int', but argument 4 has type 'size_t' [-Werror=format=]
   snprintf(name, sizeof(name), "/crypto/hash/speed-%lu", i);
   ^
  cc1: all warnings being treated as errors
  rules.mak:66: recipe for target 'tests/benchmark-crypto-hash.o' failed
  make: *** [tests/benchmark-crypto-hash.o] Error 1

Signed-off-by: Philippe Mathieu-Daudé 
---
testing v2.10.0-rc4

 tests/benchmark-crypto-cipher.c | 4 ++--
 tests/benchmark-crypto-hash.c   | 4 ++--
 tests/benchmark-crypto-hmac.c   | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/benchmark-crypto-cipher.c b/tests/benchmark-crypto-cipher.c
index c6a40929e4..cf98443468 100644
--- a/tests/benchmark-crypto-cipher.c
+++ b/tests/benchmark-crypto-cipher.c
@@ -59,7 +59,7 @@ static void test_cipher_speed(const void *opaque)
 total /= 1024 * 1024; /* to MB */
 
 g_print("cbc(aes128): ");
-g_print("Testing chunk_size %ld bytes ", chunk_size);
+g_print("Testing chunk_size %zu bytes ", chunk_size);
 g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last());
 g_print("%.2f MB/sec\n", total / g_test_timer_last());
 
@@ -80,7 +80,7 @@ int main(int argc, char **argv)
 
 for (i = 512; i <= (64 * 1204); i *= 2) {
 memset(name, 0 , sizeof(name));
-snprintf(name, sizeof(name), "/crypto/cipher/speed-%lu", i);
+snprintf(name, sizeof(name), "/crypto/cipher/speed-%zu", i);
 g_test_add_data_func(name, (void *)i, test_cipher_speed);
 }
 
diff --git a/tests/benchmark-crypto-hash.c b/tests/benchmark-crypto-hash.c
index 6769d2a11b..122bfb6b85 100644
--- a/tests/benchmark-crypto-hash.c
+++ b/tests/benchmark-crypto-hash.c
@@ -41,7 +41,7 @@ static void test_hash_speed(const void *opaque)
 
 total /= 1024 * 1024; /* to MB */
 g_print("sha256: ");
-g_print("Testing chunk_size %ld bytes ", chunk_size);
+g_print("Testing chunk_size %zu bytes ", chunk_size);
 g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last());
 g_print("%.2f MB/sec\n", total / g_test_timer_last());
 
@@ -59,7 +59,7 @@ int main(int argc, char **argv)
 
 for (i = 512; i <= (64 * 1204); i *= 2) {
 memset(name, 0 , sizeof(name));
-snprintf(name, sizeof(name), "/crypto/hash/speed-%lu", i);
+snprintf(name, sizeof(name), "/crypto/hash/speed-%zu", i);
 g_test_add_data_func(name, (void *)i, test_hash_speed);
 }
 
diff --git a/tests/benchmark-crypto-hmac.c b/tests/benchmark-crypto-hmac.c
index 72408be987..c30250df3e 100644
--- a/tests/benchmark-crypto-hmac.c
+++ b/tests/benchmark-crypto-hmac.c
@@ -56,7 +56,7 @@ static void test_hmac_speed(const void *opaque)
 total /= 1024 * 1024; /* to MB */
 
 g_print("hmac(sha256): ");
-g_print("Testing chunk_size %ld bytes ", chunk_size);
+g_print("Testing chunk_size %zu bytes ", chunk_size);
 g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last());
 g_print("%.2f MB/sec\n", total / g_test_timer_last());
 
@@ -74,7 +74,7 @@ int main(int argc, char **argv)
 
 for (i = 512; i <= (64 * 1204); i *= 2) {
 memset(name, 0 , sizeof(name));
-snprintf(name, sizeof(name), "/crypto/hmac/speed-%lu", i);
+snprintf(name, sizeof(name), "/crypto/hmac/speed-%zu", i);
 g_test_add_data_func(name, (void *)i, test_hmac_speed);
 }
 
-- 
2.14.1




Re: [Qemu-devel] [PATCH v3 1/7] block: skip implicit nodes in snapshots, blockjobs

2017-08-28 Thread Alberto Garcia
On Fri 25 Aug 2017 03:23:26 PM CEST, Manos Pitsidianakis wrote:

> +static inline BlockDriverState *child_bs(BlockDriverState *bs)
> +{
> +BlockDriverState *backing = backing_bs(bs);
> +BlockDriverState *file = file_bs(bs);
> +assert(!(file && backing));
> +return backing ?: file;
> +}

I'm still not sure how useful it is to have a separate function for
this, instead of putting it inside bdrv_get_first_explicit(), but anyway

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH v3 3/7] block: require job-id when device is a node name

2017-08-28 Thread Alberto Garcia
On Fri 25 Aug 2017 03:23:28 PM CEST, Manos Pitsidianakis wrote:
> With implicit filter nodes on the top of the graph it is not possible
> to generate job-ids with the name of the device in block_job_create()
> anymore, since the job's bs will not be a child_root.

Reviewed-by: Alberto Garcia 

Berto



[Qemu-devel] [Bug 1713408] Re: qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value passed to "maxcpus"

2017-08-28 Thread Thomas Huth
Please don't add patches to the bug tracker, post them to the qemu-devel
(and qemu-ppc in this case) mailing list instead. You even don't have to
join the mailing lists if you don't like to, posting to them is allowed
for everybody. See https://www.qemu.org/contribute/report-a-bug/ and
http://wiki.qemu.org/Contribute/SubmitAPatch for details. Thanks!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713408

Title:
  qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value
  passed to "maxcpus"

Status in QEMU:
  New

Bug description:
  # ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
  pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
  pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
  /avocado-
  vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
  -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio
  -net user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

  (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  From GDB:
  [New Thread 0x3fffb5aceb60 (LWP 12190)]

  (process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  Missing separate debuginfos, use: debuginfo-install 
glib2-2.50.3-3.el7.ppc64le glibc-2.17-196.el7.ppc64le 
gnutls-3.3.26-9.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le 
libgcc-4.8.5-16.el7.ppc64le libstdc++-4.8.5-16.el7.ppc64le 
ncurses-libs-5.9-13.20130511.el7.ppc64le nss-3.28.4-8.el7.ppc64le 
nss-softokn-freebl-3.28.3-6.el7.ppc64le nss-util-3.28.4-3.el7.ppc64le 
openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le 
p11-kit-0.23.5-3.el7.ppc64le
  (gdb) bt
  #0  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  #1  0x3fffb796be9c in _g_log_abort () from /lib64/libglib-2.0.so.0
  #2  0x3fffb796d4c4 in g_log_default_handler () from 
/lib64/libglib-2.0.so.0
  #3  0x3fffb796d86c in g_logv () from /lib64/libglib-2.0.so.0
  #4  0x3fffb796db00 in g_log () from /lib64/libglib-2.0.so.0
  #5  0x3fffb796b694 in g_malloc0 () from /lib64/libglib-2.0.so.0
  #6  0x1018fa84 in spapr_possible_cpu_arch_ids (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:3322
  #7  0x1018b444 in spapr_init_cpus (spapr=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2096
  #8  0x1018bc6c in ppc_spapr_init (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2275
  #9  0x1041ca38 in machine_run_board_init (machine=0x11165660) at 
hw/core/machine.c:760
  #10 0x1037723c in main (argc=24, argv=0x3108, 
envp=0x31d0) at vl.c:4633
  (gdb) i r
  r0 0xfa   250
  r1 0x3fffe450 70368744170576
  r2 0x3fffb7608100 70367525765376
  r3 0x00
  r4 0x2f98 12184
  r5 0x55
  r6 0x00
  r7 0x3fffa820 70367267782688
  r8 0x2f98 12184
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fffb64fccb0 70367507893424
  r140x00
  r150x00
  r160x00
  r170x00
  r180x11
  r190x00
  r200x3fffb796d3f0 70367529325552
  r210x00
  r220x2000 536870912
  r230x11
  r240x3fffb7a61498 70367530325144
  r250x3fffb7a614e8 70367530325224
  r260x3fffb7a61488 70367530325128
  r270x3fffa80008c0 70367267784896
  r280x3fffb79cd2a8 70367529718440
  r290x3fffb79cd2a8 70367529718440
  r300x 18446744073709551615
  r310x11
  pc 0x3fffb75e5408 0x3fffb75e5408 
  msr0x9000d033 10376293541461676083
  cr 0x42244842 1109674050
  lr 0x3fffb796be9c 0x3fffb796be9c <_g_log_abort+60>
  ctr0x00
  xer0x00
  orig_r30x2f98 12184
  trap   0xc00  3072

  Similar error observed on x86_64 and PPC64LE architectures.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713408/+subscriptions



[Qemu-devel] [Bug 1713408] Re: qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value passed to "maxcpus"

2017-08-28 Thread Thomas Huth
Looking at your patch, I think you should also check for "<= 0" instead
of just "< 0" ... since maxcpus = 0 also does not make much sense.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713408

Title:
  qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value
  passed to "maxcpus"

Status in QEMU:
  New

Bug description:
  # ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
  pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
  pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
  /avocado-
  vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
  -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio
  -net user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

  (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  From GDB:
  [New Thread 0x3fffb5aceb60 (LWP 12190)]

  (process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  Missing separate debuginfos, use: debuginfo-install 
glib2-2.50.3-3.el7.ppc64le glibc-2.17-196.el7.ppc64le 
gnutls-3.3.26-9.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le 
libgcc-4.8.5-16.el7.ppc64le libstdc++-4.8.5-16.el7.ppc64le 
ncurses-libs-5.9-13.20130511.el7.ppc64le nss-3.28.4-8.el7.ppc64le 
nss-softokn-freebl-3.28.3-6.el7.ppc64le nss-util-3.28.4-3.el7.ppc64le 
openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le 
p11-kit-0.23.5-3.el7.ppc64le
  (gdb) bt
  #0  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  #1  0x3fffb796be9c in _g_log_abort () from /lib64/libglib-2.0.so.0
  #2  0x3fffb796d4c4 in g_log_default_handler () from 
/lib64/libglib-2.0.so.0
  #3  0x3fffb796d86c in g_logv () from /lib64/libglib-2.0.so.0
  #4  0x3fffb796db00 in g_log () from /lib64/libglib-2.0.so.0
  #5  0x3fffb796b694 in g_malloc0 () from /lib64/libglib-2.0.so.0
  #6  0x1018fa84 in spapr_possible_cpu_arch_ids (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:3322
  #7  0x1018b444 in spapr_init_cpus (spapr=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2096
  #8  0x1018bc6c in ppc_spapr_init (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2275
  #9  0x1041ca38 in machine_run_board_init (machine=0x11165660) at 
hw/core/machine.c:760
  #10 0x1037723c in main (argc=24, argv=0x3108, 
envp=0x31d0) at vl.c:4633
  (gdb) i r
  r0 0xfa   250
  r1 0x3fffe450 70368744170576
  r2 0x3fffb7608100 70367525765376
  r3 0x00
  r4 0x2f98 12184
  r5 0x55
  r6 0x00
  r7 0x3fffa820 70367267782688
  r8 0x2f98 12184
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fffb64fccb0 70367507893424
  r140x00
  r150x00
  r160x00
  r170x00
  r180x11
  r190x00
  r200x3fffb796d3f0 70367529325552
  r210x00
  r220x2000 536870912
  r230x11
  r240x3fffb7a61498 70367530325144
  r250x3fffb7a614e8 70367530325224
  r260x3fffb7a61488 70367530325128
  r270x3fffa80008c0 70367267784896
  r280x3fffb79cd2a8 70367529718440
  r290x3fffb79cd2a8 70367529718440
  r300x 18446744073709551615
  r310x11
  pc 0x3fffb75e5408 0x3fffb75e5408 
  msr0x9000d033 10376293541461676083
  cr 0x42244842 1109674050
  lr 0x3fffb796be9c 0x3fffb796be9c <_g_log_abort+60>
  ctr0x00
  xer0x00
  orig_r30x2f98 12184
  trap   0xc00  3072

  Similar error observed on x86_64 and PPC64LE architectures.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713408/+subscriptions



Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll

2017-08-28 Thread Marc-André Lureau
Hi

On Mon, Aug 28, 2017 at 1:08 PM Markus Armbruster  wrote:

> Marc-André Lureau  writes:
>
> > On Fri, Aug 25, 2017 at 5:33 PM Dr. David Alan Gilbert <
> dgilb...@redhat.com>
> > wrote:
> >
> >> * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
> >> > Hi
> >> >
> >> > On Wed, Aug 23, 2017 at 8:52 AM Peter Xu  wrote:
> >> >
> >> > > Firstly, introduce Monitor.use_thread, and set it for monitors that
> are
> >> > > using non-mux typed backend chardev.  We only do this for monitors,
> so
> >> > > mux-typed chardevs are not suitable (when it connects to, e.g.,
> serials
> >> > > and the monitor together).
> >> > >
> >> > > When use_thread is set, we create standalone thread to poll the
> monitor
> >> > > events, isolated from the main loop thread.  Here we still need to
> take
> >> > > the BQL before dispatching the tasks since some of the monitor
> commands
> >> > > are not allowed to execute without the protection of BQL.  Then this
> >> > > gives us the chance to avoid taking the BQL for some monitor
> commands in
> >> > > the future.
> >> > >
> >> > > * Why this change?
> >> > >
> >> > > We need these per-monitor threads to make sure we can have at least
> one
> >> > > monitor that will never stuck (that can receive further monitor
> >> > > commands).
> >> > >
> >> > > * So when will monitors stuck?  And, how do they stuck?
> >> > >
> >> > > After we have postcopy and remote page faults, it's simple to
> achieve a
> >> > > stuck in the monitor (which is also a stuck in main loop thread):
> >> > >
> >> > > (1) Monitor deadlock on BQL
> >> > >
> >> > > As we may know, when postcopy is running on destination VM, the vcpu
> >> > > threads can stuck merely any time as long as it tries to access an
> >> > > uncopied guest page.  Meanwhile, when the stuck happens, it is
> possible
> >> > > that the vcpu thread is holding the BQL.  If the page fault is not
> >> > > handled quickly, you'll find that monitors stop working, which is
> trying
> >> > > to take the BQL.
> >> > >
> >> > > If the page fault cannot be handled correctly (one case is a paused
> >> > > postcopy, when network is temporarily down), monitors will hang
> >> > > forever.  Without current patch, that means the main loop hanged.
> We'll
> >> > > never find a way to talk to VM again.
> >> > >
> >> >
> >> > Could the BQL be pushed down to the monitor commands level instead?
> That
> >> > way we wouldn't need a seperate thread to solve the hang on commands
> that
> >> > do not need BQL.
> >>
> >> If the main thread is stuck though I don't see how that helps you; you
> >> have to be able to run these commands on another thread.
> >>
> >
> > Why would the main thread be stuck? In (1) If the vcpu thread takes the
> BQL
> > and the command doesn't need it, it would work.  In (2),  info cpus
> > shouldn't keep the BQL (my qapi-async series would probably help here)
>
> This has been discussed several times[*], but of course not with
> everybody, so I'll summarize once more: asynchronous commands are not a
> actually *required* for anything.  They are *one* way to package the
> "kick off task, receive an asynchronous message when it's done" pattern.
> Another way is synchronous command for the kick off, event for the
> "done".
>

But you would have to break or duplicate the QMP APIs. My proposal doesn't
need that, a command can reenter the main loop, and keep current QMP API.


> For better or worse, synchronous command + event is what we have today.
> Whether adding another way to package the the same thing improves the
> QMP interface is doubtful.
>

I would argue my series is mostly about internal refactoring for the
benefit mentionned above. The fact that you can do (optionnaly) concurrent
QMP commands is a nice bonus. Furthermore, it simplifies the API compared
to CMD / dummy reply + EVENT. And it gives a meaning to the exisiting
command "id"..


>
> [*] Try this one:
> Message-ID: <87o9yv890z@dusky.pond.sub.org>
> https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg05483.html
>
-- 
Marc-André Lureau


[Qemu-devel] [PATCH v2] vga: stop passing pointers to vga_draw_line* functions

2017-08-28 Thread Gerd Hoffmann
Instead pass around the address (aka offset into vga memory).
Add vga_read_* helper functions which apply vbe_size_mask to
the address, to make sure the address stays within the valid
range, similar to the cirrus blitter fixes (commits ffaf857778
and 026aeffcb4).

Impact:  DoS for privileged guest users.  qemu crashes with
a segfault, when hitting the guard page after vga memory
allocation, while reading vga memory for display updates.

Fixes: CVE-2017-13672
Cc: P J P 
Reported-by: David Buchanan 
Signed-off-by: Gerd Hoffmann 
---
 hw/display/vga-helpers.h | 202 ++-
 hw/display/vga_int.h |   1 +
 hw/display/vga.c |   5 +-
 3 files changed, 114 insertions(+), 94 deletions(-)

diff --git a/hw/display/vga-helpers.h b/hw/display/vga-helpers.h
index 94f6de2046..5a752b3f9e 100644
--- a/hw/display/vga-helpers.h
+++ b/hw/display/vga-helpers.h
@@ -95,20 +95,46 @@ static void vga_draw_glyph9(uint8_t *d, int linesize,
 } while (--h);
 }
 
+static inline uint8_t vga_read_byte(VGACommonState *vga, uint32_t addr)
+{
+return vga->vram_ptr[addr & vga->vbe_size_mask];
+}
+
+static inline uint16_t vga_read_word_le(VGACommonState *vga, uint32_t addr)
+{
+uint32_t offset = addr & vga->vbe_size_mask & ~1;
+uint16_t *ptr = (uint16_t *)(vga->vram_ptr + offset);
+return lduw_le_p(ptr);
+}
+
+static inline uint16_t vga_read_word_be(VGACommonState *vga, uint32_t addr)
+{
+uint32_t offset = addr & vga->vbe_size_mask & ~1;
+uint16_t *ptr = (uint16_t *)(vga->vram_ptr + offset);
+return lduw_be_p(ptr);
+}
+
+static inline uint32_t vga_read_dword_le(VGACommonState *vga, uint32_t addr)
+{
+uint32_t offset = addr & vga->vbe_size_mask & ~3;
+uint32_t *ptr = (uint32_t *)(vga->vram_ptr + offset);
+return ldl_le_p(ptr);
+}
+
 /*
  * 4 color mode
  */
-static void vga_draw_line2(VGACommonState *s1, uint8_t *d,
-   const uint8_t *s, int width)
+static void vga_draw_line2(VGACommonState *vga, uint8_t *d,
+   uint32_t addr, int width)
 {
 uint32_t plane_mask, *palette, data, v;
 int x;
 
-palette = s1->last_palette;
-plane_mask = mask16[s1->ar[VGA_ATC_PLANE_ENABLE] & 0xf];
+palette = vga->last_palette;
+plane_mask = mask16[vga->ar[VGA_ATC_PLANE_ENABLE] & 0xf];
 width >>= 3;
 for(x = 0; x < width; x++) {
-data = ((uint32_t *)s)[0];
+data = vga_read_dword_le(vga, addr);
 data &= plane_mask;
 v = expand2[GET_PLANE(data, 0)];
 v |= expand2[GET_PLANE(data, 2)] << 2;
@@ -124,7 +150,7 @@ static void vga_draw_line2(VGACommonState *s1, uint8_t *d,
 ((uint32_t *)d)[6] = palette[(v >> 4) & 0xf];
 ((uint32_t *)d)[7] = palette[(v >> 0) & 0xf];
 d += 32;
-s += 4;
+addr += 4;
 }
 }
 
@@ -134,17 +160,17 @@ static void vga_draw_line2(VGACommonState *s1, uint8_t *d,
 /*
  * 4 color mode, dup2 horizontal
  */
-static void vga_draw_line2d2(VGACommonState *s1, uint8_t *d,
- const uint8_t *s, int width)
+static void vga_draw_line2d2(VGACommonState *vga, uint8_t *d,
+ uint32_t addr, int width)
 {
 uint32_t plane_mask, *palette, data, v;
 int x;
 
-palette = s1->last_palette;
-plane_mask = mask16[s1->ar[VGA_ATC_PLANE_ENABLE] & 0xf];
+palette = vga->last_palette;
+plane_mask = mask16[vga->ar[VGA_ATC_PLANE_ENABLE] & 0xf];
 width >>= 3;
 for(x = 0; x < width; x++) {
-data = ((uint32_t *)s)[0];
+data = vga_read_dword_le(vga, addr);
 data &= plane_mask;
 v = expand2[GET_PLANE(data, 0)];
 v |= expand2[GET_PLANE(data, 2)] << 2;
@@ -160,24 +186,24 @@ static void vga_draw_line2d2(VGACommonState *s1, uint8_t 
*d,
 PUT_PIXEL2(d, 6, palette[(v >> 4) & 0xf]);
 PUT_PIXEL2(d, 7, palette[(v >> 0) & 0xf]);
 d += 64;
-s += 4;
+addr += 4;
 }
 }
 
 /*
  * 16 color mode
  */
-static void vga_draw_line4(VGACommonState *s1, uint8_t *d,
-   const uint8_t *s, int width)
+static void vga_draw_line4(VGACommonState *vga, uint8_t *d,
+   uint32_t addr, int width)
 {
 uint32_t plane_mask, data, v, *palette;
 int x;
 
-palette = s1->last_palette;
-plane_mask = mask16[s1->ar[VGA_ATC_PLANE_ENABLE] & 0xf];
+palette = vga->last_palette;
+plane_mask = mask16[vga->ar[VGA_ATC_PLANE_ENABLE] & 0xf];
 width >>= 3;
 for(x = 0; x < width; x++) {
-data = ((uint32_t *)s)[0];
+data = vga_read_dword_le(vga, addr);
 data &= plane_mask;
 v = expand4[GET_PLANE(data, 0)];
 v |= expand4[GET_PLANE(data, 1)] << 1;
@@ -192,24 +218,24 @@ static void vga_draw_line4(VGACommonState *s1, uint8_t *d,
 ((uint32_t *)d)[6] = palette[(v >> 4) & 0xf];
 ((uint32_t *)d)[7] = palette[(v >> 0) & 0xf];
 d += 32;
-s += 4;
+addr += 4;
 }
 }
 
 /*
  * 16

[Qemu-devel] [PATCH v2] vga: fix display update region calculation (split screen)

2017-08-28 Thread Gerd Hoffmann
vga display update mis-calculated the region for the dirty bitmap
snapshot in case split screen mode is used.  This can trigger an
assert in cpu_physical_memory_snapshot_get_dirty().

Impact:  DoS for privileged guest users.

Fixes: CVE-2017-13673
Fixes: fec5e8c92becad223df9d972770522f64aafdb72
Cc: P J P 
Reported-by: David Buchanan 
Signed-off-by: Gerd Hoffmann 
---
 hw/display/vga.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/display/vga.c b/hw/display/vga.c
index 63421f9ee8..ab33668402 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -1628,9 +1628,15 @@ static void vga_draw_graphic(VGACommonState *s, int 
full_update)
 y1 = 0;
 
 if (!full_update) {
+ram_addr_t region_start = addr1;
+ram_addr_t region_end = addr1 + line_offset * height;
 vga_sync_dirty_bitmap(s);
-snap = memory_region_snapshot_and_clear_dirty(&s->vram, addr1,
-  line_offset * height,
+if (s->line_compare < height) {
+/* split screen mode */
+region_start = 0;
+}
+snap = memory_region_snapshot_and_clear_dirty(&s->vram, region_start,
+  region_end - 
region_start,
   DIRTY_MEMORY_VGA);
 }
 
-- 
2.9.3




Re: [Qemu-devel] [PATCH v3 4/7] block: remove legacy I/O throttling

2017-08-28 Thread Alberto Garcia
On Fri 25 Aug 2017 03:23:29 PM CEST, Manos Pitsidianakis wrote:
> This commit removes all I/O throttling from block/block-backend.c. In
> order to support the existing interface, it is changed to use the
> block/throttle.c filter driver.
>
> The throttle filter node that is created by the legacy interface is
> stored in a 'throttle_node' field in the BlockBackendPublic of the
> device. The legacy throttle node is managed by the legacy interface
> completely. More advanced configurations with the filter drive are
> possible using the QMP API, but these will be ignored by the legacy
> interface.
>
> Signed-off-by: Manos Pitsidianakis 

Reviewed-by: Alberto Garcia 

Berto



[Qemu-devel] [PATCH v2 3/3] qxl_unpack_chunks: codestyle fixups

2017-08-28 Thread Gerd Hoffmann
---
 hw/display/qxl-render.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c
index b2c98f90c0..90e0865618 100644
--- a/hw/display/qxl-render.c
+++ b/hw/display/qxl-render.c
@@ -215,14 +215,17 @@ static void qxl_unpack_chunks(void *dest, size_t size, 
PCIQXLDevice *qxl,
 bytes = MIN(size - offset, chunk->data_size);
 memcpy(dest + offset, chunk->data, bytes);
 offset += bytes;
-if (offset == size)
+if (offset == size) {
 return;
+}
 chunk = qxl_phys2virt(qxl, chunk->next_chunk, group_id);
-if (!chunk)
+if (!chunk) {
 return;
+}
 max_chunks--;
-if (max_chunks == 0)
+if (max_chunks == 0) {
 return;
+}
 }
 }
 
-- 
2.9.3




[Qemu-devel] [PATCH v2 2/3] qxl: add support for chunked cursors.

2017-08-28 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/display/qxl-render.c | 33 ++---
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c
index e1b3f05ecb..b2c98f90c0 100644
--- a/hw/display/qxl-render.c
+++ b/hw/display/qxl-render.c
@@ -204,7 +204,30 @@ void qxl_render_update_area_done(PCIQXLDevice *qxl, 
QXLCookie *cookie)
 g_free(cookie);
 }
 
-static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor)
+static void qxl_unpack_chunks(void *dest, size_t size, PCIQXLDevice *qxl,
+  QXLDataChunk *chunk, uint32_t group_id)
+{
+uint32_t max_chunks = 32;
+size_t offset = 0;
+size_t bytes;
+
+for (;;) {
+bytes = MIN(size - offset, chunk->data_size);
+memcpy(dest + offset, chunk->data, bytes);
+offset += bytes;
+if (offset == size)
+return;
+chunk = qxl_phys2virt(qxl, chunk->next_chunk, group_id);
+if (!chunk)
+return;
+max_chunks--;
+if (max_chunks == 0)
+return;
+}
+}
+
+static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor,
+  uint32_t group_id)
 {
 QEMUCursor *c;
 size_t size;
@@ -215,7 +238,7 @@ static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor 
*cursor)
 switch (cursor->header.type) {
 case SPICE_CURSOR_TYPE_ALPHA:
 size = sizeof(uint32_t) * cursor->header.width * cursor->header.height;
-memcpy(c->data, cursor->chunk.data, size);
+qxl_unpack_chunks(c->data, size, qxl, &cursor->chunk, group_id);
 if (qxl->debug > 2) {
 cursor_print_ascii_art(c, "qxl/alpha");
 }
@@ -259,11 +282,7 @@ int qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt 
*ext)
 if (!cursor) {
 return 1;
 }
-if (cursor->chunk.data_size != cursor->data_size) {
-fprintf(stderr, "%s: multiple chunks\n", __FUNCTION__);
-return 1;
-}
-c = qxl_cursor(qxl, cursor);
+c = qxl_cursor(qxl, cursor, ext->group_id);
 if (c == NULL) {
 c = cursor_builtin_left_ptr();
 }
-- 
2.9.3




[Qemu-devel] [PATCH v2 0/3] qxl: add support for chunked cursors.

2017-08-28 Thread Gerd Hoffmann
This series adds support for unpacking qxl chunks, and uses it to
support chunked cursor images.  Windows guest drivers seem to use
that when in HiDPI mode.

Also drop (broken) support for mono cursors.

[ v2: codestyle fixes ]

Gerd Hoffmann (3):
  qxl: drop mono cursor support
  qxl: add support for chunked cursors.
  qxl_unpack_chunks: codestyle fixups

 hw/display/qxl-render.c | 45 +
 1 file changed, 29 insertions(+), 16 deletions(-)

-- 
2.9.3




[Qemu-devel] [PATCH v2 1/3] qxl: drop mono cursor support

2017-08-28 Thread Gerd Hoffmann
The chunk size sanity check in qxl_render_cursor works for
SPICE_CURSOR_TYPE_ALPHA cursors only.  So support for
SPICE_CURSOR_TYPE_MONO cursors must be broken for ages without anyone
noticing.  Most likely it simply isn't used any more by guest drivers.
Drop the dead code.

Signed-off-by: Gerd Hoffmann 
---
 hw/display/qxl-render.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c
index 9ad9d9e0f5..e1b3f05ecb 100644
--- a/hw/display/qxl-render.c
+++ b/hw/display/qxl-render.c
@@ -207,7 +207,6 @@ void qxl_render_update_area_done(PCIQXLDevice *qxl, 
QXLCookie *cookie)
 static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor)
 {
 QEMUCursor *c;
-uint8_t *image, *mask;
 size_t size;
 
 c = cursor_alloc(cursor->header.width, cursor->header.height);
@@ -221,14 +220,6 @@ static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor 
*cursor)
 cursor_print_ascii_art(c, "qxl/alpha");
 }
 break;
-case SPICE_CURSOR_TYPE_MONO:
-mask  = cursor->chunk.data;
-image = mask + cursor_get_mono_bpl(c) * c->width;
-cursor_set_mono(c, 0xff, 0x00, image, 1, mask);
-if (qxl->debug > 2) {
-cursor_print_ascii_art(c, "qxl/mono");
-}
-break;
 default:
 fprintf(stderr, "%s: not implemented: type %d\n",
 __FUNCTION__, cursor->header.type);
-- 
2.9.3




[Qemu-devel] [PATCH v3 1/2] qxl: drop mono cursor support

2017-08-28 Thread Gerd Hoffmann
The chunk size sanity check in qxl_render_cursor works for
SPICE_CURSOR_TYPE_ALPHA cursors only.  So support for
SPICE_CURSOR_TYPE_MONO cursors must be broken for ages without anyone
noticing.  Most likely it simply isn't used any more by guest drivers.
Drop the dead code.

Signed-off-by: Gerd Hoffmann 
---
 hw/display/qxl-render.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c
index 9ad9d9e0f5..e1b3f05ecb 100644
--- a/hw/display/qxl-render.c
+++ b/hw/display/qxl-render.c
@@ -207,7 +207,6 @@ void qxl_render_update_area_done(PCIQXLDevice *qxl, 
QXLCookie *cookie)
 static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor)
 {
 QEMUCursor *c;
-uint8_t *image, *mask;
 size_t size;
 
 c = cursor_alloc(cursor->header.width, cursor->header.height);
@@ -221,14 +220,6 @@ static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor 
*cursor)
 cursor_print_ascii_art(c, "qxl/alpha");
 }
 break;
-case SPICE_CURSOR_TYPE_MONO:
-mask  = cursor->chunk.data;
-image = mask + cursor_get_mono_bpl(c) * c->width;
-cursor_set_mono(c, 0xff, 0x00, image, 1, mask);
-if (qxl->debug > 2) {
-cursor_print_ascii_art(c, "qxl/mono");
-}
-break;
 default:
 fprintf(stderr, "%s: not implemented: type %d\n",
 __FUNCTION__, cursor->header.type);
-- 
2.9.3




[Qemu-devel] [PATCH v3 2/2] qxl: add support for chunked cursors.

2017-08-28 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/display/qxl-render.c | 36 +---
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c
index e1b3f05ecb..90e0865618 100644
--- a/hw/display/qxl-render.c
+++ b/hw/display/qxl-render.c
@@ -204,7 +204,33 @@ void qxl_render_update_area_done(PCIQXLDevice *qxl, 
QXLCookie *cookie)
 g_free(cookie);
 }
 
-static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor)
+static void qxl_unpack_chunks(void *dest, size_t size, PCIQXLDevice *qxl,
+  QXLDataChunk *chunk, uint32_t group_id)
+{
+uint32_t max_chunks = 32;
+size_t offset = 0;
+size_t bytes;
+
+for (;;) {
+bytes = MIN(size - offset, chunk->data_size);
+memcpy(dest + offset, chunk->data, bytes);
+offset += bytes;
+if (offset == size) {
+return;
+}
+chunk = qxl_phys2virt(qxl, chunk->next_chunk, group_id);
+if (!chunk) {
+return;
+}
+max_chunks--;
+if (max_chunks == 0) {
+return;
+}
+}
+}
+
+static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor *cursor,
+  uint32_t group_id)
 {
 QEMUCursor *c;
 size_t size;
@@ -215,7 +241,7 @@ static QEMUCursor *qxl_cursor(PCIQXLDevice *qxl, QXLCursor 
*cursor)
 switch (cursor->header.type) {
 case SPICE_CURSOR_TYPE_ALPHA:
 size = sizeof(uint32_t) * cursor->header.width * cursor->header.height;
-memcpy(c->data, cursor->chunk.data, size);
+qxl_unpack_chunks(c->data, size, qxl, &cursor->chunk, group_id);
 if (qxl->debug > 2) {
 cursor_print_ascii_art(c, "qxl/alpha");
 }
@@ -259,11 +285,7 @@ int qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt 
*ext)
 if (!cursor) {
 return 1;
 }
-if (cursor->chunk.data_size != cursor->data_size) {
-fprintf(stderr, "%s: multiple chunks\n", __FUNCTION__);
-return 1;
-}
-c = qxl_cursor(qxl, cursor);
+c = qxl_cursor(qxl, cursor, ext->group_id);
 if (c == NULL) {
 c = cursor_builtin_left_ptr();
 }
-- 
2.9.3




[Qemu-devel] [PATCH v3 0/2] qxl: add support for chunked cursors.

2017-08-28 Thread Gerd Hoffmann
This series adds support for unpacking qxl chunks, and uses it to
support chunked cursor images.  Windows guest drivers seem to use
that when in HiDPI mode.

Also drop (broken) support for mono cursors.

[ v2: codestyle fixes ]
[ v3: actually squash the codestyle fixes ... ]

Gerd Hoffmann (2):
  qxl: drop mono cursor support
  qxl: add support for chunked cursors.

 hw/display/qxl-render.c | 45 +
 1 file changed, 29 insertions(+), 16 deletions(-)

-- 
2.9.3




Re: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll

2017-08-28 Thread Peter Xu
On Mon, Aug 28, 2017 at 12:11:38PM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Aug 28, 2017 at 5:05 AM, Peter Xu  wrote:
> > On Fri, Aug 25, 2017 at 04:07:34PM +, Marc-André Lureau wrote:
> >> On Fri, Aug 25, 2017 at 5:33 PM Dr. David Alan Gilbert 
> >> 
> >> wrote:
> >>
> >> > * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:
> >> > > Hi
> >> > >
> >> > > On Wed, Aug 23, 2017 at 8:52 AM Peter Xu  wrote:
> >> > >
> >> > > > Firstly, introduce Monitor.use_thread, and set it for monitors that 
> >> > > > are
> >> > > > using non-mux typed backend chardev.  We only do this for monitors, 
> >> > > > so
> >> > > > mux-typed chardevs are not suitable (when it connects to, e.g., 
> >> > > > serials
> >> > > > and the monitor together).
> >> > > >
> >> > > > When use_thread is set, we create standalone thread to poll the 
> >> > > > monitor
> >> > > > events, isolated from the main loop thread.  Here we still need to 
> >> > > > take
> >> > > > the BQL before dispatching the tasks since some of the monitor 
> >> > > > commands
> >> > > > are not allowed to execute without the protection of BQL.  Then this
> >> > > > gives us the chance to avoid taking the BQL for some monitor commands
> >> > in
> >> > > > the future.
> >> > > >
> >> > > > * Why this change?
> >> > > >
> >> > > > We need these per-monitor threads to make sure we can have at least 
> >> > > > one
> >> > > > monitor that will never stuck (that can receive further monitor
> >> > > > commands).
> >> > > >
> >> > > > * So when will monitors stuck?  And, how do they stuck?
> >> > > >
> >> > > > After we have postcopy and remote page faults, it's simple to 
> >> > > > achieve a
> >> > > > stuck in the monitor (which is also a stuck in main loop thread):
> >> > > >
> >> > > > (1) Monitor deadlock on BQL
> >> > > >
> >> > > > As we may know, when postcopy is running on destination VM, the vcpu
> >> > > > threads can stuck merely any time as long as it tries to access an
> >> > > > uncopied guest page.  Meanwhile, when the stuck happens, it is 
> >> > > > possible
> >> > > > that the vcpu thread is holding the BQL.  If the page fault is not
> >> > > > handled quickly, you'll find that monitors stop working, which is
> >> > trying
> >> > > > to take the BQL.
> >> > > >
> >> > > > If the page fault cannot be handled correctly (one case is a paused
> >> > > > postcopy, when network is temporarily down), monitors will hang
> >> > > > forever.  Without current patch, that means the main loop hanged.
> >> > We'll
> >> > > > never find a way to talk to VM again.
> >> > > >
> >> > >
> >> > > Could the BQL be pushed down to the monitor commands level instead? 
> >> > > That
> >> > > way we wouldn't need a seperate thread to solve the hang on commands 
> >> > > that
> >> > > do not need BQL.
> >> >
> >> > If the main thread is stuck though I don't see how that helps you; you
> >> > have to be able to run these commands on another thread.
> >> >
> >>
> >> Why would the main thread be stuck? In (1) If the vcpu thread takes the BQL
> >> and the command doesn't need it, it would work.  In (2),  info cpus
> >> shouldn't keep the BQL (my qapi-async series would probably help here)
> >
> > (Thanks for joining the discussion)
> >
> > AFAIK the main thread can be stuck for many reasons.  I have seen one
> > stack when the VGA code (IIUC) was trying to writting to guest graphic
> > memory in main loop thread but luckily that guest page is still not
> > copied yet from source.  As long as the main thread is stuck for any
> > reason, no chance for monitor commands, even if the commands support
> > async operations.
> 
> If that command becomes async (it probably should, any command doing
> IO probaly should), then the main loop can keep running.

The problem is that, it's not blocked at "a command", but a task
running on the main thread.  The task can access guest memory, and
when the guest page is not there, the main thread hangs.  Then it
hangs every monitors, and all other tasks that are bounded to main
thread.

> 
> >
> > So IMHO the only solution is doing these things in separate threads,
> > rather than all in a single one.
> 
> I wouldn't say it's the only solution. I think the monitor can touch
> many areas that haven't been written with multi-threading in mind. My
> proposal is probably safer, although I don't know how hard it would be
> to push the BQL down to QMP commands, and make async existing IO
> commands. The benefits of this work are quite interesting imho,
> because a stuck mainloop is basically a stuck qemu, and an additional
> thread will not solve it...
> 
> -- 
> Marc-André Lureau

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Farhan Ali



On 08/28/2017 04:22 AM, Cornelia Huck wrote:

On Fri, 25 Aug 2017 11:05:30 -0400
Farhan Ali  wrote:


On 08/25/2017 10:04 AM, Cornelia Huck wrote:

On Fri, 25 Aug 2017 09:24:46 -0400
Farhan Ali  wrote:


The commit 198c0d1f9df8c4 s390x/css: check ccw address validity
exposes an alignment issue in ccw bios.

According to PoP the CCW must be doubleword aligned. Let's fix
this in the bios.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Farhan Ali 
Reviewed-by: Halil Pasic 
Reviewed-by: Eric Farman 
Acked-by: Christian Borntraeger 
---
 pc-bios/s390-ccw/cio.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
index f5b4549..55eaeee 100644
--- a/pc-bios/s390-ccw/cio.h
+++ b/pc-bios/s390-ccw/cio.h
@@ -133,7 +133,7 @@ struct ccw1 {
 __u8 flags;
 __u16 count;
 __u32 cda;
-} __attribute__ ((packed));
+} __attribute__ ((packed, aligned(8)));

 #define CCW_FLAG_DC  0x80
 #define CCW_FLAG_CC  0x40


Currently testing.

This looks obviously right, but did you figure out what the (probably
unrelated) other failure was?



That is still under investigation, for some reason it only fails for an
LDL DASD and it works for SCSIs and CDL DASD.


Which are the symptoms of the failure? I'd like to understand this
before I update the (currently working by accident) bios with an
updated version.

I'll just apply the patch for now.



Well it's seems like the failure for LDL DASD could be a disk setup 
failure. We tried the test on a different environment with LDL disks and 
everything worked fine with the patch applied.






Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Cornelia Huck
On Mon, 28 Aug 2017 08:56:42 -0400
Farhan Ali  wrote:

> On 08/28/2017 04:22 AM, Cornelia Huck wrote:
> > On Fri, 25 Aug 2017 11:05:30 -0400
> > Farhan Ali  wrote:
> >  
> >> On 08/25/2017 10:04 AM, Cornelia Huck wrote:  
> >>> On Fri, 25 Aug 2017 09:24:46 -0400
> >>> Farhan Ali  wrote:
> >>>  
>  The commit 198c0d1f9df8c4 s390x/css: check ccw address validity
>  exposes an alignment issue in ccw bios.
> 
>  According to PoP the CCW must be doubleword aligned. Let's fix
>  this in the bios.
> 
>  Cc: qemu-sta...@nongnu.org
>  Signed-off-by: Farhan Ali 
>  Reviewed-by: Halil Pasic 
>  Reviewed-by: Eric Farman 
>  Acked-by: Christian Borntraeger 
>  ---
>   pc-bios/s390-ccw/cio.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
>  index f5b4549..55eaeee 100644
>  --- a/pc-bios/s390-ccw/cio.h
>  +++ b/pc-bios/s390-ccw/cio.h
>  @@ -133,7 +133,7 @@ struct ccw1 {
>   __u8 flags;
>   __u16 count;
>   __u32 cda;
>  -} __attribute__ ((packed));
>  +} __attribute__ ((packed, aligned(8)));
> 
>   #define CCW_FLAG_DC  0x80
>   #define CCW_FLAG_CC  0x40  
> >>>
> >>> Currently testing.
> >>>
> >>> This looks obviously right, but did you figure out what the (probably
> >>> unrelated) other failure was?
> >>>  
> >>
> >> That is still under investigation, for some reason it only fails for an
> >> LDL DASD and it works for SCSIs and CDL DASD.  
> >
> > Which are the symptoms of the failure? I'd like to understand this
> > before I update the (currently working by accident) bios with an
> > updated version.
> >
> > I'll just apply the patch for now.
> >  
> 
> Well it's seems like the failure for LDL DASD could be a disk setup 
> failure. We tried the test on a different environment with LDL disks and 
> everything worked fine with the patch applied.

Odd that it breaks after this change, though. Do you get command
rejects, or what happens?



[Qemu-devel] [PATCH v2] qemu crashes when a negative number used for 'maxcpus'

2017-08-28 Thread Seeteena Thoufeek
---Steps to Reproduce---

When passed a negative number to 'maxcpus' parameter, Qemu aborts
with a core dump.

Run the following command with maxcpus argument as negative number

ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
pseries,accel=kvm,kvm-type=HV -m size=200g -device virtio-blk-pci,
drive=rootdisk -drive file=/home/images/pegas-1.0-ppc64le.qcow2,
if=none,cache=none,id=rootdisk,format=qcow2 -monitor telnet
:127.0.0.1:1234,server,nowait -net nic,model=virtio -net
user -redir tcp:2000::22 -device nec-usb-xhci -smp 8,cores=1,
threads=1,maxcpus=-12

(process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
 18446744073709550568 bytes

Trace/breakpoint trap

Reported-by: R.Nageswara Sastry 
Signed-off-by: Seeteena Thoufeek 
Reviewed-by: Bharata B Rao 
---
 vl.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/vl.c b/vl.c
index 8e247cc..042714f 100644
--- a/vl.c
+++ b/vl.c
@@ -1244,6 +1244,11 @@ static void smp_parse(QemuOpts *opts)
 }
 
 max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
+
+   if (max_cpus < 0) {
+   error_report("Invalid max_cpus : %d", max_cpus);
+   exit(1);
+   }
 
 if (max_cpus < cpus) {
 error_report("maxcpus must be equal to or greater than smp");
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v2] qemu crashes when a negative number used for 'maxcpus'

2017-08-28 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PATCH v2] qemu crashes when a negative number used for 
'maxcpus'
Message-id: 1503924917-12687-1-git-send-email-s1see...@linux.vnet.ibm.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/1503924917-12687-1-git-send-email-s1see...@linux.vnet.ibm.com -> 
patchew/1503924917-12687-1-git-send-email-s1see...@linux.vnet.ibm.com
Switched to a new branch 'test'
bdc7670bb2 qemu crashes when a negative number used for 'maxcpus'

=== OUTPUT BEGIN ===
Checking PATCH 1/1: qemu crashes when a negative number used for 'maxcpus'...
ERROR: code indent should never use tabs
#39: FILE: vl.c:1248:
+^Iif (max_cpus < 0) {$

ERROR: code indent should never use tabs
#40: FILE: vl.c:1249:
+^I^Ierror_report("Invalid max_cpus : %d", max_cpus);$

ERROR: code indent should never use tabs
#41: FILE: vl.c:1250:
+^I^Iexit(1);$

ERROR: code indent should never use tabs
#42: FILE: vl.c:1251:
+^I}$

total: 4 errors, 0 warnings, 11 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Farhan Ali



On 08/28/2017 09:06 AM, Cornelia Huck wrote:

On Mon, 28 Aug 2017 08:56:42 -0400
Farhan Ali  wrote:


On 08/28/2017 04:22 AM, Cornelia Huck wrote:

On Fri, 25 Aug 2017 11:05:30 -0400
Farhan Ali  wrote:


On 08/25/2017 10:04 AM, Cornelia Huck wrote:

On Fri, 25 Aug 2017 09:24:46 -0400
Farhan Ali  wrote:


The commit 198c0d1f9df8c4 s390x/css: check ccw address validity
exposes an alignment issue in ccw bios.

According to PoP the CCW must be doubleword aligned. Let's fix
this in the bios.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Farhan Ali 
Reviewed-by: Halil Pasic 
Reviewed-by: Eric Farman 
Acked-by: Christian Borntraeger 
---
 pc-bios/s390-ccw/cio.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
index f5b4549..55eaeee 100644
--- a/pc-bios/s390-ccw/cio.h
+++ b/pc-bios/s390-ccw/cio.h
@@ -133,7 +133,7 @@ struct ccw1 {
 __u8 flags;
 __u16 count;
 __u32 cda;
-} __attribute__ ((packed));
+} __attribute__ ((packed, aligned(8)));

 #define CCW_FLAG_DC  0x80
 #define CCW_FLAG_CC  0x40


Currently testing.

This looks obviously right, but did you figure out what the (probably
unrelated) other failure was?



That is still under investigation, for some reason it only fails for an
LDL DASD and it works for SCSIs and CDL DASD.


Which are the symptoms of the failure? I'd like to understand this
before I update the (currently working by accident) bios with an
updated version.

I'll just apply the patch for now.



Well it's seems like the failure for LDL DASD could be a disk setup
failure. We tried the test on a different environment with LDL disks and
everything worked fine with the patch applied.


Odd that it breaks after this change, though. Do you get command
rejects, or what happens?



It's the alignment of the CCW which causes the problem.

The exact error message when starting the guest was:

! No virtio device found !

Since it worked for SCSI and CDL, and failed for LDL disks on that 
particular system, we are not really sure what caused the failure.
Debugging it further showed the CCW for LDL disks were not aligned at 
double word boundary.


Trying the test on a different system with LDL disks worked fine, with 
the aligned(8) fix.





Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Christian Borntraeger



On 08/28/2017 03:18 PM, Farhan Ali wrote:
> 
> 
> On 08/28/2017 09:06 AM, Cornelia Huck wrote:
>> On Mon, 28 Aug 2017 08:56:42 -0400
>> Farhan Ali  wrote:
>>
>>> On 08/28/2017 04:22 AM, Cornelia Huck wrote:
 On Fri, 25 Aug 2017 11:05:30 -0400
 Farhan Ali  wrote:

> On 08/25/2017 10:04 AM, Cornelia Huck wrote:
>> On Fri, 25 Aug 2017 09:24:46 -0400
>> Farhan Ali  wrote:
>>
>>> The commit 198c0d1f9df8c4 s390x/css: check ccw address validity
>>> exposes an alignment issue in ccw bios.
>>>
>>> According to PoP the CCW must be doubleword aligned. Let's fix
>>> this in the bios.
>>>
>>> Cc: qemu-sta...@nongnu.org
>>> Signed-off-by: Farhan Ali 
>>> Reviewed-by: Halil Pasic 
>>> Reviewed-by: Eric Farman 
>>> Acked-by: Christian Borntraeger 
>>> ---
>>>  pc-bios/s390-ccw/cio.h | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
>>> index f5b4549..55eaeee 100644
>>> --- a/pc-bios/s390-ccw/cio.h
>>> +++ b/pc-bios/s390-ccw/cio.h
>>> @@ -133,7 +133,7 @@ struct ccw1 {
>>>  __u8 flags;
>>>  __u16 count;
>>>  __u32 cda;
>>> -} __attribute__ ((packed));
>>> +} __attribute__ ((packed, aligned(8)));
>>>
>>>  #define CCW_FLAG_DC  0x80
>>>  #define CCW_FLAG_CC  0x40
>>
>> Currently testing.
>>
>> This looks obviously right, but did you figure out what the (probably
>> unrelated) other failure was?
>>
>
> That is still under investigation, for some reason it only fails for an
> LDL DASD and it works for SCSIs and CDL DASD.

 Which are the symptoms of the failure? I'd like to understand this
 before I update the (currently working by accident) bios with an
 updated version.

 I'll just apply the patch for now.

>>>
>>> Well it's seems like the failure for LDL DASD could be a disk setup
>>> failure. We tried the test on a different environment with LDL disks and
>>> everything worked fine with the patch applied.
>>
>> Odd that it breaks after this change, though. Do you get command
>> rejects, or what happens?
>>
> 
> It's the alignment of the CCW which causes the problem.
> 
> The exact error message when starting the guest was:
> 
> ! No virtio device found !
> 
> Since it worked for SCSI and CDL, and failed for LDL disks on that particular 
> system, we are not really sure what caused the failure.
> Debugging it further showed the CCW for LDL disks were not aligned at double 
> word boundary.
> 
> Trying the test on a different system with LDL disks worked fine, with the 
> aligned(8) fix.

Do you happen to have an old s390-ccw.img laying around in the test folder? 
QEMU might pick up 
this one (e.g. when calling it without libvirt from the command line).




Re: [Qemu-devel] [PATCH v15 4/5] mm: support reporting free page blocks

2017-08-28 Thread Michal Hocko
On Mon 28-08-17 18:08:32, Wei Wang wrote:
> This patch adds support to walk through the free page blocks in the
> system and report them via a callback function. Some page blocks may
> leave the free list after zone->lock is released, so it is the caller's
> responsibility to either detect or prevent the use of such pages.
> 
> One use example of this patch is to accelerate live migration by skipping
> the transfer of free pages reported from the guest. A popular method used
> by the hypervisor to track which part of memory is written during live
> migration is to write-protect all the guest memory. So, those pages that
> are reported as free pages but are written after the report function
> returns will be captured by the hypervisor, and they will be added to the
> next round of memory transfer.

OK, looks much better. I still have few nits.

> +extern void walk_free_mem_block(void *opaque,
> + int min_order,
> + bool (*report_page_block)(void *, unsigned long,
> +   unsigned long));
> +

please add names to arguments of the prototype

>  /*
>   * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
>   * into the buddy system. The freed pages will be poisoned with pattern
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6d00f74..81eedc7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4762,6 +4762,71 @@ void show_free_areas(unsigned int filter, nodemask_t 
> *nodemask)
>   show_swap_cache_info();
>  }
>  
> +/**
> + * walk_free_mem_block - Walk through the free page blocks in the system
> + * @opaque: the context passed from the caller
> + * @min_order: the minimum order of free lists to check
> + * @report_page_block: the callback function to report free page blocks

page_block has meaning in the core MM which doesn't strictly match its
usage here. Moreover we are reporting pfn ranges rather than struct page
range. So report_pfn_range would suit better.

[...]
> + for_each_populated_zone(zone) {
> + for (order = MAX_ORDER - 1; order >= min_order; order--) {
> + for (mt = 0; !stop && mt < MIGRATE_TYPES; mt++) {
> + spin_lock_irqsave(&zone->lock, flags);
> + list = &zone->free_area[order].free_list[mt];
> + list_for_each_entry(page, list, lru) {
> + pfn = page_to_pfn(page);
> + stop = report_page_block(opaque, pfn,
> +  1 << order);
> + if (stop)
> + break;

if (stop) {

spin_unlock_irqrestore(&zone->lock, flags);
return;
}

would be both easier and less error prone. E.g. You wouldn't pointlessly
iterate over remaining orders just to realize there is nothing to be
done for those...

> + }
> + spin_unlock_irqrestore(&zone->lock, flags);
> + }
> + }
> + }
> +}
> +EXPORT_SYMBOL_GPL(walk_free_mem_block);

-- 
Michal Hocko
SUSE Labs



Re: [Qemu-devel] [PATCH 3/5] pci: Add INTERFACE_PCIE_DEVICE to all PCIe devices

2017-08-28 Thread Eduardo Habkost
On Sun, Aug 27, 2017 at 11:35:56AM +0300, Marcel Apfelbaum wrote:
> Hi Eduardo,
> 
> On 24/08/2017 1:14, Eduardo Habkost wrote:
> > Change all devices that set is_express=1 to implement
> > INTERFACE_PCIE_DEVICE.
> > 
> 
> Can this interface *replace* is_express field?

It can, but it has to be done carefully: 4 of the 5 hybrid
devices have is_express=0, so their logic need to be changed from
"set QEMU_PCI_CAP_EXPRESS manually if Express" to "clear
QEMU_PCI_CAP_EXPRESS manually if Conventional PCI".

Cleaning up the code on the hybrid devices is on my plans, but I
decided to do that after this series.

-- 
Eduardo



Re: [Qemu-devel] [PATCH] tests: fix incorrect size_t format in benchmark-crypto

2017-08-28 Thread Marc-André Lureau
On Mon, Aug 28, 2017 at 1:38 PM Philippe Mathieu-Daudé 
wrote:

>   $ make check-speed
>   tests/benchmark-crypto-hash.c: In function 'test_hash_speed':
>   tests/benchmark-crypto-hash.c:44:5: error: format '%ld' expects argument
> of type 'long int', but argument 2 has type 'size_t' [-Werror=format=]
>g_print("Testing chunk_size %ld bytes ", chunk_size);
>^
>   tests/benchmark-crypto-hash.c: In function 'main':
>   tests/benchmark-crypto-hash.c:62:9: error: format '%lu' expects argument
> of type 'long unsigned int', but argument 4 has type 'size_t'
> [-Werror=format=]
>snprintf(name, sizeof(name), "/crypto/hash/speed-%lu", i);
>^
>   cc1: all warnings being treated as errors
>   rules.mak:66: recipe for target 'tests/benchmark-crypto-hash.o' failed
>   make: *** [tests/benchmark-crypto-hash.o] Error 1
>
> Signed-off-by: Philippe Mathieu-Daudé 
>

 Reviewed-by: Marc-André Lureau 

---
> testing v2.10.0-rc4
>
>  tests/benchmark-crypto-cipher.c | 4 ++--
>  tests/benchmark-crypto-hash.c   | 4 ++--
>  tests/benchmark-crypto-hmac.c   | 4 ++--
>  3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/tests/benchmark-crypto-cipher.c
> b/tests/benchmark-crypto-cipher.c
> index c6a40929e4..cf98443468 100644
> --- a/tests/benchmark-crypto-cipher.c
> +++ b/tests/benchmark-crypto-cipher.c
> @@ -59,7 +59,7 @@ static void test_cipher_speed(const void *opaque)
>  total /= 1024 * 1024; /* to MB */
>
>  g_print("cbc(aes128): ");
> -g_print("Testing chunk_size %ld bytes ", chunk_size);
> +g_print("Testing chunk_size %zu bytes ", chunk_size);
>  g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last());
>  g_print("%.2f MB/sec\n", total / g_test_timer_last());
>
> @@ -80,7 +80,7 @@ int main(int argc, char **argv)
>
>  for (i = 512; i <= (64 * 1204); i *= 2) {
>  memset(name, 0 , sizeof(name));
> -snprintf(name, sizeof(name), "/crypto/cipher/speed-%lu", i);
> +snprintf(name, sizeof(name), "/crypto/cipher/speed-%zu", i);
>  g_test_add_data_func(name, (void *)i, test_cipher_speed);
>  }
>
> diff --git a/tests/benchmark-crypto-hash.c b/tests/benchmark-crypto-hash.c
> index 6769d2a11b..122bfb6b85 100644
> --- a/tests/benchmark-crypto-hash.c
> +++ b/tests/benchmark-crypto-hash.c
> @@ -41,7 +41,7 @@ static void test_hash_speed(const void *opaque)
>
>  total /= 1024 * 1024; /* to MB */
>  g_print("sha256: ");
> -g_print("Testing chunk_size %ld bytes ", chunk_size);
> +g_print("Testing chunk_size %zu bytes ", chunk_size);
>  g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last());
>  g_print("%.2f MB/sec\n", total / g_test_timer_last());
>
> @@ -59,7 +59,7 @@ int main(int argc, char **argv)
>
>  for (i = 512; i <= (64 * 1204); i *= 2) {
>  memset(name, 0 , sizeof(name));
> -snprintf(name, sizeof(name), "/crypto/hash/speed-%lu", i);
> +snprintf(name, sizeof(name), "/crypto/hash/speed-%zu", i);
>  g_test_add_data_func(name, (void *)i, test_hash_speed);
>  }
>
> diff --git a/tests/benchmark-crypto-hmac.c b/tests/benchmark-crypto-hmac.c
> index 72408be987..c30250df3e 100644
> --- a/tests/benchmark-crypto-hmac.c
> +++ b/tests/benchmark-crypto-hmac.c
> @@ -56,7 +56,7 @@ static void test_hmac_speed(const void *opaque)
>  total /= 1024 * 1024; /* to MB */
>
>  g_print("hmac(sha256): ");
> -g_print("Testing chunk_size %ld bytes ", chunk_size);
> +g_print("Testing chunk_size %zu bytes ", chunk_size);
>  g_print("done: %.2f MB in %.2f secs: ", total, g_test_timer_last());
>  g_print("%.2f MB/sec\n", total / g_test_timer_last());
>
> @@ -74,7 +74,7 @@ int main(int argc, char **argv)
>
>  for (i = 512; i <= (64 * 1204); i *= 2) {
>  memset(name, 0 , sizeof(name));
> -snprintf(name, sizeof(name), "/crypto/hmac/speed-%lu", i);
> +snprintf(name, sizeof(name), "/crypto/hmac/speed-%zu", i);
>  g_test_add_data_func(name, (void *)i, test_hmac_speed);
>  }
>
> --
> 2.14.1
>
>
> --
Marc-André Lureau


[Qemu-devel] [Bug 1713408] Re: qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value passed to "maxcpus"

2017-08-28 Thread R.Nageswara Sastry
Sure will do the changes and update. Seems one of my colleague did it already 
(sent patch to devel list)
https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg05345.html
I will pass your review comments to her for modification.

Thanks for your review.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713408

Title:
  qemu crashes with "GLib-ERROR **: gmem.c" error when a negative value
  passed to "maxcpus"

Status in QEMU:
  New

Bug description:
  # ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
  pseries,accel=kvm,kvm-type=HV -m size=20g -device virtio-blk-
  pci,drive=rootdisk -drive file=/home/nasastry/avocado-fvt-wrapper/data
  /avocado-
  vt/images/pegas-1.0-ppc64le.qcow2,if=none,cache=none,id=rootdisk,format=qcow2
  -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio
  -net user -device nec-usb-xhci -smp 8,cores=1,threads=1,maxcpus=-12

  (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  From GDB:
  [New Thread 0x3fffb5aceb60 (LWP 12190)]

  (process:12184): GLib-ERROR **: gmem.c:130: failed to allocate
  18446744073709550568 bytes

  Program received signal SIGTRAP, Trace/breakpoint trap.
  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  Missing separate debuginfos, use: debuginfo-install 
glib2-2.50.3-3.el7.ppc64le glibc-2.17-196.el7.ppc64le 
gnutls-3.3.26-9.el7.ppc64le krb5-libs-1.15.1-8.el7.ppc64le 
libgcc-4.8.5-16.el7.ppc64le libstdc++-4.8.5-16.el7.ppc64le 
ncurses-libs-5.9-13.20130511.el7.ppc64le nss-3.28.4-8.el7.ppc64le 
nss-softokn-freebl-3.28.3-6.el7.ppc64le nss-util-3.28.4-3.el7.ppc64le 
openldap-2.4.44-5.el7.ppc64le openssl-libs-1.0.2k-8.el7.ppc64le 
p11-kit-0.23.5-3.el7.ppc64le
  (gdb) bt
  #0  0x3fffb75e5408 in raise () from /lib64/libpthread.so.0
  #1  0x3fffb796be9c in _g_log_abort () from /lib64/libglib-2.0.so.0
  #2  0x3fffb796d4c4 in g_log_default_handler () from 
/lib64/libglib-2.0.so.0
  #3  0x3fffb796d86c in g_logv () from /lib64/libglib-2.0.so.0
  #4  0x3fffb796db00 in g_log () from /lib64/libglib-2.0.so.0
  #5  0x3fffb796b694 in g_malloc0 () from /lib64/libglib-2.0.so.0
  #6  0x1018fa84 in spapr_possible_cpu_arch_ids (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:3322
  #7  0x1018b444 in spapr_init_cpus (spapr=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2096
  #8  0x1018bc6c in ppc_spapr_init (machine=0x11165660) at 
/home/nasastry/upstream/qemu/hw/ppc/spapr.c:2275
  #9  0x1041ca38 in machine_run_board_init (machine=0x11165660) at 
hw/core/machine.c:760
  #10 0x1037723c in main (argc=24, argv=0x3108, 
envp=0x31d0) at vl.c:4633
  (gdb) i r
  r0 0xfa   250
  r1 0x3fffe450 70368744170576
  r2 0x3fffb7608100 70367525765376
  r3 0x00
  r4 0x2f98 12184
  r5 0x55
  r6 0x00
  r7 0x3fffa820 70367267782688
  r8 0x2f98 12184
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fffb64fccb0 70367507893424
  r140x00
  r150x00
  r160x00
  r170x00
  r180x11
  r190x00
  r200x3fffb796d3f0 70367529325552
  r210x00
  r220x2000 536870912
  r230x11
  r240x3fffb7a61498 70367530325144
  r250x3fffb7a614e8 70367530325224
  r260x3fffb7a61488 70367530325128
  r270x3fffa80008c0 70367267784896
  r280x3fffb79cd2a8 70367529718440
  r290x3fffb79cd2a8 70367529718440
  r300x 18446744073709551615
  r310x11
  pc 0x3fffb75e5408 0x3fffb75e5408 
  msr0x9000d033 10376293541461676083
  cr 0x42244842 1109674050
  lr 0x3fffb796be9c 0x3fffb796be9c <_g_log_abort+60>
  ctr0x00
  xer0x00
  orig_r30x2f98 12184
  trap   0xc00  3072

  Similar error observed on x86_64 and PPC64LE architectures.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713408/+subscriptions



[Qemu-devel] [Bug 1713434] Re: prom-env-test test aborted and core dumped

2017-08-28 Thread R.Nageswara Sastry
Host was not loaded at that time. And can be re-producable all the times

  GTESTER check-qtest-ppc64
**
ERROR:tests/prom-env-test.c:42:check_guest_memory: assertion failed (signature 
== MAGIC): (0x7c7f1b78 == 0xcafec0de)
GTester: last random seed: R02S5625099e4ad7700238a4e83dbd6576e0

this is with new seed.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713434

Title:
  prom-env-test test aborted and core dumped

Status in QEMU:
  In Progress

Bug description:
  On ppc64le architecture machine the following test case Aborted and
  Core dumped.

  # tests/prom-env-test --quiet --keep-going -m=quick --GTestLogFD=6
  **
  ERROR:tests/libqtest.c:628:qtest_get_arch: assertion failed: (qemu != NULL)
  Aborted (core dumped)

  Steps to re-produce:
  clone from the git
  configure & compile 
  run the unit tests by 'make check'

  (gdb) bt
  #0  0x3fff9d60eff0 in raise () from /lib64/libc.so.6
  #1  0x3fff9d61136c in abort () from /lib64/libc.so.6
  #2  0x3fff9de1aa04 in g_assertion_message () from /lib64/libglib-2.0.so.0
  #3  0x3fff9de1ab0c in g_assertion_message_expr () from 
/lib64/libglib-2.0.so.0
  #4  0x1000cc30 in qtest_get_arch () at tests/libqtest.c:628
  #5  0x100048f0 in main (argc=5, argv=0x32145538) at 
tests/prom-env-test.c:82
  (gdb) i r
  r0 0xfa   250
  r1 0x32144d30 70368510627120
  r2 0x3fff9d7b9900 7036709176
  r3 0x00
  r4 0x12a7 4775
  r5 0x66
  r6 0x88
  r7 0x11
  r8 0x00
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fff9dfa1950 70367099623760
  r140x00
  r150x00
  r160x00
  r170x00
  r180x00
  r190x00
  r200x00
  r210x00
  r220x00
  r230x00
  r240x00
  r250x00
  r260x00
  r270x100287f8 268601336
  r280x16841b40 377756480
  r290x4c   76
  r300x32144de8 70368510627304
  r310x66
  pc 0x3fff9d60eff0 0x3fff9d60eff0 
  msr0x9280f033 10376293541503627315
  cr 0x42000842 1107298370
  lr 0x3fff9d61136c 0x3fff9d61136c 
  ctr0x00
  xer0x00
  orig_r30x12a7 4775
  trap   0xc00  3072

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713434/+subscriptions



Re: [Qemu-devel] [PATCH v3 5/7] block/throttle-groups.c: remove throttle-groups list

2017-08-28 Thread Alberto Garcia
On Fri 25 Aug 2017 03:23:30 PM CEST, Manos Pitsidianakis wrote:
> @@ -1957,12 +1957,18 @@ void blk_io_limits_enable(BlockBackend *blk, const 
> char *group,  Error **errp)
>  BlockDriverState *bs = blk_bs(blk), *throttle_node;
>  QDict *options = qdict_new();
>  Error *local_err = NULL;
> -ThrottleState *ts;
> -
> -bdrv_drained_begin(bs);
>  
>  /* Force creation of group in case it doesn't exist */
> -ts = throttle_group_incref(group);
> +if (!throttle_group_exists(group)) {
> +throttle_group_new_legacy(group, &local_err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return;
> +}
> +}

After this, the reference count of the new group is 2 (as is already
explained in throttle_group_new_legacy()).

> +bdrv_drained_begin(bs);
> +
>  qdict_set_default_str(options, "file", bs->node_name);
>  qdict_set_default_str(options, QEMU_OPT_THROTTLE_GROUP_NAME, group);
>  throttle_node = bdrv_new_open_driver(bdrv_find_format("throttle"), NULL,

And the implicit throttle node will hold and extra reference, making it
3.

How do you delete the legacy throttle group then? block_set_io_throttle
with everything set to 0 disables I/O limits and destroys the node, but
the reference count is still 2 and the group is not destroyed.

> +static int query_throttle_groups_foreach(Object *obj, void *data)
> +{
> +ThrottleGroup *tg;
> +struct ThrottleGroupQuery *query = data;
> +
> +tg = (ThrottleGroup *)object_dynamic_cast(obj, TYPE_THROTTLE_GROUP);
> +if (!tg) {
> +return 0;
> +}
> +
> +if (!g_strcmp0(query->name, tg->name)) {
> +query->result = tg;
> +return 1;
> +}
> +
> +return 0;
> +}

If you want you can make this a bit shorter if you merge both ifs:

   if (tg && !g_strcmp0(query->name, tg->name)) {
   query->result = tg;
   return 1;
   }

   return 0;

> +void throttle_group_new_legacy(const char *name, Error **errp)
> +{
> +ThrottleGroup *tg = NULL;

No need to initialize tg here.

>  ThrottleState *throttle_group_incref(const char *name)
>  {

I think that you can make this one static and remove it from
throttle-groups.h (no one else is using it). Same with
throttle_group_unref.

Berto



Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Farhan Ali



On 08/28/2017 09:24 AM, Christian Borntraeger wrote:




On 08/28/2017 03:18 PM, Farhan Ali wrote:



On 08/28/2017 09:06 AM, Cornelia Huck wrote:

On Mon, 28 Aug 2017 08:56:42 -0400
Farhan Ali  wrote:


On 08/28/2017 04:22 AM, Cornelia Huck wrote:

On Fri, 25 Aug 2017 11:05:30 -0400
Farhan Ali  wrote:


On 08/25/2017 10:04 AM, Cornelia Huck wrote:

On Fri, 25 Aug 2017 09:24:46 -0400
Farhan Ali  wrote:


The commit 198c0d1f9df8c4 s390x/css: check ccw address validity
exposes an alignment issue in ccw bios.

According to PoP the CCW must be doubleword aligned. Let's fix
this in the bios.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Farhan Ali 
Reviewed-by: Halil Pasic 
Reviewed-by: Eric Farman 
Acked-by: Christian Borntraeger 
---
 pc-bios/s390-ccw/cio.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
index f5b4549..55eaeee 100644
--- a/pc-bios/s390-ccw/cio.h
+++ b/pc-bios/s390-ccw/cio.h
@@ -133,7 +133,7 @@ struct ccw1 {
 __u8 flags;
 __u16 count;
 __u32 cda;
-} __attribute__ ((packed));
+} __attribute__ ((packed, aligned(8)));

 #define CCW_FLAG_DC  0x80
 #define CCW_FLAG_CC  0x40


Currently testing.

This looks obviously right, but did you figure out what the (probably
unrelated) other failure was?



That is still under investigation, for some reason it only fails for an
LDL DASD and it works for SCSIs and CDL DASD.


Which are the symptoms of the failure? I'd like to understand this
before I update the (currently working by accident) bios with an
updated version.

I'll just apply the patch for now.



Well it's seems like the failure for LDL DASD could be a disk setup
failure. We tried the test on a different environment with LDL disks and
everything worked fine with the patch applied.


Odd that it breaks after this change, though. Do you get command
rejects, or what happens?



It's the alignment of the CCW which causes the problem.

The exact error message when starting the guest was:

! No virtio device found !

Since it worked for SCSI and CDL, and failed for LDL disks on that particular 
system, we are not really sure what caused the failure.
Debugging it further showed the CCW for LDL disks were not aligned at double 
word boundary.

Trying the test on a different system with LDL disks worked fine, with the 
aligned(8) fix.


Do you happen to have an old s390-ccw.img laying around in the test folder? 
QEMU might pick up
this one (e.g. when calling it without libvirt from the command line).

I explicitly mention the bios to use with '-bios' option and pick up the 
latest bios. Without the aligned fix I see the error and with the fix it 
works fine.





[Qemu-devel] [PATCH v3] qemu crashes when a negative number used for 'maxcpus'

2017-08-28 Thread Seeteena Thoufeek
---Steps to Reproduce---

When passed a negative number to 'maxcpus' parameter, Qemu aborts
with a core dump.

Run the following command with maxcpus argument as negative number

ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
pseries,accel=kvm,kvm-type=HV -m size=200g -device virtio-blk-pci,
drive=rootdisk -drive file=/home/images/pegas-1.0-ppc64le.qcow2,
if=none,cache=none,id=rootdisk,format=qcow2 -monitor telnet
:127.0.0.1:1234,server,nowait -net nic,model=virtio -net
user -redir tcp:2000::22 -device nec-usb-xhci -smp 8,cores=1,
threads=1,maxcpus=-12

(process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
 18446744073709550568 bytes

Trace/breakpoint trap

Reported-by: R.Nageswara Sastry 
Signed-off-by: Seeteena Thoufeek 
Reviewed-by: Bharata B Rao 
---
 vl.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index 8e247cc..fb45b6d 100644
--- a/vl.c
+++ b/vl.c
@@ -1244,7 +1244,10 @@ static void smp_parse(QemuOpts *opts)
 }
 
 max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
-
+if (max_cpus <= 0) {
+error_report("Invalid max_cpus : %d", max_cpus);
+exit(1);
+}
 if (max_cpus < cpus) {
 error_report("maxcpus must be equal to or greater than smp");
 exit(1);
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Cornelia Huck
On Mon, 28 Aug 2017 09:52:08 -0400
Farhan Ali  wrote:

> On 08/28/2017 09:24 AM, Christian Borntraeger wrote:
> >
> >
> >
> > On 08/28/2017 03:18 PM, Farhan Ali wrote:  
> >>
> >>
> >> On 08/28/2017 09:06 AM, Cornelia Huck wrote:  
> >>> On Mon, 28 Aug 2017 08:56:42 -0400
> >>> Farhan Ali  wrote:
> >>>  
>  On 08/28/2017 04:22 AM, Cornelia Huck wrote:  
> > On Fri, 25 Aug 2017 11:05:30 -0400
> > Farhan Ali  wrote:
> >  
> >> On 08/25/2017 10:04 AM, Cornelia Huck wrote:  
> >>> On Fri, 25 Aug 2017 09:24:46 -0400
> >>> Farhan Ali  wrote:
> >>>  
>  The commit 198c0d1f9df8c4 s390x/css: check ccw address validity
>  exposes an alignment issue in ccw bios.
> 
>  According to PoP the CCW must be doubleword aligned. Let's fix
>  this in the bios.
> 
>  Cc: qemu-sta...@nongnu.org
>  Signed-off-by: Farhan Ali 
>  Reviewed-by: Halil Pasic 
>  Reviewed-by: Eric Farman 
>  Acked-by: Christian Borntraeger 
>  ---
>   pc-bios/s390-ccw/cio.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
>  index f5b4549..55eaeee 100644
>  --- a/pc-bios/s390-ccw/cio.h
>  +++ b/pc-bios/s390-ccw/cio.h
>  @@ -133,7 +133,7 @@ struct ccw1 {
>   __u8 flags;
>   __u16 count;
>   __u32 cda;
>  -} __attribute__ ((packed));
>  +} __attribute__ ((packed, aligned(8)));
> 
>   #define CCW_FLAG_DC  0x80
>   #define CCW_FLAG_CC  0x40  
> >>>
> >>> Currently testing.
> >>>
> >>> This looks obviously right, but did you figure out what the (probably
> >>> unrelated) other failure was?
> >>>  
> >>
> >> That is still under investigation, for some reason it only fails for an
> >> LDL DASD and it works for SCSIs and CDL DASD.  
> >
> > Which are the symptoms of the failure? I'd like to understand this
> > before I update the (currently working by accident) bios with an
> > updated version.
> >
> > I'll just apply the patch for now.
> >  
> 
>  Well it's seems like the failure for LDL DASD could be a disk setup
>  failure. We tried the test on a different environment with LDL disks and
>  everything worked fine with the patch applied.  
> >>>
> >>> Odd that it breaks after this change, though. Do you get command
> >>> rejects, or what happens?
> >>>  
> >>
> >> It's the alignment of the CCW which causes the problem.
> >>
> >> The exact error message when starting the guest was:
> >>
> >> ! No virtio device found !
> >>
> >> Since it worked for SCSI and CDL, and failed for LDL disks on that 
> >> particular system, we are not really sure what caused the failure.
> >> Debugging it further showed the CCW for LDL disks were not aligned at 
> >> double word boundary.

This is really, really odd, as the low-level ccw code is the same for
any disk type...

> >>
> >> Trying the test on a different system with LDL disks worked fine, with the 
> >> aligned(8) fix.  
> >
> > Do you happen to have an old s390-ccw.img laying around in the test folder? 
> > QEMU might pick up
> > this one (e.g. when calling it without libvirt from the command line).
> >  
> I explicitly mention the bios to use with '-bios' option and pick up the 
> latest bios. Without the aligned fix I see the error and with the fix it 
> works fine.

Wait, so the fix fixes it? Or am I confused now?



Re: [Qemu-devel] [PATCH v15 4/5] mm: support reporting free page blocks

2017-08-28 Thread Michal Hocko
On Mon 28-08-17 15:33:26, Michal Hocko wrote:
> On Mon 28-08-17 18:08:32, Wei Wang wrote:
> > This patch adds support to walk through the free page blocks in the
> > system and report them via a callback function. Some page blocks may
> > leave the free list after zone->lock is released, so it is the caller's
> > responsibility to either detect or prevent the use of such pages.
> > 
> > One use example of this patch is to accelerate live migration by skipping
> > the transfer of free pages reported from the guest. A popular method used
> > by the hypervisor to track which part of memory is written during live
> > migration is to write-protect all the guest memory. So, those pages that
> > are reported as free pages but are written after the report function
> > returns will be captured by the hypervisor, and they will be added to the
> > next round of memory transfer.
> 
> OK, looks much better. I still have few nits.
> 
> > +extern void walk_free_mem_block(void *opaque,
> > +   int min_order,
> > +   bool (*report_page_block)(void *, unsigned long,
> > + unsigned long));
> > +
> 
> please add names to arguments of the prototype

And one more thing. Your callback returns bool and true usually means a
success while you are using it to break out from the loop. This is
rather confusing. I would expect iterating until false is returned so
the opposite than what you have. You could also change this to int and
return 0 on success and < 0 to break out. 

-- 
Michal Hocko
SUSE Labs



Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Farhan Ali



On 08/28/2017 10:05 AM, Cornelia Huck wrote:

It's the alignment of the CCW which causes the problem.

The exact error message when starting the guest was:

! No virtio device found !

Since it worked for SCSI and CDL, and failed for LDL disks on that particular 
system, we are not really sure what caused the failure.
Debugging it further showed the CCW for LDL disks were not aligned at double 
word boundary.

This is really, really odd, as the low-level ccw code is the same for
any disk type...


Exactly!


Trying the test on a different system with LDL disks worked fine, with the 
aligned(8) fix.

Do you happen to have an old s390-ccw.img laying around in the test folder? 
QEMU might pick up
this one (e.g. when calling it without libvirt from the command line).


I explicitly mention the bios to use with '-bios' option and pick up the
latest bios. Without the aligned fix I see the error and with the fix it
works fine.

Wait, so the fix fixes it? Or am I confused now?



It fixes in my system and one other system we tried on. But fails on a 
system where this issue was first noticed.





Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Halil Pasic


On 08/28/2017 04:15 PM, Farhan Ali wrote:
> 
> 
> On 08/28/2017 10:05 AM, Cornelia Huck wrote:
> It's the alignment of the CCW which causes the problem.
>
> The exact error message when starting the guest was:
>
> ! No virtio device found !
>
> Since it worked for SCSI and CDL, and failed for LDL disks on that 
> particular system, we are not really sure what caused the failure.
> Debugging it further showed the CCW for LDL disks were not aligned at 
> double word boundary.
>> This is really, really odd, as the low-level ccw code is the same for
>> any disk type...
>>
> Exactly!
> 
> Trying the test on a different system with LDL disks worked fine, with 
> the aligned(8) fix.
 Do you happen to have an old s390-ccw.img laying around in the test 
 folder? QEMU might pick up
 this one (e.g. when calling it without libvirt from the command line).

>>> I explicitly mention the bios to use with '-bios' option and pick up the
>>> latest bios. Without the aligned fix I see the error and with the fix it
>>> works fine.
>> Wait, so the fix fixes it? Or am I confused now?
>>
> 
> It fixes in my system and one other system we tried on. But fails on a system 
> where this issue was first noticed.

This is very confusing. So you have tried -bios on the system
where the issue was first noticed and the issue still persists
despite of the fixed bios is specified?




[Qemu-devel] [Bug 1713434] Re: prom-env-test test aborted and core dumped

2017-08-28 Thread Thomas Huth
That works for me - no problems with tests/prom-env-test on a POWER8
little endian system here. What host system are you using? Could you
also check what happens if you run QEMU directly like this, and post the
console output here:

ppc64-softmmu/qemu-system-ppc64 -nographic -M pseries,accel=tcg
-nodefaults -serial stdio -prom-env 'use-nvramrc?=true' -prom-env
'nvramrc=." Hello World!" cr power-off'

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713434

Title:
  prom-env-test test aborted and core dumped

Status in QEMU:
  In Progress

Bug description:
  On ppc64le architecture machine the following test case Aborted and
  Core dumped.

  # tests/prom-env-test --quiet --keep-going -m=quick --GTestLogFD=6
  **
  ERROR:tests/libqtest.c:628:qtest_get_arch: assertion failed: (qemu != NULL)
  Aborted (core dumped)

  Steps to re-produce:
  clone from the git
  configure & compile 
  run the unit tests by 'make check'

  (gdb) bt
  #0  0x3fff9d60eff0 in raise () from /lib64/libc.so.6
  #1  0x3fff9d61136c in abort () from /lib64/libc.so.6
  #2  0x3fff9de1aa04 in g_assertion_message () from /lib64/libglib-2.0.so.0
  #3  0x3fff9de1ab0c in g_assertion_message_expr () from 
/lib64/libglib-2.0.so.0
  #4  0x1000cc30 in qtest_get_arch () at tests/libqtest.c:628
  #5  0x100048f0 in main (argc=5, argv=0x32145538) at 
tests/prom-env-test.c:82
  (gdb) i r
  r0 0xfa   250
  r1 0x32144d30 70368510627120
  r2 0x3fff9d7b9900 7036709176
  r3 0x00
  r4 0x12a7 4775
  r5 0x66
  r6 0x88
  r7 0x11
  r8 0x00
  r9 0x00
  r100x00
  r110x00
  r120x00
  r130x3fff9dfa1950 70367099623760
  r140x00
  r150x00
  r160x00
  r170x00
  r180x00
  r190x00
  r200x00
  r210x00
  r220x00
  r230x00
  r240x00
  r250x00
  r260x00
  r270x100287f8 268601336
  r280x16841b40 377756480
  r290x4c   76
  r300x32144de8 70368510627304
  r310x66
  pc 0x3fff9d60eff0 0x3fff9d60eff0 
  msr0x9280f033 10376293541503627315
  cr 0x42000842 1107298370
  lr 0x3fff9d61136c 0x3fff9d61136c 
  ctr0x00
  xer0x00
  orig_r30x12a7 4775
  trap   0xc00  3072

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713434/+subscriptions



Re: [Qemu-devel] [PATCH v3] qemu crashes when a negative number used for 'maxcpus'

2017-08-28 Thread Fam Zheng
Hi Seeteena, thanks for sending the patch and fixing the coding style!

I suggest to change subject to "vl: exit if maxcpus is negative". The subject of
a patch email is going to be the summary of the commit message when applied,
therefore it should be worded to summarize the change. Commonly there is a
"subsystem prefix" followed by a colon, like "vl:", "net:". While there is no
hard rule, the most suitable prefix can be guessed by inspecting the log of the
changed files. In the case of this patch: "git log vl.c".

A more detailed guide on submitting "perfect" QEMU patches can be found here,
in case for your future reference:

https://wiki.qemu.org/Contribute/SubmitAPatch

On Mon, 08/28 19:23, Seeteena Thoufeek wrote:
> ---Steps to Reproduce---
> 
> When passed a negative number to 'maxcpus' parameter, Qemu aborts
> with a core dump.
> 
> Run the following command with maxcpus argument as negative number
> 
> ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
> pseries,accel=kvm,kvm-type=HV -m size=200g -device virtio-blk-pci,
> drive=rootdisk -drive file=/home/images/pegas-1.0-ppc64le.qcow2,
> if=none,cache=none,id=rootdisk,format=qcow2 -monitor telnet
> :127.0.0.1:1234,server,nowait -net nic,model=virtio -net
> user -redir tcp:2000::22 -device nec-usb-xhci -smp 8,cores=1,
> threads=1,maxcpus=-12
> 
> (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
>  18446744073709550568 bytes
> 
> Trace/breakpoint trap
> 
> Reported-by: R.Nageswara Sastry 
> Signed-off-by: Seeteena Thoufeek 
> Reviewed-by: Bharata B Rao 
> ---

Would be good if there is a comment below a "---" line on how current revision
differs from the previous one, like:

---

v3: Fix coding style pointed out by patchew.


>  vl.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/vl.c b/vl.c
> index 8e247cc..fb45b6d 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1244,7 +1244,10 @@ static void smp_parse(QemuOpts *opts)
>  }
>  
>  max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
> -
> +if (max_cpus <= 0) {
> +error_report("Invalid max_cpus : %d", max_cpus);

The space before ":" can be dropped, I think.

> +exit(1);
> +}
>  if (max_cpus < cpus) {
>  error_report("maxcpus must be equal to or greater than smp");
>  exit(1);
> -- 
> 1.8.3.1
> 

Fam



Re: [Qemu-devel] [PATCH] s390-ccw: Fix alignment for CCW1

2017-08-28 Thread Farhan Ali



On 08/28/2017 10:19 AM, Halil Pasic wrote:



On 08/28/2017 04:15 PM, Farhan Ali wrote:



On 08/28/2017 10:05 AM, Cornelia Huck wrote:

It's the alignment of the CCW which causes the problem.

The exact error message when starting the guest was:

! No virtio device found !

Since it worked for SCSI and CDL, and failed for LDL disks on that particular 
system, we are not really sure what caused the failure.
Debugging it further showed the CCW for LDL disks were not aligned at double 
word boundary.

This is really, really odd, as the low-level ccw code is the same for
any disk type...


Exactly!


Trying the test on a different system with LDL disks worked fine, with the 
aligned(8) fix.

Do you happen to have an old s390-ccw.img laying around in the test folder? 
QEMU might pick up
this one (e.g. when calling it without libvirt from the command line).


I explicitly mention the bios to use with '-bios' option and pick up the
latest bios. Without the aligned fix I see the error and with the fix it
works fine.

Wait, so the fix fixes it? Or am I confused now?



It fixes in my system and one other system we tried on. But fails on a system 
where this issue was first noticed.


This is very confusing. So you have tried -bios on the system
where the issue was first noticed and the issue still persists
despite of the fixed bios is specified?


Yes.

The system where the issue was first noticed, applying the fix for the 
bios, fixes for:


1) CDL disks
2) SCSI disks

But fails for LDL disk.

On my system and one other system, the fix works for all the disk types, 
CDL, SCSI and LDL and fixes the issue.





[Qemu-devel] How to best handle the reoccurring of rom changes breaking cross version migrations?

2017-08-28 Thread Christian Ehrhardt
Hi,
migration issues due to rom changes seem to occur over and over in past
years [1], [2],[3],[4],[5].
>From the past I know several workarounds (like just truncating to the
bigger size) but all have various deficiencies.

But OTOH rom's will always change due to fixes in them. And recently I
found one such change [6] that affects the next Ubuntu release and wonder
what the ?right?, well lets say best way to fix it would be.
Current issue:
Length mismatch: :00:03.0/virtio-net-pci.rom: 0x4 in != 0x8:
Invalid argument
Due to efi-virtio.rom growing over 256k due to an update to a newer ipxe
from upstream.

I beg your pardon (for not being educated enough on this yet), but I want
to avoid to go a route that is fixing it in a sub-optimal way and ask for
some guidance. There might be better ways in the modern qemu of today than
there were in the past.
TL;DR I look for the best way (if any) to declare in the new qemu: "I know
that the old size was smaller, let me fix that up on migration".

I'll try on my own as I expect the machine type structures/mechanisms (and
we have Ubuntu specific types that could encapsulate the rom status from
the ipxe package to be coupled with the type) have all that is needed.
Yet me almost randomly trying around there surely isn't the best way to go
- so if there is some existing case or a short hint at what in there might
be the best way to fixup a changed rom size on a migration I'd be very
happy to hear about that.

[1]: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1536331
[2]: https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01881.html
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1293566
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1090093
[5]: https://forge.univention.org/bugzilla/show_bug.cgi?id=38877
[6]: https://bugs.launchpad.net/ubuntu/+source/ipxe/+bug/1713490

P.S: As everybody else I don't mind so much on reverse migration to older
releases

-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd


Re: [Qemu-devel] [PATCH v3] qemu crashes when a negative number used for 'maxcpus'

2017-08-28 Thread Bharata B Rao
Seeteena,

On Mon, Aug 28, 2017 at 7:23 PM, Seeteena Thoufeek <
s1see...@linux.vnet.ibm.com> wrote:

> ---Steps to Reproduce---
>
> When passed a negative number to 'maxcpus' parameter, Qemu aborts
> with a core dump.
>
> Run the following command with maxcpus argument as negative number
>
> ppc64-softmmu/qemu-system-ppc64 --nographic -vga none -machine
> pseries,accel=kvm,kvm-type=HV -m size=200g -device virtio-blk-pci,
> drive=rootdisk -drive file=/home/images/pegas-1.0-ppc64le.qcow2,
> if=none,cache=none,id=rootdisk,format=qcow2 -monitor telnet
> :127.0.0.1:1234,server,nowait -net nic,model=virtio -net
> user -redir tcp:2000::22 -device nec-usb-xhci -smp 8,cores=1,
> threads=1,maxcpus=-12
>
> (process:12149): GLib-ERROR **: gmem.c:130: failed to allocate
>  18446744073709550568 bytes
>
> Trace/breakpoint trap
>
> Reported-by: R.Nageswara Sastry 
> Signed-off-by: Seeteena Thoufeek 
> Reviewed-by: Bharata B Rao 
>

In the bugzilla, I was only suggesting to post the fix upstream and it
doesn't mean a Reviewed-by.  You might want to remove this in your next
version.

Regards,
Bharata.


[Qemu-devel] [PATCH] tpm: lookup cancel path under tpm device class

2017-08-28 Thread Marc-André Lureau
Since Linux 4.0 (commit 313d21eeab9282e), tpm devices have their own
device class "tpm" and the cancel path must be looked up under
/sys/class/tpm/ instead of /sys/class/misc/.

Signed-off-by: Marc-André Lureau 
---
 hw/tpm/tpm_passthrough.c | 36 ++--
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/hw/tpm/tpm_passthrough.c b/hw/tpm/tpm_passthrough.c
index 9234eb3459..e3e9368057 100644
--- a/hw/tpm/tpm_passthrough.c
+++ b/hw/tpm/tpm_passthrough.c
@@ -367,7 +367,8 @@ static TPMVersion 
tpm_passthrough_get_tpm_version(TPMBackend *tb)
  * Unless path or file descriptor set has been provided by user,
  * determine the sysfs cancel file following kernel documentation
  * in Documentation/ABI/stable/sysfs-class-tpm.
- * From /dev/tpm0 create /sys/class/misc/tpm0/device/cancel
+ * From /dev/tpm0 create /sys/class/tpm/tpm0/device/cancel
+ * before 4.0: /sys/class/misc/tpm0/device/cancel
  */
 static int tpm_passthrough_open_sysfs_cancel(TPMBackend *tb)
 {
@@ -379,28 +380,35 @@ static int tpm_passthrough_open_sysfs_cancel(TPMBackend 
*tb)
 if (tb->cancel_path) {
 fd = qemu_open(tb->cancel_path, O_WRONLY);
 if (fd < 0) {
-error_report("Could not open TPM cancel path : %s",
+error_report("Could not open TPM cancel path: %s",
  strerror(errno));
 }
 return fd;
 }
 
 dev = strrchr(tpm_pt->tpm_dev, '/');
-if (dev) {
-dev++;
-if (snprintf(path, sizeof(path), "/sys/class/misc/%s/device/cancel",
- dev) < sizeof(path)) {
-fd = qemu_open(path, O_WRONLY);
-if (fd >= 0) {
-tb->cancel_path = g_strdup(path);
-} else {
-error_report("tpm_passthrough: Could not open TPM cancel "
- "path %s : %s", path, strerror(errno));
+if (!dev) {
+error_report("tpm_passthrough: Bad TPM device path %s",
+ tpm_pt->tpm_dev);
+return -1;
+}
+
+dev++;
+if (snprintf(path, sizeof(path), "/sys/class/tpm/%s/device/cancel",
+ dev) < sizeof(path)) {
+fd = qemu_open(path, O_WRONLY);
+if (fd < 0) {
+if (snprintf(path, sizeof(path), 
"/sys/class/misc/%s/device/cancel",
+ dev) < sizeof(path)) {
+fd = qemu_open(path, O_WRONLY);
 }
 }
+}
+
+if (fd < 0) {
+error_report("tpm_passthrough: Could not guess TPM cancel path");
 } else {
-   error_report("tpm_passthrough: Bad TPM device path %s",
-tpm_pt->tpm_dev);
+tb->cancel_path = g_strdup(path);
 }
 
 return fd;
-- 
2.14.1.146.gd35faa819




Re: [Qemu-devel] [PATCH 2/2] tests: Make acpid test compile

2017-08-28 Thread Cédric Le Goater
On 08/23/2017 01:53 PM, Dr. David Alan Gilbert wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Compiler gets confused with the size of the struct, so move form
>> g_new0() to g_malloc0().
>>
>> I *think* that the problem is in gcc (or glib for that matter), but
>> the documentation of the g_new0 states that 1sts first argument is an
>> struct type, and uint32_t is not an struct type.
>>
>> Signed-off-by: Juan Quintela 
>> ---
>>  tests/vmgenid-test.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tests/vmgenid-test.c b/tests/vmgenid-test.c
>> index 3d5c1c3615..032e1d465a 100644
>> --- a/tests/vmgenid-test.c
>> +++ b/tests/vmgenid-test.c
>> @@ -67,7 +67,7 @@ static uint32_t acpi_find_vgia(void)
>>  g_assert_cmpint(tables_nr, >, 0);
>>  
>>  /* get the addresses of the tables pointed by rsdt */
>> -tables = g_new0(uint32_t, tables_nr);
>> +tables = g_malloc0(sizeof(uint32_t) * tables_nr);
> 
> I think there's an easier fix for this I think;
> try:
> 
> -g_assert_cmpint(tables_nr, >, 0);
> +g_assert(tables_nr > 0);

I fixed that one with :

@@ -40,7 +40,7 @@ static uint32_t acpi_find_vgia(void)
 AcpiRsdpDescriptor rsdp_table;
 uint32_t rsdt;
 AcpiRsdtDescriptorRev1 rsdt_table;
-int tables_nr;
+uint32_t tables_nr;
 uint32_t *tables;
 AcpiTableHeader ssdt_table;
 VgidTable vgid_table;


C. 



> Dave
> 
>>  ACPI_READ_ARRAY_PTR(tables, tables_nr, rsdt);
>>  
>>  for (i = 0; i < tables_nr; i++) {
>> -- 
>> 2.13.5
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 




Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()

2017-08-28 Thread Cornelia Huck
On Mon, 28 Aug 2017 10:04:44 +0200
Yi Min Zhao  wrote:

> The function trap_msix() is to check if pcistg instruction would access
> msix table entries. The correct boundary condition should be
> [table_offset, table_offset+entries*entry_size). But the current
> condition calculated misses the last entry. So let's fixup it.
> 
> Acked-by: Dong Jia Shi 
> Reviewed-by: Pierre Morel 
> Signed-off-by: Yi Min Zhao 
> ---
>  hw/s390x/s390-pci-inst.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index b7beb8c36a..eba9ffb5f2 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t 
> offset, uint8_t pcias)
>  {
>  if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
>  offset >= pbdev->msix.table_offset &&
> -offset <= pbdev->msix.table_offset +
> -  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
> +offset < (pbdev->msix.table_offset +
> +  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
>  return 1;
>  } else {
>  return 0;

What happened before due to the miscalculation? Write to wrong memory
region?



Re: [Qemu-devel] [PATCH] qemu-iotests: Extend non-shared storage migration test (194)

2017-08-28 Thread Eric Blake
On 08/28/2017 06:29 AM, Kashyap Chamarthy wrote:
> This is the follow-up patch that was discussed[*] as part of feedback to
> qemu-iotest 194.
> 
> Changes in this patch:
> 
>   - Supply 'job-id' parameter to `drive-mirror` invocation.
> 
>   - Issue `block-job-cancel` command on the source QEMU to gracefully
> complete the mirroring operation.
> 
>   - Stop the NBD server on the destination QEMU.
> 
>   - Finally, exit once the event BLOCK_JOB_COMPLETED is emitted.
> 
> With the above, the test will also be (almost) in sync with the
> procedure outlined in the document live-block-operations.rst[+]
> (section: "QMP invocation for live storage migration with
> ``drive-mirror`` + NBD").
> 
> [*] https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg04820.html
> -- qemu-iotests: add 194 non-shared storage migration test
> [+] 
> https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=docs/interop/live-block-operations.rst
> 
> Signed-off-by: Kashyap Chamarthy 
> ---
> I wonder:
>   - Is it worth printing the MIGRATION event state change?

I think waiting for both the BLOCK_JOB_COMPLETED and MIGRATION events
make sense (in other words, let's check both events in the expected
order, rather than just one or the other).

>   - Since we're not checking on the MIGRATION event anymore, can
> the migration state change events related code (that is triggerred
> by setting 'migrate-set-capabilities') be simply removed?

If we're going to mirror libvirt's non-shared storage migration
sequence, I think we want to keep everything, rather than drop the
migration half.

> ---
>  tests/qemu-iotests/194 | 17 -
>  tests/qemu-iotests/194.out | 14 --
>  2 files changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
> index 
> 8028111e21bed5cf4a2e8e32dc04aa5a9ea9caca..8d746be9d0033f478f11886ee93f95b0fa55bab0
>  100755
> --- a/tests/qemu-iotests/194
> +++ b/tests/qemu-iotests/194
> @@ -46,16 +46,17 @@ iotests.log('Launching NBD server on destination...')
>  iotests.log(dest_vm.qmp('nbd-server-start', addr={'type': 'unix', 'data': 
> {'path': nbd_sock_path}}))
>  iotests.log(dest_vm.qmp('nbd-server-add', device='drive0', writable=True))
>  
> -iotests.log('Starting drive-mirror on source...')
> +iotests.log('Starting `drive-mirror` on source...')
>  iotests.log(source_vm.qmp(
>'drive-mirror',
>device='drive0',
>target='nbd+unix:///drive0?socket={0}'.format(nbd_sock_path),
>sync='full',
>format='raw', # always raw, the server handles the format
> -  mode='existing'))
> +  mode='existing',
> +  job_id='mirror-job0'))
>  
> -iotests.log('Waiting for drive-mirror to complete...')
> +iotests.log('Waiting for `drive-mirror` to complete...')

So, up to here is okay,

>  iotests.log(source_vm.event_wait('BLOCK_JOB_READY'),
>  filters=[iotests.filter_qmp_event])
>  
> @@ -66,8 +67,14 @@ dest_vm.qmp('migrate-set-capabilities',
>  capabilities=[{'capability': 'events', 'state': True}])
>  iotests.log(source_vm.qmp('migrate', 
> uri='unix:{0}'.format(migration_sock_path)))
>  
> +iotests.log('Gracefully ending the `drive-mirror` job on source...')
> +iotests.log(source_vm.qmp('block-job-cancel', device='mirror-job0'))
> +
> +iotests.log('Stopping the NBD server on destination...')
> +iotests.log(dest_vm.qmp('nbd-server-stop'))
> +
>  while True:
> -event = source_vm.event_wait('MIGRATION')
> +event = source_vm.event_wait('BLOCK_JOB_COMPLETED')

And this event makes sense for catching the block-job-cancel, but I
think you STILL want to keep a while loop for catching migration as well.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] GSOC Report: Moving I/O throttling and write notifiers into block filter drivers

2017-08-28 Thread Manos Pitsidianakis
This is a GSOC project summary required for the project's final 
submission. As part of GSOC 2017, I took the project of moving two hard 
coded block layer features into filter drivers. I/O Throttling is 
implemented in block/throttle.c and before write notifiers are split 
into a driver for each user of the before write notifier API: 
block/backup.c and block/write-threshold.c. Furthermore, work began on 
block-insert-node and block-remove-node commands for the QMP interface 
to allow runtime insertion and removal of filter drivers. [0]


A lot of thanks to my mentors for their help: Alberto Garcia, Stefan 
Hajnoczi and Kevin Wolf.


Terms
=

The BlockBackend struct (block/block-backend.c) represents the backend part 
of a storage device that a VM sees in the QEMU environment. The BlockBackend 
has the responsibility to forward I/O requests from the VM down to the 
actual underlying storage; a network block device, a qcow2 image etc.


In order to allow for polymorphic storage, a BlockBackend forwards the 
requests to an acyclic graph in which the leaves are the terminal I/O 
destination, a file or network connection. The BlockDriverState struct 
represents a node in this graph, and each node is governed by a specific 
driver. Above the leaf-nodes we have format drivers that translate requests 
for each format (ie a qcow2 driver, or a raw driver). Backing files are 
implemented as chains of nodes that forward read requests to their 
children but keep write requests to themselves. This setup allows 
different node drivers to intercept requests before they reach their 
destination by being inserted into points of interest in the graph. We 
call these block filter drivers.


An existing filter driver for example is block/blkverify.c which compares 
two children for content parity and reports content mismatches.


I/O Throttling
==

I/O throttling is done by intercepting I/O requests and throttling them 
based on the configured limits (docs/throttle.txt). The interface was 
refactored into the throttle driver [1] while the throttling primitives 
were left unchanged. The already existing interface of setting limits on 
a BlockBackend device is simulated [2] by inserting a hidden to the user 
throttle filter node at the root of the BlockBackend with the user's set 
limits. Implicitly created filter nodes is not a good solution since 
some of the QEMU internals are written without considering filter nodes.  
Some patches in the throttle-remove-legacy branch are dedicated to 
changing existing behaviour to match the new concept of implicit 
filters. In the future management tools should be expected to explicitly 
add and remove filter nodes like throttle (except for transient block 
job filters which may remain implicit) and there should be no surprises 
about the state of the block layer graph for the user.


Throttle groups are categories of drives sharing the same limits with a 
round-robin algorithm. Additional effort was spent on making throttle groups 
easier to configure by turning them into a separately creatable object (with 
-object syntax on command line invocation or object-add in QMP). Their 
properties can be set with 'qom-set' commands and retrieved with 'qom-get'.



Write Notifiers
===

While a backup block job is running, it is important to have knowledge of 
writes to the relevant image. Before write notifiers pass the write requests 
to the backup job to perform copy on write on the target image with the new 
data. Currently this is done on the BlockBackend level. Other block jobs 
(commit/mirror) already create implicit nodes in the BDS chain and this 
approach was copied and a backup filter driver was created [3], internal to 
block/backup.c


The write-threshold feature once enabled via QMP, watches passing write 
requests and compares them to a user-given threshold offset. When that 
threshold is exceeded an event is delivered to the user and the feature 
disables itself. This is used for management tools that need to know when 
they have to resize their images. Like backup, this was done in the 
BlockBackend level. However it wasn't easy to replace the existing interface 
with an implicit filter node like in throttling, so only a separate driver 
was created [4] in block/write-threshold.c. Like other filter drivers, it 
can be inserted on runtime and removed once it delivers the event and is 
spent and should be removed or replaced.


Branches / Patches
==

The 'throttle' and 'throttle-remove-legacy' patches should be merged soon 
after master unfreezes from the 2.10 release. The rest of the patch series 
are in final stages of review on qemu-devel except for block-insert-node 
which is an RFC [5].


Already merged patches in 2.10
 https://github.com/qemu/qemu/commits/v2.10.0-rc4?author=epilys
Already merged patches for 2.11
 https://www.mail-archive.com/qemu-devel@nongnu.org/msg470461.html

[0] [insert-node] block-insert-node and block-remove-n

  1   2   3   4   >