Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-30 Thread tu bo

Hi Christian:

I got the same crash with qemu master + assertion patch + "[PATCH 0/6] 
virtio: refactor host notifiers" +  Paolo's fix,


(gdb) bt
#0  blk_aio_read_entry (opaque=0x0) at block/block-backend.c:916
#1  0x02aa2e8e88fe in coroutine_trampoline (i0=, 
i1=-1677703696) at util/coroutine-ucontext.c:78

#2  0x03ffa85d150a in __makecontext_ret () from /lib64/libc.so.6


On 03/30/2016 12:27 AM, Christian Borntraeger wrote:

On 03/29/2016 03:50 PM, Paolo Bonzini wrote:



On 29/03/2016 13:45, Cornelia Huck wrote:

Hi Tu Bo,

please always include the assertion patch at
https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
your tests.  Can you include the backtrace from all threads with that patch?


thanks for your reminder about the assertion patch. Here is the
backtrace with qemu master + assertion patch + "[PATCH 0/6] virtio:
refactor host notifiers",


FWIW, I've been running this in a reboot loop for the last 2 1/2 hours.
Could you perhaps share your command line?


 From code inspection, the following is also necessary or at least a
good idea:

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 6fb29e3..7fa8477 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -258,7 +258,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
  aio_context_acquire(s->ctx);

  /* Stop notifications for new requests from guest */
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, false);
+virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, false);

  /* Drain and switch bs back to the QEMU main loop */
  blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());



Right. Tu Bo, you seem to have the best testcase for this.
Does your setup runs fine with this on top?

CHristian






Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-30 Thread tu bo

Hi Christian:

On 03/29/2016 07:54 PM, Christian Borntraeger wrote:

On 03/29/2016 11:14 AM, tu bo wrote:

Hi Paolo:

On 03/29/2016 02:11 AM, Paolo Bonzini wrote:

On 28/03/2016 05:55, TU BO wrote:

Hi Cornelia:

I got two crash with qemu master + "[PATCH 0/6] virtio: refactor host
notifiers",


Hi Tu Bo,

please always include the assertion patch at
https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
your tests.  Can you include the backtrace from all threads with that patch?


thanks for your reminder about the assertion patch. Here is the backtrace with qemu 
master + assertion patch + "[PATCH 0/6] virtio: refactor host notifiers",

I got two crashes,

1. For 1st crash,
(gdb) thread apply all bt

Thread 8 (Thread 0x3ff8daf1910 (LWP 52859)):
#0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
#1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
#2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x3ff88000fa8, ms=) at util/qemu-thread-posix.c:245
#3  0x02aa2d6803e4 in worker_thread (opaque=0x3ff88000f40) at 
thread-pool.c:92
#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 7 (Thread 0x3ff8e679910 (LWP 52856)):
#0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
#1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
#2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x2aa2e1fbfa8, ms=) at util/qemu-thread-posix.c:245
#3  0x02aa2d6803e4 in worker_thread (opaque=0x2aa2e1fbf40) at 
thread-pool.c:92
#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 6 (Thread 0x3ff9497f910 (LWP 52850)):
#0  0x03ff9718c50e in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x03ff96d19792 in g_cond_wait () from /lib64/libglib-2.0.so.0
#2  0x02aa2d7165d2 in wait_for_trace_records_available () at 
trace/simple.c:147
---Type  to continue, or q  to quit---
#3  writeout_thread (opaque=) at trace/simple.c:165
#4  0x03ff96cfa44c in g_thread_proxy () from /lib64/libglib-2.0.so.0
#5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 5 (Thread 0x3ff8efff910 (LWP 52855)):
#0  0x03ff967f819a in ioctl () from /lib64/libc.so.6
#1  0x02aa2d546f3e in kvm_vcpu_ioctl (cpu=cpu@entry=0x2aa2e239030, 
type=type@entry=44672)
 at /usr/src/debug/qemu-2.5.50/kvm-all.c:1984
#2  0x02aa2d54701e in kvm_cpu_exec (cpu=0x2aa2e239030) at 
/usr/src/debug/qemu-2.5.50/kvm-all.c:1834
#3  0x02aa2d533cd6 in qemu_kvm_cpu_thread_fn (arg=) at 
/usr/src/debug/qemu-2.5.50/cpus.c:1056
#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 4 (Thread 0x3ff951ff910 (LWP 52849)):
#0  0x03ff967fcf56 in syscall () from /lib64/libc.so.6
#1  0x02aa2d755a36 in futex_wait (val=, ev=) 
at util/qemu-thread-posix.c:292
#2  qemu_event_wait (ev=0x2aa2ddb5914 ) at 
util/qemu-thread-posix.c:399
#3  0x02aa2d765002 in call_rcu_thread (opaque=) at 
util/rcu.c:250
#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
---Type  to continue, or q  to quit---

Thread 3 (Thread 0x3ff978e0bf0 (LWP 52845)):
#0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
#1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
__fds=) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=, nfds=, timeout=-1) at 
qemu-timer.c:313
#3  0x02aa2d688b02 in os_host_main_loop_wait (timeout=-1) at main-loop.c:251
#4  main_loop_wait (nonblocking=) at main-loop.c:505
#5  0x02aa2d4faade in main_loop () at vl.c:1933
#6  main (argc=, argv=, envp=) at 
vl.c:4646

Thread 2 (Thread 0x3ff8910 (LWP 52851)):
#0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
#1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, __nfds=, 
__fds=) at /usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=, nfds=, timeout=-1) at 
qemu-timer.c:313
#3  0x02aa2d68a788 in aio_poll (ctx=0x2aa2de77e00, blocking=) at aio-posix.c:453
#4  0x02aa2d5b909c in iothread_run (opaque=0x2aa2de77220) at iothread.c:46
#5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 1 (Thread 0x3ff8f7ff910 (LWP 52854)):
#0  0x03ff9673b650 in raise () from /lib64/libc.so.6
---Type  to continue, or q  to quit---
#1  0x03ff9673ced8 in abort () from /lib64/libc.so.6
#2  0x03ff96733666 in __assert_fail_base () from /lib64/libc.so.6
#3  0x03ff967336f4 in __assert_fail () from /lib64/libc.so.6
#4  0x02aa2d562608 in virtio_blk_handle_output (vdev=, 
vq=)
 at /usr/src/debug/qemu-2.5.50/hw/block/virtio-blk.c:595


Hmmm, are you sure that 

Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-30 Thread tu bo

Hi Christian:

On 03/30/2016 12:27 AM, Christian Borntraeger wrote:

On 03/29/2016 03:50 PM, Paolo Bonzini wrote:



On 29/03/2016 13:45, Cornelia Huck wrote:

Hi Tu Bo,

please always include the assertion patch at
https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
your tests.  Can you include the backtrace from all threads with that patch?


thanks for your reminder about the assertion patch. Here is the
backtrace with qemu master + assertion patch + "[PATCH 0/6] virtio:
refactor host notifiers",


FWIW, I've been running this in a reboot loop for the last 2 1/2 hours.
Could you perhaps share your command line?


 From code inspection, the following is also necessary or at least a
good idea:

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 6fb29e3..7fa8477 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -258,7 +258,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
  aio_context_acquire(s->ctx);

  /* Stop notifications for new requests from guest */
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, false);
+virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, false);

  /* Drain and switch bs back to the QEMU main loop */
  blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());



Right. Tu Bo, you seem to have the best testcase for this.
Does your setup runs fine with this on top?


My test needs at least four scsi disks, which was broken now because of 
the s38 firmware update. I'll test it when s38 scsi is ready. thx




CHristian






Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Christian Borntraeger
On 03/29/2016 03:50 PM, Paolo Bonzini wrote:
> 
> 
> On 29/03/2016 13:45, Cornelia Huck wrote:
 Hi Tu Bo,

 please always include the assertion patch at
 https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
 your tests.  Can you include the backtrace from all threads with that 
 patch?

>>> thanks for your reminder about the assertion patch. Here is the 
>>> backtrace with qemu master + assertion patch + "[PATCH 0/6] virtio: 
>>> refactor host notifiers",
>>
>> FWIW, I've been running this in a reboot loop for the last 2 1/2 hours.
>> Could you perhaps share your command line?
> 
> From code inspection, the following is also necessary or at least a
> good idea:
> 
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 6fb29e3..7fa8477 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -258,7 +258,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>  aio_context_acquire(s->ctx);
> 
>  /* Stop notifications for new requests from guest */
> -virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, false);
> +virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, false);
> 
>  /* Drain and switch bs back to the QEMU main loop */
>  blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
> 

Right. Tu Bo, you seem to have the best testcase for this.
Does your setup runs fine with this on top?

CHristian




Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Paolo Bonzini


On 29/03/2016 13:45, Cornelia Huck wrote:
> > > Hi Tu Bo,
> > >
> > > please always include the assertion patch at
> > > https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
> > > your tests.  Can you include the backtrace from all threads with that 
> > > patch?
> > >
> > thanks for your reminder about the assertion patch. Here is the 
> > backtrace with qemu master + assertion patch + "[PATCH 0/6] virtio: 
> > refactor host notifiers",
> 
> FWIW, I've been running this in a reboot loop for the last 2 1/2 hours.
> Could you perhaps share your command line?

>From code inspection, the following is also necessary or at least a
good idea:

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 6fb29e3..7fa8477 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -258,7 +258,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 aio_context_acquire(s->ctx);
 
 /* Stop notifications for new requests from guest */
-virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, false, false);
+virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, false);
 
 /* Drain and switch bs back to the QEMU main loop */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());



Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Michael S. Tsirkin
On Tue, Mar 29, 2016 at 03:23:57PM +0200, Christian Borntraeger wrote:
> On 03/24/2016 05:15 PM, Cornelia Huck wrote:
> > Here's the next version of my refactoring of the virtio host notifiers.
> > This one actually survives a bit of testing for me (reboot loop).
> > 
> > As this patchset fixes a latent bug exposed by the recent dataplane
> > changes (we have a deassigned ioeventfd for a short period of time
> > during dataplane start, which leads to the virtqueue handler being
> > called in both the vcpu thread and the iothread simultaneously), I'd
> > like to see this in 2.6.
> > 
> > Changes from RFC:
> > - Fixed some silly errors (checking for !disabled instead of disabled,
> >   virtio_ccw_stop_ioeventfd() calling virtio_bus_start_ioeventfd()).
> > - Completely reworked set_host_notifier(): We only want to set/unset
> >   the actual handler function and don't want to do anything to the
> >   ioeventfd backing, so reduce the function to actually doing only
> >   that.
> > - With the change above, we can lose the 'assign' parameter in
> >   virtio_bus_stop_ioeventfd() again.
> > - Added more comments that hopefully make it clearer what is going on.
> > 
> > I'd appreciate it if people could give it some testing; I'll be back
> > to look at the fallout after Easter.
> > 
> > Cornelia Huck (6):
> >   virtio-bus: common ioeventfd infrastructure
> >   virtio-bus: have callers tolerate new host notifier api
> >   virtio-ccw: convert to ioeventfd callbacks
> >   virtio-pci: convert to ioeventfd callbacks
> >   virtio-mmio: convert to ioeventfd callbacks
> >   virtio-bus: remove old set_host_notifier callback
> > 
> >  hw/block/dataplane/virtio-blk.c |   6 +-
> >  hw/s390x/virtio-ccw.c   | 133 
> > ++--
> >  hw/scsi/virtio-scsi-dataplane.c |   9 ++-
> >  hw/virtio/vhost.c   |  13 ++--
> >  hw/virtio/virtio-bus.c  | 132 
> > +++
> >  hw/virtio/virtio-mmio.c | 128 
> > +-
> >  hw/virtio/virtio-pci.c  | 124 +
> >  include/hw/virtio/virtio-bus.h  |  31 +-
> >  8 files changed, 303 insertions(+), 273 deletions(-)
> > 
> 
> FWIW, I went back to the old F20 installation. Without your patch set qemu 
> crashes 
> pretty soon, with your patch set it runs stable.
> How intrusive would it be to provide the fix 3 times (for each transport) and
> do the refactoring after 2.6? If the result would be big as well I think
> this patch set is still the right thing to do for 2.6  unless MST has a 
> small and beautiful 2.6fix.

Definitely not beautiful but small.
It's hard for me to see what is meant here, but if it's
even bigger than this patch then I'd be worried.

> 
> 
> Christian



Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Christian Borntraeger
On 03/24/2016 05:15 PM, Cornelia Huck wrote:
> Here's the next version of my refactoring of the virtio host notifiers.
> This one actually survives a bit of testing for me (reboot loop).
> 
> As this patchset fixes a latent bug exposed by the recent dataplane
> changes (we have a deassigned ioeventfd for a short period of time
> during dataplane start, which leads to the virtqueue handler being
> called in both the vcpu thread and the iothread simultaneously), I'd
> like to see this in 2.6.
> 
> Changes from RFC:
> - Fixed some silly errors (checking for !disabled instead of disabled,
>   virtio_ccw_stop_ioeventfd() calling virtio_bus_start_ioeventfd()).
> - Completely reworked set_host_notifier(): We only want to set/unset
>   the actual handler function and don't want to do anything to the
>   ioeventfd backing, so reduce the function to actually doing only
>   that.
> - With the change above, we can lose the 'assign' parameter in
>   virtio_bus_stop_ioeventfd() again.
> - Added more comments that hopefully make it clearer what is going on.
> 
> I'd appreciate it if people could give it some testing; I'll be back
> to look at the fallout after Easter.
> 
> Cornelia Huck (6):
>   virtio-bus: common ioeventfd infrastructure
>   virtio-bus: have callers tolerate new host notifier api
>   virtio-ccw: convert to ioeventfd callbacks
>   virtio-pci: convert to ioeventfd callbacks
>   virtio-mmio: convert to ioeventfd callbacks
>   virtio-bus: remove old set_host_notifier callback
> 
>  hw/block/dataplane/virtio-blk.c |   6 +-
>  hw/s390x/virtio-ccw.c   | 133 
> ++--
>  hw/scsi/virtio-scsi-dataplane.c |   9 ++-
>  hw/virtio/vhost.c   |  13 ++--
>  hw/virtio/virtio-bus.c  | 132 +++
>  hw/virtio/virtio-mmio.c | 128 +-
>  hw/virtio/virtio-pci.c  | 124 +
>  include/hw/virtio/virtio-bus.h  |  31 +-
>  8 files changed, 303 insertions(+), 273 deletions(-)
> 

FWIW, I went back to the old F20 installation. Without your patch set qemu 
crashes 
pretty soon, with your patch set it runs stable.
How intrusive would it be to provide the fix 3 times (for each transport) and
do the refactoring after 2.6? If the result would be big as well I think
this patch set is still the right thing to do for 2.6  unless MST has a 
small and beautiful 2.6fix.



Christian




Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Christian Borntraeger
On 03/29/2016 11:14 AM, tu bo wrote:
> Hi Paolo:
> 
> On 03/29/2016 02:11 AM, Paolo Bonzini wrote:
>> On 28/03/2016 05:55, TU BO wrote:
>>> Hi Cornelia:
>>>
>>> I got two crash with qemu master + "[PATCH 0/6] virtio: refactor host
>>> notifiers",
>>
>> Hi Tu Bo,
>>
>> please always include the assertion patch at
>> https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
>> your tests.  Can you include the backtrace from all threads with that patch?
>>
> thanks for your reminder about the assertion patch. Here is the backtrace 
> with qemu master + assertion patch + "[PATCH 0/6] virtio: refactor host 
> notifiers",
> 
> I got two crashes,
> 
> 1. For 1st crash,
> (gdb) thread apply all bt
> 
> Thread 8 (Thread 0x3ff8daf1910 (LWP 52859)):
> #0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
> #1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
> #2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x3ff88000fa8, 
> ms=) at util/qemu-thread-posix.c:245
> #3  0x02aa2d6803e4 in worker_thread (opaque=0x3ff88000f40) at 
> thread-pool.c:92
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 7 (Thread 0x3ff8e679910 (LWP 52856)):
> #0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
> #1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
> #2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x2aa2e1fbfa8, 
> ms=) at util/qemu-thread-posix.c:245
> #3  0x02aa2d6803e4 in worker_thread (opaque=0x2aa2e1fbf40) at 
> thread-pool.c:92
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 6 (Thread 0x3ff9497f910 (LWP 52850)):
> #0  0x03ff9718c50e in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x03ff96d19792 in g_cond_wait () from /lib64/libglib-2.0.so.0
> #2  0x02aa2d7165d2 in wait_for_trace_records_available () at 
> trace/simple.c:147
> ---Type  to continue, or q  to quit---
> #3  writeout_thread (opaque=) at trace/simple.c:165
> #4  0x03ff96cfa44c in g_thread_proxy () from /lib64/libglib-2.0.so.0
> #5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 5 (Thread 0x3ff8efff910 (LWP 52855)):
> #0  0x03ff967f819a in ioctl () from /lib64/libc.so.6
> #1  0x02aa2d546f3e in kvm_vcpu_ioctl (cpu=cpu@entry=0x2aa2e239030, 
> type=type@entry=44672)
> at /usr/src/debug/qemu-2.5.50/kvm-all.c:1984
> #2  0x02aa2d54701e in kvm_cpu_exec (cpu=0x2aa2e239030) at 
> /usr/src/debug/qemu-2.5.50/kvm-all.c:1834
> #3  0x02aa2d533cd6 in qemu_kvm_cpu_thread_fn (arg=) at 
> /usr/src/debug/qemu-2.5.50/cpus.c:1056
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 4 (Thread 0x3ff951ff910 (LWP 52849)):
> #0  0x03ff967fcf56 in syscall () from /lib64/libc.so.6
> #1  0x02aa2d755a36 in futex_wait (val=, ev= out>) at util/qemu-thread-posix.c:292
> #2  qemu_event_wait (ev=0x2aa2ddb5914 ) at 
> util/qemu-thread-posix.c:399
> #3  0x02aa2d765002 in call_rcu_thread (opaque=) at 
> util/rcu.c:250
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> ---Type  to continue, or q  to quit---
> 
> Thread 3 (Thread 0x3ff978e0bf0 (LWP 52845)):
> #0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
> #1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, __nfds= out>, __fds=) at /usr/include/bits/poll2.h:77
> #2  qemu_poll_ns (fds=, nfds=, timeout=-1) at 
> qemu-timer.c:313
> #3  0x02aa2d688b02 in os_host_main_loop_wait (timeout=-1) at 
> main-loop.c:251
> #4  main_loop_wait (nonblocking=) at main-loop.c:505
> #5  0x02aa2d4faade in main_loop () at vl.c:1933
> #6  main (argc=, argv=, envp=) 
> at vl.c:4646
> 
> Thread 2 (Thread 0x3ff8910 (LWP 52851)):
> #0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
> #1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, __nfds= out>, __fds=) at /usr/include/bits/poll2.h:77
> #2  qemu_poll_ns (fds=, nfds=, timeout=-1) at 
> qemu-timer.c:313
> #3  0x02aa2d68a788 in aio_poll (ctx=0x2aa2de77e00, blocking= out>) at aio-posix.c:453
> #4  0x02aa2d5b909c in iothread_run (opaque=0x2aa2de77220) at iothread.c:46
> #5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 1 (Thread 0x3ff8f7ff910 (LWP 52854)):
> #0  0x03ff9673b650 in raise () from /lib64/libc.so.6
> ---Type  to continue, or q  to quit---
> #1  0x03ff9673ced8 in abort () from /lib64/libc.so.6
> #2  0x03ff96733666 in __assert_fail_base () from /lib64/libc.so.6
> #3  0x03ff967336f4 in __assert_fail () 

Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Cornelia Huck
On Tue, 29 Mar 2016 17:14:21 +0800
tu bo  wrote:

> Hi Paolo:
> 
> On 03/29/2016 02:11 AM, Paolo Bonzini wrote:
> > On 28/03/2016 05:55, TU BO wrote:
> >> Hi Cornelia:
> >>
> >> I got two crash with qemu master + "[PATCH 0/6] virtio: refactor host
> >> notifiers",
> >
> > Hi Tu Bo,
> >
> > please always include the assertion patch at
> > https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
> > your tests.  Can you include the backtrace from all threads with that patch?
> >
> thanks for your reminder about the assertion patch. Here is the 
> backtrace with qemu master + assertion patch + "[PATCH 0/6] virtio: 
> refactor host notifiers",

FWIW, I've been running this in a reboot loop for the last 2 1/2 hours.
Could you perhaps share your command line?

> 
> I got two crashes,
> 
> 1. For 1st crash,
> (gdb) thread apply all bt

This one looks a lot like the crashes before the rework, which I don't
understand...

> 
> Thread 8 (Thread 0x3ff8daf1910 (LWP 52859)):
> #0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
> #1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
> #2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x3ff88000fa8, 
> ms=) at util/qemu-thread-posix.c:245
> #3  0x02aa2d6803e4 in worker_thread (opaque=0x3ff88000f40) at 
> thread-pool.c:92
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 7 (Thread 0x3ff8e679910 (LWP 52856)):
> #0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
> #1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
> #2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x2aa2e1fbfa8, 
> ms=) at util/qemu-thread-posix.c:245
> #3  0x02aa2d6803e4 in worker_thread (opaque=0x2aa2e1fbf40) at 
> thread-pool.c:92
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 6 (Thread 0x3ff9497f910 (LWP 52850)):
> #0  0x03ff9718c50e in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x03ff96d19792 in g_cond_wait () from /lib64/libglib-2.0.so.0
> #2  0x02aa2d7165d2 in wait_for_trace_records_available () at 
> trace/simple.c:147
> ---Type  to continue, or q  to quit---
> #3  writeout_thread (opaque=) at trace/simple.c:165
> #4  0x03ff96cfa44c in g_thread_proxy () from /lib64/libglib-2.0.so.0
> #5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 5 (Thread 0x3ff8efff910 (LWP 52855)):
> #0  0x03ff967f819a in ioctl () from /lib64/libc.so.6
> #1  0x02aa2d546f3e in kvm_vcpu_ioctl (cpu=cpu@entry=0x2aa2e239030, 
> type=type@entry=44672)
>  at /usr/src/debug/qemu-2.5.50/kvm-all.c:1984
> #2  0x02aa2d54701e in kvm_cpu_exec (cpu=0x2aa2e239030) at 
> /usr/src/debug/qemu-2.5.50/kvm-all.c:1834
> #3  0x02aa2d533cd6 in qemu_kvm_cpu_thread_fn (arg=) 
> at /usr/src/debug/qemu-2.5.50/cpus.c:1056
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 4 (Thread 0x3ff951ff910 (LWP 52849)):
> #0  0x03ff967fcf56 in syscall () from /lib64/libc.so.6
> #1  0x02aa2d755a36 in futex_wait (val=, ev= out>) at util/qemu-thread-posix.c:292
> #2  qemu_event_wait (ev=0x2aa2ddb5914 ) at 
> util/qemu-thread-posix.c:399
> #3  0x02aa2d765002 in call_rcu_thread (opaque=) at 
> util/rcu.c:250
> #4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> ---Type  to continue, or q  to quit---
> 
> Thread 3 (Thread 0x3ff978e0bf0 (LWP 52845)):
> #0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
> #1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, 
> __nfds=, __fds=) at 
> /usr/include/bits/poll2.h:77
> #2  qemu_poll_ns (fds=, nfds=, timeout=-1) 
> at qemu-timer.c:313
> #3  0x02aa2d688b02 in os_host_main_loop_wait (timeout=-1) at 
> main-loop.c:251
> #4  main_loop_wait (nonblocking=) at main-loop.c:505
> #5  0x02aa2d4faade in main_loop () at vl.c:1933
> #6  main (argc=, argv=, envp= out>) at vl.c:4646
> 
> Thread 2 (Thread 0x3ff8910 (LWP 52851)):
> #0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
> #1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, 
> __nfds=, __fds=) at 
> /usr/include/bits/poll2.h:77
> #2  qemu_poll_ns (fds=, nfds=, timeout=-1) 
> at qemu-timer.c:313
> #3  0x02aa2d68a788 in aio_poll (ctx=0x2aa2de77e00, 
> blocking=) at aio-posix.c:453
> #4  0x02aa2d5b909c in iothread_run (opaque=0x2aa2de77220) at 
> iothread.c:46
> #5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
> #6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
> 
> Thread 1 (Thread 0x3ff8f7ff910 (LWP 52854)):
> #0  

Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Paolo Bonzini


On 29/03/2016 10:18, Cornelia Huck wrote:
>> > 
>> > Tested-by: Paolo Bonzini 
>> > 
>> > Resisted 6 minutes versus 10 seconds.  At about 2.5 seconds per reboot, 
>> > that means the failure happened at the fourth reboot before, and 
>> > resisted about 150 reboots with your patches.
> 
> Thanks for testing!
> 
> Is the failure still the same? I thought I had understood the problem
> by now, and I'm wondering which hole we're still missing.

No, I just had to leave after the 6 minutes so I turned it off. :)

Paolo



Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread tu bo

Hi Paolo:

On 03/29/2016 02:11 AM, Paolo Bonzini wrote:

On 28/03/2016 05:55, TU BO wrote:

Hi Cornelia:

I got two crash with qemu master + "[PATCH 0/6] virtio: refactor host
notifiers",


Hi Tu Bo,

please always include the assertion patch at
https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
your tests.  Can you include the backtrace from all threads with that patch?

thanks for your reminder about the assertion patch. Here is the 
backtrace with qemu master + assertion patch + "[PATCH 0/6] virtio: 
refactor host notifiers",


I got two crashes,

1. For 1st crash,
(gdb) thread apply all bt

Thread 8 (Thread 0x3ff8daf1910 (LWP 52859)):
#0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
#1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
#2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x3ff88000fa8, 
ms=) at util/qemu-thread-posix.c:245
#3  0x02aa2d6803e4 in worker_thread (opaque=0x3ff88000f40) at 
thread-pool.c:92

#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 7 (Thread 0x3ff8e679910 (LWP 52856)):
#0  0x03ff9718ec62 in do_futex_timed_wait () from /lib64/libpthread.so.0
#1  0x03ff9718ed76 in sem_timedwait () from /lib64/libpthread.so.0
#2  0x02aa2d755868 in qemu_sem_timedwait (sem=0x2aa2e1fbfa8, 
ms=) at util/qemu-thread-posix.c:245
#3  0x02aa2d6803e4 in worker_thread (opaque=0x2aa2e1fbf40) at 
thread-pool.c:92

#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 6 (Thread 0x3ff9497f910 (LWP 52850)):
#0  0x03ff9718c50e in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0

#1  0x03ff96d19792 in g_cond_wait () from /lib64/libglib-2.0.so.0
#2  0x02aa2d7165d2 in wait_for_trace_records_available () at 
trace/simple.c:147

---Type  to continue, or q  to quit---
#3  writeout_thread (opaque=) at trace/simple.c:165
#4  0x03ff96cfa44c in g_thread_proxy () from /lib64/libglib-2.0.so.0
#5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 5 (Thread 0x3ff8efff910 (LWP 52855)):
#0  0x03ff967f819a in ioctl () from /lib64/libc.so.6
#1  0x02aa2d546f3e in kvm_vcpu_ioctl (cpu=cpu@entry=0x2aa2e239030, 
type=type@entry=44672)

at /usr/src/debug/qemu-2.5.50/kvm-all.c:1984
#2  0x02aa2d54701e in kvm_cpu_exec (cpu=0x2aa2e239030) at 
/usr/src/debug/qemu-2.5.50/kvm-all.c:1834
#3  0x02aa2d533cd6 in qemu_kvm_cpu_thread_fn (arg=) 
at /usr/src/debug/qemu-2.5.50/cpus.c:1056

#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 4 (Thread 0x3ff951ff910 (LWP 52849)):
#0  0x03ff967fcf56 in syscall () from /lib64/libc.so.6
#1  0x02aa2d755a36 in futex_wait (val=, ev=out>) at util/qemu-thread-posix.c:292
#2  qemu_event_wait (ev=0x2aa2ddb5914 ) at 
util/qemu-thread-posix.c:399
#3  0x02aa2d765002 in call_rcu_thread (opaque=) at 
util/rcu.c:250

#4  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#5  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6
---Type  to continue, or q  to quit---

Thread 3 (Thread 0x3ff978e0bf0 (LWP 52845)):
#0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
#1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, 
__nfds=, __fds=) at 
/usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=, nfds=, timeout=-1) 
at qemu-timer.c:313
#3  0x02aa2d688b02 in os_host_main_loop_wait (timeout=-1) at 
main-loop.c:251

#4  main_loop_wait (nonblocking=) at main-loop.c:505
#5  0x02aa2d4faade in main_loop () at vl.c:1933
#6  main (argc=, argv=, envp=out>) at vl.c:4646


Thread 2 (Thread 0x3ff8910 (LWP 52851)):
#0  0x03ff967f66e6 in ppoll () from /lib64/libc.so.6
#1  0x02aa2d68928e in ppoll (__ss=0x0, __timeout=0x0, 
__nfds=, __fds=) at 
/usr/include/bits/poll2.h:77
#2  qemu_poll_ns (fds=, nfds=, timeout=-1) 
at qemu-timer.c:313
#3  0x02aa2d68a788 in aio_poll (ctx=0x2aa2de77e00, 
blocking=) at aio-posix.c:453
#4  0x02aa2d5b909c in iothread_run (opaque=0x2aa2de77220) at 
iothread.c:46

#5  0x03ff971884c6 in start_thread () from /lib64/libpthread.so.0
#6  0x03ff96802ec2 in thread_start () from /lib64/libc.so.6

Thread 1 (Thread 0x3ff8f7ff910 (LWP 52854)):
#0  0x03ff9673b650 in raise () from /lib64/libc.so.6
---Type  to continue, or q  to quit---
#1  0x03ff9673ced8 in abort () from /lib64/libc.so.6
#2  0x03ff96733666 in __assert_fail_base () from /lib64/libc.so.6
#3  0x03ff967336f4 in __assert_fail () from /lib64/libc.so.6
#4  0x02aa2d562608 in virtio_blk_handle_output (vdev=out>, vq=)

at /usr/src/debug/qemu-2.5.50/hw/block/virtio-blk.c:595
#5  0x02aa2d587464 in virtio_ccw_hcall_notify (args=) 
at 

Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-29 Thread Cornelia Huck
On Thu, 24 Mar 2016 18:06:21 +0100
Paolo Bonzini  wrote:

> On 24/03/2016 17:15, Cornelia Huck wrote:
> > Here's the next version of my refactoring of the virtio host notifiers.
> > This one actually survives a bit of testing for me (reboot loop).
> > 
> > As this patchset fixes a latent bug exposed by the recent dataplane
> > changes (we have a deassigned ioeventfd for a short period of time
> > during dataplane start, which leads to the virtqueue handler being
> > called in both the vcpu thread and the iothread simultaneously), I'd
> > like to see this in 2.6.
> 
> Tested-by: Paolo Bonzini 
> 
> Resisted 6 minutes versus 10 seconds.  At about 2.5 seconds per reboot, 
> that means the failure happened at the fourth reboot before, and 
> resisted about 150 reboots with your patches.

Thanks for testing!

Is the failure still the same? I thought I had understood the problem
by now, and I'm wondering which hole we're still missing.




Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-28 Thread Paolo Bonzini
On 28/03/2016 05:55, TU BO wrote:
> Hi Cornelia:
> 
> I got two crash with qemu master + "[PATCH 0/6] virtio: refactor host
> notifiers",

Hi Tu Bo,

please always include the assertion patch at
https://lists.gnu.org/archive/html/qemu-block/2016-03/msg00546.html in
your tests.  Can you include the backtrace from all threads with that patch?

Thanks,

Paolo



Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-27 Thread TU BO

Hi Cornelia:

I got two crash with qemu master + "[PATCH 0/6] virtio: refactor host 
notifiers",


I can get first crash very often.
(gdb) bt
#0  blk_aio_read_entry (opaque=0x0) at block/block-backend.c:922
#1  0x02aa17a65f0e in coroutine_trampoline (i0=, 
i1=-1677713216) at util/coroutine-ucontext.c:78

#2  0x03ffabfd150a in __makecontext_ret () from /lib64/libc.so.6
(gdb) list
917static void blk_aio_read_entry(void *opaque)
918{
919BlkAioEmAIOCB *acb = opaque;
920BlkRwCo *rwco = >rwco;
921
922rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, 
rwco->qiov->size,

923  rwco->qiov, rwco->flags);
924blk_aio_complete(acb);
925}


For 2nd crash,  I just saw it once, and didn't reproduce it later.
(gdb) bt
#0  ioq_submit (s=s@entry=0x2aa3fe2cc80) at block/linux-aio.c:197
#1  0x02aa3f645b36 in qemu_laio_completion_bh (opaque=0x2aa3fe2cc80) 
at block/linux-aio.c:143

#2  0x02aa3f5ffbf0 in aio_bh_call (bh=) at async.c:65
#3  aio_bh_poll (ctx=0x2aa3fdf7e00) at async.c:93
#4  0x02aa3f60a51e in aio_dispatch (ctx=ctx@entry=0x2aa3fdf7e00) at 
aio-posix.c:306
#5  0x02aa3f60a7ca in aio_poll (ctx=0x2aa3fdf7e00, 
blocking=) at aio-posix.c:475
#6  0x02aa3f53903c in iothread_run (opaque=0x2aa3fdf7220) at 
iothread.c:46

#7  0x03ffa86084c6 in start_thread () from /lib64/libpthread.so.0
#8  0x03ffa7c82ec2 in thread_start () from /lib64/libc.so.6
(gdb) list
192struct iocb *iocbs[MAX_QUEUED_IO];
193QSIMPLEQ_HEAD(, qemu_laiocb) completed;
194
195do {
196len = 0;
197QSIMPLEQ_FOREACH(aiocb, >io_q.pending, next) {
198iocbs[len++] = >iocb;
199if (len == MAX_QUEUED_IO) {
200break;
201}


On 16/3/25 上午12:15, Cornelia Huck wrote:

Here's the next version of my refactoring of the virtio host notifiers.
This one actually survives a bit of testing for me (reboot loop).

As this patchset fixes a latent bug exposed by the recent dataplane
changes (we have a deassigned ioeventfd for a short period of time
during dataplane start, which leads to the virtqueue handler being
called in both the vcpu thread and the iothread simultaneously), I'd
like to see this in 2.6.

Changes from RFC:
- Fixed some silly errors (checking for !disabled instead of disabled,
   virtio_ccw_stop_ioeventfd() calling virtio_bus_start_ioeventfd()).
- Completely reworked set_host_notifier(): We only want to set/unset
   the actual handler function and don't want to do anything to the
   ioeventfd backing, so reduce the function to actually doing only
   that.
- With the change above, we can lose the 'assign' parameter in
   virtio_bus_stop_ioeventfd() again.
- Added more comments that hopefully make it clearer what is going on.

I'd appreciate it if people could give it some testing; I'll be back
to look at the fallout after Easter.

Cornelia Huck (6):
   virtio-bus: common ioeventfd infrastructure
   virtio-bus: have callers tolerate new host notifier api
   virtio-ccw: convert to ioeventfd callbacks
   virtio-pci: convert to ioeventfd callbacks
   virtio-mmio: convert to ioeventfd callbacks
   virtio-bus: remove old set_host_notifier callback

  hw/block/dataplane/virtio-blk.c |   6 +-
  hw/s390x/virtio-ccw.c   | 133 ++--
  hw/scsi/virtio-scsi-dataplane.c |   9 ++-
  hw/virtio/vhost.c   |  13 ++--
  hw/virtio/virtio-bus.c  | 132 +++
  hw/virtio/virtio-mmio.c | 128 +-
  hw/virtio/virtio-pci.c  | 124 +
  include/hw/virtio/virtio-bus.h  |  31 +-
  8 files changed, 303 insertions(+), 273 deletions(-)






Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-25 Thread Fam Zheng
On Thu, 03/24 17:15, Cornelia Huck wrote:
> Here's the next version of my refactoring of the virtio host notifiers.
> This one actually survives a bit of testing for me (reboot loop).
> 
> As this patchset fixes a latent bug exposed by the recent dataplane
> changes (we have a deassigned ioeventfd for a short period of time
> during dataplane start, which leads to the virtqueue handler being
> called in both the vcpu thread and the iothread simultaneously), I'd
> like to see this in 2.6.
> 
> Changes from RFC:
> - Fixed some silly errors (checking for !disabled instead of disabled,
>   virtio_ccw_stop_ioeventfd() calling virtio_bus_start_ioeventfd()).
> - Completely reworked set_host_notifier(): We only want to set/unset
>   the actual handler function and don't want to do anything to the
>   ioeventfd backing, so reduce the function to actually doing only
>   that.
> - With the change above, we can lose the 'assign' parameter in
>   virtio_bus_stop_ioeventfd() again.
> - Added more comments that hopefully make it clearer what is going on.
> 
> I'd appreciate it if people could give it some testing; I'll be back
> to look at the fallout after Easter.

Tested-by: Fam Zheng 

> 
> Cornelia Huck (6):
>   virtio-bus: common ioeventfd infrastructure
>   virtio-bus: have callers tolerate new host notifier api
>   virtio-ccw: convert to ioeventfd callbacks
>   virtio-pci: convert to ioeventfd callbacks
>   virtio-mmio: convert to ioeventfd callbacks
>   virtio-bus: remove old set_host_notifier callback
> 
>  hw/block/dataplane/virtio-blk.c |   6 +-
>  hw/s390x/virtio-ccw.c   | 133 
> ++--
>  hw/scsi/virtio-scsi-dataplane.c |   9 ++-
>  hw/virtio/vhost.c   |  13 ++--
>  hw/virtio/virtio-bus.c  | 132 +++
>  hw/virtio/virtio-mmio.c | 128 +-
>  hw/virtio/virtio-pci.c  | 124 +
>  include/hw/virtio/virtio-bus.h  |  31 +-
>  8 files changed, 303 insertions(+), 273 deletions(-)
> 
> -- 
> 2.6.5
> 



Re: [Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-24 Thread Paolo Bonzini


On 24/03/2016 17:15, Cornelia Huck wrote:
> Here's the next version of my refactoring of the virtio host notifiers.
> This one actually survives a bit of testing for me (reboot loop).
> 
> As this patchset fixes a latent bug exposed by the recent dataplane
> changes (we have a deassigned ioeventfd for a short period of time
> during dataplane start, which leads to the virtqueue handler being
> called in both the vcpu thread and the iothread simultaneously), I'd
> like to see this in 2.6.

Tested-by: Paolo Bonzini 

Resisted 6 minutes versus 10 seconds.  At about 2.5 seconds per reboot, 
that means the failure happened at the fourth reboot before, and 
resisted about 150 reboots with your patches.

My testcase was to add "systemd.unit=reboot.target" to a Fedora 21's 
kernel command line and run the following

./+build/x86_64-softmmu/qemu-system-x86_64 --enable-kvm -m 512 \
  -smp 4 -serial mon:stdio -display none -object iothread,id=io \
  -drive if=none,id=hd,file=/vm/virt_test/images/jeos-21-64.qcow2 \
  -device virtio-blk-pci,drive=hd,iothread=io \
  -drive if=none,file=null-co://,id=n1 \
  -drive if=none,file=null-co://,id=n2 \
  -drive if=none,file=null-co://,id=n3 \
  -drive if=none,file=null-co://,id=n4 \
  -drive if=none,file=null-co://,id=n5 \
  -drive if=none,file=null-co://,id=n6 \
  -drive if=none,file=null-co://,id=n7 \
  -drive if=none,file=null-co://,id=n8 \
  -device virtio-blk-pci,iothread=io,drive=n1 \
  -device virtio-blk-pci,iothread=io,drive=n2 \
  -device virtio-blk-pci,iothread=io,drive=n3 \
  -device virtio-blk-pci,iothread=io,drive=n4 \
  -device virtio-blk-pci,iothread=io,drive=n5 \
  -device virtio-blk-pci,iothread=io,drive=n6 \
  -device virtio-blk-pci,iothread=io,drive=n7 \
  -device virtio-blk-pci,iothread=io,drive=n8

with the assertion patch applied:

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cb710f1..d0b8248 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -591,6 +591,7 @@
 return;
 }
 
+assert(atomic_fetch_inc(>reentrancy_test) == 0);
 blk_io_plug(s->blk);
 
 while ((req = virtio_blk_get_request(s))) {
@@ -602,6 +603,7 @@
 }
 
 blk_io_unplug(s->blk);
+atomic_dec(>reentrancy_test);
 }
 
 static void virtio_blk_dma_restart_bh(void *opaque)
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index ae84d92..5cb66cd 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -54,6 +54,7 @@ typedef struct VirtIOBlock {
 bool original_wce;
 VMChangeStateEntry *change;
 bool dataplane_started;
+int reentrancy_test;
 struct VirtIOBlockDataPlane *dataplane;
 } VirtIOBlock;
 

Thanks for your help!

Paolo

> Changes from RFC:
> - Fixed some silly errors (checking for !disabled instead of disabled,
>   virtio_ccw_stop_ioeventfd() calling virtio_bus_start_ioeventfd()).
> - Completely reworked set_host_notifier(): We only want to set/unset
>   the actual handler function and don't want to do anything to the
>   ioeventfd backing, so reduce the function to actually doing only
>   that.
> - With the change above, we can lose the 'assign' parameter in
>   virtio_bus_stop_ioeventfd() again.
> - Added more comments that hopefully make it clearer what is going on.
> 
> I'd appreciate it if people could give it some testing; I'll be back
> to look at the fallout after Easter.
> 
> Cornelia Huck (6):
>   virtio-bus: common ioeventfd infrastructure
>   virtio-bus: have callers tolerate new host notifier api
>   virtio-ccw: convert to ioeventfd callbacks
>   virtio-pci: convert to ioeventfd callbacks
>   virtio-mmio: convert to ioeventfd callbacks
>   virtio-bus: remove old set_host_notifier callback
> 
>  hw/block/dataplane/virtio-blk.c |   6 +-
>  hw/s390x/virtio-ccw.c   | 133 
> ++--
>  hw/scsi/virtio-scsi-dataplane.c |   9 ++-
>  hw/virtio/vhost.c   |  13 ++--
>  hw/virtio/virtio-bus.c  | 132 +++
>  hw/virtio/virtio-mmio.c | 128 +-
>  hw/virtio/virtio-pci.c  | 124 +
>  include/hw/virtio/virtio-bus.h  |  31 +-
>  8 files changed, 303 insertions(+), 273 deletions(-)
> 



[Qemu-devel] [PATCH 0/6] virtio: refactor host notifiers

2016-03-24 Thread Cornelia Huck
Here's the next version of my refactoring of the virtio host notifiers.
This one actually survives a bit of testing for me (reboot loop).

As this patchset fixes a latent bug exposed by the recent dataplane
changes (we have a deassigned ioeventfd for a short period of time
during dataplane start, which leads to the virtqueue handler being
called in both the vcpu thread and the iothread simultaneously), I'd
like to see this in 2.6.

Changes from RFC:
- Fixed some silly errors (checking for !disabled instead of disabled,
  virtio_ccw_stop_ioeventfd() calling virtio_bus_start_ioeventfd()).
- Completely reworked set_host_notifier(): We only want to set/unset
  the actual handler function and don't want to do anything to the
  ioeventfd backing, so reduce the function to actually doing only
  that.
- With the change above, we can lose the 'assign' parameter in
  virtio_bus_stop_ioeventfd() again.
- Added more comments that hopefully make it clearer what is going on.

I'd appreciate it if people could give it some testing; I'll be back
to look at the fallout after Easter.

Cornelia Huck (6):
  virtio-bus: common ioeventfd infrastructure
  virtio-bus: have callers tolerate new host notifier api
  virtio-ccw: convert to ioeventfd callbacks
  virtio-pci: convert to ioeventfd callbacks
  virtio-mmio: convert to ioeventfd callbacks
  virtio-bus: remove old set_host_notifier callback

 hw/block/dataplane/virtio-blk.c |   6 +-
 hw/s390x/virtio-ccw.c   | 133 ++--
 hw/scsi/virtio-scsi-dataplane.c |   9 ++-
 hw/virtio/vhost.c   |  13 ++--
 hw/virtio/virtio-bus.c  | 132 +++
 hw/virtio/virtio-mmio.c | 128 +-
 hw/virtio/virtio-pci.c  | 124 +
 include/hw/virtio/virtio-bus.h  |  31 +-
 8 files changed, 303 insertions(+), 273 deletions(-)

-- 
2.6.5