"Daniel P. Berrange" <berra...@redhat.com> writes:

> When testing with the new "-M none" arg, I've noticed that ~70%
> of the time libvirt starts QEMU will result in a SEGV from QEMU
> with the following stack trace:
>
> (gdb) bt
> #0  0x0000000000000000 in ?? ()
> #1  0x000055555567a37f in json_lexer_feed_char (lexer=0x55555658fb20, ch=123 
> '{', flush=false) at json-lexer.c:324
> #2  0x000055555567a4aa in json_lexer_feed (lexer=0x55555658fb20, 
> buffer=0x7fffffffe7b7 "{", size=1) at json-lexer.c:356
> #3  0x000055555567c708 in json_message_parser_feed (parser=0x55555658fb18, 
> buffer=0x7fffffffe7b7 "{", size=1) at json-streamer.c:110
> #4  0x0000555555882861 in monitor_control_read (opaque=0x55555658f6a0, 
> buf=0x7fffffffe7b7 "{", size=1) at /home/berrange/src/virt/qemu/monitor.c:4768
> #5  0x000055555579b051 in qemu_chr_be_write (s=0x55555658dc10, 
> buf=0x7fffffffe7b7 "{", len=1) at qemu-char.c:164
> #6  0x000055555579c9c8 in stdio_read (opaque=0x55555658dc10) at 
> qemu-char.c:720
> #7  0x000055555567941f in qemu_iohandler_poll (readfds=0x5555560f17c0, 
> writefds=0x5555560f1840, xfds=0x5555560f18c0, ret=2) at iohandler.c:122
> #8  0x000055555577166a in main_loop_wait (nonblocking=0) at main-loop.c:497
> #9  0x000055555576956b in main_loop () at 
> /home/berrange/src/virt/qemu/vl.c:1643
> #10 0x0000555555770239 in main (argc=10, argv=0x7fffffffeca8, 
> envp=0x7fffffffed00) at /home/berrange/src/virt/qemu/vl.c:3755
>
>
> Stack frame #1 there is doing this:
>
>   lexer->emit(lexer, lexer->token, JSON_ERROR, lexer->x, lexer->y);
>
> GDB confirms that the 'emit' field has not yet been initialized.
>
> In the case of QMP, this is initialized by the following sequence:
>
>  - main
>  - chardev_init_func
>  - qemu_chr_generic_open
>
>  ...async from event loop...
>
>  - main_loop
>  - qemu_chr_generic_open_bh
>  - monitor_control_event
>  - json_message_parser_init
>  - json_lexer_init
>
>
> The problem arises if you try to feed data to QEMU before the bottom
> half has run. There is a race where qemu_chr_be_write can be called
> to process input, before the qemu_chr_generic_open_bh has been
> invoked.

The char layer really just needs to be thrown away and rewritten :-(  It
really is a giant steaming pile...

I sent a simple patch that fixes this problem for the monitor.

Regards,

Anthony Liguori

>
> This can actually be quite easily demonstrated (at least on my system):
>
>  # echo "{" | qemu-system-x86_64 -nodefaults -nographic -M none -qmp stdio
>  Segmentation fault
>
> If you remove the '-M none' call, you won't hit this race condition 99%
> of the time, but I have occassionally been able to see it.
>
> It isn't clear to me what to change to solve this race condition. Probably
> though, the I/O handlers for a char device should be registered until the
> open bottom half has completed.
>
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

Reply via email to