Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-12-11 Thread Paolo Bonzini


On 09/09/2014 17:28, Kirill Batuzov wrote:
 In short: QEMU serial port transmits data as fast as it can ignoring
 baud rate completely. As a result we are stuck in serial8250_interrupt
 ISR in kernel most of the time.
 
 Overall we have a large issue with rate control and flow control for
 virtual serial port implementations. In QEMU we have over dozen different
 UARTs for different platforms. Among them only one uses baud rate
 (strongarm) and only one implements flow control (16550A).
 
 CC'ing some people to discuss general course of action in regards to
 serial implementation.
 
 We probably want some abstract serial which is able to transmit one
 character with fixed baud rate and maximum retry count. The actual
 serial port implementations should implement interrupts, control
 registers and FIFO around it. With such design we will not need to
 implement the same bits of rate control and retry logic for every UART
 in QEMU.
 
 Any thoughts on this?

The baud rate used to be there in the 8250 emulation.  It was removed
when flow control was added (why?), in commit fcfb4d6a.

Adding it back would be nice.

I think an abstraction of the serial port concept that is based on the
16550A register set, and with arbitrarily sized FIFOs, would be nice to
have.  All device models would talk to such abstraction.  The 16550A is
well known and a lot of devices take inspiration from it).

Paolo



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-12-11 Thread Peter Maydell
On 11 December 2014 at 14:56, Paolo Bonzini pbonz...@redhat.com wrote:
 I think an abstraction of the serial port concept that is based on the
 16550A register set, and with arbitrarily sized FIFOs, would be nice to
 have.  All device models would talk to such abstraction.  The 16550A is
 well known and a lot of devices take inspiration from it).

An abstraction of that from the specifics of the PC's serial port
might be nice, yes. (omap_uart.c has to jump through some ugly hoops
currently, with more ugliness in the out-of-tree omap3 extensions.)
However I don't think it makes sense to force all serial port
device models to go through it -- not all the world is a 16550A
and some UARTs simply are different.

thanks
-- PMM



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-12-11 Thread Paolo Bonzini


On 11/12/2014 16:06, Peter Maydell wrote:
 An abstraction of that from the specifics of the PC's serial port
 might be nice, yes. (omap_uart.c has to jump through some ugly hoops
 currently, with more ugliness in the out-of-tree omap3 extensions.)
 However I don't think it makes sense to force all serial port
 device models to go through it -- not all the world is a 16550A
 and some UARTs simply are different.

No, definitely not.  There's still qemu-char, this would just be an
abstraction providing useful stuff like baud rates and flow control.

Paolo



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-09-09 Thread Kirill Batuzov
On Fri, 5 Sep 2014, Andrey Korolyov wrote:

 
  Heh, it is kernel- (defaults-) dependent after all. Debian hangs
  always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll
  check if there are any 82550-specific patches in Fedora tree a bit
  later.
 
 
 It is a setting-dependent issue, checked this. Though I am still
 searching which option causing such a huge difference, vast majority
 of distros with default kernels hanged completely during test. Stock
 SuSE/CentOS/Debian kernels can be used for testing.


I managed to reproduce it finally with debian live image. Resulting
command line was:

qemu-system-x86_64 -enable-kvm -m 512 -smp 12  \
  -cdrom debian-live-7.6.0-amd64-standard.iso

Commands to run in guest console:
# yes  /dev/ttyS0 
# yes  /dev/ttyS0 

Looks like it is the old serial8250: too much work for irq4 bug.

In short: QEMU serial port transmits data as fast as it can ignoring
baud rate completely. As a result we are stuck in serial8250_interrupt
ISR in kernel most of the time.

Overall we have a large issue with rate control and flow control for
virtual serial port implementations. In QEMU we have over dozen different
UARTs for different platforms. Among them only one uses baud rate
(strongarm) and only one implements flow control (16550A).

CC'ing some people to discuss general course of action in regards to
serial implementation.

We probably want some abstract serial which is able to transmit one
character with fixed baud rate and maximum retry count. The actual
serial port implementations should implement interrupts, control
registers and FIFO around it. With such design we will not need to
implement the same bits of rate control and retry logic for every UART
in QEMU.

Any thoughts on this?

-- 
Kirill



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-09-09 Thread Andrey Korolyov
On Tue, Sep 9, 2014 at 7:28 PM, Kirill Batuzov batuz...@ispras.ru wrote:
 On Fri, 5 Sep 2014, Andrey Korolyov wrote:

 
  Heh, it is kernel- (defaults-) dependent after all. Debian hangs
  always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll
  check if there are any 82550-specific patches in Fedora tree a bit
  later.


 It is a setting-dependent issue, checked this. Though I am still
 searching which option causing such a huge difference, vast majority
 of distros with default kernels hanged completely during test. Stock
 SuSE/CentOS/Debian kernels can be used for testing.


 I managed to reproduce it finally with debian live image. Resulting
 command line was:

 qemu-system-x86_64 -enable-kvm -m 512 -smp 12  \
   -cdrom debian-live-7.6.0-amd64-standard.iso

 Commands to run in guest console:
 # yes  /dev/ttyS0 
 # yes  /dev/ttyS0 

 Looks like it is the old serial8250: too much work for irq4 bug.

Exactly, may be I made this unclear earlier. The only problem is that
the current emulator is very happy to hang on certain guest kernel
settings (I postponed searching for magic options which allowed Fedora
kernel to work after trying some obvious like defaults in the timer
subsystem).


 In short: QEMU serial port transmits data as fast as it can ignoring
 baud rate completely. As a result we are stuck in serial8250_interrupt
 ISR in kernel most of the time.

It`s hardly explains how more than one threads are getting locked, at
least for me. You may see in surviving Fedora case that the just one
core is eaten up, as it should be. May be after a couple of NMI was
fired it is possible to lock multiple cores, but I don`t have better
explanation.


 Overall we have a large issue with rate control and flow control for
 virtual serial port implementations. In QEMU we have over dozen different
 UARTs for different platforms. Among them only one uses baud rate
 (strongarm) and only one implements flow control (16550A).

 CC'ing some people to discuss general course of action in regards to
 serial implementation.

 We probably want some abstract serial which is able to transmit one
 character with fixed baud rate and maximum retry count. The actual
 serial port implementations should implement interrupts, control
 registers and FIFO around it. With such design we will not need to
 implement the same bits of rate control and retry logic for every UART
 in QEMU.

 Any thoughts on this?

 --
 Kirill



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-09-05 Thread Andrey Korolyov
On Thu, Sep 4, 2014 at 8:03 PM, Andrey Korolyov and...@xdel.ru wrote:
 On Thu, Sep 4, 2014 at 5:33 PM, Kirill Batuzov batuz...@ispras.ru wrote:
 On Thu, 4 Sep 2014, Andrey Korolyov wrote:

 Thanks, the launch string can be borrowed from attach here:
 http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html,
 the same VM is going under test.


 By hang I mean stopping ability to send icmp replies, it is like a
 kind of a watermark for issues I count serious after. Just tested
 again, the ceiling is not exactly representing all available cpu quota
 *every* time but is rounded by seemingly random count of cores from 3.
 to 9 in mine series of tests, with quota limit of 12. VM became
 unresponsive in matter of seconds, consumption raising by 'clicking'
 core count for about a half of minute, stabilizing then. Guest args
 are console=tty0 console=ttyS0,9600n8.



 I modified your command line a bit to match my environment:
  - removed block drive and related options,
  - changed network configuration from vhsot to tap,
  - changed bios to default one shipped with QEMU 2.1,
  - added parameters to run aboriginal linux.

 Neither of these should affect logic of serial port, yet I was not able
 to reproduce the bug again. Any ideas what am I missing?

 My command line looks like this:

 qemu-system-x86_64 -enable-kvm -name vm29180 -machine 
 pc-i440fx-2.1,accel=kvm,usb=off \
   -cpu SandyBridge,+kvm_pv_eoi -bios bios.bin -m 512 -realtime mlock=off \
   -smp 12,sockets=1,cores=12,threads=12 -numa 
 node,nodeid=0,cpus=0-11,mem=512 \
   -uuid 9ca88d08-5b89-47f2-bfbf-926efcc500cc -nographic -no-user-config 
 -nodefaults \
   -device sga -chardev 
 socket,id=charmonitor,path=vm29180.monitor,server,nowait \
   -mon chardev=charmonitor,id=monitor,mode=control -rtc 
 base=utc,driftfix=slew \
   -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \
   -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot 
 strict=on \
   -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 \
   -device virtio-serial-pci,id=virtio-serial0,vectors=1,bus=pci.0,addr=0x5 \
   -chardev pty,id=charserial0 -device 
 isa-serial,chardev=charserial0,id=serial0 \
   -chardev socket,id=charchannel0,path=vm29180.sock,server,nowait \
   -device 
 virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
  \
   -object iothread,id=vm29180blk0 -m 512,slots=31,maxmem=16384M \
   -object memory-backend-ram,id=mem0,size=512M -device 
 pc-dimm,id=dimm0,node=0,memdev=mem0 \
   -object memory-backend-ram,id=mem1,size=512M -device 
 pc-dimm,id=dimm1,node=0,memdev=mem1 \
   -object memory-backend-ram,id=mem2,size=512M -device 
 pc-dimm,id=dimm2,node=0,memdev=mem2 \
   -object memory-backend-ram,id=mem3,size=512M -device 
 pc-dimm,id=dimm3,node=0,memdev=mem3 \
   -object memory-backend-ram,id=mem4,size=512M -device 
 pc-dimm,id=dimm4,node=0,memdev=mem4 \
   -object memory-backend-ram,id=mem5,size=512M -device 
 pc-dimm,id=dimm5,node=0,memdev=mem5 \
   -object memory-backend-ram,id=mem6,size=512M -device 
 pc-dimm,id=dimm6,node=0,memdev=mem6 \
   -hda hda.sqf -kernel bzImage \
   -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 
 HOST=i686 \
   -net nic,model=e1000 -net tap,ifname=tap0,script=no,downscript=no

 PS: I did not specify baud rate of serial console because init in my
 rootfs does not like it. From linux kernel documentation it should be
 9600n8 by default.

 --
 Kirill

 Heh, it is kernel- (defaults-) dependent after all. Debian hangs
 always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll
 check if there are any 82550-specific patches in Fedora tree a bit
 later.


It is a setting-dependent issue, checked this. Though I am still
searching which option causing such a huge difference, vast majority
of distros with default kernels hanged completely during test. Stock
SuSE/CentOS/Debian kernels can be used for testing.



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-09-04 Thread Kirill Batuzov
On Wed, 3 Sep 2014, Andrey Korolyov wrote:

 Given 2.1 and isa-serial output, set as ttyS0 for the guest VM with
 9600 baud rate.

 The test case is quite simple - display as much data as possible over
 serial console and do not hang the system. While qemu-1.1 works
 perfectly, with complaining for lost interrupts (known bug for used
 guest kernel), 2.1 just hangs after some seconds, eating up all
 available cpu quota.
 
 Test case is 'while true; dmesg; done' in serial console. I`d like to
 ask to consider this bug as very serious as VM going completely
 unresponsive in matter of tens of seconds and there are a lot of side
 attacks to produce enough number of printk() to the ttyS0 with serial
 console being set up and default settings for almost any distro in
 such a way that message suppression would not work and VM can be DoSed
 by an unprivileged user.
 


I tried to reproduce the described behaviour with aboriginal linux and
QEMU 2.1.0 but without luck.

The configurations I tried:

qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \
  -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 HOST=i686

qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \
  -append root=/dev/hda rw init=/bin/ash panic=1 console=ttyS0,9600 HOST=i686

With all output the system did not hang. In particular I alway could
switch to QEMU monitor and stop the VM from there.

Can you give an exact QEMU command line which leads to the bug?

-- 
Kirill



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-09-04 Thread Andrey Korolyov
On Thu, Sep 4, 2014 at 1:46 PM, Kirill Batuzov batuz...@ispras.ru wrote:
 On Wed, 3 Sep 2014, Andrey Korolyov wrote:

 Given 2.1 and isa-serial output, set as ttyS0 for the guest VM with
 9600 baud rate.

 The test case is quite simple - display as much data as possible over
 serial console and do not hang the system. While qemu-1.1 works
 perfectly, with complaining for lost interrupts (known bug for used
 guest kernel), 2.1 just hangs after some seconds, eating up all
 available cpu quota.

 Test case is 'while true; dmesg; done' in serial console. I`d like to
 ask to consider this bug as very serious as VM going completely
 unresponsive in matter of tens of seconds and there are a lot of side
 attacks to produce enough number of printk() to the ttyS0 with serial
 console being set up and default settings for almost any distro in
 such a way that message suppression would not work and VM can be DoSed
 by an unprivileged user.



 I tried to reproduce the described behaviour with aboriginal linux and
 QEMU 2.1.0 but without luck.

 The configurations I tried:

 qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \
   -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 
 HOST=i686

 qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \
   -append root=/dev/hda rw init=/bin/ash panic=1 console=ttyS0,9600 
 HOST=i686

 With all output the system did not hang. In particular I alway could
 switch to QEMU monitor and stop the VM from there.

 Can you give an exact QEMU command line which leads to the bug?

 --
 Kirill


Thanks, the launch string can be borrowed from attach here:
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html,
the same VM is going under test.


By hang I mean stopping ability to send icmp replies, it is like a
kind of a watermark for issues I count serious after. Just tested
again, the ceiling is not exactly representing all available cpu quota
*every* time but is rounded by seemingly random count of cores from 3.
to 9 in mine series of tests, with quota limit of 12. VM became
unresponsive in matter of seconds, consumption raising by 'clicking'
core count for about a half of minute, stabilizing then. Guest args
are console=tty0 console=ttyS0,9600n8.



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-09-04 Thread Kirill Batuzov
On Thu, 4 Sep 2014, Andrey Korolyov wrote:
 
 Thanks, the launch string can be borrowed from attach here:
 http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html,
 the same VM is going under test.
 
 
 By hang I mean stopping ability to send icmp replies, it is like a
 kind of a watermark for issues I count serious after. Just tested
 again, the ceiling is not exactly representing all available cpu quota
 *every* time but is rounded by seemingly random count of cores from 3.
 to 9 in mine series of tests, with quota limit of 12. VM became
 unresponsive in matter of seconds, consumption raising by 'clicking'
 core count for about a half of minute, stabilizing then. Guest args
 are console=tty0 console=ttyS0,9600n8.
 


I modified your command line a bit to match my environment:
 - removed block drive and related options,
 - changed network configuration from vhsot to tap,
 - changed bios to default one shipped with QEMU 2.1,
 - added parameters to run aboriginal linux.

Neither of these should affect logic of serial port, yet I was not able
to reproduce the bug again. Any ideas what am I missing?

My command line looks like this:

qemu-system-x86_64 -enable-kvm -name vm29180 -machine 
pc-i440fx-2.1,accel=kvm,usb=off \
  -cpu SandyBridge,+kvm_pv_eoi -bios bios.bin -m 512 -realtime mlock=off \
  -smp 12,sockets=1,cores=12,threads=12 -numa node,nodeid=0,cpus=0-11,mem=512 \
  -uuid 9ca88d08-5b89-47f2-bfbf-926efcc500cc -nographic -no-user-config 
-nodefaults \
  -device sga -chardev socket,id=charmonitor,path=vm29180.monitor,server,nowait 
\
  -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew \
  -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \
  -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on \
  -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 \
  -device virtio-serial-pci,id=virtio-serial0,vectors=1,bus=pci.0,addr=0x5 \
  -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 
\
  -chardev socket,id=charchannel0,path=vm29180.sock,server,nowait \
  -device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
 \
  -object iothread,id=vm29180blk0 -m 512,slots=31,maxmem=16384M \
  -object memory-backend-ram,id=mem0,size=512M -device 
pc-dimm,id=dimm0,node=0,memdev=mem0 \
  -object memory-backend-ram,id=mem1,size=512M -device 
pc-dimm,id=dimm1,node=0,memdev=mem1 \
  -object memory-backend-ram,id=mem2,size=512M -device 
pc-dimm,id=dimm2,node=0,memdev=mem2 \
  -object memory-backend-ram,id=mem3,size=512M -device 
pc-dimm,id=dimm3,node=0,memdev=mem3 \
  -object memory-backend-ram,id=mem4,size=512M -device 
pc-dimm,id=dimm4,node=0,memdev=mem4 \
  -object memory-backend-ram,id=mem5,size=512M -device 
pc-dimm,id=dimm5,node=0,memdev=mem5 \
  -object memory-backend-ram,id=mem6,size=512M -device 
pc-dimm,id=dimm6,node=0,memdev=mem6 \
  -hda hda.sqf -kernel bzImage \
  -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 HOST=i686 
\
  -net nic,model=e1000 -net tap,ifname=tap0,script=no,downscript=no

PS: I did not specify baud rate of serial console because init in my
rootfs does not like it. From linux kernel documentation it should be
9600n8 by default.

-- 
Kirill



Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console

2014-09-04 Thread Andrey Korolyov
On Thu, Sep 4, 2014 at 5:33 PM, Kirill Batuzov batuz...@ispras.ru wrote:
 On Thu, 4 Sep 2014, Andrey Korolyov wrote:

 Thanks, the launch string can be borrowed from attach here:
 http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html,
 the same VM is going under test.


 By hang I mean stopping ability to send icmp replies, it is like a
 kind of a watermark for issues I count serious after. Just tested
 again, the ceiling is not exactly representing all available cpu quota
 *every* time but is rounded by seemingly random count of cores from 3.
 to 9 in mine series of tests, with quota limit of 12. VM became
 unresponsive in matter of seconds, consumption raising by 'clicking'
 core count for about a half of minute, stabilizing then. Guest args
 are console=tty0 console=ttyS0,9600n8.



 I modified your command line a bit to match my environment:
  - removed block drive and related options,
  - changed network configuration from vhsot to tap,
  - changed bios to default one shipped with QEMU 2.1,
  - added parameters to run aboriginal linux.

 Neither of these should affect logic of serial port, yet I was not able
 to reproduce the bug again. Any ideas what am I missing?

 My command line looks like this:

 qemu-system-x86_64 -enable-kvm -name vm29180 -machine 
 pc-i440fx-2.1,accel=kvm,usb=off \
   -cpu SandyBridge,+kvm_pv_eoi -bios bios.bin -m 512 -realtime mlock=off \
   -smp 12,sockets=1,cores=12,threads=12 -numa node,nodeid=0,cpus=0-11,mem=512 
 \
   -uuid 9ca88d08-5b89-47f2-bfbf-926efcc500cc -nographic -no-user-config 
 -nodefaults \
   -device sga -chardev 
 socket,id=charmonitor,path=vm29180.monitor,server,nowait \
   -mon chardev=charmonitor,id=monitor,mode=control -rtc 
 base=utc,driftfix=slew \
   -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \
   -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on 
 \
   -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 \
   -device virtio-serial-pci,id=virtio-serial0,vectors=1,bus=pci.0,addr=0x5 \
   -chardev pty,id=charserial0 -device 
 isa-serial,chardev=charserial0,id=serial0 \
   -chardev socket,id=charchannel0,path=vm29180.sock,server,nowait \
   -device 
 virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
  \
   -object iothread,id=vm29180blk0 -m 512,slots=31,maxmem=16384M \
   -object memory-backend-ram,id=mem0,size=512M -device 
 pc-dimm,id=dimm0,node=0,memdev=mem0 \
   -object memory-backend-ram,id=mem1,size=512M -device 
 pc-dimm,id=dimm1,node=0,memdev=mem1 \
   -object memory-backend-ram,id=mem2,size=512M -device 
 pc-dimm,id=dimm2,node=0,memdev=mem2 \
   -object memory-backend-ram,id=mem3,size=512M -device 
 pc-dimm,id=dimm3,node=0,memdev=mem3 \
   -object memory-backend-ram,id=mem4,size=512M -device 
 pc-dimm,id=dimm4,node=0,memdev=mem4 \
   -object memory-backend-ram,id=mem5,size=512M -device 
 pc-dimm,id=dimm5,node=0,memdev=mem5 \
   -object memory-backend-ram,id=mem6,size=512M -device 
 pc-dimm,id=dimm6,node=0,memdev=mem6 \
   -hda hda.sqf -kernel bzImage \
   -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 
 HOST=i686 \
   -net nic,model=e1000 -net tap,ifname=tap0,script=no,downscript=no

 PS: I did not specify baud rate of serial console because init in my
 rootfs does not like it. From linux kernel documentation it should be
 9600n8 by default.

 --
 Kirill

Heh, it is kernel- (defaults-) dependent after all. Debian hangs
always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll
check if there are any 82550-specific patches in Fedora tree a bit
later.