Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On 09/09/2014 17:28, Kirill Batuzov wrote: In short: QEMU serial port transmits data as fast as it can ignoring baud rate completely. As a result we are stuck in serial8250_interrupt ISR in kernel most of the time. Overall we have a large issue with rate control and flow control for virtual serial port implementations. In QEMU we have over dozen different UARTs for different platforms. Among them only one uses baud rate (strongarm) and only one implements flow control (16550A). CC'ing some people to discuss general course of action in regards to serial implementation. We probably want some abstract serial which is able to transmit one character with fixed baud rate and maximum retry count. The actual serial port implementations should implement interrupts, control registers and FIFO around it. With such design we will not need to implement the same bits of rate control and retry logic for every UART in QEMU. Any thoughts on this? The baud rate used to be there in the 8250 emulation. It was removed when flow control was added (why?), in commit fcfb4d6a. Adding it back would be nice. I think an abstraction of the serial port concept that is based on the 16550A register set, and with arbitrarily sized FIFOs, would be nice to have. All device models would talk to such abstraction. The 16550A is well known and a lot of devices take inspiration from it). Paolo
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On 11 December 2014 at 14:56, Paolo Bonzini pbonz...@redhat.com wrote: I think an abstraction of the serial port concept that is based on the 16550A register set, and with arbitrarily sized FIFOs, would be nice to have. All device models would talk to such abstraction. The 16550A is well known and a lot of devices take inspiration from it). An abstraction of that from the specifics of the PC's serial port might be nice, yes. (omap_uart.c has to jump through some ugly hoops currently, with more ugliness in the out-of-tree omap3 extensions.) However I don't think it makes sense to force all serial port device models to go through it -- not all the world is a 16550A and some UARTs simply are different. thanks -- PMM
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On 11/12/2014 16:06, Peter Maydell wrote: An abstraction of that from the specifics of the PC's serial port might be nice, yes. (omap_uart.c has to jump through some ugly hoops currently, with more ugliness in the out-of-tree omap3 extensions.) However I don't think it makes sense to force all serial port device models to go through it -- not all the world is a 16550A and some UARTs simply are different. No, definitely not. There's still qemu-char, this would just be an abstraction providing useful stuff like baud rates and flow control. Paolo
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On Fri, 5 Sep 2014, Andrey Korolyov wrote: Heh, it is kernel- (defaults-) dependent after all. Debian hangs always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll check if there are any 82550-specific patches in Fedora tree a bit later. It is a setting-dependent issue, checked this. Though I am still searching which option causing such a huge difference, vast majority of distros with default kernels hanged completely during test. Stock SuSE/CentOS/Debian kernels can be used for testing. I managed to reproduce it finally with debian live image. Resulting command line was: qemu-system-x86_64 -enable-kvm -m 512 -smp 12 \ -cdrom debian-live-7.6.0-amd64-standard.iso Commands to run in guest console: # yes /dev/ttyS0 # yes /dev/ttyS0 Looks like it is the old serial8250: too much work for irq4 bug. In short: QEMU serial port transmits data as fast as it can ignoring baud rate completely. As a result we are stuck in serial8250_interrupt ISR in kernel most of the time. Overall we have a large issue with rate control and flow control for virtual serial port implementations. In QEMU we have over dozen different UARTs for different platforms. Among them only one uses baud rate (strongarm) and only one implements flow control (16550A). CC'ing some people to discuss general course of action in regards to serial implementation. We probably want some abstract serial which is able to transmit one character with fixed baud rate and maximum retry count. The actual serial port implementations should implement interrupts, control registers and FIFO around it. With such design we will not need to implement the same bits of rate control and retry logic for every UART in QEMU. Any thoughts on this? -- Kirill
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On Tue, Sep 9, 2014 at 7:28 PM, Kirill Batuzov batuz...@ispras.ru wrote: On Fri, 5 Sep 2014, Andrey Korolyov wrote: Heh, it is kernel- (defaults-) dependent after all. Debian hangs always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll check if there are any 82550-specific patches in Fedora tree a bit later. It is a setting-dependent issue, checked this. Though I am still searching which option causing such a huge difference, vast majority of distros with default kernels hanged completely during test. Stock SuSE/CentOS/Debian kernels can be used for testing. I managed to reproduce it finally with debian live image. Resulting command line was: qemu-system-x86_64 -enable-kvm -m 512 -smp 12 \ -cdrom debian-live-7.6.0-amd64-standard.iso Commands to run in guest console: # yes /dev/ttyS0 # yes /dev/ttyS0 Looks like it is the old serial8250: too much work for irq4 bug. Exactly, may be I made this unclear earlier. The only problem is that the current emulator is very happy to hang on certain guest kernel settings (I postponed searching for magic options which allowed Fedora kernel to work after trying some obvious like defaults in the timer subsystem). In short: QEMU serial port transmits data as fast as it can ignoring baud rate completely. As a result we are stuck in serial8250_interrupt ISR in kernel most of the time. It`s hardly explains how more than one threads are getting locked, at least for me. You may see in surviving Fedora case that the just one core is eaten up, as it should be. May be after a couple of NMI was fired it is possible to lock multiple cores, but I don`t have better explanation. Overall we have a large issue with rate control and flow control for virtual serial port implementations. In QEMU we have over dozen different UARTs for different platforms. Among them only one uses baud rate (strongarm) and only one implements flow control (16550A). CC'ing some people to discuss general course of action in regards to serial implementation. We probably want some abstract serial which is able to transmit one character with fixed baud rate and maximum retry count. The actual serial port implementations should implement interrupts, control registers and FIFO around it. With such design we will not need to implement the same bits of rate control and retry logic for every UART in QEMU. Any thoughts on this? -- Kirill
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On Thu, Sep 4, 2014 at 8:03 PM, Andrey Korolyov and...@xdel.ru wrote: On Thu, Sep 4, 2014 at 5:33 PM, Kirill Batuzov batuz...@ispras.ru wrote: On Thu, 4 Sep 2014, Andrey Korolyov wrote: Thanks, the launch string can be borrowed from attach here: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html, the same VM is going under test. By hang I mean stopping ability to send icmp replies, it is like a kind of a watermark for issues I count serious after. Just tested again, the ceiling is not exactly representing all available cpu quota *every* time but is rounded by seemingly random count of cores from 3. to 9 in mine series of tests, with quota limit of 12. VM became unresponsive in matter of seconds, consumption raising by 'clicking' core count for about a half of minute, stabilizing then. Guest args are console=tty0 console=ttyS0,9600n8. I modified your command line a bit to match my environment: - removed block drive and related options, - changed network configuration from vhsot to tap, - changed bios to default one shipped with QEMU 2.1, - added parameters to run aboriginal linux. Neither of these should affect logic of serial port, yet I was not able to reproduce the bug again. Any ideas what am I missing? My command line looks like this: qemu-system-x86_64 -enable-kvm -name vm29180 -machine pc-i440fx-2.1,accel=kvm,usb=off \ -cpu SandyBridge,+kvm_pv_eoi -bios bios.bin -m 512 -realtime mlock=off \ -smp 12,sockets=1,cores=12,threads=12 -numa node,nodeid=0,cpus=0-11,mem=512 \ -uuid 9ca88d08-5b89-47f2-bfbf-926efcc500cc -nographic -no-user-config -nodefaults \ -device sga -chardev socket,id=charmonitor,path=vm29180.monitor,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \ -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on \ -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 \ -device virtio-serial-pci,id=virtio-serial0,vectors=1,bus=pci.0,addr=0x5 \ -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 \ -chardev socket,id=charchannel0,path=vm29180.sock,server,nowait \ -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1 \ -object iothread,id=vm29180blk0 -m 512,slots=31,maxmem=16384M \ -object memory-backend-ram,id=mem0,size=512M -device pc-dimm,id=dimm0,node=0,memdev=mem0 \ -object memory-backend-ram,id=mem1,size=512M -device pc-dimm,id=dimm1,node=0,memdev=mem1 \ -object memory-backend-ram,id=mem2,size=512M -device pc-dimm,id=dimm2,node=0,memdev=mem2 \ -object memory-backend-ram,id=mem3,size=512M -device pc-dimm,id=dimm3,node=0,memdev=mem3 \ -object memory-backend-ram,id=mem4,size=512M -device pc-dimm,id=dimm4,node=0,memdev=mem4 \ -object memory-backend-ram,id=mem5,size=512M -device pc-dimm,id=dimm5,node=0,memdev=mem5 \ -object memory-backend-ram,id=mem6,size=512M -device pc-dimm,id=dimm6,node=0,memdev=mem6 \ -hda hda.sqf -kernel bzImage \ -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 HOST=i686 \ -net nic,model=e1000 -net tap,ifname=tap0,script=no,downscript=no PS: I did not specify baud rate of serial console because init in my rootfs does not like it. From linux kernel documentation it should be 9600n8 by default. -- Kirill Heh, it is kernel- (defaults-) dependent after all. Debian hangs always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll check if there are any 82550-specific patches in Fedora tree a bit later. It is a setting-dependent issue, checked this. Though I am still searching which option causing such a huge difference, vast majority of distros with default kernels hanged completely during test. Stock SuSE/CentOS/Debian kernels can be used for testing.
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On Wed, 3 Sep 2014, Andrey Korolyov wrote: Given 2.1 and isa-serial output, set as ttyS0 for the guest VM with 9600 baud rate. The test case is quite simple - display as much data as possible over serial console and do not hang the system. While qemu-1.1 works perfectly, with complaining for lost interrupts (known bug for used guest kernel), 2.1 just hangs after some seconds, eating up all available cpu quota. Test case is 'while true; dmesg; done' in serial console. I`d like to ask to consider this bug as very serious as VM going completely unresponsive in matter of tens of seconds and there are a lot of side attacks to produce enough number of printk() to the ttyS0 with serial console being set up and default settings for almost any distro in such a way that message suppression would not work and VM can be DoSed by an unprivileged user. I tried to reproduce the described behaviour with aboriginal linux and QEMU 2.1.0 but without luck. The configurations I tried: qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \ -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 HOST=i686 qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \ -append root=/dev/hda rw init=/bin/ash panic=1 console=ttyS0,9600 HOST=i686 With all output the system did not hang. In particular I alway could switch to QEMU monitor and stop the VM from there. Can you give an exact QEMU command line which leads to the bug? -- Kirill
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On Thu, Sep 4, 2014 at 1:46 PM, Kirill Batuzov batuz...@ispras.ru wrote: On Wed, 3 Sep 2014, Andrey Korolyov wrote: Given 2.1 and isa-serial output, set as ttyS0 for the guest VM with 9600 baud rate. The test case is quite simple - display as much data as possible over serial console and do not hang the system. While qemu-1.1 works perfectly, with complaining for lost interrupts (known bug for used guest kernel), 2.1 just hangs after some seconds, eating up all available cpu quota. Test case is 'while true; dmesg; done' in serial console. I`d like to ask to consider this bug as very serious as VM going completely unresponsive in matter of tens of seconds and there are a lot of side attacks to produce enough number of printk() to the ttyS0 with serial console being set up and default settings for almost any distro in such a way that message suppression would not work and VM can be DoSed by an unprivileged user. I tried to reproduce the described behaviour with aboriginal linux and QEMU 2.1.0 but without luck. The configurations I tried: qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \ -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 HOST=i686 qemu-system-i386 -cpu pentium3 -no-reboot -kernel bzImage -hda hda.sqf \ -append root=/dev/hda rw init=/bin/ash panic=1 console=ttyS0,9600 HOST=i686 With all output the system did not hang. In particular I alway could switch to QEMU monitor and stop the VM from there. Can you give an exact QEMU command line which leads to the bug? -- Kirill Thanks, the launch string can be borrowed from attach here: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html, the same VM is going under test. By hang I mean stopping ability to send icmp replies, it is like a kind of a watermark for issues I count serious after. Just tested again, the ceiling is not exactly representing all available cpu quota *every* time but is rounded by seemingly random count of cores from 3. to 9 in mine series of tests, with quota limit of 12. VM became unresponsive in matter of seconds, consumption raising by 'clicking' core count for about a half of minute, stabilizing then. Guest args are console=tty0 console=ttyS0,9600n8.
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On Thu, 4 Sep 2014, Andrey Korolyov wrote: Thanks, the launch string can be borrowed from attach here: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html, the same VM is going under test. By hang I mean stopping ability to send icmp replies, it is like a kind of a watermark for issues I count serious after. Just tested again, the ceiling is not exactly representing all available cpu quota *every* time but is rounded by seemingly random count of cores from 3. to 9 in mine series of tests, with quota limit of 12. VM became unresponsive in matter of seconds, consumption raising by 'clicking' core count for about a half of minute, stabilizing then. Guest args are console=tty0 console=ttyS0,9600n8. I modified your command line a bit to match my environment: - removed block drive and related options, - changed network configuration from vhsot to tap, - changed bios to default one shipped with QEMU 2.1, - added parameters to run aboriginal linux. Neither of these should affect logic of serial port, yet I was not able to reproduce the bug again. Any ideas what am I missing? My command line looks like this: qemu-system-x86_64 -enable-kvm -name vm29180 -machine pc-i440fx-2.1,accel=kvm,usb=off \ -cpu SandyBridge,+kvm_pv_eoi -bios bios.bin -m 512 -realtime mlock=off \ -smp 12,sockets=1,cores=12,threads=12 -numa node,nodeid=0,cpus=0-11,mem=512 \ -uuid 9ca88d08-5b89-47f2-bfbf-926efcc500cc -nographic -no-user-config -nodefaults \ -device sga -chardev socket,id=charmonitor,path=vm29180.monitor,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \ -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on \ -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 \ -device virtio-serial-pci,id=virtio-serial0,vectors=1,bus=pci.0,addr=0x5 \ -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 \ -chardev socket,id=charchannel0,path=vm29180.sock,server,nowait \ -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1 \ -object iothread,id=vm29180blk0 -m 512,slots=31,maxmem=16384M \ -object memory-backend-ram,id=mem0,size=512M -device pc-dimm,id=dimm0,node=0,memdev=mem0 \ -object memory-backend-ram,id=mem1,size=512M -device pc-dimm,id=dimm1,node=0,memdev=mem1 \ -object memory-backend-ram,id=mem2,size=512M -device pc-dimm,id=dimm2,node=0,memdev=mem2 \ -object memory-backend-ram,id=mem3,size=512M -device pc-dimm,id=dimm3,node=0,memdev=mem3 \ -object memory-backend-ram,id=mem4,size=512M -device pc-dimm,id=dimm4,node=0,memdev=mem4 \ -object memory-backend-ram,id=mem5,size=512M -device pc-dimm,id=dimm5,node=0,memdev=mem5 \ -object memory-backend-ram,id=mem6,size=512M -device pc-dimm,id=dimm6,node=0,memdev=mem6 \ -hda hda.sqf -kernel bzImage \ -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 HOST=i686 \ -net nic,model=e1000 -net tap,ifname=tap0,script=no,downscript=no PS: I did not specify baud rate of serial console because init in my rootfs does not like it. From linux kernel documentation it should be 9600n8 by default. -- Kirill
Re: [Qemu-devel] Serial: possible hang during intensive interaction over the console
On Thu, Sep 4, 2014 at 5:33 PM, Kirill Batuzov batuz...@ispras.ru wrote: On Thu, 4 Sep 2014, Andrey Korolyov wrote: Thanks, the launch string can be borrowed from attach here: http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00482.html, the same VM is going under test. By hang I mean stopping ability to send icmp replies, it is like a kind of a watermark for issues I count serious after. Just tested again, the ceiling is not exactly representing all available cpu quota *every* time but is rounded by seemingly random count of cores from 3. to 9 in mine series of tests, with quota limit of 12. VM became unresponsive in matter of seconds, consumption raising by 'clicking' core count for about a half of minute, stabilizing then. Guest args are console=tty0 console=ttyS0,9600n8. I modified your command line a bit to match my environment: - removed block drive and related options, - changed network configuration from vhsot to tap, - changed bios to default one shipped with QEMU 2.1, - added parameters to run aboriginal linux. Neither of these should affect logic of serial port, yet I was not able to reproduce the bug again. Any ideas what am I missing? My command line looks like this: qemu-system-x86_64 -enable-kvm -name vm29180 -machine pc-i440fx-2.1,accel=kvm,usb=off \ -cpu SandyBridge,+kvm_pv_eoi -bios bios.bin -m 512 -realtime mlock=off \ -smp 12,sockets=1,cores=12,threads=12 -numa node,nodeid=0,cpus=0-11,mem=512 \ -uuid 9ca88d08-5b89-47f2-bfbf-926efcc500cc -nographic -no-user-config -nodefaults \ -device sga -chardev socket,id=charmonitor,path=vm29180.monitor,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \ -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on \ -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x4 \ -device virtio-serial-pci,id=virtio-serial0,vectors=1,bus=pci.0,addr=0x5 \ -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 \ -chardev socket,id=charchannel0,path=vm29180.sock,server,nowait \ -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1 \ -object iothread,id=vm29180blk0 -m 512,slots=31,maxmem=16384M \ -object memory-backend-ram,id=mem0,size=512M -device pc-dimm,id=dimm0,node=0,memdev=mem0 \ -object memory-backend-ram,id=mem1,size=512M -device pc-dimm,id=dimm1,node=0,memdev=mem1 \ -object memory-backend-ram,id=mem2,size=512M -device pc-dimm,id=dimm2,node=0,memdev=mem2 \ -object memory-backend-ram,id=mem3,size=512M -device pc-dimm,id=dimm3,node=0,memdev=mem3 \ -object memory-backend-ram,id=mem4,size=512M -device pc-dimm,id=dimm4,node=0,memdev=mem4 \ -object memory-backend-ram,id=mem5,size=512M -device pc-dimm,id=dimm5,node=0,memdev=mem5 \ -object memory-backend-ram,id=mem6,size=512M -device pc-dimm,id=dimm6,node=0,memdev=mem6 \ -hda hda.sqf -kernel bzImage \ -append root=/dev/hda rw init=/sbin/init.sh panic=1 console=ttyS0 HOST=i686 \ -net nic,model=e1000 -net tap,ifname=tap0,script=no,downscript=no PS: I did not specify baud rate of serial console because init in my rootfs does not like it. From linux kernel documentation it should be 9600n8 by default. -- Kirill Heh, it is kernel- (defaults-) dependent after all. Debian hangs always, on 3.10, 3.14 and 3.16, Fedora 20 works fine on 3.15. I`ll check if there are any 82550-specific patches in Fedora tree a bit later.