Re: [Qemu-devel] Re: QEMU-KVM and video performance - Update
Hello, Some update on this issue, archive: http://www.mail-archive.com/kvm@vger.kernel.org/msg32600.html Seems to be that cirrus VGA is now ok (>1000MB/s up to 2000MB/s). But cirrus has only 320x200x256colors (Mode 13h) mode implemented in VESA BIOS. VMWare and std VGA still have the performance issue. I guess improvement is related to the following commit: http://git.kernel.org/?p=virt/kvm/qemu-kvm.git;a=commitdiff;h=0d14905b5eb8aa1c2e195e13478bb7c74e1776db Especially i guess the change in hw/cirrus_vga.c. Any idea how to fix: 1.) More VESA modes in cirrus VGA (is VESA emulation done by Seabios or by KVM cirrus BIOS?) 2.) fix in VMWare and std VGA modes the performance, too Versions are latest dev versions of KVM user part and Seabios from GIT. Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ On Wed, 12 May 2010, Avi Kivity wrote: On 05/12/2010 09:14 AM, Gerhard Wiesinger wrote: On Mon, 10 May 2010, Avi Kivity wrote: On 05/09/2010 10:35 PM, Gerhard Wiesinger wrote: For 256 color more the first priority is to find out why direct mapping is not used. I'd suggest tracing the code that makes this decision (in hw/*vga.c) and seeing if it's right or not. I think this is because A000 is not initialized for KVM (see log below and logging patch attached). Why isn't it initialized? Did the guest configure things such as it is impossible to map it directly? Or does the configuration allow direct mapping and qemu incorrectly decides that it cannot direct map? Best would be to print out all the configuration registers and interpret them according to the specification. vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a-000a7fff Why does this happen? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Gerhard Wiesinger wrote: > Can one switch to the old software vmm in VMWare? Perhaps you can install a very old version of VMWare. Maybe run it under KVM ;-) > That was one of the reasons why I was looking for alternatives for > graphical DOS programs. Overall summary so far: > 1.) QEMU without KVM: Problem with 286 DOS Extender instruction set, but > fast VGA > 2.) QEMU with KVM: 286 DOS Extender apps ok, but slow VGA memory > performance > 3.) VMWare Server 2.0 under Linux, application ok, but slow VGA memory > performance > 4.) Virtual PC: Problems with 286 DOS Extender > 5.) Bochs: Works well, but very slow. I would be interested in the 286 DOS Extender issue, as I'd like to use some 286 programs in QEMU at some point. There were some changes to KVM in the kernel recently. Were those needed to get the 286 apps working? > Looks like that VMWare Server and QEMU with KVM maybe have the same > architectural problems going through the whole slow chain from Guest OS to > virtualization layer for VGA writes. They do have a similar architecture. the VGA write speed is a bit surprising, as it should be fast in 256-colour non-modeX modes for both. But maybe there's something we've missed that makes it architecturally slow. It will be interesting to see what you find :-) Thanks, -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Gerhard Wiesinger wrote: > On Wed, 21 Apr 2010, Jamie Lokier wrote: > > >Gerhard Wiesinger wrote: > >>Hmmm. I'm very new to QEMU and KVM but at least accessing the virtual HW > >>of QEMU even from KVM must be possible (e.g. memory and port accesses are > >>done on nearly every virtual device) and therefore I'm ending in C code in > >>the QEMU hw/*.c directory. Therefore also the VGA memory area should be > >>able to be accessable from KVM but with the specialized and fast memory > >>access of QEMU. Am I missing something? > > > >What you're missing is that when KVM calls out to QEMU to handle > >hw/*.c traps, that call is very slow. It's because the hardware-VM > >support is a bit slow when the trap happens, and then the the call > >from KVM in the kernel up to QEMU is a bit slow again. Then all the > >way back. It adds up to a lot, for every I/O operation. > > Isn't that then a general problem of KVM virtualization (oder hardware > virtualization) in general? Is this CPU dependend (AMD vs. Intel)? Yes it is a general problem, but KVM emulates some time-critical things in the kernel (like APIC and CPU instructions), so it's not too bad. KVM is about 5x faster than TCG for most things, and slower for a few things, so on balance it is usually faster. The slow 256-colour mode writes sound like just a simple bug, though. No need for complicated changes. > >In 256-colour mode, KVM should be writing to the VGA memory at high > >speed a lot like normal RAM, not trapping at the hardware-VM level, > >and not calling up to the code in hw/*.c for every byte. > > Yes, same picture to me: 256 color mode should be only a memory write (16 > color mode is more difficult as pixel/byte mapping is not the same). > But it looks like this isn't the case in this test scenario. > > >You might double-check if your guest is using VGA "Mode X". (See > >Wikipedia.) > > > >That was a way to accelerate VGA on real PCs, but it will be slow in > >KVM for the same reasons as 16-colour mode. > > Which way do you mean? Look up Mode X on Wikipedia if you're interested, but it isn't relevant to the problem you've reported. Mode X cannot be enabled with a BIOS call; it's a VGA hardware programming trick. It would not be useful in a VM environment. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 05/12/2010 09:14 AM, Gerhard Wiesinger wrote: On Mon, 10 May 2010, Avi Kivity wrote: On 05/09/2010 10:35 PM, Gerhard Wiesinger wrote: For 256 color more the first priority is to find out why direct mapping is not used. I'd suggest tracing the code that makes this decision (in hw/*vga.c) and seeing if it's right or not. I think this is because A000 is not initialized for KVM (see log below and logging patch attached). Why isn't it initialized? Did the guest configure things such as it is impossible to map it directly? Or does the configuration allow direct mapping and qemu incorrectly decides that it cannot direct map? Best would be to print out all the configuration registers and interpret them according to the specification. vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a-000a7fff Why does this happen? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Mon, 10 May 2010, Avi Kivity wrote: On 05/09/2010 10:35 PM, Gerhard Wiesinger wrote: For 256 color more the first priority is to find out why direct mapping is not used. I'd suggest tracing the code that makes this decision (in hw/*vga.c) and seeing if it's right or not. I think this is because A000 is not initialized for KVM (see log below and logging patch attached). Switches tried without success: -vga std (log is from this one) -vga cirrus -vga vmware I tried also to force the mapping (see patch where it is commented out) but some errors occour (see 2nd log below) and performance is still low at ~1MB/s: s->lfb_vram_mapped = 1; On testing the following line occour: vga_dirty_log_start vga_dirty_log_start vga_dirty_log_start vga_dirty_log_start vga_dirty_log_start vga_dirty_log_start ... Any ideas? Can you reproduce it? Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_map_addr, start=0xF000, len=0x0100 vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start vga_dirty_log_start_mapping_map_addr, start=0xF000, len=0x0100 vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_map_addr, start=0xF000, len=0x0100 vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 -- vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a-000a7fff vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A8000, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a8000-000a vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a-000a7fff vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A8000, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a8000-000a vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a-000a7fff vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A8000, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a8000-000a vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a-000a7fff vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A8000, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a8000-000a vga_dirty_log_start_mapping_lfb_vram_mapped, start=0xE000, len=0x0100 vga_dirty_log_start vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A, len=0x8000 BUG: kvm_dirty_pages_log_change: invalid parameters 000a-000a7fff vga_dirty_log_start_mapping_lfb_vram_mapped, start=0x000A8000, len=0x8000 BUG: kvm_dirty_pages_log_chan
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 05/09/2010 10:35 PM, Gerhard Wiesinger wrote: Please run kvm_stat and report output for both tests to confirm. See below. 2nd column is per second statistic when running the test. efer_reload 0 0 exits 18470836 554582 fpu_reload 21478333469 halt_exits2083 0 halt_wakeup 2047 0 host_state_reload 21481863470 hypercalls 0 0 insn_emulation 7688203 554244 This indicates that kvm is emulating instead of direct mapping. That's probably a bug. If you fix it, performance will increase dramatically. Where can I start here? Any ideas how to? One of my ideas: Move hw/vga.c functions vga_mem_readb vga_mem_readw vga_mem_readl vga_mem_writeb vga_mem_writew vga_mem_writel to KVM to avoid switching from KVM to QEMU (I can write C code even kernel but I'm not comfortable with KVM). Howto? That is already done (generically), it's called coalesced mmio. You only have 3470 qemu exits/sec compared to 554244 kvm writes/sec. Switching between tcg and kvm is hard, but not needed. For 256 color modes, direct map is possible and should yield good performance. Bank switching can be improved perhaps 3x, but will never be fast. Where can I start for KVM performance for the bank switching (256 color mode)? (e.g. BIOS writes to VGA window I/O port to switch the bank) Any ideas how to improve (architecture for the change)? For 256 color more the first priority is to find out why direct mapping is not used. I'd suggest tracing the code that makes this decision (in hw/*vga.c) and seeing if it's right or not. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Thu, 22 Apr 2010, Avi Kivity wrote: On 04/22/2010 09:04 AM, Gerhard Wiesinger wrote: On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/21/2010 09:50 PM, Gerhard Wiesinger wrote: I don't think changing VGA window is a problem because there are 500.000-1Mio changes/s possible. 1MB/s, 500k-1M changes/s Coincidence? Is it taking a page fault or trap on every write? To clarify: Memory Performance writing to segmen A000 is about 1MB/st. That indicates a fault every write (assuming 8-16 bit writes). If you're using 256 color vga and not switching banks, this indicates a bug. Yes, 256 color VGA and no bank switches involved. Calling INT 10 set/get window function with different windows (e.g. toggling between window page 0 and 1) is about 500.000 to 1Mio function calls per second. That's suprisingly fast. I'd expect 100-200k/sec. Sorry, I mixed up the numbers: 1.) QEMU-KVM: ~111k 2.) QEMU only: 500k-1Mio Please run kvm_stat and report output for both tests to confirm. See below. 2nd column is per second statistic when running the test. efer_reload 0 0 exits 18470836 554582 fpu_reload 21478333469 halt_exits2083 0 halt_wakeup 2047 0 host_state_reload 21481863470 hypercalls 0 0 insn_emulation 7688203 554244 This indicates that kvm is emulating instead of direct mapping. That's probably a bug. If you fix it, performance will increase dramatically. Where can I start here? Any ideas how to? One of my ideas: Move hw/vga.c functions vga_mem_readb vga_mem_readw vga_mem_readl vga_mem_writeb vga_mem_writew vga_mem_writel to KVM to avoid switching from KVM to QEMU (I can write C code even kernel but I'm not comfortable with KVM). Howto? To get real good VGA performance both parameters should be: About >50MB/s for writes to segment A000 ~500.000 bank switches per second. First should be doable easily, second is borderline. I think this is very easy to distingish: 1.) VGA Segment A000 is legacy and should be handled through QEMU and not through KVM (because it is much more faster). Also 16 color modes should be fast enough there. 2.) All other flat PCI memory accesses should be handled through KVM (there is a specialized driver loaded for that PCI device in the non legacy OS). Is that easily possible? No. Code can run in either qemu or kvm, not both. You can switch between them based on access statistics (early versions of qemu-kvm did that, without the statistics part), but this isn't trivial. Hmmm. Ok, 2 different opinions about the memory write performance: Easily or not possible? Switching between tcg and kvm is hard, but not needed. For 256 color modes, direct map is possible and should yield good performance. Bank switching can be improved perhaps 3x, but will never be fast. Where can I start for KVM performance for the bank switching (256 color mode)? (e.g. BIOS writes to VGA window I/O port to switch the bank) Any ideas how to improve (architecture for the change)? Thnx and sorry for long delay, was busy. Ciao, Gerhard -- http://www.wiesinger.com/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 04/22/2010 09:04 AM, Gerhard Wiesinger wrote: On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/21/2010 09:50 PM, Gerhard Wiesinger wrote: I don't think changing VGA window is a problem because there are 500.000-1Mio changes/s possible. 1MB/s, 500k-1M changes/s Coincidence? Is it taking a page fault or trap on every write? To clarify: Memory Performance writing to segmen A000 is about 1MB/st. That indicates a fault every write (assuming 8-16 bit writes). If you're using 256 color vga and not switching banks, this indicates a bug. Yes, 256 color VGA and no bank switches involved. Calling INT 10 set/get window function with different windows (e.g. toggling between window page 0 and 1) is about 500.000 to 1Mio function calls per second. That's suprisingly fast. I'd expect 100-200k/sec. Sorry, I mixed up the numbers: 1.) QEMU-KVM: ~111k 2.) QEMU only: 500k-1Mio Please run kvm_stat and report output for both tests to confirm. See below. 2nd column is per second statistic when running the test. efer_reload 0 0 exits 18470836 554582 fpu_reload 21478333469 halt_exits2083 0 halt_wakeup 2047 0 host_state_reload 21481863470 hypercalls 0 0 insn_emulation 7688203 554244 This indicates that kvm is emulating instead of direct mapping. That's probably a bug. If you fix it, performance will increase dramatically. To get real good VGA performance both parameters should be: About >50MB/s for writes to segment A000 ~500.000 bank switches per second. First should be doable easily, second is borderline. I think this is very easy to distingish: 1.) VGA Segment A000 is legacy and should be handled through QEMU and not through KVM (because it is much more faster). Also 16 color modes should be fast enough there. 2.) All other flat PCI memory accesses should be handled through KVM (there is a specialized driver loaded for that PCI device in the non legacy OS). Is that easily possible? No. Code can run in either qemu or kvm, not both. You can switch between them based on access statistics (early versions of qemu-kvm did that, without the statistics part), but this isn't trivial. Hmmm. Ok, 2 different opinions about the memory write performance: Easily or not possible? Switching between tcg and kvm is hard, but not needed. For 256 color modes, direct map is possible and should yield good performance. Bank switching can be improved perhaps 3x, but will never be fast. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 04/22/2010 08:37 AM, Gerhard Wiesinger wrote: On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/21/2010 09:14 PM, Gerhard Wiesinger wrote: Can you explain which code files/functions of KVM is involved in handling VGA memory window and page switching through the port write to the VGA window register (or is that part handled through QEMU), so a little bit architecture explaination would be nice? qemu hw/vga.c and hw/cirrus_vga.c. Boring functions like vbe_ioport_write_data() and vga_ioport_write(). Yes, I was already in that code part and that are very simple functions as already explained and are therefore in QEMU only very fast. But I ment: How is the calling path from KVM guest OS to hw/vga.c for memory and I/O accesses, and which parts are done in hardware directly (to understand the speed gap and maybe to find a solution)? The speed gap is mostly due to hardware constraints (it takes ~2000 cycles for an exit from guest mode, plus we need to switch a few msrs to get to userspace). See vmx_vcpu_run(), the vmresume instruction is where an exit starts. BTW: In which KVM code parts is decided where "direct code" or an "emulated device code" is used? Same place. Look for calls to cpu_register_physical_memory(). If the last argument was obtained by a call to cpu_register_io_memory(), then all writes trap. Otherwise, it was obtained by qemu_ram_alloc() and writes will not trap (except the first write to a page in a 30ms window, used to note that the page is dirty and needs redrawing). Ok, that finally ends in: cpu_register_physical_memory_offset() ... // 0.12.3 if (kvm_enabled()) kvm_set_phys_mem(start_addr, size, phys_offset); // KVM cpu_notify_set_memory(start_addr, size, phys_offset); ... I/O is always done through: cpu_register_io_memory => cpu_register_io_memory_fixed cpu_register_io_memory_fixed() ... No call to KVM? kvm_set_phys_mem() is a call to kvm. ... Where is the trap from KVM to QEMU? See kvm_cpu_exec(). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Jamie Lokier wrote: Gerhard Wiesinger wrote: Hmmm. I'm very new to QEMU and KVM but at least accessing the virtual HW of QEMU even from KVM must be possible (e.g. memory and port accesses are done on nearly every virtual device) and therefore I'm ending in C code in the QEMU hw/*.c directory. Therefore also the VGA memory area should be able to be accessable from KVM but with the specialized and fast memory access of QEMU. Am I missing something? What you're missing is that when KVM calls out to QEMU to handle hw/*.c traps, that call is very slow. It's because the hardware-VM support is a bit slow when the trap happens, and then the the call from KVM in the kernel up to QEMU is a bit slow again. Then all the way back. It adds up to a lot, for every I/O operation. Isn't that then a general problem of KVM virtualization (oder hardware virtualization) in general? Is this CPU dependend (AMD vs. Intel)? When QEMU does the same thing, it's fast because it's inside the same process; it's just a function call. Yes, that's clear to me. That's why the most often called devices are emulated separately in KVM's kernel code, things like the interrupt controller, timer chip etc. It's also why individual instructions that need help are emulated in KVM's kernel code, instead of passing control up to QEMU just for one instruction. BTW: Still not clear why performance is low with KVM since there are no window changes in the testcase involved which could cause a (slow) page fault. It sounds like a bug. Avi gave suggests about what to look for. If it fixes my OS install speeds too, I'll be very happy :-) See other post for details. In 256-colour mode, KVM should be writing to the VGA memory at high speed a lot like normal RAM, not trapping at the hardware-VM level, and not calling up to the code in hw/*.c for every byte. Yes, same picture to me: 256 color mode should be only a memory write (16 color mode is more difficult as pixel/byte mapping is not the same). But it looks like this isn't the case in this test scenario. You might double-check if your guest is using VGA "Mode X". (See Wikipedia.) Code: inregs.x.ax = 0x4F02; inregs.x.bx = 0xC000 | 0x101; // bh=bit 15=0 (clear), bit14=0 (windowed) int86x(INT_SCREEN, &inregs, &outregs, &outsregs); /* Call INT 10h */ I can post the whole code/exes if you want (I already planned to post my whole tools, but I have to do some cleanups until I wanted to publish whole package) . That was a way to accelerate VGA on real PCs, but it will be slow in KVM for the same reasons as 16-colour mode. Which way do you mean? Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/21/2010 09:50 PM, Gerhard Wiesinger wrote: I don't think changing VGA window is a problem because there are 500.000-1Mio changes/s possible. 1MB/s, 500k-1M changes/s Coincidence? Is it taking a page fault or trap on every write? To clarify: Memory Performance writing to segmen A000 is about 1MB/st. That indicates a fault every write (assuming 8-16 bit writes). If you're using 256 color vga and not switching banks, this indicates a bug. Yes, 256 color VGA and no bank switches involved. Calling INT 10 set/get window function with different windows (e.g. toggling between window page 0 and 1) is about 500.000 to 1Mio function calls per second. That's suprisingly fast. I'd expect 100-200k/sec. Sorry, I mixed up the numbers: 1.) QEMU-KVM: ~111k 2.) QEMU only: 500k-1Mio Please run kvm_stat and report output for both tests to confirm. See below. 2nd column is per second statistic when running the test. To get real good VGA performance both parameters should be: About >50MB/s for writes to segment A000 ~500.000 bank switches per second. First should be doable easily, second is borderline. I think this is very easy to distingish: 1.) VGA Segment A000 is legacy and should be handled through QEMU and not through KVM (because it is much more faster). Also 16 color modes should be fast enough there. 2.) All other flat PCI memory accesses should be handled through KVM (there is a specialized driver loaded for that PCI device in the non legacy OS). Is that easily possible? No. Code can run in either qemu or kvm, not both. You can switch between them based on access statistics (early versions of qemu-kvm did that, without the statistics part), but this isn't trivial. Hmmm. Ok, 2 different opinions about the memory write performance: Easily or not possible? Thnx for you help so far. Ciao, Gerhard -- http://www.wiesinger.com/ - kvm_stat Please mount debugfs ('mount -t debugfs debugfs /sys/kernel/debug') and ensure the kvm modules are loaded lsmod|grep -i kvm kvm_amd38276 0 kvm 162288 1 kvm_amd mount|grep -i debug => mount -t debugfs debugfs /sys/kernel/debug int10perf: INT10h Performance tests: kvm statistics efer_reload 0 0 exits 37648629 456206 fpu_reload 8512535 455983 halt_exits2084 0 halt_wakeup 2047 0 host_state_reload 8513213 456011 hypercalls 0 0 insn_emulation29182065 0 insn_emulation_fail 0 0 invlpg 0 0 io_exits 8386082 455975 irq_exits51713 214 irq_injections 21797 36 irq_window 0 0 largepages 0 0 mmio_exits 242781 0 mmu_cache_miss 150 0 mmu_flooded 0 0 mmu_pde_zapped 0 0 mmu_pte_updated 0 0 mmu_pte_write 8192 0 mmu_recycled 0 0 mmu_shadow_zapped 151 0 mmu_unsync 0 0 mmu_unsync_global0 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 16935 0 pf_guest 0 0 remote_tlb_flush 2 0 request_irq 0 0 request_nmi 0 0 signal_exits 1 0 tlb_flush 2251 0 Running VGA memory tests in same VGA page in Video Mode VESA 101h: kvm statistics efer_reload 0 0 exits 18470836 554582 fpu_reload 21478333469 halt_exits2083 0 halt_wakeup 2047 0 host_state_reload 21481863470 hypercalls 0 0 insn_emulation 7688203 554244 insn_emulation_fail 0 0 invlpg 0 0 io_exits 10701583 18 irq_exits50781 321 irq_injections 25251 18 irq_window 0 0 largepages 0 0 mmio_exits 1628473241 mmu_cache_miss 154 0 mmu_flooded 0 0 mmu_pde_zapped 0 0 mmu_pte_updated 0 0 mmu_pte_write 8192 0 mmu_recycled 0 0 mmu_shadow_zapped 155 0 mmu_unsync 0 0 mmu_unsync_global0 0 nmi_injections 0 0 nmi_window 0 0 pf_fixed 16936 0 pf_guest 0 0 remote_tlb_flush 5 0 request_irq 0 0 request_nmi 0
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/21/2010 09:39 PM, Jamie Lokier wrote: Avi Kivity wrote: Writes to vga in 16-color mode don't change set a memory location to a value, instead they change multiple memory locations. While code is just writing to the VGA memory, not reading(*) and not touching the VGA I/O register that control the write latches, is it possible in principle to swizzle the format around in memory to make regular writes work? Not in software. We can map pages, not cross address lines. (*) Reading should be ok for some settings of the write latches, I think. I wonder if guests of interest behave like that. Guests that use 16 color vga are usually of little interest. I tested 256 color modes. Is this a case where TCG would run significantly faster for code blocks that have been detected to access the VGA memory? Yes. $ date Wed Apr 21 19:37:38 2015 $ modprobe ktcg That's why the vmware software vmm was faster than the hardware vmm for the initial iterations of vmx. On VMWare Server 2.0: same picture: Calling INT10h interrupts is fast, Writing to VGA Memory is also very slow (1.0MB/s). Can one switch to the old software vmm in VMWare? That was one of the reasons why I was looking for alternatives for graphical DOS programs. Overall summary so far: 1.) QEMU without KVM: Problem with 286 DOS Extender instruction set, but fast VGA 2.) QEMU with KVM: 286 DOS Extender apps ok, but slow VGA memory performance 3.) VMWare Server 2.0 under Linux, application ok, but slow VGA memory performance 4.) Virtual PC: Problems with 286 DOS Extender 5.) Bochs: Works well, but very slow. Looks like that VMWare Server and QEMU with KVM maybe have the same architectural problems going through the whole slow chain from Guest OS to virtualization layer for VGA writes. Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/21/2010 09:14 PM, Gerhard Wiesinger wrote: Can you explain which code files/functions of KVM is involved in handling VGA memory window and page switching through the port write to the VGA window register (or is that part handled through QEMU), so a little bit architecture explaination would be nice? qemu hw/vga.c and hw/cirrus_vga.c. Boring functions like vbe_ioport_write_data() and vga_ioport_write(). Yes, I was already in that code part and that are very simple functions as already explained and are therefore in QEMU only very fast. But I ment: How is the calling path from KVM guest OS to hw/vga.c for memory and I/O accesses, and which parts are done in hardware directly (to understand the speed gap and maybe to find a solution)? BTW: In which KVM code parts is decided where "direct code" or an "emulated device code" is used? Same place. Look for calls to cpu_register_physical_memory(). If the last argument was obtained by a call to cpu_register_io_memory(), then all writes trap. Otherwise, it was obtained by qemu_ram_alloc() and writes will not trap (except the first write to a page in a 30ms window, used to note that the page is dirty and needs redrawing). Ok, that finally ends in: cpu_register_physical_memory_offset() ... // 0.12.3 if (kvm_enabled()) kvm_set_phys_mem(start_addr, size, phys_offset); // KVM cpu_notify_set_memory(start_addr, size, phys_offset); ... I/O is always done through: cpu_register_io_memory => cpu_register_io_memory_fixed cpu_register_io_memory_fixed() ... No call to KVM? ... Where is the trap from KVM to QEMU? Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Gerhard Wiesinger wrote: > Hmmm. I'm very new to QEMU and KVM but at least accessing the virtual HW > of QEMU even from KVM must be possible (e.g. memory and port accesses are > done on nearly every virtual device) and therefore I'm ending in C code in > the QEMU hw/*.c directory. Therefore also the VGA memory area should be > able to be accessable from KVM but with the specialized and fast memory > access of QEMU. Am I missing something? What you're missing is that when KVM calls out to QEMU to handle hw/*.c traps, that call is very slow. It's because the hardware-VM support is a bit slow when the trap happens, and then the the call from KVM in the kernel up to QEMU is a bit slow again. Then all the way back. It adds up to a lot, for every I/O operation. When QEMU does the same thing, it's fast because it's inside the same process; it's just a function call. That's why the most often called devices are emulated separately in KVM's kernel code, things like the interrupt controller, timer chip etc. It's also why individual instructions that need help are emulated in KVM's kernel code, instead of passing control up to QEMU just for one instruction. > BTW: Still not clear why performance is low with KVM since there are > no window changes in the testcase involved which could cause a (slow) page > fault. It sounds like a bug. Avi gave suggests about what to look for. If it fixes my OS install speeds too, I'll be very happy :-) In 256-colour mode, KVM should be writing to the VGA memory at high speed a lot like normal RAM, not trapping at the hardware-VM level, and not calling up to the code in hw/*.c for every byte. You might double-check if your guest is using VGA "Mode X". (See Wikipedia.) That was a way to accelerate VGA on real PCs, but it will be slow in KVM for the same reasons as 16-colour mode. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Avi Kivity wrote: > On 04/21/2010 09:39 PM, Jamie Lokier wrote: > >Avi Kivity wrote: > > > >>Writes to vga in 16-color mode don't change set a memory location to a > >>value, instead they change multiple memory locations. > >> > >While code is just writing to the VGA memory, not reading(*) and not > >touching the VGA I/O register that control the write latches, is it > >possible in principle to swizzle the format around in memory to make > >regular writes work? > > > > Not in software. We can map pages, not cross address lines. Hence "swizzle". You rearrange the data inside the page for the crossed address lines, and undo the swizzle later on demand. That doesn't work for other VGA magic though. > Guests that use 16 color vga are usually of little interest. Fair enough. We can move on :-) It's been said that the super-slow VGA writes triggering this thread are in 256-colour mode, so there's a different problem. That should be fast, shouldn't it? I vaguely recall extremely slow OS installs I've seen in KVM, which were fast in QEMU (and fast in KVM after installing), were using text mode. Possibly it was Windows 2000, or Windows Server 2003. Text mode should be fast too, shouldn't it? I suppose it's possible that it just looked like text mode and was really 16-colour mode. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 04/21/2010 09:50 PM, Gerhard Wiesinger wrote: I don't think changing VGA window is a problem because there are 500.000-1Mio changes/s possible. 1MB/s, 500k-1M changes/s Coincidence? Is it taking a page fault or trap on every write? To clarify: Memory Performance writing to segmen A000 is about 1MB/st. That indicates a fault every write (assuming 8-16 bit writes). If you're using 256 color vga and not switching banks, this indicates a bug. Calling INT 10 set/get window function with different windows (e.g. toggling between window page 0 and 1) is about 500.000 to 1Mio function calls per second. That's suprisingly fast. I'd expect 100-200k/sec. Please run kvm_stat and report output for both tests to confirm. To get real good VGA performance both parameters should be: About >50MB/s for writes to segment A000 ~500.000 bank switches per second. First should be doable easily, second is borderline. I think this is very easy to distingish: 1.) VGA Segment A000 is legacy and should be handled through QEMU and not through KVM (because it is much more faster). Also 16 color modes should be fast enough there. 2.) All other flat PCI memory accesses should be handled through KVM (there is a specialized driver loaded for that PCI device in the non legacy OS). Is that easily possible? No. Code can run in either qemu or kvm, not both. You can switch between them based on access statistics (early versions of qemu-kvm did that, without the statistics part), but this isn't trivial. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 04/21/2010 09:39 PM, Jamie Lokier wrote: Avi Kivity wrote: Writes to vga in 16-color mode don't change set a memory location to a value, instead they change multiple memory locations. While code is just writing to the VGA memory, not reading(*) and not touching the VGA I/O register that control the write latches, is it possible in principle to swizzle the format around in memory to make regular writes work? Not in software. We can map pages, not cross address lines. (*) Reading should be ok for some settings of the write latches, I think. I wonder if guests of interest behave like that. Guests that use 16 color vga are usually of little interest. Is this a case where TCG would run significantly faster for code blocks that have been detected to access the VGA memory? Yes. $ date Wed Apr 21 19:37:38 2015 $ modprobe ktcg That's why the vmware software vmm was faster than the hardware vmm for the initial iterations of vmx. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 04/21/2010 09:14 PM, Gerhard Wiesinger wrote: Can you explain which code files/functions of KVM is involved in handling VGA memory window and page switching through the port write to the VGA window register (or is that part handled through QEMU), so a little bit architecture explaination would be nice? qemu hw/vga.c and hw/cirrus_vga.c. Boring functions like vbe_ioport_write_data() and vga_ioport_write(). BTW: In which KVM code parts is decided where "direct code" or an "emulated device code" is used? Same place. Look for calls to cpu_register_physical_memory(). If the last argument was obtained by a call to cpu_register_io_memory(), then all writes trap. Otherwise, it was obtained by qemu_ram_alloc() and writes will not trap (except the first write to a page in a 30ms window, used to note that the page is dirty and needs redrawing). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Jamie Lokier wrote: Gerhard Wiesinger wrote: Would it be possible to handle these writes through QEMU directly (without KVM), because performance is there very well (looking at the code there is some pointer arithmetic and some memory write done)? I've noticed extremely slow VGA performance too, when installing OSes. It makes the difference between installing in a few minutes, and installing taking hours - just because of the slow VGA. So generally I use qemu for installing old versions of Windows, then change to KVM to run them after installing. Switching between KVM and qemu automatically based on guest code behaviour, and making both memory models and device models compatible at run time, is a difficult thing. I guess it's not worth the difficulty just to speed up VGA. I think this is very easy to distingish: 1.) VGA Segment A000 is legacy and should be handled through QEMU and not through KVM (because it is much more faster). Also 16 color modes should be fast enough there. 2.) All other flat PCI memory accesses should be handled through KVM (there is a specialized driver loaded for that PCI device in the non legacy OS). Is that easily possible? No it isn't. Distingushing addresses is trivial. You've ignored the hard part, which is switching between different virtualisation architectures... Hmmm. I'm very new to QEMU and KVM but at least accessing the virtual HW of QEMU even from KVM must be possible (e.g. memory and port accesses are done on nearly every virtual device) and therefore I'm ending in C code in the QEMU hw/*.c directory. Therefore also the VGA memory area should be able to be accessable from KVM but with the specialized and fast memory access of QEMU. Am I missing something? BTW: Still not clear why performance is low with KVM since there are no window changes in the testcase involved which could cause a (slow) page fault. Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Gerhard Wiesinger wrote: > >>Would it be possible to handle these writes through QEMU directly > >>(without > >>KVM), because performance is there very well (looking at the code there > >>is some pointer arithmetic and some memory write done)? > > > >I've noticed extremely slow VGA performance too, when installing OSes. > >It makes the difference between installing in a few minutes, and > >installing taking hours - just because of the slow VGA. > > > >So generally I use qemu for installing old versions of Windows, then > >change to KVM to run them after installing. > > > >Switching between KVM and qemu automatically based on guest code > >behaviour, and making both memory models and device models compatible > >at run time, is a difficult thing. I guess it's not worth the > >difficulty just to speed up VGA. > > I think this is very easy to distingish: > 1.) VGA Segment A000 is legacy and should be handled through QEMU > and not through KVM (because it is much more faster). Also 16 color modes > should be fast enough there. > 2.) All other flat PCI memory accesses should be handled through KVM > (there is a specialized driver loaded for that PCI device in the non > legacy OS). > > Is that easily possible? No it isn't. Distingushing addresses is trivial. You've ignored the hard part, which is switching between different virtualisation architectures... -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Jamie Lokier wrote: Gerhard Wiesinger wrote: I'm using VESA mode 0x101 (640x480 256 colors), but performance is there very low (~1MB/s). Test is also WITHOUT any vga window change, so there isn't any page switching overhead involved in this test case. Any ideas for improvement? Currently when the physical memory map changes (which is what happens when the vga window is updated), kvm drops the entire shadow cache. It's possible to do this only for vga memory, but not easy. I don't think changing VGA window is a problem because there are 500.000-1Mio changes/s possible. 1MB/s, 500k-1M changes/s Coincidence? Is it taking a page fault or trap on every write? To clarify: Memory Performance writing to segmen A000 is about 1MB/st. Calling INT 10 set/get window function with different windows (e.g. toggling between window page 0 and 1) is about 500.000 to 1Mio function calls per second. To get real good VGA performance both parameters should be: About >50MB/s for writes to segment A000 ~500.000 bank switches per second. Would it be possible to handle these writes through QEMU directly (without KVM), because performance is there very well (looking at the code there is some pointer arithmetic and some memory write done)? I've noticed extremely slow VGA performance too, when installing OSes. It makes the difference between installing in a few minutes, and installing taking hours - just because of the slow VGA. So generally I use qemu for installing old versions of Windows, then change to KVM to run them after installing. Switching between KVM and qemu automatically based on guest code behaviour, and making both memory models and device models compatible at run time, is a difficult thing. I guess it's not worth the difficulty just to speed up VGA. I think this is very easy to distingish: 1.) VGA Segment A000 is legacy and should be handled through QEMU and not through KVM (because it is much more faster). Also 16 color modes should be fast enough there. 2.) All other flat PCI memory accesses should be handled through KVM (there is a specialized driver loaded for that PCI device in the non legacy OS). Is that easily possible? Thnx. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Avi Kivity wrote: > Writes to vga in 16-color mode don't change set a memory location to a > value, instead they change multiple memory locations. While code is just writing to the VGA memory, not reading(*) and not touching the VGA I/O register that control the write latches, is it possible in principle to swizzle the format around in memory to make regular writes work? (*) Reading should be ok for some settings of the write latches, I think. I wonder if guests of interest behave like that. > >Is this a case where TCG would run significantly faster for code blocks > >that have been detected to access the VGA memory? > > Yes. $ date Wed Apr 21 19:37:38 2015 $ modprobe ktcg ;-) -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Gerhard Wiesinger wrote: > I'm using VESA mode 0x101 (640x480 256 colors), but performance is > there very low (~1MB/s). Test is also WITHOUT any vga window change, so > there isn't any page switching overhead involved in this test case. > > >>Any ideas for improvement? > > > >Currently when the physical memory map changes (which is what happens > >when the vga window is updated), kvm drops the entire shadow cache. It's > >possible to do this only for vga memory, but not easy. > > I don't think changing VGA window is a problem because there are > 500.000-1Mio changes/s possible. 1MB/s, 500k-1M changes/s Coincidence? Is it taking a page fault or trap on every write? > Would it be possible to handle these writes through QEMU directly (without > KVM), because performance is there very well (looking at the code there > is some pointer arithmetic and some memory write done)? I've noticed extremely slow VGA performance too, when installing OSes. It makes the difference between installing in a few minutes, and installing taking hours - just because of the slow VGA. So generally I use qemu for installing old versions of Windows, then change to KVM to run them after installing. Switching between KVM and qemu automatically based on guest code behaviour, and making both memory models and device models compatible at run time, is a difficult thing. I guess it's not worth the difficulty just to speed up VGA. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/21/2010 01:08 PM, Jamie Lokier wrote: Avi Kivity wrote: On 04/19/2010 10:14 PM, Gerhard Wiesinger wrote: Hello, Finally I got QEMU-KVM to work but video performance under DOS is very low (QEMU 0.12.3 stable and QEMU GIT master branch is fast, QEMU KVM is slow) I'm measuring 2 performance critical video performance parameters: 1.) INT 10h, function AX=4F05h (set same window/set window/get window) 2.) Memory performance to segment page A000h So BIOS performance (which might be port performance to VGA index/value port) is about factor 5 slower, memory performance is about factor 100 slower. QEMU 0.12.3 and QEMU GIT performance is the same (in the measurement tolerance) and listed only once, QEMU KVM is much more slower (details see below). Test programs can be provided, source code will be release soon. Any ideas why KVM is so slow? 16-color vga is slow because kvm cannot map the framebuffer to the guest (writes are not interpreted as RAM writes). 256+-color vga should be fast, except when switching the vga window. Note it's only fast on average, the first write into a page will be slow as kvm maps it in. I don't understand: why is 256+-colour mappable and 16-colour not mappable? Writes to vga in 16-color mode don't change set a memory location to a value, instead they change multiple memory locations. Is this a case where TCG would run significantly faster for code blocks that have been detected to access the VGA memory? Yes. Currently when the physical memory map changes (which is what happens when the vga window is updated), kvm drops the entire shadow cache. It's possible to do this only for vga memory, but not easy. If it's a page fault handled in the kernel, I would expect it to be about as fast as those old VGA DOS-extender drivers which provide the illusion of a single flat mapping, and bank switch on page faults - multiplied by the speed of modern CPUs compared with then. For many graphics things those DOS-extender drivers worked perfectly well. If it's a trap out to qemu on every vga window change, perhaps not quite so well. It's much more complicated. Can you explain which code files/functions of KVM is involved in handling VGA memory window and page switching through the port write to the VGA window register (or is that part handled through QEMU), so a little bit architecture explaination would be nice? BTW: In which KVM code parts is decided where "direct code" or an "emulated device code" is used? Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On Wed, 21 Apr 2010, Avi Kivity wrote: On 04/19/2010 10:14 PM, Gerhard Wiesinger wrote: Hello, Finally I got QEMU-KVM to work but video performance under DOS is very low (QEMU 0.12.3 stable and QEMU GIT master branch is fast, QEMU KVM is slow) I'm measuring 2 performance critical video performance parameters: 1.) INT 10h, function AX=4F05h (set same window/set window/get window) 2.) Memory performance to segment page A000h So BIOS performance (which might be port performance to VGA index/value port) is about factor 5 slower, memory performance is about factor 100 slower. QEMU 0.12.3 and QEMU GIT performance is the same (in the measurement tolerance) and listed only once, QEMU KVM is much more slower (details see below). Test programs can be provided, source code will be release soon. Any ideas why KVM is so slow? 16-color vga is slow because kvm cannot map the framebuffer to the guest (writes are not interpreted as RAM writes). 256+-color vga should be fast, except when switching the vga window. Note it's only fast on average, the first write into a page will be slow as kvm maps it in. Which mode are you using? I'm using VESA mode 0x101 (640x480 256 colors), but performance is there very low (~1MB/s). Test is also WITHOUT any vga window change, so there isn't any page switching overhead involved in this test case. Any ideas for improvement? Currently when the physical memory map changes (which is what happens when the vga window is updated), kvm drops the entire shadow cache. It's possible to do this only for vga memory, but not easy. I don't think changing VGA window is a problem because there are 500.000-1Mio changes/s possible. Would it be possible to handle these writes through QEMU directly (without KVM), because performance is there very well (looking at the code there is some pointer arithmetic and some memory write done)? Thnx. Ciao, Gerhard -- http://www.wiesinger.com/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
On 04/21/2010 01:08 PM, Jamie Lokier wrote: Avi Kivity wrote: On 04/19/2010 10:14 PM, Gerhard Wiesinger wrote: Hello, Finally I got QEMU-KVM to work but video performance under DOS is very low (QEMU 0.12.3 stable and QEMU GIT master branch is fast, QEMU KVM is slow) I'm measuring 2 performance critical video performance parameters: 1.) INT 10h, function AX=4F05h (set same window/set window/get window) 2.) Memory performance to segment page A000h So BIOS performance (which might be port performance to VGA index/value port) is about factor 5 slower, memory performance is about factor 100 slower. QEMU 0.12.3 and QEMU GIT performance is the same (in the measurement tolerance) and listed only once, QEMU KVM is much more slower (details see below). Test programs can be provided, source code will be release soon. Any ideas why KVM is so slow? 16-color vga is slow because kvm cannot map the framebuffer to the guest (writes are not interpreted as RAM writes). 256+-color vga should be fast, except when switching the vga window. Note it's only fast on average, the first write into a page will be slow as kvm maps it in. I don't understand: why is 256+-colour mappable and 16-colour not mappable? Writes to vga in 16-color mode don't change set a memory location to a value, instead they change multiple memory locations. Is this a case where TCG would run significantly faster for code blocks that have been detected to access the VGA memory? Yes. Currently when the physical memory map changes (which is what happens when the vga window is updated), kvm drops the entire shadow cache. It's possible to do this only for vga memory, but not easy. If it's a page fault handled in the kernel, I would expect it to be about as fast as those old VGA DOS-extender drivers which provide the illusion of a single flat mapping, and bank switch on page faults - multiplied by the speed of modern CPUs compared with then. For many graphics things those DOS-extender drivers worked perfectly well. If it's a trap out to qemu on every vga window change, perhaps not quite so well. It's much more complicated. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: QEMU-KVM and video performance
Avi Kivity wrote: > On 04/19/2010 10:14 PM, Gerhard Wiesinger wrote: > >Hello, > > > >Finally I got QEMU-KVM to work but video performance under DOS is very > >low (QEMU 0.12.3 stable and QEMU GIT master branch is fast, QEMU KVM > >is slow) > > > >I'm measuring 2 performance critical video performance parameters: > >1.) INT 10h, function AX=4F05h (set same window/set window/get window) > >2.) Memory performance to segment page A000h > > > >So BIOS performance (which might be port performance to VGA > >index/value port) is about factor 5 slower, memory performance is > >about factor 100 slower. > > > >QEMU 0.12.3 and QEMU GIT performance is the same (in the measurement > >tolerance) and listed only once, QEMU KVM is much more slower (details > >see below). > > > >Test programs can be provided, source code will be release soon. > > > >Any ideas why KVM is so slow? > > 16-color vga is slow because kvm cannot map the framebuffer to the guest > (writes are not interpreted as RAM writes). 256+-color vga should be > fast, except when switching the vga window. Note it's only fast on > average, the first write into a page will be slow as kvm maps it in. I don't understand: why is 256+-colour mappable and 16-colour not mappable? Is this a case where TCG would run significantly faster for code blocks that have been detected to access the VGA memory? > Which mode are you using? > > >Any ideas for improvement? > > Currently when the physical memory map changes (which is what happens > when the vga window is updated), kvm drops the entire shadow cache. > It's possible to do this only for vga memory, but not easy. If it's a page fault handled in the kernel, I would expect it to be about as fast as those old VGA DOS-extender drivers which provide the illusion of a single flat mapping, and bank switch on page faults - multiplied by the speed of modern CPUs compared with then. For many graphics things those DOS-extender drivers worked perfectly well. If it's a trap out to qemu on every vga window change, perhaps not quite so well. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html