Avi Kivity wrote: > Jan Kiszka wrote: >> Avi, >> >> [somehow your mails do not get through to my private account, so I'm >> switching] >> >> Avi Kivity wrote: >> >>> Jan Kiszka wrote: >>> >>>> Clarification: I can't precisely tell what code is executed in VM mode, >>>> as I don't have qemu or that guest instrumented. I just see the kernel >>>> entering VM mode and leaving it again more than 300 us later. So I >>>> wonder why this is allowed while some external IRQ is pending. >>>> >>>> >>>> >>> How do you know an external interrupt is pending? >>> >> It's the host timer IRQ, programmed to fire in certain intervals (100 us >> here). Test case is some latency measurement tool like tglx's cyclictest >> or similar programs we use in Xenomai. >> >> >>> kvm programs the hardware to exit when an external interrupt arrives. >>> >>> >> Here is a latency trace I just managed to capture over 2.6.23.1-rt1 with >> latest kvm from git hacked into (kvm generally seems to work fine this way): >> >> ... >> qemu-sys-7543 0...1 13897us : vmcs_write16+0xb/0x20 >> (vmx_save_host_state+0x1a7/0x1c0) >> qemu-sys-7543 0...1 13897us : vmcs_writel+0xb/0x30 (vmcs_write16+0x1e/0x20) >> qemu-sys-7543 0...1 13898us : segment_base+0xc/0x70 >> (vmx_save_host_state+0xa0/0x1c0) >> qemu-sys-7543 0...1 13898us : vmcs_writel+0xb/0x30 >> (vmx_save_host_state+0xb0/0x1c0) >> qemu-sys-7543 0...1 13898us : segment_base+0xc/0x70 >> (vmx_save_host_state+0xbf/0x1c0) >> qemu-sys-7543 0...1 13898us : vmcs_writel+0xb/0x30 >> (vmx_save_host_state+0xcf/0x1c0) >> qemu-sys-7543 0...1 13898us : load_msrs+0xb/0x40 >> (vmx_save_host_state+0xe7/0x1c0) >> qemu-sys-7543 0...1 13898us : kvm_load_guest_fpu+0x8/0x40 >> (kvm_vcpu_ioctl_run+0xbf/0x570) >> qemu-sys-7543 0D..1 13899us : vmx_vcpu_run+0xc/0x110 >> (kvm_vcpu_ioctl_run+0x120/0x570) >> qemu-sys-7543 0D..1 13899us!: vmcs_writel+0xb/0x30 (vmx_vcpu_run+0x22/0x110) >> qemu-sys-7543 0D..1 14344us : vmcs_read32+0xb/0x20 (vmx_vcpu_run+0xc7/0x110) >> qemu-sys-7543 0D..1 14345us : vmcs_readl+0x8/0x10 (vmcs_read32+0x16/0x20) >> qemu-sys-7543 0D..1 14345us : vmcs_read32+0xb/0x20 (vmx_vcpu_run+0xf4/0x110) >> qemu-sys-7543 0D..1 14345us+: vmcs_readl+0x8/0x10 (vmcs_read32+0x16/0x20) >> qemu-sys-7543 0D..1 14349us : irq_enter+0xb/0x30 (do_IRQ+0x45/0xc0) >> qemu-sys-7543 0D.h1 14350us : do_IRQ+0x73/0xc0 (f8caae24 0 0) >> qemu-sys-7543 0D.h1 14351us : handle_level_irq+0xe/0x120 (do_IRQ+0x7d/0xc0) >> qemu-sys-7543 0D.h1 14351us : __spin_lock+0xc/0x30 >> (handle_level_irq+0x24/0x120) >> qemu-sys-7543 0D.h2 14352us : mask_and_ack_8259A+0x14/0x120 >> (handle_level_irq+0x37/0x120) >> qemu-sys-7543 0D.h2 14352us+: __spin_lock_irqsave+0x11/0x60 >> (mask_and_ack_8259A+0x2a/0x120) >> qemu-sys-7543 0D.h3 14357us : __spin_unlock_irqrestore+0xc/0x60 >> (mask_and_ack_8259A+0x7a/0x120) >> qemu-sys-7543 0D.h2 14358us : redirect_hardirq+0x8/0x70 >> (handle_level_irq+0x72/0x120) >> qemu-sys-7543 0D.h2 14358us : __spin_unlock+0xb/0x40 >> (handle_level_irq+0x8e/0x120) >> qemu-sys-7543 0D.h1 14358us : handle_IRQ_event+0xe/0x110 >> (handle_level_irq+0x9a/0x120) >> qemu-sys-7543 0D.h1 14359us : timer_interrupt+0xb/0x60 >> (handle_IRQ_event+0x67/0x110) >> qemu-sys-7543 0D.h1 14359us : hrtimer_interrupt+0xe/0x1f0 >> (timer_interrupt+0x20/0x60) >> ... >> >> One can see 345 us latency between vm-enter and vm-exit in vmx_vcpu_run - >> and this while cyclictest runs at a period of 100 us! >> >> I got the same results over Adeos/I-pipe & Xenomai with the function >> tracer there, also pointing to the period while the CPU is in VM mode. >> >> Anyone any ideas? Greg, I put you on CC as you said you once saw "decent >> latencies" with your patches. Are there still magic bits missing in >> official kvm? >> > > No bits missing as far as I know. It should just work. > > Can you explain some more about the latenct tracer? How does it work?
Ah, sorry: The latency tracers in both -rt and I-pipe use gcc's -pg to put a call to a function called mcount at the beginning of each compiled function. mcount is provided by the tracers and stores the caller address, its parent, the current time, and more in a log. An API is provided to start and stop the trace, e.g. after someone (kernel or user space) detected large wakeup latencies. > Seeing vmx_vcpu_run() in there confuses me, as it always runs with > interrupts disabled (it does dispatch NMIs, so we could be seeing an NMI). The point is that the cyclictest does not find large latencies when kvm is not happening to start or stop right now. And if you are thinking about NMIs triggered by the kvm code on vm-exit: I also instrumented that code path, and it is not taken in case of the long delay. > > Please post a disassembly of your vmx_vcpu_run so we can interpret the > offsets. Here it comes: 00002df0 <vmx_vcpu_run>: 2df0: 55 push %ebp 2df1: 89 e5 mov %esp,%ebp 2df3: 53 push %ebx 2df4: 83 ec 08 sub $0x8,%esp 2df7: e8 fc ff ff ff call 2df8 <vmx_vcpu_run+0x8> 2dfc: 8b 5d 08 mov 0x8(%ebp),%ebx 2dff: 0f 20 c0 mov %cr0,%eax 2e02: 89 44 24 04 mov %eax,0x4(%esp) 2e06: c7 04 24 00 6c 00 00 movl $0x6c00,(%esp) 2e0d: e8 be d8 ff ff call 6d0 <vmcs_writel> 2e12: 8b 83 80 0d 00 00 mov 0xd80(%ebx),%eax 2e18: ba 14 6c 00 00 mov $0x6c14,%edx 2e1d: 89 d9 mov %ebx,%ecx 2e1f: 60 pusha 2e20: 51 push %ecx 2e21: 0f 79 d4 vmwrite %esp,%edx 2e24: 83 f8 00 cmp $0x0,%eax 2e27: 8b 81 78 01 00 00 mov 0x178(%ecx),%eax 2e2d: 0f 22 d0 mov %eax,%cr2 2e30: 8b 81 50 01 00 00 mov 0x150(%ecx),%eax 2e36: 8b 99 5c 01 00 00 mov 0x15c(%ecx),%ebx 2e3c: 8b 91 58 01 00 00 mov 0x158(%ecx),%edx 2e42: 8b b1 68 01 00 00 mov 0x168(%ecx),%esi 2e48: 8b b9 6c 01 00 00 mov 0x16c(%ecx),%edi 2e4e: 8b a9 64 01 00 00 mov 0x164(%ecx),%ebp 2e54: 8b 89 54 01 00 00 mov 0x154(%ecx),%ecx 2e5a: 75 05 jne 2e61 <vmx_vcpu_run+0x71> 2e5c: 0f 01 c2 vmlaunch 2e5f: eb 03 jmp 2e64 <vmx_vcpu_run+0x74> 2e61: 0f 01 c3 vmresume 2e64: 87 0c 24 xchg %ecx,(%esp) 2e67: 89 81 50 01 00 00 mov %eax,0x150(%ecx) 2e6d: 89 99 5c 01 00 00 mov %ebx,0x15c(%ecx) 2e73: ff 34 24 pushl (%esp) 2e76: 8f 81 54 01 00 00 popl 0x154(%ecx) 2e7c: 89 91 58 01 00 00 mov %edx,0x158(%ecx) 2e82: 89 b1 68 01 00 00 mov %esi,0x168(%ecx) 2e88: 89 b9 6c 01 00 00 mov %edi,0x16c(%ecx) 2e8e: 89 a9 64 01 00 00 mov %ebp,0x164(%ecx) 2e94: 0f 20 d0 mov %cr2,%eax 2e97: 89 81 78 01 00 00 mov %eax,0x178(%ecx) 2e9d: 8b 0c 24 mov (%esp),%ecx 2ea0: 59 pop %ecx 2ea1: 61 popa 2ea2: 0f 96 c0 setbe %al 2ea5: 88 83 84 0d 00 00 mov %al,0xd84(%ebx) 2eab: c7 04 24 24 48 00 00 movl $0x4824,(%esp) 2eb2: e8 49 d2 ff ff call 100 <vmcs_read32> 2eb7: a8 03 test $0x3,%al 2eb9: 0f 94 c0 sete %al 2ebc: 0f b6 c0 movzbl %al,%eax 2ebf: 89 83 28 01 00 00 mov %eax,0x128(%ebx) 2ec5: b8 7b 00 00 00 mov $0x7b,%eax 2eca: 8e d8 mov %eax,%ds 2ecc: 8e c0 mov %eax,%es 2ece: c7 83 80 0d 00 00 01 movl $0x1,0xd80(%ebx) 2ed5: 00 00 00 2ed8: c7 04 24 04 44 00 00 movl $0x4404,(%esp) 2edf: e8 1c d2 ff ff call 100 <vmcs_read32> 2ee4: 25 00 07 00 00 and $0x700,%eax 2ee9: 3d 00 02 00 00 cmp $0x200,%eax 2eee: 75 02 jne 2ef2 <vmx_vcpu_run+0x102> 2ef0: cd 02 int $0x2 2ef2: 83 c4 08 add $0x8,%esp 2ef5: 5b pop %ebx 2ef6: 5d pop %ebp 2ef7: c3 ret 2ef8: 90 nop 2ef9: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi Note that the first, unresolved call here goes to mcount(). Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel