Avi Kivity wrote:
> Jan Kiszka wrote:
>> Avi,
>>
>> [somehow your mails do not get through to my private account, so I'm
>> switching]
>>
>> Avi Kivity wrote:
>>
>>> Jan Kiszka wrote:
>>>
>>>> Clarification: I can't precisely tell what code is executed in VM mode,
>>>> as I don't have qemu or that guest instrumented. I just see the kernel
>>>> entering VM mode and leaving it again more than 300 us later. So I
>>>> wonder why this is allowed while some external IRQ is pending.
>>>>
>>>>
>>>>
>>> How do you know an external interrupt is pending?
>>>
>> It's the host timer IRQ, programmed to fire in certain intervals (100 us
>> here). Test case is some latency measurement tool like tglx's cyclictest
>> or similar programs we use in Xenomai.
>>
>>
>>> kvm programs the hardware to exit when an external interrupt arrives.
>>>
>>>
>> Here is a latency trace I just managed to capture over 2.6.23.1-rt1 with
>> latest kvm from git hacked into (kvm generally seems to work fine this way):
>>
>> ...
>> qemu-sys-7543 0...1 13897us : vmcs_write16+0xb/0x20
>> (vmx_save_host_state+0x1a7/0x1c0)
>> qemu-sys-7543 0...1 13897us : vmcs_writel+0xb/0x30 (vmcs_write16+0x1e/0x20)
>> qemu-sys-7543 0...1 13898us : segment_base+0xc/0x70
>> (vmx_save_host_state+0xa0/0x1c0)
>> qemu-sys-7543 0...1 13898us : vmcs_writel+0xb/0x30
>> (vmx_save_host_state+0xb0/0x1c0)
>> qemu-sys-7543 0...1 13898us : segment_base+0xc/0x70
>> (vmx_save_host_state+0xbf/0x1c0)
>> qemu-sys-7543 0...1 13898us : vmcs_writel+0xb/0x30
>> (vmx_save_host_state+0xcf/0x1c0)
>> qemu-sys-7543 0...1 13898us : load_msrs+0xb/0x40
>> (vmx_save_host_state+0xe7/0x1c0)
>> qemu-sys-7543 0...1 13898us : kvm_load_guest_fpu+0x8/0x40
>> (kvm_vcpu_ioctl_run+0xbf/0x570)
>> qemu-sys-7543 0D..1 13899us : vmx_vcpu_run+0xc/0x110
>> (kvm_vcpu_ioctl_run+0x120/0x570)
>> qemu-sys-7543 0D..1 13899us!: vmcs_writel+0xb/0x30 (vmx_vcpu_run+0x22/0x110)
>> qemu-sys-7543 0D..1 14344us : vmcs_read32+0xb/0x20 (vmx_vcpu_run+0xc7/0x110)
>> qemu-sys-7543 0D..1 14345us : vmcs_readl+0x8/0x10 (vmcs_read32+0x16/0x20)
>> qemu-sys-7543 0D..1 14345us : vmcs_read32+0xb/0x20 (vmx_vcpu_run+0xf4/0x110)
>> qemu-sys-7543 0D..1 14345us+: vmcs_readl+0x8/0x10 (vmcs_read32+0x16/0x20)
>> qemu-sys-7543 0D..1 14349us : irq_enter+0xb/0x30 (do_IRQ+0x45/0xc0)
>> qemu-sys-7543 0D.h1 14350us : do_IRQ+0x73/0xc0 (f8caae24 0 0)
>> qemu-sys-7543 0D.h1 14351us : handle_level_irq+0xe/0x120 (do_IRQ+0x7d/0xc0)
>> qemu-sys-7543 0D.h1 14351us : __spin_lock+0xc/0x30
>> (handle_level_irq+0x24/0x120)
>> qemu-sys-7543 0D.h2 14352us : mask_and_ack_8259A+0x14/0x120
>> (handle_level_irq+0x37/0x120)
>> qemu-sys-7543 0D.h2 14352us+: __spin_lock_irqsave+0x11/0x60
>> (mask_and_ack_8259A+0x2a/0x120)
>> qemu-sys-7543 0D.h3 14357us : __spin_unlock_irqrestore+0xc/0x60
>> (mask_and_ack_8259A+0x7a/0x120)
>> qemu-sys-7543 0D.h2 14358us : redirect_hardirq+0x8/0x70
>> (handle_level_irq+0x72/0x120)
>> qemu-sys-7543 0D.h2 14358us : __spin_unlock+0xb/0x40
>> (handle_level_irq+0x8e/0x120)
>> qemu-sys-7543 0D.h1 14358us : handle_IRQ_event+0xe/0x110
>> (handle_level_irq+0x9a/0x120)
>> qemu-sys-7543 0D.h1 14359us : timer_interrupt+0xb/0x60
>> (handle_IRQ_event+0x67/0x110)
>> qemu-sys-7543 0D.h1 14359us : hrtimer_interrupt+0xe/0x1f0
>> (timer_interrupt+0x20/0x60)
>> ...
>>
>> One can see 345 us latency between vm-enter and vm-exit in vmx_vcpu_run -
>> and this while cyclictest runs at a period of 100 us!
>>
>> I got the same results over Adeos/I-pipe & Xenomai with the function
>> tracer there, also pointing to the period while the CPU is in VM mode.
>>
>> Anyone any ideas? Greg, I put you on CC as you said you once saw "decent
>> latencies" with your patches. Are there still magic bits missing in
>> official kvm?
>>
>
> No bits missing as far as I know. It should just work.
>
> Can you explain some more about the latenct tracer? How does it work?
Ah, sorry: The latency tracers in both -rt and I-pipe use gcc's -pg to
put a call to a function called mcount at the beginning of each compiled
function. mcount is provided by the tracers and stores the caller
address, its parent, the current time, and more in a log. An API is
provided to start and stop the trace, e.g. after someone (kernel or user
space) detected large wakeup latencies.
> Seeing vmx_vcpu_run() in there confuses me, as it always runs with
> interrupts disabled (it does dispatch NMIs, so we could be seeing an NMI).
The point is that the cyclictest does not find large latencies when kvm
is not happening to start or stop right now. And if you are thinking
about NMIs triggered by the kvm code on vm-exit: I also instrumented
that code path, and it is not taken in case of the long delay.
>
> Please post a disassembly of your vmx_vcpu_run so we can interpret the
> offsets.
Here it comes:
00002df0 <vmx_vcpu_run>:
2df0: 55 push %ebp
2df1: 89 e5 mov %esp,%ebp
2df3: 53 push %ebx
2df4: 83 ec 08 sub $0x8,%esp
2df7: e8 fc ff ff ff call 2df8 <vmx_vcpu_run+0x8>
2dfc: 8b 5d 08 mov 0x8(%ebp),%ebx
2dff: 0f 20 c0 mov %cr0,%eax
2e02: 89 44 24 04 mov %eax,0x4(%esp)
2e06: c7 04 24 00 6c 00 00 movl $0x6c00,(%esp)
2e0d: e8 be d8 ff ff call 6d0 <vmcs_writel>
2e12: 8b 83 80 0d 00 00 mov 0xd80(%ebx),%eax
2e18: ba 14 6c 00 00 mov $0x6c14,%edx
2e1d: 89 d9 mov %ebx,%ecx
2e1f: 60 pusha
2e20: 51 push %ecx
2e21: 0f 79 d4 vmwrite %esp,%edx
2e24: 83 f8 00 cmp $0x0,%eax
2e27: 8b 81 78 01 00 00 mov 0x178(%ecx),%eax
2e2d: 0f 22 d0 mov %eax,%cr2
2e30: 8b 81 50 01 00 00 mov 0x150(%ecx),%eax
2e36: 8b 99 5c 01 00 00 mov 0x15c(%ecx),%ebx
2e3c: 8b 91 58 01 00 00 mov 0x158(%ecx),%edx
2e42: 8b b1 68 01 00 00 mov 0x168(%ecx),%esi
2e48: 8b b9 6c 01 00 00 mov 0x16c(%ecx),%edi
2e4e: 8b a9 64 01 00 00 mov 0x164(%ecx),%ebp
2e54: 8b 89 54 01 00 00 mov 0x154(%ecx),%ecx
2e5a: 75 05 jne 2e61 <vmx_vcpu_run+0x71>
2e5c: 0f 01 c2 vmlaunch
2e5f: eb 03 jmp 2e64 <vmx_vcpu_run+0x74>
2e61: 0f 01 c3 vmresume
2e64: 87 0c 24 xchg %ecx,(%esp)
2e67: 89 81 50 01 00 00 mov %eax,0x150(%ecx)
2e6d: 89 99 5c 01 00 00 mov %ebx,0x15c(%ecx)
2e73: ff 34 24 pushl (%esp)
2e76: 8f 81 54 01 00 00 popl 0x154(%ecx)
2e7c: 89 91 58 01 00 00 mov %edx,0x158(%ecx)
2e82: 89 b1 68 01 00 00 mov %esi,0x168(%ecx)
2e88: 89 b9 6c 01 00 00 mov %edi,0x16c(%ecx)
2e8e: 89 a9 64 01 00 00 mov %ebp,0x164(%ecx)
2e94: 0f 20 d0 mov %cr2,%eax
2e97: 89 81 78 01 00 00 mov %eax,0x178(%ecx)
2e9d: 8b 0c 24 mov (%esp),%ecx
2ea0: 59 pop %ecx
2ea1: 61 popa
2ea2: 0f 96 c0 setbe %al
2ea5: 88 83 84 0d 00 00 mov %al,0xd84(%ebx)
2eab: c7 04 24 24 48 00 00 movl $0x4824,(%esp)
2eb2: e8 49 d2 ff ff call 100 <vmcs_read32>
2eb7: a8 03 test $0x3,%al
2eb9: 0f 94 c0 sete %al
2ebc: 0f b6 c0 movzbl %al,%eax
2ebf: 89 83 28 01 00 00 mov %eax,0x128(%ebx)
2ec5: b8 7b 00 00 00 mov $0x7b,%eax
2eca: 8e d8 mov %eax,%ds
2ecc: 8e c0 mov %eax,%es
2ece: c7 83 80 0d 00 00 01 movl $0x1,0xd80(%ebx)
2ed5: 00 00 00
2ed8: c7 04 24 04 44 00 00 movl $0x4404,(%esp)
2edf: e8 1c d2 ff ff call 100 <vmcs_read32>
2ee4: 25 00 07 00 00 and $0x700,%eax
2ee9: 3d 00 02 00 00 cmp $0x200,%eax
2eee: 75 02 jne 2ef2 <vmx_vcpu_run+0x102>
2ef0: cd 02 int $0x2
2ef2: 83 c4 08 add $0x8,%esp
2ef5: 5b pop %ebx
2ef6: 5d pop %ebp
2ef7: c3 ret
2ef8: 90 nop
2ef9: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi
Note that the first, unresolved call here goes to mcount().
Jan
--
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel