host:
2.6.25-rc8, x86_64, kvm-66
1 dual-core Xeon(R) CPU 3050 @ 2.13GHz
6 GB RAM
(This behavior also occurs on a larger server with 2 dual-core Xeon(R) CPU
5140 @ 2.33GHz, 4 GB RAM. Same kernel and kvm versions.)
guest:
RHEL3 U8 (2.4.21-47.ELsmp), 2 vcpus, 2 GB RAM
As usual, waited for a guest "event" -- high system time in guest which appears
to lock it up. Following the event, kscand was the top CPU user (cumulative
time) in the guest.
During the event, 2 qemu threads are pegging the host CPU at 100%. Top samples
from oprofile (oprofile was started after the freeze start and stopped when
guest response returned):
samples % image name app name symbol name
171716 35.1350 kvm-intel.ko kvm_intel vmx_vcpu_run
45836 9.3786 vmlinux vmlinux copy_user_generic_string
39417 8.0652 kvm.ko kvm kvm_read_guest_atomic
23604 4.8296 vmlinux vmlinux add_preempt_count
22878 4.6811 vmlinux vmlinux __smp_call_function_mask
16143 3.3030 kvm.ko kvm gfn_to_hva
14648 2.9971 vmlinux vmlinux sub_preempt_count
14589 2.9851 kvm.ko kvm __gfn_to_memslot
11666 2.3870 kvm.ko kvm unalias_gfn
10834 2.2168 kvm.ko kvm kvm_mmu_zap_page
10532 2.1550 kvm.ko kvm paging64_prefetch_page
6285 1.2860 kvm-intel.ko kvm_intel handle_exception
6066 1.2412 kvm.ko kvm kvm_arch_vcpu_ioctl_run
4741 0.9701 kvm.ko kvm kvm_add_trace
3801 0.7777 vmlinux vmlinux __copy_from_user_inatomic
3592 0.7350 vmlinux vmlinux follow_page
3326 0.6805 kvm.ko kvm mmu_memory_cache_alloc
3317 0.6787 kvm-intel.ko kvm_intel kvm_handle_exit
2971 0.6079 kvm.ko kvm paging64_page_fault
2777 0.5682 kvm.ko kvm paging64_walk_addr
2294 0.4694 kvm.ko kvm kvm_mmu_pte_write
2278 0.4661 kvm.ko kvm kvm_flush_remote_tlbs
2266 0.4636 kvm-intel.ko kvm_intel vmcs_writel
2086 0.4268 kvm.ko kvm mmu_set_spte
2041 0.4176 kvm.ko kvm kvm_read_guest
1615 0.3304 vmlinux vmlinux free_hot_cold_page
david
Avi Kivity wrote:
> David S. Ahern wrote:
>> I added the traces and captured data over another apparent lockup of
>> the guest.
>> This seems to be representative of the sequence (pid/vcpu removed).
>>
>> (+4776) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c016127c ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000
>> c0009db4 ]
>> (+3632) VMENTRY
>> (+4552) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c016104a ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x0000000b, virt = 0x00000000
>> fffb61c8 ]
>> (+ 54928) VMENTRY
>>
>
> Can you oprofile the host to see where the 54K cycles are spent?
>
>> (+4568) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c01610e7 ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000
>> c0009db4 ]
>> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db4 gpte = 0x00000000
>> 41c5d363 ]
>> (+8432) VMENTRY
>> (+3936) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c01610ee ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000
>> c0009db0 ]
>> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db0 gpte = 0x00000000
>> 00000000 ]
>> (+ 13832) VMENTRY
>>
>>
>> (+5768) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c016127c ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000
>> c0009db4 ]
>> (+3712) VMENTRY
>> (+4576) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c016104a ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x0000000b, virt = 0x00000000
>> fffb61d0 ]
>> (+ 0) PTE_WRITE [ gpa = 0x00000000 3d5981d0 gpte = 0x00000000
>> 3d55d047 ]
>>
>
> This indeed has the accessed bit clear.
>
>> (+ 65216) VMENTRY
>> (+4232) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c01610e7 ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000
>> c0009db4 ]
>> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db4 gpte = 0x00000000
>> 3d598363 ]
>>
>
> This has the accessed bit set and the user bit clear, and the pte
> pointing at the previous pte_write gpa. Looks like a kmap_atomic().
>
>> (+8640) VMENTRY
>> (+3936) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000
>> c01610ee ]
>> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000
>> c0009db0 ]
>> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db0 gpte = 0x00000000
>> 00000000 ]
>> (+ 14160) VMENTRY
>>
>> I can forward a more complete time snippet if you'd like. vcpu0 +
>> corresponding
>> vcpu1 files have 85000 total lines and compressed the files total ~500k.
>>
>> I did not see the FLOODED trace come out during this sample though I
>> did bump
>> the count from 3 to 4 as you suggested.
>>
>>
>>
>
> Bumping the count was supposed to remove the flooding...
>
>> Correlating rip addresses to the 2.4 kernel:
>>
>> c0160d00-c0161290 = page_referenced
>>
>> It looks like the event is kscand running through the pages. I
>> suspected this
>> some time ago, and tried tweaking the kscand_work_percent sysctl
>> variable. It
>> appeared to lower the peak of the spikes, but maybe I imagined it. I
>> believe
>> lowering that value makes kscand wake up more often but do less work
>> (page
>> scanning) each time it is awakened.
>>
>>
>
> What does 'top' in the guest show (perhaps sorted by total cpu time
> rather than instantaneous usage)?
>
> What host kernel are you running? How many host cpus?
>
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel