host: 2.6.25-rc8, x86_64, kvm-66 1 dual-core Xeon(R) CPU 3050 @ 2.13GHz 6 GB RAM (This behavior also occurs on a larger server with 2 dual-core Xeon(R) CPU 5140 @ 2.33GHz, 4 GB RAM. Same kernel and kvm versions.)
guest: RHEL3 U8 (2.4.21-47.ELsmp), 2 vcpus, 2 GB RAM As usual, waited for a guest "event" -- high system time in guest which appears to lock it up. Following the event, kscand was the top CPU user (cumulative time) in the guest. During the event, 2 qemu threads are pegging the host CPU at 100%. Top samples from oprofile (oprofile was started after the freeze start and stopped when guest response returned): samples % image name app name symbol name 171716 35.1350 kvm-intel.ko kvm_intel vmx_vcpu_run 45836 9.3786 vmlinux vmlinux copy_user_generic_string 39417 8.0652 kvm.ko kvm kvm_read_guest_atomic 23604 4.8296 vmlinux vmlinux add_preempt_count 22878 4.6811 vmlinux vmlinux __smp_call_function_mask 16143 3.3030 kvm.ko kvm gfn_to_hva 14648 2.9971 vmlinux vmlinux sub_preempt_count 14589 2.9851 kvm.ko kvm __gfn_to_memslot 11666 2.3870 kvm.ko kvm unalias_gfn 10834 2.2168 kvm.ko kvm kvm_mmu_zap_page 10532 2.1550 kvm.ko kvm paging64_prefetch_page 6285 1.2860 kvm-intel.ko kvm_intel handle_exception 6066 1.2412 kvm.ko kvm kvm_arch_vcpu_ioctl_run 4741 0.9701 kvm.ko kvm kvm_add_trace 3801 0.7777 vmlinux vmlinux __copy_from_user_inatomic 3592 0.7350 vmlinux vmlinux follow_page 3326 0.6805 kvm.ko kvm mmu_memory_cache_alloc 3317 0.6787 kvm-intel.ko kvm_intel kvm_handle_exit 2971 0.6079 kvm.ko kvm paging64_page_fault 2777 0.5682 kvm.ko kvm paging64_walk_addr 2294 0.4694 kvm.ko kvm kvm_mmu_pte_write 2278 0.4661 kvm.ko kvm kvm_flush_remote_tlbs 2266 0.4636 kvm-intel.ko kvm_intel vmcs_writel 2086 0.4268 kvm.ko kvm mmu_set_spte 2041 0.4176 kvm.ko kvm kvm_read_guest 1615 0.3304 vmlinux vmlinux free_hot_cold_page david Avi Kivity wrote: > David S. Ahern wrote: >> I added the traces and captured data over another apparent lockup of >> the guest. >> This seems to be representative of the sequence (pid/vcpu removed). >> >> (+4776) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c016127c ] >> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 >> c0009db4 ] >> (+3632) VMENTRY >> (+4552) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c016104a ] >> (+ 0) PAGE_FAULT [ errorcode = 0x0000000b, virt = 0x00000000 >> fffb61c8 ] >> (+ 54928) VMENTRY >> > > Can you oprofile the host to see where the 54K cycles are spent? > >> (+4568) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c01610e7 ] >> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 >> c0009db4 ] >> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db4 gpte = 0x00000000 >> 41c5d363 ] >> (+8432) VMENTRY >> (+3936) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c01610ee ] >> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 >> c0009db0 ] >> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db0 gpte = 0x00000000 >> 00000000 ] >> (+ 13832) VMENTRY >> >> >> (+5768) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c016127c ] >> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 >> c0009db4 ] >> (+3712) VMENTRY >> (+4576) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c016104a ] >> (+ 0) PAGE_FAULT [ errorcode = 0x0000000b, virt = 0x00000000 >> fffb61d0 ] >> (+ 0) PTE_WRITE [ gpa = 0x00000000 3d5981d0 gpte = 0x00000000 >> 3d55d047 ] >> > > This indeed has the accessed bit clear. > >> (+ 65216) VMENTRY >> (+4232) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c01610e7 ] >> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 >> c0009db4 ] >> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db4 gpte = 0x00000000 >> 3d598363 ] >> > > This has the accessed bit set and the user bit clear, and the pte > pointing at the previous pte_write gpa. Looks like a kmap_atomic(). > >> (+8640) VMENTRY >> (+3936) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >> c01610ee ] >> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 >> c0009db0 ] >> (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db0 gpte = 0x00000000 >> 00000000 ] >> (+ 14160) VMENTRY >> >> I can forward a more complete time snippet if you'd like. vcpu0 + >> corresponding >> vcpu1 files have 85000 total lines and compressed the files total ~500k. >> >> I did not see the FLOODED trace come out during this sample though I >> did bump >> the count from 3 to 4 as you suggested. >> >> >> > > Bumping the count was supposed to remove the flooding... > >> Correlating rip addresses to the 2.4 kernel: >> >> c0160d00-c0161290 = page_referenced >> >> It looks like the event is kscand running through the pages. I >> suspected this >> some time ago, and tried tweaking the kscand_work_percent sysctl >> variable. It >> appeared to lower the peak of the spikes, but maybe I imagined it. I >> believe >> lowering that value makes kscand wake up more often but do less work >> (page >> scanning) each time it is awakened. >> >> > > What does 'top' in the guest show (perhaps sorted by total cpu time > rather than instantaneous usage)? > > What host kernel are you running? How many host cpus? > ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel