Re: [PATCH v4 0/6] PCI: support the ATS capability
On Thu, Mar 26, 2009 at 04:15:56PM -0700, Jesse Barnes wrote: ... > This is a good sized chunk of new code, and you want it to come through > the PCI tree, right? It looks like it's seen some review from Grant, > David and Matthew but I don't see any Reviewed-by or Acked-by tags in > there... Anyone willing to provide those? Sorry, I'm not. I've read through the code but don't understand the many of the details about how this particular HW works. All I can do is pick nits. cheers, grant -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 4/6] VT-d: add device IOTLB invalidation support
On Mon, Mar 23, 2009 at 03:59:00PM +0800, Yu Zhao wrote: > Support device IOTLB invalidation to flush the translation cached > in the Endpoint. > > Signed-off-by: Yu Zhao > --- > drivers/pci/dmar.c | 77 ++ > include/linux/intel-iommu.h | 14 +++- > 2 files changed, 82 insertions(+), 9 deletions(-) > > diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c > index 106bc45..494b167 100644 > --- a/drivers/pci/dmar.c > +++ b/drivers/pci/dmar.c > @@ -674,7 +674,8 @@ void free_iommu(struct intel_iommu *iommu) > */ > static inline void reclaim_free_desc(struct q_inval *qi) > { > - while (qi->desc_status[qi->free_tail] == QI_DONE) { > + while (qi->desc_status[qi->free_tail] == QI_DONE || > +qi->desc_status[qi->free_tail] == QI_ABORT) { > qi->desc_status[qi->free_tail] = QI_FREE; > qi->free_tail = (qi->free_tail + 1) % QI_LENGTH; > qi->free_cnt++; > @@ -684,10 +685,13 @@ static inline void reclaim_free_desc(struct q_inval *qi) > static int qi_check_fault(struct intel_iommu *iommu, int index) > { > u32 fault; > - int head; > + int head, tail; > struct q_inval *qi = iommu->qi; > int wait_index = (index + 1) % QI_LENGTH; > > + if (qi->desc_status[wait_index] == QI_ABORT) > + return -EAGAIN; > + > fault = readl(iommu->reg + DMAR_FSTS_REG); > > /* > @@ -697,7 +701,11 @@ static int qi_check_fault(struct intel_iommu *iommu, int > index) >*/ > if (fault & DMA_FSTS_IQE) { > head = readl(iommu->reg + DMAR_IQH_REG); > - if ((head >> 4) == index) { > + if ((head >> DMAR_IQ_OFFSET) == index) { Yu, DMAR_IQ_OFFSET should probably be called DMAR_IQ_SHIFT since it's used the same way that "PAGE_SHIFT" is used. I've looked through the rest of the code and don't see any problems. But I also don't have a clue what "ITE" (in IOMMU context) is. I'm assuming it has something to do with translation errors but have no idea about where/when those are generated and what the outcome is. thanks, grant > + printk(KERN_ERR "VT-d detected invalid descriptor: " > + "low=%llx, high=%llx\n", > + (unsigned long long)qi->desc[index].low, > + (unsigned long long)qi->desc[index].high); > memcpy(&qi->desc[index], &qi->desc[wait_index], > sizeof(struct qi_desc)); > __iommu_flush_cache(iommu, &qi->desc[index], > @@ -707,6 +715,32 @@ static int qi_check_fault(struct intel_iommu *iommu, int > index) > } > } > > + /* > + * If ITE happens, all pending wait_desc commands are aborted. > + * No new descriptors are fetched until the ITE is cleared. > + */ > + if (fault & DMA_FSTS_ITE) { > + head = readl(iommu->reg + DMAR_IQH_REG); > + head = ((head >> DMAR_IQ_OFFSET) - 1 + QI_LENGTH) % QI_LENGTH; > + head |= 1; > + tail = readl(iommu->reg + DMAR_IQT_REG); > + tail = ((tail >> DMAR_IQ_OFFSET) - 1 + QI_LENGTH) % QI_LENGTH; > + > + writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG); > + > + do { > + if (qi->desc_status[head] == QI_IN_USE) > + qi->desc_status[head] = QI_ABORT; > + head = (head - 2 + QI_LENGTH) % QI_LENGTH; > + } while (head != tail); > + > + if (qi->desc_status[wait_index] == QI_ABORT) > + return -EAGAIN; > + } > + > + if (fault & DMA_FSTS_ICE) > + writel(DMA_FSTS_ICE, iommu->reg + DMAR_FSTS_REG); > + > return 0; > } > > @@ -716,7 +750,7 @@ static int qi_check_fault(struct intel_iommu *iommu, int > index) > */ > int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu) > { > - int rc = 0; > + int rc; > struct q_inval *qi = iommu->qi; > struct qi_desc *hw, wait_desc; > int wait_index, index; > @@ -727,6 +761,9 @@ int qi_submit_sync(struct qi_desc *desc, struct > intel_iommu *iommu) > > hw = qi->desc; > > +restart: > + rc = 0; > + > spin_lock_irqsave(&qi->q_lock, flags); > while (qi->free_cnt < 3) { > spin_unlock_irqrestore(&qi->q_lock, flags); > @@ -757,7 +794,7 @@ int qi_submit_sync(struct qi_desc *desc, struct > intel_iommu *iommu) >* update the HW tail register indicating the presence of >* new descriptors. >*/ > - writel(qi->free_head << 4, iommu->reg + DMAR_IQT_REG); > + writel(qi->free_head << DMAR_IQ_OFFSET, iommu->reg + DMAR_IQT_REG); > > while (qi->desc_status[wait_index] != QI_DONE) { > /* > @@ -769,18 +806,21 @@ int qi_submit_sync(struct qi_desc *desc, struct > intel_iommu *iommu) >*/ > rc =
BUG: soft lockup - CPU stuck for ...
Hi, does anyone know how to solve the problem with "BUG: soft lockup - CPU#0 stuck for ..."? Today I got the messages below during compilation of the kernel modules in a guest. Using kvm84 and Kernel 2.6.29 as host kernel and 2.6.28 as guest kernel during the hangup of the guest neither ssh or ping was possible. After about 2 minutes the guest was reachable again and I saw the messages below with "dmesg". Maybe it is related with my prev. anserwed posting: http://article.gmane.org/gmane.comp.emulators.kvm.devel/29677 Thanks! Robert BUG: soft lockup - CPU#0 stuck for 61s! [cc1:17803] Modules linked in: ipv6 floppy ppdev virtio_net pcspkr parport_pc i2c_piix4 i2c_core parport thermal processor button e1000 nfs lockd sunrpc jfs raid10 raid456 async_memcpy async_xor xor async_tx raid1 raid0 dm_bbr dm_snapshot dm_mirror dm_region_hash dm_log dm_mod sbp2 ohci1394 ieee1394 sl811_hcd usbhid ohci_hcd ssb uhci_hcd usb_storage ehci_hcd usbcore lpfc qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_ mptsas scsi_transport_sas mptfc scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280 dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx aic79xx scsi_transport_spi sg videobuf_core pdc_adma sata_inic162x sata_mv ata_piix ahci sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise scsi_wait_scan pata_sl82c105 pata_cs5530 pata_cs5520 pata_via pata_jmicron pata_marvell pata_sis pata_netcell pata_sc1200 pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia firmware_class pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks pata_platform pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix libata CPU 0: Modules linked in: ipv6 floppy ppdev virtio_net pcspkr parport_pc i2c_piix4 i2c_core parport thermal processor button e1000 nfs lockd sunrpc jfs raid10 raid456 async_memcpy async_xor xor async_tx raid1 raid0 dm_bbr dm_snapshot dm_mirror dm_region_hash dm_log dm_mod sbp2 ohci1394 ieee1394 sl811_hcd usbhid ohci_hcd ssb uhci_hcd usb_storage ehci_hcd usbcore lpfc qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 cciss 3w_9xxx 3w_ mptsas scsi_transport_sas mptfc scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280 dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx aic79xx scsi_transport_spi sg videobuf_core pdc_adma sata_inic162x sata_mv ata_piix ahci sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise scsi_wait_scan pata_sl82c105 pata_cs5530 pata_cs5520 pata_via pata_jmicron pata_marvell pata_sis pata_netcell pata_sc1200 pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia firmware_class pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks pata_platform pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix libata Pid: 17803, comm: cc1 Not tainted 2.6.28-gentoo-r2 #1 RIP: 0010:[] [] 0x80221b0b RSP: :8800b8cb1d58 EFLAGS: 0246 RAX: 0018 RBX: 8800b8cb1d78 RCX: b8cb1db8 RDX: RSI: 0018 RDI: 8800b8cb1db8 RBP: 8800b8cb1d78 R08: 807040e8 R09: c8e5 R10: 8800b8cb1e08 R11: 0002 R12: 001f R13: R14: 802792e1 R15: 8800b8cb1cf8 FS: 2aee99c006f0() GS:807d5000() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 2aee9b7e CR3: a851 CR4: 06e0 DR0: DR1:
Re: [PATCH] kvm mmu: add support for 1GB pages in shadow paging code
On Sat, Mar 28, 2009 at 06:28:35PM -0300, Marcelo Tosatti wrote: > > I have searched this bug for quite some time with no real luck. Maybe > > some other reviewers have more luck than I had by now. > > Sorry, I can't spot what is wrong here. Avi? > > Perhaps it helps if you provide some info of the hang when guest > allocates hugepages on boot (its probably and endless fault that can't > be corrected?). I will try to find out why the guest stucks. I also created a full mmu trace of a boot crash case. But its size was around 170MB and I found no real problem there. > Also another point is that the large huge page at 0-1GB will never > be created, because it crosses slot boundary. The instabilies only occur if the guest has enough memory to use a gbpage in its own direct mapping. They also go away when I boot it with nogbpages command line option. So its likely that is has something to do with the processing of guest gbpages in the softmmu code. But I looked over this code again and again. There seems to be no bug. > > > Signed-off-by: Joerg Roedel > > --- > > arch/x86/kvm/mmu.c | 56 > > +++ > > arch/x86/kvm/paging_tmpl.h | 35 +-- > > arch/x86/kvm/svm.c |2 +- > > 3 files changed, 68 insertions(+), 25 deletions(-) > > > > + psize = backing_size(vcpu, vcpu->arch.update_pte.gfn); > > This can block, and this path holds mmu_lock. Thats why it needs to be > done in guess_page_from_pte_write. Ah true. Thanks for pointing this out. The previous code in the guess_page function makes sense now. > > + if ((sp->role.level == PT_DIRECTORY_LEVEL) && > > + (psize >= KVM_PAGE_SIZE_2M)) { > > + psize = KVM_PAGE_SIZE_2M; > > + vcpu->arch.update_pte.gfn &= ~(KVM_PAGES_PER_2M_PAGE-1); > > + vcpu->arch.update_pte.pfn &= ~(KVM_PAGES_PER_2M_PAGE-1); > > + } else if ((sp->role.level == PT_MIDDLE_LEVEL) && > > + (psize == KVM_PAGE_SIZE_1G)) { > > + vcpu->arch.update_pte.gfn &= ~(KVM_PAGES_PER_1G_PAGE-1); > > + vcpu->arch.update_pte.pfn &= ~(KVM_PAGES_PER_1G_PAGE-1); > > + } else > > + goto out_pde; > > Better just zap the entry in case its a 1GB one and let the > fault path handle it. Yes, that probably better. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] Support for GB pages in KVM
On Sat, Mar 28, 2009 at 06:40:08PM -0300, Marcelo Tosatti wrote: > On Fri, Mar 27, 2009 at 03:31:52PM +0100, Joerg Roedel wrote: > > Hi, > > > > this patchset extends the KVM MMU implementation to support 1GB pages as > > supported by AMD family 16 processors. These patches enable support for > > 1 GB pages with Nested Paging. Support for these pages in the shadow > > paging code was also developed but does not run stable yet. The patch > > for shadow-paging support is not included in this series and will be > > sent out seperatly. > > Looks generally sane. I'm not sure its even worthwhile to support > GBpages with softmmu, because the chance of finding an area without > shadowed (write protected) pages is much smaller than with 2MB pages. Thanks for your review. The idea behind GB pages in softmmu code was to provide GB pages to the guest even if hardware does not support it. This would work better with live migration (Only case where we wouldn't have gbpages then would be vmx with ept enabled). > Have any numbers to share? No numbers I fully trust by now. I measured a 32% improvement in kernbench using nested pages backed with gb pages. I will do some more measurements and share some more solid numbers. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] Support for GB pages in KVM
On Fri, Mar 27, 2009 at 03:31:52PM +0100, Joerg Roedel wrote: > Hi, > > this patchset extends the KVM MMU implementation to support 1GB pages as > supported by AMD family 16 processors. These patches enable support for > 1 GB pages with Nested Paging. Support for these pages in the shadow > paging code was also developed but does not run stable yet. The patch > for shadow-paging support is not included in this series and will be > sent out seperatly. Looks generally sane. I'm not sure its even worthwhile to support GBpages with softmmu, because the chance of finding an area without shadowed (write protected) pages is much smaller than with 2MB pages. Have any numbers to share? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm mmu: add support for 1GB pages in shadow paging code
On Fri, Mar 27, 2009 at 03:35:18PM +0100, Joerg Roedel wrote: > This patch adds support for 1GB pages in the shadow paging code. The > guest can map 1GB pages in his page tables and KVM will map the page > frame with a 1GB, a 2MB or even a 4kb page size, according to backing > host page size and the write protections in place. > This is the theory. In practice there are conditions which turn the > guest unstable when running with this patch and GB pages enabled. The > failing conditions are: > > * KVM is loaded using shadow paging > * The Linux guest uses GB pages for the kernel direct mapping > * The guest memory is backed with 4kb pages on the host side > > With the above configuration there are random application or kernel > crashed when the guest runs under load. When GB pages for HugeTLBfs in > the guest are allocated at boot time in the guest the guest kernel > crashes or stucks at boot depending on the amount of RAM in the guest. > The following parameters have no impact: > > * It bug occurs also without guest SMP (so likely no race > condition) > * Use PV-MMU makes no difference > > I have searched this bug for quite some time with no real luck. Maybe > some other reviewers have more luck than I had by now. Sorry, I can't spot what is wrong here. Avi? Perhaps it helps if you provide some info of the hang when guest allocates hugepages on boot (its probably and endless fault that can't be corrected?). Also another point is that the large huge page at 0-1GB will never be created, because it crosses slot boundary. > Signed-off-by: Joerg Roedel > --- > arch/x86/kvm/mmu.c | 56 +++ > arch/x86/kvm/paging_tmpl.h | 35 +-- > arch/x86/kvm/svm.c |2 +- > 3 files changed, 68 insertions(+), 25 deletions(-) > > + psize = backing_size(vcpu, vcpu->arch.update_pte.gfn); This can block, and this path holds mmu_lock. Thats why it needs to be done in guess_page_from_pte_write. > + if ((sp->role.level == PT_DIRECTORY_LEVEL) && > + (psize >= KVM_PAGE_SIZE_2M)) { > + psize = KVM_PAGE_SIZE_2M; > + vcpu->arch.update_pte.gfn &= ~(KVM_PAGES_PER_2M_PAGE-1); > + vcpu->arch.update_pte.pfn &= ~(KVM_PAGES_PER_2M_PAGE-1); > + } else if ((sp->role.level == PT_MIDDLE_LEVEL) && > +(psize == KVM_PAGE_SIZE_1G)) { > + vcpu->arch.update_pte.gfn &= ~(KVM_PAGES_PER_1G_PAGE-1); > + vcpu->arch.update_pte.pfn &= ~(KVM_PAGES_PER_1G_PAGE-1); > + } else > + goto out_pde; Better just zap the entry in case its a 1GB one and let the fault path handle it. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2702210 ] -cpu core2duo still crashes linux guests under kvm-84
Bugs item #2702210, was opened at 2009-03-22 05:57 Message generated for change (Comment added) made by franxisco1988 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2702210&group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: klondike (franxisco1988) Assigned to: Nobody/Anonymous (nobody) Summary: -cpu core2duo still crashes linux guests under kvm-84 Initial Comment: Using kvm-84 and -cpu core2duo still crashes the guest. Related bug #2413430 cpuinfo output: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz stepping: 6 cpu MHz : 1862.000 cache size : 4096 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow bogomips: 3739.71 clflush size: 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6320 @ 1.86GHz stepping: 6 cpu MHz : 1862.000 cache size : 4096 KB physical id : 0 siblings: 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow bogomips: 3739.68 clflush size: 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Distribution: gentoo hardened 64-bit. -- >Comment By: klondike (franxisco1988) Date: 2009-03-28 21:16 Message: After checking with git code it worked fine so I'll close the bug. (And wait for the next release). -- Comment By: Amit Shah (amitshah) Date: 2009-03-24 11:48 Message: I can't reproduce this with the development git tree (both kernel and userspace). The error message you mention in the host kernel log is also quite old and has been fixed. Can you try the development git snapshot or a nightly build to see if this works for you too? I tried on a 64-bit host with the 64-bit 2008.0 gentoo live cd. -- Comment By: klondike (franxisco1988) Date: 2009-03-22 15:48 Message: My fault, the only reference to the crash is a 100% CPU usage for more than 15 minutes when trying to start a Gentoo 2008.0 Live CD The last lines on the guest are: Feeing unused kernel memory: 256k freed input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input2 On the host, dmesg says: kvm: 5613: cpu0 unhandled wrmsr: 0xc0010117 data 0 And qemu outputs nothing. -- Comment By: Amit Shah (amitshah) Date: 2009-03-22 09:42 Message: Please give more information: the logs in host kernel related to the crash, guest kernel output before the crash, qemu output on the host. I tried this and it worked for me with a 64-bit Debian Lenny guest. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2702210&group_id=180599 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problems with live migration using kvm-84
Tomasz Chmielewski wpkg.org> writes: > > Gerrit Slomma schrieb: > > Hello and good day. > > > > I have filed a bug report via bugzilla.redhat.com with the id > > With what ID? > > Could you give the full URL? > Sorry, my fault. Full URL is: https://bugzilla.redhat.com/show_bug.cgi?id=492688 Problem is described there but for the lazy i copy-paste it here: Virtual machine gets stuck after migration. One CPU is 100% load, other one idles. The virtual machine could be pinged, ssh-login is not always possible. If stopped on migration target and continued on migration-source the virtual machine recovers after some time without load. Version-Release number of selected component (if applicable): kvm-84-1.el5.x86_64.rpm kvm-kmod-84-1.el5.x86_64.rpm qemu-0.9.1-11.el5.x86_64.rpm qemu-img-0.9.1-11.el5.x86_64.rpm kvm and kvm-kmod compiled from sourceforge sources and installed from built rpm, qemu and qemu-img are from EPEL-repository How reproducible: Start a kvm-virtual-machine on host A, start a kvm-virtual-machine with the same parameters on host B in incoming-Mode. Migrate the virtual-machine from host A to host B. Watch the kvm-process go to 100% after migration finishs. Maybe you have to wait up to 5 seconds or apply a ls or such in the virtual machine. After stopping virtual machine on host B and continuing virtual machine on host A the virtual machine recovers -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Live memory allocation?
Alberto Treviño byu.edu> writes: > The problem I've seen with this feature is that Windows guests end up taking > all of their available memory once they are up and running. For example, > booting Windows XP in KVM 82 show a steady increase in memory. Then about > the time the login box is about to appear, memory usage jumps to the maximum > allowed to the VM (512 MB in this case). I remember reading somewhere > Windows would try to initialize all memory during boot, causing KVM to > allocate all memory. VMware, however (and I don't know about VirtualBox) > knows about this and works around it, making sure memory isn't all allocated > during the Windows boot process. Windows does zero all memory at boot, and also runs a idle-priority thread in the background to zero memory as it is freed. This way it is far less likely to need to zero a page to satisfy a memory allocation request. Whether or not this is still a win now that people care about power consumption is an open question. I suspect the difference of behavior between KVM and VMware is related to VMware's page sharing. All those zeroed pages can be collapsed into one COW zero page. I wouldn't be surprised to learn that VMware has heuristics in the page sharing code specifically for windows guests. Perhaps KSM would help you? Alternately, a heuristic that scanned for (and collapsed) fully zeroed pages when a page is faulted in for the first time could catch these. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors after migration - why?
On Sat, 2009-03-28 at 11:21 +0100, Tomasz Chmielewski wrote: > Nolan schrieb: > > Tomasz Chmielewski wpkg.org> writes: > >> I'm trying to perform live migration by following the instructions on > >> http://www.linux-kvm.org/page/Migration. > >> Unfortunately, it doesn't work very well - guest is migrated, but looses > >> access to its disk. > > > > The LSI logic scsi device model doesn't implement device state > > save/restore. > > Any suspend/resume, snapshot or migration will fail. > > Oh, that sucks - as not everything supports virtio (which doesn't work > for me as well for some reason) - like Windows (which should be > addressed soon with block virtio drivers), but also older installations, > running older kernels. It is indeed a shame. I wish I had the time to investigate and resolve the problems with my patch that I linked to previously. LSI in particular is important for interoperability, as that is what VMware uses. > Does IDE support migration? It appears to, but I am not 100% sure that it will always survive migration under heavy IO load. I've gotten mixed messages on whether or not the qemu core waits for all in flight IOs to complete or if the device models need to checkpoint pending IOs themselves. Experimental evidence suggests that it does not. Also, from ide.c's checkpoint save code: /* XXX: if a transfer is pending, we do not save it yet */ I think the ideal here would be to stop the CPUs, but let the device models continue to run. Once all pending IOs have completed (and DMAed data and/or descriptors into guest memory, or raised interrupts, or whatever) then checkpoint all device state. When the guest resumes, it will see an unusual flurry of IO completions and/or interrupts, but it should be able to handle that OK. Shouldn't look much different from SMM taking over for a while during high IO load. This would save a lot of (unwritten, complex, hard to test) checkpointing code in the device models. Might cause a missed timer interrupt or two if there is a lot of slow IO, but that can be compensated for if needed. > > I sent a patch that partially addresses this (but is buggy in the presence > > of > > in-flight IO): > > http://lists.gnu.org/archive/html/qemu-devel/2009-01/msg00744.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Live memory allocation?
On Saturday 28 March 2009 08:38:33 Alberto Treviño wrote: > On Thursday 26 March 2009 08:11:02 am Tomasz Chmielewski wrote: > > Like, two guests, each with 2 GB memory allocated only use 1 GB of > > host's memory (as long as they don't have many programs/buffers/cache)? > > > > So yes, it's also supported by KVM. > > The problem I've seen with this feature is that Windows guests end up > taking all of their available memory once they are up and running. For > example, booting Windows XP in KVM 82 show a steady increase in memory. > Then about the time the login box is about to appear, memory usage jumps to > the maximum allowed to the VM (512 MB in this case). I remember reading > somewhere Windows would try to initialize all memory during boot, causing > KVM to allocate all memory. VMware, however (and I don't know about > VirtualBox) knows about this and works around it, making sure memory isn't > all allocated during the Windows boot process. > > Would there a way to work around the Windows memory allocation issue in KVM > as well? KVM devs have a patch called KSM (short for kernel shared memory I think) that helps windows guests a good bit. See the original announcement [1] for some numbers. I spoke to one of the devs recently and they said they are going to resubmit it soon. [1] http://marc.info/?l=kvm&m=122688851003046&w=2 > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Live memory allocation?
On Thursday 26 March 2009 08:11:02 am Tomasz Chmielewski wrote: > Like, two guests, each with 2 GB memory allocated only use 1 GB of > host's memory (as long as they don't have many programs/buffers/cache)? > > So yes, it's also supported by KVM. The problem I've seen with this feature is that Windows guests end up taking all of their available memory once they are up and running. For example, booting Windows XP in KVM 82 show a steady increase in memory. Then about the time the login box is about to appear, memory usage jumps to the maximum allowed to the VM (512 MB in this case). I remember reading somewhere Windows would try to initialize all memory during boot, causing KVM to allocate all memory. VMware, however (and I don't know about VirtualBox) knows about this and works around it, making sure memory isn't all allocated during the Windows boot process. Would there a way to work around the Windows memory allocation issue in KVM as well? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problems with live migration using kvm-84
Gerrit Slomma schrieb: Hello and good day. I have filed a bug report via bugzilla.redhat.com with the id With what ID? Could you give the full URL? -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
problems with live migration using kvm-84
Hello and good day. I have filed a bug report via bugzilla.redhat.com with the id Any ideas about this? It is urgent to resolve the problem, otherwise i am urged by my company to use vmware for virtualization. But vmware is not wanted by the admins (like me). Kind regards Gerrit Slomma. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors after migration - why?
Nolan schrieb: Tomasz Chmielewski wpkg.org> writes: I'm trying to perform live migration by following the instructions on http://www.linux-kvm.org/page/Migration. Unfortunately, it doesn't work very well - guest is migrated, but looses access to its disk. The LSI logic scsi device model doesn't implement device state save/restore. Any suspend/resume, snapshot or migration will fail. Oh, that sucks - as not everything supports virtio (which doesn't work for me as well for some reason) - like Windows (which should be addressed soon with block virtio drivers), but also older installations, running older kernels. Does IDE support migration? I sent a patch that partially addresses this (but is buggy in the presence of in-flight IO): http://lists.gnu.org/archive/html/qemu-devel/2009-01/msg00744.html -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html