date:20080506

Re: [kvm-devel] [RFC] [VTD][patch 1/3] vt-d support for pci passthrough: kvm-vtd--kernel.patch

2008-05-06 Thread Amit Shah

On Tuesday 06 May 2008 03:06:23 Kay, Allen M wrote:
> Kvm kernel changes.
>
> Signed-off-by: Allen M Kay <[EMAIL PROTECTED]>

> --- /dev/null
> +++ b/arch/x86/kvm/vtd.c
> @@ -0,0 +1,183 @@


> +
> +#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48
> +
> +struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev
> *dev);
> +struct dmar_domain * iommu_alloc_domain(struct intel_iommu *iommu);
> +void iommu_free_domain(struct dmar_domain *domain);
> +int domain_init(struct dmar_domain *domain, int guest_width);
> +int domain_context_mapping(struct dmar_domain *d,
> + struct pci_dev *pdev);
> +int domain_page_mapping(struct dmar_domain *domain, dma_addr_t iova,
> + u64 hpa, size_t size, int prot);
> +void detach_domain_for_dev(struct dmar_domain *domain, u8 bus, u8
> devfn);
> +struct dmar_domain * find_domain(struct pci_dev *pdev);

Please move these to a .h file and also prefix appropriate keywords:

domain_context_mapping is confusing and since it's an intel iommu-only thing, 
use something like

intel_iommu_domain_context_mapping 

> +int kvm_iommu_map_pages(struct kvm *kvm,
> + gfn_t base_gfn, unsigned long npages)
> +{
> + unsigned long gpa;
> + struct page *page;
> + hpa_t hpa;
> + int j, write;
> + struct vm_area_struct *vma;
> +
> + if (!kvm->arch.domain)
> + return 1;
> +
> + gpa = base_gfn << PAGE_SHIFT;
> + page = gfn_to_page(kvm, base_gfn);
> + hpa = page_to_phys(page);
> +
> + printk(KERN_DEBUG "kvm_iommu_map_page: gpa = %lx\n", gpa);
> + printk(KERN_DEBUG "kvm_iommu_map_page: hpa = %llx\n", hpa);
> + printk(KERN_DEBUG "kvm_iommu_map_page: size = %lx\n",
> + npages*PAGE_SIZE);
> +
> + for (j = 0; j < npages; j++) {
> + gpa +=  PAGE_SIZE;
> + page = gfn_to_page(kvm, gpa >> PAGE_SHIFT);
> + hpa = page_to_phys(page);
> + domain_page_mapping(kvm->arch.domain, gpa, hpa,
> PAGE_SIZE,
> + DMA_PTE_READ | DMA_PTE_WRITE);
> + vma = find_vma(current->mm, gpa);
> + if (!vma)
> + return 1;

*

> + write = (vma->vm_flags & VM_WRITE) != 0;
> + get_user_pages(current, current->mm, gpa,
> + PAGE_SIZE, write, 0, NULL, NULL);

You should put_page each of the user pages when freeing or exiting (in 
unmap_guest), else a ref is held on each page and that's a lot of memory 
leaked.

Also, this rules out any form of guest swapping. You should put_page in case a 
balloon driver in the guest tries to free some pages for the host.

> + }
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_iommu_map_pages);
> +
> +static int kvm_iommu_map_memslots(struct kvm *kvm)
> +{
> + int i, status;
> + for (i = 0; i < kvm->nmemslots; i++) {
> + status = kvm_iommu_map_pages(kvm,
> kvm->memslots[i].base_gfn,
> + kvm->memslots[i].npages);
> + if (status)
> + return status;

*

> + }
> + return 0;
> +}
> +
> +int kvm_iommu_map_guest(struct kvm *kvm,
> + struct kvm_pci_passthrough_dev *pci_pt_dev)
> +{
> + struct dmar_drhd_unit *drhd;
> + struct dmar_domain *domain;
> + struct intel_iommu *iommu;
> + struct pci_dev *pdev = NULL;
> +
> + printk(KERN_DEBUG "kvm_iommu_map_guest: host bdf = %x:%x:%x\n",
> + pci_pt_dev->host.busnr,
> + PCI_SLOT(pci_pt_dev->host.devfn),
> + PCI_FUNC(pci_pt_dev->host.devfn));
> +
> + for_each_pci_dev(pdev) {
> + if ((pdev->bus->number == pci_pt_dev->host.busnr) &&
> + (pdev->devfn == pci_pt_dev->host.devfn))
> + goto found;
> + }

You can use pci_get_device instead of going through the list yourself.

> + goto not_found;
> +found:
> + pci_pt_dev->pdev = pdev;
> +
> + drhd = dmar_find_matched_drhd_unit(pdev);
> + if (!drhd) {
> + printk(KERN_ERR "kvm_iommu_map_guest: drhd == NULL\n");
> + goto not_found;
> + }
> +
> + printk(KERN_DEBUG "kvm_iommu_map_guest: reg_base_addr = %llx\n",
> + drhd->reg_base_addr);
> +
> + iommu = drhd->iommu;
> + if (!iommu) {
> + printk(KERN_ERR "kvm_iommu_map_guest: iommu == NULL\n");
> + goto not_found;
> + }
> + domain = iommu_alloc_domain(iommu);
> + if (!domain) {
> + printk(KERN_ERR "kvm_iommu_map_guest: domain ==
> NULL\n");
> + goto not_found;
> + }
> + if (domain_init(domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
> + printk(KERN_ERR "kvm_iommu_map_guest: domain_init()
> failed\n");
> + goto not_found;

Memory allocated in iommu_alloc_domain is leaked in this case

> + }
> + kvm->arch.domain = domain;
> + kvm_iommu_map_memslots(kvm);

*: You don't check for failure in mapping

> + domain_context_mapping(kvm->arch.d

Re: [kvm-devel] [RFC] [VTD][patch 0/3] vt-d support for pci passthrough

2008-05-06 Thread Amit Shah

On Tuesday 06 May 2008 03:05:30 Kay, Allen M wrote:
> Following three patches contains vt-d support for pci passthrough.  It
> contains diff's base on Amit's 4/22 passthrough tree.
>
> The hardware environment used for this work is an Intel Weybridge system
> (Q35).  The passthrough device is an E1000 NIC. I'm still using irqhook
> mechanism for interrupt injection as I had problem with irqchip
> machanism.  Following is the command line I used  to start the guest.

Can you tell me what the problem with in-kernel irqchip is? Last time you 
mentioned there was a warning that came up when the guest exited. That 
shouldn't have stopped it from working, though

> /usr/local/bin/qemu-system-x86_64 -boot c -hda /etc/xen/fc5_32.img -m
> 256 -net none -pcidevice e1000/01:00.0-16 -no-kvm-irqchip
>
> Remaining tasks include:
>
> 1) Generated vtd.o with kvm-intel.ko instead of kvm.ko.
> 2) Make iommu hooks in generic code to be non-Intel specific

This is a good idea but will need collaboration with a lot of vendors.

> Let me know of your feedbacks.  Thanks.
>
> Allen

Amit

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [ kvm-Bugs-1958519 ] fails to build KVM modules against 2.6.26 kernel

2008-05-06 Thread SourceForge.net

Bugs item #1958519, was opened at 2008-05-06 16:05
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958519&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: yunfeng (yunfeng)
Assigned to: Nobody/Anonymous (nobody)
Summary: fails to build KVM modules against 2.6.26 kernel

Initial Comment:
Building KVM modules against 2.6.24 kernel is ok.
But building against 2.6.26 kernel will fail.

make -j20 -C /lib/modules/2.6.26-rc1-02049-g6307419/build M=`pwd` \
LINUXINCLUDE="-I`pwd`/include -Iinclude -I`pwd`/include-compat \
-include include/linux/autoconf.h" \
"$@"
make[1]: Entering directory `/root/kvm'
  Building modules, stage 2.
  MODPOST 3 modules
WARNING: "kvm_div64_u64" 
[/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm.ko] 
undefined!
  CC  
/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-amd.mod.o
  CC  
/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-intel.mod.o
  CC  
/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm.mod.o
In file included from :1:
./include/linux/autoconf.h:516:1: error: /external-module-compat.h: No such 
file or directory
In file included from :1:
./include/linux/autoconf.h:516:1: error: /external-module-compat.h: No such 
file or directory
In file included from :1:
./include/linux/autoconf.h:516:1: error: /external-module-compat.h: No such 
file or directory
make[2]: *** 
[/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-intel.mod.o]
 Error 1
make[2]: *** Waiting for unfinished jobs
make[2]: *** 
[/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-amd.mod.o]
 Error 1
make[2]: *** 
[/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm.mod.o] 
Error 1
make[1]: *** [modules] Error 2
make[1]: Leaving directory `/root/kvm'
make: *** [all] Error 2

 

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958519&group_id=180599

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] KVM Test result, kernel 6307419.., userspace 77c9148.. -- 3 new issues

2008-05-06 Thread Yunfeng Zhao

Hi All,
 
This is today's KVM test result against kvm.git 
630741928b4a7eeff27e134d7ba7bc2fc2c764c5 and kvm-userspace.git 
77c9148ba4a89a8dc4ab2ecf525c2de8604ea8c3.
There's one new issue blocked nightly test on ia32-pae platform 
(issue#1958464).

Three New Issues:

1. "Unknown symbol in module" while loading kvm.ko on PAE host
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958464&group_id=180599
2. Fail to save restore and live migrate on 32e platform
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958467&group_id=180599
3. fails to build KVM modules against 2.6.26 kernel
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958519&group_id=180599

One Old Issues:

4. Cannot boot guests with hugetlbfs
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1941302&group_id=180599
 


Test environment
 
PlatformWoodcrest
CPU 4
Memory size 8G'
 
Details

IA32e: 
1. boot four 32-bit guest in 
parallel  PASS
2. boot four 64-bit guest in 
parallel  PASS
3. boot 4G 64-bit 
guest  PASS
4. boot 4G pae 
guest PASS
5. boot 32-bit linux and 32 bit windows guest in parallelPASS
6. boot 32-bit guest with 1500M memory PASS
7. boot 64-bit guest with 1500M memory PASS
8. boot 32-bit guest with 256M memory   PASS
9. boot 64-bit guest with 256M memory   PASS
10. boot two 32-bit windows xp in parallelPASS
11. boot four 32-bit different guest in para 
PASS
12. save/restore 64-bit linux guests 
FAIL
13. save/restore 32-bit linux guests 
FAIL
14. boot 32-bit SMP windows 2003 with ACPI enabled  PASS
15. boot 32-bit SMP windows 2008 with ACPI enabled  PASS
16. boot 32-bit SMP Windows 2000 with ACPI enabled PASS
17. boot 32-bit SMP Windows xp with ACPI enabledPASS
18. boot 32-bit Windows 2000 without ACPIPASS
19. boot 64-bit Windows xp with ACPI enabledPASS
20. boot 32-bit Windows xp without ACPIPASS
21. boot 64-bit UP 
vista  PASS
22. boot 64-bit SMP 
vista   PASS
23. kernel build in 32-bit linux guest OS  PASS
24. kernel build in 64-bit linux guest OS  PASS
25. LTP on 32-bit linux guest OSPASS
26. LTP on 64-bit linux guest OSPASS
27. boot 64-bit guests with ACPI enabled PASS
28. boot 32-bit 
x-server   PASS  
29. boot 64-bit SMP windows XP with ACPI enabled PASS
30. boot 64-bit SMP windows 2003 with ACPI enabled  PASS
31. boot 64-bit SMP windows 2008 with ACPI enabled  PASS
32. live migration 64bit linux 
guests FAIL
33. live migration 32bit linux 
guests FAIL
34. reboot 32bit windows xp guest   PASS
35. reboot 32bit windows xp guest   PASS
 

Report Summary on IA32e
Summary Test Report of Last Session
=
  Total   PassFailNoResult   Crash
=
control_panel   15  11  4 00
Restart 3   3   0 00
gtest   23  21  2 00
=
control_panel   15  11  4 00
 :KVM_LM_64_g64 1   0   1 00
 :KVM_four_sguest_64_gPAE   1   1   0 00
 :KVM_4G_guest_64_g64   1   1   0 00
 :KVM_four_sguest_64_g641   1   0 00
 :KVM_linux_win_64_gPAE 1   1   0 00
 :KVM_1500M_guest_64_gPAE   1   1   0 00
 :KVM_SR_64_g64 1   0   1 00
 :KVM_LM_64_gPAE1   0   1 00
 :KVM_256M_guest_64_g64 1   1   0 00
 :KVM_1500M_guest_64_g641   1   0

[kvm-devel] [PATCH] janitorial: remove leftovers from merge conflict

2008-05-06 Thread Carlo Marcelo Arenas Belon

apparently harmless and unique

Signed-off-by: Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]>
---
 qemu/Makefile.target |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/qemu/Makefile.target b/qemu/Makefile.target
index cc66651..bb4b9a3 100644
--- a/qemu/Makefile.target
+++ b/qemu/Makefile.target
@@ -190,7 +190,6 @@ all: $(PROGS)
 
 #
 # cpu emulator library
-<<< HEAD:qemu/Makefile.target
 LIBOBJS=exec.o kqemu.o cpu-exec.o host-utils.o
 
 ifeq ($(NO_CPU_EMULATION), 1)
-- 
1.5.3.7


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [ kvm-Bugs-1958519 ] fails to build KVM modules against 2.6.26 kernel

2008-05-06 Thread Ryota OZAKI

Hi all,

>  Initial Comment:
>  Building KVM modules against 2.6.24 kernel is ok.
>  But building against 2.6.26 kernel will fail.

I got the same problem, but the following Andrea's patch helped me.

Hope this helps,
ozaki-r

-- Forwarded message --
From: Andrea Arcangeli <[EMAIL PROTECTED]>
Date: 2008/4/26
Subject: [kvm-devel] fix external module compile
To: kvm-devel@lists.sourceforge.net
Cc: Avi Kivity <[EMAIL PROTECTED]>


Hello,

 after updating kvm-userland.git, kvm.git and linux-2.6-hg, and after
 make distclean and rebuild with slightly reduced .config, I can't
 compile the external module anymore. Looking into it with V=1, $(src)
 defines to "" and including /external-module-compat.h clearly fails. I
 fixed it like below, because it seems more consistent to enforce the
 ordering of the "special" includes in the same place, notably
 $(src)/include is already included as $LINUX at point 1 of the
 comment, so this looks a cleanup of superflous line in Kconfig besides
 fixing my compile by moving the external-module-compat in the same
 place with the other includes where `pwd` works instead of $(src) that
 doesn't work anymore for whatever reason.

 Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>

 diff --git a/kernel/Kbuild b/kernel/Kbuild
 index cabfc75..d9245eb 100644
 --- a/kernel/Kbuild
 +++ b/kernel/Kbuild
 @@ -1,4 +1,3 @@
 -EXTRA_CFLAGS := -I$(src)/include -include $(src)/external-module-compat.h
  obj-m := kvm.o kvm-intel.o kvm-amd.o
  kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o anon_inodes.o irq.o i8259.o \
 lapic.o ioapic.o preempt.o i8254.o external-module-compat.o
 diff --git a/kernel/Makefile b/kernel/Makefile
 index 78ff923..e3fccbe 100644
 --- a/kernel/Makefile
 +++ b/kernel/Makefile
 @@ -27,7 +27,8 @@ all::
  #  include header priority 1) $LINUX 2) $KERNELDIR 3) include-compat
$(MAKE) -C $(KERNELDIR) M=`pwd` \
LINUXINCLUDE="-I`pwd`/include -Iinclude -I`pwd`/include-compat \
 -   -include include/linux/autoconf.h" \
 +   -include include/linux/autoconf.h \
 +   -include `pwd`/external-module-compat.h"
"$$@"

  sync: header-sync source-sync

 -
 This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
 Don't miss this year's exciting event. There's still time to save $100.
 Use priority code J8TL2D2.
 http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel



2008/5/6 SourceForge.net <[EMAIL PROTECTED]>:
> Bugs item #1958519, was opened at 2008-05-06 16:05
>  Message generated for change (Tracker Item Submitted) made by Item Submitter
>  You can respond by visiting:
>  
> https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958519&group_id=180599
>
>  Please note that this message will contain a full copy of the comment thread,
>  including the initial issue submission, for this request,
>  not just the latest update.
>  Category: None
>  Group: None
>  Status: Open
>  Resolution: None
>  Priority: 5
>  Private: No
>  Submitted By: yunfeng (yunfeng)
>  Assigned to: Nobody/Anonymous (nobody)
>  Summary: fails to build KVM modules against 2.6.26 kernel
>
>  Initial Comment:
>  Building KVM modules against 2.6.24 kernel is ok.
>  But building against 2.6.26 kernel will fail.
>
>  make -j20 -C /lib/modules/2.6.26-rc1-02049-g6307419/build M=`pwd` \
> LINUXINCLUDE="-I`pwd`/include -Iinclude 
> -I`pwd`/include-compat \
> -include include/linux/autoconf.h" \
> "$@"
>  make[1]: Entering directory `/root/kvm'
>   Building modules, stage 2.
>   MODPOST 3 modules
>  WARNING: "kvm_div64_u64" 
> [/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm.ko] 
> undefined!
>   CC  
> /root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-amd.mod.o
>   CC  
> /root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-intel.mod.o
>   CC  
> /root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm.mod.o
>  In file included from :1:
>  ./include/linux/autoconf.h:516:1: error: /external-module-compat.h: No such 
> file or directory
>  In file included from :1:
>  ./include/linux/autoconf.h:516:1: error: /external-module-compat.h: No such 
> file or directory
>  In file included from :1:
>  ./include/linux/autoconf.h:516:1: error: /external-module-compat.h: No such 
> file or directory
>  make[2]: *** 
> [/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-intel.mod.o]
>  Error 1
>  make[2]: *** Waiting for unfinished jobs
>  make[2]: *** 
> [/root/kvm-master-2.6.22-rc4-2008050601096/kvm-userspace/kernel/kvm-amd.mod.o]
>  Error 1
>  make[2]: *** 
> [/root/kvm-master-2.6.22-rc4-20080506010

Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)

2008-05-06 Thread Avi Kivity

Anthony Liguori wrote:
> Avi Kivity wrote:
>> Marcelo Tosatti wrote:
>>  
>>> Add three PCI bridges to support 128 slots.
>>>
>>> Changes since v1:
>>> - Remove I/O address range "support" (so standard PCI I/O space is 
>>> used).
>>> - Verify that there's no special quirks for 82801 PCI bridge.
>>> - Introduce separate flat IRQ mapping function for non-SPARC targets.
>>>
>>>   
>>
>> I've cooled off on the 128 slot stuff, mainly because most real hosts 
>> don't have them.  An unusual configuration will likely lead to 
>> problems as most guest OSes and workloads will not have been tested 
>> thoroughly with them.
>>
>> - it requires a large number of interrupts, which are difficult to 
>> provide, and which it is hard to ensure all OSes support.  MSI is 
>> relatively new.
>> - is only a few interrupts are available, then each interrupt 
>> requires scanning a large number of queues
>>
>> If we are to do this, then we need better tests than "80 disks show up".
>>
>> The alternative approach of having the virtio block device control up 
>> to 16 disks allows having those 80 disks with just 5 slots (and 5 
>> interrupts).  This is similar to the way traditional SCSI controllers 
>> behave, and so should not surprise the guest OS.
>>   
>
> If you have a single virtio-blk device that shows up as 8 functions, 
> we could achieve the same thing.  We can cheat with the interrupt 
> handlers to avoid cache line bouncing too.  

You can't cheat on all guests, and even on Linux, it's better to keep on 
doing what real hardware does than go off on a tangent than no one else 
uses.

You'll have to cheat on ->kick(), too.  Virtio needs one exit per 
O(queue depth).  With one spindle per ring, it doesn't make sense to 
have a queue depth > 4 (or latency goes to hell), so you have many exits.

> Plus, we can use PCI hotplug so we don't have to reinvent a new 
> hotplug mechanism.

You can plug disks into a Fibre Channel mesh, so presumably that works 
on real hardware somehow.

>
> I'm inclined to think that ring sharing isn't as useful as it seems as 
> long as we don't have indirect scatter gather lists.

I agree, but I think that indirect sg is very important for storage:

- a long sg list is cheap from the disk's point of view (the seeks are 
what's expensive)
- it is important to keep the queue depth meaningful and small 
(O(spindles * 3)), as it drastically affects latency

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)

2008-05-06 Thread Avi Kivity

Alexander Graf wrote:
>> Marcelo Tosatti wrote:
>>> Add three PCI bridges to support 128 slots.
>>>
>>> Changes since v1:
>>> - Remove I/O address range "support" (so standard PCI I/O space is 
>>> used).
>>> - Verify that there's no special quirks for 82801 PCI bridge.
>>> - Introduce separate flat IRQ mapping function for non-SPARC targets.
>>>
>>>
>>
>> I've cooled off on the 128 slot stuff, mainly because most real hosts
>> don't have them. An unusual configuration will likely lead to problems
>> as most guest OSes and workloads will not have been tested thoroughly
>> with them.
>
> This is more of a "let's do this conditionally" than a "let's not do 
> it" reason imho.

Yes. More precisely, let's not do it until we're sure it works and performs.

I don't think a queue-per-disk approach will perform well, since the 
queue will always be very short and will not be able to amortize exit 
costs and ring management overhead very well.

>> - it requires a large number of interrupts, which are difficult to
>> provide, and which it is hard to ensure all OSes support. MSI is
>> relatively new.
>
> We could just as well extend the device layout to have every device be 
> attached to one virtual IOAPIC pin, so we'd have like 128 / 4 = 32 
> IOAPICs in the system and one interrupt for each device.

That's problematic for these reasons:

- how many OSes work well with 32 IOAPICs?
- at one point, you run out of interrupt vectors (~ 220 per cpu if the 
OS can allocate per-cpu vectors; otherwise just ~220)
- you will have many interrupts fired, each for a single device with a 
few requests, reducing performance

>> - is only a few interrupts are available, then each interrupt requires
>> scanning a large number of queues
>
> This case should be rare, basically only existent with OSs that don't 
> support APIC properly.
>

Hopefully.

>> The alternative approach of having the virtio block device control up to
>> 16 disks allows having those 80 disks with just 5 slots (and 5
>> interrupts). This is similar to the way traditional SCSI controllers
>> behave, and so should not surprise the guest OS.
>
> The one thing I'm actually really missing here is use cases. What are 
> we doing this for? And further along the line, are there other 
> approaches to the problems for which this was supposed to be a 
> solution? Maybe someone can raise a case where it's not virtblk / 
> virtnet.

The requirement for lots of storage is a given. There are two ways of 
doing that, paying a lot of money to EMC or NetApp for a storage 
controller, or connecting lots of disks directly and doing the storage 
controller on the OS (what EMC and NetApp do anyway, inside their 
boxes). zfs is a good example of a use case, and I'd guess databases 
could use this too if they were able to supply the redundancy.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 1/3] vt-d support for pci passthrough: kvm-vtd--kernel.patch

2008-05-06 Thread Avi Kivity

Kay, Allen M wrote:
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +//#define DEBUG
> +
> +#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48
>   

The name "domain" is too generic; please use dma_domain or io_domain or 
something similar.

> +static int kvm_iommu_map_memslots(struct kvm *kvm)
> +{
> + int i, status;
> + for (i = 0; i < kvm->nmemslots; i++) {
> + status = kvm_iommu_map_pages(kvm,
> kvm->memslots[i].base_gfn,
> + kvm->memslots[i].npages);
> + if (status)
> + return status;
>   

Need to undo in case of partial completion.

> diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
> index 5f93b78..6202ed1 100644
> --- a/include/asm-x86/kvm_para.h
> +++ b/include/asm-x86/kvm_para.h
> @@ -170,5 +170,6 @@ struct kvm_pci_pt_info {
>  struct kvm_pci_passthrough_dev {
>   struct kvm_pci_pt_info guest;
>   struct kvm_pci_pt_info host;
> + struct pci_dev *pdev;/* kernel device pointer for host dev
> */
>   

This should be stored somewhere private (not sure, but I think 
kvm_pci_passthrough_dev is a public interface).

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 2/3] vt-d support for pci passthrough: kvm-vtd-user.patch

2008-05-06 Thread Avi Kivity

Kay, Allen M wrote:
> Still todo: move vt.d to kvm-intel.ko module.
>   

Not sure it's the right thing to do. If we get the iommus abstracted 
properly, we can rename vtd.c to dma.c and move it to virt/kvm/.

The code is certainly a lot more about managing memory than anything vmx 
specific. It's hardly x86 specific, even.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [ kvm-Bugs-1958519 ] fails to build KVM modules against 2.6.26 kernel

2008-05-06 Thread Avi Kivity

Ryota OZAKI wrote:
> Hi all,
>
>   
>>  Initial Comment:
>>  Building KVM modules against 2.6.24 kernel is ok.
>>  But building against 2.6.26 kernel will fail.
>> 
>
> I got the same problem, but the following Andrea's patch helped me.
>
> Hope this helps,
>   

Yes, while I think it's a Kbuild problem, too many people are hitting 
it, so I applied the patch.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] janitorial: remove leftovers from merge conflict

2008-05-06 Thread Avi Kivity

Carlo Marcelo Arenas Belon wrote:
> apparently harmless and unique
>
>   

Sloppy me.  Applied, thanks.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] KVM Test result, kernel 6307419.., userspace 77c9148.. -- 3 new issues

2008-05-06 Thread Avi Kivity

Yunfeng Zhao wrote:
> Three New Issues:
> 
> 1. "Unknown symbol in module" while loading kvm.ko on PAE host
> https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958464&group_id=180599
> 2. Fail to save restore and live migrate on 32e platform
> https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958467&group_id=180599
> 3. fails to build KVM modules against 2.6.26 kernel
> https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958519&group_id=180599
>   

Fixed and pushed all three.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 0/3] vt-d support for pci passthrough

2008-05-06 Thread Avi Kivity

Kay, Allen M wrote:
> Following three patches contains vt-d support for pci passthrough.  It
> contains diff's base on Amit's 4/22 passthrough tree.
>
> The hardware environment used for this work is an Intel Weybridge system
> (Q35).  The passthrough device is an E1000 NIC. I'm still using irqhook
> mechanism for interrupt injection as I had problem with irqchip
> machanism.  Following is the command line I used  to start the guest.
>
> /usr/local/bin/qemu-system-x86_64 -boot c -hda /etc/xen/fc5_32.img -m
> 256 -net none -pcidevice e1000/01:00.0-16 -no-kvm-irqchip
>
> Remaining tasks include:
>
> 1) Generated vtd.o with kvm-intel.ko instead of kvm.ko.
> 2) Make iommu hooks in generic code to be non-Intel specific
>   

Eventually we will want to make it even non-x86 specific; ia64 will 
probably be able to share, and maybe ppc someday.

That needn't be done at once, though.

Your mail client mangles the patches, please attach or use git send-email.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH]: Fake MSR_K7 performance counters

2008-05-06 Thread Avi Kivity

Chris Lalancette wrote:
> Attached is a patch that fixes a guest crash when booting older Linux kernels.
> The problem stems from the fact that we are currently emulating
> MSR_K7_EVNTSEL[0-3], but not emulating MSR_K7_PERFCTR[0-3].  Because of this,
> setup_k7_watchdog() in the Linux kernel receives a GPF when it attempts to 
> write
> into MSR_K7_PERFCTR, which causes an OOPs.
>
> The patch fixes it by just "fake" emulating the appropriate MSRs, throwing 
> away
> the data in the process.  This causes the NMI watchdog to not actually work, 
> but
> it's not such a big deal in a virtualized environment.
>
> When we get a write to one of these counters, we printk_ratelimit() a warning.
> I decided to print it out for all writes, even if the data is 0; it doesn't 
> seem
> to make sense to me to special case when data == 0.
>
> Tested by myself on a RHEL-4 guest, and Joerg Roedel on a Windows XP 64-bit 
> guest.
>   

Applied, thanks.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [ kvm-Bugs-1958715 ] kvm-userspace failed to start linux kernel (kernel panic)

2008-05-06 Thread SourceForge.net

Bugs item #1958715, was opened at 2008-05-06 15:13
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958715&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: libkvm
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: gth (gthouvenin)
Assigned to: Nobody/Anonymous (nobody)
Summary: kvm-userspace failed to start linux kernel (kernel panic)

Initial Comment:
CPU: Intel Xeon (eight cpu)
KVM: kvm-68-2049-g6307419
Host kernel arch: x86_64
Guest: Ubuntu-8.04-desktop-i386 livecd
QEMU Command: qemu-system-x86_64 -cdrom 
/images_iso/ubuntu-8.04-desktop-i386.iso -boot d -m 256
KVM-USERSPACE: kvm-66-147-gc33833a

The problem doesn't go away if I'm using the -no-kvm-irqchip or -no-kvm-pit 
switch.

When I use the commit bae043c (kvm-userspace) I can start the liveCD but the 
next commit c33833a produces a kernel panic. I see the screen with different 
choice of installation but when I choose to install linux I get a kernel panic 
(see file attach).

It also happens with an old fedora that is installed on a qcow2 file. 

Regards,
Guillaume

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958715&group_id=180599

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [ kvm-Bugs-1958715 ] kvm-userspace failed to start linux kernel (kernel panic)

2008-05-06 Thread Guillaume Thouvenin

On Tue, 06 May 2008 06:13:18 -0700
"SourceForge.net" <[EMAIL PROTECTED]> wrote:

> When I use the commit bae043c (kvm-userspace) I can start the liveCD 
> but the next commit c33833a produces a kernel panic. I see the screen 
> with different choice of installation but when I choose to install 
> linux I get a kernel panic (see file attach).

I insert the report of the kernel panic:

---

[EMAIL PROTECTED]/local/kvm-userspace.git/bin]$ ./qemu-system-x86_64 -cdrom 
/images_iso/ubuntu-8.04-desktop-i386.iso -boot d -m 256 -serial stdio
kvm_set_lapic: Bad file descriptor
[0.00] Linux version 2.6.24-16-generic ([EMAIL PROTECTED]) (gcc version 
4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu Apr 10 13:23:42 UTC 2008 (Ubuntu 
2.6.24-16.30-generic)
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
[0.00]  BIOS-e820: 000e8000 - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 0fff (usable)
[0.00]  BIOS-e820: 0fff - 1000 (ACPI data)
[0.00]  BIOS-e820: fffbd000 - 0001 (reserved)
[0.00] 0MB HIGHMEM available.
[0.00] 255MB LOWMEM available.
[0.00] Zone PFN ranges:
[0.00]   DMA 0 -> 4096
[0.00]   Normal   4096 ->65520
[0.00]   HighMem 65520 ->65520
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[1] active PFN ranges
[0.00] 0:0 ->65520
[0.00] DMI 2.4 present.
[0.00] ACPI: RSDP signature @ 0xC00FB450 checksum 0
[0.00] ACPI: RSDP 000FB450, 0014 (r0 QEMU  )
[0.00] ACPI: RSDT 0FFF, 002C (r1 QEMU   QEMURSDT1 QEMU  
  1)
[0.00] ACPI: FACP 0FFF002C, 0074 (r1 QEMU   QEMUFACP1 QEMU  
  1)
[0.00] ACPI: DSDT 0FFF0100, 2464 (r1   BXPC   BXDSDT1 INTL 
20061109)
[0.00] ACPI: FACS 0FFF00C0, 0040
[0.00] ACPI: APIC 0FFF2568, 00E0 (r1 QEMU   QEMUAPIC1 QEMU  
  1)
[0.00] ACPI: PM-Timer IO Port: 0xb008
[0.00] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[0.00] Processor #0 6:2 APIC version 20
[0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x08] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x09] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x0a] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x0b] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0c] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x0d] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x0e] disabled)
[0.00] ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x0f] disabled)
[0.00] ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0])
[0.00] IOAPIC[0]: apic_id 1, version 17, address 0xfec0, GSI 0-23
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[0.00] Enabling APIC mode:  Flat.  Using 1 I/O APICs
[0.00] Using ACPI (MADT) for SMP configuration information
[0.00] Allocating PCI resources starting at 2000 (gap: 
1000:effbd000)
[0.00] swsusp: Registered nosave memory region: 0009f000 - 
000a
[0.00] swsusp: Registered nosave memory region: 000a - 
000e8000
[0.00] swsusp: Registered nosave memory region: 000e8000 - 
0010
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 65009
[0.00] Kernel command line: BOOT_IMAGE=/casper/vmlinuz 
file=/cdrom/preseed/ubuntu.seed boot=casper initrd=/casper/initrd.gz 
console=ttyS0
[0.00] Enabling fast FPU save and restore... done.
[0.00] Enabling unmasked SIMD FPU exception support... done.
[0.00] Initializing CPU#0
[0.00] PID hash table entries: 1024 (order: 10, 4096 bytes)
[0.00] Detected 3002.716 MHz processor.
[   18.835013] Console: colour VGA+ 80x25
[   18.835162] console [ttyS0] enabled
[   18.977947] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[   18.980655] Inode-cache hash table entri

[kvm-devel] [ kvm-Bugs-1958725 ] openSUSE 11.0 became broken with newer KVM

2008-05-06 Thread SourceForge.net

Bugs item #1958725, was opened at 2008-05-06 16:21
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958725&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Technologov (technologov)
Assigned to: Nobody/Anonymous (nobody)
Summary: openSUSE 11.0 became broken with newer KVM

Initial Comment:

Host OS: Fedora7/x64, kernel 2.6.21
Guest OS: openSUSE 11.0 BETA2, 32-bit x86 DVD ISO, kernel 2.6.25
CPU: Intel Core 2
KVM: KVM-67 (bug also valid for KVM-68)

KVM-67 broke openSUSE 11.0 on newer KVMs-67/68 on intel.

command:
./qemu-kvm -cdrom openSUSE-11.0-BETA2-32-bit.iso -m 512 -hda myharddisk.qcow2 
-boot d

Symptoms:
Red error message is displayed in the guest monitor, during setup stage2 (Yast) 
load.

I have bisected it.
qemu-merge for KVM-67 userspace is responsible for this bug, commit:
c33833a3f98b1bb9d8208b0ed115009bc20e6e87

Works fully on KVM-66. On KVM-67/68 it works only with "-no-kvm" parameter. 

FAILS with default parameters, fails with -no-kvm-acpi, -no-kvm-pit, 
-no-kvm-irqchip, and fails when guest is loaded with normal or FAILSAFE kernel 
boot parameters.
That is: fails in all cases.

-Alexey "Technologov", 6.May.2008.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1958725&group_id=180599

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

2008-05-06 Thread Guillaume Thouvenin

On Mon, 5 May 2008 16:29:21 +0300
"Mohammed Gamal" <[EMAIL PROTECTED]> wrote:

> On Mon, May 5, 2008 at 3:57 PM, Anthony Liguori <[EMAIL PROTECTED]> wrote:
> 
> >  WinXP fails to boot with your patch applied too.  FWIW, Ubuntu 8.04 has
> >  a fixed version of gfxboot that doesn't do nasty things with SS on
> >  privileged mode transitions.
> >
> WinXP fails with the patch applied too. Ubuntu 7.10 live CD and
> FreeDOS don't boot but complain about instruction mov 0x11,sreg not
> being emulated.

Can you try with this one please?
On my computer it boots ubuntu-8.04-desktop-i386.iso liveCD and also
openSUSE-10.3-GM-x86_64-mini.iso

I will try FreeDOS and WinXP if I can find one ;)

Regards,
Guillaume

---

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 26c4f02..6e76c2e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1272,7 +1272,9 @@ static void enter_pmode(struct kvm_vcpu *vcpu)
fix_pmode_dataseg(VCPU_SREG_GS, &vcpu->arch.rmode.gs);
fix_pmode_dataseg(VCPU_SREG_FS, &vcpu->arch.rmode.fs);
 
+#if 0
vmcs_write16(GUEST_SS_SELECTOR, 0);
+#endif
vmcs_write32(GUEST_SS_AR_BYTES, 0x93);
 
vmcs_write16(GUEST_CS_SELECTOR,
@@ -2633,6 +2635,73 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu, 
struct kvm_run *kvm_run)
return 1;
 }
 
+static int invalid_guest_state(struct kvm_vcpu *vcpu,
+   struct kvm_run *kvm_run, u32 failure_reason)
+{
+   u16 ss, cs;
+   u8 opcodes[4];
+   unsigned long rip = vcpu->arch.rip;
+   unsigned long rip_linear;
+
+   ss = vmcs_read16(GUEST_SS_SELECTOR);
+   cs = vmcs_read16(GUEST_CS_SELECTOR);
+
+   if ((ss & 0x03) != (cs & 0x03)) {
+   int err;
+   rip_linear = rip + vmx_get_segment_base(vcpu, VCPU_SREG_CS);
+   emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu);
+#if 0
+   printk(KERN_INFO "emulation at (%lx) rip %lx: %02x %02x %02x 
%02x\n",
+   rip_linear,
+   rip, opcodes[0], opcodes[1], opcodes[2], 
opcodes[3]);
+#endif
+   err = emulate_instruction(vcpu, kvm_run, 0, 0, 0);
+   switch (err) {
+   case EMULATE_DONE:
+#if 0
+   printk(KERN_INFO "successfully emulated 
instruction\n");
+#endif
+   return 1;
+   case EMULATE_DO_MMIO:
+   printk(KERN_INFO "mmio?\n");
+   return 0;
+   default:
+   kvm_report_emulation_failure(vcpu, "vmentry 
failure");
+   break;
+   }
+   }
+
+   kvm_run->exit_reason = KVM_EXIT_UNKNOWN;
+   kvm_run->hw.hardware_exit_reason = failure_reason;
+   return 0;
+}
+
+static int handle_vmentry_failure(struct kvm_vcpu *vcpu,
+ struct kvm_run *kvm_run,
+ u32 failure_reason)
+{
+   unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
+#if 0
+   printk(KERN_INFO "Failed vm entry (exit reason 0x%x) ", failure_reason);
+#endif
+   switch (failure_reason) {
+   case EXIT_REASON_INVALID_GUEST_STATE:
+#if 0
+   printk("invalid guest state \n");
+#endif
+   return invalid_guest_state(vcpu, kvm_run, 
failure_reason);
+   case EXIT_REASON_MSR_LOADING:
+   printk("caused by MSR entry %ld loading.\n", 
exit_qualification);
+   break;
+   case EXIT_REASON_MACHINE_CHECK:
+   printk("caused by machine check.\n");
+   break;
+   default:
+   printk("reason not known yet!\n");
+   break;
+   }
+   return 0;
+}
 /*
  * The exit handlers return 1 if the exit was handled fully and guest execution
  * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
@@ -2694,6 +2763,12 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
exit_reason != EXIT_REASON_EPT_VIOLATION))
printk(KERN_WARNING "%s: unexpected, valid vectoring info and "
   "exit reason is 0x%x\n", __func__, exit_reason);
+
+   if ((exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) {
+   exit_reason &= ~VMX_EXIT_REASONS_FAILED_VMENTRY;
+   return handle_vmentry_failure(vcpu, kvm_run, exit_reason);
+   }
+
if (exit_reason < kvm_vmx_max_exit_handlers
&& kvm_vmx_exit_handlers[exit_reason])
return kvm_vmx_exit_handlers[exit_reason](vcpu, kvm_run);
diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h
index 79d94c6..2cebf48 100644
--- a/arch/x86/kvm/vmx.h
+++ b/arch/x86/kvm/vmx.h
@@ -238,7 +238,10 @@ enum vmcs_field {
 #define EXIT_REASON_IO_INSTRUCTION  30

Re: [kvm-devel] [RFC] fix VMX TSC synchronicity

2008-05-06 Thread Avi Kivity

Avi Kivity wrote:
> [Resurrecting post from the dead]
>
>
> Marcelo Tosatti wrote:
>> Forcing clustered APIC mode works only on SMP, and there were high CPU
>> consumption on Windows SMP guests due to C3 state being reported (fixed
>> in kvm-30 something).
>>
>> So perhaps:
>> - Faking clustered APIC on SMP - Faking C3 on UP
>>
>> And turning of the TSC bit (for 32-bit guests).
>>
>> Is the way to go?
>> Avi, do you understand why C3 was causing the Windows SMP problems ?
>>
>>   
>
> It's probably inb()ing on the port in a loop.  It's not SMP causing 
> the problems, but the ACPI HAL.  I'll check this.
>

Yes, it's reading 0xb010 and 0xb014, which ought to place the cpu in 
sleep mode, but don't.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

2008-05-06 Thread Anthony Liguori

Guillaume Thouvenin wrote:
> On Mon, 5 May 2008 16:29:21 +0300
> "Mohammed Gamal" <[EMAIL PROTECTED]> wrote:
>
>   
>> On Mon, May 5, 2008 at 3:57 PM, Anthony Liguori <[EMAIL PROTECTED]> wrote:
>>
>> 
>>>  WinXP fails to boot with your patch applied too.  FWIW, Ubuntu 8.04 has
>>>  a fixed version of gfxboot that doesn't do nasty things with SS on
>>>  privileged mode transitions.
>>>
>>>   
>> WinXP fails with the patch applied too. Ubuntu 7.10 live CD and
>> FreeDOS don't boot but complain about instruction mov 0x11,sreg not
>> being emulated.
>> 
>
> Can you try with this one please?
> On my computer it boots ubuntu-8.04-desktop-i386.iso liveCD and also
> openSUSE-10.3-GM-x86_64-mini.iso
>   

8.04 is not a good test-case.  7.10 is what you want to try.

The good news is, 7.10 appears to work!  The bad news is that about 20% 
of the time, it crashes and displays the following:

kvm_run: failed entry, reason 5
kvm_run returned -8

So something appears to be a bit buggy.  Still, very good work!

Regards,

Anthony Liguori

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] kvm-67: kernel panic while booting debian-40r3-i386-businesscard.iso

2008-05-06 Thread Avi Kivity

Jan Luebbe wrote:
>> 0f 0d 0bprefetchw (%ebx)
>>
>> This is an AMD 3Dnow! instruction, which is not supported on Intel 
>> processors.  I guess the 3Dnow! cpuid bit leaked in via the qemu merge.
>>
>> I guess two fixes are needed:
>> - remove the 3Dnow! bit
>> - add emulation for prefetchw (easy, as it doesn't need to do anything) 
>> to support live migration from AMD to Intel
>> 
>
> This problem still occours with kvm-68. Which CPUs will be affected by
> this (is it only the Core Duo)?
>   

All Intels.

> I'm currently delaying the upload of a new kvm package to debian because
> of this.
>   

I've fixed it for kvm-69.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] s390 kvm_virtio.c build error

2008-05-06 Thread Avi Kivity

Martin Schwidefsky wrote:
> I've added Heiko's patch to my patchqueue. But since this is
> drivers/s390/kvm this should go in over the kvm.git. See patch below.
>
>   

Thanks, I added this to my queue as well.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] Build fix for kvm/ia64 userspace.

2008-05-06 Thread Avi Kivity

Zhang, Xiantao wrote:
> Hi, Avi
>   This patch should go into RC1, otherwise it will block kvm/ia64
> userspace build. 
>
> diff --git a/include/asm-ia64/kvm.h b/include/asm-ia64/kvm.h
> index eb2d355..62b5fad 100644
> --- a/include/asm-ia64/kvm.h
> +++ b/include/asm-ia64/kvm.h
> @@ -22,7 +22,12 @@
>   */
>  
>  #include 
> +
> +#ifdef __KERNEL__
>  #include 
> +#else
> +#include 
> +#endif
>  

Fishy.  A kernel header including a userspace header?

Maybe you need to include  unconditionally?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-06 Thread Andrea Arcangeli

On Mon, May 05, 2008 at 02:46:25PM -0500, Jack Steiner wrote:
> If a task fails to unmap a GRU segment, they still exist at the start of

Yes, this will also happen in case the well behaved task receives
SIGKILL, so you can test it that way too.

> exit. On the ->release callout, I set a flag in the container of my
> mmu_notifier that exit has started. As VMA are cleaned up, TLB flushes
> are skipped because of the flag is set. When the GRU VMA is deleted, I free

GRU TLB flushes aren't skipped because your flag is set but because
__mmu_notifier_release already executed
list_del_init_rcu(&grunotifier->hlist) before proceeding with
unmap_vmas.

> my structure containing the notifier.

As long as nobody can write through the already established gru tlbs
and nobody can establish new tlbs after exit_mmap run you don't
strictly need ->release.

> I _think_ works. Do you see any problems?

You can remove the flag and ->release and ->clear_flush_young (if you
keep clear_flush_young implemented it should return 0). The
synchronize_rcu after mmu_notifier_register can also be dropped thanks
to mm_lock(). gru_drop_mmu_notifier should be careful with current->mm
if you're using an fd and if the fd can be passed to a different task
through unix sockets (you should probably fail any operation if
current->mm != gru->mm).

The way I use ->release in KVM is to set the root hpa to -1UL
(invalid) as a debug trap. That's only for debugging because even if
tlb entries and sptes are still established on the secondary mmu they
are only relevant when the cpu jumps to guest mode and that can never
happen again after exit_mmap is started.

> I should also mention that I have an open-coded function that possibly
> belongs in mmu_notifier.c. A user is allowed to have multiple GRU segments.
> Each GRU has a couple of data structures linked to the VMA. All, however,
> need to share the same notifier. I currently open code a function that
> scans the notifier list to determine if a GRU notifier already exists.
> If it does, I update a refcnt & use it. Otherwise, I register a new
> one. All of this is protected by the mmap_sem.
> 
> Just in case I mangled the above description, I'll attach a copy of the GRU 
> mmuops
> code.

Well that function needs fixing w.r.t. srcu. Are you sure you want to
search for mn->ops == gru_mmuops and not for mn == gmn?  And if you
search for mn why can't you keep track of the mn being registered or
unregistered outside of the mmu_notifier layer? Set a bitflag in the
container after mmu_notifier_register returns and a clear it after
_unregister returns. I doubt saving one bitflag is worth searching the
list and your approach make it obvious that you've to protect the
bitflag and the register/unregister under write-mmap_sem
yourself. Otherwise the find function will return an object that can
be freed at any time if somebody calls unregister and
kfree. (synchronize_srcu in mmu_notifier_unregister won't wait for
anything but some outstanding srcu_read_lock)

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] problems running many guests

2008-05-06 Thread Karl Rister

On Thursday 01 May 2008 7:16:53 pm Marcelo Tosatti wrote:
> Does -no-kvm-irqchip or -no-kvm-pit makes a difference? If not, please
> grab kvm_stat --once output when that happens.

Per some suggestions I have moved up to kvm-68 which is better, but still 
having problems.  Replicating the problem with only one guest spinning has 
proven quite difficult, but attempting to boot a large smp guest can reliably 
recreate the problem.  Using -no-kvm-pit did not help the large guest 
and -no-kvm-irqchip made it seize up even earlier with only 1 cpu spinning 
instead of all of them.

>
> Also run "readprofile -r ; readprofile -m System-map-of-guest.map" with the
> host booted with "profile=kvm". Make sure all guests are running the same
> kernel image.

I got this from a spinning 16-way guest with only 8 of the host CPUs online 
and without either -no-kvm-irqchip or -no-kvm-pit:

[EMAIL PROTECTED] ~]# readprofile -r ; readprofile -m 
karl/System.map-2.6.25-03591-g873c05f
   101 native_read_tsc3.4828
 1 read_persistent_clock  0.0192
25 kvm_clock_read 0.2660
95 getnstimeofday 0.7252
13 update_wall_time   0.0138
 1 second_overflow0.0020
readprofile: profile address out of range. Wrong map file?

The kvm_stat output during this is:

[EMAIL PROTECTED] ~]# kvm_stat --once
efer_reload23354 0
exits3587109  2250
fpu_reload   1934298 0
halt_exits  4583 0
halt_wakeup   42 0
host_state_reload2165502   167
hypercalls  1482 0
insn_emulation900199 0
insn_emulation_fail0 0
invlpg 0 0
io_exits 1983116 0
irq_exits 427728  2250
irq_window 0 0
largepages 0 0
mmio_exits163522 0
mmu_cache_miss   176 0
mmu_flooded   99 0
mmu_pde_zapped   191 0
mmu_pte_updated   10 0
mmu_pte_write  59030 0
mmu_recycled   0 0
mmu_shadow_zapped 99 0
pf_fixed   14890 0
pf_guest   0 0
remote_tlb_flush  29 0
request_irq0 0
signal_exits   1 0
tlb_flush 481952 0

The output with -no-kvm-pit looked almost identical and with -no-kvm-pit there 
was no samples registered for either tool.

-- 
Karl Rister
IBM Linux Performance Team
[EMAIL PROTECTED]
(512) 838-1553 (t/l 678)

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 0 of 2] [RESEND] [PowerPC] Fix setting memory for bamboo board model

2008-05-06 Thread Avi Kivity

Jerone Young wrote:
> These patches fell through the cracks.
>   

Unfortunately, the cracks are getting wider.

Anyway, applied, thanks.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 0/4] paravirt clock patches

2008-05-06 Thread Gerd Hoffmann

Marcelo Tosatti wrote:
> F8 host, recent kvm-userspace.git (so with IO thread), recent kvm.git
> (plus your patches), haven't tried 2x but I think 4x is not necessary to
> reproduce the problem.

Ok, see it too.  Seem to be actually two (maybe related) problems.

First the guest hangs hard after a while, burning 100% CPU time
(deadlocked I guess), doesn't respond to sysrq any more.  Is there some
easy way to get the guest vcpu state then?  EIP for starters, preferably
with stack trace?

The other one is that one ticks slower than the other.  I don't see it
from start, but after a while it starts happening (unless the guest
deadlocks before ...).

cheers,
  Gerd

-- 
http://kraxel.fedorapeople.org/xenner/

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] KVM: PIT: take inject_pending into account when emulating hlt

2008-05-06 Thread Marcelo Tosatti


Otherwise hlt emulation fails if PIT is not injecting IRQ's.

Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>


diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 1646102..07f9ff1 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -216,7 +216,7 @@ int pit_has_pending_timer(struct kvm_vcpu *vcpu)
 {
struct kvm_pit *pit = vcpu->kvm->arch.vpit;
 
-   if (pit && vcpu->vcpu_id == 0)
+   if (pit && vcpu->vcpu_id == 0 && pit->pit_state.inject_pending)
return atomic_read(&pit->pit_state.pit_timer.pending);
 
return 0;


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] problems running many guests

2008-05-06 Thread Marcelo Tosatti

Hi Karl,

On Mon, May 05, 2008 at 08:40:22PM -0500, Karl Rister wrote:
> On Thursday 01 May 2008 7:16:53 pm Marcelo Tosatti wrote:
> > Does -no-kvm-irqchip or -no-kvm-pit makes a difference? If not, please
> > grab kvm_stat --once output when that happens.
> 
> Per some suggestions I have moved up to kvm-68 which is better, but still 
> having problems.  Replicating the problem with only one guest spinning has 
> proven quite difficult, but attempting to boot a large smp guest can reliably 
> recreate the problem.  Using -no-kvm-pit did not help the large guest 
> and -no-kvm-irqchip made it seize up even earlier with only 1 cpu spinning 
> instead of all of them.
> 
> >
> > Also run "readprofile -r ; readprofile -m System-map-of-guest.map" with the
> > host booted with "profile=kvm". Make sure all guests are running the same
> > kernel image.
> 
> I got this from a spinning 16-way guest with only 8 of the host CPUs online 
> and without either -no-kvm-irqchip or -no-kvm-pit:
> 
> [EMAIL PROTECTED] ~]# readprofile -r ; readprofile -m 
> karl/System.map-2.6.25-03591-g873c05f
>101 native_read_tsc3.4828
>  1 read_persistent_clock  0.0192
> 25 kvm_clock_read 0.2660
> 95 getnstimeofday 0.7252
> 13 update_wall_time   0.0138
>  1 second_overflow0.0020
> readprofile: profile address out of range. Wrong map file?

KVM clock has known problems with SMP guests, please disable it for now.

Also disable LOCKDEP on the guest if it has more VCPU's than CPU's
available in the host.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [PATCH] fixup 3dnow! support

2008-05-06 Thread Glauber Costa

qemu recently added support for 3dnow instructions. Because of
that, 3dnow will be featured among cpuid bits. But this will
break kvm in cpus that don't have those instructions (which includes
my laptop). So we fixup our cpuid before exposing it to the guest.

Signed-off-by: Glauber Costa <[EMAIL PROTECTED]>
---
 arch/x86/kvm/x86.c   |   22 ++
 include/asm-x86/cpufeature.h |2 ++
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 979f983..e79fcd5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -919,7 +919,7 @@ static int is_efer_nx(void)
return efer & EFER_NX;
 }
 
-static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
+static void cpuid_fix_caps(struct kvm_vcpu *vcpu)
 {
int i;
struct kvm_cpuid_entry2 *e, *entry;
@@ -932,6 +932,20 @@ static void cpuid_fix_nx_cap(struct kvm_vcpu *vcpu)
break;
}
}
+
+   /* 3DNOWEXT */
+   if (entry && (entry->edx & (1 << 30)) && !cpu_has_3dnowext) {
+   entry->edx &= ~(1 << 30);
+   printk(KERN_INFO "kvm: guest 3DNOWEXT capability removed\n");
+   }
+
+   /* 3DNOW */
+   if (entry && (entry->edx & (1 << 31)) && !cpu_has_3dnow) {
+   entry->edx &= ~(1 << 31);
+   printk(KERN_INFO "kvm: guest 3DNOW capability removed\n");
+   }
+
+   /* NX */
if (entry && (entry->edx & (1 << 20)) && !is_efer_nx()) {
entry->edx &= ~(1 << 20);
printk(KERN_INFO "kvm: guest NX capability removed\n");
@@ -970,7 +984,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
vcpu->arch.cpuid_entries[i].padding[2] = 0;
}
vcpu->arch.cpuid_nent = cpuid->nent;
-   cpuid_fix_nx_cap(vcpu);
+   cpuid_fix_caps(vcpu);
r = 0;
 
 out_free:
@@ -1061,8 +1075,8 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, 
u32 function,
bit(X86_FEATURE_LM) |
 #endif
bit(X86_FEATURE_MMXEXT) |
-   bit(X86_FEATURE_3DNOWEXT) |
-   bit(X86_FEATURE_3DNOW);
+   (bit(X86_FEATURE_3DNOWEXT) && cpu_has_3dnowext) |
+   (bit(X86_FEATURE_3DNOW) && cpu_has_3dnow);
const u32 kvm_supported_word3_x86_features =
bit(X86_FEATURE_XMM3) | bit(X86_FEATURE_CX16);
const u32 kvm_supported_word6_x86_features =
diff --git a/include/asm-x86/cpufeature.h b/include/asm-x86/cpufeature.h
index 0d609c8..efbc5ce 100644
--- a/include/asm-x86/cpufeature.h
+++ b/include/asm-x86/cpufeature.h
@@ -187,6 +187,8 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_gbpagesboot_cpu_has(X86_FEATURE_GBPAGES)
 #define cpu_has_arch_perfmon   boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
 #define cpu_has_patboot_cpu_has(X86_FEATURE_PAT)
+#define cpu_has_3dnow  boot_cpu_has(X86_FEATURE_3DNOW)
+#define cpu_has_3dnowext   boot_cpu_has(X86_FEATURE_3DNOWEXT)
 
 #if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64)
 # define cpu_has_invlpg1
-- 
1.5.0.6


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] QEMU/KVM: fix copy&paste bug in ACPI IRQ routing tables

2008-05-06 Thread Marcelo Tosatti


Slots 9 and 25 were using the identifier of the previous slot.

Signed-off-by: Marcelo Tosatti <[EMAIL PROTECTED]>

diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl
index d2e33f4..c145c4b 100755
--- a/bios/acpi-dsdt.dsl
+++ b/bios/acpi-dsdt.dsl
@@ -269,10 +269,10 @@ DefinitionBlock (
 Package() {0x0008, 3, LNKC, 0},
 
 // PCI Slot 9
-Package() {0x0008, 0, LNKA, 0},
-Package() {0x0008, 1, LNKB, 0},
-Package() {0x0008, 2, LNKC, 0},
-Package() {0x0008, 3, LNKD, 0},
+Package() {0x0009, 0, LNKA, 0},
+Package() {0x0009, 1, LNKB, 0},
+Package() {0x0009, 2, LNKC, 0},
+Package() {0x0009, 3, LNKD, 0},
 
 // PCI Slot 10
 Package() {0x000a, 0, LNKB, 0},
@@ -365,10 +365,10 @@ DefinitionBlock (
 Package() {0x0018, 3, LNKC, 0},
 
 // PCI Slot 25
-Package() {0x0018, 0, LNKA, 0},
-Package() {0x0018, 1, LNKB, 0},
-Package() {0x0018, 2, LNKC, 0},
-Package() {0x0018, 3, LNKD, 0},
+Package() {0x0019, 0, LNKA, 0},
+Package() {0x0019, 1, LNKB, 0},
+Package() {0x0019, 2, LNKC, 0},
+Package() {0x0019, 3, LNKD, 0},
 
 // PCI Slot 26
 Package() {0x001a, 0, LNKB, 0},

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 0/4] paravirt clock patches

2008-05-06 Thread Gerd Hoffmann

Gerd Hoffmann wrote:
> Marcelo Tosatti wrote:
>> F8 host, recent kvm-userspace.git (so with IO thread), recent kvm.git
>> (plus your patches), haven't tried 2x but I think 4x is not necessary to
>> reproduce the problem.
> 
> Ok, see it too.  Seem to be actually two (maybe related) problems.
> 
> First the guest hangs hard after a while, burning 100% CPU time
> (deadlocked I guess), doesn't respond to sysrq any more.  Is there some
> easy way to get the guest vcpu state then?

Hmm, "info registers" in qemu monitor hangs ...

cheers,
  Gerd

-- 
http://kraxel.fedorapeople.org/xenner/

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

2008-05-06 Thread Mohammed Gamal

On Tue, May 6, 2008 at 5:30 PM, Anthony Liguori <[EMAIL PROTECTED]> wrote:
> Guillaume Thouvenin wrote:
>
> > On Mon, 5 May 2008 16:29:21 +0300
> > "Mohammed Gamal" <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > > On Mon, May 5, 2008 at 3:57 PM, Anthony Liguori <[EMAIL PROTECTED]>
> wrote:
> > >
> > >
> > >
> > > >  WinXP fails to boot with your patch applied too.  FWIW, Ubuntu 8.04
> has
> > > >  a fixed version of gfxboot that doesn't do nasty things with SS on
> > > >  privileged mode transitions.
> > > >
> > > >
> > > >
> > > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and
> > > FreeDOS don't boot but complain about instruction mov 0x11,sreg not
> > > being emulated.
> > >
> > >
> >
> > Can you try with this one please?
> > On my computer it boots ubuntu-8.04-desktop-i386.iso liveCD and also
> > openSUSE-10.3-GM-x86_64-mini.iso
> >
> >
>
>  8.04 is not a good test-case.  7.10 is what you want to try.
>
>  The good news is, 7.10 appears to work!  The bad news is that about 20% of
> the time, it crashes and displays the following:
>
>  kvm_run: failed entry, reason 5
>  kvm_run returned -8
>
>  So something appears to be a bit buggy.  Still, very good work!
>
>  Regards,
>
>  Anthony Liguori
>
>

7.10 liveCD doesn't work with me at all. It only works with -no-kvm

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] mmu notifier v15 -> v16 diff

2008-05-06 Thread Andrea Arcangeli

Hello everyone,

This is to allow GRU code to call __mmu_notifier_register inside the
mmap_sem (write mode is required as documented in the patch).

It also removes the requirement to implement ->release as it's not
guaranteed all users will really need it.

I didn't integrate the search function as we can sort that out after
2.6.26 is out and it wasn't entirely obvious it's really needed, as
the driver should be able to track if a mmu notifier is registered in
the container.

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -29,10 +29,25 @@ struct mmu_notifier_ops {
/*
 * Called either by mmu_notifier_unregister or when the mm is
 * being destroyed by exit_mmap, always before all pages are
-* freed. It's mandatory to implement this method. This can
-* run concurrently with other mmu notifier methods and it
+* freed. This can run concurrently with other mmu notifier
+* methods (the ones invoked outside the mm context) and it
 * should tear down all secondary mmu mappings and freeze the
-* secondary mmu.
+* secondary mmu. If this method isn't implemented you've to
+* be sure that nothing could possibly write to the pages
+* through the secondary mmu by the time the last thread with
+* tsk->mm == mm exits.
+*
+* As side note: the pages freed after ->release returns could
+* be immediately reallocated by the gart at an alias physical
+* address with a different cache model, so if ->release isn't
+* implemented because all _software_ driven memory accesses
+* through the secondary mmu are terminated by the time the
+* last thread of this mm quits, you've also to be sure that
+* speculative _hardware_ operations can't allocate dirty
+* cachelines in the cpu that could not be snooped and made
+* coherent with the other read and write operations happening
+* through the gart alias address, so leading to memory
+* corruption.
 */
void (*release)(struct mmu_notifier *mn,
struct mm_struct *mm);
diff --git a/mm/mmap.c b/mm/mmap.c
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2340,13 +2340,20 @@ static inline void __mm_unlock(spinlock_
 /*
  * This operation locks against the VM for all pte/vma/mm related
  * operations that could ever happen on a certain mm. This includes
- * vmtruncate, try_to_unmap, and all page faults. The holder
- * must not hold any mm related lock. A single task can't take more
- * than one mm_lock in a row or it would deadlock.
+ * vmtruncate, try_to_unmap, and all page faults.
  *
- * The mmap_sem must be taken in write mode to block all operations
- * that could modify pagetables and free pages without altering the
- * vma layout (for example populate_range() with nonlinear vmas).
+ * The caller must take the mmap_sem in read or write mode before
+ * calling mm_lock(). The caller isn't allowed to release the mmap_sem
+ * until mm_unlock() returns.
+ *
+ * While mm_lock() itself won't strictly require the mmap_sem in write
+ * mode to be safe, in order to block all operations that could modify
+ * pagetables and free pages without need of altering the vma layout
+ * (for example populate_range() with nonlinear vmas) the mmap_sem
+ * must be taken in write mode by the caller.
+ *
+ * A single task can't take more than one mm_lock in a row or it would
+ * deadlock.
  *
  * The sorting is needed to avoid lock inversion deadlocks if two
  * tasks run mm_lock at the same time on different mm that happen to
@@ -2377,17 +2384,13 @@ int mm_lock(struct mm_struct *mm, struct
 {
spinlock_t **anon_vma_locks, **i_mmap_locks;
 
-   down_write(&mm->mmap_sem);
if (mm->map_count) {
anon_vma_locks = vmalloc(sizeof(spinlock_t *) * mm->map_count);
-   if (unlikely(!anon_vma_locks)) {
-   up_write(&mm->mmap_sem);
+   if (unlikely(!anon_vma_locks))
return -ENOMEM;
-   }
 
i_mmap_locks = vmalloc(sizeof(spinlock_t *) * mm->map_count);
if (unlikely(!i_mmap_locks)) {
-   up_write(&mm->mmap_sem);
vfree(anon_vma_locks);
return -ENOMEM;
}
@@ -2426,10 +2429,12 @@ static void mm_unlock_vfree(spinlock_t *
 /*
  * mm_unlock doesn't require any memory allocation and it won't fail.
  *
+ * The mmap_sem cannot be released until mm_unlock returns.
+ *
  * All memory has been previously allocated by mm_lock and it'll be
  * all freed before returning. Only after mm_unlock returns, the
  * caller is allowed to free and forget the mm_lock_data structure.
- * 
+ *
  * mm_unlock runs in O(N) where N is the max number of VMAs in the
  * mm. The max number of vmas is defined in
  * /proc

Re: [kvm-devel] [PATCH 1/4] Replace SIGUSR1 in io-thread with eventfd() (v2)

2008-05-06 Thread Marcelo Tosatti

Looks good (the whole series).

Needs some good testing of course... Have you tested migration/loadvm?

On Mon, May 05, 2008 at 08:47:12AM -0500, Anthony Liguori wrote:
> It's a little odd to use signals to raise a notification on a file descriptor
> when we can just work directly with a file descriptor instead.  This patch
> converts the SIGUSR1 based notification in the io-thread to instead use an
> eventfd file descriptor.  If eventfd isn't available, we use a pipe() instead.
> 
> The benefit of using eventfd is that multiple notifications will be batched
> into a signal IO event.
> 
> Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]>

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 1/4] Replace SIGUSR1 in io-thread with eventfd() (v2)

2008-05-06 Thread Anthony Liguori

Marcelo Tosatti wrote:
> Looks good (the whole series).
>
> Needs some good testing of course... Have you tested migration/loadvm?
>   

No, but I will before resubmitting (which should be sometime tomorrow).

Regards,

Anthony Liguori

> On Mon, May 05, 2008 at 08:47:12AM -0500, Anthony Liguori wrote:
>   
>> It's a little odd to use signals to raise a notification on a file descriptor
>> when we can just work directly with a file descriptor instead.  This patch
>> converts the SIGUSR1 based notification in the io-thread to instead use an
>> eventfd file descriptor.  If eventfd isn't available, we use a pipe() 
>> instead.
>>
>> The benefit of using eventfd is that multiple notifications will be batched
>> into a signal IO event.
>>
>> Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]>
>> 

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 1/3] vt-d support for pci passthrough: kvm-vtd--kernel.patch

2008-05-06 Thread Kay, Allen M

>> +
>> +#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48
>> +
>> +struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev
>> *dev);
>> +struct dmar_domain * iommu_alloc_domain(struct intel_iommu *iommu);
>> +void iommu_free_domain(struct dmar_domain *domain);
>> +int domain_init(struct dmar_domain *domain, int guest_width);
>> +int domain_context_mapping(struct dmar_domain *d,
>> +struct pci_dev *pdev);
>> +int domain_page_mapping(struct dmar_domain *domain, dma_addr_t iova,
>> +u64 hpa, size_t size, int prot);
>> +void detach_domain_for_dev(struct dmar_domain *domain, u8 bus, u8
>> devfn);
>> +struct dmar_domain * find_domain(struct pci_dev *pdev);
>
>Please move these to a .h file and also prefix appropriate keywords:
>
>domain_context_mapping is confusing and since it's an intel 
>iommu-only thing, 
>use something like
>
>intel_iommu_domain_context_mapping 
>

These functions currently are just direct calls into existing functions
in drivers/pci/intel-iommu.c - hence the lack of more descriptive name
in KVM environment.  To get more relavant names in KVM environment, we
can either create wrappers for these functions or using a iommu function
table.

Allen

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] fixup 3dnow! support

2008-05-06 Thread Alexander Graf

On May 6, 2008, at 6:27 PM, Glauber Costa wrote:

> qemu recently added support for 3dnow instructions. Because of
> that, 3dnow will be featured among cpuid bits. But this will
> break kvm in cpus that don't have those instructions (which includes
> my laptop). So we fixup our cpuid before exposing it to the guest.

I actually don't see where the problem is here. As far as I read the  
code, the CPUID feature function gets received from the host CPU and  
bitwise ANDed with a bunch of features that are known to work. What's  
wrong with that approach?

But I'm pretty sure Dao can tell us a lot more about this. Has there  
been any progress in getting the new CPUID code in? I think I could  
review this sometime soon.

Alex

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] fixup 3dnow! support

2008-05-06 Thread Glauber Costa

Alexander Graf wrote:
> 
> On May 6, 2008, at 6:27 PM, Glauber Costa wrote:
> 
>> qemu recently added support for 3dnow instructions. Because of
>> that, 3dnow will be featured among cpuid bits. But this will
>> break kvm in cpus that don't have those instructions (which includes
>> my laptop). So we fixup our cpuid before exposing it to the guest.
> 
> I actually don't see where the problem is here. As far as I read the 
> code, the CPUID feature function gets received from the host CPU and 
> bitwise ANDed with a bunch of features that are known to work. What's 
> wrong with that approach?
Probably is that besides that known to work features, there are also 
features that qemu puts in unconditionally. Among them, 3DNOW.

> But I'm pretty sure Dao can tell us a lot more about this.
Sure, it would be welcome.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 1/3] vt-d support for pci passthrough: kvm-vtd--kernel.patch

2008-05-06 Thread Anthony Liguori

Kay, Allen M wrote:
> Kvm kernel changes.
>
> Signed-off-by: Allen M Kay <[EMAIL PROTECTED]>
>
> --
>  arch/x86/kvm/Makefile  |2 
>  arch/x86/kvm/vtd.c |  183
> +
>  arch/x86/kvm/x86.c |7 +
>  include/asm-x86/kvm_host.h |3 
>  include/asm-x86/kvm_para.h |1 
>  include/linux/kvm_host.h   |6 +
>  virt/kvm/kvm_main.c|3 
>  7 files changed, 204 insertions(+), 1 deletion(-)
>
> --
>
> diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> index c97d35c..b1057fb 100644
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -12,7 +12,7 @@ EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm
>  kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o
> lapic.o \
>   i8254.o
>  obj-$(CONFIG_KVM) += kvm.o
> -kvm-intel-objs = vmx.o
> +kvm-intel-objs = vmx.o vtd.o
>  obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
>  kvm-amd-objs = svm.o
>  obj-$(CONFIG_KVM_AMD) += kvm-amd.o
> diff --git a/arch/x86/kvm/vtd.c b/arch/x86/kvm/vtd.c
> new file mode 100644
> index 000..9a080b5
> --- /dev/null
> +++ b/arch/x86/kvm/vtd.c
> @@ -0,0 +1,183 @@
> +/*
> + * Copyright (c) 2006, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> along with
> + * this program; if not, write to the Free Software Foundation, Inc.,
> 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + *
> + * Copyright (C) 2006-2008 Intel Corporation
> + * Author: Allen M. Kay <[EMAIL PROTECTED]>
> + * Author: Weidong Han <[EMAIL PROTECTED]>
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +//#define DEBUG
> +
> +#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48
> +
> +struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev
> *dev);
> +struct dmar_domain * iommu_alloc_domain(struct intel_iommu *iommu);
> +void iommu_free_domain(struct dmar_domain *domain);
> +int domain_init(struct dmar_domain *domain, int guest_width);
> +int domain_context_mapping(struct dmar_domain *d,
> + struct pci_dev *pdev);
> +int domain_page_mapping(struct dmar_domain *domain, dma_addr_t iova,
> + u64 hpa, size_t size, int prot);
> +void detach_domain_for_dev(struct dmar_domain *domain, u8 bus, u8
> devfn);
> +struct dmar_domain * find_domain(struct pci_dev *pdev);
>   

These definitely need to be moved to a common header.

> +
> +int kvm_iommu_map_pages(struct kvm *kvm,
> + gfn_t base_gfn, unsigned long npages)
> +{
> + unsigned long gpa;
> + struct page *page;
> + hpa_t hpa;
> + int j, write;
> + struct vm_area_struct *vma;
> +
> + if (!kvm->arch.domain)
> + return 1;
>   

In the kernel, we should be using -errno to return error codes.

> + gpa = base_gfn << PAGE_SHIFT;
> + page = gfn_to_page(kvm, base_gfn);
> + hpa = page_to_phys(page);
>   

Please use gfn_to_pfn().  Keep in mind, by using gfn_to_page/gfn_to_pfn, 
you take a reference to a page.  You're leaking that reference here.

> + printk(KERN_DEBUG "kvm_iommu_map_page: gpa = %lx\n", gpa);
> + printk(KERN_DEBUG "kvm_iommu_map_page: hpa = %llx\n", hpa);
> + printk(KERN_DEBUG "kvm_iommu_map_page: size = %lx\n",
> + npages*PAGE_SIZE);
> +
> + for (j = 0; j < npages; j++) {
> + gpa +=  PAGE_SIZE;
> + page = gfn_to_page(kvm, gpa >> PAGE_SHIFT);
> + hpa = page_to_phys(page);
>   

Again, gfn_to_pfn() and you're taking a reference that I never see you 
releasing.

> + domain_page_mapping(kvm->arch.domain, gpa, hpa,
> PAGE_SIZE,
> + DMA_PTE_READ | DMA_PTE_WRITE);
> + vma = find_vma(current->mm, gpa);
> + if (!vma)
> + return 1;
> + write = (vma->vm_flags & VM_WRITE) != 0;
> + get_user_pages(current, current->mm, gpa,
> + PAGE_SIZE, write, 0, NULL, NULL);
>   

I don't quite see what you're doing here.  It looks like you're trying 
to pre-fault the page in?  gfn_to_pfn will do that for you.  You're 
taking a bunch of references here that are never getting released.

I think the general approach here is a bit faulty.  I think what we want 
to do is mlock() from userspace to ensure all the memory is present for 
the guest.  We should combine this with MMU-notifiers such that whenever 
the userspace mapping changes, we can reprogram the IOMMU.  In the case 
where

Re: [kvm-devel] [RFC] [VTD][patch 3/3] vt-d support for pci passthrough: kvm-intel-iommu.patch

2008-05-06 Thread Anthony Liguori

Kay, Allen M wrote:
> Intel-iommu driver changes for kvm vt-d support.  Important changes are
> in intel-iommu.c.  The rest of the changes are for moving intel-iommu.h
> and iova.h from drivers/pci directory to include/linux directory.
>
> Signed-off-by: Allen M Kay <[EMAIL PROTECTED]>
>
> 
>
>  b/drivers/pci/dmar.c  |4 
>  b/drivers/pci/intel-iommu.c   |   26 ++-
>  b/drivers/pci/iova.c  |2 
>  b/include/linux/intel-iommu.h |  344
> ++
>  b/include/linux/iova.h|   52 ++
>  drivers/pci/intel-iommu.h |  344
> --
>  drivers/pci/iova.h|   52 --
>  7 files changed, 416 insertions(+), 408 deletions(-)
>
> 
>
> diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
> index f941f60..a58a5b0 100644
> --- a/drivers/pci/dmar.c
> +++ b/drivers/pci/dmar.c
> @@ -26,8 +26,8 @@
>  
>  #include 
>  #include 
> -#include "iova.h"
> -#include "intel-iommu.h"
> +#include 
> +#include 
>  
>  #undef PREFIX
>  #define PREFIX "DMAR:"
> diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
> index 4cb949f..bfa888b 100644
> --- a/drivers/pci/intel-iommu.c
> +++ b/drivers/pci/intel-iommu.c
> @@ -31,8 +31,8 @@
>  #include 
>  #include 
>  #include 
> -#include "iova.h"
> -#include "intel-iommu.h"
> +#include 
> +#include 
>  #include  /* force_iommu in this header in x86-64*/
>  #include 
>  #include 
> @@ -1056,7 +1056,7 @@ static void free_iommu(struct intel_iommu *iommu)
>   kfree(iommu);
>  }
>  
> -static struct dmar_domain * iommu_alloc_domain(struct intel_iommu
> *iommu)
> +struct dmar_domain * iommu_alloc_domain(struct intel_iommu *iommu)
>  {
>   unsigned long num;
>   unsigned long ndomains;
> @@ -1086,8 +1086,9 @@ static struct dmar_domain *
> iommu_alloc_domain(struct intel_iommu *iommu)
>  
>   return domain;
>  }
> +EXPORT_SYMBOL_GPL(iommu_alloc_domain);
>  
> -static void iommu_free_domain(struct dmar_domain *domain)
> +void iommu_free_domain(struct dmar_domain *domain)
>  {
>   unsigned long flags;
>  
> @@ -1095,6 +1096,7 @@ static void iommu_free_domain(struct dmar_domain
> *domain)
>   clear_bit(domain->id, domain->iommu->domain_ids);
>   spin_unlock_irqrestore(&domain->iommu->lock, flags);
>  }
> +EXPORT_SYMBOL_GPL(iommu_free_domain);
>  
>  static struct iova_domain reserved_iova_list;
>  static struct lock_class_key reserved_alloc_key;
> @@ -1160,7 +1162,7 @@ static inline int guestwidth_to_adjustwidth(int
> gaw)
>   return agaw;
>  }
>  
> -static int domain_init(struct dmar_domain *domain, int guest_width)
> +int domain_init(struct dmar_domain *domain, int guest_width)
>  {
>   

I think it's already been mentioned, but these are pretty terrible names 
if you're exporting these symbols.  Linux supports other IOMMUs so VT-d 
should not be hogging the iommu_* namespace.

Regards,

Anthony Liguori

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 2/3] vt-d support for pci passthrough: kvm-vtd-user.patch

2008-05-06 Thread Anthony Liguori

Avi Kivity wrote:
> Kay, Allen M wrote:
>   
>> Still todo: move vt.d to kvm-intel.ko module.
>>   
>> 
>
> Not sure it's the right thing to do. If we get the iommus abstracted 
> properly, we can rename vtd.c to dma.c and move it to virt/kvm/.
>
> The code is certainly a lot more about managing memory than anything vmx 
> specific. It's hardly x86 specific, even.
>   

Really, an external interface to KVM that allowed someone to query the 
GPA => PA mapping would suffice.  It should not fault in pages that 
aren't present and we should provide notifications for when the mapping 
changes for a given reason.  Userspace can enforce the requirement that 
memory remains present via mlock().  This allows us to implement a PV 
API for DMA registration without the IOMMU code having any particular 
knowledge of it.

Regards,

Anthony Liguori

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 1/3] vt-d support for pci passthrough: kvm-vtd--kernel.patch

2008-05-06 Thread Kay, Allen M

>We have to ensure we don't swap KVM guest memory while using hardware 
>pass-through, but AFAICT, we do not need to make the memory 
>non-reclaimable  As long as we reprogram the IOMMU with a new, valid, 
>mapping everything should be fine.  mlock() really gives us the right 
>semantics.
>
>Semantically, a PV API that supports DMA window registration simply 
>mlock()s the DMA regions on behalf of the guest.  No special logic 
>should be needed.
>

What should be done for unmodified guest where there is no PV driver in
the guest?  Would a call to mlock() from
qemu/hw/pci-passthrough.c/add_pci_passthrough_device() a reasonable
thing to do?

Allen

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 1/3] vt-d support for pci passthrough: kvm-vtd--kernel.patch

2008-05-06 Thread Anthony Liguori

Kay, Allen M wrote:
>> We have to ensure we don't swap KVM guest memory while using hardware 
>> pass-through, but AFAICT, we do not need to make the memory 
>> non-reclaimable  As long as we reprogram the IOMMU with a new, valid, 
>> mapping everything should be fine.  mlock() really gives us the right 
>> semantics.
>>
>> Semantically, a PV API that supports DMA window registration simply 
>> mlock()s the DMA regions on behalf of the guest.  No special logic 
>> should be needed.
>>
>> 
>
> What should be done for unmodified guest where there is no PV driver in
> the guest?  Would a call to mlock() from
> qemu/hw/pci-passthrough.c/add_pci_passthrough_device() a reasonable
> thing to do?
>   

Yup.  The idea is to ensure that the memory is always present, without 
necessarily taking a reference to it.  This allows for memory reclaiming 
which should allow for things like NUMA page migration.  We can't swap 
of course but that doesn't mean reclaimation isn't useful.

Regards,

Anthony Liguori

> Allen
>   

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [RFC] [VTD][patch 1/3] vt-d support for pci passthrough: kvm-vtd--kernel.patch

2008-05-06 Thread Avi Kivity

Anthony Liguori wrote:

>> What should be done for unmodified guest where there is no PV driver in
>> the guest?  Would a call to mlock() from
>> qemu/hw/pci-passthrough.c/add_pci_passthrough_device() a reasonable
>> thing to do?
>>   
>> 
>
> Yup.  The idea is to ensure that the memory is always present, without 
> necessarily taking a reference to it.  This allows for memory reclaiming 
> which should allow for things like NUMA page migration.  We can't swap 
> of course but that doesn't mean reclaimation isn't useful.
>   

I don't think we can do page migration with VT-d.  You need to be able 
to detect whether the page has been changed by dma after you've copied 
it but before you changed the pte, but VT-d doesn't allow that AFAICT.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

2008-05-06 Thread Guillaume Thouvenin

On Tue, 06 May 2008 09:30:44 -0500
Anthony Liguori <[EMAIL PROTECTED]> wrote:

> 
> 8.04 is not a good test-case.  7.10 is what you want to try.

Oh yes you're right. I tried 8.04 because Balaji had problems to
boot it with the patch.

> The good news is, 7.10 appears to work!  The bad news is that about 20% 
> of the time, it crashes and displays the following:
> 
> kvm_run: failed entry, reason 5
> kvm_run returned -8
> 
> So something appears to be a bit buggy.  Still, very good work!

I can see the problem with openSuse10.3 too but no so often I'm
looking for this issue.

Thank you for the help,
Regards,
Guillaume

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] Build fix for kvm/ia64 userspace.

2008-05-06 Thread Zhang, Xiantao

Avi Kivity wrote:
> Zhang, Xiantao wrote:
>> Hi, Avi
>>   This patch should go into RC1, otherwise it will block kvm/ia64
>> userspace build. 
>> 
>> diff --git a/include/asm-ia64/kvm.h b/include/asm-ia64/kvm.h index
>> eb2d355..62b5fad 100644 --- a/include/asm-ia64/kvm.h
>> +++ b/include/asm-ia64/kvm.h
>> @@ -22,7 +22,12 @@
>>   */
>> 
>>  #include 
>> +
>> +#ifdef __KERNEL__
>>  #include 
>> +#else
>> +#include 
>> +#endif
>> 
> 
> Fishy.  A kernel header including a userspace header?
> 
> Maybe you need to include  unconditionally?
Hi, Avi 
You know, kvm.h is shared by userspace and kernel. But
unfortunately, the usersapce header files have redefinition for one
strucutre (structure ia64_fpreg) {One in asm/fpu.h and the other one in
bits/sigcontext}, maybe a bug here. 
 Therefore, if userspace code includes fpu.h and sigcontext.h in
one source file, it will complain the redefinition.  Do you have good
idea to cope with this issue ?
Xiantao

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

48 matches

Mail list logo