Re: [PATCH 3/4] add ksm kernel shared memory driver.

2008-12-04 Thread Alan Cox
  Taken off list 
 
 Hmmm, list would like to know :-).

That would be my choice too but unfortunately I can't do that
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/12] Factor VT-d KVM functions into a generic API (with multiple device assignment support)

2008-12-04 Thread Muli Ben-Yehuda
On Wed, Dec 03, 2008 at 10:03:04AM +0100, Joerg Roedel wrote:

Have you tried porting any of the current iommu controllers to
this new framework to see if it works properly for them?
   
   It works currently for VT-d. I also port it to AMD IOMMU
   currently. With some extensions (offset for start address, flags
   and size limitation) it is also suitable for IOMMUs like GART or
   similar ones.
  
  What about the Calgary chipset?
 
 Calgary is quite similar to GART (there is something like the
 aperture and a linear single-level pagetable).

Actually, Calgary has multiple per-bus address spaces (each of which
is a single-level linear pagetable limited to 4GB of addressable
memory), so I think it should work with your current approach pretty
much as is, once we take into account these two (per-bus and 32-bit
addressability) limitations.

Cheers,
Muli
-- 
The First Workshop on I/O Virtualization (WIOV '08)
Dec 2008, San Diego, CA, http://www.usenix.org/wiov08/
   -
SYSTOR 2009---The Israeli Experimental Systems Conference
http://www.haifa.il.ibm.com/conferences/systor2009/
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: emulator: Fix handling of VMMCALL instruction

2008-12-04 Thread Amit Shah
The VMMCALL instruction doesn't get recognised and isn't processed
by the emulator.

This is seen on an Intel host that tries to execute the VMMCALL
instruction after a guest live migrates from an AMD host.

Signed-off-by: Amit Shah [EMAIL PROTECTED]
---
 arch/x86/kvm/x86_emulate.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c
index 69b330b..f29e8f0 100644
--- a/arch/x86/kvm/x86_emulate.c
+++ b/arch/x86/kvm/x86_emulate.c
@@ -299,7 +299,7 @@ static u16 group_table[] = {
 
 static u16 group2_table[] = {
[Group7*8] =
-   SrcNone | ModRM, 0, 0, 0,
+   SrcNone | ModRM, 0, 0, SrcNone | ModRM,
SrcNone | ModRM | DstMem | Mov, 0,
SrcMem16 | ModRM | Mov, 0,
 };
-- 
1.5.6.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] Do not modify VirtQueueElement

2008-12-04 Thread Mark McLoughlin
On Wed, 2008-12-03 at 11:27 -0600, Anthony Liguori wrote:

 Right now, virtio-net modifies the iovec structure in VirtQueueElement.  This 
 is
 evil.  This creates an impossible situation if we want to bounce iovec buffers
 in VirtQueueElement as we lose track of the original buffers and the resulting
 free results in very bad things.
 
 I tried to refactor receive_headers() and iov_fill() to be able to skip the
 header if present but failed miserably.  Instead of spending more time trying 
 to
 get that to work, I simply decided to leave the code as-is and copy the iovec
 to a temporary buffer.

Sounds sensible to me.

We could perhaps make things a bit more clear by never using the guest
supplied header and always translating between the two header formats.

But whatever way we do it, if we want to avoid modifying the original
iovec structure, we need the temporary buffer.

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Use guards in virtio-net for easier upstream merging

2008-12-04 Thread Mark McLoughlin
On Wed, 2008-12-03 at 14:32 -0600, Anthony Liguori wrote:

 @@ -189,6 +205,9 @@ static void virtio_net_receive(void *opaque, const 
 uint8_t *buf, int size)
  struct virtio_net_hdr_mrg_rxbuf *mhdr = NULL;
  int hdr_len, offset, i;
  
 +if (!virtio_net_can_receive(opaque))
 +return;

Should pass the buffer size to virtio_net_can_receive() to limit the
work virtqueue_avail_bytes() has to do.

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix unitialized offset in virtio-net receive_header

2008-12-04 Thread Mark McLoughlin
On Wed, 2008-12-03 at 14:27 -0600, Anthony Liguori wrote:
 If vnet support is not available on the tap device, offset is uninitialized 
 and
 badness ensues.

Uggh, nasty. And no comiler warning ...

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix compile failure in acpi.c

2008-12-04 Thread Glauber Costa
kvm header has to be always included.

Signed-off-by: Glauber Costa [EMAIL PROTECTED]
---
 qemu/hw/acpi.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/qemu/hw/acpi.c b/qemu/hw/acpi.c
index 12f4fce..5458e54 100644
--- a/qemu/hw/acpi.c
+++ b/qemu/hw/acpi.c
@@ -24,9 +24,7 @@
 #include i2c.h
 #include smbus.h
 #include kvm.h
-#ifdef USE_KVM
 #include qemu-kvm.h
-#endif
 #include string.h
 
 //#define DEBUG
-- 
1.5.6.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] virtio: make PCI devices take a virtio_pci module ref

2008-12-04 Thread Mark McLoughlin
Nothing takes a ref on virtio_pci, so even if you have
devices in use, rmmod will attempt to unload the module.

Fix by simply making each device take a ref on the module.

Signed-off-by: Mark McLoughlin [EMAIL PROTECTED]
Reported-by: Michael Tokarev [EMAIL PROTECTED]
---
 drivers/virtio/virtio_pci.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index c7dc37c..147a17f 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -322,6 +322,9 @@ static int __devinit virtio_pci_probe(struct pci_dev 
*pci_dev,
return -ENODEV;
}
 
+   if (!try_module_get(THIS_MODULE))
+   return -ENODEV;
+
/* allocate our structure and fill it out */
vp_dev = kzalloc(sizeof(struct virtio_pci_device), GFP_KERNEL);
if (vp_dev == NULL)
@@ -393,6 +396,7 @@ static void __devexit virtio_pci_remove(struct pci_dev 
*pci_dev)
pci_release_regions(pci_dev);
pci_disable_device(pci_dev);
kfree(vp_dev);
+   module_put(THIS_MODULE);
 }
 
 #ifdef CONFIG_PM
-- 
1.6.0.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 0/5] x86_emulator: emulate shld and shrd instructions

2008-12-04 Thread Guillaume Thouvenin
This series of patches emulate instructions shld and shrd. As those
instructions have three operands we introduce a decode set for the Src2
operand. By doing this, the opcode descriptor needs to be extend to 32
bit.

So this series of patches:
 [1/5] extend the opcode descriptor to 32 bits
 [2/5] add Src2 decode set
 [3/5] add a new implied 1 Src decode type
 [4/5] add the assembler code for three operands (one operand is stored
in EXC) 
 [5/5] add the emulation of shld and shrd instructions
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 1/5] x86_emulator: Extend the opcode descriptor

2008-12-04 Thread Guillaume Thouvenin
Extend the opcode descriptor to 32 bits. This is needed by the
introduction of a new Src2 operand type.

Signed-off-by: Guillaume Thouvenin [EMAIL PROTECTED]
---
 arch/x86/kvm/x86_emulate.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c
index 69b330b..7a07ca4 100644
--- a/arch/x86/kvm/x86_emulate.c
+++ b/arch/x86/kvm/x86_emulate.c
@@ -76,7 +76,7 @@ enum {
Group1A, Group3_Byte, Group3, Group4, Group5, Group7,
 };
 
-static u16 opcode_table[256] = {
+static u32 opcode_table[256] = {
/* 0x00 - 0x07 */
ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM,
ByteOp | DstReg | SrcMem | ModRM, DstReg | SrcMem | ModRM,
@@ -195,7 +195,7 @@ static u16 opcode_table[256] = {
ImplicitOps, ImplicitOps, Group | Group4, Group | Group5,
 };
 
-static u16 twobyte_table[256] = {
+static u32 twobyte_table[256] = {
/* 0x00 - 0x0F */
0, Group | GroupDual | Group7, 0, 0, 0, 0, ImplicitOps, 0,
ImplicitOps, ImplicitOps, 0, 0, 0, ImplicitOps | ModRM, 0, 0,
@@ -253,7 +253,7 @@ static u16 twobyte_table[256] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
 };
 
-static u16 group_table[] = {
+static u32 group_table[] = {
[Group1_80*8] =
ByteOp | DstMem | SrcImm | ModRM, ByteOp | DstMem | SrcImm | ModRM,
ByteOp | DstMem | SrcImm | ModRM, ByteOp | DstMem | SrcImm | ModRM,
@@ -297,7 +297,7 @@ static u16 group_table[] = {
SrcMem16 | ModRM | Mov, SrcMem | ModRM | ByteOp,
 };
 
-static u16 group2_table[] = {
+static u32 group2_table[] = {
[Group7*8] =
SrcNone | ModRM, 0, 0, 0,
SrcNone | ModRM | DstMem | Mov, 0,
-- 
1.6.0.4.623.g171d7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 2/5] x86_emulator: add Src2 decode set

2008-12-04 Thread Guillaume Thouvenin
Instruction like shld has three operands, so we need to add a Src2
decode set. We start with Src2None, Src2CL, and Src2ImmByte, Src2One to
support shld/shrd and we will expand it later.

Signed-off-by: Guillaume Thouvenin [EMAIL PROTECTED]
---
 arch/x86/include/asm/kvm_x86_emulate.h |1 +
 arch/x86/kvm/x86_emulate.c |   29 +
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_x86_emulate.h 
b/arch/x86/include/asm/kvm_x86_emulate.h
index 16a0026..6a15973 100644
--- a/arch/x86/include/asm/kvm_x86_emulate.h
+++ b/arch/x86/include/asm/kvm_x86_emulate.h
@@ -123,6 +123,7 @@ struct decode_cache {
u8 ad_bytes;
u8 rex_prefix;
struct operand src;
+   struct operand src2;
struct operand dst;
bool has_seg_override;
u8 seg_override;
diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c
index 7a07ca4..7f5cd62 100644
--- a/arch/x86/kvm/x86_emulate.c
+++ b/arch/x86/kvm/x86_emulate.c
@@ -70,6 +70,12 @@
 #define Group   (114) /* Bits 3:5 of modrm byte extend opcode */
 #define GroupDual   (115) /* Alternate decoding of mod == 3 */
 #define GroupMask   0xff/* Group number stored in bits 0:7 */
+/* Source 2 operand type */
+#define Src2None(029)
+#define Src2CL  (129)
+#define Src2ImmByte (229)
+#define Src2One (329)
+#define Src2Mask(729)
 
 enum {
Group1_80, Group1_81, Group1_82, Group1_83,
@@ -1000,6 +1006,29 @@ done_prefixes:
break;
}
 
+   /*
+* Decode and fetch the second source operand: register, memory
+* or immediate.
+*/
+   switch (c-d  Src2Mask) {
+   case Src2None:
+   break;
+   case Src2CL:
+   c-src2.bytes = 1;
+   c-src2.val = c-regs[VCPU_REGS_RCX]  0x8;
+   break;
+   case Src2ImmByte:
+   c-src2.type = OP_IMM;
+   c-src2.ptr = (unsigned long *)c-eip;
+   c-src2.bytes = 1;
+   c-src2.val = insn_fetch(u8, 1, c-eip);
+   break;
+   case Src2One:
+   c-src2.bytes = 1;
+   c-src2.val = 1;
+   break;
+   }
+
/* Decode and fetch the destination operand: register or memory. */
switch (c-d  DstMask) {
case ImplicitOps:
-- 
1.6.0.4.623.g171d7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 3/5] x86_emulator: add a new implied 1 Src decode type

2008-12-04 Thread Guillaume Thouvenin
Add SrcOne operand type when we need to decode an implied '1' like with
regular shift instruction

Signed-off-by: Guillaume Thouvenin [EMAIL PROTECTED]
---
 arch/x86/kvm/x86_emulate.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c
index 7f5cd62..0c75306 100644
--- a/arch/x86/kvm/x86_emulate.c
+++ b/arch/x86/kvm/x86_emulate.c
@@ -58,6 +58,7 @@
 #define SrcMem32(44) /* Memory operand (32-bit). */
 #define SrcImm  (54) /* Immediate operand. */
 #define SrcImmByte  (64) /* 8-bit sign-extended immediate operand. */
+#define SrcOne  (74) /* Implied '1' */
 #define SrcMask (74)
 /* Generic ModRM decode. */
 #define ModRM   (17)
@@ -1004,6 +1005,10 @@ done_prefixes:
c-src.bytes = 1;
c-src.val = insn_fetch(s8, 1, c-eip);
break;
+   case SrcOne:
+   c-src.bytes = 1;
+   c-src.val = 1;
+   break;
}
 
/*
-- 
1.6.0.4.623.g171d7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 4/5] x86_emulator: add the assembler code for three operands

2008-12-04 Thread Guillaume Thouvenin
Add the assembler code for instruction with three operands and one
operand is stored in ECX register

Signed-off-by: Guillaume Thouvenin [EMAIL PROTECTED]
---
 arch/x86/kvm/x86_emulate.c |   39 +++
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c
index 0c75306..9ae6d5b 100644
--- a/arch/x86/kvm/x86_emulate.c
+++ b/arch/x86/kvm/x86_emulate.c
@@ -431,6 +431,45 @@ static u32 group2_table[] = {
__emulate_2op_nobyte(_op, _src, _dst, _eflags,  \
 w, r, _LO32, r, , r)
 
+/* Instruction has three operands and one operand is stored in ECX register */
+#define __emulate_2op_cl(_op, _cl, _src, _dst, _eflags, _suffix, _type)
\
+   do {
\
+   unsigned long _tmp; 
\
+   _type _clv  = (_cl).val;
\
+   _type _srcv = (_src).val;   
\
+   _type _dstv = (_dst).val;   
\
+   
\
+   __asm__ __volatile__ (  
\
+   _PRE_EFLAGS(0, 5, 2)  
\
+   _op _suffix  %4,%1 \n 
\
+   _POST_EFLAGS(0, 5, 2) 
\
+   : =m (_eflags), +r (_dstv), =r (_tmp)
\
+   : c (_clv) , r (_srcv), i (EFLAGS_MASK)   
\
+   );  
\
+   
\
+   (_cl).val  = (unsigned long) _clv;  
\
+   (_src).val = (unsigned long) _srcv; 
\
+   (_dst).val = (unsigned long) _dstv; 
\
+   } while (0)
+
+#define emulate_2op_cl(_op, _cl, _src, _dst, _eflags)  
\
+   do {
\
+   switch ((_dst).bytes) { 
\
+   case 2: 
\
+   __emulate_2op_cl(_op, _cl, _src, _dst, _eflags, 
\
+   w, unsigned short);   
\
+   break;  
\
+   case 4: 
\
+   __emulate_2op_cl(_op, _cl, _src, _dst, _eflags, 
\
+   l, unsigned int); 
\
+   break;  
\
+   case 8: 
\
+   ON64(__emulate_2op_cl(_op, _cl, _src, _dst, _eflags,
\
+   q, unsigned long));   
\
+   break;  
\
+   }   
\
+   } while (0)
+
 #define __emulate_1op(_op, _dst, _eflags, _suffix) \
do {\
unsigned long _tmp; \
-- 
1.6.0.4.623.g171d7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 5/5] x86_emulator: add the emulation of shld and shrd instructions

2008-12-04 Thread Guillaume Thouvenin
Add emulation of shld and shrd instructions

Signed-off-by: Guillaume Thouvenin [EMAIL PROTECTED]
---
 arch/x86/kvm/x86_emulate.c |   17 +++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c
index 9ae6d5b..219dc31 100644
--- a/arch/x86/kvm/x86_emulate.c
+++ b/arch/x86/kvm/x86_emulate.c
@@ -237,9 +237,14 @@ static u32 twobyte_table[256] = {
/* 0x90 - 0x9F */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
/* 0xA0 - 0xA7 */
-   0, 0, 0, DstMem | SrcReg | ModRM | BitOp, 0, 0, 0, 0,
+   0, 0, 0, DstMem | SrcReg | ModRM | BitOp,
+   DstMem | SrcReg | Src2ImmByte | ModRM,
+   DstMem | SrcReg | Src2CL | ModRM, 0, 0,
/* 0xA8 - 0xAF */
-   0, 0, 0, DstMem | SrcReg | ModRM | BitOp, 0, 0, ModRM, 0,
+   0, 0, 0, DstMem | SrcReg | ModRM | BitOp,
+   DstMem | SrcReg | Src2ImmByte | ModRM,
+   DstMem | SrcReg | Src2CL | ModRM,
+   ModRM, 0,
/* 0xB0 - 0xB7 */
ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM, 0,
DstMem | SrcReg | ModRM | BitOp,
@@ -2037,12 +2042,20 @@ twobyte_insn:
c-src.val = (c-dst.bytes  3) - 1;
emulate_2op_SrcV_nobyte(bt, c-src, c-dst, ctxt-eflags);
break;
+   case 0xa4: /* shld imm8, r, r/m */
+   case 0xa5: /* shld cl, r, r/m */
+   emulate_2op_cl(shld, c-src2, c-src, c-dst, ctxt-eflags);
+   break;
case 0xab:
  bts:  /* bts */
/* only subword offset */
c-src.val = (c-dst.bytes  3) - 1;
emulate_2op_SrcV_nobyte(bts, c-src, c-dst, ctxt-eflags);
break;
+   case 0xac: /* shrd imm8, r, r/m */
+   case 0xad: /* shrd cl, r, r/m */
+   emulate_2op_cl(shrd, c-src2, c-src, c-dst, ctxt-eflags);
+   break;
case 0xae:  /* clflush */
break;
case 0xb0 ... 0xb1: /* cmpxchg */
-- 
1.6.0.4.623.g171d7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm testsuite: Add test for 'shld' instruction

2008-12-04 Thread Guillaume Thouvenin
Add 'shld' instruction test in real mode test harness. 

Avi, on my computer this test is broken but the problem seems to be
elsewhere because test_shld() works fine alone. I'm inspecting other
test. Used alone shld gives the right values (shift is ok and bits are
added).

The error is the following:

kvm_run: failed entry, reason 7
rax 000e rbx 1fa4 rcx  rdx 
00f1
rsi 218d rdi 1ff4 rsp 1f84 rbp 

r8   r9   r10  r11 

r12  r13  r14  r15 

rip 004b rflags 00023646
cs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ds  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ss  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
fs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
gs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
tr  (fffbd000/2088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt  (/ p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
gdt 2000/17
idt 0/
cr0 6010 cr2 0 cr3 0 cr4 0 cr8 0 efer 0


Signed-off-by: Guillaume Thouvenin [EMAIL PROTECTED]
---
 user/test/x86/realmode.c |   19 +++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/user/test/x86/realmode.c b/user/test/x86/realmode.c
index 2c3be1e..a65f9f2 100644
--- a/user/test/x86/realmode.c
+++ b/user/test/x86/realmode.c
@@ -141,6 +141,19 @@ int regs_equal(const struct regs *r1, const struct regs 
*r2, int ignore)
); \
extern u8 insn_##name[], insn_##name##_end[]
 
+void test_shld(const struct regs *inregs, struct regs *outregs)
+{
+   MK_INSN(shld_test, shld $8,%edx,%eax\n\t);
+
+   exec_in_big_real_mode(inregs, outregs,
+ insn_shld_test,
+ insn_shld_test_end - insn_shld_test);
+   if (outregs-eax != 0xbeef)
+   print_serial(shld: failure\n);
+   else
+   print_serial(shld: success\n);
+}
+
 void test_mov_imm(const struct regs *inregs, struct regs *outregs)
 {
MK_INSN(mov_r32_imm_1, mov $1234567890, %eax);
@@ -360,11 +373,17 @@ void start(void)
if (!regs_equal(inregs, outregs, 0))
print_serial(null test: FAIL\n);
test_call(inregs, outregs);
+
+   inregs.eax = 0xbe;
+   inregs.edx = 0xef00;
+   test_shld(inregs, outregs);
+
test_mov_imm(inregs, outregs);
test_cmp_imm(inregs, outregs);
test_add_imm(inregs, outregs);
test_io(inregs, outregs);
test_eflags_insn(inregs, outregs);
+   
exit(0);
 }
 
-- 
1.6.0.4.623.g171d7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


lsi_scsi: error: Bad Status move errors with kvm-79

2008-12-04 Thread Anssi Kolehmainen
Hi,

I have kvm environment with linux-2.6.28-rc7 x86_64 (Xeon), kvm-79 host
and bunch of Win2K3 guests. Sometimes I get 'lsi_scsi: error: Bad Status
move' from kvm (qemu) and in Windows event log The device,
\Device\Scsi\sym_hi1, did not respond within the timeout period. These
errors come somewhat at random, usually with 10-30 second intervals when
there is enough disk usage in the guest (seems that installing Bea
Weblogic or Oracle database is pretty nice for causing these errors).

Usually windows is able to recover from these but sometimes (=too often)
I get random delays and hangups. Also I have gotten BSOD 0x77 (0x02,
0x00, 0x00, 0x5f4000) about once a day.

Any ideas how to debug / fix this problem?


KVM startup command:
/usr/local/bin/qemu-system-x86_64 -name vm1
 -smp 1 -m 1024 -vnc :4 -k fi -serial mon:telnet::10004,server,nowait
 -daemonize -localtime -vga std -usb -usbdevice tablet 
 -net nic,macaddr=00:16:3e:00:00:4,model=e1000 -net tap,ifname=tap-vm1
 -pidfile /var/run/kvm/vm1.pid -boot c 
 -drive index=0,media=disk,if=scsi,boot=on,file=/dev/mapper/vg0-vm

-- 
Anssi Kolehmainen
[EMAIL PROTECTED]
040-5085390
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Use guards in virtio-net for easier upstream merging

2008-12-04 Thread Anthony Liguori

Mark McLoughlin wrote:

On Wed, 2008-12-03 at 14:32 -0600, Anthony Liguori wrote:

  

@@ -189,6 +205,9 @@ static void virtio_net_receive(void *opaque, const uint8_t 
*buf, int size)
 struct virtio_net_hdr_mrg_rxbuf *mhdr = NULL;
 int hdr_len, offset, i;
 
+if (!virtio_net_can_receive(opaque))

+return;



Should pass the buffer size to virtio_net_can_receive() to limit the
work virtqueue_avail_bytes() has to do.
  


Sure, I have to resubmit anyway because I left tap_vnet disable (didn't 
refresh before sending patch-doh!).


Regards,

Anthony Liguori


Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix unitialized offset in virtio-net receive_header

2008-12-04 Thread Anthony Liguori

Mark McLoughlin wrote:

On Wed, 2008-12-03 at 14:27 -0600, Anthony Liguori wrote:
  

If vnet support is not available on the tap device, offset is uninitialized and
badness ensues.



Uggh, nasty. And no comiler warning ...
  


My gcc happens to throw a warning.  GCC is lame like that.

Regards,

Anthony Liguori


Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Use guards in virtio-net for easier upstream merging (v2)

2008-12-04 Thread Anthony Liguori
This gets virtio-net into an upstream acceptable state.  This includes
introducing USE_KVM guards for IO thread notification (where did
qemu_service_io() go?).  It also includes introducing TAP_VNET_HDR which is for
code that relies on the tap vnet support that is not currently in upstream QEMU.

Finally, it drops packets if not ready to receive.  This is unnecessary in
kvm-userspace but necessary in QEMU.

Since v1, I enable tap vnet by default, and pass the buffer size to
virtio_net_receive() as suggested by Mark.

Signed-off-by: Anthony Liguori [EMAIL PROTECTED]

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index 1283735..da2e835 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -15,7 +15,11 @@
 #include net.h
 #include qemu-timer.h
 #include virtio-net.h
+#ifdef USE_KVM
 #include qemu-kvm.h
+#endif
+
+#define TAP_VNET_HDR
 
 typedef struct VirtIONet
 {
@@ -49,9 +53,10 @@ static void virtio_net_update_config(VirtIODevice *vdev, 
uint8_t *config)
 
 static uint32_t virtio_net_get_features(VirtIODevice *vdev)
 {
+uint32_t features = (1  VIRTIO_NET_F_MAC);
+#ifdef TAP_VNET_HDR
 VirtIONet *n = to_virtio_net(vdev);
 VLANClientState *host = n-vc-vlan-first_client;
-uint32_t features = (1  VIRTIO_NET_F_MAC);
 
 if (tap_has_vnet_hdr(host)) {
 tap_using_vnet_hdr(host, 1);
@@ -66,6 +71,7 @@ static uint32_t virtio_net_get_features(VirtIODevice *vdev)
 features |= (1  VIRTIO_NET_F_MRG_RXBUF);
 /* Kernel can't actually handle UFO in software currently. */
 }
+#endif
 
 return features;
 }
@@ -73,10 +79,13 @@ static uint32_t virtio_net_get_features(VirtIODevice *vdev)
 static void virtio_net_set_features(VirtIODevice *vdev, uint32_t features)
 {
 VirtIONet *n = to_virtio_net(vdev);
+#ifdef TAP_VNET_HDR
 VLANClientState *host = n-vc-vlan-first_client;
+#endif
 
 n-mergeable_rx_bufs = !!(features  (1  VIRTIO_NET_F_MRG_RXBUF));
 
+#ifdef TAP_VNET_HDR
 if (!tap_has_vnet_hdr(host) || !host-set_offload)
 return;
 
@@ -85,29 +94,30 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint32_t features)
   (features  VIRTIO_NET_F_GUEST_TSO4)  1,
   (features  VIRTIO_NET_F_GUEST_TSO6)  1,
   (features  VIRTIO_NET_F_GUEST_ECN)   1);
+#endif
 }
 
 /* RX */
 
 static void virtio_net_handle_rx(VirtIODevice *vdev, VirtQueue *vq)
 {
+#ifdef USE_KVM
 /* We now have RX buffers, signal to the IO thread to break out of the
select to re-poll the tap file descriptor */
 if (kvm_enabled())
 qemu_kvm_notify_work();
+#endif
 }
 
-static int virtio_net_can_receive(void *opaque)
+static int do_virtio_net_can_receive(VirtIONet *n, int bufsize)
 {
-VirtIONet *n = opaque;
-
 if (!virtio_queue_ready(n-rx_vq) ||
 !(n-vdev.status  VIRTIO_CONFIG_S_DRIVER_OK))
 return 0;
 
 if (virtio_queue_empty(n-rx_vq) ||
 (n-mergeable_rx_bufs 
- !virtqueue_avail_bytes(n-rx_vq, VIRTIO_NET_MAX_BUFSIZE, 0))) {
+ !virtqueue_avail_bytes(n-rx_vq, bufsize, 0))) {
 virtio_queue_set_notification(n-rx_vq, 1);
 return 0;
 }
@@ -116,6 +126,14 @@ static int virtio_net_can_receive(void *opaque)
 return 1;
 }
 
+static int virtio_net_can_receive(void *opaque)
+{
+VirtIONet *n = opaque;
+
+return do_virtio_net_can_receive(n, VIRTIO_NET_MAX_BUFSIZE);
+}
+
+#ifdef TAP_VNET_HDR
 /* dhclient uses AF_PACKET but doesn't pass auxdata to the kernel so
  * it never finds out that the packets don't have valid checksums.  This
  * causes dhclient to get upset.  Fedora's carried a patch for ages to
@@ -143,6 +161,7 @@ static void work_around_broken_dhclient(struct 
virtio_net_hdr *hdr,
 hdr-flags = ~VIRTIO_NET_HDR_F_NEEDS_CSUM;
 }
 }
+#endif
 
 static int iov_fill(struct iovec *iov, int iovcnt, const void *buf, int count)
 {
@@ -168,11 +187,13 @@ static int receive_header(VirtIONet *n, struct iovec 
*iov, int iovcnt,
 hdr-flags = 0;
 hdr-gso_type = VIRTIO_NET_HDR_GSO_NONE;
 
+#ifdef TAP_VNET_HDR
 if (tap_has_vnet_hdr(n-vc-vlan-first_client)) {
 memcpy(hdr, buf, sizeof(*hdr));
 offset = sizeof(*hdr);
 work_around_broken_dhclient(hdr, buf + offset, size - offset);
 }
+#endif
 
 /* We only ever receive a struct virtio_net_hdr from the tapfd,
  * but we may be passing along a larger header to the guest.
@@ -189,6 +210,9 @@ static void virtio_net_receive(void *opaque, const uint8_t 
*buf, int size)
 struct virtio_net_hdr_mrg_rxbuf *mhdr = NULL;
 int hdr_len, offset, i;
 
+if (!do_virtio_net_can_receive(n, size))
+return;
+
 /* hdr_len refers to the header we supply to the guest */
 hdr_len = n-mergeable_rx_bufs ?
 sizeof(struct virtio_net_hdr_mrg_rxbuf) : sizeof(struct 
virtio_net_hdr);
@@ -253,7 +277,11 @@ static void virtio_net_receive(void *opaque, const uint8_t 
*buf, int size)
 static void virtio_net_flush_tx(VirtIONet *n, 

how to compile kvm 64 bit

2008-12-04 Thread paolo pedaletti
Ciao,
sorry for this silly question, but I can't compile kvm-79 for AMD 64bit

Linux 2.6.27-9-server, ubuntu 8.10
model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 5600+


[EMAIL PROTECTED]:/usr/src/kvm-79# ./configure
Install prefix/usr/local
BIOS directory/usr/local/share/qemu
binary directory  /usr/local/bin
Manual directory  /usr/local/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /usr/src/kvm-79/qemu
C compilergcc
Host C compiler   gcc
ARCH_CFLAGS   -m64
make  make
install   install
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu
gprof enabled no
sparse enabledno
profiler  no
static build  no
-Werror enabled   no
SDL support   yes
SDL static link   yes
curses supportyes
mingw32 support   no
Audio drivers oss
Extra audio cards
Mixer emulation   no
VNC TLS support   no
kqemu support no
kvm support   yes
CPU emulation yes
brlapi supportno
Documentation no
NPTL support  yes
vde support   yes
AIO support   yes
KVM support   yes

[EMAIL PROTECTED]:/usr/src/kvm-79# make
make -C libkvm
make[1]: Entering directory `/usr/src/kvm-79/libkvm'
gcc -march=i686-mcpu=i686-malign-functions=4 -O2 -m64
-D__x86_64__ -MMD -MF ./.libkvm-x86.d -g -fomit-frame-pointer -Wall
-fno-stack-protector   -I /usr/src/kvm-79/kernel/include   -c -o
libkvm-x86.o libkvm-x86.c
`-mcpu=' is deprecated. Use `-mtune=' or '-march=' instead.
libkvm-x86.c:1: error: CPU you selected does not support x86-64
instruction set
libkvm-x86.c:1: warning: -malign-functions is obsolete, use
-falign-functions
make[1]: *** [libkvm-x86.o] Error 1
make[1]: Leaving directory `/usr/src/kvm-79/libkvm'
make: *** [libkvm] Error 2

it use i686

witch command line parameter should I use?

thank you.


-- 
Paolo Pedaletti

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: lsi_scsi: error: Bad Status move errors with kvm-79

2008-12-04 Thread Ryan Harper
* Anssi Kolehmainen [EMAIL PROTECTED] [2008-12-04 08:51]:
 Hi,
 
 I have kvm environment with linux-2.6.28-rc7 x86_64 (Xeon), kvm-79 host
 and bunch of Win2K3 guests. Sometimes I get 'lsi_scsi: error: Bad Status
 move' from kvm (qemu) and in Windows event log The device,
 \Device\Scsi\sym_hi1, did not respond within the timeout period. These
 errors come somewhat at random, usually with 10-30 second intervals when
 there is enough disk usage in the guest (seems that installing Bea
 Weblogic or Oracle database is pretty nice for causing these errors).
 
 Usually windows is able to recover from these but sometimes (=too often)
 I get random delays and hangups. Also I have gotten BSOD 0x77 (0x02,
 0x00, 0x00, 0x5f4000) about once a day.
 
 Any ideas how to debug / fix this problem?

Current KVM userspace has a bogus line in the scsi code relating to the
DBC register which looks like is what is tripping up the Bad Status, or
could be anyhow.  Try out with this patch applied to your qemu dir:

http://lists.gnu.org/archive/html/qemu-devel/2008-12/msg00043.html

You can also try older KVM releases, kvm 76 at least doesn't have that
line present.  That might be easier than applying the patch.

You can also enable debugging in qemu/hw/lsi53c895a.c and in
qemu/hw/scsi-disk.c   Sending that output here would be helpful if we're
still tracking it.

If you can recreate with the patch applied or on an older KVM that
doesn't have that line in there, I'll try to reproduce.  Are there
free/downloadable copies of Bae or Oracle that I can use to recreate?


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] kvm crashes in 2.6.28-rc6-00007-ged31348

2008-12-04 Thread Avi Kivity

Avi Kivity wrote:

Steven Rostedt wrote:

The following must be available without recursion for the function
tracer to work:

  local_irq_save/restore
  smp_processor_id
  preempt_enable/disable_notrace
  atomic_inc/dec
  


In arch/x86/kvm/svm.c, function svm_vcpu_run(), everything between the 
vmrun instruction and the call to load_host_msrs() is executed without 
a live pda, so no smp_processor_id().  Could easily be fixed by 
rearranging things.





Luis, please try the attached patch.


--
error compiling committee.c: too many arguments to function

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 1452851..c10857d 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -920,13 +920,6 @@ static int svm_get_irq(struct kvm_vcpu *vcpu)
return -1;
 }
 
-static void load_host_msrs(struct kvm_vcpu *vcpu)
-{
-#ifdef CONFIG_X86_64
-   wrmsrl(MSR_GS_BASE, to_svm(vcpu)-host_gs_base);
-#endif
-}
-
 static void save_host_msrs(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_X86_64
@@ -1798,10 +1791,26 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
mov %%r14, %c[r14](%[svm]) \n\t
mov %%r15, %c[r15](%[svm]) \n\t
 #endif
-   pop %%Rbp
+   pop %%Rbp \n\t
+   /* Reload PDA early so ftrace can work */
+   mov %[fs], %%fs \n\t
+   mov %[gs], %%gs \n\t
+#ifdef CONFIG_X86_64
+   mov %c[gsbase](%[svm]), %%edi \n\t
+   mov %c[gsbase]+4(%[svm]), %%edx \n\t
+   mov %[msr_gs_base], %%ecx \n\t
+   xchg %%rax, %%rdi \n\t
+   wrmsr \n\t
+   xchg %%rax, %%rdi \n\t
+#endif
:
: [svm]a(svm),
  [vmcb]i(offsetof(struct vcpu_svm, vmcb_pa)),
+ [fs]g(fs_selector), [gs]g(gs_selector),
+#ifdef CONFIG_X86_64
+ [gsbase]i(offsetof(struct vcpu_svm, host_gs_base)),
+ [msr_gs_base]i(MSR_GS_BASE),
+#endif
  [rbx]i(offsetof(struct vcpu_svm, 
vcpu.arch.regs[VCPU_REGS_RBX])),
  [rcx]i(offsetof(struct vcpu_svm, 
vcpu.arch.regs[VCPU_REGS_RCX])),
  [rdx]i(offsetof(struct vcpu_svm, 
vcpu.arch.regs[VCPU_REGS_RDX])),
@@ -1837,10 +1846,7 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
write_dr7(svm-host_dr7);
kvm_write_cr2(svm-host_cr2);
 
-   kvm_load_fs(fs_selector);
-   kvm_load_gs(gs_selector);
kvm_load_ldt(ldt_selector);
-   load_host_msrs(vcpu);
 
reload_tss(vcpu);
 


RE: Virtio network performance problem

2008-12-04 Thread Adrian Schmitz
  On Wed, Dec 03, 2008 at 11:20:08AM -0800, Chris Wedgwood wrote:
 
   TSC instability?  Is this an SMP guest?

Ok, I tried pinning the kvm process to two cores (0,2) on a single
socket, but that didn't seem to make any difference for my virtio
network performance. I also tried pinning the process to a single core,
which also didn't seem to have any effect.

Someone on IRC suggested that it sounded like a clocking issue, since
some of my ping times are negative. He suggested trying a different
clock source. I tried it with dynticks, rtc, and unix. None of them seem
better, although all of them seem different in terms of patterns in
the ping times. Sorry if this makes it a long post, but I don't know how
to describe it other than to paste an example (below). Not sure if this
indicates that it is clock-related or if it is meaningless.

In any event, I'm not sure where to go from here. Another suggestion
from IRC was that it was due to the age of my host kernel (2.6.18) and
the fact that it doesn't support high-res timers. If I can avoid
replacing the distro kernel, I'd like to, but I'll do what I have to, I
suppose.

With dynticks (these are all with -net user, as I had some trouble with
my tap interface last night while testing this. The results are roughly
the same as when I was using tap before, though):

Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-139ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-141ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-133ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255

With rtc:

Reply from 10.0.2.2: bytes=32 time=-224ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-223ms TTL=255
Reply from 10.0.2.2: bytes=32 time=4ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-223ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-224ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255

With unix:

Reply from 10.0.2.2: bytes=32 time=-191ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-191ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-190ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-191ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=192ms TTL=255
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to compile kvm 64 bit

2008-12-04 Thread Charles Duffy

paolo pedaletti wrote:

Ciao,
sorry for this silly question, but I can't compile kvm-79 for AMD 64bit


Please pardon me for following up on one silly question with another -- 
but is your host running a 64-bit userland? (What is the output of 
uname -m?)


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: how to compile kvm 64 bit

2008-12-04 Thread Malinka Rellikwodahs
your -mcpu=i686 specifically tells gcc to compile for 32-bit cpu you
need to use something like

-march=k8 or -mtune=k8 usually

http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html

the difference between march and mtune is whether it breaks
compatability with other CPU's (mtune doesn't)

On Thu, Dec 4, 2008 at 10:24, paolo pedaletti [EMAIL PROTECTED] wrote:

 Ciao,
 sorry for this silly question, but I can't compile kvm-79 for AMD 64bit

 Linux 2.6.27-9-server, ubuntu 8.10
 model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 5600+


 [EMAIL PROTECTED]:/usr/src/kvm-79# ./configure
 Install prefix/usr/local
 BIOS directory/usr/local/share/qemu
 binary directory  /usr/local/bin
 Manual directory  /usr/local/share/man
 ELF interp prefix /usr/gnemul/qemu-%M
 Source path   /usr/src/kvm-79/qemu
 C compilergcc
 Host C compiler   gcc
 ARCH_CFLAGS   -m64
 make  make
 install   install
 host CPU  x86_64
 host big endian   no
 target list   x86_64-softmmu
 gprof enabled no
 sparse enabledno
 profiler  no
 static build  no
 -Werror enabled   no
 SDL support   yes
 SDL static link   yes
 curses supportyes
 mingw32 support   no
 Audio drivers oss
 Extra audio cards
 Mixer emulation   no
 VNC TLS support   no
 kqemu support no
 kvm support   yes
 CPU emulation yes
 brlapi supportno
 Documentation no
 NPTL support  yes
 vde support   yes
 AIO support   yes
 KVM support   yes

 [EMAIL PROTECTED]:/usr/src/kvm-79# make
 make -C libkvm
 make[1]: Entering directory `/usr/src/kvm-79/libkvm'
 gcc -march=i686-mcpu=i686-malign-functions=4 -O2 -m64
 -D__x86_64__ -MMD -MF ./.libkvm-x86.d -g -fomit-frame-pointer -Wall
 -fno-stack-protector   -I /usr/src/kvm-79/kernel/include   -c -o
 libkvm-x86.o libkvm-x86.c
 `-mcpu=' is deprecated. Use `-mtune=' or '-march=' instead.
 libkvm-x86.c:1: error: CPU you selected does not support x86-64
 instruction set
 libkvm-x86.c:1: warning: -malign-functions is obsolete, use
 -falign-functions
 make[1]: *** [libkvm-x86.o] Error 1
 make[1]: Leaving directory `/usr/src/kvm-79/libkvm'
 make: *** [libkvm] Error 2

 it use i686

 witch command line parameter should I use?

 thank you.


 --
 Paolo Pedaletti

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/13] iommu bitmap insteads of iommu pointer in dmar_domain

2008-12-04 Thread Mark McLoughlin
Hi Weidong,

On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:

 Support dmar_domain own multiple devices from different iommus, which
 are set in iommu bitmap. add function domain_get_iommu() to get the
 only one iommu of domain in native VT-d usage.

A bitmap seems quite awkward. Why not a list?

Also, I wasn't sure at first what you meant by native VT-d ... you
mean DMA-API VT-d usage as opposed to KVM device assignment usage,
right? Perhaps we need a better term for that distinction.

 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |  102 
  include/linux/dma_remapping.h |2 +-
  2 files changed, 72 insertions(+), 32 deletions(-)
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index 5c8baa4..39c5e9d 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c

 @@ -184,6 +185,21 @@ void free_iova_mem(struct iova *iova)
   kmem_cache_free(iommu_iova_cache, iova);
  }
  
 +/* in native case, each domain is related to only one iommu */
 +static struct intel_iommu *domain_get_iommu(struct dmar_domain *domain)
 +{
 + struct dmar_drhd_unit *drhd;
 +
 + for_each_drhd_unit(drhd) {
 + if (drhd-ignored)
 + continue;
 + if (test_bit(drhd-iommu-seq_id, domain-iommu_bmp))
 + return drhd-iommu;
 + }
 +
 + return NULL;
 +}

So, basically, a lot of the code assumes that there is only one iommu
associated with a domain. That makes it seem like the abstractions here
could do with some re-working.

We should at least add:

  ASSERT(!(domain-flags  DOMAIN_FLAG_VIRTUAL_MACHINE));

in the patch which adds that flag.

 @@ -1925,16 +1952,19 @@ static void add_unmap(struct dmar_domain *dom, struct 
 iova *iova)
  {
   unsigned long flags;
   int next, iommu_id;
 + struct intel_iommu *iommu;
  
   spin_lock_irqsave(async_umap_flush_lock, flags);
   if (list_size == HIGH_WATER_MARK)
   flush_unmaps();
  
 - iommu_id = dom-iommu-seq_id;
 + iommu = domain_get_iommu(dom);
 + iommu_id = iommu-seq_id;
  
   next = deferred_flush[iommu_id].next;
   deferred_flush[iommu_id].domain[next] = dom;
   deferred_flush[iommu_id].iova[next] = iova;
 + deferred_flush[iommu_id].iommu = iommu;
   deferred_flush[iommu_id].next++;

This deferred_flush-iommu change should be in it's own patch, IMHO.

Also, it's not quite right - there is a fixed mapping between iommu_id
and the iommu, so it makes no sense to update that mapping each time we
add a new iova.

In fact, it makes me wonder why we don't have the flush list in the
struct intel_iommu and have a global list of iommus.
 
Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/13] move page table handling utility functions

2008-12-04 Thread Mark McLoughlin
On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:

 move page table handling utility functions from intel-iommu.c to
 dma_remapping.h, because some of them will be used in other .c files.

You need to rebase your patches against dwmw2's tree where some cleanup
patches of mine moved a bunch of stuff from dma_remapping.h to
intel-iommu.c

And preferably, this stuff could stay internal to intel-iommu.c.

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/13] set iommu agaw

2008-12-04 Thread Mark McLoughlin
On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 agaw may be different across iommus.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/dmar.c|   14 ++
  include/linux/dma_remapping.h |2 ++
  include/linux/intel-iommu.h   |1 +
  3 files changed, 17 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
 index 691b3ad..ebcc7c2 100644
 --- a/drivers/pci/dmar.c
 +++ b/drivers/pci/dmar.c
 @@ -491,6 +491,8 @@ int alloc_iommu(struct dmar_drhd_unit *drhd)
   int map_size;
   u32 ver;
   static int iommu_allocated = 0;
 + unsigned long sagaw;
 + int agaw;
  
   iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
   if (!iommu)
 @@ -506,6 +508,18 @@ int alloc_iommu(struct dmar_drhd_unit *drhd)
   iommu-cap = dmar_readq(iommu-reg + DMAR_CAP_REG);
   iommu-ecap = dmar_readq(iommu-reg + DMAR_ECAP_REG);
  
 + /* set agaw, SAGAW may be different across iommus */
 + sagaw = cap_sagaw(iommu-cap);
 + for (agaw = width_to_agaw(DEFAULT_DOMAIN_ADDRESS_WIDTH);
 +  agaw = 0; agaw--)
 + if (test_bit(agaw, sagaw))
 + break;
 + if (agaw  0) {
 + printk(KERN_ERR IOMMU: unsupported sagaw %lx\n, sagaw);
 + goto error;
 + }
 + iommu-agaw = agaw;

Could we add something like intel_iommu_calculate_agaw() and keep the
agaw code internal to intel-iommu.c?

Also, unsupported sagaw expands to unsupported supported adjusted
guest address width which doesn't make much sense :-)

unsupported address width would be sufficient, I think.

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/13] iommu coherency

2008-12-04 Thread Mark McLoughlin
On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:

 in dmar_domain, more than one iommus may be included in iommu_bmp. Due
 to Coherency capability may be different across iommus, set this
 variable to indicate iommu access is coherent or not. Only when all
 related iommus in a dmar_domain are all coherent, iommu access of this
 domain is coherent.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |6 ++
  include/linux/dma_remapping.h |2 ++
  2 files changed, 8 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index a18e0b4..fa1507b 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -982,6 +982,12 @@ static struct dmar_domain * iommu_alloc_domain(struct 
 intel_iommu *iommu)
   domain-id = num;
   memset(domain-iommu_bmp, 0, sizeof(unsigned long));
   set_bit(iommu-seq_id, domain-iommu_bmp);
 +
 + if (ecap_coherent(iommu-ecap))
 + domain-iommu_coherency = 1;
 + else
 + domain-iommu_coherency = 0;

If you allocate a non-coherent iommu, followed by a coherent iommu, then
iommu_coherency ends up as 1

In patch 6/13 you add domain_update_iommu_coherency(). It would make
more sense to add that function in this patch and use it here.

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/13] add domain flag DOMAIN_FLAG_VIRTUAL_MACHINE

2008-12-04 Thread Mark McLoughlin
On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 By default, one domain owns one device, like native VT-d usage.
 
 For kvm VT-d usage, more than one devices across iommus may be
 assigned to one domain, flag DOMAIN_FLAG_VIRTUAL_MACHINE is for this
 usage.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |3 ++-
  include/linux/dma_remapping.h |   11 ++-
  2 files changed, 12 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index fa1507b..09a5150 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -989,6 +989,7 @@ static struct dmar_domain * iommu_alloc_domain(struct 
 intel_iommu *iommu)
   domain-iommu_coherency = 0;
  
   iommu-domains[num] = domain;
 + domain-flags = 0;
   spin_unlock_irqrestore(iommu-lock, flags);

This looks like a bugfix. Does it need to be fixed in 2.6.28?

  
   return domain;
 @@ -1387,7 +1388,7 @@ static struct dmar_domain *get_domain_for_dev(struct 
 pci_dev *pdev, int gaw)
   info-dev = NULL;
   info-domain = domain;
   /* This domain is shared by devices under p2p bridge */
 - domain-flags |= DOMAIN_FLAG_MULTIPLE_DEVICES;
 + domain-flags |= DOMAIN_FLAG_P2P_MULTIPLE_DEVICES;

Renaming this flag should probably be a separate patch.

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/13] add domain_flush_cache

2008-12-04 Thread Mark McLoughlin
On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:

 For some common low level functions which will be also used by virtual
 machine usage, use domain_flush_cache instead of __iommu_flush_cache.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |   40 
  1 files changed, 24 insertions(+), 16 deletions(-)
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index 429aff4..b00a8f2 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -200,6 +200,13 @@ static struct intel_iommu *domain_get_iommu(struct 
 dmar_domain *domain)
   return NULL;
  }
  
 +static void domain_flush_cache(struct dmar_domain *domain,
 +void *addr, int size)
 +{
 + if (!domain-iommu_coherency)
 + clflush_cache_range(addr, size);
 +}

This is quite unfortunate; __iommu_flush_cache() is essentially
identical:

static inline void __iommu_flush_cache(
struct intel_iommu *iommu, void *addr, int size)
{
if (!ecap_coherent(iommu-ecap))
clflush_cache_range(addr, size);
}

Is there no way we can use a single function for both purposes?

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] allocation and free functions of virtual machine domain

2008-12-04 Thread Mark McLoughlin
On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |  104 
 -
  1 files changed, 103 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index b00a8f2..e96b3bc 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -947,6 +947,7 @@ static int iommu_init_domains(struct intel_iommu *iommu)
  
 
  static void domain_exit(struct dmar_domain *domain);
 +static void vm_domain_exit(struct dmar_domain *domain);
  
  void free_dmar_iommu(struct intel_iommu *iommu)
  {
 @@ -957,8 +958,13 @@ void free_dmar_iommu(struct intel_iommu *iommu)
   for (; i  cap_ndoms(iommu-cap); ) {
   domain = iommu-domains[i];
   clear_bit(i, iommu-domain_ids);
 - if (--domain-iommu_count == 0)
 +
 + if (domain-flags  DOMAIN_FLAG_VIRTUAL_MACHINE) {
 + if (--domain-iommu_count == 0)
 + vm_domain_exit(domain);
 + } else
   domain_exit(domain);
 +

Again, these new functions are copies of existing code with minor
modifications. I'd much rather see the existing code refactored and then
modified to handle the DOMAIN_FLAG_VIRTUAL_MACHINE case.

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHSETS] KVM device passthrough support with AMD IOMMU

2008-12-04 Thread Joerg Roedel
Hi,

the two patchsets posted as reply to this email implement KVM device
passthrough support for AMD IOMMU hardware.

The first patchset is version 3 of the generic iommu api patchset which
generalizes the VT-d functions exported to KVM into a common api where
the AMD IOMMU code can plug into.

The second patchset finally implements the KVM device passthrough
support in the AMD IOMMU code. Together with KVM-79 I successfully
passed an 10GBit network card into an KVM guest.

These two patchsets apply in order in top of the latest post of
Han Weidong's Multiple device assignement support patches. Anybody who
wants to try this out can pull the whole stuff from

git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu.git 
kvm-amd-iommu

Please give these patches a good review.

Thanks,

Joerg

-- 
   |   AMD Saxony Limited Liability Company  Co. KG
 Operating | Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System|  Register Court Dresden: HRA 4896
 Research  |  General Partner authorized to represent:
 Center| AMD Saxony LLC (Wilmington, Delaware, US)
   | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/11] KVM: rename vtd.c to iommu.c

2008-12-04 Thread Joerg Roedel
Impact: file renamed

The code in the vtd.c file can be reused for other IOMMUs as well. So
rename it to make it clear that it handle more than VT-d.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/ia64/kvm/Makefile  |2 +-
 arch/x86/kvm/Makefile   |2 +-
 virt/kvm/{vtd.c = iommu.c} |0 
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename virt/kvm/{vtd.c = iommu.c} (100%)

diff --git a/arch/ia64/kvm/Makefile b/arch/ia64/kvm/Makefile
index 76464dc..cb69dfc 100644
--- a/arch/ia64/kvm/Makefile
+++ b/arch/ia64/kvm/Makefile
@@ -52,7 +52,7 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o 
ioapic.o \
coalesced_mmio.o irq_comm.o)
 
 ifeq ($(CONFIG_DMAR),y)
-common-objs += $(addprefix ../../../virt/kvm/, vtd.o)
+common-objs += $(addprefix ../../../virt/kvm/, iommu.o)
 endif
 
 kvm-objs := $(common-objs) kvm-ia64.o kvm_fw.o
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index c023435..00f46c2 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -8,7 +8,7 @@ ifeq ($(CONFIG_KVM_TRACE),y)
 common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o)
 endif
 ifeq ($(CONFIG_DMAR),y)
-common-objs += $(addprefix ../../../virt/kvm/, vtd.o)
+common-objs += $(addprefix ../../../virt/kvm/, iommu.o)
 endif
 
 EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm
diff --git a/virt/kvm/vtd.c b/virt/kvm/iommu.c
similarity index 100%
rename from virt/kvm/vtd.c
rename to virt/kvm/iommu.c
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/11] introcude linux/iommu.h for an iommu api

2008-12-04 Thread Joerg Roedel
This patch introduces the API to abstract the exported VT-d functions
for KVM into a generic API. This way the AMD IOMMU implementation can
plug into this API later.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 include/linux/iommu.h |  112 +
 1 files changed, 112 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/iommu.h

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
new file mode 100644
index 000..8a7bfb1
--- /dev/null
+++ b/include/linux/iommu.h
@@ -0,0 +1,112 @@
+/*
+ * Copyright (C) 2007-2008 Advanced Micro Devices, Inc.
+ * Author: Joerg Roedel [EMAIL PROTECTED]
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#ifndef __LINUX_IOMMU_H
+#define __LINUX_IOMMU_H
+
+#define IOMMU_READ (1)
+#define IOMMU_WRITE(2)
+
+struct device;
+
+struct iommu_domain {
+   void *priv;
+};
+
+struct iommu_ops {
+   int (*domain_init)(struct iommu_domain *domain);
+   void (*domain_destroy)(struct iommu_domain *domain);
+   int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
+   void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
+   int (*map)(struct iommu_domain *domain, unsigned long iova,
+  phys_addr_t paddr, size_t size, int prot);
+   void (*unmap)(struct iommu_domain *domain, unsigned long iova,
+ size_t size);
+   phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
+   unsigned long iova);
+};
+
+#ifdef CONFIG_IOMMU_API
+
+extern void register_iommu(struct iommu_ops *ops);
+extern bool iommu_found(void);
+extern struct iommu_domain *iommu_domain_alloc(void);
+extern void iommu_domain_free(struct iommu_domain *domain);
+extern int iommu_attach_device(struct iommu_domain *domain,
+  struct device *dev);
+extern void iommu_detach_device(struct iommu_domain *domain,
+   struct device *dev);
+extern int iommu_map_range(struct iommu_domain *domain, unsigned long iova,
+  phys_addr_t paddr, size_t size, int prot);
+extern void iommu_unmap_range(struct iommu_domain *domain, unsigned long iova,
+ size_t size);
+extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain,
+ unsigned long iova);
+
+#else /* CONFIG_IOMMU_API */
+
+static inline void register_iommu(struct iommu_ops *ops)
+{
+}
+
+static inline bool iommu_found(void)
+{
+   return false;
+}
+
+static inline struct iommu_domain *iommu_domain_alloc(void)
+{
+   return NULL;
+}
+
+static inline void iommu_domain_free(struct iommu_domain *domain)
+{
+}
+
+static inline int iommu_attach_device(struct iommu_domain *domain,
+ struct device *dev)
+{
+   return -ENODEV;
+}
+
+static inline void iommu_detach_device(struct iommu_domain *domain,
+  struct device *dev)
+{
+}
+
+static inline int iommu_map_range(struct iommu_domain *domain,
+ unsigned long iova, phys_addr_t paddr,
+ size_t size, int prot)
+{
+   return -ENODEV;
+}
+
+static inline void iommu_unmap_range(struct iommu_domain *domain,
+unsigned long iova, size_t size)
+{
+}
+
+static inline phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain,
+unsigned long iova)
+{
+   return 0;
+}
+
+#endif /* CONFIG_IOMMU_API */
+
+#endif /* __LINUX_IOMMU_H */
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/11] VT-d: adapt device attach and detach functions for IOMMU API

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 drivers/pci/intel-iommu.c   |   27 +++
 include/linux/intel-iommu.h |4 
 2 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index c939039..27dc896 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -2695,9 +2695,11 @@ static void intel_iommu_domain_destroy(struct 
iommu_domain *domain)
vm_domain_exit(dmar_domain);
 }
 
-int intel_iommu_attach_device(struct dmar_domain *domain,
- struct pci_dev *pdev)
+static int intel_iommu_attach_device(struct iommu_domain *domain,
+struct device *dev)
 {
+   struct dmar_domain *dmar_domain = domain-priv;
+   struct pci_dev *pdev = to_pci_dev(dev);
struct intel_iommu *iommu;
int addr_width;
u64 end;
@@ -2709,7 +2711,7 @@ int intel_iommu_attach_device(struct dmar_domain *domain,
 
old_domain = find_domain(pdev);
if (old_domain) {
-   if (domain-flags  DOMAIN_FLAG_VIRTUAL_MACHINE)
+   if (dmar_domain-flags  DOMAIN_FLAG_VIRTUAL_MACHINE)
vm_domain_remove_one_dev_info(old_domain, pdev);
else
domain_remove_dev_info(old_domain);
@@ -2724,28 +2726,29 @@ int intel_iommu_attach_device(struct dmar_domain 
*domain,
addr_width = agaw_to_width(iommu-agaw);
end = DOMAIN_MAX_ADDR(addr_width);
end = end  VTD_PAGE_MASK;
-   if (end  domain-max_addr) {
+   if (end  dmar_domain-max_addr) {
printk(KERN_ERR %s: iommu agaw (%d) is not 
   sufficient for the mapped address (%llx)\n,
-  __func__, iommu-agaw, domain-max_addr);
+  __func__, iommu-agaw, dmar_domain-max_addr);
return -EFAULT;
}
 
-   ret = domain_context_mapping(domain, pdev);
+   ret = domain_context_mapping(dmar_domain, pdev);
if (ret)
return ret;
 
-   ret = vm_domain_add_dev_info(domain, pdev);
+   ret = vm_domain_add_dev_info(dmar_domain, pdev);
return ret;
 }
-EXPORT_SYMBOL_GPL(intel_iommu_attach_device);
 
-void intel_iommu_detach_device(struct dmar_domain *domain,
-  struct pci_dev *pdev)
+static void intel_iommu_detach_device(struct iommu_domain *domain,
+ struct device *dev)
 {
-   vm_domain_remove_one_dev_info(domain, pdev);
+   struct dmar_domain *dmar_domain = domain-priv;
+   struct pci_dev *pdev = to_pci_dev(dev);
+
+   vm_domain_remove_one_dev_info(dmar_domain, pdev);
 }
-EXPORT_SYMBOL_GPL(intel_iommu_detach_device);
 
 int intel_iommu_map_address(struct dmar_domain *domain, dma_addr_t iova,
u64 hpa, size_t size, int prot)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index b4a2c2d..2b4961f 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -335,10 +335,6 @@ extern int qi_flush_iotlb(struct intel_iommu *iommu, u16 
did, u64 addr,
 
 extern void qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
 
-int intel_iommu_attach_device(struct dmar_domain *domain,
- struct pci_dev *pdev);
-void intel_iommu_detach_device(struct dmar_domain *domain,
-  struct pci_dev *pdev);
 int intel_iommu_map_address(struct dmar_domain *domain, dma_addr_t iova,
u64 hpa, size_t size, int prot);
 void intel_iommu_unmap_address(struct dmar_domain *domain,
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/11] VT-d: adapt domain map and unmap functions for IOMMU API

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 drivers/pci/intel-iommu.c   |   33 -
 include/linux/intel-iommu.h |4 
 2 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 27dc896..34ea06f 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -2750,20 +2750,28 @@ static void intel_iommu_detach_device(struct 
iommu_domain *domain,
vm_domain_remove_one_dev_info(dmar_domain, pdev);
 }
 
-int intel_iommu_map_address(struct dmar_domain *domain, dma_addr_t iova,
-   u64 hpa, size_t size, int prot)
+static int intel_iommu_map_range(struct iommu_domain *domain,
+unsigned long iova, phys_addr_t hpa,
+size_t size, int iommu_prot)
 {
+   struct dmar_domain *dmar_domain = domain-priv;
u64 max_addr;
int addr_width;
+   int prot = 0;
int ret;
 
+   if (iommu_prot  IOMMU_READ)
+   prot |= DMA_PTE_READ;
+   if (iommu_prot  IOMMU_WRITE)
+   prot |= DMA_PTE_WRITE;
+
max_addr = (iova  VTD_PAGE_MASK) + VTD_PAGE_ALIGN(size);
-   if (domain-max_addr  max_addr) {
+   if (dmar_domain-max_addr  max_addr) {
int min_agaw;
u64 end;
 
/* check if minimum agaw is sufficient for mapped address */
-   min_agaw = vm_domain_min_agaw(domain);
+   min_agaw = vm_domain_min_agaw(dmar_domain);
addr_width = agaw_to_width(min_agaw);
end = DOMAIN_MAX_ADDR(addr_width);
end = end  VTD_PAGE_MASK;
@@ -2773,28 +2781,27 @@ int intel_iommu_map_address(struct dmar_domain *domain, 
dma_addr_t iova,
   __func__, min_agaw, max_addr);
return -EFAULT;
}
-   domain-max_addr = max_addr;
+   dmar_domain-max_addr = max_addr;
}
 
-   ret = domain_page_mapping(domain, iova, hpa, size, prot);
+   ret = domain_page_mapping(dmar_domain, iova, hpa, size, prot);
return ret;
 }
-EXPORT_SYMBOL_GPL(intel_iommu_map_address);
 
-void intel_iommu_unmap_address(struct dmar_domain *domain,
-  dma_addr_t iova, size_t size)
+static void intel_iommu_unmap_range(struct iommu_domain *domain,
+   unsigned long iova, size_t size)
 {
+   struct dmar_domain *dmar_domain = domain-priv;
dma_addr_t base;
 
/* The address might not be aligned */
base = iova  VTD_PAGE_MASK;
size = VTD_PAGE_ALIGN(size);
-   dma_pte_clear_range(domain, base, base + size);
+   dma_pte_clear_range(dmar_domain, base, base + size);
 
-   if (domain-max_addr == base + size)
-   domain-max_addr = base;
+   if (dmar_domain-max_addr == base + size)
+   dmar_domain-max_addr = base;
 }
-EXPORT_SYMBOL_GPL(intel_iommu_unmap_address);
 
 int intel_iommu_found(void)
 {
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2b4961f..01ba896 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -335,10 +335,6 @@ extern int qi_flush_iotlb(struct intel_iommu *iommu, u16 
did, u64 addr,
 
 extern void qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
 
-int intel_iommu_map_address(struct dmar_domain *domain, dma_addr_t iova,
-   u64 hpa, size_t size, int prot);
-void intel_iommu_unmap_address(struct dmar_domain *domain,
-  dma_addr_t iova, size_t size);
 u64 intel_iommu_iova_to_phys(struct dmar_domain *domain, u64 iova);
 
 #ifdef CONFIG_DMAR
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/11] VT-d: adapt domain iova_to_phys function for IOMMU API

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 drivers/pci/intel-iommu.c   |7 ---
 include/linux/intel-iommu.h |2 --
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 34ea06f..c5f1405 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -2809,15 +2809,16 @@ int intel_iommu_found(void)
 }
 EXPORT_SYMBOL_GPL(intel_iommu_found);
 
-u64 intel_iommu_iova_to_phys(struct dmar_domain *domain, u64 iova)
+static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
+   unsigned long iova)
 {
+   struct dmar_domain *dmar_domain = domain-priv;
struct dma_pte *pte;
u64 phys = 0;
 
-   pte = addr_to_dma_pte(domain, iova);
+   pte = addr_to_dma_pte(dmar_domain, iova);
if (pte)
phys = dma_pte_addr(*pte);
 
return phys;
 }
-EXPORT_SYMBOL_GPL(intel_iommu_iova_to_phys);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 01ba896..57c44c2 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -335,8 +335,6 @@ extern int qi_flush_iotlb(struct intel_iommu *iommu, u16 
did, u64 addr,
 
 extern void qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
 
-u64 intel_iommu_iova_to_phys(struct dmar_domain *domain, u64 iova);
-
 #ifdef CONFIG_DMAR
 int intel_iommu_found(void);
 #else /* CONFIG_DMAR */
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/11] KVM: change KVM to use IOMMU API

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/ia64/include/asm/kvm_host.h |2 +-
 arch/ia64/kvm/Makefile   |2 +-
 arch/ia64/kvm/kvm-ia64.c |3 +-
 arch/x86/include/asm/kvm_host.h  |2 +-
 arch/x86/kvm/Makefile|2 +-
 arch/x86/kvm/x86.c   |3 +-
 include/linux/kvm_host.h |6 ++--
 virt/kvm/iommu.c |   45 +
 virt/kvm/kvm_main.c  |2 +-
 9 files changed, 33 insertions(+), 34 deletions(-)

diff --git a/arch/ia64/include/asm/kvm_host.h b/arch/ia64/include/asm/kvm_host.h
index 0560f3f..3486636 100644
--- a/arch/ia64/include/asm/kvm_host.h
+++ b/arch/ia64/include/asm/kvm_host.h
@@ -467,7 +467,7 @@ struct kvm_arch {
struct kvm_sal_data rdv_sal_data;
 
struct list_head assigned_dev_head;
-   struct dmar_domain *intel_iommu_domain;
+   struct iommu_domain *iommu_domain;
struct hlist_head irq_ack_notifier_list;
 
unsigned long irq_sources_bitmap;
diff --git a/arch/ia64/kvm/Makefile b/arch/ia64/kvm/Makefile
index cb69dfc..0bb99b7 100644
--- a/arch/ia64/kvm/Makefile
+++ b/arch/ia64/kvm/Makefile
@@ -51,7 +51,7 @@ EXTRA_AFLAGS += -Ivirt/kvm -Iarch/ia64/kvm/
 common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
coalesced_mmio.o irq_comm.o)
 
-ifeq ($(CONFIG_DMAR),y)
+ifeq ($(CONFIG_IOMMU_API),y)
 common-objs += $(addprefix ../../../virt/kvm/, iommu.o)
 endif
 
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index b4d24e2..483a15b 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -31,6 +31,7 @@
 #include linux/bitops.h
 #include linux/hrtimer.h
 #include linux/uaccess.h
+#include linux/iommu.h
 #include linux/intel-iommu.h
 
 #include asm/pgtable.h
@@ -189,7 +190,7 @@ int kvm_dev_ioctl_check_extension(long ext)
r = KVM_COALESCED_MMIO_PAGE_OFFSET;
break;
case KVM_CAP_IOMMU:
-   r = intel_iommu_found();
+   r = iommu_found();
break;
default:
r = 0;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f58f7eb..6b24a10 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -356,7 +356,7 @@ struct kvm_arch{
 */
struct list_head active_mmu_pages;
struct list_head assigned_dev_head;
-   struct dmar_domain *intel_iommu_domain;
+   struct iommu_domain *iommu_domain;
struct kvm_pic *vpic;
struct kvm_ioapic *vioapic;
struct kvm_pit *vpit;
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 00f46c2..d3ec292 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -7,7 +7,7 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o 
ioapic.o \
 ifeq ($(CONFIG_KVM_TRACE),y)
 common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o)
 endif
-ifeq ($(CONFIG_DMAR),y)
+ifeq ($(CONFIG_IOMMU_API),y)
 common-objs += $(addprefix ../../../virt/kvm/, iommu.o)
 endif
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7a2aeba..5d2787b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -34,6 +34,7 @@
 #include linux/module.h
 #include linux/mman.h
 #include linux/highmem.h
+#include linux/iommu.h
 #include linux/intel-iommu.h
 
 #include asm/uaccess.h
@@ -987,7 +988,7 @@ int kvm_dev_ioctl_check_extension(long ext)
r = !tdp_enabled;
break;
case KVM_CAP_IOMMU:
-   r = intel_iommu_found();
+   r = iommu_found();
break;
default:
r = 0;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index cb1d404..ca93f7f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -326,7 +326,7 @@ void kvm_unregister_irq_ack_notifier(struct 
kvm_irq_ack_notifier *kian);
 int kvm_request_irq_source_id(struct kvm *kvm);
 void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id);
 
-#ifdef CONFIG_DMAR
+#ifdef CONFIG_IOMMU_API
 int kvm_iommu_map_pages(struct kvm *kvm, gfn_t base_gfn,
unsigned long npages);
 int kvm_iommu_map_guest(struct kvm *kvm);
@@ -335,7 +335,7 @@ int kvm_assign_device(struct kvm *kvm,
  struct kvm_assigned_dev_kernel *assigned_dev);
 int kvm_deassign_device(struct kvm *kvm,
struct kvm_assigned_dev_kernel *assigned_dev);
-#else /* CONFIG_DMAR */
+#else /* CONFIG_IOMMU_API */
 static inline int kvm_iommu_map_pages(struct kvm *kvm,
  gfn_t base_gfn,
  unsigned long npages)
@@ -364,7 +364,7 @@ static inline int kvm_deassign_device(struct kvm *kvm,
 {
return 0;
 }
-#endif /* CONFIG_DMAR */
+#endif /* CONFIG_IOMMU_API */
 
 static inline void kvm_guest_enter(void)
 {
diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index 174ea1f..8a7f488 100644
--- 

[PATCH 04/11] select IOMMU_API when DMAR and/or AMD_IOMMU is selected

2008-12-04 Thread Joerg Roedel
These two IOMMUs can implement the current version of this API. So
select the API if one or both of these IOMMU drivers is selected.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/ia64/Kconfig |3 +++
 arch/x86/Kconfig  |3 +++
 drivers/base/Makefile |1 +
 3 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 6bd91ed..6a7b0c9 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -687,3 +687,6 @@ config IRQ_PER_CPU
 
 config IOMMU_HELPER
def_bool (IA64_HP_ZX1 || IA64_HP_ZX1_SWIOTLB || IA64_GENERIC || SWIOTLB)
+
+config IOMMU_API
+   def_bool (DMAR)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ac22bb7..b9f7187 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -580,6 +580,9 @@ config SWIOTLB
 config IOMMU_HELPER
def_bool (CALGARY_IOMMU || GART_IOMMU || SWIOTLB || AMD_IOMMU)
 
+config IOMMU_API
+   def_bool (AMD_IOMMU || DMAR)
+
 config MAXSMP
bool Configure Maximum number of SMP Processors and NUMA Nodes
depends on X86_64  SMP  BROKEN
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index c666373..b5b8ba5 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_FW_LOADER)   += firmware_class.o
 obj-$(CONFIG_NUMA) += node.o
 obj-$(CONFIG_MEMORY_HOTPLUG_SPARSE) += memory.o
 obj-$(CONFIG_SMP)  += topology.o
+obj-$(CONFIG_IOMMU_API) += iommu.o
 ifeq ($(CONFIG_SYSFS),y)
 obj-$(CONFIG_MODULES)  += module.o
 endif
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/11] VT-d: remove now unused intel_iommu_found function

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 drivers/pci/intel-iommu.c   |6 --
 include/linux/intel-iommu.h |9 -
 2 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 20d2305..90767f2 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -2808,12 +2808,6 @@ static void intel_iommu_unmap_range(struct iommu_domain 
*domain,
dmar_domain-max_addr = base;
 }
 
-int intel_iommu_found(void)
-{
-   return g_num_of_iommus;
-}
-EXPORT_SYMBOL_GPL(intel_iommu_found);
-
 static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
unsigned long iova)
 {
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 57c44c2..86a9653 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -335,15 +335,6 @@ extern int qi_flush_iotlb(struct intel_iommu *iommu, u16 
did, u64 addr,
 
 extern void qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
 
-#ifdef CONFIG_DMAR
-int intel_iommu_found(void);
-#else /* CONFIG_DMAR */
-static inline int intel_iommu_found(void)
-{
-   return 0;
-}
-#endif /* CONFIG_DMAR */
-
 extern void *intel_alloc_coherent(struct device *, size_t, dma_addr_t *, 
gfp_t);
 extern void intel_free_coherent(struct device *, size_t, void *, dma_addr_t);
 extern dma_addr_t intel_map_single(struct device *, phys_addr_t, size_t, int);
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/11] VT-d: adapt domain init and destroy functions for IOMMU API

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 drivers/pci/intel-iommu.c   |   33 ++---
 include/linux/intel-iommu.h |2 --
 2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index a8c2e58..c939039 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -35,6 +35,7 @@
 #include linux/mempool.h
 #include linux/timer.h
 #include linux/iova.h
+#include linux/iommu.h
 #include linux/intel-iommu.h
 #include asm/cacheflush.h
 #include asm/iommu.h
@@ -2665,32 +2666,34 @@ static void vm_domain_exit(struct dmar_domain *domain)
free_domain_mem(domain);
 }
 
-struct dmar_domain *intel_iommu_alloc_domain(void)
+static int intel_iommu_domain_init(struct iommu_domain *domain)
 {
-   struct dmar_domain *domain;
+   struct dmar_domain *dmar_domain;
 
-   domain = iommu_alloc_vm_domain();
-   if (!domain) {
+   dmar_domain = iommu_alloc_vm_domain();
+   if (!dmar_domain) {
printk(KERN_ERR
-   intel_iommu_domain_alloc: domain == NULL\n);
-   return NULL;
+   intel_iommu_domain_init: dmar_domain == NULL\n);
+   return -ENOMEM;
}
-   if (vm_domain_init(domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+   if (vm_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
printk(KERN_ERR
-   intel_iommu_domain_alloc: domain_init() failed\n);
-   vm_domain_exit(domain);
-   return NULL;
+   intel_iommu_domain_init() failed\n);
+   vm_domain_exit(dmar_domain);
+   return -ENOMEM;
}
+   domain-priv = dmar_domain;
 
-   return domain;
+   return 0;
 }
-EXPORT_SYMBOL_GPL(intel_iommu_alloc_domain);
 
-void intel_iommu_free_domain(struct dmar_domain *domain)
+static void intel_iommu_domain_destroy(struct iommu_domain *domain)
 {
-   vm_domain_exit(domain);
+   struct dmar_domain *dmar_domain = domain-priv;
+
+   domain-priv = NULL;
+   vm_domain_exit(dmar_domain);
 }
-EXPORT_SYMBOL_GPL(intel_iommu_free_domain);
 
 int intel_iommu_attach_device(struct dmar_domain *domain,
  struct pci_dev *pdev)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 39a68b3..b4a2c2d 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -335,8 +335,6 @@ extern int qi_flush_iotlb(struct intel_iommu *iommu, u16 
did, u64 addr,
 
 extern void qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu);
 
-struct dmar_domain *intel_iommu_alloc_domain(void);
-void intel_iommu_free_domain(struct dmar_domain *domain);
 int intel_iommu_attach_device(struct dmar_domain *domain,
  struct pci_dev *pdev);
 void intel_iommu_detach_device(struct dmar_domain *domain,
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/11] VT-d: register functions for the IOMMU API

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 drivers/pci/intel-iommu.c |   15 +++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index c5f1405..20d2305 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -90,6 +90,8 @@ static int intel_iommu_strict;
 static DEFINE_SPINLOCK(device_domain_lock);
 static LIST_HEAD(device_domain_list);
 
+static struct iommu_ops intel_iommu_ops;
+
 static int __init intel_iommu_setup(char *str)
 {
if (!str)
@@ -2419,6 +2421,9 @@ int __init intel_iommu_init(void)
init_timer(unmap_timer);
force_iommu = 1;
dma_ops = intel_dma_ops;
+
+   register_iommu(intel_iommu_ops);
+
return 0;
 }
 
@@ -2822,3 +2827,13 @@ static phys_addr_t intel_iommu_iova_to_phys(struct 
iommu_domain *domain,
 
return phys;
 }
+
+static struct iommu_ops intel_iommu_ops = {
+   .domain_init= intel_iommu_domain_init,
+   .domain_destroy = intel_iommu_domain_destroy,
+   .attach_dev = intel_iommu_attach_device,
+   .detach_dev = intel_iommu_detach_device,
+   .map= intel_iommu_map_range,
+   .unmap  = intel_iommu_unmap_range,
+   .iova_to_phys   = intel_iommu_iova_to_phys,
+};
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/11] Factor VT-d KVM functions into a generic API

2008-12-04 Thread Joerg Roedel
This patch series makes the current KVM device passthrough code generic
enough so that other IOMMU implementation can also plug into this code.
It works by factoring the functions Vt-d code exports to KVM into a
generic interface which allows different backends.

This is the third version of the patchset. I rebased these patches onto
the 13-patches post of Han Weidongs multiple device assignment work and
included changes to the commit-messages according the comments I got
from the review.

This a basic implementation of a generic interface. It can and should be
improved later to support more types of hardware IOMMUs then VT-d and
AMD IOMMU.

Since I have no VT-d hardware available these patches are only compile
tested for now.

Please review, comment and test these patches.

Thanks,

Joerg

diffstat:

 arch/ia64/Kconfig|3 +
 arch/ia64/include/asm/kvm_host.h |2 +-
 arch/ia64/kvm/Makefile   |4 +-
 arch/ia64/kvm/kvm-ia64.c |3 +-
 arch/x86/Kconfig |3 +
 arch/x86/include/asm/kvm_host.h  |2 +-
 arch/x86/kvm/Makefile|4 +-
 arch/x86/kvm/x86.c   |3 +-
 drivers/base/Makefile|1 +
 drivers/base/iommu.c |  100 +++
 drivers/pci/intel-iommu.c|  121 ++---
 include/linux/intel-iommu.h  |   21 ---
 include/linux/iommu.h|  112 +++
 include/linux/kvm_host.h |6 +-
 virt/kvm/{vtd.c = iommu.c}  |   45 +++
 virt/kvm/kvm_main.c  |2 +-
 16 files changed, 326 insertions(+), 106 deletions(-)



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/11] add frontend implementation for the IOMMU API

2008-12-04 Thread Joerg Roedel
This API can be used by KVM for accessing different types of IOMMUs to
do device passthrough to guests. Beside that this API can also be used
by device drivers to map non-linear host memory into dma-linear
addresses to prevent sgather-gather DMA. UIO may be another user for
this API.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 drivers/base/iommu.c |  100 ++
 1 files changed, 100 insertions(+), 0 deletions(-)
 create mode 100644 drivers/base/iommu.c

diff --git a/drivers/base/iommu.c b/drivers/base/iommu.c
new file mode 100644
index 000..5e039d4
--- /dev/null
+++ b/drivers/base/iommu.c
@@ -0,0 +1,100 @@
+/*
+ * Copyright (C) 2007-2008 Advanced Micro Devices, Inc.
+ * Author: Joerg Roedel [EMAIL PROTECTED]
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#include linux/bug.h
+#include linux/types.h
+#include linux/errno.h
+#include linux/iommu.h
+
+static struct iommu_ops *iommu_ops;
+
+void register_iommu(struct iommu_ops *ops)
+{
+   if (iommu_ops)
+   BUG();
+
+   iommu_ops = ops;
+}
+
+bool iommu_found()
+{
+   return iommu_ops != NULL;
+}
+EXPORT_SYMBOL_GPL(iommu_found);
+
+struct iommu_domain *iommu_domain_alloc(void)
+{
+   struct iommu_domain *domain;
+   int ret;
+
+   domain = kmalloc(sizeof(*domain), GFP_KERNEL);
+   if (!domain)
+   return NULL;
+
+   ret = iommu_ops-domain_init(domain);
+   if (ret)
+   goto out_free;
+
+   return domain;
+
+out_free:
+   kfree(domain);
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(iommu_domain_alloc);
+
+void iommu_domain_free(struct iommu_domain *domain)
+{
+   iommu_ops-domain_destroy(domain);
+   kfree(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_domain_free);
+
+int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
+{
+   return iommu_ops-attach_dev(domain, dev);
+}
+EXPORT_SYMBOL_GPL(iommu_attach_device);
+
+void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
+{
+   iommu_ops-detach_dev(domain, dev);
+}
+EXPORT_SYMBOL_GPL(iommu_detach_device);
+
+int iommu_map_range(struct iommu_domain *domain, unsigned long iova,
+   phys_addr_t paddr, size_t size, int prot)
+{
+   return iommu_ops-map(domain, iova, paddr, size, prot);
+}
+EXPORT_SYMBOL_GPL(iommu_map_range);
+
+void iommu_unmap_range(struct iommu_domain *domain, unsigned long iova,
+ size_t size)
+{
+   iommu_ops-unmap(domain, iova, size);
+}
+EXPORT_SYMBOL_GPL(iommu_unmap_range);
+
+phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain,
+  unsigned long iova)
+{
+   return iommu_ops-iova_to_phys(domain, iova);
+}
+EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/19] AMD IOMMU: add domain id free function

2008-12-04 Thread Joerg Roedel
Impact: add code to release a domain id

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index bec34fd..b774638 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -568,6 +568,16 @@ static u16 domain_id_alloc(void)
return id;
 }
 
+static void domain_id_free(int id)
+{
+   unsigned long flags;
+
+   write_lock_irqsave(amd_iommu_devtable_lock, flags);
+   if (id  0  id  MAX_DOMAIN_ID)
+   __clear_bit(id, amd_iommu_pd_alloc_bitmap);
+   write_unlock_irqrestore(amd_iommu_devtable_lock, flags);
+}
+
 /*
  * Used to reserve address ranges in the aperture (e.g. for exclusion
  * ranges.
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/19] AMD IOMMU: add iommu_flush_domain function

2008-12-04 Thread Joerg Roedel
Impact: add a function to flush a domain id on every IOMMU

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   22 ++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index d4d0369..3fb5a04 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -350,6 +350,28 @@ static void iommu_flush_tlb(struct amd_iommu *iommu, u16 
domid)
iommu_queue_inv_iommu_pages(iommu, address, domid, 0, 1);
 }
 
+/*
+ * This function is used to flush the IO/TLB for a given protection domain
+ * on every IOMMU in the system
+ */
+static void iommu_flush_domain(u16 domid)
+{
+   unsigned long flags;
+   struct amd_iommu *iommu;
+   struct iommu_cmd cmd;
+
+   __iommu_build_inv_iommu_pages(cmd, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
+ domid, 1, 1);
+
+   list_for_each_entry(iommu, amd_iommu_list, list) {
+   spin_lock_irqsave(iommu-lock, flags);
+   __iommu_queue_command(iommu, cmd);
+   __iommu_completion_wait(iommu);
+   __iommu_wait_for_completion(iommu);
+   spin_unlock_irqrestore(iommu-lock, flags);
+   }
+}
+
 /
  *
  * The functions below are used the create the page table mappings for
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/19] AMD IOMMU: move invalidation command building to a separate function

2008-12-04 Thread Joerg Roedel
Impact: refactoring of iommu_queue_inv_iommu_pages

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index ac801f1..d4d0369 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -282,6 +282,21 @@ static int iommu_queue_inv_dev_entry(struct amd_iommu 
*iommu, u16 devid)
return ret;
 }
 
+static void __iommu_build_inv_iommu_pages(struct iommu_cmd *cmd, u64 address,
+ u16 domid, int pde, int s)
+{
+   memset(cmd, 0, sizeof(*cmd));
+   address = PAGE_MASK;
+   CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
+   cmd-data[1] |= domid;
+   cmd-data[2] = lower_32_bits(address);
+   cmd-data[3] = upper_32_bits(address);
+   if (s) /* size bit - we flush more than one 4kb page */
+   cmd-data[2] |= CMD_INV_IOMMU_PAGES_SIZE_MASK;
+   if (pde) /* PDE bit - we wan't flush everything not only the PTEs */
+   cmd-data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
+}
+
 /*
  * Generic command send function for invalidaing TLB entries
  */
@@ -291,16 +306,7 @@ static int iommu_queue_inv_iommu_pages(struct amd_iommu 
*iommu,
struct iommu_cmd cmd;
int ret;
 
-   memset(cmd, 0, sizeof(cmd));
-   address = PAGE_MASK;
-   CMD_SET_TYPE(cmd, CMD_INV_IOMMU_PAGES);
-   cmd.data[1] |= domid;
-   cmd.data[2] = lower_32_bits(address);
-   cmd.data[3] = upper_32_bits(address);
-   if (s) /* size bit - we flush more than one 4kb page */
-   cmd.data[2] |= CMD_INV_IOMMU_PAGES_SIZE_MASK;
-   if (pde) /* PDE bit - we wan't flush everything not only the PTEs */
-   cmd.data[2] |= CMD_INV_IOMMU_PAGES_PDE_MASK;
+   __iommu_build_inv_iommu_pages(cmd, address, domid, pde, s);
 
ret = iommu_queue_command(iommu, cmd);
 
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/19] AMD IOMMU support for KVM device assignment

2008-12-04 Thread Joerg Roedel
This patchset implements KVM device assignment support in the AMD IOMMU
driver. It uses the generic interface from the iommu-api patchset and
was successfully tested using an 10GBit network card passed through to
an KVM guest.

diffstat:

 arch/x86/include/asm/amd_iommu_types.h |   15 +-
 arch/x86/kernel/amd_iommu.c|  426 
 2 files changed, 393 insertions(+), 48 deletions(-)



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/19] AMD IOMMU: add device detach function for IOMMU API

2008-12-04 Thread Joerg Roedel
Impact: add a generic function to detach devices from protection domains

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   51 +++
 1 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index fd42926..0ded6f4 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -1529,3 +1529,54 @@ static void amd_iommu_domain_destroy(struct iommu_domain 
*dom)
 
dom-priv = NULL;
 }
+
+static void __detach_device(struct protection_domain *domain, u16 devid)
+{
+   unsigned long flags;
+
+   /* lock domain and device table */
+   write_lock_irqsave(amd_iommu_devtable_lock, flags);
+   spin_lock(domain-lock);
+
+   /* remove domain from the lookup table */
+   amd_iommu_pd_table[devid] = NULL;
+
+   /* remove entry from the device table seen by the hardware */
+   amd_iommu_dev_table[devid].data[0] = IOMMU_PTE_P | IOMMU_PTE_TV;
+   amd_iommu_dev_table[devid].data[1] = 0;
+   amd_iommu_dev_table[devid].data[2] = 0;
+
+   /* decrease reference counter */
+   domain-dev_cnt -= 1;
+
+   /* ready */
+   spin_unlock(domain-lock);
+   write_unlock_irqrestore(amd_iommu_devtable_lock, flags);
+}
+
+static void amd_iommu_detach_device(struct iommu_domain *dom,
+   struct device *dev)
+{
+   struct protection_domain *domain = dom-priv;
+   struct amd_iommu *iommu;
+   struct pci_dev *pdev;
+   u16 devid;
+
+   if (dev-bus != pci_bus_type)
+   return;
+
+   pdev = to_pci_dev(dev);
+
+   devid = calc_devid(pdev-bus-number, pdev-devfn);
+
+   if (devid  0)
+   __detach_device(domain, devid);
+
+   iommu = amd_iommu_rlookup_table[devid];
+   if (!iommu)
+   return;
+
+   iommu_queue_inv_dev_entry(iommu, devid);
+   iommu_completion_wait(iommu);
+}
+
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/19] AMD IOMMU: add domain destroy function for IOMMU API

2008-12-04 Thread Joerg Roedel
Impact: add a generic function for releasing protection domains

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   18 ++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 5d56cba..fd42926 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -1511,3 +1511,21 @@ out_free:
 
return -ENOMEM;
 }
+
+static void amd_iommu_domain_destroy(struct iommu_domain *dom)
+{
+   struct protection_domain *domain = dom-priv;
+
+   if (!domain)
+   return;
+
+   BUG_ON(domain-dev_cnt);
+
+   free_pagetable(domain);
+
+   domain_id_free(domain-id);
+
+   kfree(domain);
+
+   dom-priv = NULL;
+}
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/19] AMD IOMMU: add protection domain flags

2008-12-04 Thread Joerg Roedel
Imapct: add a new struct member to 'struct protection_domain'

When using protection domains for dma_ops and KVM its better to know for
which subsystem it was allocated. Add a flags member to struct
protection domain for that purpose.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/include/asm/amd_iommu_types.h |   14 +-
 arch/x86/kernel/amd_iommu.c|1 +
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/amd_iommu_types.h 
b/arch/x86/include/asm/amd_iommu_types.h
index 1a30c04..b74d35f 100644
--- a/arch/x86/include/asm/amd_iommu_types.h
+++ b/arch/x86/include/asm/amd_iommu_types.h
@@ -190,16 +190,20 @@
 /* FIXME: move this macro to linux/pci.h */
 #define PCI_BUS(x) (((x)  8)  0xff)
 
+/* Protection domain flags */
+#define PD_DMA_OPS_MASK(1UL  0) /* domain used for dma_ops */
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
  */
 struct protection_domain {
-   spinlock_t lock; /* mostly used to lock the page table*/
-   u16 id;  /* the domain id written to the device table */
-   int mode;/* paging mode (0-6 levels) */
-   u64 *pt_root;/* page table root pointer */
-   void *priv;  /* private data */
+   spinlock_t lock;/* mostly used to lock the page table*/
+   u16 id; /* the domain id written to the device table */
+   int mode;   /* paging mode (0-6 levels) */
+   u64 *pt_root;   /* page table root pointer */
+   unsigned long flags;/* flags to find out type of domain */
+   void *priv; /* private data */
 };
 
 /*
diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 3fb5a04..07ef5e9 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -723,6 +723,7 @@ static struct dma_ops_domain *dma_ops_domain_alloc(struct 
amd_iommu *iommu,
goto free_dma_dom;
dma_dom-domain.mode = PAGE_MODE_3_LEVEL;
dma_dom-domain.pt_root = (void *)get_zeroed_page(GFP_KERNEL);
+   dma_dom-domain.flags = PD_DMA_OPS_MASK;
dma_dom-domain.priv = dma_dom;
if (!dma_dom-domain.pt_root)
goto free_dma_dom;
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/19] AMD IOMMU: make dma_ops_free_pagetable generic

2008-12-04 Thread Joerg Roedel
Impact: change code to free pagetables from protection domains

The dma_ops_free_pagetable function can only free pagetables from
dma_ops domains. Change that to free pagetables of pure protection
domains.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index f3dd4cd..bec34fd 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -584,12 +584,12 @@ static void dma_ops_reserve_addresses(struct 
dma_ops_domain *dom,
iommu_area_reserve(dom-bitmap, start_page, pages);
 }
 
-static void dma_ops_free_pagetable(struct dma_ops_domain *dma_dom)
+static void free_pagetable(struct protection_domain *domain)
 {
int i, j;
u64 *p1, *p2, *p3;
 
-   p1 = dma_dom-domain.pt_root;
+   p1 = domain-pt_root;
 
if (!p1)
return;
@@ -610,6 +610,8 @@ static void dma_ops_free_pagetable(struct dma_ops_domain 
*dma_dom)
}
 
free_page((unsigned long)p1);
+
+   domain-pt_root = NULL;
 }
 
 /*
@@ -621,7 +623,7 @@ static void dma_ops_domain_free(struct dma_ops_domain *dom)
if (!dom)
return;
 
-   dma_ops_free_pagetable(dom);
+   free_pagetable(dom-domain);
 
kfree(dom-pte_pages);
 
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/19] AMD IOMMU: fix loop counter in free_pagetable function

2008-12-04 Thread Joerg Roedel
Impact: bugfix

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index b174471..f3dd4cd 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -599,7 +599,7 @@ static void dma_ops_free_pagetable(struct dma_ops_domain 
*dma_dom)
continue;
 
p2 = IOMMU_PTE_PAGE(p1[i]);
-   for (j = 0; j  512; ++i) {
+   for (j = 0; j  512; ++j) {
if (!IOMMU_PTE_PRESENT(p2[j]))
continue;
p3 = IOMMU_PTE_PAGE(p2[j]);
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/19] AMD IOMMU: rename iommu_map to iommu_map_page

2008-12-04 Thread Joerg Roedel
Impact: function rename

The iommu_map function maps only one page. Make this clear in the
function name.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index e4899e0..9e1c0fb 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -335,10 +335,10 @@ static void iommu_flush_tlb(struct amd_iommu *iommu, u16 
domid)
  * supporting all features of AMD IOMMU page tables like level skipping
  * and full 64 bit address spaces.
  */
-static int iommu_map(struct protection_domain *dom,
-unsigned long bus_addr,
-unsigned long phys_addr,
-int prot)
+static int iommu_map_page(struct protection_domain *dom,
+ unsigned long bus_addr,
+ unsigned long phys_addr,
+ int prot)
 {
u64 __pte, *pte, *page;
 
@@ -437,7 +437,7 @@ static int dma_ops_unity_map(struct dma_ops_domain *dma_dom,
 
for (addr = e-address_start; addr  e-address_end;
 addr += PAGE_SIZE) {
-   ret = iommu_map(dma_dom-domain, addr, addr, e-prot);
+   ret = iommu_map_page(dma_dom-domain, addr, addr, e-prot);
if (ret)
return ret;
/*
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/19] AMD IOMMU: add device reference counting for protection domains

2008-12-04 Thread Joerg Roedel
Impact: know how many devices are assigned to a domain

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/include/asm/amd_iommu_types.h |1 +
 arch/x86/kernel/amd_iommu.c|3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/amd_iommu_types.h 
b/arch/x86/include/asm/amd_iommu_types.h
index b74d35f..90f77de 100644
--- a/arch/x86/include/asm/amd_iommu_types.h
+++ b/arch/x86/include/asm/amd_iommu_types.h
@@ -203,6 +203,7 @@ struct protection_domain {
int mode;   /* paging mode (0-6 levels) */
u64 *pt_root;   /* page table root pointer */
unsigned long flags;/* flags to find out type of domain */
+   unsigned dev_cnt;   /* devices assigned to this domain */
void *priv; /* private data */
 };
 
diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index be626ad..afd2128 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -819,9 +819,10 @@ static void set_device_domain(struct amd_iommu *iommu,
  u16 devid)
 {
unsigned long flags;
-
u64 pte_root = virt_to_phys(domain-pt_root);
 
+   domain-dev_cnt += 1;
+
pte_root |= (domain-mode  DEV_ENTRY_MODE_MASK)
 DEV_ENTRY_MODE_SHIFT;
pte_root |= IOMMU_PTE_IR | IOMMU_PTE_IW | IOMMU_PTE_P | IOMMU_PTE_TV;
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/19] AMD IOMMU: register functions for the IOMMU API

2008-12-04 Thread Joerg Roedel
Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 57e636e..d859323 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -38,6 +38,8 @@ static DEFINE_RWLOCK(amd_iommu_devtable_lock);
 static LIST_HEAD(iommu_pd_list);
 static DEFINE_SPINLOCK(iommu_pd_list_lock);
 
+static struct iommu_ops amd_iommu_ops;
+
 /*
  * general struct to manage commands send to an IOMMU
  */
@@ -1485,6 +1487,8 @@ int __init amd_iommu_init_dma_ops(void)
/* Make the driver finally visible to the drivers */
dma_ops = amd_iommu_dma_ops;
 
+   register_iommu(amd_iommu_ops);
+
return 0;
 
 free_domains:
@@ -1713,3 +1717,13 @@ static phys_addr_t amd_iommu_iova_to_phys(struct 
iommu_domain *dom,
 
return paddr;
 }
+
+static struct iommu_ops amd_iommu_ops = {
+   .domain_init = amd_iommu_domain_init,
+   .domain_destroy = amd_iommu_domain_destroy,
+   .attach_dev = amd_iommu_attach_device,
+   .detach_dev = amd_iommu_detach_device,
+   .map = amd_iommu_map_range,
+   .unmap = amd_iommu_unmap_range,
+   .iova_to_phys = amd_iommu_iova_to_phys,
+};
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/19] AMD IOMMU: add checks for dma_ops domain to dma_ops functions

2008-12-04 Thread Joerg Roedel
Impact: detect when a driver uses a device assigned otherwise

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   21 +
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 07ef5e9..be626ad 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -786,6 +786,15 @@ free_dma_dom:
 }
 
 /*
+ * little helper function to check whether a given protection domain is a
+ * dma_ops domain
+ */
+static bool dma_ops_domain(struct protection_domain *domain)
+{
+   return domain-flags  PD_DMA_OPS_MASK;
+}
+
+/*
  * Find out the protection domain structure for a given PCI device. This
  * will give us the pointer to the page table root for example.
  */
@@ -1089,6 +1098,8 @@ static dma_addr_t map_single(struct device *dev, 
phys_addr_t paddr,
/* device not handled by any AMD IOMMU */
return (dma_addr_t)paddr;
 
+   BUG_ON(!dma_ops_domain(domain));
+
spin_lock_irqsave(domain-lock, flags);
addr = __map_single(dev, iommu, domain-priv, paddr, size, dir, false,
dma_mask);
@@ -1120,6 +1131,8 @@ static void unmap_single(struct device *dev, dma_addr_t 
dma_addr,
/* device not handled by any AMD IOMMU */
return;
 
+   BUG_ON(!dma_ops_domain(domain));
+
spin_lock_irqsave(domain-lock, flags);
 
__unmap_single(iommu, domain-priv, dma_addr, size, dir);
@@ -1175,6 +1188,8 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
if (!iommu || !domain)
return map_sg_no_iommu(dev, sglist, nelems, dir);
 
+   BUG_ON(!dma_ops_domain(domain));
+
spin_lock_irqsave(domain-lock, flags);
 
for_each_sg(sglist, s, nelems, i) {
@@ -1229,6 +1244,8 @@ static void unmap_sg(struct device *dev, struct 
scatterlist *sglist,
!get_device_resources(dev, iommu, domain, devid))
return;
 
+   BUG_ON(!dma_ops_domain(domain));
+
spin_lock_irqsave(domain-lock, flags);
 
for_each_sg(sglist, s, nelems, i) {
@@ -1275,6 +1292,8 @@ static void *alloc_coherent(struct device *dev, size_t 
size,
return virt_addr;
}
 
+   BUG_ON(!dma_ops_domain(domain));
+
if (!dma_mask)
dma_mask = *dev-dma_mask;
 
@@ -1317,6 +1336,8 @@ static void free_coherent(struct device *dev, size_t size,
if (!iommu || !domain)
goto free_mem;
 
+   BUG_ON(!dma_ops_domain(domain));
+
spin_lock_irqsave(domain-lock, flags);
 
__unmap_single(iommu, domain-priv, dma_addr, size, DMA_BIDIRECTIONAL);
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/19] AMD IOMMU: fix iommu_map_page function

2008-12-04 Thread Joerg Roedel
Impact: bugfix in iommu_map_page function

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 9e1c0fb..b174471 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -343,7 +343,7 @@ static int iommu_map_page(struct protection_domain *dom,
u64 __pte, *pte, *page;
 
bus_addr  = PAGE_ALIGN(bus_addr);
-   phys_addr = PAGE_ALIGN(bus_addr);
+   phys_addr = PAGE_ALIGN(phys_addr);
 
/* only support 512GB address spaces for now */
if (bus_addr  IOMMU_MAP_SIZE_L3 || !(prot  IOMMU_PROT_MASK))
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/19] AMD IOMMU: add domain unmap function for IOMMU API

2008-12-04 Thread Joerg Roedel
Impact: add a generic function to unmap pages into protection domains

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   39 +++
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 8391b09..6898d10 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -437,6 +437,28 @@ static int iommu_map_page(struct protection_domain *dom,
return 0;
 }
 
+static void iommu_unmap_page(struct protection_domain *dom,
+unsigned long bus_addr)
+{
+   u64 *pte;
+
+   pte = dom-pt_root[IOMMU_PTE_L2_INDEX(bus_addr)];
+
+   if (!IOMMU_PTE_PRESENT(*pte))
+   return;
+
+   pte = IOMMU_PTE_PAGE(*pte);
+   pte = pte[IOMMU_PTE_L1_INDEX(bus_addr)];
+
+   if (!IOMMU_PTE_PRESENT(*pte))
+   return;
+
+   pte = IOMMU_PTE_PAGE(*pte);
+   pte = pte[IOMMU_PTE_L1_INDEX(bus_addr)];
+
+   *pte = 0;
+}
+
 /*
  * This function checks if a specific unity mapping entry is needed for
  * this specific IOMMU.
@@ -1643,3 +1665,20 @@ static int amd_iommu_map_range(struct iommu_domain *dom,
 
return 0;
 }
+
+static void amd_iommu_unmap_range(struct iommu_domain *dom,
+ unsigned long iova, size_t size)
+{
+
+   struct protection_domain *domain = dom-priv;
+   unsigned long i,  npages = iommu_num_pages(iova, size, PAGE_SIZE);
+
+   iova  = PAGE_MASK;
+
+   for (i = 0; i  npages; ++i) {
+   iommu_unmap_page(domain, iova);
+   iova  += PAGE_SIZE;
+   }
+
+   iommu_flush_domain(domain-id);
+}
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/19] AMD IOMMU: refactor completion wait handling into separate functions

2008-12-04 Thread Joerg Roedel
Impact: split one function into three

The separate functions are required synchronize commands across all
hardware IOMMUs in the system.

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   67 +--
 1 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index b774638..ac801f1 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -193,31 +193,14 @@ static int iommu_queue_command(struct amd_iommu *iommu, 
struct iommu_cmd *cmd)
 }
 
 /*
- * This function is called whenever we need to ensure that the IOMMU has
- * completed execution of all commands we sent. It sends a
- * COMPLETION_WAIT command and waits for it to finish. The IOMMU informs
- * us about that by writing a value to a physical address we pass with
- * the command.
+ * This function waits until an IOMMU has completed a completion
+ * wait command
  */
-static int iommu_completion_wait(struct amd_iommu *iommu)
+static void __iommu_wait_for_completion(struct amd_iommu *iommu)
 {
-   int ret = 0, ready = 0;
+   int ready = 0;
unsigned status = 0;
-   struct iommu_cmd cmd;
-   unsigned long flags, i = 0;
-
-   memset(cmd, 0, sizeof(cmd));
-   cmd.data[0] = CMD_COMPL_WAIT_INT_MASK;
-   CMD_SET_TYPE(cmd, CMD_COMPL_WAIT);
-
-   iommu-need_sync = 0;
-
-   spin_lock_irqsave(iommu-lock, flags);
-
-   ret = __iommu_queue_command(iommu, cmd);
-
-   if (ret)
-   goto out;
+   unsigned long i = 0;
 
while (!ready  (i  EXIT_LOOP_COUNT)) {
++i;
@@ -232,6 +215,46 @@ static int iommu_completion_wait(struct amd_iommu *iommu)
 
if (unlikely((i == EXIT_LOOP_COUNT)  printk_ratelimit()))
printk(KERN_WARNING AMD IOMMU: Completion wait loop failed\n);
+}
+
+/*
+ * This function queues a completion wait command into the command
+ * buffer of an IOMMU
+ */
+static int __iommu_completion_wait(struct amd_iommu *iommu)
+{
+
+   struct iommu_cmd cmd;
+
+   memset(cmd, 0, sizeof(cmd));
+   cmd.data[0] = CMD_COMPL_WAIT_INT_MASK;
+   CMD_SET_TYPE(cmd, CMD_COMPL_WAIT);
+
+   return __iommu_queue_command(iommu, cmd);
+}
+
+/*
+ * This function is called whenever we need to ensure that the IOMMU has
+ * completed execution of all commands we sent. It sends a
+ * COMPLETION_WAIT command and waits for it to finish. The IOMMU informs
+ * us about that by writing a value to a physical address we pass with
+ * the command.
+ */
+static int iommu_completion_wait(struct amd_iommu *iommu)
+{
+   int ret;
+   unsigned long flags;
+
+   iommu-need_sync = 0;
+
+   spin_lock_irqsave(iommu-lock, flags);
+
+   ret = __iommu_completion_wait(iommu);
+
+   if (ret)
+   goto out;
+
+   __iommu_wait_for_completion(iommu);
 out:
spin_unlock_irqrestore(iommu-lock, flags);
 
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/19] AMD IOMMU: add domain init function for IOMMU API

2008-12-04 Thread Joerg Roedel
Impact: add a generic function for allocation protection domains

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   38 ++
 1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index afd2128..5d56cba 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -22,6 +22,7 @@
 #include linux/bitops.h
 #include linux/scatterlist.h
 #include linux/iommu-helper.h
+#include linux/iommu.h
 #include asm/proto.h
 #include asm/iommu.h
 #include asm/amd_iommu_types.h
@@ -1473,3 +1474,40 @@ free_domains:
 
return ret;
 }
+
+/*
+ *
+ * The following functions belong to the exported interface of AMD IOMMU
+ *
+ * This interface allows access to lower level functions of the IOMMU
+ * like protection domain handling and assignement of devices to domains
+ * which is not possible with the dma_ops interface.
+ *
+ */
+
+static int amd_iommu_domain_init(struct iommu_domain *dom)
+{
+   struct protection_domain *domain;
+
+   domain = kzalloc(sizeof(*domain), GFP_KERNEL);
+   if (!domain)
+   return -ENOMEM;
+
+   spin_lock_init(domain-lock);
+   domain-mode = PAGE_MODE_3_LEVEL;
+   domain-id = domain_id_alloc();
+   if (!domain-id)
+   goto out_free;
+   domain-pt_root = (void *)get_zeroed_page(GFP_KERNEL);
+   if (!domain-pt_root)
+   goto out_free;
+
+   dom-priv = domain;
+
+   return 0;
+
+out_free:
+   kfree(domain);
+
+   return -ENOMEM;
+}
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/19] AMD IOMMU: add domain map function for IOMMU API

2008-12-04 Thread Joerg Roedel
Impact: add a generic function to map pages into protection domains

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   28 
 1 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index d3fee3e..8391b09 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -1615,3 +1615,31 @@ static int amd_iommu_attach_device(struct iommu_domain 
*dom,
return 0;
 }
 
+static int amd_iommu_map_range(struct iommu_domain *dom,
+  unsigned long iova, phys_addr_t paddr,
+  size_t size, int iommu_prot)
+{
+   struct protection_domain *domain = dom-priv;
+   unsigned long i,  npages = iommu_num_pages(paddr, size, PAGE_SIZE);
+   int prot = 0;
+   int ret;
+
+   if (iommu_prot  IOMMU_READ)
+   prot |= IOMMU_PROT_IR;
+   if (iommu_prot  IOMMU_WRITE)
+   prot |= IOMMU_PROT_IW;
+
+   iova  = PAGE_MASK;
+   paddr = PAGE_MASK;
+
+   for (i = 0; i  npages; ++i) {
+   ret = iommu_map_page(domain, iova, paddr, prot);
+   if (ret)
+   return ret;
+
+   iova  += PAGE_SIZE;
+   paddr += PAGE_SIZE;
+   }
+
+   return 0;
+}
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/19] AMD IOMMU: add device attach function for IOMMU API

2008-12-04 Thread Joerg Roedel
Impact: add a generic function to attach devices to protection domains

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   35 +++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 0ded6f4..d3fee3e 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -1580,3 +1580,38 @@ static void amd_iommu_detach_device(struct iommu_domain 
*dom,
iommu_completion_wait(iommu);
 }
 
+static int amd_iommu_attach_device(struct iommu_domain *dom,
+  struct device *dev)
+{
+   struct protection_domain *domain = dom-priv;
+   struct protection_domain *old_domain;
+   struct amd_iommu *iommu;
+   struct pci_dev *pdev;
+   u16 devid;
+
+   if (dev-bus != pci_bus_type)
+   return -EINVAL;
+
+   pdev = to_pci_dev(dev);
+
+   devid = calc_devid(pdev-bus-number, pdev-devfn);
+
+   if (devid = amd_iommu_last_bdf ||
+   devid != amd_iommu_alias_table[devid])
+   return -EINVAL;
+
+   iommu = amd_iommu_rlookup_table[devid];
+   if (!iommu)
+   return -EINVAL;
+
+   old_domain = domain_for_device(devid);
+   if (old_domain)
+   __detach_device(old_domain, devid);
+
+   set_device_domain(iommu, domain, devid);
+
+   iommu_completion_wait(iommu);
+
+   return 0;
+}
+
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] KVM-userspace: Improved guest debugging / debug register emulation

2008-12-04 Thread Jan Kiszka
Changes since last round:
 - Rebased over last QEMU merge (eliminates 2 patches)
 - Fix for building against kernel headers with old guest debugging

As usual, find the patches also at git://git.kiszka.org/kvm-userspace.git 
gdb-queue

Jan

--
Siemens AG, Corporate Technology, CT SE 26
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/19] AMD IOMMU: add domain address lookup function for IOMMU API

2008-12-04 Thread Joerg Roedel
Impact: add a generic function to lockup addresses in protection domains

Signed-off-by: Joerg Roedel [EMAIL PROTECTED]
---
 arch/x86/kernel/amd_iommu.c |   31 +++
 1 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 6898d10..57e636e 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -1682,3 +1682,34 @@ static void amd_iommu_unmap_range(struct iommu_domain 
*dom,
 
iommu_flush_domain(domain-id);
 }
+
+static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
+ unsigned long iova)
+{
+   struct protection_domain *domain = dom-priv;
+   unsigned long offset = iova  ~PAGE_MASK;
+   phys_addr_t paddr;
+   u64 *pte;
+
+   pte = domain-pt_root[IOMMU_PTE_L2_INDEX(iova)];
+
+   if (!IOMMU_PTE_PRESENT(*pte))
+   return 0;
+
+   pte = IOMMU_PTE_PAGE(*pte);
+   pte = pte[IOMMU_PTE_L1_INDEX(iova)];
+
+   if (!IOMMU_PTE_PRESENT(*pte))
+   return 0;
+
+   pte = IOMMU_PTE_PAGE(*pte);
+   pte = pte[IOMMU_PTE_L0_INDEX(iova)];
+
+   if (!IOMMU_PTE_PRESENT(*pte))
+   return 0;
+
+   paddr  = *pte  IOMMU_PAGE_MASK;
+   paddr |= offset;
+
+   return paddr;
+}
-- 
1.5.6.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] kvm-userspace: Switch to new guest debug interface

2008-12-04 Thread Jan Kiszka
This patch switches both libkvm as well as the qemu pieces over to the
new guest debug interface. It comes with full support for software-based
breakpoints (via guest code modification), hardware-assisted breakpoints
and watchpoints (x86-only so far).

Breakpoint management is done inside qemu-kvm, transparently to gdbstub
and also avoiding that the gdb frontend takes over. This allows for
running debuggers inside the guest while guest debugging it active,
because the host can cleanly tell apart host- and guest-originated
breakpoint events.

Yet improvable are x86 corner cases when using single-step (forgotten
debug flags on the guest's stack). And, of course, the yet empty non-x86
helper functions have to be populated.

Signed-off-by: Jan Kiszka [EMAIL PROTECTED]
---

 libkvm/kvm-common.h |2 
 libkvm/libkvm.c |   18 +++-
 libkvm/libkvm.h |9 ++
 qemu/exec.c |   18 ++--
 qemu/gdbstub.c  |   23 +++--
 qemu/gdbstub.h  |7 ++
 qemu/qemu-kvm-ia64.c|   35 
 qemu/qemu-kvm-powerpc.c |   35 
 qemu/qemu-kvm-x86.c |  172 +
 qemu/qemu-kvm.c |  199 +++
 qemu/qemu-kvm.h |   33 
 user/main.c |7 +-
 12 files changed, 496 insertions(+), 62 deletions(-)

diff --git a/libkvm/kvm-common.h b/libkvm/kvm-common.h
index 9dae17b..c5beacc 100644
--- a/libkvm/kvm-common.h
+++ b/libkvm/kvm-common.h
@@ -88,7 +88,7 @@ int handle_shutdown(kvm_context_t kvm, void *env);
 void post_kvm_run(kvm_context_t kvm, void *env);
 int pre_kvm_run(kvm_context_t kvm, void *env);
 int handle_io_window(kvm_context_t kvm);
-int handle_debug(kvm_context_t kvm, void *env);
+int handle_debug(kvm_context_t kvm, int vcpu, void *env);
 int try_push_interrupts(kvm_context_t kvm);
 
 #endif
diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 40c95ce..01324bd 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -738,9 +738,15 @@ static int handle_io(kvm_context_t kvm, struct kvm_run 
*run, int vcpu)
return 0;
 }
 
-int handle_debug(kvm_context_t kvm, void *env)
+int handle_debug(kvm_context_t kvm, int vcpu, void *env)
 {
-   return kvm-callbacks-debug(kvm-opaque, env);
+#ifdef KVM_CAP_SET_GUEST_DEBUG
+struct kvm_run *run = kvm-run[vcpu];
+
+return kvm-callbacks-debug(kvm-opaque, env, run-debug.arch);
+#else
+return 0;
+#endif
 }
 
 int kvm_get_regs(kvm_context_t kvm, int vcpu, struct kvm_regs *regs)
@@ -937,7 +943,7 @@ again:
r = handle_io(kvm, run, vcpu);
break;
case KVM_EXIT_DEBUG:
-   r = handle_debug(kvm, env);
+   r = handle_debug(kvm, vcpu, env);
break;
case KVM_EXIT_MMIO:
r = handle_mmio(kvm, run);
@@ -982,10 +988,12 @@ int kvm_inject_irq(kvm_context_t kvm, int vcpu, unsigned 
irq)
return ioctl(kvm-vcpu_fd[vcpu], KVM_INTERRUPT, intr);
 }
 
-int kvm_guest_debug(kvm_context_t kvm, int vcpu, struct kvm_debug_guest *dbg)
+#ifdef KVM_CAP_SET_GUEST_DEBUG
+int kvm_set_guest_debug(kvm_context_t kvm, int vcpu, struct kvm_guest_debug 
*dbg)
 {
-   return ioctl(kvm-vcpu_fd[vcpu], KVM_DEBUG_GUEST, dbg);
+   return ioctl(kvm-vcpu_fd[vcpu], KVM_SET_GUEST_DEBUG, dbg);
 }
+#endif
 
 int kvm_set_signal_mask(kvm_context_t kvm, int vcpu, const sigset_t *sigset)
 {
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index aaad4fb..4304983 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -55,7 +55,10 @@ struct kvm_callbacks {
/// generic memory writes to unmapped memory (For MMIO devices)
 int (*mmio_write)(void *opaque, uint64_t addr, uint8_t *data,
int len);
-int (*debug)(void *opaque, void *env);
+#ifdef KVM_CAP_SET_GUEST_DEBUG
+int (*debug)(void *opaque, void *env,
+struct kvm_debug_exit_arch *arch_info);
+#endif
/*!
 * \brief Called when the VCPU issues an 'hlt' instruction.
 *
@@ -348,7 +351,9 @@ static inline int kvm_reset_mpstate(kvm_context_t kvm, int 
vcpu)
  */
 int kvm_inject_irq(kvm_context_t kvm, int vcpu, unsigned irq);
 
-int kvm_guest_debug(kvm_context_t, int vcpu, struct kvm_debug_guest *dbg);
+#ifdef KVM_CAP_SET_GUEST_DEBUG
+int kvm_set_guest_debug(kvm_context_t, int vcpu, struct kvm_guest_debug *dbg);
+#endif
 
 #if defined(__i386__) || defined(__x86_64__)
 /*!
diff --git a/qemu/exec.c b/qemu/exec.c
index c699043..0c71eb8 100644
--- a/qemu/exec.c
+++ b/qemu/exec.c
@@ -1407,9 +1407,6 @@ int cpu_breakpoint_insert(CPUState *env, target_ulong pc, 
int flags,
 else
 TAILQ_INSERT_TAIL(env-breakpoints, bp, entry);
 
-if (kvm_enabled())
-   kvm_update_debugger(env);
-
 breakpoint_invalidate(env, pc);
 
 if (breakpoint)
@@ -1444,9 +1441,6 @@ void cpu_breakpoint_remove_by_ref(CPUState *env, 
CPUBreakpoint *breakpoint)
 #if 

[PATCH v2 3/3] kvm-userspace: Provide compat wrapper for set_debugreg

2008-12-04 Thread Jan Kiszka
Older set_debugreg macros did not allow to pass the register number as
constant (without additional typcasting). Catch this as the latest kvm
debug changes make use of this property.

Signed-off-by: Jan Kiszka [EMAIL PROTECTED]
---

 kernel/x86/external-module-compat.h |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kernel/x86/external-module-compat.h 
b/kernel/x86/external-module-compat.h
index 1055050..5489f47 100644
--- a/kernel/x86/external-module-compat.h
+++ b/kernel/x86/external-module-compat.h
@@ -333,6 +333,16 @@ struct kvm_desc_ptr {
 #define FEATURE_CONTROL_VMXON_ENABLED  (12)
 #endif
 
+#if LINUX_VERSION_CODE  KERNEL_VERSION(2,6,25)  defined(__x86_64__)
+
+#undef set_debugreg
+#define set_debugreg(value, register) \
+   __asm__(movq %0,%%db #register \
+   : /* no output */ \
+   :r ((unsigned long)value))
+
+#endif
+
 #if LINUX_VERSION_CODE  KERNEL_VERSION(2,6,29)
 
 struct mtrr_var_range {

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] kvm-userspace: Remove obsolete special_reload_dr7 hack

2008-12-04 Thread Jan Kiszka
Host debug registers are now properly saved and restored before/after
entering the guest.

Signed-off-by: Jan Kiszka [EMAIL PROTECTED]
---

 kernel/x86/external-module-compat.h |2 --
 kernel/x86/hack-module.awk  |4 
 kernel/x86/preempt.c|6 --
 3 files changed, 0 insertions(+), 12 deletions(-)

diff --git a/kernel/x86/external-module-compat.h 
b/kernel/x86/external-module-compat.h
index b5e11e2..1055050 100644
--- a/kernel/x86/external-module-compat.h
+++ b/kernel/x86/external-module-compat.h
@@ -171,7 +171,6 @@ static inline void preempt_notifier_init(struct 
preempt_notifier *notifier,
 void start_special_insn(void);
 void end_special_insn(void);
 void in_special_section(void);
-void special_reload_dr7(void);
 
 void preempt_notifier_sys_init(void);
 void preempt_notifier_sys_exit(void);
@@ -181,7 +180,6 @@ void preempt_notifier_sys_exit(void);
 static inline void start_special_insn(void) {}
 static inline void end_special_insn(void) {}
 static inline void in_special_section(void) {}
-static inline void special_reload_dr7(void) {}
 
 static inline void preempt_notifier_sys_init(void) {}
 static inline void preempt_notifier_sys_exit(void) {}
diff --git a/kernel/x86/hack-module.awk b/kernel/x86/hack-module.awk
index a112d0b..fe38059 100644
--- a/kernel/x86/hack-module.awk
+++ b/kernel/x86/hack-module.awk
@@ -75,10 +75,6 @@ BEGIN { split(INIT_WORK tsc_khz desc_struct ldttss_desc64 
desc_ptr  \
 
 { print }
 
-/kvm_x86_ops-run/ {
-print \tspecial_reload_dr7();
-}
-
 /unsigned long flags;/   vmx_load_host_state {
 print \tunsigned long gsbase;
 }
diff --git a/kernel/x86/preempt.c b/kernel/x86/preempt.c
index 9e4bd2c..3112879 100644
--- a/kernel/x86/preempt.c
+++ b/kernel/x86/preempt.c
@@ -40,12 +40,6 @@ static void preempt_enable_sched_in_notifiers(void * addr)
 #endif
 }
 
-void special_reload_dr7(void)
-{
-   asm volatile (mov %0, %%db7 : : r(0x701ul));
-}
-EXPORT_SYMBOL_GPL(special_reload_dr7);
-
 static void __preempt_disable_notifiers(void)
 {
asm volatile (mov %0, %%db7 : : r(0ul));

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


vlan option is misleading, what about vnet ?

2008-12-04 Thread Luca Bigliardi

Hi qemu/kvm devels!

I think vlan network option is misleading because it's about a virtual
network and not 802.1Q .

I'd like to rename it vnet (or something similar.. any ideas?) and then
mark vlan option deprecated somehow.

Is it ok for you?


Thank you,

luca

-- 
Beware of programmers who carry screwdrivers.
-- Leonard Brandwein

http://shammash.homelinux.org/ - http://www.artha.org/ - http://www.yue.it/
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/11] add frontend implementation for the IOMMU API

2008-12-04 Thread Greg KH
On Thu, Dec 04, 2008 at 06:28:48PM +0100, Joerg Roedel wrote:
 This API can be used by KVM for accessing different types of IOMMUs to
 do device passthrough to guests. Beside that this API can also be used
 by device drivers to map non-linear host memory into dma-linear
 addresses to prevent sgather-gather DMA. UIO may be another user for
 this API.
 
 Signed-off-by: Joerg Roedel [EMAIL PROTECTED]

Acked-by: Greg Kroah-Hartman [EMAIL PROTECTED]

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [v2] Remove TARGET_PAGE_SIZE from virtio interface

2008-12-04 Thread Anthony Liguori

Hollis Blanchard wrote:

TARGET_PAGE_SIZE should only be used internal to qemu, not in guest/host
interfaces. The virtio frontend code in Linux uses two constants (PFN shift
and vring alignment) for the interface, so update qemu to match.

I've tested this with PowerPC KVM and confirmed that it fixes virtio problems
when using non-TARGET_PAGE_SIZE pages in the guest.
  


Applied.  Thanks.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [5874] Add virtio-balloon support

2008-12-04 Thread Hollis Blanchard
On Thu, 2008-12-04 at 20:33 +, Anthony Liguori wrote:
 
 +static void balloon_page(void *addr, int deflate)
 +{
 +#if defined(__linux__)
 +if (!kvm_enabled() || kvm_has_sync_mmu())
 +madvise(addr, TARGET_PAGE_SIZE,
 +deflate ? MADV_WILLNEED : MADV_DONTNEED);
 +#endif
 +}

Hmm, I just noticed this... we need to use VIRTIO_BALLOON_PFN_SHIFT like
Rusty did on the kernel side.

However, in general I'm not sure how this is supposed to work. Isn't it
true that madvise() is a no-op if 0  length  getpagesize()? If so, how
should the guest know the chunk size needed on the host?

What happens when a guest tries to balloon 4K pages when it's backed on
the host by hugetlbfs? We can't even use getpagesize() there.

Maybe the virtio balloon interface needs to advertise a unit size from
the host, and use that size instead of alloc_page() in the guest?

-- 
Hollis Blanchard
IBM Linux Technology Center

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] do boundary check based on absolute value

2008-12-04 Thread Glauber Costa
For backward operations, dstpitch and srcpitch can
be negative. This leads BLTUNSAFE macro into an
overflow, and as a result, it avoids performing
operations that are perfectly valid.

The visible effect that led to that patch was the gnome-panel
bar in Fedora10. Before this patch, you could see garbage
clobbering a big portion of the bar.

After this patch, this garbage is gone.

Signed-off-by: Glauber Costa [EMAIL PROTECTED]
---
 hw/cirrus_vga.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
index e0cf458..5690719 100644
--- a/hw/cirrus_vga.c
+++ b/hw/cirrus_vga.c
@@ -221,15 +221,17 @@
 #define CIRRUS_HOOK_NOT_HANDLED 0
 #define CIRRUS_HOOK_HANDLED 1
 
+#define ABS(a) ((signed)(a)  0 ? a : -a)
+
 #define BLTUNSAFE(s) \
 ( \
 ( /* check dst is within bounds */ \
-(s)-cirrus_blt_height * (s)-cirrus_blt_dstpitch \
+(s)-cirrus_blt_height * ABS((s)-cirrus_blt_dstpitch) \
 + ((s)-cirrus_blt_dstaddr  (s)-cirrus_addr_mask)  \
 (s)-vram_size \
 ) || \
 ( /* check src is within bounds */ \
-(s)-cirrus_blt_height * (s)-cirrus_blt_srcpitch \
+(s)-cirrus_blt_height * ABS((s)-cirrus_blt_srcpitch) \
 + ((s)-cirrus_blt_srcaddr  (s)-cirrus_addr_mask)  \
 (s)-vram_size \
 ) \
-- 
1.5.6.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] do boundary check based on absolute value

2008-12-04 Thread Anthony Liguori

Glauber Costa wrote:

For backward operations, dstpitch and srcpitch can
be negative. This leads BLTUNSAFE macro into an
overflow, and as a result, it avoids performing
operations that are perfectly valid.

The visible effect that led to that patch was the gnome-panel
bar in Fedora10. Before this patch, you could see garbage
clobbering a big portion of the bar.

After this patch, this garbage is gone.

  


Applied.  Thanks.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Virtio network performance problem

2008-12-04 Thread Dor Laor

Adrian Schmitz wrote:

On Wed, Dec 03, 2008 at 11:20:08AM -0800, Chris Wedgwood wrote:

  

TSC instability?  Is this an SMP guest?



Ok, I tried pinning the kvm process to two cores (0,2) on a single
socket, but that didn't seem to make any difference for my virtio
network performance. I also tried pinning the process to a single core,
which also didn't seem to have any effect.

  

I think it is an unsync tsc problem.
First, make sure you pin all of the process threads. There is thread per 
vcpu + io thread +more non relevant.

You can do it by adding the taskset before the cmdline.
Second, you said that you use smp guest. So windows also sees unsync tsc.
So, either test with UP guest or learn how to pin windows receiving ISR, 
DPC and the user app.


Well, testing on Intel or newer AMD is another option.
I tested it again now on Intel with UP guest and there is no such a problem.
Hope to test it next week on AMD SMP guest.

Regards,
Dor

Someone on IRC suggested that it sounded like a clocking issue, since
some of my ping times are negative. He suggested trying a different
clock source. I tried it with dynticks, rtc, and unix. None of them seem
better, although all of them seem different in terms of patterns in
the ping times. Sorry if this makes it a long post, but I don't know how
to describe it other than to paste an example (below). Not sure if this
indicates that it is clock-related or if it is meaningless.

In any event, I'm not sure where to go from here. Another suggestion
from IRC was that it was due to the age of my host kernel (2.6.18) and
the fact that it doesn't support high-res timers. If I can avoid
replacing the distro kernel, I'd like to, but I'll do what I have to, I
suppose.

With dynticks (these are all with -net user, as I had some trouble with
my tap interface last night while testing this. The results are roughly
the same as when I was using tap before, though):

Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-139ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-141ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-133ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=143ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255

With rtc:

Reply from 10.0.2.2: bytes=32 time=-224ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-223ms TTL=255
Reply from 10.0.2.2: bytes=32 time=4ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-223ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-224ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255
Reply from 10.0.2.2: bytes=32 time=225ms TTL=255

With unix:

Reply from 10.0.2.2: bytes=32 time=-191ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-191ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-190ms TTL=255
Reply from 10.0.2.2: bytes=32 time=-191ms TTL=255
Reply from 10.0.2.2: bytes=32 time=1ms TTL=255
Reply from 10.0.2.2: bytes=32 time=192ms TTL=255
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] virtio: make PCI devices take a virtio_pci module ref

2008-12-04 Thread Jiri Slaby
On 12/04/2008 01:44 PM, Mark McLoughlin wrote:
 Nothing takes a ref on virtio_pci, so even if you have
 devices in use, rmmod will attempt to unload the module.

It unbinds the device properly as any other driver. So what's the problem here?

 Fix by simply making each device take a ref on the module.
 
 Signed-off-by: Mark McLoughlin [EMAIL PROTECTED]
 Reported-by: Michael Tokarev [EMAIL PROTECTED]
 ---
  drivers/virtio/virtio_pci.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/13] allocation and free functions of virtual machine domain

2008-12-04 Thread Joerg Roedel
On Thu, Dec 04, 2008 at 05:13:00PM +, Mark McLoughlin wrote:
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
  Signed-off-by: Weidong Han [EMAIL PROTECTED]
  ---
   drivers/pci/intel-iommu.c |  104 
  -
   1 files changed, 103 insertions(+), 1 deletions(-)
  
  diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
  index b00a8f2..e96b3bc 100644
  --- a/drivers/pci/intel-iommu.c
  +++ b/drivers/pci/intel-iommu.c
  @@ -947,6 +947,7 @@ static int iommu_init_domains(struct intel_iommu *iommu)
   
  
   static void domain_exit(struct dmar_domain *domain);
  +static void vm_domain_exit(struct dmar_domain *domain);
   
   void free_dmar_iommu(struct intel_iommu *iommu)
   {
  @@ -957,8 +958,13 @@ void free_dmar_iommu(struct intel_iommu *iommu)
  for (; i  cap_ndoms(iommu-cap); ) {
  domain = iommu-domains[i];
  clear_bit(i, iommu-domain_ids);
  -   if (--domain-iommu_count == 0)
  +
  +   if (domain-flags  DOMAIN_FLAG_VIRTUAL_MACHINE) {
  +   if (--domain-iommu_count == 0)
  +   vm_domain_exit(domain);
  +   } else
  domain_exit(domain);
  +
 
 Again, these new functions are copies of existing code with minor
 modifications. I'd much rather see the existing code refactored and then
 modified to handle the DOMAIN_FLAG_VIRTUAL_MACHINE case.

Hey Mark,

can your objections be fixed by follow-up patches bei Han or is anything
critical in it? My AMD IOMMU patches for KVM support depend on these
patches and everything they are changed I have to rebase by work. So I
would prefer if Han can fix the issues found by follow-up patches :)

Joerg
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/13] move page table handling utility functions

2008-12-04 Thread Han, Weidong
Mark McLoughlin wrote:
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 
 move page table handling utility functions from intel-iommu.c to
 dma_remapping.h, because some of them will be used in other .c files.
 
 You need to rebase your patches against dwmw2's tree where some
 cleanup patches of mine moved a bunch of stuff from dma_remapping.h to
 intel-iommu.c
 

How to get dwmw2's tree? 

Regards,
Weidong

 And preferably, this stuff could stay internal to intel-iommu.c.
 
 Cheers,
 Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 01/13] iommu bitmap insteads of iommu pointer in dmar_domain

2008-12-04 Thread Han, Weidong
Mark McLoughlin wrote:
 Hi Weidong,
 
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 
 Support dmar_domain own multiple devices from different iommus, which
 are set in iommu bitmap. add function domain_get_iommu() to get the
 only one iommu of domain in native VT-d usage.
 
 A bitmap seems quite awkward. Why not a list?

Yes, list may be direct. I will replace it.

 
 Also, I wasn't sure at first what you meant by native VT-d ... you
 mean DMA-API VT-d usage as opposed to KVM device assignment usage,
 right? Perhaps we need a better term for that distinction.

Yes. any proposal on the term?

 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |  102
  
  include/linux/dma_remapping.h |2 +- 2 files changed, 72
 insertions(+), 32 deletions(-) 
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index 5c8baa4..39c5e9d 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 
 @@ -184,6 +185,21 @@ void free_iova_mem(struct iova *iova)
  kmem_cache_free(iommu_iova_cache, iova);
  }
 
 +/* in native case, each domain is related to only one iommu */
 +static struct intel_iommu *domain_get_iommu(struct dmar_domain
 *domain) +{ +struct dmar_drhd_unit *drhd;
 +
 +for_each_drhd_unit(drhd) {
 +if (drhd-ignored)
 +continue;
 +if (test_bit(drhd-iommu-seq_id, domain-iommu_bmp)) +
 return
 drhd-iommu; +   }
 +
 +return NULL;
 +}
 
 So, basically, a lot of the code assumes that there is only one iommu
 associated with a domain. That makes it seem like the abstractions
 here could do with some re-working.
 
 We should at least add:
 
   ASSERT(!(domain-flags  DOMAIN_FLAG_VIRTUAL_MACHINE));
 
 in the patch which adds that flag.

Okay, I will add ASSERT()s. 

 
 @@ -1925,16 +1952,19 @@ static void add_unmap(struct dmar_domain
  *dom, struct iova *iova)  { unsigned long flags;
  int next, iommu_id;
 +struct intel_iommu *iommu;
 
  spin_lock_irqsave(async_umap_flush_lock, flags);
  if (list_size == HIGH_WATER_MARK)
  flush_unmaps();
 
 -iommu_id = dom-iommu-seq_id;
 +iommu = domain_get_iommu(dom);
 +iommu_id = iommu-seq_id;
 
  next = deferred_flush[iommu_id].next;
  deferred_flush[iommu_id].domain[next] = dom;
  deferred_flush[iommu_id].iova[next] = iova;
 +deferred_flush[iommu_id].iommu = iommu;
  deferred_flush[iommu_id].next++;
 
 This deferred_flush-iommu change should be in it's own patch, IMHO.

I will make a separate patch for it.

 
 Also, it's not quite right - there is a fixed mapping between iommu_id
 and the iommu, so it makes no sense to update that mapping each time
 we add a new iova.
 
 In fact, it makes me wonder why we don't have the flush list in the
 struct intel_iommu and have a global list of iommus.

I think a global list of iommus is useful. Because there is a fixed mapping 
between iommu_id and the iommu, iommu can be got directly from the global iommu 
list by iommu_id.

Regards,
Weidong

 
 Cheers,
 Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 03/13] set iommu agaw

2008-12-04 Thread Han, Weidong
Mark McLoughlin wrote:
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 agaw may be different across iommus.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/dmar.c|   14 ++
  include/linux/dma_remapping.h |2 ++
  include/linux/intel-iommu.h   |1 +
  3 files changed, 17 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
 index 691b3ad..ebcc7c2 100644
 --- a/drivers/pci/dmar.c
 +++ b/drivers/pci/dmar.c
 @@ -491,6 +491,8 @@ int alloc_iommu(struct dmar_drhd_unit *drhd) 
  int map_size; u32 ver;
  static int iommu_allocated = 0;
 +unsigned long sagaw;
 +int agaw;
 
  iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
  if (!iommu)
 @@ -506,6 +508,18 @@ int alloc_iommu(struct dmar_drhd_unit *drhd)
  iommu-cap = dmar_readq(iommu-reg + DMAR_CAP_REG);
  iommu-ecap = dmar_readq(iommu-reg + DMAR_ECAP_REG);
 
 +/* set agaw, SAGAW may be different across iommus */
 +sagaw = cap_sagaw(iommu-cap);
 +for (agaw = width_to_agaw(DEFAULT_DOMAIN_ADDRESS_WIDTH); +  
 agaw = 0; agaw--) + if (test_bit(agaw, sagaw))
 +break;
 +if (agaw  0) {
 +printk(KERN_ERR IOMMU: unsupported sagaw %lx\n, sagaw); + 
 goto
 error; + }
 +iommu-agaw = agaw;
 
 Could we add something like intel_iommu_calculate_agaw() and keep the
 agaw code internal to intel-iommu.c?

Okay.

 
 Also, unsupported sagaw expands to unsupported supported adjusted
 guest address width which doesn't make much sense :-)
 
 unsupported address width would be sufficient, I think.

Agree.

Regards,
Weidong

 
 Cheers,
 Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 04/13] iommu coherency

2008-12-04 Thread Han, Weidong
Mark McLoughlin wrote:
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 
 in dmar_domain, more than one iommus may be included in iommu_bmp.
 Due 
 to Coherency capability may be different across iommus, set this
 variable to indicate iommu access is coherent or not. Only when all
 related iommus in a dmar_domain are all coherent, iommu access of
 this 
 domain is coherent.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |6 ++
  include/linux/dma_remapping.h |2 ++
  2 files changed, 8 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index a18e0b4..fa1507b 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -982,6 +982,12 @@ static struct dmar_domain *
  iommu_alloc_domain(struct intel_iommu *iommu)   domain-id = num;
  memset(domain-iommu_bmp, 0, sizeof(unsigned long));
 set_bit(iommu-seq_id, domain-iommu_bmp); +
 +if (ecap_coherent(iommu-ecap))
 +domain-iommu_coherency = 1;
 +else
 +domain-iommu_coherency = 0;
 
 If you allocate a non-coherent iommu, followed by a coherent iommu,
 then iommu_coherency ends up as 1
 
 In patch 6/13 you add domain_update_iommu_coherency(). It would make
 more sense to add that function in this patch and use it here.

there is also an assumption that iommu_alloc_domain() is only used by native 
VT-d, so the domain is related to only one iommu. I will add ASSERT() here.

Regards,
Weidong

 
 Cheers,
 Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 05/13] add domain flag DOMAIN_FLAG_VIRTUAL_MACHINE

2008-12-04 Thread Han, Weidong
Mark McLoughlin wrote:
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 By default, one domain owns one device, like native VT-d usage.
 
 For kvm VT-d usage, more than one devices across iommus may be
 assigned to one domain, flag DOMAIN_FLAG_VIRTUAL_MACHINE is for this
 usage.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |3 ++-
  include/linux/dma_remapping.h |   11 ++-
  2 files changed, 12 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index fa1507b..09a5150 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -989,6 +989,7 @@ static struct dmar_domain *
  iommu_alloc_domain(struct intel_iommu *iommu)
 domain-iommu_coherency = 0; 
 
  iommu-domains[num] = domain;
 +domain-flags = 0;
  spin_unlock_irqrestore(iommu-lock, flags);
 
 This looks like a bugfix. Does it need to be fixed in 2.6.28?

Yes, it's a bugfix. I will send it out separately.

 
 
  return domain;
 @@ -1387,7 +1388,7 @@ static struct dmar_domain
  *get_domain_for_dev(struct pci_dev *pdev, int gaw)  
 info-dev =
  NULL; info-domain = domain; /* This domain is shared by devices
 under p2p bridge */ -domain-flags |= 
 DOMAIN_FLAG_MULTIPLE_DEVICES;
 +domain-flags |= DOMAIN_FLAG_P2P_MULTIPLE_DEVICES;
 
 Renaming this flag should probably be a separate patch.

You mean one patch to rename this flag, and another patch to add 
DOMAIN_FLAG_VIRTUAL_MACHINE, right?

Regards,
Weidong

 Cheers,
 Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 06/13] add/remove domain device info for virtual machine VT-d

2008-12-04 Thread Han, Weidong
Mark McLoughlin wrote:
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 
 Separate add/remove domain device info functions for virtual machine
 VT-d from natvie VT-d. 
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |  164
  +++-
  include/linux/dma_remapping.h |1 + 2 files changed, 160
 insertions(+), 5 deletions(-) 
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index 09a5150..429aff4 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -200,6 +200,27 @@ static struct intel_iommu
  *domain_get_iommu(struct dmar_domain *domain)   return NULL; }
 
 +static struct intel_iommu *device_find_matched_iommu(u8 bus, u8
 devfn) 
 
 That's quite an unwieldy name, how about device_to_iommu() ?

will rename it.

 
 +{
 +struct dmar_drhd_unit *drhd = NULL;
 +int i;
 +
 +for_each_drhd_unit(drhd) {
 +if (drhd-ignored)
 +continue;
 +
 +for (i = 0; i  drhd-devices_cnt; i++)
 +if (drhd-devices[i]-bus-number == bus 
 +drhd-devices[i]-devfn == devfn)
 +return drhd-iommu;
 +
 +if (drhd-include_all)
 +return drhd-iommu;
 +}
 +
 +return NULL;
 +}
 ...
 @@ -1269,9 +1292,12 @@ domain_page_mapping(struct dmar_domain
  *domain, dma_addr_t iova,   return 0; }
 
 -static void detach_domain_for_dev(struct dmar_domain *domain, u8
 bus, u8 devfn) +static void iommu_detach_dev(u8 bus, u8 devfn)
 
 Would be nicer if this function took a struct intel_iommu pointer
 rather than bus/devfn. 

intel_iommu can be got by bus and devfn, so I didn't add the parameter. But 
your suggestion is reasonable. 

 
  {
 -struct intel_iommu *iommu = domain_get_iommu(domain);
 +struct intel_iommu *iommu = device_find_matched_iommu(bus, devfn);
 + +  if (!iommu)
 +return;
 
  clear_context_table(iommu, bus, devfn);
  iommu-flush.flush_context(iommu, 0, 0, 0, ...
 +/* Coherency capability may be different across iommus */
 +static void domain_update_iommu_coherency(struct dmar_domain
 *domain) +{ +struct dmar_drhd_unit *drhd;
 +
 +domain-iommu_coherency = 1;
 +
 +for_each_drhd_unit(drhd) {
 +if (drhd-ignored)
 +continue;
 +if (test_bit(drhd-iommu-seq_id, domain-iommu_bmp)) {
 +if (!ecap_coherent(drhd-iommu-ecap)) {
 +domain-iommu_coherency = 0;
 +break;
 +}
 +}
 +}
 +}
 
 As I said, this belongs in the patch where you added the
 iommu_coherency 
 flag.
 
 +
 +static int vm_domain_add_dev_info(struct dmar_domain *domain, + 
 
 struct pci_dev *pdev) +{
 +struct device_domain_info *info;
 +unsigned long flags;
 +
 +info = alloc_devinfo_mem();
 +if (!info)
 +return -ENOMEM;
 +
 +info-bus = pdev-bus-number;
 +info-devfn = pdev-devfn;
 +info-dev = pdev;
 +info-domain = domain;
 +
 +spin_lock_irqsave(device_domain_lock, flags);
 +list_add(info-link, domain-devices);
 +list_add(info-global, device_domain_list);
 +pdev-dev.archdata.iommu = info;
 +spin_unlock_irqrestore(device_domain_lock, flags); +
 +return 0;
 +}
 +
 +static void vm_domain_remove_one_dev_info(struct dmar_domain
 *domain, + struct pci_dev *pdev) +{
 +struct device_domain_info *info;
 +struct intel_iommu *iommu;
 +unsigned long flags;
 +int found = 0;
 +
 +iommu = device_find_matched_iommu(pdev-bus-number, pdev-devfn);
 + +  spin_lock_irqsave(device_domain_lock, flags);
 +while (!list_empty(domain-devices)) {
 +info = list_entry(domain-devices.next,
 +struct device_domain_info, link);
 +if (info-bus == pdev-bus-number 
 +info-devfn == pdev-devfn) {
 +list_del(info-link);
 +list_del(info-global);
 +if (info-dev)
 +info-dev-dev.archdata.iommu = NULL;
 +spin_unlock_irqrestore(device_domain_lock, flags); +
 +iommu_detach_dev(info-bus, info-devfn);
 +free_devinfo_mem(info);
 +
 +spin_lock_irqsave(device_domain_lock, flags);
 +
 +if (found)
 +break;
 +else
 +continue;
 +}
 +
 +/* if there is no other devices under the same iommu
 + * owned by this domain, clear this iommu in iommu_bmp +
  */
 +if (device_find_matched_iommu(info-bus, info-devfn) == iommu)
 +found = 1; +}
 +
 +if (found == 0) {
 +spin_lock_irqsave(iommu-lock, flags);
 + 

RE: [PATCH 07/13] add domain_flush_cache

2008-12-04 Thread Han, Weidong
Mark McLoughlin wrote:
 On Tue, 2008-12-02 at 22:22 +0800, Han, Weidong wrote:
 
 For some common low level functions which will be also used by
 virtual 
 machine usage, use domain_flush_cache instead of __iommu_flush_cache.
 
 Signed-off-by: Weidong Han [EMAIL PROTECTED]
 ---
  drivers/pci/intel-iommu.c |   40
   1 files changed, 24
 insertions(+), 16 deletions(-) 
 
 diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
 index 429aff4..b00a8f2 100644
 --- a/drivers/pci/intel-iommu.c
 +++ b/drivers/pci/intel-iommu.c
 @@ -200,6 +200,13 @@ static struct intel_iommu
  *domain_get_iommu(struct dmar_domain *domain)   return NULL; }
 
 +static void domain_flush_cache(struct dmar_domain *domain,
 +   void *addr, int size)
 +{
 +if (!domain-iommu_coherency)
 +clflush_cache_range(addr, size);
 +}
 
 This is quite unfortunate; __iommu_flush_cache() is essentially
 identical:
 
 static inline void __iommu_flush_cache(
 struct intel_iommu *iommu, void *addr, int size)
 {
 if (!ecap_coherent(iommu-ecap))
 clflush_cache_range(addr, size);
 }
 
 Is there no way we can use a single function for both purposes?
 

it's not easy, some functions (e.g. qi_submit_sync() in dmar.c) don't have 
dmar_domain struct, whereas, domain_flush_cache is necessary for kvm vtd case.

Regards,
Weidong 

 Cheers,
 Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/13] move page table handling utility functions

2008-12-04 Thread Yu, Fenghua
 
 You need to rebase your patches against dwmw2's tree where some
 cleanup patches of mine moved a bunch of stuff from dma_remapping.h to
 intel-iommu.c
How to get dwmw2's tree? 

git://git.infradead.org/iommu-2.6.git

Thanks.

-Fenghua
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html