Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Gleb Natapov
On Wed, Jun 05, 2013 at 07:41:17PM -0500, Anthony Liguori wrote:
 H. Peter Anvin h...@zytor.com writes:
 
  On 06/05/2013 03:08 PM, Anthony Liguori wrote:
 
  Definitely an option.  However, we want to be able to boot from native
  devices, too, so having an I/O BAR (which would not be used by the OS
  driver) should still at the very least be an option.
  
  What makes it so difficult to work with an MMIO bar for PCI-e?
  
  With legacy PCI, tracking allocation of MMIO vs. PIO is pretty straight
  forward.  Is there something special about PCI-e here?
  
 
  It's not tracking allocation.  It is that accessing memory above 1 MiB
  is incredibly painful in the BIOS environment, which basically means
  MMIO is inaccessible.
 
 Oh, you mean in real mode.
 
 SeaBIOS runs the virtio code in 32-bit mode with a flat memory layout.
 There are loads of ASSERT32FLAT()s in the code to make sure of this.
 
Well, not exactly. Initialization is done in 32bit, but disk
reads/writes are done in 16bit mode since it should work from int13
interrupt handler. The only way I know to access MMIO bars from 16 bit
is to use SMM which we do not have in KVM.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug#707257: linux-image-3.8-1-686-pae: KVM crashes with entry failed, hardware error 0x80000021

2013-06-06 Thread Gleb Natapov
On Wed, Jun 05, 2013 at 02:51:19PM +0200, Stefan Pietsch wrote:
 On 05.06.2013 14:10, Gleb Natapov wrote:
  On Wed, Jun 05, 2013 at 01:57:25PM +0200, Stefan Pietsch wrote:
  On 19.05.2013 14:32, Gleb Natapov wrote:
  On Sun, May 19, 2013 at 02:00:31AM +0100, Ben Hutchings wrote:
  Dear KVM maintainers, it appears that there is a gap in x86 emulation,
  at least on a 32-bit host.  Stefan found this when running GRML, a live
  distribution which can be downloaded from:
  http://download.grml.org/grml32-full_2013.02.iso.  His original
  reported is at http://bugs.debian.org/707257.
 
  Can you verify with latest linux.git HEAD? It works for me there on
  64bit. There were a lot of problems fixed in this area in 3.9/3.10 time 
  frame,
  so it would be helpful if you'll test 32bit before I install one myself.
 
 
  Kernel version 3.9.4-1 (linux-image-3.9-1-686-pae) made things worse.
 
  The virtual machine tries to boot the kernel, but stops after a few
  seconds and the kern.log shows:
  At what point does it stop?
 
 
 The machine stops at:
 
 Performance Events: Broken PMU hardware detected, using software events
 only.
 Failed to access perfctr msr (MSR c1 is 0)
 Enabling APIC mode:  Flat.  Using 1 I/O APICs
Timer initialization is what comes next.

I tried 32bit kernel compiled from kvm.git next (3.10.0-rc2+) branch and 
upstream
qemu and I cannot reproduce the problem. The guest boots fine.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: Add test case for accessing bpl via modr/m

2013-06-06 Thread 李春奇
On Thu, Jun 6, 2013 at 1:45 PM, Gleb Natapov g...@redhat.com wrote:
 On Thu, Jun 06, 2013 at 01:03:44PM +0800, Arthur Chunqi Li wrote:
 Test access to %bpl via modr/m addressing mode. This case can test another 
 bug in the boot of RHEL5.9 64-bit.

 We have growing number of instructions tests using the same tlb trick. I
 think it is time to make the code more generic. Create a function that
 receives instruction to check and all the tlb games will be done by that
 function.
Should I do some work to merge all these test cases?


 Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
 ---
  x86/emulator.c |   41 +
  1 file changed, 41 insertions(+)

 diff --git a/x86/emulator.c b/x86/emulator.c
 index 96576e5..3563971 100644
 --- a/x86/emulator.c
 +++ b/x86/emulator.c
 @@ -901,6 +901,45 @@ static void test_simplealu(u32 *mem)
  report(test, *mem == 0x8400);
  }

 +static void test_bpl_modrm(uint64_t *mem, uint8_t *insn_page,
 +   uint8_t *alt_insn_page, void *insn_ram)
 +{
 + ulong *cr3 = (ulong *)read_cr3();
 + uint16_t cx = 0;
 +
 + // Pad with RET instructions
 + memset(insn_page, 0xc3, 4096);
 + memset(alt_insn_page, 0xc3, 4096);
 + // Place a trapping instruction in the page to trigger a VMEXIT
 + insn_page[0] = 0x66; // mov $0x4321, %cx
 + insn_page[1] = 0xb9;
 + insn_page[2] = 0x21;
 + insn_page[3] = 0x43;
 + insn_page[4] = 0x89; // mov %eax, (%rax)
 + insn_page[5] = 0x00;
 + insn_page[6] = 0x90; // nop
 + // Place mov %cl, %bpl in alt_insn_page for emulator to execuate
 + // If emulator mistaken addressing %bpl, %cl may be moved to %ch
 + // %cx will be broken to 0x2121, not 0x4321
 + alt_insn_page[4] = 0x40;
 + alt_insn_page[5] = 0x88;
 + alt_insn_page[6] = 0xcd;
 +
 + // Load the code TLB with insn_page, but point the page tables at
 + // alt_insn_page (and keep the data TLB clear, for AMD decode assist).
 + // This will make the CPU trap on the insn_page instruction but the
 + // hypervisor will see alt_insn_page.
 + install_page(cr3, virt_to_phys(insn_page), insn_ram);
 + // Load code TLB
 + invlpg(insn_ram);
 + asm volatile(call *%0 : : r(insn_ram+3));
 + // Trap, let hypervisor emulate at alt_insn_page
 + install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
 + asm volatile(call *%0 : : r(insn_ram), a(mem));
 + asm volatile(:=c(cx));
 Why not add the constrain to previous asm?
I will merge them in next version.


 + report(access bpl in modr/m, cx == 0x4321);
 +}
 +
  int main()
  {
   void *mem;
 @@ -964,6 +1003,8 @@ int main()

   test_string_io_mmio(mem);

 + test_bpl_modrm(mem, insn_page, alt_insn_page, insn_ram);
 +
   printf(\nSUMMARY: %d tests, %d failures\n, tests, fails);
   return fails ? 1 : 0;
  }
 --
 1.7.9.5

 --
 Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Test case of emulating multibyte NOP

2013-06-06 Thread 李春奇
On Thu, Jun 6, 2013 at 1:40 PM, Gleb Natapov g...@redhat.com wrote:
 On Thu, Jun 06, 2013 at 12:28:16AM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 12:13 AM, Gleb Natapov g...@redhat.com wrote:
  This time the email is perfect :)
 
  On Thu, Jun 06, 2013 at 12:02:52AM +0800, Arthur Chunqi Li wrote:
  Add multibyte NOP test case to kvm-unit-tests. This version adds test 
  cases into x86/realmode.c. This can test one of bugs when booting RHEL5.9 
  64-bit.
 
  Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
  ---
   x86/realmode.c |   24 
   1 file changed, 24 insertions(+)
 
  diff --git a/x86/realmode.c b/x86/realmode.c
  index 981be08..e103ca6 100644
  --- a/x86/realmode.c
  +++ b/x86/realmode.c
  @@ -1504,6 +1504,29 @@ static void test_fninit(void)
report(fninit, 0, fsw == 0  (fcw  0x103f) == 0x003f);
   }
 
  +static void test_nopl(void)
  +{
  + MK_INSN(nopl1, .byte 0x90\n\r); // 1 byte nop
  + MK_INSN(nopl2, .byte 0x66, 0x90\n\r); // 2 bytes nop
  + MK_INSN(nopl3, .byte 0x0f, 0x1f, 0x00\n\r); // 3 bytes nop
  + MK_INSN(nopl4, .byte 0x0f, 0x1f, 0x40, 0x00\n\r); // 4 bytes nop
  But all nops below that are not supported in 16 bit mode. You can
  disassemble realmode.elf in 16bit node (objdump -z -d -mi8086
  x86/realmode.elf) and check yourself. Lets not complicate things for now
  and test only those that are easy to test.
 Yes. But what if a 7-bytes nop runs in 16bit mode? Just the same as
 https://bugzilla.redhat.com/show_bug.cgi?id=967652

 It cannot. In 16 bit mode it is decoded as two instructions:
0f 1f 80 00 00  nopw   0x0(%bx,%si)
00 00   add%al,(%bx,%si)

OK, I will just test the first four nop instructions. Should I commit
another patch?

Arthur.

 DR6=0ff0 DR7=0400
 EFER=0500
 Code=00 00 e9 50 ff ff ff 00 00 00 00 85 d2 74 20 45 31 c0 31 c9 0f
 1f 80 00 00 00 00 0f b6 04 31 41 83 c0 01 88 04 39 48 83 c1 01 41 39
 d0 75 ec 48 89 f8

 The error code is 0f 1f 80 00 00 00 00, which is a 7-bytes nop. Will
 the emulator runs well in that case when booting RHEL5.9 64-bit?

 Arthur


 
  + MK_INSN(nopl5, .byte 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); // 5 
  bytes nop
  + MK_INSN(nopl6, .byte 0x66, 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); // 
  6 bytes nop
  + MK_INSN(nopl7, .byte 0x0f, 0x1f, 0x80, 0x00, 0x00, 0x00, 
  0x00\n\r); // 7 bytes nop
  + MK_INSN(nopl8, .byte 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 
  0x00\n\r); // 8 bytes nop
  + MK_INSN(nopl9, .byte 0x66, 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 
  0x00, 0x00\n\r); // 9 bytes nop
  + exec_in_big_real_mode(insn_nopl1);
  + exec_in_big_real_mode(insn_nopl2);
  + exec_in_big_real_mode(insn_nopl3);
  + exec_in_big_real_mode(insn_nopl4);
  + exec_in_big_real_mode(insn_nopl5);
  + exec_in_big_real_mode(insn_nopl6);
  + exec_in_big_real_mode(insn_nopl7);
  + exec_in_big_real_mode(insn_nopl8);
  + exec_in_big_real_mode(insn_nopl9);
  + report(nopl, 0, 1);
  +}
  +
   void realmode_start(void)
   {
test_null();
  @@ -1548,6 +1571,7 @@ void realmode_start(void)
test_xlat();
test_salc();
test_fninit();
  + test_nopl();
 
exit(0);
   }
  --
  1.7.9.5
 
  --
  Gleb.

 --
 Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: Add test case for accessing bpl via modr/m

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 02:47:49PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 1:45 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 01:03:44PM +0800, Arthur Chunqi Li wrote:
  Test access to %bpl via modr/m addressing mode. This case can test another 
  bug in the boot of RHEL5.9 64-bit.
 
  We have growing number of instructions tests using the same tlb trick. I
  think it is time to make the code more generic. Create a function that
  receives instruction to check and all the tlb games will be done by that
  function.
 Should I do some work to merge all these test cases?
 
It would be nice, yes. I also have an idea on how to improve test
reliability. Since it relies on tlb to be out of sync with actual page
table a vmexit in a wrong time can break this assumption and wrong
instruction will be emulated (the one from insn_page instead of
alt_insn_page). If we make the instruction on insn_page place special
value somewhere and check it after the test we can see if the wrong
instruction was executed and rerun the test.

 
  Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
  ---
   x86/emulator.c |   41 +
   1 file changed, 41 insertions(+)
 
  diff --git a/x86/emulator.c b/x86/emulator.c
  index 96576e5..3563971 100644
  --- a/x86/emulator.c
  +++ b/x86/emulator.c
  @@ -901,6 +901,45 @@ static void test_simplealu(u32 *mem)
   report(test, *mem == 0x8400);
   }
 
  +static void test_bpl_modrm(uint64_t *mem, uint8_t *insn_page,
  +   uint8_t *alt_insn_page, void *insn_ram)
  +{
  + ulong *cr3 = (ulong *)read_cr3();
  + uint16_t cx = 0;
  +
  + // Pad with RET instructions
  + memset(insn_page, 0xc3, 4096);
  + memset(alt_insn_page, 0xc3, 4096);
  + // Place a trapping instruction in the page to trigger a VMEXIT
  + insn_page[0] = 0x66; // mov $0x4321, %cx
  + insn_page[1] = 0xb9;
  + insn_page[2] = 0x21;
  + insn_page[3] = 0x43;
  + insn_page[4] = 0x89; // mov %eax, (%rax)
  + insn_page[5] = 0x00;
  + insn_page[6] = 0x90; // nop
  + // Place mov %cl, %bpl in alt_insn_page for emulator to execuate
  + // If emulator mistaken addressing %bpl, %cl may be moved to %ch
  + // %cx will be broken to 0x2121, not 0x4321
  + alt_insn_page[4] = 0x40;
  + alt_insn_page[5] = 0x88;
  + alt_insn_page[6] = 0xcd;
  +
  + // Load the code TLB with insn_page, but point the page tables at
  + // alt_insn_page (and keep the data TLB clear, for AMD decode 
  assist).
  + // This will make the CPU trap on the insn_page instruction but the
  + // hypervisor will see alt_insn_page.
  + install_page(cr3, virt_to_phys(insn_page), insn_ram);
  + // Load code TLB
  + invlpg(insn_ram);
  + asm volatile(call *%0 : : r(insn_ram+3));
  + // Trap, let hypervisor emulate at alt_insn_page
  + install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
  + asm volatile(call *%0 : : r(insn_ram), a(mem));
  + asm volatile(:=c(cx));
  Why not add the constrain to previous asm?
 I will merge them in next version.
 
 
  + report(access bpl in modr/m, cx == 0x4321);
  +}
  +
   int main()
   {
void *mem;
  @@ -964,6 +1003,8 @@ int main()
 
test_string_io_mmio(mem);
 
  + test_bpl_modrm(mem, insn_page, alt_insn_page, insn_ram);
  +
printf(\nSUMMARY: %d tests, %d failures\n, tests, fails);
return fails ? 1 : 0;
   }
  --
  1.7.9.5
 
  --
  Gleb.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Test case of emulating multibyte NOP

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 02:49:14PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 1:40 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 12:28:16AM +0800, 李春奇 Arthur Chunqi Li wrote:
  On Thu, Jun 6, 2013 at 12:13 AM, Gleb Natapov g...@redhat.com wrote:
   This time the email is perfect :)
  
   On Thu, Jun 06, 2013 at 12:02:52AM +0800, Arthur Chunqi Li wrote:
   Add multibyte NOP test case to kvm-unit-tests. This version adds test 
   cases into x86/realmode.c. This can test one of bugs when booting 
   RHEL5.9 64-bit.
  
   Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
   ---
x86/realmode.c |   24 
1 file changed, 24 insertions(+)
  
   diff --git a/x86/realmode.c b/x86/realmode.c
   index 981be08..e103ca6 100644
   --- a/x86/realmode.c
   +++ b/x86/realmode.c
   @@ -1504,6 +1504,29 @@ static void test_fninit(void)
 report(fninit, 0, fsw == 0  (fcw  0x103f) == 0x003f);
}
  
   +static void test_nopl(void)
   +{
   + MK_INSN(nopl1, .byte 0x90\n\r); // 1 byte nop
   + MK_INSN(nopl2, .byte 0x66, 0x90\n\r); // 2 bytes nop
   + MK_INSN(nopl3, .byte 0x0f, 0x1f, 0x00\n\r); // 3 bytes nop
   + MK_INSN(nopl4, .byte 0x0f, 0x1f, 0x40, 0x00\n\r); // 4 bytes nop
   But all nops below that are not supported in 16 bit mode. You can
   disassemble realmode.elf in 16bit node (objdump -z -d -mi8086
   x86/realmode.elf) and check yourself. Lets not complicate things for now
   and test only those that are easy to test.
  Yes. But what if a 7-bytes nop runs in 16bit mode? Just the same as
  https://bugzilla.redhat.com/show_bug.cgi?id=967652
 
  It cannot. In 16 bit mode it is decoded as two instructions:
 0f 1f 80 00 00  nopw   0x0(%bx,%si)
 00 00   add%al,(%bx,%si)
 
 OK, I will just test the first four nop instructions. Should I commit
 another patch?
 
Yes, all others will have to go into emulator.c.

 Arthur.
 
  DR6=0ff0 DR7=0400
  EFER=0500
  Code=00 00 e9 50 ff ff ff 00 00 00 00 85 d2 74 20 45 31 c0 31 c9 0f
  1f 80 00 00 00 00 0f b6 04 31 41 83 c0 01 88 04 39 48 83 c1 01 41 39
  d0 75 ec 48 89 f8
 
  The error code is 0f 1f 80 00 00 00 00, which is a 7-bytes nop. Will
  the emulator runs well in that case when booting RHEL5.9 64-bit?
 
  Arthur
 
 
  
   + MK_INSN(nopl5, .byte 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); // 5 
   bytes nop
   + MK_INSN(nopl6, .byte 0x66, 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); 
   // 6 bytes nop
   + MK_INSN(nopl7, .byte 0x0f, 0x1f, 0x80, 0x00, 0x00, 0x00, 
   0x00\n\r); // 7 bytes nop
   + MK_INSN(nopl8, .byte 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 
   0x00\n\r); // 8 bytes nop
   + MK_INSN(nopl9, .byte 0x66, 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 
   0x00, 0x00\n\r); // 9 bytes nop
   + exec_in_big_real_mode(insn_nopl1);
   + exec_in_big_real_mode(insn_nopl2);
   + exec_in_big_real_mode(insn_nopl3);
   + exec_in_big_real_mode(insn_nopl4);
   + exec_in_big_real_mode(insn_nopl5);
   + exec_in_big_real_mode(insn_nopl6);
   + exec_in_big_real_mode(insn_nopl7);
   + exec_in_big_real_mode(insn_nopl8);
   + exec_in_big_real_mode(insn_nopl9);
   + report(nopl, 0, 1);
   +}
   +
void realmode_start(void)
{
 test_null();
   @@ -1548,6 +1571,7 @@ void realmode_start(void)
 test_xlat();
 test_salc();
 test_fninit();
   + test_nopl();
  
 exit(0);
}
   --
   1.7.9.5
  
   --
   Gleb.
 
  --
  Gleb.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Test case of emulating multibyte NOP

2013-06-06 Thread 李春奇
On Thu, Jun 6, 2013 at 3:02 PM, Gleb Natapov g...@redhat.com wrote:
 On Thu, Jun 06, 2013 at 02:49:14PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 1:40 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 12:28:16AM +0800, 李春奇 Arthur Chunqi Li wrote:
  On Thu, Jun 6, 2013 at 12:13 AM, Gleb Natapov g...@redhat.com wrote:
   This time the email is perfect :)
  
   On Thu, Jun 06, 2013 at 12:02:52AM +0800, Arthur Chunqi Li wrote:
   Add multibyte NOP test case to kvm-unit-tests. This version adds test 
   cases into x86/realmode.c. This can test one of bugs when booting 
   RHEL5.9 64-bit.
  
   Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
   ---
x86/realmode.c |   24 
1 file changed, 24 insertions(+)
  
   diff --git a/x86/realmode.c b/x86/realmode.c
   index 981be08..e103ca6 100644
   --- a/x86/realmode.c
   +++ b/x86/realmode.c
   @@ -1504,6 +1504,29 @@ static void test_fninit(void)
 report(fninit, 0, fsw == 0  (fcw  0x103f) == 0x003f);
}
  
   +static void test_nopl(void)
   +{
   + MK_INSN(nopl1, .byte 0x90\n\r); // 1 byte nop
   + MK_INSN(nopl2, .byte 0x66, 0x90\n\r); // 2 bytes nop
   + MK_INSN(nopl3, .byte 0x0f, 0x1f, 0x00\n\r); // 3 bytes nop
   + MK_INSN(nopl4, .byte 0x0f, 0x1f, 0x40, 0x00\n\r); // 4 bytes 
   nop
   But all nops below that are not supported in 16 bit mode. You can
   disassemble realmode.elf in 16bit node (objdump -z -d -mi8086
   x86/realmode.elf) and check yourself. Lets not complicate things for now
   and test only those that are easy to test.
  Yes. But what if a 7-bytes nop runs in 16bit mode? Just the same as
  https://bugzilla.redhat.com/show_bug.cgi?id=967652
 
  It cannot. In 16 bit mode it is decoded as two instructions:
 0f 1f 80 00 00  nopw   0x0(%bx,%si)
 00 00   add%al,(%bx,%si)
 
 OK, I will just test the first four nop instructions. Should I commit
 another patch?

 Yes, all others will have to go into emulator.c.
You mean I need also add another test for nopl5~nop9 in emulator.c
with the trick emulator mode?
I will commit a modified one for realmode.c since some other works
should be done in emulator.c.


 Arthur.

  DR6=0ff0 DR7=0400
  EFER=0500
  Code=00 00 e9 50 ff ff ff 00 00 00 00 85 d2 74 20 45 31 c0 31 c9 0f
  1f 80 00 00 00 00 0f b6 04 31 41 83 c0 01 88 04 39 48 83 c1 01 41 39
  d0 75 ec 48 89 f8
 
  The error code is 0f 1f 80 00 00 00 00, which is a 7-bytes nop. Will
  the emulator runs well in that case when booting RHEL5.9 64-bit?
 
  Arthur
 
 
  
   + MK_INSN(nopl5, .byte 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); // 5 
   bytes nop
   + MK_INSN(nopl6, .byte 0x66, 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); 
   // 6 bytes nop
   + MK_INSN(nopl7, .byte 0x0f, 0x1f, 0x80, 0x00, 0x00, 0x00, 
   0x00\n\r); // 7 bytes nop
   + MK_INSN(nopl8, .byte 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 
   0x00\n\r); // 8 bytes nop
   + MK_INSN(nopl9, .byte 0x66, 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 
   0x00, 0x00\n\r); // 9 bytes nop
   + exec_in_big_real_mode(insn_nopl1);
   + exec_in_big_real_mode(insn_nopl2);
   + exec_in_big_real_mode(insn_nopl3);
   + exec_in_big_real_mode(insn_nopl4);
   + exec_in_big_real_mode(insn_nopl5);
   + exec_in_big_real_mode(insn_nopl6);
   + exec_in_big_real_mode(insn_nopl7);
   + exec_in_big_real_mode(insn_nopl8);
   + exec_in_big_real_mode(insn_nopl9);
   + report(nopl, 0, 1);
   +}
   +
void realmode_start(void)
{
 test_null();
   @@ -1548,6 +1571,7 @@ void realmode_start(void)
 test_xlat();
 test_salc();
 test_fninit();
   + test_nopl();
  
 exit(0);
}
   --
   1.7.9.5
  
   --
   Gleb.
 
  --
  Gleb.

 --
 Gleb.



--
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug#707257: linux-image-3.8-1-686-pae: KVM crashes with entry failed, hardware error 0x80000021

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 09:42:40AM +0300, Gleb Natapov wrote:
 On Wed, Jun 05, 2013 at 02:51:19PM +0200, Stefan Pietsch wrote:
  On 05.06.2013 14:10, Gleb Natapov wrote:
   On Wed, Jun 05, 2013 at 01:57:25PM +0200, Stefan Pietsch wrote:
   On 19.05.2013 14:32, Gleb Natapov wrote:
   On Sun, May 19, 2013 at 02:00:31AM +0100, Ben Hutchings wrote:
   Dear KVM maintainers, it appears that there is a gap in x86 emulation,
   at least on a 32-bit host.  Stefan found this when running GRML, a live
   distribution which can be downloaded from:
   http://download.grml.org/grml32-full_2013.02.iso.  His original
   reported is at http://bugs.debian.org/707257.
  
   Can you verify with latest linux.git HEAD? It works for me there on
   64bit. There were a lot of problems fixed in this area in 3.9/3.10 time 
   frame,
   so it would be helpful if you'll test 32bit before I install one myself.
  
  
   Kernel version 3.9.4-1 (linux-image-3.9-1-686-pae) made things worse.
  
   The virtual machine tries to boot the kernel, but stops after a few
   seconds and the kern.log shows:
   At what point does it stop?
  
  
  The machine stops at:
  
  Performance Events: Broken PMU hardware detected, using software events
  only.
  Failed to access perfctr msr (MSR c1 is 0)
  Enabling APIC mode:  Flat.  Using 1 I/O APICs
 Timer initialization is what comes next.
 
 I tried 32bit kernel compiled from kvm.git next (3.10.0-rc2+) branch and 
 upstream
 qemu and I cannot reproduce the problem. The guest boots fine.
 
Actually the branch I tested is master not next, but this should not
make a difference.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Test case of emulating multibyte NOP

2013-06-06 Thread 李春奇
On Thu, Jun 6, 2013 at 3:17 PM, 李春奇 Arthur Chunqi Li yzt...@gmail.com wrote:
 On Thu, Jun 6, 2013 at 3:02 PM, Gleb Natapov g...@redhat.com wrote:
 On Thu, Jun 06, 2013 at 02:49:14PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 1:40 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 12:28:16AM +0800, 李春奇 Arthur Chunqi Li wrote:
  On Thu, Jun 6, 2013 at 12:13 AM, Gleb Natapov g...@redhat.com wrote:
   This time the email is perfect :)
  
   On Thu, Jun 06, 2013 at 12:02:52AM +0800, Arthur Chunqi Li wrote:
   Add multibyte NOP test case to kvm-unit-tests. This version adds test 
   cases into x86/realmode.c. This can test one of bugs when booting 
   RHEL5.9 64-bit.
  
   Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
   ---
x86/realmode.c |   24 
1 file changed, 24 insertions(+)
  
   diff --git a/x86/realmode.c b/x86/realmode.c
   index 981be08..e103ca6 100644
   --- a/x86/realmode.c
   +++ b/x86/realmode.c
   @@ -1504,6 +1504,29 @@ static void test_fninit(void)
 report(fninit, 0, fsw == 0  (fcw  0x103f) == 0x003f);
}
  
   +static void test_nopl(void)
   +{
   + MK_INSN(nopl1, .byte 0x90\n\r); // 1 byte nop
   + MK_INSN(nopl2, .byte 0x66, 0x90\n\r); // 2 bytes nop
   + MK_INSN(nopl3, .byte 0x0f, 0x1f, 0x00\n\r); // 3 bytes nop
   + MK_INSN(nopl4, .byte 0x0f, 0x1f, 0x40, 0x00\n\r); // 4 bytes 
   nop
   But all nops below that are not supported in 16 bit mode. You can
   disassemble realmode.elf in 16bit node (objdump -z -d -mi8086
   x86/realmode.elf) and check yourself. Lets not complicate things for 
   now
   and test only those that are easy to test.
  Yes. But what if a 7-bytes nop runs in 16bit mode? Just the same as
  https://bugzilla.redhat.com/show_bug.cgi?id=967652
 
  It cannot. In 16 bit mode it is decoded as two instructions:
 0f 1f 80 00 00  nopw   0x0(%bx,%si)
 00 00   add%al,(%bx,%si)
 
 OK, I will just test the first four nop instructions. Should I commit
 another patch?

 Yes, all others will have to go into emulator.c.
 You mean I need also add another test for nopl5~nop9 in emulator.c
 with the trick emulator mode?
 I will commit a modified one for realmode.c since some other works
 should be done in emulator.c.
Since we need to place some relevant codes in emulator.c, why don't we
place all the tests in emulator.c?

Arthur.



 Arthur.

  DR6=0ff0 DR7=0400
  EFER=0500
  Code=00 00 e9 50 ff ff ff 00 00 00 00 85 d2 74 20 45 31 c0 31 c9 0f
  1f 80 00 00 00 00 0f b6 04 31 41 83 c0 01 88 04 39 48 83 c1 01 41 39
  d0 75 ec 48 89 f8
 
  The error code is 0f 1f 80 00 00 00 00, which is a 7-bytes nop. Will
  the emulator runs well in that case when booting RHEL5.9 64-bit?
 
  Arthur
 
 
  
   + MK_INSN(nopl5, .byte 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); // 5 
   bytes nop
   + MK_INSN(nopl6, .byte 0x66, 0x0f, 0x1f, 0x44, 0x00, 0x00\n\r); 
   // 6 bytes nop
   + MK_INSN(nopl7, .byte 0x0f, 0x1f, 0x80, 0x00, 0x00, 0x00, 
   0x00\n\r); // 7 bytes nop
   + MK_INSN(nopl8, .byte 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 
   0x00\n\r); // 8 bytes nop
   + MK_INSN(nopl9, .byte 0x66, 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 
   0x00, 0x00\n\r); // 9 bytes nop
   + exec_in_big_real_mode(insn_nopl1);
   + exec_in_big_real_mode(insn_nopl2);
   + exec_in_big_real_mode(insn_nopl3);
   + exec_in_big_real_mode(insn_nopl4);
   + exec_in_big_real_mode(insn_nopl5);
   + exec_in_big_real_mode(insn_nopl6);
   + exec_in_big_real_mode(insn_nopl7);
   + exec_in_big_real_mode(insn_nopl8);
   + exec_in_big_real_mode(insn_nopl9);
   + report(nopl, 0, 1);
   +}
   +
void realmode_start(void)
{
 test_null();
   @@ -1548,6 +1571,7 @@ void realmode_start(void)
 test_xlat();
 test_salc();
 test_fninit();
   + test_nopl();
  
 exit(0);
}
   --
   1.7.9.5
  
   --
   Gleb.
 
  --
  Gleb.

 --
 Gleb.



 --
 Arthur Chunqi Li
 Department of Computer Science
 School of EECS
 Peking University
 Beijing, China



--
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Test case of emulating multibyte NOP

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 03:22:59PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 3:17 PM, 李春奇 Arthur Chunqi Li yzt...@gmail.com 
 wrote:
  On Thu, Jun 6, 2013 at 3:02 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 02:49:14PM +0800, 李春奇 Arthur Chunqi Li wrote:
  On Thu, Jun 6, 2013 at 1:40 PM, Gleb Natapov g...@redhat.com wrote:
   On Thu, Jun 06, 2013 at 12:28:16AM +0800, 李春奇 Arthur Chunqi Li wrote:
   On Thu, Jun 6, 2013 at 12:13 AM, Gleb Natapov g...@redhat.com wrote:
This time the email is perfect :)
   
On Thu, Jun 06, 2013 at 12:02:52AM +0800, Arthur Chunqi Li wrote:
Add multibyte NOP test case to kvm-unit-tests. This version adds 
test cases into x86/realmode.c. This can test one of bugs when 
booting RHEL5.9 64-bit.
   
Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/realmode.c |   24 
 1 file changed, 24 insertions(+)
   
diff --git a/x86/realmode.c b/x86/realmode.c
index 981be08..e103ca6 100644
--- a/x86/realmode.c
+++ b/x86/realmode.c
@@ -1504,6 +1504,29 @@ static void test_fninit(void)
  report(fninit, 0, fsw == 0  (fcw  0x103f) == 0x003f);
 }
   
+static void test_nopl(void)
+{
+ MK_INSN(nopl1, .byte 0x90\n\r); // 1 byte nop
+ MK_INSN(nopl2, .byte 0x66, 0x90\n\r); // 2 bytes nop
+ MK_INSN(nopl3, .byte 0x0f, 0x1f, 0x00\n\r); // 3 bytes nop
+ MK_INSN(nopl4, .byte 0x0f, 0x1f, 0x40, 0x00\n\r); // 4 
bytes nop
But all nops below that are not supported in 16 bit mode. You can
disassemble realmode.elf in 16bit node (objdump -z -d -mi8086
x86/realmode.elf) and check yourself. Lets not complicate things for 
now
and test only those that are easy to test.
   Yes. But what if a 7-bytes nop runs in 16bit mode? Just the same as
   https://bugzilla.redhat.com/show_bug.cgi?id=967652
  
   It cannot. In 16 bit mode it is decoded as two instructions:
  0f 1f 80 00 00  nopw   0x0(%bx,%si)
  00 00   add%al,(%bx,%si)
  
  OK, I will just test the first four nop instructions. Should I commit
  another patch?
 
  Yes, all others will have to go into emulator.c.
  You mean I need also add another test for nopl5~nop9 in emulator.c
  with the trick emulator mode?
  I will commit a modified one for realmode.c since some other works
  should be done in emulator.c.
 Since we need to place some relevant codes in emulator.c, why don't we
 place all the tests in emulator.c?
 
We can place those 4 in both. I do not always run all tests so it is
nice to cover as much as possible in both.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm-unit-tests: Test case of emulating multibyte NOP

2013-06-06 Thread Arthur Chunqi Li
Add multibyte (1 to 4-bytes) NOPL test case to kvm-unit-tests
x86/realmode.c. This test only consist of 16-bit NOPL insn,
other test cases (5 to 9-bytes NOPL) should be placed
in x86/emulator.c.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/realmode.c |   14 ++
 1 file changed, 14 insertions(+)

diff --git a/x86/realmode.c b/x86/realmode.c
index 981be08..3546771 100644
--- a/x86/realmode.c
+++ b/x86/realmode.c
@@ -1504,6 +1504,19 @@ static void test_fninit(void)
report(fninit, 0, fsw == 0  (fcw  0x103f) == 0x003f);
 }
 
+static void test_nopl(void)
+{
+   MK_INSN(nopl1, .byte 0x90\n\r); // 1 byte nop
+   MK_INSN(nopl2, .byte 0x66, 0x90\n\r); // 2 bytes nop
+   MK_INSN(nopl3, .byte 0x0f, 0x1f, 0x00\n\r); // 3 bytes nop
+   MK_INSN(nopl4, .byte 0x0f, 0x1f, 0x40, 0x00\n\r); // 4 bytes nop
+   exec_in_big_real_mode(insn_nopl1);
+   exec_in_big_real_mode(insn_nopl2);
+   exec_in_big_real_mode(insn_nopl3);
+   exec_in_big_real_mode(insn_nopl4);
+   report(nopl, 0, 1);
+}
+
 void realmode_start(void)
 {
test_null();
@@ -1548,6 +1561,7 @@ void realmode_start(void)
test_xlat();
test_salc();
test_fninit();
+   test_nopl();
 
exit(0);
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: Add test case for accessing bpl via modr/m

2013-06-06 Thread 李春奇
On Thu, Jun 6, 2013 at 3:01 PM, Gleb Natapov g...@redhat.com wrote:
 On Thu, Jun 06, 2013 at 02:47:49PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 1:45 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 01:03:44PM +0800, Arthur Chunqi Li wrote:
  Test access to %bpl via modr/m addressing mode. This case can test 
  another bug in the boot of RHEL5.9 64-bit.
 
  We have growing number of instructions tests using the same tlb trick. I
  think it is time to make the code more generic. Create a function that
  receives instruction to check and all the tlb games will be done by that
  function.
 Should I do some work to merge all these test cases?

 It would be nice, yes. I also have an idea on how to improve test
 reliability. Since it relies on tlb to be out of sync with actual page
 table a vmexit in a wrong time can break this assumption and wrong
 instruction will be emulated (the one from insn_page instead of
 alt_insn_page). If we make the instruction on insn_page place special
 value somewhere and check it after the test we can see if the wrong
 instruction was executed and rerun the test.
If I commit the patch to merge these test cases, should the patch base
on what I have commit before (after patched of this mail), or base on
the master thread?

Arthur


 
  Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
  ---
   x86/emulator.c |   41 +
   1 file changed, 41 insertions(+)
 
  diff --git a/x86/emulator.c b/x86/emulator.c
  index 96576e5..3563971 100644
  --- a/x86/emulator.c
  +++ b/x86/emulator.c
  @@ -901,6 +901,45 @@ static void test_simplealu(u32 *mem)
   report(test, *mem == 0x8400);
   }
 
  +static void test_bpl_modrm(uint64_t *mem, uint8_t *insn_page,
  +   uint8_t *alt_insn_page, void *insn_ram)
  +{
  + ulong *cr3 = (ulong *)read_cr3();
  + uint16_t cx = 0;
  +
  + // Pad with RET instructions
  + memset(insn_page, 0xc3, 4096);
  + memset(alt_insn_page, 0xc3, 4096);
  + // Place a trapping instruction in the page to trigger a VMEXIT
  + insn_page[0] = 0x66; // mov $0x4321, %cx
  + insn_page[1] = 0xb9;
  + insn_page[2] = 0x21;
  + insn_page[3] = 0x43;
  + insn_page[4] = 0x89; // mov %eax, (%rax)
  + insn_page[5] = 0x00;
  + insn_page[6] = 0x90; // nop
  + // Place mov %cl, %bpl in alt_insn_page for emulator to execuate
  + // If emulator mistaken addressing %bpl, %cl may be moved to %ch
  + // %cx will be broken to 0x2121, not 0x4321
  + alt_insn_page[4] = 0x40;
  + alt_insn_page[5] = 0x88;
  + alt_insn_page[6] = 0xcd;
  +
  + // Load the code TLB with insn_page, but point the page tables at
  + // alt_insn_page (and keep the data TLB clear, for AMD decode 
  assist).
  + // This will make the CPU trap on the insn_page instruction but the
  + // hypervisor will see alt_insn_page.
  + install_page(cr3, virt_to_phys(insn_page), insn_ram);
  + // Load code TLB
  + invlpg(insn_ram);
  + asm volatile(call *%0 : : r(insn_ram+3));
  + // Trap, let hypervisor emulate at alt_insn_page
  + install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
  + asm volatile(call *%0 : : r(insn_ram), a(mem));
  + asm volatile(:=c(cx));
  Why not add the constrain to previous asm?
 I will merge them in next version.

 
  + report(access bpl in modr/m, cx == 0x4321);
  +}
  +
   int main()
   {
void *mem;
  @@ -964,6 +1003,8 @@ int main()
 
test_string_io_mmio(mem);
 
  + test_bpl_modrm(mem, insn_page, alt_insn_page, insn_ram);
  +
printf(\nSUMMARY: %d tests, %d failures\n, tests, fails);
return fails ? 1 : 0;
   }
  --
  1.7.9.5
 
  --
  Gleb.

 --
 Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: Add test case for accessing bpl via modr/m

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 03:42:56PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 3:01 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 02:47:49PM +0800, 李春奇 Arthur Chunqi Li wrote:
  On Thu, Jun 6, 2013 at 1:45 PM, Gleb Natapov g...@redhat.com wrote:
   On Thu, Jun 06, 2013 at 01:03:44PM +0800, Arthur Chunqi Li wrote:
   Test access to %bpl via modr/m addressing mode. This case can test 
   another bug in the boot of RHEL5.9 64-bit.
  
   We have growing number of instructions tests using the same tlb trick. I
   think it is time to make the code more generic. Create a function that
   receives instruction to check and all the tlb games will be done by that
   function.
  Should I do some work to merge all these test cases?
 
  It would be nice, yes. I also have an idea on how to improve test
  reliability. Since it relies on tlb to be out of sync with actual page
  table a vmexit in a wrong time can break this assumption and wrong
  instruction will be emulated (the one from insn_page instead of
  alt_insn_page). If we make the instruction on insn_page place special
  value somewhere and check it after the test we can see if the wrong
  instruction was executed and rerun the test.
 If I commit the patch to merge these test cases, should the patch base
 on what I have commit before (after patched of this mail), or base on
 the master thread?
 
Master. First patch provides the infrastructure. Second converts
existing users. Third adds new tests.

 Arthur
 
 
  
   Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
   ---
x86/emulator.c |   41 +
1 file changed, 41 insertions(+)
  
   diff --git a/x86/emulator.c b/x86/emulator.c
   index 96576e5..3563971 100644
   --- a/x86/emulator.c
   +++ b/x86/emulator.c
   @@ -901,6 +901,45 @@ static void test_simplealu(u32 *mem)
report(test, *mem == 0x8400);
}
  
   +static void test_bpl_modrm(uint64_t *mem, uint8_t *insn_page,
   +   uint8_t *alt_insn_page, void *insn_ram)
   +{
   + ulong *cr3 = (ulong *)read_cr3();
   + uint16_t cx = 0;
   +
   + // Pad with RET instructions
   + memset(insn_page, 0xc3, 4096);
   + memset(alt_insn_page, 0xc3, 4096);
   + // Place a trapping instruction in the page to trigger a VMEXIT
   + insn_page[0] = 0x66; // mov $0x4321, %cx
   + insn_page[1] = 0xb9;
   + insn_page[2] = 0x21;
   + insn_page[3] = 0x43;
   + insn_page[4] = 0x89; // mov %eax, (%rax)
   + insn_page[5] = 0x00;
   + insn_page[6] = 0x90; // nop
   + // Place mov %cl, %bpl in alt_insn_page for emulator to execuate
   + // If emulator mistaken addressing %bpl, %cl may be moved to %ch
   + // %cx will be broken to 0x2121, not 0x4321
   + alt_insn_page[4] = 0x40;
   + alt_insn_page[5] = 0x88;
   + alt_insn_page[6] = 0xcd;
   +
   + // Load the code TLB with insn_page, but point the page tables at
   + // alt_insn_page (and keep the data TLB clear, for AMD decode 
   assist).
   + // This will make the CPU trap on the insn_page instruction but 
   the
   + // hypervisor will see alt_insn_page.
   + install_page(cr3, virt_to_phys(insn_page), insn_ram);
   + // Load code TLB
   + invlpg(insn_ram);
   + asm volatile(call *%0 : : r(insn_ram+3));
   + // Trap, let hypervisor emulate at alt_insn_page
   + install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
   + asm volatile(call *%0 : : r(insn_ram), a(mem));
   + asm volatile(:=c(cx));
   Why not add the constrain to previous asm?
  I will merge them in next version.
 
  
   + report(access bpl in modr/m, cx == 0x4321);
   +}
   +
int main()
{
 void *mem;
   @@ -964,6 +1003,8 @@ int main()
  
 test_string_io_mmio(mem);
  
   + test_bpl_modrm(mem, insn_page, alt_insn_page, insn_ram);
   +
 printf(\nSUMMARY: %d tests, %d failures\n, tests, fails);
 return fails ? 1 : 0;
}
   --
   1.7.9.5
  
   --
   Gleb.
 
  --
  Gleb.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: Test case of emulating multibyte NOP

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 03:38:29PM +0800, Arthur Chunqi Li wrote:
 Add multibyte (1 to 4-bytes) NOPL test case to kvm-unit-tests
 x86/realmode.c. This test only consist of 16-bit NOPL insn,
 other test cases (5 to 9-bytes NOPL) should be placed
 in x86/emulator.c.
 
Applied, thanks!

 Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
 ---
  x86/realmode.c |   14 ++
  1 file changed, 14 insertions(+)
 
 diff --git a/x86/realmode.c b/x86/realmode.c
 index 981be08..3546771 100644
 --- a/x86/realmode.c
 +++ b/x86/realmode.c
 @@ -1504,6 +1504,19 @@ static void test_fninit(void)
   report(fninit, 0, fsw == 0  (fcw  0x103f) == 0x003f);
  }
  
 +static void test_nopl(void)
 +{
 + MK_INSN(nopl1, .byte 0x90\n\r); // 1 byte nop
 + MK_INSN(nopl2, .byte 0x66, 0x90\n\r); // 2 bytes nop
 + MK_INSN(nopl3, .byte 0x0f, 0x1f, 0x00\n\r); // 3 bytes nop
 + MK_INSN(nopl4, .byte 0x0f, 0x1f, 0x40, 0x00\n\r); // 4 bytes nop
 + exec_in_big_real_mode(insn_nopl1);
 + exec_in_big_real_mode(insn_nopl2);
 + exec_in_big_real_mode(insn_nopl3);
 + exec_in_big_real_mode(insn_nopl4);
 + report(nopl, 0, 1);
 +}
 +
  void realmode_start(void)
  {
   test_null();
 @@ -1548,6 +1561,7 @@ void realmode_start(void)
   test_xlat();
   test_salc();
   test_fninit();
 + test_nopl();
  
   exit(0);
  }
 -- 
 1.7.9.5

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Michael S. Tsirkin
On Tue, Jun 04, 2013 at 03:01:50PM +0930, Rusty Russell wrote:
 Michael S. Tsirkin m...@redhat.com writes:
  On Mon, Jun 03, 2013 at 09:56:15AM +0930, Rusty Russell wrote:
  Michael S. Tsirkin m...@redhat.com writes:
   On Thu, May 30, 2013 at 08:53:45AM -0500, Anthony Liguori wrote:
   Rusty Russell ru...@rustcorp.com.au writes:
   
Anthony Liguori aligu...@us.ibm.com writes:
Forcing a guest driver change is a really big
deal and I see no reason to do that unless there's a compelling 
reason
to.
   
So we're stuck with the 1.0 config layout for a very long time.
   
We definitely must not force a guest change.  The explicit aim of the
standard is that legacy and 1.0 be backward compatible.  One
deliverable is a document detailing how this is done (effectively a
summary of changes between what we have and 1.0).
   
   If 2.0 is fully backwards compatible, great.  It seems like such a
   difference that that would be impossible but I need to investigate
   further.
   
   Regards,
   
   Anthony Liguori
  
   If you look at my patches you'll see how it works.
   Basically old guests use BAR0 new ones don't, so
   it's easy: BAR0 access means legacy guest.
   Only started testing but things seem to work
   fine with old guests so far.
  
   I think we need a spec, not just driver code.
  
   Rusty what's the plan? Want me to write it?
  
  We need both, of course, but the spec work will happen in the OASIS WG.
  A draft is good, but let's not commit anything to upstream QEMU until we
  get the spec finalized.  And that is proposed to be late this year.
 
  Well that would be quite sad really.
  
  This means we can't make virtio a spec compliant pci express device,
  and we can't add any more feature bits, so no
  flexible buffer optimizations for virtio net.
 
  There are probably more projects that will be blocked.
 
  So how about we keep extending legacy layout for a bit longer:
  - add a way to access device with MMIO
  - use feature bit 31 to signal 64 bit features
(and shift device config accordingly)
 
 By my count, net still has 7 feature bits left, so I don't think the
 feature bits are likely to be a limitation in the next 6 months?

Actually I count 5 net specific ones:

3, 4 and then 25, 26, 27

Unless you count 31 even though it's a generic transport bit?

That's still 6 ...

 MMIO is a bigger problem.  Linux guests are happy with it: does it break
 the Windows drivers?
 
 Thanks,
 Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Regression after Remove support for reporting coalesced APIC IRQs

2013-06-06 Thread Gleb Natapov
Hi Jan,

I bisected [1] to f1ed0450a5fac7067590317cbf027f566b6ccbca. Fortunately
further investigation showed that it is not really related to removing
APIC timer interrupt reinjection and the real problem is that we cannot
assume that __apic_accept_irq() always injects interrupts like the patch
does because the function skips interrupt injection if APIC is disabled.
This misreporting screws RTC interrupt tracking, so further RTC interrupt
are stopped to be injected. The simplest solution that I see is to revert
most of the commit and only leave APIC timer interrupt reinjection.

If you have more elegant solution let me know.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=58931
--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: Add test case for accessing bpl via modr/m

2013-06-06 Thread 李春奇
On Thu, Jun 6, 2013 at 3:45 PM, Gleb Natapov g...@redhat.com wrote:
 On Thu, Jun 06, 2013 at 03:42:56PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 3:01 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 02:47:49PM +0800, 李春奇 Arthur Chunqi Li wrote:
  On Thu, Jun 6, 2013 at 1:45 PM, Gleb Natapov g...@redhat.com wrote:
   On Thu, Jun 06, 2013 at 01:03:44PM +0800, Arthur Chunqi Li wrote:
   Test access to %bpl via modr/m addressing mode. This case can test 
   another bug in the boot of RHEL5.9 64-bit.
  
   We have growing number of instructions tests using the same tlb trick. I
   think it is time to make the code more generic. Create a function that
   receives instruction to check and all the tlb games will be done by that
   function.
  Should I do some work to merge all these test cases?
 
  It would be nice, yes. I also have an idea on how to improve test
  reliability. Since it relies on tlb to be out of sync with actual page
  table a vmexit in a wrong time can break this assumption and wrong
  instruction will be emulated (the one from insn_page instead of
  alt_insn_page). If we make the instruction on insn_page place special
  value somewhere and check it after the test we can see if the wrong
  instruction was executed and rerun the test.
 If I commit the patch to merge these test cases, should the patch base
 on what I have commit before (after patched of this mail), or base on
 the master thread?

 Master. First patch provides the infrastructure. Second converts
 existing users. Third adds new tests.
There are some problems packaging it. For some test cases some
registers should be set before call alt_insn_page and some registers
should be returned after it (such as test_movabs). If the emulator
part is packed, the initialization before and result got after it may
be invalid. Add two functions designed by caller to initialize and get
result may cause the parameter table too long. Do you have any
suggestions about this?


 Arthur

 
  
   Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
   ---
x86/emulator.c |   41 +
1 file changed, 41 insertions(+)
  
   diff --git a/x86/emulator.c b/x86/emulator.c
   index 96576e5..3563971 100644
   --- a/x86/emulator.c
   +++ b/x86/emulator.c
   @@ -901,6 +901,45 @@ static void test_simplealu(u32 *mem)
report(test, *mem == 0x8400);
}
  
   +static void test_bpl_modrm(uint64_t *mem, uint8_t *insn_page,
   +   uint8_t *alt_insn_page, void *insn_ram)
   +{
   + ulong *cr3 = (ulong *)read_cr3();
   + uint16_t cx = 0;
   +
   + // Pad with RET instructions
   + memset(insn_page, 0xc3, 4096);
   + memset(alt_insn_page, 0xc3, 4096);
   + // Place a trapping instruction in the page to trigger a VMEXIT
   + insn_page[0] = 0x66; // mov $0x4321, %cx
   + insn_page[1] = 0xb9;
   + insn_page[2] = 0x21;
   + insn_page[3] = 0x43;
   + insn_page[4] = 0x89; // mov %eax, (%rax)
   + insn_page[5] = 0x00;
   + insn_page[6] = 0x90; // nop
   + // Place mov %cl, %bpl in alt_insn_page for emulator to execuate
   + // If emulator mistaken addressing %bpl, %cl may be moved to %ch
   + // %cx will be broken to 0x2121, not 0x4321
   + alt_insn_page[4] = 0x40;
   + alt_insn_page[5] = 0x88;
   + alt_insn_page[6] = 0xcd;
   +
   + // Load the code TLB with insn_page, but point the page tables at
   + // alt_insn_page (and keep the data TLB clear, for AMD decode 
   assist).
   + // This will make the CPU trap on the insn_page instruction but 
   the
   + // hypervisor will see alt_insn_page.
   + install_page(cr3, virt_to_phys(insn_page), insn_ram);
   + // Load code TLB
   + invlpg(insn_ram);
   + asm volatile(call *%0 : : r(insn_ram+3));
   + // Trap, let hypervisor emulate at alt_insn_page
   + install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
   + asm volatile(call *%0 : : r(insn_ram), a(mem));
   + asm volatile(:=c(cx));
   Why not add the constrain to previous asm?
  I will merge them in next version.
 
  
   + report(access bpl in modr/m, cx == 0x4321);
   +}
   +
int main()
{
 void *mem;
   @@ -964,6 +1003,8 @@ int main()
  
 test_string_io_mmio(mem);
  
   + test_bpl_modrm(mem, insn_page, alt_insn_page, insn_ram);
   +
 printf(\nSUMMARY: %d tests, %d failures\n, tests, fails);
 return fails ? 1 : 0;
}
   --
   1.7.9.5
  
   --
   Gleb.
 
  --
  Gleb.

 --
 Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: Add test case for accessing bpl via modr/m

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 05:33:52PM +0800, 李春奇 Arthur Chunqi Li wrote:
 On Thu, Jun 6, 2013 at 3:45 PM, Gleb Natapov g...@redhat.com wrote:
  On Thu, Jun 06, 2013 at 03:42:56PM +0800, 李春奇 Arthur Chunqi Li wrote:
  On Thu, Jun 6, 2013 at 3:01 PM, Gleb Natapov g...@redhat.com wrote:
   On Thu, Jun 06, 2013 at 02:47:49PM +0800, 李春奇 Arthur Chunqi Li wrote:
   On Thu, Jun 6, 2013 at 1:45 PM, Gleb Natapov g...@redhat.com wrote:
On Thu, Jun 06, 2013 at 01:03:44PM +0800, Arthur Chunqi Li wrote:
Test access to %bpl via modr/m addressing mode. This case can test 
another bug in the boot of RHEL5.9 64-bit.
   
We have growing number of instructions tests using the same tlb 
trick. I
think it is time to make the code more generic. Create a function that
receives instruction to check and all the tlb games will be done by 
that
function.
   Should I do some work to merge all these test cases?
  
   It would be nice, yes. I also have an idea on how to improve test
   reliability. Since it relies on tlb to be out of sync with actual page
   table a vmexit in a wrong time can break this assumption and wrong
   instruction will be emulated (the one from insn_page instead of
   alt_insn_page). If we make the instruction on insn_page place special
   value somewhere and check it after the test we can see if the wrong
   instruction was executed and rerun the test.
  If I commit the patch to merge these test cases, should the patch base
  on what I have commit before (after patched of this mail), or base on
  the master thread?
 
  Master. First patch provides the infrastructure. Second converts
  existing users. Third adds new tests.
 There are some problems packaging it. For some test cases some
 registers should be set before call alt_insn_page and some registers
 should be returned after it (such as test_movabs). If the emulator
 part is packed, the initialization before and result got after it may
 be invalid. Add two functions designed by caller to initialize and get
 result may cause the parameter table too long. Do you have any
 suggestions about this?
 
You can do what realmode.c does. It has inregs and outregs structures,
registers are set to inregs values before a test and saved to outregs
after. You can examine outregs to check correctness instead of checking
HW registers directly.


 
  Arthur
 
  
   
Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/emulator.c |   41 +
 1 file changed, 41 insertions(+)
   
diff --git a/x86/emulator.c b/x86/emulator.c
index 96576e5..3563971 100644
--- a/x86/emulator.c
+++ b/x86/emulator.c
@@ -901,6 +901,45 @@ static void test_simplealu(u32 *mem)
 report(test, *mem == 0x8400);
 }
   
+static void test_bpl_modrm(uint64_t *mem, uint8_t *insn_page,
+   uint8_t *alt_insn_page, void *insn_ram)
+{
+ ulong *cr3 = (ulong *)read_cr3();
+ uint16_t cx = 0;
+
+ // Pad with RET instructions
+ memset(insn_page, 0xc3, 4096);
+ memset(alt_insn_page, 0xc3, 4096);
+ // Place a trapping instruction in the page to trigger a VMEXIT
+ insn_page[0] = 0x66; // mov $0x4321, %cx
+ insn_page[1] = 0xb9;
+ insn_page[2] = 0x21;
+ insn_page[3] = 0x43;
+ insn_page[4] = 0x89; // mov %eax, (%rax)
+ insn_page[5] = 0x00;
+ insn_page[6] = 0x90; // nop
+ // Place mov %cl, %bpl in alt_insn_page for emulator to 
execuate
+ // If emulator mistaken addressing %bpl, %cl may be moved to 
%ch
+ // %cx will be broken to 0x2121, not 0x4321
+ alt_insn_page[4] = 0x40;
+ alt_insn_page[5] = 0x88;
+ alt_insn_page[6] = 0xcd;
+
+ // Load the code TLB with insn_page, but point the page tables 
at
+ // alt_insn_page (and keep the data TLB clear, for AMD decode 
assist).
+ // This will make the CPU trap on the insn_page instruction 
but the
+ // hypervisor will see alt_insn_page.
+ install_page(cr3, virt_to_phys(insn_page), insn_ram);
+ // Load code TLB
+ invlpg(insn_ram);
+ asm volatile(call *%0 : : r(insn_ram+3));
+ // Trap, let hypervisor emulate at alt_insn_page
+ install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
+ asm volatile(call *%0 : : r(insn_ram), a(mem));
+ asm volatile(:=c(cx));
Why not add the constrain to previous asm?
   I will merge them in next version.
  
   
+ report(access bpl in modr/m, cx == 0x4321);
+}
+
 int main()
 {
  void *mem;
@@ -964,6 +1003,8 @@ int main()
   
  test_string_io_mmio(mem);
   
+ test_bpl_modrm(mem, insn_page, alt_insn_page, insn_ram);
+
  printf(\nSUMMARY: %d tests, %d failures\n, tests, fails);
  return fails ? 1 : 0;
 }
--
1.7.9.5
   
--

RE: [RFC PATCH 0/6] KVM: PPC: Book3E: AltiVec support

2013-06-06 Thread Caraman Mihai Claudiu-B02008
   This looks like a bit much for 3.10 (certainly, subject lines like
   refactor and enhance and add support aren't going to make
  Linus
   happy given that we're past rc4) so I think we should apply
   http://patchwork.ozlabs.org/patch/242896/ for 3.10.  Then for 3.11,
   revert it after applying this patchset.
  
 
  Why not 1/6 plus e6500 removal?
 
 1/6 is not a bugfix.

Not sure I get it. Isn't this a better fix for AltiVec build breakage:

-#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 42
-#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 43
+#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 32
+#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 33

This removes the need for additional kvm_handlers. Obvious this doesn't make
AltiVec to work so we still need to disable e6500.

-Mike

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug#707257: linux-image-3.8-1-686-pae: KVM crashes with entry failed, hardware error 0x80000021

2013-06-06 Thread Stefan Pietsch
On 06.06.2013 08:42, Gleb Natapov wrote:
 On Wed, Jun 05, 2013 at 02:51:19PM +0200, Stefan Pietsch wrote:
 On 05.06.2013 14:10, Gleb Natapov wrote:
 On Wed, Jun 05, 2013 at 01:57:25PM +0200, Stefan Pietsch wrote:
 On 19.05.2013 14:32, Gleb Natapov wrote:
 On Sun, May 19, 2013 at 02:00:31AM +0100, Ben Hutchings wrote:
 Dear KVM maintainers, it appears that there is a gap in x86 emulation,
 at least on a 32-bit host.  Stefan found this when running GRML, a live
 distribution which can be downloaded from:
 http://download.grml.org/grml32-full_2013.02.iso.  His original
 reported is at http://bugs.debian.org/707257.

 Can you verify with latest linux.git HEAD? It works for me there on
 64bit. There were a lot of problems fixed in this area in 3.9/3.10 time 
 frame,
 so it would be helpful if you'll test 32bit before I install one myself.


 Kernel version 3.9.4-1 (linux-image-3.9-1-686-pae) made things worse.

 The virtual machine tries to boot the kernel, but stops after a few
 seconds and the kern.log shows:
 At what point does it stop?


 The machine stops at:

 Performance Events: Broken PMU hardware detected, using software events
 only.
 Failed to access perfctr msr (MSR c1 is 0)
 Enabling APIC mode:  Flat.  Using 1 I/O APICs
 Timer initialization is what comes next.
 
 I tried 32bit kernel compiled from kvm.git next (3.10.0-rc2+) branch and 
 upstream
 qemu and I cannot reproduce the problem. The guest boots fine.


I had no success with the Debian kernel 3.10~rc4-1~exp1 (3.10-rc4-686-pae).

The machine hangs after Enabling APIC mode:  Flat.  Using 1 I/O APICs.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug#707257: linux-image-3.8-1-686-pae: KVM crashes with entry failed, hardware error 0x80000021

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 01:35:13PM +0200, Stefan Pietsch wrote:
 On 06.06.2013 08:42, Gleb Natapov wrote:
  On Wed, Jun 05, 2013 at 02:51:19PM +0200, Stefan Pietsch wrote:
  On 05.06.2013 14:10, Gleb Natapov wrote:
  On Wed, Jun 05, 2013 at 01:57:25PM +0200, Stefan Pietsch wrote:
  On 19.05.2013 14:32, Gleb Natapov wrote:
  On Sun, May 19, 2013 at 02:00:31AM +0100, Ben Hutchings wrote:
  Dear KVM maintainers, it appears that there is a gap in x86 emulation,
  at least on a 32-bit host.  Stefan found this when running GRML, a live
  distribution which can be downloaded from:
  http://download.grml.org/grml32-full_2013.02.iso.  His original
  reported is at http://bugs.debian.org/707257.
 
  Can you verify with latest linux.git HEAD? It works for me there on
  64bit. There were a lot of problems fixed in this area in 3.9/3.10 time 
  frame,
  so it would be helpful if you'll test 32bit before I install one myself.
 
 
  Kernel version 3.9.4-1 (linux-image-3.9-1-686-pae) made things worse.
 
  The virtual machine tries to boot the kernel, but stops after a few
  seconds and the kern.log shows:
  At what point does it stop?
 
 
  The machine stops at:
 
  Performance Events: Broken PMU hardware detected, using software events
  only.
  Failed to access perfctr msr (MSR c1 is 0)
  Enabling APIC mode:  Flat.  Using 1 I/O APICs
  Timer initialization is what comes next.
  
  I tried 32bit kernel compiled from kvm.git next (3.10.0-rc2+) branch and 
  upstream
  qemu and I cannot reproduce the problem. The guest boots fine.
 
 
 I had no success with the Debian kernel 3.10~rc4-1~exp1 (3.10-rc4-686-pae).
 
 The machine hangs after Enabling APIC mode:  Flat.  Using 1 I/O APICs.
OK, since it looks like it hangs during timer initialization can you try
to disable kvmclock? Add -cpu qemu64,-kvmclock to your command line.
Also can you provide the output of cat /proc/cpuinfo on your host? And
complete serial output before hang.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/2] tools: lkvm - Filter out cpu vendor string

2013-06-06 Thread Pekka Enberg
On Tue, May 28, 2013 at 2:49 PM, Cyrill Gorcunov gorcu...@openvz.org wrote:
 If cpuvendor string is not filetered in case of host
 amd machine we get unhandled msr reads

 | [1709265.368464] kvm: 25706: cpu6 unhandled rdmsr: 0xc0010048
 | [1709265.397161] kvm: 25706: cpu7 unhandled rdmsr: 0xc0010048
 | [1709265.425774] kvm: 25706: cpu8 unhandled rdmsr: 0xc0010048

 thus provide own string and kernel will use generic cpu init.

 Reported-by: Ingo Molnar mi...@kernel.org
 CC: Pekka Enberg penb...@kernel.org
 CC: Sasha Levin sasha.le...@oracle.com
 CC: Asias He as...@redhat.com
 Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org
 ---
  tools/kvm/x86/cpuid.c |8 
  1 file changed, 8 insertions(+)

 Index: linux-2.6.git/tools/kvm/x86/cpuid.c
 ===
 --- linux-2.6.git.orig/tools/kvm/x86/cpuid.c
 +++ linux-2.6.git/tools/kvm/x86/cpuid.c
 @@ -12,6 +12,7 @@

  static void filter_cpuid(struct kvm_cpuid2 *kvm_cpuid)
  {
 +   unsigned int signature[3];
 unsigned int i;

 /*
 @@ -21,6 +22,13 @@ static void filter_cpuid(struct kvm_cpui
 struct kvm_cpuid_entry2 *entry = kvm_cpuid-entries[i];

 switch (entry-function) {
 +   case 0:
 +   /* Vendor name */
 +   memcpy(signature, LKVMLKVMLKVM, 12);
 +   entry-ebx = signature[0];
 +   entry-ecx = signature[1];
 +   entry-edx = signature[2];
 +   break;
 case 1:
 /* Set X86_FEATURE_HYPERVISOR */
 if (entry-index == 0)

Ping! Is there someone out there who has a AMD box they could test this on?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/2] tools: lkvm - Filter out cpu vendor string

2013-06-06 Thread Cyrill Gorcunov
On Thu, Jun 06, 2013 at 03:03:03PM +0300, Pekka Enberg wrote:
  /* Set X86_FEATURE_HYPERVISOR */
  if (entry-index == 0)
 
 Ping! Is there someone out there who has a AMD box they could test this on?

I don't have it, sorry :-(
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug#707257: linux-image-3.8-1-686-pae: KVM crashes with entry failed, hardware error 0x80000021

2013-06-06 Thread Stefan Pietsch
On 06.06.2013 13:40, Gleb Natapov wrote:
 On Thu, Jun 06, 2013 at 01:35:13PM +0200, Stefan Pietsch wrote:

 I had no success with the Debian kernel 3.10~rc4-1~exp1 (3.10-rc4-686-pae).

 The machine hangs after Enabling APIC mode:  Flat.  Using 1 I/O APICs.
 OK, since it looks like it hangs during timer initialization can you try
 to disable kvmclock? Add -cpu qemu64,-kvmclock to your command line.
 Also can you provide the output of cat /proc/cpuinfo on your host? And
 complete serial output before hang.


command line:
qemu-system-i386 -machine accel=kvm -m 512 -cpu qemu64,-kvmclock -cdrom
grml32-full_2013.02.iso -serial file:ttyS0.log



/proc/cpuinfo:
##

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 14
model name  : Intel(R) Core(TM) Duo CPU  L2400  @ 1.66GHz
stepping: 12
microcode   : 0x54
cpu MHz : 1000.000
cache size  : 2048 KB
physical id : 0
siblings: 2
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fdiv_bug: no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc
arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm dtherm
bogomips: 3325.02
clflush size: 64
cache_alignment : 64
address sizes   : 32 bits physical, 32 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 14
model name  : Intel(R) Core(TM) Duo CPU  L2400  @ 1.66GHz
stepping: 12
microcode   : 0x54
cpu MHz : 1000.000
cache size  : 2048 KB
physical id : 0
siblings: 2
core id : 1
cpu cores   : 2
apicid  : 1
initial apicid  : 1
fdiv_bug: no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc
arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm dtherm
bogomips: 3325.02
clflush size: 64
cache_alignment : 64
address sizes   : 32 bits physical, 32 bits virtual
power management:



ttyS0.log:
##

[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.7-1-grml-486 (t...@grml.org) (gcc version
4.7.2 (Debian 4.7.2-5) ) #1 Debian 3.7.9-1+grml.1
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff]
usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009]
reserved
[0.00] BIOS-e820: [mem 0x000f-0x000f]
reserved
[0.00] BIOS-e820: [mem 0x0010-0x1fffdfff]
usable
[0.00] BIOS-e820: [mem 0x1fffe000-0x1fff]
reserved
[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff]
reserved
[0.00] BIOS-e820: [mem 0xfffc-0x]
reserved
[0.00] Notice: NX (Execute Disable) protection cannot be
enabled: non-PAE kernel!
[0.00] SMBIOS 2.4 present.
[0.00] Hypervisor detected: KVM
[0.00] e820: last_pfn = 0x1fffe max_arch_pfn = 0x10
[0.00] PAT not supported by CPU.
[0.00] found SMP MP-table at [mem 0x000fdb00-0x000fdb0f] mapped
at [c00fdb00]
[0.00] init_memory_mapping: [mem 0x-0x1fffdfff]
[0.00] RAMDISK: [mem 0x1f33-0x1ffdbfff]
[0.00] ACPI: RSDP 000fd9a0 00014 (v00 BOCHS )
[0.00] ACPI: RSDT 1fffe4b0 00034 (v01 BOCHS  BXPCRSDT 0001
BXPC 0001)
[0.00] ACPI: FACP 1f80 00074 (v01 BOCHS  BXPCFACP 0001
BXPC 0001)
[0.00] ACPI: DSDT 1fffe4f0 011A9 (v01   BXPC   BXDSDT 0001
INTL 20100528)
[0.00] ACPI: FACS 1f40 00040
[0.00] ACPI: SSDT 1800 00735 (v01 BOCHS  BXPCSSDT 0001
BXPC 0001)
[0.00] ACPI: APIC 16e0 00078 (v01 BOCHS  BXPCAPIC 0001
BXPC 0001)
[0.00] ACPI: HPET 16a0 00038 (v01 BOCHS  BXPCHPET 0001
BXPC 0001)
[0.00] 0MB HIGHMEM available.
[0.00] 511MB LOWMEM available.
[0.00]   mapped low ram: 0 - 1fffe000
[0.00]   low ram: 0 - 1fffe000
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x0001-0x00ff]
[0.00]   Normal   [mem 0x0100-0x1fffdfff]
[0.00]   HighMem  empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x0001-0x0009efff]
[0.00]   node   0: [mem 0x0010-0x1fffdfff]
[0.00] Using APIC driver default
[0.00] ACPI: PM-Timer IO Port: 0xb008
[0.00] ACPI: 

[PATCH net 2/2] vhost: fix ubuf_info cleanup

2013-06-06 Thread Michael S. Tsirkin
vhost_net_clear_ubuf_info didn't clear ubuf_info
after kfree, this could trigger double free.
Fix this and simplify this code to make it more robust: make sure
ubuf info is always freed through vhost_net_clear_ubuf_info.

Reported-by: Tommi Rantala tt.rant...@gmail.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 drivers/vhost/net.c | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 6b00f64..7fc47f7 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -155,14 +155,11 @@ static void vhost_net_ubuf_put_and_wait(struct 
vhost_net_ubuf_ref *ubufs)
 
 static void vhost_net_clear_ubuf_info(struct vhost_net *n)
 {
-
-   bool zcopy;
int i;
 
-   for (i = 0; i  n-dev.nvqs; ++i) {
-   zcopy = vhost_net_zcopy_mask  (0x1  i);
-   if (zcopy)
-   kfree(n-vqs[i].ubuf_info);
+   for (i = 0; i  VHOST_NET_VQ_MAX; ++i) {
+   kfree(n-vqs[i].ubuf_info);
+   n-vqs[i].ubuf_info = NULL;
}
 }
 
@@ -171,7 +168,7 @@ int vhost_net_set_ubuf_info(struct vhost_net *n)
bool zcopy;
int i;
 
-   for (i = 0; i  n-dev.nvqs; ++i) {
+   for (i = 0; i  VHOST_NET_VQ_MAX; ++i) {
zcopy = vhost_net_zcopy_mask  (0x1  i);
if (!zcopy)
continue;
@@ -183,12 +180,7 @@ int vhost_net_set_ubuf_info(struct vhost_net *n)
return 0;
 
 err:
-   while (i--) {
-   zcopy = vhost_net_zcopy_mask  (0x1  i);
-   if (!zcopy)
-   continue;
-   kfree(n-vqs[i].ubuf_info);
-   }
+   vhost_net_clear_ubuf_info(n);
return -ENOMEM;
 }
 
@@ -196,12 +188,12 @@ void vhost_net_vq_reset(struct vhost_net *n)
 {
int i;
 
+   vhost_net_clear_ubuf_info(n);
+
for (i = 0; i  VHOST_NET_VQ_MAX; i++) {
n-vqs[i].done_idx = 0;
n-vqs[i].upend_idx = 0;
n-vqs[i].ubufs = NULL;
-   kfree(n-vqs[i].ubuf_info);
-   n-vqs[i].ubuf_info = NULL;
n-vqs[i].vhost_hlen = 0;
n-vqs[i].sock_hlen = 0;
}
-- 
MST

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 1/2] vhost: check owner before we overwrite ubuf_info

2013-06-06 Thread Michael S. Tsirkin
If device has an owner, we shouldn't touch ubuf_info
since it might be in use.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 drivers/vhost/net.c   | 4 
 drivers/vhost/vhost.c | 8 +++-
 drivers/vhost/vhost.h | 1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2b51e23..6b00f64 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1053,6 +1053,10 @@ static long vhost_net_set_owner(struct vhost_net *n)
int r;
 
mutex_lock(n-dev.mutex);
+   if (vhost_dev_has_owner(n-dev)) {
+   r = -EBUSY;
+   goto out;
+   }
r = vhost_net_set_ubuf_info(n);
if (r)
goto out;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index beee7f5..60aa5ad 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -344,13 +344,19 @@ static int vhost_attach_cgroups(struct vhost_dev *dev)
 }
 
 /* Caller should have device mutex */
+bool vhost_dev_has_owner(struct vhost_dev *dev)
+{
+   return dev-mm;
+}
+
+/* Caller should have device mutex */
 long vhost_dev_set_owner(struct vhost_dev *dev)
 {
struct task_struct *worker;
int err;
 
/* Is there an owner already? */
-   if (dev-mm) {
+   if (vhost_dev_has_owner(dev)) {
err = -EBUSY;
goto err_mm;
}
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index a7ad635..64adcf9 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -133,6 +133,7 @@ struct vhost_dev {
 
 long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, int 
nvqs);
 long vhost_dev_set_owner(struct vhost_dev *dev);
+bool vhost_dev_has_owner(struct vhost_dev *dev);
 long vhost_dev_check_owner(struct vhost_dev *);
 struct vhost_memory *vhost_dev_reset_owner_prepare(void);
 void vhost_dev_reset_owner(struct vhost_dev *, struct vhost_memory *);
-- 
MST

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 0/2] vhost fixes for 3.10

2013-06-06 Thread Michael S. Tsirkin
Two patches fixing the fallout from the vhost cleanup in 3.10.
Thanks to Tommi Rantala who reported the issue.

Tommi, could you please confirm this fixes the crashes for you?

Michael S. Tsirkin (2):
  vhost: check owner before we overwrite ubuf_info
  vhost: fix ubuf_info cleanup

 drivers/vhost/net.c   | 26 +++---
 drivers/vhost/vhost.c |  8 +++-
 drivers/vhost/vhost.h |  1 +
 3 files changed, 19 insertions(+), 16 deletions(-)

-- 
MST

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/2] tools: lkvm - Filter out cpu vendor string

2013-06-06 Thread Asias He
On Thu, Jun 6, 2013 at 8:03 PM, Pekka Enberg penb...@kernel.org wrote:
 On Tue, May 28, 2013 at 2:49 PM, Cyrill Gorcunov gorcu...@openvz.org wrote:
 If cpuvendor string is not filetered in case of host
 amd machine we get unhandled msr reads

 | [1709265.368464] kvm: 25706: cpu6 unhandled rdmsr: 0xc0010048
 | [1709265.397161] kvm: 25706: cpu7 unhandled rdmsr: 0xc0010048
 | [1709265.425774] kvm: 25706: cpu8 unhandled rdmsr: 0xc0010048

 thus provide own string and kernel will use generic cpu init.

 Reported-by: Ingo Molnar mi...@kernel.org
 CC: Pekka Enberg penb...@kernel.org
 CC: Sasha Levin sasha.le...@oracle.com
 CC: Asias He as...@redhat.com
 Signed-off-by: Cyrill Gorcunov gorcu...@openvz.org
 ---
  tools/kvm/x86/cpuid.c |8 
  1 file changed, 8 insertions(+)

 Index: linux-2.6.git/tools/kvm/x86/cpuid.c
 ===
 --- linux-2.6.git.orig/tools/kvm/x86/cpuid.c
 +++ linux-2.6.git/tools/kvm/x86/cpuid.c
 @@ -12,6 +12,7 @@

  static void filter_cpuid(struct kvm_cpuid2 *kvm_cpuid)
  {
 +   unsigned int signature[3];
 unsigned int i;

 /*
 @@ -21,6 +22,13 @@ static void filter_cpuid(struct kvm_cpui
 struct kvm_cpuid_entry2 *entry = kvm_cpuid-entries[i];

 switch (entry-function) {
 +   case 0:
 +   /* Vendor name */
 +   memcpy(signature, LKVMLKVMLKVM, 12);
 +   entry-ebx = signature[0];
 +   entry-ecx = signature[1];
 +   entry-edx = signature[2];
 +   break;
 case 1:
 /* Set X86_FEATURE_HYPERVISOR */
 if (entry-index == 0)

 Ping! Is there someone out there who has a AMD box they could test this on?

I will try to find one.

--
Asias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread H. Peter Anvin
On 06/05/2013 11:34 PM, Gleb Natapov wrote:

 SeaBIOS runs the virtio code in 32-bit mode with a flat memory layout.
 There are loads of ASSERT32FLAT()s in the code to make sure of this.

 Well, not exactly. Initialization is done in 32bit, but disk
 reads/writes are done in 16bit mode since it should work from int13
 interrupt handler. The only way I know to access MMIO bars from 16 bit
 is to use SMM which we do not have in KVM.
 

In some ways it is even worse for PXE, since PXE is defined to work from
16-bit protected mode... but the mechanism to request a segment mapping
for the high memory area doesn't actually work.

-hpa


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] virtio_balloon: leak_balloon(): only tell host if we got pages deflated

2013-06-06 Thread Rafael Aquini
On Wed, Jun 05, 2013 at 09:18:37PM -0400, Luiz Capitulino wrote:
 The balloon_page_dequeue() function can return NULL. If it does for
 the first page being freed, then leak_balloon() will create a
 scatter list with len=0. Which in turn seems to generate an invalid
 virtio request.
 
 I didn't get this in practice, I found it by code review. On the other
 hand, such an invalid virtio request will cause errors in QEMU and
 fill_balloon() also performs the same check implemented by this commit.
 
 Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
 Acked-by: Rafael Aquini aqu...@redhat.com
 ---
 
 o v2
 
  - Improve changelog
 
  drivers/virtio/virtio_balloon.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
 index bd3ae32..71af7b5 100644
 --- a/drivers/virtio/virtio_balloon.c
 +++ b/drivers/virtio/virtio_balloon.c
 @@ -191,7 +191,8 @@ static void leak_balloon(struct virtio_balloon *vb, 
 size_t num)
* virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
* is true, we *have* to do it in this order
*/
 - tell_host(vb, vb-deflate_vq);

Luiz, sorry for not being clearer before. I was referring to add a commentary on
code, to explain in short words why we should not get rid of this check point.

 + if (vb-num_pfns != 0)
 + tell_host(vb, vb-deflate_vq);
   mutex_unlock(vb-balloon_lock);

If the comment is regarded as unnecessary, then just ignore my suggestion. I'm
OK with your patch. :)

Cheers!
-- Rafael

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] virtio_balloon: leak_balloon(): only tell host if we got pages deflated

2013-06-06 Thread Luiz Capitulino
On Thu, 6 Jun 2013 11:13:58 -0300
Rafael Aquini aqu...@redhat.com wrote:

 On Wed, Jun 05, 2013 at 09:18:37PM -0400, Luiz Capitulino wrote:
  The balloon_page_dequeue() function can return NULL. If it does for
  the first page being freed, then leak_balloon() will create a
  scatter list with len=0. Which in turn seems to generate an invalid
  virtio request.
  
  I didn't get this in practice, I found it by code review. On the other
  hand, such an invalid virtio request will cause errors in QEMU and
  fill_balloon() also performs the same check implemented by this commit.
  
  Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
  Acked-by: Rafael Aquini aqu...@redhat.com
  ---
  
  o v2
  
   - Improve changelog
  
   drivers/virtio/virtio_balloon.c | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)
  
  diff --git a/drivers/virtio/virtio_balloon.c 
  b/drivers/virtio/virtio_balloon.c
  index bd3ae32..71af7b5 100644
  --- a/drivers/virtio/virtio_balloon.c
  +++ b/drivers/virtio/virtio_balloon.c
  @@ -191,7 +191,8 @@ static void leak_balloon(struct virtio_balloon *vb, 
  size_t num)
   * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
   * is true, we *have* to do it in this order
   */
  -   tell_host(vb, vb-deflate_vq);
 
 Luiz, sorry for not being clearer before. I was referring to add a commentary 
 on
 code, to explain in short words why we should not get rid of this check point.

Oh.

  +   if (vb-num_pfns != 0)
  +   tell_host(vb, vb-deflate_vq);
  mutex_unlock(vb-balloon_lock);
 
 If the comment is regarded as unnecessary, then just ignore my suggestion. I'm
 OK with your patch. :)

IMHO, the code is clear enough.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Anthony Liguori

Hi Rusty,

Rusty Russell ru...@rustcorp.com.au writes:

 Anthony Liguori aligu...@us.ibm.com writes:
 4) Do virtio-pcie, make it PCI-e friendly (drop the IO BAR completely), give
it a new device/vendor ID.   Continue to use virtio-pci for existing
devices potentially adding virtio-{net,blk,...}-pcie variants for
people that care to use them.

 Now you have a different compatibility problem; how do you know the
 guest supports the new virtio-pcie net?

We don't care.

We would still use virtio-pci for existing devices.  Only new devices
would use virtio-pcie.

 If you put a virtio-pci card behind a PCI-e bridge today, it's not
 compliant, but AFAICT it will Just Work.  (Modulo the 16-dev limit).

I believe you can put it in legacy mode and then there isn't the 16-dev
limit.  I believe the only advantage of putting it in native mode is
that then you can do native hotplug (as opposed to ACPI hotplug).

So sticking with virtio-pci seems reasonable to me.

 I've been assuming we'd avoid a flag day change; that devices would
 look like existing virtio-pci with capabilities indicating the new
 config layout.

I don't think that's feasible.  Maybe 5 or 10 years from now, we switch
the default adapter to virtio-pcie.

 I think 4 is the best path forward.  It's better for users (guests
 continue to work as they always have).  There's less confusion about
 enabling PCI-e support--you must ask for the virtio-pcie variant and you
 must have a virtio-pcie driver.  It's easy to explain.

 Removing both forward and backward compatibility is easy to explain, but
 I think it'll be harder to deploy.  This is your area though, so perhaps
 I'm wrong.

My concern is that it's not real backwards compatibility.

 It also maps to what regular hardware does.  I highly doubt that there
 are any real PCI cards that made the shift from PCI to PCI-e without
 bumping at least a revision ID.

 Noone expected the new cards to Just Work with old OSes: a new machine
 meant a new OS and new drivers.  Hardware vendors like that.

Yup.

 Since virtualization often involves legacy, our priorities might be
 different.

So realistically, I think if we introduce virtio-pcie with a different
vendor ID, it will be adopted fairly quickly.  The drivers will show up
in distros quickly and get backported.

New devices can be limited to supporting virtio-pcie and we'll certainly
provide a way to use old devices with virtio-pcie too.  But for
practical reasons, I think we have to continue using virtio-pci by
default.

Regards,

Anthony Liguori


 Cheers,
 Rusty.
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Anthony Liguori
Gleb Natapov g...@redhat.com writes:

 On Wed, Jun 05, 2013 at 07:41:17PM -0500, Anthony Liguori wrote:
 H. Peter Anvin h...@zytor.com writes:
 
  On 06/05/2013 03:08 PM, Anthony Liguori wrote:
 
  Definitely an option.  However, we want to be able to boot from native
  devices, too, so having an I/O BAR (which would not be used by the OS
  driver) should still at the very least be an option.
  
  What makes it so difficult to work with an MMIO bar for PCI-e?
  
  With legacy PCI, tracking allocation of MMIO vs. PIO is pretty straight
  forward.  Is there something special about PCI-e here?
  
 
  It's not tracking allocation.  It is that accessing memory above 1 MiB
  is incredibly painful in the BIOS environment, which basically means
  MMIO is inaccessible.
 
 Oh, you mean in real mode.
 
 SeaBIOS runs the virtio code in 32-bit mode with a flat memory layout.
 There are loads of ASSERT32FLAT()s in the code to make sure of this.
 
 Well, not exactly. Initialization is done in 32bit, but disk
 reads/writes are done in 16bit mode since it should work from int13
 interrupt handler. The only way I know to access MMIO bars from 16 bit
 is to use SMM which we do not have in KVM.

Ah, if it's just the dataplane operations then there's another solution.

We can introduce a virtqueue flag that asks the backend to poll for new
requests.  Then SeaBIOS can add the request to the queue and not worry
about kicking or reading the ISR.

SeaBIOS is polling for completion anyway.

Regards,

Anthony Liguori


 --
   Gleb.
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Gerd Hoffmann
On 06/06/13 08:34, Gleb Natapov wrote:
 On Wed, Jun 05, 2013 at 07:41:17PM -0500, Anthony Liguori wrote:

 Oh, you mean in real mode.

 SeaBIOS runs the virtio code in 32-bit mode with a flat memory layout.
 There are loads of ASSERT32FLAT()s in the code to make sure of this.

 Well, not exactly. Initialization is done in 32bit, but disk
 reads/writes are done in 16bit mode since it should work from int13
 interrupt handler.

Exactly.  It's only the initialization code which has ASSERt32FLAT()
all over the place.  Which actually is the majority of the code in most
cases as all the hardware detection and initialization code is there.
But kicking I/O requests must work from 16bit mode too.

 The only way I know to access MMIO bars from 16 bit
 is to use SMM which we do not have in KVM.

For seabios itself this isn't a big issue, see pci_{readl,writel} in
src/pci.c.  When called in 16bit mode it goes into 32bit mode
temporarily, just for accessing the mmio register.  ahci driver uses it,
xhci driver (wip atm) will use that too, and virtio-{blk,scsi} drivers
in seabios can do the same.

But as hpa mentioned it will be more tricky for option roms (aka
virtio-net).

cheers,
  Gerd


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Gleb Natapov
On Thu, Jun 06, 2013 at 05:06:32PM +0200, Gerd Hoffmann wrote:
 On 06/06/13 08:34, Gleb Natapov wrote:
  On Wed, Jun 05, 2013 at 07:41:17PM -0500, Anthony Liguori wrote:
 
  Oh, you mean in real mode.
 
  SeaBIOS runs the virtio code in 32-bit mode with a flat memory layout.
  There are loads of ASSERT32FLAT()s in the code to make sure of this.
 
  Well, not exactly. Initialization is done in 32bit, but disk
  reads/writes are done in 16bit mode since it should work from int13
  interrupt handler.
 
 Exactly.  It's only the initialization code which has ASSERt32FLAT()
 all over the place.  Which actually is the majority of the code in most
 cases as all the hardware detection and initialization code is there.
 But kicking I/O requests must work from 16bit mode too.
 
  The only way I know to access MMIO bars from 16 bit
  is to use SMM which we do not have in KVM.
 
 For seabios itself this isn't a big issue, see pci_{readl,writel} in
 src/pci.c.  When called in 16bit mode it goes into 32bit mode
 temporarily, just for accessing the mmio register.  ahci driver uses it,
 xhci driver (wip atm) will use that too, and virtio-{blk,scsi} drivers
 in seabios can do the same.
 
Isn't this approach broken? How can SeaBIOS be sure it restores real
mode registers to exactly same state they were before entering 32bit
mode?

 But as hpa mentioned it will be more tricky for option roms (aka
 virtio-net).
 
 cheers,
   Gerd
 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread H. Peter Anvin
On 06/06/2013 08:10 AM, Gleb Natapov wrote:
 On Thu, Jun 06, 2013 at 05:06:32PM +0200, Gerd Hoffmann wrote:

 Isn't this approach broken? How can SeaBIOS be sure it restores real
 mode registers to exactly same state they were before entering 32bit
 mode?
 

It can't... so yes, it is broken.

-hpa


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Gerd Hoffmann
  Hi,

 For seabios itself this isn't a big issue, see pci_{readl,writel} in
 src/pci.c.  When called in 16bit mode it goes into 32bit mode
 temporarily, just for accessing the mmio register.  ahci driver uses it,
 xhci driver (wip atm) will use that too, and virtio-{blk,scsi} drivers
 in seabios can do the same.

 Isn't this approach broken? How can SeaBIOS be sure it restores real
 mode registers to exactly same state they were before entering 32bit
 mode?

Don't know the details of that magic.  Kevin had some concerns on the
stability of this, so maybe there is a theoretic hole.  So far I havn't
seen any issues in practice, but also didn't stress it too much.
Basically only used that with all kinds of boot loaders, could very well
be it breaks if you try to use that with more esoteric stuff such as dos
extenders, then hit unhandled corner cases ...

cheers,
  Gerd


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm-unit-tests: Add a func to run instruction in emulator

2013-06-06 Thread Arthur Chunqi Li
Add a function trap_emulator to run an instruction in emulator.
Set inregs first (%rax is invalid because it is used as return
address), put instruction codec in alt_insn and call func with
alt_insn_length. Get results in outregs.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/emulator.c |   81 
 1 file changed, 81 insertions(+)

diff --git a/x86/emulator.c b/x86/emulator.c
index 96576e5..8ab9904 100644
--- a/x86/emulator.c
+++ b/x86/emulator.c
@@ -11,6 +11,14 @@ int fails, tests;
 
 static int exceptions;
 
+struct regs {
+   u64 rax, rbx, rcx, rdx;
+   u64 rsi, rdi, rsp, rbp;
+   u64 rip, rflags;
+};
+
+static struct regs inregs, outregs;
+
 void report(const char *name, int result)
 {
++tests;
@@ -685,6 +693,79 @@ static void test_shld_shrd(u32 *mem)
 report(shrd (cl), *mem == ((0x12345678  3) | (5u  29)));
 }
 
+static void trap_emulator(uint64_t *mem, uint8_t *insn_page,
+uint8_t *alt_insn_page, void *insn_ram,
+uint8_t *alt_insn, int alt_insn_length)
+{
+   ulong *cr3 = (ulong *)read_cr3();
+   int i;
+
+   // Pad with RET instructions
+   memset(insn_page, 0xc3, 4096);
+   memset(alt_insn_page, 0xc3, 4096);
+
+   // Place a trapping instruction in the page to trigger a VMEXIT
+   insn_page[0] = 0x89; // mov %eax, (%rax)
+   insn_page[1] = 0x00;
+   insn_page[2] = 0x90; // nop
+   insn_page[3] = 0xc3; // ret
+
+   // Place the instruction we want the hypervisor to see in the alternate 
page
+   for (i=0; ialt_insn_length; i++)
+   alt_insn_page[i] = alt_insn[i];
+
+   // Save general registers
+   asm volatile(
+   push %rax\n\r
+   push %rbx\n\r
+   push %rcx\n\r
+   push %rdx\n\r
+   push %rsi\n\r
+   push %rdi\n\r
+   );
+   // Load the code TLB with insn_page, but point the page tables at
+   // alt_insn_page (and keep the data TLB clear, for AMD decode assist).
+   // This will make the CPU trap on the insn_page instruction but the
+   // hypervisor will see alt_insn_page.
+   install_page(cr3, virt_to_phys(insn_page), insn_ram);
+   invlpg(insn_ram);
+   // Load code TLB
+   asm volatile(call *%0 : : r(insn_ram + 3));
+   install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
+   // Trap, let hypervisor emulate at alt_insn_page
+   asm volatile(
+   call *%1\n\r
+
+   mov %%rax, 0+%[outregs] \n\t
+   mov %%rbx, 8+%[outregs] \n\t
+   mov %%rcx, 16+%[outregs] \n\t
+   mov %%rdx, 24+%[outregs] \n\t
+   mov %%rsi, 32+%[outregs] \n\t
+   mov %%rdi, 40+%[outregs] \n\t
+   mov %%rsp,48+ %[outregs] \n\t
+   mov %%rbp, 56+%[outregs] \n\t
+
+   /* Save RFLAGS in outregs*/
+   pushf \n\t
+   popq 72+%[outregs] \n\t
+   : [outregs]+m(outregs)
+   : r(insn_ram),
+   a(mem), b(inregs.rbx),
+   c(inregs.rcx), d(inregs.rdx),
+   S(inregs.rsi), D(inregs.rdi)
+   : memory, cc
+   );
+   // Restore general registers
+   asm volatile(
+   pop %rax\n\r
+   pop %rbx\n\r
+   pop %rcx\n\r
+   pop %rdx\n\r
+   pop %rsi\n\r
+   pop %rdi\n\r
+   );
+}
+
 static void advance_rip_by_3_and_note_exception(struct ex_regs *regs)
 {
 ++exceptions;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm-unit-tests: Change two cases to use trap_emulator

2013-06-06 Thread Arthur Chunqi Li
Change two functions (test_mmx_movq_mf and test_movabs) using
unified trap_emulator.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/emulator.c |   66 
 1 file changed, 14 insertions(+), 52 deletions(-)

diff --git a/x86/emulator.c b/x86/emulator.c
index 8ab9904..fa8993f 100644
--- a/x86/emulator.c
+++ b/x86/emulator.c
@@ -776,72 +776,34 @@ static void test_mmx_movq_mf(uint64_t *mem, uint8_t 
*insn_page,
 uint8_t *alt_insn_page, void *insn_ram)
 {
 uint16_t fcw = 0;  // all exceptions unmasked
-ulong *cr3 = (ulong *)read_cr3();
+uint8_t alt_insn[] = {0x0f, 0x7f, 0x00}; // movq %mm0, (%rax)
 
 write_cr0(read_cr0()  ~6);  // TS, EM
-// Place a trapping instruction in the page to trigger a VMEXIT
-insn_page[0] = 0x89; // mov %eax, (%rax)
-insn_page[1] = 0x00;
-insn_page[2] = 0x90; // nop
-insn_page[3] = 0xc3; // ret
-// Place the instruction we want the hypervisor to see in the alternate 
page
-alt_insn_page[0] = 0x0f; // movq %mm0, (%rax)
-alt_insn_page[1] = 0x7f;
-alt_insn_page[2] = 0x00;
-alt_insn_page[3] = 0xc3; // ret
-
 exceptions = 0;
 handle_exception(MF_VECTOR, advance_rip_by_3_and_note_exception);
-
-// Load the code TLB with insn_page, but point the page tables at
-// alt_insn_page (and keep the data TLB clear, for AMD decode assist).
-// This will make the CPU trap on the insn_page instruction but the
-// hypervisor will see alt_insn_page.
-install_page(cr3, virt_to_phys(insn_page), insn_ram);
 asm volatile(fninit; fldcw %0 : : m(fcw));
 asm volatile(fldz; fldz; fdivp); // generate exception
-invlpg(insn_ram);
-// Load code TLB
-asm volatile(call *%0 : : r(insn_ram + 3));
-install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
-// Trap, let hypervisor emulate at alt_insn_page
-asm volatile(call *%0 : : r(insn_ram), a(mem));
+
+inregs = (struct regs){ 0 };
+trap_emulator(mem, insn_page, alt_insn_page, insn_ram, 
+   alt_insn, 3);
 // exit MMX mode
 asm volatile(fnclex; emms);
-report(movq mmx generates #MF, exceptions == 1);
+report(movq mmx generates #MF2, exceptions == 1);
 handle_exception(MF_VECTOR, 0);
 }
 
 static void test_movabs(uint64_t *mem, uint8_t *insn_page,
   uint8_t *alt_insn_page, void *insn_ram)
 {
-uint64_t val = 0;
-ulong *cr3 = (ulong *)read_cr3();
-
-// Pad with RET instructions
-memset(insn_page, 0xc3, 4096);
-memset(alt_insn_page, 0xc3, 4096);
-// Place a trapping instruction in the page to trigger a VMEXIT
-insn_page[0] = 0x89; // mov %eax, (%rax)
-insn_page[1] = 0x00;
-// Place the instruction we want the hypervisor to see in the alternate
-// page. A buggy hypervisor will fetch a 32-bit immediate and return
-// 0xc3c3c3c3.
-alt_insn_page[0] = 0x48; // mov $0xc3c3c3c3c3c3c3c3, %rcx
-alt_insn_page[1] = 0xb9;
-
-// Load the code TLB with insn_page, but point the page tables at
-// alt_insn_page (and keep the data TLB clear, for AMD decode assist).
-// This will make the CPU trap on the insn_page instruction but the
-// hypervisor will see alt_insn_page.
-install_page(cr3, virt_to_phys(insn_page), insn_ram);
-// Load code TLB
-invlpg(insn_ram);
-asm volatile(call *%0 : : r(insn_ram + 3));
-// Trap, let hypervisor emulate at alt_insn_page
-install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
-asm volatile(call *%1 : =c(val) : r(insn_ram), a(mem), c(0));
-report(64-bit mov imm, val == 0xc3c3c3c3c3c3c3c3);
+// mov $0xc3c3c3c3c3c3c3c3, %rcx
+uint8_t alt_insn[] = {0x48, 0xb9, 0xc3, 0xc3, 0xc3,
+   0xc3, 0xc3, 0xc3, 0xc3, 0xc3};
+inregs = (struct regs){ .rcx = 0 };
+
+trap_emulator(mem, insn_page, alt_insn_page, insn_ram,
+   alt_insn, 10);
+report(64-bit mov imm2, outregs.rcx == 0xc3c3c3c3c3c3c3c3);
 }
 
 static void test_crosspage_mmio(volatile uint8_t *mem)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Regression after Remove support for reporting coalesced APIC IRQs

2013-06-06 Thread Ren, Yongjie
 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Thursday, June 06, 2013 4:54 PM
 To: Jan Kiszka
 Cc: kvm@vger.kernel.org; Ren, Yongjie
 Subject: Regression after Remove support for reporting coalesced APIC
 IRQs
 
 Hi Jan,
 
 I bisected [1] to f1ed0450a5fac7067590317cbf027f566b6ccbca.

Right. Before Gleb's mail, I also just did the bisection for this bug. 
The first bad commit is Jan's f1ed0450.

 Fortunately
 further investigation showed that it is not really related to removing
 APIC timer interrupt reinjection and the real problem is that we cannot
 assume that __apic_accept_irq() always injects interrupts like the patch
 does because the function skips interrupt injection if APIC is disabled.
 This misreporting screws RTC interrupt tracking, so further RTC interrupt
 are stopped to be injected. The simplest solution that I see is to revert
 most of the commit and only leave APIC timer interrupt reinjection.
 
 If you have more elegant solution let me know.
 
 [1] https://bugzilla.kernel.org/show_bug.cgi?id=58931
 --
   Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 58931] SMP x64 Windows 2003 guest can't boot up

2013-06-06 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=58931





--- Comment #1 from Jay Ren yongjie@intel.com  2013-06-06 15:40:58 ---
After bisection, we found the first bad commit is:

f1ed0450a5fac7067590317cbf027f566b6ccbca
commit f1ed0450a5fac7067590317cbf027f566b6ccbca
Author: Jan Kiszka jan.kis...@siemens.com
Date:   Sun Apr 28 14:00:41 2013 +0200

KVM: x86: Remove support for reporting coalesced APIC IRQs


Gleb and Jan are trying to fix it.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation

2013-06-06 Thread Mihai Caraman
lwepx faults needs to be handled by KVM and this implies additional code
in DO_KVM macro to identify the source of the exception originated in
host context. This requires to check the Exception Syndrome Register
(ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) for DTB_MISS,
DSI and LRAT exceptions which is too intrusive for the host.

Get rid of lwepx and acquire last instuction in kvmppc_handle_exit() by
searching for the physical address and kmap it. This fixes an infinite loop
caused by lwepx's data TLB miss handled in the host and the TODO for TLB
eviction and execute-but-not-read entries.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
---
 arch/powerpc/include/asm/mmu-book3e.h |6 ++-
 arch/powerpc/kvm/booke.c  |6 +++
 arch/powerpc/kvm/booke.h  |2 +
 arch/powerpc/kvm/bookehv_interrupts.S |   32 ++-
 arch/powerpc/kvm/e500.c   |4 ++
 arch/powerpc/kvm/e500mc.c |   69 +
 6 files changed, 91 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h 
b/arch/powerpc/include/asm/mmu-book3e.h
index 99d43e0..32e470e 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,7 +40,10 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL(x) (((x)  28)  0x3000)
+#define MAS0_TLBSEL_MASK   0x3000
+#define MAS0_TLBSEL_SHIFT  28
+#define MAS0_TLBSEL(x) (((x)  MAS0_TLBSEL_SHIFT)  MAS0_TLBSEL_MASK)
+#define MAS0_GET_TLBSEL(mas0)  (((mas0)  MAS0_TLBSEL_MASK)  
MAS0_TLBSEL_SHIFT)
 #define MAS0_ESEL_MASK 0x0FFF
 #define MAS0_ESEL_SHIFT16
 #define MAS0_ESEL(x)   (((x)  MAS0_ESEL_SHIFT)  MAS0_ESEL_MASK)
@@ -58,6 +61,7 @@
 #define MAS1_TSIZE_MASK0x0f80
 #define MAS1_TSIZE_SHIFT   7
 #define MAS1_TSIZE(x)  (((x)  MAS1_TSIZE_SHIFT)  MAS1_TSIZE_MASK)
+#define MAS1_GET_TSIZE(mas1)   (((mas1)  MAS1_TSIZE_MASK)  MAS1_TSIZE_SHIFT)
 
 #define MAS2_EPN   (~0xFFFUL)
 #define MAS2_X00x0040
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..6764a8e 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -836,6 +836,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
 
+   /*
+* The exception type can change at this point, such as if the TLB entry
+* for the emulated instruction has been evicted.
+*/
+   kvmppc_prepare_for_emulation(vcpu, exit_nr);
+
/* restart interrupts if they were meant for the host */
kvmppc_restart_interrupt(vcpu, exit_nr);
 
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index 5fd1ba6..a0d0fea 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -90,6 +90,8 @@ void kvmppc_vcpu_disable_spe(struct kvm_vcpu *vcpu);
 void kvmppc_booke_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
 void kvmppc_booke_vcpu_put(struct kvm_vcpu *vcpu);
 
+void kvmppc_prepare_for_emulation(struct kvm_vcpu *vcpu, unsigned int 
*exit_nr);
+
 enum int_class {
INT_CLASS_NONCRIT,
INT_CLASS_CRIT,
diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index 20c7a54..0538ab9 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -120,37 +120,20 @@
 
.if \flags  NEED_EMU
/*
-* This assumes you have external PID support.
-* To support a bookehv CPU without external PID, you'll
-* need to look up the TLB entry and create a temporary mapping.
-*
-* FIXME: we don't currently handle if the lwepx faults.  PR-mode
-* booke doesn't handle it either.  Since Linux doesn't use
-* broadcast tlbivax anymore, the only way this should happen is
-* if the guest maps its memory execute-but-not-read, or if we
-* somehow take a TLB miss in the middle of this entry code and
-* evict the relevant entry.  On e500mc, all kernel lowmem is
-* bolted into TLB1 large page mappings, and we don't use
-* broadcast invalidates, so we should not take a TLB miss here.
-*
-* Later we'll need to deal with faults here.  Disallowing guest
-* mappings that are execute-but-not-read could be an option on
-* e500mc, but not on chips with an LRAT if it is used.
+* We don't use external PID support. lwepx faults would need to be
+* handled by KVM and this implies aditional code in DO_KVM (for
+* DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
+* is too intrusive for the host. Get last instuction in
+* kvmppc_handle_exit().
 */
-
-   mfspr   r3, SPRN_EPLC   /* will already have correct ELPID 

[PATCH 1/2] KVM: PPC: e500mc: Revert add load inst fixup

2013-06-06 Thread Mihai Caraman
lwepx faults needs to be handled by KVM. With the current solution
the host kernel searches for the faulting address using its LPID context.
If a host translation is found we return to the lwepx instr instead of the
fixup ending up in an infinite loop.

Revert the commit 1d628af7 add load inst fixup. We will address lwepx
issue in a subsequent patch without the need of fixups.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
---
 arch/powerpc/kvm/bookehv_interrupts.S |   26 +-
 1 files changed, 1 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index e8ed7d6..20c7a54 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -29,7 +29,6 @@
 #include asm/asm-compat.h
 #include asm/asm-offsets.h
 #include asm/bitsperlong.h
-#include asm/thread_info.h
 
 #ifdef CONFIG_64BIT
 #include asm/exception-64e.h
@@ -162,32 +161,9 @@
PPC_STL r30, VCPU_GPR(R30)(r4)
PPC_STL r31, VCPU_GPR(R31)(r4)
mtspr   SPRN_EPLC, r8
-
-   /* disable preemption, so we are sure we hit the fixup handler */
-   CURRENT_THREAD_INFO(r8, r1)
-   li  r7, 1
-   stw r7, TI_PREEMPT(r8)
-
isync
-
-   /*
-* In case the read goes wrong, we catch it and write an invalid value
-* in LAST_INST instead.
-*/
-1: lwepx   r9, 0, r5
-2:
-.section .fixup, ax
-3: li  r9, KVM_INST_FETCH_FAILED
-   b   2b
-.previous
-.section __ex_table,a
-   PPC_LONG_ALIGN
-   PPC_LONG 1b,3b
-.previous
-
+   lwepx   r9, 0, r5
mtspr   SPRN_EPLC, r3
-   li  r7, 0
-   stw r7, TI_PREEMPT(r8)
stw r9, VCPU_LAST_INST(r4)
.endif
 
-- 
1.7.4.1


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost_net: clear msg.control for non-zerocopy case during tx

2013-06-06 Thread Sergei Shtylyov

Hello.

On 06/06/2013 07:27 AM, Jason Wang wrote:


When we decide not use zero-copy, msg.control should be set to NULL
otherwise
macvtap/tap may set zerocopy callbacks which may decrease the kref of
ubufs
wrongly.
Bug were introduced by commit cedb9bdce099206290a2bdd02ce47a7b253b6a84
(vhost-net: skip head management if no outstanding).
This solves the following warnings:
WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]()
Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge
stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun]
CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566
Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011
a0198323 88007c9ebd08 81796b73 88007c9ebd48
8103d66b 7b773e20 8800779f 8800779f43f0
8800779f8418 015c 0062 88007c9ebd58
Call Trace:
[81796b73] dump_stack+0x19/0x1e
[8103d66b] warn_slowpath_common+0x6b/0xa0
[8103d6b5] warn_slowpath_null+0x15/0x20
[a0197627] handle_tx+0x477/0x4b0 [vhost_net]
[a0197690] handle_tx_kick+0x10/0x20 [vhost_net]
[a019541e] vhost_worker+0xfe/0x1a0 [vhost_net]
[a0195320] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[a0195320] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[81061f46] kthread+0xc6/0xd0
[81061e80] ? kthread_freezable_should_stop+0x70/0x70
[817a1aec] ret_from_fork+0x7c/0xb0
[81061e80] ? kthread_freezable_should_stop+0x70/0x70
Signed-off-by: Jason Wang jasow...@redhat.com
---
   drivers/vhost/net.c |3 ++-
   1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2b51e23..b07d96b 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -436,7 +436,8 @@ static void handle_tx(struct vhost_net *net)
   kref_get(ubufs-kref);
   }
   nvq-upend_idx = (nvq-upend_idx + 1) % UIO_MAXIOV;
-}
+} else

You have to use {} on the *else* branch if you have it of the *if*
branch (and vice versa), according to Documentation/CodingStyle.

checkpatch.pl didn't complain this, will send v2.


Yes, this check is what it seems to lack still.


Thanks



WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 0/2] vhost fixes for 3.10

2013-06-06 Thread Tommi Rantala
2013/6/6 Michael S. Tsirkin m...@redhat.com:
 Two patches fixing the fallout from the vhost cleanup in 3.10.
 Thanks to Tommi Rantala who reported the issue.

 Tommi, could you please confirm this fixes the crashes for you?

Confirmed! With the two patches applied, I can no longer reproduce the
crash with trinity.

Thanks!

Tommi

 Michael S. Tsirkin (2):
   vhost: check owner before we overwrite ubuf_info
   vhost: fix ubuf_info cleanup

  drivers/vhost/net.c   | 26 +++---
  drivers/vhost/vhost.c |  8 +++-
  drivers/vhost/vhost.h |  1 +
  3 files changed, 19 insertions(+), 16 deletions(-)

 --
 MST

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/6] KVM: PPC: Book3E: AltiVec support

2013-06-06 Thread Scott Wood

On 06/06/2013 04:42:44 AM, Caraman Mihai Claudiu-B02008 wrote:
   This looks like a bit much for 3.10 (certainly, subject lines  
like

   refactor and enhance and add support aren't going to make
  Linus
   happy given that we're past rc4) so I think we should apply
   http://patchwork.ozlabs.org/patch/242896/ for 3.10.  Then for  
3.11,

   revert it after applying this patchset.
  
 
  Why not 1/6 plus e6500 removal?

 1/6 is not a bugfix.

Not sure I get it. Isn't this a better fix for AltiVec build breakage:

-#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 42
-#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 43
+#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 32
+#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 33

This removes the need for additional kvm_handlers. Obvious this  
doesn't make

AltiVec to work so we still need to disable e6500.


OK, didn't realize you meant it as an alternative fix to what was in my  
patch.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] kvm/ppc: IRQ disabling cleanup

2013-06-06 Thread Scott Wood
Simplify the handling of lazy EE by going directly from fully-enabled
to hard-disabled.  This replaces the lazy_irq_pending() check
(including its misplaced kvm_guest_exit() call).

As suggested by Tiejun Chen, move the interrupt disabling into
kvmppc_prepare_to_enter() rather than have each caller do it.  Also
move the IRQ enabling on heavyweight exit into
kvmppc_prepare_to_enter().

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/include/asm/kvm_ppc.h |6 ++
 arch/powerpc/kvm/book3s_pr.c   |   12 +++-
 arch/powerpc/kvm/booke.c   |   11 +++
 arch/powerpc/kvm/powerpc.c |   23 ++-
 4 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 6885846..e4474f8 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -404,6 +404,12 @@ static inline void kvmppc_fix_ee_before_entry(void)
trace_hardirqs_on();
 
 #ifdef CONFIG_PPC64
+   /*
+* To avoid races, the caller must have gone directly from having
+* interrupts fully-enabled to hard-disabled.
+*/
+   WARN_ON(local_paca-irq_happened != PACA_IRQ_HARD_DIS);
+
/* Only need to enable IRQs by hard enabling them after this */
local_paca-irq_happened = 0;
local_paca-soft_enabled = 1;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 0b97ce4..e61e39e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -884,14 +884,11 @@ program_interrupt:
 * and if we really did time things so badly, then we just exit
 * again due to a host external interrupt.
 */
-   local_irq_disable();
s = kvmppc_prepare_to_enter(vcpu);
-   if (s = 0) {
-   local_irq_enable();
+   if (s = 0)
r = s;
-   } else {
+   else
kvmppc_fix_ee_before_entry();
-   }
}
 
trace_kvm_book3s_reenter(r, vcpu);
@@ -1121,12 +1118,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 * really did time things so badly, then we just exit again due to
 * a host external interrupt.
 */
-   local_irq_disable();
ret = kvmppc_prepare_to_enter(vcpu);
-   if (ret = 0) {
-   local_irq_enable();
+   if (ret = 0)
goto out;
-   }
 
/* Save FPU state in stack */
if (current-thread.regs-msr  MSR_FP)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 08f4aa1..c5270a3 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -617,7 +617,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
local_irq_enable();
kvm_vcpu_block(vcpu);
clear_bit(KVM_REQ_UNHALT, vcpu-requests);
-   local_irq_disable();
+   hard_irq_disable();
 
kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS);
r = 1;
@@ -666,10 +666,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
return -EINVAL;
}
 
-   local_irq_disable();
s = kvmppc_prepare_to_enter(vcpu);
if (s = 0) {
-   local_irq_enable();
ret = s;
goto out;
}
@@ -1161,14 +1159,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 * aren't already exiting to userspace for some other reason.
 */
if (!(r  RESUME_HOST)) {
-   local_irq_disable();
s = kvmppc_prepare_to_enter(vcpu);
-   if (s = 0) {
-   local_irq_enable();
+   if (s = 0)
r = (s  2) | RESUME_HOST | (r  RESUME_FLAG_NV);
-   } else {
+   else
kvmppc_fix_ee_before_entry();
-   }
}
 
return r;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4e05f8c..2f7a221 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -64,12 +64,14 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
int r = 1;
 
-   WARN_ON_ONCE(!irqs_disabled());
+   WARN_ON(irqs_disabled());
+   hard_irq_disable();
+
while (true) {
if (need_resched()) {
local_irq_enable();
cond_resched();
-   local_irq_disable();
+   hard_irq_disable();
continue;
}
 
@@ -95,7 +97,7 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
local_irq_enable();
trace_kvm_check_requests(vcpu);
r = 

[PATCH 3/8] kvm/ppc/booke: Hold srcu lock when calling gfn functions

2013-06-06 Thread Scott Wood
KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/44x_tlb.c  |5 +
 arch/powerpc/kvm/booke.c|7 +++
 arch/powerpc/kvm/e500_mmu.c |5 +
 3 files changed, 17 insertions(+)

diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
index 5dd3ab4..ed03854 100644
--- a/arch/powerpc/kvm/44x_tlb.c
+++ b/arch/powerpc/kvm/44x_tlb.c
@@ -441,6 +441,7 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 
rs, u8 ws)
struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu);
struct kvmppc_44x_tlbe *tlbe;
unsigned int gtlb_index;
+   int idx;
 
gtlb_index = kvmppc_get_gpr(vcpu, ra);
if (gtlb_index = KVM44x_GUEST_TLB_SIZE) {
@@ -473,6 +474,8 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 
rs, u8 ws)
return EMULATE_FAIL;
}
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
if (tlbe_is_host_safe(vcpu, tlbe)) {
gva_t eaddr;
gpa_t gpaddr;
@@ -489,6 +492,8 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 
rs, u8 ws)
kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
+
trace_kvm_gtlb_write(gtlb_index, tlbe-tid, tlbe-word0, tlbe-word1,
 tlbe-word2);
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..ecbe908 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -832,6 +832,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
 {
int r = RESUME_HOST;
int s;
+   int idx;
 
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
@@ -1053,6 +1054,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
break;
}
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
gpaddr = kvmppc_mmu_xlate(vcpu, gtlb_index, eaddr);
gfn = gpaddr  PAGE_SHIFT;
 
@@ -1075,6 +1078,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
kvmppc_account_exit(vcpu, MMIO_EXITS);
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
break;
}
 
@@ -1098,6 +1102,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
kvmppc_account_exit(vcpu, ITLB_VIRT_MISS_EXITS);
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
gpaddr = kvmppc_mmu_xlate(vcpu, gtlb_index, eaddr);
gfn = gpaddr  PAGE_SHIFT;
 
@@ -1114,6 +1120,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
kvmppc_booke_queue_irqprio(vcpu, 
BOOKE_IRQPRIO_MACHINE_CHECK);
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
break;
}
 
diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index c41a5a9..6d6f153 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -396,6 +396,7 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
struct kvm_book3e_206_tlb_entry *gtlbe;
int tlbsel, esel;
int recal = 0;
+   int idx;
 
tlbsel = get_tlb_tlbsel(vcpu);
esel = get_tlb_esel(vcpu, tlbsel);
@@ -430,6 +431,8 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
kvmppc_set_tlb1map_range(vcpu, gtlbe);
}
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
/* Invalidate shadow mappings for the about-to-be-clobbered TLBE. */
if (tlbe_is_host_safe(vcpu, gtlbe)) {
u64 eaddr = get_tlb_eaddr(gtlbe);
@@ -444,6 +447,8 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
kvmppc_mmu_map(vcpu, eaddr, raddr, index_of(tlbsel, esel));
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
+
kvmppc_set_exit_type(vcpu, EMULATED_TLBWE_EXITS);
return EMULATE_DONE;
 }
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/8] kvm/ppc: fixes for 3.10

2013-06-06 Thread Scott Wood
Most of these have been posted before, but I grouped them together as
there are some contextual dependencies between them.

Gleb/Paolo: As Alex doesn't appear to be back yet, can you apply these
if there's no objection over the next few days?

Mihai Caraman (1):
  kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage

Scott Wood (7):
  kvm/ppc/booke64: Disable e6500 support
  kvm/ppc/booke: Hold srcu lock when calling gfn functions
  kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
  kvm/ppc: Call trace_hardirqs_on before entry
  kvm/ppc: IRQ disabling cleanup
  kvm/ppc/booke: Delay kvmppc_fix_ee_before_entry
  kvm/ppc/booke: Don't call kvm_guest_enter twice

 arch/powerpc/include/asm/kvm_asm.h |   16 ++--
 arch/powerpc/include/asm/kvm_ppc.h |   17 ++---
 arch/powerpc/kvm/44x_tlb.c |5 +
 arch/powerpc/kvm/book3s_pr.c   |   16 +---
 arch/powerpc/kvm/booke.c   |   36 
 arch/powerpc/kvm/e500_mmu.c|5 +
 arch/powerpc/kvm/e500mc.c  |2 --
 arch/powerpc/kvm/powerpc.c |   25 ++---
 8 files changed, 73 insertions(+), 49 deletions(-)

-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] kvm/ppc/booke: Delay kvmppc_fix_ee_before_entry

2013-06-06 Thread Scott Wood
kwmppc_fix_ee_before_entry() should be called as late as possible,
or else we get things like WARN_ON(preemptible()) in enable_kernel_fp()
in configurations where preemptible() works.

Note that book3s_pr already waits until just before __kvmppc_vcpu_run
to call kvmppc_fix_ee_before_entry().

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/booke.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index c5270a3..f953324 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -671,7 +671,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
ret = s;
goto out;
}
-   kvmppc_fix_ee_before_entry();
 
kvm_guest_enter();
 
@@ -697,6 +696,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
kvmppc_load_guest_fp(vcpu);
 #endif
 
+   kvmppc_fix_ee_before_entry();
+
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
/* No need for kvm_guest_exit. It's done in handle_exit.
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()

2013-06-06 Thread Scott Wood
EE is hard-disabled on entry to kvmppc_handle_exit(), so call
hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled
is unset.

Without this, we get warnings such as arch/powerpc/kernel/time.c:300,
and sometimes host kernel hangs.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/booke.c |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ecbe908..5cd7ad0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -834,6 +834,17 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
int s;
int idx;
 
+#ifdef CONFIG_PPC64
+   WARN_ON(local_paca-irq_happened != 0);
+#endif
+
+   /*
+* We enter with interrupts disabled in hardware, but
+* we need to call hard_irq_disable anyway to ensure that
+* the software state is kept in sync.
+*/
+   hard_irq_disable();
+
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] kvm/ppc: Call trace_hardirqs_on before entry

2013-06-06 Thread Scott Wood
Currently this is only being done on 64-bit.  Rather than just move it
out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is
consistent with lazy ee state, and so that we don't track more host
code as interrupts-enabled than necessary.

Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect
that this function now has a role on 32-bit as well.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/include/asm/kvm_ppc.h |   11 ---
 arch/powerpc/kvm/book3s_pr.c   |4 ++--
 arch/powerpc/kvm/booke.c   |4 ++--
 arch/powerpc/kvm/powerpc.c |2 --
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a5287fe..6885846 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -394,10 +394,15 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
}
 }
 
-/* Please call after prepare_to_enter. This function puts the lazy ee state
-   back to normal mode, without actually enabling interrupts. */
-static inline void kvmppc_lazy_ee_enable(void)
+/*
+ * Please call after prepare_to_enter. This function puts the lazy ee and irq
+ * disabled tracking state back to normal mode, without actually enabling
+ * interrupts.
+ */
+static inline void kvmppc_fix_ee_before_entry(void)
 {
+   trace_hardirqs_on();
+
 #ifdef CONFIG_PPC64
/* Only need to enable IRQs by hard enabling them after this */
local_paca-irq_happened = 0;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index bdc40b8..0b97ce4 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -890,7 +890,7 @@ program_interrupt:
local_irq_enable();
r = s;
} else {
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
}
}
 
@@ -1161,7 +1161,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
if (vcpu-arch.shared-msr  MSR_FP)
kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
 
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
 
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 5cd7ad0..08f4aa1 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -673,7 +673,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
ret = s;
goto out;
}
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
 
kvm_guest_enter();
 
@@ -1167,7 +1167,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
local_irq_enable();
r = (s  2) | RESUME_HOST | (r  RESUME_FLAG_NV);
} else {
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
}
}
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 6316ee3..4e05f8c 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -117,8 +117,6 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
kvm_guest_exit();
continue;
}
-
-   trace_hardirqs_on();
 #endif
 
kvm_guest_enter();
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] kvm/ppc/booke: Don't call kvm_guest_enter twice

2013-06-06 Thread Scott Wood
kvm_guest_enter() was already called by kvmppc_prepare_to_enter().
Don't call it again.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/booke.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index f953324..0b4d792 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -672,8 +672,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
goto out;
}
 
-   kvm_guest_enter();
-
 #ifdef CONFIG_PPC_FPU
/* Save userspace FPU state in stack */
enable_kernel_fp();
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/8] kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage

2013-06-06 Thread Scott Wood
From: Mihai Caraman mihai.cara...@freescale.com

Interrupt numbers defined for Book3E follows IVORs definition. Align
BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this
rule which also fixes the build breakage.
IVORs 32 and 33 are shared so reflect this in the interrupts naming.

This fixes a build break for 64-bit booke KVM.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/include/asm/kvm_asm.h |   16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index b9dd382..851bac7 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -54,8 +54,16 @@
 #define BOOKE_INTERRUPT_DEBUG 15
 
 /* E500 */
-#define BOOKE_INTERRUPT_SPE_UNAVAIL 32
-#define BOOKE_INTERRUPT_SPE_FP_DATA 33
+#define BOOKE_INTERRUPT_SPE_ALTIVEC_UNAVAIL 32
+#define BOOKE_INTERRUPT_SPE_FP_DATA_ALTIVEC_ASSIST 33
+/*
+ * TODO: Unify 32-bit and 64-bit kernel exception handlers to use same defines
+ */
+#define BOOKE_INTERRUPT_SPE_UNAVAIL BOOKE_INTERRUPT_SPE_ALTIVEC_UNAVAIL
+#define BOOKE_INTERRUPT_SPE_FP_DATA BOOKE_INTERRUPT_SPE_FP_DATA_ALTIVEC_ASSIST
+#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL BOOKE_INTERRUPT_SPE_ALTIVEC_UNAVAIL
+#define BOOKE_INTERRUPT_ALTIVEC_ASSIST \
+   BOOKE_INTERRUPT_SPE_FP_DATA_ALTIVEC_ASSIST
 #define BOOKE_INTERRUPT_SPE_FP_ROUND 34
 #define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35
 #define BOOKE_INTERRUPT_DOORBELL 36
@@ -67,10 +75,6 @@
 #define BOOKE_INTERRUPT_HV_SYSCALL 40
 #define BOOKE_INTERRUPT_HV_PRIV 41
 
-/* altivec */
-#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 42
-#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 43
-
 /* book3s */
 
 #define BOOK3S_INTERRUPT_SYSTEM_RESET  0x100
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/8] kvm/ppc/booke64: Disable e6500 support

2013-06-06 Thread Scott Wood
The previous patch made 64-bit booke KVM build again, but Altivec
support is still not complete, and we can't prevent the guest from
turning on Altivec (which can corrupt host state until state
save/restore is implemented).  Disable e6500 on KVM until this is
fixed.

Signed-off-by: Scott Wood scottw...@freescale.com
---
Mihai has posted RFC patches for proper Altivec support, so disabling
e6500 should only need to be for 3.10.
---
 arch/powerpc/kvm/e500mc.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 753cc99..19c8379 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -177,8 +177,6 @@ int kvmppc_core_check_processor_compat(void)
r = 0;
else if (strcmp(cur_cpu_spec-cpu_name, e5500) == 0)
r = 0;
-   else if (strcmp(cur_cpu_spec-cpu_name, e6500) == 0)
-   r = 0;
else
r = -ENOTSUPP;
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] kvm-unit-tests: Add a func to run instruction in emulator

2013-06-06 Thread 李春奇
This version of save/restore general register seems a bit too ugly, I
will change it and commit another patch.

Some of the registers cannot be set as realmode.c do, for example %rax
used to save return value, wrong %esp %ebp may cause crash, and I
think changed %rflags may cause some unknown error. So these registers
should not be set by caller.

Arthur

On Thu, Jun 6, 2013 at 11:24 PM, Arthur Chunqi Li yzt...@gmail.com wrote:
 Add a function trap_emulator to run an instruction in emulator.
 Set inregs first (%rax is invalid because it is used as return
 address), put instruction codec in alt_insn and call func with
 alt_insn_length. Get results in outregs.

 Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
 ---
  x86/emulator.c |   81 
 
  1 file changed, 81 insertions(+)

 diff --git a/x86/emulator.c b/x86/emulator.c
 index 96576e5..8ab9904 100644
 --- a/x86/emulator.c
 +++ b/x86/emulator.c
 @@ -11,6 +11,14 @@ int fails, tests;

  static int exceptions;

 +struct regs {
 +   u64 rax, rbx, rcx, rdx;
 +   u64 rsi, rdi, rsp, rbp;
 +   u64 rip, rflags;
 +};
 +
 +static struct regs inregs, outregs;
 +
  void report(const char *name, int result)
  {
 ++tests;
 @@ -685,6 +693,79 @@ static void test_shld_shrd(u32 *mem)
  report(shrd (cl), *mem == ((0x12345678  3) | (5u  29)));
  }

 +static void trap_emulator(uint64_t *mem, uint8_t *insn_page,
 +uint8_t *alt_insn_page, void *insn_ram,
 +uint8_t *alt_insn, int alt_insn_length)
 +{
 +   ulong *cr3 = (ulong *)read_cr3();
 +   int i;
 +
 +   // Pad with RET instructions
 +   memset(insn_page, 0xc3, 4096);
 +   memset(alt_insn_page, 0xc3, 4096);
 +
 +   // Place a trapping instruction in the page to trigger a VMEXIT
 +   insn_page[0] = 0x89; // mov %eax, (%rax)
 +   insn_page[1] = 0x00;
 +   insn_page[2] = 0x90; // nop
 +   insn_page[3] = 0xc3; // ret
 +
 +   // Place the instruction we want the hypervisor to see in the 
 alternate page
 +   for (i=0; ialt_insn_length; i++)
 +   alt_insn_page[i] = alt_insn[i];
 +
 +   // Save general registers
 +   asm volatile(
 +   push %rax\n\r
 +   push %rbx\n\r
 +   push %rcx\n\r
 +   push %rdx\n\r
 +   push %rsi\n\r
 +   push %rdi\n\r
 +   );
 +   // Load the code TLB with insn_page, but point the page tables at
 +   // alt_insn_page (and keep the data TLB clear, for AMD decode assist).
 +   // This will make the CPU trap on the insn_page instruction but the
 +   // hypervisor will see alt_insn_page.
 +   install_page(cr3, virt_to_phys(insn_page), insn_ram);
 +   invlpg(insn_ram);
 +   // Load code TLB
 +   asm volatile(call *%0 : : r(insn_ram + 3));
 +   install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
 +   // Trap, let hypervisor emulate at alt_insn_page
 +   asm volatile(
 +   call *%1\n\r
 +
 +   mov %%rax, 0+%[outregs] \n\t
 +   mov %%rbx, 8+%[outregs] \n\t
 +   mov %%rcx, 16+%[outregs] \n\t
 +   mov %%rdx, 24+%[outregs] \n\t
 +   mov %%rsi, 32+%[outregs] \n\t
 +   mov %%rdi, 40+%[outregs] \n\t
 +   mov %%rsp,48+ %[outregs] \n\t
 +   mov %%rbp, 56+%[outregs] \n\t
 +
 +   /* Save RFLAGS in outregs*/
 +   pushf \n\t
 +   popq 72+%[outregs] \n\t
 +   : [outregs]+m(outregs)
 +   : r(insn_ram),
 +   a(mem), b(inregs.rbx),
 +   c(inregs.rcx), d(inregs.rdx),
 +   S(inregs.rsi), D(inregs.rdi)
 +   : memory, cc
 +   );
 +   // Restore general registers
 +   asm volatile(
 +   pop %rax\n\r
 +   pop %rbx\n\r
 +   pop %rcx\n\r
 +   pop %rdx\n\r
 +   pop %rsi\n\r
 +   pop %rdi\n\r
 +   );
 +}
 +
  static void advance_rip_by_3_and_note_exception(struct ex_regs *regs)
  {
  ++exceptions;
 --
 1.7.9.5




-- 
Arthur Chunqi Li
Department of Computer Science
School of EECS
Peking University
Beijing, China
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm-unit-tests: Change two cases to use trap_emulator

2013-06-06 Thread Arthur Chunqi Li
Change two functions (test_mmx_movq_mf and test_movabs) using
unified trap_emulator.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/emulator.c |   66 
 1 file changed, 14 insertions(+), 52 deletions(-)

diff --git a/x86/emulator.c b/x86/emulator.c
index 770e8f7..f8a204e 100755
--- a/x86/emulator.c
+++ b/x86/emulator.c
@@ -762,72 +762,34 @@ static void test_mmx_movq_mf(uint64_t *mem, uint8_t 
*insn_page,
 uint8_t *alt_insn_page, void *insn_ram)
 {
 uint16_t fcw = 0;  // all exceptions unmasked
-ulong *cr3 = (ulong *)read_cr3();
+uint8_t alt_insn[] = {0x0f, 0x7f, 0x00}; // movq %mm0, (%rax)
 
 write_cr0(read_cr0()  ~6);  // TS, EM
-// Place a trapping instruction in the page to trigger a VMEXIT
-insn_page[0] = 0x89; // mov %eax, (%rax)
-insn_page[1] = 0x00;
-insn_page[2] = 0x90; // nop
-insn_page[3] = 0xc3; // ret
-// Place the instruction we want the hypervisor to see in the alternate 
page
-alt_insn_page[0] = 0x0f; // movq %mm0, (%rax)
-alt_insn_page[1] = 0x7f;
-alt_insn_page[2] = 0x00;
-alt_insn_page[3] = 0xc3; // ret
-
 exceptions = 0;
 handle_exception(MF_VECTOR, advance_rip_by_3_and_note_exception);
-
-// Load the code TLB with insn_page, but point the page tables at
-// alt_insn_page (and keep the data TLB clear, for AMD decode assist).
-// This will make the CPU trap on the insn_page instruction but the
-// hypervisor will see alt_insn_page.
-install_page(cr3, virt_to_phys(insn_page), insn_ram);
 asm volatile(fninit; fldcw %0 : : m(fcw));
 asm volatile(fldz; fldz; fdivp); // generate exception
-invlpg(insn_ram);
-// Load code TLB
-asm volatile(call *%0 : : r(insn_ram + 3));
-install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
-// Trap, let hypervisor emulate at alt_insn_page
-asm volatile(call *%0 : : r(insn_ram), a(mem));
+
+inregs = (struct regs){ 0 };
+trap_emulator(mem, insn_page, alt_insn_page, insn_ram, 
+   alt_insn, 3);
 // exit MMX mode
 asm volatile(fnclex; emms);
-report(movq mmx generates #MF, exceptions == 1);
+report(movq mmx generates #MF2, exceptions == 1);
 handle_exception(MF_VECTOR, 0);
 }
 
 static void test_movabs(uint64_t *mem, uint8_t *insn_page,
   uint8_t *alt_insn_page, void *insn_ram)
 {
-uint64_t val = 0;
-ulong *cr3 = (ulong *)read_cr3();
-
-// Pad with RET instructions
-memset(insn_page, 0xc3, 4096);
-memset(alt_insn_page, 0xc3, 4096);
-// Place a trapping instruction in the page to trigger a VMEXIT
-insn_page[0] = 0x89; // mov %eax, (%rax)
-insn_page[1] = 0x00;
-// Place the instruction we want the hypervisor to see in the alternate
-// page. A buggy hypervisor will fetch a 32-bit immediate and return
-// 0xc3c3c3c3.
-alt_insn_page[0] = 0x48; // mov $0xc3c3c3c3c3c3c3c3, %rcx
-alt_insn_page[1] = 0xb9;
-
-// Load the code TLB with insn_page, but point the page tables at
-// alt_insn_page (and keep the data TLB clear, for AMD decode assist).
-// This will make the CPU trap on the insn_page instruction but the
-// hypervisor will see alt_insn_page.
-install_page(cr3, virt_to_phys(insn_page), insn_ram);
-// Load code TLB
-invlpg(insn_ram);
-asm volatile(call *%0 : : r(insn_ram + 3));
-// Trap, let hypervisor emulate at alt_insn_page
-install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
-asm volatile(call *%1 : =c(val) : r(insn_ram), a(mem), c(0));
-report(64-bit mov imm, val == 0xc3c3c3c3c3c3c3c3);
+// mov $0xc3c3c3c3c3c3c3c3, %rcx
+uint8_t alt_insn[] = {0x48, 0xb9, 0xc3, 0xc3, 0xc3,
+   0xc3, 0xc3, 0xc3, 0xc3, 0xc3};
+inregs = (struct regs){ .rcx = 0 };
+
+trap_emulator(mem, insn_page, alt_insn_page, insn_ram,
+   alt_insn, 10);
+report(64-bit mov imm2, outregs.rcx == 0xc3c3c3c3c3c3c3c3);
 }
 
 static void test_crosspage_mmio(volatile uint8_t *mem)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm-unit-tests: Add a func to run instruction in emulator

2013-06-06 Thread Arthur Chunqi Li
Add a function trap_emulator to run an instruction in emulator.
Set inregs first (%rax, %rsp, %rbp, %rflags have special usage and
cannot set in inregs), put instruction codec in alt_insn and call
func with alt_insn_length. Get results in outregs.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/emulator.c |   67 
 1 file changed, 67 insertions(+)
 mode change 100644 = 100755 x86/emulator.c

diff --git a/x86/emulator.c b/x86/emulator.c
old mode 100644
new mode 100755
index 96576e5..770e8f7
--- a/x86/emulator.c
+++ b/x86/emulator.c
@@ -11,6 +11,13 @@ int fails, tests;
 
 static int exceptions;
 
+struct regs {
+   u64 rax, rbx, rcx, rdx;
+   u64 rsi, rdi, rsp, rbp;
+   u64 rip, rflags;
+};
+static struct regs inregs, outregs;
+
 void report(const char *name, int result)
 {
++tests;
@@ -685,6 +692,66 @@ static void test_shld_shrd(u32 *mem)
 report(shrd (cl), *mem == ((0x12345678  3) | (5u  29)));
 }
 
+static void trap_emulator(uint64_t *mem, uint8_t *insn_page,
+uint8_t *alt_insn_page, void *insn_ram,
+uint8_t *alt_insn, int alt_insn_length)
+{
+   ulong *cr3 = (ulong *)read_cr3();
+   int i;
+   static struct regs save;
+
+   // Pad with RET instructions
+   memset(insn_page, 0xc3, 4096);
+   memset(alt_insn_page, 0xc3, 4096);
+
+   // Place a trapping instruction in the page to trigger a VMEXIT
+   insn_page[0] = 0x89; // mov %eax, (%rax)
+   insn_page[1] = 0x00;
+   insn_page[2] = 0x90; // nop
+   insn_page[3] = 0xc3; // ret
+
+   // Place the instruction we want the hypervisor to see in the alternate 
page
+   for (i=0; ialt_insn_length; i++)
+   alt_insn_page[i] = alt_insn[i];
+   save = inregs;
+   
+   // Load the code TLB with insn_page, but point the page tables at
+   // alt_insn_page (and keep the data TLB clear, for AMD decode assist).
+   // This will make the CPU trap on the insn_page instruction but the
+   // hypervisor will see alt_insn_page.
+   install_page(cr3, virt_to_phys(insn_page), insn_ram);
+   invlpg(insn_ram);
+   // Load code TLB
+   asm volatile(call *%0 : : r(insn_ram + 3));
+   install_page(cr3, virt_to_phys(alt_insn_page), insn_ram);
+   // Trap, let hypervisor emulate at alt_insn_page
+   asm volatile(
+   xchg %%rbx, 8+%[save] \n\t
+   xchg %%rcx, 16+%[save] \n\t
+   xchg %%rdx, 24+%[save] \n\t
+   xchg %%rsi, 32+%[save] \n\t
+   xchg %%rdi, 40+%[save] \n\t
+
+   call *%1\n\t
+
+   mov %%rax, 0+%[save] \n\t
+   xchg %%rbx, 8+%[save] \n\t
+   xchg %%rcx, 16+%[save] \n\t
+   xchg %%rdx, 24+%[save] \n\t
+   xchg %%rsi, 32+%[save] \n\t
+   xchg %%rdi, 40+%[save] \n\t
+   mov %%rsp, 48+%[save] \n\t
+   mov %%rbp, 56+%[save] \n\t
+   /* Save RFLAGS in outregs*/
+   pushf \n\t
+   popq 72+%[save] \n\t
+   : [save]+m(save)
+   : r(insn_ram), a(mem)
+   : memory, cc
+   );
+   outregs = save;
+}
+
 static void advance_rip_by_3_and_note_exception(struct ex_regs *regs)
 {
 ++exceptions;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR

2013-06-06 Thread Rusty Russell
Anthony Liguori aligu...@us.ibm.com writes:
 Hi Rusty,

 Rusty Russell ru...@rustcorp.com.au writes:

 Anthony Liguori aligu...@us.ibm.com writes:
 4) Do virtio-pcie, make it PCI-e friendly (drop the IO BAR completely), give
it a new device/vendor ID.   Continue to use virtio-pci for existing
devices potentially adding virtio-{net,blk,...}-pcie variants for
people that care to use them.

 Now you have a different compatibility problem; how do you know the
 guest supports the new virtio-pcie net?

 We don't care.

 We would still use virtio-pci for existing devices.  Only new devices
 would use virtio-pcie.

My concern is that new features make the virtio-pcie driver so desirable
that libvirt is pressured to use it ASAP.  Have we just punted the
problem upstream?

(Of course, feature bits were supposed to avoid such a transition issue,
but mistakes accumulate).

Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation

2013-06-06 Thread Mihai Caraman
lwepx faults needs to be handled by KVM and this implies additional code
in DO_KVM macro to identify the source of the exception originated in
host context. This requires to check the Exception Syndrome Register
(ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) for DTB_MISS,
DSI and LRAT exceptions which is too intrusive for the host.

Get rid of lwepx and acquire last instuction in kvmppc_handle_exit() by
searching for the physical address and kmap it. This fixes an infinite loop
caused by lwepx's data TLB miss handled in the host and the TODO for TLB
eviction and execute-but-not-read entries.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
---
 arch/powerpc/include/asm/mmu-book3e.h |6 ++-
 arch/powerpc/kvm/booke.c  |6 +++
 arch/powerpc/kvm/booke.h  |2 +
 arch/powerpc/kvm/bookehv_interrupts.S |   32 ++-
 arch/powerpc/kvm/e500.c   |4 ++
 arch/powerpc/kvm/e500mc.c |   69 +
 6 files changed, 91 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h 
b/arch/powerpc/include/asm/mmu-book3e.h
index 99d43e0..32e470e 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,7 +40,10 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL(x) (((x)  28)  0x3000)
+#define MAS0_TLBSEL_MASK   0x3000
+#define MAS0_TLBSEL_SHIFT  28
+#define MAS0_TLBSEL(x) (((x)  MAS0_TLBSEL_SHIFT)  MAS0_TLBSEL_MASK)
+#define MAS0_GET_TLBSEL(mas0)  (((mas0)  MAS0_TLBSEL_MASK)  
MAS0_TLBSEL_SHIFT)
 #define MAS0_ESEL_MASK 0x0FFF
 #define MAS0_ESEL_SHIFT16
 #define MAS0_ESEL(x)   (((x)  MAS0_ESEL_SHIFT)  MAS0_ESEL_MASK)
@@ -58,6 +61,7 @@
 #define MAS1_TSIZE_MASK0x0f80
 #define MAS1_TSIZE_SHIFT   7
 #define MAS1_TSIZE(x)  (((x)  MAS1_TSIZE_SHIFT)  MAS1_TSIZE_MASK)
+#define MAS1_GET_TSIZE(mas1)   (((mas1)  MAS1_TSIZE_MASK)  MAS1_TSIZE_SHIFT)
 
 #define MAS2_EPN   (~0xFFFUL)
 #define MAS2_X00x0040
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..6764a8e 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -836,6 +836,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
 
+   /*
+* The exception type can change at this point, such as if the TLB entry
+* for the emulated instruction has been evicted.
+*/
+   kvmppc_prepare_for_emulation(vcpu, exit_nr);
+
/* restart interrupts if they were meant for the host */
kvmppc_restart_interrupt(vcpu, exit_nr);
 
diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
index 5fd1ba6..a0d0fea 100644
--- a/arch/powerpc/kvm/booke.h
+++ b/arch/powerpc/kvm/booke.h
@@ -90,6 +90,8 @@ void kvmppc_vcpu_disable_spe(struct kvm_vcpu *vcpu);
 void kvmppc_booke_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
 void kvmppc_booke_vcpu_put(struct kvm_vcpu *vcpu);
 
+void kvmppc_prepare_for_emulation(struct kvm_vcpu *vcpu, unsigned int 
*exit_nr);
+
 enum int_class {
INT_CLASS_NONCRIT,
INT_CLASS_CRIT,
diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index 20c7a54..0538ab9 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -120,37 +120,20 @@
 
.if \flags  NEED_EMU
/*
-* This assumes you have external PID support.
-* To support a bookehv CPU without external PID, you'll
-* need to look up the TLB entry and create a temporary mapping.
-*
-* FIXME: we don't currently handle if the lwepx faults.  PR-mode
-* booke doesn't handle it either.  Since Linux doesn't use
-* broadcast tlbivax anymore, the only way this should happen is
-* if the guest maps its memory execute-but-not-read, or if we
-* somehow take a TLB miss in the middle of this entry code and
-* evict the relevant entry.  On e500mc, all kernel lowmem is
-* bolted into TLB1 large page mappings, and we don't use
-* broadcast invalidates, so we should not take a TLB miss here.
-*
-* Later we'll need to deal with faults here.  Disallowing guest
-* mappings that are execute-but-not-read could be an option on
-* e500mc, but not on chips with an LRAT if it is used.
+* We don't use external PID support. lwepx faults would need to be
+* handled by KVM and this implies aditional code in DO_KVM (for
+* DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which
+* is too intrusive for the host. Get last instuction in
+* kvmppc_handle_exit().
 */
-
-   mfspr   r3, SPRN_EPLC   /* will already have correct ELPID 

Re: [RFC PATCH 0/6] KVM: PPC: Book3E: AltiVec support

2013-06-06 Thread Scott Wood

On 06/06/2013 04:42:44 AM, Caraman Mihai Claudiu-B02008 wrote:
   This looks like a bit much for 3.10 (certainly, subject lines  
like

   refactor and enhance and add support aren't going to make
  Linus
   happy given that we're past rc4) so I think we should apply
   http://patchwork.ozlabs.org/patch/242896/ for 3.10.  Then for  
3.11,

   revert it after applying this patchset.
  
 
  Why not 1/6 plus e6500 removal?

 1/6 is not a bugfix.

Not sure I get it. Isn't this a better fix for AltiVec build breakage:

-#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 42
-#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 43
+#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 32
+#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 33

This removes the need for additional kvm_handlers. Obvious this  
doesn't make

AltiVec to work so we still need to disable e6500.


OK, didn't realize you meant it as an alternative fix to what was in my  
patch.


-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/8] kvm/ppc: fixes for 3.10

2013-06-06 Thread Scott Wood
Most of these have been posted before, but I grouped them together as
there are some contextual dependencies between them.

Gleb/Paolo: As Alex doesn't appear to be back yet, can you apply these
if there's no objection over the next few days?

Mihai Caraman (1):
  kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage

Scott Wood (7):
  kvm/ppc/booke64: Disable e6500 support
  kvm/ppc/booke: Hold srcu lock when calling gfn functions
  kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
  kvm/ppc: Call trace_hardirqs_on before entry
  kvm/ppc: IRQ disabling cleanup
  kvm/ppc/booke: Delay kvmppc_fix_ee_before_entry
  kvm/ppc/booke: Don't call kvm_guest_enter twice

 arch/powerpc/include/asm/kvm_asm.h |   16 ++--
 arch/powerpc/include/asm/kvm_ppc.h |   17 ++---
 arch/powerpc/kvm/44x_tlb.c |5 +
 arch/powerpc/kvm/book3s_pr.c   |   16 +---
 arch/powerpc/kvm/booke.c   |   36 
 arch/powerpc/kvm/e500_mmu.c|5 +
 arch/powerpc/kvm/e500mc.c  |2 --
 arch/powerpc/kvm/powerpc.c |   25 ++---
 8 files changed, 73 insertions(+), 49 deletions(-)

-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/8] kvm/ppc/booke: Hold srcu lock when calling gfn functions

2013-06-06 Thread Scott Wood
KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/44x_tlb.c  |5 +
 arch/powerpc/kvm/booke.c|7 +++
 arch/powerpc/kvm/e500_mmu.c |5 +
 3 files changed, 17 insertions(+)

diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
index 5dd3ab4..ed03854 100644
--- a/arch/powerpc/kvm/44x_tlb.c
+++ b/arch/powerpc/kvm/44x_tlb.c
@@ -441,6 +441,7 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 
rs, u8 ws)
struct kvmppc_vcpu_44x *vcpu_44x = to_44x(vcpu);
struct kvmppc_44x_tlbe *tlbe;
unsigned int gtlb_index;
+   int idx;
 
gtlb_index = kvmppc_get_gpr(vcpu, ra);
if (gtlb_index = KVM44x_GUEST_TLB_SIZE) {
@@ -473,6 +474,8 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 
rs, u8 ws)
return EMULATE_FAIL;
}
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
if (tlbe_is_host_safe(vcpu, tlbe)) {
gva_t eaddr;
gpa_t gpaddr;
@@ -489,6 +492,8 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 
rs, u8 ws)
kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
+
trace_kvm_gtlb_write(gtlb_index, tlbe-tid, tlbe-word0, tlbe-word1,
 tlbe-word2);
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 1020119..ecbe908 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -832,6 +832,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
 {
int r = RESUME_HOST;
int s;
+   int idx;
 
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
@@ -1053,6 +1054,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
break;
}
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
gpaddr = kvmppc_mmu_xlate(vcpu, gtlb_index, eaddr);
gfn = gpaddr  PAGE_SHIFT;
 
@@ -1075,6 +1078,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
kvmppc_account_exit(vcpu, MMIO_EXITS);
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
break;
}
 
@@ -1098,6 +1102,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
kvmppc_account_exit(vcpu, ITLB_VIRT_MISS_EXITS);
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
gpaddr = kvmppc_mmu_xlate(vcpu, gtlb_index, eaddr);
gfn = gpaddr  PAGE_SHIFT;
 
@@ -1114,6 +1120,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
kvmppc_booke_queue_irqprio(vcpu, 
BOOKE_IRQPRIO_MACHINE_CHECK);
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
break;
}
 
diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index c41a5a9..6d6f153 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -396,6 +396,7 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
struct kvm_book3e_206_tlb_entry *gtlbe;
int tlbsel, esel;
int recal = 0;
+   int idx;
 
tlbsel = get_tlb_tlbsel(vcpu);
esel = get_tlb_esel(vcpu, tlbsel);
@@ -430,6 +431,8 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
kvmppc_set_tlb1map_range(vcpu, gtlbe);
}
 
+   idx = srcu_read_lock(vcpu-kvm-srcu);
+
/* Invalidate shadow mappings for the about-to-be-clobbered TLBE. */
if (tlbe_is_host_safe(vcpu, gtlbe)) {
u64 eaddr = get_tlb_eaddr(gtlbe);
@@ -444,6 +447,8 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu)
kvmppc_mmu_map(vcpu, eaddr, raddr, index_of(tlbsel, esel));
}
 
+   srcu_read_unlock(vcpu-kvm-srcu, idx);
+
kvmppc_set_exit_type(vcpu, EMULATED_TLBWE_EXITS);
return EMULATE_DONE;
 }
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] kvm/ppc: IRQ disabling cleanup

2013-06-06 Thread Scott Wood
Simplify the handling of lazy EE by going directly from fully-enabled
to hard-disabled.  This replaces the lazy_irq_pending() check
(including its misplaced kvm_guest_exit() call).

As suggested by Tiejun Chen, move the interrupt disabling into
kvmppc_prepare_to_enter() rather than have each caller do it.  Also
move the IRQ enabling on heavyweight exit into
kvmppc_prepare_to_enter().

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/include/asm/kvm_ppc.h |6 ++
 arch/powerpc/kvm/book3s_pr.c   |   12 +++-
 arch/powerpc/kvm/booke.c   |   11 +++
 arch/powerpc/kvm/powerpc.c |   23 ++-
 4 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 6885846..e4474f8 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -404,6 +404,12 @@ static inline void kvmppc_fix_ee_before_entry(void)
trace_hardirqs_on();
 
 #ifdef CONFIG_PPC64
+   /*
+* To avoid races, the caller must have gone directly from having
+* interrupts fully-enabled to hard-disabled.
+*/
+   WARN_ON(local_paca-irq_happened != PACA_IRQ_HARD_DIS);
+
/* Only need to enable IRQs by hard enabling them after this */
local_paca-irq_happened = 0;
local_paca-soft_enabled = 1;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 0b97ce4..e61e39e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -884,14 +884,11 @@ program_interrupt:
 * and if we really did time things so badly, then we just exit
 * again due to a host external interrupt.
 */
-   local_irq_disable();
s = kvmppc_prepare_to_enter(vcpu);
-   if (s = 0) {
-   local_irq_enable();
+   if (s = 0)
r = s;
-   } else {
+   else
kvmppc_fix_ee_before_entry();
-   }
}
 
trace_kvm_book3s_reenter(r, vcpu);
@@ -1121,12 +1118,9 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 * really did time things so badly, then we just exit again due to
 * a host external interrupt.
 */
-   local_irq_disable();
ret = kvmppc_prepare_to_enter(vcpu);
-   if (ret = 0) {
-   local_irq_enable();
+   if (ret = 0)
goto out;
-   }
 
/* Save FPU state in stack */
if (current-thread.regs-msr  MSR_FP)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 08f4aa1..c5270a3 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -617,7 +617,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
local_irq_enable();
kvm_vcpu_block(vcpu);
clear_bit(KVM_REQ_UNHALT, vcpu-requests);
-   local_irq_disable();
+   hard_irq_disable();
 
kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS);
r = 1;
@@ -666,10 +666,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
return -EINVAL;
}
 
-   local_irq_disable();
s = kvmppc_prepare_to_enter(vcpu);
if (s = 0) {
-   local_irq_enable();
ret = s;
goto out;
}
@@ -1161,14 +1159,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 * aren't already exiting to userspace for some other reason.
 */
if (!(r  RESUME_HOST)) {
-   local_irq_disable();
s = kvmppc_prepare_to_enter(vcpu);
-   if (s = 0) {
-   local_irq_enable();
+   if (s = 0)
r = (s  2) | RESUME_HOST | (r  RESUME_FLAG_NV);
-   } else {
+   else
kvmppc_fix_ee_before_entry();
-   }
}
 
return r;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4e05f8c..2f7a221 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -64,12 +64,14 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
 {
int r = 1;
 
-   WARN_ON_ONCE(!irqs_disabled());
+   WARN_ON(irqs_disabled());
+   hard_irq_disable();
+
while (true) {
if (need_resched()) {
local_irq_enable();
cond_resched();
-   local_irq_disable();
+   hard_irq_disable();
continue;
}
 
@@ -95,7 +97,7 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
local_irq_enable();
trace_kvm_check_requests(vcpu);
r = 

[PATCH 8/8] kvm/ppc/booke: Don't call kvm_guest_enter twice

2013-06-06 Thread Scott Wood
kvm_guest_enter() was already called by kvmppc_prepare_to_enter().
Don't call it again.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/booke.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index f953324..0b4d792 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -672,8 +672,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
goto out;
}
 
-   kvm_guest_enter();
-
 #ifdef CONFIG_PPC_FPU
/* Save userspace FPU state in stack */
enable_kernel_fp();
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()

2013-06-06 Thread Scott Wood
EE is hard-disabled on entry to kvmppc_handle_exit(), so call
hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled
is unset.

Without this, we get warnings such as arch/powerpc/kernel/time.c:300,
and sometimes host kernel hangs.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/booke.c |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index ecbe908..5cd7ad0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -834,6 +834,17 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
int s;
int idx;
 
+#ifdef CONFIG_PPC64
+   WARN_ON(local_paca-irq_happened != 0);
+#endif
+
+   /*
+* We enter with interrupts disabled in hardware, but
+* we need to call hard_irq_disable anyway to ensure that
+* the software state is kept in sync.
+*/
+   hard_irq_disable();
+
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/8] kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage

2013-06-06 Thread Scott Wood
From: Mihai Caraman mihai.cara...@freescale.com

Interrupt numbers defined for Book3E follows IVORs definition. Align
BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this
rule which also fixes the build breakage.
IVORs 32 and 33 are shared so reflect this in the interrupts naming.

This fixes a build break for 64-bit booke KVM.

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/include/asm/kvm_asm.h |   16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_asm.h 
b/arch/powerpc/include/asm/kvm_asm.h
index b9dd382..851bac7 100644
--- a/arch/powerpc/include/asm/kvm_asm.h
+++ b/arch/powerpc/include/asm/kvm_asm.h
@@ -54,8 +54,16 @@
 #define BOOKE_INTERRUPT_DEBUG 15
 
 /* E500 */
-#define BOOKE_INTERRUPT_SPE_UNAVAIL 32
-#define BOOKE_INTERRUPT_SPE_FP_DATA 33
+#define BOOKE_INTERRUPT_SPE_ALTIVEC_UNAVAIL 32
+#define BOOKE_INTERRUPT_SPE_FP_DATA_ALTIVEC_ASSIST 33
+/*
+ * TODO: Unify 32-bit and 64-bit kernel exception handlers to use same defines
+ */
+#define BOOKE_INTERRUPT_SPE_UNAVAIL BOOKE_INTERRUPT_SPE_ALTIVEC_UNAVAIL
+#define BOOKE_INTERRUPT_SPE_FP_DATA BOOKE_INTERRUPT_SPE_FP_DATA_ALTIVEC_ASSIST
+#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL BOOKE_INTERRUPT_SPE_ALTIVEC_UNAVAIL
+#define BOOKE_INTERRUPT_ALTIVEC_ASSIST \
+   BOOKE_INTERRUPT_SPE_FP_DATA_ALTIVEC_ASSIST
 #define BOOKE_INTERRUPT_SPE_FP_ROUND 34
 #define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35
 #define BOOKE_INTERRUPT_DOORBELL 36
@@ -67,10 +75,6 @@
 #define BOOKE_INTERRUPT_HV_SYSCALL 40
 #define BOOKE_INTERRUPT_HV_PRIV 41
 
-/* altivec */
-#define BOOKE_INTERRUPT_ALTIVEC_UNAVAIL 42
-#define BOOKE_INTERRUPT_ALTIVEC_ASSIST 43
-
 /* book3s */
 
 #define BOOK3S_INTERRUPT_SYSTEM_RESET  0x100
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] kvm/ppc: Call trace_hardirqs_on before entry

2013-06-06 Thread Scott Wood
Currently this is only being done on 64-bit.  Rather than just move it
out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is
consistent with lazy ee state, and so that we don't track more host
code as interrupts-enabled than necessary.

Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect
that this function now has a role on 32-bit as well.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/include/asm/kvm_ppc.h |   11 ---
 arch/powerpc/kvm/book3s_pr.c   |4 ++--
 arch/powerpc/kvm/booke.c   |4 ++--
 arch/powerpc/kvm/powerpc.c |2 --
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a5287fe..6885846 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -394,10 +394,15 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
}
 }
 
-/* Please call after prepare_to_enter. This function puts the lazy ee state
-   back to normal mode, without actually enabling interrupts. */
-static inline void kvmppc_lazy_ee_enable(void)
+/*
+ * Please call after prepare_to_enter. This function puts the lazy ee and irq
+ * disabled tracking state back to normal mode, without actually enabling
+ * interrupts.
+ */
+static inline void kvmppc_fix_ee_before_entry(void)
 {
+   trace_hardirqs_on();
+
 #ifdef CONFIG_PPC64
/* Only need to enable IRQs by hard enabling them after this */
local_paca-irq_happened = 0;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index bdc40b8..0b97ce4 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -890,7 +890,7 @@ program_interrupt:
local_irq_enable();
r = s;
} else {
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
}
}
 
@@ -1161,7 +1161,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
if (vcpu-arch.shared-msr  MSR_FP)
kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
 
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
 
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 5cd7ad0..08f4aa1 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -673,7 +673,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
ret = s;
goto out;
}
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
 
kvm_guest_enter();
 
@@ -1167,7 +1167,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
local_irq_enable();
r = (s  2) | RESUME_HOST | (r  RESUME_FLAG_NV);
} else {
-   kvmppc_lazy_ee_enable();
+   kvmppc_fix_ee_before_entry();
}
}
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 6316ee3..4e05f8c 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -117,8 +117,6 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
kvm_guest_exit();
continue;
}
-
-   trace_hardirqs_on();
 #endif
 
kvm_guest_enter();
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/8] kvm/ppc/booke64: Disable e6500 support

2013-06-06 Thread Scott Wood
The previous patch made 64-bit booke KVM build again, but Altivec
support is still not complete, and we can't prevent the guest from
turning on Altivec (which can corrupt host state until state
save/restore is implemented).  Disable e6500 on KVM until this is
fixed.

Signed-off-by: Scott Wood scottw...@freescale.com
---
Mihai has posted RFC patches for proper Altivec support, so disabling
e6500 should only need to be for 3.10.
---
 arch/powerpc/kvm/e500mc.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 753cc99..19c8379 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -177,8 +177,6 @@ int kvmppc_core_check_processor_compat(void)
r = 0;
else if (strcmp(cur_cpu_spec-cpu_name, e5500) == 0)
r = 0;
-   else if (strcmp(cur_cpu_spec-cpu_name, e6500) == 0)
-   r = 0;
else
r = -ENOTSUPP;
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] kvm/ppc/booke: Delay kvmppc_fix_ee_before_entry

2013-06-06 Thread Scott Wood
kwmppc_fix_ee_before_entry() should be called as late as possible,
or else we get things like WARN_ON(preemptible()) in enable_kernel_fp()
in configurations where preemptible() works.

Note that book3s_pr already waits until just before __kvmppc_vcpu_run
to call kvmppc_fix_ee_before_entry().

Signed-off-by: Scott Wood scottw...@freescale.com
---
 arch/powerpc/kvm/booke.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index c5270a3..f953324 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -671,7 +671,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
ret = s;
goto out;
}
-   kvmppc_fix_ee_before_entry();
 
kvm_guest_enter();
 
@@ -697,6 +696,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
kvmppc_load_guest_fp(vcpu);
 #endif
 
+   kvmppc_fix_ee_before_entry();
+
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
/* No need for kvm_guest_exit. It's done in handle_exit.
-- 
1.7.10.4


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html