Re: Fix for bug that causes KVM_GET_SUPPORTED_CPUID failed errors.

2012-01-24 Thread Gabe Black
Yes, I saw the problem with 3.0.13. Thanks!

Gabe

On Mon, Jan 23, 2012 at 11:40 PM, Sasha Levin levinsasha...@gmail.com wrote:
 The GET_SUPPORTED_CPUID bug has been fixed and shouldn't be happening
 from v3.2 onwards.

 Do you still see the issue in older versions?

 On Mon, 2012-01-23 at 21:20 -0800, Gabe Black wrote:
 Sorry, forgot to add a subject.

 Gabe

 On Mon, Jan 23, 2012 at 9:18 PM, Gabe Black gabebl...@google.com wrote:
  Hi, I think I've tracked down the bug that causes
  KVM_GET_SUPPORTED_CPUID failed: Argument list too long errors when
  using the kvm tool. Basically, this (possibly squished) code seems to
  be to blame:
 
  case 0xd: {
  int i;
 
  entry-flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
  for (i = 1; *nent  maxnent  i  64; ++i) {
  if (entry[i].eax == 0)
  continue;
  do_cpuid_1_ent(entry[i], function, i);
  entry[i].flags |=
       KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
  ++*nent;
  }
  break;
  }
 
  You can see there's a check whether entry[i].eax is 0, but it isn't
  until the next line that entry[i] is actually filled in. That means
  that whether or not an entry is filled in for the 0xd function is
  essentially random, and that can lead to the loss of valid entries. It
  also means that nent may be incremented too often, and since all 64
  entries are iterated over, that can fill up the available storage and
  cause that error.
 
  I tested my theory by commenting out the if (100% failure rate) and
  moving it after do_cpuid_1_ent (100% success rate). Since this is a
  non-deterministic failure that isn't really conclusive, but I'm fairly
  confident my fix is correct. I don't know exactly what your procedure
  is for submitting patches, but one is attached.
 
  Gabe
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 --

 Sasha.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2012-01-23 Thread Gabe Black
Hi, I think I've tracked down the bug that causes
KVM_GET_SUPPORTED_CPUID failed: Argument list too long errors when
using the kvm tool. Basically, this (possibly squished) code seems to
be to blame:

case 0xd: {
int i;

entry-flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
for (i = 1; *nent  maxnent  i  64; ++i) {
if (entry[i].eax == 0)
continue;
do_cpuid_1_ent(entry[i], function, i);
entry[i].flags |=
  KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
++*nent;
}
break;
}

You can see there's a check whether entry[i].eax is 0, but it isn't
until the next line that entry[i] is actually filled in. That means
that whether or not an entry is filled in for the 0xd function is
essentially random, and that can lead to the loss of valid entries. It
also means that nent may be incremented too often, and since all 64
entries are iterated over, that can fill up the available storage and
cause that error.

I tested my theory by commenting out the if (100% failure rate) and
moving it after do_cpuid_1_ent (100% success rate). Since this is a
non-deterministic failure that isn't really conclusive, but I'm fairly
confident my fix is correct. I don't know exactly what your procedure
is for submitting patches, but one is attached.

Gabe
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 77c9d86..35d7ae0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2414,9 +2414,9 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 
 		entry-flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 		for (i = 1; *nent  maxnent  i  64; ++i) {
+			do_cpuid_1_ent(entry[i], function, i);
 			if (entry[i].eax == 0)
 continue;
-			do_cpuid_1_ent(entry[i], function, i);
 			entry[i].flags |=
 			   KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 			++*nent;


Fix for bug that causes KVM_GET_SUPPORTED_CPUID failed errors.

2012-01-23 Thread Gabe Black
Sorry, forgot to add a subject.

Gabe

On Mon, Jan 23, 2012 at 9:18 PM, Gabe Black gabebl...@google.com wrote:
 Hi, I think I've tracked down the bug that causes
 KVM_GET_SUPPORTED_CPUID failed: Argument list too long errors when
 using the kvm tool. Basically, this (possibly squished) code seems to
 be to blame:

 case 0xd: {
 int i;

 entry-flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 for (i = 1; *nent  maxnent  i  64; ++i) {
 if (entry[i].eax == 0)
 continue;
 do_cpuid_1_ent(entry[i], function, i);
 entry[i].flags |=
      KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 ++*nent;
 }
 break;
 }

 You can see there's a check whether entry[i].eax is 0, but it isn't
 until the next line that entry[i] is actually filled in. That means
 that whether or not an entry is filled in for the 0xd function is
 essentially random, and that can lead to the loss of valid entries. It
 also means that nent may be incremented too often, and since all 64
 entries are iterated over, that can fill up the available storage and
 cause that error.

 I tested my theory by commenting out the if (100% failure rate) and
 moving it after do_cpuid_1_ent (100% success rate). Since this is a
 non-deterministic failure that isn't really conclusive, but I'm fairly
 confident my fix is correct. I don't know exactly what your procedure
 is for submitting patches, but one is attached.

 Gabe
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


unhandled vm exit: 0x80000021 vcpu_id 0

2009-05-29 Thread Gabe Black
   Hello again. I'm making more progress getting KVM going in M5, and 
right now I'm trying to figure out why I'm getting an unhandled vm exit 
with exit code 0x8021. According to Intel's manual, something about 
the guest state isn't being set up correctly. I dumped the initial 
register state for the 0th virtual CPU and noticed that some things 
Intel claims are illegal show up there, for instance having paging and 
protected mode disabled. I'm assuming there's some cooking done to the 
state as presented to KVM to, for instance, substitute V8086 mode for 
real mode, etc. I've fixed a number of bugs in M5 that cleaned up some 
issues, but I'm hoping somebody with more knowledge can tell me what 
illegal state is still there that would make it through the kvms 
twiddling and cause VMX to abort. One thing that I know looks funny is 
that the limit on the IDT is zero, but I haven't been able to find any 
evidence in the manuals that that's considered wrong rather than just a 
bad idea. Any help here would be very appreciated!


Gabe

rax  rbx  rcx  rdx 

rsi 00090200 rdi  rsp  rbp 

r8   r9   r10  r11 

r12  r13  r14  r15 


rip 0020 rflags 0002
cs 0008 (/ p 1 dpl 0 db 0 s 1 type a l 1 g 1 avl 0)
ds 0010 (/ p 1 dpl 0 db 1 s 1 type 2 l 0 g 1 avl 0)
es 0010 (/ p 1 dpl 0 db 1 s 1 type 2 l 0 g 1 avl 0)
ss 0010 (/ p 1 dpl 0 db 1 s 1 type 2 l 0 g 1 avl 0)
fs 0010 (/ p 1 dpl 0 db 1 s 1 type 2 l 0 g 1 avl 0)
gs 0010 (/ p 1 dpl 0 db 1 s 1 type 2 l 0 g 1 avl 0)
tr 0018 (/ p 1 dpl 0 db 1 s 0 type b l 0 g 1 avl 0)
ldt  (/ p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
gdt 76000/17
idt 0/0
cr0 8011 cr2 0 cr3 7 cr4 20 cr8 0 efer 500

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NULL pointer dereference in kernel code, ignored parameters in libkvm

2009-05-24 Thread Gabe Black
Avi Kivity wrote:
 Gabe Black wrote:
 Hi. I'm a developer on the M5 simulator (m5sim.org) working on a CPU
 model which uses kvm as its execution engine. 

 Neat stuff.  You're using kvm to run non-x86 code on x86?

Right now we're trying to run x86 on x86 but using our device models
instead of QEMUs. It would be interesting to use it for other ISAs, but
as far as I know we hadn't considered that.


 I ran into a kernel BUG
 where a NULL pointer is being dereferenced in gfn_to_rmap.

 What's happening on the kernel side is that gfn_to_rmap is calling
 gfn_to_memslot. That function looks for the gfn in the memory slots,
 fails to find it, and returns a NULL pointer. gfn_to_rmap then tries to
 dereference it, and the kernel kills itself. I believe the original
 source of the call to gfn_to_memslot was mmu_alloc_roots (in 2.6.28.9,
 it may have moved) which tries to get the page pointed to by CR3 using
 kvm_mmu_get_page. That part may not be correct, so here's the log output
 from the kernel.
   

 This was fixed by 89da4ff17 (KVM: x86: check for cr3 validity in
 mmu_alloc_roots).  Did the code base you were testing contain that?7

It was the 2.6.28.9 kernel, and looking at the patch and lxr it appears not.


 The second problem was the fact that CR3 didn't point to any memory even
 though it had a valid value (0x7000). This was because our code relied
 on kvm_create to set up physical memory, and while it takes parameters
 for it and passes them around, it never actually seems to do anything
 with them. This also seems to be the case in your most recent code.

   

 You should set up the memory independently using the memory slot APIs,
 then load CR3.  kvm_create() has bitrotted a bit.

Will do.


 I am a full time employee of VMware, and while I work on M5 on my own
 time, that places certain limits on what I can do to help fix these
 bugs. While I probably can't implement anything, I should be able to
 provide more information about what we're doing with M5 or about the
 crash if that would help.
   

 I appreciate the reports.  Please test latest kvm.git and let us know
 if the problems persist.

Will do.


 It would also be interesting to hear how you use kvm.


Ultimately, we'd like to use KVM for at least two things. The first is
as a way to fast forward simulations to the portion of interest before
switching into something slower that can collect interesting statistics
and accurately simulate performance. Our immediate goal on our way to
that is to get a CPU based around KVM to boot Linux while hooked into
our device models. We're very early in the process, but one challenge
I'm anticipating is being able to pull the local APIC out of the virtual
CPU and into M5 so that we can coordinate IPIs, etc., ourselves.

The other thing we'd like to do is to use KVM as a golden model to
verify our correctness against. To do that, we'll probably need to make
significant progress on the above, and then also find a mechanism to
make each CPU advance in very incremental and deterministic ways. Our
thought on that so far as been to set the TF bit in the guest and use
the #DB to exit back to the host. We expect there will be some gotchas
with this like hold off on mov to %ss, but if there would be any show
stopper problems please let us know.

Gabe
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


potential tss trampling, assumptions about physical memory layout

2009-05-24 Thread Gabe Black
While continuing to try to get KVM and our M5 simulator to work
together, I ran into another issue.

During VCPU bring up in x86 under VMX, the function init_rmode_tss
is called which seems to be writing an initial version of a TSS into
guest memory. It's not immediately clear to me why that would be a
necessary part of CPU initialization since, according to my
understanding of the ISA, the TSS should only be written by CPU itself
(as apposed to a regular load or store) when doing a task switch. This
doesn't present an immediate problem for us, but if CPUs are not all
started at the same time, it would be possible for them to trample a
shared TSS or whatever would end up beneath it.

There has also been a strange problem that crops up when the tss is
put at an unusual address (more on that next) where copy_to_user within
kvm_clear_guest_page will work or not work depending on seemingly
arbitrary circumstances. Are there any restrictions on how to mmap the
backing memory or on how it's used at that point in CPU initialization?
I'd need to look into this more to get a good idea of what's going on,
but it seems to defy logic and any guidance would be useful.

init_rmode_tss uses a function called rmode_tss_base to figure out
where to put this initial tss, and there seem to be a few assumptions
about what the physical memory layout looks like that are not
necessarily true. The version of the code in the kernel I'm working with
(2.6.28.9) looks like the following:

static gva_t rmode_tss_base(struct kvm *kvm)
{
if (!kvm-arch.tss_addr) {
gfn_t base_gfn = kvm-memslots[0].base_gfn +
 kvm-memslots[0].npages - 3;
return base_gfn  PAGE_SHIFT;
}
return kvm-arch.tss_addr;
}

The first assumption is that there is a valid element in the
memslots array. If not the guest may be in trouble anyway, but the
kernel may behave unpredictably in that case. Second, this code seems to
assume what I believe is the standard QEMU style memory layout with a
chunk of physical memory from 0x0 - 0xA, the VGA framebuffer, and
then 0xC and upwards. In that case, memslots[0] would end at
0xA, and going three pages back from the end would still be a valid
realmode address. In our case, I was not leaving a gap for VGA and
instead allocating all ~500 MB of memory in one chunk, so three pages
from the end of memslot[0] was WAY outside the bounds of real mode
addressing. If this TSS is indeed for use by the guest in real mode,
that doesn't seem like it would work. Finally, it assumes that
memslots[0].npages is greater than 3 in the first place. If it isn't,
you'll end up with wrapping and try to install a real mode TSS at the
end of the address space, even farther from what real mode can handle
and potentially entirely invalid.

Gabe
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html