Re: [PATCH] kvm: add missing void __user * cast to access_ok() call

2011-05-23 Thread Takuya Yoshikawa
On Tue, 24 May 2011 07:51:27 +0200
Heiko Carstens  wrote:

> From: Heiko Carstens 
> 
> fa3d315a "KVM: Validate userspace_addr of memslot when registered" introduced
> this new warning onn s390:
> 
> kvm_main.c: In function '__kvm_set_memory_region':
> kvm_main.c:654:7: warning: passing argument 1 of '__access_ok' makes pointer 
> from integer without a cast
> arch/s390/include/asm/uaccess.h:53:19: note: expected 'const void *' but 
> argument is of type '__u64'
> 
> Add the missing cast to get rid of it again...
> 

Looks good to me, thank you!

I should have checked s390's type checking...

  Takuya


> Cc: Takuya Yoshikawa 
> Signed-off-by: Heiko Carstens 
> ---
>  virt/kvm/kvm_main.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -651,7 +651,8 @@ int __kvm_set_memory_region(struct kvm *
>   /* We can read the guest memory with __xxx_user() later on. */
>   if (user_alloc &&
>   ((mem->userspace_addr & (PAGE_SIZE - 1)) ||
> -  !access_ok(VERIFY_WRITE, mem->userspace_addr, mem->memory_size)))
> +  !access_ok(VERIFY_WRITE, (void __user *)mem->userspace_addr,
> + mem->memory_size)))
>   goto out;
>   if (mem->slot >= KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
>   goto out;


-- 
Takuya Yoshikawa 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: add missing void __user * cast to access_ok() call

2011-05-23 Thread Heiko Carstens
From: Heiko Carstens 

fa3d315a "KVM: Validate userspace_addr of memslot when registered" introduced
this new warning onn s390:

kvm_main.c: In function '__kvm_set_memory_region':
kvm_main.c:654:7: warning: passing argument 1 of '__access_ok' makes pointer 
from integer without a cast
arch/s390/include/asm/uaccess.h:53:19: note: expected 'const void *' but 
argument is of type '__u64'

Add the missing cast to get rid of it again...

Cc: Takuya Yoshikawa 
Signed-off-by: Heiko Carstens 
---
 virt/kvm/kvm_main.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -651,7 +651,8 @@ int __kvm_set_memory_region(struct kvm *
/* We can read the guest memory with __xxx_user() later on. */
if (user_alloc &&
((mem->userspace_addr & (PAGE_SIZE - 1)) ||
-!access_ok(VERIFY_WRITE, mem->userspace_addr, mem->memory_size)))
+!access_ok(VERIFY_WRITE, (void __user *)mem->userspace_addr,
+   mem->memory_size)))
goto out;
if (mem->slot >= KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
goto out;
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Tian, Kevin
> From: Nadav Har'El
> Sent: Tuesday, May 24, 2011 2:51 AM
> 
> > >+  vmcs_init(vmx->loaded_vmcs->vmcs);
> > >+  vmx->loaded_vmcs->cpu = -1;
> > >+  vmx->loaded_vmcs->launched = 0;
> >
> > Perhaps a loaded_vmcs_init() to encapsulate initialization of these
> > three fields, you'll probably reuse it later.
> 
> It's good you pointed this out, because it made me suddenly realise that I
> forgot to VMCLEAR the new vmcs02's I allocate. In practice it never made a
> difference, but better safe than sorry.

yes, that's what spec requires. You need VMCLEAR on any new VMCS which
does implementation specific initialization in that VMCS region.

> 
> I had to restructure some of the code a bit to be able to properly use this
> new function (in 3 places - __loaded_vmcs_clear, nested_get_current_vmcs02,
> vmx_create_cpu).
> 
> > Please repost separately after the fix, I'd like to apply it before the
> > rest of the series.
> 
> I am adding a new version of this patch at the end of this mail.
> 
> > (regarding interrupts, I think we can do that work post-merge.  But I'd
> > like to see Kevin's comments addressed)
> 
> I replied to his comments. Done some of the things he asked, and asked for
> more info on why/where he believes the current code is incorrect where I
> didn't understand what problems he pointed to, and am now waiting for him
> to reply.

As I replied in another thread, I believe this has been explained clearly by 
Nadav.

> 
> 
> --- 8< -- 8< -- 8< -- 8< --- 8< ---
> 
> Subject: [PATCH 01/31] nVMX: Keep list of loaded VMCSs, instead of vcpus.
> 
> In VMX, before we bring down a CPU we must VMCLEAR all VMCSs loaded on it
> because (at least in theory) the processor might not have written all of its
> content back to memory. Since a patch from June 26, 2008, this is done using
> a per-cpu "vcpus_on_cpu" linked list of vcpus loaded on each CPU.
> 
> The problem is that with nested VMX, we no longer have the concept of a
> vcpu being loaded on a cpu: A vcpu has multiple VMCSs (one for L1, a pool for
> L2s), and each of those may be have been last loaded on a different cpu.
> 
> So instead of linking the vcpus, we link the VMCSs, using a new structure
> loaded_vmcs. This structure contains the VMCS, and the information
> pertaining
> to its loading on a specific cpu (namely, the cpu number, and whether it
> was already launched on this cpu once). In nested we will also use the same
> structure to hold L2 VMCSs, and vmx->loaded_vmcs is a pointer to the
> currently active VMCS.
> 
> Signed-off-by: Nadav Har'El 
> ---
>  arch/x86/kvm/vmx.c |  150 ---
>  1 file changed, 86 insertions(+), 64 deletions(-)
> 
> --- .before/arch/x86/kvm/vmx.c2011-05-23 21:46:14.0 +0300
> +++ .after/arch/x86/kvm/vmx.c 2011-05-23 21:46:14.0 +0300
> @@ -116,6 +116,18 @@ struct vmcs {
>   char data[0];
>  };
> 
> +/*
> + * Track a VMCS that may be loaded on a certain CPU. If it is (cpu!=-1), also
> + * remember whether it was VMLAUNCHed, and maintain a linked list of all
> VMCSs
> + * loaded on this CPU (so we can clear them if the CPU goes down).
> + */
> +struct loaded_vmcs {
> + struct vmcs *vmcs;
> + int cpu;
> + int launched;
> + struct list_head loaded_vmcss_on_cpu_link;
> +};
> +
>  struct shared_msr_entry {
>   unsigned index;
>   u64 data;
> @@ -124,9 +136,7 @@ struct shared_msr_entry {
> 
>  struct vcpu_vmx {
>   struct kvm_vcpu   vcpu;
> - struct list_head  local_vcpus_link;
>   unsigned long host_rsp;
> - int   launched;
>   u8fail;
>   u8cpl;
>   bool  nmi_known_unmasked;
> @@ -140,7 +150,14 @@ struct vcpu_vmx {
>   u64   msr_host_kernel_gs_base;
>   u64   msr_guest_kernel_gs_base;
>  #endif
> - struct vmcs  *vmcs;
> + /*
> +  * loaded_vmcs points to the VMCS currently used in this vcpu. For a
> +  * non-nested (L1) guest, it always points to vmcs01. For a nested
> +  * guest (L2), it points to a different VMCS.
> +  */
> + struct loaded_vmcsvmcs01;
> + struct loaded_vmcs   *loaded_vmcs;
> + bool  __launched; /* temporary, used in
> vmx_vcpu_run */
>   struct msr_autoload {
>   unsigned nr;
>   struct vmx_msr_entry guest[NR_AUTOLOAD_MSRS];
> @@ -200,7 +217,11 @@ static int vmx_set_tss_addr(struct kvm *
> 
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
> -static DEFINE_PER_CPU(struct list_head, vcpus_on_cpu);
> +/*
> + * We maintain a per-CPU linked-list of VMCS loaded on that CPU. This is
> needed
> + * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded
> on it.
> + */
> +static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
>  static DEFINE_PER_CPU(struct desc_ptr, host

RE: [PATCH 07/31] nVMX: Introduce vmcs02: VMCS used to run L2

2011-05-23 Thread Tian, Kevin
> From: Nadav Har'El [mailto:n...@math.technion.ac.il]
> Sent: Sunday, May 22, 2011 4:30 PM
> 
> Hi,
> 
> On Fri, May 20, 2011, Tian, Kevin wrote about "RE: [PATCH 07/31] nVMX:
> Introduce vmcs02: VMCS used to run L2":
> > Possibly we can maintain the vmcs02 pool along with L1 VMCLEAR ops, which
> > is similar to the hardware behavior regarding to cleared and launched state.
> 
> If you set VMCS02_POOL_SIZE to a large size, and L1, like typical hypervisors,
> only keeps around a few VMCSs (and VMCLEARs the ones it will not use again),
> then we'll only have a few vmcs02: handle_vmclear() removes from the pool the
> vmcs02 that L1 explicitly told us it won't need again.

yes

> 
> > > +struct saved_vmcs {
> > > + struct vmcs *vmcs;
> > > + int cpu;
> > > + int launched;
> > > +};
> >
> > "saved" looks a bit misleading here. It's simply a list of all active vmcs02
> tracked
> > by kvm, isn't it?
> 
> I have rewritten this part of the code, based on Avi's and Marcelo's requests,
> and the new name for this structure is "loaded_vmcs", i.e., a structure
> describing where a VMCS was loaded.

great, I'll take a look at your new code.

Thanks
Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Tian, Kevin
> From: Avi Kivity
> Sent: Monday, May 23, 2011 11:49 PM
> (regarding interrupts, I think we can do that work post-merge.  But I'd
> like to see Kevin's comments addressed)

My earlier comment has been addressed by Nadav with his explanation.

Thanks
Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 07/31] nVMX: Introduce vmcs02: VMCS used to run L2

2011-05-23 Thread Tian, Kevin
> From: Nadav Har'El [mailto:n...@math.technion.ac.il]
> Sent: Sunday, May 22, 2011 3:23 PM
> 
> Hi,
> 
> On Sun, May 22, 2011, Tian, Kevin wrote about "RE: [PATCH 07/31] nVMX:
> Introduce vmcs02: VMCS used to run L2":
> > Here the vmcs02 being overridden may have been run on another processor
> before
> > but is not vmclear-ed yet. When you resume this vmcs02 with new content on
> a
> > separate processor, the risk of corruption exists.
> 
> I still believe that my current code is correct (in this area). I'll try to
> explain it here and would be grateful if you could point to me the error (if
> there is one) in my logic:
> 
> Nested_vmx_run() is our function which is switches from running L1 to L2
> (patch 18).
> 
> This function starts by calling nested_get_current_vmcs02(), which gets us
> *some* vmcs to use for vmcs02. This may be a fresh new VMCS, or a
> "recycled"
> VMCS, some VMCS we've previously used to run some, potentially different L2
> guest on some, potentially different, CPU.
> nested_get_current_vmcs02() returns a "saved_vmcs" structure, which
> not only contains a VMCS, but also remembers on which (if any) cpu it is
> currently loaded (and whether it was VMLAUNCHed once on that cpu).
> 
> The next thing that Nested_vmx_run() now does is to set up in the vcpu object
> the vmcs, cpu and launched fields according to what was returned above.
> 
> Now it calls vmx_vcpu_load(). This standard KVM function checks if we're now
> running on a different CPU from the vcpu->cpu, and if it a different one, is
> uses vcpu_clear() to VMCLEAR the vmcs on the CPU where it was last loaded
> (using an IPI). Only after it vmclears the VMCS on the old CPU, it can finally
> load the VMCS on the new CPU.
> 
> Only now Nested_vmx_run() can call prepare_vmcs02, which starts
> VMWRITEing
> to this VMCS, and finally returns.
> 

yes, you're correct. Previously I just looked around 07/31 and raised above 
concern.
Along with nested_vmx_run you explained above, this part is clear to me now. :-)

> P.S. Seeing that you're from Intel, maybe you can help me with a pointer:
> I found what appears to be a small error in the SDM - who can I report it to?
> 

Let me ask for you.

Thanks
Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


memory zones and the KVM guest kernel

2011-05-23 Thread David Evensky

Hi,

When I boot my guest kernel with KVM, the dmesg output says that:

...
[0.00] Zone PFN ranges:
[0.00]   DMA  0x0010 -> 0x1000
[0.00]   DMA320x1000 -> 0x0010
[0.00]   Normal   empty
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[2] active PFN ranges
[0.00] 0: 0x0010 -> 0x009f
[0.00] 0: 0x0100 -> 0x0007fffd
...

Why is the Normal Zone empty? Is it possible to have some of the
guest's memory mapped in the Normal zone?

Is there a good reference that talks about the normal, movable,
etc. memory zones?

Thanks,
\dae

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Gleb Natapov
On Mon, May 23, 2011 at 09:59:01PM +0300, Nadav Har'El wrote:
> On Mon, May 23, 2011, Gleb Natapov wrote about "Re: [PATCH 08/31] nVMX: Fix 
> local_vcpus_link handling":
> > On Mon, May 23, 2011 at 06:49:17PM +0300, Avi Kivity wrote:
> > > (regarding interrupts, I think we can do that work post-merge.  But
> > > I'd like to see Kevin's comments addressed)
> > > 
> > To be fair this wasn't addressed for almost two years now.
> 
> Gleb, I assume by "this" you meant the idt-vectoring information issue, not
> Kevin's comments (which I only saw a couple of days ago)?
> 
Yes, of course.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Nadav Har'El
On Mon, May 23, 2011, Gleb Natapov wrote about "Re: [PATCH 08/31] nVMX: Fix 
local_vcpus_link handling":
> On Mon, May 23, 2011 at 06:49:17PM +0300, Avi Kivity wrote:
> > (regarding interrupts, I think we can do that work post-merge.  But
> > I'd like to see Kevin's comments addressed)
> > 
> To be fair this wasn't addressed for almost two years now.

Gleb, I assume by "this" you meant the idt-vectoring information issue, not
Kevin's comments (which I only saw a couple of days ago)?

-- 
Nadav Har'El|   Monday, May 23 2011, 20 Iyyar 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Someone offered you a cute little quote
http://nadav.harel.org.il   |for your signature? JUST SAY NO!
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Nadav Har'El
Hi, and thanks again for the reviews,

On Mon, May 23, 2011, Avi Kivity wrote about "Re: [PATCH 08/31] nVMX: Fix 
local_vcpus_link handling":
> > if (need_emulate_wbinvd(vcpu)) {
> > if (kvm_x86_ops->has_wbinvd_exit())
> > cpumask_set_cpu(cpu, vcpu->arch.wbinvd_dirty_mask);
> >-else if (vcpu->cpu != -1&&  vcpu->cpu != cpu)
> >+else if (vcpu->cpu != -1&&  vcpu->cpu != cpu
> >+&&  cpu_online(vcpu->cpu))
> > smp_call_function_single(vcpu->cpu,
> > wbinvd_ipi, NULL, 1);
> > }
> 
> Is this a necessary part of this patch?  Or an semi-related bugfix?
> 
> I think that it can't actually trigger before this patch due to luck.  
> svm doesn't clear vcpu->cpu on cpu offline, but on the other hand it 
> ->has_wbinvd_exit().

Well, this was Marcelo's patch:  When I suggested that we might have problems
because vcpu->cpu now isn't cleared to -1 when a cpu is offlined, he looked
at the code and said that he thinks this is the only place that will have
problems, and offered this patch, which I simply included in mine. I'm afraid
to admit I don't understand that part of the code, so I can't judge if this
is important or not. I'll drop it from my patch for now (and you can apply
Marcelo's patch separately).

> >+if (vmx->loaded_vmcs->cpu != cpu) {
> > struct desc_ptr *gdt =&__get_cpu_var(host_gdt);
> > unsigned long sysenter_esp;
> >
> > kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
> > local_irq_disable();
> >-list_add(&vmx->local_vcpus_link,
> >-&per_cpu(vcpus_on_cpu, cpu));
> >+list_add(&vmx->loaded_vmcs->loaded_vmcss_on_cpu_link,
> >+&per_cpu(loaded_vmcss_on_cpu, cpu));
> > local_irq_enable();
> >
> > /*
> >@@ -999,13 +1020,15 @@ static void vmx_vcpu_load(struct kvm_vcp
> > rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
> > vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 
> > */
> > }
> >+vmx->loaded_vmcs->cpu = cpu;
> This should be within the if () block.

Makes sense :-) Done.

> >+vmcs_init(vmx->loaded_vmcs->vmcs);
> >+vmx->loaded_vmcs->cpu = -1;
> >+vmx->loaded_vmcs->launched = 0;
> 
> Perhaps a loaded_vmcs_init() to encapsulate initialization of these 
> three fields, you'll probably reuse it later.

It's good you pointed this out, because it made me suddenly realise that I
forgot to VMCLEAR the new vmcs02's I allocate. In practice it never made a
difference, but better safe than sorry.

I had to restructure some of the code a bit to be able to properly use this
new function (in 3 places - __loaded_vmcs_clear, nested_get_current_vmcs02,
vmx_create_cpu).

> Please repost separately after the fix, I'd like to apply it before the 
> rest of the series.

I am adding a new version of this patch at the end of this mail.

> (regarding interrupts, I think we can do that work post-merge.  But I'd 
> like to see Kevin's comments addressed)

I replied to his comments. Done some of the things he asked, and asked for
more info on why/where he believes the current code is incorrect where I
didn't understand what problems he pointed to, and am now waiting for him
to reply.


--- 8< -- 8< -- 8< -- 8< --- 8< ---

Subject: [PATCH 01/31] nVMX: Keep list of loaded VMCSs, instead of vcpus.

In VMX, before we bring down a CPU we must VMCLEAR all VMCSs loaded on it
because (at least in theory) the processor might not have written all of its
content back to memory. Since a patch from June 26, 2008, this is done using
a per-cpu "vcpus_on_cpu" linked list of vcpus loaded on each CPU.

The problem is that with nested VMX, we no longer have the concept of a
vcpu being loaded on a cpu: A vcpu has multiple VMCSs (one for L1, a pool for
L2s), and each of those may be have been last loaded on a different cpu.

So instead of linking the vcpus, we link the VMCSs, using a new structure
loaded_vmcs. This structure contains the VMCS, and the information pertaining
to its loading on a specific cpu (namely, the cpu number, and whether it
was already launched on this cpu once). In nested we will also use the same
structure to hold L2 VMCSs, and vmx->loaded_vmcs is a pointer to the
currently active VMCS.

Signed-off-by: Nadav Har'El 
---
 arch/x86/kvm/vmx.c |  150 ---
 1 file changed, 86 insertions(+), 64 deletions(-)

--- .before/arch/x86/kvm/vmx.c  2011-05-23 21:46:14.0 +0300
+++ .after/arch/x86/kvm/vmx.c   2011-05-23 21:46:14.0 +0300
@@ -116,6 +116,18 @@ struct vmcs {
char data[0];
 };
 
+/*
+ * Track a VMCS that may be loaded on a certain CPU. If it is (cpu!=-1), also
+ * remember whether it was VMLAUNCHed, and maintain a linked list of all VMCSs
+ * loaded on this CPU (so we can clear them if the CPU g

Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Alexander Graf

On 23.05.2011, at 17:23, Avi Kivity wrote:

> On 05/23/2011 05:44 PM, Nadav Har'El wrote:
>> On Mon, May 23, 2011, Avi Kivity wrote about "Re: [PATCH 0/30] nVMX: Nested 
>> VMX, v9":
>> >  vmcs01 and vmcs02 will both be generated from vmcs12.
>> 
>> If you don't do a clean nested exit (from L2 to L1), vmcs02 can't be 
>> generated
>> from vmcs12... while L2 runs, it is possible that it modifies vmcs02 (e.g.,
>> non-trapped bits of guest_cr0), and these modifications are not copied back
>> to vmcs12 until the nested exit (when prepare_vmcs12() is called to perform
>> this task).
>> 
>> If you do a nested exit (a "fake" one), vmcs12 is made up to date, and then
>> indeed vmcs02 can be thrown away and regenerated.
> 
> You would flush this state back to the vmcs.  But that just confirms Joerg's 
> statement that a fake vmexit/vmrun is more or less equivalent.
> 
> The question is whether %rip points to the VMRUN/VMLAUNCH instruction, 
> HOST_RIP (or the next instruction for svm), or to guest code.  But the actual 
> things we need to do are all very similar subsets of a vmexit.

%rip should certainly point to VMRUN. That way there is no need to save any 
information whatsoever, as the VMCB is already in sane state and nothing needs 
to be special cased, as the next VCPU_RUN would simply go back into guest mode 
- which is exactly what we want.

The only tricky part is how we distinguish between "I need to live migrate" and 
"info registers". In the former case, %rip should be on VMRUN. In the latter, 
on the guest rip.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Avi Kivity

On 05/23/2011 07:43 PM, Roedel, Joerg wrote:

On Mon, May 23, 2011 at 11:49:17AM -0400, Avi Kivity wrote:

>  Joerg, is
>
>   if (unlikely(cpu != vcpu->cpu)) {
>   svm->asid_generation = 0;
>   mark_all_dirty(svm->vmcb);
>   }
>
>  susceptible to cpu offline/online?

I don't think so. This should be safe for cpu offline/online as long as
the cpu-number value is not reused for another physical cpu. But that
should be the case afaik.



Why not? offline/online does reuse cpu numbers AFAIK (and it must, if 
you have a fully populated machine and offline/online just one cpu).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Roedel, Joerg
On Mon, May 23, 2011 at 11:49:17AM -0400, Avi Kivity wrote:

> Joerg, is
> 
>  if (unlikely(cpu != vcpu->cpu)) {
>  svm->asid_generation = 0;
>  mark_all_dirty(svm->vmcb);
>  }
> 
> susceptible to cpu offline/online?

I don't think so. This should be safe for cpu offline/online as long as
the cpu-number value is not reused for another physical cpu. But that
should be the case afaik.

Joerg


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Gleb Natapov
On Mon, May 23, 2011 at 06:49:17PM +0300, Avi Kivity wrote:
> (regarding interrupts, I think we can do that work post-merge.  But
> I'd like to see Kevin's comments addressed)
> 
To be fair this wasn't addressed for almost two years now.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/31] nVMX: Fix local_vcpus_link handling

2011-05-23 Thread Avi Kivity

On 05/22/2011 11:57 AM, Nadav Har'El wrote:

Hi Avi and Marcelo, here is a the new first patch to the nvmx patch set,
which overhauls the handling of vmcss on cpus, as you asked.

As you guessed, the nested entry and exit code becomes much simpler and
cleaner, with the whole VMCS switching code on entry, for example, reduced
to:
cpu = get_cpu();
vmx->loaded_vmcs = vmcs02;
vmx_vcpu_put(vcpu);
vmx_vcpu_load(vcpu, cpu);
vcpu->cpu = cpu;
put_cpu();


That's wonderful, it indicates the code is much better integrated.  
Perhaps later we can refine it  to have separate _load and _put for 
host-related and guest-related parts (I think they already exist in the 
code, except they are always called together), but that is an 
optimization, and not the most important one by far.



You can apply this patch separately from the rest of the patch set, if you
wish. I'm sending just this one, like you asked - and can send the rest of
the patches when you ask me to.


Subject: [PATCH 01/31] nVMX: Keep list of loaded VMCSs, instead of vcpus.

In VMX, before we bring down a CPU we must VMCLEAR all VMCSs loaded on it
because (at least in theory) the processor might not have written all of its
content back to memory. Since a patch from June 26, 2008, this is done using
a per-cpu "vcpus_on_cpu" linked list of vcpus loaded on each CPU.

The problem is that with nested VMX, we no longer have the concept of a
vcpu being loaded on a cpu: A vcpu has multiple VMCSs (one for L1, a pool for
L2s), and each of those may be have been last loaded on a different cpu.

So instead of linking the vcpus, we link the VMCSs, using a new structure
loaded_vmcs. This structure contains the VMCS, and the information pertaining
to its loading on a specific cpu (namely, the cpu number, and whether it
was already launched on this cpu once). In nested we will also use the same
structure to hold L2 VMCSs, and vmx->loaded_vmcs is a pointer to the
currently active VMCS.

--- .before/arch/x86/kvm/x86.c  2011-05-22 11:41:57.0 +0300
+++ .after/arch/x86/kvm/x86.c   2011-05-22 11:41:57.0 +0300
@@ -2119,7 +2119,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu
if (need_emulate_wbinvd(vcpu)) {
if (kvm_x86_ops->has_wbinvd_exit())
cpumask_set_cpu(cpu, vcpu->arch.wbinvd_dirty_mask);
-   else if (vcpu->cpu != -1&&  vcpu->cpu != cpu)
+   else if (vcpu->cpu != -1&&  vcpu->cpu != cpu
+   &&  cpu_online(vcpu->cpu))
smp_call_function_single(vcpu->cpu,
wbinvd_ipi, NULL, 1);
}


Is this a necessary part of this patch?  Or an semi-related bugfix?

I think that it can't actually trigger before this patch due to luck.  
svm doesn't clear vcpu->cpu on cpu offline, but on the other hand it 
->has_wbinvd_exit().


Joerg, is

if (unlikely(cpu != vcpu->cpu)) {
svm->asid_generation = 0;
mark_all_dirty(svm->vmcb);
}

susceptible to cpu offline/online?


@@ -971,22 +992,22 @@ static void vmx_vcpu_load(struct kvm_vcp

if (!vmm_exclusive)
kvm_cpu_vmxon(phys_addr);
-   else if (vcpu->cpu != cpu)
-   vcpu_clear(vmx);
+   else if (vmx->loaded_vmcs->cpu != cpu)
+   loaded_vmcs_clear(vmx->loaded_vmcs);

-   if (per_cpu(current_vmcs, cpu) != vmx->vmcs) {
-   per_cpu(current_vmcs, cpu) = vmx->vmcs;
-   vmcs_load(vmx->vmcs);
+   if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
+   per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
+   vmcs_load(vmx->loaded_vmcs->vmcs);
}

-   if (vcpu->cpu != cpu) {
+   if (vmx->loaded_vmcs->cpu != cpu) {
struct desc_ptr *gdt =&__get_cpu_var(host_gdt);
unsigned long sysenter_esp;

kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
local_irq_disable();
-   list_add(&vmx->local_vcpus_link,
-   &per_cpu(vcpus_on_cpu, cpu));
+   list_add(&vmx->loaded_vmcs->loaded_vmcss_on_cpu_link,
+   &per_cpu(loaded_vmcss_on_cpu, cpu));
local_irq_enable();

/*
@@ -999,13 +1020,15 @@ static void vmx_vcpu_load(struct kvm_vcp
rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
}
+   vmx->loaded_vmcs->cpu = cpu;


This should be within the if () block.


@@ -4344,11 +4369,13 @@ static struct kvm_vcpu *vmx_create_vcpu(
goto uninit_vcpu;
}

-   vmx->vmcs = alloc_vmcs();
-   if (!vmx->vmcs)
+   vmx->loaded_vmcs =&vmx->vmcs01;
+   vmx->loaded_vmcs->vmcs = alloc_vmcs();
+   if (!vmx->loaded_vmcs->vmcs)
goto free_msrs;
-
-   vmcs_init(vmx->vmcs);
+   vmcs_init(vmx->loaded_vmcs->vmcs);
+ 

Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Avi Kivity

On 05/23/2011 05:44 PM, Nadav Har'El wrote:

On Mon, May 23, 2011, Avi Kivity wrote about "Re: [PATCH 0/30] nVMX: Nested VMX, 
v9":
>  vmcs01 and vmcs02 will both be generated from vmcs12.

If you don't do a clean nested exit (from L2 to L1), vmcs02 can't be generated
from vmcs12... while L2 runs, it is possible that it modifies vmcs02 (e.g.,
non-trapped bits of guest_cr0), and these modifications are not copied back
to vmcs12 until the nested exit (when prepare_vmcs12() is called to perform
this task).

If you do a nested exit (a "fake" one), vmcs12 is made up to date, and then
indeed vmcs02 can be thrown away and regenerated.


You would flush this state back to the vmcs.  But that just confirms 
Joerg's statement that a fake vmexit/vmrun is more or less equivalent.


The question is whether %rip points to the VMRUN/VMLAUNCH instruction, 
HOST_RIP (or the next instruction for svm), or to guest code.  But the 
actual things we need to do are all very similar subsets of a vmexit.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Avi Kivity

On 05/23/2011 05:58 PM, Joerg Roedel wrote:

On Mon, May 23, 2011 at 05:34:20PM +0300, Avi Kivity wrote:
>  On 05/23/2011 05:28 PM, Joerg Roedel wrote:

>>  To user-space we can provide a VCPU_FREEZE/VCPU_UNFREEZE ioctl which
>>  does all the necessary things.
>
>  Or we can automatically flush things on any exit to userspace.  They
>  should be very rare in guest mode.

This would make nesting mostly transparent to migration, so it sounds
good in this regard.

I do not completly agree that user-space exits in guest-mode are rare,
this depends on the hypervisor in the L1. In Hyper-V for example the
root-domain uses hardware virtualization too and has direct access to
devices (at least to some degree). IOIO is not intercepted in the
root-domain, for example. Not sure about the MMIO regions.


Good point.  We were also talking about passing through virtio (or even 
host) devices to the guest.


So an ioctl to flush volatile state to memory would be a good idea.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Joerg Roedel
On Mon, May 23, 2011 at 05:34:20PM +0300, Avi Kivity wrote:
> On 05/23/2011 05:28 PM, Joerg Roedel wrote:

>> To user-space we can provide a VCPU_FREEZE/VCPU_UNFREEZE ioctl which
>> does all the necessary things.
>
> Or we can automatically flush things on any exit to userspace.  They  
> should be very rare in guest mode.

This would make nesting mostly transparent to migration, so it sounds
good in this regard.

I do not completly agree that user-space exits in guest-mode are rare,
this depends on the hypervisor in the L1. In Hyper-V for example the
root-domain uses hardware virtualization too and has direct access to
devices (at least to some degree). IOIO is not intercepted in the
root-domain, for example. Not sure about the MMIO regions.


Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Nadav Har'El
On Mon, May 23, 2011, Avi Kivity wrote about "Re: [PATCH 0/30] nVMX: Nested 
VMX, v9":
> vmcs01 and vmcs02 will both be generated from vmcs12.

If you don't do a clean nested exit (from L2 to L1), vmcs02 can't be generated
from vmcs12... while L2 runs, it is possible that it modifies vmcs02 (e.g.,
non-trapped bits of guest_cr0), and these modifications are not copied back
to vmcs12 until the nested exit (when prepare_vmcs12() is called to perform
this task).

If you do a nested exit (a "fake" one), vmcs12 is made up to date, and then
indeed vmcs02 can be thrown away and regenerated.

Nadav.

-- 
Nadav Har'El|   Monday, May 23 2011, 19 Iyyar 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Jury: Twelve people who determine which
http://nadav.harel.org.il   |client has the better lawyer.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: Drop unused vars from int10.c code

2011-05-23 Thread Cyrill Gorcunov
There is a couple of functions which defines 'ah' variable but
never use it in real so that gcc 4.6.x series does complain on
me as

  CC   bios/bios-rom.bin
  bios/int10.c: In function ‘int10_putchar’:
  bios/int10.c:86:9: error: variable ‘ah’ set but not used 
[-Werror=unused-but-set-variable]
  bios/int10.c: In function ‘int10_vesa’:
  bios/int10.c:96:9: error: variable ‘ah’ set but not used 
[-Werror=unused-but-set-variable]
  cc1: all warnings being treated as errors

so get rid of them.

Signed-off-by: Cyrill Gorcunov 
CC: Sasha Levin 
---
 tools/kvm/bios/int10.c |8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

Index: linux-2.6.git/tools/kvm/bios/int10.c
===
--- linux-2.6.git.orig/tools/kvm/bios/int10.c
+++ linux-2.6.git/tools/kvm/bios/int10.c
@@ -83,22 +83,18 @@ static inline void outb(unsigned short p
  */
 static inline void int10_putchar(struct int10_args *args)
 {
-   u8 al, ah;
-
-   al = args->eax & 0xFF;
-   ah = (args->eax & 0xFF00) >> 8;
+   u8 al = args->eax & 0xFF;

outb(0x3f8, al);
 }

 static void int10_vesa(struct int10_args *args)
 {
-   u8 al, ah;
+   u8 al;
struct vesa_general_info *destination;
struct vminfo *vi;

al = args->eax;
-   ah = args->eax >> 8;

switch (al) {
case 0:
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Avi Kivity

On 05/23/2011 05:28 PM, Joerg Roedel wrote:

On Mon, May 23, 2011 at 04:52:47PM +0300, Avi Kivity wrote:
>  On 05/23/2011 04:40 PM, Joerg Roedel wrote:

>>  The next benefit is that it works seemlessly even if the state that
>>  needs to be transfered is extended (e.g. by emulating a new
>>  virtualization hardware feature). This support can be implemented in the
>>  kernel module and no changes to qemu are required.
>
>  I agree it's a benefit.  But I don't like making the fake vmexit part of
>  live migration, if it turns out the wrong choice it's hard to undo it.

Well, saving the state to the host-save-area and doing a fake-vmexit is
logically the same, only the memory where the information is stored
differs.


Right.  I guess the main difference is "info registers" after a stop.


To user-space we can provide a VCPU_FREEZE/VCPU_UNFREEZE ioctl which
does all the necessary things.



Or we can automatically flush things on any exit to userspace.  They 
should be very rare in guest mode.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Avi Kivity

On 05/23/2011 05:10 PM, Nadav Har'El wrote:

On Mon, May 23, 2011, Avi Kivity wrote about "Re: [PATCH 0/30] nVMX: Nested VMX, 
v9":
>  I think for Intel there is no hidden state apart from in-guest-mode
>  (there is the VMPTR, but it is an actual register accessible via
>  instructions).

is_guest_mode(vcpu), vmx->nested.vmxon, vmx->nested.current_vmptr are the
only three things I can think of. Vmxon is actually more than a boolean
(there's also a vmxon pointer).

What do you mean by the current_vmptr being available through an instruction?
It is (VMPTRST), but this would be an instruction run on L1 (emulated by L0).
How would L0's user space use that instruction?


I mean that it is an architectural register rather than "hidden state".  
It doesn't mean that L0 user space can use it.




>  I agree it's a benefit.  But I don't like making the fake vmexit part of
>  live migration, if it turns out the wrong choice it's hard to undo it.

If you don't do this "fake vmexit", you'll need to migrate both vmcs01 and
the current vmcs02 - the fact that vmcs12 is in guest memory will not be
enough, because vmcs02 isn't copied back to vmcs12 until the nested exit.



vmcs01 and vmcs02 will both be generated from vmcs12.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Joerg Roedel
On Mon, May 23, 2011 at 04:52:47PM +0300, Avi Kivity wrote:
> On 05/23/2011 04:40 PM, Joerg Roedel wrote:

>> The next benefit is that it works seemlessly even if the state that
>> needs to be transfered is extended (e.g. by emulating a new
>> virtualization hardware feature). This support can be implemented in the
>> kernel module and no changes to qemu are required.
>
> I agree it's a benefit.  But I don't like making the fake vmexit part of  
> live migration, if it turns out the wrong choice it's hard to undo it.

Well, saving the state to the host-save-area and doing a fake-vmexit is
logically the same, only the memory where the information is stored
differs.

To user-space we can provide a VCPU_FREEZE/VCPU_UNFREEZE ioctl which
does all the necessary things.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5 V2] kvm tools: Initialize and use VESA and VNC

2011-05-23 Thread Pekka Enberg
On Mon, May 23, 2011 at 2:19 PM, Sasha Levin  wrote:
> Requirements - Kernel compiled with:
> CONFIG_FB_BOOT_VESA_SUPPORT=y
> CONFIG_FB_VESA=y
> CONFIG_FRAMEBUFFER_CONSOLE=y

Dunno if it's possible but it would be nice to have a more readable
error message if you don't have that compiled in:

penberg@tiger:~/linux/tools/kvm$ ./kvm run --vnc -d
~/images/debian_squeeze_amd64_standard.img
  # kvm run -k ../../arch/x86/boot/bzImage -m 320 -c 2
  Warning: Config tap device error. Are you root?
23/05/2011 17:08:19 Listening for VNC connections on TCP port 5900
Undefined video mode number: 312
Press  to see video modes available,  to continue, or wait 30 sec
Killed

This obviously isn't an issue for merging this patch.

   Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Nadav Har'El
On Mon, May 23, 2011, Avi Kivity wrote about "Re: [PATCH 0/30] nVMX: Nested 
VMX, v9":
> I think for Intel there is no hidden state apart from in-guest-mode 
> (there is the VMPTR, but it is an actual register accessible via 
> instructions).

is_guest_mode(vcpu), vmx->nested.vmxon, vmx->nested.current_vmptr are the
only three things I can think of. Vmxon is actually more than a boolean
(there's also a vmxon pointer).

What do you mean by the current_vmptr being available through an instruction?
It is (VMPTRST), but this would be an instruction run on L1 (emulated by L0).
How would L0's user space use that instruction?

> I agree it's a benefit.  But I don't like making the fake vmexit part of 
> live migration, if it turns out the wrong choice it's hard to undo it.

If you don't do this "fake vmexit", you'll need to migrate both vmcs01 and
the current vmcs02 - the fact that vmcs12 is in guest memory will not be
enough, because vmcs02 isn't copied back to vmcs12 until the nested exit.


-- 
Nadav Har'El|   Monday, May 23 2011, 19 Iyyar 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |The world is coming to an end ... SAVE
http://nadav.harel.org.il   |YOUR BUFFERS!!!
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Avi Kivity

On 05/23/2011 04:40 PM, Joerg Roedel wrote:

On Mon, May 23, 2011 at 04:08:00PM +0300, Avi Kivity wrote:
>  On 05/23/2011 04:02 PM, Joerg Roedel wrote:

>>  About live-migration with nesting, we had discussed the idea of just
>>  doing an VMEXIT(INTR) if the vcpu runs nested and we want to migrate.
>>  The problem was that the hypervisor may not expect an INTR intercept.
>>
>>  How about doing an implicit VMEXIT in this case and an implicit VMRUN
>>  after the vcpu is migrated?
>
>  What if there's something in EXIT_INT_INFO?

On real SVM hardware EXIT_INT_INFO should only contain something for
exception and npt intercepts. These are all handled in the kernel and do
not cause an exit to user-space so that no valid EXIT_INT_INFO should be
around when we actually go back to user-space (so that migration can
happen).

The exception might be the #PF/NPT intercept when the guest is doing
very obscure things like putting an exception/interrupt handler on mmio
memory, but that isn't really supported by KVM anyway so I doubt we
should care.

Unless I miss something here we should be safe by just not looking at
EXIT_INT_INFO while migrating.


Agree.


>>The nested hypervisor will not see the
>>  vmexit and the vcpu will be in a state where it is safe to migrate. This
>>  should work for nested-vmx too if the guest-state is written back to
>>  guest memory on VMEXIT. Is this the case?
>
>  It is the case with the current implementation, and we can/should make
>  it so in future implementations, just before exit to userspace.  Or at
>  least provide an ABI to sync memory.
>
>  But I don't see why we shouldn't just migrate all the hidden state (in
>  guest mode flag, svm host paging mode, svm host interrupt state, vmcb
>  address/vmptr, etc.).  It's more state, but no thinking is involved, so
>  it's clearly superior.

An issue is that there is different state to migrate for Intel and AMD
hosts. If we keep all that information in guest memory the kvm kernel
module can handle those details and all KVM needs to migrate is the
in-guest-mode flag and the gpa of the vmcb/vmcs which is currently
executed. This state should be enough for Intel and AMD nesting.


I think for Intel there is no hidden state apart from in-guest-mode 
(there is the VMPTR, but it is an actual register accessible via 
instructions).  For svm we can keep the hidden state in the host 
state-save area (including the vmcb pointer).  The only risk is that svm 
will gain hardware support for nesting, and will choose a different 
format than ours.


An alternative is a fake MSR for storing this data, or just another 
get/set ioctl pair.  We'll have a flags field that says which fields are 
filled in.



The next benefit is that it works seemlessly even if the state that
needs to be transfered is extended (e.g. by emulating a new
virtualization hardware feature). This support can be implemented in the
kernel module and no changes to qemu are required.


I agree it's a benefit.  But I don't like making the fake vmexit part of 
live migration, if it turns out the wrong choice it's hard to undo it.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Joerg Roedel
On Mon, May 23, 2011 at 04:08:00PM +0300, Avi Kivity wrote:
> On 05/23/2011 04:02 PM, Joerg Roedel wrote:

>> About live-migration with nesting, we had discussed the idea of just
>> doing an VMEXIT(INTR) if the vcpu runs nested and we want to migrate.
>> The problem was that the hypervisor may not expect an INTR intercept.
>>
>> How about doing an implicit VMEXIT in this case and an implicit VMRUN
>> after the vcpu is migrated?
>
> What if there's something in EXIT_INT_INFO?

On real SVM hardware EXIT_INT_INFO should only contain something for
exception and npt intercepts. These are all handled in the kernel and do
not cause an exit to user-space so that no valid EXIT_INT_INFO should be
around when we actually go back to user-space (so that migration can
happen).

The exception might be the #PF/NPT intercept when the guest is doing
very obscure things like putting an exception/interrupt handler on mmio
memory, but that isn't really supported by KVM anyway so I doubt we
should care.

Unless I miss something here we should be safe by just not looking at
EXIT_INT_INFO while migrating.

>>   The nested hypervisor will not see the
>> vmexit and the vcpu will be in a state where it is safe to migrate. This
>> should work for nested-vmx too if the guest-state is written back to
>> guest memory on VMEXIT. Is this the case?
>
> It is the case with the current implementation, and we can/should make  
> it so in future implementations, just before exit to userspace.  Or at  
> least provide an ABI to sync memory.
>
> But I don't see why we shouldn't just migrate all the hidden state (in  
> guest mode flag, svm host paging mode, svm host interrupt state, vmcb  
> address/vmptr, etc.).  It's more state, but no thinking is involved, so  
> it's clearly superior.

An issue is that there is different state to migrate for Intel and AMD
hosts. If we keep all that information in guest memory the kvm kernel
module can handle those details and all KVM needs to migrate is the
in-guest-mode flag and the gpa of the vmcb/vmcs which is currently
executed. This state should be enough for Intel and AMD nesting.

The next benefit is that it works seemlessly even if the state that
needs to be transfered is extended (e.g. by emulating a new
virtualization hardware feature). This support can be implemented in the
kernel module and no changes to qemu are required.


Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Nadav Har'El
On Mon, May 23, 2011, Joerg Roedel wrote about "Re: [PATCH 0/30] nVMX: Nested 
VMX, v9":
> About live-migration with nesting, we had discussed the idea of just
> doing an VMEXIT(INTR) if the vcpu runs nested and we want to migrate.
> The problem was that the hypervisor may not expect an INTR intercept.
> 
> How about doing an implicit VMEXIT in this case and an implicit VMRUN
> after the vcpu is migrated? The nested hypervisor will not see the
> vmexit and the vcpu will be in a state where it is safe to migrate. This
> should work for nested-vmx too if the guest-state is written back to
> guest memory on VMEXIT. Is this the case?

Indeed, on nested exit (L2 to L1), the L2 guest state is written back to
vmcs12 (in guest memory). In theory, at that point, the vmcs02 (the vmcs
used by L0 to actually run L2) can be discarded, without risking losing
anything.

The receiving hypervisor will need to remember to do that implicit VMRUN
when it starts the guest; It also needs to know what is the current L2
guest - in VMX this would be vmx->nested.current_vmptr, which needs to me
migrated as well (on the other hand, other variables like
vmx->nested.current_vmcs12, will need to be recalculated by the receiver, and
not migrated as-is). I haven't started considering how to wrap up all these
pieces into a complete working solution - it is one of the things on my TODO
list after the basic nested VMX is merged.

-- 
Nadav Har'El|   Monday, May 23 2011, 19 Iyyar 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Live as if you were to die tomorrow,
http://nadav.harel.org.il   |learn as if you were to live forever.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Managedsave does not work with kernel >=2.6.38

2011-05-23 Thread Sebastian Nickel - Hetzner Online AG
Hello,
we recently noticed that the "managedsave" command from libvirt does not
work when using kernel >= 2.6.38. It saves the state to a file, but the
domain does not resume from the file. Instead a started domain always
gets rebooted. When using kernel 2.6.37 "managedsave" does work without
problems. 

We are currently using:

- libvirt 0.9.0
- kvm 0.14.0
- kernel 2.6.38.6

Is this a known bug?


Best regards

Sebastian Nickel

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Avi Kivity

On 05/23/2011 04:02 PM, Joerg Roedel wrote:

On Mon, May 23, 2011 at 12:52:50PM +0300, Avi Kivity wrote:
>  On 05/22/2011 10:32 PM, Nadav Har'El wrote:
>>  What do we need to do with this idt-vectoring-information? In regular (non-
>>  nested) guests, the answer is simple: On the next entry, we need to inject
>>  this event again into the guest, so it can resume the delivery of the
>>  same event it was trying to deliver. This is why the nested-unaware code
>>  has a vmx_complete_interrupts which basically adds this idt-vectoring-info
>>  into KVM's event queue, which on the next entry will be injected similarly
>>  to the way virtual interrupts from userspace are injected, and so on.
>
>  The other thing we may need to do, is to expose it to userspace in case
>  we're live migrating at exactly this point in time.

About live-migration with nesting, we had discussed the idea of just
doing an VMEXIT(INTR) if the vcpu runs nested and we want to migrate.
The problem was that the hypervisor may not expect an INTR intercept.

How about doing an implicit VMEXIT in this case and an implicit VMRUN
after the vcpu is migrated?


What if there's something in EXIT_INT_INFO?


  The nested hypervisor will not see the
vmexit and the vcpu will be in a state where it is safe to migrate. This
should work for nested-vmx too if the guest-state is written back to
guest memory on VMEXIT. Is this the case?


It is the case with the current implementation, and we can/should make 
it so in future implementations, just before exit to userspace.  Or at 
least provide an ABI to sync memory.


But I don't see why we shouldn't just migrate all the hidden state (in 
guest mode flag, svm host paging mode, svm host interrupt state, vmcb 
address/vmptr, etc.).  It's more state, but no thinking is involved, so 
it's clearly superior.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Joerg Roedel
On Mon, May 23, 2011 at 12:52:50PM +0300, Avi Kivity wrote:
> On 05/22/2011 10:32 PM, Nadav Har'El wrote:
>> What do we need to do with this idt-vectoring-information? In regular (non-
>> nested) guests, the answer is simple: On the next entry, we need to inject
>> this event again into the guest, so it can resume the delivery of the
>> same event it was trying to deliver. This is why the nested-unaware code
>> has a vmx_complete_interrupts which basically adds this idt-vectoring-info
>> into KVM's event queue, which on the next entry will be injected similarly
>> to the way virtual interrupts from userspace are injected, and so on.
>
> The other thing we may need to do, is to expose it to userspace in case  
> we're live migrating at exactly this point in time.

About live-migration with nesting, we had discussed the idea of just
doing an VMEXIT(INTR) if the vcpu runs nested and we want to migrate.
The problem was that the hypervisor may not expect an INTR intercept.

How about doing an implicit VMEXIT in this case and an implicit VMRUN
after the vcpu is migrated? The nested hypervisor will not see the
vmexit and the vcpu will be in a state where it is safe to migrate. This
should work for nested-vmx too if the guest-state is written back to
guest memory on VMEXIT. Is this the case?

Joerg
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 4/5] kvm tools: Update makefile and feature tests

2011-05-23 Thread Sasha Levin
From: John Floren 

Update feature tests to test for libvncserver.

VESA support doesn't get compiled in unless libvncserver
is installed.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/Makefile |   11 ++-
 tools/kvm/config/feature-tests.mak |   10 ++
 2 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
index e6e8d4e..2ebc86c 100644
--- a/tools/kvm/Makefile
+++ b/tools/kvm/Makefile
@@ -58,6 +58,14 @@ ifeq ($(has_bfd),y)
LIBS+= -lbfd
 endif
 
+FLAGS_VNCSERVER=$(CFLAGS) -lvncserver
+has_vncserver := $(call try-cc,$(SOURCE_VNCSERVER),$(FLAGS_VNCSERVER))
+ifeq ($(has_vncserver),y)
+   CFLAGS  += -DCONFIG_HAS_VNCSERVER
+   OBJS+= hw/vesa.o
+   LIBS+= -lvncserver
+endif
+
 DEPS   := $(patsubst %.o,%.d,$(OBJS))
 
 # Exclude BIOS object files from header dependencies.
@@ -153,9 +161,10 @@ bios/bios.o: bios/bios.S bios/bios-rom.bin
 bios/bios-rom.bin: bios/bios-rom.S bios/e820.c
$(E) "  CC  " $@
$(Q) $(CC) -include code16gcc.h $(CFLAGS) $(BIOS_CFLAGS) -c -s 
bios/e820.c -o bios/e820.o
+   $(Q) $(CC) -include code16gcc.h $(CFLAGS) $(BIOS_CFLAGS) -c -s 
bios/int10.c -o bios/int10.o
$(Q) $(CC) $(CFLAGS) $(BIOS_CFLAGS) -c -s bios/bios-rom.S -o 
bios/bios-rom.o
$(E) "  LD  " $@
-   $(Q) ld -T bios/rom.ld.S -o bios/bios-rom.bin.elf bios/bios-rom.o 
bios/e820.o
+   $(Q) ld -T bios/rom.ld.S -o bios/bios-rom.bin.elf bios/bios-rom.o 
bios/e820.o bios/int10.o
$(E) "  OBJCOPY " $@
$(Q) objcopy -O binary -j .text bios/bios-rom.bin.elf bios/bios-rom.bin
$(E) "  NM  " $@
diff --git a/tools/kvm/config/feature-tests.mak 
b/tools/kvm/config/feature-tests.mak
index 6170fd2..0801b54 100644
--- a/tools/kvm/config/feature-tests.mak
+++ b/tools/kvm/config/feature-tests.mak
@@ -126,3 +126,13 @@ int main(void)
return 0;
 }
 endef
+
+define SOURCE_VNCSERVER
+#include 
+
+int main(void)
+{
+   rfbIsActive((void *)0);
+   return 0;
+}
+endef
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 5/5] kvm tools: Initialize and use VESA and VNC

2011-05-23 Thread Sasha Levin
From: John Floren 

Requirements - Kernel compiled with:
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_VESA=y
CONFIG_FRAMEBUFFER_CONSOLE=y

Start VNC server by starting kvm tools with "--vnc".
Connect to the VNC server by running: "vncviewer :0".

Since there is no support for input devices at this time,
it may be useful starting kvm tools with an additional
' -p "console=ttyS0" ' parameter so that it would be possible
to use a serial console alongside with a graphic one.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/kvm-run.c |   17 +++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c
index 288e1fb..adbb25b 100644
--- a/tools/kvm/kvm-run.c
+++ b/tools/kvm/kvm-run.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* header files for gitish interface  */
 #include 
@@ -66,6 +67,7 @@ static const char *virtio_9p_dir;
 static bool single_step;
 static bool readonly_image[MAX_DISK_IMAGES];
 static bool virtio_rng;
+static bool vnc;
 extern bool ioport_debug;
 extern int  active_console;
 
@@ -110,6 +112,7 @@ static const struct option options[] = {
OPT_STRING('\0', "kvm-dev", &kvm_dev, "kvm-dev", "KVM device file"),
OPT_STRING('\0', "virtio-9p", &virtio_9p_dir, "root dir",
"Enable 9p over virtio"),
+   OPT_BOOLEAN('\0', "vnc", &vnc, "Enable VNC framebuffer"),
 
OPT_GROUP("Kernel options:"),
OPT_STRING('k', "kernel", &kernel_filename, "kernel",
@@ -413,6 +416,7 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
char *hi;
int i;
void *ret;
+   u16 vidmode = 0;
 
signal(SIGALRM, handle_sigalrm);
signal(SIGQUIT, handle_sigquit);
@@ -511,7 +515,13 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
kvm->nrcpus = nrcpus;
 
memset(real_cmdline, 0, sizeof(real_cmdline));
-   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1 console=ttyS0 
earlyprintk=serial");
+   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1");
+   if (vnc) {
+   strcat(real_cmdline, " video=vesafb console=tty0");
+   vidmode = 0x312;
+   } else {
+   strcat(real_cmdline, " console=ttyS0 earlyprintk=serial");
+   }
strcat(real_cmdline, " ");
if (kernel_cmdline)
strlcat(real_cmdline, kernel_cmdline, sizeof(real_cmdline));
@@ -543,7 +553,7 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
printf("  # kvm run -k %s -m %Lu -c %d\n", kernel_filename, ram_size / 
1024 / 1024, nrcpus);
 
if (!kvm__load_kernel(kvm, kernel_filename, initrd_filename,
-   real_cmdline))
+   real_cmdline, vidmode))
die("unable to load kernel %s", kernel_filename);
 
kvm->vmlinux= vmlinux_filename;
@@ -597,6 +607,9 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
 
kvm__init_ram(kvm);
 
+   if (vnc)
+   vesa__init(kvm);
+
thread_pool__init(nr_online_cpus);
 
for (i = 0; i < nrcpus; i++) {
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 3/5] kvm tools: Add VESA device

2011-05-23 Thread Sasha Levin
From: John Floren 

Add a simple VESA device which simply moves a framebuffer
from guest kernel to a VNC server.

VESA device PCI code is very similar to virtio-* PCI code.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/hw/vesa.c|  108 
 tools/kvm/include/kvm/ioport.h |2 +
 tools/kvm/include/kvm/vesa.h   |   27 
 tools/kvm/include/kvm/virtio-pci-dev.h |3 +
 4 files changed, 140 insertions(+), 0 deletions(-)
 create mode 100644 tools/kvm/hw/vesa.c
 create mode 100644 tools/kvm/include/kvm/vesa.h

diff --git a/tools/kvm/hw/vesa.c b/tools/kvm/hw/vesa.c
new file mode 100644
index 000..3003aa5
--- /dev/null
+++ b/tools/kvm/hw/vesa.c
@@ -0,0 +1,108 @@
+#include "kvm/vesa.h"
+#include "kvm/ioport.h"
+#include "kvm/util.h"
+#include "kvm/kvm.h"
+#include "kvm/pci.h"
+#include "kvm/kvm-cpu.h"
+#include "kvm/irq.h"
+#include "kvm/virtio-pci-dev.h"
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#define VESA_QUEUE_SIZE128
+#define VESA_IRQ   14
+
+/*
+ * This "6000" value is pretty much the result of experimentation
+ * It seems that around this value, things update pretty smoothly
+ */
+#define VESA_UPDATE_TIME   6000
+
+u8 videomem[VESA_MEM_SIZE];
+
+static bool vesa_pci_io_in(struct kvm *kvm, u16 port, void *data, int size, 
u32 count)
+{
+   printf("vesa in port=%u\n", port);
+   return true;
+}
+
+static bool vesa_pci_io_out(struct kvm *kvm, u16 port, void *data, int size, 
u32 count)
+{
+   printf("vesa out port=%u\n", port);
+   return true;
+}
+
+static struct ioport_operations vesa_io_ops = {
+   .io_in  = vesa_pci_io_in,
+   .io_out = vesa_pci_io_out,
+};
+
+static struct pci_device_header vesa_pci_device = {
+   .vendor_id  = PCI_VENDOR_ID_REDHAT_QUMRANET,
+   .device_id  = PCI_DEVICE_ID_VESA,
+   .header_type= PCI_HEADER_TYPE_NORMAL,
+   .revision_id= 0,
+   .class  = 0x03,
+   .subsys_vendor_id   = PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET,
+   .subsys_id  = PCI_SUBSYSTEM_ID_VESA,
+   .bar[0] = IOPORT_VESA | PCI_BASE_ADDRESS_SPACE_IO,
+   .bar[1] = VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY,
+};
+
+
+void vesa_mmio_callback(u64 addr, u8 *data, u32 len, u8 is_write)
+{
+   if (is_write)
+   memcpy(&videomem[addr - VESA_MEM_ADDR], data, len);
+
+   return;
+}
+
+void vesa__init(struct kvm *kvm)
+{
+   u8 dev, line, pin;
+   pthread_t thread;
+
+   if (irq__register_device(PCI_DEVICE_ID_VESA, &dev, &pin, &line) < 0)
+   return;
+
+   vesa_pci_device.irq_pin = pin;
+   vesa_pci_device.irq_line = line;
+   pci__register(&vesa_pci_device, dev);
+   ioport__register(IOPORT_VESA, &vesa_io_ops, IOPORT_VESA_SIZE);
+
+   kvm__register_mmio(VESA_MEM_ADDR, VESA_MEM_SIZE, &vesa_mmio_callback);
+   pthread_create(&thread, NULL, vesa__dovnc, kvm);
+}
+
+/*
+ * This starts a VNC server to display the framebuffer.
+ * It's not altogether clear this belongs here rather than in kvm-run.c
+ */
+void *vesa__dovnc(void *v)
+{
+   /*
+* Make a fake argc and argv because the getscreen function
+* seems to want it.
+*/
+   int ac = 1;
+   char av[1][1] = {{0} };
+   rfbScreenInfoPtr server;
+
+   server = rfbGetScreen(&ac, (char **)av, VESA_WIDTH, VESA_HEIGHT, 8, 3, 
4);
+   server->frameBuffer = (char *)videomem;
+   server->alwaysShared = TRUE;
+   rfbInitServer(server);
+
+   while (rfbIsActive(server)) {
+   rfbMarkRectAsModified(server, 0, 0, VESA_WIDTH, VESA_HEIGHT);
+   rfbProcessEvents(server, server->deferUpdateTime * 
VESA_UPDATE_TIME);
+   }
+   return NULL;
+}
+
diff --git a/tools/kvm/include/kvm/ioport.h b/tools/kvm/include/kvm/ioport.h
index 218530c..8253938 100644
--- a/tools/kvm/include/kvm/ioport.h
+++ b/tools/kvm/include/kvm/ioport.h
@@ -7,6 +7,8 @@
 
 /* some ports we reserve for own use */
 #define IOPORT_DBG 0xe0
+#define IOPORT_VESA0xa200
+#define IOPORT_VESA_SIZE   256
 #define IOPORT_VIRTIO_P9   0xb200  /* Virtio 9P device */
 #define IOPORT_VIRTIO_P9_SIZE  256
 #define IOPORT_VIRTIO_BLK  0xc200  /* Virtio block device */
diff --git a/tools/kvm/include/kvm/vesa.h b/tools/kvm/include/kvm/vesa.h
new file mode 100644
index 000..ff3ec75
--- /dev/null
+++ b/tools/kvm/include/kvm/vesa.h
@@ -0,0 +1,27 @@
+#ifndef KVM__VESA_H
+#define KVM__VESA_H
+
+#include 
+
+#define VESA_WIDTH 640
+#define VESA_HEIGHT480
+
+#define VESA_MEM_ADDR  0xd000
+#define VESA_MEM_SIZE  (4*VESA_WIDTH*VESA_HEIGHT)
+#define VESA_BPP   32
+
+struct kvm;
+struct int10_args;
+
+void vesa_mmio_callback(u64, u8*, 

[PATCH V3 2/5] kvm tools: Add video mode to kernel initialization

2011-05-23 Thread Sasha Levin
From: John Floren 

Allow setting video mode in guest kernel.

For possible values see Documentation/fb/vesafb.txt

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/include/kvm/kvm.h |2 +-
 tools/kvm/kvm.c |7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/kvm/include/kvm/kvm.h b/tools/kvm/include/kvm/kvm.h
index 08c6fda..f951f2d 100644
--- a/tools/kvm/include/kvm/kvm.h
+++ b/tools/kvm/include/kvm/kvm.h
@@ -41,7 +41,7 @@ int kvm__max_cpus(struct kvm *kvm);
 void kvm__init_ram(struct kvm *kvm);
 void kvm__delete(struct kvm *kvm);
 bool kvm__load_kernel(struct kvm *kvm, const char *kernel_filename,
-   const char *initrd_filename, const char 
*kernel_cmdline);
+   const char *initrd_filename, const char 
*kernel_cmdline, u16 vidmode);
 void kvm__setup_bios(struct kvm *kvm);
 void kvm__start_timer(struct kvm *kvm);
 void kvm__stop_timer(struct kvm *kvm);
diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
index 4393a41..7284211 100644
--- a/tools/kvm/kvm.c
+++ b/tools/kvm/kvm.c
@@ -320,7 +320,7 @@ static int load_flat_binary(struct kvm *kvm, int fd)
 static const char *BZIMAGE_MAGIC   = "HdrS";
 
 static bool load_bzimage(struct kvm *kvm, int fd_kernel,
-   int fd_initrd, const char *kernel_cmdline)
+   int fd_initrd, const char *kernel_cmdline, u16 vidmode)
 {
struct boot_params *kern_boot;
unsigned long setup_sects;
@@ -383,6 +383,7 @@ static bool load_bzimage(struct kvm *kvm, int fd_kernel,
kern_boot->hdr.type_of_loader   = 0xff;
kern_boot->hdr.heap_end_ptr = 0xfe00;
kern_boot->hdr.loadflags|= CAN_USE_HEAP;
+   kern_boot->hdr.vid_mode = vidmode;
 
/*
 * Read initrd image into guest memory
@@ -441,7 +442,7 @@ static bool initrd_check(int fd)
 }
 
 bool kvm__load_kernel(struct kvm *kvm, const char *kernel_filename,
-   const char *initrd_filename, const char *kernel_cmdline)
+   const char *initrd_filename, const char *kernel_cmdline, u16 
vidmode)
 {
bool ret;
int fd_kernel = -1, fd_initrd = -1;
@@ -459,7 +460,7 @@ bool kvm__load_kernel(struct kvm *kvm, const char 
*kernel_filename,
die("%s is not an initrd", initrd_filename);
}
 
-   ret = load_bzimage(kvm, fd_kernel, fd_initrd, kernel_cmdline);
+   ret = load_bzimage(kvm, fd_kernel, fd_initrd, kernel_cmdline, vidmode);
 
if (initrd_filename)
close(fd_initrd);
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3 1/5] kvm tools: Add BIOS INT10 handler

2011-05-23 Thread Sasha Levin
From: John Floren 

INT10 handler is a basic implementation of BIOS video services.

The handler implements a VESA interface which is initialized at
the very beginning of loading the kernel.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/bios/bios-rom.S |   56 
 tools/kvm/bios/int10.c|  161 +
 2 files changed, 189 insertions(+), 28 deletions(-)
 create mode 100644 tools/kvm/bios/int10.c

diff --git a/tools/kvm/bios/bios-rom.S b/tools/kvm/bios/bios-rom.S
index 8a53dcd..5645cd2 100644
--- a/tools/kvm/bios/bios-rom.S
+++ b/tools/kvm/bios/bios-rom.S
@@ -27,36 +27,36 @@ ENTRY_END(bios_intfake)
  * We ignore bx settings
  */
 ENTRY(bios_int10)
-   test $0x0e, %ah
-   jne 1f
+   pushw   %fs
+   pushl   %es
+   pushl   %edi
+   pushl   %esi
+   pushl   %ebp
+   pushl   %esp
+   pushl   %edx
+   pushl   %ecx
+   pushl   %ebx
+   pushl   %eax
+
+   movl%esp, %eax
+   /* this is way easier than doing it in assembly */
+   /* just push all the regs and jump to a C handler */
+   callint10_handler
+
+   popl%eax
+   popl%ebx
+   popl%ecx
+   popl%edx
+   popl%esp
+   popl%ebp
+   popl%esi
+   popl%edi
+   popl%es
+   popw%fs
 
-/*
- * put char in AL at current cursor and
- * increment cursor position
- */
-putchar:
-   stack_swap
-
-   push %fs
-   push %bx
-
-   mov $VGA_RAM_SEG, %bx
-   mov %bx, %fs
-   mov %cs:(cursor), %bx
-   mov %al, %fs:(%bx)
-   inc %bx
-   test $VGA_PAGE_SIZE, %bx
-   jb putchar_new
-   xor %bx, %bx
-putchar_new:
-   mov %bx, %fs:(cursor)
-
-   pop %bx
-   pop %fs
-
-   stack_restore
-1:
IRET
+
+
 /*
  * private IRQ data
  */
diff --git a/tools/kvm/bios/int10.c b/tools/kvm/bios/int10.c
new file mode 100644
index 000..1ab3a67
--- /dev/null
+++ b/tools/kvm/bios/int10.c
@@ -0,0 +1,161 @@
+#include "kvm/segment.h"
+#include "kvm/bios.h"
+#include "kvm/util.h"
+#include "kvm/vesa.h"
+#include 
+
+#define VESA_MAGIC ('V' + ('E' << 8) + ('S' << 16) + ('A' << 24))
+
+struct int10_args {
+   u32 eax;
+   u32 ebx;
+   u32 ecx;
+   u32 edx;
+   u32 esp;
+   u32 ebp;
+   u32 esi;
+   u32 edi;
+   u32 es;
+};
+
+/* VESA General Information table */
+struct vesa_general_info {
+   u32 signature;  /* 0 Magic number = "VESA" */
+   u16 version;/* 4 */
+   void *vendor_string;/* 6 */
+   u32 capabilities;   /* 10 */
+   void *video_mode_ptr;   /* 14 */
+   u16 total_memory;   /* 18 */
+
+   u8 reserved[236];   /* 20 */
+} __attribute__ ((packed));
+
+
+struct vminfo {
+   u16 mode_attr;  /* 0 */
+   u8  win_attr[2];/* 2 */
+   u16 win_grain;  /* 4 */
+   u16 win_size;   /* 6 */
+   u16 win_seg[2]; /* 8 */
+   u32 win_scheme; /* 12 */
+   u16 logical_scan;   /* 16 */
+
+   u16 h_res;  /* 18 */
+   u16 v_res;  /* 20 */
+   u8  char_width; /* 22 */
+   u8  char_height;/* 23 */
+   u8  memory_planes;  /* 24 */
+   u8  bpp;/* 25 */
+   u8  banks;  /* 26 */
+   u8  memory_layout;  /* 27 */
+   u8  bank_size;  /* 28 */
+   u8  image_planes;   /* 29 */
+   u8  page_function;  /* 30 */
+
+   u8  rmask;  /* 31 */
+   u8  rpos;   /* 32 */
+   u8  gmask;  /* 33 */
+   u8  gpos;   /* 34 */
+   u8  bmask;  /* 35 */
+   u8  bpos;   /* 36 */
+   u8  resv_mask;  /* 37 */
+   u8  resv_pos;   /* 38 */
+   u8  dcm_info;   /* 39 */
+
+   u32 lfb_ptr;/* 40 Linear frame buffer address */
+   u32 offscreen_ptr;  /* 44 Offscreen memory address */
+   u16 offscreen_size; /* 48 */
+
+   u8  reserved[206];  /* 50 */
+};
+
+char oemstring[11] = "KVM VESA";
+u16 modes[2] = { 0x0112, 0x };
+
+static inline void outb(unsigned short port, unsigned char val)
+{
+   asm volatile("outb %0, %1" : : "a"(val), "Nd"(port));
+}
+
+/*
+ * It's probably much more useful to make this print to the serial
+ * line rather than print to a non-displayed VGA memory
+ */
+static inline void int10_putchar(struct int10_args *args)
+{
+   u8 al, ah;
+
+   al = args->eax & 0xFF;
+

Re: [PATCH 5/5 V2] kvm tools: Initialize and use VESA and VNC

2011-05-23 Thread Pekka Enberg

On 5/23/11 2:38 PM, Ingo Molnar wrote:

* Sasha Levin  wrote:


@@ -511,7 +515,13 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
kvm->nrcpus = nrcpus;

memset(real_cmdline, 0, sizeof(real_cmdline));
-   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1 console=ttyS0 
earlyprintk=serial");
+   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1");
+   if (vnc) {
+   strcat(real_cmdline, " video=vesafb console=tty0");
+   vidmode = 0x312;
+   } else {
+   strcat(real_cmdline, " console=ttyS0 earlyprintk=serial");
+   }

Hm, i think all the kernel parameter handling code wants to move into driver
specific routines as well. Something like:

serial_init(kvm, real_cmdline);

where serial_init() would append to real_cmdline if needed.

This removes a bit of serial-driver specific knowledge from kvm-run.c.

Same goes for the VESA driver and the above video mode flag logic.


@@ -597,6 +607,9 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)

kvm__init_ram(kvm);

+   if (vnc)
+   vesa__init(kvm);

Shouldnt vesa__init() itself know about whether it's active (i.e. the 'vnc'
flag is set) and return early if it's not set?

That way this could become more encapsulated and self-sufficient:

vesa__init(kvm);

With no VESA driver specific state exposed to the generic kvm_cmd_run()
function.

Ideally kvm_cmd_run() hould just be a series of:

serial_init(kvm, real_cmdline);
vesa_init(kvm, real_cmdline);
...

initialization routines. Later on even this could be removed: using section
tricks we can put init functions into a section and drivers could register
their init function like initcall(func) functions are registered within the
kernel. kvm_cmd_run() could thus iterate over that (build time constructed)
section like this:

extern initcall_t __initcall_start[], __initcall_end[], __early_initcall_end[];

static void __init do_initcalls(void)
{
 initcall_t *fn;

 for (fn = __early_initcall_end; fn<  __initcall_end; fn++)
 do_one_initcall(*fn);
}

and would not actually have *any* knowledge about what drivers were built in.

Currently it's fine to initialize everything explicitly - but this would be the
long term model to work towards ...


Prasad, didn't you have patches to do exactly that?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5 V2] kvm tools: Initialize and use VESA and VNC

2011-05-23 Thread Ingo Molnar

* Sasha Levin  wrote:

> @@ -511,7 +515,13 @@ int kvm_cmd_run(int argc, const char **argv, const char 
> *prefix)
>   kvm->nrcpus = nrcpus;
>  
>   memset(real_cmdline, 0, sizeof(real_cmdline));
> - strcpy(real_cmdline, "notsc noapic noacpi pci=conf1 console=ttyS0 
> earlyprintk=serial");
> + strcpy(real_cmdline, "notsc noapic noacpi pci=conf1");
> + if (vnc) {
> + strcat(real_cmdline, " video=vesafb console=tty0");
> + vidmode = 0x312;
> + } else {
> + strcat(real_cmdline, " console=ttyS0 earlyprintk=serial");
> + }

Hm, i think all the kernel parameter handling code wants to move into driver 
specific routines as well. Something like:

serial_init(kvm, real_cmdline);

where serial_init() would append to real_cmdline if needed.

This removes a bit of serial-driver specific knowledge from kvm-run.c.

Same goes for the VESA driver and the above video mode flag logic.

> @@ -597,6 +607,9 @@ int kvm_cmd_run(int argc, const char **argv, const char 
> *prefix)
>  
>   kvm__init_ram(kvm);
>  
> + if (vnc)
> + vesa__init(kvm);

Shouldnt vesa__init() itself know about whether it's active (i.e. the 'vnc' 
flag is set) and return early if it's not set?

That way this could become more encapsulated and self-sufficient:

vesa__init(kvm);

With no VESA driver specific state exposed to the generic kvm_cmd_run() 
function.

Ideally kvm_cmd_run() hould just be a series of:

serial_init(kvm, real_cmdline);
vesa_init(kvm, real_cmdline);
...

initialization routines. Later on even this could be removed: using section 
tricks we can put init functions into a section and drivers could register 
their init function like initcall(func) functions are registered within the 
kernel. kvm_cmd_run() could thus iterate over that (build time constructed) 
section like this:

extern initcall_t __initcall_start[], __initcall_end[], __early_initcall_end[];

static void __init do_initcalls(void)
{
initcall_t *fn;

for (fn = __early_initcall_end; fn < __initcall_end; fn++)
do_one_initcall(*fn);
}

and would not actually have *any* knowledge about what drivers were built in.

Currently it's fine to initialize everything explicitly - but this would be the 
long term model to work towards ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5 V2] kvm tools: Add VESA device

2011-05-23 Thread Ingo Molnar

* Sasha Levin  wrote:

> +struct int10args;

this should be int10_args.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/5 V2] kvm tools: Add video mode to kernel initialization

2011-05-23 Thread Ingo Molnar

* Sasha Levin  wrote:

>  bool kvm__load_kernel(struct kvm *kvm, const char *kernel_filename,
> - const char *initrd_filename, const char 
> *kernel_cmdline);
> + const char *initrd_filename, const char 
> *kernel_cmdline, u16 vidmode);

Suggestion for future cleanup: we really want to gros a 'struct kernel_params' 
kind of thing which could be passed along here by address.

That would make it easier to extent it with whatever may come along in the 
future, and would make the code look cleaner as well.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5 V2] kvm tools: Add BIOS INT10 handler

2011-05-23 Thread Ingo Molnar

* Sasha Levin  wrote:

> INT10 handler is a basic implementation of BIOS video services.
> 
> The handler implements a VESA interface which is initialized at
> the very beginning of loading the kernel.
> 
> Signed-off-by: John Floren 
> [ turning code into patches and cleanup ]
> Signed-off-by: Sasha Levin 
> ---
>  tools/kvm/bios/bios-rom.S |   56 
>  tools/kvm/bios/int10.c|  161 
> +
>  2 files changed, 189 insertions(+), 28 deletions(-)
>  create mode 100644 tools/kvm/bios/int10.c
> 
> diff --git a/tools/kvm/bios/bios-rom.S b/tools/kvm/bios/bios-rom.S
> index 8a53dcd..b636cb8 100644
> --- a/tools/kvm/bios/bios-rom.S
> +++ b/tools/kvm/bios/bios-rom.S
> @@ -27,36 +27,36 @@ ENTRY_END(bios_intfake)
>   * We ignore bx settings
>   */
>  ENTRY(bios_int10)
> - test $0x0e, %ah
> - jne 1f
> + pushw   %fs
> + pushl   %es
> + pushl   %edi
> + pushl   %esi
> + pushl   %ebp
> + pushl   %esp
> + pushl   %edx
> + pushl   %ecx
> + pushl   %ebx
> + pushl   %eax
> +
> + movl%esp, %eax
> + /* this is way easier than doing it in assembly */
> + /* just push all the regs and jump to a C handler */
> + callint10handler
> +
> + popl%eax
> + popl%ebx
> + popl%ecx
> + popl%edx
> + popl%esp
> + popl%ebp
> + popl%esi
> + popl%edi
> + popl%es
> + popw%fs
>  
> -/*
> - * put char in AL at current cursor and
> - * increment cursor position
> - */
> -putchar:
> - stack_swap
> -
> - push %fs
> - push %bx
> -
> - mov $VGA_RAM_SEG, %bx
> - mov %bx, %fs
> - mov %cs:(cursor), %bx
> - mov %al, %fs:(%bx)
> - inc %bx
> - test $VGA_PAGE_SIZE, %bx
> - jb putchar_new
> - xor %bx, %bx
> -putchar_new:
> - mov %bx, %fs:(cursor)
> -
> - pop %bx
> - pop %fs
> -
> - stack_restore
> -1:
>   IRET
> +
> +
>  /*
>   * private IRQ data
>   */
> diff --git a/tools/kvm/bios/int10.c b/tools/kvm/bios/int10.c
> new file mode 100644
> index 000..98205c3
> --- /dev/null
> +++ b/tools/kvm/bios/int10.c
> @@ -0,0 +1,161 @@
> +#include "kvm/segment.h"
> +#include "kvm/bios.h"
> +#include "kvm/util.h"
> +#include "kvm/vesa.h"
> +#include 
> +
> +#define VESA_MAGIC ('V' + ('E' << 8) + ('S' << 16) + ('A' << 24))
> +
> +struct int10args {
> + u32 eax;
> + u32 ebx;
> + u32 ecx;
> + u32 edx;
> + u32 esp;
> + u32 ebp;
> + u32 esi;
> + u32 edi;
> + u32 es;
> +};
> +
> +/* VESA General Information table */
> +struct vesa_general_info {
> + u32 signature;  /* 0 Magic number = "VESA" */
> + u16 version;/* 4 */
> + void *vendor_string;/* 6 */
> + u32 capabilities;   /* 10 */
> + void *video_mode_ptr;   /* 14 */
> + u16 total_memory;   /* 18 */
> +
> + u8 reserved[236];   /* 20 */
> +} __attribute__ ((packed));
> +
> +
> +struct vminfo {
> + u16 mode_attr;  /* 0 */
> + u8  win_attr[2];/* 2 */
> + u16 win_grain;  /* 4 */
> + u16 win_size;   /* 6 */
> + u16 win_seg[2]; /* 8 */
> + u32 win_scheme; /* 12 */
> + u16 logical_scan;   /* 16 */
> +
> + u16 h_res;  /* 18 */
> + u16 v_res;  /* 20 */
> + u8  char_width; /* 22 */
> + u8  char_height;/* 23 */
> + u8  memory_planes;  /* 24 */
> + u8  bpp;/* 25 */
> + u8  banks;  /* 26 */
> + u8  memory_layout;  /* 27 */
> + u8  bank_size;  /* 28 */
> + u8  image_planes;   /* 29 */
> + u8  page_function;  /* 30 */
> +
> + u8  rmask;  /* 31 */
> + u8  rpos;   /* 32 */
> + u8  gmask;  /* 33 */
> + u8  gpos;   /* 34 */
> + u8  bmask;  /* 35 */
> + u8  bpos;   /* 36 */
> + u8  resv_mask;  /* 37 */
> + u8  resv_pos;   /* 38 */
> + u8  dcm_info;   /* 39 */
> +
> + u32 lfb_ptr;/* 40 Linear frame buffer address */
> + u32 offscreen_ptr;  /* 44 Offscreen memory address */
> + u16 offscreen_size; /* 48 */
> +
> + u8  reserved[206];  /* 50 */
> +};
> +
> +char oemstring[11] = "KVM VESA";
> +u16 modes[2] = { 0x0112, 0x };
> +
> +static inline void outb(unsigned short port, unsigned char val)
> +{
> + asm volatile("outb %0, %1" : : "a"(val), "Nd"(port));
> +}
> +
> +/*
> + * It's probably much more useful to make this print to the serial
> + * line r

Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-23 Thread Michael S. Tsirkin
On Mon, May 23, 2011 at 11:37:15AM +0930, Rusty Russell wrote:
> On Sun, 22 May 2011 15:10:08 +0300, "Michael S. Tsirkin"  
> wrote:
> > On Sat, May 21, 2011 at 11:49:59AM +0930, Rusty Russell wrote:
> > > On Fri, 20 May 2011 02:11:56 +0300, "Michael S. Tsirkin" 
> > >  wrote:
> > > > Current code might introduce a lot of latency variation
> > > > if there are many pending bufs at the time we
> > > > attempt to transmit a new one. This is bad for
> > > > real-time applications and can't be good for TCP either.
> > > 
> > > Do we have more than speculation to back that up, BTW?
> > 
> > Need to dig this up: I thought we saw some reports of this on the list?
> 
> I think so too, but a reference needs to be here too.
> 
> It helps to have exact benchmarks on what's being tested, otherwise we
> risk unexpected interaction with the other optimization patches.
> 
> > > > struct sk_buff *skb;
> > > > unsigned int len;
> > > > -
> > > > -   while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> > > > +   bool c;
> > > > +   int n;
> > > > +
> > > > +   /* We try to free up at least 2 skbs per one sent, so that 
> > > > we'll get
> > > > +* all of the memory back if they are used fast enough. */
> > > > +   for (n = 0;
> > > > +((c = virtqueue_get_capacity(vi->svq) < capacity) || n < 
> > > > 2) &&
> > > > +((skb = virtqueue_get_buf(vi->svq, &len)));
> > > > +++n) {
> > > > pr_debug("Sent skb %p\n", skb);
> > > > vi->dev->stats.tx_bytes += skb->len;
> > > > vi->dev->stats.tx_packets++;
> > > > dev_kfree_skb_any(skb);
> > > > }
> > > > +   return !c;
> > > 
> > > This is for() abuse :)
> > > 
> > > Why is the capacity check in there at all?  Surely it's simpler to try
> > > to free 2 skbs each time around?
> > 
> > This is in case we can't use indirect: we want to free up
> > enough buffers for the following add_buf to succeed.
> 
> Sure, or we could just count the frags of the skb we're taking out,
> which would be accurate for both cases and far more intuitive.
> 
> ie. always try to free up twice as much as we're about to put in.
> 
> Can we hit problems with OOM?  Sure, but no worse than now...
> The problem is that this "virtqueue_get_capacity()" returns the worst
> case, not the normal case.  So using it is deceptive.
> 

Maybe just document this?

I still believe capacity really needs to be decided
at the virtqueue level, not in the driver.
E.g. with indirect each skb uses a single entry: freeing
1 small skb is always enough to have space for a large one.

I do understand how it seems a waste to leave direct space
in the ring while we might in practice have space
due to indirect. Didn't come up with a nice way to
solve this yet - but 'no worse than now :)'

> > I just wanted to localize the 2+MAX_SKB_FRAGS logic that tries to make
> > sure we have enough space in the buffer. Another way to do
> > that is with a define :).
> 
> To do this properly, we should really be using the actual number of sg
> elements needed, but we'd have to do most of xmit_skb beforehand so we
> know how many.
> 
> Cheers,
> Rusty.

Maybe I'm confused here.  The problem isn't the failing
add_buf for the given skb IIUC.  What we are trying to do here is stop
the queue *before xmit_skb fails*. We can't look at the
number of fragments in the current skb - the next one can be
much larger.  That's why we check capacity after xmit_skb,
not before it, right?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5 V2] kvm tools: Initialize and use VESA and VNC

2011-05-23 Thread Sasha Levin
Requirements - Kernel compiled with:
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_VESA=y
CONFIG_FRAMEBUFFER_CONSOLE=y

Start VNC server by starting kvm tools with "--vnc".
Connect to the VNC server by running: "vncviewer :0".

Since there is no support for input devices at this time,
it may be useful starting kvm tools with an additional
' -p "console=ttyS0" ' parameter so that it would be possible
to use a serial console alongside with a graphic one.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/kvm-run.c |   17 +++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c
index 288e1fb..adbb25b 100644
--- a/tools/kvm/kvm-run.c
+++ b/tools/kvm/kvm-run.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* header files for gitish interface  */
 #include 
@@ -66,6 +67,7 @@ static const char *virtio_9p_dir;
 static bool single_step;
 static bool readonly_image[MAX_DISK_IMAGES];
 static bool virtio_rng;
+static bool vnc;
 extern bool ioport_debug;
 extern int  active_console;
 
@@ -110,6 +112,7 @@ static const struct option options[] = {
OPT_STRING('\0', "kvm-dev", &kvm_dev, "kvm-dev", "KVM device file"),
OPT_STRING('\0', "virtio-9p", &virtio_9p_dir, "root dir",
"Enable 9p over virtio"),
+   OPT_BOOLEAN('\0', "vnc", &vnc, "Enable VNC framebuffer"),
 
OPT_GROUP("Kernel options:"),
OPT_STRING('k', "kernel", &kernel_filename, "kernel",
@@ -413,6 +416,7 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
char *hi;
int i;
void *ret;
+   u16 vidmode = 0;
 
signal(SIGALRM, handle_sigalrm);
signal(SIGQUIT, handle_sigquit);
@@ -511,7 +515,13 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
kvm->nrcpus = nrcpus;
 
memset(real_cmdline, 0, sizeof(real_cmdline));
-   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1 console=ttyS0 
earlyprintk=serial");
+   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1");
+   if (vnc) {
+   strcat(real_cmdline, " video=vesafb console=tty0");
+   vidmode = 0x312;
+   } else {
+   strcat(real_cmdline, " console=ttyS0 earlyprintk=serial");
+   }
strcat(real_cmdline, " ");
if (kernel_cmdline)
strlcat(real_cmdline, kernel_cmdline, sizeof(real_cmdline));
@@ -543,7 +553,7 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
printf("  # kvm run -k %s -m %Lu -c %d\n", kernel_filename, ram_size / 
1024 / 1024, nrcpus);
 
if (!kvm__load_kernel(kvm, kernel_filename, initrd_filename,
-   real_cmdline))
+   real_cmdline, vidmode))
die("unable to load kernel %s", kernel_filename);
 
kvm->vmlinux= vmlinux_filename;
@@ -597,6 +607,9 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
 
kvm__init_ram(kvm);
 
+   if (vnc)
+   vesa__init(kvm);
+
thread_pool__init(nr_online_cpus);
 
for (i = 0; i < nrcpus; i++) {
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5 V2] kvm tools: Update makefile and feature tests

2011-05-23 Thread Sasha Levin
Update feature tests to test for libvncserver.

VESA support doesn't get compiled in unless libvncserver
is installed.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/Makefile |   11 ++-
 tools/kvm/config/feature-tests.mak |   10 ++
 2 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
index e6e8d4e..2ebc86c 100644
--- a/tools/kvm/Makefile
+++ b/tools/kvm/Makefile
@@ -58,6 +58,14 @@ ifeq ($(has_bfd),y)
LIBS+= -lbfd
 endif
 
+FLAGS_VNCSERVER=$(CFLAGS) -lvncserver
+has_vncserver := $(call try-cc,$(SOURCE_VNCSERVER),$(FLAGS_VNCSERVER))
+ifeq ($(has_vncserver),y)
+   CFLAGS  += -DCONFIG_HAS_VNCSERVER
+   OBJS+= hw/vesa.o
+   LIBS+= -lvncserver
+endif
+
 DEPS   := $(patsubst %.o,%.d,$(OBJS))
 
 # Exclude BIOS object files from header dependencies.
@@ -153,9 +161,10 @@ bios/bios.o: bios/bios.S bios/bios-rom.bin
 bios/bios-rom.bin: bios/bios-rom.S bios/e820.c
$(E) "  CC  " $@
$(Q) $(CC) -include code16gcc.h $(CFLAGS) $(BIOS_CFLAGS) -c -s 
bios/e820.c -o bios/e820.o
+   $(Q) $(CC) -include code16gcc.h $(CFLAGS) $(BIOS_CFLAGS) -c -s 
bios/int10.c -o bios/int10.o
$(Q) $(CC) $(CFLAGS) $(BIOS_CFLAGS) -c -s bios/bios-rom.S -o 
bios/bios-rom.o
$(E) "  LD  " $@
-   $(Q) ld -T bios/rom.ld.S -o bios/bios-rom.bin.elf bios/bios-rom.o 
bios/e820.o
+   $(Q) ld -T bios/rom.ld.S -o bios/bios-rom.bin.elf bios/bios-rom.o 
bios/e820.o bios/int10.o
$(E) "  OBJCOPY " $@
$(Q) objcopy -O binary -j .text bios/bios-rom.bin.elf bios/bios-rom.bin
$(E) "  NM  " $@
diff --git a/tools/kvm/config/feature-tests.mak 
b/tools/kvm/config/feature-tests.mak
index 6170fd2..0801b54 100644
--- a/tools/kvm/config/feature-tests.mak
+++ b/tools/kvm/config/feature-tests.mak
@@ -126,3 +126,13 @@ int main(void)
return 0;
 }
 endef
+
+define SOURCE_VNCSERVER
+#include 
+
+int main(void)
+{
+   rfbIsActive((void *)0);
+   return 0;
+}
+endef
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5 V2] kvm tools: Add VESA device

2011-05-23 Thread Sasha Levin
Add a simple VESA device which simply moves a framebuffer
from guest kernel to a VNC server.

VESA device PCI code is very similar to virtio-* PCI code.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/hw/vesa.c|  108 
 tools/kvm/include/kvm/ioport.h |2 +
 tools/kvm/include/kvm/vesa.h   |   27 
 tools/kvm/include/kvm/virtio-pci-dev.h |3 +
 4 files changed, 140 insertions(+), 0 deletions(-)
 create mode 100644 tools/kvm/hw/vesa.c
 create mode 100644 tools/kvm/include/kvm/vesa.h

diff --git a/tools/kvm/hw/vesa.c b/tools/kvm/hw/vesa.c
new file mode 100644
index 000..3003aa5
--- /dev/null
+++ b/tools/kvm/hw/vesa.c
@@ -0,0 +1,108 @@
+#include "kvm/vesa.h"
+#include "kvm/ioport.h"
+#include "kvm/util.h"
+#include "kvm/kvm.h"
+#include "kvm/pci.h"
+#include "kvm/kvm-cpu.h"
+#include "kvm/irq.h"
+#include "kvm/virtio-pci-dev.h"
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#define VESA_QUEUE_SIZE128
+#define VESA_IRQ   14
+
+/*
+ * This "6000" value is pretty much the result of experimentation
+ * It seems that around this value, things update pretty smoothly
+ */
+#define VESA_UPDATE_TIME   6000
+
+u8 videomem[VESA_MEM_SIZE];
+
+static bool vesa_pci_io_in(struct kvm *kvm, u16 port, void *data, int size, 
u32 count)
+{
+   printf("vesa in port=%u\n", port);
+   return true;
+}
+
+static bool vesa_pci_io_out(struct kvm *kvm, u16 port, void *data, int size, 
u32 count)
+{
+   printf("vesa out port=%u\n", port);
+   return true;
+}
+
+static struct ioport_operations vesa_io_ops = {
+   .io_in  = vesa_pci_io_in,
+   .io_out = vesa_pci_io_out,
+};
+
+static struct pci_device_header vesa_pci_device = {
+   .vendor_id  = PCI_VENDOR_ID_REDHAT_QUMRANET,
+   .device_id  = PCI_DEVICE_ID_VESA,
+   .header_type= PCI_HEADER_TYPE_NORMAL,
+   .revision_id= 0,
+   .class  = 0x03,
+   .subsys_vendor_id   = PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET,
+   .subsys_id  = PCI_SUBSYSTEM_ID_VESA,
+   .bar[0] = IOPORT_VESA | PCI_BASE_ADDRESS_SPACE_IO,
+   .bar[1] = VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY,
+};
+
+
+void vesa_mmio_callback(u64 addr, u8 *data, u32 len, u8 is_write)
+{
+   if (is_write)
+   memcpy(&videomem[addr - VESA_MEM_ADDR], data, len);
+
+   return;
+}
+
+void vesa__init(struct kvm *kvm)
+{
+   u8 dev, line, pin;
+   pthread_t thread;
+
+   if (irq__register_device(PCI_DEVICE_ID_VESA, &dev, &pin, &line) < 0)
+   return;
+
+   vesa_pci_device.irq_pin = pin;
+   vesa_pci_device.irq_line = line;
+   pci__register(&vesa_pci_device, dev);
+   ioport__register(IOPORT_VESA, &vesa_io_ops, IOPORT_VESA_SIZE);
+
+   kvm__register_mmio(VESA_MEM_ADDR, VESA_MEM_SIZE, &vesa_mmio_callback);
+   pthread_create(&thread, NULL, vesa__dovnc, kvm);
+}
+
+/*
+ * This starts a VNC server to display the framebuffer.
+ * It's not altogether clear this belongs here rather than in kvm-run.c
+ */
+void *vesa__dovnc(void *v)
+{
+   /*
+* Make a fake argc and argv because the getscreen function
+* seems to want it.
+*/
+   int ac = 1;
+   char av[1][1] = {{0} };
+   rfbScreenInfoPtr server;
+
+   server = rfbGetScreen(&ac, (char **)av, VESA_WIDTH, VESA_HEIGHT, 8, 3, 
4);
+   server->frameBuffer = (char *)videomem;
+   server->alwaysShared = TRUE;
+   rfbInitServer(server);
+
+   while (rfbIsActive(server)) {
+   rfbMarkRectAsModified(server, 0, 0, VESA_WIDTH, VESA_HEIGHT);
+   rfbProcessEvents(server, server->deferUpdateTime * 
VESA_UPDATE_TIME);
+   }
+   return NULL;
+}
+
diff --git a/tools/kvm/include/kvm/ioport.h b/tools/kvm/include/kvm/ioport.h
index 218530c..8253938 100644
--- a/tools/kvm/include/kvm/ioport.h
+++ b/tools/kvm/include/kvm/ioport.h
@@ -7,6 +7,8 @@
 
 /* some ports we reserve for own use */
 #define IOPORT_DBG 0xe0
+#define IOPORT_VESA0xa200
+#define IOPORT_VESA_SIZE   256
 #define IOPORT_VIRTIO_P9   0xb200  /* Virtio 9P device */
 #define IOPORT_VIRTIO_P9_SIZE  256
 #define IOPORT_VIRTIO_BLK  0xc200  /* Virtio block device */
diff --git a/tools/kvm/include/kvm/vesa.h b/tools/kvm/include/kvm/vesa.h
new file mode 100644
index 000..3e58587
--- /dev/null
+++ b/tools/kvm/include/kvm/vesa.h
@@ -0,0 +1,27 @@
+#ifndef KVM__VESA_H
+#define KVM__VESA_H
+
+#include 
+
+#define VESA_WIDTH 640
+#define VESA_HEIGHT480
+
+#define VESA_MEM_ADDR  0xd000
+#define VESA_MEM_SIZE  (4*VESA_WIDTH*VESA_HEIGHT)
+#define VESA_BPP   32
+
+struct kvm;
+struct int10args;
+
+void vesa_mmio_callback(u64, u8*, u32, u8);
+void vesa_

[PATCH 2/5 V2] kvm tools: Add video mode to kernel initialization

2011-05-23 Thread Sasha Levin
Allow setting video mode in guest kernel.

For possible values see Documentation/fb/vesafb.txt

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/include/kvm/kvm.h |2 +-
 tools/kvm/kvm.c |7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/kvm/include/kvm/kvm.h b/tools/kvm/include/kvm/kvm.h
index 08c6fda..f951f2d 100644
--- a/tools/kvm/include/kvm/kvm.h
+++ b/tools/kvm/include/kvm/kvm.h
@@ -41,7 +41,7 @@ int kvm__max_cpus(struct kvm *kvm);
 void kvm__init_ram(struct kvm *kvm);
 void kvm__delete(struct kvm *kvm);
 bool kvm__load_kernel(struct kvm *kvm, const char *kernel_filename,
-   const char *initrd_filename, const char 
*kernel_cmdline);
+   const char *initrd_filename, const char 
*kernel_cmdline, u16 vidmode);
 void kvm__setup_bios(struct kvm *kvm);
 void kvm__start_timer(struct kvm *kvm);
 void kvm__stop_timer(struct kvm *kvm);
diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
index 4393a41..7284211 100644
--- a/tools/kvm/kvm.c
+++ b/tools/kvm/kvm.c
@@ -320,7 +320,7 @@ static int load_flat_binary(struct kvm *kvm, int fd)
 static const char *BZIMAGE_MAGIC   = "HdrS";
 
 static bool load_bzimage(struct kvm *kvm, int fd_kernel,
-   int fd_initrd, const char *kernel_cmdline)
+   int fd_initrd, const char *kernel_cmdline, u16 vidmode)
 {
struct boot_params *kern_boot;
unsigned long setup_sects;
@@ -383,6 +383,7 @@ static bool load_bzimage(struct kvm *kvm, int fd_kernel,
kern_boot->hdr.type_of_loader   = 0xff;
kern_boot->hdr.heap_end_ptr = 0xfe00;
kern_boot->hdr.loadflags|= CAN_USE_HEAP;
+   kern_boot->hdr.vid_mode = vidmode;
 
/*
 * Read initrd image into guest memory
@@ -441,7 +442,7 @@ static bool initrd_check(int fd)
 }
 
 bool kvm__load_kernel(struct kvm *kvm, const char *kernel_filename,
-   const char *initrd_filename, const char *kernel_cmdline)
+   const char *initrd_filename, const char *kernel_cmdline, u16 
vidmode)
 {
bool ret;
int fd_kernel = -1, fd_initrd = -1;
@@ -459,7 +460,7 @@ bool kvm__load_kernel(struct kvm *kvm, const char 
*kernel_filename,
die("%s is not an initrd", initrd_filename);
}
 
-   ret = load_bzimage(kvm, fd_kernel, fd_initrd, kernel_cmdline);
+   ret = load_bzimage(kvm, fd_kernel, fd_initrd, kernel_cmdline, vidmode);
 
if (initrd_filename)
close(fd_initrd);
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5 V2] kvm tools: Add BIOS INT10 handler

2011-05-23 Thread Sasha Levin
INT10 handler is a basic implementation of BIOS video services.

The handler implements a VESA interface which is initialized at
the very beginning of loading the kernel.

Signed-off-by: John Floren 
[ turning code into patches and cleanup ]
Signed-off-by: Sasha Levin 
---
 tools/kvm/bios/bios-rom.S |   56 
 tools/kvm/bios/int10.c|  161 +
 2 files changed, 189 insertions(+), 28 deletions(-)
 create mode 100644 tools/kvm/bios/int10.c

diff --git a/tools/kvm/bios/bios-rom.S b/tools/kvm/bios/bios-rom.S
index 8a53dcd..b636cb8 100644
--- a/tools/kvm/bios/bios-rom.S
+++ b/tools/kvm/bios/bios-rom.S
@@ -27,36 +27,36 @@ ENTRY_END(bios_intfake)
  * We ignore bx settings
  */
 ENTRY(bios_int10)
-   test $0x0e, %ah
-   jne 1f
+   pushw   %fs
+   pushl   %es
+   pushl   %edi
+   pushl   %esi
+   pushl   %ebp
+   pushl   %esp
+   pushl   %edx
+   pushl   %ecx
+   pushl   %ebx
+   pushl   %eax
+
+   movl%esp, %eax
+   /* this is way easier than doing it in assembly */
+   /* just push all the regs and jump to a C handler */
+   callint10handler
+
+   popl%eax
+   popl%ebx
+   popl%ecx
+   popl%edx
+   popl%esp
+   popl%ebp
+   popl%esi
+   popl%edi
+   popl%es
+   popw%fs
 
-/*
- * put char in AL at current cursor and
- * increment cursor position
- */
-putchar:
-   stack_swap
-
-   push %fs
-   push %bx
-
-   mov $VGA_RAM_SEG, %bx
-   mov %bx, %fs
-   mov %cs:(cursor), %bx
-   mov %al, %fs:(%bx)
-   inc %bx
-   test $VGA_PAGE_SIZE, %bx
-   jb putchar_new
-   xor %bx, %bx
-putchar_new:
-   mov %bx, %fs:(cursor)
-
-   pop %bx
-   pop %fs
-
-   stack_restore
-1:
IRET
+
+
 /*
  * private IRQ data
  */
diff --git a/tools/kvm/bios/int10.c b/tools/kvm/bios/int10.c
new file mode 100644
index 000..98205c3
--- /dev/null
+++ b/tools/kvm/bios/int10.c
@@ -0,0 +1,161 @@
+#include "kvm/segment.h"
+#include "kvm/bios.h"
+#include "kvm/util.h"
+#include "kvm/vesa.h"
+#include 
+
+#define VESA_MAGIC ('V' + ('E' << 8) + ('S' << 16) + ('A' << 24))
+
+struct int10args {
+   u32 eax;
+   u32 ebx;
+   u32 ecx;
+   u32 edx;
+   u32 esp;
+   u32 ebp;
+   u32 esi;
+   u32 edi;
+   u32 es;
+};
+
+/* VESA General Information table */
+struct vesa_general_info {
+   u32 signature;  /* 0 Magic number = "VESA" */
+   u16 version;/* 4 */
+   void *vendor_string;/* 6 */
+   u32 capabilities;   /* 10 */
+   void *video_mode_ptr;   /* 14 */
+   u16 total_memory;   /* 18 */
+
+   u8 reserved[236];   /* 20 */
+} __attribute__ ((packed));
+
+
+struct vminfo {
+   u16 mode_attr;  /* 0 */
+   u8  win_attr[2];/* 2 */
+   u16 win_grain;  /* 4 */
+   u16 win_size;   /* 6 */
+   u16 win_seg[2]; /* 8 */
+   u32 win_scheme; /* 12 */
+   u16 logical_scan;   /* 16 */
+
+   u16 h_res;  /* 18 */
+   u16 v_res;  /* 20 */
+   u8  char_width; /* 22 */
+   u8  char_height;/* 23 */
+   u8  memory_planes;  /* 24 */
+   u8  bpp;/* 25 */
+   u8  banks;  /* 26 */
+   u8  memory_layout;  /* 27 */
+   u8  bank_size;  /* 28 */
+   u8  image_planes;   /* 29 */
+   u8  page_function;  /* 30 */
+
+   u8  rmask;  /* 31 */
+   u8  rpos;   /* 32 */
+   u8  gmask;  /* 33 */
+   u8  gpos;   /* 34 */
+   u8  bmask;  /* 35 */
+   u8  bpos;   /* 36 */
+   u8  resv_mask;  /* 37 */
+   u8  resv_pos;   /* 38 */
+   u8  dcm_info;   /* 39 */
+
+   u32 lfb_ptr;/* 40 Linear frame buffer address */
+   u32 offscreen_ptr;  /* 44 Offscreen memory address */
+   u16 offscreen_size; /* 48 */
+
+   u8  reserved[206];  /* 50 */
+};
+
+char oemstring[11] = "KVM VESA";
+u16 modes[2] = { 0x0112, 0x };
+
+static inline void outb(unsigned short port, unsigned char val)
+{
+   asm volatile("outb %0, %1" : : "a"(val), "Nd"(port));
+}
+
+/*
+ * It's probably much more useful to make this print to the serial
+ * line rather than print to a non-displayed VGA memory
+ */
+static inline void int10putchar(struct int10args *args)
+{
+   u8 al, ah;
+
+   al = args->eax & 0xFF;
+   ah = (args->eax &

Re: Some errors when running KVM-Autotest on kernel-2.6.39

2011-05-23 Thread Avi Kivity

On 05/23/2011 12:43 PM, Zhi Yong Wu wrote:

HI, guys,

Some warnings and errors appear when running KVM-autotest on kernel 2.6.39

Can anyone give some comments? Is it a known issue, new, or a problem with my 
setup?
/home/zwu/work/virt/autotest/client/tests/kvm/qemu -name 'vm1' -monitor 
unix:'/tmp/monitor-humanmonitor1-20110523-101151-G8Zb',server,nowait -serial 
unix:'/tmp/serial-20110523-101151-G8Zb',server,nowait -m 512 -smp 2 -kernel 
'/home/zwu/work/virt/autotest/client/tests/kvm/unittests/emulator.flat' -vnc :0 
-chardev file,id=testlog,path=/tmp/testlog-20110523-101151-G8Zb -device 
testdev,chardev=testlog  -S



10:13:16 INFO | (qemu) Code=44 24 08 03 00 00 00 c7 44 24 0c 04 00 00 00 66 0f 6f 04 
24  0f 7f 03 48 89 de 48 89 e7 e8 a8 ee ff ff 0f b6 f0 bf cb c3 40 00 e8 c4 
ee ff ff c7 03


That's a movdqu instruction.  2.6.40 gained support for emulating this 
instruction, and the emulator unit test has a new test for it.  
Obviously it will fail on earlier kernels.


We need some way to tell the test to expect failures on older kernels.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM call agenda dfor May 24th

2011-05-23 Thread Juan Quintela

Please send in any agenda items you are interested in covering.

Thanks, Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Avi Kivity

On 05/22/2011 10:32 PM, Nadav Har'El wrote:

On Thu, May 12, 2011, Gleb Natapov wrote about "Re: [PATCH 0/30] nVMX: Nested VMX, 
v9":
>  >  But if my interpretation of the code is correct, SVM isn't much closer
>  >  than VMX to the goal of moving this logic to x86.c. When some logic is
>  >  moved there, both SVM and VMX code will need to change - perhaps even
>  >  considerably. So how will it be helpful to make VMX behave exactly like
>  >  SVM does now, when the latter will also need to change considerably?
>  >
>  SVM design is much close to the goal of moving the logic into x86.c
>  because IIRC it does not bypass parsing of IDT vectoring info into arch
>  independent structure. VMX code uses vmx->idt_vectoring_info directly.

At the risk of sounding blasphemous, I'd like to make the case that perhaps
the current nested-VMX design - regarding the IDT-vectoring-info-field
handling - is actually closer than nested-SVM to the goal of moving clean
nested-supporting logic into x86.c, instead of having ad-hoc, unnatural,
workarounds.

Let me explain, and see if you agree with my logic:

We discover at exit time whether the virtualization hardware (VMX or SVM)
exited while *delivering* an interrupt or exception to the current guest.
This is known as "idt-vectoring-information" in VMX.

What do we need to do with this idt-vectoring-information? In regular (non-
nested) guests, the answer is simple: On the next entry, we need to inject
this event again into the guest, so it can resume the delivery of the
same event it was trying to deliver. This is why the nested-unaware code
has a vmx_complete_interrupts which basically adds this idt-vectoring-info
into KVM's event queue, which on the next entry will be injected similarly
to the way virtual interrupts from userspace are injected, and so on.


The other thing we may need to do, is to expose it to userspace in case 
we're live migrating at exactly this point in time.



But with nested virtualization, this is *not* what is supposed to happen -
we do not *always* need to inject the event to the guest. We will only need
to inject the event if the next entry will be again to the same guest, i.e.,
L1 after L1, or L2 after L2. If the idt-vectoring-info came from L2, but
our next entry will be into L1 (i.e., a nested exit), we *shouldn't* inject
the event as usual, but should rather pass this idt-vectoring-info field
as the exit information that L1 gets (in nested vmx terminology, in vmcs12).

However, at the time of exit, we cannot know for sure whether L2 will actually
run next, because it is still possible that an injection from user space,
before the next entry, will cause us to decide to exit to L1.

Therefore, I believe that the clean solution isn't to leave the original
non-nested logic that always queues the idt-vectoring-info assuming it will
be injected, and then if it shouldn't (because we want to exit during entry)
we need to skip the entry once as a "trick" to avoid this wrong injection.

Rather, a clean solution is, I think, to recognize that in nested
virtualization, idt-vectoring-info is a different kind of beast than regular
injected events, and it needs to be saved at exit time in a different field
(which will of course be common to SVM and VMX). Only at entry time, after
the regular injection code (which may cause a nested exit), we can call a
x86_op to handle this special injection.

The benefit of this approach, which is closer to the current vmx code,
is, I think, that x86.c will contain clear, self-explanatory nested logic,
instead of relying on vmx.c or svm.c circumventing various x86.c functions
and mechanisms to do something different from what they were meant to do.



IMO this will cause confusion, especially with the user interfaces used 
to read/write pending events.


I think what we need to do is:

1. change ->interrupt_allowed() to return true if the interrupt flag is 
unmasked OR if in a nested guest, and we're intercepting interrupts
2. change ->set_irq() to cause a nested vmexit if in a nested guest and 
we're intercepting interrupts

3. change ->nmi_allowed() and ->set_nmi() in a similar way
4. add a .injected flag to the interrupt queue which overrides the 
nested vmexit for VM_ENTRY_INTR_INFO_FIELD and the svm equivalent; if 
present normal injection takes place (or an error vmexit if the 
interrupt flag is clear and we cannot inject)



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Some errors when running KVM-Autotest on kernel-2.6.39

2011-05-23 Thread Zhi Yong Wu
HI, guys,

Some warnings and errors appear when running KVM-autotest on kernel 2.6.39

Can anyone give some comments? Is it a known issue, new, or a problem with my 
setup?

[root@f12 linux-2.6]# uname -a
Linux f12 2.6.39 #2 SMP Fri May 20 19:51:05 CST 2011 x86_64 x86_64 x86_64 
GNU/Linux

[root@f12 linux-2.6]# modinfo kvm
filename:   /lib/modules/2.6.39/kernel/arch/x86/kvm/kvm.ko
license:GPL
author: Qumranet
srcversion: ABB5612DB8B1955AA82288F
depends:
vermagic:   2.6.39 SMP mod_unload 
parm:   oos_shadow:bool
parm:   ignore_msrs:bool

[root@f12 linux-2.6]# cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Duo CPU E6750  @ 2.66GHz
stepping: 11
cpu MHz : 2667.000
cache size  : 4096 KB
physical id : 0
siblings: 2
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor 
ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dts tpr_shadow vnmi 
flexpriority
bogomips: 5320.45
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Duo CPU E6750  @ 2.66GHz
stepping: 11
cpu MHz : 2000.000
cache size  : 4096 KB
physical id : 0
siblings: 2
core id : 1
cpu cores   : 2
apicid  : 1
initial apicid  : 1
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor 
ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dts tpr_shadow vnmi 
flexpriority
bogomips: 5319.99
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

Below is the output.

10:13:04 INFO | Running apic
10:13:04 WARNI| Could not send monitor command 'screendump 
/home/zwu/work/virt/autotest/client/results/default/kvm.unittest/debug/pre_vm1.ppm'
([Errno 32] Broken pipe)
10:13:04 INFO | Running qemu command:
/home/zwu/work/virt/autotest/client/tests/kvm/qemu -name 'vm1' -monitor 
unix:'/tmp/monitor-humanmonitor1-20110523-101151-G8Zb',server,nowait -serial 
unix:'/tmp/serial-20110523-101151-G8Zb',server,nowait -m 512 -smp 2 -kernel 
'/home/zwu/work/virt/autotest/client/tests/kvm/unittests/apic.flat' -vnc :0 
-chardev file,id=testlog,path=/tmp/testlog-20110523-101151-G8Zb -device 
testdev,chardev=testlog  -S -cpu qemu64,+x2apic
10:13:05 INFO | Waiting for unittest apic to complete, timeout 600, output in 
/tmp/testlog-20110523-101151-G8Zb
10:13:07 INFO | (qemu) /bin/sh: line 1: 18661 Segmentation fault  (core 
dumped) /home/zwu/work/virt/autotest/client/tests/kvm/qemu -name 'vm1' -monitor 
unix:'/tmp/monitor-humanmonitor1-20110523-101151-G8Zb',server,nowait -serial 
unix:'/tmp/serial-20110523-101151-G8Zb',server,nowait -m 512 -smp 2 -kernel 
'/home/zwu/work/virt/autotest/client/tests/kvm/unittests/apic.flat' -vnc :0 
-chardev file,id=testlog,path=/tmp/testlog-20110523-101151-G8Zb -device 
testdev,chardev=testlog -S -cpu qemu64,+x2apic
10:13:07 INFO | (qemu) (Process terminated with status 139)
10:13:07 ERROR| Unit test apic failed
10:13:07 INFO | Unit test log collected and available under 
/home/zwu/work/virt/autotest/client/results/default/kvm.unittest/debug/apic.log
10:13:07 INFO | Running svm
10:13:07 WARNI| Could not send monitor command 'screendump 
/home/zwu/work/virt/autotest/client/results/default/kvm.unittest/debug/pre_vm1.ppm'
([Errno 32] Broken pipe)
10:13:07 INFO | Running qemu command:
/home/zwu/work/virt/autotest/client/tests/kvm/qemu -name 'vm1' -monitor 
unix:'/tmp/monitor-humanmonitor1-20110523-101151-G8Zb',server,nowait -serial 
unix:'/tmp/serial-20110523-101151-G8Zb',server,nowait -m 512 -smp 2 -kernel 
'/home/zwu/work/virt/autotest/client/tests/kvm/unittests/svm.flat' -vnc :0 
-chardev file,id=testlog,path=/tmp/testlog-20110523-101151-G8Zb -device 
testdev,chardev=testlog  -S -enable-nesting -cpu qemu64,+svm
10:13:08 INFO | Waiting for unittest svm to complete, timeout 600, output in 
/tmp/testlog-20110523-101151-G8Zb
10:13:09 INFO | (qemu) (Process terminated with status 0)
10:13:10 INFO | Unit te

Re: [PATCH 1/5] kvm tools: Add BIOS INT10 handler

2011-05-23 Thread Ingo Molnar

* Sasha Levin  wrote:

> INT10 handler is a basic implementation of BIOS video services.
> 
> The handler implements a VESA interface which is initialized at
> the very beginning of loading the kernel.
> 
> Signed-off-by: John Floren 
> Signed-off-by: Sasha Levin 

Btw., the signoff chain looks broken - this will look odd in Git.

If you took most of this from John then please put this in the first line of 
the patch:

  From: John Floren 

That way John will be marked by Git as the author and you are the patch 
maintainer who nursed along the patch.

If you did significant changes to the patch (such as splitting it off a larger 
patch, cleaning it up, etc.) you can mark this before your SOB entry:

 Signed-off-by: John Floren 
 [ split up the patch and cleaned it up ]
 Signed-off-by: Sasha Levin 

If you did so many changes to a patch that you can reasonably be called the 
main author then you can be the From line and can mark John's first version as:

 Originally-From: John Floren 
 Signed-off-by: Sasha Levin 

If John has put copyright notices into the file then those should be preserved, 
and you can add yours as well, if you so wish.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/30] nVMX: Nested VMX, v9

2011-05-23 Thread Joerg Roedel
On Sun, May 22, 2011 at 10:32:39PM +0300, Nadav Har'El wrote:

> At the risk of sounding blasphemous, I'd like to make the case that perhaps
> the current nested-VMX design - regarding the IDT-vectoring-info-field
> handling - is actually closer than nested-SVM to the goal of moving clean
> nested-supporting logic into x86.c, instead of having ad-hoc, unnatural,
> workarounds.

Well, the nested SVM implementation is certainly not perfect in this
regard :)

> Therefore, I believe that the clean solution isn't to leave the original
> non-nested logic that always queues the idt-vectoring-info assuming it will
> be injected, and then if it shouldn't (because we want to exit during entry)
> we need to skip the entry once as a "trick" to avoid this wrong injection.
> 
> Rather, a clean solution is, I think, to recognize that in nested
> virtualization, idt-vectoring-info is a different kind of beast than regular
> injected events, and it needs to be saved at exit time in a different field
> (which will of course be common to SVM and VMX). Only at entry time, after
> the regular injection code (which may cause a nested exit), we can call a
> x86_op to handle this special injection.

Things are complicated either way. If you keep the vectoring-info
seperate from the kvm exception queue you need special logic to combine
the vectoring-info and the queue. For example, imagine something is
pending in idt-vectoring info and the intercept causes another
exception for the guest. KVM needs to turn this into the #DF then. When
we just queue the vectoring-info into the exception queue we get this
implicitly without extra code. This is a cleaner way imho.

On the other side, when using the exception queue we need to keep
extra-information for nesting in the queue because an event which is
just re-injected into L2 must not cause a nested vmexit, even if the
exception vector is intercepted by L1. But this is the same for SVM and
VMX so we can do this in generic x86 code. This is not the case when
keeping track of idt-vectoring info seperate in architecture code.

Regards,

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] kvm tools: Initialize and use VESA and VNC

2011-05-23 Thread Ingo Molnar

* Sasha Levin  wrote:

> @@ -598,6 +608,13 @@ int kvm_cmd_run(int argc, const char **argv, const char 
> *prefix)
>  
>   kvm__init_ram(kvm);
>  
> + if (vnc) {
> + pthread_t thread;
> +
> + vesa__init(kvm);
> + pthread_create(&thread, NULL, vesa__dovnc, kvm);
> + }
> +

This should be encapsulated better, it should probably be all be done within 
vesa__init() and the only kv_cmd_run() exposure should be:

vesa__init(kvm);

vesa__init() would wrap to an empty inline function if the library prereqs are 
not present.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] kvm tools: Initialize and use VESA and VNC

2011-05-23 Thread Sasha Levin
Requirements - Kernel compiled with:
CONFIG_FB_BOOT_VESA_SUPPORT=y
CONFIG_FB_VESA=y
CONFIG_FRAMEBUFFER_CONSOLE=y

Start VNC server by starting kvm tools with "--vnc".
Connect to the VNC server by running: "vncviewer :0".

Since there is no support for input devices at this time,
it may be useful starting kvm tools with an additional
' -p "console=ttyS0" ' parameter so that it would be possible
to use a serial console alongside with a graphic one.

Signed-off-by: John Floren 
Signed-off-by: Sasha Levin 
---
 tools/kvm/kvm-run.c |   21 +++--
 1 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c
index f7de0fb..5acddb2 100644
--- a/tools/kvm/kvm-run.c
+++ b/tools/kvm/kvm-run.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* header files for gitish interface  */
 #include 
@@ -67,6 +68,7 @@ static const char *virtio_9p_dir;
 static bool single_step;
 static bool readonly_image[MAX_DISK_IMAGES];
 static bool virtio_rng;
+static bool vnc;
 extern bool ioport_debug;
 extern int  active_console;
 
@@ -111,6 +113,7 @@ static const struct option options[] = {
OPT_STRING('\0', "kvm-dev", &kvm_dev, "kvm-dev", "KVM device file"),
OPT_STRING('\0', "virtio-9p", &virtio_9p_dir, "root dir",
"Enable 9p over virtio"),
+   OPT_BOOLEAN('\0', "vnc", &vnc, "Enable VNC framebuffer"),
 
OPT_GROUP("Kernel options:"),
OPT_STRING('k', "kernel", &kernel_filename, "kernel",
@@ -414,6 +417,7 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
char *hi;
int i;
void *ret;
+   u16 vidmode = 0;
 
signal(SIGALRM, handle_sigalrm);
signal(SIGQUIT, handle_sigquit);
@@ -512,7 +516,13 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
kvm->nrcpus = nrcpus;
 
memset(real_cmdline, 0, sizeof(real_cmdline));
-   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1 console=ttyS0 
earlyprintk=serial");
+   strcpy(real_cmdline, "notsc noapic noacpi pci=conf1");
+   if (vnc) {
+   strcat(real_cmdline, " video=vesafb:ypan console=tty0");
+   vidmode = 0x312;
+   } else {
+   strcat(real_cmdline, " console=ttyS0 earlyprintk=serial");
+   }
strcat(real_cmdline, " ");
if (kernel_cmdline)
strlcat(real_cmdline, kernel_cmdline, sizeof(real_cmdline));
@@ -544,7 +554,7 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
printf("  # kvm run -k %s -m %Lu -c %d\n", kernel_filename, ram_size / 
1024 / 1024, nrcpus);
 
if (!kvm__load_kernel(kvm, kernel_filename, initrd_filename,
-   real_cmdline))
+   real_cmdline, vidmode))
die("unable to load kernel %s", kernel_filename);
 
kvm->vmlinux= vmlinux_filename;
@@ -598,6 +608,13 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
 
kvm__init_ram(kvm);
 
+   if (vnc) {
+   pthread_t thread;
+
+   vesa__init(kvm);
+   pthread_create(&thread, NULL, vesa__dovnc, kvm);
+   }
+
thread_pool__init(nr_online_cpus);
 
for (i = 0; i < nrcpus; i++) {
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] kvm tools: Update makefile and feature tests

2011-05-23 Thread Sasha Levin
Update feature tests to test for libvncserver.

VESA support doesn't get compiled in unless libvncserver
is installed.

Signed-off-by: John Floren 
Signed-off-by: Sasha Levin 
---
 tools/kvm/Makefile |   11 ++-
 tools/kvm/config/feature-tests.mak |   10 ++
 2 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
index e6e8d4e..2ebc86c 100644
--- a/tools/kvm/Makefile
+++ b/tools/kvm/Makefile
@@ -58,6 +58,14 @@ ifeq ($(has_bfd),y)
LIBS+= -lbfd
 endif
 
+FLAGS_VNCSERVER=$(CFLAGS) -lvncserver
+has_vncserver := $(call try-cc,$(SOURCE_VNCSERVER),$(FLAGS_VNCSERVER))
+ifeq ($(has_vncserver),y)
+   CFLAGS  += -DCONFIG_HAS_VNCSERVER
+   OBJS+= hw/vesa.o
+   LIBS+= -lvncserver
+endif
+
 DEPS   := $(patsubst %.o,%.d,$(OBJS))
 
 # Exclude BIOS object files from header dependencies.
@@ -153,9 +161,10 @@ bios/bios.o: bios/bios.S bios/bios-rom.bin
 bios/bios-rom.bin: bios/bios-rom.S bios/e820.c
$(E) "  CC  " $@
$(Q) $(CC) -include code16gcc.h $(CFLAGS) $(BIOS_CFLAGS) -c -s 
bios/e820.c -o bios/e820.o
+   $(Q) $(CC) -include code16gcc.h $(CFLAGS) $(BIOS_CFLAGS) -c -s 
bios/int10.c -o bios/int10.o
$(Q) $(CC) $(CFLAGS) $(BIOS_CFLAGS) -c -s bios/bios-rom.S -o 
bios/bios-rom.o
$(E) "  LD  " $@
-   $(Q) ld -T bios/rom.ld.S -o bios/bios-rom.bin.elf bios/bios-rom.o 
bios/e820.o
+   $(Q) ld -T bios/rom.ld.S -o bios/bios-rom.bin.elf bios/bios-rom.o 
bios/e820.o bios/int10.o
$(E) "  OBJCOPY " $@
$(Q) objcopy -O binary -j .text bios/bios-rom.bin.elf bios/bios-rom.bin
$(E) "  NM  " $@
diff --git a/tools/kvm/config/feature-tests.mak 
b/tools/kvm/config/feature-tests.mak
index 6170fd2..0801b54 100644
--- a/tools/kvm/config/feature-tests.mak
+++ b/tools/kvm/config/feature-tests.mak
@@ -126,3 +126,13 @@ int main(void)
return 0;
 }
 endef
+
+define SOURCE_VNCSERVER
+#include 
+
+int main(void)
+{
+   rfbIsActive((void *)0);
+   return 0;
+}
+endef
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] kvm tools: Add VESA device

2011-05-23 Thread Sasha Levin
Add a simple VESA device which simply moves a framebuffer
from guest kernel to a VNC server.

VESA device PCI code is very similar to virtio-* PCI code.

Signed-off-by: John Floren 
Signed-off-by: Sasha Levin 
---
 tools/kvm/hw/vesa.c|  106 
 tools/kvm/include/kvm/ioport.h |2 +
 tools/kvm/include/kvm/vesa.h   |   31 +
 tools/kvm/include/kvm/virtio-pci-dev.h |3 +
 4 files changed, 142 insertions(+), 0 deletions(-)
 create mode 100644 tools/kvm/hw/vesa.c
 create mode 100644 tools/kvm/include/kvm/vesa.h

diff --git a/tools/kvm/hw/vesa.c b/tools/kvm/hw/vesa.c
new file mode 100644
index 000..c1a4c64
--- /dev/null
+++ b/tools/kvm/hw/vesa.c
@@ -0,0 +1,106 @@
+#include "kvm/vesa.h"
+#include "kvm/ioport.h"
+#include "kvm/util.h"
+#include "kvm/kvm.h"
+#include "kvm/pci.h"
+#include "kvm/kvm-cpu.h"
+#include "kvm/irq.h"
+#include "kvm/virtio-pci-dev.h"
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#define VESA_QUEUE_SIZE128
+#define VESA_IRQ   14
+
+/*
+ * This "6000" value is pretty much the result of experimentation
+ * It seems that around this value, things update pretty smoothly
+ */
+#define VESA_UPDATE_TIME   6000
+
+u8 videomem[VESA_MEM_SIZE];
+
+static bool vesa_pci_io_in(struct kvm *kvm, u16 port, void *data, int size, 
u32 count)
+{
+   printf("vesa in port=%u\n", port);
+   return true;
+}
+
+static bool vesa_pci_io_out(struct kvm *kvm, u16 port, void *data, int size, 
u32 count)
+{
+   printf("vesa out port=%u\n", port);
+   return true;
+}
+
+static struct ioport_operations vesa_io_ops = {
+   .io_in  = vesa_pci_io_in,
+   .io_out = vesa_pci_io_out,
+};
+
+static struct pci_device_header vesa_pci_device = {
+   .vendor_id  = PCI_VENDOR_ID_REDHAT_QUMRANET,
+   .device_id  = PCI_DEVICE_ID_VESA,
+   .header_type= PCI_HEADER_TYPE_NORMAL,
+   .revision_id= 0,
+   .class  = 0x03,
+   .subsys_vendor_id   = PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET,
+   .subsys_id  = PCI_SUBSYSTEM_ID_VESA,
+   .bar[0] = IOPORT_VESA | PCI_BASE_ADDRESS_SPACE_IO,
+   .bar[1] = VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY,
+};
+
+
+void vesa_mmio_callback(u64 addr, u8 *data, u32 len, u8 is_write)
+{
+   if (is_write)
+   memcpy(&videomem[addr - VESA_MEM_ADDR], data, len);
+
+   return;
+}
+
+void vesa__init(struct kvm *kvm)
+{
+   u8 dev, line, pin;
+
+   if (irq__register_device(PCI_DEVICE_ID_VESA, &dev, &pin, &line) < 0)
+   return;
+
+   vesa_pci_device.irq_pin = pin;
+   vesa_pci_device.irq_line = line;
+   pci__register(&vesa_pci_device, dev);
+   ioport__register(IOPORT_VESA, &vesa_io_ops, IOPORT_VESA_SIZE);
+
+   kvm__register_mmio(VESA_MEM_ADDR, VESA_MEM_SIZE, &vesa_mmio_callback);
+}
+
+/*
+ * This starts a VNC server to display the framebuffer.
+ * It's not altogether clear this belongs here rather than in kvm-run.c
+ */
+void *vesa__dovnc(void *v)
+{
+   /*
+* Make a fake argc and argv because the getscreen function
+* seems to want it.
+*/
+   int ac = 1;
+   char av[1][1] = {{0} };
+   rfbScreenInfoPtr server;
+
+   server = rfbGetScreen(&ac, (char **)av, VESA_WIDTH, VESA_HEIGHT, 8, 3, 
4);
+   server->frameBuffer = (char *)videomem;
+   server->alwaysShared = TRUE;
+   rfbInitServer(server);
+
+   while (rfbIsActive(server)) {
+   rfbMarkRectAsModified(server, 0, 0, VESA_WIDTH, VESA_HEIGHT);
+   rfbProcessEvents(server, server->deferUpdateTime * 
VESA_UPDATE_TIME);
+   }
+   return NULL;
+}
+
diff --git a/tools/kvm/include/kvm/ioport.h b/tools/kvm/include/kvm/ioport.h
index 218530c..8253938 100644
--- a/tools/kvm/include/kvm/ioport.h
+++ b/tools/kvm/include/kvm/ioport.h
@@ -7,6 +7,8 @@
 
 /* some ports we reserve for own use */
 #define IOPORT_DBG 0xe0
+#define IOPORT_VESA0xa200
+#define IOPORT_VESA_SIZE   256
 #define IOPORT_VIRTIO_P9   0xb200  /* Virtio 9P device */
 #define IOPORT_VIRTIO_P9_SIZE  256
 #define IOPORT_VIRTIO_BLK  0xc200  /* Virtio block device */
diff --git a/tools/kvm/include/kvm/vesa.h b/tools/kvm/include/kvm/vesa.h
new file mode 100644
index 000..dfa3d941
--- /dev/null
+++ b/tools/kvm/include/kvm/vesa.h
@@ -0,0 +1,31 @@
+#ifndef KVM__VESA_H
+#define KVM__VESA_H
+
+#include 
+
+#define VESA_WIDTH 640
+#define VESA_HEIGHT480
+
+#define VESA_MEM_ADDR  0xd000
+#define VESA_MEM_SIZE  (4*VESA_WIDTH*VESA_HEIGHT)
+#define VESA_BPP   32
+
+struct kvm;
+struct int10args;
+
+#ifdef CONFIG_HAS_VNCSERVER
+void vesa_mmio_callback(u64, u8*, u32, u8);
+void vesa__init(struct kvm *self);
+void *vesa__dovnc(void *);
+#else
+void vesa__init(struct kvm *self)

[PATCH 2/5] kvm tools: Add video mode to kernel initialization

2011-05-23 Thread Sasha Levin
Allow setting video mode in guest kernel.

For possible values see Documentation/fb/vesafb.txt

Signed-off-by: John Floren 
Signed-off-by: Sasha Levin 
---
 tools/kvm/include/kvm/kvm.h |2 +-
 tools/kvm/kvm.c |7 ---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/kvm/include/kvm/kvm.h b/tools/kvm/include/kvm/kvm.h
index 3cf6e6c..49ebd95 100644
--- a/tools/kvm/include/kvm/kvm.h
+++ b/tools/kvm/include/kvm/kvm.h
@@ -39,7 +39,7 @@ int kvm__max_cpus(struct kvm *kvm);
 void kvm__init_ram(struct kvm *kvm);
 void kvm__delete(struct kvm *kvm);
 bool kvm__load_kernel(struct kvm *kvm, const char *kernel_filename,
-   const char *initrd_filename, const char 
*kernel_cmdline);
+   const char *initrd_filename, const char 
*kernel_cmdline, u16 vidmode);
 void kvm__setup_bios(struct kvm *kvm);
 void kvm__start_timer(struct kvm *kvm);
 void kvm__stop_timer(struct kvm *kvm);
diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
index 4393a41..7284211 100644
--- a/tools/kvm/kvm.c
+++ b/tools/kvm/kvm.c
@@ -320,7 +320,7 @@ static int load_flat_binary(struct kvm *kvm, int fd)
 static const char *BZIMAGE_MAGIC   = "HdrS";
 
 static bool load_bzimage(struct kvm *kvm, int fd_kernel,
-   int fd_initrd, const char *kernel_cmdline)
+   int fd_initrd, const char *kernel_cmdline, u16 vidmode)
 {
struct boot_params *kern_boot;
unsigned long setup_sects;
@@ -383,6 +383,7 @@ static bool load_bzimage(struct kvm *kvm, int fd_kernel,
kern_boot->hdr.type_of_loader   = 0xff;
kern_boot->hdr.heap_end_ptr = 0xfe00;
kern_boot->hdr.loadflags|= CAN_USE_HEAP;
+   kern_boot->hdr.vid_mode = vidmode;
 
/*
 * Read initrd image into guest memory
@@ -441,7 +442,7 @@ static bool initrd_check(int fd)
 }
 
 bool kvm__load_kernel(struct kvm *kvm, const char *kernel_filename,
-   const char *initrd_filename, const char *kernel_cmdline)
+   const char *initrd_filename, const char *kernel_cmdline, u16 
vidmode)
 {
bool ret;
int fd_kernel = -1, fd_initrd = -1;
@@ -459,7 +460,7 @@ bool kvm__load_kernel(struct kvm *kvm, const char 
*kernel_filename,
die("%s is not an initrd", initrd_filename);
}
 
-   ret = load_bzimage(kvm, fd_kernel, fd_initrd, kernel_cmdline);
+   ret = load_bzimage(kvm, fd_kernel, fd_initrd, kernel_cmdline, vidmode);
 
if (initrd_filename)
close(fd_initrd);
-- 
1.7.5.rc3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] kvm tools: Add BIOS INT10 handler

2011-05-23 Thread Sasha Levin
INT10 handler is a basic implementation of BIOS video services.

The handler implements a VESA interface which is initialized at
the very beginning of loading the kernel.

Signed-off-by: John Floren 
Signed-off-by: Sasha Levin 
---
 tools/kvm/bios/bios-rom.S |   56 
 tools/kvm/bios/int10.c|  161 +
 2 files changed, 189 insertions(+), 28 deletions(-)
 create mode 100644 tools/kvm/bios/int10.c

diff --git a/tools/kvm/bios/bios-rom.S b/tools/kvm/bios/bios-rom.S
index 8a53dcd..b636cb8 100644
--- a/tools/kvm/bios/bios-rom.S
+++ b/tools/kvm/bios/bios-rom.S
@@ -27,36 +27,36 @@ ENTRY_END(bios_intfake)
  * We ignore bx settings
  */
 ENTRY(bios_int10)
-   test $0x0e, %ah
-   jne 1f
+   pushw   %fs
+   pushl   %es
+   pushl   %edi
+   pushl   %esi
+   pushl   %ebp
+   pushl   %esp
+   pushl   %edx
+   pushl   %ecx
+   pushl   %ebx
+   pushl   %eax
+
+   movl%esp, %eax
+   /* this is way easier than doing it in assembly */
+   /* just push all the regs and jump to a C handler */
+   callint10handler
+
+   popl%eax
+   popl%ebx
+   popl%ecx
+   popl%edx
+   popl%esp
+   popl%ebp
+   popl%esi
+   popl%edi
+   popl%es
+   popw%fs
 
-/*
- * put char in AL at current cursor and
- * increment cursor position
- */
-putchar:
-   stack_swap
-
-   push %fs
-   push %bx
-
-   mov $VGA_RAM_SEG, %bx
-   mov %bx, %fs
-   mov %cs:(cursor), %bx
-   mov %al, %fs:(%bx)
-   inc %bx
-   test $VGA_PAGE_SIZE, %bx
-   jb putchar_new
-   xor %bx, %bx
-putchar_new:
-   mov %bx, %fs:(cursor)
-
-   pop %bx
-   pop %fs
-
-   stack_restore
-1:
IRET
+
+
 /*
  * private IRQ data
  */
diff --git a/tools/kvm/bios/int10.c b/tools/kvm/bios/int10.c
new file mode 100644
index 000..98205c3
--- /dev/null
+++ b/tools/kvm/bios/int10.c
@@ -0,0 +1,161 @@
+#include "kvm/segment.h"
+#include "kvm/bios.h"
+#include "kvm/util.h"
+#include "kvm/vesa.h"
+#include 
+
+#define VESA_MAGIC ('V' + ('E' << 8) + ('S' << 16) + ('A' << 24))
+
+struct int10args {
+   u32 eax;
+   u32 ebx;
+   u32 ecx;
+   u32 edx;
+   u32 esp;
+   u32 ebp;
+   u32 esi;
+   u32 edi;
+   u32 es;
+};
+
+/* VESA General Information table */
+struct vesa_general_info {
+   u32 signature;  /* 0 Magic number = "VESA" */
+   u16 version;/* 4 */
+   void *vendor_string;/* 6 */
+   u32 capabilities;   /* 10 */
+   void *video_mode_ptr;   /* 14 */
+   u16 total_memory;   /* 18 */
+
+   u8 reserved[236];   /* 20 */
+} __attribute__ ((packed));
+
+
+struct vminfo {
+   u16 mode_attr;  /* 0 */
+   u8  win_attr[2];/* 2 */
+   u16 win_grain;  /* 4 */
+   u16 win_size;   /* 6 */
+   u16 win_seg[2]; /* 8 */
+   u32 win_scheme; /* 12 */
+   u16 logical_scan;   /* 16 */
+
+   u16 h_res;  /* 18 */
+   u16 v_res;  /* 20 */
+   u8  char_width; /* 22 */
+   u8  char_height;/* 23 */
+   u8  memory_planes;  /* 24 */
+   u8  bpp;/* 25 */
+   u8  banks;  /* 26 */
+   u8  memory_layout;  /* 27 */
+   u8  bank_size;  /* 28 */
+   u8  image_planes;   /* 29 */
+   u8  page_function;  /* 30 */
+
+   u8  rmask;  /* 31 */
+   u8  rpos;   /* 32 */
+   u8  gmask;  /* 33 */
+   u8  gpos;   /* 34 */
+   u8  bmask;  /* 35 */
+   u8  bpos;   /* 36 */
+   u8  resv_mask;  /* 37 */
+   u8  resv_pos;   /* 38 */
+   u8  dcm_info;   /* 39 */
+
+   u32 lfb_ptr;/* 40 Linear frame buffer address */
+   u32 offscreen_ptr;  /* 44 Offscreen memory address */
+   u16 offscreen_size; /* 48 */
+
+   u8  reserved[206];  /* 50 */
+};
+
+char oemstring[11] = "KVM VESA";
+u16 modes[2] = { 0x0112, 0x };
+
+static inline void outb(unsigned short port, unsigned char val)
+{
+   asm volatile("outb %0, %1" : : "a"(val), "Nd"(port));
+}
+
+/*
+ * It's probably much more useful to make this print to the serial
+ * line rather than print to a non-displayed VGA memory
+ */
+static inline void int10putchar(struct int10args *args)
+{
+   u8 al, ah;
+
+   al = args->eax & 0xFF;
+   ah = (args->eax & 0xFF00) >> 8;
+
+   outb(0x3f8, al);

[PATCH][RESEND] KVM: Clean up error handling during VCPU creation

2011-05-23 Thread Jan Kiszka
So far kvm_arch_vcpu_setup is responsible for freeing the vcpu struct if
it fails. Move this confusing resonsibility back into the hands of
kvm_vm_ioctl_create_vcpu. Only kvm_arch_vcpu_setup of x86 is affected,
all other archs cannot fail.

Signed-off-by: Jan Kiszka 
---
 arch/x86/kvm/x86.c  |5 -
 virt/kvm/kvm_main.c |   11 ++-
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index da48622..aaa3735 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6126,12 +6126,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
if (r == 0)
r = kvm_mmu_setup(vcpu);
vcpu_put(vcpu);
-   if (r < 0)
-   goto free_vcpu;
 
-   return 0;
-free_vcpu:
-   kvm_x86_ops->vcpu_free(vcpu);
return r;
 }
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3962899..8de7208 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1612,18 +1612,18 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, 
u32 id)
 
r = kvm_arch_vcpu_setup(vcpu);
if (r)
-   return r;
+   goto vcpu_destroy;
 
mutex_lock(&kvm->lock);
if (atomic_read(&kvm->online_vcpus) == KVM_MAX_VCPUS) {
r = -EINVAL;
-   goto vcpu_destroy;
+   goto unlock_vcpu_destroy;
}
 
kvm_for_each_vcpu(r, v, kvm)
if (v->vcpu_id == id) {
r = -EEXIST;
-   goto vcpu_destroy;
+   goto unlock_vcpu_destroy;
}
 
BUG_ON(kvm->vcpus[atomic_read(&kvm->online_vcpus)]);
@@ -1633,7 +1633,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 
id)
r = create_vcpu_fd(vcpu);
if (r < 0) {
kvm_put_kvm(kvm);
-   goto vcpu_destroy;
+   goto unlock_vcpu_destroy;
}
 
kvm->vcpus[atomic_read(&kvm->online_vcpus)] = vcpu;
@@ -1647,8 +1647,9 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 
id)
mutex_unlock(&kvm->lock);
return r;
 
-vcpu_destroy:
+unlock_vcpu_destroy:
mutex_unlock(&kvm->lock);
+vcpu_destroy:
kvm_arch_vcpu_destroy(vcpu);
return r;
 }
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] qemu-kvm: Fix non-ISA IRQ routing in kernel irqchip mode

2011-05-23 Thread Jan Kiszka
Merge regression of d1dcf63406: The KVM i8259 believes it is also an
IOAPIC and takes all GSIs. Until we refactor this, work around it by
avoiding the isa_irq_handler dispatcher in kernel irqchip mode.

Signed-off-by: Jan Kiszka 
---
 hw/pc_piix.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 66c5e04..7af03fa 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -131,7 +131,11 @@ static void pc_init1(ram_addr_t ram_size,
 if (pci_enabled) {
 ioapic_init(isa_irq_state);
 }
-isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
+if (!(kvm_enabled() && kvm_irqchip_in_kernel())) {
+isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
+} else {
+isa_irq = i8259;
+}
 
 if (pci_enabled) {
 if (!xen_enabled()) {
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] KVM updates for 2.6.40

2011-05-23 Thread Avi Kivity

Linus, please pull from:

  git://git.kernel.org/pub/scm/virt/kvm/kvm.git kvm-updates/2.6.40

to receive the KVM updates for the 2.6.40 cycle.  Changes this time 
include emulator correctness (segment checks, nested SVM intercepts), 
16-byte MMIO, Via CPU feature support, virtual TSC rate for newer AMD 
processors, better RCU integration, and performance improvements.


Changelog/diffstat (includes already-merged RCU commits):

Avi Kivity (56):
  KVM: Use kvm_get_rflags() and kvm_set_rflags() instead of the raw 
versions

  KVM: VMX: Optimize vmx_get_rflags()
  KVM: VMX: Optimize vmx_get_cpl()
  KVM: VMX: Cache cpl
  KVM: VMX: Avoid vmx_recover_nmi_blocking() when unneeded
  KVM: VMX: Qualify check for host NMI
  KVM: VMX: Refactor vmx_complete_atomic_exit()
  KVM: VMX: Don't VMREAD VM_EXIT_INTR_INFO unconditionally
  KVM: VMX: Use cached VM_EXIT_INTR_INFO in handle_exception
  KVM: VMX: simplify NMI mask management
  KVM: extend in-kernel mmio to handle >8 byte transactions
  KVM: Split mmio completion into a function
  KVM: 16-byte mmio support
  KVM: x86 emulator: do not munge rep prefix
  KVM: x86 emulator: define callbacks for using the guest fpu 
within the emulator
  KVM: x86 emulator: Specialize decoding for insns with 66/f2/f3 
prefixes

  KVM: x86 emulator: SSE support
  KVM: x86 emulator: implement movdqu instruction (f3 0f 6f, f3 0f 7f)
  KVM: x86 emulator: add framework for instruction intercepts
  KVM: x86 emulator: add SVM intercepts
  KVM: x86 emulator: Re-add VendorSpecific tag to VMMCALL insn
  KVM: x86 emulator: Drop EFER.SVME requirement from VMMCALL
  KVM: x86 emulator: Add helpers for memory access using segmented 
addresses

  KVM: x86 emulator: move invlpg emulation into a function
  KVM: x86 emulator: change address linearization to return an 
error code
  KVM: x86 emulator: pass access size and read/write intent to 
linearize()

  KVM: x86 emulator: move linearize() downwards
  KVM: x86 emulator: move desc_limit_scaled()
  KVM: x86 emulator: implement segment permission checks
  KVM: x86 emulator: whitespace cleanups
  KVM: x86 emulator: drop vcpu argument from memory read/write 
callbacks

  KVM: x86 emulator: drop vcpu argument from pio callbacks
  KVM: x86 emulator: drop vcpu argument from segment/gdt/idt callbacks
  KVM: x86 emulator: drop vcpu argument from cr/dr/cpl/msr callbacks
  KVM: x86 emulator: drop vcpu argument from intercept callback
  KVM: x86 emulator: avoid using ctxt->vcpu in check_perm() callbacks
  KVM: x86 emulator: add and use new callbacks set_idt(), set_gdt()
  KVM: x86 emulator: drop use of is_long_mode()
  KVM: x86 emulator: Replace calls to is_pae() and is_paging with 
->get_cr()

  KVM: x86 emulator: emulate CLTS internally
  KVM: x86 emulator: make emulate_invlpg() an emulator callback
  KVM: x86 emulator: add new ->halt() callback
  KVM: x86 emulator: add ->fix_hypercall() callback
  KVM: x86 emulator: add new ->wbinvd() callback
  KVM: Avoid using x86_emulate_ctxt.vcpu
  KVM: x86 emulator: drop x86_emulate_ctxt::vcpu
  KVM: x86 emulator: move 0F 01 sub-opcodes into their own functions
  KVM: x86 emulator: Don't force #UD for 0F 01 /5
  KVM: x86 emulator: Use opcode::execute for 0F 01 opcode
  KVM: SVM: Get rid of x86_intercept_map::valid
  KVM: MMU: Add unlikely() annotations to walk_addr_generic()
  KVM: x86 emulator: consolidate group handling
  KVM: VMX: Avoid reading %rip unnecessarily when handling exceptions
  KVM: x86 emulator: consolidate segment accessors
  KVM: VMX: Cache vmcs segment fields
  Merge commit '29ce83181dd757d3116bf774aafffc4b6b20' into next

Bharat Bhushan (1):
  KVM: PPC: Fix issue clearing exit timing counters

brill...@viatech.com.cn (1):
  KVM: Add CPUID support for VIA CPU

Clemens Noss (1):
  KVM: x86 emulator: avoid calling wbinvd() macro

Duan Jiong (2):
  KVM: remove useless function declarations from file 
arch/x86/kvm/irq.h

  KVM: remove useless function declaration kvm_inject_pit_timer_irqs()

Glauber Costa (1):
  KVM: expose async pf through our standard mechanism

Gleb Natapov (8):
  KVM: x86: better fix for race between nmi injection and enabling 
nmi window

  KVM: x86 emulator: do not open code return values from the emulator
  KVM: emulator: do not needlesly sync registers from emulator ctxt 
to vcpu

  KVM: mmio_fault_cr2 is not used
  KVM: emulator: Propagate fault in far jump emulation
  KVM: Fix compound mmio
  KVM: call cache_all_regs() only once during instruction emulation
  KVM: make guest mode entry to be rcu quiescent state

Jan Kiszka (2):
  KVM: SVM: Remove unused svm_features
  KVM: VMX: Ensure that vmx_create_vcpu always returns proper error

Jeff Mahoney (2):
  KVM: Fix off by one in kvm_for_each_vcpu

[PATCH 4/5] KVM test: setup tap fd and pass it to qemu-kvm v2

2011-05-23 Thread Lucas Meneghel Rodrigues
We used to use qemu-ifup to manage the tap which have several
limitations:

1) If we want to specify a bridge, we must create a customized
qemu-ifup file as the default script always match the first bridge.
2) It's hard to add support for macvtap device.

So this patch let kvm subtest control the tap creation and setup then
pass it to qemu-kvm. User could specify the bridge he want to used in
configuration file.

The original autoconfiguration was changed by private bridge setup.

Changes from v1:
* Combine the private bridge config and TAP fd in one patchset,
dropped the "auto" mode
* Close TAP fds on VM.destroy() (thanks to Amos Kong for finding
the problem)

Signed-off-by: Jason Wang 
Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/scripts/qemu-ifup |   11 --
 client/virt/kvm_vm.py  |   60 
 client/virt/virt_utils.py  |   11 --
 3 files changed, 47 insertions(+), 35 deletions(-)
 delete mode 100755 client/tests/kvm/scripts/qemu-ifup

diff --git a/client/tests/kvm/scripts/qemu-ifup 
b/client/tests/kvm/scripts/qemu-ifup
deleted file mode 100755
index c4debf5..000
--- a/client/tests/kvm/scripts/qemu-ifup
+++ /dev/null
@@ -1,11 +0,0 @@
-#!/bin/sh
-
-# The following expression selects the first bridge listed by 'brctl show'.
-# Modify it to suit your needs.
-switch=$(/usr/sbin/brctl show | awk 'NR==2 { print $1 }')
-
-/bin/echo 1 > /proc/sys/net/ipv6/conf/${switch}/disable_ipv6
-/sbin/ifconfig $1 0.0.0.0 up
-/usr/sbin/brctl addif ${switch} $1
-/usr/sbin/brctl setfd ${switch} 0
-/usr/sbin/brctl stp ${switch} off
diff --git a/client/virt/kvm_vm.py b/client/virt/kvm_vm.py
index 57fc61b..5b1a27b 100644
--- a/client/virt/kvm_vm.py
+++ b/client/virt/kvm_vm.py
@@ -7,7 +7,7 @@ Utility classes and functions to handle Virtual Machine 
creation using qemu.
 import time, os, logging, fcntl, re, commands, glob
 from autotest_lib.client.common_lib import error
 from autotest_lib.client.bin import utils
-import virt_utils, virt_vm, kvm_monitor, aexpect
+import virt_utils, virt_vm, virt_test_setup, kvm_monitor, aexpect
 
 
 class VM(virt_vm.BaseVM):
@@ -41,6 +41,7 @@ class VM(virt_vm.BaseVM):
 self.pci_assignable = None
 self.netdev_id = []
 self.device_id = []
+self.tapfds = []
 self.uuid = None
 
 
@@ -231,19 +232,17 @@ class VM(virt_vm.BaseVM):
 cmd += ",id='%s'" % device_id
 return cmd
 
-def add_net(help, vlan, mode, ifname=None, script=None,
-downscript=None, tftp=None, bootfile=None, hostfwd=[],
-netdev_id=None, netdev_extra_params=None):
+def add_net(help, vlan, mode, ifname=None, tftp=None, bootfile=None,
+hostfwd=[], netdev_id=None, netdev_extra_params=None,
+tapfd=None):
 if has_option(help, "netdev"):
 cmd = " -netdev %s,id=%s" % (mode, netdev_id)
 if netdev_extra_params:
 cmd += ",%s" % netdev_extra_params
 else:
 cmd = " -net %s,vlan=%d" % (mode, vlan)
-if mode == "tap":
-if ifname: cmd += ",ifname='%s'" % ifname
-if script: cmd += ",script='%s'" % script
-cmd += ",downscript='%s'" % (downscript or "no")
+if mode == "tap" and tapfd:
+cmd += ",fd=%d" % tapfd
 elif mode == "user":
 if tftp and "[,tftp=" in help:
 cmd += ",tftp='%s'" % tftp
@@ -413,20 +412,22 @@ class VM(virt_vm.BaseVM):
 qemu_cmd += add_nic(help, vlan, nic_params.get("nic_model"), mac,
 device_id, netdev_id, 
nic_params.get("nic_extra_params"))
 # Handle the '-net tap' or '-net user' or '-netdev' part
-script = nic_params.get("nic_script")
-downscript = nic_params.get("nic_downscript")
 tftp = nic_params.get("tftp")
-if script:
-script = virt_utils.get_path(root_dir, script)
-if downscript:
-downscript = virt_utils.get_path(root_dir, downscript)
 if tftp:
 tftp = virt_utils.get_path(root_dir, tftp)
-qemu_cmd += add_net(help, vlan, nic_params.get("nic_mode", "user"),
-vm.get_ifname(vlan),
-script, downscript, tftp,
+if nic_params.get("nic_mode") == "tap":
+try:
+tapfd = vm.tapfds[vlan]
+except IndexError:
+tapfd = None
+else:
+tapfd = None
+qemu_cmd += add_net(help, vlan,
+nic_params.get("nic_mode", "user"),
+vm.get_ifname(vlan), tftp,
 nic_params.get("bootp"), redirs, netdev_id,
- 

[PATCH 2/2] KVM Test: Add a subtest lvm

2011-05-23 Thread Lucas Meneghel Rodrigues
From: Qingtang Zhou 

Changes from v1:
* Made the test use more current kvm autotest api, namely:
 - Error contexts, and session.cmd for shorter, cleaner code
 - Removed pre command, as the functionality needed for image_create
   was implemented on the previous patch

Signed-off-by: Lucas Meneghel Rodrigues 

This test sets up an lvm over two images and then format the lvm and
finally checks the fs using fsck.

Signed-off-by: Yolkfull Chow 

Remove the progress of filling up.
Add a params of clean which could prevent the umount and volume removing
command and let this case usd by the following benchmark or stress test.
Add the dbench into the lvm tests.

Signed-off-by: Jason Wang 

This test depends on fillup_disk test and ioquit test.
Signed-off-by: Qingtang Zhou 
---
 client/tests/kvm/tests_base.cfg.sample |   48 ++
 client/virt/tests/lvm.py   |   84 
 2 files changed, 132 insertions(+), 0 deletions(-)
 create mode 100644 client/virt/tests/lvm.py

diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index 5713513..d1a188d 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -879,6 +879,46 @@ variants:
 fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d bs=%dM count=1 
oflag=direct"
 kill_vm = yes
 
+- lvm:
+only Linux
+images += ' stg1 stg2'
+image_name_stg1 = storage_4k
+image_cluster_size_stg1 = 4096
+image_size_stg1 = 1G
+image_format_stg1 = qcow2
+image_name_stg2 = storage_64k
+image_cluster_size_stg2 = 65536
+image_size_stg2 = 1G
+image_format_stg2 = qcow2
+guest_testdir = /mnt
+disks = "/dev/sdb /dev/sdc"
+kill_vm = no
+post_command_noncritical = no
+variants:
+lvm_create:
+type = lvm
+force_create_image_stg1 = yes
+force_create_image_stg2 = yes
+clean = no
+lvm_fill: lvm_create
+type = fillup_disk
+force_create_image_stg1 = no
+force_create_image_stg2 = no
+guest_testdir = /mnt/kvm_test_lvm
+fillup_timeout = 120
+fillup_size = 20
+fillup_cmd = "dd if=/dev/zero of=%s/fillup.%d bs=%dM count=1 
oflag=direct"
+lvm_ioquit: lvm_create
+type = ioquit
+force_create_image_stg1 = no
+force_create_image_stg2 = no
+kill_vm = yes
+background_cmd = "for i in 1 2 3 4; do (dd if=/dev/urandom 
of=/mnt/kvm_test_lvm/file bs=102400 count=1000 &); done"
+check_cmd = pgrep dd
+clean = yes
+remove_image_stg1 = yes
+remove_image_stg2 = yes
+
 - ioquit:
 only Linux
 type = ioquit
@@ -1656,6 +1696,8 @@ variants:
 md5sum_1m_cd1 = 127081cbed825d7232331a2083975528
 fillup_disk:
 fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
+lvm.lvm_fill:
+fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
 
 - 4.7.x86_64:
 no setup autotest
@@ -1677,6 +1719,8 @@ variants:
 md5sum_1m_cd1 = 58fa63eaee68e269f4cb1d2edf479792
 fillup_disk:
 fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
+lvm.lvm_fill:
+fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
 
 - 4.8.i386:
 no setup autotest
@@ -1696,6 +1740,8 @@ variants:
 sys_path = "/sys/class/net/%s/driver"
 fillup_disk:
 fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
+lvm.lvm_fill:
+fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
 
 
 - 4.8.x86_64:
@@ -1716,6 +1762,8 @@ variants:
 sys_path = "/sys/class/net/%s/driver"
 fillup_disk:
 fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
+lvm.lvm_fill:
+fillup_cmd = "dd if=/dev/zero of=/%s/fillup.%d 
bs=%dM count=1"
 
 
 - 5.3.i386:
diff --git a/client/virt/tests/lvm.py b/client/virt/tests/lvm.py
new file mode 100644
index 000..d171747
--- /dev/null
+++ b/client/virt/tests/lvm.py
@@ -0,0 +1,84 @@
+import logging, os
+from autotest_lib.client.common_lib import error
+
+
+@error.context_aware
+def mount_lv(lv_path, session):
+error.context("mount

[PATCH 1/2] client.virt.virt_vm: Make it possible to specify cluster size for image

2011-05-23 Thread Lucas Meneghel Rodrigues
For some tests, we need to specify image cluster size for
a given image. Make it possible to specify it so qemu-img
is called with the right parameters. This way we can state
things like:

images += ' stg1 stg2'
image_name_stg1 = storage_4k
image_cluster_size_stg1 = 4096
image_format_stg1 = qcow2
image_name_stg2 = storage_64k
image_cluster_size_stg2 = 65536
image_format_stg2 = qcow2

in the configuration file for a test

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/virt/virt_vm.py |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/client/virt/virt_vm.py b/client/virt/virt_vm.py
index 983ee02..7236218 100644
--- a/client/virt/virt_vm.py
+++ b/client/virt/virt_vm.py
@@ -218,6 +218,7 @@ def create_image(params, root_dir):
 @note: params should contain:
image_name -- the name of the image file, without extension
image_format -- the format of the image (qcow2, raw etc)
+   image_cluster_size (optional) -- the cluster size for the image
image_size -- the requested size of the image (a string
qemu-img can understand, such as '10G')
 """
@@ -228,6 +229,10 @@ def create_image(params, root_dir):
 format = params.get("image_format", "qcow2")
 qemu_img_cmd += " -f %s" % format
 
+image_cluster_size = params.get("image_cluster_size", None)
+if image_cluster_size is not None:
+qemu_img_cmd += " -o cluster_size=%s" % image_cluster_size
+
 image_filename = get_image_filename(params, root_dir)
 qemu_img_cmd += " %s" % image_filename
 
-- 
1.7.5.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/5] KVM test: Add helpers to control the TAP/bridge

2011-05-23 Thread Lucas Meneghel Rodrigues
On Mon, 2011-05-23 at 14:16 +0800, Amos Kong wrote:
> On Sat, May 21, 2011 at 01:23:27AM -0300, Lucas Meneghel Rodrigues wrote:
> > This patch adds some helpers to assist virt test to setup the bridge or
> > macvtap based guest networking.
> > 
> > Changes from v1:
> >  * Fixed undefined variable errors on the exception class definitions
> > 
> > Signed-off-by: Jason Wang 
> > Signed-off-by: Lucas Meneghel Rodrigues 
> > ---
> >  client/virt/virt_utils.py |  218 
> > +
> >  1 files changed, 218 insertions(+), 0 deletions(-)
> > 
> > diff --git a/client/virt/virt_utils.py b/client/virt/virt_utils.py
> > index 5510c89..96b9c84 100644
> > --- a/client/virt/virt_utils.py
> > +++ b/client/virt/virt_utils.py
> > @@ -6,6 +6,7 @@ KVM test utility functions.
> >  
> >  import time, string, random, socket, os, signal, re, logging, commands, 
> > cPickle
> >  import fcntl, shelve, ConfigParser, threading, sys, UserDict, inspect
> > +import struct
> >  from autotest_lib.client.bin import utils, os_dep
> >  from autotest_lib.client.common_lib import error, logging_config
> >  import rss_client, aexpect
> > @@ -15,6 +16,20 @@ try:
> >  except ImportError:
> >  KOJI_INSTALLED = False
> >  
> > +# From include/linux/sockios.h
> > +SIOCSIFHWADDR = 0x8924
> > +SIOCGIFHWADDR = 0x8927
> > +SIOCSIFFLAGS = 0x8914
> > +SIOCGIFINDEX = 0x8933
> > +SIOCBRADDIF = 0x89a2
> > +# From linux/include/linux/if_tun.h
> > +TUNSETIFF = 0x400454ca
> > +TUNGETIFF = 0x800454d2
> > +TUNGETFEATURES = 0x800454cf
> > +IFF_UP = 0x1
> > +IFF_TAP = 0x0002
> > +IFF_NO_PI = 0x1000
> > +IFF_VNET_HDR = 0x4000
> >  
> >  def _lock_file(filename):
> >  f = open(filename, "w")
> > @@ -36,6 +51,76 @@ def is_vm(obj):
> >  return obj.__class__.__name__ == "VM"
> >  
> >  
> > +class NetError(Exception):
> > +pass
> > +
> > +
> > +class TAPModuleError(NetError):
> > +def __init__(self, devname):
> > +NetError.__init__(self, devname)
> > +self.devname = devname
> > +
> > +def __str__(self):
> > +return "Can't open %s" % self.devname
> > +
> > +class TAPNotExistError(NetError):
> > +def __init__(self, ifname):
> > +NetError.__init__(self, ifname)
> > +self.ifname = ifname
> > +
> > +def __str__(self):
> > +return "Interface %s does not exist" % self.ifname
> > +
> > +
> > +class TAPCreationError(NetError):
> > +def __init__(self, ifname):
> > +NetError.__init__(self, ifname)
> > +self.ifname = ifname
> > +
> > +def __str__(self):
> > +return "Cannot create TAP device %s" % self.ifname
> > +
> > +
> > +class TAPBringUpError(NetError):
> > +def __init__(self, ifname):
> > +NetError.__init__(self, ifname)
> > +self.ifname = ifname
> > +
> > +def __str__(self):
> > +return "Cannot bring up TAP %s" % self.ifname
> > +
> > +
> > +class BRAddIfError(NetError):
> > +def __init__(self, ifname, brname, details):
> > +NetError.__init__(self, ifname, brname, details)
> > +self.ifname = ifname
> > +self.brname = brname
> > +self.details = details
> > +
> > +def __str__(self):
> > +return ("Can not add if %s to bridge %s: %s" %
> > +(self.ifname, self.brname, self.details))
> > +
> > +
> > +class HwAddrSetError(NetError):
> > +def __init__(self, ifname, mac):
> > +NetError.__init__(self, ifname, mac)
> > +self.ifname = ifname
> > +self.mac = mac
> > +
> > +def __str__(self):
> > +return "Can not set mac %s to interface %s" % (self.mac, 
> > self.ifname)
> > +
> > +
> > +class HwAddrGetError(NetError):
> > +def __init__(self, ifname):
> > +NetError.__init__(self, ifname)
> > +self.ifname = ifname
> > +
> > +def __str__(self):
> > +return "Can not get mac of interface %s" % self.ifname
> > +
> > +
> >  class Env(UserDict.IterableUserDict):
> >  """
> >  A dict-like object containing global objects used by tests.
> > @@ -2307,3 +2392,136 @@ def install_host_kernel(job, params):
> >  else:
> >  logging.info('Chose %s, using the current kernel for the host',
> >   install_type)
> > +
> > +
> > +def bridge_auto_detect():
> > +"""
> > +Automatically detect a bridge for tap through brctl.
> > +"""
> > +try:
> > +brctl_output = utils.system_output("ip route list",
> > +   retain_output=True)
> > +brname = re.findall("default.*dev (.*) ", brctl_output)[0]
> > +except:
> > +raise BRAutoDetectError
> > +return brname
> > +
> > +
> > +def if_nametoindex(ifname):
> > +"""
> > +Map an interface name into its corresponding index.
> > +Returns 0 on error, as 0 is not a valid index
> > +
> > +@param ifname: interface name
> > +"""
> > +index = 0
> > +ctrl_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, 0)
> > +if