Yang, Sheng wrote:
> These patches enable memory mapped TPR shadow (FlexPriority).
>
> Since TPR is accessed very frequently by 32bit Windows, especially SMP
> guest, with FlexPriority enabled, we saw significant performance gain.
>
> The issue is: FlexPriority needs to add a memory slot to the vm to make
> shadow work with APIC access page.
>
> We don't like the idea to add a memory slot, but no better choice now.
> Our propose is to add p2m table to KVM, while seems this is still a long
> way to go.
>
> BTW: I didn't use the offset(or other info) provide by CPU when handling
> APIC access vmexit. Instead, I used a bit in cmd_type(including
> no_decode) to tell emulator decode memory operand by itself when
> necessary. That's because I only got the guest physical address when
> handling APIC access vmexit, but emulator need a guest virtual address
> to fit its flow. I have tried some ways, and current solution seems the
> most proper one.
>
>   

> These patches enable memory mapped TPR shadow (FlexPriority).
>
> Since TPR is accessed very frequently by 32bit Windows, especially SMP
> guest, with FlexPriority enabled, we saw significant performance gain.
>
> The issue is: FlexPriority needs to add a memory slot to the vm to make
> shadow work with APIC access page.
>
> We don't like the idea to add a memory slot, but no better choice now.
> Our propose is to add p2m table to KVM, while seems this is still a long
> way to go.
>
> BTW: I didn't use the offset(or other info) provide by CPU when handling
> APIC access vmexit. Instead, I used a bit in cmd_type(including
> no_decode) to tell emulator decode memory operand by itself when
> necessary. That's because I only got the guest physical address when
> handling APIC access vmexit, but emulator need a guest virtual address
> to fit its flow. I have tried some ways, and current solution seems the
> most proper one.
>
> +     struct kvm_memory_region apic_memory = {
> +             .slot = 5,
> +             .memory_size = PAGE_SIZE,
> +             .guest_phys_addr = apicmem,
> +     };
>  
>       if (memory >= pcimem)
>               extended_memory.memory_size = pcimem - exmem;
> @@ -302,9 +308,16 @@ int kvm_create(kvm_context_t kvm, unsigned long
> phys_mem_bytes, void **vm_mem)
>               }
>       }
>  
> +     r = ioctl(fd, KVM_SET_MEMORY_REGION, &apic_memory);
> +     if (r == -1) {
> +             fprintf(stderr, "kvm_create_memory_region: %m\n");
> +             return -1;
> +     }
>   

Older kernels support only 4 memory slots, so you need to tolerate
failures here (this isn't an issue for large memory, because there's no
way the older kernel can run with large memory, so you can't continue. 
but new userspace should be able to run with an older kernel if is not
using newer features.

I don't like userspace involvement in this.  Perhaps we can have a
memory slot controlled by the kernel for this?  It would be activated by
the feature, so it we won't have it on AMD or when the feature isn't
available.

It can also be just a special case in gfn_to_page.

> ns(-) diff --git a/drivers/kvm/irq.h b/drivers/kvm/irq.h index
> 11fc014..afbfa0c 100644 --- a/drivers/kvm/irq.h +++
> b/drivers/kvm/irq.h @@ -118,6 +118,8 @@ struct kvm_lapic { struct
> kvm_vcpu *vcpu; struct page *regs_page; void *regs; + struct page
> *apic_access_page; + hpa_t apic_access_hpa; };

The second variable is redundant; just use page_address().

> diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c index
> cecdb1b..0ebae4c 100644 --- a/drivers/kvm/kvm_main.c +++
> b/drivers/kvm/kvm_main.c @@ -1080,14 +1080,19 @@ static int
> emulator_read_emulated(unsigned long addr, memcpy(val,
> vcpu->mmio_data, bytes); vcpu->mmio_read_completed = 0; return
> X86EMUL_CONTINUE; - } else if (emulator_read_std(addr, val, bytes,
> vcpu) - == X86EMUL_CONTINUE) - return X86EMUL_CONTINUE; + } gpa =
> vcpu->mmu.gva_to_gpa(vcpu, addr); + if ((gpa & PAGE_MASK) ==
> 0xfee00000) + goto mmio; +

The guest can change the apic base address.  Different vcpus can have
different addresses.

> + if ((gpa & PAGE_MASK) == 0xfee00000) + goto mmio; +

Same here.

> +static inline int cpu_has_secondary_exec_ctrls(void) +{ + return
> (vmcs_config.cpu_based_exec_ctrl & \ +
> CPU_BASED_ACTIVATE_SECONDARY_CONTROLS); +}

We aren't in a macro there's no \
need for a backslash.

> + +static inline int cpu_has_vmx_virtualize_apic_accesses(void) +{ +
> return (vmcs_config.cpu_based_2nd_exec_ctrl & \ +
> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES); +}

ditto

> + +static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm)
> +{ + return ((cpu_has_vmx_virtualize_apic_accesses()) && \ +
> (irqchip_in_kernel(kvm))); +} +

ditto

> +static int handle_apic_access(struct kvm_vcpu *vcpu, struct kvm_run
> *kvm_run) +{ + u64 exit_qualification; + enum emulation_result er; +
> unsigned long offset; + + exit_qualification =
> vmcs_read64(EXIT_QUALIFICATION); + offset = exit_qualification &
> 0xffful; + + er = emulate_instruction(vcpu, kvm_run, 0, 0,
> EMULCMD_DECODE_ADDR); + + if (er != EMULATE_DONE) { + BUG();

BUG() is a big here.  You allow a guest to crash the host.

We can return -EOPNOTSUPP here (that goes for the other places we
emulate and can't recover; we can trap that in kvmctl.c and give better
error messages to users).

> @@ -820,13 +820,17 @@ done_prefixes: c->src.bytes = 4; goto
> srcmem_common; case SrcMem: - c->src.bytes = (c->d & ByteOp) ? 1 : -
> c->op_bytes; + c->src.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;

Don't add unrelated changes please.

> ->b == 0x01 && c->modrm_reg == 7) break; srcmem_common: + if
> (((ctxt->cmd_type & EMULCMD_DECODE_ADDR) != 0) && + (c->modrm_ea ==
> 0)) { + ctxt->cr2 = insn_fetch(u32, c->src.bytes, c->eip); + c->eip -=
> c->src.bytes; + }

Confused.  What is this?  why is eip going backwards?


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to