On 02/18/13 13:53, David Woodhouse wrote:

> Nevertheless, on my workstation as on yours, we do seem to end up
> executing from the CSM in RAM when we reset. But on my laptop, it
> executes the *ROM* as it should.
>
> This patch 'fixes' it, and I think it might even be correct in itself,
> but I don't think it's a correct fix for the problem we're discussing.
> And I certainly want to know what's different on my laptop that makes it
> work *without* this patch.
>
> Either there's some weirdness with setting the high CS base address, on
> CPU reset. Or perhaps the contents of the memory region at 0xfffffff0
> have *really* been changed along with the sub-1MiB range. Or maybe the
> universe just hates us...

We're ending up in the wrong place, under 1MB (which is consistent with
your "reset the PAMs" patch -- state of PAMs should only matter below
1MB).

I single-stepped qemu-1.3.1 in x86_cpu_reset() /
cpu_x86_load_seg_cache(), and we seem to set the correct base. However
when I pause the VM when it's spinning in the reset loop, and I issue
the following in virsh:

# qemu-monitor-command --domain \
  fw-mixed.g-f18xfce2012121716.e-upstream --hmp --cmd \
  cpu 0

# qemu-monitor-command --domain \
  fw-mixed.g-f18xfce2012121716.e-upstream --hmp --cmd \
  info registers

for EIP and CS I get (from cpu_x86_dump_seg_cache(), in the
"HF_CS64_MASK clear" branch):

EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000623
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 0000f300
CS =f000 000f0000 0000ffff 0000f300
    ^    ^        ^        ^
    |    base     limit    flags
    selector

SS =0000 00000000 0000ffff 0000f300
DS =0000 00000000 0000ffff 0000f300
FS =0000 00000000 0000ffff 0000f300
GS =0000 00000000 0000ffff 0000f300
LDT=0000 00000000 0000ffff 00008200
TR =0000 feffd000 00002088 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000

(1) The three high nibbles of CS base are lost.


Furthermore, the flags value is (Intel SDM Vol.3A, 3.4.5):

  1 11  1 0011 00000000
  P DPL S type base 23:16
  ^ ^   ^
  | |   descriptor type (1 == code or data segment, 0 == system segment), 
DESC_S_MASK
  | descriptor privilege level (3 == least privileged)
  segment present, DESC_P_MASK

The "type" field depends on the S bit (here 1 == code/data). 0011b means
(see 3.4.5.1):

  0   0 1 1
  D/C E W A
      C R A
  ^   ^ ^ ^
  |   | | accessed, DESC_A_MASK
  |   | |
  |   | for data: 0=r/o, 1==r/w
  |   | for code: 0==exec/only, 1==exec/read, DESC_R_MASK
  |   |
  |   for data: 1==expand down
  |   for code: 1==conforming
  |
  0 == data, 1 == code, DESC_CS_MASK

The type dumped by "info registers" is "data segment, expand up,
read/write, accessed".

I believe the D/C bit (bit 11) should be set, and then 1011b would mean
"code segment, non-conforming, exec/read, accessed".

(2) x86_cpu_reset() does pass DESC_CS_MASK for R_CS, but it doesn't seem
to be present in the dumped value.


I have no idea what's going on, but vmx_set_segment() in the kernel has
a bunch of hacks for CS && selector == 0xf000 && base == 0xffff0000, and
it seems to be host processor dependent. Eg. from commit b246dd5d:

        /*
         * Fix segments for real mode guest in hosts that don't have
         * "unrestricted_mode" or it was disabled.
         * This is done to allow migration of the guests from hosts with
         * unrestricted guest like Westmere to older host that don't have
         * unrestricted guest like Nehelem.
         */
        if (vmx->rmode.vm86_active) {
                switch (seg) {
                case VCPU_SREG_CS:
                        vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
                        vmcs_write32(GUEST_CS_LIMIT, 0xffff);
                        if (vmcs_readl(GUEST_CS_BASE) == 0xffff0000)
                                vmcs_writel(GUEST_CS_BASE, 0xf0000);
                        vmcs_write16(GUEST_CS_SELECTOR,
                                     vmcs_readl(GUEST_CS_BASE) >> 4);
                        break;

Also in init_vmcb() [arch/x86/kvm/svm.c] I can see (from commit
d92899a0):

        /*
         * cs.base should really be 0xffff0000, but vmx can't handle that, so
         * be consistent with it.
         *
         * Replace when we have real mode working for vmx.
         */
        save->cs.base = 0xf0000;

Going back to vmx, vmx_vcpu_reset() [arch/x86/kvm/vmx.c]:

        /*
         * GUEST_CS_BASE should really be 0xffff0000, but VT vm86 mode
         * insists on having GUEST_CS_BASE == GUEST_CS_SELECTOR << 4.  Sigh.
         */
        if (kvm_vcpu_is_bsp(&vmx->vcpu)) {
                vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
                vmcs_writel(GUEST_CS_BASE, 0x000f0000);
        } else {
                vmcs_write16(GUEST_CS_SELECTOR, vmx->vcpu.arch.sipi_vector << 
8);
                vmcs_writel(GUEST_CS_BASE, vmx->vcpu.arch.sipi_vector << 12);
        }

The leading comment and the main logic date back to commit 6aa8b732
([PATCH] kvm: userspace interface).

(3) I wanted to ask you whether your laptop CPU is "more modern" than
your workstation CPU, but from your other email I guess they're indeed
different.

Laszlo

------------------------------------------------------------------------------
The Go Parallel Website, sponsored by Intel - in partnership with Geeknet, 
is your hub for all things parallel software development, from weekly thought 
leadership blogs to news, videos, case studies, tutorials, tech docs, 
whitepapers, evaluation guides, and opinion stories. Check out the most 
recent posts - join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

Reply via email to