I've recently run into a problem with some linux KVM code that may be a bug in
gcc-4.3.0.  I'm building the KVM modules on Fedora 9 x86_64, with gcc --version
reporting:

[EMAIL PROTECTED] kvm-userspace]# gcc --version
gcc (GCC) 4.3.0 20080416 (Red Hat 4.3.0-7)
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The code in question is as follows:

static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned ecx, u64 *data)
{
        struct vcpu_svm *svm = to_svm(vcpu);

        switch (ecx) {

...
        case MSR_K7_EVNTSEL0:
        case MSR_K7_EVNTSEL1:
        case MSR_K7_EVNTSEL2:
        case MSR_K7_EVNTSEL3:
                printk(KERN_ALERT "ecx is 0x%lx\n", ecx);
                /*
                 * only support writing 0 to the performance counters for now
                 * to make Windows happy. Should be replaced by a real
                 * performance counter emulation later.
                 */
                if (data != 0)
                        goto unhandled;
                break;
        default:
        unhandled:
                return kvm_set_msr_common(vcpu, ecx, data);
        }
        return 0;
}

The *intent* of the code is to call kvm_set_msr_common if and only if our
MSR_K7_EVNTSEL{0,3} case statement fired, and data was not 0; otherwise we
should just break out of here and return 0.  Unfortunately, that's not what's
actually happening; what's happening is that we are calling
kvm_set_msr_common(), regardless of the state of data.

Disassembling the code around here with objdump -Sr shows this:

    1803:       81 fe 02 01 00 c0       cmp    $0xc0000102,%esi
    1809:       74 6c                   je     1877 <svm_set_msr+0xf4>
    180b:       0f 82 f1 01 00 00       jb     1a02 <svm_set_msr+0x27f>
    1811:       8d 86 00 00 ff 3f       lea    0x3fff0000(%rsi),%eax
    1817:       83 f8 03                cmp    $0x3,%eax
    181a:       0f 87 e2 01 00 00       ja     1a02 <svm_set_msr+0x27f>
    1820:       e9 d8 01 00 00          jmpq   19fd <svm_set_msr+0x27a>

What you can see for the first cmp here, we properly use the "unsigned ecx"
argument, which is a 32-bit quantity in %esi (based on the x86_64 calling
convention).  However, when we make it down to the lea instruction, we actually
use %rsi, which seems wrong, since it seems like there could be garbage in the
upper 32-bits of the register, causing the resulting ja to fire erroneously.

I have a test case at http://people.redhat.com/clalance/rsi-test-case.tar.bz2 ;
unfortunately I was never able to reproduce the unexpected behavior in this
userland testcase, but if you compile that code and look at the disassembly,
you can see the problem.

The flags used to compile the code is in the tarball above, but just for
completeness they are:

-Wp,-MD,/root/testcase.o.d -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs
-fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os
-fno-stack-protector -m64 -mtune=generic -mno-red-zone -mcmodel=kernel
-funit-at-a-time -maccumulate-outgoing-args -pipe -Wno-sign-compare
-fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
-fno-omit-frame-pointer -fno-optimize-sibling-calls -g
-Wdeclaration-after-statement -Wno-pointer-sign


-- 
           Summary: Using %rsi instead of %esi for a u32 in generated code
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: clalance at redhat dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36040

Reply via email to