Alexander Graf wrote:
>
> On Feb 25, 2008, at 6:40 PM, Avi Kivity wrote:
>
>> Alexander Graf wrote:
>>>
>>> The ebx store was done because of PIC code, which does not allow ebx 
>>> to get clobbered. If we are not in PIC code, =r contains ebx as GPR 
>>> though, so the assumption that ebx needs to be restored was wrong 
>>> then. This new version only enables the store/restore code if i386 
>>> and PIC code are used. There is no need to distinguish between 
>>> x86_64 and i386 for the other cases.
>>>
>>> So does this version work?
>>>
>>
>> It probably will, but it seems fragile to depend on the details of 
>> PIC.  I committed something more generic:
>>
>> #ifdef __x86_64__
>>   asm volatile("cpuid"
>>        : "=a"(vec[0]), "=b"(vec[1]),
>>          "=c"(vec[2]), "=d"(vec[3])
>>        : "0"(function) : "cc");
>
> This code works fine for all targets, including i386. With PIC 
> enabled, gcc registers the ebx registers and complains about this, 
> thus errors out. This is the only special case I am aware of, so I 
> doubt we should treat any case different from the "normal" case but 
> the PIC one.
>
>>
>> #else
>>   asm volatile("pusha \n\t"
>>
>>        "cpuid \n\t"
>>        "mov %%eax, 0(%1) \n\t"
>>        "mov %%ebx, 4(%1) \n\t"
>>        "mov %%ecx, 8(%1) \n\t"
>>        "mov %%edx, 12(%1) \n\t"
>>
>>        "popa"
>>        : "a"(function), "S"(vec) : "memory", "cc");
>> #endif
>
> Basically #ifdef __x86_64__ is even wrong, as the problem is not that 
> too many registers are being used, but that ebx is reserved and can't 
> be saved/restored automatically.
>
> Furthermore I believe that the less assembler is used, the better the 
> code looks. So for cases the snippet above is not required, why use 
> it? Overusing assembler is imho exactly the reason the previous code 
> broke.
>

I agree with all of this, but I think this case is an exception.  gcc 
doesn't behave well with many register constraints on i386 and the PIC 
case shows things are not straightforward.  I want something I can 
forget about.

> There's one more thing I'd like to add here. Gcc optimizes really well 
> when one lets it to. So for this exact case with -O2 used, there are 
> no memory accesses. The vector is simply stored in 4 registers and 
> thus no more movs are required.
>

Again I agree, but host_cpuid() is hardly an optimization target.  you 
can add a usleep(10000) there with no noticable effect.

btw the cpuid instruction execution time itself will likely overwhelm 
any instructions around it (since it is microcoded).

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to