Re: rfa (x86): 387<=>sse moves

Dale Johannesen Tue, 26 Jul 2005 15:34:01 -0700

On Jul 26, 2005, at 12:51 AM, Paolo Bonzini wrote:

Dale Johannesen wrote:
With -march=pentium4 -mfpmath=sse -O2, we get an extra move for codelike
    double d = atof(foo);
    int i = d;
        call    atof
        fstpl   -8(%ebp)
        movsd   -8(%ebp), %xmm0
        cvttsd2si       %xmm0, %eax
(This is Linux, Darwin is similar.) I think the difficulty is thatfor
(set (reg/v:DF 58 [ d ]) (reg:DF 8 st)) 64 {*movdf_nointeger}

Try the attached patch. It gave a 3% speedup on -mfpmath=sse fortramp3d. Richard Henderson asked for SPEC testing, then it may go in.

Thanks. That's progress; the cost computation in regclass now figuresout that memory

is that fastest place to put R58:

  Register 58 costs: AD_REGS:87000 Q_REGS:87000 NON_Q_REGS:87000
INDEX_REGS:87000 LEGACY_REGS:87000 GENERAL_REGS:87000 FP_TOP_REG:49000

FP_SECOND_REG:50000 FLOAT_REGS:50000 SSE_REGS:50000FP_TOP_SSE_REGS:75000

FP_SECOND_SSE_REGS:75000 FLOAT_SSE_REGS:75000 FLOAT_INT_REGS:87000
INT_SSE_REGS:91000 FLOAT_INT_SSE_REGS:91000
ALL_REGS:91000 MEM:40000

Unfortunately local-alloc insists on putting in a register anyway(ST(0) instead of an XMM,

but the end codegen is unchanged):

;; Register 58 in 8.

I think the RA may be missing the concept that memory might be fasterthan any possible register....

will dig further.

Re: rfa (x86): 387<=>sse moves

Reply via email to