2015-05-22 15:01 GMT+03:00 Ilya Enkovich <[email protected]>: > 2015-05-22 11:53 GMT+03:00 Ilya Enkovich <[email protected]>: >> 2015-05-21 22:08 GMT+03:00 Vladimir Makarov <[email protected]>: >>> So, Ilya, to solve the problem you need to avoid sharing subregs for the >>> correct LRA/reload work. >>> >>> >> >> Thanks a lot for your help! I'll fix it. >> >> Ilya > > I've fixed SUBREG sharing and got a missing spill. I added > --enable-checking=rtl to check other possible bugs. Spill/fill code > still seems incorrect because different sizes are used. Shouldn't > block me though. > > .L5: > movl 16(%esp), %eax > addl $8, %esi > movl 20(%esp), %edx > movl %eax, (%esp) > movl %edx, 4(%esp) > call counter@PLT > movq -8(%esi), %xmm0 > **movdqa 16(%esp), %xmm2** > pand %xmm0, %xmm2 > movdqa %xmm2, %xmm0 > movd %xmm2, %edx > **movq %xmm2, 16(%esp)** > psrlq $32, %xmm0 > movd %xmm0, %eax > orl %edx, %eax > jne .L5 > > Thanks, > Ilya
I was wrong assuming reloads with wrong size shouldn't block me. These
reloads require memory to be aligned which is not always true. Here is
what I have in RTL now:
(insn 2 7 3 2 (set (reg/v:DI 93 [ l ])
(mem/c:DI (reg/f:SI 16 argp) [1 l+0 S8 A32])) test.c:5 89
{*movdi_internal}
(nil))
...
(insn 27 26 52 6 (set (subreg:V2DI (reg:DI 87 [ D.1822 ]) 0)
(ior:V2DI (subreg:V2DI (reg:DI 99 [ D.1822 ]) 0)
(subreg:V2DI (reg/v:DI 93 [ l ]) 0))) test.c:11 3489 {*iorv2di3}
(expr_list:REG_DEAD (reg:DI 99 [ D.1822 ])
(expr_list:REG_DEAD (reg/v:DI 93 [ l ])
(nil))))
After reload I get:
(insn 2 7 75 2 (set (reg/v:DI 0 ax [orig:93 l ] [93])
(mem/c:DI (plus:SI (reg/f:SI 7 sp)
(const_int 24 [0x18])) [1 l+0 S8 A32])) test.c:5 89
{*movdi_internal}
(nil))
(insn 75 2 3 2 (set (mem/c:DI (reg/f:SI 7 sp) [3 %sfp+-16 S8 A64])
(reg/v:DI 0 ax [orig:93 l ] [93])) test.c:5 89 {*movdi_internal}
(nil))
...
(insn 27 26 52 6 (set (reg:V2DI 21 xmm0 [orig:87 D.1822 ] [87])
(ior:V2DI (reg:V2DI 21 xmm0 [orig:99 D.1822 ] [99])
(mem/c:V2DI (reg/f:SI 7 sp) [3 %sfp+-16 S16 A64])))
test.c:11 3489 {*iorv2di3}
'por' instruction requires memory to be aligned and fails in a bigger
testcase. There is also movdqa generated for esp by reload. May it
mean I still have some inconsistencies in the produced RTL? Probably I
should somehow transform loads and stores?
Thanks,
Ilya
ira.log
Description: Binary data
pr65105.patch
Description: Binary data
extern long long arr[];
long long
test (long long l, int i1, int i2)
{
switch (i2)
{
case 1:
return l | arr[i1];
case 8:
return l | arr[i1] & arr[i2];
}
return l;
}
