Hello,

I was running a couple of tests on various platforms in preparation
of getting the find_reload_subreg_address patch needed by aarch64 upstream:
http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01421.html

This unfortunately uncovered a regression in vect-98-big-array.c on i386.
It seems to me that this is due to a pre-existing problem in the back-end.

The insn being reloaded is:

(insn 17 15 19 3 (set (subreg:V16QI (reg:V4SI 182) 0)
        (unspec:V16QI [
                (mem:V16QI (plus:SI (reg:SI 64 [ ivtmp.97 ])
                        (const_int 16 [0x10])) [2 MEM[base: _164, offset: 
16B]+0 S16 A32])
            ] UNSPEC_MOVU)) 
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23 
921 {sse2_movdqu}
     (nil))

where reg:V4SI 182 was spilled to the stack.

With the current compiler, this gets reloaded via a straightforward
output reload:

(insn 17 15 195 3 (set (reg:V16QI 21 xmm0)
        (unspec:V16QI [
                (mem:V16QI (plus:SI (reg:SI 0 ax [orig:64 ivtmp.97 ] [64])
                        (const_int 16 [0x10])) [2 MEM[base: _164, offset: 
16B]+0 S16 A32])
            ] UNSPEC_MOVU)) 
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23 
921 {sse2_movdqu}
     (nil))
(insn 195 17 19 3 (set (mem/c:V16QI (plus:SI (reg/f:SI 7 sp)
                (const_int 112 [0x70])) [4 %sfp+-1232 S16 A128])
        (reg:V16QI 21 xmm0)) 
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23 
901 {*movv16qi_internal}
     (nil))


But with the above patch applied, we get an input reload instead:

(insn 196 15 17 3 (set (reg:V16QI 21 xmm0)
        (mem:V16QI (plus:SI (reg:SI 0 ax [orig:64 ivtmp.97 ] [64])
                (const_int 16 [0x10])) [2 MEM[base: _164, offset: 16B]+0 S16 
A32])) 
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23 
901 {*movv16qi_internal}
     (nil))
(insn 17 196 19 3 (set (mem/c:V16QI (plus:SI (reg/f:SI 7 sp)
                (const_int 112 [0x70])) [4 %sfp+-1232 S16 A128])
        (unspec:V16QI [
                (reg:V16QI 21 xmm0)
            ] UNSPEC_MOVU)) 
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23 
921 {sse2_movdqu}
     (nil))

Now this is clearly broken, since now the load from memory is no longer
marked as potentially unaligned.  And indeed execution of this instruction
traps due to a misaligned access.


However, looking at the underlying pattern:

(define_insn "<sse2>_movdqu<avxsizesuffix>"
  [(set (match_operand:VI1 0 "nonimmediate_operand" "=x,m")
        (unspec:VI1 [(match_operand:VI1 1 "nonimmediate_operand" "xm,x")]
                    UNSPEC_MOVU))]
  "TARGET_SSE2 && !(MEM_P (operands[0]) && MEM_P (operands[1]))"

the back-end actually *allows* this transformation just fine.  So reload has
always had both options.  Since both require just one reload, it's a matter
of additional heuristics which will get chosen, and my patch as a side effect
slightly changes one of those heuristics ...

But in principle the problem is latent even without the patch.  The back-end
should not permit reload to make changes to the pattern that fundamentally
change the semantics of the resulting instruction ...


I was wondering if the i386 port maintainers could have a look at this
pattern.  Shouldn't we really have two patterns, one to *load* an unaligned
value and one to *store* and unaligned value, and not permit that memory
access to get reloaded?

[I guess a more fundamental change might also be to not have an unspec
in the first place, but only plain moves, and check MEM_ALIGN in the
move insn emitter to see which variant of the instruction is required ...]

Bye,
Ulrich
-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Reply via email to