Hello,
I was running a couple of tests on various platforms in preparation
of getting the find_reload_subreg_address patch needed by aarch64 upstream:
http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01421.html
This unfortunately uncovered a regression in vect-98-big-array.c on i386.
It seems to me that this is due to a pre-existing problem in the back-end.
The insn being reloaded is:
(insn 17 15 19 3 (set (subreg:V16QI (reg:V4SI 182) 0)
(unspec:V16QI [
(mem:V16QI (plus:SI (reg:SI 64 [ ivtmp.97 ])
(const_int 16 [0x10])) [2 MEM[base: _164, offset:
16B]+0 S16 A32])
] UNSPEC_MOVU))
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23
921 {sse2_movdqu}
(nil))
where reg:V4SI 182 was spilled to the stack.
With the current compiler, this gets reloaded via a straightforward
output reload:
(insn 17 15 195 3 (set (reg:V16QI 21 xmm0)
(unspec:V16QI [
(mem:V16QI (plus:SI (reg:SI 0 ax [orig:64 ivtmp.97 ] [64])
(const_int 16 [0x10])) [2 MEM[base: _164, offset:
16B]+0 S16 A32])
] UNSPEC_MOVU))
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23
921 {sse2_movdqu}
(nil))
(insn 195 17 19 3 (set (mem/c:V16QI (plus:SI (reg/f:SI 7 sp)
(const_int 112 [0x70])) [4 %sfp+-1232 S16 A128])
(reg:V16QI 21 xmm0))
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23
901 {*movv16qi_internal}
(nil))
But with the above patch applied, we get an input reload instead:
(insn 196 15 17 3 (set (reg:V16QI 21 xmm0)
(mem:V16QI (plus:SI (reg:SI 0 ax [orig:64 ivtmp.97 ] [64])
(const_int 16 [0x10])) [2 MEM[base: _164, offset: 16B]+0 S16
A32]))
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23
901 {*movv16qi_internal}
(nil))
(insn 17 196 19 3 (set (mem/c:V16QI (plus:SI (reg/f:SI 7 sp)
(const_int 112 [0x70])) [4 %sfp+-1232 S16 A128])
(unspec:V16QI [
(reg:V16QI 21 xmm0)
] UNSPEC_MOVU))
/home/uweigand/fsf/gcc-head/gcc/testsuite/gcc.dg/vect/vect-98-big-array.c:23
921 {sse2_movdqu}
(nil))
Now this is clearly broken, since now the load from memory is no longer
marked as potentially unaligned. And indeed execution of this instruction
traps due to a misaligned access.
However, looking at the underlying pattern:
(define_insn "<sse2>_movdqu<avxsizesuffix>"
[(set (match_operand:VI1 0 "nonimmediate_operand" "=x,m")
(unspec:VI1 [(match_operand:VI1 1 "nonimmediate_operand" "xm,x")]
UNSPEC_MOVU))]
"TARGET_SSE2 && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
the back-end actually *allows* this transformation just fine. So reload has
always had both options. Since both require just one reload, it's a matter
of additional heuristics which will get chosen, and my patch as a side effect
slightly changes one of those heuristics ...
But in principle the problem is latent even without the patch. The back-end
should not permit reload to make changes to the pattern that fundamentally
change the semantics of the resulting instruction ...
I was wondering if the i386 port maintainers could have a look at this
pattern. Shouldn't we really have two patterns, one to *load* an unaligned
value and one to *store* and unaligned value, and not permit that memory
access to get reloaded?
[I guess a more fundamental change might also be to not have an unspec
in the first place, but only plain moves, and check MEM_ALIGN in the
move insn emitter to see which variant of the instruction is required ...]
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
[email protected]