[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-12-09 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #23 from Jakub Jelinek  ---
.


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-12-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #22 from Jakub Jelinek  ---
Fixed.


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-12-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #21 from Jakub Jelinek  ---
Author: jakub
Date: Wed Dec  4 15:50:02 2013
New Revision: 205671

URL: http://gcc.gnu.org/viewcvs?rev=205671&root=gcc&view=rev
Log:
PR target/59163
* config/i386/i386.c (ix86_legitimate_combined_insn): If for
!TARGET_AVX there is misaligned MEM operand with vector mode
and get_attr_ssememalign is 0, return false.
(ix86_expand_special_args_builtin): Add get_pointer_alignment
computed alignment and for non-temporal loads/stores also
at least GET_MODE_ALIGNMENT as MEM_ALIGN.
* config/i386/sse.md
(_loadu,
_storeu,
_loaddqu,
_storedqu, _lddqu,
sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps,
sse_movlhps, sse_storehps, sse_loadhps, sse_loadlps,
*vec_interleave_highv2df, *vec_interleave_lowv2df,
*vec_extractv2df_1_sse, sse2_loadhpd, sse2_loadlpd, sse2_movsd,
sse4_1_v8qiv8hi2, sse4_1_v4qiv4si2,
sse4_1_v4hiv4si2, sse4_1_v2qiv2di2,
sse4_1_v2hiv2di2, sse4_1_v2siv2di2, sse4_2_pcmpestr,
*sse4_2_pcmpestr_unaligned, sse4_2_pcmpestri, sse4_2_pcmpestrm,
sse4_2_pcmpestr_cconly, sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned,
sse4_2_pcmpistri, sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add
ssememalign attribute.
* config/i386/i386.md (ssememalign): New define_attr.

* g++.dg/torture/pr59163.C: New test.

Added:
branches/gcc-4_8-branch/gcc/testsuite/g++.dg/torture/pr59163.C
Modified:
branches/gcc-4_8-branch/gcc/ChangeLog
branches/gcc-4_8-branch/gcc/config/i386/i386.c
branches/gcc-4_8-branch/gcc/config/i386/i386.md
branches/gcc-4_8-branch/gcc/config/i386/sse.md
branches/gcc-4_8-branch/gcc/testsuite/ChangeLog


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-12-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #19 from Jakub Jelinek  ---
Author: jakub
Date: Wed Dec  4 11:11:24 2013
New Revision: 205661

URL: http://gcc.gnu.org/viewcvs?rev=205661&root=gcc&view=rev
Log:
PR target/59163
* config/i386/i386.c (ix86_legitimate_combined_insn): If for
!TARGET_AVX there is misaligned MEM operand with vector mode
and get_attr_ssememalign is 0, return false.
(ix86_expand_special_args_builtin): Add get_pointer_alignment
computed alignment and for non-temporal loads/stores also
at least GET_MODE_ALIGNMENT as MEM_ALIGN.
* config/i386/sse.md
(_loadu,
_storeu,
_loaddqu,
_storedqu, _lddqu,
sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps,
sse_movlhps, sse_storehps, sse_loadhps, sse_loadlps,
*vec_interleave_highv2df, *vec_interleave_lowv2df,
*vec_extractv2df_1_sse, sse2_movsd, sse4_1_v8qiv8hi2,
sse4_1_v4qiv4si2, sse4_1_v4hiv4si2,
sse4_1_v2qiv2di2, sse4_1_v2hiv2di2,
sse4_1_v2siv2di2, sse4_2_pcmpestr, *sse4_2_pcmpestr_unaligned,
sse4_2_pcmpestri, sse4_2_pcmpestrm, sse4_2_pcmpestr_cconly,
sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned, sse4_2_pcmpistri,
sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add ssememalign attribute.
* config/i386/i386.md (ssememalign): New define_attr.

* g++.dg/torture/pr59163.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/torture/pr59163.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.md
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-12-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #20 from Jakub Jelinek  ---
Author: jakub
Date: Wed Dec  4 11:12:04 2013
New Revision: 205663

URL: http://gcc.gnu.org/viewcvs?rev=205663&root=gcc&view=rev
Log:
PR target/59163
* config/i386/i386.c (ix86_legitimate_combined_insn): If for
!TARGET_AVX there is misaligned MEM operand with vector mode
and get_attr_ssememalign is 0, return false.
(ix86_expand_special_args_builtin): Add get_pointer_alignment
computed alignment and for non-temporal loads/stores also
at least GET_MODE_ALIGNMENT as MEM_ALIGN.
* config/i386/sse.md
(_loadu,
_storeu,
_loaddqu,
_storedqu, _lddqu,
sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps,
sse_movlhps, sse_storehps, sse_loadhps, sse_loadlps,
*vec_interleave_highv2df, *vec_interleave_lowv2df,
*vec_extractv2df_1_sse, sse2_movsd, sse4_1_v8qiv8hi2,
sse4_1_v4qiv4si2, sse4_1_v4hiv4si2,
sse4_1_v2qiv2di2, sse4_1_v2hiv2di2,
sse4_1_v2siv2di2, sse4_2_pcmpestr, *sse4_2_pcmpestr_unaligned,
sse4_2_pcmpestri, sse4_2_pcmpestrm, sse4_2_pcmpestr_cconly,
sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned, sse4_2_pcmpistri,
sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add ssememalign attribute.
* config/i386/i386.md (ssememalign): New define_attr.

* g++.dg/torture/pr59163.C: New test.

Modified:
trunk/gcc/config/i386/i386.c


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-30 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #18 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #17)
> Perhaps add new attribute ssememalign, with default 0, which would be
> (maximum for all alternatives) required alignment for memory operands in the
> instruction pre-AVX, or 0 for GET_MODE_ALIGNMENT.  So, instructions that can
> handle completely unaligned loads/stores in all alternatives would have
> ssememalign 8,
> instructions that require everything properly aligned would have default
> ssememalign 0, and say movlps/movhps would have ssememalign 64.  The default
> would be conservatively correct, so whether instructions would have
> ssememalign attribute would be just an optimization issue (but, if it would
> be non-zero, it would have to be correct).

This sounds like a good approach to me.

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #17 from Jakub Jelinek  ---
Perhaps add new attribute ssememalign, with default 0, which would be (maximum
for all alternatives) required alignment for memory operands in the instruction
pre-AVX, or 0 for GET_MODE_ALIGNMENT.  So, instructions that can handle
completely unaligned loads/stores in all alternatives would have ssememalign 8,
instructions that require everything properly aligned would have default
ssememalign 0, and say movlps/movhps would have ssememalign 64.  The default
would be conservatively correct, so whether instructions would have ssememalign
attribute would be just an optimization issue (but, if it would be non-zero, it
would have to be correct).


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #16 from Jakub Jelinek  ---
Note that (according to my reading of the docs) e.g. movlps/movhps don't allow
unaligned memory, so blindly allow any combine is wrong, but while the MEM
operand in those cases is say V4SFmode, the loads or stores can have V2SFmode's
alignment.  So, I don't see how ssemovu type would help us here, I think we
handle all the completely unaligned loads/stores already (after the
UNSPEC_LDDQU addition).  But we need some way how to determine for which
instructions we should require only smaller alignment, plus tweak the two
gen_rtx_MEM spots for the builtins and set mode there according to
get_pointer_alignment and/or to assumed alignment of the builtins.


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #15 from Jakub Jelinek  ---
(In reply to Uroš Bizjak from comment #14)
> (In reply to Uroš Bizjak from comment #13)
> > (In reply to Jakub Jelinek from comment #12)
> > > Created attachment 31332 [details]
> > > gcc49-pr59163.patch
> > > 
> > > So like this?
> > 
> > Yes, with adjusted comment in ix86_legitimate_combined_insn.
> > 
> > IIRC, unaligned moves won't be propagated during or after reload, so it
> > looks to me that the approach is correct.
> 
> Running the testsuite with your patch applied exposed a minor problem:
> 
> FAIL: gcc.target/i386/sse-1.c scan-assembler-not movaps
> 
> movlps/movhps and movlpd/movhpd can also handle unaligned operands (please
> see ix86_expand_vector_move_misalign). We should simply tag instructions
> that operate on unaligned operands (attribute type = ssemovu) and check type
> attribute instead.
> 
> The proposed approach would mean to change all scheduler and attribute
> calculation checks from "ssemov" to "ssemov,ssemovu", but this would be a
> simple mechanical change.

Yeah, I've noticed that too, plus the dumps I've used to note what instructions
have been rejected by the patch show that UNSPEC_LDDQU would need to be treated
like UNSPEC_LOADU.
The patch made difference for 32-bit:
gcc.target/i386/sse-1.c (as you write above)
gcc.dg/torture/pr18582-1.c (UNSPEC_LDDQU)
gcc.target/i386/sse4_1-movntdqa.c (UNSPEC_MOVNTDQA)
and 64-bit also
g++.dg/torture/pr59163.C (desirable)

Now, for movntdqa, I think it accepts only aligned memory, but the MEM in there
is supposed to be aligned and is created by
  if (i == memory)
{
  /* This must be the memory operand.  */
  op = ix86_zero_extend_to_Pmode (op);
  op = gen_rtx_MEM (mode, op); 
  gcc_assert (GET_MODE (op) == mode
  || GET_MODE (op) == VOIDmode);
}
and there is similar code for builtins that store.
Supposedly for this we should use get_pointer_alignment (arg) and at least
set_mem_align (op, get_pointer_alignment (arg)); if it is larger than MEM_ALIGN
(op).  The gcc_assert doesn't make any sense to me, result of gen_rtx_MEM
(mode, op) will always have GET_MODE (op) == mode, no need to assert that and
it will never have VOIDmode.
Now, if we could easily find out which of the builtins assume aligned memory
(and to what extent), we should also set it, because say using
_mm_stream_load_si128 with not 128-bit aligned memory is user error, so GCC
should be able to assume A128 there.  I'd say the sse-1.c case is similar,
isn't it?

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #14 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #13)
> (In reply to Jakub Jelinek from comment #12)
> > Created attachment 31332 [details]
> > gcc49-pr59163.patch
> > 
> > So like this?
> 
> Yes, with adjusted comment in ix86_legitimate_combined_insn.
> 
> IIRC, unaligned moves won't be propagated during or after reload, so it
> looks to me that the approach is correct.

Running the testsuite with your patch applied exposed a minor problem:

FAIL: gcc.target/i386/sse-1.c scan-assembler-not movaps

movlps/movhps and movlpd/movhpd can also handle unaligned operands (please see
ix86_expand_vector_move_misalign). We should simply tag instructions that
operate on unaligned operands (attribute type = ssemovu) and check type
attribute instead.

The proposed approach would mean to change all scheduler and attribute
calculation checks from "ssemov" to "ssemov,ssemovu", but this would be a
simple mechanical change.

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #13 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #12)
> Created attachment 31332 [details]
> gcc49-pr59163.patch
> 
> So like this?

Yes, with adjusted comment in ix86_legitimate_combined_insn.

IIRC, unaligned moves won't be propagated during or after reload, so it looks
to me that the approach is correct.

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #31331|0   |1
is obsolete||
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #12 from Jakub Jelinek  ---
Created attachment 31332
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31332&action=edit
gcc49-pr59163.patch

So like this?


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #11 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #10)
> For stores I think the patch already allows that, that is the
>   if (GET_CODE (*x) == SET && &SET_DEST (*x) == data)
> return 1;
> in there (the reason why I've added it was that for the misaligned store
> insns
> with UNSPEC_STOREU the misaligned MEM is in SET_DEST and SET_SRC just
> contains some rtl with UNSPEC_STOREU embedded somewhere in it.
> So, would you like:
>   if (GET_CODE (*x) == SET && (&SET_DEST (*x) == data || &SET_SRC (*x) ==
> data))
> return 1;
> ?  That would IMHO handle simple loads from misaligned MEM too.

Yes. Perhaps also special case sse4.2 string instructions (they can operate on
unaligned data, too), and the patch covers everything.

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #10 from Jakub Jelinek  ---
For stores I think the patch already allows that, that is the
  if (GET_CODE (*x) == SET && &SET_DEST (*x) == data)
return 1;
in there (the reason why I've added it was that for the misaligned store insns
with UNSPEC_STOREU the misaligned MEM is in SET_DEST and SET_SRC just contains
some rtl with UNSPEC_STOREU embedded somewhere in it.
So, would you like:
  if (GET_CODE (*x) == SET && (&SET_DEST (*x) == data || &SET_SRC (*x) ==
data))
return 1;
?  That would IMHO handle simple loads from misaligned MEM too.


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #9 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #8)
> Created attachment 31331 [details]
> gcc49-pr59163.patch
> 
> So like this?  Untested...

Yes, but I think that we can also allow simple vector loads and stores - they
emit movu insn when unaligned operand is detected.

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #8 from Jakub Jelinek  ---
Created attachment 31331
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31331&action=edit
gcc49-pr59163.patch

So like this?  Untested...


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #7 from Uroš Bizjak  ---
Maybe even better idea is to use ix86_legitimate_combined_insn and reject
combinations that would result in unaligned operands of all but vector move
instructions.

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

--- Comment #6 from Uroš Bizjak  ---
I think that we should disallow tie of TImode with 128bit vector modes due to
different alignment requirements. Integer register pairs can load unaligned
TImode without problems, while unaligned TImode will crash SSE insns.

Following patch fixes the problem:

Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 205509)
+++ config/i386/i386.c  (working copy)
@@ -35237,6 +35237,7 @@ ix86_modes_tieable_p (enum machine_mode mode1, enu
 return (GET_MODE_SIZE (mode1) == 32
&& ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode1));
   if (GET_MODE_SIZE (mode2) == 16
+  && !(mode1 == TImode || mode2 == TImode)
   && ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode2))
 return (GET_MODE_SIZE (mode1) == 16
&& ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode1));

However, it needs some refinements for AVX.

[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

Jakub Jelinek  changed:

   What|Removed |Added

 CC||uros at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
Looks like it, yes.
In *.jump we still have (IMHO correct):
(insn 2 4 3 2 (set (reg/v/f:DI 89 [ a ])
(reg:DI 5 di [ a ])) pr59163-2.C:13 85 {*movdi_internal}
 (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:TI 90)
(mem:TI (reg/v/f:DI 89 [ a ]) [3 MEM[(const struct A &)a_4(D)]+0 S16
A32])) pr59163-2.C:15 84 {*movti_internal}
 (nil))
(insn 7 6 8 2 (set (mem/c:TI (plus:DI (reg/f:DI 20 frame)
(const_int -16 [0xfff0])) [3 c+0 S16 A128])
(reg:TI 90)) pr59163-2.C:15 84 {*movti_internal}
 (nil))
...
(insn 9 8 10 2 (set (reg:V4SF 91 [ vect__7.10 ])
(mult:V4SF (reg:V4SF 92)
(mem/c:V4SF (plus:DI (reg/f:DI 20 frame)
(const_int -16 [0xfff0])) [2 MEM[(float *)&c]+0
S16 A128]))) pr59163-2.C:17 1269 {*mulv4sf3}
 (nil))

movti_internal handles unaligned loads properly.
Then *.dse1 transforms this into:
(insn 6 3 18 2 (set (reg:TI 90 [ MEM[(const struct A &)a_4(D)] ])
(mem:TI (reg/v/f:DI 89 [ a ]) [3 MEM[(const struct A &)a_4(D)]+0 S16
A32])) pr59163-2.C:15 84 {*movti_internal}
 (nil))
(insn 18 6 8 2 (set (reg:V4SF 94)
(subreg:V4SF (reg:TI 90 [ MEM[(const struct A &)a_4(D)] ]) 0))
pr59163-2.C:15 -1
 (expr_list:REG_DEAD (reg:TI 90 [ MEM[(const struct A &)a_4(D)] ])
(nil)))
...
(insn 9 8 19 2 (set (reg:V4SF 91 [ vect__7.10 ])
(mult:V4SF (reg:V4SF 92)
(reg:V4SF 94))) pr59163-2.C:17 1269 {*mulv4sf3}
 (expr_list:REG_DEAD (reg:V4SF 94)
(expr_list:REG_DEAD (reg:V4SF 92)
(nil
which also looks ok to me.  But then combine combines it into:
(insn 9 8 19 2 (set (reg:V4SF 91 [ vect__7.10 ])
(mult:V4SF (reg:V4SF 92)
(mem:V4SF (reg/v/f:DI 89 [ a ]) [3 MEM[(const struct A &)a_4(D)]+0
S16 A32]))) pr59163-2.C:17 1269 {*mulv4sf3}
 (expr_list:REG_DEAD (reg:V4SF 92)
(nil)))
which is wrong (for pre-AVX), because mulv4sf3 can't accept unaligned memory.
Likely the SSEx pre-AVX predicates assume that an unaligned vector load will be
done using UNSPEC and thus not really mergeable here, and don't count with the
fact that the load could be done using integral mode and just subreged into
vector mode.  Perhaps we need new predicates for this that would fail for
exactly this situation (disallow unaligned scalar load subregged into vector
mode for pre-AVX) and use them everywhere where SSE? doesn't accept unaligned
loads?


[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults

2013-11-29 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||x86_64-*-*
  Component|tree-optimization   |target

--- Comment #4 from Richard Biener  ---
target issue then?