[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #23 from Jakub Jelinek --- .
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #22 from Jakub Jelinek --- Fixed.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #21 from Jakub Jelinek --- Author: jakub Date: Wed Dec 4 15:50:02 2013 New Revision: 205671 URL: http://gcc.gnu.org/viewcvs?rev=205671&root=gcc&view=rev Log: PR target/59163 * config/i386/i386.c (ix86_legitimate_combined_insn): If for !TARGET_AVX there is misaligned MEM operand with vector mode and get_attr_ssememalign is 0, return false. (ix86_expand_special_args_builtin): Add get_pointer_alignment computed alignment and for non-temporal loads/stores also at least GET_MODE_ALIGNMENT as MEM_ALIGN. * config/i386/sse.md (_loadu, _storeu, _loaddqu, _storedqu, _lddqu, sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps, sse_movlhps, sse_storehps, sse_loadhps, sse_loadlps, *vec_interleave_highv2df, *vec_interleave_lowv2df, *vec_extractv2df_1_sse, sse2_loadhpd, sse2_loadlpd, sse2_movsd, sse4_1_v8qiv8hi2, sse4_1_v4qiv4si2, sse4_1_v4hiv4si2, sse4_1_v2qiv2di2, sse4_1_v2hiv2di2, sse4_1_v2siv2di2, sse4_2_pcmpestr, *sse4_2_pcmpestr_unaligned, sse4_2_pcmpestri, sse4_2_pcmpestrm, sse4_2_pcmpestr_cconly, sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned, sse4_2_pcmpistri, sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add ssememalign attribute. * config/i386/i386.md (ssememalign): New define_attr. * g++.dg/torture/pr59163.C: New test. Added: branches/gcc-4_8-branch/gcc/testsuite/g++.dg/torture/pr59163.C Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/config/i386/i386.c branches/gcc-4_8-branch/gcc/config/i386/i386.md branches/gcc-4_8-branch/gcc/config/i386/sse.md branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #19 from Jakub Jelinek --- Author: jakub Date: Wed Dec 4 11:11:24 2013 New Revision: 205661 URL: http://gcc.gnu.org/viewcvs?rev=205661&root=gcc&view=rev Log: PR target/59163 * config/i386/i386.c (ix86_legitimate_combined_insn): If for !TARGET_AVX there is misaligned MEM operand with vector mode and get_attr_ssememalign is 0, return false. (ix86_expand_special_args_builtin): Add get_pointer_alignment computed alignment and for non-temporal loads/stores also at least GET_MODE_ALIGNMENT as MEM_ALIGN. * config/i386/sse.md (_loadu, _storeu, _loaddqu, _storedqu, _lddqu, sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps, sse_movlhps, sse_storehps, sse_loadhps, sse_loadlps, *vec_interleave_highv2df, *vec_interleave_lowv2df, *vec_extractv2df_1_sse, sse2_movsd, sse4_1_v8qiv8hi2, sse4_1_v4qiv4si2, sse4_1_v4hiv4si2, sse4_1_v2qiv2di2, sse4_1_v2hiv2di2, sse4_1_v2siv2di2, sse4_2_pcmpestr, *sse4_2_pcmpestr_unaligned, sse4_2_pcmpestri, sse4_2_pcmpestrm, sse4_2_pcmpestr_cconly, sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned, sse4_2_pcmpistri, sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add ssememalign attribute. * config/i386/i386.md (ssememalign): New define_attr. * g++.dg/torture/pr59163.C: New test. Added: trunk/gcc/testsuite/g++.dg/torture/pr59163.C Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.md trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #20 from Jakub Jelinek --- Author: jakub Date: Wed Dec 4 11:12:04 2013 New Revision: 205663 URL: http://gcc.gnu.org/viewcvs?rev=205663&root=gcc&view=rev Log: PR target/59163 * config/i386/i386.c (ix86_legitimate_combined_insn): If for !TARGET_AVX there is misaligned MEM operand with vector mode and get_attr_ssememalign is 0, return false. (ix86_expand_special_args_builtin): Add get_pointer_alignment computed alignment and for non-temporal loads/stores also at least GET_MODE_ALIGNMENT as MEM_ALIGN. * config/i386/sse.md (_loadu, _storeu, _loaddqu, _storedqu, _lddqu, sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps, sse_movlhps, sse_storehps, sse_loadhps, sse_loadlps, *vec_interleave_highv2df, *vec_interleave_lowv2df, *vec_extractv2df_1_sse, sse2_movsd, sse4_1_v8qiv8hi2, sse4_1_v4qiv4si2, sse4_1_v4hiv4si2, sse4_1_v2qiv2di2, sse4_1_v2hiv2di2, sse4_1_v2siv2di2, sse4_2_pcmpestr, *sse4_2_pcmpestr_unaligned, sse4_2_pcmpestri, sse4_2_pcmpestrm, sse4_2_pcmpestr_cconly, sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned, sse4_2_pcmpistri, sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add ssememalign attribute. * config/i386/i386.md (ssememalign): New define_attr. * g++.dg/torture/pr59163.C: New test. Modified: trunk/gcc/config/i386/i386.c
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #18 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #17) > Perhaps add new attribute ssememalign, with default 0, which would be > (maximum for all alternatives) required alignment for memory operands in the > instruction pre-AVX, or 0 for GET_MODE_ALIGNMENT. So, instructions that can > handle completely unaligned loads/stores in all alternatives would have > ssememalign 8, > instructions that require everything properly aligned would have default > ssememalign 0, and say movlps/movhps would have ssememalign 64. The default > would be conservatively correct, so whether instructions would have > ssememalign attribute would be just an optimization issue (but, if it would > be non-zero, it would have to be correct). This sounds like a good approach to me.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #17 from Jakub Jelinek --- Perhaps add new attribute ssememalign, with default 0, which would be (maximum for all alternatives) required alignment for memory operands in the instruction pre-AVX, or 0 for GET_MODE_ALIGNMENT. So, instructions that can handle completely unaligned loads/stores in all alternatives would have ssememalign 8, instructions that require everything properly aligned would have default ssememalign 0, and say movlps/movhps would have ssememalign 64. The default would be conservatively correct, so whether instructions would have ssememalign attribute would be just an optimization issue (but, if it would be non-zero, it would have to be correct).
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #16 from Jakub Jelinek --- Note that (according to my reading of the docs) e.g. movlps/movhps don't allow unaligned memory, so blindly allow any combine is wrong, but while the MEM operand in those cases is say V4SFmode, the loads or stores can have V2SFmode's alignment. So, I don't see how ssemovu type would help us here, I think we handle all the completely unaligned loads/stores already (after the UNSPEC_LDDQU addition). But we need some way how to determine for which instructions we should require only smaller alignment, plus tweak the two gen_rtx_MEM spots for the builtins and set mode there according to get_pointer_alignment and/or to assumed alignment of the builtins.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #15 from Jakub Jelinek --- (In reply to Uroš Bizjak from comment #14) > (In reply to Uroš Bizjak from comment #13) > > (In reply to Jakub Jelinek from comment #12) > > > Created attachment 31332 [details] > > > gcc49-pr59163.patch > > > > > > So like this? > > > > Yes, with adjusted comment in ix86_legitimate_combined_insn. > > > > IIRC, unaligned moves won't be propagated during or after reload, so it > > looks to me that the approach is correct. > > Running the testsuite with your patch applied exposed a minor problem: > > FAIL: gcc.target/i386/sse-1.c scan-assembler-not movaps > > movlps/movhps and movlpd/movhpd can also handle unaligned operands (please > see ix86_expand_vector_move_misalign). We should simply tag instructions > that operate on unaligned operands (attribute type = ssemovu) and check type > attribute instead. > > The proposed approach would mean to change all scheduler and attribute > calculation checks from "ssemov" to "ssemov,ssemovu", but this would be a > simple mechanical change. Yeah, I've noticed that too, plus the dumps I've used to note what instructions have been rejected by the patch show that UNSPEC_LDDQU would need to be treated like UNSPEC_LOADU. The patch made difference for 32-bit: gcc.target/i386/sse-1.c (as you write above) gcc.dg/torture/pr18582-1.c (UNSPEC_LDDQU) gcc.target/i386/sse4_1-movntdqa.c (UNSPEC_MOVNTDQA) and 64-bit also g++.dg/torture/pr59163.C (desirable) Now, for movntdqa, I think it accepts only aligned memory, but the MEM in there is supposed to be aligned and is created by if (i == memory) { /* This must be the memory operand. */ op = ix86_zero_extend_to_Pmode (op); op = gen_rtx_MEM (mode, op); gcc_assert (GET_MODE (op) == mode || GET_MODE (op) == VOIDmode); } and there is similar code for builtins that store. Supposedly for this we should use get_pointer_alignment (arg) and at least set_mem_align (op, get_pointer_alignment (arg)); if it is larger than MEM_ALIGN (op). The gcc_assert doesn't make any sense to me, result of gen_rtx_MEM (mode, op) will always have GET_MODE (op) == mode, no need to assert that and it will never have VOIDmode. Now, if we could easily find out which of the builtins assume aligned memory (and to what extent), we should also set it, because say using _mm_stream_load_si128 with not 128-bit aligned memory is user error, so GCC should be able to assume A128 there. I'd say the sse-1.c case is similar, isn't it?
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #14 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #13) > (In reply to Jakub Jelinek from comment #12) > > Created attachment 31332 [details] > > gcc49-pr59163.patch > > > > So like this? > > Yes, with adjusted comment in ix86_legitimate_combined_insn. > > IIRC, unaligned moves won't be propagated during or after reload, so it > looks to me that the approach is correct. Running the testsuite with your patch applied exposed a minor problem: FAIL: gcc.target/i386/sse-1.c scan-assembler-not movaps movlps/movhps and movlpd/movhpd can also handle unaligned operands (please see ix86_expand_vector_move_misalign). We should simply tag instructions that operate on unaligned operands (attribute type = ssemovu) and check type attribute instead. The proposed approach would mean to change all scheduler and attribute calculation checks from "ssemov" to "ssemov,ssemovu", but this would be a simple mechanical change.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #13 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #12) > Created attachment 31332 [details] > gcc49-pr59163.patch > > So like this? Yes, with adjusted comment in ix86_legitimate_combined_insn. IIRC, unaligned moves won't be propagated during or after reload, so it looks to me that the approach is correct.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 Jakub Jelinek changed: What|Removed |Added Attachment #31331|0 |1 is obsolete|| Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #12 from Jakub Jelinek --- Created attachment 31332 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31332&action=edit gcc49-pr59163.patch So like this?
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #11 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #10) > For stores I think the patch already allows that, that is the > if (GET_CODE (*x) == SET && &SET_DEST (*x) == data) > return 1; > in there (the reason why I've added it was that for the misaligned store > insns > with UNSPEC_STOREU the misaligned MEM is in SET_DEST and SET_SRC just > contains some rtl with UNSPEC_STOREU embedded somewhere in it. > So, would you like: > if (GET_CODE (*x) == SET && (&SET_DEST (*x) == data || &SET_SRC (*x) == > data)) > return 1; > ? That would IMHO handle simple loads from misaligned MEM too. Yes. Perhaps also special case sse4.2 string instructions (they can operate on unaligned data, too), and the patch covers everything.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #10 from Jakub Jelinek --- For stores I think the patch already allows that, that is the if (GET_CODE (*x) == SET && &SET_DEST (*x) == data) return 1; in there (the reason why I've added it was that for the misaligned store insns with UNSPEC_STOREU the misaligned MEM is in SET_DEST and SET_SRC just contains some rtl with UNSPEC_STOREU embedded somewhere in it. So, would you like: if (GET_CODE (*x) == SET && (&SET_DEST (*x) == data || &SET_SRC (*x) == data)) return 1; ? That would IMHO handle simple loads from misaligned MEM too.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #9 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #8) > Created attachment 31331 [details] > gcc49-pr59163.patch > > So like this? Untested... Yes, but I think that we can also allow simple vector loads and stores - they emit movu insn when unaligned operand is detected.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #8 from Jakub Jelinek --- Created attachment 31331 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31331&action=edit gcc49-pr59163.patch So like this? Untested...
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #7 from Uroš Bizjak --- Maybe even better idea is to use ix86_legitimate_combined_insn and reject combinations that would result in unaligned operands of all but vector move instructions.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 --- Comment #6 from Uroš Bizjak --- I think that we should disallow tie of TImode with 128bit vector modes due to different alignment requirements. Integer register pairs can load unaligned TImode without problems, while unaligned TImode will crash SSE insns. Following patch fixes the problem: Index: config/i386/i386.c === --- config/i386/i386.c (revision 205509) +++ config/i386/i386.c (working copy) @@ -35237,6 +35237,7 @@ ix86_modes_tieable_p (enum machine_mode mode1, enu return (GET_MODE_SIZE (mode1) == 32 && ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode1)); if (GET_MODE_SIZE (mode2) == 16 + && !(mode1 == TImode || mode2 == TImode) && ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode2)) return (GET_MODE_SIZE (mode1) == 16 && ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode1)); However, it needs some refinements for AVX.
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 Jakub Jelinek changed: What|Removed |Added CC||uros at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- Looks like it, yes. In *.jump we still have (IMHO correct): (insn 2 4 3 2 (set (reg/v/f:DI 89 [ a ]) (reg:DI 5 di [ a ])) pr59163-2.C:13 85 {*movdi_internal} (nil)) (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG) (insn 6 3 7 2 (set (reg:TI 90) (mem:TI (reg/v/f:DI 89 [ a ]) [3 MEM[(const struct A &)a_4(D)]+0 S16 A32])) pr59163-2.C:15 84 {*movti_internal} (nil)) (insn 7 6 8 2 (set (mem/c:TI (plus:DI (reg/f:DI 20 frame) (const_int -16 [0xfff0])) [3 c+0 S16 A128]) (reg:TI 90)) pr59163-2.C:15 84 {*movti_internal} (nil)) ... (insn 9 8 10 2 (set (reg:V4SF 91 [ vect__7.10 ]) (mult:V4SF (reg:V4SF 92) (mem/c:V4SF (plus:DI (reg/f:DI 20 frame) (const_int -16 [0xfff0])) [2 MEM[(float *)&c]+0 S16 A128]))) pr59163-2.C:17 1269 {*mulv4sf3} (nil)) movti_internal handles unaligned loads properly. Then *.dse1 transforms this into: (insn 6 3 18 2 (set (reg:TI 90 [ MEM[(const struct A &)a_4(D)] ]) (mem:TI (reg/v/f:DI 89 [ a ]) [3 MEM[(const struct A &)a_4(D)]+0 S16 A32])) pr59163-2.C:15 84 {*movti_internal} (nil)) (insn 18 6 8 2 (set (reg:V4SF 94) (subreg:V4SF (reg:TI 90 [ MEM[(const struct A &)a_4(D)] ]) 0)) pr59163-2.C:15 -1 (expr_list:REG_DEAD (reg:TI 90 [ MEM[(const struct A &)a_4(D)] ]) (nil))) ... (insn 9 8 19 2 (set (reg:V4SF 91 [ vect__7.10 ]) (mult:V4SF (reg:V4SF 92) (reg:V4SF 94))) pr59163-2.C:17 1269 {*mulv4sf3} (expr_list:REG_DEAD (reg:V4SF 94) (expr_list:REG_DEAD (reg:V4SF 92) (nil which also looks ok to me. But then combine combines it into: (insn 9 8 19 2 (set (reg:V4SF 91 [ vect__7.10 ]) (mult:V4SF (reg:V4SF 92) (mem:V4SF (reg/v/f:DI 89 [ a ]) [3 MEM[(const struct A &)a_4(D)]+0 S16 A32]))) pr59163-2.C:17 1269 {*mulv4sf3} (expr_list:REG_DEAD (reg:V4SF 92) (nil))) which is wrong (for pre-AVX), because mulv4sf3 can't accept unaligned memory. Likely the SSEx pre-AVX predicates assume that an unaligned vector load will be done using UNSPEC and thus not really mergeable here, and don't count with the fact that the load could be done using integral mode and just subreged into vector mode. Perhaps we need new predicates for this that would fail for exactly this situation (disallow unaligned scalar load subregged into vector mode for pre-AVX) and use them everywhere where SSE? doesn't accept unaligned loads?
[Bug target/59163] [4.8/4.9 Regression] program compiled with g++ -O3 segfaults
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59163 Richard Biener changed: What|Removed |Added Keywords||wrong-code Target||x86_64-*-* Component|tree-optimization |target --- Comment #4 from Richard Biener --- target issue then?