[Bug tree-optimization/71361] New: [7 Regression] Changes in ivopts caused perf regression on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71361 Bug ID: 71361 Summary: [7 Regression] Changes in ivopts caused perf regression on x86 Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: iverbin at gcc dot gnu.org CC: amker.cheng at gmail dot com, izamyatin at gmail dot com, kyukhin at gcc dot gnu.org Target Milestone: --- r235805 leads to performance regression on x86. Reduced testcase: int arr_1[512]; int arr_2[512]; int main () { int c1[512]; int c2[512]; int res[512]; for (int i = 0; i < 512; i++) arr_1[i] = arr_2[i] = c1[i] = c2[i] = i; for (int l = 0; l < 100; l++) for (int k = 1; k <= 9; k++) { int n1 = 1 << k; int n2 = n1 >> 1; for (int j = 0; j < n2; j++) for (int i = j; i < 512; i += n1) { int idx = i + n2; int x1 = arr_1[idx] * c1[j] + arr_2[idx] * c2[j]; int x2 = arr_2[idx] * c1[j] + arr_1[idx] * c2[j]; arr_1[i] = x1; arr_2[i] = x2; arr_1[idx] = x1; arr_2[idx] = x2; } } return 0; } Compilation options: -Ofast -m32 -fPIE GCC is configured --with-arch=corei7 --with-cpu=corei7 --with-fpmath=sse Run time on Sandy Bridge increased by ~20% Run time on Atom increased by ~60% Below are the dumps of the innermost loop after ivopts pass. Before regression there are 2 induction variables, which are used as bases for all 6 memory accesses: # i_66 = PHI # ivtmp.19_63 = PHI # ivtmp.20_17 = PHI _59 = (void *) ivtmp.19_63; _58 = (sizetype) n2_20; _22 = MEM[base: _59, index: _58, step: 4, offset: 0B]; _24 = _22 * pretmp_105; _55 = (void *) ivtmp.20_17; _54 = (sizetype) n2_20; _25 = MEM[base: _55, index: _54, step: 4, offset: 0B]; _27 = _25 * pretmp_107; x1_28 = _24 + _27; _30 = _25 * pretmp_105; _31 = _22 * pretmp_107; x2_32 = _30 + _31; _51 = (void *) ivtmp.19_63; MEM[base: _51, offset: 0B] = x1_28; _50 = (void *) ivtmp.20_17; MEM[base: _50, offset: 0B] = x2_32; _57 = (void *) ivtmp.19_63; _56 = (sizetype) n2_20; MEM[base: _57, index: _56, step: 4, offset: 0B] = x1_28; _53 = (void *) ivtmp.20_17; _52 = (sizetype) n2_20; MEM[base: _53, index: _52, step: 4, offset: 0B] = x2_32; i_37 = n1_19 + i_66; ivtmp.19_95 = ivtmp.19_63 + _77; ivtmp.20_15 = ivtmp.20_17 + _12; if (i_37 <= 511) goto ; else goto ; After regression there is only one induction variable, which is used as index for 4 memory accesses. # i_66 = PHI # ivtmp.22_63 = PHI _22 = MEM[symbol: arr_1, index: ivtmp.22_63, offset: 0B]; _24 = _22 * pretmp_105; _25 = MEM[symbol: arr_2, index: ivtmp.22_63, offset: 0B]; _27 = _25 * pretmp_107; x1_28 = _24 + _27; _30 = _25 * pretmp_105; _31 = _22 * pretmp_107; x2_32 = _30 + _31; _17 = (sizetype) i_66; _15 = _17 * 4; MEM[symbol: arr_1, index: _15, offset: 0B] = x1_28; _14 = (sizetype) i_66; _12 = _14 * 4; MEM[symbol: arr_2, index: _12, offset: 0B] = x2_32; MEM[symbol: arr_1, index: ivtmp.22_63, offset: 0B] = x1_28; MEM[symbol: arr_2, index: ivtmp.22_63, offset: 0B] = x2_32; i_37 = n1_19 + i_66; ivtmp.22_95 = ivtmp.22_63 + _77; if (i_37 <= 511) goto ; else goto ; As a result, the final assembly contains 13% more instructions. Before regression: .L5: movl(%edi,%ebx,4), %eax movd%xmm1, %edx movd%xmm0, %ecx imull (%esi,%ebx,4), %ecx imull %eax, %edx addl%ecx, %edx movd%xmm0, %ecx imull %ecx, %eax movd%xmm1, %ecx imull (%esi,%ebx,4), %ecx movl%edx, (%esi) movl%edx, (%esi,%ebx,4) movd%xmm5, %edx addl%edx, %esi addl%ecx, %eax movl%eax, (%edi) movl%eax, (%edi,%ebx,4) movd%xmm4, %eax addl%edx, %edi addl%eax, -4124(%ebp) movl-4124(%ebp), %ecx cmpl$511, %ecx jle .L5 After regression: .L5: movd%xmm5, %edi movd%xmm3, %edx movd%xmm1, %ebx imull (%eax,%edx), %ebx movd%xmm4, %ecx movd%xmm4, %edx imull (%eax,%edi), %ecx addl%ecx, %ebx movd%xmm1, %ecx imull (%eax,%edi), %ecx movd%ecx, %xmm0 movd%xmm3, %ecx imull (%eax,%ecx), %edx movd%xmm0, %ecx addl%edx, %ecx movd%xmm3, %edx movl%ebx, (%edx,%esi,4) movd%xmm3, %edx movl%ecx, (%edi,%esi,4) addl-4124(%ebp),
[Bug target/71088] New: [i386, AVX-512, Perf] vpermi2ps instead of vpermps emitted
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71088 Bug ID: 71088 Summary: [i386, AVX-512, Perf] vpermi2ps instead of vpermps emitted Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: iverbin at gcc dot gnu.org CC: ienkovich at gcc dot gnu.org, izamyatin at gmail dot com, kyukhin at gcc dot gnu.org, ubizjak at gmail dot com Target Milestone: --- Testcase: float foo (float *arr1, float *arr2, float *max_x, int M, float s) { float *res = new float[M]; for (int i = M - 1; i >= 0; i--) for (int j = 0; j <= i; j++) { float x = arr1[j] * arr2[i - j] + s; res[j] = x > max_x[j] ? x : max_x[j]; } return res[0]; } To reproduce: $ g++ -S test.cpp -Ofast -funroll-loops -march=knl GCC emits vpermi2ps instruction to rearrange elements of arr2 backwards, however this instruction writes the result into the index register, therefore there are additional movs before each vpermi2ps to restore indexes [1]. Also there are some weird movs after each vpermi2ps [2]. It's not clear why the result from vpermi2ps isn't passed directly to vfmadd132ps. .L1: vmovups (%r11), %zmm9 vmovdqa64 %zmm2, %zmm1 # [1] vpermi2ps %zmm9, %zmm9, %zmm1 vmovdqa64 %zmm2, %zmm16 # [1] vmovaps %zmm1, %zmm10 # [2] vmovdqa64 %zmm2, %zmm1 # [1] vmovups -64(%r11), %zmm12 vfmadd132ps (%rax,%r9), %zmm3, %zmm10 vpermi2ps %zmm12, %zmm12, %zmm1 vmaxps(%rcx,%r9), %zmm10, %zmm11 vmovaps %zmm1, %zmm13 # [2] vmovdqa64 %zmm2, %zmm1 # [1] vmovups -128(%r11), %zmm15 vfmadd132ps 64(%rax,%r9), %zmm3, %zmm13 vmovups -192(%r11), %zmm6 vpermi2ps %zmm15, %zmm15, %zmm1 vpermi2ps %zmm6, %zmm6, %zmm16 vmovaps %zmm1, %zmm4 # [2] vmovaps %zmm16, %zmm7 # [2] vmaxps64(%rcx,%r9), %zmm13, %zmm14 vfmadd132ps 128(%rax,%r9), %zmm3, %zmm4 vfmadd132ps 192(%rax,%r9), %zmm3, %zmm7 vmaxps128(%rcx,%r9), %zmm4, %zmm5 leal 4(%r15), %r15d vmaxps192(%rcx,%r9), %zmm7, %zmm8 cmpl %esi, %r15d vmovups %zmm11, (%r8,%r9) leaq -256(%r11), %r11 vmovups %zmm14, 64(%r8,%r9) vmovups %zmm5, 128(%r8,%r9) vmovups %zmm8, 192(%r8,%r9) leaq 256(%r9), %r9 jb.L1 Instead of this, vpermps can be used. It doesn't overwrite the index register, what allows to get rid of 8 movs in this loop: .L2: lea (,%r12,4), %r10 negq %r10 addq %rbx, %r10 vpermps -64(%r10), %zmm3, %zmm4 vpermps -128(%r10), %zmm3, %zmm6 vpermps -192(%r10), %zmm3, %zmm8 vpermps -256(%r10), %zmm3, %zmm10 vfmadd132ps (%r11,%r12,4), %zmm2, %zmm4 vfmadd132ps 64(%r11,%r12,4), %zmm2, %zmm6 vfmadd132ps 128(%r11,%r12,4), %zmm2, %zmm8 vfmadd132ps 192(%r11,%r12,4), %zmm2, %zmm10 vmaxps(%r13,%r12,4), %zmm4, %zmm5 vmovups %zmm5, (%rdi,%r12,4) vmaxps64(%r13,%r12,4), %zmm6, %zmm7 vmovups %zmm7, 64(%rdi,%r12,4) vmaxps128(%r13,%r12,4), %zmm8, %zmm9 vmovups %zmm9, 128(%rdi,%r12,4) vmaxps192(%r13,%r12,4), %zmm10, %zmm11 vmovups %zmm11, 192(%rdi,%r12,4) addq $64, %r12 cmpq %rax, %r12 jb.L2
[Bug other/69582] [meta-bug] Cilk+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69582 Bug 69582 depends on bug 69363, which changed state. Bug 69363 Summary: ICE when doing a pragma simd reduction with max https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69363 What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED
[Bug c++/69363] ICE when doing a pragma simd reduction with max
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69363 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #7 from iverbin at gcc dot gnu.org --- Fixed in GCC 7.
[Bug c++/69363] ICE when doing a pragma simd reduction with max
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69363 --- Comment #6 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Wed Apr 20 15:48:53 2016 New Revision: 235290 URL: https://gcc.gnu.org/viewcvs?rev=235290&root=gcc&view=rev Log: Fix PR c++/69363 gcc/c-family/ PR c++/69363 * c-cilkplus.c (c_finish_cilk_clauses): Remove function. * c-common.h (c_finish_cilk_clauses): Remove declaration. gcc/c/ PR c++/69363 * c-parser.c (c_parser_cilk_all_clauses): Use c_finish_omp_clauses instead of c_finish_cilk_clauses. * c-tree.h (c_finish_omp_clauses): Add new default argument. * c-typeck.c (c_finish_omp_clauses): Add new argument. Allow floating-point variables in the linear clause for Cilk Plus. gcc/cp/ PR c++/69363 * cp-tree.h (finish_omp_clauses): Add new default argument. * parser.c (cp_parser_cilk_simd_all_clauses): Use finish_omp_clauses instead of c_finish_cilk_clauses. * semantics.c (finish_omp_clauses): Add new argument. Allow floating-point variables in the linear clause for Cilk Plus. gcc/testsuite/ PR c++/69363 * c-c++-common/cilk-plus/PS/clauses3.c: Adjust dg-error string. * c-c++-common/cilk-plus/PS/clauses4.c: New test. * c-c++-common/cilk-plus/PS/pr69363.c: New test. Added: trunk/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses4.c trunk/gcc/testsuite/c-c++-common/cilk-plus/PS/pr69363.c Modified: trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/c-cilkplus.c trunk/gcc/c-family/c-common.h trunk/gcc/c/ChangeLog trunk/gcc/c/c-parser.c trunk/gcc/c/c-tree.h trunk/gcc/c/c-typeck.c trunk/gcc/cp/ChangeLog trunk/gcc/cp/cp-tree.h trunk/gcc/cp/parser.c trunk/gcc/cp/semantics.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/c-c++-common/cilk-plus/PS/clauses3.c
[Bug middle-end/70506] New: [CilkPlus] error: location references block not in block tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70506 Bug ID: 70506 Summary: [CilkPlus] error: location references block not in block tree Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: ice-checking Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: iverbin at gcc dot gnu.org Target Milestone: --- $ cat test.c void foo () { int i, x = 0; _Cilk_for (i = 0; i < 100; i++) x++; _Cilk_for (i = 0; i < 100; i++) x++; } $ gcc -c -fcilkplus test.c test.c: In function ‘foo._cilk_for_fn.0’: test.c:10:1: error: location references block not in block tree } ^ D.1952 = .omp_data_i->x; test.c:10:1: error: location references block not in block tree D.1953 = D.1952 + 1; test.c:10:1: error: location references block not in block tree .omp_data_i->x = D.1953; test.c:10:1: internal compiler error: verify_gimple failed 0xe31754 verify_gimple_in_cfg(function*, bool) gcc/tree-cfg.c:5125 0xcd17a8 execute_function_todo gcc/passes.c:1958 0xcd0941 do_per_function gcc/passes.c:1652 0xcd1984 execute_todo gcc/passes.c:2010 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. Here is DECL_INITIAL (fn->decl) block for foo._cilk_for_fn.0: unit size align 32 symtab 0 alias set -1 canonical type 0x... precision 32 min max pointer_to_this > used SI file test.c line 3 col 10 size unit size align 32 context value-expr arg 0 nothrow arg 0 arg 1 > arg 1 >>> However, similar OpenMP testcase works fine, because corresponding DECL_INITIAL (fn->decl) contains a subblock: unit size align 32 symtab 0 alias set -1 canonical type 0x... precision 32 min max pointer_to_this > used SI file test.c line 3 col 10 size unit size align 32 context value-expr arg 0 nothrow arg 0 arg 1 > arg 1 >> subblocks used SI file test.c line 3 col 7 size unit size align 32 context > supercontext >>
[Bug testsuite/64177] Various cilk+ testsuite failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64177 --- Comment #2 from iverbin at gcc dot gnu.org --- Actually, only 3 tests require 2+ workers (they fail with export CILK_NWORKERS=1): FAIL: c-c++-common/cilk-plus/CK/spawning_arg.c FAIL: c-c++-common/cilk-plus/CK/steal_check.c FAIL: g++.dg/cilk-plus/CK/catch_exc.cc It's unclear what happens with others. Maybe it's a bug in libcilkrts.
[Bug testsuite/64177] Various cilk+ testsuite failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64177 iverbin at gcc dot gnu.org changed: What|Removed |Added CC||iverbin at gcc dot gnu.org, ||tschwinge at gcc dot gnu.org --- Comment #1 from iverbin at gcc dot gnu.org --- This issue was discussed here: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01817.html We need __cilkrts_set_param ("nworkers", "2"); in such tests.
[Bug driver/68463] Offloading fails when some objects are compiled with LTO and some without
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68463 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from iverbin at gcc dot gnu.org --- Fixed in GCC 6.
[Bug driver/68463] Offloading fails when some objects are compiled with LTO and some without
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68463 --- Comment #4 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Thu Feb 25 12:23:52 2016 New Revision: 233712 URL: https://gcc.gnu.org/viewcvs?rev=233712&root=gcc&view=rev Log: gcc/ PR driver/68463 * config/gnu-user.h (CRTOFFLOADBEGIN): Define. Add crtoffloadbegin.o if offloading is enabled and -fopenacc or -fopenmp is specified. (CRTOFFLOADEND): Likewise. (GNU_USER_TARGET_STARTFILE_SPEC): Add CRTOFFLOADBEGIN. (GNU_USER_TARGET_ENDFILE_SPEC): Add CRTOFFLOADEND. * lto-wrapper.c (offloadbegin, offloadend): Remove static vars. (offload_objects_file_name): New static var. (tool_cleanup): Remove offload_objects_file_name file. (find_offloadbeginend): Replace with ... (find_crtoffloadtable): ... this. (run_gcc): Remove offload_argc and offload_argv. Get offload_objects_file_name from -foffload-objects=... option. Read names of object files with offload from this file, pass them to compile_images_for_offload_targets. Don't call find_offloadbeginend and don't pass offloadbegin and offloadend to the linker. Don't pass offload non-LTO files to the linker, because now they're not claimed. libgcc/ PR driver/68463 * Makefile.in (crtoffloadtable$(objext)): New rule. * configure.ac (extra_parts): Add crtoffloadtable$(objext) if enable_offload_targets is not empty. * configure: Regenerate. * offloadstuff.c: Move __OFFLOAD_TABLE__ from crtoffloadend to crtoffloadtable. libgomp/ PR driver/68463 * testsuite/libgomp.oacc-c-c++-common/parallel-dims-2.c: Remove. lto-plugin/ PR driver/68463 * lto-plugin.c (struct plugin_offload_file): New. (offload_files): Change type. (offload_files_last, offload_files_last_obj): New. (offload_files_last_lto): New. (free_2): Adjust accordingly. (all_symbols_read_handler): Don't add offload files to lto_arg_ptr. Don't call free_1 for offload_files. Write names of object files with offloading to the temporary file. Add new option to lto_arg_ptr. (claim_file_handler): Don't claim file if it contains offload sections without LTO sections. If it contains offload sections, add to the list. Removed: trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/gnu-user.h trunk/gcc/lto-wrapper.c trunk/libgcc/ChangeLog trunk/libgcc/Makefile.in trunk/libgcc/configure trunk/libgcc/configure.ac trunk/libgcc/offloadstuff.c trunk/libgomp/ChangeLog trunk/lto-plugin/ChangeLog trunk/lto-plugin/lto-plugin.c
[Bug c++/69363] ICE when doing a pragma simd reduction with max
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69363 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|WAITING --- Comment #5 from iverbin at gcc dot gnu.org --- Waiting for stage1: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01183.html
[Bug c++/69363] ICE when doing a pragma simd reduction with max
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69363 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from iverbin at gcc dot gnu.org --- It turns out that there is at least one difference between Cilk Plus and OpenMP finalization - Cilk Plus allows float and double variables in linear clause, while OpenMP doesn't. I'm going to adjust finish_omp_clauses accordingly.
[Bug c++/69363] ICE when doing a pragma simd reduction with max
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69363 iverbin at gcc dot gnu.org changed: What|Removed |Added CC||iverbin at gcc dot gnu.org --- Comment #2 from iverbin at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #1) > The bug is that cp_parser_cilk_simd_all_clauses and > c_parser_cilk_simd_all_clauses calls c_finish_cilk_clauses rather than the > OpenMP clauses finalization routines in each of the FEs, perhaps with some > argument that would say it wants Cilk+ semantics instead of OpenMP. > That way, it misses lots of important actions that need to be performed on > the clauses. Yep, this patch fixes original testcase and 23 fails in Cilk Plus Conformance Suite v1.2.1 without new regressions. diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 6f47edf..9e12a96 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -37621,7 +37621,7 @@ cp_parser_cilk_simd_all_clauses (cp_parser *parser, cp_token *pragma_token) if (clauses == error_mark_node) return error_mark_node; else -return c_finish_cilk_clauses (clauses); +return finish_omp_clauses (clauses, false); } /* Main entry-point for parsing Cilk Plus <#pragma simd> for loops. */
[Bug libgomp/69607] undefined reference to MAIN__._omp_fn.0 in atomic_capture-1.f with -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69607 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-02-04 Ever confirmed|0 |1 --- Comment #9 from iverbin at gcc dot gnu.org --- (In reply to vries from comment #8) > (In reply to iverbin from comment #7) > > I believe we should drop support of offloading without linker plugin. > > Same failures occur with -fuse-linker-plugin though. Ok, I see. Maybe we could promote all *omp_fn* to global? It should fix "undefined reference" from offload table in one partition to *omp_fn* in another.
[Bug libgomp/69607] undefined reference to MAIN__._omp_fn.0 in atomic_capture-1.f with -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69607 --- Comment #7 from iverbin at gcc dot gnu.org --- I believe we should drop support of offloading without linker plugin.
[Bug fortran/69090] Allocatable arrays mishandled in 'omp declare target'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69090 iverbin at gcc dot gnu.org changed: What|Removed |Added CC||iverbin at gcc dot gnu.org --- Comment #4 from iverbin at gcc dot gnu.org --- (In reply to Alexander Monakov from comment #0) > Compiling and running the following testcase with non-shared-memory > accelerator segfaults in the target region, because only pointed-to data of > the allocatable array is copied, but not the array structure (.data, .offset > fields) itself. From my reading of the OpenMP spec, allocatable arrays are > not explicitely allowed in the 'declare target' directive, so the code is > ill-formed. However, no diagnostic is issued, and generally I don't know > what GCC intends to do here. Do you mean *global* allocatable arrays, or locals fail too? We discussed it a bit here: https://gcc.gnu.org/ml/gcc/2015-03/msg00238.html There is also a discussion in OpenMP ML about clarifying the spec; and as per my understanding global allocatable arrays are not allowed by OpenMP 4.5, so it would be nice to have a diagnostic instead of segfault.
[Bug driver/68463] Offloading fails when some objects are compiled with LTO and some without
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68463 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2015-11-30 CC||bernds at gcc dot gnu.org, ||hubicka at gcc dot gnu.org, ||jakub at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from iverbin at gcc dot gnu.org --- > I presume the same issue exists for GCC 5. Yes. It seems that we can fix this issue by passing a new option to lto-wrapper, which will contain a list of object files with offload (or a filename with the list). It also will allow to remove some hacky code from lto-wrapper, like this comparison: if (strncmp (argv[i], "-fresolution=", sizeof ("-fresolution=") ... E.g., if there are 4 objects: * obj1.o - non-LTO, offload; * obj2.o - LTO, non-offload; * obj3.o - non-LTO, non-offload; * obj4.o - LTO, offload; then linker plugin will claim only obj2.o and obj4.o, as it was intended. So it will call lto-wrapper by passing obj2.o and obj4.o as argv. But additionally linker plugin will pass something like: -foffload_objects="obj1.o,obj4.o". lto-wrapper will perform LTO on objects from argv as usually, and additionally compile target images using offload IR from obj1.o and obj4.o. The tables also should match, because host table will consist of: pieces from all LTO objects with offload + pieces from non-LTO objects with offload. Just need to reorder offload_objects correspondingly before passing them to the targer compiler (obj4.o,obj1.o). However in this case both obj1.o and obj4.o cannot be surrounded by crtoffload{begin,end}.o, because lto-wrapper cannot place crtoffload* before or after obj1.o, because it is unclaimed. But I guess this can be fixed by something like linker script, which will place sections from crtoffload* at the begin/end of the final joint section.
[Bug other/68463] New: Offloading fails when some objects are compiled with LTO and some without
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68463 Bug ID: 68463 Summary: Offloading fails when some objects are compiled with LTO and some without Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: openacc, openmp Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: iverbin at gcc dot gnu.org Target Milestone: --- The reproducer: $ cat foo.c void foo () { #pragma omp target ; } $ cat bar.c void bar () { #pragma omp target ; } $ cat main.c extern void foo (); extern void bar (); int main () { foo (); bar (); return 0; } $ gcc -c -fopenmp -flto foo.c $ gcc -c -fopenmp bar.c main.c $ gcc -fopenmp foo.o bar.o main.o main.o: In function `main': main.c:(.text+0x14): undefined reference to `bar' collect2: error: ld returned 1 exit status This happens because the linker plugin in claim_file_handler claims bar.o, and linker just drops it, because linker considers bar.o as LTO object. Without offload it claims only LTO objects, but now it claims objects with any IR. (Yes, offloading misuses lto-plugin and lto-wrapper a bit.) And even worse, it fails with -foffload=disable, because we decided to stream- out offload IR unconditionally: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00628.html So -foffload=disable only disables compilation of target images in lto-wrapper, but objects are handled by linker plugin before that. The first solution that comes to mind - do not claim objects, which contain offload IR without LTO IR. But this will cause run-time error: "libgomp: Cannot map target functions or variables (expected 1, have 2)", because lto-wrapper will surround only *.ltrans.o (derived from foo.o) with crtoffload{begin,end}.o; and bar.o will be added at the end of the list of objects from lto-wrapper. But we need this order to get correct host table: "crtoffloadbegin.o, *.ltrans.o, bar.o, crtoffloadend.o". Here is a bit more about tables: https://gcc.gnu.org/wiki/Offloading#Address_mapping_tables Or maybe we should implement new linker offload-plugin with its offload-wrapper, but I don't know how difficult it would be to support 2 plugins in the linkers, and it really doesn't solve the issue with crtoffload{begin,end}.o placement. Or maybe just print an error during linking that offloading doesn't support mixing LTO and non-LTO objects (even if some of them doesn't have offload)?
[Bug other/67652] liboffloadmic/runtime/offload_engine.cpp:176: strange expression in sizeof ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67652 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from iverbin at gcc dot gnu.org --- Fixed in trunk. GCC 5 doesn't have such an issue.
[Bug other/67652] liboffloadmic/runtime/offload_engine.cpp:176: strange expression in sizeof ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67652 --- Comment #3 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Mon Sep 28 16:10:16 2015 New Revision: 228210 URL: https://gcc.gnu.org/viewcvs?rev=228210&root=gcc&view=rev Log: PR other/67652 liboffloadmic/ * runtime/offload_engine.cpp (Engine::init_process): Fix sizeof. Modified: trunk/liboffloadmic/ChangeLog trunk/liboffloadmic/runtime/offload_engine.cpp
[Bug c/67652] liboffloadmic/runtime/offload_engine.cpp:176: strange expression in sizeof ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67652 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |iverbin at gcc dot gnu.org --- Comment #2 from iverbin at gcc dot gnu.org --- Thanks, I will fix it.
[Bug libgomp/66950] FAIL: libgomp.fortran/examples-4/simd-7.f90 -O0 execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66950 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||iverbin at gcc dot gnu.org Resolution|--- |FIXED --- Comment #2 from iverbin at gcc dot gnu.org --- Fixed.
[Bug libgomp/66950] FAIL: libgomp.fortran/examples-4/simd-7.f90 -O0 execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66950 --- Comment #1 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Wed Jul 22 17:19:31 2015 New Revision: 226080 URL: https://gcc.gnu.org/viewcvs?rev=226080&root=gcc&view=rev Log: 2015-07-22 Maxim Blumenthal PR libgomp/66950 * testsuite/libgomp.c/examples-4/simd-7.c (N): Change to 30 from 45. (fib_ref): New function. (fib): Correct corner cases in the recursion. (main): Replace the non-simd loop with fib_ref call. * testsuite/libgomp.fortran/examples-4/simd-7.f90: (fib_ref): New subroutine. (fibonacci): Lower the parameter N to 30. Correct accordingly check for the last array element value. Replace the non-simd loop with fib_ref call. Remove redundant b_ref array. Remove the comparison of the last array element with according Fibonacci sequence element. (fib): Correct corner cases in the recursion. Modified: trunk/libgomp/ChangeLog trunk/libgomp/testsuite/libgomp.c/examples-4/simd-7.c trunk/libgomp/testsuite/libgomp.fortran/examples-4/simd-7.f90
[Bug libgomp/65338] Offloading from DSO is broken after OpenACC merge to trunk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65338 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from iverbin at gcc dot gnu.org --- Fixed in trunk (r221878).
[Bug libgomp/65338] New: [5 Regression] Offloading from DSO is broken after OpenACC merge to trunk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65338 Bug ID: 65338 Summary: [5 Regression] Offloading from DSO is broken after OpenACC merge to trunk Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: iverbin at gcc dot gnu.org CC: jakub at gcc dot gnu.org, kyukhin at gcc dot gnu.org, tschwinge at gcc dot gnu.org The testcase: + test.c: + int f_aaa (void); int main () { int x = f_aaa (); #pragma omp target x++; return x; } + libaaa.c: + int f_aaa (void) { int x = 0; #pragma omp target x = 10; return x; } ++ $ gcc -fopenmp -shared -fPIC libaaa.c -o libaaa.so $ gcc -fopenmp -L. -laaa test.c $ ./a.out libgomp: Target function wasn't mapped The problem is caused by this change: -gomp_register_images_for_device (struct gomp_device_descr *device) +gomp_register_image_for_device (struct gomp_device_descr *device, + struct offload_image_descr *image) { - int i; - for (i = 0; i < num_offload_images; i++) + if (!device->offload_regions_registered + && (device->type == image->type + || device->type == OFFLOAD_TARGET_TYPE_HOST)) { - struct offload_image_descr *image = &offload_images[i]; - if (image->type == device->type) - device->register_image_func (image->host_table, image->target_data); + device->register_image_func (image->host_table, image->target_data); + device->offload_regions_registered = true; } } We should at least remove device->offload_regions_registered, or rework loading/registration to support dlopen'ed libraries. Related mail thread: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01455.html
[Bug lto/63923] FAIL: libgomp.c/examples-4/e.50.1.c (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63923 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from iverbin at gcc dot gnu.org --- Fixed.
[Bug testsuite/64605] [5 Regression] ERROR: (DejaGnu) proc "libatomic_target_compile lto1738.c lto1738.o object additional_flags=-flto" does not exist.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64605 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||mikestump at comcast dot net Resolution|--- |FIXED --- Comment #3 from iverbin at gcc dot gnu.org --- Fixed by r219722
[Bug testsuite/64605] [5 Regression] ERROR: (DejaGnu) proc "libatomic_target_compile lto1738.c lto1738.o object additional_flags=-flto" does not exist.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64605 --- Comment #2 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Fri Jan 16 11:29:54 2015 New Revision: 219722 URL: https://gcc.gnu.org/viewcvs?rev=219722&root=gcc&view=rev Log: PR testsuite/64605 libatomic/ * testsuite/lib/libatomic.exp: Do not load gcc-dg.exp. * testsuite/libatomic.c/c.exp: Load gcc-dg.exp. Modified: trunk/libatomic/ChangeLog trunk/libatomic/testsuite/lib/libatomic.exp trunk/libatomic/testsuite/libatomic.c/c.exp
[Bug testsuite/64605] New: [5 Regression] ERROR: (DejaGnu) proc "libatomic_target_compile lto1738.c lto1738.o object additional_flags=-flto" does not exist.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64605 Bug ID: 64605 Summary: [5 Regression] ERROR: (DejaGnu) proc "libatomic_target_compile lto1738.c lto1738.o object additional_flags=-flto" does not exist. Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: iverbin at gcc dot gnu.org A fix for check_effective_target_lto (r219606) caused an ERROR in libatomic testsuite. To reproduce: make check-target-libatomic ERROR: (DejaGnu) proc "libatomic_target_compile lto4486.c lto4486.o object additional_flags=-flto" does not exist. The error code is NONE The info on the error is: invalid command name "libatomic_target_compile" while executing "::tcl_unknown libatomic_target_compile lto4486.c lto4486.o object additional_flags=-flto" ("uplevel" body line 1) invoked from within "uplevel 1 ::tcl_unknown $args"
[Bug middle-end/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #19 from iverbin at gcc dot gnu.org --- (In reply to iverbin from comment #18) > It seems that the problem with offload is that -fPIC option is passed to the > offload compiler, but not passed to the host compiler. If I add -fPIC to the > host compiler as well, everything is ok. > > I don't know how -fPIC option affects IR before streaming out, > -fdump-tree-optimized are identical for pic/nonpic cases, but > .gnu.offload_lto_.decls sections are different. However debug_tree > (vnode->decl) for "G" in ipa_write_summaries are identical for pic/nonpic > cases. > > So, the question is, how to figure out what is different in G's declaration > in IR, and how it can affect further expansion? The regression is caused by LTO streaming of TARGET_OPTIMIZE_NODE: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00376.html
[Bug middle-end/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #18 from iverbin at gcc dot gnu.org --- (In reply to Uroš Bizjak from comment #17) > The x86 backend did survive many years just fine, so I think offload should > be fixed to follow approach that generic middle-end takes. The testcase I > posted expands without problems; in this case middle-end knows that symbol > address has to be moved to a register. It looks like offload bypasses > generic expansion functions (that would magically fix this issue). It seems that the problem with offload is that -fPIC option is passed to the offload compiler, but not passed to the host compiler. If I add -fPIC to the host compiler as well, everything is ok. Offload RTL for pic host, pic offload: (insn 8 6 9 4 (set (reg:DI 88) (mem/u/c:DI (const:DI (unspec:DI [ (symbol_ref:DI ("G") ) ] UNSPEC_GOTPCREL)) [0 S8 A8])) addr.c:6 -1 (nil)) Offload RTL for nonpic host, pic offload: (insn 8 6 9 4 (set (reg:CC 17 flags) (compare:CC (mem/f/c:DI (plus:DI (reg/f:DI 82 virtual-stack-vars) (const_int -8 [0xfff8])) [0 p+0 S8 A64]) (symbol_ref:DI ("G") ))) addr.c:6 -1 (nil)) I don't know how -fPIC option affects IR before streaming out, -fdump-tree-optimized are identical for pic/nonpic cases, but .gnu.offload_lto_.decls sections are different. However debug_tree (vnode->decl) for "G" in ipa_write_summaries are identical for pic/nonpic cases. So, the question is, how to figure out what is different in G's declaration in IR, and how it can affect further expansion?
[Bug target/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #11 from iverbin at gcc dot gnu.org --- Created attachment 34360 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34360&action=edit pr64412_2.s
[Bug target/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #10 from iverbin at gcc dot gnu.org --- Created attachment 34359 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34359&action=edit pr64412_2.c
[Bug target/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #9 from iverbin at gcc dot gnu.org --- (In reply to H.J. Lu from comment #8) > Created attachment 34357 [details] > A patch > > Can you try this? Thank you, e.53.5.c now passed. However for-3.c and for-11.C still fails with another unrecognizable insn. I attached reduced testcase (pr64412_2).
[Bug target/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #6 from iverbin at gcc dot gnu.org --- Created attachment 34351 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34351&action=edit pr64412.s
[Bug target/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #5 from iverbin at gcc dot gnu.org --- Created attachment 34350 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34350&action=edit Source code
[Bug target/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #4 from iverbin at gcc dot gnu.org --- (In reply to H.J. Lu from comment #3) > (In reply to iverbin from comment #2) > > (In reply to H.J. Lu from comment #1) > > > (In reply to iverbin from comment #0) > > > > To reproduce using Intel Xeon Phi emulation: > > > > 1. Build offload and host compilers as described in > > > > https://gcc.gnu.org/wiki/Offloading#How_to_try_offloading_enabled_GCC > > > > 2. Run make check-target-libgomp RUNTESTFLAGS="c.exp=e.53.5.c" > > > > > > Can you create a stanalone testcase for the Intel Xeon Phi offload > > > cross compiler? It will be easier to debug. > > > > The offload model in GCC implies 2 compilers: one produces IR for OpenMP > > target regions, and another compiles this IR for Intel Xeon Phi. > > There is no single compiler, which could stream offload IR out, then stream > > it in, and then compile. > > I can reduce e.53.5.c testcase, not sure whether this is helpful. > > Can you use "gcc -v -save-temps" to see what is passed to the offload > compiler and feed them to the offload compiler directly? Yes, this is possible. However, the function preload_common_nodes, modified in r218767, is used for both IN/OUT streaming, therefore the IR should be produced and consumed by compilers built from the same sources. Here are the reduced testcase and corresponding IR for: gcc -fopenmp -O1 -S pr64412.c To reproduce the error: 1. Configure and make gcc with: --enable-as-accelerator-for=x86_64-unknown-linux --host=x86_64-intelmicemul-linux --build=x86_64-intelmicemul-linux --target=x86_64-intelmicemul-linux 2. Run: as pr64412.s -o pr64412.o && x86_64-unknown-linux-accel-x86_64-intelmicemul-linux-gnu-gcc -xlto -fopenmp -O1 -shared -fPIC pr64412.o
[Bug lto/64412] [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 --- Comment #2 from iverbin at gcc dot gnu.org --- (In reply to H.J. Lu from comment #1) > (In reply to iverbin from comment #0) > > To reproduce using Intel Xeon Phi emulation: > > 1. Build offload and host compilers as described in > > https://gcc.gnu.org/wiki/Offloading#How_to_try_offloading_enabled_GCC > > 2. Run make check-target-libgomp RUNTESTFLAGS="c.exp=e.53.5.c" > > Can you create a stanalone testcase for the Intel Xeon Phi offload > cross compiler? It will be easier to debug. The offload model in GCC implies 2 compilers: one produces IR for OpenMP target regions, and another compiles this IR for Intel Xeon Phi. There is no single compiler, which could stream offload IR out, then stream it in, and then compile. I can reduce e.53.5.c testcase, not sure whether this is helpful.
[Bug lto/63923] FAIL: libgomp.c/examples-4/e.50.1.c (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63923 iverbin at gcc dot gnu.org changed: What|Removed |Added CC||iverbin at gcc dot gnu.org --- Comment #2 from iverbin at gcc dot gnu.org --- This issue is fixed by r217773
[Bug lto/64412] New: [regression] ICE in offload compiler: in extract_insn, at recog.c:2327
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64412 Bug ID: 64412 Summary: [regression] ICE in offload compiler: in extract_insn, at recog.c:2327 Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: iverbin at gcc dot gnu.org CC: bernds at gcc dot gnu.org, hubicka at gcc dot gnu.org, kyukhin at gcc dot gnu.org, tschwinge at gcc dot gnu.org After fixing PR lto/64043 (r218767) the offload target compiler began crashing while reading intermediate bytecode. FAIL: libgomp.c/examples-4/e.53.5.c (internal compiler error) FAIL: libgomp.c/for-3.c (internal compiler error) FAIL: libgomp.c++/for-11.C (internal compiler error) FAIL: libgomp.fortran/examples-4/e.53.3.f90 -O2 (internal compiler error) etc. To reproduce using Intel Xeon Phi emulation: 1. Build offload and host compilers as described in https://gcc.gnu.org/wiki/Offloading#How_to_try_offloading_enabled_GCC 2. Run make check-target-libgomp RUNTESTFLAGS="c.exp=e.53.5.c" libgomp/testsuite/libgomp.c/examples-4/e.53.5.c: In function 'accum._omp_fn.1': libgomp/testsuite/libgomp.c/examples-4/e.53.5.c:53:13: error: unrecognizable insn: #pragma omp parallel for reduction(+:tmp) ^ (insn 176 66 177 4 (set (reg:DI 0 ax) (symbol_ref:DI ("Q") )) -1 (nil)) libgomp/testsuite/libgomp.c/examples-4/e.53.5.c:53:13: internal compiler error: in extract_insn, at recog.c:2327 0xbadaf7 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) gcc/rtl-error.c:110 0xbadb38 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) gcc/rtl-error.c:118 0xb60522 extract_insn(rtx_insn*) gcc/recog.c:2327 0xb60221 extract_constrain_insn(rtx_insn*) gcc/recog.c:2228 0xb6e973 copyprop_hardreg_forward_1 gcc/regcprop.c:773 0xb701db execute gcc/regcprop.c:1279 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. mkoffload-intelmic: fatal error: x86_64-pc-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc returned 1 exit status compilation terminated. lto-wrapper: fatal error: accel/x86_64-intelmicemul-linux-gnu/mkoffload returned 1 exit status compilation terminated. ld: lto-wrapper failed collect2: error: ld returned 1 exit status
[Bug regression/63868] [5 Regression] Multiple failures in the libgomp test suite between r217458 and r217501.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63868 --- Comment #6 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Wed Nov 19 13:52:12 2014 New Revision: 217773 URL: https://gcc.gnu.org/viewcvs?rev=217773&root=gcc&view=rev Log: PR regression/63868 * cgraph.c (cgraph_node::create): Guard g->have_offload with ifdef ENABLE_OFFLOADING. * omp-low.c (create_omp_child_function): Likewise. (expand_omp_target): Guard node->mark_force_output and offload_funcs with ifdef ENABLE_OFFLOADING. * varpool.c (varpool_node::get_create): Guard g->have_offload and offload_vars with ifdef ENABLE_OFFLOADING. Modified: trunk/gcc/ChangeLog trunk/gcc/cgraph.c trunk/gcc/omp-low.c trunk/gcc/varpool.c
[Bug regression/63868] [5 Regression] Multiple failures in the libgomp test suite between r217458 and r217501.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63868 iverbin at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2014-11-17 Component|libgomp |regression Assignee|unassigned at gcc dot gnu.org |iverbin at gcc dot gnu.org Target Milestone|--- |5.0 Ever confirmed|0 |1
[Bug bootstrap/63853] [5.0 Regression] The use of strchrnul breaks bootstrap on x86_64-apple-darwin14.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63853 --- Comment #12 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Thu Nov 13 22:06:15 2014 New Revision: 217524 URL: https://gcc.gnu.org/viewcvs?rev=217524&root=gcc&view=rev Log: 2014-11-13 Dominique Dhumieres PR bootstrap/63853 gcc/ * gcc.c (handle_foffload_option): Replace strchrnul with strchr. * lto-wrapper.c (parse_env_var, append_offload_options): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/gcc.c trunk/gcc/lto-wrapper.c
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #37 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Thu Oct 23 16:52:11 2014 New Revision: 216596 URL: https://gcc.gnu.org/viewcvs?rev=216596&root=gcc&view=rev Log: PR target/63534 PR target/63618 gcc/ * cse.c (delete_trivially_dead_insns): Consider PIC register is used while it is pseudo. * dse.c (deletable_insn_p): Likewise. gcc/testsuite/ * gcc.target/i386/pr63618.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr63618.c Modified: trunk/gcc/ChangeLog trunk/gcc/cse.c trunk/gcc/dce.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/63618] CSE at IRA pass delete SET_GOT which is used later
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63618 --- Comment #5 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Thu Oct 23 16:52:11 2014 New Revision: 216596 URL: https://gcc.gnu.org/viewcvs?rev=216596&root=gcc&view=rev Log: PR target/63534 PR target/63618 gcc/ * cse.c (delete_trivially_dead_insns): Consider PIC register is used while it is pseudo. * dse.c (deletable_insn_p): Likewise. gcc/testsuite/ * gcc.target/i386/pr63618.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr63618.c Modified: trunk/gcc/ChangeLog trunk/gcc/cse.c trunk/gcc/dce.c trunk/gcc/testsuite/ChangeLog
[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307 --- Comment #6 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Mon Oct 20 15:22:09 2014 New Revision: 216483 URL: https://gcc.gnu.org/viewcvs?rev=216483&root=gcc&view=rev Log: PR c/63307 gcc/c-family/ * cilk.c: Include vec.h. (struct cilk_decls): New structure. (wrapper_parm_cb): Split this function to... (fill_decls_vec): ...this... (create_parm_list): ...and this. (compare_decls): New function. (for_local_cb): Remove. (wrapper_local_cb): Ditto. (build_wrapper_type): For now first traverse and fill vector of declarations then sort it and then deal with sorted vector. (cilk_outline): Ditto. (declare_one_free_variable): Ditto. Modified: trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/cilk.c
[Bug bootstrap/63536] [5 Regression] bootstrap failed when configured with --with-cpu=slm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63536 --- Comment #5 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Wed Oct 15 17:49:27 2014 New Revision: 216280 URL: https://gcc.gnu.org/viewcvs?rev=216280&root=gcc&view=rev Log: PR target/63536 gcc/java/ * lang.c (java_print_error_function): Add check on NULL function context. Modified: trunk/gcc/java/ChangeLog trunk/gcc/java/lang.c
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #6 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Tue Oct 14 16:26:57 2014 New Revision: 216208 URL: https://gcc.gnu.org/viewcvs?rev=216208&root=gcc&view=rev Log: PR target/63534 gcc/ * config/i386/i386.c (ix86_expand_split_stack_prologue): Make __morestack local. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c