Re: An unexplained 10% speed-up with gcc-4.8
On 23/12/13 12:40, Michael Veksler wrote: Hello All, When I started using gcc-4.8.1 I was glad to observe a substantial speed-up of about 10% in my code, as compared with gcc-4.7.3. Usually, switching to newer compilers has a relatively minor effect and definitely not a 10% speed-up. Was there anything significant in gcc-4.8.1 which may explain this dramatic improvement? My code is C++98 which is compiled with profile-driven optimizations, with -O2. My target is generic 32 bit Intel architecture. The result is run on Intel Xeon. The application is CPU intensive. After some more testing, I found out that there is about 12% improvement even when comparing two executables compiled without profile-driven optimization. Unfortunately, the vast improvement is observed only for x86, not for x86-64. The speed-up on x86-64 is only 2-3%. Michael.
error in converting macro IS_EXPR_CODE_CLASS() to function
IS_EXPR_CODE_CLASS() is called at 18 places within gcc subdirectory, and except for expr_check(), tree_block(), tree_set_block() all the other callers pass argument of type enum tree_code_class to IS_EXPR_CODE_CLASS(). These four callers (expr_check is overloaded) assign value of TREE_CODE_CLASS() to variable of type char const, and then pass it as argument to IS_EXPR_CODE_CLASS() For example: tree_block(): tree tree_block (tree t) { char const c = TREE_CODE_CLASS (TREE_CODE (t)); if (IS_EXPR_CODE_CLASS (c)) return LOCATION_BLOCK (t-exp.locus); gcc_unreachable (); return NULL; } Should type of c be changed to const enum tree_code_class instead (similarly in other callers) ? Also, TREE_CODE_CLASS()'s value is of type enum tree_code_class. This gave a compile-error: invalid conversion from ‘char’ to ‘tree_code_class’ when i changed the macro IS_EXPR_CODE_CLASS() to the following function: static inline bool IS_EXPR_CODE_CLASS(enum tree_code_class code_class) { return (code_class = tcc_reference) (code_class = tcc_expression); } Thanks and Regards, Prathamesh
Re: Remove spam in GCC mailing list
Someone on Launchpad has suspended ~seotaewong40 as a spammer. Please enable the account ~seotaewong40. Log off in Launchpad and email information will be removed. Log on to Launchpad and email information will be added. Launchpad has a facility that replaces all email addresses with email address hidden. -- Tae-Wong Seo Korea, Republic of
[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Priority|P3 |P1
[Bug target/59573] aarch64: commit 07ca5686e64 broken glibc-2.17
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59573 Yvan Roux yvan.roux at linaro dot org changed: What|Removed |Added CC||yvan.roux at linaro dot org --- Comment #5 from Yvan Roux yvan.roux at linaro dot org --- Yes, I've tried with foundation_v8, and not only it extremely slow, but also it fails here. compiling gcc in qemu takes 5hours, but takes one week (someone told me) in foudation model. for another simulator, do you have any suggestion? is the foundation model failing for the same reason here (i.e. not recognizing the cmeq instruction) ?
[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569 --- Comment #8 from Bingfeng Mei bmei at broadcom dot com --- Sorry for the regression. The assertion happens if storing a constant value with negative step. Doing permutation of constant is not the best optimization here. So the easy way to fix is to skip vectorizing this statement in the same way as before the patch. Or maybe better way is to form a constant vector to store.
[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569 --- Comment #9 from Bingfeng Mei bmei at broadcom dot com --- Seems simple patch is to just bypass permutation for constant operand as vec_oprnd is a constant vector with identical elements. Index: tree-vect-stmts.c === --- tree-vect-stmts.c (revision 206176) +++ tree-vect-stmts.c (working copy) @@ -5353,7 +5353,8 @@ vectorizable_store (gimple stmt, gimple_ set_ptr_info_alignment (get_ptr_info (dataref_ptr), align, misalign); - if (negative) + if (negative + !CONSTANT_CLASS_P (gimple_assign_rhs1 (stmt))) { tree perm_mask = perm_mask_for_reverse (vectype); tree perm_dest
[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569 --- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org --- (In reply to Bingfeng Mei from comment #9) Seems simple patch is to just bypass permutation for constant operand as vec_oprnd is a constant vector with identical elements. Index: tree-vect-stmts.c === --- tree-vect-stmts.c (revision 206176) +++ tree-vect-stmts.c (working copy) @@ -5353,7 +5353,8 @@ vectorizable_store (gimple stmt, gimple_ set_ptr_info_alignment (get_ptr_info (dataref_ptr), align, misalign); - if (negative) + if (negative + !CONSTANT_CLASS_P (gimple_assign_rhs1 (stmt))) { tree perm_mask = perm_mask_for_reverse (vectype); tree perm_dest I think checking dt == vect_constant_def || dt == vect_external_def would be more appropriate. But, IMNSHO you don't need to check at the analysis phase !perm_mask_for_reverse (vectype) either.
[Bug bootstrap/59583] New: --enable-targets=all --with-cpu=broadwell isn't allowed to configure i686-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59583 Bug ID: 59583 Summary: --enable-targets=all --with-cpu=broadwell isn't allowed to configure i686-linux Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com I got # /export/gnu/import/git/gcc/configure --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --enable-shared i686-linux --prefix=/usr/gcc-4.9.0 --with-local-prefix=/usr/local --enable-targets=all --with-cpu=broadwell --with-fpmath=sse ... # make bootstrap ... Unsupported CPU used in --with-cpu=broadwell, supported values: generic intel atom slm core2 corei7 corei7-avx nocona x86-64 bdver4 bdver3 bdver2 bdver1 btver2 btver1 amdfam10 barcelona k8 opteron athlon64 athlon-fx athlon64-sse3 k8-sse3 opteron-sse3 make[3]: *** [configure-stage1-gcc] Error 1 make[3]: Leaving directory `/export/build/gnu/gcc-test-32bit/build-i686-linux' make[2]: *** [stage1-bubble] Error 2 make[2]: Leaving directory `/export/build/gnu/gcc-test-32bit/build-i686-linux' make[1]: *** [bootstrap] Error 2
[Bug c++/59111] [4.9 Regression] [c++11] ICE on invalid usage of auto in return type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59111 --- Comment #3 from Marek Polacek mpolacek at gcc dot gnu.org --- Author: mpolacek Date: Mon Dec 23 12:14:56 2013 New Revision: 206177 URL: http://gcc.gnu.org/viewcvs?rev=206177root=gccview=rev Log: PR c++/59111 cp/ * search.c (lookup_conversions): Return NULL_TREE if !CLASS_TYPE_P. testsuite/ * g++.dg/cpp0x/pr59111.C: New test. * g++.dg/cpp1y/pr59110.C: New test. Added: trunk/gcc/testsuite/g++.dg/cpp0x/pr59111.C trunk/gcc/testsuite/g++.dg/cpp1y/pr59110.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/search.c trunk/gcc/testsuite/ChangeLog
[Bug lto/59582] LTO discards symbol that defined as weak elsewhere
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59582 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2013-12-23 Ever confirmed|0 |1 --- Comment #1 from H.J. Lu hjl.tools at gmail dot com --- Please try binutils 2.24.
[Bug rtl-optimization/57422] [4.9 Regression] ICE: SIGSEGV in dominated_by_p with custom flags
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57422 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #7 from H.J. Lu hjl.tools at gmail dot com --- Please add the testcase.
[Bug rtl-optimization/57422] [4.9 Regression] ICE: SIGSEGV in dominated_by_p with custom flags
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57422 Andrey Belevantsev abel at gcc dot gnu.org changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Andrey Belevantsev abel at gcc dot gnu.org --- See the thread in gcc-patches: the test does not make sense as it is very sensitive on the scheduler decisions -- even now I had to use the exact reported revision to get the failure. I have added extra asserts in the separate commit instead.
[Bug c++/59111] [4.9 Regression] [c++11] ICE on invalid usage of auto in return type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59111 Marek Polacek mpolacek at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |mpolacek at gcc dot gnu.org --- Comment #4 from Marek Polacek mpolacek at gcc dot gnu.org --- Fixed.
[Bug middle-end/59584] New: [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 Bug ID: 59584 Summary: [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: hp at gcc dot gnu.org CC: jakub at gcc dot gnu.org Host: x86_64-unknown-linux-gnu Target: cris-axis-elf This test previously passed, now it fails. A patch in the revision range (last_known_working:first_known_failing) 206008:206011 exposed or caused this regression. Since then it fails as follows: Running /tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/dg.exp ... ... FAIL: gcc.dg/pr50251.c (internal compiler error) FAIL: gcc.dg/pr50251.c (test for excess errors) In gcc.log: Executing on host: /tmp/hpautotest-gcc1/cris-elf/gccobj/gcc/xgcc -B/tmp/hpautotest-gcc1/cris-elf/gccobj/gcc/ /tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/pr50251.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -S -isystem /tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./newlib/targ-include -isystem /tmp/hpautotest-gcc1/gcc/newlib/libc/include -o pr50251.s(timeout = 300) /tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/pr50251.c: In function 'main': /tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/pr50251.c:18:1: internal compiler error: in fixup_args_size_notes, at expr.c:3978 0x698221 fixup_args_size_notes(rtx_def*, rtx_def*, int) /tmp/hpautotest-gcc1/gcc/gcc/expr.c:3978 0x67aef9 try_split(rtx_def*, rtx_def*, int) /tmp/hpautotest-gcc1/gcc/gcc/emit-rtl.c:3602 0x886e61 split_insn /tmp/hpautotest-gcc1/gcc/gcc/recog.c:2850 0x887104 split_all_insns() /tmp/hpautotest-gcc1/gcc/gcc/recog.c:2940 0x8871d2 rest_of_handle_split_after_reload /tmp/hpautotest-gcc1/gcc/gcc/recog.c:3889 0x8871d2 execute /tmp/hpautotest-gcc1/gcc/gcc/recog.c:3918 Please submit a full bug report, with preprocessed source if appropriate. (as the test-case is without preprocessing directives no such action necessary) A few more hints from gdb shows that gcc ties itself in a knot when splitting: (set (reg/f:SI 14 sp) (mem/f/c:SI (symbol_ref:SI (p))) into: (gdb) call debug_rtx_range (seq, 0) (insn 33 0 34 (set (reg/f:SI 14 sp) (symbol_ref:SI (p) var_decl 0x77eb2000 p)) -1 (nil)) (insn 34 33 0 (set (reg/f:SI 14 sp) (mem/f/c:SI (reg/f:SI 14 sp) [2 p+0 S4 A8])) -1 (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (nil))) (nil) While this define_split has a bug (by matching sp, allowing to set the stack temporarily in an inconsistent state by using sp as a temporary for the symbol), I doubt that's the actual bug causing internal inconsistency within gcc. Anyway: (gdb) r -fpreprocessed pr50251.i -melf -quiet -dumpbase pr50251.c -auxbase-strip pr50251.s -O2 -version -fno-diagnostics-show-caret -fdiagnostics-color=never -o pr50251.s GNU C (GCC) version 4.9.0 20131223 (experimental) [trunk revision 206176] (cris-elf) compiled by GNU C version 4.4.4 20100630 (Red Hat 4.4.4-10), GMP version 4.3.0, MPFR version 2.4.1, MPC version 0.8 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C (GCC) version 4.9.0 20131223 (experimental) [trunk revision 206176] (cris-elf) compiled by GNU C version 4.4.4 20100630 (Red Hat 4.4.4-10), GMP version 4.3.0, MPFR version 2.4.1, MPC version 0.8 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: cc4b37aa04284e09676146c2c3d35a20 Breakpoint 1, fancy_abort (file=0xd4f878 /tmp/hpautotest-gcc1/gcc/gcc/expr.c, line=3978, function=0xd50d50 fixup_args_size_notes) at /tmp/hpautotest-gcc1/gcc/gcc/diagnostic.c:1182 1182{ Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.x86_64 libgcc-4.4.4-10.fc12.x86_64 libstdc++-4.4.4-10.fc12.x86_64 (gdb) up #1 0x00698222 in fixup_args_size_notes (prev=0x0, last=value optimized out, end_args_size=value optimized out) at /tmp/hpautotest-gcc1/gcc/gcc/expr.c:3978 3978 gcc_assert (!saw_unknown); (gdb) p prev (gdb) p prev $1 = (rtx_def *) 0x0 (gdb) p last $2 = value optimized out (gdb) up #2 0x0067aefa in try_split (pat=value optimized out, trial=0x77ea47e0, last=1) at /tmp/hpautotest-gcc1/gcc/gcc/emit-rtl.c:3602 3602 fixup_args_size_notes (NULL_RTX, insn_last, INTVAL (XEXP (note, 0))); (gdb) p insn_last $3 = (rtx_def *) 0x77ea4c60 (gdb) p note $4 = (rtx_def *) 0x77ea2df8 (gdb) pr (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (nil)) (gdb) call debug_rtx_range ($3, 0) (insn 34 33 0 (set (reg/f:SI 14 sp) (mem/f/c:SI (reg/f:SI 14 sp) [2 p+0 S4 A8])) -1 (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (nil))) (nil) (gdb) bt #0 fancy_abort
[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org --- Are you sure it didn't fail before r205026 as well, because what my patch did was essentially restore the old behavior unless strictly necessary (then it would keep the r205026+ behavior).
[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 --- Comment #2 from Hans-Peter Nilsson hp at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #1) Are you sure it didn't fail before r205026 as well, because what my patch did was essentially restore the old behavior unless strictly necessary (then it would keep the r205026+ behavior). Sounds like you have a good grip on the circumstances. :) There was no reason to check for earlier failure ranges, but it certainly failed before and with r205023, started passing with r205046 up until as noted. So, I guess this will be a low-priority PR, particularly as it uses an odd builtin-construct very unlikely to be seen in user code - not to mention it will also be hidden behind a target-specific fix.
[Bug target/59573] aarch64: commit 07ca5686e64 broken glibc-2.17
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59573 --- Comment #6 from Dennis Lan (dlan) dennis.yxun at gmail dot com --- (In reply to Yvan Roux from comment #5) is the foundation model failing for the same reason here (i.e. not recognizing the cmeq instruction) ? Not exactly, the foundation_v8 got abort while compiling gcc.. and yes, it does recognize the cmeq instruction. to clarify, the former gcc build log[1] I provided was generated in qemu which have *no* cmeq support. I do have a patch[2] for qemu which implement cmeq support (which I tested passed), yes, could if anyone can review those patches[3] for the qemu which implement cmeq, it does pass the glibc compilation and install successfully, but with the new glibc, gcc fail to build executable image[4] [1] http://gcc.gnu.org/bugzilla/attachment.cgi?id=31498 [2] https://github.com/dlanx/qemu/commit/1a9b3a40917c416125f10accba9e531ed91677d4 [3] git://github.com/dlanx/qemu (branch aarch64-1.6, top four patches) [4] following output from qemu with cmeq implemented (202940) insn # gcc -v Using built-in specs. COLLECT_GCC=/usr/aarch64-unknown-linux-gnu/gcc-bin/4.9.0-pre/gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/lto -wrapper Target: aarch64-unknown-linux-gnu Configured with: /var/tmp/portage/sys-devel/gcc-4.9.0_pre/work/gcc-4.9.0-999 9/configure --prefix=/usr --bindir=/usr/aarch64-unknown-linux-gnu/gcc-bin/4.9.0- pre --includedir=/usr/lib/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/includ e --datadir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre --mandir =/usr/share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre/man --infodir=/usr/ share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre/info --with-gxx-include-d ir=/usr/lib/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/include/g++-v4 --host=aa rch64-unknown-linux-gnu --build=aarch64-unknown-linux-gnu --disable-altivec --di sable-fixed-point --without-cloog --disable-lto --enable-nls --without-included- gettext --with-system-zlib --enable-obsolete --disable-werror --enable-secureplt --disable-multilib --disable-libmudflap --disable-libssp --enable-libgomp --wit h-python-dir=/share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre/python --en able-checking=release --disable-libgcj --enable-libstdcxx-time --enable-language s=c,c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --$ nable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gento o 4.9.0_pre' Thread model: posix gcc version 4.9.0-pre 20130926 (experimental) commit 07ca5686e64d32f7df4ccf4 205d0b914f120da5e (Gentoo 4.9.0_pre) (202940) insn # cat cmeq_test.c #include stdio.h #include stdlib.h long long fn(long long val) { asm volatile( fmov d0, x0\n\t cmeq d0, d0, #0\n\t fmov x0, d0\n\t ); } int main(int argc, char *argv[]) { long long v = strtoul(argv[1], NULL, 0); printf(result: 0x%lx, 0x%lx\n, v, fn(v)); return 0; } (202940) insn # ./cmeq_test 1 result: 0x1, 0x0 (202940) insn # ./cmeq_test 0 result: 0x0, 0x (202940) insn # ./cmeq_test 0x00 result: 0x00, 0x0 (202940) insn # gcc -o mytest_v4 mytest_v4.c /usr/lib/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/../../../../aarch64-unknown-linux-gnu/bin/ld: error: Cannot change output format whilst linking AArch64 binaries. collect2: error: ld returned 1 exit status (the above cmeq_test was built with sane gcc - with 07ca5686e64 reverted)
[Bug sanitizer/59585] Tests failing due to trailing newline
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59585 --- Comment #1 from Yury Gribov y.gribov at samsung dot com --- Created attachment 31503 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31503action=edit Draft patch
[Bug sanitizer/59585] New: Tests failing due to trailing newline
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59585 Bug ID: 59585 Summary: Tests failing due to trailing newline Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: y.gribov at samsung dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, mpolacek at gcc dot gnu.org, tetra2005 at gmail dot com, v.garbuzov at samsung dot com Host: x86_64-unknown-linux-gnu Target: arm-v7a15-linux-gnueabi Created attachment 31502 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31502action=edit Log file Hi folks, I've tested ubsan in cross-gcc on ARM platform and got a series of similiar errors: FAIL: c-c++-common/ubsan/div-by-zero-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none \ output pattern test, is /home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:11:5: runtime error: division by zero /home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:12:5: runtime error: division by zero /home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:13:5: runtime error: division by zero /home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:14:5: runtime error: division by zero /home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:15:5: runtime error: division by zero, should match division by zero( |^M |^M)[^ ^M]*division by zero( |^M |^M)[^ ^M]*division by zero( |^M |^M)[^ ^M]*division by zero( |^M |^M)[^ ^M]*division by zero( |^M |^M) Extract from log file attached.
[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569 --- Comment #11 from meibf at gcc dot gnu.org --- Author: meibf Date: Mon Dec 23 15:07:58 2013 New Revision: 206179 URL: http://gcc.gnu.org/viewcvs?rev=206179root=gccview=rev Log: 2013-12-23 Bingfeng Mei b...@broadcom.com PR middle-end/59569 * tree-vect-stmts.c (vectorizable_store): Skip permutation for consant operand, and add a few missing \n. * gcc.c-torture/compile/pr59569-1.c: New test. * gcc.c-torture/compile/pr59569-2.c: Ditto. Added: trunk/gcc/testsuite/gcc.c-torture/compile/pr59569-1.c trunk/gcc/testsuite/gcc.c-torture/compile/pr59569-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-stmts.c
[Bug fortran/59577] OpenMP: ICE with type(c_ptr) in private()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59577 --- Comment #1 from Thomas Needham 06needhamt at gmail dot com --- Also occurs in version 4.8.2
[Bug fortran/59577] OpenMP: ICE with type(c_ptr) in private()
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59577 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2013-12-23 Ever confirmed|0 |1 --- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr --- Also occurs in version 4.8.2 And all versions I have tested down to 4.3.1.
[Bug c++/41090] [4.7/4.8/4.9 Regression] Using static label reference in c++ class constructor produces wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41090 --- Comment #19 from Jason Merrill jason at gcc dot gnu.org --- Author: jason Date: Mon Dec 23 17:49:47 2013 New Revision: 206182 URL: http://gcc.gnu.org/viewcvs?rev=206182root=gccview=rev Log: PR c++/41090 Add -fdeclone-ctor-dtor. gcc/cp/ * optimize.c (can_alias_cdtor, populate_clone_array): Split out from maybe_clone_body. (maybe_thunk_body): New function. (maybe_clone_body): Call it. * mangle.c (write_mangled_name): Remove code to suppress writing of mangled name for cloned constructor or destructor. (write_special_name_constructor): Handle decloned constructor. (write_special_name_destructor): Handle decloned destructor. * method.c (trivial_fn_p): Handle decloning. * semantics.c (expand_or_defer_fn_1): Clone after setting linkage. gcc/c-family/ * c.opt: Add -fdeclone-ctor-dtor. * c-opts.c (c_common_post_options): Default to on iff -Os. gcc/ * cgraph.h (struct cgraph_node): Add calls_comdat_local. (symtab_comdat_local_p, symtab_in_same_comdat_p): New. * cif-code.def: Add USES_COMDAT_LOCAL. * symtab.c (verify_symtab_base): Make sure we don't refer to a comdat-local symbol from outside its comdat. * cgraph.c (verify_cgraph_node): Likewise. * cgraphunit.c (mark_functions_to_output): Don't mark comdat-locals. * ipa.c (symtab_remove_unreachable_nodes): Likewise. (function_and_variable_visibility): Handle comdat-local fns. * ipa-cp.c (determine_versionability): Don't clone comdat-locals. * ipa-inline-analysis.c (compute_inline_parameters): Update calls_comdat_local. * ipa-inline-transform.c (inline_call): Likewise. (save_inline_function_body): Don't clear DECL_COMDAT_GROUP. * ipa-inline.c (can_inline_edge_p): Check calls_comdat_local. * lto-cgraph.c (input_overwrite_node): Read calls_comdat_local. (lto_output_node): Write it. * symtab.c (symtab_dissolve_same_comdat_group_list): Clear DECL_COMDAT_GROUP for comdat-locals. include/ * demangle.h (enum gnu_v3_ctor_kinds): Added literal gnu_v3_unified_ctor. (enum gnu_v3_ctor_kinds): Added literal gnu_v3_unified_dtor. libiberty/ * cp-demangle.c (cplus_demangle_fill_ctor,cplus_demangle_fill_dtor): Handle unified ctor/dtor. (d_ctor_dtor_name): Handle unified ctor/dtor. Added: trunk/gcc/testsuite/g++.dg/ext/label13a.C trunk/gcc/testsuite/g++.dg/opt/declone1.C Modified: trunk/gcc/ChangeLog trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/c-opts.c trunk/gcc/c-family/c.opt trunk/gcc/cgraph.c trunk/gcc/cgraph.h trunk/gcc/cgraphunit.c trunk/gcc/cif-code.def trunk/gcc/cp/ChangeLog trunk/gcc/cp/decl.c trunk/gcc/cp/mangle.c trunk/gcc/cp/method.c trunk/gcc/cp/optimize.c trunk/gcc/cp/semantics.c trunk/gcc/doc/invoke.texi trunk/gcc/ipa-cp.c trunk/gcc/ipa-inline-analysis.c trunk/gcc/ipa-inline-transform.c trunk/gcc/ipa-inline.c trunk/gcc/ipa.c trunk/gcc/lto-cgraph.c trunk/gcc/symtab.c trunk/include/ChangeLog trunk/include/demangle.h trunk/libiberty/ChangeLog trunk/libiberty/cp-demangle.c
[Bug c++/41090] [4.7/4.8/4.9 Regression] Using static label reference in c++ class constructor produces wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41090 Jason Merrill jason at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|4.8.3 |4.9.0 --- Comment #20 from Jason Merrill jason at gcc dot gnu.org --- Fixed for 4.9.
[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org --- So is this actually a regression then (I mean, has it worked in 4.8 or 4.7 etc.)?
[Bug c++/59349] [4.9 Regression] ICE on invalid: Segmentation fault toplev.c:336
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59349 Jason Merrill jason at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED CC||jason at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |jason at gcc dot gnu.org
[Bug tree-optimization/59586] New: Segmentation fault with -Ofast -floop-parallelize-all -ftree-parallelize-loops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59586 Bug ID: 59586 Summary: Segmentation fault with -Ofast -floop-parallelize-all -ftree-parallelize-loops Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: chaosgate at gmail dot com Created attachment 31504 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31504action=edit Test case This code results in the following segfault when compiled with gfortran -o /dev/null -c -Ofast -floop-parallelize-all -ftree-parallelize-loops=1 -fopenmp t3.f t3.f: In function ‘subsm’: t3.f:1:0: internal compiler error: Segmentation fault subroutine subsm ( n, x, xp, xx) ^ Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. gcc -v: Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.8.2/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /var/tmp/paludis/build/sys-devel-gcc-4.8.2/work/gcc-4.8.2/configure --prefix=/usr --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --disable-silent-rules --enable-fast-install --libdir=/usr/lib64 --cache-file=config.cache --libdir=/usr/lib64 --with-pkgversion='exherbo gcc-4.8.2' --program-suffix=-4.8 --disable-bootstrap --enable-clocale=gnu --enable-languages=c,c++,fortran,java --enable-lto --disable-multilib --enable-nls --enable-serial-configure --enable-libquadmath --enable-libquadmath-support --with-cloog --enable-libgomp --disable-libobjc --disable-libssp --with-as=x86_64-pc-linux-gnu-as --with-ld=x86_64-pc-linux-gnu-ld --with-system-zlib Thread model: posix gcc version 4.8.2 (exherbo gcc-4.8.2)
[Bug tree-optimization/59586] Segmentation fault with -Ofast -floop-parallelize-all -ftree-parallelize-loops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59586 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2013-12-23 Ever confirmed|0 |1 --- Comment #1 from Dominique d'Humieres dominiq at lps dot ens.fr --- Confirmed for 4.8.2 and trunk with -O3 -ffast-math -floop-parallelize-all, but no ICE for 4.5.4, 4.6.4, and 4.7.3. Likely a 4.8/4.9 regression. I'll try to bisect when I find some time.
[Bug target/59587] New: cpu_names in i386.c is accessed with wrong index
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59587 Bug ID: 59587 Summary: cpu_names in i386.c is accessed with wrong index Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: ubizjak at gmail dot com i386.c has static const char *const cpu_names[TARGET_CPU_DEFAULT_max] = { generic, ... btver2 }; ... if (!opts-x_ix86_tune_string) { opts-x_ix86_tune_string = cpu_names[TARGET_CPU_DEFAULT]; ix86_tune_defaulted = 1; } ... fprintf (file, %*sarch = %d (%s)\n, indent, , ptr-arch, ((ptr-arch TARGET_CPU_DEFAULT_max) ? cpu_names[ptr-arch] : unknown)); fprintf (file, %*stune = %d (%s)\n, indent, , ptr-tune, ((ptr-tune TARGET_CPU_DEFAULT_max) ? cpu_names[ptr-tune] : unknown)); But ptr-arch and ptr-tune are set by ptr-arch = ix86_arch; ptr-schedule = ix86_schedule; ptr-tune = ix86_tune; ix86_arch is set by /* Which instruction set architecture to use. */ enum processor_type ix86_arch; ix86_arch = processor_alias_table[i].processor; and ix86_tune is set by /* Which cpu are we optimizing for. */ enum processor_type ix86_tune; ix86_tune = processor_alias_table[i].processor; We are using enum processor_type as index to access array of enum target_cpu_default enum target_cpu_default { TARGET_CPU_DEFAULT_generic = 0, ... TARGET_CPU_DEFAULT_max }; x86 backend only uses TARGET_CPU_DEFAULT_generic to set up the default tuning: #ifndef TARGET_CPU_DEFAULT #define TARGET_CPU_DEFAULT TARGET_CPU_DEFAULT_generic #endif ... if (!opts-x_ix86_tune_string) { opts-x_ix86_tune_string = cpu_names[TARGET_CPU_DEFAULT]; ix86_tune_defaulted = 1; } We never define a different TARGET_CPU_DEFAULT. When GCC is configured with --with-arch=/--with-cpu=, we have [hjl@gnu-6 build-x86_64-linux]$ cat gcc/configargs.h /* Generated automatically. */ static const char configuration_arguments[] = /export/gnu/import/git/gcc/configure --enable-languages=c,c++,fortran --disable-bootstrap --prefix=/usr/gcc-4.9.0 --with-local-prefix=/usr/local --enable-gnu-indirect-function --with-fpmath=sse; static const char thread_model[] = posix; static const struct { const char *name, *value; } configure_default_options[] = { { cpu, generic }, { arch, x86-64 } }; [hjl@gnu-6 build-x86_64-linux]$ which passes -march=/-mtune= to toplev.c.
[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 --- Comment #4 from Hans-Peter Nilsson hp at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #3) So is this actually a regression then (I mean, has it worked in 4.8 or 4.7 etc.)? That's not the definition. At one point it work on trunk (4.9) thus it's a regression.
[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 --- Comment #5 from Hans-Peter Nilsson hp at gcc dot gnu.org --- The actual bug causing the ICE is that the combination of expr.c:find_args_size_adjust and expr.c:fixup_args_size_notes\ can't handle a define_split matching for the stack-adjustment assignment instruction emitted by __builtin_stack_restor\ e. I'm going to mark my commit for the CRIS port with this PR number (since it fixes the regression per se), but it will j\ ust remove the define_split part happening for the CRIS port; the bug is still there so the PR should not be closed. Though, I'll change the title.
[Bug target/59587] cpu_names in i386.c is accessed with wrong index
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59587 --- Comment #1 from H.J. Lu hjl.tools at gmail dot com --- Created attachment 31505 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31505action=edit A patch I am testing this patch.
[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 --- Comment #6 from Hans-Peter Nilsson hp at gcc dot gnu.org --- Author: hp Date: Mon Dec 23 22:33:52 2013 New Revision: 206187 URL: http://gcc.gnu.org/viewcvs?rev=206187root=gccview=rev Log: PR middle-end/59584 * config/cris/predicates.md (cris_nonsp_register_operand): New define_predicate. * config/cris/cris.md: Replace register_operand with cris_nonsp_register_operand for destinations in all define_splits where a register is set more than once. Modified: trunk/gcc/ChangeLog trunk/gcc/config/cris/cris.md trunk/gcc/config/cris/predicates.md
[Bug middle-end/59584] [4.9 Regression]: cannot handle define_split for insn emitted for __builtin_stack_restore
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584 Hans-Peter Nilsson hp at gcc dot gnu.org changed: What|Removed |Added Priority|P3 |P5 Summary|[4.9 Regression]: |[4.9 Regression]: cannot |gcc.dg/pr50251.c ICE|handle define_split for |exposed by Don't reject|insn emitted for |TER unnecessarily |__builtin_stack_restore
[Bug target/59588] New: Odd codes in ix86_option_override_internal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59588 Bug ID: 59588 Summary: Odd codes in ix86_option_override_internal Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: ubizjak at gmail dot com ix86_option_override_internal has if (opts-x_ix86_arch_string) opts-x_ix86_tune_string = opts-x_ix86_arch_string; if (!opts-x_ix86_tune_string) { opts-x_ix86_tune_string = cpu_names[TARGET_CPU_DEFAULT]; ix86_tune_defaulted = 1; } /* opts-x_ix86_tune_string is set to opts-x_ix86_arch_string or defaulted. We need to use a sensible tune option. */ if (!strcmp (opts-x_ix86_tune_string, generic) || !strcmp (opts-x_ix86_tune_string, x86-64) || !strcmp (opts-x_ix86_tune_string, i686)) { opts-x_ix86_tune_string = generic; } Why is opts-x_ix86_tune_string changed to generic. If opts-x_ix86_tune_string is generic. there is no need to change it to generic. If an option is valid for -march=, it should also be valid for -mtune.
[Bug target/59203] config/cris/cris.c:2491: possible typo ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59203 --- Comment #3 from Hans-Peter Nilsson hp at gcc dot gnu.org --- Author: hp Date: Mon Dec 23 23:12:09 2013 New Revision: 206188 URL: http://gcc.gnu.org/viewcvs?rev=206188root=gccview=rev Log: PR target/59203 * config/cris/cris.c (cris_pic_symbol_type_of): Fix typo, checking t1 twice instead of t1 and t2 respectively. Modified: trunk/gcc/ChangeLog trunk/gcc/config/cris/cris.c
[Bug tree-optimization/59586] [4.8/4.9 Regression] [graphite] Segmentation fault with -Ofast -floop-parallelize-all -ftree-parallelize-loops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59586 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added CC||spop at gcc dot gnu.org Summary|Segmentation fault with |[4.8/4.9 Regression] |-Ofast |[graphite] Segmentation |-floop-parallelize-all |fault with -Ofast |-ftree-parallelize-loops|-floop-parallelize-all ||-ftree-parallelize-loops --- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr --- Revision r188914 (2012-06-24) is OK, r189336 (2012-07-06) gives the ICE. (gdb) bt #0 0x00010053910b in compute_deps (scop=0x1418bcc00, pbbs=..., must_raw=0xde650, may_raw=0xfc080, must_raw_no_source=0x141907f50, may_raw_no_source=0x141920330, must_war=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**)., may_war=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**)., must_war_no_source=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**)., may_war_no_source=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**)., must_waw=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**)., may_waw=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**)., must_waw_no_source=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**)., may_waw_no_source=error reading variable: Could not find the frame base for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**).) at ../../_clean/gcc/graphite-dependences.c:430 #1 0x00010053965e in loop_is_parallel_p (loop=optimized out, bb_pbb_mapping=..., depth=optimized out) at ../../_clean/gcc/graphite-dependences.c:566 #2 0x000100537597 in translate_clast (context_loop=0x1418bcc00, stmt=0x1419084c0, next_e=0xde650, bb_pbb_mapping=..., level=1099988816, ip=0x141920330) at ../../_clean/gcc/graphite-clast-to-gimple.c:1200 #3 0x00010053793c in gloog (scop=optimized out, bb_pbb_mapping=...) at ../../_clean/gcc/graphite-clast-to-gimple.c:1705 #4 0x00010053200f in graphite_transform_loops () at ../../_clean/gcc/graphite.c:304 #5 0x00010053251a in pass_graphite_transforms::execute (this=optimized out) at ../../_clean/gcc/graphite.c:332 #6 0x00010067c4b9 in execute_one_pass (pass=optimized out) at ../../_clean/gcc/passes.c:2213 #7 0x00010067c74e in execute_pass_list (pass=optimized out) at ../../_clean/gcc/passes.c:2266 #8 0x00010067c760 in execute_pass_list (pass=optimized out) at ../../_clean/gcc/passes.c:2267 #9 0x00010067c760 in execute_pass_list (pass=optimized out) at ../../_clean/gcc/passes.c:2267 #10 0x00010067c760 in execute_pass_list (pass=optimized out) at ../../_clean/gcc/passes.c:2267 #11 0x0001003c66cf in expand_function (node=optimized out) at ../../_clean/gcc/cgraphunit.c:1763
[Bug target/59203] config/cris/cris.c:2491: possible typo ?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59203 Hans-Peter Nilsson hp at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Hans-Peter Nilsson hp at gcc dot gnu.org --- done
[Bug target/59587] cpu_names in i386.c is accessed with wrong index
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59587 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added Attachment #31505|0 |1 is obsolete|| --- Comment #2 from H.J. Lu hjl.tools at gmail dot com --- Created attachment 31506 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31506action=edit An updated patch Test this updated patch.
[Bug fortran/59589] New: Memory leak when deallocating polymorphic
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589 Bug ID: 59589 Summary: Memory leak when deallocating polymorphic Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: townsend at astro dot wisc.edu Created attachment 31507 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31507action=edit Test code demonstrating leak The attached code leaks memory, as indicated by the 'ps' call.
[Bug fortran/58007] [4.7/4.9 Regression] [OOP] ICE in free_pi_tree(): Unresolved fixup - resolve_fixups does not fixup component of __class_bsr_Bsr_matrix
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58007 --- Comment #11 from Rich Townsend townsend at astro dot wisc.edu --- #6 fails with 4.9.0 (svn rev. 206179), on both OS X and Linux.
[Bug fortran/59589] Memory leak when deallocating polymorphic
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589 --- Comment #1 from Rich Townsend townsend at astro dot wisc.edu --- Oops, missed out details. This is with rev. 206179, on both OS X and Linux.
[Bug fortran/59589] Memory leak when deallocating polymorphic
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2013-12-24 Ever confirmed|0 |1 --- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr --- Works for me on OS X for 4.8.2 or trunk. What command are you using?
[Bug c/59590] New: gcc produces an infinite loop on O2 optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59590 Bug ID: 59590 Summary: gcc produces an infinite loop on O2 optimization Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: major Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: cottrell at wfu dot edu Created attachment 31508 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31508action=edit minimal test case I'm getting an infinite loop with -O2, though the code is compiled correctly with just -O. I'm attaching a minimal test case -- but please see also the real function that exposes the problem: the following is the real counterpart to fake_gradient() in the minimal case: static int richardson_gradient (double *b, double *g, int n, BFGS_CRIT_FUNC func, void *data) { double df[RSTEPS]; double eps = 1.0e-4; double d = 0.0001; double v = 2.0; double h, p4m; double bi0, f1, f2; int r = RSTEPS; int i, k, m; int err = 0; for (i=0; in; i++) { bi0 = b[i]; h = d * b[i] + eps * (b[i] == 0.0); for (k=0; kr; k++) { b[i] = bi0 - h; f1 = func(b, data); b[i] = bi0 + h; f2 = func(b, data); if (na(f1) || na(f2)) { b[i] = bi0; return 1; } df[k] = (f2 - f1) / (2.0 * h); h /= v; } b[i] = bi0; p4m = 4.0; for (m=0; mr-1; m++) { for (k=0; kr-m; k++) { df[k] = (df[k+1] * p4m - df[k]) / (p4m - 1.0); // if (k == r-m-1) break; } p4m *= 4.0; } g[i] = df[0]; } return err; }
[Bug fortran/59589] Memory leak when deallocating polymorphic
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589 --- Comment #3 from Rich Townsend townsend at astro dot wisc.edu --- (In reply to Dominique d'Humieres from comment #2) Works for me on OS X for 4.8.2 or trunk. What command are you using? townsend@talos ~ $ gfortran -v Using built-in specs. COLLECT_GCC=/Applications/madsdk/bin/gfortran.exec COLLECT_LTO_WRAPPER=/Applications/madsdk/libexec/gcc/x86_64-apple-darwin11.4.2/4.9.0/lto-wrapper Target: x86_64-apple-darwin11.4.2 Configured with: ./configure CC='gcc -D_FORTIFY_SOURCE=0' --build=x86_64-apple-darwin11.4.2 --prefix=/Applications/madsdk --with-gmp=/Applications/madsdk --with-mpfr=/Applications/madsdk --with-mpc=/Applications/madsdk --enable-languages=c,c++,fortran --disable-multilib --disable-nls --disable-libsanitizer Thread model: posix gcc version 4.9.0 20131223 (experimental) (GCC) townsend@talos ~ $ gfortran -o test_leak test_leak.f90 townsend@talos ~ $ ./test_leak ./test_leak 39688 ./test_leak 78764 ./test_leak 117828 ./test_leak 156908 ./test_leak 195972 ./test_leak 235036 ./test_leak 274100 ./test_leak 313164 ./test_leak 352228 ./test_leak 391292 ...so, the memory usage grows on each iteration of the loop; this suggests a leak.
[Bug c/59590] gcc produces an infinite loop on O2 optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59590 Andrew Pinski pinskia at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org --- df[k+1] reads past the bounds of df as k is 0...RSTEPS-1 so k+1 is 1...RSTEPS and the bounds of df is 0...RSTEPS-1.
[Bug target/59573] aarch64: commit 07ca5686e64 broken glibc-2.17
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59573 --- Comment #7 from Dennis Lan (dlan) dennis.yxun at gmail dot com --- Ok, it's qemu problem, not gcc. I've built rootfs in qemu (with cmeq insn implemented), then deploy the rootfs into foudation_v8 emulator. test to compiles code with gcc, and it works fine, without the linker error.
[Bug c/59590] gcc produces an infinite loop on O2 optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59590 --- Comment #2 from Allin Cottrell cottrell at wfu dot edu --- OK, you're right, there's an off-by-one bug in the second k-loop. But it's not very nice that gcc takes that as a license to produce an infinite loop. However, I guess that makes this report a duplicate of some others that have made the same observation.
[Bug c++/59349] [4.9 Regression] ICE on invalid: Segmentation fault toplev.c:336
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59349 --- Comment #3 from Jason Merrill jason at gcc dot gnu.org --- Author: jason Date: Tue Dec 24 04:22:15 2013 New Revision: 206192 URL: http://gcc.gnu.org/viewcvs?rev=206192root=gccview=rev Log: PR c++/59349 * parser.c (cp_parser_lambda_introducer): Handle empty init. Added: trunk/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/parser.c
[Bug c++/59271] [4.9 Regression] a.C:16:21: internal compiler error: in strip_typedefs, at cp/tree.c:1315
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59271 --- Comment #4 from Jason Merrill jason at gcc dot gnu.org --- Author: jason Date: Tue Dec 24 04:22:23 2013 New Revision: 206193 URL: http://gcc.gnu.org/viewcvs?rev=206193root=gccview=rev Log: PR c++/59271 * lambda.c (build_capture_proxy): Use build_cplus_array_type. Added: trunk/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/lambda.c
[Bug lto/59582] LTO discards symbol that defined as weak elsewhere
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59582 --- Comment #2 from Joey Ye joey.ye at arm dot com --- Lastest binutils trunk still has this issue. I'm assuming 2.24 the same.
Re: [PATCH, committed] Fix PR 57422
On Mon, Dec 23, 2013 at 10:52:25AM +0400, Andrey Belevantsev wrote: As described in the PR, the ICE reason was the typo made when introducing calls to add_hard_reg_set. Fixed by the first attached patch, bootstrapped and tested on both ia64 and x86_64, committed as obvious. The test case is very sensitive to the scheduler decisions (e.g. it didn't fail on trunk but only on the revision reported for me), so instead of adding the test I have put in the code two asserts checking that we can always schedule the fence instruction as is. This hunk was tested together with the first but committed separately. The first patch can be safely committed to 4.8, the second can stay on trunk only. Jakub, will it be fine with you? Yes. Jakub
[C++ Patch ping] Re: [C++ Patch] PR 59165 (aka Core/1442)
Hi, assuming I didn't miss anything (I'm still catching up with my emails), I'd like to ping the below. Thanks! Paolo. /// On 12/10/2013 01:54 PM, Paolo Carlini wrote: Hi, as far as I can see, this bug asks for the implementation of Core/1442, thus don't do a special Koenig lookup including namespace std in cp_parser_perform_range_for_lookup. Tested x86_64-linux. Thanks, Paolo. /
Re: [PATCH, committed] Fix PR 57422
On Sun, Dec 22, 2013 at 10:52 PM, Andrey Belevantsev a...@ispras.ru wrote: Hello, As described in the PR, the ICE reason was the typo made when introducing calls to add_hard_reg_set. Fixed by the first attached patch, bootstrapped and tested on both ia64 and x86_64, committed as obvious. The test case is very sensitive to the scheduler decisions (e.g. it didn't fail on trunk but only on the revision reported for me), so instead of adding the test I have put in the code two asserts checking that we can always schedule the fence instruction as is. This hunk was tested together with the first but committed separately. Testcase is very small. Why not add it? -- H.J.
Re: [PATCH] Don't reject TER unnecessarily (PRs middle-end/58956, middle-end/59470)
On Sat, 14 Dec 2013, Jakub Jelinek wrote: 2013-12-14 Jakub Jelinek ja...@redhat.com PR middle-end/58956 PR middle-end/59470 * gimple-walk.h (walk_stmt_load_store_addr_fn): New typedef. (walk_stmt_load_store_addr_ops, walk_stmt_load_store_ops): Use it for callback params. * gimple-walk.c (walk_stmt_load_store_ops): Likewise. (walk_stmt_load_store_addr_ops): Likewise. Adjust all callback calls to supply the gimple operand containing the base tree as an extra argument. * tree-ssa-ter.c: Include gimple-walk.h. (find_ssaname, find_ssaname_in_store): New helper functions. (find_replaceable_in_bb): For calls or GIMPLE_ASM, only set same_root_var if USE is used somewhere in the stores of the stmt. * ipa-prop.c (visit_ref_for_mod_analysis): Remove name of the stmt argument and ATTRIBUTE_UNUSED, add another unnamed tree argument. * ipa-pure-const.c (check_load, check_store, check_ipa_load, check_ipa_store): Likewise. * gimple.c (gimple_ior_addresses_taken_1, check_loadstore): Likewise. * ipa-split.c (test_nonssa_use, mark_nonssa_use): Likewise. (verify_non_ssa_vars, visit_bb): Adjust their callers. * cfgexpand.c (add_scope_conflicts_1): Use walk_stmt_load_store_addr_fn type for visit variable. (visit_op, visit_conflict): Remove name of the stmt argument and ATTRIBUTE_UNUSED, add another unnamed tree argument. * tree-sra.c (asm_visit_addr): Likewise. Remove name of the data argument and ATTRIBUTE_UNUSED. * cgraphbuild.c (mark_address, mark_load, mark_store): Add another unnamed tree argument. * gimple-ssa-isolate-paths.c (check_loadstore): Likewise. Remove ATTRIBUTE_UNUSED from stmt parameter. Caused PR59584, an ICE. (I'm going to fix the define_split bug that this exposed, but I don't think that bug - allowing SP as a temporary - is the cause of the hiccup.) brgds, H-P
Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote: On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote: Sorry, I must have been looking at an older version, but as I said I already did enable it in the latest patch. (see http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html ) Sorry for causing another revision but we would like to stick with btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the changes would be like I will need to make an updated patch to move the new ISAs to the end of the list anyway. I will send it in a few days to give AMD or Intel developers time to comment on the current version. I renamed Intel processor names. Please update your patch. Here is my patch to add more Intel processor support. You can add it to your patch. Thanks. H.J. --- From 2ef9b6959a4625d89cab6f06aec6bb2b37095264 Mon Sep 17 00:00:00 2001 From: H.J. Lu hjl.to...@gmail.com Date: Mon, 23 Dec 2013 05:26:01 -0800 Subject: [PATCH 1/2] Handle haswell and silvermont --- ChangeLog.arch | 18 ++ gcc/config/i386/i386.c | 14 ++ libgcc/config/i386/cpuinfo.c | 15 +++ 3 files changed, 47 insertions(+) create mode 100644 ChangeLog.arch diff --git a/ChangeLog.arch b/ChangeLog.arch new file mode 100644 index 000..2030a76 --- /dev/null +++ b/ChangeLog.arch @@ -0,0 +1,18 @@ +gcc/ + +2013-12-23 H.J. Lu hongjiu...@intel.com + + * config/i386/i386.c (get_builtin_code_for_version): Handle + PROCESSOR_HASWELL and PROCESSOR_SILVERMONT. + (processor_model): Add M_INTEL_COREI7_IVYBRIDGE and + M_INTEL_COREI7_HASWELL. + (arch_names_table): Add ivybridge, haswell, bonnell, + silvermont. + +libgcc/ + +2013-12-23 H.J. Lu hongjiu...@intel.com + + * config/i386/cpuinfo.c (processor_subtypes): Add + INTEL_COREI7_IVYBRIDGE and INTEL_COREI7_HASWELL. + (get_intel_cpu): Check Ivy Bridge and Haswell processors. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 2d480b3..d854b5b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -30058,10 +30058,18 @@ get_builtin_code_for_version (tree decl, tree *predicate_list) arg_str = sandybridge; priority = P_PROC_SSE4_2; break; + case PROCESSOR_HASWELL: + arg_str = haswell; + priority = P_PROC_SSE4_2; + break; case PROCESSOR_BONNELL: arg_str = bonnell; priority = P_PROC_SSSE3; break; + case PROCESSOR_SILVERMONT: + arg_str = silvermont; + priority = P_PROC_SSE4_2; + break; case PROCESSOR_AMDFAM10: arg_str = amdfam10h; priority = P_PROC_SSE4_a; @@ -30959,6 +30967,8 @@ fold_builtin_cpu (tree fndecl, tree *args) M_INTEL_COREI7_NEHALEM, M_INTEL_COREI7_WESTMERE, M_INTEL_COREI7_SANDYBRIDGE, +M_INTEL_COREI7_IVYBRIDGE, +M_INTEL_COREI7_HASWELL, M_AMDFAM10H_BARCELONA, M_AMDFAM10H_SHANGHAI, M_AMDFAM10H_ISTANBUL, @@ -30984,6 +30994,10 @@ fold_builtin_cpu (tree fndecl, tree *args) {nehalem, M_INTEL_COREI7_NEHALEM}, {westmere, M_INTEL_COREI7_WESTMERE}, {sandybridge, M_INTEL_COREI7_SANDYBRIDGE}, + {ivybridge, M_INTEL_COREI7_IVYBRIDGE}, + {haswell, M_INTEL_COREI7_HASWELL}, + {bonnell, M_INTEL_BONNELL}, + {silvermont, M_INTEL_SILVERMONT}, {amdfam10h, M_AMDFAM10H}, {barcelona, M_AMDFAM10H_BARCELONA}, {shanghai, M_AMDFAM10H_SHANGHAI}, diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c index 4b0c189..577881b 100644 --- a/libgcc/config/i386/cpuinfo.c +++ b/libgcc/config/i386/cpuinfo.c @@ -70,6 +70,8 @@ enum processor_subtypes INTEL_COREI7_NEHALEM = 1, INTEL_COREI7_WESTMERE, INTEL_COREI7_SANDYBRIDGE, + INTEL_COREI7_IVYBRIDGE, + INTEL_COREI7_HASWELL, AMDFAM10H_BARCELONA, AMDFAM10H_SHANGHAI, AMDFAM10H_ISTANBUL, @@ -196,6 +198,19 @@ get_intel_cpu (unsigned int family, unsigned int model, unsigned int brand_id) __cpu_model.__cpu_type = INTEL_COREI7; __cpu_model.__cpu_subtype = INTEL_COREI7_SANDYBRIDGE; break; + case 0x3a: + case 0x3e: + /* Ivy Bridge. */ + __cpu_model.__cpu_type = INTEL_COREI7; + __cpu_model.__cpu_subtype = INTEL_COREI7_IVYBRIDGE; + break; + case 0x3c: + case 0x45: + case 0x46: + /* Haswell. */ + __cpu_model.__cpu_type = INTEL_COREI7; + __cpu_model.__cpu_subtype = INTEL_COREI7_HASWELL; + break; case 0x17: case 0x1d: /* Penryn. */ -- 1.8.4.2
PATCH: PRs bootstrap/59580/59583: Improve x86 --with-arch/--with-cpu= configure handling
On Sun, Dec 22, 2013 at 11:11:12PM +0100, Uros Bizjak wrote: Please get someone to review config.gcc changes. They are OK as far as x86 rename is concerned, but I can't review functional changes. Hi Paolo, Can you review this config.gcc change? @@ -588,6 +588,22 @@ esac # Common C libraries. tm_defines=$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 +# 32-bit x86 processors supported by --with-arch=. Each processor +# MUST be separated by exactly one space. +x86_archs=athlon athlon-4 athlon-fx athlon-mp athlon-tbird \ +athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \ +i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \ +pentium4 pentium4m pentiumpro prescott Missing native. x86_archs contains 32-bit x86 processors. native is allowed for 64-bit targets and is included in x86_64_archs. 64-bit processors can be used in --with-arch/--with-cpu= for 32-bit targets. Here is a patch to improve x86 x86 --with-arch/--with-cpu= configure handling. This patch defines 3 variables: 1. x86_archs: It contains 32-bit x86 processors supported by --with-arch=, which aren't allowed for 64-bit targets. 2. x86_64_archs: It contains 64-bit x86 processors supported by --with-arch=, which are allowed for both 32-bit and 64-bit targets. 3. x86_cpus. It contains x86 processors supported by --with-cpu=, which are allowed for both 32-bit and 64-bit targets. Each processor in those 3 variables are separated by exactly one space. Instead of checking if a value of --with-arch/--with-cpu= is valid in many difference places with case ${val} in valid pattern list) OK ;; *) error exit 1 ;; esac and updating all pattern lists when adding a new processor, this patch uses case valid processor list separated by exactly one space in * ${val} *) OK ;; *) error exit 1 ;; esac valid processor list separated by exactly one space is combination of 3 processor variables. It only needs separate a check for empty value with if test x${val} != x; then $val isn't empty else $val is empty fi With this approach, we only need to add new 32-bit processors to x86_archs and new 64-bit processors to x86_64_archs. They will be supported by --with-arch/--with-cpu= automatically. OK to install? Thanks. H.J. --- 2013-12-23 H.J. Lu hongjiu...@intel.com PR bootstrap/59580 PR bootstrap/59583 * config.gcc (x86_archs): New variable. (x86_64_archs): Likewise. (x86_cpus): Likewise. Use $x86_archs, $x86_64_archs and $x86_cpus to check valid --with-arch/--with-cpu= options. Support --with-arch=/--with-cpu={nehalem,westmere, sandybridge,ivybridge,haswell,broadwell,bonnell,silvermont}. diff --git a/gcc/config.gcc b/gcc/config.gcc index 24dbaf9..51eb2b1 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -588,6 +588,22 @@ esac # Common C libraries. tm_defines=$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 +# 32-bit x86 processors supported by --with-arch=. Each processor +# MUST be separated by exactly one space. +x86_archs=athlon athlon-4 athlon-fx athlon-mp athlon-tbird \ +athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \ +i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \ +pentium4 pentium4m pentiumpro prescott +# 64-bit x86 processors supported by --with-arch=. Each processor +# MUST be separated by exactly one space. +x86_64_archs=amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \ +bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \ +core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \ +sandybridge ivybridge haswell broadwell bonnell silvermont x86-64 native +# Additional x86 processors supported by --with-cpu=. Each processor +# MUST be separated by exactly one space. +x86_cpus=generic intel + # Common parts for widely ported systems. case ${target} in *-*-darwin*) @@ -1392,20 +1408,21 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i done TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'` need_64bit_isa=yes - case X${with_cpu} in - Xgeneric|Xintel|Xatom|Xslm|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver4|Xbdver3|Xbdver2|Xbdver1|Xbtver2|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3) - ;; - X) + if test x$with_cpu = x; then if test x$with_cpu_64 = x; then with_cpu_64=generic fi - ;; - *) - echo Unsupported CPU used in --with-cpu=$with_cpu, supported values: 12 -
[PATCH] Fix PR59569
Hi, Jakub, Thanks for suggestion. Please find attached patch. GCC is bootstrapped and passes testsuite on x86-64. Let me know if it is OK to commit. (Sorry if you received this mail twice as I forgot to set to text format). Thanks, Bingfeng Mei patch_pr59569 Description: patch_pr59569
Re: [PATCH] Fix PR59569
On Mon, Dec 23, 2013 at 6:25 AM, Bingfeng Mei b...@broadcom.com wrote: Hi, Jakub, Thanks for suggestion. Please find attached patch. GCC is bootstrapped and passes testsuite on x86-64. Let me know if it is OK to commit. (Sorry if you received this mail twice as I forgot to set to text format). Please test on 3 testcases in the PR and include some testcases in your patch. Thanks. -- H.J.
RE: [PATCH] Fix PR59569
All the 3 tests are tested and the first two are included in my patch. Didn't include the third one as it is not reduced. Bingfeng -Original Message- From: H.J. Lu [mailto:hjl.to...@gmail.com] Sent: 23 December 2013 14:28 To: Bingfeng Mei Cc: gcc-patches@gcc.gnu.org; Jakub Jelinek (ja...@redhat.com) Subject: Re: [PATCH] Fix PR59569 On Mon, Dec 23, 2013 at 6:25 AM, Bingfeng Mei b...@broadcom.com wrote: Hi, Jakub, Thanks for suggestion. Please find attached patch. GCC is bootstrapped and passes testsuite on x86-64. Let me know if it is OK to commit. (Sorry if you received this mail twice as I forgot to set to text format). Please test on 3 testcases in the PR and include some testcases in your patch. Thanks. -- H.J.
Re: [PATCH] Fix PR59569
On Mon, Dec 23, 2013 at 02:23:49PM +, Bingfeng Mei wrote: Thanks for suggestion. Please find attached patch. GCC is bootstrapped and passes testsuite on x86-64. Let me know if it is OK to commit. Ok, thanks. Would be nice to add runtime testcases for both cases (test whether vectorization with negative step storing a constant worked properly, and similarly for external def (e.g. function parameter with __attribute__((noinline, noclone)) on the function), but that can be done as a follow-up patch. Jakub
[COMMITTED]RE: [PATCH] Fix PR59569
Committed. I will prepare some new tests as you suggested. Thanks, Bingfeng -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: 23 December 2013 14:53 To: Bingfeng Mei Cc: gcc-patches@gcc.gnu.org; H.J. Lu (hjl.to...@gmail.com) Subject: Re: [PATCH] Fix PR59569 On Mon, Dec 23, 2013 at 02:23:49PM +, Bingfeng Mei wrote: Thanks for suggestion. Please find attached patch. GCC is bootstrapped and passes testsuite on x86-64. Let me know if it is OK to commit. Ok, thanks. Would be nice to add runtime testcases for both cases (test whether vectorization with negative step storing a constant worked properly, and similarly for external def (e.g. function parameter with __attribute__((noinline, noclone)) on the function), but that can be done as a follow-up patch. Jakub
[PATCH] Fix for PR59585
Hi folks, This patch fixes problem with UBSan tests failing on remote target platforms (ARM via SSH). The error is caused by DejaGNU harness stripping trailing newline from test output (and thus causing pattern matching failures). Link to PR: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59585 -Y diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c index 4e2a2b9..ec391e4 100644 --- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c +++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c @@ -21,4 +21,4 @@ main (void) /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c index ee96738..c8820fa 100644 --- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c +++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c @@ -20,4 +20,4 @@ main (void) /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division by zero } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c index f3ee23b..399071e 100644 --- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c +++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c @@ -18,4 +18,4 @@ main (void) /* { dg-output division of -2147483648 by -1 cannot be represented in type 'int'(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*division of -2147483648 by -1 cannot be represented in type 'int'(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*division of -2147483648 by -1 cannot be represented in type 'int'(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*division of -2147483648 by -1 cannot be represented in type 'int' } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c b/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c index db346cb..96f7984 100644 --- a/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c +++ b/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c @@ -10,8 +10,8 @@ bool b; __attribute__((noinline, noclone)) enum A foo (bool *p) { - *p = b; /* { dg-output load-bool-enum.c:13:\[^\n\r]*runtime error: load of value 4, which is not a valid value for type '(_B|b)ool'(\n|\r\n|\r) } */ - return a; /* { dg-output \[^\n\r]*load-bool-enum.c:14:\[^\n\r]*runtime error: load of value 9, which is not a valid value for type 'A'(\n|\r\n|\r) { target c++ } } */ + *p = b; /* { dg-output load-bool-enum.c:13:\[^\n\r]*runtime error: load of value 4, which is not a valid value for type '(_B|b)ool'(\n|\r\n|\r)* } */ + return a; /* { dg-output \[^\n\r]*load-bool-enum.c:14:\[^\n\r]*runtime error: load of value 9, which is not a valid value for type 'A'(\n|\r\n|\r)* { target c++ } } */ } int diff --git a/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c b/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c index de2cd2d..f8af828 100644 --- a/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c +++ b/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c @@ -58,4 +58,4 @@ main (void) /* { dg-output \[^\n\r]*signed integer overflow: \[^\n\r]* \\+ 1024 cannot be represented in type 'long int'(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*signed integer overflow: -\[^\n\r]* \\+ -1 cannot be represented in type 'long int'(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*signed integer overflow: -1 \\+ -\[^\n\r]* cannot be represented in type 'long int'(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*signed integer overflow: -\[^\n\r]* \\+ -1024 cannot be represented in type 'long int'(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*signed integer overflow: -\[^\n\r]* \\+ -1024 cannot be represented in type 'long int' } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c b/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c index adcbfe1..ddfbb2e 100644 --- a/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c +++ b/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c @@ -24,4 +24,4 @@ main (void) /* { dg-output signed integer overflow: 2147483647 \\* 2 cannot be represented in type 'int'(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*signed integer overflow: 2 \\* 2147483647 cannot be represented in type 'int'(\n|\r\n|\r) } */ /* { dg-output \[^\n\r]*signed integer overflow: \[^\n\r]* \\* 2 cannot be represented in type 'long int'(\n|\r\n|\r) } */ -/* { dg-output \[^\n\r]*signed integer overflow: 2 \\* \[^\n\r]* cannot be represented in type 'long int'(\n|\r\n|\r) } */ +/* { dg-output \[^\n\r]*signed integer overflow: 2 \\* \[^\n\r]* cannot be represented in type 'long int' } */ diff --git a/gcc/testsuite/c-c++-common/ubsan/overflow-mul-4.c
Re: [PATCH i386 4/8] [AVX512] [5/8] Add substed patterns: rounding subst.
On Wed, Dec 18, 2013 at 2:00 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, On 02 Dec 16:09, Kirill Yukhin wrote: Hello, On 19 Nov 12:08, Kirill Yukhin wrote: Hello, On 15 Nov 20:06, Kirill Yukhin wrote: Ping. Ping. Ping. Ping. Rebased patch in the bottom. At the end of the day, the patch looks fairly mechanical, adding extensions to insn templates in a consistent way. The approach with define_subst is already approved and used throughout the .md files. I have reviewed the patch, and didn't find any obvious mistakes - and there is a huge testsuite to find non-obvious ones, so I'm confident enough to approve the patch. So, OK for mainline, but I would kindly ask you to please wait a couple of days for possible Richard's comments Thanks, Uros.
Re: [PATCH i386 4/8] [AVX512] [5/8] Add substed patterns: rounding subst.
On Mon, Dec 23, 2013 at 5:11 PM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Dec 18, 2013 at 2:00 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, On 02 Dec 16:09, Kirill Yukhin wrote: Hello, On 19 Nov 12:08, Kirill Yukhin wrote: Hello, On 15 Nov 20:06, Kirill Yukhin wrote: Ping. Ping. Ping. Ping. Rebased patch in the bottom. At the end of the day, the patch looks fairly mechanical, adding extensions to insn templates in a consistent way. The approach with define_subst is already approved and used throughout the .md files. I have reviewed the patch, and didn't find any obvious mistakes - and there is a huge testsuite to find non-obvious ones, so I'm confident enough to approve the patch. So, OK for mainline, but I would kindly ask you to please wait a couple of days for possible Richard's comments There is one issue: +(define_subst_attr round_constraint round vm v) +(define_subst_attr round_constraint2 round m v) +(define_subst_attr round_constraint3 round rm r) When substituting constraints, please also substitute corresponding operand predicate: nonimmediate_operand - register_operand in 1st and 3rd case memory_operand - register_operand in 2nd case. When you allow e.g. nonimmediate_operand in predicate, but only register in operand constraint, reload will resolve it, however - memory load will remain in the loop even if it is invariant. There is no pass to hoist invariant loads after reload. Uros.
Re: [PATCH] Fix for PR59585
On Mon, Dec 23, 2013 at 07:59:47PM +0400, Yury Gribov wrote: Hi folks, This patch fixes problem with UBSan tests failing on remote target platforms (ARM via SSH). The error is caused by DejaGNU harness stripping trailing newline from test output (and thus causing pattern matching failures). Sounds like a bug in whatever is stripping the newlines, how else can you test that the messages aren't on the same lne? Or is it stripping just the final newline at the end of output? Still sounds like a bug elsewhere to me. Jakub
Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Monday 23 December 2013, H.J. Lu wrote: On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote: On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote: Sorry, I must have been looking at an older version, but as I said I already did enable it in the latest patch. (see http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html ) Sorry for causing another revision but we would like to stick with btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the changes would be like I will need to make an updated patch to move the new ISAs to the end of the list anyway. I will send it in a few days to give AMD or Intel developers time to comment on the current version. I renamed Intel processor names. Please update your patch. Here is my patch to add more Intel processor support. You can add it to your patch. Updated patch attached. Rebased, fixed coding style, moved new ISA enums to the end and applied H.J.Lu's patch. `Allan Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 206179) +++ gcc/config/i386/i386.c (working copy) @@ -29970,16 +29970,21 @@ P_SSE3, P_SSSE3, P_PROC_SSSE3, -P_SSE4_a, -P_PROC_SSE4_a, +P_SSE4_A, +P_PROC_SSE4_A, P_SSE4_1, P_SSE4_2, P_PROC_SSE4_2, P_POPCNT, P_AVX, +P_PROC_AVX, +P_FMA4, +P_XOP, +P_PROC_XOP, +P_FMA, +P_PROC_FMA, P_AVX2, -P_FMA, -P_PROC_FMA +P_PROC_AVX2 }; enum feature_priority priority = P_ZERO; @@ -29998,11 +30003,15 @@ {sse, P_SSE}, {sse2, P_SSE2}, {sse3, P_SSE3}, + {sse4a, P_SSE4_A}, {ssse3, P_SSSE3}, {sse4.1, P_SSE4_1}, {sse4.2, P_SSE4_2}, {popcnt, P_POPCNT}, {avx, P_AVX}, + {fma4, P_FMA4}, + {xop, P_XOP}, + {fma, P_FMA}, {avx2, P_AVX2} }; @@ -30054,26 +30063,50 @@ arg_str = nehalem; priority = P_PROC_SSE4_2; break; -case PROCESSOR_SANDYBRIDGE: - arg_str = sandybridge; - priority = P_PROC_SSE4_2; - break; + case PROCESSOR_SANDYBRIDGE: + arg_str = sandybridge; + priority = P_PROC_AVX; + break; + case PROCESSOR_HASWELL: + arg_str = haswell; + priority = P_PROC_SSE4_2; + break; case PROCESSOR_BONNELL: arg_str = bonnell; priority = P_PROC_SSSE3; break; + case PROCESSOR_SILVERMONT: + arg_str = silvermont; + priority = P_PROC_SSE4_2; + break; case PROCESSOR_AMDFAM10: arg_str = amdfam10h; - priority = P_PROC_SSE4_a; + priority = P_PROC_SSE4_A; break; + case PROCESSOR_BTVER1: + arg_str = bobcat; + priority = P_PROC_SSE4_A; + break; + case PROCESSOR_BTVER2: + arg_str = jaguar; + priority = P_PROC_AVX; + break; case PROCESSOR_BDVER1: arg_str = bdver1; - priority = P_PROC_FMA; + priority = P_PROC_XOP; break; case PROCESSOR_BDVER2: arg_str = bdver2; priority = P_PROC_FMA; break; + case PROCESSOR_BDVER3: + arg_str = bdver3; + priority = P_PROC_FMA; + break; + case PROCESSOR_BDVER4: + arg_str = bdver4; + priority = P_PROC_AVX2; + break; } } @@ -30938,6 +30971,10 @@ F_SSE4_2, F_AVX, F_AVX2, +F_SSE4_A, +F_FMA4, +F_XOP, +F_FMA, F_MAX }; @@ -30955,6 +30992,10 @@ M_AMDFAM10H, M_AMDFAM15H, M_INTEL_SILVERMONT, +M_INTEL_COREI7_AVX, +M_INTEL_CORE_AVX2, +M_AMD_BOBCAT, +M_AMD_JAGUAR, M_CPU_SUBTYPE_START, M_INTEL_COREI7_NEHALEM, M_INTEL_COREI7_WESTMERE, @@ -30965,7 +31006,9 @@ M_AMDFAM15H_BDVER1, M_AMDFAM15H_BDVER2, M_AMDFAM15H_BDVER3, -M_AMDFAM15H_BDVER4 +M_AMDFAM15H_BDVER4, +M_INTEL_COREI7_IVYBRIDGE, +M_INTEL_CORE_HASWELL }; static struct _arch_names_table @@ -30983,16 +31026,24 @@ {corei7, M_INTEL_COREI7}, {nehalem, M_INTEL_COREI7_NEHALEM}, {westmere, M_INTEL_COREI7_WESTMERE}, + {corei7-avx, M_INTEL_COREI7_AVX}, {sandybridge, M_INTEL_COREI7_SANDYBRIDGE}, + {ivybridge, M_INTEL_COREI7_IVYBRIDGE}, + {core-avx2, M_INTEL_CORE_AVX2}, + {haswell, M_INTEL_CORE_HASWELL}, + {bonnell, M_INTEL_BONNELL}, + {silvermont, M_INTEL_SILVERMONT}, {amdfam10h, M_AMDFAM10H}, {barcelona, M_AMDFAM10H_BARCELONA}, {shanghai, M_AMDFAM10H_SHANGHAI}, {istanbul, M_AMDFAM10H_ISTANBUL}, + {bobcat, M_AMD_BOBCAT}, {amdfam15h, M_AMDFAM15H}, {bdver1, M_AMDFAM15H_BDVER1}, {bdver2, M_AMDFAM15H_BDVER2}, {bdver3, M_AMDFAM15H_BDVER3}, {bdver4, M_AMDFAM15H_BDVER4}, + {jaguar, M_AMD_JAGUAR}, }; static struct _isa_names_table @@ -31009,9
Re: [PATCH i386 4/8] [AVX512] [6/8] Add substed patterns: `sae' subst.
On Wed, Dec 18, 2013 at 2:02 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, On 02 Dec 16:10, Kirill Yukhin wrote: Hello, On 19 Nov 12:11, Kirill Yukhin wrote: Hello, On 15 Nov 20:07, Kirill Yukhin wrote: Is it ok for trunk? Ping. Ping. Ping. Ping. Rebased patch in the bottom. +(define_subst_attr round_saeonly_constraint round_saeonly vm v) +(define_subst_attr round_saeonly_constraint2 round_saeonly m v) The same comment as in previous patch. Please introduce corresponding predicate substitution that will follow constraint changes. +(define_subst_attr round_saeonly_mode512bit_condition round_saeonly 1 (GET_MODE (operands[0]) == V16SFmode || GET_MODE (operands[0]) == V8DFmode)) +(define_subst_attr round_saeonly_mode512bit_condition_op1 round_saeonly 1 (GET_MODE (operands[1]) == V16SFmode || GET_MODE (operands[1]) == V8DFmode)) Use MODEmode == ... static checks in above conditions. The patch is OK for mainline with these changes. Thanks, Uros.
[PATCH] [followup to PR59569] new vect tests for store with negative step
Hi, Here are two vectorization tests for store with negative step. This is follow-up to PR59569 fix, which contains two tests for ICE. These tests are for vectorization tests and executable. OK to commit? Thanks, Bingfeng patch_vect_tests Description: patch_vect_tests
Re: [PATCH i386 4/8] [AVX512] [7/8] Add substed patterns: `round for expand' subst.
On Wed, Dec 18, 2013 at 2:04 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Hello, On 02 Dec 16:11, Kirill Yukhin wrote: Hello, On 19 Nov 12:12, Kirill Yukhin wrote: Hello, On 15 Nov 20:08, Kirill Yukhin wrote: Is it ok for trunk? Ping. Ping. Ping. Ping. Rebased patch in the bottom. This round_expand_predicate is the predicate substitution I was referred to in the review of 5/8. Please use it also in insn patterns, perhaps renamed as round_predicate, as it is not exclusive to expanders. As mentioned, predicates should mirror constraints as close as possible. OK with these changes, Uros.
Re: [PATCH] [followup to PR59569] new vect tests for store with negative step
On Mon, Dec 23, 2013 at 04:43:17PM +, Bingfeng Mei wrote: Here are two vectorization tests for store with negative step. This is follow-up to PR59569 fix, which contains two tests for ICE. These tests are for vectorization tests and executable. OK to commit? --- testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0) +++ testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0) @@ -0,0 +1,27 @@ +/* { dg-require-effective-target vect_int } */ +#include stdlib.h + +__attribute__((noinline, noclone)) +void test1(short x[128]) +{ +int i; +for (i=127; i=0; i--) { + x[i] = 1234; +} +} + +int main (void) +{ + short x[128]; + int i; + test1 (x); + + for (i = 0; i 128; i++) + if (x[i] != 1234) + abort (); Can you please change both tests so that the x array is say 128+32 elements long instead of 128, you store some other pattern to the first 16 and last 16 elements in the array before calling test1 (do it say with asm (); inside of the loop to avoid vectorization), call test1 on x + 16 and afterwards verify that test1 didn't write anything before or after the buffer? Ok with that change. Jakub
Re: [PATCH i386 4/8] [AVX512] [8/8] Add substed patterns: `sae-only for expand' subst.
On Wed, Dec 18, 2013 at 2:16 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Rebased patch in the bottom. Adding the patch. The same comment as in 7/8 applies here. The predicate is not exclusive to expanders, should also be used in insn patterns. The name of the predicate is a bit weird, please name it simply round_saeonly_predicate. OK with these changes. Uros.
Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Monday 23 December 2013, Allan Sandfeld Jensen wrote: On Monday 23 December 2013, H.J. Lu wrote: On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote: On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote: Sorry, I must have been looking at an older version, but as I said I already did enable it in the latest patch. (see http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html ) Sorry for causing another revision but we would like to stick with btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the changes would be like I will need to make an updated patch to move the new ISAs to the end of the list anyway. I will send it in a few days to give AMD or Intel developers time to comment on the current version. I renamed Intel processor names. Please update your patch. Here is my patch to add more Intel processor support. You can add it to your patch. Updated patch attached. Rebased, fixed coding style, moved new ISA enums to the end and applied H.J.Lu's patch. Fixed merging mistake that left haswell with SSE4_2 priority. `Allan Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 206179) +++ gcc/config/i386/i386.c (working copy) @@ -29970,16 +29970,21 @@ P_SSE3, P_SSSE3, P_PROC_SSSE3, -P_SSE4_a, -P_PROC_SSE4_a, +P_SSE4_A, +P_PROC_SSE4_A, P_SSE4_1, P_SSE4_2, P_PROC_SSE4_2, P_POPCNT, P_AVX, +P_PROC_AVX, +P_FMA4, +P_XOP, +P_PROC_XOP, +P_FMA, +P_PROC_FMA, P_AVX2, -P_FMA, -P_PROC_FMA +P_PROC_AVX2 }; enum feature_priority priority = P_ZERO; @@ -29998,11 +30003,15 @@ {sse, P_SSE}, {sse2, P_SSE2}, {sse3, P_SSE3}, + {sse4a, P_SSE4_A}, {ssse3, P_SSSE3}, {sse4.1, P_SSE4_1}, {sse4.2, P_SSE4_2}, {popcnt, P_POPCNT}, {avx, P_AVX}, + {fma4, P_FMA4}, + {xop, P_XOP}, + {fma, P_FMA}, {avx2, P_AVX2} }; @@ -30054,26 +30063,50 @@ arg_str = nehalem; priority = P_PROC_SSE4_2; break; -case PROCESSOR_SANDYBRIDGE: - arg_str = sandybridge; - priority = P_PROC_SSE4_2; - break; + case PROCESSOR_SANDYBRIDGE: + arg_str = sandybridge; + priority = P_PROC_AVX; + break; + case PROCESSOR_HASWELL: + arg_str = haswell; + priority = P_PROC_AVX2; + break; case PROCESSOR_BONNELL: arg_str = bonnell; priority = P_PROC_SSSE3; break; + case PROCESSOR_SILVERMONT: + arg_str = silvermont; + priority = P_PROC_SSE4_2; + break; case PROCESSOR_AMDFAM10: arg_str = amdfam10h; - priority = P_PROC_SSE4_a; + priority = P_PROC_SSE4_A; break; + case PROCESSOR_BTVER1: + arg_str = bobcat; + priority = P_PROC_SSE4_A; + break; + case PROCESSOR_BTVER2: + arg_str = jaguar; + priority = P_PROC_AVX; + break; case PROCESSOR_BDVER1: arg_str = bdver1; - priority = P_PROC_FMA; + priority = P_PROC_XOP; break; case PROCESSOR_BDVER2: arg_str = bdver2; priority = P_PROC_FMA; break; + case PROCESSOR_BDVER3: + arg_str = bdver3; + priority = P_PROC_FMA; + break; + case PROCESSOR_BDVER4: + arg_str = bdver4; + priority = P_PROC_AVX2; + break; } } @@ -30938,6 +30971,10 @@ F_SSE4_2, F_AVX, F_AVX2, +F_SSE4_A, +F_FMA4, +F_XOP, +F_FMA, F_MAX }; @@ -30955,6 +30992,10 @@ M_AMDFAM10H, M_AMDFAM15H, M_INTEL_SILVERMONT, +M_INTEL_COREI7_AVX, +M_INTEL_CORE_AVX2, +M_AMD_BOBCAT, +M_AMD_JAGUAR, M_CPU_SUBTYPE_START, M_INTEL_COREI7_NEHALEM, M_INTEL_COREI7_WESTMERE, @@ -30965,7 +31006,9 @@ M_AMDFAM15H_BDVER1, M_AMDFAM15H_BDVER2, M_AMDFAM15H_BDVER3, -M_AMDFAM15H_BDVER4 +M_AMDFAM15H_BDVER4, +M_INTEL_COREI7_IVYBRIDGE, +M_INTEL_CORE_HASWELL }; static struct _arch_names_table @@ -30983,16 +31026,24 @@ {corei7, M_INTEL_COREI7}, {nehalem, M_INTEL_COREI7_NEHALEM}, {westmere, M_INTEL_COREI7_WESTMERE}, + {corei7-avx, M_INTEL_COREI7_AVX}, {sandybridge, M_INTEL_COREI7_SANDYBRIDGE}, + {ivybridge, M_INTEL_COREI7_IVYBRIDGE}, + {core-avx2, M_INTEL_CORE_AVX2}, + {haswell, M_INTEL_CORE_HASWELL}, + {bonnell, M_INTEL_BONNELL}, + {silvermont, M_INTEL_SILVERMONT}, {amdfam10h, M_AMDFAM10H}, {barcelona, M_AMDFAM10H_BARCELONA}, {shanghai, M_AMDFAM10H_SHANGHAI}, {istanbul, M_AMDFAM10H_ISTANBUL}, + {bobcat, M_AMD_BOBCAT}, {amdfam15h, M_AMDFAM15H}, {bdver1, M_AMDFAM15H_BDVER1}, {bdver2, M_AMDFAM15H_BDVER2}, {bdver3,
Re: [PATCH][x86] march aliases
On Mon, Dec 23, 2013 at 5:10 AM, H.J. Lu hjl.to...@gmail.com wrote: On Sun, Dec 22, 2013 at 11:11:12PM +0100, Uros Bizjak wrote: On Sun, Dec 22, 2013 at 8:28 PM, H.J. Lu hjl.to...@gmail.com wrote: Perhaps we should add sandybridge, ivybridge and haswell aliases for corei7-avx, core-avx-i, core-avx2? I mean, it is a nightmare to remember which one has the i7 in and which doesn't even for me. Yes please, I think this is a good idea. I've added aliases for haswell, sandybridge, ivybridge, bonnell, nehalem and silvermont. Old names, like corei7, core-avx-i, atom, .. don't have precise description for the processor. I think gcc driver should keep accepting them. But they should be marked as undocumented or deprecated. They should be removed from documentation. How about we leave these as -march=... to refer to the architecture, and reintroduce -mcpu= to refer to the exact cpu? Internally, the -mcpu would use some architecture specific base PTA_ attributes (as Jakub suggested) and would add some fine-tuning PTA_ attributes, based on -mcpu selection. This way, -march stays as is, and can still be used for some generally distributed binaries. -mcpu is problematic, because it means various things among different targets, and even on i?86/x86_64 it used to mean something already in the past. Sometimes -mcpu= is what -march= is now on i?86/x86_64, sometimes what -mtune= is. I'd say we don't need to deprecate anything, just add new aliases for the sometimes harder to remember names. But everything just IMHO. Jakub There are many problems with the current -march=xxx/-mtune=xxx for Intel processors, which aren't faults of GCC: 1. Atom processors can be Bonnell or Silvermont processors. -mtune=atom may not optimize for the Atom CPU being targeted. 2. Core I7 processors can be Nehalem, Westmere, Sandy Bridge, Ivy Bridge, Haswell or Broadwell. It is hard to tell which -mtune= to use for saying Core i7-3820QM. 3. There are Core i3/i5, Xeon, Celeron, Pentium processors which aren't called Core I7. They may be Nehalem, Westmere, Sandy Bridge, Ivy Bridge, Haswell or even Silvermont. We should move away from corei7, corei7-avx, core-avx-i, core-avx2, atom. Instead, we should use the actual processor names. We must accept those old names. But we should remove them from GCC manual to avoid any confusions. This patch adds -march=/mtune={nehalem,westmere,sandybridge, ivybridge,haswell,broadwell,bonnell,silvermont}. It also adds --with-arch=/--with-cpu= support as well as adds ivybridge, haswell, bonnell, silvermont to multi-arch function versioning. Any comments? This is the updated patch to add PTA_XXX as well as fix http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59580 to properly check --with-arch=/--with-cpu= options. Now we only need to add a new processor to x86_64_archs, which will enable its --with-arch=/--with-cpu= support. H.J. --- gcc/ 2013-12-22 H.J. Lu hongjiu...@intel.com Tocar Ilya ilya.to...@intel.com * config.gcc (x86_archs): New variable. (x86_64_archs): Likewise. (x86_cpus): Likewise. Use $x86_archs, $x86_64_archs and $x86_cpus to check valid --with-arch/--with-cpu= options. Support --with-arch=/--with-cpu={nehalem,westmere, sandybridge,ivybridge,haswell,broadwell,bonnell,silvermont}. * config/i386/core2.md: Replace corei7 with nehalem. * config/i386/driver-i386.c (host_detect_local_cpu): Use nehalem, westmere, sandybridge, ivybridge, haswell, bonnell, silvermont for cpu names. * config/i386/i386-c.c (ix86_target_macros_internal): Replace PROCESSOR_COREI7, PROCESSOR_COREI7_AVX, PROCESSOR_ATOM, PROCESSOR_SLM with PROCESSOR_NEHALEM, PROCESSOR_SANDYBRIDGE, PROCESSOR_BONNELL, PROCESSOR_SILVERMONT. Define __nehalem/__nehalem__, __sandybridge/__sandybridge__, __haswell/__haswell__, __tune_nehalem__, __tune_sandybridge__, __tune_haswell__, __bonnell/__bonnell__, __silvermont/__silvermont__, __tune_bonnell__, __tune_silvermont__. * config/i386/i386.c (m_COREI7): Renamed to ... (m_NEHALEM): This. (m_COREI7_AVX): Renamed to ... (m_SANDYBRIDGE): This. (m_ATOM): Renamed to ... (m_BONNELL): This. (m_SLM): Renamed to ... (m_SILVERMONT): This. (m_CORE_ALL): Updated. (cpu_names): Add nehalem, westmere, sandybridge, ivybridge, haswell, broadwell, bonnell, silvermont. (PTA_CORE2): New. (PTA_NEHALEM): Likewise. (PTA_WESTMERE): Likewise. (PTA_SANDYBRIDGE): Likewise. (PTA_IVYBRIDGE): Likewise. (PTA_HASWELL):
Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Mon, Dec 23, 2013 at 8:57 AM, Allan Sandfeld Jensen carew...@gmail.com wrote: On Monday 23 December 2013, Allan Sandfeld Jensen wrote: On Monday 23 December 2013, H.J. Lu wrote: On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote: On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote: Sorry, I must have been looking at an older version, but as I said I already did enable it in the latest patch. (see http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html ) Sorry for causing another revision but we would like to stick with btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the changes would be like I will need to make an updated patch to move the new ISAs to the end of the list anyway. I will send it in a few days to give AMD or Intel developers time to comment on the current version. I renamed Intel processor names. Please update your patch. Here is my patch to add more Intel processor support. You can add it to your patch. Updated patch attached. Rebased, fixed coding style, moved new ISA enums to the end and applied H.J.Lu's patch. Fixed merging mistake that left haswell with SSE4_2 priority. `Allan +M_INTEL_COREI7_AVX, +M_INTEL_CORE_AVX2, Do we need them? M_INTEL_COREI7_AVX is the same M_INTEL_COREI7_SANDYBRIDGE and M_INTEL_CORE_AVX2 is the same as M_INTEL_COREI7_HASWELL. +M_INTEL_CORE_HASWELL Please change M_INTEL_CORE_HASWELL to M_INTEL_COREI7_HASWELL. + {corei7-avx, M_INTEL_COREI7_AVX}, + {core-avx2, M_INTEL_CORE_AVX2}, Why do we need them? -- H.J.
[COMMITTED] RE: [PATCH] [followup to PR59569] new vect tests for store with negative step
Thanks. Committed with suggested change. Merry Christmas! Bingfeng -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: 23 December 2013 16:48 To: Bingfeng Mei Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [followup to PR59569] new vect tests for store with negative step On Mon, Dec 23, 2013 at 04:43:17PM +, Bingfeng Mei wrote: Here are two vectorization tests for store with negative step. This is follow-up to PR59569 fix, which contains two tests for ICE. These tests are for vectorization tests and executable. OK to commit? --- testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0) +++ testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0) @@ -0,0 +1,27 @@ +/* { dg-require-effective-target vect_int } */ +#include stdlib.h + +__attribute__((noinline, noclone)) +void test1(short x[128]) +{ +int i; +for (i=127; i=0; i--) { + x[i] = 1234; +} +} + +int main (void) +{ + short x[128]; + int i; + test1 (x); + + for (i = 0; i 128; i++) + if (x[i] != 1234) + abort (); Can you please change both tests so that the x array is say 128+32 elements long instead of 128, you store some other pattern to the first 16 and last 16 elements in the array before calling test1 (do it say with asm (); inside of the loop to avoid vectorization), call test1 on x + 16 and afterwards verify that test1 didn't write anything before or after the buffer? Ok with that change. Jakub patch_vect_tests Description: patch_vect_tests
Re: [PATCH 14/16] tree-ssa-loop-niter.c: use gimple_phi in a few places
On Fri, 2013-12-13 at 12:13 -0500, Andrew MacLeod wrote: On 12/13/2013 10:58 AM, David Malcolm wrote: { gimple stmt = SSA_NAME_DEF_STMT (x); @@ -2162,7 +2162,7 @@ chain_of_csts_start (struct loop *loop, tree x) if (gimple_code (stmt) == GIMPLE_PHI) { if (bb == loop-header) - return stmt; + return stmt-as_a_gimple_phi (); return NULL; } @@ -2195,10 +2195,10 @@ chain_of_csts_start (struct loop *loop, tree x) If such phi node exists, it is returned, otherwise NULL is returned. */ I dislike separating the checking of gimple_code () and the following as_a. I rather envisioned this sort of thing as being more of an abstraction improvement if we never have to check gimple_code()... Then you are also less locked into a specific implementation. So something more like: if (gimple_phi phi = stmt-dyncast_gimple_phi ()) { if (bb == loop-header) return phi; } IMO anyway... Thanks. My goal is to use these stronger types (a) to move type-checking to compile-time and (b) to (i hope) improve the readability of the code. I'm not trying to switch away from gimple_code for the home-grown RTTI per se. However, given that you prefer the above style, I'm now opting to use dyn_cast for the above kind of test in my ongoing work on this. The other consideration is that I'm trying to minimize the invasiveness of the patches, to avoid the amount of conflicts that will occur when trying to merge this (for next stage1). So I'm sometimes tactically avoiding some constructs, e.g. to avoid needing to reindent large suites. FWIW I'm currently at 90 patches, and have reached some kind of halfway point, with 162 gimple_foo_ access functions now taking a more concrete type that gimple [1]; 159 to go. That said, I think these accessors are something of a surface detail - I'm more interested in such concretizing of types *throughout* the middle-end, rather than just focusing on the gimple_foo_ access functions; for example, I now have the callgraph edge statements being gimple_call rather than just gimple. It's the latter kind of deeper change to typesafety that I'm most excited about it. Andrew: hopefully this is all compatible with your proposed changes to types and expressions? I'm trying to just touch the statements themselves. Dave [1] including all of gimple_asm_*, gimple_bind_*, gimple_catch_*, gimple_eh_dispatch_*, gimple_eh_else_*, gimple_omp_atomic_load_*, gimple_omp_atomic_store_*, gimple_omp_continue_*, gimple_resx_*, gimple_switch_*, gimple_transaction_*.
Re: [PATCH] Fix for PR59585
Re-sending as plaintext. Jakub wrote: Or is it stripping just the final newline at the end of output? Exactly. Still sounds like a bug elsewhere to me. Let me investigate this deeper tomorrow (rebuilding fresh Dg, etc.). If it indeed turns out to be feature of current DejaGNU, workaround may be the easiest solution.
Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Monday 23 December 2013, H.J. Lu wrote: On Mon, Dec 23, 2013 at 8:57 AM, Allan Sandfeld Jensen carew...@gmail.com wrote: On Monday 23 December 2013, Allan Sandfeld Jensen wrote: On Monday 23 December 2013, H.J. Lu wrote: On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote: On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote: Sorry, I must have been looking at an older version, but as I said I already did enable it in the latest patch. (see http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html ) Sorry for causing another revision but we would like to stick with btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the changes would be like I will need to make an updated patch to move the new ISAs to the end of the list anyway. I will send it in a few days to give AMD or Intel developers time to comment on the current version. I renamed Intel processor names. Please update your patch. Here is my patch to add more Intel processor support. You can add it to your patch. Updated patch attached. Rebased, fixed coding style, moved new ISA enums to the end and applied H.J.Lu's patch. Fixed merging mistake that left haswell with SSE4_2 priority. `Allan +M_INTEL_COREI7_AVX, +M_INTEL_CORE_AVX2, Do we need them? M_INTEL_COREI7_AVX is the same M_INTEL_COREI7_SANDYBRIDGE and M_INTEL_CORE_AVX2 is the same as M_INTEL_COREI7_HASWELL. M_INTEL_COREI7_AVX is the common model for both sandybridge and ivybridge. Matching PROCESSOR_SANDYBRIDGE, or march=corei7-avx. Similarly M_INTEL_CORE_AVX2 is the common model for haswell and broadwell, matching PROCESSOR_HASWELL or march=core-avx2. +M_INTEL_CORE_HASWELL Please change M_INTEL_CORE_HASWELL to M_INTEL_COREI7_HASWELL. I used the name core_haswell to make its prefix match that of its model core_avx2 (as opposed to corei7_avx for instance). + {corei7-avx, M_INTEL_COREI7_AVX}, + {core-avx2, M_INTEL_CORE_AVX2}, Why do we need them? Without the existence of these entries, __attribute__((target(corei7-avx))) or __attribute__((target(core-avx2)) failed to compile because of how parameters to attributes were verified. Regards `Allan
Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning
On Mon, Dec 23, 2013 at 10:33 AM, Allan Sandfeld Jensen carew...@gmail.com wrote: On Monday 23 December 2013, H.J. Lu wrote: On Mon, Dec 23, 2013 at 8:57 AM, Allan Sandfeld Jensen carew...@gmail.com wrote: On Monday 23 December 2013, Allan Sandfeld Jensen wrote: On Monday 23 December 2013, H.J. Lu wrote: On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote: On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote: Sorry, I must have been looking at an older version, but as I said I already did enable it in the latest patch. (see http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html ) Sorry for causing another revision but we would like to stick with btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the changes would be like I will need to make an updated patch to move the new ISAs to the end of the list anyway. I will send it in a few days to give AMD or Intel developers time to comment on the current version. I renamed Intel processor names. Please update your patch. Here is my patch to add more Intel processor support. You can add it to your patch. Updated patch attached. Rebased, fixed coding style, moved new ISA enums to the end and applied H.J.Lu's patch. Fixed merging mistake that left haswell with SSE4_2 priority. `Allan +M_INTEL_COREI7_AVX, +M_INTEL_CORE_AVX2, Do we need them? M_INTEL_COREI7_AVX is the same M_INTEL_COREI7_SANDYBRIDGE and M_INTEL_CORE_AVX2 is the same as M_INTEL_COREI7_HASWELL. M_INTEL_COREI7_AVX is the common model for both sandybridge and ivybridge. Matching PROCESSOR_SANDYBRIDGE, or march=corei7-avx. Similarly M_INTEL_CORE_AVX2 is the common model for haswell and broadwell, matching PROCESSOR_HASWELL or march=core-avx2. If you use {corei7-avx, M_INTEL_COREI7_SANYBRIDGE}, {core-avx2, M_INTEL_COREI7_HASWELL}, will it cause any problems? When there are both __attribute__((target(corei7-avx))) and __attribute__((target(sandybridge))) we should either issue an error or silently drop __attribute__((target(corei7-avx))) instead of generating to 2 identical copies of the same function. +M_INTEL_CORE_HASWELL Please change M_INTEL_CORE_HASWELL to M_INTEL_COREI7_HASWELL. I used the name core_haswell to make its prefix match that of its model core_avx2 (as opposed to corei7_avx for instance). We should remove all internal references to corei7-avx and core-avx2 if possible. + {corei7-avx, M_INTEL_COREI7_AVX}, + {core-avx2, M_INTEL_CORE_AVX2}, Why do we need them? Without the existence of these entries, __attribute__((target(corei7-avx))) or __attribute__((target(core-avx2)) failed to compile because of how parameters to attributes were verified. -- H.J.
Fix use of stack-pointer-register as a temporary for CRIS
The circumstances are a bit odd; the stack-pointer (sp) is never the target for a direct assignment in ordinary generated code. Still, this happens for gcc.dg/pr50251.c, calling __builtin_stack_restore. There's a bug in several define_splits in the CRIS port, in that the destination of the split insn is used as a temporary, so sp is set to something unusable as a stack-pointer. You don't want that in a context where interrupts use the same stack as the running program; there's no red-zone or anything. Though, it *would* be valid for contexts where the user stack is not the system (interrupt) stack, but introducing that distinction is not worthwhile. I'll mark this with middle-end/59584 only because it makes a nice test-case should anyone want to work on the general bug noticed there (revert the commit locally, observe ICE for gcc.dg/pr50251.c). The general bug for PR59584 is that GCC can't handle fixing up the REG_ARGS_SIZE note being on a direct assignment to the stack-pointer, therefore no define_split must match it. This patch just removes the define_split; the bug is likely to hit other targets, when __builtin_stack_restore is called. PS. I wish we have a name field for define_splits... I don't think a string would collide, syntactically. Maybe later. Tested cris-elf, makes gcc.dg/pr50251.c pass again. PR middle-end/59584 * config/cris/predicates.md (cris_nonsp_register_operand): New define_predicate. * config/cris/cris.md: Replace register_operand with cris_nonsp_register_operand for destinations in all define_splits where a register is set more than once. Index: gcc/config/cris/cris.md === --- gcc/config/cris/cris.md (revision 206176) +++ gcc/config/cris/cris.md (working copy) @@ -758,7 +758,7 @@ (define_split (match_operand:SI 1 const_int_operand )) (match_operand:SI 2 register_operand ))]) (match_operand 3 register_operand )) - (set (match_operand:SI 4 register_operand ) + (set (match_operand:SI 4 cris_nonsp_register_operand ) (plus:SI (mult:SI (match_dup 0) (match_dup 1)) (match_dup 2)))])] @@ -859,7 +859,7 @@ (define_split (match_operand:SI 0 cris_bdap_operand ) (match_operand:SI 1 cris_bdap_operand ))]) (match_operand 2 register_operand )) - (set (match_operand:SI 3 register_operand ) + (set (match_operand:SI 3 cris_nonsp_register_operand ) (plus:SI (match_dup 0) (match_dup 1)))])] reload_completed reg_overlap_mentioned_p (operands[3], operands[2]) [(set (match_dup 4) (match_dup 2)) @@ -3960,7 +3960,7 @@ (define_expand casesi ;; up. (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 4 cris_operand_extend_operator [(match_operand 1 register_operand ) @@ -3990,7 +3990,7 @@ (define_split ;; Call this op-extend-split-rx=rz (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 4 cris_plus_or_bound_operator [(match_operand 1 register_operand ) @@ -4018,7 +4018,7 @@ (define_split ;; Call this op-extend-split-swapped (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 4 cris_plus_or_bound_operator [(match_operator @@ -4044,7 +4044,7 @@ (define_split ;; bound. Call this op-extend-split-swapped-rx=rz. (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 4 cris_plus_or_bound_operator [(match_operator @@ -4075,7 +4075,7 @@ (define_split ;; Call this op-extend. (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 3 cris_orthogonal_operator [(match_operand 1 register_operand ) @@ -4099,7 +4099,7 @@ (define_split ;; Call this op-split-rx=rz (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 3 cris_commutative_orth_op [(match_operand 2 memory_operand ) @@ -4123,7 +4123,7 @@ (define_split ;; Call this op-split-swapped. (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 3 cris_commutative_orth_op [(match_operand 1 register_operand ) @@ -4146,7 +4146,7 @@ (define_split ;; Call this op-split-swapped-rx=rz. (define_split - [(set (match_operand 0 register_operand ) + [(set (match_operand 0 cris_nonsp_register_operand ) (match_operator 3
Committed: fix PR target/59203, typo in cris.c
Spotted by David Binderman and cppcheck, thanks. The interesting cases wouldn't be exposed by a cris-elf build, but I made a regtest-run nonetheless: the fix has actually been in our local tree for quite some time together with TLS for CRIS v32 so I'm not worried about fallout. (Upstreaming that? Hm... one excuse I use is that I've been waiting for TLS for CRIS v10 to materialize for the Linux kernel, along the v32 lines but using $IRP, but that never happened.) PR target/59203 * config/cris/cris.c (cris_pic_symbol_type_of): Fix typo, checking t1 twice instead of t1 and t2 respectively. Index: gcc/config/cris/cris.c === --- gcc/config/cris/cris.c (revision 206176) +++ gcc/config/cris/cris.c (working copy) @@ -2493,7 +2493,7 @@ cris_pic_symbol_type_of (const_rtx x) gcc_assert (t1 == cris_no_symbol || t2 == cris_no_symbol); - if (t1 == cris_got_symbol || t1 == cris_got_symbol) + if (t1 == cris_got_symbol || t2 == cris_got_symbol) return cris_got_symbol_needing_fixup; return t1 != cris_no_symbol ? t1 : t2; brgds, H-P
C++ PATCH for c++/59349 (ICE with empty lambda init-capture initializer)
We need to handle getting NULL_TREE for the capture initializer, so that we don't crash when trying to do things like look at its type. Tested x86_64-pc-linux-gnu, applying to trunk. commit 135f0f322516ce986ed13a214ca9351bd1963749 Author: Jason Merrill ja...@redhat.com Date: Mon Dec 23 15:05:00 2013 -0500 PR c++/59349 * parser.c (cp_parser_lambda_introducer): Handle empty init. diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 2a2cbf0..4ef0f05 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -8898,6 +8898,11 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr) capture_init_expr = cp_parser_initializer (parser, direct, non_constant); explicit_init_p = true; + if (capture_init_expr == NULL_TREE) + { + error (empty initializer for lambda init-capture); + capture_init_expr = error_mark_node; + } } else { diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C b/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C new file mode 100644 index 000..ad152cf --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C @@ -0,0 +1,6 @@ +// PR c++/59349 +// { dg-options -std=c++1y } + +int foo () { + [bar()]{}; // { dg-error empty initializer } +}
C++ PATCH for c++/59271 (ICE with polymorphic lambda and VLA)
This testcase was crashing in strip_typedefs because it uses build_cplus_array_type, while the original type was built with the generic build_array_type, and the two functions work differently within a template such that we violated an assert in strip_typedefs. Fixed by using build_cplus_array_type consistently. Tested x86_64-pc-linux-gnu, applying to trunk. commit 9e0c771b79ce3c143fffae2fd09ecdc6f88041d9 Author: Jason Merrill ja...@redhat.com Date: Mon Dec 23 15:20:38 2013 -0500 PR c++/59271 * lambda.c (build_capture_proxy): Use build_cplus_array_type. diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c index 24aa2c5..bd8df1d 100644 --- a/gcc/cp/lambda.c +++ b/gcc/cp/lambda.c @@ -377,8 +377,8 @@ build_capture_proxy (tree member) tree ptr = build_simple_component_ref (object, field); field = next_initializable_field (DECL_CHAIN (field)); tree max = build_simple_component_ref (object, field); - type = build_array_type (TREE_TYPE (TREE_TYPE (ptr)), - build_index_type (max)); + type = build_cplus_array_type (TREE_TYPE (TREE_TYPE (ptr)), + build_index_type (max)); type = build_reference_type (type); REFERENCE_VLA_OK (type) = true; object = convert (type, ptr); diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C new file mode 100644 index 000..556722c --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C @@ -0,0 +1,24 @@ +// PR c++/59271 +// { dg-options -std=c++1y } + +extern C int printf (const char *, ...); + +void f(int n) +{ + int a[n]; + + for (auto i : a) +{ + i = i - a; +} + + [a] (auto m) +{ + for (auto i : a) + { + printf (%d, i); + } + + return m; +}; +}
[PING][GOMP4][PATCH] SIMD-enabled functions (formerly Elemental functions) for C++
Ping! -Balaji V. Iyer. -Original Message- From: Iyer, Balaji V Sent: Thursday, December 19, 2013 1:12 PM To: Jakub Jelinek Cc: 'Aldy Hernandez (al...@redhat.com)'; 'gcc-patches@gcc.gnu.org' Subject: RE: [GOMP4][PATCH] SIMD-enabled functions (formerly Elemental functions) for C++ Hi Jakub, Attached, please find a fixed patch. I have answered your questions below. Is this OK for trunk? Here are the ChangeLog entries: Gcc/cp/ChangeLog 2013-12-19 Balaji V. Iyer balaji.v.i...@intel.com * parser.c (cp_parser_direct_declarator): When Cilk Plus is enabled see if there is an attribute after function decl. If so, then parse them now. (cp_parser_late_return_type_opt): Handle parsing of Cilk Plus SIMD enabled function late parsing. (cp_parser_gnu_attribute_list): Parse all the tokens for the vector attribute for a SIMD-enabled function. (cp_parser_omp_all_clauses): Skip parsing to the end of pragma when the function is used by SIMD-enabled function (indicated by NULL pragma token). Added 3 new clauses: PRAGMA_CILK_CLAUSE_MASK, PRAGMA_CILK_CLAUSE_NOMASK and PRAGMA_CILK_CLAUSE_VECTORLENGTH (cp_parser_cilk_simd_vectorlength): Modified this function to handle vectorlength clause in SIMD-enabled function and #pragma SIMD's vectorlength clause. Added a new bool parameter to differentiate between the two. (cp_parser_cilk_simd_fn_vector_attrs): New function. (is_cilkplus_vector_p): Likewise. (cp_parser_late_parsing_elem_fn_info): Likewise. (cp_parser_omp_clause_name): Added a check for mask, nomask and vectorlength clauses when Cilk Plus is enabled. (cp_parser_omp_clause_linear): Added a new parameter of type bool and emit a sorry message when step size is a parameter. * parser.h (cp_parser::cilk_simd_fn_info): New field. Testsuite/ChangeLog 2013-12-19 Balaji V. Iyer balaji.v.i...@intel.com * g++.dg/cilk-plus/cilk-plus.exp: Called the C/C++ common tests for SIMD enabled function. * g++.dg/cilk-plus/ef_test.C: New test. * c-c++-common/cilk-plus/vlength_errors.c: Added new dg-error tags to differenciate C error messages from C++ ones. Thanks, Balaji V. Iyer. -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Thursday, December 19, 2013 2:23 AM To: Iyer, Balaji V Cc: 'Aldy Hernandez (al...@redhat.com)'; 'gcc-patches@gcc.gnu.org' Subject: Re: [GOMP4][PATCH] SIMD-enabled functions (formerly Elemental functions) for C++ On Wed, Dec 18, 2013 at 11:36:04PM +, Iyer, Balaji V wrote: --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -1124,6 +1124,10 @@ is_late_template_attribute (tree attr, tree decl) is_attribute_p (omp declare simd, name)) return true; + /* Ditto as above for Cilk Plus SIMD-enabled function attributes. + */ if (flag_enable_cilkplus is_attribute_p (cilk simd + function, name)) +return true; Why? It doesn't have any argument, why it should be processed late? Fixed. @@ -17097,6 +17102,14 @@ cp_parser_direct_declarator (cp_parser* parser, attrs = cp_parser_std_attribute_spec_seq (parser); + /* In here, we handle cases where attribute is used after + the function declaration. For example: + void func (int x) __attribute__((vector(..))); */ + if (flag_enable_cilkplus +cp_lexer_next_token_is_keyword (parser-lexer, + RID_ATTRIBUTE)) + attrs = chainon (cp_parser_gnu_attributes_opt (parser), + attrs); late_return = (cp_parser_late_return_type_opt (parser, declarator, memfn ? cv_quals : -1)); Doesn't this change the grammar (for all attributes, not just Cilk+ specific ones) just based on whether -fcilkplus has been specified or not? OK. Fixed this by making it parse tentatively (sort of similar to how you parse attributes after labels (line #9584)) @@ -17820,10 +17833,14 @@ cp_parser_late_return_type_opt (cp_parser* parser, cp_declarator *declarator, declarator declarator-kind == cdk_id); + bool cilk_simd_fn_vector_p = (parser-cilk_simd_fn_info + declarator + declarator-kind == cdk_id); Formatting looks wrong, put = on the next line and align right below parser. Fixed. + +cp_omp_declare_simd_data info; Global var? Why? Isn't heap or GC allocation better? Fixed. Replaced it with XNEW and XDELETE combinations instead of setting the address of a global value. + /* The vectorlength clause
Re: [PATCH i386 4/8] [AVX512] [2/n] Add substed patterns: mask scalar subst.
Patch attached. Ok for trunk? Just noticed Uros's input about predicates. So, ok with fix of predicate?
Re: [PATCH] Fix for PR59585
Yury wrote: Still sounds like a bug elsewhere to me. Let me investigate this deeper tomorrow (rebuilding fresh Dg, etc.). So I've double-checked that this is a problem with trunk DejaGNU rsh.exp script removing trailing newline from test output: # Delete one trailing \n because that is what `exec' will do and we want # to behave identical to it. regsub \n$ $output output I can report this to DejaGNU mailing list but even if they agree to fix we'll still have to do something about legacy Dg installations. I suggest to work around by removing trailing newline as suggested by original patch (or maybe replacing it with $ ?). What's your opinion? -Y
Re: [PATCH i386 4/8] [AVX512] [2/n] Add substed patterns: mask scalar subst.
On Tue, Dec 24, 2013 at 5:57 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Patch attached. Ok for trunk? Just noticed Uros's input about predicates. So, ok with fix of predicate? Please retest and repost the patch with the predicate fix. Looks good otherwise, with a couple of minor changes below: diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vextractf32x4-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vextractf32x4-2.c new file mode 100644 index 000..26d7c3c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512f-vextractf32x4-2.c @@ -0,0 +1,54 @@ +/* { dg-do run } */ +/* { dg-options -O2 -mavx512f -DAVX512F } */ Please move defines from options to source. +(define_subst_attr round_prefix round vex evex) (define_subst_attr round_mode512bit_condition round 1 (GET_MODE (operands[0]) == V16SFmode || GET_MODE (operands[0]) == V8DFmode)) (define_subst_attr round_modev4sf_condition round 1 (GET_MODE (operands[0]) == V4SFmode)) While here, can you also change conditions to static checks (MODEmode == ) ? Uros.