RE: [PATCH] RE: gcc parallel make check
Attached is an extended version of the patch, it brings a 100% improvement in make -j32 -k check-gcc (down from 20min to 10min) by modification of check_gcc_parallelize. It includes one non-trivial part, namely a split of the target exps. They are now all split using a common choice (based on i386), which I believe is reasonable as it is the target with most tests, and the patterns will be somewhat similar for other targets (e.g. split of p(rxxx)). The implementation of this in the makefile uses an odd looking technique to substitute spaces with commas in a variable, if this can be done more elegantly, I'm happy to make the change. Bootstrap and testing revealed one issue, i386.exp hard-codes a loop for the testcase 'vect-args.c' in order to test 10 different combinations of options. With the current split (i.e. target x4) this test will thus be executed 4 times. There are two easy options 1) keep the current setup, overhead is small 2) keep the .exp file simple and just replicate this test 10x I've selected 1), but I can update a patch with 2). Ideally dg-options in the testcase file itself could be repeated, but I haven't found an example of this. The script now includes sorting and compression of the ranges, and an additional sanity check on the input, i.e. that file names start with [0-9A-Za-z]. Some (few) files seem to start with _ or # (in ./gcc.dg/cpp/). I'll follow up with a separate patch to improve check_g++_parallelize. Full 'make -j k32 check' is now dominated by libstdc++ testing, which contains single goals that run ~1100s (e.g. regex related tests). These uses a slightly different syntax (see gcc/libstdc++-v3/testsuite/Makefile.am) and I'm not yet sure how to deal with the .am files. current patch OK for trunk ? Joost patch-speedup-checkfortran-v05.CL Description: patch-speedup-checkfortran-v05.CL Index: contrib/generate_tcl_patterns.sh === --- contrib/generate_tcl_patterns.sh (revision 0) +++ contrib/generate_tcl_patterns.sh (revision 0) @@ -0,0 +1,114 @@ +#! /bin/sh + +# +# based on a list of filenames as input, starting with [0-9A-Za-z], +# generate regexps that match subsets trying to not exceed a +# 'maxcount' parameter. Most useful to generate the +# check_LANG_parallelize assignments needed to split +# testsuite directories, defining prefix appropriately. +# +# Example usage: +# cd gcc/gcc/testsuite/gfortran.dg +# ls -1 | ../../../contrib/generate_tcl_patterns.sh 300 dg.exp=gfortran.dg/ +# +# the first parameter is the maximum number of files. +# the second parameter the prefix used for printing. +# + +# Copyright (C) 2014 Free Software Foundation +# Contributed by Joost VandeVondele joost.vandevond...@mat.ethz.ch +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING. If not, write to +# the Free Software Foundation, 51 Franklin Street, Fifth Floor, +# Boston, MA 02110-1301, USA. + +gawk -v maxcount=$1 -v prefix=$2 ' +BEGIN{ + # list of allowed starting chars for a file name in a dir to split + achars=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz + ranget=112233 +} +{ + if (index(achars,substr($1,1,1))==0){ + print file : $1 does not start with an allowed character. + _assert_exit = 1 + exit 1 + } + nfiles++ ; files[nfiles]=$1 +} +END{ + if (_assert_exit) exit 1 + for(i=1; i=length(achars); i++) count[substr(achars,i,1)]=0 + for(i=1; i=nfiles; i++) { + if (length(files[i]0)) { count[substr(files[i],1,1)]++ } + }; + asort(count,ordered) + countsingle=0 + groups=0 + label= + for(i=length(achars);i=1;i--) { +countsingle=countsingle+ordered[i] +for(j=1;j=length(achars);j++) { + if(count[substr(achars,j,1)]==ordered[i]) found=substr(achars,j,1) +} +count[found]=-1 +label=label found +if(i==1) { val=maxcount+1 } else { val=ordered[i-1] } +if(countsingle+valmaxcount) { + subset[label]=countsingle + print Adding label: , label, matching files: countsingle + groups++ + countsingle=0 + label= +} + } + print patterns: + asort(subset,ordered) + for(i=groups;i=1;i--) { +for(j in subset){ + if(subset[j]==ordered[i]) found=j +} +subset[found]=-1 +if (length(found)==1) { + printf(%s%s* \\\n,prefix,found) +} else { + sortandcompress() +
GCC 5 snapshots produce broken kernel for powerpc-e500v2-linux-gnuspe?
Hello, I've recently faced an issue I'm afraid I currently unable to debug. When building an arbitrary version of Linux kernel for powerpc-e500v2-linux-gnuspe target, it seems gcc prior to 5 produces a good image which boots just fine, and current gcc 5 snapshots (4.10.0-alpha20140810 for example) produce an image which hangs just after U-Boot hands over to the kernel. This behavior is well reproducible on real hardware as well as under qemu. I've prepared a minimal kernel config which is dysfunctional as is but still enough to demonstrate the problem in qemu. I believe the exact Linux version number doesn't actually matter here, but see the attachment for details. Compare the output produced by u-boot and this minified kernel build using gcc 4.9.1 and 4.10.0-alpha20140810 snapshot. % qemu-system-ppc --version QEMU emulator version 2.1.0, Copyright (c) 2003-2008 Fabrice Bellard % qemu-system-ppc -cpu e500v2 -M mpc8544ds -bios /usr/share/qemu/u-boot.e500 \ -kernel arch/powerpc/boot/uImage-gcc4.9.1 -nographic U-Boot 2014.07-rc1-00079-g2072e72-dirty (May 16 2014 - 13:04:54) CPU: Unknown, Version: 0.0, (0x) Core: e500, Version: 2.2, (0x80210022) Clock Configuration: CPU0:400 MHz, CCB:400 MHz, DDR:200 MHz (400 MT/s data rate), LBC: unknown (LCRR[CLKDIV] = 0x00) L1:D-cache 32 KiB enabled I-cache 32 KiB enabled DRAM: 128 MiB L2:disabled Using default environment PCI: base address e0008000 00:11.0 - 1af4:1000 - Network controller PCI1: Bus 00 - 00 In:serial Out: serial Err: serial Net: No ethernet found. Hit any key to stop autoboot: 0 ## Booting kernel from Legacy Image at 0200 ... Image Name: Linux-2.6.35+ Image Type: PowerPC Linux Kernel Image (gzip compressed) Data Size:507635 Bytes = 495.7 KiB Load Address: Entry Point: Verifying Checksum ... OK ## Flattened Device Tree blob at e800 Booting using the fdt blob at 0xe800 Uncompressing Kernel Image ... OK Loading Device Tree to 03fec000, end 03ffefff ... OK setup_arch: bootmem mpc85xx_ds_setup_arch() arch: exit qemu terminated by user % qemu-system-ppc -cpu e500v2 -M mpc8544ds -bios /usr/share/qemu/u-boot.e500 \ -kernel arch/powerpc/boot/uImage-gcc5 -nographic U-Boot 2014.07-rc1-00079-g2072e72-dirty (May 16 2014 - 13:04:54) hardware enumeration output just as above Hit any key to stop autoboot: 0 ## Booting kernel from Legacy Image at 0200 ... Image Name: Linux-2.6.35+ Image Type: PowerPC Linux Kernel Image (gzip compressed) Data Size:505303 Bytes = 493.5 KiB Load Address: Entry Point: Verifying Checksum ... OK ## Flattened Device Tree blob at e800 Booting using the fdt blob at 0xe800 Uncompressing Kernel Image ... OK Loading Device Tree to 03fec000, end 03ffefff ... OK hang I also have another seemingly related issue: kernel module built w/ 4.10.0-alpha20140810 against a kernel configured for powerpc-e500v2-linux-gnuspe snapshot lacks .gnu.linkonce.this_module and .rela.gnu.linkonce.this_module sections. Current stable versions of gcc emit these sections but also drop it (and, consequently, init_module() and cleanup_module() symbols) when CFLAGS_MODULE is modified in any way, even w/ CFLAGS_MODULE=-mno-isel for example. I now have completely no idea what to do next to find a cause of (1) gcc 5 snapshots producing unbootable kernel, and (2) different gcc versions producing garbled kernel modules when configured for SPE target. As for modules, I've compared assembler output of gcc configured for powerpc-e300c3-linux-gnu and powerpc-e500v2-linux-gnuspe and failed to spot real differences, but installed version of binutils is exactly the same for both configurations, and I'm completely sure my setup is correct. Regards. # # Automatically generated make config: don't edit # Linux kernel version: 2.6.35 # Tue Sep 9 16:29:19 2014 # # CONFIG_PPC64 is not set # # Processor support # # CONFIG_PPC_BOOK3S_32 is not set CONFIG_PPC_85xx=y # CONFIG_PPC_8xx is not set # CONFIG_40x is not set # CONFIG_44x is not set # CONFIG_E200 is not set CONFIG_E500=y # CONFIG_PPC_E500MC is not set CONFIG_FSL_EMB_PERFMON=y CONFIG_FSL_EMB_PERF_EVENT=y CONFIG_FSL_EMB_PERF_EVENT_E500=y CONFIG_BOOKE=y CONFIG_FSL_BOOKE=y # CONFIG_PHYS_64BIT is not set # CONFIG_SPE is not set CONFIG_PPC_MMU_NOHASH=y CONFIG_PPC_MMU_NOHASH_32=y CONFIG_PPC_BOOK3E_MMU=y # CONFIG_PPC_MM_SLICES is not set # CONFIG_SMP is not set CONFIG_PPC32=y CONFIG_WORD_SIZE=32 # CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set CONFIG_MMU=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y # CONFIG_HAVE_SETUP_PER_CPU_AREA is not set # CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK is not set CONFIG_IRQ_PER_CPU=y CONFIG_NR_IRQS=512 CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y
Re: GCC 5 snapshots produce broken kernel for powerpc-e500v2-linux-gnuspe?
On 2014.09.09 at 17:35 +0800, Arseny Solokha wrote: Hello, I've recently faced an issue I'm afraid I currently unable to debug. When building an arbitrary version of Linux kernel for powerpc-e500v2-linux-gnuspe target, it seems gcc prior to 5 produces a good image which boots just fine, and current gcc 5 snapshots (4.10.0-alpha20140810 for example) produce an image which hangs just after U-Boot hands over to the kernel. This behavior is well reproducible on real hardware as well as under qemu. I've prepared a minimal kernel config which is dysfunctional as is but still enough to demonstrate the problem in qemu. I believe the exact Linux version number doesn't actually matter here, but see the attachment for details. Compare the output produced by u-boot and this minified kernel build using gcc 4.9.1 and 4.10.0-alpha20140810 snapshot. I now have completely no idea what to do next to find a cause of (1) gcc 5 snapshots producing unbootable kernel, gcc trunk also miscompiles x86_64 kernels currently, but I haven't looked deeper yet. The best way to narrow down the issue is to use git (or svn) bisect to find out which gcc revision causes the miscompile. Then you can md5sum the kernel object files for the bad revision and for the first good revision and compare the results. After that you can look at the disassembly of the object files, for which md5sum differs, and try to figure out the reason why. -- Markus
Re: [PATCH] RE: gcc parallel make check
On 09/09/2014 10:51 AM, VandeVondele Joost wrote: Attached is an extended version of the patch, it brings a 100% improvement in make -j32 -k check-gcc First of all, many thanks for working on this. +# ls -1 | ../../../contrib/generate_tcl_patterns.sh 300 dg.exp=gfortran.dg/ How does this work with subdirectories? Can we replace ls with find? -check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 +check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 \ + 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 $(shell seq 1 40) ? + if (_assert_exit) exit 1 Haven't you already exited above? A second part of the patch is a new file 'contrib/generate_tcl_patterns.sh' which generates the needed regexp Can we provide a Makefile target to automatically update Makefile.in? -Y
Re: GCC 5 snapshots produce broken kernel for powerpc-e500v2-linux-gnuspe?
On Sep 9, 2014, at 2:57 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.09.09 at 17:35 +0800, Arseny Solokha wrote: Hello, I've recently faced an issue I'm afraid I currently unable to debug. When building an arbitrary version of Linux kernel for powerpc-e500v2-linux-gnuspe target, it seems gcc prior to 5 produces a good image which boots just fine, and current gcc 5 snapshots (4.10.0-alpha20140810 for example) produce an image which hangs just after U-Boot hands over to the kernel. This behavior is well reproducible on real hardware as well as under qemu. I've prepared a minimal kernel config which is dysfunctional as is but still enough to demonstrate the problem in qemu. I believe the exact Linux version number doesn't actually matter here, but see the attachment for details. Compare the output produced by u-boot and this minified kernel build using gcc 4.9.1 and 4.10.0-alpha20140810 snapshot. I now have completely no idea what to do next to find a cause of (1) gcc 5 snapshots producing unbootable kernel, gcc trunk also miscompiles x86_64 kernels currently, but I haven't looked deeper yet. The best way to narrow down the issue is to use git (or svn) bisect to find out which gcc revision causes the miscompile. Then you can md5sum the kernel object files for the bad revision and for the first good revision and compare the results. After that you can look at the disassembly of the object files, for which md5sum differs, and try to figure out the reason why. I have a patch which I need to submit. Maybe by Friday I will do that. It fixes the kernel on arm64 but it is generic c front-end patch. -- Markus
Re: [PATCH] RE: gcc parallel make check
On Tue, Sep 09, 2014 at 02:02:18PM +0400, Yury Gribov wrote: On 09/09/2014 10:51 AM, VandeVondele Joost wrote: Attached is an extended version of the patch, it brings a 100% improvement in make -j32 -k check-gcc First of all, many thanks for working on this. +# ls -1 | ../../../contrib/generate_tcl_patterns.sh 300 dg.exp=gfortran.dg/ How does this work with subdirectories? Can we replace ls with find? Generally, if the argument to *.exp doesn't contain a particular subdirectory, then the wildcard is taken against basenames of the tests. -check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 +check_p_numbers=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 \ + 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 $(shell seq 1 40) ? Would that be sufficiently portable to weirdo hosts (M$Win, Darwin, ...)? We require GNU make, so if it can be written in GNU make text functions, fine, otherwise it is better to keep as is. + if (_assert_exit) exit 1 Haven't you already exited above? A second part of the patch is a new file 'contrib/generate_tcl_patterns.sh' which generates the needed regexp Can we provide a Makefile target to automatically update Makefile.in? No. As I wrote earlier, splitting on filenames and test counts only is only very rough split, all the splits really need to be backed out by real timing data from popular targets. Also, I'm afraid of some tests being left out unintentionally (e.g. the wildcards created at some point, then a new test is added with a weird starting character that hasn't been used before and suddenly it will not be tested with make -j?). Jakub
Re: [gomp4] openacc kernels directive support
On 18-08-14 14:16, Tom de Vries wrote: On 06-08-14 17:10, Tom de Vries wrote: We could insert a pass-group here that only deals with functions that have the kernels directive, and do the auto-par thing in a pass_oacc_kernels (which should share the majority of the infrastructure with the parloops pass): ... NEXT_PASS (pass_build_ealias); INSERT_PASSES_AFTER/WITHIN (passes_oacc_kernels) NEXT_PASS (pass_ch); NEXT_PASS (pass_ccp); NEXT_PASS (pass_lim_aux); NEXT_PASS (pass_oacc_par); POP_INSERT_PASSES () ... Any comments, ideas or suggestions ? I've experimented with implementing this on top of gomp-4_0-branch, and I ran into PR46032. PR46032 is about vectorization failure on a function split off by omp parallelization. The vectorization fails due to aliasing constraints in the split off function, which are not present in the original code. In the gomp-4_0-branch, the code marked by the openacc kernels directive is split off during omp_expand. The generated code has the same additional aliasing constraints, and in pass_oacc_par the parallelization fails. The PR46032 contains a tentative patch by Richard Biener, which applies cleanly on top of 4.6 (I haven't yet reached a level of understanding of tree-ssa-structalias.c to be able to resolve the conflict in intra_create_variable_infos when applying on 4.7). The tentative patch involves running ipa-pta, which is also a pass run after the point where we write out the lto stream. I'm not sure whether it makes sense to run the pta-ipa pass as part of the pass_oacc_kernels pass list. I see three ways of continuing from here: - take the tentative patch and make it work, including running pta-ipa during passes_oacc_kernels - same, but try somehow to manage without running pta-ipa. - try to postpone splitting of the function until the end of pass_oacc_par. Some advice on how to continue from here would be *highly* appreciated. My hunch atm is to investigate the last option. Jakub, Richard, I've investigated the last option, and published the current state in git-only branch vries/oacc-kernels ( https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/vries/oacc-kernels ). The current state at commit 9255cadc5b6f8f7f4e4506e65a6be7fb3c00cd35 is that: - a simple loop marked with the oacc kernels directive is analyzed for parallelization, - the loop is then rewritten using oacc parallel and oacc loop directives - these oacc directives are expanded using omp_expand_local - this results in the loop being split off into a separate function, while the loop is replaced with a GOACC_parallel call - all this is done before writing out the lto stream - no support yet for reductions, nested loops, more than one loop nest in kernels region At toplevel, the added pass list looks like this: ... NEXT_PASS (pass_build_ealias); /* Pass group that runs when there are oacc kernels in the function. */ NEXT_PASS (pass_oacc_kernels); PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) NEXT_PASS (pass_ch_oacc_kernels); NEXT_PASS (pass_tree_loop_init); NEXT_PASS (pass_lim); NEXT_PASS (pass_ccp); NEXT_PASS (pass_parallelize_loops_oacc_kernels); NEXT_PASS (pass_tree_loop_done); POP_INSERT_PASSES () ... The main question I'm currently facing is the following: when to do lowering (in other words, rewriting of variable access in terms of .omp_data) of the kernels region. There are basically 2 passes that contain code to do this: - pass_lower_omp (on pre-ssa code) - pass_parallelize_loops (on ssa code) Atm I'm using pass_lower_omp, and I've added a patch that handles omp-lowered code conservatively in ccp and forwprop in order for the lowering to remain until arriving at pass_parallelize_loops_oacc_kernels. But it might turn out to be easier/necessary to handle this in pass_parallelize_loops_oacc_kernels instead. Any advice on this issue, and on the current implementation is welcome. Thanks, - Tom
RE: [PATCH] RE: gcc parallel make check
+# ls -1 | ../../../contrib/generate_tcl_patterns.sh 300 dg.exp=gfortran.dg/ How does this work with subdirectories? Can we replace ls with find? The input to the script is general, you can use this to your advantage. For example, I've been using: ls -1 g++.*/* | cut -c5- | ../../../contrib/generate_tcl_patterns.sh 700 old-deja.exp=g++.old-deja/g++. to split at a deeper level or find . -name [0-9A-Za-z]* -type f -printf %f\n | ../../../../contrib/generate_tcl_patterns.sh 300 dg-torture.exp=torture/ to collect statistics also from subdirs. + if (_assert_exit) exit 1 Haven't you already exited above? yes, but the END{} block in awk is nevertheless executed, unless protected as above.
RE: [PATCH] RE: gcc parallel make check
No. As I wrote earlier, splitting on filenames and test counts only is only very rough split, all the splits really need to be backed out by real timing data from popular targets. I'm actually doing quite some testing trying to get a reasonable balance, checking 'completed in' in all *.log.sep files. However, it is important that the procedure is semi-automatic, otherwise few people will be interested in doing so. Furthermore, for parallel performance, it is not so important that times are distributed evenly (it is anyway unlikely the number of goals is exactly divided by N of -jN), but rather that the goals are ordered (executed) from slow to fast (similar to omp schedule guided). Most of the real bottlenecks are single letter patterns (e.g. p* since pr is such a common filename), and this is ultimately limiting. In the project (CP2K) I'm working on, we also parallelize testing over directories, but we keep a list of approximate runtimes per directory, and keep that (global) list sorted. Testing follows that list. As a result, we have near perfect parallel speedup, despite (or because) timings per directory ranging from a few 100s to 1s. Also, I'm afraid of some tests being left out unintentionally (e.g. the wildcards created at some point, then a new test is added with a weird starting character that hasn't been used before and suddenly it will not be tested with make -j?). I agree this is an issue, partially addressed by not having to write patterns by hand anymore (i.e. a script does this), and by having the script check its input. There are something like 10 testnames that do not fall in [0-9A-Za-z], as mentioned in a previous email.
Re: [gomp4] openacc kernels directive support
On Tue, 9 Sep 2014, Tom de Vries wrote: On 18-08-14 14:16, Tom de Vries wrote: On 06-08-14 17:10, Tom de Vries wrote: We could insert a pass-group here that only deals with functions that have the kernels directive, and do the auto-par thing in a pass_oacc_kernels (which should share the majority of the infrastructure with the parloops pass): ... NEXT_PASS (pass_build_ealias); INSERT_PASSES_AFTER/WITHIN (passes_oacc_kernels) NEXT_PASS (pass_ch); NEXT_PASS (pass_ccp); NEXT_PASS (pass_lim_aux); NEXT_PASS (pass_oacc_par); POP_INSERT_PASSES () ... Any comments, ideas or suggestions ? I've experimented with implementing this on top of gomp-4_0-branch, and I ran into PR46032. PR46032 is about vectorization failure on a function split off by omp parallelization. The vectorization fails due to aliasing constraints in the split off function, which are not present in the original code. Heh. At least the omp-low.c parts from comment #1 should be pushed to trunk... In the gomp-4_0-branch, the code marked by the openacc kernels directive is split off during omp_expand. The generated code has the same additional aliasing constraints, and in pass_oacc_par the parallelization fails. The PR46032 contains a tentative patch by Richard Biener, which applies cleanly on top of 4.6 (I haven't yet reached a level of understanding of tree-ssa-structalias.c to be able to resolve the conflict in intra_create_variable_infos when applying on 4.7). The tentative patch involves running ipa-pta, which is also a pass run after the point where we write out the lto stream. I'm not sure whether it makes sense to run the pta-ipa pass as part of the pass_oacc_kernels pass list. No, that's not even possible I think. I see three ways of continuing from here: - take the tentative patch and make it work, including running pta-ipa during passes_oacc_kernels - same, but try somehow to manage without running pta-ipa. - try to postpone splitting of the function until the end of pass_oacc_par. I don't understand the last option? What is the actual issue you run into? You split oacc kernels off and _then_ run autopar on the split-off function (and get additional kernels)? Some advice on how to continue from here would be *highly* appreciated. My hunch atm is to investigate the last option. Jakub, Richard, I've investigated the last option, and published the current state in git-only branch vries/oacc-kernels ( https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/vries/oacc-kernels ). The current state at commit 9255cadc5b6f8f7f4e4506e65a6be7fb3c00cd35 is that: - a simple loop marked with the oacc kernels directive is analyzed for parallelization, - the loop is then rewritten using oacc parallel and oacc loop directives - these oacc directives are expanded using omp_expand_local - this results in the loop being split off into a separate function, while the loop is replaced with a GOACC_parallel call - all this is done before writing out the lto stream - no support yet for reductions, nested loops, more than one loop nest in kernels region At toplevel, the added pass list looks like this: ... NEXT_PASS (pass_build_ealias); /* Pass group that runs when there are oacc kernels in the function. */ Not sure why pass_oacc_kernels runs before all the other local cleanups? I would have put it after pass_cd_dce at least. NEXT_PASS (pass_oacc_kernels); PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) NEXT_PASS (pass_ch_oacc_kernels); NEXT_PASS (pass_tree_loop_init); NEXT_PASS (pass_lim); NEXT_PASS (pass_ccp); NEXT_PASS (pass_parallelize_loops_oacc_kernels); NEXT_PASS (pass_tree_loop_done); POP_INSERT_PASSES () ... The main question I'm currently facing is the following: when to do lowering (in other words, rewriting of variable access in terms of .omp_data) of the kernels region. There are basically 2 passes that contain code to do this: - pass_lower_omp (on pre-ssa code) - pass_parallelize_loops (on ssa code) Both use the same utilities. Atm I'm using pass_lower_omp, and I've added a patch that handles omp-lowered code conservatively in ccp and forwprop in order for the lowering to remain until arriving at pass_parallelize_loops_oacc_kernels. You mean omp-_un_-lowered code? But it might turn out to be easier/necessary to handle this in pass_parallelize_loops_oacc_kernels instead. I'd do it similar to how autopar does it (not that autopar is a great example for a GCC pass these days...). Richard. Any advice on this issue, and on the current implementation is welcome. Thanks, - Tom
Re: ASAN test failures make compare_tests useless
Hi! On Mon, 18 Aug 2014 17:17:53 +0200, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 18 August 2014 16:34, Alexander Potapenko gli...@google.com wrote: On Mon, Aug 18, 2014 at 9:43 AM, Yury Gribov y.gri...@samsung.com wrote: On 08/18/2014 09:42 AM, Yury Gribov wrote: On 08/16/2014 04:37 AM, Manuel López-Ibáñez wrote: On the compile farm, ASAN tests seem to fail a lot like: FAIL: c-c++-common/asan/global-overflow-1.c -O0 output pattern test, is ==31166==ERROR: AddressSanitizer failed to allocate 0xdfff0001000 (15392894357504) bytes at address 2008fff7000 (errno: 12) ==31166==ReserveShadowMemoryRange failed while trying to map 0xdfff0001000 bytes. Perhaps you're using ulimit -v , should match READ of size 1 at 0x[0-9a-f]+ thread T0.*( I'm also annoyed by this, due to »ulimit -v 4194304« being set. The problem is that those addresses and sizes are very random, The output pattern that must be printed has these addresses masked out (note 0x[0-9a-f]+ in your report). No other lines with varying addresses should be printed. For the record, I think the fault lies in the GCC testing infrastructure and not in ASAN. It is wrong to print as the test error message the output of ASAN. It should print FAIL: c-c++-common/asan/global-overflow-1.c -O0 output pattern test, is ERROR This is enough to see that something failed. For details one can go to the detailed logs. But I didn't add the asan testing infrastructure and I couldn't figure out how to fix this. Any suggestions? Richard Sandiford has already addressed this in DejaGnu upstream, http://news.gmane.org/find-root.php?message_id=%3C87bo0samke.fsf%40talisman.default%3E, so you now just need to wait for the next DejaGnu release to be made and packaged for your distribution, or you manually patch /usr/share/dejagnu/dg.exp:dg-test, or add a patched dg-test to a suitable gcc/testsuite/lib/*.exp file to override the system one. Grüße, Thomas pgpY_aslZ7qwK.pgp Description: PGP signature
Re: [PATCH] RE: gcc parallel make check
On Tue, Sep 09, 2014 at 10:57:09AM +, VandeVondele Joost wrote: No. As I wrote earlier, splitting on filenames and test counts only is only very rough split, all the splits really need to be backed out by real timing data from popular targets. Furthermore, for parallel performance, it is not so important that times are distributed evenly (it is anyway unlikely the number of goals is exactly divided by N of -jN), but rather that the goals are ordered (executed) from slow to fast (similar to omp schedule guided). Most of the real bottlenecks are single letter patterns (e.g. p* since pr is such a common filename), and this is ultimately limiting. I disagree. If e.g. in gcc.dg/ more than a third of testcases are pr*.c, then running dg.exp=p* in one job and dg.exp=a* in another one etc. is simply a bad idea, the pr*.c should be split more and some other letters just be done together. Even that can be done semi-automatically. If you get whitespace right, one can provide multiple different wildcards to a single *.exp file, e.g. make check-gcc RUNTESTFLAGS=dg.exp='p[0-9A-Za-qs-z]* pr[9A-Za-z]*' should cover all tests starting with p other than pr[0-8]*.c (where you could split say pr[0-2]* into another job, pr[3-5]* into another and pr[6-8]* into another. The fact that some check-gcc or check-gfortran test job is early in the list doesn't mean it will be started early, you need to consider also all other potentially long jobs like check-g++, check-target-libgomp, check-target-libstdc++-v3 etc. Jakub
RE: [PATCH] RE: gcc parallel make check
If you get whitespace right, one can provide multiple different wildcards to a single *.exp file, e.g. make check-gcc RUNTESTFLAGS=dg.exp='p[0-9A-Za-qs-z]* pr[9A-Za-z]*' should cover all tests starting with p other than pr[0-8]*.c (where you could split say pr[0-2]* into another job, pr[3-5]* into another and pr[6-8]* into another. I think this confirms that it becomes very delicate to try and write these more complex patterns. The above would miss p_test.c, p-1.c, etc ? For other classes of files the difference is even further down the filename (e.g. using dates as in 20020508-3.c going from 2000 to 2014, or avx*), making the automatic generation of the patterns more complicated. I certainly don't want to claim that the patch I have now is perfect, it is rather an incremental improvement on the current setup.
Re: [PATCH] RE: gcc parallel make check
On 09/09/2014 06:14 PM, VandeVondele Joost wrote: I certainly don't want to claim that the patch I have now is perfect, it is rather an incremental improvement on the current setup. I'd second this. Writing patterns manually seems rather inefficient and error-prone (not undoable of course but unnecessarily complicated). And with current (crippled) version Joost already got 100% test time improvement. -Y
Re: [PATCH] RE: gcc parallel make check
On Tue, Sep 09, 2014 at 06:27:10PM +0400, Yury Gribov wrote: On 09/09/2014 06:14 PM, VandeVondele Joost wrote: I certainly don't want to claim that the patch I have now is perfect, it is rather an incremental improvement on the current setup. I'd second this. Writing patterns manually seems rather inefficient and error-prone (not undoable of course but unnecessarily complicated). And with current (crippled) version Joost already got 100% test time improvement. But if there are jobs that just take 1s to complete, then clearly it doesn't make sense to split them off as separate job. I think we don't need 100% even split, but at least roughly is highly desirable. Jakub
Re: [PATCH] RE: gcc parallel make check
On 09/09/2014 06:33 PM, Jakub Jelinek wrote: On Tue, Sep 09, 2014 at 06:27:10PM +0400, Yury Gribov wrote: On 09/09/2014 06:14 PM, VandeVondele Joost wrote: I certainly don't want to claim that the patch I have now is perfect, it is rather an incremental improvement on the current setup. I'd second this. Writing patterns manually seems rather inefficient and error-prone (not undoable of course but unnecessarily complicated). And with current (crippled) version Joost already got 100% test time improvement. But if there are jobs that just take 1s to complete, then clearly it doesn't make sense to split them off as separate job. I think we don't need 100% even split, but at least roughly is highly desirable. You mean enhancing the script to split across arbitrarily long prefixes? That would be great. -Y
RE: [PATCH] RE: gcc parallel make check
Now with gzipped figure.. why do these bounce ? But if there are jobs that just take 1s to complete, then clearly it doesn't make sense to split them off as separate job. I think we don't need 100% even split, but at least roughly is highly desirable. Let me add some data, attached is a graph (logscale y) showing the runtime of tests before and after my changes (including a new patch for c++). There is virtually no change for tests running shorter than 50s, only slowly running tests have been split. Now, there are only very few slow tests remaining: gcc_trunk/obj.new find . -name *.log | xargs grep completed in | sort -n -k 5 | tail -n 10 ./gcc/testsuite/gcc/gcc.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/gcc/testsuite/gcc.dg/torture/dg-torture.exp completed in 521 seconds ./x86_64-unknown-linux-gnu/libstdc++-v3/testsuite/libstdc++.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp completed in 530 seconds ./x86_64-unknown-linux-gnu/libstdc++-v3/testsuite/libstdc++.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp completed in 553 seconds ./x86_64-unknown-linux-gnu/libgomp/testsuite/libgomp.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/libgomp/testsuite/libgomp.fortran/fortran.exp completed in 561 seconds ./gcc/testsuite/gcc/gcc.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/gcc/testsuite/gcc.c-torture/compile/compile.exp completed in 625 seconds ./x86_64-unknown-linux-gnu/libstdc++-v3/testsuite/libstdc++.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp completed in 683 seconds ./gcc/testsuite/g++/g++.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/gcc/testsuite/g++.dg/dg.exp completed in 702 seconds ./x86_64-unknown-linux-gnu/libstdc++-v3/testsuite/libstdc++.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp completed in 726 seconds ./gcc/testsuite/gcc/gcc.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp completed in 752 seconds ./x86_64-unknown-linux-gnu/libstdc++-v3/testsuite/libstdc++.log:testcase /data/vjoost/gnu/gcc_trunk/gcc/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp completed in 904 seconds They, of course, limit the ultimate speedup. timings.png.gz Description: timings.png.gz
RE: [PATCH] RE: gcc parallel make check
Attached is a further revision of the patch, now dealing with check-c++. Roughly 50% speedup here at '-j32' (18m vs 12m). For my setup (--enable-languages=c,c++,fortran) I have now improved all targets called in 'make -j32 -k check'. The latter is now 30% faster (15m vs 20m). Note that there are +- 1m fluctuations in these numbers, easily. I currently have no plans to work on other check targets before this patch is committed. OK for trunk ? Joost contrib/ChangeLog 2014-09-09 Joost VandeVondele vond...@gcc.gnu.org * generate_tcl_patterns.sh: New file. gcc/fortran/ChangeLog 2014-09-09 Joost VandeVondele vond...@gcc.gnu.org * Make-lang.in (check_gfortran_parallelize): Improved parallelism. gcc/Changelog 2014-09-09 Joost VandeVondele vond...@gcc.gnu.org * Makefile.in (check_gcc_parallelize): Improved parallelism. (check_p_numbers): Increase maximum value. (dg_target_exps): Mention targets as separate words only. (null,space,comma,dg_target_exps_p1,dg_target_exps_p2, dg_target_exps_p3,dg_target_exps_p4): New variables. gcc/cp/ChangeLog 2014-09-09 Joost VandeVondele vond...@gcc.gnu.org * Make-lang.in (check_g++_parallelize): Improved parallelism. libstdc++-v3/ChangeLog 2014-09-09 Joost VandeVondele vond...@gcc.gnu.org * testsuite/Makefile.am (check_DEJAGNU_normal_targets): Add check-DEJAGNUnormal[11-15]. (check-DEJAGNU): Split into 15 jobs for parallel testing. * testsuite/Makefile.in: Regenerated. Index: libstdc++-v3/testsuite/Makefile.am === --- libstdc++-v3/testsuite/Makefile.am (revision 215017) +++ libstdc++-v3/testsuite/Makefile.am (working copy) @@ -101,7 +101,7 @@ new-abi-baseline: @test ! -f $*/site.exp || mv $*/site.exp $*/site.bak @mv $*/site.exp.tmp $*/site.exp -check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10) +check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15) $(check_DEJAGNU_normal_targets): check-DEJAGNUnormal%: normal%/site.exp # Run the testsuite in normal mode. @@ -111,7 +111,7 @@ check-DEJAGNU $(check_DEJAGNU_normal_tar if [ -z $*$(filter-out --target_board=%, $(RUNTESTFLAGS)) ] \ [ $(filter -j, $(MFLAGS)) = -j ]; then \ $(MAKE) $(AM_MAKEFLAGS) $(check_DEJAGNU_normal_targets); \ - for idx in 0 1 2 3 4 5 6 7 8 9 10; do \ + for idx in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15; do \ mv -f normal$$idx/libstdc++.sum normal$$idx/libstdc++.sum.sep; \ mv -f normal$$idx/libstdc++.log normal$$idx/libstdc++.log.sep; \ done; \ @@ -138,25 +138,35 @@ check-DEJAGNU $(check_DEJAGNU_normal_tar fi; \ dirs=`cd $$srcdir; echo [013-9][0-9]_*/*`;; \ normal1) \ - dirs=`cd $$srcdir; echo [ab]* de* [ep]*/*`;; \ + dirs=`cd $$srcdir; echo e*/*`;; \ normal2) \ - dirs=`cd $$srcdir; echo 2[01]_*/*`;; \ + dirs=`cd $$srcdir; echo 28_*/a*`;; \ normal3) \ - dirs=`cd $$srcdir; echo 22_*/*`;; \ + dirs=`cd $$srcdir; echo 23_*/[lu]*`;; \ normal4) \ - dirs=`cd $$srcdir; echo 23_*/[a-km-tw-z]*`;; \ + dirs=`cd $$srcdir; echo 2[459]_*/*`;; \ normal5) \ - dirs=`cd $$srcdir; echo 23_*/[luv]*`;; \ + dirs=`cd $$srcdir; echo 2[01]_*/*`;; \ normal6) \ - dirs=`cd $$srcdir; echo 2[459]_*/*`;; \ + dirs=`cd $$srcdir; echo 23_*/[m-tw-z]*`;; \ normal7) \ - dirs=`cd $$srcdir; echo 26_*/* 28_*/[c-z]*`;; \ + dirs=`cd $$srcdir; echo 26_*/*`;; \ normal8) \ dirs=`cd $$srcdir; echo 27_*/*`;; \ normal9) \ - dirs=`cd $$srcdir; echo 28_*/[ab]*`;; \ + dirs=`cd $$srcdir; echo 22_*/*`;; \ normal10) \ dirs=`cd $$srcdir; echo t*/*`;; \ + normal11) \ + dirs=`cd $$srcdir; echo 28_*/b*`;; \ + normal12) \ + dirs=`cd $$srcdir; echo 28_*/[c-z]*`;; \ + normal13) \ + dirs=`cd $$srcdir; echo de* p*/*`;; \ + normal14) \ + dirs=`cd $$srcdir; echo [ab]* 23_*/v*`;; \ + normal15) \ + dirs=`cd $$srcdir; echo 23_*/[a-k]*`;; \ esac; \ if [ -n $* ]; then cd $*; fi; \ if $(SHELL) -c $$runtest --version /dev/null 21; then \ Index: libstdc++-v3/testsuite/Makefile.in === --- libstdc++-v3/testsuite/Makefile.in (revision 215017) +++ libstdc++-v3/testsuite/Makefile.in (working copy) @@ -301,7 +301,7 @@ lists_of_files = \ extract_symvers = $(glibcxx_builddir)/scripts/extract_symvers baseline_subdir := $(shell $(CXX) $(baseline_subdir_switch)) -check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10) +check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15) # Runs the testsuite, but in compile only mode. # Can be used to test sources with non-GNU FE's at various warning @@ -562,7 +562,7 @@ check-DEJAGNU $(check_DEJAGNU_normal_tar if [ -z
Possible violation of the gcc GPL license
Dear GNU developers, I don't know if this is the right place to signal this, but I believe this Android application https://play.google.com/store/apps/details?id=com.maclab.codepad2 Especially with this plugin https://play.google.com/store/apps/details?id=com.maclab.codepadgcc violates the GPL license: in fact the description states This software uses code of GNU Compiler Collection. but there is no link to any place where you can get source code for the application, which seams to be distributed with a non-GPL license. Application website is in Chinese and I cannot understand Chinese, so I cannot see if source code is distributed there, but anyway, section 6.d of the GPL states that you must d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. And this is clearly not fullfilled, since there are no clear directions next to the object code saying where to find the Corresponding Source. You may want to contact the author and force him to comply with the license terms of the GPL. Thank you for your work Paolo
Re: GCC 5 snapshots produce broken kernel for powerpc-e500v2-linux-gnuspe?
Makrus, Andrew, thanks for your suggestions. On Sep 9, 2014, at 2:57 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.09.09 at 17:35 +0800, Arseny Solokha wrote: Hello, I've recently faced an issue I'm afraid I currently unable to debug. When building an arbitrary version of Linux kernel for powerpc-e500v2-linux-gnuspe target, it seems gcc prior to 5 produces a good image which boots just fine, and current gcc 5 snapshots (4.10.0-alpha20140810 for example) produce an image which hangs just after U-Boot hands over to the kernel. This behavior is well reproducible on real hardware as well as under qemu. I've prepared a minimal kernel config which is dysfunctional as is but still enough to demonstrate the problem in qemu. I believe the exact Linux version number doesn't actually matter here, but see the attachment for details. Compare the output produced by u-boot and this minified kernel build using gcc 4.9.1 and 4.10.0-alpha20140810 snapshot. I now have completely no idea what to do next to find a cause of (1) gcc 5 snapshots producing unbootable kernel, gcc trunk also miscompiles x86_64 kernels currently, but I haven't looked deeper yet. The best way to narrow down the issue is to use git (or svn) bisect to find out which gcc revision causes the miscompile. Then you can md5sum the kernel object files for the bad revision and for the first good revision and compare the results. After that you can look at the disassembly of the object files, for which md5sum differs, and try to figure out the reason why. OK, bisect is of course one way to find a culprit revision and then files it miscompiles. However, one difficulty w/ this approach is that SPE ABI support in GCC (and probably other places too) is in semi-bitrot state by now, so there is a preliminary consideration that not every revision could even be buildable. I don't have any better idea, though. Regarding the case of missing sections in kernel module built w/ stable GCC versions after tinkering w/ CFLAGS, what can be the cause? Is it possible that ld, or as, or kernel linker script is real culprit? Because this started showing up since 4.8, and the only difference in assembler output I can see between 4.7 and 4.10 is use of isel instructions by the latter even though -misel isn't appended to kernel CFLAGS by default. I have a patch which I need to submit. Maybe by Friday I will do that. It fixes the kernel on arm64 but it is generic c front-end patch. So I'll for sure try current trunk next week once again. Regards. -- Markus
Re: Possible violation of the gcc GPL license
On 09/09/14 09:46, Paolo Inaudi wrote: Dear GNU developers, I don't know if this is the right place to signal this, but I believe this Android application [ ... ] The right place is license-violat...@gnu.org. The FSF actually owns the copyright on GCC. You can find further information on how to report a violation here: http://www.gnu.org/licenses/gpl-violation.html Thanks, Jeff
Re: Possible violation of the gcc GPL license
Thank you very much for the pointer, and sorry for my mistake. Paolo Il 09/09/2014 18:46, Jeff Law ha scritto: On 09/09/14 09:46, Paolo Inaudi wrote: Dear GNU developers, I don't know if this is the right place to signal this, but I believe this Android application [ ... ] The right place is license-violat...@gnu.org. The FSF actually owns the copyright on GCC. You can find further information on how to report a violation here: http://www.gnu.org/licenses/gpl-violation.html Thanks, Jeff
Re: Trouble trying to test GCC on a simulator
On Mon, 8 Sep 2014, Pierre-Marie de Rodat wrote: # Get newlib and the simulator cvs -d :pserver:anon...@sourceware.org:/cvs/src co newlib sim # Get binutils git clone git://sourceware.org/git/binutils-gdb.git # Create the combined tree rm -rf combined mkdir combined cd src find . -print | cpio -pdlm ../combined cd .. cd binutils-gdb find . -print | cpio -pdlmu ../combined cd .. cd gcc find . -print | cpio -pdlmu ../combined cd .. # Same build/test procedure... It seems to work fine! (I'm running tests, now...) So thank you very much, Tristan. I'm going to submit a website patch to update the documentation according to this. On the one hand, I promised to do this some time ago and on the other you may be faster than me... I'm using your input above (corrected with Tristan's observation) and will double-check that it works to build at least a couple of ports. Thanks! Also, I saw some other gotchas. (We require a recent g++, and more recent than 2.95 etc.) brgds, H-P
Re: Some questions about pass web
On 09/03/14 02:35, Steven Bosscher wrote: On Wed, Sep 3, 2014 at 9:17 AM, Bin.Cheng wrote: Last time I tried, there are several passes after loop_done and before auto-inc-dec can't handle auto-increment addressing mode, including fweb. It surprises me that pass_web can't handle AUTOINC. Perhaps I'm off my rocker, but it's always been my understanding that almost all passes handle AUTOINC just fine (or at least conservatively: punt if you see an AUTOINC), and that only CSE really doesn't know about AUTOINC at all. In the past autoinc instructions didn't appear until flow (just prior to combine) and that was documented behaviour. So anything which was run strictly prior to flow/combine wasn't autoinc aware. That may have changed somewhat with the autoinc rewrite. It is long time since I looked at web, but it should understand read/write refs and that those must remain unified. how does DF refs look like for autoinc? Honza jeff
Re: Some questions about pass web
It is indeed caused by wrong DF information, which is caused by a wrong fix for bug PR32339. More discussion is at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63156. thanks a lot Guozhi Wei On Tue, Sep 9, 2014 at 5:31 PM, Jan Hubicka hubi...@ucw.cz wrote: On 09/03/14 02:35, Steven Bosscher wrote: On Wed, Sep 3, 2014 at 9:17 AM, Bin.Cheng wrote: Last time I tried, there are several passes after loop_done and before auto-inc-dec can't handle auto-increment addressing mode, including fweb. It surprises me that pass_web can't handle AUTOINC. Perhaps I'm off my rocker, but it's always been my understanding that almost all passes handle AUTOINC just fine (or at least conservatively: punt if you see an AUTOINC), and that only CSE really doesn't know about AUTOINC at all. In the past autoinc instructions didn't appear until flow (just prior to combine) and that was documented behaviour. So anything which was run strictly prior to flow/combine wasn't autoinc aware. That may have changed somewhat with the autoinc rewrite. It is long time since I looked at web, but it should understand read/write refs and that those must remain unified. how does DF refs look like for autoinc? Honza jeff
[Bug rtl-optimization/63210] ira does not select the best register compared with gcc 4.8 for ARM THUMB1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63210 --- Comment #1 from Zhenqiang Chen zhenqiang.chen at arm dot com --- Here is a workaround patch to show the point. diff --git a/gcc/ira-color.c b/gcc/ira-color.c index e2ea359..1573fb5 100644 --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -1709,6 +1709,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) { ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj); enum reg_class conflict_aclass; + HARD_REG_SET prof_regs; + prof_regs = ALLOCNO_COLOR_DATA (conflict_a)-profitable_hard_regs; /* Reload can give another class so we need to check all allocnos. */ @@ -1780,7 +1782,7 @@ assign_hard_reg (ira_allocno_t a, bool retry_p) hard_regno = ira_class_hard_regs[aclass][j]; ira_assert (hard_regno = 0); k = ira_class_hard_reg_index[conflict_aclass][hard_regno]; -if (k 0) +if (k 0 || !TEST_HARD_REG_BIT (prof_regs, hard_regno)) continue; full_costs[j] -= conflict_costs[k]; } For this case, r0 is not available for r115. The conflict for r110 on r0 maybe meaningless.
[Bug target/63209] [ARM] Wrong conditional move generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63209 Mikael Pettersson mikpelinux at gmail dot com changed: What|Removed |Added CC||mikpelinux at gmail dot com --- Comment #1 from Mikael Pettersson mikpelinux at gmail dot com --- I can reproduce the wrong-code with gcc-4.9 on armv5tel-linux-gnueabi.
[Bug testsuite/63211] New: gcc.target/i386/avx2-*.c tests use broken type-punning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63211 Bug ID: 63211 Summary: gcc.target/i386/avx2-*.c tests use broken type-punning Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target: x86_64-*-*, i?86-*-* For example gcc.target/i386/avx2-vpblendd128-2.c does static void init_pblendd128 (int *src1, int *src2, int seed) { int i, sign = 1; for (i = 0; i 4; i++) { src1[i] = (i + seed) * (i + seed) * sign; src2[i] = (i + seed + 20) * sign; sign = -sign; } } ... static void avx2_test (void) { union128i_d src1, src2, dst; int dst_ref[4]; int i; for (i = 0; i NUM; i++) { init_pblendd128 (src1.a, src2.a, i); dst.x = _mm_blend_epi32 (src1.x, src2.x, MASK); which stores to src1.a/src2.a via a pointer and not a direct union access. This makes us optimize away the init_pblendd128 function after it being inlined. FAIL: gcc.target/i386/avx2-vpblendd128-2.c execution test FAIL: gcc.target/i386/avx2-vpblendd256-2.c execution test FAIL: gcc.target/i386/avx2-vpblendw-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastd128-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastd256-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastw128-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastw256-2.c execution test FAIL: gcc.target/i386/avx2-vpermd-2.c execution test FAIL: gcc.target/i386/avx2-vpermps-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxsd-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxsw-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxud-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxuw-2.c execution test FAIL: gcc.target/i386/avx2-vpminsd-2.c execution test FAIL: gcc.target/i386/avx2-vpminsw-2.c execution test FAIL: gcc.target/i386/avx2-vpminud-2.c execution test FAIL: gcc.target/i386/avx2-vpminuw-2.c execution test FAIL: gcc.target/i386/avx2-vpmuldq-2.c execution test FAIL: gcc.target/i386/avx2-vpmulhrsw-2.c execution test FAIL: gcc.target/i386/avx2-vpmulhuw-2.c execution test FAIL: gcc.target/i386/avx2-vpmulhw-2.c execution test FAIL: gcc.target/i386/avx2-vpmulld-2.c execution test FAIL: gcc.target/i386/avx2-vpmullw-2.c execution test FAIL: gcc.target/i386/avx2-vpmuludq-2.c execution test FAIL: gcc.target/i386/avx2-vpshufd-2.c execution test FAIL: gcc.target/i386/avx2-vpshufhw-2.c execution test FAIL: gcc.target/i386/avx2-vpshuflw-2.c execution test FAIL: gcc.target/i386/avx2-vpsignd-2.c execution test FAIL: gcc.target/i386/avx2-vpsignw-2.c execution test FAIL: gcc.target/i386/avx2-vpslld-2.c execution test FAIL: gcc.target/i386/avx2-vpsllvd128-2.c execution test FAIL: gcc.target/i386/avx2-vpsllvd256-2.c execution test FAIL: gcc.target/i386/avx2-vpsllw-2.c execution test FAIL: gcc.target/i386/avx2-vpsrad-2.c execution test FAIL: gcc.target/i386/avx2-vpsravd128-2.c execution test FAIL: gcc.target/i386/avx2-vpsravd256-2.c execution test FAIL: gcc.target/i386/avx2-vpsraw-2.c execution test FAIL: gcc.target/i386/avx2-vpsrld-2.c execution test FAIL: gcc.target/i386/avx2-vpsrlvd128-2.c execution test FAIL: gcc.target/i386/avx2-vpsrlvd256-2.c execution test FAIL: gcc.target/i386/avx2-vpsrlw-2.c execution test FAIL: gcc.target/i386/avx2-vpunpckhdq-2.c execution test FAIL: gcc.target/i386/avx2-vpunpckhwd-2.c execution test FAIL: gcc.target/i386/avx2-vpunpckldq-2.c execution test FAIL: gcc.target/i386/avx2-vpunpcklwd-2.c execution test the actual miscompile only happens if you apply the fix for PR40135, the testcases are broken nevertheless according to how type-punning through unions is supposed to work.
[Bug middle-end/40135] using alias-set zero for union accesses necessary because of RTL alias oracle
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40135 --- Comment #6 from Richard Biener rguenth at gcc dot gnu.org --- For reference I see for example FAIL: gcc.target/i386/avx2-vpblendd128-2.c execution test FAIL: gcc.target/i386/avx2-vpblendd256-2.c execution test FAIL: gcc.target/i386/avx2-vpblendw-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastd128-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastd256-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastw128-2.c execution test FAIL: gcc.target/i386/avx2-vpbroadcastw256-2.c execution test FAIL: gcc.target/i386/avx2-vpermd-2.c execution test FAIL: gcc.target/i386/avx2-vpermps-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxsd-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxsw-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxud-2.c execution test FAIL: gcc.target/i386/avx2-vpmaxuw-2.c execution test FAIL: gcc.target/i386/avx2-vpminsd-2.c execution test FAIL: gcc.target/i386/avx2-vpminsw-2.c execution test FAIL: gcc.target/i386/avx2-vpminud-2.c execution test FAIL: gcc.target/i386/avx2-vpminuw-2.c execution test FAIL: gcc.target/i386/avx2-vpmuldq-2.c execution test FAIL: gcc.target/i386/avx2-vpmulhrsw-2.c execution test FAIL: gcc.target/i386/avx2-vpmulhuw-2.c execution test FAIL: gcc.target/i386/avx2-vpmulhw-2.c execution test FAIL: gcc.target/i386/avx2-vpmulld-2.c execution test ... FAIL: gcc.target/i386/avx2-vpunpckhdq-2.c execution test FAIL: gcc.target/i386/avx2-vpunpckhwd-2.c execution test FAIL: gcc.target/i386/avx2-vpunpckldq-2.c execution test FAIL: gcc.target/i386/avx2-vpunpcklwd-2.c execution test but for example gcc.target/i386/avx2-vpblendd128-2.c contains invalid type-punning through unions: static void init_pblendd128 (int *src1, int *src2, int seed) { int i, sign = 1; for (i = 0; i 4; i++) { src1[i] = (i + seed) * (i + seed) * sign; src2[i] = (i + seed + 20) * sign; sign = -sign; } } ... static void avx2_test (void) { union128i_d src1, src2, dst; int dst_ref[4]; int i; for (i = 0; i NUM; i++) { init_pblendd128 (src1.a, src2.a, i); dst.x = _mm_blend_epi32 (src1.x, src2.x, MASK); which stores into src1/src2 via a pointer access and only reads via a direct access to the union. That's not how the GCC extension specifies union type-punning. I've filed PR63211 for that. Here the stores do not end up in alias-set zero and with removing the c-common.c hack the loads also end up not using alias-set zero. Without the fix the loads use alias-set zero and thus keep the int-stores live. There is also no easy must-alias to identify here as the stores happen in a loop.
[Bug bootstrap/63212] New: [5 Regression] r214957 breaks bootstrap configured with --enable-checking=release
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63212 Bug ID: 63212 Summary: [5 Regression] r214957 breaks bootstrap configured with --enable-checking=release Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: dominiq at lps dot ens.fr CC: iains at gcc dot gnu.org, rguenth at gcc dot gnu.org Revision r214957 breaks bootstrap configured with --enable-checking=release: g++ -c -g -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -DHAVE_CONFIG_H -I. -I. -I../../p_work/gcc -I../../p_work/gcc/. -I../../p_work/gcc/../include -I./../intl -I../../p_work/gcc/../libcpp/include -I/opt/mp/include -I../../p_work/gcc/../libdecnumber -I../../p_work/gcc/../libdecnumber/dpd -I../libdecnumber -I../../p_work/gcc/../libbacktrace -DCLOOG_INT_GMP -DCLOOG_INT_GMP -I/opt/mp/include -o gtype-desc.o -MT gtype-desc.o -MMD -MP -MF ./.deps/gtype-desc.TPo gtype-desc.c In file included from ../../p_work/gcc/ggc.h:34:0, from ../../p_work/gcc/hash-table.h:199, from ../../p_work/gcc/hash-set.h:24, from ../../p_work/gcc/tree-core.h:24, from ../../p_work/gcc/tree.h:23, from gtype-desc.c:30: gtype-desc.c: In function 'void gt_ggc_mx_loop(void*)': gtype-desc.c:887:40: error: 'struct loop' has no member named 'former_header' gt_ggc_m_15basic_block_def ((*x).former_header); ^ ./gtype-desc.h:853:7: note: in definition of macro 'gt_ggc_m_15basic_block_def' if (X != NULL) gt_ggc_mx_basic_block_def (X);\ ^ gtype-desc.c:887:40: error: 'struct loop' has no member named 'former_header' gt_ggc_m_15basic_block_def ((*x).former_header); ^ ./gtype-desc.h:853:45: note: in definition of macro 'gt_ggc_m_15basic_block_def' if (X != NULL) gt_ggc_mx_basic_block_def (X);\ ^ gtype-desc.c: In function 'void gt_pch_nx_loop(void*)': gtype-desc.c:4062:40: error: 'struct loop' has no member named 'former_header' gt_pch_n_15basic_block_def ((*x).former_header); ^ ./gtype-desc.h:1783:7: note: in definition of macro 'gt_pch_n_15basic_block_def' if (X != NULL) gt_pch_nx_basic_block_def (X);\ ^ gtype-desc.c:4062:40: error: 'struct loop' has no member named 'former_header' gt_pch_n_15basic_block_def ((*x).former_header); ^ ./gtype-desc.h:1783:45: note: in definition of macro 'gt_pch_n_15basic_block_def' if (X != NULL) gt_pch_nx_basic_block_def (X);\ ^ gtype-desc.c: In function 'void gt_pch_p_4loop(void*, void*, gt_pointer_operator, void*)': gtype-desc.c:7296:16: error: 'struct loop' has no member named 'former_header' op (((*x).former_header), cookie); ^
[Bug bootstrap/63204] gtype-desc.c:887:40: error: 'struct loop' has no member named 'former_header' breaks bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63204 Andreas Schwab sch...@linux-m68k.org changed: What|Removed |Added CC||dominiq at lps dot ens.fr --- Comment #4 from Andreas Schwab sch...@linux-m68k.org --- *** Bug 63212 has been marked as a duplicate of this bug. ***
[Bug bootstrap/63212] [5 Regression] r214957 breaks bootstrap configured with --enable-checking=release
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63212 Andreas Schwab sch...@linux-m68k.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andreas Schwab sch...@linux-m68k.org --- Dup. *** This bug has been marked as a duplicate of bug 63204 ***
[Bug c/63213] -Warray-bounds false positive with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63213 --- Comment #1 from Oliver Stoeneberg oliverst at online dot de --- Created attachment 33463 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33463action=edit preprocessed source
[Bug c/63213] New: -Warray-bounds false positive with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63213 Bug ID: 63213 Summary: -Warray-bounds false positive with -O3 Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: oliverst at online dot de Created attachment 33462 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33462action=edit source file The attached code generates the following warnings: gcc -Wall -O3 -Wno-uninitialized -c model1.c model1.c: In function 'fill_quad': model1.c:94:11: warning: array subscript is above array bounds [-Warray-bounds] while(p[ps2+1].y == cury) ^ model1.c:114:11: warning: array subscript is above array bounds [-Warray-bounds] while(p[ps2+1].y == cury) ^ model1.c:114:11: warning: array subscript is above array bounds [-Warray-bounds] model1.c:92:11: warning: array subscript is below array bounds [-Warray-bounds] while(p[ps1-1].y == cury) ^ model1.c:105:11: warning: array subscript is below array bounds [-Warray-bounds] while(p[ps1-1].y == cury) ^ model1.c:105:11: warning: array subscript is below array bounds [-Warray-bounds] The code is quite messy and reducing it created some uninitialized warnings, that didn't exist in the original code, but the array bounds ones stayed the same. It was about ten times the size of this before I reduced it and I couldn't reduce it anymore since removing any line reduces the amount of warnings. Apparently this false positive started with GCC 4.8.1 on some Linux distribution (I assume it was some Fedora version), but I didn't have access to a toolchain, that was causing it until now with the MinGW 4.9.1 release. Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=c:/mingw/mingw64-w64/bin/../libexec/gcc/x86_64-w64-mingw32/4.9.1/lto-wrapper.exe Target: x86_64-w64-mingw32 Configured with: ../../../src/gcc-4.9.1/configure --host=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --prefix=/mingw64 --with-sysroot=/c/mingw491/x86_64-491-win32-seh-rt_v3-rev0/mingw64 --with-gxx-include-dir=/mingw64/x86_64-w64-mingw32/include/c++ --enable-shared --enable-static --disable-multilib --enable-languages=ada,c,c++,fortran,objc,obj-c++,lto --enable-libstdcxx-time=yes --enable-threads=win32 --enable-libgomp --enable-libatomic --enable-lto --enable-graphite --enable-checking=release --enable-fully-dynamic-string --enable-version-specific-runtime-libs --disable-isl-version-check --disable-cloog-version-check --disable-libstdcxx-pch --disable-libstdcxx-debug --enable-bootstrap --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-gnu-as --with-gnu-ld --with-arch=nocona --with-tune=core2 --with-libiconv --with-system-zlib --with-gmp=/c/mingw491/prerequisites/x86_64-w64-mingw32-static --with-mpfr=/c/mingw491/prerequisites/x86_64-w64-mingw32-static --with-mpc=/c/mingw491/prerequisites/x86_64-w64-mingw32-static --with-isl=/c/mingw491/prerequisites/x86_64-w64-mingw32-static --with-cloog=/c/mingw491/prerequisites/x86_64-w64-mingw32-static --enable-cloog-backend=isl --with-pkgversion='x86_64-win32-seh-rev0, Built by MinGW-W64 project' --with-bugurl=http://sourceforge.net/projects/mingw-w64 CFLAGS='-O2 -pipe -I/c/mingw491/x86_64-491-win32-seh-rt_v3-rev0/mingw64/opt/include -I/c/mingw491/prerequisites/x86_64-zlib-static/include -I/c/mingw491/prerequisites/x86_64-w64-mingw32-static/include' CXXFLAGS='-O2 -pipe -I/c/mingw491/x86_64-491-win32-seh-rt_v3-rev0/mingw64/opt/include -I/c/mingw491/prerequisites/x86_64-zlib-static/include -I/c/mingw491/prerequisites/x86_64-w64-mingw32-static/include' CPPFLAGS= LDFLAGS='-pipe -L/c/mingw491/x86_64-491-win32-seh-rt_v3-rev0/mingw64/opt/lib -L/c/mingw491/prerequisites/x86_64-zlib-static/lib -L/c/mingw491/prerequisites/x86_64-w64-mingw32-static/lib' Thread model: win32 gcc version 4.9.1 (x86_64-win32-seh-rev0, Built by MinGW-W64 project)
[Bug c++/50921] GCC cannot find dependent conversion-function-id even if there's a using declaration for it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50921 Paolo Carlini paolo.carlini at oracle dot com changed: What|Removed |Added CC||paolo.carlini at oracle dot com --- Comment #4 from Paolo Carlini paolo.carlini at oracle dot com --- Any news?
[Bug middle-end/63213] -Warray-bounds false positive with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63213 Marek Polacek mpolacek at gcc dot gnu.org changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org Component|c |middle-end --- Comment #2 from Marek Polacek mpolacek at gcc dot gnu.org --- I'm pretty sure this is a dupe, we have many -Warray-bounds bugs.
[Bug target/61749] arm_neon.h _lane and _n intrinsics can cause ICEs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61749 --- Comment #5 from ktkachov at gcc dot gnu.org --- Author: ktkachov Date: Tue Sep 9 10:15:46 2014 New Revision: 215046 URL: https://gcc.gnu.org/viewcvs?rev=215046root=gccview=rev Log: [AArch64] PR 61749: Do not ICE in lane intrinsics when passed non-constant lane number PR target/61749 * config/aarch64/aarch64-builtins.c (aarch64_types_quadop_qualifiers): Use qualifier_immediate for last operand. Rename to... (aarch64_types_ternop_lane_qualifiers): ... This. (TYPES_QUADOP): Rename to... (TYPES_TERNOP_LANE): ... This. (aarch64_simd_expand_args): Return const0_rtx when encountering user error. Change return of 0 to return of NULL_RTX. (aarch64_crc32_expand_builtin): Likewise. (aarch64_expand_builtin): Return NULL_RTX instead of 0. ICE when expanding unknown builtin. * config/aarch64/aarch64-simd-builtins.def (sqdmlal_lane): Use TERNOP_LANE qualifiers. (sqdmlsl_lane): Likewise. (sqdmlal_laneq): Likewise. (sqdmlsl_laneq): Likewise. (sqdmlal2_lane): Likewise. (sqdmlsl2_lane): Likewise. (sqdmlal2_laneq): Likewise. (sqdmlsl2_laneq): Likewise. * gcc.target/aarch64/vqdml_lane_intrinsics-bad_1.c: New test. Added: trunk/gcc/testsuite/gcc.target/aarch64/vqdml_lane_intrinsics-bad_1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-builtins.c trunk/gcc/config/aarch64/aarch64-simd-builtins.def trunk/gcc/testsuite/ChangeLog
[Bug c++/58678] [4.9 Regression] pykde4-4.11.2 link error (devirtualization too trigger happy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58678 --- Comment #65 from Markus Trippelsdorf trippels at gcc dot gnu.org --- Honza, does Jason's patch from comment 62 look good to you?
[Bug c++/62175] [4.9 Regression] Internal compiler error in gimplify_modify_expr gimplify.c:4616
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62175 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug rtl-optimization/61672] [4.9 Regression] Less redundant instructions deleted by pre_delete after r208113.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61672 --- Comment #10 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 11:42:34 2014 New Revision: 215059 URL: https://gcc.gnu.org/viewcvs?rev=215059root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-08-05 Richard Biener rguent...@suse.de PR rtl-optimization/61672 * emit-rtl.h (mem_attrs_eq_p): Declare. * emit-rtl.c (mem_attrs_eq_p): Export. Handle NULL mem-attrs. * cse.c (exp_equiv_p): Use mem_attrs_eq_p. * cfgcleanup.c (merge_memattrs): Likewise. Include emit-rtl.h. 2014-08-11 Richard Biener rguent...@suse.de PR tree-optimization/62075 * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Properly handle uses in patterns. * gcc.dg/vect/pr62075.c: New testcase. 2014-08-14 Richard Biener rguent...@suse.de PR rtl-optimization/62079 * recog.c (peephole2_optimize): If peep2_do_cleanup_cfg run cleanup_cfg. * g++.dg/pr62079.C: New testcase. 2014-08-26 Richard Biener rguent...@suse.de PR tree-optimization/62175 * tree-ssa-loop-niter.c (expand_simple_operations): Do not expand possibly trapping operations. * g++.dg/torture/pr62175.C: New testcase. Added: branches/gcc-4_9-branch/gcc/testsuite/g++.dg/pr62079.C branches/gcc-4_9-branch/gcc/testsuite/g++.dg/torture/pr62175.C branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/vect/pr62075.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/cfgcleanup.c branches/gcc-4_9-branch/gcc/cse.c branches/gcc-4_9-branch/gcc/emit-rtl.c branches/gcc-4_9-branch/gcc/recog.c branches/gcc-4_9-branch/gcc/testsuite/ChangeLog branches/gcc-4_9-branch/gcc/tree-ssa-loop-niter.c branches/gcc-4_9-branch/gcc/tree-vect-slp.c
[Bug tree-optimization/62079] [4.9 Regression] ICE: in calc_dfs_tree, at dominance.c:401 with -fnon-call-exceptions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62079 --- Comment #6 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 11:42:34 2014 New Revision: 215059 URL: https://gcc.gnu.org/viewcvs?rev=215059root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-08-05 Richard Biener rguent...@suse.de PR rtl-optimization/61672 * emit-rtl.h (mem_attrs_eq_p): Declare. * emit-rtl.c (mem_attrs_eq_p): Export. Handle NULL mem-attrs. * cse.c (exp_equiv_p): Use mem_attrs_eq_p. * cfgcleanup.c (merge_memattrs): Likewise. Include emit-rtl.h. 2014-08-11 Richard Biener rguent...@suse.de PR tree-optimization/62075 * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Properly handle uses in patterns. * gcc.dg/vect/pr62075.c: New testcase. 2014-08-14 Richard Biener rguent...@suse.de PR rtl-optimization/62079 * recog.c (peephole2_optimize): If peep2_do_cleanup_cfg run cleanup_cfg. * g++.dg/pr62079.C: New testcase. 2014-08-26 Richard Biener rguent...@suse.de PR tree-optimization/62175 * tree-ssa-loop-niter.c (expand_simple_operations): Do not expand possibly trapping operations. * g++.dg/torture/pr62175.C: New testcase. Added: branches/gcc-4_9-branch/gcc/testsuite/g++.dg/pr62079.C branches/gcc-4_9-branch/gcc/testsuite/g++.dg/torture/pr62175.C branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/vect/pr62075.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/cfgcleanup.c branches/gcc-4_9-branch/gcc/cse.c branches/gcc-4_9-branch/gcc/emit-rtl.c branches/gcc-4_9-branch/gcc/recog.c branches/gcc-4_9-branch/gcc/testsuite/ChangeLog branches/gcc-4_9-branch/gcc/tree-ssa-loop-niter.c branches/gcc-4_9-branch/gcc/tree-vect-slp.c
[Bug tree-optimization/62075] [4.8/4.9 Regression] Vectorizer ICE on dolphin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62075 --- Comment #7 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 11:42:34 2014 New Revision: 215059 URL: https://gcc.gnu.org/viewcvs?rev=215059root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-08-05 Richard Biener rguent...@suse.de PR rtl-optimization/61672 * emit-rtl.h (mem_attrs_eq_p): Declare. * emit-rtl.c (mem_attrs_eq_p): Export. Handle NULL mem-attrs. * cse.c (exp_equiv_p): Use mem_attrs_eq_p. * cfgcleanup.c (merge_memattrs): Likewise. Include emit-rtl.h. 2014-08-11 Richard Biener rguent...@suse.de PR tree-optimization/62075 * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Properly handle uses in patterns. * gcc.dg/vect/pr62075.c: New testcase. 2014-08-14 Richard Biener rguent...@suse.de PR rtl-optimization/62079 * recog.c (peephole2_optimize): If peep2_do_cleanup_cfg run cleanup_cfg. * g++.dg/pr62079.C: New testcase. 2014-08-26 Richard Biener rguent...@suse.de PR tree-optimization/62175 * tree-ssa-loop-niter.c (expand_simple_operations): Do not expand possibly trapping operations. * g++.dg/torture/pr62175.C: New testcase. Added: branches/gcc-4_9-branch/gcc/testsuite/g++.dg/pr62079.C branches/gcc-4_9-branch/gcc/testsuite/g++.dg/torture/pr62175.C branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/vect/pr62075.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/cfgcleanup.c branches/gcc-4_9-branch/gcc/cse.c branches/gcc-4_9-branch/gcc/emit-rtl.c branches/gcc-4_9-branch/gcc/recog.c branches/gcc-4_9-branch/gcc/testsuite/ChangeLog branches/gcc-4_9-branch/gcc/tree-ssa-loop-niter.c branches/gcc-4_9-branch/gcc/tree-vect-slp.c
[Bug tree-optimization/62079] [4.9 Regression] ICE: in calc_dfs_tree, at dominance.c:401 with -fnon-call-exceptions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62079 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Known to fail|4.9.2 |4.9.1 --- Comment #7 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug c++/62175] [4.9 Regression] Internal compiler error in gimplify_modify_expr gimplify.c:4616
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62175 --- Comment #10 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 11:42:34 2014 New Revision: 215059 URL: https://gcc.gnu.org/viewcvs?rev=215059root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-08-05 Richard Biener rguent...@suse.de PR rtl-optimization/61672 * emit-rtl.h (mem_attrs_eq_p): Declare. * emit-rtl.c (mem_attrs_eq_p): Export. Handle NULL mem-attrs. * cse.c (exp_equiv_p): Use mem_attrs_eq_p. * cfgcleanup.c (merge_memattrs): Likewise. Include emit-rtl.h. 2014-08-11 Richard Biener rguent...@suse.de PR tree-optimization/62075 * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Properly handle uses in patterns. * gcc.dg/vect/pr62075.c: New testcase. 2014-08-14 Richard Biener rguent...@suse.de PR rtl-optimization/62079 * recog.c (peephole2_optimize): If peep2_do_cleanup_cfg run cleanup_cfg. * g++.dg/pr62079.C: New testcase. 2014-08-26 Richard Biener rguent...@suse.de PR tree-optimization/62175 * tree-ssa-loop-niter.c (expand_simple_operations): Do not expand possibly trapping operations. * g++.dg/torture/pr62175.C: New testcase. Added: branches/gcc-4_9-branch/gcc/testsuite/g++.dg/pr62079.C branches/gcc-4_9-branch/gcc/testsuite/g++.dg/torture/pr62175.C branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/vect/pr62075.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/cfgcleanup.c branches/gcc-4_9-branch/gcc/cse.c branches/gcc-4_9-branch/gcc/emit-rtl.c branches/gcc-4_9-branch/gcc/recog.c branches/gcc-4_9-branch/gcc/testsuite/ChangeLog branches/gcc-4_9-branch/gcc/tree-ssa-loop-niter.c branches/gcc-4_9-branch/gcc/tree-vect-slp.c
[Bug rtl-optimization/61672] [4.9 Regression] Less redundant instructions deleted by pre_delete after r208113.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61672 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #11 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug c++/61214] [4.9/5 regression] Weird interaction between -fvisibility-inlines-hidden, inline virtuals and devirtualization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61214 --- Comment #9 from Jason Merrill jason at gcc dot gnu.org --- Author: jason Date: Tue Sep 9 11:57:25 2014 New Revision: 215061 URL: https://gcc.gnu.org/viewcvs?rev=215061root=gccview=rev Log: PR c++/61214 PR c++/62224 gcc/ * gimple-fold.c (can_refer_decl_in_current_unit_p): Don't allow reference to a DECL_EXTERNAL COMDAT. gcc/cp/ * decl2.c (decl_needed_p): Revert virtual functions change. Added: branches/gcc-4_9-branch/gcc/testsuite/g++.dg/ipa/devirt-40.C Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/cp/ChangeLog branches/gcc-4_9-branch/gcc/cp/decl2.c branches/gcc-4_9-branch/gcc/gimple-fold.c branches/gcc-4_9-branch/gcc/testsuite/g++.dg/ipa/devirt-39.C
[Bug c++/62224] [4.9 Regression] Possible regression in gcc-4.9-20140820
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62224 --- Comment #15 from Jason Merrill jason at gcc dot gnu.org --- Author: jason Date: Tue Sep 9 11:57:25 2014 New Revision: 215061 URL: https://gcc.gnu.org/viewcvs?rev=215061root=gccview=rev Log: PR c++/61214 PR c++/62224 gcc/ * gimple-fold.c (can_refer_decl_in_current_unit_p): Don't allow reference to a DECL_EXTERNAL COMDAT. gcc/cp/ * decl2.c (decl_needed_p): Revert virtual functions change. Added: branches/gcc-4_9-branch/gcc/testsuite/g++.dg/ipa/devirt-40.C Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/cp/ChangeLog branches/gcc-4_9-branch/gcc/cp/decl2.c branches/gcc-4_9-branch/gcc/gimple-fold.c branches/gcc-4_9-branch/gcc/testsuite/g++.dg/ipa/devirt-39.C
[Bug c++/62255] [4.8/4.9 Regression] Introducing an unrelated template parameter causes compilation to fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62255 --- Comment #10 from Jason Merrill jason at gcc dot gnu.org --- Author: jason Date: Tue Sep 9 11:59:45 2014 New Revision: 215062 URL: https://gcc.gnu.org/viewcvs?rev=215062root=gccview=rev Log: PR c++/62255 * pt.c (instantiate_decl): Handle recursive instantiation of static data member. Added: trunk/gcc/testsuite/g++.dg/template/recurse4.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/pt.c
[Bug c++/62224] [4.9 Regression] Possible regression in gcc-4.9-20140820
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62224 Markus Trippelsdorf trippels at gcc dot gnu.org changed: What|Removed |Added Resolution|INVALID |FIXED --- Comment #16 from Markus Trippelsdorf trippels at gcc dot gnu.org --- Changing resolution. Fixed.
[Bug tree-optimization/62053] [5 Regression] ICE: in remap_type_1, at tree-inline.c:540
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62053 --- Comment #3 from Alexander Ivchenko aivchenk at gmail dot com --- Jan, by any chance, do you have any progress on that? May be we should revert the patch until the proper fix?
[Bug c++/63194] [5 Regression] ICE in maybe_explain_implicit_delete, at cp/method.c:1552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63194 Markus Trippelsdorf trippels at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |5.0
[Bug target/63190] Assembler errors when building md5 code from fbb on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63190 --- Comment #3 from Yvan Roux yroux at gcc dot gnu.org --- Author: yroux Date: Tue Sep 9 12:40:52 2014 New Revision: 215069 URL: https://gcc.gnu.org/viewcvs?rev=215069root=gccview=rev Log: 2014-09-09 Venkataramanan Kumar venkataramanan.ku...@linaro.org Backport from trunk r215004. 2014-09-07 Venkataramanan Kumar venkataramanan.ku...@linaro.org PR target/63190 * config/aarch64/aarch64.md (stack_protect_test_mode) Add register constraint for operand0 and remove write only modifier from operand3. Modified: branches/linaro/gcc-4_9-branch/gcc/ChangeLog.linaro branches/linaro/gcc-4_9-branch/gcc/config/aarch64/aarch64.md
[Bug c++/63214] New: [5.0 regression] ICE with static __thread value member in template class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63214 Bug ID: 63214 Summary: [5.0 regression] ICE with static __thread value member in template class Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kikairoya at gmail dot com following code causes ICE with gcc HEAD 5.0.0 20140908 (experimental) (ran in http://melpon.org/wandbox/permlink/vXZsIogHe6QO5pnI ) #include memory template typename T struct X { static __thread T value_; }; prog.cc:5:2: internal compiler error: Segmentation fault }; ^ 0x9a2d3f crash_signal /home/heads/gcc/gcc-source/gcc/toplev.c:339 0x6fdee0 symbol_table::decl_assembler_name_hash(tree_node const*) /home/heads/gcc/gcc-source/gcc/symtab.c:69 0x6fe0e8 symbol_table::insert_to_assembler_name_hash(symtab_node*, bool) /home/heads/gcc/gcc-source/gcc/symtab.c:181 0x6fe1fc symbol_table::symtab_initialize_asm_name_hash() /home/heads/gcc/gcc-source/gcc/symtab.c:263 0x6ff3f4 symbol_table::symtab_initialize_asm_name_hash() /home/heads/gcc/gcc-source/gcc/symtab.c:950 0x6ff3f4 symtab_node::get_for_asmname(tree_node const*) /home/heads/gcc/gcc-source/gcc/symtab.c:939 0x709088 handle_alias_pairs /home/heads/gcc/gcc-source/gcc/cgraphunit.c: 0x70c94c symbol_table::finalize_compilation_unit() /home/heads/gcc/gcc-source/gcc/cgraphunit.c:2264 0x5bba2b cp_write_global_declarations() /home/heads/gcc/gcc-source/gcc/cp/decl2.c:4666 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. -
[Bug c/38354] Spurious error: initializer element is not computable at load time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38354 --- Comment #9 from Jonathan Wakely redi at gcc dot gnu.org --- (In reply to Adam Warner from comment #6) Just to make sure I understand this correctly: 1. You won't confirm this bug because it violates the C standard. You asked why it works with C++ and I told you, I didn't say anything about confirming it or not, because I'm not a C front-end maintainer so it's not my place to decide what is a bug in the C front-end. (IMHO it's not a bug, but might be a valid enhancement request.) I stand by my statement that the C standard is more restrictive about what is allowed in a static initializer. C++ has a dynamic initialization phase which does not exist for C. 2. GNU provides extensions to C when the C standard is too restrictive. In this case the initializer element is *clearly* computable at load time because all function pointers are *already* 32 bits under the non-large code model. 3. To demonstrate this, a GNU extension to C++ has no problem computing the address of the function pointer at load time and storing it in a 32-bit integer array. It's not a GNU extension, it's required by the C++ standard, and it happens at runtime, so that fails to demonstrate your point. 4. Just because C is supposed to be a portable assembler and lower level than C++ doesn't mean you should be able to store a 32-bit address in a 32-bit integer at load time. Even though GNU C++ can. Except it can't.
[Bug tree-optimization/62075] [4.8 Regression] Vectorizer ICE on dolphin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62075 --- Comment #8 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 13:17:51 2014 New Revision: 215073 URL: https://gcc.gnu.org/viewcvs?rev=215073root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-05-05 Richard Biener rguent...@suse.de PR middle-end/61010 * fold-const.c (fold_binary_loc): Consistently avoid canonicalizing X CST away from a CST that is the mask of a mode. * gcc.dg/torture/pr61010.c: New testcase. 2014-05-28 Richard Biener rguent...@suse.de PR middle-end/61045 * fold-const.c (fold_comparison): When folding X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 also ensure the sign of the remaining constant operand stays the same. * gcc.dg/pr61045.c: New testcase. 2014-08-11 Richard Biener rguent...@suse.de PR tree-optimization/62075 * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Properly handle uses in patterns. * gcc.dg/vect/pr62075.c: New testcase. Added: branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/pr61045.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr61010.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/vect/pr62075.c Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/fold-const.c branches/gcc-4_8-branch/gcc/testsuite/ChangeLog branches/gcc-4_8-branch/gcc/tree-vect-slp.c
[Bug middle-end/61045] [4.8 Regression] Wrong constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61045 --- Comment #10 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 13:17:51 2014 New Revision: 215073 URL: https://gcc.gnu.org/viewcvs?rev=215073root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-05-05 Richard Biener rguent...@suse.de PR middle-end/61010 * fold-const.c (fold_binary_loc): Consistently avoid canonicalizing X CST away from a CST that is the mask of a mode. * gcc.dg/torture/pr61010.c: New testcase. 2014-05-28 Richard Biener rguent...@suse.de PR middle-end/61045 * fold-const.c (fold_comparison): When folding X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 also ensure the sign of the remaining constant operand stays the same. * gcc.dg/pr61045.c: New testcase. 2014-08-11 Richard Biener rguent...@suse.de PR tree-optimization/62075 * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Properly handle uses in patterns. * gcc.dg/vect/pr62075.c: New testcase. Added: branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/pr61045.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr61010.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/vect/pr62075.c Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/fold-const.c branches/gcc-4_8-branch/gcc/testsuite/ChangeLog branches/gcc-4_8-branch/gcc/tree-vect-slp.c
[Bug middle-end/61010] [4.8 Regression] Infinite recursion in fold
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61010 --- Comment #13 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 13:17:51 2014 New Revision: 215073 URL: https://gcc.gnu.org/viewcvs?rev=215073root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-05-05 Richard Biener rguent...@suse.de PR middle-end/61010 * fold-const.c (fold_binary_loc): Consistently avoid canonicalizing X CST away from a CST that is the mask of a mode. * gcc.dg/torture/pr61010.c: New testcase. 2014-05-28 Richard Biener rguent...@suse.de PR middle-end/61045 * fold-const.c (fold_comparison): When folding X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 also ensure the sign of the remaining constant operand stays the same. * gcc.dg/pr61045.c: New testcase. 2014-08-11 Richard Biener rguent...@suse.de PR tree-optimization/62075 * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Properly handle uses in patterns. * gcc.dg/vect/pr62075.c: New testcase. Added: branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/pr61045.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr61010.c branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/vect/pr62075.c Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/fold-const.c branches/gcc-4_8-branch/gcc/testsuite/ChangeLog branches/gcc-4_8-branch/gcc/tree-vect-slp.c
[Bug middle-end/61045] [4.8 Regression] Wrong constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61045 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work|4.10.0 |4.8.4, 5.0 Resolution|--- |FIXED --- Comment #11 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug middle-end/61010] [4.8 Regression] Infinite recursion in fold
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61010 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work|4.10.0 |4.8.4, 5.0 Resolution|--- |FIXED Known to fail|4.8.2 |4.8.3 --- Comment #14 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug tree-optimization/62075] [4.8 Regression] Vectorizer ICE on dolphin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62075 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work||4.8.4 Resolution|--- |FIXED Known to fail||4.8.3 --- Comment #9 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug c++/63214] [5.0 regression] ICE with static __thread value member in template class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63214 Tomohiro Kashiwada kikairoya at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Tomohiro Kashiwada kikairoya at gmail dot com --- sorry. dupped. *** This bug has been marked as a duplicate of bug 61558 ***
[Bug middle-end/61558] [5 Regression] ICE: Segmentation fault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61558 Tomohiro Kashiwada kikairoya at gmail dot com changed: What|Removed |Added CC||kikairoya at gmail dot com --- Comment #8 from Tomohiro Kashiwada kikairoya at gmail dot com --- *** Bug 63214 has been marked as a duplicate of this bug. ***
[Bug sanitizer/61897] sanitizer internal compiler error: in build2_stat, at tree.c:4160
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61897 Tom Truscott trt at alumni dot duke.edu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Tom Truscott trt at alumni dot duke.edu --- This patch works fine for me, thanks!
[Bug tree-optimization/62012] Loop is not vectorized after function inlining (SCEV)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62012 --- Comment #4 from Yuri Rumyantsev ysrumyan at gmail dot com --- I checked that our benchmark is successfully vectorized with function inlining. So this bug must be closed as fixed/resolved.
[Bug tree-optimization/62012] Loop is not vectorized after function inlining (SCEV)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62012 --- Comment #5 from Yuri Rumyantsev ysrumyan at gmail dot com --- You can close this bug as fixed/resolved (see my comment). Thanks. Yuri. 2014-09-08 15:29 GMT+04:00 rguenth at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62012 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2014-09-08 Ever confirmed|0 |1 --- Comment #3 from Richard Biener rguenth at gcc dot gnu.org --- So it's independend of LTO. Confirmed. We have _28 = MEM[(struct Array *)fa + 256B].a[0] + _3; *_28 = u_23; which SCEV messes up because it ends up with (instantiate_scev (instantiate_below = 4) (evolution_loop = 1) (chrec = MEM[(struct Array *)fa + 256B].a[0]) (res = MEM[(struct Array *)fa + 256B].a[0])) (instantiate_scev (instantiate_below = 4) (evolution_loop = 1) (chrec = {(long unsigned int) first_6(D) * 4, +, 4}_1) (res = {(long unsigned int) first_6(D) * 4, +, 4}_1)) (set_scalar_evolution instantiated_below = 4 (scalar = _13) (scalar_evolution = {MEM[(struct Array *)fa + 256B].a[(sizetype) first_6(D)], +, 4}_1)) ) failed: evolution of base is not affine. Not sure why it thinks that. Btw, on trunk we now vectorize this just fine probably because of the fix for PR63148 which avoids moving first_6 * 4 inside the array-ref and we get (scalar_evolution = {MEM[(struct Array *)fa + 256B].a[0] + (sizetype) ((long unsigned int) first_6(D) * 4), +, 4}_1)) ) success. instead. So - can you re-check please? -- You are receiving this mail because: You reported the bug.
[Bug tree-optimization/61680] [4.8 Regression] vectorization gives wrong answer for sandybridge target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61680 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Known to work|4.10.0 |5.0 --- Comment #12 from Richard Biener rguenth at gcc dot gnu.org --- For 4.8 we can't backport the patch as group analysis and dependence analysis are still the wrong way around. A simple Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 215073) +++ gcc/tree-vect-data-refs.c (working copy) @@ -2307,6 +2307,17 @@ vect_analyze_group_access (struct data_r while (next) { + /* Check that there is no load-store dependencies for this loads +to prevent a case of load-store-load to the same location. */ + if (GROUP_READ_WRITE_DEPENDENCE (vinfo_for_stmt (next)) + || GROUP_READ_WRITE_DEPENDENCE (vinfo_for_stmt (prev))) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, +READ_WRITE dependence in interleaving.); + return false; + } + /* Skip same data-refs. In case that two or more stmts share data-ref (supported only for loads), we vectorize only the first stmt, and the rest get their vectorized loads from the first @@ -2323,17 +2334,6 @@ vect_analyze_group_access (struct data_r return false; } - /* Check that there is no load-store dependencies for this loads - to prevent a case of load-store-load to the same location. */ - if (GROUP_READ_WRITE_DEPENDENCE (vinfo_for_stmt (next)) - || GROUP_READ_WRITE_DEPENDENCE (vinfo_for_stmt (prev))) -{ - if (dump_enabled_p ()) -dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - READ_WRITE dependence in interleaving.); - return false; -} - /* For load use the same data-ref load. */ GROUP_SAME_DR_STMT (vinfo_for_stmt (next)) = prev; will regress testcases in vect.exp. Like with two other 4.8 vectorizer miscompiles it's hard to fix them without turning the vectorizer upside down (aka backport most of the re-org from GCC 4.9).
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 62012, which changed state. Bug 62012 Summary: Loop is not vectorized after function inlining (SCEV) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62012 What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/62012] Loop is not vectorized after function inlining (SCEV)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62012 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #6 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug tree-optimization/61452] [4.8 Regression] hang at -O1 and -Os on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61452 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work|4.10.0 |4.8.4, 5.0 Resolution|--- |FIXED --- Comment #6 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug tree-optimization/61452] [4.8 Regression] hang at -O1 and -Os on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61452 --- Comment #7 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Sep 9 14:45:57 2014 New Revision: 215081 URL: https://gcc.gnu.org/viewcvs?rev=215081root=gccview=rev Log: 2014-09-09 Richard Biener rguent...@suse.de Backport from mainline 2014-06-11 Richard Biener rguent...@suse.de PR tree-optimization/61452 * tree-ssa-sccvn.c (visit_phi): Remove pointless setting of expr and has_constants in case we found a leader. (simplify_binary_expression): Always valueize operands first. (simplify_unary_expression): Likewise. * gcc.dg/torture/pr61452.c: New testcase. Added: branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/torture/pr61452.c Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/testsuite/ChangeLog branches/gcc-4_8-branch/gcc/tree-ssa-sccvn.c
[Bug ipa/61659] [4.9 Regression] Extra undefined symbol because of devirtualization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61659 --- Comment #33 from John David Anglin danglin at gcc dot gnu.org --- The issue in comment 32 was introduced in revision 214177 .
[Bug c/38354] Spurious error: initializer element is not computable at load time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38354 --- Comment #10 from joseph at codesourcery dot com joseph at codesourcery dot com --- This seems like an enhancement request to me. Initializers using a 64-bit function address cast to 32-bit are outside the scope of my model of constant expressions for GNU C (http://www.polyomino.org.uk/computer/c/const-exprs-gnu.txt - fairly old, may be out of date). It so happens that the x86_64 ABI does include a relocation for 32-bit symbol references (in fact two of them, R_X86_64_32 and R_X86_64_32S), but I don't believe GCC has that information about what relocations the target supports at present (and then there would be the question of which of those two relocations to use, though maybe that's a question for the assembler).
[Bug target/63195] [5.0 regression] stage3 build/gengtype miscompiled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63195 --- Comment #3 from Segher Boessenkool segher at gcc dot gnu.org --- Author: segher Date: Tue Sep 9 18:49:08 2014 New Revision: 215091 URL: https://gcc.gnu.org/viewcvs?rev=215091root=gccview=rev Log: 2014-09-09 Segher Boessenkool seg...@kernel.crashing.org PR target/63195 * config/rs6000/rs6000.md (*boolmode3): Allow only register operands. Split off the constant operand alternative to ... (*boolmode3_imm): New. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.md
[Bug web/62211] ./configure --with-float= and ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62211 --- Comment #1 from Richard Earnshaw rearnsha at gcc dot gnu.org --- -m{soft,hard}-float for arm should be considered deprecated (we try to support them by mapping them onto the -mfloat-abi option), and deliberately no-longer document them. Rather than clarifying what they do, the references elsewhere should be cleaned up to use -mfloat-abi. Specifically, in this case, the reference to -msoft-float should be changed to -mfloat-abi=soft.
[Bug target/63195] [5.0 regression] stage3 build/gengtype miscompiled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63195 Segher Boessenkool segher at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Segher Boessenkool segher at gcc dot gnu.org --- Fixed. Thanks for the report!
[Bug target/61407] Build errors on latest OS X 10.10 Yosemite with Xcode 6 on GCC 4.8.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61407 --- Comment #38 from Lawrence Velázquez larryv at macports dot org --- Created attachment 33464 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33464action=edit Corrected dirent patch Couple of things: 1) The version of Sanitizer included with GCC 4.9.1 at least, and probably earlier, has the includes in the wrong order in sanitizer_platform_limits_posix.cc: #include sanitizer_platform.h #if SANITIZER_LINUX || SANITIZER_MAC #include sanitizer_internal_defs.h #include sanitizer_platform_limits_posix.h #include arpa/inet.h #include dirent.h Since sanitizer_platform_limits_posix.h is included before the system headers, _DARWIN_FEATURE_64_BIT_INODE is *not* defined for the second hunk of your patch, and the 32-bit dirent is always used. However, it *is* defined for the first hunk, so CHECK_SIZE_AND_OFFSET does try to access d_seekoff (and predictably fails). The ordering is fixed in Sanitizer upstream, but it's more robust to directly include sys/cdefs.h in sanitizer_platform_limits_posix.h to avoid relying on include ordering. 2) You should report this bug upstream. http://compiler-rt.llvm.org
[Bug fortran/55534] -Wno-missing-include-dirs does not work with gfortran
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55534 Manuel López-Ibáñez manu at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-09-09 CC||manu at gcc dot gnu.org Depends on||62226 Ever confirmed|0 |1 --- Comment #7 from Manuel López-Ibáñez manu at gcc dot gnu.org --- The ideal fix for this would adding a function like: +bool +gfc_warning_cmdline (int opt, const char *gmsgid, ...) +{ + va_list argp; + diagnostic_info diagnostic; + bool ret; + + va_start (argp, gmsgid); + diagnostic_set_info (diagnostic, gmsgid, argp, UNKNOWN_LOCATION, + DK_WARNING); + diagnostic.option_index = opt; + ret = report_diagnostic (diagnostic); + va_end (argp); + return ret; +} + in error.c. Then calling: Index: gcc/fortran/scanner.c === --- gcc/fortran/scanner.c (revision 214251) +++ gcc/fortran/scanner.c (working copy) @@ -326,13 +326,13 @@ add_path_to_list (gfc_directorylist **li if (errno != ENOENT) gfc_warning_now (Include directory \%s\: %s, path, xstrerror(errno)); else { - /* FIXME: Also support -Wmissing-include-dirs. */ if (warn) - gfc_warning_now (Nonexistent include directory \%s\, path); + gfc_warning_cmdline (OPT_Wmissing_include_dirs, +Nonexistent include directory \%s\, path); } return; } else if (!S_ISDIR (st.st_mode)) { Then, NOT adding gfc_option.warn_missing_include_dirs, but instead fixing 62226, and simply adding: Index: gcc/fortran/lang.opt === --- gcc/fortran/lang.opt(revision 194167) +++ gcc/fortran/lang.opt(working copy) @@ -254,6 +254,10 @@ Fortran Warning Warn on intrinsics not part of the selected standard +Wmissing-include-dirs +Fortran Warning +; Documented in C + Wreal-q-constant Fortran Warning Warn about real-literal-constants with 'q' exponent-letter This automatically will give you: * Setting cpp_opts, even when using #pragma, -Werror= and complicated combinations. * Colors! * Printing [-Wmissing-include-dirs] in the warning message.
[Bug c++/62224] [4.9 Regression] Possible regression in gcc-4.9-20140820
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62224 --- Comment #17 from Chris Clayton chris2553 at googlemail dot com --- I can confirm that with Jason's code changes (referenced in comment 15) to gcc/cp/decl2.c and gcc/gimple-fold.c, the resultant compiler successfully builds qt-creator-3.2.0. Thanks Jason.
[Bug rtl-optimization/63156] web can't handle AUTOINC correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63156 --- Comment #9 from Carrot carrot at google dot com --- The original flag setting code is neither correct. Consider following pre_modify expression: (pre_modify (r1)// def1, use1 (plus (r1) // use2 (r2)))// use3 GCC will generate 4 df_ref information for this expression as noted, 1 def and 3 use. Current code only set DF_REF_READ_WRITE for def1, this causes web do wrong renaming. The original flag setting code will set DF_REF_READ_WRITE for all def/use in this expression, this is obviously wrong for r2. I don't know if this has any relations to bug 32339.
[Bug bootstrap/56750] [4.8/4.9/5 Regression] static -lstdc++ logic bleeds into all subdirs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56750 --- Comment #7 from Mike Frysinger vapier at gentoo dot org --- ping ... this is causing releases of binutils' gold to be statically linked, and causing headaches for the gold testsuites.
[Bug target/61407] Build errors on latest OS X 10.10 Yosemite with Xcode 6 on GCC 4.8.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61407 --- Comment #39 from Lawrence Velázquez larryv at macports dot org --- (In reply to Lawrence Velázquez from comment #38) 2) You should report this bug upstream. http://compiler-rt.llvm.org On second thought, they probably won't accept it because the patch is invalid. sanitizer_mac.cc explicitly defines _DARWIN_USE_64_BIT_INODE, which result in 64-bit dirents on all platforms that AddressSanitizer supports (Snow Leopard and newer). The correct fix is to disable the AddressSanitizer build entirely on platforms that don't support 64-bit inodes (Leopard and older).
[Bug c/38354] Spurious error: initializer element is not computable at load time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38354 --- Comment #11 from Adam Warner adam at consulting dot net.nz --- Thank you Joseph for clarifying in Comment 10 why this should be considered an enhancement request. It would be non-trivial to change the model of what GNU C considers a constant expression and code relying on an enhanced model of constant expression is likely to be incompatible with other compilers. I now have a 100% efficient fairly portable workaround in a few lines of bash scripting. Here is an example lookup_table.c: #include stdint.h #include stdio.h #include lookup_table.h void fn0(void) {printf(fn0\n);} void fn1(void) {printf(fn1\n);} void fn2(void) {printf(fn2\n);} void fn3(void) {printf(fn3\n);} int main(void) { for (int i=0; i4; ++i) ((void (*)(void)) (uint64_t) lookup_table[i])(); return 0; } To compile/execute the C code run this bash script named lookup_table.sh: #!/bin/bash if [[ ! -f lookup_table.h ]]; then echo uint32_t lookup_table[4]; lookup_table.h; fi gcc -std=gnu11 -O3 lookup_table.c -o lookup_table mv -f lookup_table.h lookup_table.h.old echo uint32_t lookup_table[] = { lookup_table.h objdump -d -m i386:x86-64 lookup_table | grep 'fn' | sed 's/^0/0x0/' | sed 's/ .*/,/' lookup_table.h echo }; lookup_table.h diff -su lookup_table.h.old lookup_table.h ./lookup_table || ./lookup_table.sh Example output: $ ./lookup_table.sh --- lookup_table.h.old2014-09-10 13:35:03.162644646 +1200 +++ lookup_table.h2014-09-10 13:35:03.222648312 +1200 @@ -1 +1,6 @@ -uint32_t lookup_table[4]; +uint32_t lookup_table[] = { +0x00400530, +0x00400540, +0x00400550, +0x00400560, +}; Files lookup_table.h.old and lookup_table.h are identical fn0 fn1 fn2 fn3 The lookup table functions must share a unique prefix. If the new lookup_table.h matches the old lookup_table.h then the binary is internally consistent and ready for execution. There is no additional overhead in the final binary: 00400410 main: 400410: 53 push rbx 400411: bb e0 09 60 00 movebx,0x6009e0 400416: 8b 03 moveax,DWORD PTR [rbx] 400418: 48 83 c3 04 addrbx,0x4 40041c: ff d0 call rax 40041e: 48 81 fb f0 09 60 00cmprbx,0x6009f0 400425: 75 ef jne400416 main+0x6 400427: 31 c0 xoreax,eax 400429: 5b poprbx 40042a: c3 ret In this example the lookup table has been mapped into memory at address 0x6009e0. There is a 32-bit load for each function call. No code is required to populate the lookup table at run time. No internal run time checks are required to ensure the lookup table and function addresses match. This is a perfectly efficient workaround that is superior to the C++ approach.
[Bug libgcc/56846] _Unwind_Backtrace on ARM and noexcept
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56846 --- Comment #8 from thopre01 at gcc dot gnu.org --- Author: thopre01 Date: Wed Sep 10 04:45:32 2014 New Revision: 215101 URL: https://gcc.gnu.org/viewcvs?rev=215101root=gccview=rev Log: 2014-09-10 Tony Wang tony.w...@arm.com libstdc++-v3/ PR target/56846 * libsupc++/eh_personality.cc (PERSONALITY_FUNCTION): Return with CONTINUE_UNWINDING when the state pattern contains: _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/libsupc++/eh_personality.cc
[Bug debug/60655] [4.9 Regression] ICE: output_operand: invalid expression as operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60655 --- Comment #19 from Alan Modra amodra at gcc dot gnu.org --- Author: amodra Date: Wed Sep 10 05:02:02 2014 New Revision: 215102 URL: https://gcc.gnu.org/viewcvs?rev=215102root=gccview=rev Log: PR debug/60655 * dwarf2out.c (mem_loc_descriptor PLUS): Return NULL if addend can't be output. Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/dwarf2out.c
[Bug debug/60655] [4.9 Regression] ICE: output_operand: invalid expression as operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60655 --- Comment #20 from Alan Modra amodra at gcc dot gnu.org --- Author: amodra Date: Wed Sep 10 05:02:28 2014 New Revision: 215103 URL: https://gcc.gnu.org/viewcvs?rev=215103root=gccview=rev Log: PR debug/60655 * dwarf2out.c (mem_loc_descriptor PLUS): Return NULL if addend can't be output. Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/dwarf2out.c
[Bug lto/63215] New: LTO causes symbols for builtin functions to be omitted from object files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63215 Bug ID: 63215 Summary: LTO causes symbols for builtin functions to be omitted from object files Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: patrick at motec dot com.au -flto causes symbols for builtin functions to be omitted from object files. Specifying -fno-builtin generates the symbols again. This causes problems when these files are combined into an archive and used for later linking as the linker can't resolve the symbols. For example: abs.c: int abs(int j) { return j 0 ? -j : j; } Normal compile: $ powerpc-eabispe-gcc -c abs.c -o abs.o $ powerpc-eabispe-gcc-nm abs.o T abs Compile with lto: $ powerpc-eabispe-gcc -flto -c abs.c -o abs_lto.o $ powerpc-eabispe-gcc-nm abs_lto.o /home/patrick/toolchain/lib/gcc/powerpc-eabispe/4.9.1/../../../../powerpc-eabispe/bin/nm: abs_lto.o: no symbols Compile with lto and no-builtin: $ powerpc-eabispe-gcc -fno-builtin -flto -c abs.c -o abs_lto_nobuiltin.o $ powerpc-eabispe-gcc-nm abs_lto_nobuiltin.o T abs Here is the output of gcc -v $ powerpc-eabispe-gcc -v Using built-in specs. COLLECT_GCC=powerpc-eabispe-gcc COLLECT_LTO_WRAPPER=/home/patrick/src/e7/toolchain/stage2/libexec/gcc/powerpc-eabispe/4.9.1/lto-wrapper Target: powerpc-eabispe Configured with: /home/patrick/src/e7/toolchain/src/gcc-4.9.1/configure --prefix=/home/patrick/src/e7/toolchain/stage2 --build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu --target=powerpc-eabispe --enable-languages=c,c++ --with-sysroot=/home/patrick/src/e7/toolchain/../prex_sysroot --disable-nls --disable-werror --with-newlib --with-gmp=/home/patrick/src/e7/toolchain/stage2 --with-mpfr=/home/patrick/src/e7/toolchain/stage2 --disable-shared --disable-debug --disable-libssp --with-cpu=8540 Thread model: single gcc version 4.9.1 (GCC)
RE: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, September 05, 2014 12:45 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap On 09/01/14 02:13, Zhenqiang Chen wrote: To split live-range of register, split_live_ranges_for_shrink_wrap will introduce additional register copies. If such copies can not be optimized by later optimizations, it will lead to code size and performance regression. My tests on ARM THUMB1 code size show lots of regressions due to additional register copies. Shrink-wrap is not enabled for ARM THUMB1, so I think split_live_ranges_for_shrink_wrap should not be called. So has anyone looked at why IRA ends up selecting different registers for the source/dest of these copies? Odds are it's just an artifact of the heuristics in use, but I'd like to make sure there isn't something inherently wrong happening in IRA that's causing it to not tie the source/dest of those copies. ChangeLog: 2014-09-01 Zhenqiang Chen zhenqiang.c...@arm.com * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED. * ira.c: #include shrink-wrap.h (split_live_ranges_for_shrink_wrap): Use SHRINK_WRAPPING_ENABLED. * ifcvt.c: #include shrink-wrap.h (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED. testsuite/ChangeLog: 2014-09-01 Zhenqiang Chen zhenqiang.c...@arm.com * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test. Thanks. OK for the trunk. Thanks. The patch is installed @r215041. As noted above, it'd may be worth spending a little time looking at the regressions without this patch installed to see why IRA isn't doing a good job of tying the source/dest of these copies together -- perhaps there's something that's been overlooked and fixing it may be beneficial. I had investigated it. Compared with 4.8, the allocation order and conflict cost might be the root cause. A bug is submitted: PR63210. Thanks! -Zhenqiang jeff
Re: [PATCH, Fortran] Wrong invocation of caf_atomic_op
Alessandro Fanfarillo wrote: This email follows the previous without subject (sorry about that). I think I'd prefer the following patch, which avoids a temporary if none is required. value is a pointer if the kind is the same (see kind check before) and if it is not a literal. Otherwise, it isn't a pointer and one needs to generate a temporary. I do not quite understand why the current check doesn't work as both are integer(kind=4) but for some reasons one has a variant. Additionally, I wonder whether one should add a test case – one probably should do – and of which kind (run test + fdump-tree-original?). Tobias --- a/gcc/fortran/trans-intrinsic.c +++ b/gcc/fortran/trans-intrinsic.c @@ -8398,3 +8398,3 @@ conv_intrinsic_atomic_op (gfc_code *code) - if (TREE_TYPE (TREE_TYPE (atom)) != TREE_TYPE (TREE_TYPE (value))) + if (!POINTER_TYPE_P (TREE_TYPE (value))) { The attached patch solves the problem raised by the following code: program atomic use iso_fortran_env implicit none integer :: me integer(atomic_int_kind) :: atom[*] me = this_image() call atomic_define(atom[1],0) sync all call ATOMIC_ADD (atom[1], me) if(me == 1) call atomic_ref(me,atom[1]) sync all write(*,*) me end program Ok for trunk?
RE: [PATCH] RE: gcc parallel make check
Attached is an extended version of the patch, it brings a 100% improvement in make -j32 -k check-gcc (down from 20min to 10min) by modification of check_gcc_parallelize. It includes one non-trivial part, namely a split of the target exps. They are now all split using a common choice (based on i386), which I believe is reasonable as it is the target with most tests, and the patterns will be somewhat similar for other targets (e.g. split of p(rxxx)). The implementation of this in the makefile uses an odd looking technique to substitute spaces with commas in a variable, if this can be done more elegantly, I'm happy to make the change. Bootstrap and testing revealed one issue, i386.exp hard-codes a loop for the testcase 'vect-args.c' in order to test 10 different combinations of options. With the current split (i.e. target x4) this test will thus be executed 4 times. There are two easy options 1) keep the current setup, overhead is small 2) keep the .exp file simple and just replicate this test 10x I've selected 1), but I can update a patch with 2). Ideally dg-options in the testcase file itself could be repeated, but I haven't found an example of this. The script now includes sorting and compression of the ranges, and an additional sanity check on the input, i.e. that file names start with [0-9A-Za-z]. Some (few) files seem to start with _ or # (in ./gcc.dg/cpp/). I'll follow up with a separate patch to improve check_g++_parallelize. Full 'make -j k32 check' is now dominated by libstdc++ testing, which contains single goals that run ~1100s (e.g. regex related tests). These uses a slightly different syntax (see gcc/libstdc++-v3/testsuite/Makefile.am) and I'm not yet sure how to deal with the .am files. current patch OK for trunk ? Joost patch-speedup-checkfortran-v05.CL Description: patch-speedup-checkfortran-v05.CL Index: contrib/generate_tcl_patterns.sh === --- contrib/generate_tcl_patterns.sh (revision 0) +++ contrib/generate_tcl_patterns.sh (revision 0) @@ -0,0 +1,114 @@ +#! /bin/sh + +# +# based on a list of filenames as input, starting with [0-9A-Za-z], +# generate regexps that match subsets trying to not exceed a +# 'maxcount' parameter. Most useful to generate the +# check_LANG_parallelize assignments needed to split +# testsuite directories, defining prefix appropriately. +# +# Example usage: +# cd gcc/gcc/testsuite/gfortran.dg +# ls -1 | ../../../contrib/generate_tcl_patterns.sh 300 dg.exp=gfortran.dg/ +# +# the first parameter is the maximum number of files. +# the second parameter the prefix used for printing. +# + +# Copyright (C) 2014 Free Software Foundation +# Contributed by Joost VandeVondele joost.vandevond...@mat.ethz.ch +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING. If not, write to +# the Free Software Foundation, 51 Franklin Street, Fifth Floor, +# Boston, MA 02110-1301, USA. + +gawk -v maxcount=$1 -v prefix=$2 ' +BEGIN{ + # list of allowed starting chars for a file name in a dir to split + achars=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz + ranget=112233 +} +{ + if (index(achars,substr($1,1,1))==0){ + print file : $1 does not start with an allowed character. + _assert_exit = 1 + exit 1 + } + nfiles++ ; files[nfiles]=$1 +} +END{ + if (_assert_exit) exit 1 + for(i=1; i=length(achars); i++) count[substr(achars,i,1)]=0 + for(i=1; i=nfiles; i++) { + if (length(files[i]0)) { count[substr(files[i],1,1)]++ } + }; + asort(count,ordered) + countsingle=0 + groups=0 + label= + for(i=length(achars);i=1;i--) { +countsingle=countsingle+ordered[i] +for(j=1;j=length(achars);j++) { + if(count[substr(achars,j,1)]==ordered[i]) found=substr(achars,j,1) +} +count[found]=-1 +label=label found +if(i==1) { val=maxcount+1 } else { val=ordered[i-1] } +if(countsingle+valmaxcount) { + subset[label]=countsingle + print Adding label: , label, matching files: countsingle + groups++ + countsingle=0 + label= +} + } + print patterns: + asort(subset,ordered) + for(i=groups;i=1;i--) { +for(j in subset){ + if(subset[j]==ordered[i]) found=j +} +subset[found]=-1 +if (length(found)==1) { + printf(%s%s* \\\n,prefix,found) +} else { + sortandcompress() +
[gomp4] Merge trunk r214918 (2014-09-04) into gomp-4_0-branch
Hi! In r215042, I have committed a merge from trunk r214918 (2014-09-04) into gomp-4_0-branch. Grüße, Thomas pgp9jeGqLoI2x.pgp Description: PGP signature
Commit:
Hi Guys, I am applying the patch below as an obvious fix. It adds a missing @gol to the end of one of the option list lines and it removes a superfluous second functions from the description of the -mhotpatch option. Cheers Nick gcc/ChangeLog 2014-09-09 Nick Clifton ni...@redhat.com * doc/invoke.texi (Optimization Options): Add missing @gol to the end of a line. (S/390 and zSeries Options): Remove superfluous word from the description of the -mhotpatch option. Index: invoke.texi === --- invoke.texi (revision 215043) +++ invoke.texi (working copy) @@ -386,7 +386,7 @@ -fira-region=@var{region} -fira-hoist-pressure @gol -fira-loop-pressure -fno-ira-share-save-slots @gol -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol --fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute +-fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol -fivopts -fkeep-inline-functions -fkeep-static-consts -flive-range-shrinkage @gol -floop-block -floop-interchange -floop-strip-mine -floop-nest-optimize @gol -floop-parallelize-all -flto -flto-compression-level @gol @@ -20634,7 +20635,7 @@ Nop instructions (@var{halfwords}, maximum 100) or 12 Nop instructions if no argument is present. Functions with a hot-patching prologue are never inlined automatically, and a -hot-patching prologue is never generated for functions functions +hot-patching prologue is never generated for functions that are explicitly inline. This option can be overridden for individual functions with the
Re: [Patch ARM] Fix PR target/56846
On Mon, Aug 25, 2014 at 11:32 AM, Tony Wang tony.w...@arm.com wrote: Hi all, The bug is reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56846, and it’s about the problem that when exception handler is involved in the function, then _Unwind_Backtrace function will run into deadloop on arm target. You mean an infinite loop. Cmd line: arm-none-eabi-g++ -mthumb -mcpu=cortex-m3 -O0 -g -std=c++11 -specs=rdimon.specs main.c -o main.exe #include unwind.h #include stdio.h _Unwind_Reason_Code trace_func(struct _Unwind_Context * context, void* arg) { void *ip = (void *)_Unwind_GetIP(context); printf(Address: %p\n, ip); return _URC_NO_REASON; } void bar() { puts(This is in bar); _Unwind_Backtrace((_Unwind_Trace_Fn)trace_func, 0); } void foo() { try { bar(); } catch (...) { puts(Exception); } } The potential of such a bug is discussed long time ago in mail: https://gcc.gnu.org/ml/gcc/2007-08/msg00235.html. Basically, as the ARM EHABI does not define how to implement the Unwind_Backtrace, Andrew give control to the personality routine to unwind the stack, and use the unwind state combination of “_US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND” to represent that the caller is asking the personality routine to only unwind the stack for it. However, the pr in the libstdc++-v3 doesn’t handle such a unwind state pattern correctly. When the backtrace function passes such a pattern to it, it will still return _URC_HANDLER_FOUND to the caller in some cases. It’s because the pr will think that the _Unwind_Backtrace is raising a none type exception to it, so if the exception handler in current stack frame can catch anything(like catch(…)), the pr will return _URC_HANDLER_FOUND to the caller and ask for next step. But definitely, the unwind backtrace function don’t know what to do when pr return an exception handler to it. So this patch just evaluate such a unwind state pattern at the beginning of the personality routine in libstdc++-v3, if we meet with “_US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND”, then we directly call macro CONTINUE_UNWINDING to unwind the stack and return. Is this a reasonable fix? I'd like another review here however it looks sane to me. You need to CC libstd...@gcc.gnu.org for libstdc++ patches. Your email doesn't say how you tested this patch. Can you make sure you've run this through a bootstrap and regression test on GNU/Linux and a cross regression test on arm-none-eabi with no regressions ? regards Ramana gcc/libstdc++-v3/ChangeLog: 2014-8-25 Tony Wang tony.w...@arm.com PR target/56846 * libsupc++/eh_personality.cc: Return with CONTINUE_UNWINDING when meet with the unwind state pattern: _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND diff --git a/libstdc++-v3/libsupc++/eh_personality.cc b/libstdc++-v3/libsupc++/eh_personality.cc index f315a83..c2b30e9 100644 --- a/libstdc++-v3/libsupc++/eh_personality.cc +++ b/libstdc++-v3/libsupc++/eh_personality.cc @@ -378,6 +378,11 @@ PERSONALITY_FUNCTION (int version, switch (state _US_ACTION_MASK) { case _US_VIRTUAL_UNWIND_FRAME: + // If the unwind state pattern is _US_VIRTUAL_UNWIND_FRAME | + // _US_FORCE_UNWIND, we don't need to search for any handler + // as it is not a real exception. Just unwind the stack. + if (state _US_FORCE_UNWIND) +CONTINUE_UNWINDING; actions = _UA_SEARCH_PHASE; break;
RE: [Patch ARM] Fix PR target/56846
-Original Message- From: Ramana Radhakrishnan [mailto:ramana@googlemail.com] Sent: Tuesday, September 09, 2014 4:33 PM To: Tony Wang Cc: gcc-patches; d...@debian.org; aph-...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan; libstd...@gcc.gnu.org Subject: Re: [Patch ARM] Fix PR target/56846 On Mon, Aug 25, 2014 at 11:32 AM, Tony Wang tony.w...@arm.com wrote: Hi all, The bug is reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56846, and it’s about the problem that when exception handler is involved in the function, then _Unwind_Backtrace function will run into deadloop on arm target. You mean an infinite loop. Cmd line: arm-none-eabi-g++ -mthumb -mcpu=cortex-m3 -O0 -g -std=c++11 -specs=rdimon.specs main.c -o main.exe #include unwind.h #include stdio.h _Unwind_Reason_Code trace_func(struct _Unwind_Context * context, void* arg) { void *ip = (void *)_Unwind_GetIP(context); printf(Address: %p\n, ip); return _URC_NO_REASON; } void bar() { puts(This is in bar); _Unwind_Backtrace((_Unwind_Trace_Fn)trace_func, 0); } void foo() { try { bar(); } catch (...) { puts(Exception); } } The potential of such a bug is discussed long time ago in mail: https://gcc.gnu.org/ml/gcc/2007-08/msg00235.html. Basically, as the ARM EHABI does not define how to implement the Unwind_Backtrace, Andrew give control to the personality routine to unwind the stack, and use the unwind state combination of “_US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND” to represent that the caller is asking the personality routine to only unwind the stack for it. However, the pr in the libstdc++-v3 doesn’t handle such a unwind state pattern correctly. When the backtrace function passes such a pattern to it, it will still return _URC_HANDLER_FOUND to the caller in some cases. It’s because the pr will think that the _Unwind_Backtrace is raising a none type exception to it, so if the exception handler in current stack frame can catch anything(like catch(…)), the pr will return _URC_HANDLER_FOUND to the caller and ask for next step. But definitely, the unwind backtrace function don’t know what to do when pr return an exception handler to it. So this patch just evaluate such a unwind state pattern at the beginning of the personality routine in libstdc++-v3, if we meet with “_US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND”, then we directly call macro CONTINUE_UNWINDING to unwind the stack and return. Is this a reasonable fix? I'd like another review here however it looks sane to me. You need to CC libstd...@gcc.gnu.org for libstdc++ patches. Your email doesn't say how you tested this patch. Can you make sure you've run this through a bootstrap and regression test on GNU/Linux and a cross regression test on arm-none-eabi with no regressions ? Hi Ramana, Thanks for you review. After this patch, the infinite loop will be fixed for the above test case, and I can make sure that no regression is happen during bootstrap and regression test on Linux and a cross regression test on arm-none-eabi. Regards, Tony regards Ramana gcc/libstdc++-v3/ChangeLog: 2014-8-25 Tony Wang tony.w...@arm.com PR target/56846 * libsupc++/eh_personality.cc: Return with CONTINUE_UNWINDING when meet with the unwind state pattern: _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND diff --git a/libstdc++-v3/libsupc++/eh_personality.cc b/libstdc++-v3/libsupc++/eh_personality.cc index f315a83..c2b30e9 100644 --- a/libstdc++-v3/libsupc++/eh_personality.cc +++ b/libstdc++-v3/libsupc++/eh_personality.cc @@ -378,6 +378,11 @@ PERSONALITY_FUNCTION (int version, switch (state _US_ACTION_MASK) { case _US_VIRTUAL_UNWIND_FRAME: + // If the unwind state pattern is _US_VIRTUAL_UNWIND_FRAME | + // _US_FORCE_UNWIND, we don't need to search for any handler + // as it is not a real exception. Just unwind the stack. + if (state _US_FORCE_UNWIND) +CONTINUE_UNWINDING; actions = _UA_SEARCH_PHASE; break;
Re: RFA: Add a target_globals destructor
On Mon, Sep 8, 2014 at 5:21 PM, Richard Sandiford richard.sandif...@arm.com wrote: This is a prerequisite for a cleaned-up version of the patch in: https://gcc.gnu.org/ml/gcc/2014-03/msg00163.html . Thanks to Trevor's recent(ish) changes, it's now possible for GC structures to have destructors. This means that we can go back to xmalloc()ing the parts of target_globals that don't point to GCed data. Also, some non-GC default_* variables had redundant GTY markers. Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Richard gcc/ * bb-reorder.h (default_target_bb_reorder): Remove redundant GTY. * builtins.h (default_target_builtins): Likewise. * gcse.h (default_target_gcse): Likewise. * target-globals.h (target_globals): Add a destructor. Convert void-pointer fields back to their real type and change from GTY((atomic)) to GTY((skip)). (restore_target_globals): Remove casts accordingly. * target-globals.c (save_target_globals): Use XCNEW rather than ggc_internal_cleared_alloc to allocate non-GC structures. Use ggc_cleared_alloc to allocate the target_globals structure itself. (target_globals::~target_globals): Define. Index: gcc/bb-reorder.h === --- gcc/bb-reorder.h2014-09-05 16:07:26.791345611 +0100 +++ gcc/bb-reorder.h2014-09-05 16:07:26.787345661 +0100 @@ -26,7 +26,7 @@ struct target_bb_reorder { int x_uncond_jump_length; }; -extern GTY(()) struct target_bb_reorder default_target_bb_reorder; +extern struct target_bb_reorder default_target_bb_reorder; #if SWITCHABLE_TARGET extern struct target_bb_reorder *this_target_bb_reorder; #else Index: gcc/builtins.h === --- gcc/builtins.h 2014-09-05 16:07:26.791345611 +0100 +++ gcc/builtins.h 2014-09-05 16:07:26.787345661 +0100 @@ -39,7 +39,7 @@ struct target_builtins { enum machine_mode x_apply_result_mode[FIRST_PSEUDO_REGISTER]; }; -extern GTY(()) struct target_builtins default_target_builtins; +extern struct target_builtins default_target_builtins; #if SWITCHABLE_TARGET extern struct target_builtins *this_target_builtins; #else Index: gcc/gcse.h === --- gcc/gcse.h 2014-09-05 16:07:26.791345611 +0100 +++ gcc/gcse.h 2014-09-05 16:07:26.787345661 +0100 @@ -32,7 +32,7 @@ struct target_gcse { bool x_can_copy_init_p; }; -extern GTY(()) struct target_gcse default_target_gcse; +extern struct target_gcse default_target_gcse; #if SWITCHABLE_TARGET extern struct target_gcse *this_target_gcse; #else Index: gcc/target-globals.h === --- gcc/target-globals.h2014-09-05 16:07:26.791345611 +0100 +++ gcc/target-globals.h2014-09-05 16:07:26.787345661 +0100 @@ -40,18 +40,20 @@ #define TARGET_GLOBALS_H 1 #endif struct GTY(()) target_globals { + ~target_globals (); + struct target_flag_state *GTY((skip)) flag_state; - void *GTY((atomic)) regs; + struct target_regs *GTY((skip)) regs; struct target_rtl *rtl; - void *GTY((atomic)) recog; - void *GTY((atomic)) hard_regs; - void *GTY((atomic)) reload; - void *GTY((atomic)) expmed; + struct target_recog *GTY((skip)) recog; + struct target_hard_regs *GTY((skip)) hard_regs; + struct target_reload *GTY((skip)) reload; + struct target_expmed *GTY((skip)) expmed; struct target_optabs *GTY((skip)) optabs; struct target_libfuncs *libfuncs; struct target_cfgloop *GTY((skip)) cfgloop; - void *GTY((atomic)) ira; - void *GTY((atomic)) ira_int; + struct target_ira *GTY((skip)) ira; + struct target_ira_int *GTY((skip)) ira_int; struct target_builtins *GTY((skip)) builtins; struct target_gcse *GTY((skip)) gcse; struct target_bb_reorder *GTY((skip)) bb_reorder; @@ -68,17 +70,17 @@ extern struct target_globals *save_targe restore_target_globals (struct target_globals *g) { this_target_flag_state = g-flag_state; - this_target_regs = (struct target_regs *) g-regs; + this_target_regs = g-regs; this_target_rtl = g-rtl; - this_target_recog = (struct target_recog *) g-recog; - this_target_hard_regs = (struct target_hard_regs *) g-hard_regs; - this_target_reload = (struct target_reload *) g-reload; - this_target_expmed = (struct target_expmed *) g-expmed; + this_target_recog = g-recog; + this_target_hard_regs = g-hard_regs; + this_target_reload = g-reload; + this_target_expmed = g-expmed; this_target_optabs = g-optabs; this_target_libfuncs = g-libfuncs; this_target_cfgloop = g-cfgloop; - this_target_ira = (struct target_ira *) g-ira; - this_target_ira_int = (struct target_ira_int *) g-ira_int; + this_target_ira = g-ira; + this_target_ira_int = g-ira_int;
Re: [debug-early] reuse variable DIEs and fix their context
On Tue, Sep 9, 2014 at 2:00 AM, Aldy Hernandez al...@redhat.com wrote: On 09/05/14 02:00, Richard Biener wrote: [jason: C++ questions throughout.] On Fri, Sep 5, 2014 at 4:38 AM, Aldy Hernandez al...@redhat.com wrote: On 09/04/14 03:42, Richard Biener wrote: On Wed, Sep 3, 2014 at 7:54 PM, Aldy Hernandez al...@redhat.com wrote: I meant that LATE_WRITE_GLOBALS shouldn't be a langhook at all but instead the middle-end should be in control of that and implement it in a language independent way. After all this will be called from LTO LTRANS phase. This looks like a rat's nest :(. Interestingly, most non-C/C++ languages are well behaved, and use the generic write_global_declarations() function. Ada even goes as far as calling debug_hooks-global_decl() before the compilation proper and then once again after the compilation has finished (like we're planning on doing). Java, which you thought was horrible, mostly just calls the generic write_global_declarations(). C++ is a different story... It seems to me that C++ is the most complicated of the FE's when it comes to LANG_HOOKS_WRITE_GLOBALS. Most annoyingly, it does many things *after* it has called finalize_compilation_unit (creating VTV constructors, calling check_global_declarations on pending_statics, building Java method aliases, etc etc). See cp_write_global_declarations() for everything after finalize_compilation_unit). Yeah, it was the Java method aliases building I remember (I tried before VTV materialized) ;) So I falsely blamed Java - it's only remotely Javas fault ... ;) What I have in mind is: 1. Move the FE specific things that come before the call to finalize_compilation_unit currently in each LANG_HOOKS_WRITE_GLOBALS, into the FE proper (lang_hooks.parse_file). This may or may not mean calling {wrapup,check}_global_declarations directly from the FEs since some FE's call these in a sufficiently different order to merit everyone doing their own thing (not sure though). 2. Generate debug information by gathering the list of globals with lang_hooks.decls.getdecls (??) and then doing debug_hooks-early_global_decl() as discussed. Or move that also to lang_hooks.parse_file? ISTR lang_hooks.decls.getdecls is sort of an alternative hook to write_global_declarations that is only used by the generic implementation of write_global_declarations. So if we move everything else but calling debug_hooks-early_global_decl () out of the write_global_declarations langhook then we could indeed remove that hook and implement getdecls everywhere. I suppose one of the hooks should go in the end. 2. Call finalize_compilation_unit() directly from compile_file(). Great! 3. Call some (new) hook for C++ stuff after finalize_compilation_unit (???). Or fix the C++ stuff to work properly in a symtab way? I suppose as an intermediate step adding a new langhook for this on the branch is ok but I'd rather not get that merged into trunk. Maybe Jason can help cleaning this up. 4. FOR_EACH_DEFINED_SYMBOL (node) debug_hooks-late_global_decl (node-decl) as suggested. The wildcard here is C++. Shall we create a hook for post finalize_compilation_unit() but pre late debug dumping (item 3 above)? Or can we move most of the post finalize_compilation_unit() stuff in C++ before it, thus into the FE proper? Also, disturbingly C++ calls check_global_declarations() after finalize_compilation_unit (a couple times actually). I think if we can get C++ to work, everything else basically falls into place... even Ada and Go ;-). I hope that C++ can be fixed to do things in proper order and not behind symtabs back. For the branch to be able to move forward we can certainly add some hooks temporarily. Or disable the Java/VTV stuff for the time being. I don't remember running into the check_global_declarations () issue (or what that does). Jason? Tahnks, Richard. Comments highly welcome. Aldy
Re: [Patch ARM] Fix PR target/56846
On 09/09/14 09:33 +0100, Ramana Radhakrishnan wrote: I'd like another review here however it looks sane to me. You need to CC libstd...@gcc.gnu.org for libstdc++ patches. Your email doesn't say how you tested this patch. Can you make sure you've run this through a bootstrap and regression test on GNU/Linux and a cross regression test on arm-none-eabi with no regressions ? Thanks for forwarding this, Ramana. I don't know the EABI unwinder code so if Ramana is OK with it and no other ARM maintainers have any comments then the patch is OK with me too, with a couple of small tweaks ... gcc/libstdc++-v3/ChangeLog: 2014-8-25 Tony Wang tony.w...@arm.com PR target/56846 * libsupc++/eh_personality.cc: Return with CONTINUE_UNWINDING when meet with the unwind state pattern: _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND The changelog should say which function is being changed: * libsupc++/eh_personality.cc (__gxx_personality_v0): ... Or maybe: * libsupc++/eh_personality.cc (PERSONALITY_FUNCTION): ... Instead of when meet with the unwind state pattern please say when the state pattern contains diff --git a/libstdc++-v3/libsupc++/eh_personality.cc b/libstdc++-v3/libsupc++/eh_personality.cc index f315a83..c2b30e9 100644 --- a/libstdc++-v3/libsupc++/eh_personality.cc +++ b/libstdc++-v3/libsupc++/eh_personality.cc @@ -378,6 +378,11 @@ PERSONALITY_FUNCTION (int version, switch (state _US_ACTION_MASK) { case _US_VIRTUAL_UNWIND_FRAME: + // If the unwind state pattern is _US_VIRTUAL_UNWIND_FRAME | + // _US_FORCE_UNWIND, we don't need to search for any handler + // as it is not a real exception. Just unwind the stack. I think this comment would be easier to read if the expression with the two constants was all on one line: // If the unwind state pattern is // _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND // then we don't need to search for any handler as it is not a real // exception. Just unwind the stack. + if (state _US_FORCE_UNWIND) +CONTINUE_UNWINDING; actions = _UA_SEARCH_PHASE; break;
RE: [Patch ARM] Fix PR target/56846
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Jonathan Wakely Sent: Tuesday, September 09, 2014 5:16 PM To: Ramana Radhakrishnan Cc: Tony Wang; gcc-patches; d...@debian.org; aph-...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan; libstd...@gcc.gnu.org Subject: Re: [Patch ARM] Fix PR target/56846 On 09/09/14 09:33 +0100, Ramana Radhakrishnan wrote: I'd like another review here however it looks sane to me. You need to CC libstd...@gcc.gnu.org for libstdc++ patches. Your email doesn't say how you tested this patch. Can you make sure you've run this through a bootstrap and regression test on GNU/Linux and a cross regression test on arm-none-eabi with no regressions ? Thanks for forwarding this, Ramana. I don't know the EABI unwinder code so if Ramana is OK with it and no other ARM maintainers have any comments then the patch is OK with me too, with a couple of small tweaks ... Thanks for your comment, Jonathan. I will send a new patch to cover your comment. BR, Tony gcc/libstdc++-v3/ChangeLog: 2014-8-25 Tony Wang tony.w...@arm.com PR target/56846 * libsupc++/eh_personality.cc: Return with CONTINUE_UNWINDING when meet with the unwind state pattern: _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND The changelog should say which function is being changed: * libsupc++/eh_personality.cc (__gxx_personality_v0): ... Or maybe: * libsupc++/eh_personality.cc (PERSONALITY_FUNCTION): ... Instead of when meet with the unwind state pattern please say when the state pattern contains diff --git a/libstdc++-v3/libsupc++/eh_personality.cc b/libstdc++-v3/libsupc++/eh_personality.cc index f315a83..c2b30e9 100644 --- a/libstdc++-v3/libsupc++/eh_personality.cc +++ b/libstdc++-v3/libsupc++/eh_personality.cc @@ -378,6 +378,11 @@ PERSONALITY_FUNCTION (int version, switch (state _US_ACTION_MASK) { case _US_VIRTUAL_UNWIND_FRAME: + // If the unwind state pattern is _US_VIRTUAL_UNWIND_FRAME | + // _US_FORCE_UNWIND, we don't need to search for any handler + // as it is not a real exception. Just unwind the stack. I think this comment would be easier to read if the expression with the two constants was all on one line: // If the unwind state pattern is // _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND // then we don't need to search for any handler as it is not a real // exception. Just unwind the stack. + if (state _US_FORCE_UNWIND) +CONTINUE_UNWINDING; actions = _UA_SEARCH_PHASE; break;
Re: [Patch, gcc, testsuite]Disable xordi3-opt.c/iordi3-opt.c on thumb1 target
On Thu, Sep 4, 2014 at 3:21 AM, Tony Wang tony.w...@arm.com wrote: Hi there, This is a test case clean up patch, because orr/eor instruction for thumb1 has only two variant: ORRS Rdn, Rm ORRc Rdn, Rm No shift is available for thumb1 encoding, so test case xordi3-opt.c/iordi3-opt.c is invalid for thumb1 target. This patch just disabled them for thumb1 target. Ok for the trunk? Ok (assuming you've tested this on suitable multilib variants :)) Ramana gcc/gcc/testsuite/ChangeLog: 2014-09-04 Tony Wang tony.w...@arm.com * gcc.target/arm/xordi3-opt.c: Disable this test case for thumb1 target. * gcc.target/arm/iordi3-opt.c: Ditto. diff --git a/gcc/testsuite/gcc.target/arm/iordi3-opt.c b/gcc/testsuite/gcc.target/arm/iordi3-opt.c index b3f465b..63fbe0b 100644 --- a/gcc/testsuite/gcc.target/arm/iordi3-opt.c +++ b/gcc/testsuite/gcc.target/arm/iordi3-opt.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target { arm_arm_ok || arm_thumb2_ok} } } */ /* { dg-options -O1 } */ unsigned long long or64 (unsigned long long input) diff --git a/gcc/testsuite/gcc.target/arm/xordi3-opt.c b/gcc/testsuite/gcc.target/arm/xordi3-opt.c index 7e031c3..53b2bab 100644 --- a/gcc/testsuite/gcc.target/arm/xordi3-opt.c +++ b/gcc/testsuite/gcc.target/arm/xordi3-opt.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target { arm_arm_ok || arm_thumb2_ok} } } */ /* { dg-options -O1 } */ unsigned long long xor64 (unsigned long long input)
Re: [Patch AArch64] Add support for crtfastmath.c
On 5 September 2014 09:04, Ramana Radhakrishnan ramana.radhakrish...@arm.com wrote: On 09/04/2014 07:04 AM, Ramana Radhakrishnan wrote: gcc/Changelog 2014-09-04 Marcus Shawcroft marcus.shawcr...@arm.com Ramana Radhakrishnan ramana.radhakrish...@arm.com * config/aarch64/aarch64-elf-raw.h (ENDFILE_SPEC): Add crtfastmath.o. * config/aarch64/aarch64-linux.h (GNU_USER_TARGET_MATH_ENDFILE_SPEC): Define. (ENDFILE_SPEC): Define and use GNU_USER_TARGET_MATH_ENDFILE_SPEC. libgcc/Changelog 2014-09-04 Marcus Shawcroft marcus.shawcr...@arm.com Ramana Radhakrishnan ramana.radhakrish...@arm.com * config.host (aarch64*): Include crtfastmath and t-crtfm. * config/aarch64/crtfastmath.c: New file. Bah - ofcourse. Here it is attached. Ramana OK /Marcus
Re: [PATCH][AArch64] PR 61749: Do not ICE in lane intrinsics when passed non-constant lane number
On 5 September 2014 10:07, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Ok for trunk? 2014-09-05 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/61749 * config/aarch64/aarch64-builtins.c (aarch64_types_quadop_qualifiers): Use qualifier_immediate for last operand. Rename to... (aarch64_types_ternop_lane_qualifiers): ... This. (TYPES_QUADOP): Rename to... (TYPES_TERNOP_LANE): ... This. (aarch64_simd_expand_args): Return const0_rtx when encountering user error. Change return of 0 to return of NULL_RTX. (aarch64_crc32_expand_builtin): Likewise. (aarch64_expand_builtin): Return NULL_RTX instead of 0. ICE when expanding unknown builtin. * config/aarch64/aarch64-simd-builtins.def (sqdmlal_lane): Use TERNOP_LANE qualifiers. (sqdmlsl_lane): Likewise. (sqdmlal_laneq): Likewise. (sqdmlsl_laneq): Likewise. (sqdmlal2_lane): Likewise. (sqdmlsl2_lane): Likewise. (sqdmlal2_laneq): Likewise. (sqdmlsl2_laneq): Likewise. 2014-09-05 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/61749 * gcc.target/aarch64/vqdml_lane_intrinsics-bad_1.c: New test. OK /Marcus