C++ PATCH for c++/46304 (ICE with _Complex)
In this bug, the underlying problem was that we had a COMPLEX_EXPR representing a complex constant rather than a COMPLEX_CST. There was also the issue that 4.5 didn't deal with this very well, but fixing the testcase to use COMPLEX_CST (by folding the COMPLEX_EXPR case like we do everything else in cp_build_binary_op) fixes the ICE. Tested x86_64-pc-linux-gnu, applied to trunk, 4.5 and 4.6. commit 58d863dc8ff1f357a7f8f4064c204779a2b2c2eb Author: Jason Merrill ja...@redhat.com Date: Tue Apr 19 17:58:20 2011 -0700 PR c++/46304 * typeck.c (cp_build_binary_op): Fold COMPLEX_EXPR. diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index 89d3247..dcdc790 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -4357,7 +4357,11 @@ cp_build_binary_op (location_t location, gcc_unreachable(); } } - return build2 (COMPLEX_EXPR, result_type, real, imag); + real = fold_if_not_in_template (real); + imag = fold_if_not_in_template (imag); + result = build2 (COMPLEX_EXPR, result_type, real, imag); + result = fold_if_not_in_template (result); + return result; } /* For certain operations (which identify themselves by shorten != 0) diff --git a/gcc/testsuite/g++.dg/ext/complex7.C b/gcc/testsuite/g++.dg/ext/complex7.C new file mode 100644 index 000..9d5463f --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/complex7.C @@ -0,0 +1,6 @@ +// { dg-options } + +class A +{ + static const _Complex double x = 1.0 + 2.0i; +};
Re: [PATCH] make LABEL_DECL has its own rtx field for its associated CODE_LABEL
On Wed, Apr 20, 2011 at 12:00 AM, Nathan Froyd froy...@codesourcery.com wrote: On Tue, Apr 05, 2011 at 05:55:33PM +0200, Michael Matz wrote: I have a preference in having just one DECL_RTL field for conceptual reasons: Most DECLs are actually objects (there are some prominent exceptions, but those always would be better described with something like NAMED_ENTITY, if we had something like that, namespaces and translation_unit would qualify). All these have a RTL representation, so one field for them seems appropriate. That some of those don't have a size (either because size makes no sense or is always available via type size) hints towards a problem in the inheritance. I would think it should look like so: decl_common {} # no size, no rtl, no align, no pt_uid decl_with_rtl : decl_common { # add rtl, align, pt_uid } decl_with_size : decl_with_rtl { # add size, size_unit } Then decl_common can still be used for imported_decl/namespace/translation_unit; objects are at least decl_with_rtl, and some objects will be decl_with_size. I had occasion to try this today; this inheritance structure doesn't work. The truncated inheritance tree looks like: * decl_common * field_decl * const_decl * decl_with_rtl * label_decl * result_decl * parm_decl * decl_with_vis... In particular, FIELD_DECLs have a size, but they have no RTL associated with them. And LABEL_DECLs have RTL, but no size. So if you went with the above, FIELD_DECLs would grow by one (useless) word. And the reverse is the situation we have today, where CONST_DECLs and LABEL_DECLs (at least) have a pointless DECL_SIZE. Ideally, we could fix things like FUNCTION_DECLs having a size, too... And I didn't check the C++ FE to see if there are problematic cases there, either. What do you think is the next step? To address this issue, we could just give LABEL_DECL its own rtx field as in the original patch, and that would resolve that. But maybe we should go further, say by making DECL_SIZE{,_UNIT} and/or DECL_RTL into actual (out-of-line function) accessors; these accessors can then access structure-specific bits of data. Then we don't have to worry about the inheritance structure, and maybe could adopt alternate storage schemes for different DECLs, such as the off-to-the-side table that Steven suggested. Another option is to change nothing ;) Conceptually I'd say not storing DECL_RTL in the decls themselves but on the side would make sense, at least from a stylish view. I'm not sure it'll work out very well though in terms of cost benefit. What we could do is, if we ever can dispose of DECL/TYPE_LANG_SPECIFIC after lowering to gimple, overload that field with a DECL/TYPE_RTL_SPECIFIC field ... Richard. -Nathan
Re: [google] remove redundant push {lr} for -mthumb (issue4441050)
On Wed, 2011-04-20 at 16:26 +0800, Carrot Wei wrote: On Tue, Apr 19, 2011 at 8:55 PM, Richard Earnshaw rearn...@arm.com wrote: On Tue, 2011-04-19 at 17:41 +0800, Guozhi Wei wrote: Reload pass tries to determine the stack frame, so it needs to check the push/pop lr optimization opportunity. One of the criteria is if there is any far jump inside the function. Unfortunately at this time gcc can't decide each instruction's length and basic block layout, so it can't know the offset of a jump. To be conservative it assumes every jump is a far jump. So any jump in a function will prevent this push/pop lr optimization. To enable the push/pop lr optimization in reload pass, I compute the possible maximum length of the function body. If the length is not large enough, far jump is not necessary, so we can safely do push/pop lr optimization. Tested on arm qemu with options -march=armv5te -mthumb, without regression. This patch is for google/main. 2011-04-19 Guozhi Wei car...@google.com Google ref 40255. * gcc/config/arm/arm.c (SHORTEST_FAR_JUMP_LENGTH): New constant. (estimate_function_length): New function. (thumb_far_jump_used_p): No far jump is needed in short function. Setting aside for the moment Richi's issue with hot/cold sections, this isn't safe. Firstly get_attr_length() doesn't return the worst case length; and secondly, it doesn't take into account the size of reload insns that are still on the reloads stack -- these are only emitted right at the end of the reload pass. Both of these would need to be addressed before this can be safely done. It's worth noting here that in the dim and distant past we used to try to estimate the size of the function and eliminate redundant saves of R14, but the code had to be removed because it was too fragile; but it looks like some vestiges of the code are still in the compiler. A slightly less optimistic approach, but one that is much safer is to scan the function after reload has completed and see if we can avoid having to push LR. We can do this if: I guess less optimistic is relative to the ideal optimization situation, I believe it is still much better than current result. Do you think if arm_reorg() is appropriate place to do this? Making the decision in a single pass would certainly be the best approach; and arm_reorg is certainly going to come after all other major code re-arrangements. Indeed, you should probably do this after the minipool placement so that you can be sure that these don't bulk up the body of the function too much. As you are doing the elimination late on in the compilation you can do a better job of estimation by calling shorten_branches() to work out the precise length of each insn. Then you can simply scan over the insns to work out if there is a branch that still needs r14. R.
Re: Allow union variables to share stack slots wwith -fno-strict-aliasing (issue4444051)
On Wed, Apr 20, 2011 at 2:42 AM, Easwaran Raman era...@google.com wrote: Hi, This patch allows variables whose type transitively contains a union to share stack slots if -fno-strict-aliasing is used. Bootstraps on x86_64 with no test regressions. Also tested by changing flag_strict_aliasing to 0 by default. Bootstraps and no test regressions when compared to flag_strict_aliasing=0 without this patch. OK for trunk? Ok. Thanks, Richard. -Easwaran 2011-04-19 Easwaran Raman era...@google.com * gcc/testsuite/gcc.dg/stack-layout-1.c: New * gcc/cfgexpand.c (add_alias_set_conflicts): Add conflicts with a variable containing union type only with -fstrict-aliasing. Index: gcc/testsuite/gcc.dg/stack-layout-1.c === --- gcc/testsuite/gcc.dg/stack-layout-1.c (revision 0) +++ gcc/testsuite/gcc.dg/stack-layout-1.c (revision 0) @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fno-strict-aliasing -fdump-rtl-expand } */ +union U { + int a; + float b; +}; +struct A { + union U u1; + char a[100]; +}; +void bar (struct A *); +void foo () + { + { + struct A a; + bar (a); + } + { + struct A a; + bar (a); + } + } + +/* { dg-final { scan-rtl-dump-times Partition 1 expand } } */ +/* { dg-final { cleanup-rtl-dump expand } } */ Index: gcc/cfgexpand.c === --- gcc/cfgexpand.c (revision 172657) +++ gcc/cfgexpand.c (working copy) @@ -373,8 +373,9 @@ add_alias_set_conflicts (void) to elements will conflict. In case of unions we have to be careful as type based aliasing rules may say access to the same memory does not conflict. So play - safe and add a conflict in this case. */ - || contains_union) + safe and add a conflict in this case when + -fstrict-aliasing is used. */ + || (contains_union flag_strict_aliasing)) add_stack_var_conflict (i, j); } } -- This patch is available for review at http://codereview.appspot.com/051
Re: [google] remove redundant push {lr} for -mthumb (issue4441050)
I will try this method for trunk later. thanks Carrot On Wed, Apr 20, 2011 at 4:48 PM, Richard Earnshaw rearn...@arm.com wrote: On Wed, 2011-04-20 at 16:26 +0800, Carrot Wei wrote: On Tue, Apr 19, 2011 at 8:55 PM, Richard Earnshaw rearn...@arm.com wrote: On Tue, 2011-04-19 at 17:41 +0800, Guozhi Wei wrote: Reload pass tries to determine the stack frame, so it needs to check the push/pop lr optimization opportunity. One of the criteria is if there is any far jump inside the function. Unfortunately at this time gcc can't decide each instruction's length and basic block layout, so it can't know the offset of a jump. To be conservative it assumes every jump is a far jump. So any jump in a function will prevent this push/pop lr optimization. To enable the push/pop lr optimization in reload pass, I compute the possible maximum length of the function body. If the length is not large enough, far jump is not necessary, so we can safely do push/pop lr optimization. Tested on arm qemu with options -march=armv5te -mthumb, without regression. This patch is for google/main. 2011-04-19 Guozhi Wei car...@google.com Google ref 40255. * gcc/config/arm/arm.c (SHORTEST_FAR_JUMP_LENGTH): New constant. (estimate_function_length): New function. (thumb_far_jump_used_p): No far jump is needed in short function. Setting aside for the moment Richi's issue with hot/cold sections, this isn't safe. Firstly get_attr_length() doesn't return the worst case length; and secondly, it doesn't take into account the size of reload insns that are still on the reloads stack -- these are only emitted right at the end of the reload pass. Both of these would need to be addressed before this can be safely done. It's worth noting here that in the dim and distant past we used to try to estimate the size of the function and eliminate redundant saves of R14, but the code had to be removed because it was too fragile; but it looks like some vestiges of the code are still in the compiler. A slightly less optimistic approach, but one that is much safer is to scan the function after reload has completed and see if we can avoid having to push LR. We can do this if: I guess less optimistic is relative to the ideal optimization situation, I believe it is still much better than current result. Do you think if arm_reorg() is appropriate place to do this? Making the decision in a single pass would certainly be the best approach; and arm_reorg is certainly going to come after all other major code re-arrangements. Indeed, you should probably do this after the minipool placement so that you can be sure that these don't bulk up the body of the function too much. As you are doing the elimination late on in the compilation you can do a better job of estimation by calling shorten_branches() to work out the precise length of each insn. Then you can simply scan over the insns to work out if there is a branch that still needs r14. R.
Re: Allow union variables to share stack slots wwith -fno-strict-aliasing (issue4444051)
2011-04-19 Easwaran Raman era...@google.com * gcc/testsuite/gcc.dg/stack-layout-1.c: New * gcc/cfgexpand.c (add_alias_set_conflicts): Add conflicts with a variable containing union type only with -fstrict-aliasing. You need an entry for each relevant ChangeLog, without prefixes: 2011-04-20 Easwaran Raman era...@google.com * cfgexpand.c (add_alias_set_conflicts): Add conflicts with a variable containing union type only with -fstrict-aliasing. 2011-04-20 Easwaran Raman era...@google.com * gcc.dg/stack-layout-1.c: New test. -- Eric Botcazou
Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.
Denis Chertykov schrieb: 2011/4/17 Denis Chertykov cherty...@gmail.com: 2011/4/15 Georg-Johann Lay a...@gjlay.de: Finally, I exposed alternative #3 of the insns to the register allocator, because it is not possible to distinguish between overlapping or non-overlapping regs, and #3 does not need a scratch. Ran C-testsuite with no regressions. Are you encountered any difference in code size ? I'm ask about code size because the IRA pass isn't work with `scratch:MODE' at all. This lead to bad/wrong register allocation in IRA pass. The reload pass will correct such a wrong allocation, but reload can't generate optimal code. (reload generate correct code). Because of that, may be you right and may be better to have (clobber (match_operand)) instead of (clobber (match_scratch...)). So the conclusion is not to commit this patch and that the one-liner already installed is sufficient to fix the ICE? Denis.
[libgo] Improve IRIX 6.5 support (PR go/47515)
There are still a couple of open issues with libgo on IRIX 6.5. This patch fixes some of them. * As documented in the autoconf manual, some versions of sed and grep cannot handle \? and/or \(, \). The Solaris 8 tools are among them, so this patch avoids those constructs. * test $mips_abi != produces an error message with some shells, so avoid the empty string. * IRIX 6 has its timezone files in /usr/lib/locale/TZ. * One issue I'd been fighting with since I didn't understand the gccgo error message /vol/gcc/src/hg/trunk/irix/libgo/go/os/file.go:432:12: error: incompatible types in assignment (implicit assignment of 'syscall.Timeval' hidden field '_f0') was a compilation failure of the 64-bit libgo. sys/time.h has struct timeval { #if _MIPS_SZLONG == 64 __int32_t :32; #endif time_t tv_sec; /* seconds */ longtv_usec;/* and microseconds */ }; which is converted into type _timeval struct { _f0 int32; tv_sec int32; tv_usec int64; } Just changing the field name in mksysinfo.sh allows the compilation to succeed. * IRIX needs its own socket_irix.go. * IRIX syslog has the same issue as the Solaris one. syslogd(1M) states syslogd reads from the stream device /dev/log, from an Internet domain socket specified in /etc/services, and from the special device /dev/klog (to read kernel messages). so I'm reusing the libc syslog variant. Since it isn't Solaris specific, I've renamed to file to syslog_libc.go. The remaining changes should be self-explanatory. With this patch, a mainline IRIX 6.5 bootstrap with Go included succeeds, although there's still a considerable number of testsuite failures. Some of them seem to be common with Solaris/SPARC, so I suspect some endianess issues. Rainer 2011-01-30 Rainer Orth r...@cebitec.uni-bielefeld.de libgo: PR go/47515 * mksysinfo.sh: Avoid \(, \), \? with grep, sed REs. Handle _f0 in IRIX 6.5 N64 struct timeval. (sysinfo.c) [__sgi__]: Move ... * configure.ac: ... here. (OSCFLAGS) [mips-sgi-irix6.5]: Define. (mips_abi): Initialize to unknown. * configure: Regenerate. * config.h.in: Regenerate. * syscalls/socket_irix.go: New file. * go/syslog/syslog_solaris.go: Rename to ... * go/syslog/syslog_libc.go: ... this. * Makefile.am (gogo_syslog_file) [LIBGO_IS_SOLARIS]: Reflect this. [LIBGO_IS_IRIX]: Use it here, too. [LIBGO_IS_IRIX] (syscall_socket_os_file): Use it. * Makefile.in: Regenerate. * go/time/zoneinfo_unix.go (zoneDir, zoneDir2): Remove. (setupZone): Handle IRIX 6 zoneDir. gcc/testsuite: * go.test/go-test.exp (go-set-goarch): Accept mips*-*-*. diff --git a/gcc/testsuite/go.test/go-test.exp b/gcc/testsuite/go.test/go-test.exp --- a/gcc/testsuite/go.test/go-test.exp +++ b/gcc/testsuite/go.test/go-test.exp @@ -140,7 +140,7 @@ proc go-set-goarch { } { set goarch amd64 } } - mips-*-* { + mips*-*-* { if [check_no_compiler_messages mipso32 assembly { #if _MIPS_SIM != _ABIO32 #error FOO diff --git a/libgo/Makefile.am b/libgo/Makefile.am --- a/libgo/Makefile.am +++ b/libgo/Makefile.am @@ -822,10 +822,14 @@ go_sync_files = \ go/sync/waitgroup.go if LIBGO_IS_SOLARIS -go_syslog_file = go/syslog/syslog_solaris.go +go_syslog_file = go/syslog/syslog_libc.go +else +if LIBGO_IS_IRIX +go_syslog_file = go/syslog/syslog_libc.go else go_syslog_file = go/syslog/syslog_unix.go endif +endif go_syslog_files = \ go/syslog/syslog.go \ @@ -1303,9 +1307,13 @@ else if LIBGO_IS_SOLARIS syscall_socket_os_file = syscalls/socket_solaris.go else +if LIBGO_IS_IRIX +syscall_socket_os_file = syscalls/socket_irix.go +else syscall_socket_os_file = syscalls/socket_bsd.go endif endif +endif # Support for epoll. if LIBGO_IS_LINUX diff --git a/libgo/configure.ac b/libgo/configure.ac --- a/libgo/configure.ac +++ b/libgo/configure.ac @@ -144,7 +144,7 @@ is_386=no is_alpha=no is_arm=no is_m68k=no -mips_abi= +mips_abi=unknown is_ppc=no is_ppc64=no is_sparc=no @@ -237,7 +237,7 @@ AM_CONDITIONAL(LIBGO_IS_386, test $is_38 AM_CONDITIONAL(LIBGO_IS_ALPHA, test $is_alpha = yes) AM_CONDITIONAL(LIBGO_IS_ARM, test $is_arm = yes) AM_CONDITIONAL(LIBGO_IS_M68K, test $is_m68k = yes) -AM_CONDITIONAL(LIBGO_IS_MIPS, test $mips_abi != ) +AM_CONDITIONAL(LIBGO_IS_MIPS, test $mips_abi != unknown) AM_CONDITIONAL(LIBGO_IS_MIPSO32, test $mips_abi = o32) AM_CONDITIONAL(LIBGO_IS_MIPSN32, test $mips_abi = n32) AM_CONDITIONAL(LIBGO_IS_MIPSN64, test $mips_abi = n64) @@ -264,6 +264,11 @@ AC_SUBST(GO_DEBUG_PROC_REGS_OS_ARCH_FILE dnl Some targets need special flags to build sysinfo.go. case $target in +mips-sgi-irix6.5) + # IRIX 6 needs _XOPEN_SOURCE=500 for the XPG5 version of struct + # msghdr in sys/socket.h. +
Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.
2011/4/20 Georg-Johann Lay a...@gjlay.de: Denis Chertykov schrieb: 2011/4/17 Denis Chertykov cherty...@gmail.com: 2011/4/15 Georg-Johann Lay a...@gjlay.de: Finally, I exposed alternative #3 of the insns to the register allocator, because it is not possible to distinguish between overlapping or non-overlapping regs, and #3 does not need a scratch. Ran C-testsuite with no regressions. Are you encountered any difference in code size ? I'm ask about code size because the IRA pass isn't work with `scratch:MODE' at all. This lead to bad/wrong register allocation in IRA pass. The reload pass will correct such a wrong allocation, but reload can't generate optimal code. (reload generate correct code). Because of that, may be you right and may be better to have (clobber (match_operand)) instead of (clobber (match_scratch...)). So the conclusion is not to commit this patch and that the one-liner already installed is sufficient to fix the ICE? Yes. Denis.
[RFT][patch] Fix PR testsuite/48498
Hi, In gcc.dg/vect/slp-3.c and gcc.dg/vect/no-vfa-pr29145.c vectorization is expected to fail on targets vect_no_align. But no realignment is necessary here except for having the array bases aligned. This patch removes xfail for vect_no_align (and increases a loop bound in slp-3.c to prevent complete unrolling). It is supposed to fix the XPASSes on SPARC, I also checked it with a cross-compiler on ia64-linux-gnu, and tested on x86_64-suse-linux. vect_no_align is true for if { [istarget mipsisa64*-*-*] || [istarget sparc*-*-*] || [istarget ia64-*-*] || [check_effective_target_arm_vect_no_misalign] || ([istarget mips*-*-*] [check_effective_target_mips_loongson]) } { set et_vect_no_align_saved 1 } so I'd appreciate testing on these targets. Thanks, Ira testsuite/ChangeLog PR testsuite/48498 * gcc.dg/vect/slp-3.c: Increase loop bound. Don't expect to fail on vect_no_align targets. * gcc.dg/vect/no-vfa-pr29145.c: Don't expect to fail on vect_no_align targets. Index: testsuite/gcc.dg/vect/slp-3.c === --- testsuite/gcc.dg/vect/slp-3.c (revision 172765) +++ testsuite/gcc.dg/vect/slp-3.c (working copy) @@ -4,9 +4,9 @@ #include stdarg.h #include tree-vect.h -#define N 8 +#define N 12 -unsigned short in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63}; +unsigned short in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31}; int main1 () @@ -142,7 +142,7 @@ return 0; } -/* { dg-final { scan-tree-dump-times vectorized 3 loops 1 vect { xfail vect_no_align } } } */ -/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 3 vect { xfail vect_no_align } } } */ +/* { dg-final { scan-tree-dump-times vectorized 3 loops 1 vect } } */ +/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 3 vect } } */ /* { dg-final { cleanup-tree-dump vect } } */ Index: testsuite/gcc.dg/vect/no-vfa-pr29145.c === --- testsuite/gcc.dg/vect/no-vfa-pr29145.c (revision 172765) +++ testsuite/gcc.dg/vect/no-vfa-pr29145.c (working copy) @@ -48,6 +48,6 @@ return 0; } -/* { dg-final { scan-tree-dump-times vectorized 0 loops 2 vect { xfail vect_no_align } } } */ -/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { xfail vect_no_align } } } */ +/* { dg-final { scan-tree-dump-times vectorized 0 loops 2 vect } } */ +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */ /* { dg-final { cleanup-tree-dump vect } } */
Re: [patch testsuite committed] Skip gcc.dg/torture/pr37868.c on sh
Mike Stump mikest...@comcast.net wrote: I'd pre-approve hoisting these up into the lib/.exp files and checking a generic target requirement... :-) -/* { dg-skip-if unaligned access { sparc*-*-* } * } */ +/* { dg-skip-if unaligned access { sparc*-*-* sh*-*-* } * } */ I've thought the same thing when reading the recent HP's comment about changes of testcases for avr, but gave up after grepping STRICT_ALIGNMENT in gcc/config/*/*.h. It seems that it isn't that easy to write something like check_effective_target_unaligned_access for lib/target-supports.exp with my poor knowledge for the various targets. Regards, kaz
[PATCH][ARM] Remove redundant code in arm.c
This patch removes some redundant code that caused me some confusion. It's not possible to construct a constant from multiple ORN instructions, just as it's not possible to do it with multiple AND instructions. OK? Andrew 2011-04-20 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.c (arm_gen_constant): Remove redundant can_invert. --- src/gcc-mainline/gcc/config/arm/arm.c |6 +- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/src/gcc-mainline/gcc/config/arm/arm.c b/src/gcc-mainline/gcc/config/arm/arm.c index 533ad91..dee6f29 100644 --- a/src/gcc-mainline/gcc/config/arm/arm.c +++ b/src/gcc-mainline/gcc/config/arm/arm.c @@ -2745,9 +2745,6 @@ arm_gen_constant (enum rtx_code code, enum machine_mode mode, rtx cond, gen_rtx_SET (VOIDmode, target, source)); return 1; } - - if (TARGET_THUMB2) - can_invert = 1; break; case AND: @@ -3238,8 +3235,7 @@ arm_gen_constant (enum rtx_code code, enum machine_mode mode, rtx cond, if (remainder (1 i)) num_bits_set++; - if ((code == AND) - || (code != IOR can_invert num_bits_set 16)) + if ((code == AND) || (can_invert num_bits_set 16)) remainder ^= 0x; else if (code == PLUS num_bits_set 16) remainder = (-remainder) 0x;
[PATCH] Fix PR48695
This fixes PR48965, not handling mem-refs properly within aliasing_component_refs_p. We disregarded the offset for the decl base. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2011-04-20 Richard Guenther rguent...@suse.de PR middle-end/48695 * tree-ssa-alias.c (aliasing_component_refs_p): Compute base objects and types here. Adjust for their offset before comparing. * g++.dg/torture/pr48695.C: New testcase. Index: gcc/tree-ssa-alias.c === *** gcc/tree-ssa-alias.c(revision 172759) --- gcc/tree-ssa-alias.c(working copy) *** same_type_for_tbaa (tree type1, tree typ *** 593,603 are the respective alias sets. */ static bool ! aliasing_component_refs_p (tree ref1, tree type1, alias_set_type ref1_alias_set, alias_set_type base1_alias_set, HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1, ! tree ref2, tree type2, alias_set_type ref2_alias_set, alias_set_type base2_alias_set, HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2, --- 593,603 are the respective alias sets. */ static bool ! aliasing_component_refs_p (tree ref1, alias_set_type ref1_alias_set, alias_set_type base1_alias_set, HOST_WIDE_INT offset1, HOST_WIDE_INT max_size1, ! tree ref2, alias_set_type ref2_alias_set, alias_set_type base2_alias_set, HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2, *** aliasing_component_refs_p (tree ref1, tr *** 609,617 --- 609,629 struct A { int i; int j; } *q; struct B { struct A a; int k; } *p; disambiguating q-i and p-a.j. */ + tree base1, base2; + tree type1, type2; tree *refp; int same_p; + /* Choose bases and base types to search for. */ + base1 = ref1; + while (handled_component_p (base1)) + base1 = TREE_OPERAND (base1, 0); + type1 = TREE_TYPE (base1); + base2 = ref2; + while (handled_component_p (base2)) + base2 = TREE_OPERAND (base2, 0); + type2 = TREE_TYPE (base2); + /* Now search for the type1 in the access path of ref2. This would be a common base for doing offset based disambiguation on. */ refp = ref2; *** aliasing_component_refs_p (tree ref1, tr *** 627,632 --- 639,646 HOST_WIDE_INT offadj, sztmp, msztmp; get_ref_base_and_extent (*refp, offadj, sztmp, msztmp); offset2 -= offadj; + get_ref_base_and_extent (base1, offadj, sztmp, msztmp); + offset1 -= offadj; return ranges_overlap_p (offset1, max_size1, offset2, max_size2); } /* If we didn't find a common base, try the other way around. */ *** aliasing_component_refs_p (tree ref1, tr *** 643,648 --- 657,664 HOST_WIDE_INT offadj, sztmp, msztmp; get_ref_base_and_extent (*refp, offadj, sztmp, msztmp); offset1 -= offadj; + get_ref_base_and_extent (base2, offadj, sztmp, msztmp); + offset2 -= offadj; return ranges_overlap_p (offset1, max_size1, offset2, max_size2); } *** indirect_ref_may_alias_decl_p (tree ref1 *** 804,814 TREE_CODE (base1) != TARGET_MEM_REF (TREE_CODE (base1) != MEM_REF || same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) == 1)) ! return aliasing_component_refs_p (ref1, TREE_TYPE (ptrtype1), ref1_alias_set, base1_alias_set, offset1, max_size1, ! ref2, TREE_TYPE ! (reference_alias_ptr_type (ref2)), ref2_alias_set, base2_alias_set, offset2, max_size2, true); --- 820,829 TREE_CODE (base1) != TARGET_MEM_REF (TREE_CODE (base1) != MEM_REF || same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) == 1)) ! return aliasing_component_refs_p (ref1, ref1_alias_set, base1_alias_set, offset1, max_size1, ! ref2, ref2_alias_set, base2_alias_set, offset2, max_size2, true); *** indirect_refs_may_alias_p (tree ref1 ATT *** 951,960 || same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) == 1) (TREE_CODE (base2) != MEM_REF || same_type_for_tbaa (TREE_TYPE (base2), TREE_TYPE
Re: better wpa [1/n]: merge types during read-in
Hi, On Wed, 20 Apr 2011, Richard Guenther wrote: + /* A hashtable of trees that potentially refer to variables or functions + that must be replaced with their prevailing variant. */ + static GTY((if_marked (ggc_marked_p), param_is (union tree_node))) htab_t + tree_with_vars; + /* A hashtable of top-level trees that can potentially be merged with trees + from other compilation units. */ It would have been nice to have the top-level tree merging as a separate patch, as I am not convinced it is correct, but see below ... I'll split it out. + { + tree *t2 = (tree *) htab_find_slot (top_decls, t, INSERT); + if (*t2) + t = *t2; + else + *t2 = t; I don't think we can share TREE_LISTs. Where do they hang off when you do this? Are not all but one of those dead? All TREE_LIST manipulation routines I know of do not even think about the possibility of shared lists. Well, it clearly works for the use in global trees :-) The problem I solved with this is, that all trees are stored in the reader cache. If (say) two function types, or enum types are merged, one of their TYPE_ARG_TYPES or TYPE_VALUES lists should be dead. But as they're still referred from the cache they aren't collected. I could remove them from the cache in a similar way to how I deal with the FIELD_DECLs. Or I could store tree_list nodes not in the cache at all. The latter seems more worthwhile to me, so I'm going to try this. + /* Fix up fields of a field_decl T. */ + + static void + lto_ft_field_decl (tree t) + { + lto_ft_decl_common (t); + LTO_FIXUP_TYPE (DECL_FIELD_OFFSET (t)); + LTO_FIXUP_TYPE (DECL_BIT_FIELD_TYPE (t)); + LTO_FIXUP_TYPE (DECL_QUALIFIER (t)); here (and earlier) we had some no_fixup_p asserts. What happened to them? no_fixup_p checked only that the tree wasn't a type or a var/function_decl. In particular, don't we need to fixup the DECL_FIELD_BIT_OFFSET integer-cst? But I was indeed too eager to remove those lines, I should still deal with DECL_SECTION_NAME, DECL_FIELD_BIT_OFFSET and BINFO_OFFSET. Thanks for catching this. + + /* First fixup the fields of T. */ + lto_fixup_types (t); + + /* Now try to find a canonical variant of T itself. */ + if (TYPE_P (t)) + { If t is a type, why fix up its field if it may not be the canonical variant? Because type merging to work sometimes requires already canonicalized fields, at least that's what I found in investigating why some types weren't merged that should have been. Hence I'm first canonicalizing all fields of everything and then see if something merged. + if (t == oldt + TYPE_MAIN_VARIANT (t) != t) + { + /* If this is its own type, link it into the variant + chain. */ + TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (TYPE_MAIN_VARIANT (t)); + TYPE_NEXT_VARIANT (TYPE_MAIN_VARIANT (t)) = t; Hmm - isn't this taken care of in lto_fixup_types? Nope. I've taken it out of the (old) lto_fixup_type, because (now that I've removed the seen pointer_set) it can be called for the same type multiple times, which would lead to it being included multiple times in the NEXT_VARIANT chain. I found it more clear to do this kind of list manipulation at the place where we're sure to see every type only once. + for (f1 = TYPE_FIELDS (t), f2 = TYPE_FIELDS (oldt); + f1 f2; f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2)) + if (TREE_CODE (f1) == FIELD_DECL + f1 != f2 This should always be true unless TYPE_FIELDS (t) == TYPE_FIELDS (oldt) Think shared field_decl chains. I'll have fixed up the chain for one of the type pairs already and can later come to a type referring exactly the same field_decls again. + lto_streamer_cache_lookup (cache, f2, ix)) This lookup should always succeed, no? Yes. I'll assert this. + { + gcc_assert (DECL_NAME (f1) == DECL_NAME (f2)); + /* If we're going to replace an element which we'd + still visit in the next iterations, we wouldn't + handle it, so do it here. */ + if (ix i) + lto_fixup_types (f2); ? But it's dead, no? That is true, but it can refer to integer_cst which I can only fix up by walking the fields. If I don't do that, then even though the field_decl will not be in the cache anymore, its integer_cst (and their non-fixed up types) will stay uncollected. + lto_streamer_cache_insert_at (cache, f1, ix); + } Btw, nice. Does this get rid of the need for the field-decl fixup code in input_gimple_stmt? Hmm, it's gross but seems to me still required for the
Re: Remove vtable_method field in cgraph_node
Hi, On Sun, Apr 17, 2011 at 06:34:59PM +0200, Jan Hubicka wrote: Hi, this patch drops vtable_method filed. I never understood what it is about but reading PR20991 I am convinced that we hit the same problem with work on new devirutalization code. I implemented what Mark describe as ideal solution there - i.e. teaching cgraph that virtual functions might become reachable until after inlining. Since we still can devirutalize in late compilation (that is pretty much poinless but anyway), we no ahve can_refer_decl_in_current_unit_p that tests if the function has been already thrown away. Perhaps we might apply there the observation about vtable being output somewhere, but I do not think it is safe: if vtable is COMDAT, we can pretty much also optimize all references to it out in all compilation unit and up not outputting it. When vtable is not COMDAT, the methods won't be either and this trick will not apply. Consequently I am dropping the flag. This is very trivial based on observation that cp_fold_obj_type_ref, the only setter of the flag, is now dead. Plus the varasm code is no-longer executed at the time of IPA optimizations that the original patch was fixing. Martin, can you please look into why cp_fold_obj_type_ref is no longer used and if possible get rid of it? It was needed because of LANG_HOOKS_FOLD_OBJ_TYPE_REF which I removed last May (revision 159393) and I somehow left the implementation in place. I'll take care of removing it. Thanks, Martin
[PATCH][RFC] Fix PR48694
This makes us treat a == a in operand_equal_p in case the addresses have TREE_CONSTANT set, even in the case if a has TREE_SIDE_EFFECTS set (such as for volatile vars). Simply special-casing DECLs in the ADDR_EXPR case wouldn't handle a.b == a.b. I think it's important that operand_equal_p (a, a, 0) holds if a holds is_gimple_min_invariant (), as we generally compare constants using it. We could devise some other solution, for example introducing a possibly cheaper gimple_val_equal_p () predicate that only handles operands that satisfy is_gimple_val (). Any comments? The testcase is too large, I didn't sofar try to construct one artificially. The issue is latent at least on the 4.6 branch. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2011-04-20 Richard Guenther rguent...@suse.de * tree.h (OEP_CONSTANT_ADDRESS_OF): New operand_equal_flag. * fold-const.c (operand_equal_p): For TREE_CONSTANT ADDR_EXPRs compare the operands with OEP_CONSTANT_ADDRESS_OF. Treat trees with TREE_SIDE_EFFECTS equal when OEP_CONSTANT_ADDRESS_OF is set. Index: gcc/tree.h === *** gcc/tree.h (revision 172759) --- gcc/tree.h (working copy) *** extern tree fold_fma (location_t, tree, *** 5118,5124 enum operand_equal_flag { OEP_ONLY_CONST = 1, ! OEP_PURE_SAME = 2 }; extern int operand_equal_p (const_tree, const_tree, unsigned int); --- 5118,5125 enum operand_equal_flag { OEP_ONLY_CONST = 1, ! OEP_PURE_SAME = 2, ! OEP_CONSTANT_ADDRESS_OF = 4 }; extern int operand_equal_p (const_tree, const_tree, unsigned int); Index: gcc/fold-const.c === *** gcc/fold-const.c(revision 172759) --- gcc/fold-const.c(working copy) *** operand_equal_p (const_tree arg0, const_ *** 2475,2483 equal if they have no side effects. If we have two identical expressions with side effects that should be treated the same due to the only side effects being identical SAVE_EXPR's, that will ! be detected in the recursive calls below. */ if (arg0 == arg1 ! (flags OEP_ONLY_CONST) (TREE_CODE (arg0) == SAVE_EXPR || (! TREE_SIDE_EFFECTS (arg0) ! TREE_SIDE_EFFECTS (arg1 return 1; --- 2475,2486 equal if they have no side effects. If we have two identical expressions with side effects that should be treated the same due to the only side effects being identical SAVE_EXPR's, that will ! be detected in the recursive calls below. ! If we are taking an invariant address of two identical objects ! they are necessarily equal as well. */ if (arg0 == arg1 ! (flags OEP_ONLY_CONST) (TREE_CODE (arg0) == SAVE_EXPR + || (flags OEP_CONSTANT_ADDRESS_OF) || (! TREE_SIDE_EFFECTS (arg0) ! TREE_SIDE_EFFECTS (arg1 return 1; *** operand_equal_p (const_tree arg0, const_ *** 2540,2546 case ADDR_EXPR: return operand_equal_p (TREE_OPERAND (arg0, 0), TREE_OPERAND (arg1, 0), ! 0); default: break; } --- 2543,2550 case ADDR_EXPR: return operand_equal_p (TREE_OPERAND (arg0, 0), TREE_OPERAND (arg1, 0), ! TREE_CONSTANT (arg0) TREE_CONSTANT (arg1) ! ? OEP_CONSTANT_ADDRESS_OF : 0); default: break; }
Re: [Patch,AVR]: PR18145: do_copy_data do_clear_bss only if needed
Denis Chertykov schrieb: 2011/4/19 Anatoly Sokolov ae...@post.ru: Please use ASM_OUTPUT_ALIGNED_DECL_COMMON and ASM_OUTPUT_ALIGNED_DECL_LOCAL macros here. Confused. These macros are used. Johann Sorry... Im look on function name not on macro definition. I agree with the patch. Please wait day or two if Denis would not object, commit patch. Please, commit. Denis. Commited with the following change in avr.c:avr_asm_named_section + || 0 == strncmp (name, .gnu.linkonce.d, 15)); instead of + || 0 == strncmp (name, .gnu.linkonce., 14)); http://gcc.gnu.org/viewcvs?view=revisionrevision=172769 Johann
Re: Improve stack layout heuristic.
Hi, On Tue, 19 Apr 2011, Easwaran Raman wrote: That is correct but is also what the use of stack_vars[u].representative achieves alone, ... I am adding a check to that effect. ... without any check. @@ -596,7 +581,8 @@ if (vb-conflicts) { EXECUTE_IF_SET_IN_BITMAP (vb-conflicts, 0, u, bi) - add_stack_var_conflict (a, stack_vars[u].representative); + if (stack_vars[u].next == EOC stack_vars[u].representative == u) + add_stack_var_conflict (a, u); BITMAP_FREE (vb-conflicts); } } What's your objective with this change? I find the original code clearer. Let us say we try to merge 'a' to 'b' and 'a' has conflicts with many members of an existing partition C. It is not necessary to add all those conflicts to 'b' since they will be never considered in the call to union_stack_vars. Right, that's why I was objecting to your initial change. The original code (adding stack_vars[u].representative to the conflicts of A) made sure the target conflict bitmap only got representatives added. That's why I was asking why you changed this area at all. I was motivated by your comment on bit-vector bloat to try this, but if this affects readability I'll happily revert back to what it was before. Ciao, Michael.
Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)
Hi, On Tue, Apr 19, 2011 at 09:07:35AM +0200, Jakub Jelinek wrote: On Mon, Apr 18, 2011 at 03:33:18PM -0700, Jason Merrill wrote: On 04/18/2011 02:40 PM, Jakub Jelinek wrote: If TREE_BINFO has BV_VCALL_INDEX set, this needs to be dynamically adjusted, but none of the callers are prepared to handle that. Well, it means that we do dynamic adjustment at runtime. If we're able to do devirtualization, we should be able to figure out the right offset as well, just not in 4.6. Sure, but how exactly? We could re-use code in assemble_thunk and thunk_adjust in cgraphunit.c which IIRC is able to deal with these kinds of thunks too. That is, if do not opt for some more systematic approach, like making (most of) them real function with body in gimple instead. Martin
[RFA] MIPS 24K Errata Support
Hi Richard, This is the gcc patch for the 24K errata. Does this look okay to commit? Thanks, Catherine 2011-04-20 Catherine Moore c...@codesourcery.com * config/mips/mips.opt (mfix-24k): New. * config/mips/mips.h (ASM_SPEC): Handle -mfix-24k. * config/mips/mips.md (length): Increase by 4 for stores if fixing 24K errata. * config/mips/mips.c (mips_reorg_process_insns): Do not allow all noreorder if fixing 24K errata. * doc/invoke.texi: Document mfix-24k. Index: doc/invoke.texi === --- doc/invoke.texi (revision 172728) +++ doc/invoke.texi (working copy) @@ -714,6 +714,7 @@ Objective-C and Objective-C++ Dialects}. -mdivide-traps -mdivide-breaks @gol -mmemcpy -mno-memcpy -mlong-calls -mno-long-calls @gol -mmad -mno-mad -mfused-madd -mno-fused-madd -nocpp @gol +-mfix-24k -mno-fix-24k @gol -mfix-r4000 -mno-fix-r4000 -mfix-r4400 -mno-fix-r4400 @gol -mfix-r1 -mno-fix-r1 -mfix-vr4120 -mno-fix-vr4120 @gol -mfix-vr4130 -mno-fix-vr4130 -mfix-sb1 -mno-fix-sb1 @gol @@ -14602,6 +14603,13 @@ circumstances. Tell the MIPS assembler to not run its preprocessor over user assembler files (with a @samp{.s} suffix) when assembling them. +@item -mfix-24k +@item -mno-fix-24k +@opindex mfix-24k +@opindex mno-fix-24k +Work around the 24K E48 Lost Data on Stores during Refill errata. The +workarounds are implemented by the assembler rather than by GCC. + @item -mfix-r4000 @itemx -mno-fix-r4000 @opindex mfix-r4000 Index: config/mips/mips.md === --- config/mips/mips.md (revision 172728) +++ config/mips/mips.md (working copy) @@ -479,7 +479,9 @@ (define_attr length (eq_attr move_type load,fpload) (symbol_ref mips_load_store_insns (operands[1], insn) * 4) (eq_attr move_type store,fpstore) - (symbol_ref mips_load_store_insns (operands[0], insn) * 4) + (cond [(eq (symbol_ref TARGET_FIX_24K) (const_int 0)) + (symbol_ref mips_load_store_insns (operands[0], insn) * 4)] + (symbol_ref mips_load_store_insns (operands[0], insn) * 4 + 4)) ;; In the worst case, a call macro will take 8 instructions: ;; Index: config/mips/mips.opt === --- config/mips/mips.opt (revision 172728) +++ config/mips/mips.opt (working copy) @@ -110,6 +110,10 @@ mextern-sdata Target Report Var(TARGET_EXTERN_SDATA) Init(1) Use -G for data that is not defined by the current object +mfix-24k +Target Report Var(TARGET_FIX_24K) +Work around certain 24K errata + mfix-r4000 Target Report Mask(FIX_R4000) Work around certain R4000 errata Index: config/mips/mips.c === --- config/mips/mips.c (revision 172728) +++ config/mips/mips.c (working copy) @@ -14886,9 +14886,9 @@ mips_reorg_process_insns (void) if (crtl-profile) cfun-machine-all_noreorder_p = false; - /* Code compiled with -mfix-vr4120 can't be all noreorder because - we rely on the assembler to work around some errata. */ - if (TARGET_FIX_VR4120) + /* Code compiled with -mfix-vr4120 or -mfix-24k can't be all noreorder + because we rely on the assembler to work around some errata. */ + if (TARGET_FIX_VR4120 || TARGET_FIX_24K) cfun-machine-all_noreorder_p = false; /* The same is true for -mfix-vr4130 if we might generate MFLO or Index: config/mips/mips.h === --- config/mips/mips.h (revision 172728) +++ config/mips/mips.h (working copy) @@ -1134,6 +1134,7 @@ enum mips_code_readable_setting { %{msmartmips} %{mno-smartmips} \ %{mmt} %{mno-mt} \ %{mfix-vr4120} %{mfix-vr4130} \ +%{mfix-24k} \ %(subtarget_asm_optimizing_spec) \ %(subtarget_asm_debugging_spec) \ %{mabi=*} %{!mabi=*: %(asm_abi_default_spec)} \
Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)
Hi, On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote: Actually what happens here is that CCP devirtualize by propagating the constructors and due to Richard's new code to drop OBJ_TYPE_REF we finally get a direct call. This is all good and desirable. I think good solution would be to fold further and inline the thunk adjustment, just like the type based devirtualization does. Even once I get far enough with my cgraph cleanuping project to make cgraph represent thunks nicely, we would win if in these cases ccp and other passes simply inlined the this adjustment, like we do with type based devirtualization already. Martin, I guess it is matter of looking up the thunk info by associated cgraph node alias and extending fold_stmts of passes that now drop the OBJ_TYPE_REF wrappers? Well, if you have a cgraph node then yes. But if the method is implemented in a different compilation unit you don't. And as I already said today on IRC, I don't think it is possible to tell whether a function is a thunk by looking at the decl alone (the front hand has a flag for it as Jakub noted, though), let alone what kind of thunk it is. The more I think about this the more I would also like to make thunks as ordinary real functions as possible, with perhaps some kind of totally opaque decls/cgraph_nodes for the most obscure types which could be generated by assembly. Martin
[PATCH][ARM] Clean up movw support
This patch doesn't change the compiler behaviour; it merely moves the support for MOVW's 16-bit immediate constant to const_ok_for_op. This patch is broken out of my previous (rejected) Thumb2-constants patch. I'll be posting v2 of that patch soon, and this clean up will be required then. OK? Andrew 2011-04-20 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.c (arm_gen_constant): Move mowv support (const_ok_for_op): ... to here. --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -2461,6 +2461,13 @@ const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code) switch (code) { +case SET: + /* See if we can use movw. */ + if (arm_arch_thumb2 (i 0x) == 0) + return 1; + else + return 0; + case PLUS: case COMPARE: case EQ: @@ -2820,9 +2827,7 @@ arm_gen_constant (enum rtx_code code, enum machine_mode mode, rtx cond, } /* If we can do it in one insn get out quickly. */ - if (const_ok_for_arm (val) - || (can_negate_initial const_ok_for_arm (-val)) - || (can_invert const_ok_for_arm (~val))) + if (const_ok_for_op (val, code)) { if (generate) emit_constant_insn (cond, @@ -2875,15 +2880,6 @@ arm_gen_constant (enum rtx_code code, enum machine_mode mode, rtx cond, switch (code) { case SET: - /* See if we can use movw. */ - if (arm_arch_thumb2 (remainder 0x) == 0) - { - if (generate) - emit_constant_insn (cond, gen_rtx_SET (VOIDmode, target, - GEN_INT (val))); - return 1; - } - /* See if we can do this by sign_extending a constant that is known to be negative. This is a good, way of doing it, since the shift may well merge into a subsequent insn. */
[committed/vms]: do not use vms-dwarf2.o for gnu-ld
Hi, when gnu-ld is used, we do not need the extra vms-dwarf2.o file - which is needed only for the native vms linker. Committed on trunk. Tristan. 2011-04-20 Tristan Gingold ging...@adacore.com * config/alpha/vms.h (LINK_SPEC): Do not use vms-dwarf2.o for gnu-ld. === --- config/alpha/vms.h (revision 172769) +++ config/alpha/vms.h (working copy) @@ -329,11 +329,16 @@ } \ } while (0) +#undef LINK_SPEC +#if HAVE_GNU_LD +/* GNU-ld built-in linker script already handles the dwarf2 debug sections. */ +#define LINK_SPEC %{shared} %{v} +#else /* Link with vms-dwarf2.o if -g (except -g0). This causes the VMS link to pull all the dwarf2 debug sections together. */ -#undef LINK_SPEC #define LINK_SPEC %{g:-g vms-dwarf2.o%s} %{g0} %{g1:-g1 vms-dwarf2.o%s} \ %{g2:-g2 vms-dwarf2.o%s} %{g3:-g3 vms-dwarf2.o%s} %{shared} %{v} %{map} +#endif #undef STARTFILE_SPEC #define STARTFILE_SPEC \
Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)
Hi, On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote: Actually what happens here is that CCP devirtualize by propagating the constructors and due to Richard's new code to drop OBJ_TYPE_REF we finally get a direct call. This is all good and desirable. I think good solution would be to fold further and inline the thunk adjustment, just like the type based devirtualization does. Even once I get far enough with my cgraph cleanuping project to make cgraph represent thunks nicely, we would win if in these cases ccp and other passes simply inlined the this adjustment, like we do with type based devirtualization already. Martin, I guess it is matter of looking up the thunk info by associated cgraph node alias and extending fold_stmts of passes that now drop the OBJ_TYPE_REF wrappers? Well, if you have a cgraph node then yes. But if the method is implemented in a different compilation unit you don't. And as I already said today on IRC, I don't think it is possible to tell whether a function is a thunk by looking at the decl alone (the front hand has a flag for it as Jakub noted, though), let alone what kind of thunk it is. Well, you don't care about thunks resisting in other unit/partition... Honza The more I think about this the more I would also like to make thunks as ordinary real functions as possible, with perhaps some kind of totally opaque decls/cgraph_nodes for the most obscure types which could be generated by assembly. Martin
Re: [RFA] MIPS 24K Errata Support
Catherine Moore c...@codesourcery.com writes: +Work around the 24K E48 Lost Data on Stores during Refill errata. I think this should either be: Work around the 24K E48 (@cite{Lost Data on Stores During Refill}) errata. or: Work around the 24K E48 (lost data on stores during refill) errata. Maybe the second is safer. @@ -479,7 +479,9 @@ (define_attr length (eq_attr move_type load,fpload) (symbol_ref mips_load_store_insns (operands[1], insn) * 4) (eq_attr move_type store,fpstore) - (symbol_ref mips_load_store_insns (operands[0], insn) * 4) + (cond [(eq (symbol_ref TARGET_FIX_24K) (const_int 0)) +(symbol_ref mips_load_store_insns (operands[0], insn) * 4)] +(symbol_ref mips_load_store_insns (operands[0], insn) * 4 + 4)) Keep the existing indentation (i.e. move the new block two spaces to the left). Sorry for being so picky... OK with those changes, thanks. Richard
Re: [google]Pass --save-temps to the assembler (issue4436049)
On Tue, Apr 19, 2011 at 20:32, Easwaran Raman era...@google.com wrote: The revised patch has a comment that this should be used with an assembler wrapper that can recognize --save-temps. Thanks. Will commit after testing finishes. Diego.
[vms/committed]: fix ICE on alpha-vms
Hi, This patch fixes a compiler crash for alpha-vms. Back-ends should not lie to the middle-end by defining macros to plain abort since the middle-end is entitled to infer properties from their existence. The correct thing to do is not to define the macros in the first place. Committed on trunk. Tristan. 2011-04-20 Eric Botcazou ebotca...@adacore.com * config/alpha/vms.h (ASM_OUTPUT_ADDR_DIFF_ELT): Do not redefine. *** gcc/config/alpha/vms.h.02011-02-19 16:51:47.0 +0100 --- gcc/config/alpha/vms.h 2011-02-19 16:52:07.0 +0100 *** typedef struct {int num_args; enum avms_ *** 202,208 asm (SECTION_OP \n\t.long #FUNC\n); #undef ASM_OUTPUT_ADDR_DIFF_ELT - #define ASM_OUTPUT_ADDR_DIFF_ELT(FILE, BODY, VALUE, REL) gcc_unreachable () #undef ASM_OUTPUT_ADDR_VEC_ELT #define ASM_OUTPUT_ADDR_VEC_ELT(FILE, VALUE) \ --- 202,207
Re: [pph] Namespaces, step 1. Trace formatting. (issue4433054)
http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c File gcc/cp/pph-streamer.c (right): http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode144 gcc/cp/pph-streamer.c:144: return; + if ((type == PPH_TRACE_TREE || type == PPH_TRACE_CHAIN) + !data flag_pph_tracer = 3) +return; Line up the predicates vertically. http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode172 gcc/cp/pph-streamer.c:172: fprintf (pph_logfile, , code=%s, tree_code_name[TREE_CODE (t)]); case PPH_TRACE_REF: + { + const_tree t = (const_tree) data; + if (t) + { + print_generic_expr (pph_logfile, CONST_CAST (union tree_node *, t), + 0); + fprintf (pph_logfile, , code=%s, tree_code_name[TREE_CODE (t)]); But how are we going to tell if this is a REF instead of a tree? The output seems identical to the PPH_TRACE_TREE case. http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h File gcc/cp/pph-streamer.h (right): http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode149 gcc/cp/pph-streamer.h:149: } pph_output_tree_lst (pph_stream *stream, tree t, bool ref_p) +{ + if (flag_pph_tracer = 2) +pph_stream_trace_tree (stream, t, ref_p); + lto_output_tree (stream-ob, t, ref_p); +} I don't really like all this code duplication. Wouldn't it be better if instead of having pph_output_tree_aux and pph_output_tree_lst, we added another argument to pph_output_tree? The argument would be an enum and we could have a default 'DONT_CARE' value. http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode298 gcc/cp/pph-streamer.h:298: pph_stream_trace_tree (stream, t, false); /* FIXME pph: always false? */ @@ -285,7 +295,7 @@ pph_input_tree (pph_stream *stream) { tree t = lto_input_tree (stream-ib, stream-data_in); if (flag_pph_tracer = 4) -pph_stream_trace_tree (stream, t); +pph_stream_trace_tree (stream, t, false); /* FIXME pph: always false? Yes, on input we can't tell if we read a reference or a real tree. We could, but not at this level. That's inside the actual LTO streaming code. http://codereview.appspot.com/4433054/
Re: [patch, ARM] PR48250, rehaul arm_legitimize_reload_address()
On Wed, 2011-04-20 at 23:06 +0800, Chung-Lin Tang wrote: On 2011/4/20 09:24 PM, Richard Sandiford wrote: Hi Chung-Lin, I'm seeing an ICE with this patch, specifically; Chung-Lin Tang clt...@codesourcery.com writes: + if (coproc_p) + low = SIGN_MAG_LOW_ADDR_BITS (val, 10); We generate: Reload 1: reload_out (V4SI) = (mem/c:V4SI (plus:SI (plus:SI (reg/f:SI 11 fp) (const_int -6144 [0xe800])) (const_int 1020 [0x3fc])) [43 %sfp+-5024 S16 A64]) but 1020 isn't a legitimate offset for V4SI: /* For quad modes, we restrict the constant offset to be slightly less than what the instruction format permits. We do this because for quad mode moves, we will actually decompose them into two separate double-mode reads or writes. INDEX must therefore be a valid (double-mode) offset and so should INDEX+8. */ if (TARGET_NEON VALID_NEON_QREG_MODE (mode)) return (code == CONST_INT INTVAL (index) 1016 INTVAL (index) -1024 (INTVAL (index) 3) == 0); A simple fix would be to use 9 instead of 10, but something a little more subtle might be preferred :-) Richard Oh dear, for some reason I mistakenly thought that NEON had a quad-word load/store, sorry :P Reducing from 10 to 9 may be a possible solution, if restricted to the necessary cases. For example: -low = SIGN_MAG_LOW_ADDR_BITS (val, 10); +{ + low = SIGN_MAG_LOW_ADDR_BITS (val, 10); + + /* NEON quad-word load/stores are made of two double-word accesses, + so the valid index range is reduced by 8. Treat as 9-bit range if + we go over it. */ + if (TARGET_NEON VALID_NEON_QREG_MODE (mode) low = 1016) +low = SIGN_MAG_LOW_ADDR_BITS (val, 9); +} To Richard Earnshaw, how do you think of a fix like this? Or should we just simply return false under this out-of-range case (it should be rare I hope). I don't think it matters a great deal. The above is fine. Note, that some targets don't have LDRD either. Do we do the right thing if we're going to fall back to two LDR instructions? R.
Re: better wpa [1/n]: merge types during read-in
Hi, On Wed, 20 Apr 2011, Richard Guenther wrote: If t is a type, why fix up its field if it may not be the canonical variant? Because type merging to work sometimes requires already canonicalized fields, at least that's what I found in investigating why some types weren't merged that should have been. Hence I'm first canonicalizing all fields of everything and then see if something merged. That sounds like a bug in type-merging. You don't happen to have a small testcase? ;) cc1 was my testcase :-/ Think shared field_decl chains. I'll have fixed up the chain for one of the type pairs already and can later come to a type referring exactly the same field_decls again. But only in case the first one is already equal. What I wanted to say is that we shouldn't have partially shared chains, so if (TYPE_FIELDS (t) != TYPE_FIELDS (oldt)) for (...) if (TREE_CODE (f1) == FIELD_DECL) ... should be enough, no? Indeed. In fact, why restrict fixing up the cache to FIELD_DECLs and not also do it for TYPE_DECLs or FUNCTION_DECLs that may reside in this chain? non-FIELD_DECLs are removed in free_lang_data. But even more so I can remove the test for FIELD_DECL. Hmm, it's gross but seems to me still required for the diagnostic and to emit the VIEW_CONVERT_EXPR, at least for invalid input code. OTOH if the streamed out code ensures that a field_decl in a component_ref always is included in its DECL_CONTEXT, then the new merging should indeed make sure that this also holds after streaming in. Do we have testcases specifically trying this code? greping for mismatching in testsuite/ doesn't show anything relevant. The lto testsuite harness doesn't support dg-error/warning, so there are no testcases. There are testcases that ICEd (type verification) before introducing these fixups though. Okay, I'll leave investigating this to a follow up. Ciao, Michael.
Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)
On Wed, 20 Apr 2011, Jan Hubicka wrote: Hi, On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote: Actually what happens here is that CCP devirtualize by propagating the constructors and due to Richard's new code to drop OBJ_TYPE_REF we finally get a direct call. This is all good and desirable. I think good solution would be to fold further and inline the thunk adjustment, just like the type based devirtualization does. Even once I get far enough with my cgraph cleanuping project to make cgraph represent thunks nicely, we would win if in these cases ccp and other passes simply inlined the this adjustment, like we do with type based devirtualization already. Martin, I guess it is matter of looking up the thunk info by associated cgraph node alias and extending fold_stmts of passes that now drop the OBJ_TYPE_REF wrappers? Well, if you have a cgraph node then yes. But if the method is implemented in a different compilation unit you don't. And as I already said today on IRC, I don't think it is possible to tell whether a function is a thunk by looking at the decl alone (the front hand has a flag for it as Jakub noted, though), let alone what kind of thunk it is. Well, you don't care about thunks resisting in other unit/partition... Sure you do - LTO might bring them into scope if you fold them to a direct call early. Richard.
[patch middle-end]: Missed optimization for (x ~y) | (~x y)
Hello, well the bonus points might gain somebody else ... But this adds a missing optimization for tree level implemented in fold-const. ChangeLog gcc/ 2011-04-20 Kai Tietz * fold-const.c (fold_binary_loc): Add handling for (X ~Y) | (~X Y) and (X !Y) | (!X Y) optimization to (X ^ Y). ChangeLog gcc/testsuite 2011-04-20 Kai Tietz * gcc.dg/binio-xor1.c: New test. * gcc.dg/binio-xor2.c: New test. * gcc.dg/binio-xor3.c: New test. * gcc.dg/binio-xor4.c: New test. * gcc.dg/binio-xor5.c: New test. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-pc-linux-gnu (multilib). Ok for apply? Regards, Kai Index: gcc/gcc/fold-const.c === --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200 +++ gcc/gcc/fold-const.c2011-04-20 17:11:22.901039400 +0200 @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc, reorder_operands_p (arg0, TREE_OPERAND (arg1, 0))) return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0)); + /* (X ~Y) | (~X Y) is X ^ Y */ + if (TREE_CODE (arg0) == BIT_AND_EXPR + TREE_CODE (arg1) == BIT_AND_EXPR) +{ + tree a0, a1, l0, l1, n0, n1; + + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); + + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); + + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0); + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1); + + if ((operand_equal_p (n0, a0, 0) + operand_equal_p (n1, a1, 0)) + || (operand_equal_p (n0, a1, 0) + operand_equal_p (n1, a0, 0))) + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1); + } + t1 = distribute_bit_expr (loc, code, type, arg0, arg1); if (t1 != NULL_TREE) return t1; @@ -12039,6 +12061,28 @@ fold_binary_loc (location_t loc, operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) return omit_one_operand_loc (loc, type, integer_one_node, arg0); + /* (X !Y) || (!X Y) is X ^ Y */ + if (TREE_CODE (arg0) == TREE_CODE (arg1) + (TREE_CODE (arg1) == TRUTH_AND_EXPR + || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR)) +{ + tree a0, a1, l0, l1, n0, n1; + + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); + + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); + + n0 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l0); + n1 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l1); + + if ((operand_equal_p (n0, a0, 0) + operand_equal_p (n1, a1, 0)) + || (operand_equal_p (n0, a1, 0) + operand_equal_p (n1, a0, 0))) + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1); + } goto truth_andor; case TRUTH_XOR_EXPR: Index: gcc/gcc/testsuite/gcc.dg/binop-xor1.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ gcc/gcc/testsuite/gcc.dg/binop-xor1.c 2011-04-20 17:11:22.905039900 +0200 @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int +foo (int a, int b, int c) +{ + return ((a !b c) || (!a b c)); +} + +/* We expect to see bb N; confirm that, so that we know to count + it in the real test. */ +/* { dg-final { scan-tree-dump-times bb\[^\]* 5 optimized } } */ +/* { dg-final { scan-tree-dump-times \^ 1 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/gcc/testsuite/gcc.dg/binop-xor2.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ gcc/gcc/testsuite/gcc.dg/binop-xor2.c 2011-04-20 17:11:22.908540300 +0200 @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int +foo (int a, int b) +{ + return ((a ~b) | (~a b)); +} + +/* We expect to see bb N; confirm that, so that we know to count + it in the real test. */ +/* { dg-final { scan-tree-dump-times bb\[^\]* 1 optimized } } */ +/* { dg-final { scan-tree-dump-times \^ 1 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/gcc/testsuite/gcc.dg/binop-xor3.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ gcc/gcc/testsuite/gcc.dg/binop-xor3.c 2011-04-20 17:11:22.911040600 +0200 @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */
Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)
Hi, On Wed, Apr 20, 2011 at 04:38:25PM +0200, Jan Hubicka wrote: Hi, On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote: Actually what happens here is that CCP devirtualize by propagating the constructors and due to Richard's new code to drop OBJ_TYPE_REF we finally get a direct call. This is all good and desirable. I think good solution would be to fold further and inline the thunk adjustment, just like the type based devirtualization does. Even once I get far enough with my cgraph cleanuping project to make cgraph represent thunks nicely, we would win if in these cases ccp and other passes simply inlined the this adjustment, like we do with type based devirtualization already. Martin, I guess it is matter of looking up the thunk info by associated cgraph node alias and extending fold_stmts of passes that now drop the OBJ_TYPE_REF wrappers? Well, if you have a cgraph node then yes. But if the method is implemented in a different compilation unit you don't. And as I already said today on IRC, I don't think it is possible to tell whether a function is a thunk by looking at the decl alone (the front hand has a flag for it as Jakub noted, though), let alone what kind of thunk it is. Well, you don't care about thunks resisting in other unit/partition... Unless you fold in early optimizations and LTO later, deciding to inline the function but forgetting about the thunk adjustment. Martin
libgo patch committed: Remove empty directory
I committed a patch to remove the now-empty directory libgo/go/crypto/block. Ian
[PATCH][ARM] Add support for ADDW and SUBW instructions
This patch adds basic support for the Thumb ADDW and SUBW instructions. The patch permits the compiler to use the new instructions for constants that can be loaded with a single instruction (i.e. 16-bit unshifted), but does not support use of addw with split-constants; I have a patch for that coming soon. This patch requires that my previously posted patch for MOVW is applied first. OK? Andrew 2011-04-20 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm-protos.h (const_ok_for_op): Add prototype. * config/arm/arm.c (const_ok_for_op): Add support for addw/subw. Remove prototype. Remove static function type. * config/arm/arm.md (*arm_addsi3): Add addw/subw support. Add arch attribute. (*arm_subsi3_insn): Add subw support. Add arch attribute. * config/arm/constraints.md (Pj, PJ): New constraints. --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -46,6 +46,7 @@ extern bool arm_vector_mode_supported_p (enum machine_mode); extern bool arm_small_register_classes_for_mode_p (enum machine_mode); extern int arm_hard_regno_mode_ok (unsigned int, enum machine_mode); extern int const_ok_for_arm (HOST_WIDE_INT); +extern int const_ok_for_op (HOST_WIDE_INT, enum rtx_code); extern int arm_split_constant (RTX_CODE, enum machine_mode, rtx, HOST_WIDE_INT, rtx, rtx, int); extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *, rtx *); --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -82,7 +82,6 @@ inline static int thumb1_index_register_rtx_p (rtx, int); static bool arm_legitimate_address_p (enum machine_mode, rtx, bool); static int thumb_far_jump_used_p (void); static bool thumb_force_lr_save (void); -static int const_ok_for_op (HOST_WIDE_INT, enum rtx_code); static rtx emit_sfm (int, int); static unsigned arm_size_return_regs (void); static bool arm_assemble_integer (rtx, unsigned int, int); @@ -2453,7 +2452,7 @@ const_ok_for_arm (HOST_WIDE_INT i) } /* Return true if I is a valid constant for the operation CODE. */ -static int +int const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code) { if (const_ok_for_arm (i)) @@ -2469,6 +2468,13 @@ const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code) return 0; case PLUS: + /* See if we can use addw or subw. */ + if (TARGET_THUMB2 + ((i 0xf000) == 0 + || ((-i) 0xf000) == 0)) + return 1; + /* else fall through. */ + case COMPARE: case EQ: case NE: --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -707,21 +707,24 @@ ;; (plus (reg rN) (reg sp)) into (reg rN). In this case reload will ;; put the duplicated register first, and not try the commutative version. (define_insn_and_split *arm_addsi3 - [(set (match_operand:SI 0 s_register_operand =r, k,r,r, k,r) - (plus:SI (match_operand:SI 1 s_register_operand %rk,k,r,rk,k,rk) - (match_operand:SI 2 reg_or_int_operand rI,rI,k,L, L,?n)))] + [(set (match_operand:SI 0 s_register_operand =r, k,r,r, k, r, k,r, k, r) + (plus:SI (match_operand:SI 1 s_register_operand %rk,k,r,rk,k, rk,k,rk,k, rk) + (match_operand:SI 2 reg_or_int_operand rI,rI,k,Pj,Pj,L, L,PJ,PJ,?n)))] TARGET_32BIT @ add%?\\t%0, %1, %2 add%?\\t%0, %1, %2 add%?\\t%0, %2, %1 + addw%?\\t%0, %1, %2 + addw%?\\t%0, %1, %2 sub%?\\t%0, %1, #%n2 sub%?\\t%0, %1, #%n2 + subw%?\\t%0, %1, #%n2 + subw%?\\t%0, %1, #%n2 # TARGET_32BIT GET_CODE (operands[2]) == CONST_INT -!(const_ok_for_arm (INTVAL (operands[2])) -|| const_ok_for_arm (-INTVAL (operands[2]))) +!const_ok_for_op (INTVAL (operands[2]), PLUS) (reload_completed || !arm_eliminable_register (operands[1])) [(clobber (const_int 0))] @@ -730,8 +733,9 @@ operands[1], 0); DONE; - [(set_attr length 4,4,4,4,4,16) - (set_attr predicable yes)] + [(set_attr length 4,4,4,4,4,4,4,4,4,16) + (set_attr predicable yes) + (set_attr arch *,*,*,t2,t2,*,*,t2,t2,*)] ) (define_insn_and_split *thumb1_addsi3 @@ -1184,28 +1188,33 @@ ; ??? Check Thumb-2 split length (define_insn_and_split *arm_subsi3_insn - [(set (match_operand:SI 0 s_register_operand =r,r,rk,r,r) - (minus:SI (match_operand:SI 1 reg_or_int_operand rI,r,k,?n,r) - (match_operand:SI 2 reg_or_int_operand r,rI,r, r,?n)))] + [(set (match_operand:SI 0 s_register_operand =r,r,rk,r, k, r,r) + (minus:SI (match_operand:SI 1 reg_or_int_operand rI,r,k, rk,k, ?n,r) + (match_operand:SI 2 reg_or_int_operand r,rI,r, Pj,Pj,r,?n)))] TARGET_32BIT @ rsb%?\\t%0, %2, %1 sub%?\\t%0, %1, %2 sub%?\\t%0, %1, %2 + subw%?\\t%0, %1, %2 + subw%?\\t%0, %1, %2 # # ((GET_CODE (operands[1]) == CONST_INT - !const_ok_for_arm (INTVAL (operands[1]))) + !(const_ok_for_arm (INTVAL (operands[1])) + || satisfies_constraint_Pj (operands[2]))) || (GET_CODE (operands[2]) == CONST_INT - !const_ok_for_arm (INTVAL (operands[2] + !(const_ok_for_arm (INTVAL
Re: [patch, ARM] PR48250, rehaul arm_legitimize_reload_address()
On 2011/4/20 11:12 PM, Richard Earnshaw wrote: On Wed, 2011-04-20 at 23:06 +0800, Chung-Lin Tang wrote: On 2011/4/20 09:24 PM, Richard Sandiford wrote: Hi Chung-Lin, I'm seeing an ICE with this patch, specifically; Chung-Lin Tang clt...@codesourcery.com writes: + if (coproc_p) + low = SIGN_MAG_LOW_ADDR_BITS (val, 10); We generate: Reload 1: reload_out (V4SI) = (mem/c:V4SI (plus:SI (plus:SI (reg/f:SI 11 fp) (const_int -6144 [0xe800])) (const_int 1020 [0x3fc])) [43 %sfp+-5024 S16 A64]) but 1020 isn't a legitimate offset for V4SI: /* For quad modes, we restrict the constant offset to be slightly less than what the instruction format permits. We do this because for quad mode moves, we will actually decompose them into two separate double-mode reads or writes. INDEX must therefore be a valid (double-mode) offset and so should INDEX+8. */ if (TARGET_NEON VALID_NEON_QREG_MODE (mode)) return (code == CONST_INT INTVAL (index) 1016 INTVAL (index) -1024 (INTVAL (index) 3) == 0); A simple fix would be to use 9 instead of 10, but something a little more subtle might be preferred :-) Richard Oh dear, for some reason I mistakenly thought that NEON had a quad-word load/store, sorry :P Reducing from 10 to 9 may be a possible solution, if restricted to the necessary cases. For example: -low = SIGN_MAG_LOW_ADDR_BITS (val, 10); +{ + low = SIGN_MAG_LOW_ADDR_BITS (val, 10); + + /* NEON quad-word load/stores are made of two double-word accesses, + so the valid index range is reduced by 8. Treat as 9-bit range if + we go over it. */ + if (TARGET_NEON VALID_NEON_QREG_MODE (mode) low = 1016) +low = SIGN_MAG_LOW_ADDR_BITS (val, 9); +} To Richard Earnshaw, how do you think of a fix like this? Or should we just simply return false under this out-of-range case (it should be rare I hope). I don't think it matters a great deal. The above is fine. Note, that some targets don't have LDRD either. Do we do the right thing if we're going to fall back to two LDR instructions? R. The current non-TARGET_LDRD case goes through this path: ... else /* For pre-ARMv5TE (without ldrd), we use ldm/stm(db/da/ib) to access doublewords. The supported load/store offsets are -8, -4, and 4, which we try to produce here. */ low = ((val 0xf) ^ 0x8) - 0x8; which uses ldm/stm. This should be safe. As for pre-ARMv4 ldrh, this is special cased as: if (arm_arch4) low = SIGN_MAG_LOW_ADDR_BITS (val, 8); else { /* The storehi/movhi_bytes fallbacks can use only [-4094,+4094] of the full ldrb/strb index range. */ low = SIGN_MAG_LOW_ADDR_BITS (val, 12); if (low == 4095 || low == -4095) return false; } Although to be frank, I haven't really tested a pre-ARMv4 config; not very easy to do so in an EABI world :) I'll take the above NEON QREG mode fix as approved. Chung-Lin
Re: [PATCH][ARM] Remove redundant code in arm.c
On Wed, 2011-04-20 at 13:55 +0100, Andrew Stubbs wrote: This patch removes some redundant code that caused me some confusion. It's not possible to construct a constant from multiple ORN instructions, just as it's not possible to do it with multiple AND instructions. OK? Andrew OK. R.
Re: [patch middle-end]: Missed optimization for (x ~y) | (~x y)
On 04/20/2011 08:22 AM, Kai Tietz wrote: + if (TREE_CODE (arg0) == BIT_AND_EXPR +TREE_CODE (arg1) == BIT_AND_EXPR) +{ + tree a0, a1, l0, l1, n0, n1; + + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); + + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); + + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0); + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1); + + if ((operand_equal_p (n0, a0, 0) + operand_equal_p (n1, a1, 0)) + || (operand_equal_p (n0, a1, 0) +operand_equal_p (n1, a0, 0))) + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1); First, you typoed BIT_XOR_EXPR in this first block. Second, I don't see how you're arbitrarily choosing L0 and N1 in the expansion. If you write the expression the other way around, (~x y) | (x ~y) don't you wind up with (~x ^ ~y) ? Or do the extra NOT expressions get folded away anyway? + if (TREE_CODE (arg0) == TREE_CODE (arg1) +(TREE_CODE (arg1) == TRUTH_AND_EXPR + || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR)) I don't believe you want to apply this transformation with ANDIF. r~
Re: [PATCH][ARM] Add support for ADDW and SUBW instructions
On 20/04/11 16:27, Andrew Stubbs wrote: (*arm_subsi3_insn): Add subw support. Oh, I should probably say that I've added subw support to arm_subsi3 even though it's not obvious that anything will ever use this. The existing implementation of arm_subsi3 (sans 'w') supports immediates, so I added subw to match. If there are any objections, I expect I can remove that hunk of the patch. Andrew
Re: [PATCH][ARM] Clean up movw support
On Wed, 2011-04-20 at 15:20 +0100, Andrew Stubbs wrote: This patch doesn't change the compiler behaviour; it merely moves the support for MOVW's 16-bit immediate constant to const_ok_for_op. This patch is broken out of my previous (rejected) Thumb2-constants patch. I'll be posting v2 of that patch soon, and this clean up will be required then. OK? Andrew 2011-04-20 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.c (arm_gen_constant): Move mowv support (const_ok_for_op): ... to here. it's movw (not mowv :) Otherwise OK. R.
[PATCH] Fix PR47892
This fixes PR47892, we are failing to if-convert function calls, even those we can vectorize. This includes pow() which we canonicalize x*x to with -ffast-math (yeah, I know ...). No reason to not if-convert at least const builtins. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2011-04-20 Richard Guenther rguent...@suse.de PR tree-optimization/47892 * tree-if-conv.c (if_convertible_stmt_p): Const builtins are if-convertible. * gcc.dg/vect/fast-math-ifcvt-1.c: New testcase. Index: gcc/tree-if-conv.c === *** gcc/tree-if-conv.c (revision 172759) --- gcc/tree-if-conv.c (working copy) *** if_convertible_stmt_p (gimple stmt, VEC *** 719,724 --- 719,740 case GIMPLE_ASSIGN: return if_convertible_gimple_assign_stmt_p (stmt, refs); + case GIMPLE_CALL: + { + tree fndecl = gimple_call_fndecl (stmt); + if (fndecl) + { + int flags = gimple_call_flags (stmt); + if ((flags ECF_CONST) +!(flags ECF_LOOPING_CONST_OR_PURE) + /* We can only vectorize some builtins at the moment, + so restrict if-conversion to those. */ +DECL_BUILT_IN (fndecl)) + return true; + } + return false; + } + default: /* Don't know what to do with 'em so don't do anything. */ if (dump_file (dump_flags TDF_DETAILS)) Index: gcc/testsuite/gcc.dg/vect/fast-math-ifcvt-1.c === *** gcc/testsuite/gcc.dg/vect/fast-math-ifcvt-1.c (revision 0) --- gcc/testsuite/gcc.dg/vect/fast-math-ifcvt-1.c (revision 0) *** *** 0 --- 1,18 + /* PR 47892 */ + /* { dg-do compile } */ + /* { dg-require-effective-target vect_float } */ + /* { dg-require-effective-target vect_condition } */ + + void + bestseries9 (float * __restrict__ arr, int len) + { + int i; + for (i = 0; i len; ++i) + { + float or = arr[i]; + arr[i] = (or 0.0f) * (2 - or * or); + } + } + + /* { dg-final { scan-tree-dump vectorized 1 loops vect } } */ + /* { dg-final { cleanup-tree-dump vect } } */
Re: [patch middle-end]: Missed optimization for (x ~y) | (~x y)
2011/4/20 Richard Henderson r...@redhat.com: On 04/20/2011 08:22 AM, Kai Tietz wrote: + if (TREE_CODE (arg0) == BIT_AND_EXPR + TREE_CODE (arg1) == BIT_AND_EXPR) + { + tree a0, a1, l0, l1, n0, n1; + + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); + + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); + + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0); + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1); + + if ((operand_equal_p (n0, a0, 0) + operand_equal_p (n1, a1, 0)) + || (operand_equal_p (n0, a1, 0) + operand_equal_p (n1, a0, 0))) + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1); First, you typoed BIT_XOR_EXPR in this first block. Duh, corrected. Second, I don't see how you're arbitrarily choosing L0 and N1 in the expansion. If you write the expression the other way around, (~x y) | (x ~y) don't you wind up with (~x ^ ~y) ? Or do the extra NOT expressions get folded away anyway? Not I didn't wind up here. First ~X ^ ~Y is in result the same as X ^ Y, and for this I used here the explicit folding. Well, it might be a bit slower, but it has the advantage to compare equal transformations in doubt. + if (TREE_CODE (arg0) == TREE_CODE (arg1) + (TREE_CODE (arg1) == TRUTH_AND_EXPR + || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR)) I don't believe you want to apply this transformation with ANDIF. Yes, it is superflous. I removed it. r~ Adjusted patch attached. Kai Index: gcc/gcc/fold-const.c === --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200 +++ gcc/gcc/fold-const.c2011-04-20 17:41:23.427677200 +0200 @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc, reorder_operands_p (arg0, TREE_OPERAND (arg1, 0))) return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0)); + /* (X ~Y) | (~X Y) is X ^ Y */ + if (TREE_CODE (arg0) == BIT_AND_EXPR + TREE_CODE (arg1) == BIT_AND_EXPR) +{ + tree a0, a1, l0, l1, n0, n1; + + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); + + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); + + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0); + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1); + + if ((operand_equal_p (n0, a0, 0) + operand_equal_p (n1, a1, 0)) + || (operand_equal_p (n0, a1, 0) + operand_equal_p (n1, a0, 0))) + return fold_build2_loc (loc, BIT_XOR_EXPR, type, l0, n1); + } + t1 = distribute_bit_expr (loc, code, type, arg0, arg1); if (t1 != NULL_TREE) return t1; @@ -12039,6 +12061,27 @@ fold_binary_loc (location_t loc, operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) return omit_one_operand_loc (loc, type, integer_one_node, arg0); + /* (X !Y) || (!X Y) is X ^ Y */ + if (TREE_CODE (arg0) == TREE_CODE (arg1) + TREE_CODE (arg1) == TRUTH_AND_EXPR) +{ + tree a0, a1, l0, l1, n0, n1; + + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); + + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); + + n0 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l0); + n1 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l1); + + if ((operand_equal_p (n0, a0, 0) + operand_equal_p (n1, a1, 0)) + || (operand_equal_p (n0, a1, 0) + operand_equal_p (n1, a0, 0))) + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1); + } goto truth_andor; case TRUTH_XOR_EXPR: Index: gcc/gcc/testsuite/gcc.dg/binop-xor1.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ gcc/gcc/testsuite/gcc.dg/binop-xor1.c 2011-04-20 17:11:22.905039900 +0200 @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int +foo (int a, int b, int c) +{ + return ((a !b c) || (!a b c)); +} + +/* We expect to see bb N; confirm that, so that we know to count + it in the real test. */ +/* { dg-final { scan-tree-dump-times bb\[^\]* 5 optimized } } */ +/* { dg-final { scan-tree-dump-times \^ 1 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index:
Re: [patch middle-end]: Missed optimization for (x ~y) | (~x y)
On 04/20/2011 08:50 AM, Kai Tietz wrote: + if (TREE_CODE (arg0) == TREE_CODE (arg1) +TREE_CODE (arg1) == TRUTH_AND_EXPR) Ok with these both explicitly testing TRUTH_AND_EXPR now. r~
Re: [patch middle-end]: Missed optimization for (x ~y) | (~x y)
2011/4/20 Jakub Jelinek ja...@redhat.com: On Wed, Apr 20, 2011 at 05:22:31PM +0200, Kai Tietz wrote: --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200 +++ gcc/gcc/fold-const.c 2011-04-20 17:11:22.901039400 +0200 @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc, reorder_operands_p (arg0, TREE_OPERAND (arg1, 0))) return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0)); + /* (X ~Y) | (~X Y) is X ^ Y */ + if (TREE_CODE (arg0) == BIT_AND_EXPR + TREE_CODE (arg1) == BIT_AND_EXPR) + { + tree a0, a1, l0, l1, n0, n1; + + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0)); + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1)); + + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0)); + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1)); + + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0); + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1); + + if ((operand_equal_p (n0, a0, 0) + operand_equal_p (n1, a1, 0)) + || (operand_equal_p (n0, a1, 0) + operand_equal_p (n1, a0, 0))) + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1); + } + I must say I don't like first folding/building new trees, then testing and then maybe optimizing, that is slow and creates unnecessary garbage in the likely case the optimization can't do anything. Wouldn't something like: int arg0_not = TREE_CODE (TREE_OPERAND (arg0, 1)) == BIT_NOT_EXPR; int arg1_not = TREE_CODE (TREE_OPERAND (arg1, 1)) == BIT_NOT_EXPR; if (TREE_CODE (TREE_OPERAND (arg0, arg0_not)) == BIT_NOT_EXPR TREE_CODE (TREE_OPERAND (arg1, arg1_not)) == BIT_NOT_EXPR operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg0, arg0_not), 0), TREE_OPERAND (arg1, 1 - arg1_not), 0) operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg1, arg1_not), 0), TREE_OPERAND (arg0, 1 - arg0_not), 0)) return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1 - arg0_not)), fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1 - arg1_not))); work better? Jakub Well, as special case we could use that, but we have here also to handle integer-values, so I used fold to make sure I get inverse. Also there might be some transformations, which otherwise might be not caught, like !(X || Y) == !X !Y ... Regards, Kai -- | (\_/) This is Bunny. Copy and paste | (='.'=) Bunny into your signature to help | ()_() him gain world domination
Fix PR48703: segfault in mangler due to -g
Hi, as noted in the bug trail the fix for PR48207 broke compilation of C++ programs with -g. This variant fixes the bug too without breaking -g. Basically we have to set assembler names early also for TYPE_DECLs, we can't rely on the frontends langhook to do that after free_lang_data. Okay for trunk assuming regstrapping on x86_64-linux works? Ciao, Michael. PR debug/48703 * dwarf2out.c (retry_incomplete_types): Export. Clear incomplete_types. * dwarf2out.h (retry_incomplete_types): Declare. * tree.c (need_assembler_name_p): Also handle TYPE_DECLs. (free_lang_data_in_cgraph): Call retry_incomplete_types. (free_lang_data): Reset set_decl_assembler_name langhook. * Makefile.in (tree.o): Depend on dwarf2out.h. Index: tree.c === --- tree.c (revision 172769) +++ tree.c (working copy) @@ -61,6 +61,7 @@ along with GCC; see the file COPYING3. #include except.h #include debug.h #include intl.h +#include dwarf2out.h /* Tree code classes. */ @@ -4500,7 +4501,8 @@ need_assembler_name_p (tree decl) { /* Only FUNCTION_DECLs and VAR_DECLs are considered. */ if (TREE_CODE (decl) != FUNCTION_DECL - TREE_CODE (decl) != VAR_DECL) + TREE_CODE (decl) != VAR_DECL + TREE_CODE (decl) != TYPE_DECL) return false; /* If DECL already has its assembler name set, it does not need a @@ -4538,6 +4540,11 @@ need_assembler_name_p (tree decl) return false; } + if (TREE_CODE (decl) == TYPE_DECL) +{ + if (TYPE_DECL_SUPPRESS_DEBUG (decl)) + return false; +} return true; } @@ -5111,6 +5118,8 @@ free_lang_data_in_cgraph (void) FOR_EACH_VEC_ELT (tree, fld.decls, i, t) assign_assembler_name_if_neeeded (t); + retry_incomplete_types (); + /* Traverse every decl found freeing its language data. */ FOR_EACH_VEC_ELT (tree, fld.decls, i, t) free_lang_data_in_decl (t); @@ -5182,6 +5191,7 @@ free_lang_data (void) name and only produce assembler names for local symbols. Or rather make sure we never call decl_assembler_name on local symbols and devise a separate, middle-end private scheme for it. */ + lang_hooks.set_decl_assembler_name = lhd_set_decl_assembler_name; /* Reset diagnostic machinery. */ diagnostic_starter (global_dc) = default_tree_diagnostic_starter; Index: dwarf2out.c === --- dwarf2out.c (revision 172769) +++ dwarf2out.c (working copy) @@ -6575,7 +6575,6 @@ static dw_die_ref force_type_die (tree); static dw_die_ref setup_namespace_context (tree, dw_die_ref); static dw_die_ref declare_in_namespace (tree, dw_die_ref); static struct dwarf_file_data * lookup_filename (const char *); -static void retry_incomplete_types (void); static void gen_type_die_for_member (tree, tree, dw_die_ref); static void gen_generic_params_dies (tree); static void gen_tagged_type_die (tree, dw_die_ref, enum debug_info_usage); @@ -18497,15 +18496,17 @@ gen_entry_point_die (tree decl, dw_die_r /* Walk through the list of incomplete types again, trying once more to emit full debugging info for them. */ -static void +void retry_incomplete_types (void) { int i; - - for (i = VEC_length (tree, incomplete_types) - 1; i = 0; i--) -if (should_emit_struct_debug (VEC_index (tree, incomplete_types, i), + VEC(tree,gc) *types = incomplete_types; + incomplete_types = NULL; + for (i = VEC_length (tree, types) - 1; i = 0; i--) +if (should_emit_struct_debug (VEC_index (tree, types, i), DINFO_USAGE_DIR_USE)) - gen_type_die (VEC_index (tree, incomplete_types, i), comp_unit_die ()); + gen_type_die (VEC_index (tree, types, i), comp_unit_die ()); + types = NULL; } /* Determine what tag to use for a record type. */ Index: dwarf2out.h === --- dwarf2out.h (revision 172769) +++ dwarf2out.h (working copy) @@ -25,6 +25,8 @@ extern void dwarf2out_cfi_begin_epilogue extern void dwarf2out_frame_debug_restore_state (void); extern void dwarf2out_flush_queued_reg_saves (void); +extern void retry_incomplete_types (void); + extern void debug_dwarf (void); struct die_struct; extern void debug_dwarf_die (struct die_struct *); Index: Makefile.in === --- Makefile.in (revision 172769) +++ Makefile.in (working copy) @@ -2354,7 +2354,7 @@ langhooks.o : langhooks.c $(CONFIG_H) $( tree.o: tree.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ all-tree.def $(FLAGS_H) $(FUNCTION_H) $(PARAMS_H) \ toplev.h $(DIAGNOSTIC_CORE_H) $(GGC_H) $(HASHTAB_H) $(TARGET_H) output.h $(TM_P_H) \ - langhooks.h gt-tree.h $(TREE_INLINE_H) tree-iterator.h \ + langhooks.h gt-tree.h $(TREE_INLINE_H) tree-iterator.h dwarf2out.h \
[PATCH] Optimize (x * 8) | 5 and (x 3) ^ 3 to use lea (PR target/48688)
Hi! This splitter allows us to optimize (x {* {2,4,8}, {1,2,3}}) {|,^} y for constant integer y = {1ULL,3ULL,7ULL} using lea{l,q} (| or ^ in that case, when the low bits are known to be all 0, is like plus). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-04-20 Jakub Jelinek ja...@redhat.com PR target/48688 * config/i386/i386.md (*lea_general_4): New define_insn_and_split. * gcc.target/i386/pr48688.c: New test. --- gcc/config/i386/i386.md.jj 2011-04-19 14:08:55.0 +0200 +++ gcc/config/i386/i386.md 2011-04-20 14:34:50.0 +0200 @@ -6646,6 +6646,40 @@ (define_insn_and_split *lea_general_3_z } [(set_attr type lea) (set_attr mode SI)]) + +(define_insn_and_split *lea_general_4 + [(set (match_operand:SWI 0 register_operand =r) + (any_or:SWI (ashift:SWI (match_operand:SWI 1 index_register_operand l) + (match_operand:SWI 2 const_int_operand n)) + (match_operand 3 const_int_operand n)))] + (MODEmode == DImode +|| MODEmode == SImode +|| !TARGET_PARTIAL_REG_STALL +|| optimize_function_for_size_p (cfun)) +((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 3 +((unsigned HOST_WIDE_INT) INTVAL (operands[3]) + = ((unsigned HOST_WIDE_INT) 1 INTVAL (operands[2]))) + # + reload_completed + [(const_int 0)] +{ + rtx pat; + if (MODEmode != DImode) +operands[0] = gen_lowpart (SImode, operands[0]); + operands[1] = gen_lowpart (Pmode, operands[1]); + operands[2] = GEN_INT (1 INTVAL (operands[2])); + pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]), + INTVAL (operands[3])); + if (Pmode != SImode MODEmode != DImode) +pat = gen_rtx_SUBREG (SImode, pat, 0); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat)); + DONE; +} + [(set_attr type lea) + (set (attr mode) + (if_then_else (eq (symbol_ref MODEmode == DImode) (const_int 0)) + (const_string SI) + (const_string DI)))]) ;; Subtract instructions --- gcc/testsuite/gcc.target/i386/pr48688.c.jj 2011-04-20 14:55:37.0 +0200 +++ gcc/testsuite/gcc.target/i386/pr48688.c 2011-04-20 14:57:03.0 +0200 @@ -0,0 +1,24 @@ +/* PR target/48688 */ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +int fn1 (int x) { return (x 3) | 5; } +int fn2 (int x) { return (x * 8) | 5; } +int fn3 (int x) { return (x 3) + 5; } +int fn4 (int x) { return (x * 8) + 5; } +int fn5 (int x) { return (x 3) ^ 5; } +int fn6 (int x) { return (x * 8) ^ 5; } +long fn7 (long x) { return (x 3) | 5; } +long fn8 (long x) { return (x * 8) | 5; } +long fn9 (long x) { return (x 3) + 5; } +long fn10 (long x) { return (x * 8) + 5; } +long fn11 (long x) { return (x 3) ^ 5; } +long fn12 (long x) { return (x * 8) ^ 5; } +long fn13 (unsigned x) { return (x 3) | 5; } +long fn14 (unsigned x) { return (x * 8) | 5; } +long fn15 (unsigned x) { return (x 3) + 5; } +long fn16 (unsigned x) { return (x * 8) + 5; } +long fn17 (unsigned x) { return (x 3) ^ 5; } +long fn18 (unsigned x) { return (x * 8) ^ 5; } + +/* { dg-final { scan-assembler-not \[ \t\]x?or\[bwlq\]\[ \t\] } } */ Jakub
Re: [patch middle-end]: Missed optimization for (x ~y) | (~x y)
2011/4/20 Richard Henderson r...@redhat.com: On 04/20/2011 08:50 AM, Kai Tietz wrote: + if (TREE_CODE (arg0) == TREE_CODE (arg1) + TREE_CODE (arg1) == TRUTH_AND_EXPR) Ok with these both explicitly testing TRUTH_AND_EXPR now. r~ Committed at revision 172776 with explicit testing for TRUTH_AND_EXPR. Kai
Re: FDO usage: -Wcoverage-mismatch should not ignore -Wno-error
This would work if there is a way to set Werror=coverage-mismatch without having to explicitly set the option classification as DK_ERROR. Does this mechanism exist? Thanks, David On Tue, Apr 19, 2011 at 12:52 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Apr 19, 2011 at 9:13 AM, Xinliang David Li davi...@google.com wrote: -Wcoverage-mismatch is enabled by default, and the warning is promoted to error by default. However in the current implementation -Wno-error can not demote the error back to warning. The patch was ported from one contributed by Neil. OK for trunk after regression testing? I am sure there is a better way to achieve this, like making Werror=coverage-mismatch the default. Joseph? Richard. 2011-04-18 Neil Vachharajani nvach...@gmail.com * flags.c: New flag variable. * opts.c (common_handle_options): Set flag_werror_set. * opts-global.c (decode_options): Delay Werror decision for Wcoverage-mismatch util after options are parsed. The following test case can be added, but the test harness does not like the extra warnings -- how can they be pruned? Thanks, David /* { dg-options -O2 -Wcoverage-mismatch -Wno-error } */ int __attribute__((noinline)) bar (void) { } #ifdef _PROFILE_USE int foo (int i) { if (i) bar (); else bar (); return 0; } #else int foo (int i) { if (i) bar (); return 0; } #endif int main(int argc, char **argv) { foo (argc); return 0; }
[PATCH] make Ada runtime function building use build_function_type_list
This patch changes most of the uses of build_function_type in the Ada to use build_function_type_list. There are a handful of build_function_type calls left; replacing those will have to wait until we get a build_function_type_{n,vec} interface. Tested on x86_64-unknown-linux-gnu. OK to commit? -Nathan * gcc-interface/trans.c (gigi): Call build_function_type_list instead of build_function_type. Adjust calls to... (build_raise_check): ...this. Do not take a void_tree parameter. Call build_function_type_list instead of build_function_type. diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c index 378f88c..05e2842 100644 --- a/gcc/ada/gcc-interface/trans.c +++ b/gcc/ada/gcc-interface/trans.c @@ -214,7 +214,7 @@ static void set_expr_location_from_node (tree, Node_Id); static bool set_end_locus_from_node (tree, Node_Id); static void set_gnu_expr_location_from_node (tree, Node_Id); static int lvalue_required_p (Node_Id, tree, bool, bool, bool); -static tree build_raise_check (int, tree, enum exception_info_kind); +static tree build_raise_check (int, enum exception_info_kind); /* Hooks for debug info back-ends, only supported and used in a restricted set of configurations. */ @@ -236,7 +236,7 @@ gigi (Node_Id gnat_root, int max_gnat_node, int number_name ATTRIBUTE_UNUSED, Entity_Id standard_exception_type, Int gigi_operating_mode) { Entity_Id gnat_literal; - tree long_long_float_type, exception_type, t; + tree long_long_float_type, exception_type, t, ftype; tree int64_type = gnat_type_for_size (64, 0); struct elab_info *info; int i; @@ -344,47 +344,39 @@ gigi (Node_Id gnat_root, int max_gnat_node, int number_name ATTRIBUTE_UNUSED, DECL_IGNORED_P (t) = 1; save_gnu_tree (gnat_literal, t, false); - void_ftype = build_function_type (void_type_node, NULL_TREE); + void_ftype = build_function_type_list (void_type_node, NULL_TREE); ptr_void_ftype = build_pointer_type (void_ftype); /* Now declare run-time functions. */ - t = tree_cons (NULL_TREE, void_type_node, NULL_TREE); + ftype = build_function_type_list (ptr_void_type_node, sizetype, NULL_TREE); /* malloc is a function declaration tree for a function to allocate memory. */ malloc_decl = create_subprog_decl (get_identifier (__gnat_malloc), NULL_TREE, - build_function_type (ptr_void_type_node, - tree_cons (NULL_TREE, - sizetype, t)), - NULL_TREE, false, true, true, NULL, Empty); + ftype, NULL_TREE, false, true, true, NULL, Empty); DECL_IS_MALLOC (malloc_decl) = 1; /* malloc32 is a function declaration tree for a function to allocate 32-bit memory on a 64-bit system. Needed only on 64-bit VMS. */ malloc32_decl = create_subprog_decl (get_identifier (__gnat_malloc32), NULL_TREE, - build_function_type (ptr_void_type_node, - tree_cons (NULL_TREE, - sizetype, t)), - NULL_TREE, false, true, true, NULL, Empty); + ftype, NULL_TREE, false, true, true, NULL, Empty); DECL_IS_MALLOC (malloc32_decl) = 1; /* free is a function declaration tree for a function to free memory. */ + ftype = build_function_type_list (void_type_node, + ptr_void_type_node, NULL_TREE); free_decl = create_subprog_decl (get_identifier (__gnat_free), NULL_TREE, - build_function_type (void_type_node, - tree_cons (NULL_TREE, - ptr_void_type_node, - t)), - NULL_TREE, false, true, true, NULL, Empty); + ftype, NULL_TREE, false, true, true, NULL, Empty); /* This is used for 64-bit multiplication with overflow checking. */ + ftype = build_function_type_list (int64_type, + int64_type, int64_type, NULL_TREE); mulv64_decl = create_subprog_decl (get_identifier (__gnat_mulv64), NULL_TREE, - build_function_type_list (int64_type, int64_type, -int64_type, NULL_TREE), - NULL_TREE, false, true, true, NULL, Empty); + ftype, NULL_TREE, false, true, true, NULL, Empty); /* Name of the _Parent field in tagged record types. */ parent_name_id = get_identifier (Get_Name_String (Name_uParent)); @@ -401,61 +393,54 @@ gigi (Node_Id gnat_root, int max_gnat_node, int number_name ATTRIBUTE_UNUSED, jmpbuf_ptr_type = build_pointer_type (jmpbuf_type); /*
[PATCH] use build_function_type_list a few places in the ObjC frontend
Just as $SUBJECT suggests. All the other uses of build_function_type_list are tied up with get_arg_type_list and will therefore have to wait for a better FUNCTION_TYPE builder. Tested on x86_64-unknown-linux-gnu. IIUC the changes to objc-next-runtime-abi-02.c would not be tested on that platform, so it would be helpful to have a Darwin tester double-check my work. OK to commit? -Nathan * objc-act.c (synth_module_prologue): Call build_function_type_list instead of build_function_type. * objc-next-runtime-abi-02.c (next_runtime_02_initialize): Likewise. diff --git a/gcc/objc/objc-act.c b/gcc/objc/objc-act.c index b48f179..0b6b793 100644 --- a/gcc/objc/objc-act.c +++ b/gcc/objc/objc-act.c @@ -2995,8 +2995,8 @@ synth_module_prologue (void) build_fast_enumeration_state_template (); /* void objc_enumeration_mutation (id) */ - type = build_function_type (void_type_node, - tree_cons (NULL_TREE, objc_object_type, NULL_TREE)); + type = build_function_type_list (void_type_node, + objc_object_type, NULL_TREE); objc_enumeration_mutation_decl = add_builtin_function (TAG_ENUMERATION_MUTATION, type, 0, NOT_BUILT_IN, NULL, NULL_TREE); diff --git a/gcc/objc/objc-next-runtime-abi-02.c b/gcc/objc/objc-next-runtime-abi-02.c index 4ce0159..f3cf359 100644 --- a/gcc/objc/objc-next-runtime-abi-02.c +++ b/gcc/objc/objc-next-runtime-abi-02.c @@ -492,9 +492,8 @@ static void next_runtime_02_initialize (void) build_v2_ehtype_template (); /* void * objc_begin_catch (void *) */ - type = build_function_type (ptr_type_node, - tree_cons (NULL_TREE, ptr_type_node, - OBJC_VOID_AT_END)); + type = build_function_type_list (ptr_type_node, + ptr_type_node, NULL_TREE); objc2_begin_catch_decl = add_builtin_function (objc_begin_catch, type, 0, NOT_BUILT_IN, @@ -502,14 +501,13 @@ static void next_runtime_02_initialize (void) TREE_NOTHROW (objc2_begin_catch_decl) = 0; /* void objc_end_catch () */ - type = build_function_type (void_type_node, OBJC_VOID_AT_END); + type = build_function_type_list (void_type_node, NULL_TREE); objc2_end_catch_decl = add_builtin_function (objc_end_catch, type, 0, NOT_BUILT_IN, NULL, NULL_TREE); TREE_NOTHROW (objc2_end_catch_decl) = 0; /* void objc_exception_rethrow (void) */ - type = build_function_type (void_type_node, OBJC_VOID_AT_END); objc_rethrow_exception_decl = add_builtin_function (objc_exception_rethrow, type, 0, NOT_BUILT_IN,
Re: [pph] Namespaces, step 1. Trace formatting. (issue4433054)
On 4/20/11, dnovi...@google.com dnovi...@google.com wrote: http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c File gcc/cp/pph-streamer.c (right): http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode144 gcc/cp/pph-streamer.c:144: return; + if ((type == PPH_TRACE_TREE || type == PPH_TRACE_CHAIN) + !data flag_pph_tracer = 3) +return; Line up the predicates vertically. Can you be more specific? http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode172 gcc/cp/pph-streamer.c:172: fprintf (pph_logfile, , code=%s, tree_code_name[TREE_CODE (t)]); case PPH_TRACE_REF: + { + const_tree t = (const_tree) data; + if (t) + { + print_generic_expr (pph_logfile, CONST_CAST (union tree_node *, t), + 0); + fprintf (pph_logfile, , code=%s, tree_code_name[TREE_CODE (t)]); But how are we going to tell if this is a REF instead of a tree? The type_s array is indexed by PPH_TRACE_REF. The output seems identical to the PPH_TRACE_TREE case. Well, the case in those branches is identical. The splitting was a bit preemptive, as I was planning to see what changes I needed after seeing what items were refs. None actually were refs, so the distinction isn't there. http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h File gcc/cp/pph-streamer.h (right): http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode149 gcc/cp/pph-streamer.h:149: } pph_output_tree_lst (pph_stream *stream, tree t, bool ref_p) +{ + if (flag_pph_tracer = 2) +pph_stream_trace_tree (stream, t, ref_p); + lto_output_tree (stream-ob, t, ref_p); +} I don't really like all this code duplication. Wouldn't it be better if instead of having pph_output_tree_aux and pph_output_tree_lst, we added another argument to pph_output_tree? The argument would be an enum and we could have a default 'DONT_CARE' value. I'm not sure that would save much code. It would induce some runtime overhead (unless the compiler specialized routines). It would also change the callbacks. http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode298 gcc/cp/pph-streamer.h:298: pph_stream_trace_tree (stream, t, false); /* FIXME pph: always false? */ @@ -285,7 +295,7 @@ pph_input_tree (pph_stream *stream) { tree t = lto_input_tree (stream-ib, stream-data_in); if (flag_pph_tracer = 4) -pph_stream_trace_tree (stream, t); +pph_stream_trace_tree (stream, t, false); /* FIXME pph: always false? Yes, on input we can't tell if we read a reference or a real tree. We could, but not at this level. That's inside the actual LTO streaming code. It would be nice to have an indication, but it is not something I want to do now. http://codereview.appspot.com/4433054/ -- Lawrence Crowl
[Patch,AVR]: Solve PR42210
This solves some missed optimization that can be seen when moving around bits. There are 4 combiner patterns that operate on regs and one that uses them as intermediate patterns and works on I/O. Even if just an intermediate pattern matches it's still an improvement because avoid of shift. Tested on some home-brew example. Ok if I see no regressions? Johann 2011-04-20 Georg-Johann Lay a...@gjlay.de PR target/42210 * config/avr/avr.md (*movbitqi.1-6.a, *movbitqi.1-6.b, *movbitqi.0, *movbitqi.7, *movbitqi.io): New insns. Index: config/avr/avr.md === --- config/avr/avr.md (Revision 172770) +++ config/avr/avr.md (Arbeitskopie) @@ -3388,3 +3388,81 @@ (define_insn fmulsu clr __zero_reg__ [(set_attr length 3) (set_attr cc clobber)]) + + +;; Some combiner patterns dealing with bits. +;; See PR42210 + +;; Move bit $3.$4 into bit $0.$4 +(define_insn *movbitqi.1-6.a + [(set (match_operand:QI 0 register_operand=r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(match_operand:QI 2 single_zero_operand n)) +(and:QI (ashift:QI (match_operand:QI 3 register_operand r) + (match_operand:QI 4 const_int_operand n)) +(match_operand:QI 5 single_one_operand n] + optimize +INTVAL(operands[4]) == exact_log2 (INTVAL(operands[5]) GET_MODE_MASK (QImode)) +INTVAL(operands[4]) == exact_log2 (~INTVAL(operands[2]) GET_MODE_MASK (QImode)) + bst %3,%4\;bld %0,%4 + [(set_attr length 2) + (set_attr cc none)]) + +;; Move bit $3.$4 into bit $0.$4 +;; Variation of above. Unfortunately, there is no canonicalized representation +;; of moving around bits. So what we see here depends on how user writes down +;; bit manipulations. +(define_insn *movbitqi.1-6.b + [(set (match_operand:QI 0 register_operand=r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(match_operand:QI 2 single_zero_operand n)) +(ashift:QI (and:QI (match_operand:QI 3 register_operand r) + (const_int 1)) + (match_operand:QI 4 const_int_operand n] + optimize +INTVAL(operands[4]) == exact_log2 (~INTVAL(operands[2]) GET_MODE_MASK (QImode)) + bst %3,%4\;bld %0,%4 + [(set_attr length 2) + (set_attr cc none)]) + +;; Move bit $3.0 into bit $0.0. +;; For bit 0, combiner generates slightly different pattern. +(define_insn *movbitqi.0 + [(set (match_operand:QI 0 register_operand =r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(match_operand:QI 2 single_zero_operand n)) +(and:QI (match_operand:QI 3 register_operand r) +(const_int 1] + optimize +0 == exact_log2 (~INTVAL(operands[2]) GET_MODE_MASK (QImode)) + bst %3,0\;bld %0,0 + [(set_attr length 2) + (set_attr cc none)]) + +;; Move bit $2.7 into bit $0.7. +;; For bit 7, combiner generates slightly different pattern +(define_insn *movbitqi.7 + [(set (match_operand:QI 0 register_operand =r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(const_int 127)) +(ashift:QI (match_operand:QI 2 register_operandr) + (const_int 7] + optimize + bst %2,7\;bld %0,7 + [(set_attr length 2) + (set_attr cc none)]) + +;; Combiner transforms above four pattern into ZERO_EXTRACT if it sees MEM +;; and input/output match. We provide a special pattern for this, because +;; in contrast to a IN/BST/BLD/OUT sequence we need less registers and the +;; operation on I/O is atomic. +(define_insn *movbitqi.io + [(set (zero_extract:QI (mem:QI (match_operand 0 low_io_address_operand n)) + (const_int 1) ;; width + (match_operand 1 const_int_operand n)) ;; pos +(match_operand:QI 2 register_operand r))] + optimize +IN_RANGE (INTVAL(operands[1]), 0, 7) + sbrc %2,0\;sbi %m0-0x20,%1\;sbrs %2,0\;cbi %m0-0x20,%1 + [(set_attr length 4) + (set_attr cc none)])
Re: [patch] Do not generate discriminator directive in strict mode
On 04/19/2011 06:40 AM, Eric Botcazou wrote: - if (SUPPORTS_DISCRIMINATOR discriminator != 0) + if (SUPPORTS_DISCRIMINATOR +discriminator != 0 +(dwarf_version = 4 || !dwarf_strict)) fprintf (asm_out_file, discriminator %d, discriminator); How is this not redundant with the existing /* The discriminator column was added in dwarf4. Simplify the below by simply removing it if we're not supposed to output it. */ if (dwarf_version 4 dwarf_strict) discriminator = 0; check near the top of the function? r~
Re: Improve stack layout heuristic.
On Wed, Apr 20, 2011 at 6:53 AM, Michael Matz m...@suse.de wrote: Hi, On Tue, 19 Apr 2011, Easwaran Raman wrote: That is correct but is also what the use of stack_vars[u].representative achieves alone, ... I am adding a check to that effect. ... without any check. @@ -596,7 +581,8 @@ if (vb-conflicts) { EXECUTE_IF_SET_IN_BITMAP (vb-conflicts, 0, u, bi) - add_stack_var_conflict (a, stack_vars[u].representative); + if (stack_vars[u].next == EOC stack_vars[u].representative == u) + add_stack_var_conflict (a, u); BITMAP_FREE (vb-conflicts); } } What's your objective with this change? I find the original code clearer. Let us say we try to merge 'a' to 'b' and 'a' has conflicts with many members of an existing partition C. It is not necessary to add all those conflicts to 'b' since they will be never considered in the call to union_stack_vars. Right, that's why I was objecting to your initial change. a I agree that my initial change - adding a conflict with u - was wrong. The original code (adding stack_vars[u].representative to the conflicts of A) made sure the target conflict bitmap only got representatives added. In my above example, it is not even necessary to add a conflict between 'b' and representative(C) since it is already in a partition. But you're right - not adding that conflict doesn't actually reduce the size of bit maps. Reverting back to what was there originally. Thanks, Easwaran That's why I was asking why you changed this area at all. I was motivated by your comment on bit-vector bloat to try this, but if this affects readability I'll happily revert back to what it was before. Ciao, Michael. Index: gcc/testsuite/gcc.dg/stack-layout-2.c === --- gcc/testsuite/gcc.dg/stack-layout-2.c (revision 0) +++ gcc/testsuite/gcc.dg/stack-layout-2.c (revision 0) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-rtl-expand } */ +void bar( char *); +int foo() +{ + int i=0; + { +char a[8000]; +bar(a); +i += a[0]; + } + { +char a[8192]; +char b[32]; +bar(a); +i += a[0]; +bar(b); +i += a[0]; + } + return i; +} +/* { dg-final { scan-rtl-dump size 8192 expand } } */ +/* { dg-final { scan-rtl-dump size 32 expand } } */ Index: gcc/cfgexpand.c === --- gcc/cfgexpand.c (revision 171954) +++ gcc/cfgexpand.c (working copy) @@ -158,11 +158,6 @@ /* The Variable. */ tree decl; - /* The offset of the variable. During partitioning, this is the - offset relative to the partition. After partitioning, this - is relative to the stack frame. */ - HOST_WIDE_INT offset; - /* Initially, the size of the variable. Later, the size of the partition, if this variable becomes it's partition's representative. */ HOST_WIDE_INT size; @@ -267,7 +262,6 @@ v = stack_vars[stack_vars_num]; v-decl = decl; - v-offset = 0; v-size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1); /* Ensure that all variables have size, so that a != b for any two variables that are simultaneously live. */ @@ -403,9 +397,9 @@ return (int)largeb - (int)largea; /* Secondary compare on size, decreasing */ - if (sizea sizeb) -return -1; if (sizea sizeb) +return -1; + if (sizea sizeb) return 1; /* Tertiary compare on true alignment, decreasing. */ @@ -564,28 +558,19 @@ /* A subroutine of partition_stack_vars. The UNION portion of a UNION/FIND partitioning algorithm. Partitions A and B are known to be non-conflicting. - Merge them into a single partition A. + Merge them into a single partition A. */ - At the same time, add OFFSET to all variables in partition B. At the end - of the partitioning process we've have a nice block easy to lay out within - the stack frame. */ - static void -union_stack_vars (size_t a, size_t b, HOST_WIDE_INT offset) +union_stack_vars (size_t a, size_t b) { - size_t i, last; struct stack_var *vb = stack_vars[b]; bitmap_iterator bi; unsigned u; - /* Update each element of partition B with the given offset, - and merge them into partition A. */ - for (last = i = b; i != EOC; last = i, i = stack_vars[i].next) -{ - stack_vars[i].offset += offset; - stack_vars[i].representative = a; -} - stack_vars[last].next = stack_vars[a].next; + gcc_assert (stack_vars[b].next == EOC); + /* Add B to A's partition. */ + stack_vars[b].next = stack_vars[a].next; + stack_vars[b].representative = a; stack_vars[a].next = b; /* Update the required alignment of partition A to account for B. */ @@ -605,16 +590,13 @@ partitions constrained by the interference graph. The overall algorithm used is as follows: - Sort the objects by size. + Sort the objects by size in descending order. For each
Re: [PATCH] Stop in note_eh_region_may_contain_throw after ERT_MUST_NOT_THROW (PR tree-optimization/48611)
On 04/18/2011 02:35 PM, Jakub Jelinek wrote: PR tree-optimization/48611 * tree-eh.c (note_eh_region_may_contain_throw): Don't propagate beyond ERT_MUST_NOT_THROW region. Ok. r~
unnecessary test before free changes committed
FYI, I've just pushed the following two change sets. I verified that make check on x86_64 produced the same set of 92 failures without as with my changes. However, when I ran make check MALLOC_PERTURB_=0 MALLOC_CHECK_=0, I saw only 91 failures. (normally those MALLOC_*_ variables are set to nonzero values in my environment) This was the culprit: FAIL: gcc.dg/matrix/transpose-3.c execution,-fprofile-use -fipa-matrix-reorg -fdump-ipa-matrix-reorg -O3 -fwhole-program -fno-tree-fre From 7e50b781d25170cf5bbe5f6247607c5dca879009 Mon Sep 17 00:00:00 2001 From: Jim Meyering meyer...@redhat.com Date: Mon, 3 Jan 2011 16:52:37 +0100 Subject: [PATCH 1/2] discourage unnecessary use of if before free * README.Portability: Explain why if (P) free (P) is best avoided. --- gcc/README.Portability | 27 --- 1 files changed, 20 insertions(+), 7 deletions(-) diff --git a/gcc/README.Portability b/gcc/README.Portability index 32a33e2..4101a2f 100644 --- a/gcc/README.Portability +++ b/gcc/README.Portability @@ -51,14 +51,28 @@ foo (bar, ) needs to be coded in some other way. -free and realloc - +Avoid unnecessary test before free +-- -Some implementations crash upon attempts to free or realloc the null -pointer. Thus if mem might be null, you need to write +Since SunOS 4 stopped being a reasonable portability target, +(which happened around 2007) there has been no need to guard +against free (NULL). Thus, any guard like the following +constitutes a redundant test: + + if (P) +free (P); + +It is better to avoid the test.[*] +Instead, simply free P, regardless of whether it is NULL. + +[*] However, if your profiling exposes a test like this in a +performance-critical loop, say where P is nearly always NULL, and +the cost of calling free on a NULL pointer would be prohibitively +high, consider using __builtin_expect, e.g., like this: + + if (__builtin_expect (ptr != NULL, 0)) +free (ptr); - if (mem) -free (mem); Trigraphs @@ -194,4 +208,3 @@ o Passing incorrect types to fprintf and friends. o Adding a function declaration for a module declared in another file to a .c file instead of to a .h file. - -- 1.7.5.rc2.295.g19c42 From 08544935e6fcfd6a1d1cba6d302ccede02e13681 Mon Sep 17 00:00:00 2001 From: Jim Meyering meyer...@redhat.com Date: Fri, 15 Apr 2011 20:47:40 +0200 Subject: [PATCH 2/2] remove useless if-before-free tests Change if (E) free (E); to free (E); everywhere except in the libgo/, intl/, zlib/ and classpath/ directories. Also transform equivalent variants like if (E != NULL) free (E); and allow an extra cast on the argument to free. Otherwise, the tested and freed E expressions must be identical, modulo white space. --- gcc/ChangeLog | 39 + gcc/ada/ChangeLog |4 ++ gcc/ada/initialize.c|3 +- gcc/c-family/ChangeLog |7 +++- gcc/c-family/c-format.c |6 +-- gcc/calls.c | 15 ++ gcc/cfgcleanup.c|3 +- gcc/collect2.c |3 +- gcc/config/i386/i386.c |3 +- gcc/config/mcore/mcore.c|3 +- gcc/coverage.c |3 +- gcc/cp/ChangeLog|4 ++ gcc/cp/tree.c |3 +- gcc/cse.c |6 +-- gcc/cselib.c|3 +- gcc/df-core.c | 15 ++ gcc/fortran/ChangeLog |7 +++ gcc/fortran/expr.c |3 +- gcc/fortran/gfortranspec.c |5 +- gcc/fortran/interface.c |3 +- gcc/fortran/trans-openmp.c |3 +- gcc/function.c |3 +- gcc/gcc.c | 15 ++ gcc/gcov.c |6 +-- gcc/gensupport.c| 12 ++ gcc/graphite-clast-to-gimple.c |3 +- gcc/graphite-sese-to-poly.c |3 +- gcc/haifa-sched.c |3 +- gcc/ipa-prop.c |3 +- gcc/ipa-pure-const.c|3 +- gcc/ipa-reference.c |3 +- gcc/ira-costs.c | 12 ++ gcc/ira.c | 18 +++- gcc/java/ChangeLog |6 ++- gcc/java/jcf-parse.c|3 +- gcc/matrix-reorg.c |9 +--- gcc/prefix.c|3 +- gcc/profile.c |3 +- gcc/reload1.c |6 +-- gcc/sched-deps.c|3 +- gcc/sel-sched-ir.c |3 +- gcc/sese.c |6 +-- gcc/tree-data-ref.c |6 +-- gcc/tree-eh.c |3 +- gcc/tree-ssa-coalesce.c |3 +- gcc/tree-ssa-live.c |6 +--
Re: Allow union variables to share stack slots wwith -fno-strict-aliasing (issue4444051)
On Wed, Apr 20, 2011 at 2:12 AM, Eric Botcazou ebotca...@adacore.com wrote: 2011-04-19 Easwaran Raman era...@google.com * gcc/testsuite/gcc.dg/stack-layout-1.c: New * gcc/cfgexpand.c (add_alias_set_conflicts): Add conflicts with a variable containing union type only with -fstrict-aliasing. You need an entry for each relevant ChangeLog, without prefixes: 2011-04-20 Easwaran Raman era...@google.com * cfgexpand.c (add_alias_set_conflicts): Add conflicts with a variable containing union type only with -fstrict-aliasing. 2011-04-20 Easwaran Raman era...@google.com * gcc.dg/stack-layout-1.c: New test. -- Eric Botcazou Thanks. I've added them to the respective Changelog files and committed the patch. -Easwaran
Re: [patch] Do not generate discriminator directive in strict mode
How is this not redundant with the existing /* The discriminator column was added in dwarf4. Simplify the below by simply removing it if we're not supposed to output it. */ if (dwarf_version 4 dwarf_strict) discriminator = 0; check near the top of the function? Obviously I missed this recent change, sorry. So the question is: would the change be appropriate for the release branches, where we emit the directive unconditionally, i.e 4.5 and 4.6 branches, or would mine be safer for them? This directive apparently confuses (some versions of) the Wind River debugger. -- Eric Botcazou
Re: FDO usability: pid handling
Please review the latest patch. SPEC2k FDO testing pass. Thanks, David On Wed, Apr 20, 2011 at 11:22 AM, Xinliang David Li davi...@google.com wrote: Here is the revised patch. Basic FDO testing went fine. I still saw the ipa-inline assertion in SPEC FDO. Will run it when the regression is fixed. Thanks, David On Tue, Apr 19, 2011 at 5:33 PM, Jan Hubicka hubi...@ucw.cz wrote: On Tue, Apr 19, 2011 at 4:49 PM, Jan Hubicka hubi...@ucw.cz wrote: Actually, among all the choices, funcdef_no is probably the most dense one -- it is for function decl with definition only. In LIPO, the Yes, funddef_no is densiest, but we don't really need great density here (in many other places we index arrays by cgraph_uid - it is intended for that purpose and we pay some attention to recycle unused nodes). That does not mean it is right to use sparse ids:) DECL_UID will be the worst amongst them. Sure, that is why I suggested cgraph-uid. That one is kept dense and it also tracks cgraph node creation order. Unlike pid it counts also functions w/o bodies. We only care to avoid divergence in the indexes in between instrumentation and feedback compilation. With the IPA pass organizatoin the compiler doesn't really diverge until the profile pass, so it seems to me that all should be safe. When I said 'fragile' -- I meant it depends on the optimization pass phase ordering which can change in the future. Well, that is the case of all of them (passes can create function bodies that can make funcdef_no also diverge). This is the case of couple passes already, especially OMP lowering and friends. Ok, I will throw away pid and use funcdef_no for now. For future replacement for the function ids, please consider the following desired properties: 1) The id sequence does not depend on optimization passes -- only depend on source/parsing order; It depends on optimization, too. This is why we actually have cgraph-order that is used for -fno-toplevel-reorder and is similar to funcdef_no, but assigned at finalization time. 2) It is dense and sequential for defined functions a) it has proven to be very useful to use nice looking, sequential ids in triaging coverage mismatch related problems (the tree dump should also show the function id); You get the cgraph uids in the dumps already. b) it can be very useful in bug triaging via binary search by specifying ranges of function ids (to enable optimizations etc). But as you wish, we can process with fundef_no first and then discuss removal of that field later. Honza Thanks, David Honza David Honza Index: cgraph.c === --- cgraph.c (revision 172781) +++ cgraph.c (working copy) @@ -142,9 +142,6 @@ int cgraph_max_uid; /* Maximal uid used in cgraph edges. */ int cgraph_edge_max_uid; -/* Maximal pid used for profiling */ -int cgraph_max_pid; - /* Set when whole unit has been analyzed so we can access global info. */ bool cgraph_global_info_ready = false; @@ -472,7 +469,6 @@ cgraph_create_node_1 (void) struct cgraph_node *node = cgraph_allocate_node (); node-next = cgraph_nodes; - node-pid = -1; node-order = cgraph_order++; if (cgraph_nodes) cgraph_nodes-previous = node; @@ -1827,8 +1823,7 @@ dump_cgraph_node (FILE *f, struct cgraph struct cgraph_edge *edge; int indirect_calls_count = 0; - fprintf (f, %s/%i(%i), cgraph_node_name (node), node-uid, - node-pid); + fprintf (f, %s/%i, cgraph_node_name (node), node-uid); dump_addr (f, @, (void *)node); if (DECL_ASSEMBLER_NAME_SET_P (node-decl)) fprintf (f, (asm: %s), IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node-decl))); Index: cgraph.h === --- cgraph.h (revision 172781) +++ cgraph.h (working copy) @@ -200,9 +200,6 @@ struct GTY((chain_next (%h.next), chai /* Ordering of all cgraph nodes. */ int order; - /* unique id for profiling. pid is not suitable because of different - number of cfg nodes with -fprofile-generate and -fprofile-use */ - int pid; enum ld_plugin_symbol_resolution resolution; /* Set when function must be output for some reason. The primary @@ -472,7 +469,6 @@ extern GTY(()) struct cgraph_node *cgrap extern GTY(()) int cgraph_n_nodes; extern GTY(()) int cgraph_max_uid; extern GTY(()) int cgraph_edge_max_uid; -extern GTY(()) int cgraph_max_pid; extern bool cgraph_global_info_ready; enum cgraph_state { @@ -730,6 +726,8 @@ void cgraph_clone_inlined_nodes (struct void compute_inline_parameters (struct cgraph_node *); cgraph_inline_failed_t cgraph_edge_inlinable_p (struct cgraph_edge *); +void cgraph_init_node_map (void); +void cgraph_del_node_map (void); /* Create a new static variable of type TYPE. */ tree add_new_static_var (tree type); Index: value-prof.c
[PATCH] use build_function_type_list in the alpha backend
As $SUBJECT suggests. Tested with cross to alpha-elf. OK to commit? -Nathan * config/alpha/alpha.c (alpha_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c index 5e85e2b..237e9b3 100644 --- a/gcc/config/alpha/alpha.c +++ b/gcc/config/alpha/alpha.c @@ -6409,7 +6409,7 @@ alpha_init_builtins (void) implicit_built_in_decls[(int) BUILT_IN_FWRITE_UNLOCKED] = NULL_TREE; #endif - ftype = build_function_type (dimode_integer_type_node, void_list_node); + ftype = build_function_type_list (dimode_integer_type_node, NULL_TREE); alpha_add_builtins (zero_arg_builtins, ARRAY_SIZE (zero_arg_builtins), ftype); @@ -6424,7 +6424,7 @@ alpha_init_builtins (void) alpha_add_builtins (two_arg_builtins, ARRAY_SIZE (two_arg_builtins), ftype); - ftype = build_function_type (ptr_type_node, void_list_node); + ftype = build_function_type_list (ptr_type_node, NULL_TREE); alpha_builtin_function (__builtin_thread_pointer, ftype, ALPHA_BUILTIN_THREAD_POINTER, ECF_NOTHROW);
[PATCH] use build_function_type_list in the bfin backend
As $SUBJECT suggests. Tested with cross to bfin-elf. OK to commit? -Nathan * config/bfin/bfin.c (bfin_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c index 5d08437..03a833d 100644 --- a/gcc/config/bfin/bfin.c +++ b/gcc/config/bfin/bfin.c @@ -5967,7 +5967,7 @@ bfin_init_builtins (void) { tree V2HI_type_node = build_vector_type_for_mode (intHI_type_node, V2HImode); tree void_ftype_void -= build_function_type (void_type_node, void_list_node); += build_function_type_list (void_type_node, NULL_TREE); tree short_ftype_short = build_function_type_list (short_integer_type_node, short_integer_type_node, NULL_TREE);
[PATCH] use build_function_type_list in the frv backend
As $SUBJECT suggests. Tested with cross to frv-elf. OK to commit? -Nathan * config/frv/frv.c (frv_init_builtins): Delete `endlink' variable. Call builtin_function_type_list instead of builtin_function_type. (UNARY, BINARY, TRINARY, QUAD): Likewise. diff --git a/gcc/config/frv/frv.c b/gcc/config/frv/frv.c index 0913765..564baa0 100644 --- a/gcc/config/frv/frv.c +++ b/gcc/config/frv/frv.c @@ -8390,7 +8390,6 @@ static struct builtin_description bdesc_stores[] = static void frv_init_builtins (void) { - tree endlink = void_list_node; tree accumulator = integer_type_node; tree integer = integer_type_node; tree voidt = void_type_node; @@ -8405,24 +8404,18 @@ frv_init_builtins (void) tree iacc = integer_type_node; #define UNARY(RET, T1) \ - build_function_type (RET, tree_cons (NULL_TREE, T1, endlink)) + build_function_type_list (RET, T1, NULL_TREE) #define BINARY(RET, T1, T2) \ - build_function_type (RET, tree_cons (NULL_TREE, T1, \ - tree_cons (NULL_TREE, T2, endlink))) + build_function_type_list (RET, T1, T2, NULL_TREE) #define TRINARY(RET, T1, T2, T3) \ - build_function_type (RET, tree_cons (NULL_TREE, T1, \ - tree_cons (NULL_TREE, T2, \ - tree_cons (NULL_TREE, T3, endlink + build_function_type_list (RET, T1, T2, T3, NULL_TREE) #define QUAD(RET, T1, T2, T3, T4) \ - build_function_type (RET, tree_cons (NULL_TREE, T1, \ - tree_cons (NULL_TREE, T2, \ - tree_cons (NULL_TREE, T3, \ - tree_cons (NULL_TREE, T4, endlink) + build_function_type_list (RET, T1, T2, T3, NULL_TREE) - tree void_ftype_void = build_function_type (voidt, endlink); + tree void_ftype_void = build_function_type_list (voidt, NULL_TREE); tree void_ftype_acc = UNARY (voidt, accumulator); tree void_ftype_uw4_uw1 = BINARY (voidt, uword4, uword1);
[PATCH] use build_function_type_list in the i386 backend
As $SUBJECT suggests. There's still one use of build_function_type; replacing that type will have to wait for an improved FUNCTION_TYPE-building interface. Tested on x86_64-unknown-linux-gnu. OK to commit? -Nathan * config/i386/i386.c (ix86_code_end): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index b6d41f0..40151f4 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -8833,7 +8833,7 @@ ix86_code_end (void) decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL, get_identifier (name), -build_function_type (void_type_node, void_list_node)); +build_function_type_list (void_type_node, NULL_TREE)); DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL, NULL_TREE, void_type_node); TREE_PUBLIC (decl) = 1;
[PATCH] use build_function_type_list in the ia64 backend
As $SUBJECT suggests. Tested with cross to ia64-linux-gnu. OK to commit? -Nathan * config/ia64/ia64.c (ia64_init_builtins): Call build_function_type_list instead of builtin_function_type. diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index 5f22b17..166ec43 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -10165,11 +10165,10 @@ ia64_init_builtins (void) (*lang_hooks.types.register_builtin_type) (float128_type, __float128); /* TFmode support builtins. */ - ftype = build_function_type (float128_type, void_list_node); - decl = add_builtin_function (__builtin_infq, ftype, - IA64_BUILTIN_INFQ, BUILT_IN_MD, - NULL, NULL_TREE); - ia64_builtins[IA64_BUILTIN_INFQ] = decl; + ftype = build_function_type_list (float128_type, NULL_TREE); + add_builtin_function (__builtin_infq, ftype, + IA64_BUILTIN_INFQ, BUILT_IN_MD, + NULL, NULL_TREE); decl = add_builtin_function (__builtin_huge_valq, ftype, IA64_BUILTIN_HUGE_VALQ, BUILT_IN_MD, @@ -10211,15 +10210,13 @@ ia64_init_builtins (void) add_builtin_function ((name), (type), (code), BUILT_IN_MD, \ NULL, NULL_TREE) - decl = def_builtin (__builtin_ia64_bsp, - build_function_type (ptr_type_node, void_list_node), + def_builtin (__builtin_ia64_bsp, + build_function_type_list (ptr_type_node, NULL_TREE), IA64_BUILTIN_BSP); - ia64_builtins[IA64_BUILTIN_BSP] = decl; - decl = def_builtin (__builtin_ia64_flushrs, - build_function_type (void_type_node, void_list_node), + def_builtin (__builtin_ia64_flushrs, + build_function_type_list (void_type_node, NULL_TREE), IA64_BUILTIN_FLUSHRS); - ia64_builtins[IA64_BUILTIN_FLUSHRS] = decl; #undef def_builtin
[PATCH] use build_function_type_list in the iq2000 backend
As $SUBJECT suggests. Tested with cross to iq2000-elf. OK to commit? -Nathan * config/iq2000/i2000.c (iq2000_init_builtins): Call build_function_type_list instead of build_function_type. Delete `endlink' variable. diff --git a/gcc/config/iq2000/iq2000.c b/gcc/config/iq2000/iq2000.c index 2d69085..aa63674 100644 --- a/gcc/config/iq2000/iq2000.c +++ b/gcc/config/iq2000/iq2000.c @@ -2466,7 +2466,6 @@ iq2000_output_conditional_branch (rtx insn, rtx * operands, int two_operands_p, static void iq2000_init_builtins (void) { - tree endlink = void_list_node; tree void_ftype, void_ftype_int, void_ftype_int_int; tree void_ftype_int_int_int; tree int_ftype_int, int_ftype_int_int, int_ftype_int_int_int; @@ -2474,76 +2473,55 @@ iq2000_init_builtins (void) /* func () */ void_ftype -= build_function_type (void_type_node, - tree_cons (NULL_TREE, void_type_node, endlink)); += build_function_type_list (void_type_node, NULL_TREE); /* func (int) */ void_ftype_int -= build_function_type (void_type_node, - tree_cons (NULL_TREE, integer_type_node, endlink)); += build_function_type_list (void_type_node, integer_type_node, NULL_TREE); /* void func (int, int) */ void_ftype_int_int -= build_function_type (void_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - endlink))); += build_function_type_list (void_type_node, +integer_type_node, +integer_type_node, +NULL_TREE); /* int func (int) */ int_ftype_int -= build_function_type (integer_type_node, - tree_cons (NULL_TREE, integer_type_node, endlink)); += build_function_type_list (integer_type_node, +integer_type_node, NULL_TREE); /* int func (int, int) */ int_ftype_int_int -= build_function_type (integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - endlink))); += build_function_type_list (integer_type_node, +integer_type_node, +integer_type_node, +NULL_TREE); /* void func (int, int, int) */ -void_ftype_int_int_int -= build_function_type -(void_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, - integer_type_node, - endlink; - - /* int func (int, int, int, int) */ - int_ftype_int_int_int_int -= build_function_type -(integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, - integer_type_node, - tree_cons (NULL_TREE, -integer_type_node, -endlink); + void_ftype_int_int_int += build_function_type_list (void_type_node, +integer_type_node, +integer_type_node, +integer_type_node, +NULL_TREE); /* int func (int, int, int) */ int_ftype_int_int_int -= build_function_type -(integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, - integer_type_node, - endlink; += build_function_type_list (integer_type_node, +integer_type_node, +integer_type_node, +integer_type_node, +NULL_TREE); /* int func (int, int, int, int) */ int_ftype_int_int_int_int -= build_function_type -(integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, - integer_type_node, - tree_cons (NULL_TREE, -integer_type_node, -endlink); += build_function_type_list (integer_type_node, +integer_type_node, +
[PATCH] use build_function_type_list in the mips backend
As $SUBJECT suggests. Tested with cross to mips-elf. OK to commit? -Nathan * config/mips/mips.c (mips16_build_function_stub): Call build_function_type_list instead of build_function_type. (mips16_build_call_stub): Likewise. diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index e075c4f..4d4d639 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -6075,7 +6075,7 @@ mips16_build_function_stub (void) /* Build a decl for the stub. */ stubdecl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL, get_identifier (stubname), -build_function_type (void_type_node, NULL_TREE)); +build_function_type_list (void_type_node, NULL_TREE)); DECL_SECTION_NAME (stubdecl) = build_string (strlen (secname), secname); DECL_RESULT (stubdecl) = build_decl (BUILTINS_LOCATION, RESULT_DECL, NULL_TREE, void_type_node); @@ -6321,7 +6321,7 @@ mips16_build_call_stub (rtx retval, rtx *fn_ptr, rtx args_size, int fp_code) stubid = get_identifier (stubname); stubdecl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL, stubid, -build_function_type (void_type_node, NULL_TREE)); +build_function_type_list (void_type_node, NULL_TREE)); DECL_SECTION_NAME (stubdecl) = build_string (strlen (secname), secname); DECL_RESULT (stubdecl) = build_decl (BUILTINS_LOCATION, RESULT_DECL, NULL_TREE,
Re: [PATCH] use build_function_type_list in the mep backend
* config/mep/mep.c (mep_init_builtins): Call build_function_type_list instead of build_function_type. Ok.
C++ PATCH for c++/48657 (rejects-valid with local variable used as non-type template argument)
The problem in this testcase was that we were recognizing a local const variable with a constant initializer as a constant expression, but we weren't doing the necessary adjustments to convert the initializer to the type of the variable. But some of the other bits of cp_finish_decl caused problems for variables with function scope. After some investigation, it seemed to me that the only part of cp_finish_decl that we really want for constants in templates is the initializer processing, so rather than mess with clearing processing_template_decl and going through all the other pieces, we can just call check_initializer directly and then be done. For 4.6 I made a smaller change that only affects local variables. Tested x86_64-pc-linux-gnu, applying to trunk and 4.6. commit 56ee3cf091b9b349ddbcdc8afb62e4ec4cf0eae0 Author: Jason Merrill ja...@redhat.com Date: Tue Apr 19 23:25:21 2011 -0700 PR c++/48657 * decl.c (cp_finish_decl): Simplify template handling. diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 6309648..cf4a40e 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -5750,7 +5750,6 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, const char *asmspec = NULL; int was_readonly = 0; bool var_definition_p = false; - int saved_processing_template_decl; tree auto_node; if (decl == error_mark_node) @@ -5772,7 +5771,6 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, /* Assume no cleanup is required. */ cleanup = NULL_TREE; - saved_processing_template_decl = processing_template_decl; /* If a name was specified, get the string. */ if (global_scope_p (current_binding_level)) @@ -5878,39 +5876,24 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, template is instantiated. But, if DECL is a variable constant then it can be used in future constant expressions, so its value must be available. */ - if (!(init -DECL_CLASS_SCOPE_P (decl) - /* We just set TREE_CONSTANT appropriately; see above. */ -TREE_CONSTANT (decl) -!type_dependent_p - /* FIXME non-value-dependent constant expression */ -!value_dependent_init_p (init))) + if (init + init_const_expr_p + !type_dependent_p + decl_maybe_constant_var_p (decl) + !value_dependent_init_p (init)) { - if (init) - DECL_INITIAL (decl) = init; - if (TREE_CODE (decl) == VAR_DECL - !DECL_PRETTY_FUNCTION_P (decl) - !type_dependent_p) - maybe_deduce_size_from_array_init (decl, init); - goto finish_end; + tree init_code = check_initializer (decl, init, flags, cleanup); + if (init_code == NULL_TREE) + init = NULL_TREE; } + else if (TREE_CODE (decl) == VAR_DECL + !DECL_PRETTY_FUNCTION_P (decl) + !type_dependent_p) + maybe_deduce_size_from_array_init (decl, init); - if (TREE_CODE (init) == TREE_LIST) - { - /* If the parenthesized-initializer form was used (e.g., -int AN::i(X)), then INIT will be a TREE_LIST of initializer -arguments. (There is generally only one.) We convert them -individually. */ - tree list = init; - for (; list; list = TREE_CHAIN (list)) - { - tree elt = TREE_VALUE (list); - TREE_VALUE (list) = fold_non_dependent_expr (elt); - } - } - else - init = fold_non_dependent_expr (init); - processing_template_decl = 0; + if (init) + DECL_INITIAL (decl) = init; + return; } /* Take care of TYPE_DECLs up front. */ @@ -5933,7 +5916,7 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, rest_of_decl_compilation (decl, DECL_FILE_SCOPE_P (decl), at_eof); - goto finish_end; + return; } /* A reference will be modified here, as it is initialized. */ @@ -6057,8 +6040,7 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, else if (TREE_CODE (type) == ARRAY_TYPE) layout_type (type); - if (!processing_template_decl - TREE_STATIC (decl) + if (TREE_STATIC (decl) !at_function_scope_p () current_function_decl == NULL) /* So decl is a global variable or a static member of a @@ -6078,9 +6060,8 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, /* Let the middle end know about variables and functions -- but not static data members in uninstantiated class templates. */ - if (!saved_processing_template_decl - (TREE_CODE (decl) == VAR_DECL - || TREE_CODE (decl) == FUNCTION_DECL)) + if (TREE_CODE (decl) == VAR_DECL + || TREE_CODE (decl) == FUNCTION_DECL) { if (TREE_CODE (decl) == VAR_DECL) { @@ -6167,9 +6148,6 @@
[PATCH] use build_function_type_list in the s390 backend
As $SUBJECT suggests. Tested with cross to s390-linux-gnu. OK to commit? -Nathan * config/s390/s390.c (s390_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index caee077..adacfa3 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -9172,7 +9172,7 @@ s390_init_builtins (void) { tree ftype; - ftype = build_function_type (ptr_type_node, void_list_node); + ftype = build_function_type_list (ptr_type_node, NULL_TREE); add_builtin_function (__builtin_thread_pointer, ftype, S390_BUILTIN_THREAD_POINTER, BUILT_IN_MD, NULL, NULL_TREE);
[PATCH] use build_function_type_list in the xtensa backend
As $SUBJECT suggests. Tested with cross to xtensa-elf. OK to commit? -Nathan * config/xtensa/xtensa.c (xtensa_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c index fe70270..574e08e 100644 --- a/gcc/config/xtensa/xtensa.c +++ b/gcc/config/xtensa/xtensa.c @@ -3083,7 +3083,7 @@ xtensa_init_builtins (void) if (TARGET_THREADPTR) { - ftype = build_function_type (ptr_type_node, void_list_node); + ftype = build_function_type_list (ptr_type_node, NULL_TREE); decl = add_builtin_function (__builtin_thread_pointer, ftype, XTENSA_BUILTIN_THREAD_POINTER, BUILT_IN_MD, NULL, NULL_TREE);
Re: [PATCH] use build_function_type_list in the mips backend
Nathan Froyd froy...@codesourcery.com writes: * config/mips/mips.c (mips16_build_function_stub): Call build_function_type_list instead of build_function_type. (mips16_build_call_stub): Likewise. OK, thanks, but: - build_function_type (void_type_node, NULL_TREE)); + build_function_type_list (void_type_node, NULL_TREE)); please split the long line. Richard
[PATCH] use build_function_type_list in the rs6000 backend
As $SUBJECT suggests. The only tricky part is in builtin_function_type, where we fill in unused args with NULL_TREE so that passing extra arguments to build_function_type_list doesn't matter. Tested with cross to powerpc-eabi. OK to commit? -Nathan * config/rs6000/rs6000.c (spe_init_builtins): Call build_function_type_list instead of build_function_type. (paired_init_builtins, altivec_init_builtins): Likewise. (builtin_function_type): Likewise. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 8182bf0..c08c16e 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -12824,107 +12824,97 @@ enable_mask_for_builtins (struct builtin_description *desc, int size, static void spe_init_builtins (void) { - tree endlink = void_list_node; tree puint_type_node = build_pointer_type (unsigned_type_node); tree pushort_type_node = build_pointer_type (short_unsigned_type_node); struct builtin_description *d; size_t i; tree v2si_ftype_4_v2si -= build_function_type -(opaque_V2SI_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, -endlink); += build_function_type_list (opaque_V2SI_type_node, +opaque_V2SI_type_node, +opaque_V2SI_type_node, +opaque_V2SI_type_node, +opaque_V2SI_type_node, +NULL_TREE); tree v2sf_ftype_4_v2sf -= build_function_type -(opaque_V2SF_type_node, - tree_cons (NULL_TREE, opaque_V2SF_type_node, - tree_cons (NULL_TREE, opaque_V2SF_type_node, - tree_cons (NULL_TREE, opaque_V2SF_type_node, - tree_cons (NULL_TREE, opaque_V2SF_type_node, -endlink); += build_function_type_list (opaque_V2SF_type_node, +opaque_V2SF_type_node, +opaque_V2SF_type_node, +opaque_V2SF_type_node, +opaque_V2SF_type_node, +NULL_TREE); tree int_ftype_int_v2si_v2si -= build_function_type -(integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, - endlink; += build_function_type_list (integer_type_node, +integer_type_node, +opaque_V2SI_type_node, +opaque_V2SI_type_node, +NULL_TREE); tree int_ftype_int_v2sf_v2sf -= build_function_type -(integer_type_node, - tree_cons (NULL_TREE, integer_type_node, - tree_cons (NULL_TREE, opaque_V2SF_type_node, - tree_cons (NULL_TREE, opaque_V2SF_type_node, - endlink; += build_function_type_list (integer_type_node, +integer_type_node, +opaque_V2SF_type_node, +opaque_V2SF_type_node, +NULL_TREE); tree void_ftype_v2si_puint_int -= build_function_type (void_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, - tree_cons (NULL_TREE, puint_type_node, -tree_cons (NULL_TREE, - integer_type_node, - endlink; += build_function_type_list (void_type_node, +opaque_V2SI_type_node, +puint_type_node, +integer_type_node, +NULL_TREE); tree void_ftype_v2si_puint_char -= build_function_type (void_type_node, - tree_cons (NULL_TREE, opaque_V2SI_type_node, - tree_cons (NULL_TREE, puint_type_node, -tree_cons (NULL_TREE, - char_type_node, - endlink; += build_function_type_list (void_type_node, +opaque_V2SI_type_node, +puint_type_node, +
[PATCH] use build_function_type_list in the picochip backend
As $SUBJECT suggests. Tested with cross to picochip-elf. OK to commit? -Nathan * config/picochip/picochip.c (picochip_init_builtins): Call build_function_type_list instead of build_function_type. Delete `endlink' variable. diff --git a/gcc/config/picochip/picochip.c b/gcc/config/picochip/picochip.c index 1ca95b4..4442d1e 100644 --- a/gcc/config/picochip/picochip.c +++ b/gcc/config/picochip/picochip.c @@ -4216,18 +4216,6 @@ void picochip_init_builtins (void) { tree noreturn; - tree endlink = void_list_node; - tree int_endlink = tree_cons (NULL_TREE, integer_type_node, endlink); - tree unsigned_endlink = tree_cons (NULL_TREE, unsigned_type_node, endlink); - tree long_endlink = tree_cons (NULL_TREE, long_integer_type_node, endlink); - tree int_int_endlink = -tree_cons (NULL_TREE, integer_type_node, int_endlink); - tree int_int_int_endlink = -tree_cons (NULL_TREE, integer_type_node, int_int_endlink); - tree int_long_endlink = -tree_cons (NULL_TREE, integer_type_node, long_endlink); - tree long_int_int_int_endlink = -tree_cons (NULL_TREE, long_integer_type_node, int_int_int_endlink); tree int_ftype_int, int_ftype_int_int; tree long_ftype_int, long_ftype_int_int_int; @@ -4236,36 +4224,51 @@ picochip_init_builtins (void) tree void_ftype_void, unsigned_ftype_unsigned; /* void func (void) */ - void_ftype_void = build_function_type (void_type_node, endlink); + void_ftype_void = build_function_type_list (void_type_node, NULL_TREE); /* int func (int) */ - int_ftype_int = build_function_type (integer_type_node, int_endlink); + int_ftype_int = build_function_type_list (integer_type_node, + integer_type_node, NULL_TREE); /* unsigned int func (unsigned int) */ - unsigned_ftype_unsigned = build_function_type (unsigned_type_node, unsigned_endlink); + unsigned_ftype_unsigned += build_function_type_list (unsigned_type_node, + unsigned_type_node, NULL_TREE); /* int func(int, int) */ int_ftype_int_int -= build_function_type (integer_type_node, int_int_endlink); += build_function_type_list (integer_type_node, + integer_type_node, integer_type_node, + NULL_TREE); /* long func(int) */ - long_ftype_int = build_function_type (long_integer_type_node, int_endlink); + long_ftype_int = build_function_type_list (long_integer_type_node, +integer_type_node, NULL_TREE); /* long func(int, int, int) */ long_ftype_int_int_int -= build_function_type (long_integer_type_node, int_int_int_endlink); += build_function_type_list (long_integer_type_node, + integer_type_node, integer_type_node, + integer_type_node, NULL_TREE); /* int func(int, int, int) */ int_ftype_int_int_int -= build_function_type (integer_type_node, int_int_int_endlink); += build_function_type_list (integer_type_node, + integer_type_node, integer_type_node, + integer_type_node, NULL_TREE); /* void func(int, long) */ void_ftype_int_long -= build_function_type (void_type_node, int_long_endlink); += build_function_type_list (void_type_node, + integer_type_node, long_integer_type_node, + NULL_TREE); /* void func(long, int, int, int) */ void_ftype_long_int_int_int -= build_function_type (void_type_node, long_int_int_int_endlink); += build_function_type_list (void_type_node, + long_integer_type_node, integer_type_node, + integer_type_node, integer_type_node, + NULL_TREE); /* Initialise the sign-bit-count function. */ add_builtin_function (__builtin_sbc, int_ftype_int,
[PATCH, i386]: Expand insv pattern to pinsr{q,w,d,q} insn
Hello! Attached patch enhances the fix for PR target/48678 to generate pinsr{q,w,d,q} insn when value is inserted into vector register. 2011-04-20 Uros Bizjak ubiz...@gmail.com PR target/48678 * config/i386/i386.md (insv): Change operand 0 constraint to register_operand. Change operand 1 and 2 constraint to const_int_operand. Expand to pinsr{b,w,d,q} * when appropriate. * config/i386/sse.md (sse4_1_pinsrb): Export. (sse2_pinsrw): Ditto. (sse4_1_pinsrd): Ditto. (sse4_1_pinsrq): Ditto. * config/i386/i386-protos.h (ix86_expand_pinsr): Add prototype. * config/i386/i386.c (ix86_expand_pinsr): New. testsuite/ChangeLog: 2011-04-20 Uros Bizjak ubiz...@gmail.com PR target/48678 * gcc.target/i386/sse2-pinsrw.c: New test. * gcc.target/i386/avx-vpinsrw.c: Ditto. * gcc.target/i386/sse4_1-insvqi.c: Ditto. * gcc.target/i386/sse2-insvhi.c: Ditto. * gcc.target/i386/sse4_1-insvsi.c: Ditto. * gcc.target/i386/sse4_1-insvdi.c: Ditto. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 172780) +++ config/i386/i386.md (working copy) @@ -10393,14 +10393,17 @@ }) (define_expand insv - [(set (zero_extract (match_operand 0 ext_register_operand ) - (match_operand 1 const8_operand ) - (match_operand 2 const8_operand )) + [(set (zero_extract (match_operand 0 register_operand ) + (match_operand 1 const_int_operand ) + (match_operand 2 const_int_operand )) (match_operand 3 register_operand ))] { rtx (*gen_mov_insv_1) (rtx, rtx); + if (ix86_expand_pinsr (operands)) +DONE; + /* Handle insertions to %ah et al. */ if (INTVAL (operands[1]) != 8 || INTVAL (operands[2]) != 8) FAIL; Index: config/i386/sse.md === --- config/i386/sse.md (revision 172780) +++ config/i386/sse.md (working copy) @@ -6051,7 +6051,7 @@ (set_attr prefix orig,vex) (set_attr mode TI)]) -(define_insn *sse4_1_pinsrb +(define_insn sse4_1_pinsrb [(set (match_operand:V16QI 0 register_operand =x,x,x,x) (vec_merge:V16QI (vec_duplicate:V16QI @@ -6083,7 +6083,7 @@ (set_attr prefix orig,orig,vex,vex) (set_attr mode TI)]) -(define_insn *sse2_pinsrw +(define_insn sse2_pinsrw [(set (match_operand:V8HI 0 register_operand =x,x,x,x) (vec_merge:V8HI (vec_duplicate:V8HI @@ -6117,7 +6117,7 @@ (set_attr mode TI)]) ;; It must come before sse2_loadld since it is preferred. -(define_insn *sse4_1_pinsrd +(define_insn sse4_1_pinsrd [(set (match_operand:V4SI 0 register_operand =x,x) (vec_merge:V4SI (vec_duplicate:V4SI @@ -6145,7 +6145,7 @@ (set_attr prefix orig,vex) (set_attr mode TI)]) -(define_insn *sse4_1_pinsrq +(define_insn sse4_1_pinsrq [(set (match_operand:V2DI 0 register_operand =x,x) (vec_merge:V2DI (vec_duplicate:V2DI Index: config/i386/i386-protos.h === --- config/i386/i386-protos.h (revision 172780) +++ config/i386/i386-protos.h (working copy) @@ -203,6 +203,7 @@ extern void ix86_expand_vector_extract ( extern void ix86_expand_reduc_v4sf (rtx (*)(rtx, rtx, rtx), rtx, rtx); extern void ix86_expand_vec_extract_even_odd (rtx, rtx, rtx, unsigned); +extern bool ix86_expand_pinsr (rtx *); /* In i386-c.c */ extern void ix86_target_macros (void); Index: config/i386/i386.c === --- config/i386/i386.c (revision 172780) +++ config/i386/i386.c (working copy) @@ -34106,6 +34106,88 @@ ix86_expand_vec_extract_even_odd (rtx ta /* ... or we use the special-case patterns. */ expand_vec_perm_even_odd_1 (d, odd); } + +/* Expand an insert into a vector register through pinsr insn. + Return true if successful. */ + +bool +ix86_expand_pinsr (rtx *operands) +{ + rtx dst = operands[0]; + rtx src = operands[3]; + + unsigned int size = INTVAL (operands[1]); + unsigned int pos = INTVAL (operands[2]); + + if (GET_CODE (dst) == SUBREG) +{ + pos += SUBREG_BYTE (dst) * BITS_PER_UNIT; + dst = SUBREG_REG (dst); +} + + if (GET_CODE (src) == SUBREG) +src = SUBREG_REG (src); + + switch (GET_MODE (dst)) +{ +case V16QImode: +case V8HImode: +case V4SImode: +case V2DImode: + { + enum machine_mode srcmode, dstmode; + rtx (*pinsr)(rtx, rtx, rtx, rtx); + + srcmode = mode_for_size (size, MODE_INT, 0); + + switch (srcmode) + { + case QImode: + if (!TARGET_SSE4_1) + return false; + dstmode = V16QImode; + pinsr =
[PATCH] use build_function_type_list in the avr backend
As $SUBJECT suggests. Tested with cross to avr-elf. OK to commit? -Nathan * config/avr/avr.c (avr_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c index 500a5b2..6dbf8b4 100644 --- a/gcc/config/avr/avr.c +++ b/gcc/config/avr/avr.c @@ -6535,7 +6535,7 @@ static void avr_init_builtins (void) { tree void_ftype_void -= build_function_type (void_type_node, void_list_node); += build_function_type_list (void_type_node, NULL_TREE); tree uchar_ftype_uchar = build_function_type_list (unsigned_char_type_node, unsigned_char_type_node,
[PATCH] use build_function_type_list in the pa backend
As $SUBJECT suggests. Tested with cross to hppa-linux-gnu. OK to commit? -Nathan * config/pa/pa.c (pa_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c index e05cf19..aeb8061 100644 --- a/gcc/config/pa/pa.c +++ b/gcc/config/pa/pa.c @@ -641,7 +641,7 @@ pa_init_builtins (void) TREE_READONLY (decl) = 1; pa_builtins[PA_BUILTIN_COPYSIGNQ] = decl; - ftype = build_function_type (long_double_type_node, void_list_node); + ftype = build_function_type_list (long_double_type_node, NULL_TREE); decl = add_builtin_function (__builtin_infq, ftype, PA_BUILTIN_INFQ, BUILT_IN_MD, NULL, NULL_TREE);
[PATCH] use build_function_type_list in the arm backend
As $SUBJECT suggests. There's one remaining use of build_function_type, but replace that will have to wait until we have a better FUNCTION_TYPE-building interface. Tested with cross to arm-eabi. OK to commit? -Nathan * config/arm/arm.c (arm_init_iwmmxt_builtins): Call build_function_type_list instead of build_function_type. Delete variable `endlink'. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 5f964d6..9f10ac4 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -18915,196 +18915,137 @@ arm_init_iwmmxt_builtins (void) { const struct builtin_description * d; size_t i; - tree endlink = void_list_node; tree V2SI_type_node = build_vector_type_for_mode (intSI_type_node, V2SImode); tree V4HI_type_node = build_vector_type_for_mode (intHI_type_node, V4HImode); tree V8QI_type_node = build_vector_type_for_mode (intQI_type_node, V8QImode); tree int_ftype_int -= build_function_type (integer_type_node, - tree_cons (NULL_TREE, integer_type_node, endlink)); += build_function_type_list (integer_type_node, + integer_type_node, NULL_TREE); tree v8qi_ftype_v8qi_v8qi_int -= build_function_type (V8QI_type_node, - tree_cons (NULL_TREE, V8QI_type_node, - tree_cons (NULL_TREE, V8QI_type_node, -tree_cons (NULL_TREE, - integer_type_node, - endlink; += build_function_type_list (V8QI_type_node, + V8QI_type_node, V8QI_type_node, + integer_type_node, NULL_TREE); tree v4hi_ftype_v4hi_int -= build_function_type (V4HI_type_node, - tree_cons (NULL_TREE, V4HI_type_node, - tree_cons (NULL_TREE, integer_type_node, -endlink))); += build_function_type_list (V4HI_type_node, + V4HI_type_node, integer_type_node, NULL_TREE); tree v2si_ftype_v2si_int -= build_function_type (V2SI_type_node, - tree_cons (NULL_TREE, V2SI_type_node, - tree_cons (NULL_TREE, integer_type_node, -endlink))); += build_function_type_list (V2SI_type_node, + V2SI_type_node, integer_type_node, NULL_TREE); tree v2si_ftype_di_di -= build_function_type (V2SI_type_node, - tree_cons (NULL_TREE, long_long_integer_type_node, - tree_cons (NULL_TREE, -long_long_integer_type_node, -endlink))); += build_function_type_list (V2SI_type_node, + long_long_integer_type_node, + long_long_integer_type_node, + NULL_TREE); tree di_ftype_di_int -= build_function_type (long_long_integer_type_node, - tree_cons (NULL_TREE, long_long_integer_type_node, - tree_cons (NULL_TREE, integer_type_node, -endlink))); += build_function_type_list (long_long_integer_type_node, + long_long_integer_type_node, + integer_type_node, NULL_TREE); tree di_ftype_di_int_int -= build_function_type (long_long_integer_type_node, - tree_cons (NULL_TREE, long_long_integer_type_node, - tree_cons (NULL_TREE, integer_type_node, -tree_cons (NULL_TREE, - integer_type_node, - endlink; += build_function_type_list (long_long_integer_type_node, + long_long_integer_type_node, + integer_type_node, + integer_type_node, NULL_TREE); tree int_ftype_v8qi -= build_function_type (integer_type_node, - tree_cons (NULL_TREE, V8QI_type_node, - endlink)); += build_function_type_list (integer_type_node, + V8QI_type_node, NULL_TREE); tree int_ftype_v4hi -= build_function_type (integer_type_node, - tree_cons (NULL_TREE, V4HI_type_node, - endlink)); += build_function_type_list (integer_type_node, + V4HI_type_node, NULL_TREE); tree int_ftype_v2si -=
Re: [PATCH] use build_function_type_list in the ia64 backend
On Wed, Apr 20, 2011 at 03:29:19PM -0400, Nathan Froyd wrote: As $SUBJECT suggests. Tested with cross to ia64-linux-gnu. OK to commit? - ftype = build_function_type (float128_type, void_list_node); - decl = add_builtin_function (__builtin_infq, ftype, -IA64_BUILTIN_INFQ, BUILT_IN_MD, -NULL, NULL_TREE); - ia64_builtins[IA64_BUILTIN_INFQ] = decl; + ftype = build_function_type_list (float128_type, NULL_TREE); + add_builtin_function (__builtin_infq, ftype, + IA64_BUILTIN_INFQ, BUILT_IN_MD, + NULL, NULL_TREE); Of course, the patch I tested didn't delete the assignment to ia64_builtins. Please disregard that bit. -Nathan
Re: [PATCH] use build_function_type_list in the pa backend
* config/pa/pa.c (pa_init_builtins): Call build_function_type_list instead of build_function_type. Ok. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
[PATCH] use build_function_type_list in the spu backend
As $SUBJECT suggests. The only tricky bit is initializing all the args to NULL_TREE so that we can safely pass all the args to build_function_type_list. Tested with cross to spu-elf; I couldn't build all of libgcc, but that appears to be a pre-existing problem. OK to commit? -Nathan * config/spu/spu.c (spu_init_builtins): Call build_function_type_list instead of build_function_type. Rearrange gathering of args to do so. * config/spu/spu-builtins.def (SPU_MAX_ARGS_TO_BUILTIN): Define. diff --git a/gcc/config/spu/spu-builtins.def b/gcc/config/spu/spu-builtins.def index 4d01d94..6dfdf8c 100644 --- a/gcc/config/spu/spu-builtins.def +++ b/gcc/config/spu/spu-builtins.def @@ -23,6 +23,8 @@ #define _A3(a,b,c) {a, b, c, SPU_BTI_END_OF_PARAMS} #define _A4(a,b,c,d) {a, b, c, d, SPU_BTI_END_OF_PARAMS} +#define SPU_MAX_ARGS_TO_BUILTIN 3 + /* definitions to support si intrinsic functions: (These and other builtin * definitions must precede definitions of the overloaded generic intrinsics */ diff --git a/gcc/config/spu/spu.c b/gcc/config/spu/spu.c index 941194b..ea9d580 100644 --- a/gcc/config/spu/spu.c +++ b/gcc/config/spu/spu.c @@ -5777,9 +5777,10 @@ spu_init_builtins (void) sure nodes are shared. */ for (i = 0, d = spu_builtins; i NUM_SPU_BUILTINS; i++, d++) { - tree p; + tree ftype; char name[64]; /* build_function will make a copy. */ - int parm; + int parm, i; + tree args[SPU_MAX_ARGS_TO_BUILTIN]; if (d-name == 0) continue; @@ -5788,15 +5789,23 @@ spu_init_builtins (void) for (parm = 1; d-parm[parm] != SPU_BTI_END_OF_PARAMS; parm++) ; - p = void_list_node; + gcc_assert (parm = (SPU_MAX_ARGS_TO_BUILTIN + 1)); + + for (i = 0; i ARRAY_SIZE (args); i++) + args[i] = NULL_TREE; + while (parm 1) - p = tree_cons (NULL_TREE, spu_builtin_types[d-parm[--parm]], p); + { + tree arg = spu_builtin_types[d-parm[--parm]]; + args[parm-1] = arg; + } - p = build_function_type (spu_builtin_types[d-parm[0]], p); + ftype = build_function_type_list (spu_builtin_types[d-parm[0]], + args[0], args[1], args[2], NULL_TREE); sprintf (name, __builtin_%s, d-name); spu_builtin_decls[i] = - add_builtin_function (name, p, i, BUILT_IN_MD, NULL, NULL_TREE); + add_builtin_function (name, ftype, i, BUILT_IN_MD, NULL, NULL_TREE); if (d-fcode == SPU_MASK_FOR_LOAD) TREE_READONLY (spu_builtin_decls[i]) = 1;
[PATCH] use build_function_type_list in the stormy16 backend
As $SUBJECT suggests. For safety's sake, we initialize all the arguments to NULL before passing them to build_function_type_list. This is not necessary currently, as we always completely fill in the args array, but it might save some future coder from quite some grief... Tested with cross to xstormy16-elf. OK to commit? -Nathan * config/stormy16/stormy16 (xstormy16_init_builtins): Call build_function_type_list instead of build_function_type. Rearrange initialization of `args' to do so. diff --git a/gcc/config/stormy16/stormy16.c b/gcc/config/stormy16/stormy16.c index 052285c..1a90e16 100644 --- a/gcc/config/stormy16/stormy16.c +++ b/gcc/config/stormy16/stormy16.c @@ -2255,15 +2255,21 @@ static struct static void xstormy16_init_builtins (void) { - tree args, ret_type, arg; - int i, a; + tree args[2], ret_type, arg = NULL_TREE, ftype; + int i, a, n_args; ret_type = void_type_node; for (i = 0; s16builtins[i].name; i++) { - args = void_list_node; - for (a = strlen (s16builtins[i].arg_types) - 1; a = 0; a--) + n_args = strlen (s16builtins[i].arg_types) - 1; + + gcc_assert (n_args = (int) ARRAY_SIZE (args)); + + for (a = n_args; a = 0; a--) + args[a] = NULL_TREE; + + for (a = n_args; a = 0; a--) { switch (s16builtins[i].arg_types[a]) { @@ -2276,10 +2282,10 @@ xstormy16_init_builtins (void) if (a == 0) ret_type = arg; else - args = tree_cons (NULL_TREE, arg, args); + args[a-1] = arg; } - add_builtin_function (s16builtins[i].name, - build_function_type (ret_type, args), + ftype = build_function_type_list (ret_type, arg[0], arg[1], NULL_TREE); + add_builtin_function (s16builtins[i].name, ftype, i, BUILT_IN_MD, NULL, NULL); } }
[PATCH] use build_function_type_list in the sh backend
As $SUBJECT suggests. The only tricky bit is the initialization of `args' to NULL_TREEs so that we can safely pass all of the relevant args to build_function_type_list, regardless of whether the function type in question has that many args. Tested with cross to sh-elf. OK to commit? -Nathan * config/sh/sh.c (sh_media_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c index 78f6f0f..0f158d5 100644 --- a/gcc/config/sh/sh.c +++ b/gcc/config/sh/sh.c @@ -11222,6 +11222,7 @@ sh_media_init_builtins (void) else { int has_result = signature_args[signature][0] != 0; + tree args[3]; if ((signature_args[signature][1] 8) (((signature_args[signature][1] 1) TARGET_SHMEDIA32) @@ -11230,7 +11231,8 @@ sh_media_init_builtins (void) if (! TARGET_FPU_ANY FLOAT_MODE_P (insn_data[d-icode].operand[0].mode)) continue; - type = void_list_node; + for (i = 0; i (int) ARRAY_SIZE (args); i++) + args[i] = NULL_TREE; for (i = 3; ; i--) { int arg = signature_args[signature][i]; @@ -11248,9 +11250,10 @@ sh_media_init_builtins (void) arg_type = void_type_node; if (i == 0) break; - type = tree_cons (NULL_TREE, arg_type, type); + args[i-1] = arg_type; } - type = build_function_type (arg_type, type); + type = build_function_type_list (arg_type, args[0], args[1], + args[2], NULL_TREE); if (signature SH_BLTIN_NUM_SHARED_SIGNATURES) shared[signature] = type; }
Re: Second ping for cannot_force_const_mem LEGITIMATE_CONSTANT_P changes
On 04/18/2011 02:30 AM, Richard Sandiford wrote: Ping for these two changes: http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00194.html http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00195.html Both ok. r~
Re: [patch] Do not generate discriminator directive in strict mode
On 04/20/2011 12:09 PM, Eric Botcazou wrote: How is this not redundant with the existing /* The discriminator column was added in dwarf4. Simplify the below by simply removing it if we're not supposed to output it. */ if (dwarf_version 4 dwarf_strict) discriminator = 0; check near the top of the function? Obviously I missed this recent change, sorry. So the question is: would the change be appropriate for the release branches, where we emit the directive unconditionally, i.e 4.5 and 4.6 branches, or would mine be safer for them? Let's try to keep the branches more similar than not. It's just as safe, since prior to mainline we ignore the discriminator when not emitting via gas directive. r~
fix memory leak in gengtype
Hello list, while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM killed. That's when I noticed that its RAM usage peaks at 150MB, which is a bit excessive for parsing a ~500K text file. The attached patch fixes the leak and gengtype now uses a peak of 4MB heap. Hopefully I don't do something wrong, since it took me a while to understand those obstacks... Thanks, Dimitris P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu) but unfortunately the sparcstation crashed and burned after this, so I can't continue the build and report back :-( --- gcc/gengtype-state.c.orig 2011-04-20 23:06:29.0 +0300 +++ gcc/gengtype-state.c2011-04-20 23:12:43.0 +0300 @@ -303,7 +303,7 @@ obstack_1grow (id_obstack, (char) 0); ids = XOBFINISH (id_obstack, char *); sid = state_ident_by_name (ids, INSERT); - obstack_free (id_obstack, ids); + obstack_free (id_obstack, NULL); ids = NULL; tk = XCNEW (struct state_token_st); tk-stok_kind = STOK_NAME; @@ -408,7 +408,7 @@ tk-stok_file = state_path; tk-stok_next = NULL; strcpy (tk-stok_un.stok_string, cstr); - obstack_free (bstring_obstack, cstr); + obstack_free (bstring_obstack, NULL); return tk; }
Re: [PATCH] use build_function_type_list a few places in the ObjC frontend
On Apr 20, 2011, at 10:27 AM, Nathan Froyd wrote: Tested on x86_64-unknown-linux-gnu. IIUC the changes to objc-next-runtime-abi-02.c would not be tested on that platform, so it would be helpful to have a Darwin tester double-check my work. Just check http://gcc.gnu.org/regtest/HEAD/ after about 10 hours. OK to commit? Ok.
Ping^2 Re: Target header etc. cleanup patch
Ping^2. This patch http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00130.html is still pending review. This version applies cleanly to current trunk. 2011-04-20 Joseph Myers jos...@codesourcery.com * config/alpha/alpha.c (struct machine_function): Use rtx, not struct rtx_def *. * config/bfin/bfin-protos.h (Mmode): Don't define. Expand definition where used. * config/bfin/bfin.h (bfin_cc_rtx, bfin_rets_rtx): Use rtx, not struct rtx_def *. * config/cris/cris-protos.h (STDIO_INCLUDED): Don't define. * config/fr30/fr30-protos.h (Mmode): Don't define. * config/fr30/fr30.h (inhibit_libc): Don't define. * config/h8300/h8300.h (struct cum_arg): Use rtx, not struct rtx_def *. * config/i386/cygming.h (union tree_node, TREE): Don't define or undefine. (FILE): Don't undefine. * config/iq2000/iq2000.h (struct iq2000_args): Use rtx, not struct rtx_def *. * config/m32c/m32c-protos.h (MM, UINT): Don't define. Expand definitions where used. * config/m32r/m32r-protos.h (Mmode): Don't define. Expand definition where used. * config/microblaze/microblaze.h (struct microblaze_args): Use rtx, not struct rtx_def *. * config/mn10300/mn10300-protos.h (Mmode, Cstar, Rclas): Don't define. Expand definitions where used. * config/pa/pa-protos.h (return_addr_rtx): Use rtx, not struct rtx_def *. * config/pa/pa.h (hppa_pic_save_rtx): Use rtx, not struct rtx_def *. * config/pdp11/pdp11.h (cc0_reg_rtx): Use rtx, not struct rtx_def *. * config/rx/rx-protos.h (Mmode, Fargs, Rcode): Don't define. Expand definitions where used. * config/rx/rx.c (rx_is_legitimate_address, rx_function_arg_size, rx_function_arg, rx_function_arg_advance, rx_function_arg_boundary): Expand definitions of those macros. * config/sh/sh-protos.h (sfunc_uses_reg, get_fpscr_rtx): Use rtx, not struct rtx_def *. * config/sh/sh.h (sh_compare_op0, sh_compare_op1): Use rtx, not struct rtx_def *. * config/spu/spu-protos.h (spu_float_const): Use rtx, not struct rtx_def *. * config/spu/spu.c (spu_float_const): Use rtx, not struct rtx_def *. * config/v850/v850-protos.h (Mmode): Don't define. Expand definition where used. * config/v850/v850.h (GHS_default_section_names, GHS_current_section_names): Use tree, not union tree_node *. Index: gcc/config/alpha/alpha.c === --- gcc/config/alpha/alpha.c(revision 172767) +++ gcc/config/alpha/alpha.c(working copy) @@ -4606,7 +4606,7 @@ struct GTY(()) machine_function const char *some_ld_name; /* For TARGET_LD_BUGGY_LDGP. */ - struct rtx_def *gp_save_rtx; + rtx gp_save_rtx; /* For VMS condition handlers. */ bool uses_condition_handler; Index: gcc/config/m32c/m32c-protos.h === --- gcc/config/m32c/m32c-protos.h (revision 172767) +++ gcc/config/m32c/m32c-protos.h (working copy) @@ -1,5 +1,5 @@ /* Target Prototypes for R8C/M16C/M32C - Copyright (C) 2005, 2007, 2008, 2010 + Copyright (C) 2005, 2007, 2008, 2010, 2011 Free Software Foundation, Inc. Contributed by Red Hat. @@ -19,12 +19,9 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -#define MM enum machine_mode -#define UINT unsigned int - void m32c_conditional_register_usage (void); int m32c_const_ok_for_constraint_p (HOST_WIDE_INT, char, const char *); -UINT m32c_dwarf_frame_regnum (int); +unsigned int m32c_dwarf_frame_regnum (int); int m32c_eh_return_data_regno (int); void m32c_emit_epilogue (void); void m32c_emit_prologue (void); @@ -47,8 +44,8 @@ int m32c_trampoline_size (void); #ifdef RTX_CODE -int m32c_cannot_change_mode_class (MM, MM, int); -int m32c_class_max_nregs (int, MM); +int m32c_cannot_change_mode_class (enum machine_mode, enum machine_mode, int); +int m32c_class_max_nregs (int, enum machine_mode); rtx m32c_eh_return_stackadj_rtx (void); void m32c_emit_eh_epilogue (rtx); int m32c_expand_cmpstr (rtx *); @@ -60,20 +57,20 @@ void m32c_expand_neg_mulpsi3 (rtx *); int m32c_expand_setmemhi (rtx *); int m32c_extra_constraint_p (rtx, char, const char *); int m32c_extra_constraint_p2 (rtx, char, const char *); -int m32c_hard_regno_nregs (int, MM); -int m32c_hard_regno_ok (int, MM); +int m32c_hard_regno_nregs (int, enum machine_mode); +int m32c_hard_regno_ok (int, enum machine_mode); bool m32c_illegal_subreg_p (rtx); -bool m32c_immd_dbl_mov (rtx *, MM); +bool m32c_immd_dbl_mov (rtx *, enum machine_mode); rtx m32c_incoming_return_addr_rtx (void); int m32c_legitimate_constant_p (rtx); -int m32c_legitimize_reload_address (rtx *, MM, int, int, int);
Re: [PATCH] use build_function_type_list in the ia64 backend
On Wed, 2011-04-20 at 13:03 -0700, Nathan Froyd wrote: On Wed, Apr 20, 2011 at 03:29:19PM -0400, Nathan Froyd wrote: As $SUBJECT suggests. Tested with cross to ia64-linux-gnu. OK to commit? - ftype = build_function_type (float128_type, void_list_node); - decl = add_builtin_function (__builtin_infq, ftype, - IA64_BUILTIN_INFQ, BUILT_IN_MD, - NULL, NULL_TREE); - ia64_builtins[IA64_BUILTIN_INFQ] = decl; + ftype = build_function_type_list (float128_type, NULL_TREE); + add_builtin_function (__builtin_infq, ftype, + IA64_BUILTIN_INFQ, BUILT_IN_MD, + NULL, NULL_TREE); Of course, the patch I tested didn't delete the assignment to ia64_builtins. Please disregard that bit. -Nathan I am not sure what the patch would look like then. You removed the assignment to decl, so what are you putting in ia64_builtins? Can you send the full correct patch. Steve Ellcey s...@cup.hp.com
Ping Re: Don't use linux.h for non-Linux targets
Ping. This patch http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00919.html is pending review. -- Joseph S. Myers jos...@codesourcery.com
Re: Ping^2 Re: Target header etc. cleanup patch
The m32c one is OK
Re: [PATCH] use build_function_type_list in the ia64 backend
On Wed, Apr 20, 2011 at 02:09:49PM -0700, Steve Ellcey wrote: I am not sure what the patch would look like then. You removed the assignment to decl, so what are you putting in ia64_builtins? Can you send the full correct patch. Sure. Updated patch below, which probably looks somewhat more sane. -Nathan diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index 5f22b17..880aa8d 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -10165,7 +10165,7 @@ ia64_init_builtins (void) (*lang_hooks.types.register_builtin_type) (float128_type, __float128); /* TFmode support builtins. */ - ftype = build_function_type (float128_type, void_list_node); + ftype = build_function_type_list (float128_type, NULL_TREE); decl = add_builtin_function (__builtin_infq, ftype, IA64_BUILTIN_INFQ, BUILT_IN_MD, NULL, NULL_TREE); @@ -10212,13 +10212,13 @@ ia64_init_builtins (void) NULL, NULL_TREE) decl = def_builtin (__builtin_ia64_bsp, - build_function_type (ptr_type_node, void_list_node), - IA64_BUILTIN_BSP); + build_function_type_list (ptr_type_node, NULL_TREE), + IA64_BUILTIN_BSP); ia64_builtins[IA64_BUILTIN_BSP] = decl; decl = def_builtin (__builtin_ia64_flushrs, - build_function_type (void_type_node, void_list_node), - IA64_BUILTIN_FLUSHRS); + build_function_type_list (void_type_node, NULL_TREE), + IA64_BUILTIN_FLUSHRS); ia64_builtins[IA64_BUILTIN_FLUSHRS] = decl; #undef def_builtin
Re: [PATCH] use build_function_type_list in the ia64 backend
On Wed, 2011-04-20 at 17:25 -0400, Nathan Froyd wrote: On Wed, Apr 20, 2011 at 02:09:49PM -0700, Steve Ellcey wrote: I am not sure what the patch would look like then. You removed the assignment to decl, so what are you putting in ia64_builtins? Can you send the full correct patch. Sure. Updated patch below, which probably looks somewhat more sane. -Nathan OK, that looks good. Steve Ellcey s...@cup.hp.com
Re: [patch] Split Parse Timevar (issue4378056)
On 04/12/2011 11:49 AM, Lawrence Crowl wrote: This patch is available for review at http://codereview.appspot.com/4378056 I tried to comment there, but it didn't seem to be working; looking at the side-by-side diffs didn't show any changes, and double-clicking on a line in the patch form didn't let me add a comment. + timevar_start (TV_RESOLVE_OVERLOAD); Putting this in perform_overload_resolution isn't enough; only a couple of cases of overload resolution actually use it. Any function that calls tourney will also need this. +lookup_template_class (tree d1, tree arglist, tree in_decl, tree context, + int entering_scope, tsubst_flags_t complain) +{ + tree ret; + bool subtime = timevar_cond_start (TV_NAME_LOOKUP); Let's count this as TV_INSTANTIATE_TEMPLATE instead. @@ -17194,7 +17225,7 @@ instantiate_decl (tree d, int defer_ok, - timevar_push (TV_PARSE); + timevar_push (TV_PARSE_GLOBAL); This too. @@ -1911,7 +1911,7 @@ ggc_collect (void) - timevar_push (TV_GC); + timevar_start (TV_GC); Why this change? GC time shouldn't be counted against whatever we happen to be parsing when it happens. +DEFTIMEVAR (TV_PHASE_C_WRAPUP_CHECK , phase C wrapup check) +DEFTIMEVAR (TV_PHASE_CP_DEFERRED , phase C++ deferred) Why do these need to be different timevars? +DEFTIMEVAR (TV_PARSE_INMETH , parser inl. meth. body) Is it really important to distinguish this from other functions? -DEFTIMEVAR (TV_NAME_LOOKUP , name lookup) -DEFTIMEVAR (TV_OVERLOAD , overload resolution) -DEFTIMEVAR (TV_TEMPLATE_INSTANTIATION, template instantiation) +DEFTIMEVAR (TV_INSTANTIATE_TEMPLATE , instantiate template) +DEFTIMEVAR (TV_NAME_LOOKUP , |name lookup) +DEFTIMEVAR (TV_RESOLVE_OVERLOAD , |overload resolution) Why these changes? @@ -564,6 +564,8 @@ compile_file (void) + timevar_start (TV_PHASE_PARSING); Why does this happen before... + timevar_push (TV_PARSE_GLOBAL); ...this? I would think the bits in there should be part of _SETUP. @@ -16760,6 +16770,7 @@ cp_parser_class_specifier (cp_parser* parser) + timevar_pop (TV_PARSE_STRUCT); + timevar_pop (TV_PARSE_STRUCT); + timevar_pop (TV_PARSE_STRUCT); + timevar_pop (TV_PARSE_STRUCT); Why not factor this out like you did with so many functions outside the parser? Jason
Re: Fix PR48703: segfault in mangler due to -g
Hi, I wrote: Basically we have to set assembler names early also for TYPE_DECLs, we can't rely on the frontends langhook to do that after free_lang_data. Okay for trunk assuming regstrapping on x86_64-linux works? Patch retracted, doesn't even survive testsuite. The problem is that we can't simply accept TYPE_DECLs for generating assembler names, because the other frontends except C++ can't deal with that (they use the default set_decl_assembler_name hook). Even conditionalizing on lang_hooks.set_decl_assembler_name == lhd_set_decl_assembler_name doesn't work, because mysteriously for C++ we'll get ICEs in the C++ frontend itself when presented to mangle some TYPE_DECLs (namely when flag_abi_version is set, mangle_decl unconditionally calls make_alias_for, which in turn doesn't work with type_decls). It's all quite messy and a wonder why -g worked somewhat with -flto at all for so long :-( Ciao, Michael.
Re: fix memory leak in gengtype
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/20/11 15:08, Dimitrios Apostolou wrote: Hello list, while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM killed. That's when I noticed that its RAM usage peaks at 150MB, which is a bit excessive for parsing a ~500K text file. The attached patch fixes the leak and gengtype now uses a peak of 4MB heap. Hopefully I don't do something wrong, since it took me a while to understand those obstacks... The code in question creates an obstack, allocates (and grows) a single object on the obstack, then frees the object. This leaks the underlying obstack structure itself and potentially any chunks that were too small to hold the object. It turns out there's a similar leak in gengtype.c which is fixed in the same way. A quick valgrind test shows that prior to your change gengtype leaked roughly 200M, after your change it leaks about 1.3M and after fixing gengtype it leaks a little under 300k. I'll run those changes through the usual tests and check in the changes assuming they pass those tests. Thanks for the patch! P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu) but unfortunately the sparcstation crashed and burned after this, so I can't continue the build and report back :-( :( My old PA box has similar problems, though it merely overheats before a bootstrap can complete, so in theory I could coax it to finish a bootstrap. Luckily others (particularly John) have stepped in over the last decade and taken excellent care of the PA port. Jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNr2RiAAoJEBRtltQi2kC7ryUH/iYvVw8LWZNWc1zSczCOOo8w T8uyVX6WX+0xjPDA52si34BdCXfKdNDmtQXAVpnRbbTrgT42lj1bTH9c9KLadWEZ 0/FUZQB5VGQTMYah7iDDAfyjUdyRRCZW/YWnbyfAP0UdVTR7xJsjqjjWEetuyyFA jF6WQYovzWzjssUnKfPnD/WyQxoPm+gihBVw0abhdPpojXcH8uMYrXpZrGLEk0QA drR0ogL3ZKNJiRMFZQH5NKrhhx76mPiACsRZmCJkXSm+N6GqRsJFE9gGbc7Lwpdn bVjd1CGo5yYCscEM/yUBS4fclO6aDRRdMbT5/cVsObYXv58WGG1gfk0F6g1GqFs= =d6SQ -END PGP SIGNATURE-
Re: [patch testsuite committed] Skip gcc.dg/torture/pr37868.c on sh
On Apr 20, 2011, at 5:22 AM, Kaz Kojima wrote: Mike Stump mikest...@comcast.net wrote: I'd pre-approve hoisting these up into the lib/.exp files and checking a generic target requirement... :-) -/* { dg-skip-if unaligned access { sparc*-*-* } * } */ +/* { dg-skip-if unaligned access { sparc*-*-* sh*-*-* } * } */ I've thought the same thing when reading the recent HP's comment about changes of testcases for avr, but gave up after grepping STRICT_ALIGNMENT in gcc/config/*/*.h. Oh, I had even less work in mind. If the test makes non-portable assumptions about alignment, just tag it as unportable due to alignment, and then for sparc and sh, just set that flag. As others find other testcases for other machines that suffer the same general problem, they eventually would find and switch to the more maintainable form. As someone did a new port, and saw the testcase fail, they would glance at it, see the non-portable due to alignment, say, yeah, that applies to me a well, and then just add their port to the list. Once they did this, then presto, all the testcases would suddenly turn off, which is what they want to happen.
Re: fix memory leak in gengtype
On Wed, 20 Apr 2011, Jeff Law wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/20/11 15:08, Dimitrios Apostolou wrote: Hello list, while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM killed. That's when I noticed that its RAM usage peaks at 150MB, which is a bit excessive for parsing a ~500K text file. The attached patch fixes the leak and gengtype now uses a peak of 4MB heap. Hopefully I don't do something wrong, since it took me a while to understand those obstacks... The code in question creates an obstack, allocates (and grows) a single object on the obstack, then frees the object. This leaks the underlying obstack structure itself and potentially any chunks that were too small to hold the object. Plus a whole page which is preallocated by the obstack, if I understand correctly. As a result, for each word in the text file we consume 4KB, which are never freed. It turns out there's a similar leak in gengtype.c which is fixed in the same way. Nice, thanks for looking deeper into this, I just stopped when memory utilisation seemed ok. A quick valgrind test shows that prior to your change gengtype leaked roughly 200M, after your change it leaks about 1.3M and after fixing gengtype it leaks a little under 300k. I'll run those changes through the usual tests and check in the changes assuming they pass those tests. Thanks for the patch! P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu) but unfortunately the sparcstation crashed and burned after this, so I can't continue the build and report back :-( :( My old PA box has similar problems, though it merely overheats before a bootstrap can complete, so in theory I could coax it to finish a bootstrap. Luckily others (particularly John) have stepped in over the last decade and taken excellent care of the PA port. If by PA you mean PA-RISC, I remember when I had access to a Visualize C200 with gentoo on. I loved the machine, but it had an important issue: it was absolutely random if it would power up, when pressing the power button. But as long as we never turned it off, it worked ok :-) Dimitris
Re: [PATCH] use build_function_type_list in the rs6000 backend
On Wed, Apr 20, 2011 at 3:49 PM, Nathan Froyd froy...@codesourcery.com wrote: As $SUBJECT suggests. The only tricky part is in builtin_function_type, where we fill in unused args with NULL_TREE so that passing extra arguments to build_function_type_list doesn't matter. Tested with cross to powerpc-eabi. OK to commit? -Nathan * config/rs6000/rs6000.c (spe_init_builtins): Call build_function_type_list instead of build_function_type. (paired_init_builtins, altivec_init_builtins): Likewise. (builtin_function_type): Likewise. Okay. Thanks, David
Re: Ping^2 Re: Target header etc. cleanup patch
On Apr 20, 2011, at 5:09 PM, Joseph S. Myers wrote: Ping^2. This patch http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00130.html is still pending review. This version applies cleanly to current trunk. ... pdp11 is fine. Thanks! paul
Re: [RFA] [PowerPC]
The test and-1.c has wrong logic. In the formula: y ~(y -y) The part (y -y) is always a mask with one bit set, which corresponds to the least significant 1 bit in y. The final result is that bit, is set to zero (y ~mask) There is no boolean simplification possible, and the compiler always produces a nand instruction. The formula is equal to y (y-1) , maybe the testcase is testing that? Segher