ping^2: [patch] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, libc/15407)
Hi, [patch update] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, libc/15407) http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00775.html Message-ID: <20130514191244.ga12...@host2.jankratochvil.net> Thanks, Jan
Re: [c++-concepts] code review
On Wed, Jun 19, 2013 at 9:21 AM, Jason Merrill wrote: > On 06/18/2013 12:27 PM, Andrew Sutton wrote: >> >> There was a bug in instantiation_dependent_expr_r that would cause >> trait expressions like __is_class(int) to be marked as type dependent. >> It was always testing the 2nd operand, even for unary traits >> (NULL_TREE turns out to be type dependent). > > > I fixed that last month: > > 2013-05-20 Jason Merrill > > PR c++/57016 > * pt.c (instantiation_dependent_r) [TRAIT_EXPR]: Only check > type2 if there is one. The last merge to c++-concepts was a little bit before that (2013-06-16), so the fix wasn't on the branch. As I discussed with Andrew a couple of weeks ago, I have been holding back the merge from trunk because he has these patch series in the queue. That also means we don't get these sort of fixes before a while. Maybe I should just go ahead with the merge so that we have conflicts, and potentially less duplication of work in terms of fixing the compiler. -- Gaby
Re: patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=
On Wed, Jun 19, 2013 at 7:55 PM, Sandra Loosemore wrote: > On 06/19/2013 05:10 PM, Joseph S. Myers wrote: >> >> >> I don't think it's right to depend on the standard version like this. The >> existing semantics for GNU C and C++ follow the memory model for all >> standard versions, and that's the sort of thing that shouldn't depend on >> the target architecture. In the absence of explicit >> -fstrict-volatile-bitfields, semantics conflicting with the memory model >> should only be enabled by one of the --param options to allow data races, >> and not by some default option relating to something in a target ABI >> that's incompatible with the normal language semantics. > > > H. Well, I'm willing to hack up a patch to remove > -fstrict-volatile-bitfields from the defaults for all backends, if it is the > consensus of the GCC community to do that and it unblocks consideration of > the other wrong-code bug fix patches in the series. > > I'm concerned, though, that we should consider the perspective of GCC users > on the affected targets as well as that of a standards committee member. > E.g., suppose I am an ARM user with some code manipulating memory-mapped I/O > registers that was originally developed with an older version of GCC, or > with a different compiler. Maybe it is not even my own code, but something > I got from a chip vendor SDK. People working with such target-specific, > low-level code are far more likely to be familiar with and conforming to > ARM's published guidelines for volatile bit-field access than obscure > details of a language standard that is not even being selected as the > dialect for compiling the code. I think there's a good argument here that > by retroactively applying the C11/C++11 memory model to older standard > versions, GCC has simply broken access to memory-mapped registers on ARM. > After all, the AAPCS predates these newer standards and older versions of > GCC at least made an effort to implement the behavior AAPCS required, and if > the pr23623 testcase had been added at the time that issue was originally > resolved back in 2006, the regression would have been caught immediately > when the bitfield range patches were committed. > > I hope that by the time GCC switches to C11 and C++11 as its default > dialects, ARM will have revised its ABI or clarified how this conflict > should be resolved. :-) > > Anyway, what do other people think? I rather ARM think about the issues. This effects even AARCH64 where most programs are going to be following the C11/C++11 memory model due to the market that is being aimed. I think it is a good idea to have ARM resolves the issues before even breaking C11/C++11 memory model. I know for Cavium's AARCH64 GCC I am going to turn off -fstrict-volatile-bitfields for AARCH64 even though it violates the ABI since it violates the C/c++ standard first. The C/c++ standard in my mind is what takes precedence over the ABI. Thanks, Andrew > > -Sandra >
Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath
On Thu, 2013-06-20 at 00:51 +0200, Torvald Riegel wrote: > On Wed, 2013-06-19 at 14:43 -0500, Peter Bergner wrote: > >> I'm having trouble seeing why/when _ITM_inTransaction() is > >> returning something other than inIrrevocableTransaction. I'll see if I can > >> determine why and will report back. > > > > Ok, we return outsideTransaction because the nesting level (tx->nesting) > > is zero. > > That's a second bug in libitm, sorry. Can you try with the attached > patch additionally to the previous one? Thanks! Ok, with this patch, plus adding a powerpc implementation of htm_transaction_active(), reentrant.c now executes correctly on both HTM and non-HTM hardware for me. Thanks for all of your help with this! I'd still like to hear from Andreas, whether the reentrant.c test case with both patches, now works on S390. I'll note unlike your x86 htm_transaction_active() implementation, my implementation has to check for htm_available() first before executing the htm instruction which tells me whether I'm in transaction state or not, otherwise I'll get a SIGILL on non-HTM hardware. Unfortunately, my htm_available() call uses getauxval() to query the AUXV for a hwcap bit. Is there a place I can stash the result of the first call, so that subsequent calls use the cached value? Normally, I could use a static var, but I doubt that is what we want to do in static inline functions. Peter
Re: patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=
On 06/19/2013 05:10 PM, Joseph S. Myers wrote: I don't think it's right to depend on the standard version like this. The existing semantics for GNU C and C++ follow the memory model for all standard versions, and that's the sort of thing that shouldn't depend on the target architecture. In the absence of explicit -fstrict-volatile-bitfields, semantics conflicting with the memory model should only be enabled by one of the --param options to allow data races, and not by some default option relating to something in a target ABI that's incompatible with the normal language semantics. H. Well, I'm willing to hack up a patch to remove -fstrict-volatile-bitfields from the defaults for all backends, if it is the consensus of the GCC community to do that and it unblocks consideration of the other wrong-code bug fix patches in the series. I'm concerned, though, that we should consider the perspective of GCC users on the affected targets as well as that of a standards committee member. E.g., suppose I am an ARM user with some code manipulating memory-mapped I/O registers that was originally developed with an older version of GCC, or with a different compiler. Maybe it is not even my own code, but something I got from a chip vendor SDK. People working with such target-specific, low-level code are far more likely to be familiar with and conforming to ARM's published guidelines for volatile bit-field access than obscure details of a language standard that is not even being selected as the dialect for compiling the code. I think there's a good argument here that by retroactively applying the C11/C++11 memory model to older standard versions, GCC has simply broken access to memory-mapped registers on ARM. After all, the AAPCS predates these newer standards and older versions of GCC at least made an effort to implement the behavior AAPCS required, and if the pr23623 testcase had been added at the time that issue was originally resolved back in 2006, the regression would have been caught immediately when the bitfield range patches were committed. I hope that by the time GCC switches to C11 and C++11 as its default dialects, ARM will have revised its ABI or clarified how this conflict should be resolved. :-) Anyway, what do other people think? -Sandra
RE: [PATCH, ARM, iWMMXT] Check IWMMXT_GR_REGS in the SECONDARY_RELOAD MACRO
At 2013-05-24 15:19:36,"Chung-Ju Wu" wrote: > 2013/5/24 Xinyu Qi : > > Hi, > > > > For this simple case, compiled with option -march=iwmmxt -O, #define > > N 64 signed int b[N]; signed long long j[N], d[N]; void foo (void) { > > int i; > > for (i = 0; i < N; i++) > > j[i] = d[i] << b[i]; > > } > > An internal compiler error occurs, > > error: insn does not satisfy its constraints: > > (insn 112 74 75 3 (set (reg:SI 96 wcgr0) > > (mem/c:SI (plus:SI (reg:SI 3 r3 [orig:174 ivtmp.19 ] [174]) > > (reg/f:SI 0 r0 [183])) [0 MEM[symbol: b, index: > ivtmp.19_14, offset: 0B]+0 S4 A32])) {*iwmmxt_movsi_insn} > > (nil)) > > > > The load address of wmmx wcgr register cannot accept the register offset > mode and the reload pass fails to fix it, so that such error happens. > > This issue could be solved by adding check code for IWMMXT_GR_REGS class > in the SECONDARY_RELOAD MACRO if TARGET_IWMMXT. Current code only > check the IWMMXT_REGS group. > > Patch attached with a new test. > > Pass full dejagnu test. No regression. > > > > Is this fix proper? > > OK for trunk? > > > > I cannot approve it. But here are some comments and hope it helps. Hi Chung-Ju, Thanks for your comments:) I fixed the typo you mentioned and regenerated the patch attached. ChangeLog gcc/ 2013-05-24 Xinyu Qi * config/arm/arm.h (SECONDARY_OUTPUT_RELOAD_CLASS): Check IWMMXT_GR_REGS register class. (SECONDARY_INPUT_RELOAD_CLASS): Likewise. testsuite/ 2013-05-24 Xinyu Qi * gcc.target/arm/mmx-3.c: New test. Thanks, Xinyu > > > > ChangeLog > > gcc/ > > 2013-05-24 Xinyu Qi > > > > * config/arm/arm.h (SECONDARY_OUTPUT_RELOAD_CLASS): > Check IWMMXT_GR_REGS. > > This line just ends at 81 column. > How about this one? > > 2013-05-24 Xinyu Qi > > * config/arm/arm.h (SECONDARY_OUTPUT_RELOAD_CLASS): Check > IWMMXT_GR_REGS register class. > (SECONDARY_INPUT_RELOAD_CLASS): Likewise. > > > > > testsuite/ > > 2013-05-24 Xinyu Qi > > > > * gcc.target/arm/mmx-3.c: New test. > > > > Index: gcc/config/arm/arm.h > > > > === > > --- gcc/config/arm/arm.h(revision 199090) > > +++ gcc/config/arm/arm.h(working copy) > > @@ -1280,7 +1280,8 @@ > >((TARGET_VFP && TARGET_HARD_FLOAT\ > > && IS_VFP_CLASS (CLASS))\ > > ? coproc_secondary_reload_class (MODE, X, FALSE)\ > > - : (TARGET_IWMMXT && (CLASS) == IWMMXT_REGS)\ > > + : (TARGET_IWMMXT && ((CLASS) == IWMMXT_REGS)\ > > +|| (CLASS) == IWMMXT_GR_REGS)\ > > I think it should be > > + : (TARGET_IWMMXT && ((CLASS) == IWMMXT_REGS\ > +|| (CLASS) == IWMMXT_GR_REGS))\ > > > > ? coproc_secondary_reload_class (MODE, X, TRUE)\ > > : TARGET_32BIT\ > > ? (((MODE) == HImode && ! arm_arch4 && true_regnum (X) == -1) \ > @@ > > -1293,7 +1294,8 @@ > >((TARGET_VFP && TARGET_HARD_FLOAT\ > > && IS_VFP_CLASS (CLASS))\ > > ? coproc_secondary_reload_class (MODE, X, FALSE) :\ > > -(TARGET_IWMMXT && (CLASS) == IWMMXT_REGS) ?\ > > +(TARGET_IWMMXT && ((CLASS) == IWMMXT_REGS\ > > + || (CLASS) == IWMMXT_GR_REGS)) ?\ > > coproc_secondary_reload_class (MODE, X, TRUE) :\ > > (TARGET_32BIT ?\ > > (((CLASS) == IWMMXT_REGS || (CLASS) == IWMMXT_GR_REGS)\ > > It seems that you didn't CC arm maintainer. > Let me do this for you. :) > > > Best regards, > jasonwucj GR_secondary_reload.diff Description: GR_secondary_reload.diff
Re: new port: msp430-elf, revision 3
> A random spotting; copyright header replacement miss, including > but maybe not limited to: Doh! I'll scan them all and fix them. Thanks!
Re: new port: msp430-elf, revision 3
On Wed, 19 Jun 2013, DJ Delorie wrote: > > Third revision, mostly the same as the last, haven't heard any additional > feedback in the last few weeks. Ok to commit yet? > [libgcc] > > * config.host (msp*-*-elf): New. > * config/msp430/: New port. A random spotting; copyright header replacement miss, including but maybe not limited to: > Index: libgcc/config/msp430/srai.S > === > --- libgcc/config/msp430/srai.S (revision 0) > +++ libgcc/config/msp430/srai.S (revision 0) > @@ -0,0 +1,114 @@ > +/* Copyright (c) 2012 Red Hat Incorporated. > + All rights reserved. > + > + Redistribution and use in source and binary forms, with or without > + modification, are permitted provided that the following conditions > + are met: > Index: libgcc/config/msp430/slli.S > === > --- libgcc/config/msp430/slli.S (revision 0) > +++ libgcc/config/msp430/slli.S (revision 0) > @@ -0,0 +1,116 @@ > +/* Copyright (c) 2012 Red Hat Incorporated. > + All rights reserved. brgds, H-P
Re: [C++ Patch] PR 57645
Hi again, On 06/19/2013 03:37 PM, Paolo Carlini wrote: Hi, when I implemented Core/1123 "Destructors should be noexcept by default", unfortunately I caused this regression, present now in mainline and 4_8-branch. When the destructor is user provided, with no exception specifications, and the type has data members (not bases, those are already Ok) with the destructor which can throw, the destructor is wrongly deduced to be noexcept. The reason is that deduce_noexcept_on_destructors is called from check_bases_and_members after check_bases but *before* check_methods and therefore the latter does too late work relevant for the deduction, namely possibly setting TYPE_HAS_NONTRIVIAL_DESTRUCTOR. I can confirm that the issue in very general terms is this one, but my patch isn't Ok. Sorry. For example, it doesn't handle correctly the defaulted destructor case. If, in check_bases_and_members, I simply move deduce_noexcept_on_destructors after check_methods and nothing else, all the new testcases are fine + the tests added for Core/1123, but there are regressions, for example for testcases involving virtual destructors, eg, debug/dwarf2/non-virtual-thunk.C. All in all the issue seems rather nasty to me, I'm afraid I will need some help if we want to quickly make substantive progress on this issue. Thanks, Paolo.
patch to fix PR57604
I hope the following patch fixes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57604 Although I have no specific hardware to check this. The patch also adds a comment about one recent change as it was done in the same function. The patch was successfully bootstrapped and tested on x86/x86-64 and s390x (including building java). Committed as rev. 200227. 2013-06-19 Vladimir Makarov PR bootstrap/57604 * lra.c (emit_add3_insn, emit_add2_insn): New functions. (lra_emit_add): Use the functions. Add comment about Y as an address segment. Index: lra.c === --- lra.c (revision 200174) +++ lra.c (working copy) @@ -242,6 +242,42 @@ lra_delete_dead_insn (rtx insn) lra_set_insn_deleted (insn); } +/* Emit insn x = y + z. Return NULL if we failed to do it. + Otherwise, return the insn. We don't use gen_add3_insn as it might + clobber CC. */ +static rtx +emit_add3_insn (rtx x, rtx y, rtx z) +{ + rtx insn, last; + + last = get_last_insn (); + insn = emit_insn (gen_rtx_SET (VOIDmode, x, +gen_rtx_PLUS (GET_MODE (y), y, z))); + if (recog_memoized (insn) < 0) +{ + delete_insns_since (last); + insn = NULL_RTX; +} + return insn; +} + +/* Emit insn x = x + y. Return the insn. We use gen_add2_insn as the + last resort. */ +static rtx +emit_add2_insn (rtx x, rtx y) +{ + rtx insn; + + insn = emit_add3_insn (x, x, y); + if (insn == NULL_RTX) +{ + insn = gen_add2_insn (x, y); + if (insn != NULL_RTX) + emit_insn (insn); +} + return insn; +} + /* Target checks operands through operand predicates to recognize an insn. We should have a special precaution to generate add insns which are frequent results of elimination. @@ -260,10 +296,10 @@ lra_emit_add (rtx x, rtx y, rtx z) rtx a1, a2, base, index, disp, scale, index_scale; bool ok_p; - insn = gen_add3_insn (x, y, z); + insn = emit_add3_insn (x, y, z); old = max_reg_num (); if (insn != NULL_RTX) -emit_insn (insn); +; else { disp = a2 = NULL_RTX; @@ -306,12 +342,14 @@ lra_emit_add (rtx x, rtx y, rtx z) || (disp != NULL_RTX && ! CONSTANT_P (disp)) || (scale != NULL_RTX && ! CONSTANT_P (scale))) { - /* It is not an address generation. Probably we have no 3 op -add. Last chance is to use 2-op add insn. */ + /* Probably we have no 3 op add. Last chance is to use 2-op +add insn. To succeed, don't move Z to X as an address +segment always comes in Y. Otherwise, we might fail when +adding the address segment to register. */ lra_assert (x != y && x != z); emit_move_insn (x, y); - insn = gen_add2_insn (x, z); - emit_insn (insn); + insn = emit_add2_insn (x, z); + lra_assert (insn != NULL_RTX); } else { @@ -322,8 +360,8 @@ lra_emit_add (rtx x, rtx y, rtx z) /* Generate x = index_scale; x = x + base. */ lra_assert (index_scale != NULL_RTX && base != NULL_RTX); emit_move_insn (x, index_scale); - insn = gen_add2_insn (x, base); - emit_insn (insn); + insn = emit_add2_insn (x, base); + lra_assert (insn != NULL_RTX); } else if (scale == NULL_RTX) { @@ -337,14 +375,14 @@ lra_emit_add (rtx x, rtx y, rtx z) delete_insns_since (last); /* Generate x = disp; x = x + base. */ emit_move_insn (x, disp); - insn = gen_add2_insn (x, base); - emit_insn (insn); + insn = emit_add2_insn (x, base); + lra_assert (insn != NULL_RTX); } /* Generate x = x + index. */ if (index != NULL_RTX) { - insn = gen_add2_insn (x, index); - emit_insn (insn); + insn = emit_add2_insn (x, index); + lra_assert (insn != NULL_RTX); } } else @@ -355,16 +393,12 @@ lra_emit_add (rtx x, rtx y, rtx z) ok_p = false; if (recog_memoized (insn) >= 0) { - insn = gen_add2_insn (x, disp); + insn = emit_add2_insn (x, disp); if (insn != NULL_RTX) { - emit_insn (insn); - insn = gen_add2_insn (x, disp); + insn = emit_add2_insn (x, disp); if (insn != NULL_RTX) - { - emit_insn (insn); - ok_p = true; - } + ok_p = true; } } if (! ok_p) @@ -372,10 +406,10 @@ lra_emit_add (rtx
Re: patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=
On Wed, 19 Jun 2013, Sandra Loosemore wrote: > On 06/17/2013 06:02 PM, Sandra Loosemore wrote: > > > > I had another thought: perhaps -fstrict-volatile-bitfields could remain > > the default on targets where it currently is, but it can be overridden > > by an appropriate -std= option. Perhaps also GCC could give an error if > > -fstrict-volatile-bitfields is given explicitly with an incompatible > > -std= option. > > Like this. This patch is intended to be applied on top of the other 5 pieces > in this series, although in theory it's independent of them. OK to commit, > and does this resolve the objection to part 3? I don't think it's right to depend on the standard version like this. The existing semantics for GNU C and C++ follow the memory model for all standard versions, and that's the sort of thing that shouldn't depend on the target architecture. In the absence of explicit -fstrict-volatile-bitfields, semantics conflicting with the memory model should only be enabled by one of the --param options to allow data races, and not by some default option relating to something in a target ABI that's incompatible with the normal language semantics. -- Joseph S. Myers jos...@codesourcery.com
Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath
On Wed, 2013-06-19 at 14:43 -0500, Peter Bergner wrote: > On Wed, 2013-06-19 at 10:57 -0500, Peter Bergner wrote: > > On Wed, 2013-06-19 at 10:49 -0500, Peter Bergner wrote: > > > This is due to the following in _ITM_inTransaction(): > > > > > > 47 if (tx && (tx->nesting > 0)) > > > (gdb) p tx > > > $2 = (GTM::gtm_thread *) 0x10901bf0 > > > (gdb) p tx->nesting > > > $3 = 1 > > > (gdb) step > > > 49 if (tx->state & gtm_thread::STATE_IRREVOCABLE) > > > (gdb) p tx->state > > > $4 = 3 > > > (gdb) p gtm_thread::STATE_IRREVOCABLE > > > $5 = 2 > > > (gdb) step > > > 50return inIrrevocableTransaction; > > > > Bah, ignore this. It's a different call that is returning something other > > than inIrrevocableTransaction. Unfortunately, gdb is having problems inside > > hw txns and I'm having trouble seeing why/when _ITM_inTransaction() is > > returning something other than inIrrevocableTransaction. I'll see if I can > > determine why and will report back. > > Ok, we return outsideTransaction because the nesting level (tx->nesting) > is zero. That's a second bug in libitm, sorry. Can you try with the attached patch additionally to the previous one? Thanks! Torvald commit 02dde6bb91107792fb0cb9f5c4785d25b6aa0e3c Author: Torvald Riegel Date: Thu Jun 20 00:46:59 2013 +0200 libitm: Handle HTM fastpath in status query functions. diff --git a/libitm/config/x86/target.h b/libitm/config/x86/target.h index 77b627f..063c09e 100644 --- a/libitm/config/x86/target.h +++ b/libitm/config/x86/target.h @@ -125,6 +125,13 @@ htm_abort_should_retry (uint32_t begin_ret) { return begin_ret & _XABORT_RETRY; } + +/* Returns true iff a hardware transaction is currently being executed. */ +static inline bool +htm_transaction_active () +{ + return _xtest() != 0; +} #endif diff --git a/libitm/query.cc b/libitm/query.cc index 5707321..0ac3eda 100644 --- a/libitm/query.cc +++ b/libitm/query.cc @@ -43,6 +43,15 @@ _ITM_libraryVersion (void) _ITM_howExecuting ITM_REGPARM _ITM_inTransaction (void) { +#if defined(USE_HTM_FASTPATH) + // If we use the HTM fastpath, we cannot reliably detect whether we are + // in a transaction because this function can be called outside of + // a transaction and thus we can't deduce this by looking at just the serial + // lock. This function isn't used in practice currently, so the easiest + // way to handle it is to just abort. + if (htm_transaction_active()) +htm_abort(); +#endif struct gtm_thread *tx = gtm_thr(); if (tx && (tx->nesting > 0)) { @@ -58,6 +67,11 @@ _ITM_inTransaction (void) _ITM_transactionId_t ITM_REGPARM _ITM_getTransactionId (void) { +#if defined(USE_HTM_FASTPATH) + // See ITM_inTransaction. + if (htm_transaction_active()) +htm_abort(); +#endif struct gtm_thread *tx = gtm_thr(); return (tx && (tx->nesting > 0)) ? tx->id : _ITM_noTransactionId; }
Re: [google gcc-4_8] fix bad merge in r199218
lgtm. On Wed, Jun 19, 2013 at 3:41 PM, Rong Xu wrote: > Hi, > > This patch fixes a bad merge in r199218. > Removing cgraph noded in early-ipa should be allowed. > Otherwise, we got ICE in tree-eipa_sra with > -freorder-funtions=callgraph (without -fripa) > > Tested with regressions and google banchmarks. > > Thanks, > > -Rong
[google gcc-4_8] fix bad merge in r199218
Hi, This patch fixes a bad merge in r199218. Removing cgraph noded in early-ipa should be allowed. Otherwise, we got ICE in tree-eipa_sra with -freorder-funtions=callgraph (without -fripa) Tested with regressions and google banchmarks. Thanks, -Rong patch.diff Description: Binary data
Re: [PATCH] PR57518, RA generated redundent code
+jakub who manages GCC 4.8 releases. David On Wed, Jun 19, 2013 at 2:28 PM, Wei Mi wrote: > Yes, I think so. > > Regards, > Wei. > > On Wed, Jun 19, 2013 at 2:00 PM, Xinliang David Li wrote: >> Should the patch be ported to in 48 branch? >> >> thanks, >> >> David >> >> On Wed, Jun 19, 2013 at 11:46 AM, Vladimir Makarov >> wrote: >>> On 13-06-19 1:23 AM, Wei Mi wrote: Ping. On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi wrote: > > Hi, > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518 > > pr57518 happened because update_equiv_regs in IRA marked a reg > equivalent with a mem, lowered its mem_cost in scan_one_insn, set > NO_REGS to its rclass, but didn't consider the reg was used in > paradoxical subreg which prevented the reg from being replaced by mem > in LRA phase. > > This patch is to check whether a reg is used in a paradoxical subreg > in update_equiv_regs before reg is set as equivalent to a mem. > > bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for > trunk and gcc-4.8 branch? > > >>> Thanks for working on this PR, Wei, and sorry for the delay with the answer >>> (I was on vacation). >>> >>> In general, the PR analysis and the proposed solution looks ok. I only >>> worry that you are adding additional full scan of all RTL code. It might >>> add 0.5% to GCC compilation time if data cache is rewritten (which will >>> happen for moderate size or big functions). It would be nice to do it on >>> some other existing RTL traversing. Unfortunately, this info is calculated >>> later (reg_max_width in reload or biggest_mode in LRA). I am in doubt that >>> other solutions I see now are better: >>> >>> o calculate this info in regstat_... function and store it in reg_info_p >>> o calculate it with update_equiv_regs and use it for invalidation the >>> equiv info later >>> >>> The first one increases reg_info_p footprint and calculation is done many >>> times although it is used once. >>> The second one results in complicated code. >>> >>> So I think the current patch is ok to commit. >>> >>> Thanks, again. >>> >>> >>>
Re: [PATCH] PR57518, RA generated redundent code
Yes, I think so. Regards, Wei. On Wed, Jun 19, 2013 at 2:00 PM, Xinliang David Li wrote: > Should the patch be ported to in 48 branch? > > thanks, > > David > > On Wed, Jun 19, 2013 at 11:46 AM, Vladimir Makarov > wrote: >> On 13-06-19 1:23 AM, Wei Mi wrote: >>> >>> Ping. >>> >>> On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi wrote: Hi, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518 pr57518 happened because update_equiv_regs in IRA marked a reg equivalent with a mem, lowered its mem_cost in scan_one_insn, set NO_REGS to its rclass, but didn't consider the reg was used in paradoxical subreg which prevented the reg from being replaced by mem in LRA phase. This patch is to check whether a reg is used in a paradoxical subreg in update_equiv_regs before reg is set as equivalent to a mem. bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for trunk and gcc-4.8 branch? >> Thanks for working on this PR, Wei, and sorry for the delay with the answer >> (I was on vacation). >> >> In general, the PR analysis and the proposed solution looks ok. I only >> worry that you are adding additional full scan of all RTL code. It might >> add 0.5% to GCC compilation time if data cache is rewritten (which will >> happen for moderate size or big functions). It would be nice to do it on >> some other existing RTL traversing. Unfortunately, this info is calculated >> later (reg_max_width in reload or biggest_mode in LRA). I am in doubt that >> other solutions I see now are better: >> >> o calculate this info in regstat_... function and store it in reg_info_p >> o calculate it with update_equiv_regs and use it for invalidation the >> equiv info later >> >> The first one increases reg_info_p footprint and calculation is done many >> times although it is used once. >> The second one results in complicated code. >> >> So I think the current patch is ok to commit. >> >> Thanks, again. >> >> >>
Go patch committed: Check for invalid unsafe.Offsetof
In Go it is not valid to use unsafe.Offsetof with a reference to an embedded field that is accessed via an embedded pointer, because there is no reasonable answer to return in that case. Unfortunately gccgo was returning bogus results for that case. This patch from Rémy Oudompheng fixes it to return an error. There is a test case in the master testsuite that will be imported into gccgo in due course. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8 branch. Ian diff -r 3d794090fa86 go/expressions.cc --- a/go/expressions.cc Tue Jun 18 16:01:29 2013 -0700 +++ b/go/expressions.cc Wed Jun 19 14:16:32 2013 -0700 @@ -6955,6 +6955,26 @@ return Expression::make_error(loc); } + if (this->code_ == BUILTIN_OFFSETOF) +{ + Expression* arg = this->one_arg(); + Field_reference_expression* farg = arg->field_reference_expression(); + while (farg != NULL) + { + if (!farg->implicit()) + break; + // When the selector refers to an embedded field, + // it must not be reached through pointer indirections. + if (farg->expr()->deref() != farg->expr()) + { + this->report_error(_("argument of Offsetof implies indirection of an embedded field")); + return this; + } + // Go up until we reach the original base. + farg = farg->expr()->field_reference_expression(); + } +} + if (this->is_constant()) { Numeric_constant nc; diff -r 3d794090fa86 go/expressions.h --- a/go/expressions.h Tue Jun 18 16:01:29 2013 -0700 +++ b/go/expressions.h Wed Jun 19 14:16:32 2013 -0700 @@ -1872,7 +1872,7 @@ Field_reference_expression(Expression* expr, unsigned int field_index, Location location) : Expression(EXPRESSION_FIELD_REFERENCE, location), - expr_(expr), field_index_(field_index), called_fieldtrack_(false) + expr_(expr), field_index_(field_index), implicit_(false), called_fieldtrack_(false) { } // Return the struct expression.
Re: [PATCH] PR57518, RA generated redundent code
Should the patch be ported to in 48 branch? thanks, David On Wed, Jun 19, 2013 at 11:46 AM, Vladimir Makarov wrote: > On 13-06-19 1:23 AM, Wei Mi wrote: >> >> Ping. >> >> On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi wrote: >>> >>> Hi, >>> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518 >>> >>> pr57518 happened because update_equiv_regs in IRA marked a reg >>> equivalent with a mem, lowered its mem_cost in scan_one_insn, set >>> NO_REGS to its rclass, but didn't consider the reg was used in >>> paradoxical subreg which prevented the reg from being replaced by mem >>> in LRA phase. >>> >>> This patch is to check whether a reg is used in a paradoxical subreg >>> in update_equiv_regs before reg is set as equivalent to a mem. >>> >>> bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for >>> trunk and gcc-4.8 branch? >>> >>> > Thanks for working on this PR, Wei, and sorry for the delay with the answer > (I was on vacation). > > In general, the PR analysis and the proposed solution looks ok. I only > worry that you are adding additional full scan of all RTL code. It might > add 0.5% to GCC compilation time if data cache is rewritten (which will > happen for moderate size or big functions). It would be nice to do it on > some other existing RTL traversing. Unfortunately, this info is calculated > later (reg_max_width in reload or biggest_mode in LRA). I am in doubt that > other solutions I see now are better: > > o calculate this info in regstat_... function and store it in reg_info_p > o calculate it with update_equiv_regs and use it for invalidation the > equiv info later > > The first one increases reg_info_p footprint and calculation is done many > times although it is used once. > The second one results in complicated code. > > So I think the current patch is ok to commit. > > Thanks, again. > > >
Re: Symtab cleanups 4/17 - ICE in GUPC due to use of init section
On 06/19/13 09:26:30, Gary Funck wrote: > The variable declaration tree node looks about right to me. > However, it never makes it into the output assembler file. > > What is the recommended method for making sure that the > static variable created above is associated with the current > translation unit and that its initialization makes it into > the assembler output file? Adding this call to the upc_create_static_var() routine implemented the necessary binding: pushdecl_top_level (decl); pushdecl_top_level() is defined in c/c-decl.c: /* Record X as belonging to file scope. This is used only internally by the Objective-C front end, and is limited to its needs. duplicate_decls is not called; if there is any preexisting decl for this identifier, it is an ICE. */ tree pushdecl_top_level (tree x) [...] This also required that the temporary variable being created needs to be named (have a non-null DECL_NAME()). This doesn't seem ideal, but does generate the desired code.
Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self
Am 19.06.2013 19:46, schrieb Jakub Jelinek: > On Wed, Jun 19, 2013 at 05:29:42PM +0200, Matthias Klose wrote: >> well, I did fix this assumption last year in gcc.c, then lets fix it in other >> places too, just adding a mode parameter to the public find_a_file function. >> Testing the attached patch. > > Ok, provided you: > 1) write proper ChangeLog > 2) adjust the gcc-ar.c change (because it won't apply cleanly now that >I've committed the other gcc-ar.c fix > 3) >> --- file-find.h (revision 200203) >> +++ file-find.h (working copy) >> @@ -38,7 +38,7 @@ >> }; >> >> extern void find_file_set_debug (bool); >> -extern char *find_a_file (struct path_prefix *, const char *); >> +extern char *find_a_file (struct path_prefix *, const char *, int mode); > > Remove " mode" above, none of the arguments have names, so adding it > is both inconsistent and useless. > > 4) >>if (ld_file_name == 0) >> -ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker]); >> +ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker], >> X_OK); > > This line looks too long now. addressed 1-3, the semicolon is in column 79. committed to the trunk and the 4.8 branch as attached. Matthias 2013-06-19 Matthias Klose PR driver/57651 * file-find.h (find_a_file): Add a mode parameter. * file-find.c (find_a_file): Likewise. * gcc-ar.c (main): Call find_a_file with R_OK for the plugin, with X_OK for the executables. * collect2.c (main): Call find_a_file with X_OK. Index: gcc-ar.c === --- gcc-ar.c(revision 200217) +++ gcc-ar.c(working copy) @@ -136,7 +136,7 @@ setup_prefixes (av[0]); /* Find the GCC LTO plugin */ - plugin = find_a_file (&target_path, LTOPLUGINSONAME); + plugin = find_a_file (&target_path, LTOPLUGINSONAME, R_OK); if (!plugin) { fprintf (stderr, "%s: Cannot find plugin '%s'\n", av[0], LTOPLUGINSONAME); @@ -144,14 +144,14 @@ } /* Find the wrapped binutils program. */ - exe_name = find_a_file (&target_path, PERSONALITY); + exe_name = find_a_file (&target_path, PERSONALITY, X_OK); if (!exe_name) { const char *real_exe_name = PERSONALITY; #ifdef CROSS_DIRECTORY_STRUCTURE real_exe_name = concat (target_machine, "-", PERSONALITY, NULL); #endif - exe_name = find_a_file (&path, real_exe_name); + exe_name = find_a_file (&path, real_exe_name, X_OK); if (!exe_name) { fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0], Index: file-find.c === --- file-find.c (revision 200217) +++ file-find.c (working copy) @@ -31,7 +31,7 @@ } char * -find_a_file (struct path_prefix *pprefix, const char *name) +find_a_file (struct path_prefix *pprefix, const char *name, int mode) { char *temp; struct prefix_list *pl; @@ -50,7 +50,7 @@ if (IS_ABSOLUTE_PATH (name)) { - if (access (name, X_OK) == 0) + if (access (name, mode) == 0) { strcpy (temp, name); @@ -66,7 +66,7 @@ strcpy (temp, name); strcat (temp, HOST_EXECUTABLE_SUFFIX); - if (access (temp, X_OK) == 0) + if (access (temp, mode) == 0) return temp; #endif @@ -83,7 +83,7 @@ if (stat (temp, &st) >= 0 && ! S_ISDIR (st.st_mode) - && access (temp, X_OK) == 0) + && access (temp, mode) == 0) return temp; #ifdef HOST_EXECUTABLE_SUFFIX @@ -93,7 +93,7 @@ if (stat (temp, &st) >= 0 && ! S_ISDIR (st.st_mode) - && access (temp, X_OK) == 0) + && access (temp, mode) == 0) return temp; #endif } Index: file-find.h === --- file-find.h (revision 200217) +++ file-find.h (working copy) @@ -38,7 +38,7 @@ }; extern void find_file_set_debug (bool); -extern char *find_a_file (struct path_prefix *, const char *); +extern char *find_a_file (struct path_prefix *, const char *, int); extern void add_prefix (struct path_prefix *, const char *); extern void prefix_from_env (const char *, struct path_prefix *); extern void prefix_from_string (const char *, struct path_prefix *); Index: collect2.c === --- collect2.c (revision 200217) +++ collect2.c (working copy) @@ -1110,55 +1110,55 @@ if (ld_file_name == 0) #endif #ifdef REAL_LD_FILE_NAME - ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME); + ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME, X_OK); if (ld_file_name == 0) #endif /* Search the (target-specific) compiler dirs for ld'. */ - ld_file_name = find_a_file (&cpath, real_ld_suffix); + ld_file_name = find_a_file (&cpath, real_ld_suffix, X_OK); /* Likewise for `collect-ld'. */ if (ld_file_name == 0) { -
patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=
On 06/17/2013 06:02 PM, Sandra Loosemore wrote: I had another thought: perhaps -fstrict-volatile-bitfields could remain the default on targets where it currently is, but it can be overridden by an appropriate -std= option. Perhaps also GCC could give an error if -fstrict-volatile-bitfields is given explicitly with an incompatible -std= option. Like this. This patch is intended to be applied on top of the other 5 pieces in this series, although in theory it's independent of them. OK to commit, and does this resolve the objection to part 3? -Sandra 2013-06-19 Sandra Loosemore gcc/c-family/ * c-opts.c (c_common_post_options): Check for conflict between -std= and -fstrict-volatile-bitfields. gcc/ * doc/invoke.texi (Code Gen Options): Document what happens when -fstrict-volatile-bitfields conflicts with -std=. Index: gcc/c-family/c-opts.c === --- gcc/c-family/c-opts.c (revision 199963) +++ gcc/c-family/c-opts.c (working copy) @@ -813,6 +813,18 @@ c_common_post_options (const char **pfil C_COMMON_OVERRIDE_OPTIONS; #endif + /* C11 and C++11 specify a memory model that is incompatible with + -fstrict-volatile-bitfields. Warn if that option is given explicitly + and prevent backends from defaulting to turning it on. */ + if (flag_isoc11 || cxx_dialect >= cxx11) +{ + if (flag_strict_volatile_bitfields > 0) + warning (0, "-fstrict-volatile-bitfields conflicts with the " + "C11 and C++11 memory model"); + else + flag_strict_volatile_bitfields = 0; +} + /* Excess precision other than "fast" requires front-end support. */ if (c_dialect_cxx ()) Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 199963) +++ gcc/doc/invoke.texi (working copy) @@ -20899,4 +20899,12 @@ AAPCS, @option{-fstrict-volatile-bitfields} is the default. +Note that @option{-fstrict-volatile-bitfields} is incompatible with +the bit-field access behavior required by the ISO C11 and C++11 +standards. GCC warns if @option{-fstrict-volatile-bitfields} is given +explicitly with an incompatible @option{-std=} option. On targets +that otherwise default to @option{-fstrict-volatile-bitfields}, +providing an incompatible @option{-std=} option implicitly disables +@option{-fstrict-volatile-bitfields}. + @item -fsync-libcalls @opindex fsync-libcalls
Re: Document Intel Silvermont support in invoke.texi
> Patch preapproved. Checked into 4.8 branch: http://gcc.gnu.org/ml/gcc-cvs/2013-06/msg00648.html Thanks, K
Re: RFA: Fix rtl-optimization/57425
Quoting Michael Matz : That's not good. You now have different order of parameters between anti_dependence and canon_anti_dependence. That will be mightily confusing, please instead change the caller. Currently these predicates take their arguments in the order of the corresponding instructions, that should better be retained: true_dependence: write-then-(depending)read anti_dependence: read-then-(clobbering)write write_dependence: write-then-(clobbering)write All right, attached is the patch with the arguments in instruction-order. Again, bootstrapped/regtested on i686-pc-linux-gnu . 2013-06-19 Joern Rennecke PR rtl-optimization/57425 PR rtl-optimization/57569 * alias.c (write_dependence_p): Remove parameters mem_mode and canon_mem_addr. Add parameters x_mode, x_addr and x_canonicalized. Changed all callers. (canon_anti_dependence): Get comments and semantics in sync. Add parameter mem_canonicalized. Changed all callers. * rtl.h (canon_anti_dependence): Update prototype. Index: alias.c === --- alias.c (revision 200133) +++ alias.c (working copy) @@ -156,8 +156,9 @@ static int insert_subset_children (splay static alias_set_entry get_alias_set_entry (alias_set_type); static bool nonoverlapping_component_refs_p (const_rtx, const_rtx); static tree decl_for_component_ref (tree); -static int write_dependence_p (const_rtx, enum machine_mode, rtx, const_rtx, - bool, bool); +static int write_dependence_p (const_rtx, + const_rtx, enum machine_mode, rtx, + bool, bool, bool); static void memory_modified_1 (rtx, const_rtx, void *); @@ -2555,20 +2556,22 @@ canon_true_dependence (const_rtx mem, en /* Returns nonzero if a write to X might alias a previous read from (or, if WRITEP is true, a write to) MEM. - If MEM_CANONCALIZED is nonzero, CANON_MEM_ADDR is the canonicalized - address of MEM, and MEM_MODE the mode for that access. */ + If X_CANONCALIZED is true, then X_ADDR is the canonicalized address of X, + and X_MODE the mode for that access. + If MEM_CANONICALIZED is true, MEM is canonicalized. */ static int -write_dependence_p (const_rtx mem, enum machine_mode mem_mode, - rtx canon_mem_addr, const_rtx x, - bool mem_canonicalized, bool writep) +write_dependence_p (const_rtx mem, + const_rtx x, enum machine_mode x_mode, rtx x_addr, + bool mem_canonicalized, bool x_canonicalized, bool writep) { - rtx x_addr, mem_addr; + rtx mem_addr; rtx base; int ret; - gcc_checking_assert (mem_canonicalized ? (canon_mem_addr != NULL_RTX) - : (canon_mem_addr == NULL_RTX && mem_mode == VOIDmode)); + gcc_checking_assert (x_canonicalized + ? (x_addr != NULL_RTX && x_mode != VOIDmode) + : (x_addr == NULL_RTX && x_mode == VOIDmode)); if (MEM_VOLATILE_P (x) && MEM_VOLATILE_P (mem)) return 1; @@ -2593,17 +2596,21 @@ write_dependence_p (const_rtx mem, enum if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x)) return 1; - x_addr = XEXP (x, 0); mem_addr = XEXP (mem, 0); - if (!((GET_CODE (x_addr) == VALUE -&& GET_CODE (mem_addr) != VALUE -&& reg_mentioned_p (x_addr, mem_addr)) - || (GET_CODE (x_addr) != VALUE - && GET_CODE (mem_addr) == VALUE - && reg_mentioned_p (mem_addr, x_addr + if (!x_addr) { - x_addr = get_addr (x_addr); - mem_addr = get_addr (mem_addr); + x_addr = XEXP (x, 0); + if (!((GET_CODE (x_addr) == VALUE +&& GET_CODE (mem_addr) != VALUE +&& reg_mentioned_p (x_addr, mem_addr)) + || (GET_CODE (x_addr) != VALUE + && GET_CODE (mem_addr) == VALUE + && reg_mentioned_p (mem_addr, x_addr + { + x_addr = get_addr (x_addr); + if (!mem_canonicalized) + mem_addr = get_addr (mem_addr); + } } base = find_base_term (mem_addr); @@ -2619,17 +2626,16 @@ write_dependence_p (const_rtx mem, enum GET_MODE (mem))) return 0; - x_addr = canon_rtx (x_addr); - if (mem_canonicalized) -mem_addr = canon_mem_addr; - else + if (!x_canonicalized) { - mem_addr = canon_rtx (mem_addr); - mem_mode = GET_MODE (mem); + x_addr = canon_rtx (x_addr); + x_mode = GET_MODE (x); } + if (!mem_canonicalized) +mem_addr = canon_rtx (mem_addr); - if ((ret = memrefs_conflict_p (GET_MODE_SIZE (mem_mode), mem_addr, -SIZE_FOR_MODE (x), x_addr, 0)) != -1) + if ((ret = memrefs_conflict_p (SIZE_FOR_MODE (mem), mem_addr, +GET_MODE_SIZE (x_mode), x_addr, 0)) != -1) return ret; if (nonoverlapping_memrefs_p (x, mem, false)) @@ -2
Re: Unordered container insertion hints
Still no chance to have a look ? I think that that patch is a really safe one. Those that do not use hint won't be impacted. Those that are already using it without any reason might experiment a small performance issue if they found the way to always use the worst possible hint. François On 06/12/2013 10:12 PM, François Dumont wrote: Hi Any news regarding this patch ? Thanks François On 06/06/2013 10:33 PM, François Dumont wrote: On 05/24/2013 01:00 AM, Paolo Carlini wrote: On 05/23/2013 10:01 PM, François Dumont wrote: Some feedback regarding this patch ? Two quick ones: what if the hint is wrong? I suppose the insertion succeeds anyway, it's only a little waste of time, right? Right. Is it possible that for instance something throws in that case and would not now (when the hint is simply ignored)? In case, check and re-check we are still conforming. I consider the hint only if it is equivalent to the inserted element so I invoke the equal_to functor for that. The invocation of the equal_to functor is already done if no hint is granted at the same location. So usage of the hint has no impact on exception safety. In any case, I think it's quite easy to notice if an implementation is using the hint in this way or a similar one basing on some simple benchmarks, without looking of course at the actual implementation code. Do we have any idea what other implementations are doing? Like, eg, they invented something for unordered_set and map too? Or a better way to exploit the hint for the multi variants? I only bench llvm/clang implementation and notice no different with or without hint, I guess it is simply ignored. I haven't plan to check or bench other implementations. The usage of hint I am introducing is quite natural considering the new unordered containers data model. And if anyone has a better idea to deal with it then he is welcome to contribute ! Eventually I suppose we want to add a performance testcase to our testsuite. Good request and the reason why it took me so long to answer. Writing such benchmark have shown me that users should be very careful with it cause it can do more bad than good. unordered_multiset_hint.ccunordered_set 100 X 2 insertions w/o hint 120r 120u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 100 X 2 insertions with any hint 130r 130u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 100 X 2 insertions with good hint 54r 54u0s 6416mem 0pf unordered_multiset_hint.ccunordered_set 100 X 2 insertions with perfect hint 36r 36u0s 6416mem 0pf unordered_multiset_hint.ccunordered_set 2 X 100 insertions w/o hint 40r 40u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 2 X 100 insertions with any hint 38r 38u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 2 X 100 insertions with bad hint 49r 50u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 2 X 100 insertions with perfect hint 34r 35u0s 6416mem 0pf The small number represents how many time the same element is inserted and the big one the number of different elements. 100 X 2 means that we loop 100 times inserting the 2 elements during each loop. 2 X 100 means that the main loop is on the elements and we insert each 100 times. Being able to insert all the equivalent elements at the same time or not has a major impact on the performances to get the same result. This is because when a new element is inserted it will be first in its bucket and the following 99 insertions will benefit from it even without any hint. The bench also show that a bad hint can be worst than no hint. A bad hint is one that once used require to check that next bucket is not impacted by the insertion. To do so it requires a hash code computation (if it is not cached like in my use case) and check. I have added a word about being able to check performance before using hints. Here is the result using the default std::hash, hash code is being cached. unordered_multiset_hint.ccunordered_set 100 X 2 insertions w/o hint 76r 76u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 100 X 2 insertions with any hint 83r 83u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 100 X 2 insertions with good hint 29r 29u0s 6416mem 0pf unordered_multiset_hint.ccunordered_set 100 X 2 insertions with perfect hint 24r 23u0s 6416mem 0pf unordered_multiset_hint.ccunordered_set 2 X 100 insertions w/o hint 27r 26u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 2 X 100 insertions with any hint 24r 24u0s 6416mem0pf unordered_multiset_hint.ccunordered_set 2 X 100 insertions with bad hint
Re: patch to fix PR57559 for s390
Vladimir Makarov writes: > On 13-06-19 2:31 PM, Richard Sandiford wrote: >> Richard Sandiford writes: >>> Vladimir Makarov writes: Index: lra.c === --- lra.c (revision 199753) +++ lra.c (working copy) @@ -306,11 +306,11 @@ lra_emit_add (rtx x, rtx y, rtx z) || (disp != NULL_RTX && ! CONSTANT_P (disp)) || (scale != NULL_RTX && ! CONSTANT_P (scale))) { -/* Its is not an address generation. Probably we have no 3 op +/* It is not an address generation. Probably we have no 3 op add. Last chance is to use 2-op add insn. */ lra_assert (x != y && x != z); -emit_move_insn (x, z); -insn = gen_add2_insn (x, y); +emit_move_insn (x, y); +insn = gen_add2_insn (x, z); emit_insn (insn); } else >>> Could you add a comment to lra_emit_add saying why it has to be this >>> way round (move y, add z)? >> Ping. > I am going to add a comment when I submit my next patch (it will happen > today or tomorrow). Thanks. > The reason is simple as address segment is stored in y not in z and > generation of addition of address segment to pseudo can fail (that is > what happens for the PR). Do you mean address segment in the x86 sense of "segment"? I was just a bit confused because the current comment says "It is not an address generation", whereas it sounds like addresses are involved somewhere. I suppose the commutation rules are that Y should be "no less complicated" than Z, so maybe it wins from that point of view too. Richard
Re: [PATCH, libfortran]: Initialize result variable (+ other changes)
On Wed, Jun 19, 2013 at 8:27 AM, Tobias Burnus wrote: >> Attached patch initializes return variable in get_fpu_except_flags. >> Additionally, it uses __asm__ and __volatile__ consistently, as >> recommended for header files and unifies a bunch of formatting issues >> throughout the file. > > > OK. Thanks for having a second look and improving the file. Actually, on a third look, there are multiple other issues in this file: 1. "cw_sse &= 0x;" is wrong, since it also clears FTZ, RC and DAZ flags. 2. x87 also needs to clear stalled exception flags, otherwise stalled flag triggers exception, when corresponding exception bit is unmasked. 3. fsfcw should be used instead of fnstcw, so all pending exceptions are handled (please note that in case when sticky exception flag is not cleared in the exception handler, point 2 applies). 4. A lot of code could be simplified in set_fpu function. 2013-06-19 Uros Bizjak * config/fpu-387.h (_FPU_MASK_ALL): New. (set_fpu): Use fstcw to store x87 FPU control word. Use fnclex to clear stalled exception flags. Correctly clear stalled SSE exception flags. Simplify code. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu. I will wait for a day for possible comments. The patch should be committed to all release branches. Uros. Index: config/fpu-387.h === --- config/fpu-387.h(revision 200211) +++ config/fpu-387.h(working copy) @@ -96,23 +96,26 @@ has_sse (void) #define _FPU_MASK_UM 0x10 #define _FPU_MASK_PM 0x20 +#define _FPU_MASK_ALL 0x3f + void set_fpu (void) { + int excepts = 0; unsigned short cw; - __asm__ __volatile__ ("fnstcw\t%0" : "=m" (cw)); + __asm__ __volatile__ ("fstcw\t%0" : "=m" (cw)); - cw |= (_FPU_MASK_IM | _FPU_MASK_DM | _FPU_MASK_ZM | _FPU_MASK_OM -| _FPU_MASK_UM | _FPU_MASK_PM); + if (options.fpe & GFC_FPE_INVALID) excepts |= _FPU_MASK_IM; + if (options.fpe & GFC_FPE_DENORMAL) excepts |= _FPU_MASK_DM; + if (options.fpe & GFC_FPE_ZERO) excepts |= _FPU_MASK_ZM; + if (options.fpe & GFC_FPE_OVERFLOW) excepts |= _FPU_MASK_OM; + if (options.fpe & GFC_FPE_UNDERFLOW) excepts |= _FPU_MASK_UM; + if (options.fpe & GFC_FPE_INEXACT) excepts |= _FPU_MASK_PM; - if (options.fpe & GFC_FPE_INVALID) cw &= ~_FPU_MASK_IM; - if (options.fpe & GFC_FPE_DENORMAL) cw &= ~_FPU_MASK_DM; - if (options.fpe & GFC_FPE_ZERO) cw &= ~_FPU_MASK_ZM; - if (options.fpe & GFC_FPE_OVERFLOW) cw &= ~_FPU_MASK_OM; - if (options.fpe & GFC_FPE_UNDERFLOW) cw &= ~_FPU_MASK_UM; - if (options.fpe & GFC_FPE_INEXACT) cw &= ~_FPU_MASK_PM; + cw |= _FPU_MASK_ALL; + cw &= ~excepts; - __asm__ __volatile__ ("fldcw\t%0" : : "m" (cw)); + __asm__ __volatile__ ("fnclex\n\tfldcw\t%0" : : "m" (cw)); if (has_sse()) { @@ -120,22 +123,17 @@ void set_fpu (void) __asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw_sse)); - cw_sse &= 0x; - cw_sse |= (_FPU_MASK_IM | _FPU_MASK_DM | _FPU_MASK_ZM | _FPU_MASK_OM -| _FPU_MASK_UM | _FPU_MASK_PM ) << 7; + /* The SSE exception masks are shifted by 7 bits. */ + cw_sse |= _FPU_MASK_ALL << 7; + cw_sse &= ~(excepts << 7); - if (options.fpe & GFC_FPE_INVALID) cw_sse &= ~(_FPU_MASK_IM << 7); - if (options.fpe & GFC_FPE_DENORMAL) cw_sse &= ~(_FPU_MASK_DM << 7); - if (options.fpe & GFC_FPE_ZERO) cw_sse &= ~(_FPU_MASK_ZM << 7); - if (options.fpe & GFC_FPE_OVERFLOW) cw_sse &= ~(_FPU_MASK_OM << 7); - if (options.fpe & GFC_FPE_UNDERFLOW) cw_sse &= ~(_FPU_MASK_UM << 7); - if (options.fpe & GFC_FPE_INEXACT) cw_sse &= ~(_FPU_MASK_PM << 7); + /* Clear stalled exception flags. */ + cw_sse &= ~0x3f; __asm__ __volatile__ ("%vldmxcsr\t%0" : : "m" (cw_sse)); } } - int get_fpu_except_flags (void) {
Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath
On Wed, 2013-06-19 at 10:57 -0500, Peter Bergner wrote: > On Wed, 2013-06-19 at 10:49 -0500, Peter Bergner wrote: > > This is due to the following in _ITM_inTransaction(): > > > > 47if (tx && (tx->nesting > 0)) > > (gdb) p tx > > $2 = (GTM::gtm_thread *) 0x10901bf0 > > (gdb) p tx->nesting > > $3 = 1 > > (gdb) step > > 49if (tx->state & gtm_thread::STATE_IRREVOCABLE) > > (gdb) p tx->state > > $4 = 3 > > (gdb) p gtm_thread::STATE_IRREVOCABLE > > $5 = 2 > > (gdb) step > > 50 return inIrrevocableTransaction; > > Bah, ignore this. It's a different call that is returning something other > than inIrrevocableTransaction. Unfortunately, gdb is having problems inside > hw txns and I'm having trouble seeing why/when _ITM_inTransaction() is > returning something other than inIrrevocableTransaction. I'll see if I can > determine why and will report back. Ok, we return outsideTransaction because the nesting level (tx->nesting) is zero. Peter
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
Patch preapproved. Jakub Hi, Checked into trunk: http://gcc.gnu.org/ml/gcc-cvs/2013-06/msg00646.html Thanks, K
Re: patch to fix PR57559 for s390
On 13-06-19 2:31 PM, Richard Sandiford wrote: Richard Sandiford writes: Vladimir Makarov writes: Index: lra.c === --- lra.c (revision 199753) +++ lra.c (working copy) @@ -306,11 +306,11 @@ lra_emit_add (rtx x, rtx y, rtx z) || (disp != NULL_RTX && ! CONSTANT_P (disp)) || (scale != NULL_RTX && ! CONSTANT_P (scale))) { - /* Its is not an address generation. Probably we have no 3 op + /* It is not an address generation. Probably we have no 3 op add. Last chance is to use 2-op add insn. */ lra_assert (x != y && x != z); - emit_move_insn (x, z); - insn = gen_add2_insn (x, y); + emit_move_insn (x, y); + insn = gen_add2_insn (x, z); emit_insn (insn); } else Could you add a comment to lra_emit_add saying why it has to be this way round (move y, add z)? Ping. I am going to add a comment when I submit my next patch (it will happen today or tomorrow). The reason is simple as address segment is stored in y not in z and generation of addition of address segment to pseudo can fail (that is what happens for the PR). Thanks, Richard.
Re: [PATCH] PR57518, RA generated redundent code
On 13-06-19 1:23 AM, Wei Mi wrote: Ping. On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi wrote: Hi, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518 pr57518 happened because update_equiv_regs in IRA marked a reg equivalent with a mem, lowered its mem_cost in scan_one_insn, set NO_REGS to its rclass, but didn't consider the reg was used in paradoxical subreg which prevented the reg from being replaced by mem in LRA phase. This patch is to check whether a reg is used in a paradoxical subreg in update_equiv_regs before reg is set as equivalent to a mem. bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for trunk and gcc-4.8 branch? Thanks for working on this PR, Wei, and sorry for the delay with the answer (I was on vacation). In general, the PR analysis and the proposed solution looks ok. I only worry that you are adding additional full scan of all RTL code. It might add 0.5% to GCC compilation time if data cache is rewritten (which will happen for moderate size or big functions). It would be nice to do it on some other existing RTL traversing. Unfortunately, this info is calculated later (reg_max_width in reload or biggest_mode in LRA). I am in doubt that other solutions I see now are better: o calculate this info in regstat_... function and store it in reg_info_p o calculate it with update_equiv_regs and use it for invalidation the equiv info later The first one increases reg_info_p footprint and calculation is done many times although it is used once. The second one results in complicated code. So I think the current patch is ok to commit. Thanks, again.
Re: [Patch tree-ssa] RFC: Enable path threading for control variables (PR tree-optimization/54742).
On Wed, 2013-06-19 at 14:19 +0100, James Greenhalgh wrote: > Please let me know if this fixes the performance issues you > were seeing and if you have any other comments. > > FWIW I've bootstrapped and regression tested this version of > the patch on x86_64 and ARM with no regressions. > > Thanks, > James Greenhalgh James, This patch does give me the same performance as my original patch, so that is excellent. While testing it I noticed that the final executable is larger with your patch then with mine. Here are the sizes of the bare-metal executables I created using the same flags I sent you earlier, the first has no switch optimization, the second one uses my plugin optimization, and the third uses your latest patch. I haven't looked into why the size difference for your patch and mine exists, do you see a size difference on your platforms? I am not sure if path threading in general is turned off for -Os but it probably should be. % ll -art coremark.fsf*elf -rwxr-xr-x 1 sellcey src 413812 Jun 19 11:11 coremark.fsf.1.elf -rwxr-xr-x 1 sellcey src 414676 Jun 19 11:11 coremark.fsf.2.elf -rwxr-xr-x 1 sellcey src 416402 Jun 19 11:11 coremark.fsf.3.elf Steve Ellcey sell...@mips.com
Re: patch to fix PR57559 for s390
Richard Sandiford writes: > Vladimir Makarov writes: >> Index: lra.c >> === >> --- lra.c(revision 199753) >> +++ lra.c(working copy) >> @@ -306,11 +306,11 @@ lra_emit_add (rtx x, rtx y, rtx z) >>|| (disp != NULL_RTX && ! CONSTANT_P (disp)) >>|| (scale != NULL_RTX && ! CONSTANT_P (scale))) >> { >> - /* Its is not an address generation. Probably we have no 3 op >> + /* It is not an address generation. Probably we have no 3 op >> add. Last chance is to use 2-op add insn. */ >>lra_assert (x != y && x != z); >> - emit_move_insn (x, z); >> - insn = gen_add2_insn (x, y); >> + emit_move_insn (x, y); >> + insn = gen_add2_insn (x, z); >>emit_insn (insn); >> } >>else > > Could you add a comment to lra_emit_add saying why it has to be this > way round (move y, add z)? Ping.
Re: [patch, mips] Fix switch statement for mips16 (PR target/56942)
"Steve Ellcey " writes: > 2013-06-19 Steve Ellcey > > PR target/56942 > * config/mips/mips.md (casesi_internal_mips16_): > Use NEXT_INSN instead of next_real_insn. OK, thanks. Richard
Re: FW: [PATCH] RTEMS: Use strict DWARF-2 on ARM, PowerPC, SPARC
Hello, sorry, but this patch was intended for review on the rtems-devel list. This patch should not go into GCC at this point. On 18/06/13 21:04, Rempel, Cynthia wrote: Hi, Forwarding this patch to gcc-patches... Cheers! Cindy From: rtems-devel-boun...@rtems.org [rtems-devel-boun...@rtems.org] on behalf of Sebastian Huber [sebastian.hu...@embedded-brains.de] Sent: Tuesday, June 18, 2013 4:58 AM To: rtems-de...@rtems.org Subject: [PATCH] RTEMS: Use strict DWARF-2 on ARM, PowerPC, SPARC Some debuggers do not cope with the new DWARF3/4 debug format introduced with GCC 4.8. Default to strict DWARF-2 on ARM, PowerPC and SPARC for now. This patch should be committed to GCC 4.8 and 4.9. gcc/ChangeLog 2013-06-18 Sebastian Huber * config/rtems.c: New. * config.gcc (*-*-rtems*): Set extra_objs. * config/rtems.h (rtems_override_options): Declare. (RTEMS_OVERRIDE_OPTIONS): Define. * config/t-rtems (rtems.o): New. * config/arm/rtems-eabi.h (SUBTARGET_OVERRIDE_OPTIONS): Define. * config/rs6000/rtems.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Define. * config/sparc/rtemself.h (SUBTARGET_OVERRIDE_OPTIONS): Define. [...] -- Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
Re: [patch, mips] Fix switch statement for mips16 (PR target/56942)
On Wed, Jun 19, 2013 at 6:36 PM, Steve Ellcey wrote: > Steven and Richard, > > I saw the email about the s390 switch statement > > http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01026.html > > and tested this patch on MIPS to see if using NEXT_INSN instead of > next_real_insn fixed PR 56942. It did, so is this the right long > term fix for MIPS? Yes it is. Also for other targets that look for JUMP_TABLE_DATA via next_*_insn. Sorry for not getting the necessary changes in any quicker. I'll try to get things cleaned up a bit next weekend. Ciao! Steven
Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self
On Wed, Jun 19, 2013 at 05:29:42PM +0200, Matthias Klose wrote: > well, I did fix this assumption last year in gcc.c, then lets fix it in other > places too, just adding a mode parameter to the public find_a_file function. > Testing the attached patch. Ok, provided you: 1) write proper ChangeLog 2) adjust the gcc-ar.c change (because it won't apply cleanly now that I've committed the other gcc-ar.c fix 3) > --- file-find.h (revision 200203) > +++ file-find.h (working copy) > @@ -38,7 +38,7 @@ > }; > > extern void find_file_set_debug (bool); > -extern char *find_a_file (struct path_prefix *, const char *); > +extern char *find_a_file (struct path_prefix *, const char *, int mode); Remove " mode" above, none of the arguments have names, so adding it is both inconsistent and useless. 4) >if (ld_file_name == 0) > -ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker]); > +ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker], > X_OK); This line looks too long now. Jakub
CALL_INSN_FUNCTION_USAGE fix, PR52773
This is bug that triggers on m68k. The loop unroller creates a MULT expression and tries to force it into a register, which causes a libcall to be generated. Since args are pushed we create a (use (mem (plus virtual_outgoing_args scratch))) in CALL_INSN_FUNCTION_USAGE. Since we're past vregs, the virtual_outgoing_args rtx survives to reload, which blows up. Fixed by just using stack_pointer_rtx, since we use a scratch anyway rather than a known offset. I also noticed that we actually add two of these USEs, so I've fixed that as well. Bootstrapped and tested on x86_64-linux, ok? Bernd diff --git a/gcc/calls.c b/gcc/calls.c index cdab8e0..db38b73 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -3603,6 +3603,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value, int reg_parm_stack_space = 0; int needed; rtx before_call; + bool have_push_fusage; tree tfom; /* type_for_mode (outmode, 0) */ #ifdef REG_PARM_STACK_SPACE @@ -3956,6 +3957,8 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value, /* Push the args that need to be pushed. */ + have_push_fusage = false; + /* ARGNUM indexes the ARGVEC array in the order in which the arguments are to be pushed. */ for (count = 0; count < nargs; count++, argnum += inc) @@ -4046,14 +4049,19 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value, if (argblock) use = plus_constant (Pmode, argblock, argvec[argnum].locate.offset.constant); + else if (have_push_fusage) + continue; else - /* When arguments are pushed, trying to tell alias.c where - exactly this argument is won't work, because the - auto-increment causes confusion. So we merely indicate - that we access something with a known mode somewhere on - the stack. */ - use = gen_rtx_PLUS (Pmode, virtual_outgoing_args_rtx, -gen_rtx_SCRATCH (Pmode)); + { + /* When arguments are pushed, trying to tell alias.c where + exactly this argument is won't work, because the + auto-increment causes confusion. So we merely indicate + that we access something with a known mode somewhere on + the stack. */ + use = gen_rtx_PLUS (Pmode, stack_pointer_rtx, + gen_rtx_SCRATCH (Pmode)); + have_push_fusage = true; + } use = gen_rtx_MEM (argvec[argnum].mode, use); use = gen_rtx_USE (VOIDmode, use); call_fusage = gen_rtx_EXPR_LIST (VOIDmode, use, call_fusage); diff --git a/gcc/testsuite/gcc.c-torture/compile/pr52773.c b/gcc/testsuite/gcc.c-torture/compile/pr52773.c new file mode 100644 index 000..8daa5ee --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr52773.c @@ -0,0 +1,16 @@ +/* pr52773.c */ + +struct s { +short x; +short _pad[2]; +}; + +static short mat_a_x; + +void transform(const struct s *src, struct s *dst, int n) +{ +int i; + +for (i = 0; i < n; ++i) + dst[i].x = (src[i].x * mat_a_x) >> 6; +}
Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types
On 06/19/2013 10:02 AM, Bernhard Reutner-Fischer wrote: On 19 June 2013 15:57, Jeff Law wrote: On 06/19/2013 01:02 AM, Chung-Ju Wu wrote: 2013/6/19 Jeff Law : * gcc.dg/tree-ssa/forwprop-28.c: New test. In the gnu coding standard we have a space before the open-parentheses. Would that be great to have testcase follow this convention as well? :) If so, then... No reason not to fix the test in this instance. I'll make these updates before committing. eh, nitpicking party ? + If a simplification is mode, return TRUE, else return FALSE. */ +static bool +simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi, s/mode/made/ Fixed via attached patch. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 0ecf5ba..bd60452 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,8 @@ 2013-06-19 Jeff Law + * tree-ssa-forwprop.c (simplify_bitwise_binary_boolean): Fix typo + in comment. + * tree-ssa-forwprop.c (simplify_bitwise_binary_boolean): New function. (simplify_bitwise_binary): Use it to simpify certain binary ops on booleans. diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c index 29a0bb7..df19295 100644 --- a/gcc/tree-ssa-forwprop.c +++ b/gcc/tree-ssa-forwprop.c @@ -1881,7 +1881,7 @@ hoist_conversion_for_bitop_p (tree to, tree from) then we can simplify the two statements into a single LT_EXPR or LE_EXPR when code is BIT_AND_EXPR and BIT_IOR_EXPR respectively. - If a simplification is mode, return TRUE, else return FALSE. */ + If a simplification is made, return TRUE, else return FALSE. */ static bool simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi, enum tree_code code,
Re: [PATCH] PR/57652 collect2 temp files
On Wed, 19 Jun 2013, David Edelsohn wrote: > A 2011 change to collect2 to use the standard diagnostics > infrastructure broke collect2's cleanup of temp files when an error > occurs. This prototype of a patch implements the minimal conversion > of collect2 to use atexit(). > > If this is the right direction, all calls to collect_exit() can be > converted to exit(). > > Thanks, David > > PR driver/57652 > * collect2.c (collect_atexit): New. > (collect_exit): Directly call exit. > (main): Register collect_atexit with atexit. This is OK. Using atexit seems to me to be the right approach for such cleanup. -- Joseph S. Myers jos...@codesourcery.com
[patch, mips] Fix switch statement for mips16 (PR target/56942)
Steven and Richard, I saw the email about the s390 switch statement http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01026.html and tested this patch on MIPS to see if using NEXT_INSN instead of next_real_insn fixed PR 56942. It did, so is this the right long term fix for MIPS? Is it OK to check it in? Since Steven added an assert in tablejump_p, I did not include any here, though I could if we thought it was needed. Steve Ellcey sell...@mips.com 2013-06-19 Steve Ellcey PR target/56942 * config/mips/mips.md (casesi_internal_mips16_): Use NEXT_INSN instead of next_real_insn. diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index ce322d8..b832dda 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -5948,7 +5948,7 @@ (clobber (reg:SI MIPS16_T_REGNUM))] "TARGET_MIPS16_SHORT_JUMP_TABLES" { - rtx diff_vec = PATTERN (next_real_insn (operands[2])); + rtx diff_vec = PATTERN (NEXT_INSN (operands[2])); gcc_assert (GET_CODE (diff_vec) == ADDR_DIFF_VEC);
[PATCH] Replaced the array sizes from hard-coded values to #define
Hello Everyone, This patch will replace the array sizes in array notation test suite functions from a hard-coded value to a #defined one. The main reason for doing this is that it will get easier in future if I want to experiment with different array sizes. In some cases this change was not possible since I am using the triplets based on the hard-coded length. I have also increased the array sizes from 10 to like 100 so that we can test with larger array-sizes (mainly to see if any memory overflow in the temporary storage arrays I have created in the compiler). I am checking this patch in as obvious. I am willing to revert this if anyone has any objections. Here is the ChangeLog entries 2013-06-19 Balaji V. Iyer * c-c++-common/cilk-plus/AN/builtin_fn_custom.c: Replaced all the hard-coded values of array sizes with a #define. * c-c++-common/cilk-plus/AN/builtin_fn_mutating.c: Likewise. * c-c++-common/cilk-plus/AN/builtin_func_double2.c: Likewise. * c-c++-common/cilk-plus/AN/gather_scatter.c: Likewise. * c-c++-common/cilk-plus/AN/pr57577.c: Likewise. * c-c++-common/cilk-plus/AN/sec_implicit_ex.c: Likewise. Thanks, Balaji V. Iyer. diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index be51cb3..723af40 100755 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,13 @@ +2013-06-19 Balaji V. Iyer + + * c-c++-common/cilk-plus/AN/builtin_fn_custom.c: Replaced all the + hard-coded values of array sizes with a #define. + * c-c++-common/cilk-plus/AN/builtin_fn_mutating.c: Likewise. + * c-c++-common/cilk-plus/AN/builtin_func_double2.c: Likewise. + * c-c++-common/cilk-plus/AN/gather_scatter.c: Likewise. + * c-c++-common/cilk-plus/AN/pr57577.c: Likewise. + * c-c++-common/cilk-plus/AN/sec_implicit_ex.c: Likewise. + 2013-06-18 Sriraman Tallam * gcc.target/i386/inline_error.c: New test. diff --git a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c index c5d3d7c..0f066d4 100644 --- a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c +++ b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c @@ -1,6 +1,7 @@ /* { dg-do run } */ /* { dg-options "-fcilkplus" } */ +#define NUMBER 100 #if HAVE_IO #include #endif @@ -18,17 +19,17 @@ double my_func (double x, double y) /* char __sec_reduce_add (int *); */ int main(void) { - int ii,array[10], y = 0, y_int = 0, array2[10]; - double x, yy, array3[10], array4[10]; + int ii,array[NUMBER], y = 0, y_int = 0, array2[NUMBER]; + double x, yy, array3[NUMBER], array4[NUMBER]; double max_value = 0.000, min_value = 0.000, add_value, mul_value = 1.00; int max_index = 0, min_index = 0; - for (ii = 0; ii < 10; ii++) + for (ii = 0; ii < NUMBER; ii++) { array[ii] = 1+ii; array2[ii]= 2; } - for (ii = 0; ii < 10; ii++) + for (ii = 0; ii < NUMBER; ii++) { if (ii%2 && ii) array3[ii] = (double)(1./(double)ii); @@ -43,7 +44,7 @@ int main(void) /* Initialize it to the first variable. */ max_value = array3[0] * array4[0]; - for (ii = 0; ii < 10; ii++) + for (ii = 0; ii < NUMBER; ii++) if (array3[ii] * array4[ii] > max_value) { max_value = array3[ii] * array4[ii]; max_index = ii; @@ -52,7 +53,7 @@ int main(void) #if HAVE_IO - for (ii = 0; ii < 10; ii++) + for (ii = 0; ii < NUMBER; ii++) printf("%5.3f ", array3[ii] * array4[ii]); printf("\n"); printf("Max = %5.3f\t Max Index = %2d\n", x, y); diff --git a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c index 7c194c2..e01fbb1 100644 --- a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c +++ b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c @@ -1,6 +1,7 @@ /* { dg-do run } */ /* { dg-options "-fcilkplus" } */ +#define NUMBER 100 #if HAVE_IO #include #endif @@ -15,18 +16,18 @@ void my_func (double *x, double y) int main(void) { - int ii,array[10], y = 0, y_int = 0, array2[10]; - double x = 0.000, yy, array3[10], array4[10]; + int ii,array[NUMBER], y = 0, y_int = 0, array2[NUMBER]; + double x = 0.000, yy, array3[NUMBER], array4[NUMBER]; double max_value = 0.000, min_value = 0.000, add_value, mul_value = 1.00; int max_index = 0, min_index = 0; #if 1 - for (ii = 0; ii < 10; ii++) + for (ii = 0; ii < NUMBER; ii++) { array[ii] = 1+ii; array2[ii]= 2; } - for (ii = 0; ii < 10; ii++) + for (ii = 0; ii < NUMBER; ii++) { if (ii%2 && ii) array3[ii] = (double)(1./(double)ii); @@ -42,16 +43,16 @@ int main(void) /* Initialize it to the first variable. */ max_value = array3[0] * array4[0]; - for (ii = 0; ii < 10; ii++) + for (ii = 0; ii < NUMBER; ii++) if (array3[ii] * array4[ii] > max_value)
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Jun 19, 2013, at 1:38 AM, Richard Biener wrote: > On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek wrote: >> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote: >>> Right, as you did for other cases. It works here as well. >> >> Patch preapproved. > > I wonder how much code breaks these days when we enable -fno-common by > default? Not much. gcc as Apple shipped it, has always been no-common, and indeed the shared library scheme doesn't like common. There are a few test cases that would need -fcommon, but I don't think that is a big deal. Most oss I think is -fno-common friendly. I think gcc should default to c99, and I think c99 mode (and later) could use -fno-common by default. For pre c99 modes, I'd probably just leave it to the dust bin of history.
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Jun 19, 2013, at 1:44 AM, Jakub Jelinek wrote: > On Wed, Jun 19, 2013 at 10:38:47AM +0200, Richard Biener wrote: >> On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek wrote: >>> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote: Right, as you did for other cases. It works here as well. >>> >>> Patch preapproved. >> >> I wonder how much code breaks these days when we enable -fno-common by >> default? ... > > Somebody would need to try it ;). Been there done that. That experiment has been running for at least 10 years now… :-)
Re: RFA: Fix rtl-optimization/57425
On Wed, 19 Jun 2013, Joern Rennecke wrote: > > I.e. the arguments after your patch are exactly swapped. This is usually > > harmless, but not always, so that should be corrected before check in. > > The change in cselib.c:cselib_invalidate_mem has the same problem. > > Well, I have already committed the patch, so attached is a patch to fix > things up. > int > > anti_dependence (const_rtx mem, const_rtx x) > > { > ... > int > > -canon_anti_dependence (const_rtx mem, enum machine_mode mem_mode, > > - rtx mem_addr, const_rtx x) > > +canon_anti_dependence (const_rtx x, enum machine_mode x_mode, rtx x_addr, > > + const_rtx mem, bool mem_canonicalized) > { That's not good. You now have different order of parameters between anti_dependence and canon_anti_dependence. That will be mightily confusing, please instead change the caller. Currently these predicates take their arguments in the order of the corresponding instructions, that should better be retained: true_dependence: write-then-(depending)read anti_dependence: read-then-(clobbering)write write_dependence: write-then-(clobbering)write We could change the order of arguments to something else, like first the clobber, then the clobbered, but then that should be done for all the predicates at the same time (and I would suggest to not do it). Ciao, Michael.
Re: Symtab cleanups 4/17 - ICE in GUPC due to use of init section
On 06/18/13 16:37:04, Gary Funck wrote: > The initialization function is currently generated in tree form in the > usual way (it will be gimplified when the gimple pass is run). > > The code that is being generated is roughly equivalent to: > > static void > __upc_init_decls (void) > { > /* Compiler generated: > Initialize data related to UPC shared variables. */ > } > > static void (*const __upc_init_addr) (void) > __attribute__ ((section ("upc_init_array"))) = __upc_init_decls; > I tried building a variable declaration along the lines of __upc_init_addr above, by defining this function: /* Create a static variable of type 'type'. This routine mimics the behavior of 'objc_create_temporary_var' with the change that it creates a static (file scoped) variable. If we continue to need this function, the two implementations should be unified. */ static tree upc_create_static_var (tree type, const char *name) { tree id = (name != NULL) ? get_identifier (name) : NULL; tree decl = build_decl (input_location, VAR_DECL, id, type); TREE_USED (decl) = 1; TREE_EXTERNAL (decl) = 0; TREE_STATIC (decl) = 1; TREE_READONLY (decl) = 1; TREE_THIS_VOLATILE (decl) = 0; TREE_ADDRESSABLE (decl) = 0; DECL_PRESERVE_P (decl) = 1; DECL_ARTIFICIAL (decl) = 1; DECL_IGNORED_P (decl) = 1; DECL_CONTEXT (decl) = NULL; return decl; } and then using it as follows: init_func_ptr_type = build_pointer_type (TREE_TYPE (init_func)); init_func_addr = upc_create_static_var (init_func_ptr_type, NULL); DECL_INITIAL (init_func_addr) = build_unary_op (loc, ADDR_EXPR, init_func, 0); DECL_SECTION_NAME (init_func_addr) = build_string ( strlen (UPC_INIT_ARRAY_SECTION_NAME), UPC_INIT_ARRAY_SECTION_NAME); The variable declaration tree node looks about right to me. However, it never makes it into the output assembler file. What is the recommended method for making sure that the static variable created above is associated with the current translation unit and that its initialization makes it into the assembler output file? Thanks, - Gary
Re: [c++-concepts] code review
>> +// If the types of the underlying templates match, compare >> +// their constraints. The declarations could differ there. >> +if (types_match) >> + types_match = equivalent_constraints (get_constraints >> (olddecl), >> +current_template_reqs); > > > We can't assume that current_template_reqs will always apply to newdecl > here, as decls_match is called in overload resolution as well. What's the > problem with attaching the requirements to the declaration before we get to > duplicate_decls? It's because newdecl doesn't have a template_info at the point at which this is called, and the constraints are associated through that information. This seems like another good reason for keeping constraints with template decls. Until I change that, I can just test to see if newdecl has template info. If so, I'll use its constraints. If not, I'll use the current requirements. Andrew
Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types
On 19 June 2013 15:57, Jeff Law wrote: > On 06/19/2013 01:02 AM, Chung-Ju Wu wrote: >> >> 2013/6/19 Jeff Law : >>> >>> >>> * gcc.dg/tree-ssa/forwprop-28.c: New test. >>> >> >> In the gnu coding standard we have a space before >> the open-parentheses. Would that be great to have >> testcase follow this convention as well? :) >> >> If so, then... > > No reason not to fix the test in this instance. I'll make these updates > before committing. eh, nitpicking party ? + If a simplification is mode, return TRUE, else return FALSE. */ +static bool +simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi, s/mode/made/ Sounds nice, otherwise! thanks,
Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath
On Wed, 2013-06-19 at 10:49 -0500, Peter Bergner wrote: > This is due to the following in _ITM_inTransaction(): > > 47 if (tx && (tx->nesting > 0)) > (gdb) p tx > $2 = (GTM::gtm_thread *) 0x10901bf0 > (gdb) p tx->nesting > $3 = 1 > (gdb) step > 49 if (tx->state & gtm_thread::STATE_IRREVOCABLE) > (gdb) p tx->state > $4 = 3 > (gdb) p gtm_thread::STATE_IRREVOCABLE > $5 = 2 > (gdb) step > 50return inIrrevocableTransaction; Bah, ignore this. It's a different call that is returning something other than inIrrevocableTransaction. Unfortunately, gdb is having problems inside hw txns and I'm having trouble seeing why/when _ITM_inTransaction() is returning something other than inIrrevocableTransaction. I'll see if I can determine why and will report back. Peter
Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath
On Wed, 2013-06-19 at 16:57 +0200, Torvald Riegel wrote: > (Re-sending to the proper list. Sorry for the noise at gcc@.) > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57643 > > The HTM fastpath didn't handle a situation in which a relaxed > transaction executed unsafe code that in turn starts a transaction; it > simply tried to wait for the "other" transaction, not checking whether > the current thread started the other transaction. [snip] > Peter and/or Andreas: Could you please check that this fixes the bug you > see on Power/s390? Thanks. This patch fixed the hang, but now I'm dying due to an abort in the test case. Specifically, the first abort in unsafe() int __attribute__((transaction_unsafe)) unsafe(int i) { if (_ITM_inTransaction() != inIrrevocableTransaction) dying here: abort(); __transaction_atomic { x++; } if (_ITM_inTransaction() != inIrrevocableTransaction) abort(); return i+1; } This is due to the following in _ITM_inTransaction(): 47if (tx && (tx->nesting > 0)) (gdb) p tx $2 = (GTM::gtm_thread *) 0x10901bf0 (gdb) p tx->nesting $3 = 1 (gdb) step 49if (tx->state & gtm_thread::STATE_IRREVOCABLE) (gdb) p tx->state $4 = 3 (gdb) p gtm_thread::STATE_IRREVOCABLE $5 = 2 (gdb) step 50 return inIrrevocableTransaction; Peter
Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self
Am 19.06.2013 14:10, schrieb Jakub Jelinek: > On Wed, Jun 19, 2013 at 02:03:34PM +0200, Matthias Klose wrote: >> Am 27.11.2012 19:14, schrieb Meador Inge: >>> On 11/26/2012 09:05 AM, Richard Biener wrote: >>> On Wed, Nov 7, 2012 at 10:51 PM, Meador Inge wrote: > Ping ^ 4. Ok. >>> >>> Thanks for the review. Committed to trunk. >> >> This did break gcc-ar and gcc-nm; now a regression on the 4.8 branch. >> >> $ gcc-ar-4.8 -h >> gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so' >> >> the plugin is *not* installed with x permission flags (which seems to be the >> standard for shared libraries). You did change the code to use find_a_file >> which searches only for files with the x bit set. > > That actually is the standard for shared libraries, the linker creates > libraries with those permissions and libtool/automake installs them that way > too. So if you override this, you need to cope with that decision. well, I did fix this assumption last year in gcc.c, then lets fix it in other places too, just adding a mode parameter to the public find_a_file function. Testing the attached patch. Matthias Index: gcc-ar.c === --- gcc-ar.c(revision 200203) +++ gcc-ar.c(working copy) @@ -136,7 +136,7 @@ setup_prefixes (av[0]); /* Find the GCC LTO plugin */ - plugin = find_a_file (&target_path, LTOPLUGINSONAME); + plugin = find_a_file (&target_path, LTOPLUGINSONAME, R_OK); if (!plugin) { fprintf (stderr, "%s: Cannot find plugin '%s'\n", av[0], LTOPLUGINSONAME); @@ -151,7 +151,7 @@ const char *cross_exe_name; cross_exe_name = concat (target_machine, "-", PERSONALITY, NULL); - exe_name = find_a_file (&path, cross_exe_name); + exe_name = find_a_file (&path, cross_exe_name, X_OK); if (!exe_name) { fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0], Index: file-find.c === --- file-find.c (revision 200203) +++ file-find.c (working copy) @@ -31,7 +31,7 @@ } char * -find_a_file (struct path_prefix *pprefix, const char *name) +find_a_file (struct path_prefix *pprefix, const char *name, int mode) { char *temp; struct prefix_list *pl; @@ -50,7 +50,7 @@ if (IS_ABSOLUTE_PATH (name)) { - if (access (name, X_OK) == 0) + if (access (name, mode) == 0) { strcpy (temp, name); @@ -66,7 +66,7 @@ strcpy (temp, name); strcat (temp, HOST_EXECUTABLE_SUFFIX); - if (access (temp, X_OK) == 0) + if (access (temp, mode) == 0) return temp; #endif @@ -83,7 +83,7 @@ if (stat (temp, &st) >= 0 && ! S_ISDIR (st.st_mode) - && access (temp, X_OK) == 0) + && access (temp, mode) == 0) return temp; #ifdef HOST_EXECUTABLE_SUFFIX @@ -93,7 +93,7 @@ if (stat (temp, &st) >= 0 && ! S_ISDIR (st.st_mode) - && access (temp, X_OK) == 0) + && access (temp, mode) == 0) return temp; #endif } Index: file-find.h === --- file-find.h (revision 200203) +++ file-find.h (working copy) @@ -38,7 +38,7 @@ }; extern void find_file_set_debug (bool); -extern char *find_a_file (struct path_prefix *, const char *); +extern char *find_a_file (struct path_prefix *, const char *, int mode); extern void add_prefix (struct path_prefix *, const char *); extern void prefix_from_env (const char *, struct path_prefix *); extern void prefix_from_string (const char *, struct path_prefix *); Index: collect2.c === --- collect2.c (revision 200203) +++ collect2.c (working copy) @@ -1110,55 +1110,55 @@ if (ld_file_name == 0) #endif #ifdef REAL_LD_FILE_NAME - ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME); +ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME, X_OK); if (ld_file_name == 0) #endif /* Search the (target-specific) compiler dirs for ld'. */ - ld_file_name = find_a_file (&cpath, real_ld_suffix); +ld_file_name = find_a_file (&cpath, real_ld_suffix, X_OK); /* Likewise for `collect-ld'. */ if (ld_file_name == 0) { - ld_file_name = find_a_file (&cpath, collect_ld_suffix); + ld_file_name = find_a_file (&cpath, collect_ld_suffix, X_OK); use_collect_ld = ld_file_name != 0; } /* Search the compiler directories for `ld'. We have protection against recursive calls in find_a_file. */ if (ld_file_name == 0) -ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker]); +ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker], X_OK); /* Search the ordinary system bin directories for `ld' (if native linking) or `TARGET-ld' (if cross). */ if (ld_file_name == 0) -ld_file_name = find_a_file (&path, full_ld_suff
Re: [patch] Improve debug info for small structures passed by reference
> Especially if it is -O0 only, I don't see why you think so. Just dg-skip-if > it for -O1+ if you believe it is unreliable for some reason, but if you > look at the parameter value after the prologue, not showing the right value > at -O0 would be a serious bug everywhere. Having some GDB testcase also > doesn't hurt, but having it in GCC testsuite has significant advantages > over that, most GCC developers don't run GDB testsuite after any changes > they do. All right, I've attached a couple of guality testcases (param-1.c for -O0 and param-2.c for -O1 -fno-var-tracking-assignments, a param-3.c for bare -O1 will require further adjustments in var-tracking.c). Unfortunately they don't fail without the patch e.g. on PowerPC, they are reported as unsupported instead because of "Cannot access memory at address" messages from GDB... * gcc.dg/guality/param-1.c: New test. * gcc.dg/guality/param-2.c: Likewise. -- Eric Botcazou/* { dg-do run } */ /* { dg-options "-g" } */ /* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */ typedef __UINTPTR_TYPE__ uintptr_t; __attribute__((noinline, noclone)) int sub (int a, int b) { return a - b; } typedef struct { uintptr_t pa; uintptr_t pb; } fatp_t __attribute__ ((aligned (2 * __alignof__ (uintptr_t; __attribute__((noinline, noclone)) void foo (fatp_t str, int a, int b) { int i = sub (a, b); if (i == 0) /* BREAK */ i = sub (b, a); } int main (void) { fatp_t ptr = { 31415927, 27182818 }; foo (ptr, 1, 2); return 0; } /* { dg-final { gdb-test 20 "str.pa" "31415927" } } */ /* { dg-final { gdb-test 20 "str.pb" "27182818" } } */ /* { dg-do run } */ /* { dg-options "-g -fno-var-tracking-assignments" } */ /* { dg-skip-if "" { *-*-* } { "*" } { "-O0" "-O1" } } */ typedef __UINTPTR_TYPE__ uintptr_t; __attribute__((noinline, noclone)) int sub (int a, int b) { return a - b; } typedef struct { uintptr_t pa; uintptr_t pb; } fatp_t __attribute__ ((aligned (2 * __alignof__ (uintptr_t; __attribute__((noinline, noclone)) void foo (fatp_t str, int a, int b) { int i = sub (a, b); if (i == 0) /* BREAK */ foo (str, a - 1, b); } int main (void) { fatp_t ptr = { 31415927, 27182818 }; foo (ptr, 1, 2); return 0; } /* { dg-final { gdb-test 20 "str.pa" "31415927" } } */ /* { dg-final { gdb-test 20 "str.pb" "27182818" } } */
[patch] libitm: Fix handling of reentrancy in the HTM fastpath
(Re-sending to the proper list. Sorry for the noise at gcc@.) http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57643 The HTM fastpath didn't handle a situation in which a relaxed transaction executed unsafe code that in turn starts a transaction; it simply tried to wait for the "other" transaction, not checking whether the current thread started the other transaction. We fix this by doing this check, and if we have the lock, we just continue with the fallback serial-mode path instead of using a HW transaction. The current code won't do the check before starting a HW transaction, but this way we can keep the HTM fastpath unchanged; also, this particular "reentrancy" is probably infrequent in practice, so I suppose the small slowdown shouldn't matter much. Also, I first thought about trying to use the HTM in the reentrancy case, but this doesn't make any sense because other transactions can't run anyway, and we should really just finish the serial-mode transaction as fast as possible. Peter and/or Andreas: Could you please check that this fixes the bug you see on Power/s390? Thanks. Torvald commit 185af84e365e1bae31aea5afd6e67e81f3c32c72 Author: Torvald Riegel Date: Wed Jun 19 16:42:24 2013 +0200 libitm: Fix handling of reentrancy in the HTM fastpath. PR libitm/57643 diff --git a/libitm/beginend.cc b/libitm/beginend.cc index 93e702e..a3bf549 100644 --- a/libitm/beginend.cc +++ b/libitm/beginend.cc @@ -197,6 +197,8 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb) // We are executing a transaction now. // Monitor the writer flag in the serial-mode lock, and abort // if there is an active or waiting serial-mode transaction. + // Note that this can also happen due to an enclosing + // serial-mode transaction; we handle this case below. if (unlikely(serial_lock.is_write_locked())) htm_abort(); else @@ -219,6 +221,14 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb) tx = new gtm_thread(); set_gtm_thr(tx); } + // Check whether there is an enclosing serial-mode transaction; + // if so, we just continue as a nested transaction and don't + // try to use the HTM fastpath. This case can happen when an + // outermost relaxed transaction calls unsafe code that starts + // a transaction. + if (tx->nesting > 0) + break; + // Another thread is running a serial-mode transaction. Wait. serial_lock.read_lock(tx); serial_lock.read_unlock(tx); // TODO We should probably reset the retry count t here, unless
[PATCH] PR/57652 collect2 temp files
A 2011 change to collect2 to use the standard diagnostics infrastructure broke collect2's cleanup of temp files when an error occurs. This prototype of a patch implements the minimal conversion of collect2 to use atexit(). If this is the right direction, all calls to collect_exit() can be converted to exit(). Thanks, David PR driver/57652 * collect2.c (collect_atexit): New. (collect_exit): Directly call exit. (main): Register collect_atexit with atexit. Index: collect2.c === --- collect2.c (revision 200180) +++ collect2.c (working copy) @@ -367,7 +367,7 @@ /* Delete tempfiles and exit function. */ void -collect_exit (int status) +collect_atexit (void) { if (c_file != 0 && c_file[0]) maybe_unlink (c_file); @@ -395,12 +395,16 @@ maybe_unlink (lderrout); } - if (status != 0 && output_file != 0 && output_file[0]) + if (output_file != 0 && output_file[0]) maybe_unlink (output_file); if (response_file) maybe_unlink (response_file); +} +void +collect_exit (int status) +{ exit (status); } @@ -970,6 +974,9 @@ signal (SIGCHLD, SIG_DFL); #endif + if (atexit (collect_atexit) != 0) +fatal_error ("atexit failed"); + /* Unlock the stdio streams. */ unlock_std_streams ();
Re: [PATCH GCC]Fix PR57540, try to choose scaled_offset address mode when expanding array reference
On Tue, Jun 18, 2013 at 10:02 PM, Oleg Endo wrote: > On Tue, 2013-06-18 at 18:09 +0800, Bin.Cheng wrote: >> On Tue, Jun 18, 2013 at 3:52 AM, Oleg Endo wrote: >> > >> > My observation is, that legitimizing addressing modes in the backend by >> > looking at one isolated address works, but doesn't give good results. >> > In the SH backend there is this a particular case with displacement >> > addressing (register + constant). On SH displacements for byte >> > addressing are 0..15, 0..31 for 16 bit words and 0..63 for 32 bit words. >> > sh_legitimize_address (or rather sh_find_mov_disp_adjust) uses a fixed >> > heuristic to satisfy the displacement constraint and splits out a plus >> > insn if needed to adjust the base address. Of course that fixed >> > heuristic doesn't work for some cases and thus sometimes results in >> > unnecessary base address adjustments. >> > I had the idea of replacing the current (partially defunct) auto-inc-dec >> > RTL pass with a more generic addressing mode selection pass: >> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56590 >> > >> > Any suggestions/comments/... are highly appreciated. >> > >> In PR56590, is PR50749 the only one that correlate with auto-inc-dec? >> Others seem just problems of wrong addressing mode. > > Yes, PR 50749 was the initial description of auto-inc-dec defects. PR > 52049 is also related to it, as the code examples there are candidates > for post-inc addressing mode. In that case, if 'int' is replaced with > 'float' on SH post-inc is the optimal mode, because it doesn't have > displacement addressing for FPU loads, except than SH2A. Even then, > using post-inc is better as it has a more compact instruction encoding. > The current auto-inc-dec is not able to discover such opportunities if, > for example, mem accesses are reordered by preceding optimization > passes. > >> And one point on PR50749, auto-inc-dec depends on ivopt to choose >> auto-increment candidate. Since you disabled ivopt, I bet GCC will >> miss lots of auto-increment opportunities. > > No, I haven't disabled ivopt. > But -fno-ivopts is specified in PR50749. With current implementation, auto-inc-dec iterates instructions backward, tries to find memory access and increment/decrement pairs. It will miss opportunities if instructions are interfered with each other. Thanks. bin -- Best Regards.
Re: RFA: Fix rtl-optimization/57425
I.e. the arguments after your patch are exactly swapped. This is usually harmless, but not always, so that should be corrected before check in. The change in cselib.c:cselib_invalidate_mem has the same problem. Well, I have already committed the patch, so attached is a patch to fix things up. Looking at the read MEM canonicalization further, these are obviously canonicalized in cse.c:invalidate, but this is not so clear in cselib. I can see some canonicalization going on where things are recorded, but there is no comment in cselib.h:cselib_val nor cselib.c:first_containing_mem to say that it's guaranteed that all the expression in the locs lists are canonicalized, so I'll assume that's not a safe assumption, and I've added an mem_canonicalized parameter to canon_anti_dependence to indicate if MEM (the previously read location) is canonicalized. Bootstrapped / regtested on i686-pc-linux-gnu. 2013-06-19 Joern Rennecke PR rtl-optimization/57425 PR rtl-optimization/57569 * alias.c (write_dependence_p): Remove parameters mem_mode and canon_mem_addr. Add parameters x_mode, x_addr and x_canonicalized. Changed all callers. (canon_anti_dependence): Get comments and semantics in sync, to be a proper drop-in replacement for canon_true_dependence for use by cse(lib). Add parameter mem_canonicalized. Changed all callers. * rtl.h (canon_anti_dependence): Update prototype. Index: alias.c === --- alias.c (revision 200133) +++ alias.c (working copy) @@ -156,8 +156,9 @@ static int insert_subset_children (splay static alias_set_entry get_alias_set_entry (alias_set_type); static bool nonoverlapping_component_refs_p (const_rtx, const_rtx); static tree decl_for_component_ref (tree); -static int write_dependence_p (const_rtx, enum machine_mode, rtx, const_rtx, - bool, bool); +static int write_dependence_p (const_rtx, + const_rtx, enum machine_mode, rtx, + bool, bool, bool); static void memory_modified_1 (rtx, const_rtx, void *); @@ -2555,20 +2556,22 @@ canon_true_dependence (const_rtx mem, en /* Returns nonzero if a write to X might alias a previous read from (or, if WRITEP is true, a write to) MEM. - If MEM_CANONCALIZED is nonzero, CANON_MEM_ADDR is the canonicalized - address of MEM, and MEM_MODE the mode for that access. */ + If X_CANONCALIZED is nonzero, then X_ADDR is the canonicalized address of X, + and X_MODE the mode for that access. + If MEM_CANONICALIZED is true, MEM is canonicalized. */ static int -write_dependence_p (const_rtx mem, enum machine_mode mem_mode, - rtx canon_mem_addr, const_rtx x, - bool mem_canonicalized, bool writep) +write_dependence_p (const_rtx mem, + const_rtx x, enum machine_mode x_mode, rtx x_addr, + bool mem_canonicalized, bool x_canonicalized, bool writep) { - rtx x_addr, mem_addr; + rtx mem_addr; rtx base; int ret; - gcc_checking_assert (mem_canonicalized ? (canon_mem_addr != NULL_RTX) - : (canon_mem_addr == NULL_RTX && mem_mode == VOIDmode)); + gcc_checking_assert (x_canonicalized + ? (x_addr != NULL_RTX && x_mode != VOIDmode) + : (x_addr == NULL_RTX && x_mode == VOIDmode)); if (MEM_VOLATILE_P (x) && MEM_VOLATILE_P (mem)) return 1; @@ -2593,17 +2596,21 @@ write_dependence_p (const_rtx mem, enum if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x)) return 1; - x_addr = XEXP (x, 0); mem_addr = XEXP (mem, 0); - if (!((GET_CODE (x_addr) == VALUE -&& GET_CODE (mem_addr) != VALUE -&& reg_mentioned_p (x_addr, mem_addr)) - || (GET_CODE (x_addr) != VALUE - && GET_CODE (mem_addr) == VALUE - && reg_mentioned_p (mem_addr, x_addr + if (!x_addr) { - x_addr = get_addr (x_addr); - mem_addr = get_addr (mem_addr); + x_addr = XEXP (x, 0); + if (!((GET_CODE (x_addr) == VALUE +&& GET_CODE (mem_addr) != VALUE +&& reg_mentioned_p (x_addr, mem_addr)) + || (GET_CODE (x_addr) != VALUE + && GET_CODE (mem_addr) == VALUE + && reg_mentioned_p (mem_addr, x_addr + { + x_addr = get_addr (x_addr); + if (!mem_canonicalized) + mem_addr = get_addr (mem_addr); + } } base = find_base_term (mem_addr); @@ -2619,17 +2626,16 @@ write_dependence_p (const_rtx mem, enum GET_MODE (mem))) return 0; - x_addr = canon_rtx (x_addr); - if (mem_canonicalized) -mem_addr = canon_mem_addr; - else + if (!x_canonicalized) { - mem_addr = canon_rtx (mem_addr); - mem_mode = GET_MODE (mem); + x_addr = canon_rtx (x_addr); + x_mode = GET_MODE (x); } + if (!mem_ca
Re: [PATCH] Fix up gcc-{ar,nm,ranlib}
On Wed, Jun 19, 2013 at 03:20:33PM +0200, Jakub Jelinek wrote: > Hi! > > If say /usr/bin/gcc-ar doesn't find /usr//bin/ar (and a few others), > it gives up unless CROSS_DIRECTORY_STRUCTURE, while e.g. collect2 looks for > ld, nm etc. in $PATH. The collect2.c snippet is: Looks good. Thanks for fixing. -Andi
Re: [PATCH] Fix up gcc-{ar,nm,ranlib}
Am 19.06.2013 15:20, schrieb Jakub Jelinek: > Here is so far untested attempt to do that in gcc-{ar,nm,ranlib} too, ok if > bootstrap/regtest passes and testing shows it works (for 4.8 too, in 4.7 it > worked)? works for me, checked with a 4.8 native build and install. Matthias
Re: [c++-concepts] code review
On 06/18/2013 12:27 PM, Andrew Sutton wrote: There was a bug in instantiation_dependent_expr_r that would cause trait expressions like __is_class(int) to be marked as type dependent. It was always testing the 2nd operand, even for unary traits (NULL_TREE turns out to be type dependent). I fixed that last month: 2013-05-20 Jason Merrill PR c++/57016 * pt.c (instantiation_dependent_r) [TRAIT_EXPR]: Only check type2 if there is one. If you want to keep the is_binary_trait stuff, that's fine, except that +extern bool is_binary_trait (cp_trait_kind); ... +inline bool +is_binary_trait (cp_trait_kind k) violates the rules for inline functions: an inline function must be declared as inline before any uses and defined in all translation units that use it. +// reduced terms in the constraints language. Note that conjoining with a +// non-null expression with NULL_TREE is an identity operation. That is, Drop the first "with". +// If the types of the underlying templates match, compare +// their constraints. The declarations could differ there. +if (types_match) + types_match = equivalent_constraints (get_constraints (olddecl), +current_template_reqs); We can't assume that current_template_reqs will always apply to newdecl here, as decls_match is called in overload resolution as well. What's the problem with attaching the requirements to the declaration before we get to duplicate_decls? Jason
Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types
On 06/19/2013 01:02 AM, Chung-Ju Wu wrote: 2013/6/19 Jeff Law : * gcc.dg/tree-ssa/forwprop-28.c: New test. In the gnu coding standard we have a space before the open-parentheses. Would that be great to have testcase follow this convention as well? :) If so, then... No reason not to fix the test in this instance. I'll make these updates before committing. jeff
[C++ Patch] PR 57645
Hi, when I implemented Core/1123 "Destructors should be noexcept by default", unfortunately I caused this regression, present now in mainline and 4_8-branch. When the destructor is user provided, with no exception specifications, and the type has data members (not bases, those are already Ok) with the destructor which can throw, the destructor is wrongly deduced to be noexcept. The reason is that deduce_noexcept_on_destructors is called from check_bases_and_members after check_bases but *before* check_methods and therefore the latter does too late work relevant for the deduction, namely possibly setting TYPE_HAS_NONTRIVIAL_DESTRUCTOR. My proposal for a fix involves simply anticipating that work as part of deduce_noexcept_on_destructors, renamed now check_destructors, and called unconditionally. Things appear to work fine. Of course different refactorings and naming schemes could make perfect sense. Tested x86_64-linux. Thanks, Paolo. // /cp 2013-06-19 Paolo Carlini PR c++/57645 * class.c (check_methods): Don't set TYPE_HAS_NONTRIVIAL_DESTRUCTOR here... (deduce_noexcept_on_destructors): ... do it here. Rename the function to check_destructors. (check_bases_and_members): Adjust. /testsuite 2013-06-19 Paolo Carlini PR c++/57645 * testsuite/g++.dg/cpp0x/noexcept21.C: New. Index: cp/class.c === --- cp/class.c (revision 200197) +++ cp/class.c (working copy) @@ -4256,11 +4256,6 @@ check_methods (tree t) if (DECL_PURE_VIRTUAL_P (x)) vec_safe_push (CLASSTYPE_PURE_VIRTUALS (t), x); } - /* All user-provided destructors are non-trivial. - Constructors and assignment ops are handled in -grok_special_member_properties. */ - if (DECL_DESTRUCTOR_P (x) && user_provided_p (x)) - TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = 1; } } @@ -4568,8 +4563,12 @@ clone_constructors_and_destructors (tree t) clone_function_decl (OVL_CURRENT (fns), /*update_method_vec_p=*/1); } -/* Deduce noexcept for a destructor DTOR. */ +/* Deduce noexcept for a destructor DTOR. + 12.4/3: A declaration of a destructor that does not have an + exception-specification is implicitly considered to have the + same exception-specification as an implicit declaration (15.4). */ + void deduce_noexcept_on_destructor (tree dtor) { @@ -4584,14 +4583,11 @@ deduce_noexcept_on_destructor (tree dtor) } } -/* For each destructor in T, deduce noexcept: +/* Possibly set TYPE_HAS_NONTRIVIAL_DESTRUCTOR and deduce noexcept for + each destructor. */ - 12.4/3: A declaration of a destructor that does not have an - exception-specification is implicitly considered to have the - same exception-specification as an implicit declaration (15.4). */ - static void -deduce_noexcept_on_destructors (tree t) +check_destructors (tree t) { tree fns; @@ -4601,7 +4597,12 @@ static void return; for (fns = CLASSTYPE_DESTRUCTORS (t); fns; fns = OVL_NEXT (fns)) -deduce_noexcept_on_destructor (OVL_CURRENT (fns)); +{ + if (user_provided_p (OVL_CURRENT (fns))) + TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = 1; + if (cxx_dialect >= cxx0x) + deduce_noexcept_on_destructor (OVL_CURRENT (fns)); +} } /* Subroutine of set_one_vmethod_tm_attributes. Search base classes @@ -5319,10 +5320,10 @@ check_bases_and_members (tree t) check_bases (t, &cant_have_const_ctor, &no_const_asn_ref); - /* Deduce noexcept on destructors. This needs to happen after we've set - triviality flags appropriately for our bases. */ - if (cxx_dialect >= cxx0x) -deduce_noexcept_on_destructors (t); + /* Possibly set TYPE_HAS_NONTRIVIAL_DESTRUCTOR and deduce noexcept on + destructors. This needs to happen after we've set triviality flags + appropriately for our bases. */ + check_destructors (t); /* Check all the method declarations. */ check_methods (t); Index: testsuite/g++.dg/cpp0x/noexcept21.C === --- testsuite/g++.dg/cpp0x/noexcept21.C (revision 0) +++ testsuite/g++.dg/cpp0x/noexcept21.C (working copy) @@ -0,0 +1,28 @@ +// PR c++/57645 +// { dg-do compile { target c++11 } } + +struct Thrower +{ + ~Thrower() noexcept(false) { throw 1; } +}; + +struct ExplicitA +{ + ~ExplicitA() {} + + Thrower t; +}; + +struct ExplicitB +{ + ~ExplicitB(); + + Thrower t; +}; + +ExplicitB::~ExplicitB() { } + +#define SA(X) static_assert(X, #X) + +SA( !noexcept(ExplicitA()) ); +SA( !noexcept(ExplicitB()) );
[PATCH] Fix up gcc-{ar,nm,ranlib}
Hi! If say /usr/bin/gcc-ar doesn't find /usr//bin/ar (and a few others), it gives up unless CROSS_DIRECTORY_STRUCTURE, while e.g. collect2 looks for ld, nm etc. in $PATH. The collect2.c snippet is: /* Search the compiler directories for `ld'. We have protection against recursive calls in find_a_file. */ if (ld_file_name == 0) ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker]); /* Search the ordinary system bin directories for `ld' (if native linking) or `TARGET-ld' (if cross). */ if (ld_file_name == 0) ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker]); where the difference between full_ld_suffixes and ld_suffixes is exactly a concat (target_machine, "-", ld_suffixes[xxx], NULL); Here is so far untested attempt to do that in gcc-{ar,nm,ranlib} too, ok if bootstrap/regtest passes and testing shows it works (for 4.8 too, in 4.7 it worked)? 2013-06-19 Jakub Jelinek * gcc-ar.c (main): If not CROSS_DIRECTORY_STRUCTURE, look for PERSONALITY in $PATH derived prefixes. --- gcc/gcc-ar.c.jj 2013-01-11 09:02:55.0 +0100 +++ gcc/gcc-ar.c2013-06-19 15:09:08.314935157 +0200 @@ -147,21 +147,17 @@ main(int ac, char **av) exe_name = find_a_file (&target_path, PERSONALITY); if (!exe_name) { + const char *real_exe_name = PERSONALITY; #ifdef CROSS_DIRECTORY_STRUCTURE - const char *cross_exe_name; - - cross_exe_name = concat (target_machine, "-", PERSONALITY, NULL); - exe_name = find_a_file (&path, cross_exe_name); + real_exe_name = concat (target_machine, "-", PERSONALITY, NULL); +#endif + exe_name = find_a_file (&path, real_exe_name); if (!exe_name) { fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0], - cross_exe_name); + real_exe_name); exit (1); } -#else - fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0], PERSONALITY); - exit (1); -#endif } /* Create new command line with plugin */ Jakub
Re: [Patch tree-ssa] RFC: Enable path threading for control variables (PR tree-optimization/54742).
> -Original Message- > From: Steve Ellcey [mailto:sell...@mips.com] > Sent: 14 June 2013 19:07 > > With my version the compiler calls gimple_duplicate_sese_region from > duplicate_blocks 60 times. With your patch it calls > gimple_duplicate_sese_region from duplicate_thread_path 16 times. > Hi Steve, You are quite right. With -finline-limit=1000 I see a big difference in performance. The cause of this is the code added to tree-ssa-threadedge.c (simplify_control_stmt_condition). If we have failed to simplify to a gimple_min_invariant, we want to look for thread paths to the value given by gimple_goto_dest, rather than the SSA_NAME_VALUE of that value. This improves the performance on my x86_64 toolchain to the same level as your patch. I see "Registered 20 jump paths" printed 3 times in dom1, for a total of 60 thread paths. I've also fixed another couple of bugs I spotted, improved logging of results and added the parameters that were in your patch. I did investigate changing the search strategy back to yours, but I saw no impact on the thread paths found. Please let me know if this fixes the performance issues you were seeing and if you have any other comments. FWIW I've bootstrapped and regression tested this version of the patch on x86_64 and ARM with no regressions. Thanks, James Greenhalgh --- Changes from v1: --- gcc/ 2013-06-19 James Greenhalgh * params.def (PARAM_MAX_THREAD_PATH_INSNS): New. (PARAM_THREAD_PATHS): Likewise. * tree-ssa-threadedge.c (simplify_control_stmt_condition): If we can't simplify cond, return it unmodified. (max_insn_count): Do not initialize. (max_path_count): Likewise. (find_control_statement_thread_paths): Catch case where path has already been computed (thus no further path exists), add sanity checking. (thread_across_edge): Initialize max_{insn, path}_count; * tree-ssa-threadupdate.c (duplicate_thread_path): Add sanity check, logging. (thread_through_all_blocks): Thread paths, even if no threaded_edges were found. (register_thread_paths): Improve logging. --- Changelog: gcc/ 2013-06-19 James Greenhalgh PR tree-optimization/54742 * params.def (PARAM_MAX_THREAD_PATH_INSNS): New. (PARAM_THREAD_PATHS): Likewise. * tree.cfg (gimple_duplicate_sese_region): Memoize loop latch and loop header blocks if copying across a latch/header. * tree-flow.h (thread_paths): New struct. (register_thread_paths): Declare. * tree-ssa-threadedge.c (simplify_control_stmt_condition): Permit returning something not in gimple_min_invariant form. (max_insn_count): Declare. (max_path_count): Likewise. (find_thread_path_1): New function. (find_thread_path): Likewise. (save_new_thread_path): Likewise. (find_control_statement_thread_paths): Likewise. (thread_across_edge): Handle non gimple_min_invariant cases. * tree-ssa-threadupdate.c (thread_paths_vec): New. (remove_edge_from_thread_candidates): New function. (duplicate_thread_path): Likewise. (copy_control_statement_thread_paths): Likewise. (thread_through_all_blocks): Handle thread_paths. (register_thread_paths): New function. diff --git a/gcc/params.def b/gcc/params.def index 3c52651..25d36a6 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -123,6 +123,19 @@ DEFPARAM (PARAM_PARTIAL_INLINING_ENTRY_PROBABILITY, "Maximum probability of the entry BB of split region (in percent relative to entry BB of the function) to make partial inlining happen", 70, 0, 0) +/* Maximum number of instructions to copy when duplicating blocks + on a jump thread path. */ +DEFPARAM (PARAM_MAX_THREAD_PATH_INSNS, + "max-thread-path-insns", + "Maximum number of instructions to copy when duplicating blocks on a jump thread path", + 100, 1, 99) + +/* Maximum number of jump thread paths to duplicate. */ +DEFPARAM (PARAM_MAX_THREAD_PATHS, + "max-thread-paths", + "Maximum number of new jump thread paths to create", + 50, 1, 99) + /* Limit the number of expansions created by the variable expansion optimization to avoid register pressure. */ DEFPARAM (PARAM_MAX_VARIABLE_EXPANSIONS, diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 4b91a35..6dcd2e4 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -5717,10 +5717,12 @@ gimple_duplicate_sese_region (edge entry, edge exit, { unsigned i; bool free_region_copy = false, copying_header = false; + bool save_loop_details = false; struct loop *loop = entry->dest->loop_father; edge exit_copy; vec doms; edge redirected; + int memo_loop_header_no = 0, memo_loop_latch_no = 0; int total_freq = 0, entry_freq = 0; gcov_type total_count = 0, entry_count = 0; @@ -5738,9 +5740,15 @@ gimple_duplicate_sese_region (edge entry, edge exit, if (re
Re: Go patch committed: Use function descriptors
On Wed, Jun 19, 2013 at 2:19 AM, Matthias Klose wrote: > > so this did change the soname for libgo to 5 on the trunk, and to 4 on the > branch. We had this discussion before, and then decided to revert this kind > of > change on the 4.7 branch. This time the release notes had a hint that the Go > support would be updated to v1.1 in a bug fix release, so maybe it is ok. > Will > this the only soname bump on the way to Go 1.1 support, or are there more > changes/version bumps planned on this way? Yes, exactly, I've been promising Go 1.1 support on the 4.8 branch, so I think this change is necessary. The change in type layout makes the library incompatible for functions that take function arguments. Sorry for not calling it out. This is the only soname bump planned. Ian
Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types
On Wed, Jun 19, 2013 at 6:08 AM, Jeff Law wrote: > > The notable changes since the last version: > > First, it should properly handle signed single bit types, though I haven't > tested it with real code. > > Second, the transformation is only applied when the result is used in a > conditional. Thus it's much less likely to pessimize targets with and-not > instructions as it's highly likely we'll eliminate two gimple statements > rather than just one. > > > Other comments (such as not needing to retrieve gsi_stmt) were also > addressed. Testcase was renamed, but is otherwise unchanged. > > Bootstrapped and regression tested on x86_64-unknown-linux-gnu. > > OK for the trunk? Ok. Thanks, Richard. > > * tree-ssa-forwprop.c (simplify_bitwise_binary_boolean): New > function. > (simplify_bitwise_binary): Use it to simpify certain binary ops on > booleans. > > * gcc.dg/tree-ssa/forwprop-28.c: New test. > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c > new file mode 100644 > index 000..2c42065 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c > @@ -0,0 +1,76 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-forwprop1" } */ > + > +extern char * frob (void); > +extern _Bool testit(void); > + > +test (int code) > +{ > + char * temp = frob();; > + int rotate = (code == 22); > + if (temp == 0 && !rotate) > + oof(); > +} > + > +test_2 (int code) > +{ > + char * temp = frob(); > + int rotate = (code == 22); > + if (!rotate && temp == 0) > + oof(); > +} > + > + > +test_3 (int code) > +{ > + char * temp = frob(); > + int rotate = (code == 22); > + if (!rotate || temp == 0) > + oof(); > +} > + > + > +test_4 (int code) > +{ > + char * temp = frob(); > + int rotate = (code == 22); > + if (temp == 0 || !rotate) > + oof(); > +} > + > + > +test_5 (int code) > +{ > + _Bool temp = testit();; > + _Bool rotate = (code == 22); > + if (temp == 0 && !rotate) > + oof(); > +} > + > +test_6 (int code) > +{ > + _Bool temp = testit(); > + _Bool rotate = (code == 22); > + if (!rotate && temp == 0) > + oof(); > +} > + > + > +test_7 (int code) > +{ > + _Bool temp = testit(); > + _Bool rotate = (code == 22); > + if (!rotate || temp == 0) > + oof(); > +} > + > + > +test_8 (int code) > +{ > + _Bool temp = testit(); > + _Bool rotate = (code == 22); > + if (temp == 0 || !rotate) > + oof(); > +} > + > +/* { dg-final { scan-tree-dump-times "Replaced" 8 "forwprop1"} } */ > diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c > index c6a7eaf..29a0bb7 100644 > --- a/gcc/tree-ssa-forwprop.c > +++ b/gcc/tree-ssa-forwprop.c > @@ -1870,6 +1870,52 @@ hoist_conversion_for_bitop_p (tree to, tree from) >return false; > } > > +/* GSI points to a statement of the form > + > + result = OP0 CODE OP1 > + > + Where OP0 and OP1 are single bit SSA_NAMEs and CODE is either > + BIT_AND_EXPR or BIT_IOR_EXPR. > + > + If OP0 is fed by a bitwise negation of another single bit SSA_NAME, > + then we can simplify the two statements into a single LT_EXPR or LE_EXPR > + when code is BIT_AND_EXPR and BIT_IOR_EXPR respectively. > + > + If a simplification is mode, return TRUE, else return FALSE. */ > +static bool > +simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi, > +enum tree_code code, > +tree op0, tree op1) > +{ > + gimple op0_def_stmt = SSA_NAME_DEF_STMT (op0); > + > + if (!is_gimple_assign (op0_def_stmt) > + || (gimple_assign_rhs_code (op0_def_stmt) != BIT_NOT_EXPR)) > +return false; > + > + tree x = gimple_assign_rhs1 (op0_def_stmt); > + if (TREE_CODE (x) == SSA_NAME > + && INTEGRAL_TYPE_P (TREE_TYPE (x)) > + && TYPE_PRECISION (TREE_TYPE (x)) == 1 > + && TYPE_UNSIGNED (TREE_TYPE (x)) == TYPE_UNSIGNED (TREE_TYPE (op1))) > +{ > + enum tree_code newcode; > + > + gimple stmt = gsi_stmt (*gsi); > + gimple_assign_set_rhs1 (stmt, x); > + gimple_assign_set_rhs2 (stmt, op1); > + if (code == BIT_AND_EXPR) > + newcode = TYPE_UNSIGNED (TREE_TYPE (x)) ? LT_EXPR : GT_EXPR; > + else > + newcode = TYPE_UNSIGNED (TREE_TYPE (x)) ? LE_EXPR : GE_EXPR; > + gimple_assign_set_rhs_code (stmt, newcode); > + update_stmt (stmt); > + return true; > +} > + return false; > + > +} > + > /* Simplify bitwise binary operations. > Return true if a transformation applied, otherwise return false. */ > > @@ -2117,8 +2163,44 @@ simplify_bitwise_binary (gimple_stmt_iterator *gsi) > return true; > } > } > -} > > + /* If arg1 and arg2 are booleans (or any single bit type) > + then try to simplify: > + > + (~X & Y) -> X < Y > + (X & ~Y) -> Y < X > + (~X | Y) -> X <= Y > + (X | ~Y) -> Y <= X > + > + But only do this if our res
Re: [PATCH, rs6000] power8 patches, patch #9, power8 scheduling
On Fri, Jun 7, 2013 at 3:22 PM, Pat Haugen wrote: > This patch adds instruction scheduling support for the Power8 processor. > Bootstrap/regression test with no new failures. Ok for trunk? > > > 2013-06-07 Michael Meissner > Pat Haugen > Peter Bergner > > * config/rs6000/power8.md: New. > * config/rs6000/rs6000-cpus.def (RS6000_CPU table): Adjust processor > setting for power8 entry. > * config/rs6000/t-rs6000 (MD_INCLUDES): Add power8.md. > * config/rs6000/rs6000.c (is_microcoded_insn, is_cracked_insn): Adjust > test for Power4/Power5 only. > (insn_must_be_first_in_group, insn_must_be_last_in_group): Add Power8 > support. > (force_new_group): Adjust comment. > * config/rs6000/rs6000.md: Include power8.md. This patch is okay. Thanks, David
Re: [PATCH] Provide a pointer_map template
On Wed, 19 Jun 2013, Richard Biener wrote: > > This templates the pointer-map implementation (actually it copies > the implementation, leaving the old interface unchanged) providing > a templated value type. That's suitable to replace the various > users requesting a pointer-to-integer-type map, like I noticed > for the two LTO tree recording mechanisms. Which in turn saves > memory on 64bit hosts (and should be less heavy-weight on the cache). > Not very much, but a quarter of the old pointer-map memory usage. > > LTO bootstrap and regtest running on x86_64-unknown-linux-gnu. > > In theory we can typedef pointer_map pointer_map_t, but > that requires touching all pointer_map_t users to drop the > leading 'struct' and eventually include pointer-set.h. > > I changed the insert () interface to get another output as to > whether the slot was present to avoid the need to have a special > "not present" value. That also makes it unnecessary to zero > the values array. > > Any comments? > > If not then I'll comb over existing pointer -> integer type map > users and convert them. Added the dominance.c one and changed the implementation to "inherit" from pointer-set instead, sharing a bit more code. The pointer-map template is also type-safe for the value array so converting all pointer-map users will make the code a tiny bit prettier. The remaining integer type cases seem to store integer types as large as pointer types so they fall in the same category (but eventually they chose that large type for no good reason). Old patch LTO bootstrapped and tested on x86_64-unknown-linux-gnu. Any objections? Thanks, Richard. 2013-06-19 Richard Biener * pointer-set.h (struct pointer_set_t): Move here from pointer-set.c. (pointer_set_lookup): Declare. (class pointer_map): New template class implementing a generic pointer to T map. (pointer_map::pointer_map, pointer_map::~pointer_map, pointer_map::contains, pointer_map::insert, pointer_map::traverse): New functions. * pointer-set.c (struct pointer_set_t): Moved to pointer-set.h. (pointer_set_lookup): New function. (pointer_set_contains): Use pointer_set_lookup. (pointer_set_insert): Likewise. (insert_aux): Remove. (struct pointer_map_t): Embed a pointer_set_t. (pointer_map_create): Adjust. (pointer_map_destroy): Likewise. (pointer_map_contains): Likewise. (pointer_map_insert): Likewise. (pointer_map_traverse): Likewise. * tree-streamer.h (struct streamer_tree_cache_d): Use a pointer_map instead of a pointer_map_t. * tree-streamer.c (streamer_tree_cache_insert_1): Adjust. (streamer_tree_cache_lookup): Likewise. (streamer_tree_cache_create): Likewise. (streamer_tree_cache_delete): Likewise. * lto-streamer.h (struct lto_tree_ref_encoder): Use a pointer_map instead of a pointer_map_t. (lto_init_tree_ref_encoder): Adjust. (lto_destroy_tree_ref_encoder): Likewise. * lto-section-out.c (lto_output_decl_index): Likewise. (lto_record_function_out_decl_state): Likewise. * dominance.c (iterate_fix_dominators): Use pointer_map. Index: gcc/pointer-set.c === *** gcc/pointer-set.c.orig 2013-06-19 12:28:49.0 +0200 --- gcc/pointer-set.c 2013-06-19 13:52:49.172792131 +0200 *** along with GCC; see the file COPYING3. *** 21,41 #include "system.h" #include "pointer-set.h" - /* A pointer set is represented as a simple open-addressing hash -table. Simplifications: The hash code is based on the value of the -pointer, not what it points to. The number of buckets is always a -power of 2. Null pointers are a reserved value. Deletion is not -supported (yet). There is no mechanism for user control of hash -function, equality comparison, initial size, or resizing policy. */ - - struct pointer_set_t - { - size_t log_slots; - size_t n_slots; /* n_slots = 2^log_slots */ - size_t n_elements; - - const void **slots; - }; /* Use the multiplicative method, as described in Knuth 6.4, to obtain a hash code for P in the range [0, MAX). MAX == 2^LOGMAX. --- 21,26 *** hash1 (const void *p, unsigned long max, *** 67,72 --- 52,58 return ((A * (uintptr_t) p) >> shift) & (max - 1); } + /* Allocate an empty pointer set. */ struct pointer_set_t * pointer_set_create (void) *** pointer_set_destroy (struct pointer_set_ *** 89,108 XDELETE (pset); } - /* Returns nonzero if PSET contains P. P must be nonnull. !Collisions are resolved by linear probing. */ ! int ! pointer_set_contains (const struct pointer_set_t *pset, const void *p) { size_t n = hash1 (p, pset->n_slots, pset->log_slots); while (true) {
Re: [PATCH] ARM: Don't clobber CC reg when it is live after the peephole window
On 18/06/13 17:22, Meador Inge wrote: Ping. On 06/06/2013 01:23 PM, Meador Inge wrote: On 06/06/2013 08:11 AM, Richard Earnshaw wrote: I understand (and agree with) this bit... +(define_peephole2 + [(set (reg:CC CC_REGNUM) +(compare:CC (match_operand:SI 0 "register_operand" "") +(match_operand:SI 1 "arm_rhs_operand" ""))) + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) + (set (match_operand:SI 2 "register_operand" "") (const_int 0))) + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) + (set (match_dup 2) (const_int 1))) + (match_scratch:SI 3 "r")] + "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])" + [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1))) + (parallel +[(set (reg:CC CC_REGNUM) + (compare:CC (const_int 0) (match_dup 3))) + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))]) + (set (match_dup 2) +(plus:SI (plus:SI (match_dup 2) (match_dup 3)) + (geu:SI (reg:CC CC_REGNUM) (const_int 0]) + ... but what's this bit about? The original intent was to revert back to the original peephole pattern (pre-PR 46975) when the CC reg is still live, but that doesn't properly maintain the CC state either (it just happened to pass in the test case I was looking at because I only cared about the Z flag, which is maintained the same). OK with the above bit left out? Sorry for the delay, I've been sidetracked onto other things. Having looked at this patch I realized that we were missing a trick on ARMv5 and later, when a more efficient sequence exists, particularly for Cortex-A15. By using CLZ we can avoid the need to set the condition code register at all, which gives us far more scheduling freedom. It's also best not to unnecessarily clobber the condition code register even if there are other instructions in the sequence that do set/use the flags (the peepholer pass right at the end will do this optimization when it is useful), so I've tweaked some of the existing alternatives as well. Finally, we can use peep2_regno_dead_p (rather than peep2_reg_dead_p) to avoid having to create extra match_operand values. The result is that I've now committed the patch below. R. 2013-06-19 Richard Earnshaw arm.md (split for eq(reg, 0)): Add variants for ARMv5 and Thumb2. (peepholes for eq(reg, not-0)): Ensure condition register is dead after pattern. Use more efficient sequences on ARMv5 and Thumb2. --- gcc/config/arm/arm.md (revision 200187) +++ gcc/config/arm/arm.md (local) @@ -10021,6 +10021,16 @@ (define_split (eq:SI (match_operand:SI 1 "s_register_operand" "") (const_int 0))) (clobber (reg:CC CC_REGNUM))] + "arm_arch5 && TARGET_32BIT" + [(set (match_dup 0) (clz:SI (match_dup 1))) + (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 5)))] +) + +(define_split + [(set (match_operand:SI 0 "s_register_operand" "") + (eq:SI (match_operand:SI 1 "s_register_operand" "") + (const_int 0))) + (clobber (reg:CC CC_REGNUM))] "TARGET_32BIT && reload_completed" [(parallel [(set (reg:CC CC_REGNUM) @@ -10090,29 +10100,87 @@ (define_insn_and_split "*compare_scc" ;; Attempt to improve the sequence generated by the compare_scc splitters ;; not to use conditional execution. + +;; Rd = (eq (reg1) (const_int0)) // ARMv5 +;; clz Rd, reg1 +;; lsr Rd, Rd, #5 (define_peephole2 [(set (reg:CC CC_REGNUM) (compare:CC (match_operand:SI 1 "register_operand" "") - (match_operand:SI 2 "arm_rhs_operand" ""))) + (const_int 0))) + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) + (set (match_operand:SI 0 "register_operand" "") (const_int 0))) + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) + (set (match_dup 0) (const_int 1)))] + "arm_arch5 && TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)" + [(set (match_dup 0) (clz:SI (match_dup 1))) + (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 5)))] +) + +;; Rd = (eq (reg1) (const_int0)) // !ARMv5 +;; negs Rd, reg1 +;; adc Rd, Rd, reg1 +(define_peephole2 + [(set (reg:CC CC_REGNUM) + (compare:CC (match_operand:SI 1 "register_operand" "") + (const_int 0))) (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) (set (match_operand:SI 0 "register_operand" "") (const_int 0))) (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) (set (match_dup 0) (const_int 1))) - (match_scratch:SI 3 "r")] - "TARGET_32BIT" + (match_scratch:SI 2 "r")] + "TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)" [(parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 1) (match_dup 2))) - (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))]) + (compare:CC (const_int 0) (match_dup 1))) + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 1)))]) + (set (match_dup 0) + (pl
Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self
Am 19.06.2013 14:03, schrieb Matthias Klose: > $ gcc-ar-4.8 -h > gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so' > > the plugin is *not* installed with x permission flags (which seems to be the > standard for shared libraries). You did change the code to use find_a_file > which searches only for files with the x bit set. > > Work around is to install the plugin with the x bits set, or use some helper > function which doesn't look for the x bits. I assume that wasn't catched, > because the plugin then was found in another location? openend 57651 for that. now, working around the permission bit, I get: $ gcc-ar-4.8 gcc-ar-4.8: Cannot find binary 'ar' so it only searches ar in the given paths, not on the path of the file system (/usr/bin in this case)
Re: [Committed] S/390: PR57609 fix - use next_active_insn instead of next_real_insn
On 18/06/13 19:06, Steven Bosscher wrote: > BTW I don't understand how a label satisfying the following can be > true for a label before a jump table: > > if (LABEL_P (insn) > && (LABEL_PRESERVE_P (insn) || LABEL_NAME (insn))) > > LABEL_PRESERVE_P should never be set on a label before a > JUMP_TABLE_DATA, and LABEL_NAME should be NULL. Actually LABEL_PRESERVE_P appears to be set on quite many of the jump table data labels. Example from compiling fold-const.c: (code_label/s 1285 1284 1286 9315 "" [3 uses]) (jump_table_data 1286 1285 1287 (addr_vec:SI [ (label_ref:SI 54875) (label_ref:SI 63283) (label_ref:SI 63283) (label_ref:SI 63283) (label_ref:SI 63283) (label_ref:SI 63283) (label_ref:SI 63283) ... Hardware watchpoint 5: table_label->in_struct Old value = 0 New value = 1 force_const_mem (mode=SImode, x=0x7c0c27c0) at /build/gcc-head/gcc/varasm.c:3699 3699 return copy_rtx (def); (gdb) bt #0 force_const_mem (mode=SImode, x=0x7c0c27c0) at /build/gcc-head/gcc/varasm.c:3699 #1 0x009ada54 in emit_move_insn (x=0x7bdf44b0, y=0x7c0c27c0) at /build/gcc-head/gcc/expr.c:3499 #2 0x010b3e6a in gen_casesi (operand0=0x7bdf42d0, operand1=0x7d7e24e0, operand2=0x7c0c20e0, operand3=0x7b3c4488, operand4=0x7b3c43c0) at /build/gcc-head/gcc/config/s390/s390.md:8588 #3 0x00c0c70e in maybe_gen_insn (icode=CODE_FOR_casesi, nops=5, ops=0x7fffe3a8) at /build/gcc-head/gcc/optabs.c:8219 #4 0x00c0c92a in maybe_expand_jump_insn (icode=CODE_FOR_casesi, nops=5, ops=0x7fffe3a8) at /build/gcc-head/gcc/optabs.c:8257 #5 0x00c0c9ee in expand_jump_insn (icode=CODE_FOR_casesi, nops=5, ops=0x7fffe3a8) at /build/gcc-head/gcc/optabs.c:8283 #6 0x009ca5ec in try_casesi (index_type=0x7d7e6420, index_expr=0x7bca4168, minval=0x7d889300, range=0x7d156920, table_label=0x7b3c4488, default_label=0x7b3c43c0, fallback_label=0x7b3c43e8, default_probability=) at /build/gcc-head/gcc/expr.c:10967 #7 0x00d18016 in emit_case_dispatch_table (index_expr=0x7bca4168, index_type=0x7d7e6420, case_list=0x1a31d58, default_label=0x7b3c43c0, minval=0x7d889300, maxval=0x7d0ad180, range=0x7d156920, stmt_bb=0x7bf96000) at /build/gcc-head/gcc/stmt.c:1933 #8 0x00d18ef4 in expand_case (stmt=0x7d7dc800) at /build/gcc-head/gcc/stmt.c:2207 > Better yet would be to use tablejump_p instead of examining the > pattern by hand, e.g.: Ok. Better. I've applied your patch after testing it. Thanks! Bye, -Andreas- > > Index: s390.c > === > --- s390.c (revision 200173) > +++ s390.c (working copy) > @@ -7023,7 +7023,7 @@ s390_chunkify_start (void) >if (LABEL_P (insn) > && (LABEL_PRESERVE_P (insn) || LABEL_NAME (insn))) > { > - rtx vec_insn = next_active_insn (insn); > + rtx vec_insn = NEXT_INSN (insn); > if (! vec_insn || ! JUMP_TABLE_DATA_P (vec_insn)) > bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (insn)); > } > @@ -7033,6 +7033,8 @@ s390_chunkify_start (void) >else if (JUMP_P (insn)) > { >rtx pat = PATTERN (insn); > + rtx table; > + > if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) > 2) > pat = XVECEXP (pat, 0, 0); > > @@ -7046,28 +7048,18 @@ s390_chunkify_start (void) > bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label)); > } > } > - else if (GET_CODE (pat) == PARALLEL > - && XVECLEN (pat, 0) == 2 > - && GET_CODE (XVECEXP (pat, 0, 0)) == SET > - && GET_CODE (XVECEXP (pat, 0, 1)) == USE > - && GET_CODE (XEXP (XVECEXP (pat, 0, 1), 0)) == LABEL_REF) > - { > - /* Find the jump table used by this casesi jump. */ > - rtx vec_label = XEXP (XEXP (XVECEXP (pat, 0, 1), 0), 0); > - rtx vec_insn = next_active_insn (vec_label); > - if (vec_insn && JUMP_TABLE_DATA_P (vec_insn)) > - { > - rtx vec_pat = PATTERN (vec_insn); > - int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC; > - > - for (i = 0; i < XVECLEN (vec_pat, diff_p); i++) > - { > - rtx label = XEXP (XVECEXP (vec_pat, diff_p, i), 0); > - > - if (s390_find_pool (pool_list, label) > - != s390_find_pool (pool_list, insn)) > - bitmap_set_bit (far_labels, CODE_LABEL_NUMBER > (label)); > - } > + else if (tablejump_p (insn, NULL, &table)) > + { > + rtx vec_pat = PATTERN (table); > + int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC; > + > + for (i = 0; i < XVECLEN (vec_pat, diff_p); i++) > + { > + rtx label = XEXP (XVECEXP (ve
Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self
On Wed, Jun 19, 2013 at 02:03:34PM +0200, Matthias Klose wrote: > Am 27.11.2012 19:14, schrieb Meador Inge: > > On 11/26/2012 09:05 AM, Richard Biener wrote: > > > >> On Wed, Nov 7, 2012 at 10:51 PM, Meador Inge > >> wrote: > >>> Ping ^ 4. > >> > >> Ok. > > > > Thanks for the review. Committed to trunk. > > This did break gcc-ar and gcc-nm; now a regression on the 4.8 branch. > > $ gcc-ar-4.8 -h > gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so' > > the plugin is *not* installed with x permission flags (which seems to be the > standard for shared libraries). You did change the code to use find_a_file > which searches only for files with the x bit set. That actually is the standard for shared libraries, the linker creates libraries with those permissions and libtool/automake installs them that way too. So if you override this, you need to cope with that decision. Jakub
Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self
Am 27.11.2012 19:14, schrieb Meador Inge: > On 11/26/2012 09:05 AM, Richard Biener wrote: > >> On Wed, Nov 7, 2012 at 10:51 PM, Meador Inge >> wrote: >>> Ping ^ 4. >> >> Ok. > > Thanks for the review. Committed to trunk. This did break gcc-ar and gcc-nm; now a regression on the 4.8 branch. $ gcc-ar-4.8 -h gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so' the plugin is *not* installed with x permission flags (which seems to be the standard for shared libraries). You did change the code to use find_a_file which searches only for files with the x bit set. Work around is to install the plugin with the x bits set, or use some helper function which doesn't look for the x bits. I assume that wasn't catched, because the plugin then was found in another location? Matthias
Re: [ARM][Insn classification refactoring 2/N] Update instruction classification documentation
On 18/06/13 15:47, Sofiane Naci wrote: Hi, This patch updates the documentation for "type" attribute. It complements the changes proposed in the previous patch OK for trunk? - Thanks Sofiane= OK.
Re: [PATCH] Enable non-complex math builtins from C99 for Bionic
> I don't see how a target hook is required for the command-line idea. > Targets already have a perfectly working way of changing the default > of a command-line option. That's true.. sorry, my bad. Anyway, could somebody take a look at the patch itself? --Alexander >> 2013/4/23 Alexander Ivchenko : >>> *ping* >>> >>> thanks >>> Alexander >>> >>> 2013/3/28 Alexander Ivchenko : Hi, 4.8 is now branched, lets come back to the discussion that we had before. I updated the patch a little bit since we now have linux-protos.h and linux-android.c files. I tried to preserve the avaiability of c99 for all targets, but it's pretty difficult, because we are changing the defaults. Passing an empty string as second argument doesn't look very good, but on the other hand the user has one clear way for checking the presence of a certain function. But of course we can create another function, that will call targetm.libc_has_function (function_class, "") within itself. best regards, Alexander 2013/1/7 Joseph S. Myers : > On Fri, 21 Dec 2012, Alexander Ivchenko wrote: > >> Hi, >> >> Thank you very much for your input! Please, take a look at the updated >> version: >> I fixed coding style, moved documentation for TARGET_LIBC_HAS_FUNCTION >> to target.def. >> Removed TARGET_C99_FUNCTIONS and TARGET_HAS_SINCOS and all their >> influence and moved the implementation of linux_libc_has_function to >> host-linux.c. >> I changed the defaults: now it is assumed that we have C99 runtime, >> but no sincos. I updated all needed gcc/config/*.h. But 'm not sure in >> this part, >> cause I don't have the opportunity to test it properly... > > This patch seems mostly plausible, though there are various places that > call targetm.libc_has_function with and empty string as second argument, > that should be naming the specific function instead. I haven't reviewed > the details, and at this development stage I think it will need to wait > until after 4.8 branches. > > -- > Joseph S. Myers > jos...@codesourcery.com
[PATCH] Provide a pointer_map template
This templates the pointer-map implementation (actually it copies the implementation, leaving the old interface unchanged) providing a templated value type. That's suitable to replace the various users requesting a pointer-to-integer-type map, like I noticed for the two LTO tree recording mechanisms. Which in turn saves memory on 64bit hosts (and should be less heavy-weight on the cache). Not very much, but a quarter of the old pointer-map memory usage. LTO bootstrap and regtest running on x86_64-unknown-linux-gnu. In theory we can typedef pointer_map pointer_map_t, but that requires touching all pointer_map_t users to drop the leading 'struct' and eventually include pointer-set.h. I changed the insert () interface to get another output as to whether the slot was present to avoid the need to have a special "not present" value. That also makes it unnecessary to zero the values array. Any comments? If not then I'll comb over existing pointer -> integer type map users and convert them. Thanks, Richard. 2013-06-19 Richard Biener * pointer-set.h (struct pointer_map_base): New struct. (class pointer_map): New template class implementing a generic pointer to T map. (pointer_map::pointer_map, pointer_map::~pointer_map, pointer_map::contains, pointer_map::insert, pointer_map::traverse): New functions. * pointer-set.c (pointer_map_base::lookup): New function. * tree-streamer.h (struct streamer_tree_cache_d): Use a pointer_map instead of a pointer_map_t. * tree-streamer.c (streamer_tree_cache_insert_1): Adjust. (streamer_tree_cache_lookup): Likewise. (streamer_tree_cache_create): Likewise. (streamer_tree_cache_delete): Likewise. * lto-streamer.h (struct lto_tree_ref_encoder): Use a pointer_map instead of a pointer_map_t. (lto_init_tree_ref_encoder): Adjust. (lto_destroy_tree_ref_encoder): Likewise. * lto-section-out.c (lto_output_decl_index): Likewise. (lto_record_function_out_decl_state): Likewise. Index: gcc/pointer-set.c === *** gcc/pointer-set.c (revision 200189) --- gcc/pointer-set.c (working copy) *** void pointer_map_traverse (const struct *** 301,303 --- 301,335 if (pmap->keys[i] && !fn (pmap->keys[i], &pmap->values[i], data)) break; } + + + + /* Lookup the slot for the pointer P and return true if it exists, +otherwise return false in which case *IX points to the slot that +would be used on insertion. */ + + bool + pointer_map_base::lookup (const void *p, size_t *ix) + { + size_t n = hash1 (p, n_slots, log_slots); + + while (true) + { + if (keys[n] == p) + { + *ix = n; + return true; + } + else if (keys[n] == 0) + { + *ix = n; + return false; + } + else +{ + ++n; + if (n == n_slots) +n = 0; +} + } + } Index: gcc/pointer-set.h === *** gcc/pointer-set.h (revision 200189) --- gcc/pointer-set.h (working copy) *** void pointer_set_traverse (const struct *** 30,42 bool (*) (const void *, void *), void *); struct pointer_map_t; ! struct pointer_map_t *pointer_map_create (void); ! void pointer_map_destroy (struct pointer_map_t *pmap); ! void **pointer_map_contains (const struct pointer_map_t *pmap, const void *p); ! void **pointer_map_insert (struct pointer_map_t *pmap, const void *p); ! void pointer_map_traverse (const struct pointer_map_t *, bool (*) (const void *, void **, void *), void *); #endif /* POINTER_SET_H */ --- 30,166 bool (*) (const void *, void *), void *); + + struct pointer_map_base + { + size_t log_slots; + size_t n_slots; /* n_slots = 2^log_slots */ + size_t n_elements; + const void **keys; + + bool lookup (const void *p, size_t *n); + }; + + /* A pointer map is represented the same way as a pointer_set, so +the hash code is based on the address of the key, rather than +its contents. Null keys are a reserved value. Deletion is not +supported (yet). There is no mechanism for user control of hash +function, equality comparison, initial size, or resizing policy. */ + + template + class pointer_map : protected pointer_map_base + { + T *values; + + public: + pointer_map (); + ~pointer_map (); + T *contains (const void *p); + T *insert (const void *p, bool *existed_p = NULL); + void traverse (bool (*fn) (const void *, T *, void *), void *data); + }; + + /* Allocate an empty pointer map. */ + template + pointer_map::pointer_map (void) + { + n_elements = 0; + log_slots = 8; + n_slots =
Re: Go patch committed: Use function descriptors
Am 19.06.2013 01:50, schrieb Ian Lance Taylor: > This patch to gccgo changes the representation of values of function > type. They used to be a pointer to function code, like a C function > pointer. They are now a pointer to a struct. The first field of the > struct points to the function code. The remaining fields, if any, are > the addresses of variables referenced in enclosing functions. For each > call to a function, the address of the function descriptor is passed as > the last argument. > > This lets us avoid generating trampolines, and removes the use > of writable/executable sections of the heap. > > This is also a prerequisite to a new Go 1.1 feature, method values. > > Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. > Committed to mainline and 4.8 branch. so this did change the soname for libgo to 5 on the trunk, and to 4 on the branch. We had this discussion before, and then decided to revert this kind of change on the 4.7 branch. This time the release notes had a hint that the Go support would be updated to v1.1 in a bug fix release, so maybe it is ok. Will this the only soname bump on the way to Go 1.1 support, or are there more changes/version bumps planned on this way? Matthias
Re: [PATCH, ARM] Reintroduce minipool ranges for zero-extension insn patterns
> The following patch removed pool_range/neg_pool_range attributes from > several instructions as a cleanup, which I believe to have been > incorrect: > > http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01036.html > > On a Mentor-local branch, this caused problems with instructions like: > > (insn 77 53 87 (set (reg:SI 8 r8 [orig:197 s.4 ] [197]) > (zero_extend:SI (mem/u/c:HI (symbol_ref/u:SI ("*.LC0") [flags 0x2]) > [7 S2 A16]))) [...] 161 {*arm_zero_extendhisi2_v6} (nil)) > > The reasoning behind the cleanup was that the instructions in question > have no immediate constraints -- but the minipool code is used for more > than just immediates, e.g. in the above case where a symbol reference > ("m") is loaded. Probably the most reported ARM bug (PR target/49423) and a clear regression. > I don't have a test case for the problem on mainline at present, but I > believe it is still a latent bug. Tested with the default multilibs (ARM > & Thumb mode) on arm-none-eabi, with no regressions. (The patch has > also been tested with more multilibs on our local branches for a while, > and I did ensure previously that it did not adversely affect Bernd's > patch linked above.) Can you attach it to PR target/49423? Anybody doing serious testing with an ARM compiler will run into it in some configuration so it would be nice to have a single source for the fix (although that ought to be the tree...). -- Eric Botcazou
Re: PING [C++ docs patch] PR 56544
... I have no committed this simple doc update. Also, a 4_8-branch version, attached below. Thanks, Paolo. 2013-06-19 Paolo Carlini PR c++/56544 * doc/cpp.texi [Standard Predefined Macros, __cplusplus]: Document that now in C++ the value is correct per the C++ standards. Index: doc/cpp.texi === --- doc/cpp.texi(revision 200192) +++ doc/cpp.texi(working copy) @@ -1926,11 +1926,9 @@ facilities of the standard C library available. This macro is defined when the C++ compiler is in use. You can use @code{__cplusplus} to test whether a header is compiled by a C compiler or a C++ compiler. This macro is similar to @code{__STDC_VERSION__}, in -that it expands to a version number. A fully conforming implementation -of the 1998 C++ standard will define this macro to @code{199711L}. The -GNU C++ compiler is not yet fully conforming, so it uses @code{1} -instead. It is hoped to complete the implementation of standard C++ -in the near future. +that it expands to a version number. Depending on the language standard +selected, the value of the macro is @code{199711L}, as mandated by the +1998 C++ standard, or @code{201103L}, per the 2011 C++ standard. @item __OBJC__ This macro is defined, with value 1, when the Objective-C compiler is in
Re: [PATCH] Add command line parsing of -fsanitize
On Wed, Jun 19, 2013 at 10:45:28AM +0200, Richard Biener wrote: > Btw, how to handle the issue with LTO and different -fsanitize options > at compile vs. link-time? Can TUs without -fsanitize options be LTO > linked with -fsanitize? Then lto-wrapper should union -fsanitize > options from all TUs to the final link. I hope all -fsanitize options can > be mixed freely and will properly combine. You can mix the options at compile time, you can't mix -fsanitize=thread with -fsanitize=address linking, because libasan.* and libtsan.* are runtime incompatible, each one uses different incompatible virtual memory layout. But of course, if you compile something with -fsanitize=thread, something else -fsanitize=address, then link, either with -fsanitize=thread or -fsanitize=address, it will most likely not link (because the other library will not be there). The undefined stuff (lots of options) Marek is working on are orthogonal to this, that library just contains tons of fancy printfs and aborts to complain about various issues, but can coexist with libasan or libtsan. Jakub
Re: [PATCH] Add command line parsing of -fsanitize
On Tue, Jun 18, 2013 at 10:25 PM, Jakub Jelinek wrote: > On Tue, Jun 18, 2013 at 04:42:51PM +0200, Marek Polacek wrote: >> Ok, should be done now (together with other nit-fixes). >> Regtested/bootstrapped on x86_64-linux, ok for trunk? > > Looks good to me, the only thing I'm worried about are how this > interferes with the > %{fsanitize=address:...} > and > %{fsanitize=thread:...} > bits in gcc.c. Because we should link in -lasan even for > -fsanitize=shift,address,undefined > and should not link in -lasan for > -fsanitize=address -fno-sanitize=undefined,address,shift > (generally, what we have guarded right now with > %{fsanitize=address:...} > should be done if flag_sanitize & SANITIZE_ADDRESS is going to be > true in the end, etc., and we'll need to link in > -lubsan whenever at least one of the undefined options are set in the > bitmask. -lubsan isn't incompatible with -lasan nor -ltsan, but -lasan > and -ltsan are incompatible. > > Joseph, any thoughts how to deal with this? Btw, how to handle the issue with LTO and different -fsanitize options at compile vs. link-time? Can TUs without -fsanitize options be LTO linked with -fsanitize? Then lto-wrapper should union -fsanitize options from all TUs to the final link. I hope all -fsanitize options can be mixed freely and will properly combine. Richard. > Jakub
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Wed, Jun 19, 2013 at 10:38:47AM +0200, Richard Biener wrote: > On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek wrote: > > On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote: > >> Right, as you did for other cases. It works here as well. > > > > Patch preapproved. > > I wonder how much code breaks these days when we enable -fno-common by > default? ... Somebody would need to try it ;). From vectorization POV, it surely would be better if -fno-common was the default. Jakub
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek wrote: > On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote: >> Right, as you did for other cases. It works here as well. > > Patch preapproved. I wonder how much code breaks these days when we enable -fno-common by default? ... Richard.
Re: [PATCH] PowerPC: Fix test case for PR55033
2013/6/18 Sebastian Huber : > Hello Chung-Ju, > > > On 06/18/2013 05:12 AM, Chung-Ju Wu wrote: >> >> 2013/6/18 David Edelsohn : >>> >>> gcc/testsuite/ChangeLog >>> 2013-06-17 Sebastian Huber >>> >>> PR target/55033 >>> * gcc.target/powerpc/pr55033.c: Fix options. >>> >>> Okay. >>> >>> Thanks, David >>> >>> P.S. Please explicitly copy me on patches. >> >> >> Hi, Sebastian, >> >> Since David has pproved your patch, >> do you need me to help commit this fix again? >> I'd happy to do this for you. :) > > > yes, please commit it for me. Thanks. > > -- > Sebastian Huber, embedded brains GmbH > Committed into trunk as Revision.200191. http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=200191 Best regards, jasonwucj
Re: [PATCH] Add command line parsing of -fsanitize
On Tue, Jun 18, 2013 at 10:45:53PM +, Joseph S. Myers wrote: > On Tue, 18 Jun 2013, Jakub Jelinek wrote: > > > On Tue, Jun 18, 2013 at 04:42:51PM +0200, Marek Polacek wrote: > > > Ok, should be done now (together with other nit-fixes). > > > Regtested/bootstrapped on x86_64-linux, ok for trunk? > > > > Looks good to me, the only thing I'm worried about are how this > > interferes with the > > %{fsanitize=address:...} > > and > > %{fsanitize=thread:...} > > bits in gcc.c. Because we should link in -lasan even for > > -fsanitize=shift,address,undefined > > and should not link in -lasan for > > -fsanitize=address -fno-sanitize=undefined,address,shift > > (generally, what we have guarded right now with > > %{fsanitize=address:...} > > should be done if flag_sanitize & SANITIZE_ADDRESS is going to be > > true in the end, etc., and we'll need to link in > > -lubsan whenever at least one of the undefined options are set in the > > bitmask. -lubsan isn't incompatible with -lasan nor -ltsan, but -lasan > > and -ltsan are incompatible. > > > > Joseph, any thoughts how to deal with this? > > Try defining a new spec function or functions that uses flag_sanitize to > determine what linker arguments to pass? Since the option handling is in > opts.c it should get run in the driver so flag_sanitize should be set > correctly there; as long as the specs in question run after the relevant > option processing, a spec function should work for this. While it would be possible to define say %:sanitize(thread LIBTSAN_EARLY) that would work roughly like %{fsanitize=thread:LIBTSAN_EARLY} worked until now (variable number of arguments that would be concatenated together if flag_sanitize & ..., otherwise return empty), we use e.g. %e inside of the %{fsanitize=thread:...} etc. So, I wonder if we couldn't extend the handle_braces, I think right now empty atoms are disallowed for the first choice, so perhaps %{%:function(args):...} where %:function(args) would be expanded to either non-empty or empty string and depending on that the condition would be then true resp. false. As % is not considered part of the atom name, and we require after atom name optional * and then only one of |, }, &, :, I think this wouldn't be ambiguous in the grammar. We could then have: %{!%:function1():-lfoo;%:function2(bar baz):-lbar -lbaz;-lxxx} and for the sanitizer purposes: %{%:sanitize(address):LIBTSAN_EARLY} %{!nostdlib:%{!nodefaultlibs:%{%:sanitize(address):" LIBASAN_SPEC "\ %{static:%ecannot specify -static with -fsanitize=address}\ %{%:sanitize(thread):%e-fsanitize=address is incompatible with -fsanitize=thread}}\ %{%:sanitize(thread):" LIBTSAN_SPEC "\ %{!pie:%{!shared:%e-fsanitize=thread linking must be done with -pie or %-shared}" Jakub
Re: [PATCH] Re-write LTO type merging again, do tree merging
On Tue, 18 Jun 2013, Andi Kleen wrote: > On Tue, Jun 18, 2013 at 08:04:15PM +0200, Andi Kleen wrote: > > > Just confirmed with the small build. It does. Running the large build > > > now. > > > > Large build worked too. > > Also it seems to be drastically faster. I haven't done a proper > measurement run, but the initial run was 58% faster than 4.8, > using 42% less peak RSS. That was the intent. As a side-effect it should also behave correctly and not have weird effects on dwarf2out.c expectations. Richard.
[PATCH] Fix LTO kernel build ICE
As reported by Andi the following trivial fix fixes the ICE. Committed as obvious. Richard. 2013-06-19 Richard Biener * expr.c (expand_expr_real_1): Use SCOPE_FILE_SCOPE_P to check for global context. Index: gcc/expr.c === --- gcc/expr.c (revision 200189) +++ gcc/expr.c (working copy) @@ -9353,7 +9353,7 @@ expand_expr_real_1 (tree exp, rtx target /* Variables inherited from containing functions should have been lowered by this point. */ context = decl_function_context (exp); - gcc_assert (!context + gcc_assert (SCOPE_FILE_SCOPE_P (context) || context == current_function_decl || TREE_STATIC (exp) || DECL_EXTERNAL (exp)
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote: > Right, as you did for other cases. It works here as well. Patch preapproved. Jakub
Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types
2013/6/19 Jakub Jelinek : > On Wed, Jun 19, 2013 at 03:02:38PM +0800, Chung-Ju Wu wrote: >> In the gnu coding standard we have a space before >> the open-parentheses. Would that be great to have >> testcase follow this convention as well? :) >> >> If so, then... > > Testcases generally don't need to follow the coding conventions, > they can and often it is nicer if they follow it, but we certainly > also want testcases that don't follow it, otherwise we wouldn't have > testsuite coverage for other coding styles (what if say a parser or > preprocessor didn't work properly if there wasn't a space in between > function name and ( ?). > > Jakub That makes sense. Thanks for clarifying it. :) Best regards, jasonwucj
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
Right, as you did for other cases. It works here as well. Thanks, Igor On Wed, Jun 19, 2013 at 11:05 AM, Jakub Jelinek wrote: > On Wed, Jun 19, 2013 at 11:01:59AM +0400, Igor Zamyatin wrote: >> The change also affects vectorizer in avx case which could be seen for >> gcc.dg/tree-ssa/loop-19.c test. >> >> After the change report says >> >> loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment === >> loop-19_bad.c:16: note: vect_compute_data_ref_alignment: >> loop-19_bad.c:16: note: can't force alignment of ref: a[j_9] >> loop-19_bad.c:16: note: vect_compute_data_ref_alignment: >> loop-19_bad.c:16: note: can't force alignment of ref: c[j_9] >> >> AFAICS first condition in ix86_data_alignment was true before the >> change so 256 was a return value. >> >> Do we need to tweak this test also? > > I'd add -fno-common to the test. > > Jakub
Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types
On Wed, Jun 19, 2013 at 03:02:38PM +0800, Chung-Ju Wu wrote: > In the gnu coding standard we have a space before > the open-parentheses. Would that be great to have > testcase follow this convention as well? :) > > If so, then... Testcases generally don't need to follow the coding conventions, they can and often it is nicer if they follow it, but we certainly also want testcases that don't follow it, otherwise we wouldn't have testsuite coverage for other coding styles (what if say a parser or preprocessor didn't work properly if there wasn't a space in between function name and ( ?). Jakub
Re: FW: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Wed, Jun 19, 2013 at 11:01:59AM +0400, Igor Zamyatin wrote: > The change also affects vectorizer in avx case which could be seen for > gcc.dg/tree-ssa/loop-19.c test. > > After the change report says > > loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment === > loop-19_bad.c:16: note: vect_compute_data_ref_alignment: > loop-19_bad.c:16: note: can't force alignment of ref: a[j_9] > loop-19_bad.c:16: note: vect_compute_data_ref_alignment: > loop-19_bad.c:16: note: can't force alignment of ref: c[j_9] > > AFAICS first condition in ix86_data_alignment was true before the > change so 256 was a return value. > > Do we need to tweak this test also? I'd add -fno-common to the test. Jakub
Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types
2013/6/19 Jeff Law : > > * gcc.dg/tree-ssa/forwprop-28.c: New test. > In the gnu coding standard we have a space before the open-parentheses. Would that be great to have testcase follow this convention as well? :) If so, then... > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c > new file mode 100644 > index 000..2c42065 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c > @@ -0,0 +1,76 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-forwprop1" } */ > + > +extern char * frob (void); > +extern _Bool testit(void); Missing a space before '('. > + > +test (int code) > +{ > + char * temp = frob();; Likewise. And redundant ';'. > + int rotate = (code == 22); > + if (temp == 0 && !rotate) > + oof(); Likewise. > +} > + > +test_2 (int code) > +{ > + char * temp = frob(); Likewise. > + int rotate = (code == 22); > + if (!rotate && temp == 0) > + oof(); Likewise. And there are similar cases can be fixed in test_3, test_4, test_5, test_6, test_7, and test_8. Best regards, jasonwucj
Re: FW: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
The change also affects vectorizer in avx case which could be seen for gcc.dg/tree-ssa/loop-19.c test. After the change report says loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment === loop-19_bad.c:16: note: vect_compute_data_ref_alignment: loop-19_bad.c:16: note: can't force alignment of ref: a[j_9] loop-19_bad.c:16: note: vect_compute_data_ref_alignment: loop-19_bad.c:16: note: can't force alignment of ref: c[j_9] AFAICS first condition in ix86_data_alignment was true before the change so 256 was a return value. Do we need to tweak this test also? Thanks, Igor > Hi! > > This PR is about DATA_ALIGNMENT macro increasing alignment of some decls for > optimization purposes beyond ABI mandated levels. It is fine to emit the > vars aligned as much as we want for optimization purposes, but if we can't be > sure that references to that decl bind to the definition we increased the > alignment on (e.g. common variables, or -fpic code without hidden visibility, > weak vars etc.), we can't assume that alignment. > As DECL_ALIGN is used for both the alignment emitted for the definitions and > alignment assumed on code referring to it, this patch increases DECL_ALIGN > only on decls where decl_binds_to_current_def_p is true, and otherwise the > optimization part on top of that emits only when aligning definition. > On x86_64, DATA_ALIGNMENT macro was partly an optimization, partly ABI > mandated alignment increase, so I've introduced a new macro, > DATA_ABI_ALIGNMENT, which is the ABI mandated increase only (on x86-64 I > think the only one is that arrays with size 16 bytes or more (and VLAs, but > that is not handled by DATA*ALIGNMENT) are at least 16 byte aligned). > > Bootstrapped/regtested on x86_64-linux and i686-linux. No idea about other > targets, I've kept them all using DATA_ALIGNMENT, which is considered > optimization increase only now, if there is some ABI mandated alignment > increase on other targets, that should be done in DATA_ABI_ALIGNMENT as well > as DATA_ALIGNMENT. The patch causes some vectorization regressions (tweaked > in the testsuite), especially for common vars where we used to align say > common arrays to 256 bits rather than the ABI mandated 128 bits, or for -fpic > code, but I'm afraid we need to live with that, if you compile another file > with say icc or some other compiler which doesn't increase alignment beyond > ABI mandated level and that other file defines the var say as non-common, we > have wrong-code. > > 2013-06-07 Jakub Jelinek > > PR target/56564 > * varasm.c (align_variable): Don't use DATA_ALIGNMENT or > CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl). > Use DATA_ABI_ALIGNMENT for that case instead if defined. > (get_variable_align): New function. > (get_variable_section, emit_bss, emit_common, > assemble_variable_contents, place_block_symbol): Use > get_variable_align instead of DECL_ALIGN. > (assemble_noswitch_variable): Add align argument, use it > instead of DECL_ALIGN. > (assemble_variable): Adjust caller. Use get_variable_align > instead of DECL_ALIGN. > * config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment > caller. > (DATA_ABI_ALIGNMENT): Define. > * config/i386/i386-protos.h (x86_data_alignment): Adjust prototype. > * config/i386/i386.c (x86_data_alignment): Add opt argument. If > opt is false, only return the psABI mandated alignment increase. > * doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document. > * doc/tm.texi: Regenerated. > > * gcc.target/i386/pr56564-1.c: New test. > * gcc.target/i386/pr56564-2.c: New test. > * gcc.target/i386/pr56564-3.c: New test. > * gcc.target/i386/pr56564-4.c: New test. > * gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common. > * gcc.target/i386/avx256-unaligned-store-1.c: Likewise. > * gcc.target/i386/avx256-unaligned-store-3.c: Likewise. > * gcc.target/i386/avx256-unaligned-store-4.c: Likewise. > * gcc.target/i386/vect-sizes-1.c: Likewise. > * gcc.target/i386/memcpy-1.c: Likewise. > * gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize. > * gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise. >