ping^2: [patch] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, libc/15407)

2013-06-19 Thread Jan Kratochvil
Hi,

[patch update] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, 
libc/15407)
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00775.html
Message-ID: <20130514191244.ga12...@host2.jankratochvil.net>


Thanks,
Jan


Re: [c++-concepts] code review

2013-06-19 Thread Gabriel Dos Reis
On Wed, Jun 19, 2013 at 9:21 AM, Jason Merrill  wrote:
> On 06/18/2013 12:27 PM, Andrew Sutton wrote:
>>
>> There was a bug in instantiation_dependent_expr_r that would cause
>> trait expressions like __is_class(int) to be marked as type dependent.
>> It was always testing the 2nd operand, even for unary traits
>> (NULL_TREE turns out to be type dependent).
>
>
> I fixed that last month:
>
> 2013-05-20  Jason Merrill  
>
> PR c++/57016
> * pt.c (instantiation_dependent_r) [TRAIT_EXPR]: Only check
> type2 if there is one.

The last merge to c++-concepts was a little bit before
that (2013-06-16), so the fix wasn't on the branch.  As I discussed
with Andrew a couple of weeks ago, I have been holding back the
merge from trunk because he has these patch series in the queue.
That also means we don't get these sort of fixes before a while.

Maybe I should just go ahead with the merge so that we have
conflicts, and potentially less duplication of work in terms of fixing
the compiler.

-- Gaby


Re: patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=

2013-06-19 Thread Andrew Pinski
On Wed, Jun 19, 2013 at 7:55 PM, Sandra Loosemore
 wrote:
> On 06/19/2013 05:10 PM, Joseph S. Myers wrote:
>>
>>
>> I don't think it's right to depend on the standard version like this.  The
>> existing semantics for GNU C and C++ follow the memory model for all
>> standard versions, and that's the sort of thing that shouldn't depend on
>> the target architecture.  In the absence of explicit
>> -fstrict-volatile-bitfields, semantics conflicting with the memory model
>> should only be enabled by one of the --param options to allow data races,
>> and not by some default option relating to something in a target ABI
>> that's incompatible with the normal language semantics.
>
>
> H.  Well, I'm willing to hack up a patch to remove
> -fstrict-volatile-bitfields from the defaults for all backends, if it is the
> consensus of the GCC community to do that and it unblocks consideration of
> the other wrong-code bug fix patches in the series.
>
> I'm concerned, though, that we should consider the perspective of GCC users
> on the affected targets as well as that of a standards committee member.
> E.g., suppose I am an ARM user with some code manipulating memory-mapped I/O
> registers that was originally developed with an older version of GCC, or
> with a different compiler.  Maybe it is not even my own code, but something
> I got from a chip vendor SDK.  People working with such target-specific,
> low-level code are far more likely to be familiar with and conforming to
> ARM's published guidelines for volatile bit-field access than obscure
> details of a language standard that is not even being selected as the
> dialect for compiling the code.  I think there's a good argument here that
> by retroactively applying the C11/C++11 memory model to older standard
> versions, GCC has simply broken access to memory-mapped registers on ARM.
> After all, the AAPCS predates these newer standards and older versions of
> GCC at least made an effort to implement the behavior AAPCS required, and if
> the pr23623 testcase had been added at the time that issue was originally
> resolved back in 2006, the regression would have been caught immediately
> when the bitfield range patches were committed.
>
> I hope that by the time GCC switches to C11 and C++11 as its default
> dialects, ARM will have revised its ABI or clarified how this conflict
> should be resolved.  :-)
>
> Anyway, what do other people think?

I rather ARM think about the issues.  This effects even AARCH64 where
most programs are going to be following the C11/C++11 memory model due
to the market that is being aimed.  I think it is a good idea to have
ARM resolves the issues before even breaking C11/C++11 memory model.

I know for Cavium's AARCH64 GCC I am going to turn off
-fstrict-volatile-bitfields for AARCH64 even though it violates the
ABI since it violates the C/c++ standard first.  The C/c++ standard in
my mind is what takes precedence over the ABI.

Thanks,
Andrew


>
> -Sandra
>


Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath

2013-06-19 Thread Peter Bergner
On Thu, 2013-06-20 at 00:51 +0200, Torvald Riegel wrote:
> On Wed, 2013-06-19 at 14:43 -0500, Peter Bergner wrote: 
> >> I'm having trouble seeing why/when _ITM_inTransaction() is
> >> returning something other than inIrrevocableTransaction.  I'll see if I can
> >> determine why and will report back.
> > 
> > Ok, we return outsideTransaction because the nesting level (tx->nesting)
> > is zero.
> 
> That's a second bug in libitm, sorry.  Can you try with the attached
> patch additionally to the previous one?  Thanks!

Ok, with this patch, plus adding a powerpc implementation of
htm_transaction_active(), reentrant.c now executes correctly
on both HTM and non-HTM hardware for me.  Thanks for all of
your help with this!

I'd still like to hear from Andreas, whether the reentrant.c test case
with both patches, now works on S390.

I'll note unlike your x86 htm_transaction_active() implementation,
my implementation has to check for htm_available() first before
executing the htm instruction which tells me whether I'm in transaction
state or not, otherwise I'll get a SIGILL on non-HTM hardware.
Unfortunately, my htm_available() call uses getauxval() to query
the AUXV for a hwcap bit.  Is there a place I can stash the result
of the first call, so that subsequent calls use the cached value?
Normally, I could use a static var, but I doubt that is what we want
to do in static inline functions.


Peter





Re: patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=

2013-06-19 Thread Sandra Loosemore

On 06/19/2013 05:10 PM, Joseph S. Myers wrote:


I don't think it's right to depend on the standard version like this.  The
existing semantics for GNU C and C++ follow the memory model for all
standard versions, and that's the sort of thing that shouldn't depend on
the target architecture.  In the absence of explicit
-fstrict-volatile-bitfields, semantics conflicting with the memory model
should only be enabled by one of the --param options to allow data races,
and not by some default option relating to something in a target ABI
that's incompatible with the normal language semantics.


H.  Well, I'm willing to hack up a patch to remove 
-fstrict-volatile-bitfields from the defaults for all backends, if it is 
the consensus of the GCC community to do that and it unblocks 
consideration of the other wrong-code bug fix patches in the series.


I'm concerned, though, that we should consider the perspective of GCC 
users on the affected targets as well as that of a standards committee 
member.  E.g., suppose I am an ARM user with some code manipulating 
memory-mapped I/O registers that was originally developed with an older 
version of GCC, or with a different compiler.  Maybe it is not even my 
own code, but something I got from a chip vendor SDK.  People working 
with such target-specific, low-level code are far more likely to be 
familiar with and conforming to ARM's published guidelines for volatile 
bit-field access than obscure details of a language standard that is not 
even being selected as the dialect for compiling the code.  I think 
there's a good argument here that by retroactively applying the 
C11/C++11 memory model to older standard versions, GCC has simply broken 
access to memory-mapped registers on ARM.  After all, the AAPCS predates 
these newer standards and older versions of GCC at least made an effort 
to implement the behavior AAPCS required, and if the pr23623 testcase 
had been added at the time that issue was originally resolved back in 
2006, the regression would have been caught immediately when the 
bitfield range patches were committed.


I hope that by the time GCC switches to C11 and C++11 as its default 
dialects, ARM will have revised its ABI or clarified how this conflict 
should be resolved.  :-)


Anyway, what do other people think?

-Sandra



RE: [PATCH, ARM, iWMMXT] Check IWMMXT_GR_REGS in the SECONDARY_RELOAD MACRO

2013-06-19 Thread Xinyu Qi
At 2013-05-24 15:19:36,"Chung-Ju Wu"  wrote: 
> 2013/5/24 Xinyu Qi :
> > Hi,
> >
> >   For this simple case, compiled with option -march=iwmmxt -O, #define
> > N 64 signed int b[N]; signed long long j[N], d[N]; void foo (void) {
> >   int i;
> >   for (i = 0; i < N; i++)
> > j[i] = d[i] << b[i];
> > }
> > An internal compiler error occurs,
> > error: insn does not satisfy its constraints:
> > (insn 112 74 75 3 (set (reg:SI 96 wcgr0)
> > (mem/c:SI (plus:SI (reg:SI 3 r3 [orig:174 ivtmp.19 ] [174])
> > (reg/f:SI 0 r0 [183])) [0 MEM[symbol: b, index:
> ivtmp.19_14, offset: 0B]+0 S4 A32])) {*iwmmxt_movsi_insn}
> >  (nil))
> >
> > The load address of wmmx wcgr register cannot accept the register offset
> mode and the reload pass fails to fix it, so that such error happens.
> > This issue could be solved by adding check code for IWMMXT_GR_REGS class
> in the SECONDARY_RELOAD MACRO if TARGET_IWMMXT. Current code only
> check the IWMMXT_REGS group.
> > Patch attached with a new test.
> > Pass full dejagnu test. No regression.
> >
> > Is this fix proper?
> > OK for trunk?
> >
> 
> I cannot approve it.  But here are some comments and hope it helps.

Hi Chung-Ju,

Thanks for your comments:)
I fixed the typo you mentioned and regenerated the patch attached.

ChangeLog
gcc/
2013-05-24  Xinyu Qi  

* config/arm/arm.h (SECONDARY_OUTPUT_RELOAD_CLASS): Check
IWMMXT_GR_REGS register class.
(SECONDARY_INPUT_RELOAD_CLASS): Likewise.

testsuite/
2013-05-24  Xinyu Qi  

* gcc.target/arm/mmx-3.c: New test.

Thanks,
Xinyu

> 
> 
> > ChangeLog
> > gcc/
> > 2013-05-24  Xinyu Qi  
> >
> > * config/arm/arm.h (SECONDARY_OUTPUT_RELOAD_CLASS):
> Check IWMMXT_GR_REGS.
> 
> This line just ends at 81 column.
> How about this one?
> 
> 2013-05-24  Xinyu Qi  
> 
> * config/arm/arm.h (SECONDARY_OUTPUT_RELOAD_CLASS): Check
> IWMMXT_GR_REGS register class.
> (SECONDARY_INPUT_RELOAD_CLASS): Likewise.
> 
> >
> > testsuite/
> > 2013-05-24  Xinyu Qi  
> >
> > * gcc.target/arm/mmx-3.c: New test.
> 
> 
> > Index: gcc/config/arm/arm.h
> >
> 
> ===
> > --- gcc/config/arm/arm.h(revision 199090)
> > +++ gcc/config/arm/arm.h(working copy)
> > @@ -1280,7 +1280,8 @@
> >((TARGET_VFP && TARGET_HARD_FLOAT\
> >  && IS_VFP_CLASS (CLASS))\
> > ? coproc_secondary_reload_class (MODE, X, FALSE)\
> > -   : (TARGET_IWMMXT && (CLASS) == IWMMXT_REGS)\
> > +   : (TARGET_IWMMXT && ((CLASS) == IWMMXT_REGS)\
> > +|| (CLASS) == IWMMXT_GR_REGS)\
> 
> I think it should be
> 
> +   : (TARGET_IWMMXT && ((CLASS) == IWMMXT_REGS\
> +|| (CLASS) == IWMMXT_GR_REGS))\
> 
> 
> > ? coproc_secondary_reload_class (MODE, X, TRUE)\
> > : TARGET_32BIT\
> > ? (((MODE) == HImode && ! arm_arch4 && true_regnum (X) == -1) \
> @@
> > -1293,7 +1294,8 @@
> >((TARGET_VFP && TARGET_HARD_FLOAT\
> >  && IS_VFP_CLASS (CLASS))\
> >  ? coproc_secondary_reload_class (MODE, X, FALSE) :\
> > -(TARGET_IWMMXT && (CLASS) == IWMMXT_REGS) ?\
> > +(TARGET_IWMMXT && ((CLASS) == IWMMXT_REGS\
> > +   || (CLASS) == IWMMXT_GR_REGS)) ?\
> >  coproc_secondary_reload_class (MODE, X, TRUE) :\
> > (TARGET_32BIT ?\
> >  (((CLASS) == IWMMXT_REGS || (CLASS) == IWMMXT_GR_REGS)\
> 
> It seems that you didn't CC arm maintainer.
> Let me do this for you. :)
> 
> 
> Best regards,
> jasonwucj


GR_secondary_reload.diff
Description: GR_secondary_reload.diff


Re: new port: msp430-elf, revision 3

2013-06-19 Thread DJ Delorie

> A random spotting; copyright header replacement miss, including
> but maybe not limited to:

Doh!  I'll scan them all and fix them.  Thanks!


Re: new port: msp430-elf, revision 3

2013-06-19 Thread Hans-Peter Nilsson
On Wed, 19 Jun 2013, DJ Delorie wrote:
>
> Third revision, mostly the same as the last, haven't heard any additional
> feedback in the last few weeks.  Ok to commit yet?

> [libgcc]
>
>   * config.host (msp*-*-elf): New.
>   * config/msp430/: New port.

A random spotting; copyright header replacement miss, including
but maybe not limited to:

> Index: libgcc/config/msp430/srai.S
> ===
> --- libgcc/config/msp430/srai.S   (revision 0)
> +++ libgcc/config/msp430/srai.S   (revision 0)
> @@ -0,0 +1,114 @@
> +/* Copyright (c) 2012 Red Hat Incorporated.
> +   All rights reserved.
> +
> +   Redistribution and use in source and binary forms, with or without
> +   modification, are permitted provided that the following conditions
> +   are met:

> Index: libgcc/config/msp430/slli.S
> ===
> --- libgcc/config/msp430/slli.S   (revision 0)
> +++ libgcc/config/msp430/slli.S   (revision 0)
> @@ -0,0 +1,116 @@
> +/* Copyright (c) 2012 Red Hat Incorporated.
> +   All rights reserved.

brgds, H-P


Re: [C++ Patch] PR 57645

2013-06-19 Thread Paolo Carlini

Hi again,

On 06/19/2013 03:37 PM, Paolo Carlini wrote:

Hi,

when I implemented Core/1123 "Destructors should be noexcept by 
default", unfortunately I caused this regression, present now in 
mainline and 4_8-branch.


When the destructor is user provided, with no exception 
specifications, and the type has data members (not bases, those are 
already Ok) with the destructor which can throw, the destructor is 
wrongly deduced to be noexcept. The reason is that 
deduce_noexcept_on_destructors is called from check_bases_and_members 
after check_bases but *before* check_methods and therefore the latter 
does too late work relevant for the deduction, namely possibly setting 
TYPE_HAS_NONTRIVIAL_DESTRUCTOR.
I can confirm that the issue in very general terms is this one, but my 
patch isn't Ok. Sorry. For example, it doesn't handle correctly the 
defaulted destructor case.


If, in check_bases_and_members, I simply move 
deduce_noexcept_on_destructors after check_methods and nothing else, all 
the new testcases are fine + the tests added for Core/1123, but there 
are regressions, for example for testcases involving virtual 
destructors, eg, debug/dwarf2/non-virtual-thunk.C.


All in all the issue seems rather nasty to me, I'm afraid I will need 
some help if we want to quickly make substantive progress on this issue.


Thanks,
Paolo.


patch to fix PR57604

2013-06-19 Thread Vladimir Makarov

I hope the following patch fixes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57604

Although I have no specific hardware to check this.

The patch also adds a comment about one recent change as it was done in 
the same function.


The patch was successfully bootstrapped and tested on x86/x86-64 and 
s390x (including building java).


Committed as rev. 200227.

2013-06-19  Vladimir Makarov  

PR bootstrap/57604
* lra.c (emit_add3_insn, emit_add2_insn): New functions.
(lra_emit_add): Use the functions.  Add comment about Y as an
address segment.

Index: lra.c
===
--- lra.c   (revision 200174)
+++ lra.c   (working copy)
@@ -242,6 +242,42 @@ lra_delete_dead_insn (rtx insn)
   lra_set_insn_deleted (insn);
 }
 
+/* Emit insn x = y + z.  Return NULL if we failed to do it.
+   Otherwise, return the insn.  We don't use gen_add3_insn as it might
+   clobber CC.  */
+static rtx
+emit_add3_insn (rtx x, rtx y, rtx z)
+{
+  rtx insn, last;
+
+  last = get_last_insn ();
+  insn = emit_insn (gen_rtx_SET (VOIDmode, x,
+gen_rtx_PLUS (GET_MODE (y), y, z)));
+  if (recog_memoized (insn) < 0)
+{
+  delete_insns_since (last);
+  insn = NULL_RTX;
+}
+  return insn;
+}
+
+/* Emit insn x = x + y.  Return the insn.  We use gen_add2_insn as the
+   last resort.  */
+static rtx
+emit_add2_insn (rtx x, rtx y)
+{
+  rtx insn;
+
+  insn = emit_add3_insn (x, x, y);
+  if (insn == NULL_RTX)
+{
+  insn = gen_add2_insn (x, y);
+  if (insn != NULL_RTX)
+   emit_insn (insn);
+}
+  return insn;
+}
+
 /* Target checks operands through operand predicates to recognize an
insn.  We should have a special precaution to generate add insns
which are frequent results of elimination.
@@ -260,10 +296,10 @@ lra_emit_add (rtx x, rtx y, rtx z)
   rtx a1, a2, base, index, disp, scale, index_scale;
   bool ok_p;
 
-  insn = gen_add3_insn (x, y, z);
+  insn = emit_add3_insn (x, y, z);
   old = max_reg_num ();
   if (insn != NULL_RTX)
-emit_insn (insn);
+;
   else
 {
   disp = a2 = NULL_RTX;
@@ -306,12 +342,14 @@ lra_emit_add (rtx x, rtx y, rtx z)
  || (disp != NULL_RTX && ! CONSTANT_P (disp))
  || (scale != NULL_RTX && ! CONSTANT_P (scale)))
{
- /* It is not an address generation.   Probably we have no 3 op
-add.  Last chance is to use 2-op add insn.  */
+ /* Probably we have no 3 op add.  Last chance is to use 2-op
+add insn.  To succeed, don't move Z to X as an address
+segment always comes in Y.  Otherwise, we might fail when
+adding the address segment to register.  */
  lra_assert (x != y && x != z);
  emit_move_insn (x, y);
- insn = gen_add2_insn (x, z);
- emit_insn (insn);
+ insn = emit_add2_insn (x, z);
+ lra_assert (insn != NULL_RTX);
}
   else
{
@@ -322,8 +360,8 @@ lra_emit_add (rtx x, rtx y, rtx z)
  /* Generate x = index_scale; x = x + base.  */
  lra_assert (index_scale != NULL_RTX && base != NULL_RTX);
  emit_move_insn (x, index_scale);
- insn = gen_add2_insn (x, base);
- emit_insn (insn);
+ insn = emit_add2_insn (x, base);
+ lra_assert (insn != NULL_RTX);
}
  else if (scale == NULL_RTX)
{
@@ -337,14 +375,14 @@ lra_emit_add (rtx x, rtx y, rtx z)
  delete_insns_since (last);
  /* Generate x = disp; x = x + base.  */
  emit_move_insn (x, disp);
- insn = gen_add2_insn (x, base);
- emit_insn (insn);
+ insn = emit_add2_insn (x, base);
+ lra_assert (insn != NULL_RTX);
}
  /* Generate x = x + index.  */
  if (index != NULL_RTX)
{
- insn = gen_add2_insn (x, index);
- emit_insn (insn);
+ insn = emit_add2_insn (x, index);
+ lra_assert (insn != NULL_RTX);
}
}
  else
@@ -355,16 +393,12 @@ lra_emit_add (rtx x, rtx y, rtx z)
  ok_p = false;
  if (recog_memoized (insn) >= 0)
{
- insn = gen_add2_insn (x, disp);
+ insn = emit_add2_insn (x, disp);
  if (insn != NULL_RTX)
{
- emit_insn (insn);
- insn = gen_add2_insn (x, disp);
+ insn = emit_add2_insn (x, disp);
  if (insn != NULL_RTX)
-   {
- emit_insn (insn);
- ok_p = true;
-   }
+   ok_p = true;
}
}
  if (! ok_p)
@@ -372,10 +406,10 @@ lra_emit_add (rtx

Re: patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=

2013-06-19 Thread Joseph S. Myers
On Wed, 19 Jun 2013, Sandra Loosemore wrote:

> On 06/17/2013 06:02 PM, Sandra Loosemore wrote:
> > 
> > I had another thought:  perhaps -fstrict-volatile-bitfields could remain
> > the default on targets where it currently is, but it can be overridden
> > by an appropriate -std= option.  Perhaps also GCC could give an error if
> > -fstrict-volatile-bitfields is given explicitly with an incompatible
> > -std= option.
> 
> Like this.  This patch is intended to be applied on top of the other 5 pieces
> in this series, although in theory it's independent of them.  OK to commit,
> and does this resolve the objection to part 3?

I don't think it's right to depend on the standard version like this.  The 
existing semantics for GNU C and C++ follow the memory model for all 
standard versions, and that's the sort of thing that shouldn't depend on 
the target architecture.  In the absence of explicit 
-fstrict-volatile-bitfields, semantics conflicting with the memory model 
should only be enabled by one of the --param options to allow data races, 
and not by some default option relating to something in a target ABI 
that's incompatible with the normal language semantics.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath

2013-06-19 Thread Torvald Riegel
On Wed, 2013-06-19 at 14:43 -0500, Peter Bergner wrote: 
> On Wed, 2013-06-19 at 10:57 -0500, Peter Bergner wrote:
> > On Wed, 2013-06-19 at 10:49 -0500, Peter Bergner wrote:
> > > This is due to the following in _ITM_inTransaction():
> > > 
> > > 47  if (tx && (tx->nesting > 0))
> > > (gdb) p tx
> > > $2 = (GTM::gtm_thread *) 0x10901bf0
> > > (gdb) p tx->nesting
> > > $3 = 1
> > > (gdb) step
> > > 49  if (tx->state & gtm_thread::STATE_IRREVOCABLE)
> > > (gdb) p tx->state
> > > $4 = 3
> > > (gdb) p gtm_thread::STATE_IRREVOCABLE
> > > $5 = 2
> > > (gdb) step
> > > 50return inIrrevocableTransaction;
> > 
> > Bah, ignore this.  It's a different call that is returning something other
> > than inIrrevocableTransaction.  Unfortunately, gdb is having problems inside
> > hw txns and I'm having trouble seeing why/when _ITM_inTransaction() is
> > returning something other than inIrrevocableTransaction.  I'll see if I can
> > determine why and will report back.
> 
> Ok, we return outsideTransaction because the nesting level (tx->nesting)
> is zero.

That's a second bug in libitm, sorry.  Can you try with the attached
patch additionally to the previous one?  Thanks!

Torvald


commit 02dde6bb91107792fb0cb9f5c4785d25b6aa0e3c
Author: Torvald Riegel 
Date:   Thu Jun 20 00:46:59 2013 +0200

libitm: Handle HTM fastpath in status query functions.

diff --git a/libitm/config/x86/target.h b/libitm/config/x86/target.h
index 77b627f..063c09e 100644
--- a/libitm/config/x86/target.h
+++ b/libitm/config/x86/target.h
@@ -125,6 +125,13 @@ htm_abort_should_retry (uint32_t begin_ret)
 {
   return begin_ret & _XABORT_RETRY;
 }
+
+/* Returns true iff a hardware transaction is currently being executed.  */
+static inline bool
+htm_transaction_active ()
+{
+  return _xtest() != 0;
+}
 #endif
 
 
diff --git a/libitm/query.cc b/libitm/query.cc
index 5707321..0ac3eda 100644
--- a/libitm/query.cc
+++ b/libitm/query.cc
@@ -43,6 +43,15 @@ _ITM_libraryVersion (void)
 _ITM_howExecuting ITM_REGPARM
 _ITM_inTransaction (void)
 {
+#if defined(USE_HTM_FASTPATH)
+  // If we use the HTM fastpath, we cannot reliably detect whether we are
+  // in a transaction because this function can be called outside of
+  // a transaction and thus we can't deduce this by looking at just the serial
+  // lock.  This function isn't used in practice currently, so the easiest
+  // way to handle it is to just abort.
+  if (htm_transaction_active())
+htm_abort();
+#endif
   struct gtm_thread *tx = gtm_thr();
   if (tx && (tx->nesting > 0))
 {
@@ -58,6 +67,11 @@ _ITM_inTransaction (void)
 _ITM_transactionId_t ITM_REGPARM
 _ITM_getTransactionId (void)
 {
+#if defined(USE_HTM_FASTPATH)
+  // See ITM_inTransaction.
+  if (htm_transaction_active())
+htm_abort();
+#endif
   struct gtm_thread *tx = gtm_thr();
   return (tx && (tx->nesting > 0)) ? tx->id : _ITM_noTransactionId;
 }


Re: [google gcc-4_8] fix bad merge in r199218

2013-06-19 Thread Xinliang David Li
lgtm.

On Wed, Jun 19, 2013 at 3:41 PM, Rong Xu  wrote:
> Hi,
>
> This patch fixes a bad merge in r199218.
> Removing cgraph noded in early-ipa should be allowed.
> Otherwise, we got ICE in tree-eipa_sra with
> -freorder-funtions=callgraph (without -fripa)
>
> Tested with regressions and google banchmarks.
>
> Thanks,
>
> -Rong


[google gcc-4_8] fix bad merge in r199218

2013-06-19 Thread Rong Xu
Hi,

This patch fixes a bad merge in r199218.
Removing cgraph noded in early-ipa should be allowed.
Otherwise, we got ICE in tree-eipa_sra with
-freorder-funtions=callgraph (without -fripa)

Tested with regressions and google banchmarks.

Thanks,

-Rong


patch.diff
Description: Binary data


Re: [PATCH] PR57518, RA generated redundent code

2013-06-19 Thread Xinliang David Li
+jakub who manages GCC 4.8 releases.

David

On Wed, Jun 19, 2013 at 2:28 PM, Wei Mi  wrote:
> Yes, I think so.
>
> Regards,
> Wei.
>
> On Wed, Jun 19, 2013 at 2:00 PM, Xinliang David Li  wrote:
>> Should the patch be ported to in 48 branch?
>>
>> thanks,
>>
>> David
>>
>> On Wed, Jun 19, 2013 at 11:46 AM, Vladimir Makarov  
>> wrote:
>>> On 13-06-19 1:23 AM, Wei Mi wrote:

 Ping.

 On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi  wrote:
>
> Hi,
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518
>
> pr57518 happened because update_equiv_regs in IRA marked a reg
> equivalent with a mem, lowered its mem_cost in scan_one_insn, set
> NO_REGS to its rclass, but didn't consider the reg was used in
> paradoxical subreg which prevented the reg from being replaced by mem
> in LRA phase.
>
> This patch is to check whether a reg is used in a paradoxical subreg
> in update_equiv_regs before reg is set as equivalent to a mem.
>
> bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for
> trunk and gcc-4.8 branch?
>
>
>>> Thanks for working on this PR, Wei, and sorry for the delay with the answer
>>> (I was on vacation).
>>>
>>> In general, the PR analysis and the proposed solution looks ok.  I only
>>> worry that you are adding additional full scan of all RTL code.  It might
>>> add 0.5% to GCC compilation time if data cache is rewritten (which will
>>> happen for moderate size or big functions). It would be nice to do it on
>>> some other existing RTL traversing. Unfortunately, this info is calculated
>>> later (reg_max_width in reload or biggest_mode in LRA).  I am in doubt that
>>> other solutions I see now are better:
>>>
>>>   o calculate this info in regstat_...  function and store it in reg_info_p
>>>   o calculate it with update_equiv_regs and use it for invalidation the
>>> equiv info later
>>>
>>> The first one increases reg_info_p footprint and calculation is done many
>>> times although it is used once.
>>> The second one results in complicated code.
>>>
>>> So I think the current patch is ok to commit.
>>>
>>> Thanks, again.
>>>
>>>
>>>


Re: [PATCH] PR57518, RA generated redundent code

2013-06-19 Thread Wei Mi
Yes, I think so.

Regards,
Wei.

On Wed, Jun 19, 2013 at 2:00 PM, Xinliang David Li  wrote:
> Should the patch be ported to in 48 branch?
>
> thanks,
>
> David
>
> On Wed, Jun 19, 2013 at 11:46 AM, Vladimir Makarov  
> wrote:
>> On 13-06-19 1:23 AM, Wei Mi wrote:
>>>
>>> Ping.
>>>
>>> On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi  wrote:

 Hi,

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518

 pr57518 happened because update_equiv_regs in IRA marked a reg
 equivalent with a mem, lowered its mem_cost in scan_one_insn, set
 NO_REGS to its rclass, but didn't consider the reg was used in
 paradoxical subreg which prevented the reg from being replaced by mem
 in LRA phase.

 This patch is to check whether a reg is used in a paradoxical subreg
 in update_equiv_regs before reg is set as equivalent to a mem.

 bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for
 trunk and gcc-4.8 branch?


>> Thanks for working on this PR, Wei, and sorry for the delay with the answer
>> (I was on vacation).
>>
>> In general, the PR analysis and the proposed solution looks ok.  I only
>> worry that you are adding additional full scan of all RTL code.  It might
>> add 0.5% to GCC compilation time if data cache is rewritten (which will
>> happen for moderate size or big functions). It would be nice to do it on
>> some other existing RTL traversing. Unfortunately, this info is calculated
>> later (reg_max_width in reload or biggest_mode in LRA).  I am in doubt that
>> other solutions I see now are better:
>>
>>   o calculate this info in regstat_...  function and store it in reg_info_p
>>   o calculate it with update_equiv_regs and use it for invalidation the
>> equiv info later
>>
>> The first one increases reg_info_p footprint and calculation is done many
>> times although it is used once.
>> The second one results in complicated code.
>>
>> So I think the current patch is ok to commit.
>>
>> Thanks, again.
>>
>>
>>


Go patch committed: Check for invalid unsafe.Offsetof

2013-06-19 Thread Ian Lance Taylor
In Go it is not valid to use unsafe.Offsetof with a reference to an
embedded field that is accessed via an embedded pointer, because there
is no reasonable answer to return in that case.  Unfortunately gccgo was
returning bogus results for that case.  This patch from Rémy Oudompheng
fixes it to return an error.  There is a test case in the master
testsuite that will be imported into gccgo in due course.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline
and 4.8 branch.

Ian

diff -r 3d794090fa86 go/expressions.cc
--- a/go/expressions.cc	Tue Jun 18 16:01:29 2013 -0700
+++ b/go/expressions.cc	Wed Jun 19 14:16:32 2013 -0700
@@ -6955,6 +6955,26 @@
   return Expression::make_error(loc);
 }
 
+  if (this->code_ == BUILTIN_OFFSETOF)
+{
+  Expression* arg = this->one_arg();
+  Field_reference_expression* farg = arg->field_reference_expression();
+  while (farg != NULL)
+	{
+	  if (!farg->implicit())
+	break;
+	  // When the selector refers to an embedded field,
+	  // it must not be reached through pointer indirections.
+	  if (farg->expr()->deref() != farg->expr())
+	{
+	  this->report_error(_("argument of Offsetof implies indirection of an embedded field"));
+	  return this;
+	}
+	  // Go up until we reach the original base.
+	  farg = farg->expr()->field_reference_expression();
+	}
+}
+ 
   if (this->is_constant())
 {
   Numeric_constant nc;
diff -r 3d794090fa86 go/expressions.h
--- a/go/expressions.h	Tue Jun 18 16:01:29 2013 -0700
+++ b/go/expressions.h	Wed Jun 19 14:16:32 2013 -0700
@@ -1872,7 +1872,7 @@
   Field_reference_expression(Expression* expr, unsigned int field_index,
 			 Location location)
 : Expression(EXPRESSION_FIELD_REFERENCE, location),
-  expr_(expr), field_index_(field_index), called_fieldtrack_(false)
+  expr_(expr), field_index_(field_index), implicit_(false), called_fieldtrack_(false)
   { }
 
   // Return the struct expression.


Re: [PATCH] PR57518, RA generated redundent code

2013-06-19 Thread Xinliang David Li
Should the patch be ported to in 48 branch?

thanks,

David

On Wed, Jun 19, 2013 at 11:46 AM, Vladimir Makarov  wrote:
> On 13-06-19 1:23 AM, Wei Mi wrote:
>>
>> Ping.
>>
>> On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi  wrote:
>>>
>>> Hi,
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518
>>>
>>> pr57518 happened because update_equiv_regs in IRA marked a reg
>>> equivalent with a mem, lowered its mem_cost in scan_one_insn, set
>>> NO_REGS to its rclass, but didn't consider the reg was used in
>>> paradoxical subreg which prevented the reg from being replaced by mem
>>> in LRA phase.
>>>
>>> This patch is to check whether a reg is used in a paradoxical subreg
>>> in update_equiv_regs before reg is set as equivalent to a mem.
>>>
>>> bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for
>>> trunk and gcc-4.8 branch?
>>>
>>>
> Thanks for working on this PR, Wei, and sorry for the delay with the answer
> (I was on vacation).
>
> In general, the PR analysis and the proposed solution looks ok.  I only
> worry that you are adding additional full scan of all RTL code.  It might
> add 0.5% to GCC compilation time if data cache is rewritten (which will
> happen for moderate size or big functions). It would be nice to do it on
> some other existing RTL traversing. Unfortunately, this info is calculated
> later (reg_max_width in reload or biggest_mode in LRA).  I am in doubt that
> other solutions I see now are better:
>
>   o calculate this info in regstat_...  function and store it in reg_info_p
>   o calculate it with update_equiv_regs and use it for invalidation the
> equiv info later
>
> The first one increases reg_info_p footprint and calculation is done many
> times although it is used once.
> The second one results in complicated code.
>
> So I think the current patch is ok to commit.
>
> Thanks, again.
>
>
>


Re: Symtab cleanups 4/17 - ICE in GUPC due to use of init section

2013-06-19 Thread Gary Funck
On 06/19/13 09:26:30, Gary Funck wrote:
> The variable declaration tree node looks about right to me.
> However, it never makes it into the output assembler file.
> 
> What is the recommended method for making sure that the
> static variable created above is associated with the current
> translation unit and that its initialization makes it into
> the assembler output file?

Adding this call to the upc_create_static_var() routine
implemented the necessary binding:

  pushdecl_top_level (decl);

pushdecl_top_level() is defined in c/c-decl.c:

/* Record X as belonging to file scope.
   This is used only internally by the Objective-C front end,
   and is limited to its needs.  duplicate_decls is not called;
   if there is any preexisting decl for this identifier, it is an ICE.  */

tree
pushdecl_top_level (tree x)
[...]

This also required that the temporary variable being created
needs to be named (have a non-null DECL_NAME()).

This doesn't seem ideal, but does generate the desired code.



Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2013-06-19 Thread Matthias Klose
Am 19.06.2013 19:46, schrieb Jakub Jelinek:
> On Wed, Jun 19, 2013 at 05:29:42PM +0200, Matthias Klose wrote:
>> well, I did fix this assumption last year in gcc.c, then lets fix it in other
>> places too, just adding a mode parameter to the public find_a_file function.
>> Testing the attached patch.
> 
> Ok, provided you:
> 1) write proper ChangeLog
> 2) adjust the gcc-ar.c change (because it won't apply cleanly now that
>I've committed the other gcc-ar.c fix
> 3) 
>> --- file-find.h  (revision 200203)
>> +++ file-find.h  (working copy)
>> @@ -38,7 +38,7 @@
>>  };
>>  
>>  extern void find_file_set_debug (bool);
>> -extern char *find_a_file (struct path_prefix *, const char *);
>> +extern char *find_a_file (struct path_prefix *, const char *, int mode);
> 
> Remove " mode" above, none of the arguments have names, so adding it
> is both inconsistent and useless.
> 
> 4)
>>if (ld_file_name == 0)
>> -ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker]);
>> +ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker], 
>> X_OK);
> 
> This line looks too long now.

addressed 1-3, the semicolon is in column 79. committed to the trunk and the 4.8
branch as attached.

  Matthias



2013-06-19  Matthias Klose  

PR driver/57651
* file-find.h (find_a_file): Add a mode parameter.
* file-find.c (find_a_file): Likewise.
* gcc-ar.c (main): Call find_a_file with R_OK for the plugin,
with X_OK for the executables.
* collect2.c (main): Call find_a_file with X_OK.

Index: gcc-ar.c
===
--- gcc-ar.c(revision 200217)
+++ gcc-ar.c(working copy)
@@ -136,7 +136,7 @@
   setup_prefixes (av[0]);
 
   /* Find the GCC LTO plugin */
-  plugin = find_a_file (&target_path, LTOPLUGINSONAME);
+  plugin = find_a_file (&target_path, LTOPLUGINSONAME, R_OK);
   if (!plugin)
 {
   fprintf (stderr, "%s: Cannot find plugin '%s'\n", av[0], 
LTOPLUGINSONAME);
@@ -144,14 +144,14 @@
 }
 
   /* Find the wrapped binutils program.  */
-  exe_name = find_a_file (&target_path, PERSONALITY);
+  exe_name = find_a_file (&target_path, PERSONALITY, X_OK);
   if (!exe_name)
 {
   const char *real_exe_name = PERSONALITY;
 #ifdef CROSS_DIRECTORY_STRUCTURE
   real_exe_name = concat (target_machine, "-", PERSONALITY, NULL);
 #endif
-  exe_name = find_a_file (&path, real_exe_name);
+  exe_name = find_a_file (&path, real_exe_name, X_OK);
   if (!exe_name)
{
  fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0],
Index: file-find.c
===
--- file-find.c (revision 200217)
+++ file-find.c (working copy)
@@ -31,7 +31,7 @@
 }
 
 char *
-find_a_file (struct path_prefix *pprefix, const char *name)
+find_a_file (struct path_prefix *pprefix, const char *name, int mode)
 {
   char *temp;
   struct prefix_list *pl;
@@ -50,7 +50,7 @@
 
   if (IS_ABSOLUTE_PATH (name))
 {
-  if (access (name, X_OK) == 0)
+  if (access (name, mode) == 0)
{
  strcpy (temp, name);
 
@@ -66,7 +66,7 @@
   strcpy (temp, name);
strcat (temp, HOST_EXECUTABLE_SUFFIX);
 
-   if (access (temp, X_OK) == 0)
+   if (access (temp, mode) == 0)
  return temp;
 #endif
 
@@ -83,7 +83,7 @@
 
if (stat (temp, &st) >= 0
&& ! S_ISDIR (st.st_mode)
-   && access (temp, X_OK) == 0)
+   && access (temp, mode) == 0)
  return temp;
 
 #ifdef HOST_EXECUTABLE_SUFFIX
@@ -93,7 +93,7 @@
 
if (stat (temp, &st) >= 0
&& ! S_ISDIR (st.st_mode)
-   && access (temp, X_OK) == 0)
+   && access (temp, mode) == 0)
  return temp;
 #endif
   }
Index: file-find.h
===
--- file-find.h (revision 200217)
+++ file-find.h (working copy)
@@ -38,7 +38,7 @@
 };
 
 extern void find_file_set_debug (bool);
-extern char *find_a_file (struct path_prefix *, const char *);
+extern char *find_a_file (struct path_prefix *, const char *, int);
 extern void add_prefix (struct path_prefix *, const char *);
 extern void prefix_from_env (const char *, struct path_prefix *);
 extern void prefix_from_string (const char *, struct path_prefix *);
Index: collect2.c
===
--- collect2.c  (revision 200217)
+++ collect2.c  (working copy)
@@ -1110,55 +1110,55 @@
   if (ld_file_name == 0)
 #endif
 #ifdef REAL_LD_FILE_NAME
-  ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME);
+  ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME, X_OK);
   if (ld_file_name == 0)
 #endif
   /* Search the (target-specific) compiler dirs for ld'.  */
-  ld_file_name = find_a_file (&cpath, real_ld_suffix);
+  ld_file_name = find_a_file (&cpath, real_ld_suffix, X_OK);
   /* Likewise for `collect-ld'.  */
   if (ld_file_name == 0)
 {
-

patch [6/5] check for conflict with -fstrict-volatile-bitfields and -std=

2013-06-19 Thread Sandra Loosemore

On 06/17/2013 06:02 PM, Sandra Loosemore wrote:


I had another thought:  perhaps -fstrict-volatile-bitfields could remain
the default on targets where it currently is, but it can be overridden
by an appropriate -std= option.  Perhaps also GCC could give an error if
-fstrict-volatile-bitfields is given explicitly with an incompatible
-std= option.


Like this.  This patch is intended to be applied on top of the other 5 
pieces in this series, although in theory it's independent of them.  OK 
to commit, and does this resolve the objection to part 3?


-Sandra



2013-06-19  Sandra Loosemore  

	gcc/c-family/
	* c-opts.c (c_common_post_options): Check for conflict between
	-std= and -fstrict-volatile-bitfields.

	gcc/
	* doc/invoke.texi (Code Gen Options): Document what happens when
	-fstrict-volatile-bitfields conflicts with -std=.
	Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c	(revision 199963)
+++ gcc/c-family/c-opts.c	(working copy)
@@ -813,6 +813,18 @@ c_common_post_options (const char **pfil
   C_COMMON_OVERRIDE_OPTIONS;
 #endif
 
+  /* C11 and C++11 specify a memory model that is incompatible with
+ -fstrict-volatile-bitfields.  Warn if that option is given explicitly
+ and prevent backends from defaulting to turning it on.  */
+  if (flag_isoc11 || cxx_dialect >= cxx11)
+{
+  if (flag_strict_volatile_bitfields > 0)
+	warning (0, "-fstrict-volatile-bitfields conflicts with the "
+		"C11 and C++11 memory model");
+  else
+	flag_strict_volatile_bitfields = 0;
+}
+
   /* Excess precision other than "fast" requires front-end
  support.  */
   if (c_dialect_cxx ())
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 199963)
+++ gcc/doc/invoke.texi	(working copy)
@@ -20899,4 +20899,12 @@
 AAPCS, @option{-fstrict-volatile-bitfields} is the default.
 
+Note that @option{-fstrict-volatile-bitfields} is incompatible with
+the bit-field access behavior required by the ISO C11 and C++11
+standards.  GCC warns if @option{-fstrict-volatile-bitfields} is given
+explicitly with an incompatible @option{-std=} option.  On targets
+that otherwise default to @option{-fstrict-volatile-bitfields},
+providing an incompatible @option{-std=} option implicitly disables
+@option{-fstrict-volatile-bitfields}.
+
 @item -fsync-libcalls
 @opindex fsync-libcalls


Re: Document Intel Silvermont support in invoke.texi

2013-06-19 Thread Kirill Yukhin
> Patch preapproved.

Checked into 4.8 branch: 
http://gcc.gnu.org/ml/gcc-cvs/2013-06/msg00648.html

Thanks, K


Re: RFA: Fix rtl-optimization/57425

2013-06-19 Thread Joern Rennecke

Quoting Michael Matz :


That's not good.  You now have different order of parameters between
anti_dependence and canon_anti_dependence.  That will be mightily
confusing, please instead change the caller.  Currently these predicates
take their arguments in the order of the corresponding instructions, that
should better be retained:

true_dependence:  write-then-(depending)read
anti_dependence:  read-then-(clobbering)write
write_dependence: write-then-(clobbering)write


All right, attached is the patch with the arguments in instruction-order.
Again, bootstrapped/regtested on i686-pc-linux-gnu .
2013-06-19  Joern Rennecke 

PR rtl-optimization/57425
PR rtl-optimization/57569
* alias.c (write_dependence_p): Remove parameters mem_mode and
canon_mem_addr.  Add parameters x_mode, x_addr and x_canonicalized.
Changed all callers.
(canon_anti_dependence): Get comments and semantics in sync.
Add parameter mem_canonicalized.  Changed all callers.
* rtl.h (canon_anti_dependence): Update prototype.

Index: alias.c
===
--- alias.c (revision 200133)
+++ alias.c (working copy)
@@ -156,8 +156,9 @@ static int insert_subset_children (splay
 static alias_set_entry get_alias_set_entry (alias_set_type);
 static bool nonoverlapping_component_refs_p (const_rtx, const_rtx);
 static tree decl_for_component_ref (tree);
-static int write_dependence_p (const_rtx, enum machine_mode, rtx, const_rtx,
-  bool, bool);
+static int write_dependence_p (const_rtx,
+  const_rtx, enum machine_mode, rtx,
+  bool, bool, bool);
 
 static void memory_modified_1 (rtx, const_rtx, void *);
 
@@ -2555,20 +2556,22 @@ canon_true_dependence (const_rtx mem, en
 
 /* Returns nonzero if a write to X might alias a previous read from
(or, if WRITEP is true, a write to) MEM.
-   If MEM_CANONCALIZED is nonzero, CANON_MEM_ADDR is the canonicalized
-   address of MEM, and MEM_MODE the mode for that access.  */
+   If X_CANONCALIZED is true, then X_ADDR is the canonicalized address of X,
+   and X_MODE the mode for that access.
+   If MEM_CANONICALIZED is true, MEM is canonicalized.  */
 
 static int
-write_dependence_p (const_rtx mem, enum machine_mode mem_mode,
-   rtx canon_mem_addr, const_rtx x,
-   bool mem_canonicalized, bool writep)
+write_dependence_p (const_rtx mem,
+   const_rtx x, enum machine_mode x_mode, rtx x_addr,
+   bool mem_canonicalized, bool x_canonicalized, bool writep)
 {
-  rtx x_addr, mem_addr;
+  rtx mem_addr;
   rtx base;
   int ret;
 
-  gcc_checking_assert (mem_canonicalized ? (canon_mem_addr != NULL_RTX)
-  : (canon_mem_addr == NULL_RTX && mem_mode == VOIDmode));
+  gcc_checking_assert (x_canonicalized
+  ? (x_addr != NULL_RTX && x_mode != VOIDmode)
+  : (x_addr == NULL_RTX && x_mode == VOIDmode));
 
   if (MEM_VOLATILE_P (x) && MEM_VOLATILE_P (mem))
 return 1;
@@ -2593,17 +2596,21 @@ write_dependence_p (const_rtx mem, enum
   if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x))
 return 1;
 
-  x_addr = XEXP (x, 0);
   mem_addr = XEXP (mem, 0);
-  if (!((GET_CODE (x_addr) == VALUE
-&& GET_CODE (mem_addr) != VALUE
-&& reg_mentioned_p (x_addr, mem_addr))
-   || (GET_CODE (x_addr) != VALUE
-   && GET_CODE (mem_addr) == VALUE
-   && reg_mentioned_p (mem_addr, x_addr
+  if (!x_addr)
 {
-  x_addr = get_addr (x_addr);
-  mem_addr = get_addr (mem_addr);
+  x_addr = XEXP (x, 0);
+  if (!((GET_CODE (x_addr) == VALUE
+&& GET_CODE (mem_addr) != VALUE
+&& reg_mentioned_p (x_addr, mem_addr))
+   || (GET_CODE (x_addr) != VALUE
+   && GET_CODE (mem_addr) == VALUE
+   && reg_mentioned_p (mem_addr, x_addr
+   {
+ x_addr = get_addr (x_addr);
+ if (!mem_canonicalized)
+   mem_addr = get_addr (mem_addr);
+   }
 }
 
   base = find_base_term (mem_addr);
@@ -2619,17 +2626,16 @@ write_dependence_p (const_rtx mem, enum
  GET_MODE (mem)))
 return 0;
 
-  x_addr = canon_rtx (x_addr);
-  if (mem_canonicalized)
-mem_addr = canon_mem_addr;
-  else
+  if (!x_canonicalized)
 {
-  mem_addr = canon_rtx (mem_addr);
-  mem_mode = GET_MODE (mem);
+  x_addr = canon_rtx (x_addr);
+  x_mode = GET_MODE (x);
 }
+  if (!mem_canonicalized)
+mem_addr = canon_rtx (mem_addr);
 
-  if ((ret = memrefs_conflict_p (GET_MODE_SIZE (mem_mode), mem_addr,
-SIZE_FOR_MODE (x), x_addr, 0)) != -1)
+  if ((ret = memrefs_conflict_p (SIZE_FOR_MODE (mem), mem_addr,
+GET_MODE_SIZE (x_mode), x_addr, 0)) != -1)
 return ret;
 
   if (nonoverlapping_memrefs_p (x, mem, false))
@@ -2

Re: Unordered container insertion hints

2013-06-19 Thread François Dumont

Still no chance to have a look ?

I think that that patch is a really safe one. Those that do not use 
hint won't be impacted. Those that are already using it without any 
reason might experiment a small performance issue if they found the way 
to always use the worst possible hint.


François


On 06/12/2013 10:12 PM, François Dumont wrote:

Hi

Any news regarding this patch ?

Thanks

François


On 06/06/2013 10:33 PM, François Dumont wrote:

On 05/24/2013 01:00 AM, Paolo Carlini wrote:

On 05/23/2013 10:01 PM, François Dumont wrote:

Some feedback regarding this patch ?
Two quick ones: what if the hint is wrong? I suppose the insertion 
succeeds anyway, it's only a little waste of time, right?


Right.

Is it possible that for instance something throws in that case and 
would not now (when the hint is simply ignored)? In case, check and 
re-check we are still conforming.
I consider the hint only if it is equivalent to the inserted element 
so I invoke the equal_to functor for that. The invocation of the 
equal_to functor is already done if no hint is granted at the same 
location. So usage of the hint has no impact on exception safety.


In any case, I think it's quite easy to notice if an implementation 
is using the hint in this way or a similar one basing on some simple 
benchmarks, without looking of course at the actual implementation 
code. Do we have any idea what other implementations are doing? 
Like, eg, they invented something for unordered_set and map too? Or 
a better way to exploit the hint for the multi variants?


I only bench llvm/clang implementation and notice no different 
with or without hint, I guess it is simply ignored. I haven't plan to 
check or bench other implementations. The usage of hint I am 
introducing is quite natural considering the new unordered containers 
data model. And if anyone has a better idea to deal with it then he 
is welcome to contribute !


Eventually I suppose we want to add a performance testcase to our 
testsuite.
Good request and the reason why it took me so long to answer. Writing 
such benchmark have shown me that users should be very careful with 
it cause it can do more bad than good.


unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
w/o hint 120r  120u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with any hint 130r  130u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with good hint  54r   54u0s 6416mem 0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with perfect hint  36r   36u0s 6416mem 0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
w/o hint  40r   40u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with any hint  38r   38u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with bad hint  49r   50u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with perfect hint  34r   35u0s 6416mem 0pf


The small number represents how many time the same element is 
inserted and the big one the number of different elements. 100 X 
2 means that we loop 100 times inserting the 2 elements 
during each loop. 2 X 100 means that the main loop is on the 
elements and we insert each 100 times. Being able to insert all the 
equivalent elements at the same time or not has a major impact on the 
performances to get the same result. This is because when a new 
element is inserted it will be first in its bucket and the following 
99 insertions will benefit from it even without any hint.


The bench also show that a bad hint can be worst than no hint. A 
bad hint is one that once used require to check that next bucket is 
not impacted by the insertion. To do so it requires a hash code 
computation (if it is not cached like in my use case) and check. I 
have added a word about being able to check performance before using 
hints. Here is the result using the default std::hash, 
hash code is being cached.


unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
w/o hint  76r   76u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with any hint  83r   83u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with good hint  29r   29u0s 6416mem 0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with perfect hint  24r   23u0s 6416mem 0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
w/o hint  27r   26u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with any hint  24r   24u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with bad hint  

Re: patch to fix PR57559 for s390

2013-06-19 Thread Richard Sandiford
Vladimir Makarov  writes:
> On 13-06-19 2:31 PM, Richard Sandiford wrote:
>> Richard Sandiford  writes:
>>> Vladimir Makarov  writes:
 Index: lra.c
 ===
 --- lra.c  (revision 199753)
 +++ lra.c  (working copy)
 @@ -306,11 +306,11 @@ lra_emit_add (rtx x, rtx y, rtx z)
  || (disp != NULL_RTX && ! CONSTANT_P (disp))
  || (scale != NULL_RTX && ! CONSTANT_P (scale)))
{
 -/* Its is not an address generation.  Probably we have no 3 op
 +/* It is not an address generation.   Probably we have no 3 op
 add.  Last chance is to use 2-op add insn.  */
  lra_assert (x != y && x != z);
 -emit_move_insn (x, z);
 -insn = gen_add2_insn (x, y);
 +emit_move_insn (x, y);
 +insn = gen_add2_insn (x, z);
  emit_insn (insn);
}
 else
>>> Could you add a comment to lra_emit_add saying why it has to be this
>>> way round (move y, add z)?
>> Ping.
> I am going to add a comment when I submit my next patch (it will happen 
> today or tomorrow).

Thanks.

> The reason is simple as address segment is stored in y not in z and
> generation of addition of address segment to pseudo can fail (that is
> what happens for the PR).

Do you mean address segment in the x86 sense of "segment"?  I was just
a bit confused because the current comment says "It is not an address
generation", whereas it sounds like addresses are involved somewhere.

I suppose the commutation rules are that Y should be "no less complicated"
than Z, so maybe it wins from that point of view too.

Richard


Re: [PATCH, libfortran]: Initialize result variable (+ other changes)

2013-06-19 Thread Uros Bizjak
On Wed, Jun 19, 2013 at 8:27 AM, Tobias Burnus  wrote:

>> Attached patch initializes return variable in get_fpu_except_flags.
>> Additionally, it uses __asm__ and __volatile__ consistently, as
>> recommended for header files and unifies a bunch of formatting issues
>> throughout the file.
>
>
> OK. Thanks for having a second look and improving the file.

Actually, on a third look, there are multiple other issues in this file:

1. "cw_sse &= 0x;" is wrong, since it also clears FTZ, RC and DAZ flags.

2. x87 also needs to clear stalled exception flags, otherwise stalled
flag triggers exception, when corresponding exception bit is unmasked.

3. fsfcw should be used instead of fnstcw, so all pending exceptions
are handled (please note that in case when sticky exception flag is
not cleared in the exception handler, point 2 applies).

4. A lot of code could be simplified in set_fpu function.

2013-06-19  Uros Bizjak  

* config/fpu-387.h (_FPU_MASK_ALL): New.
(set_fpu): Use fstcw to store x87 FPU control word. Use fnclex to
clear stalled exception flags.  Correctly clear stalled SSE
exception flags.  Simplify code.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu.

I will wait for a day for possible comments.

The patch should be committed to all release branches.

Uros.
Index: config/fpu-387.h
===
--- config/fpu-387.h(revision 200211)
+++ config/fpu-387.h(working copy)
@@ -96,23 +96,26 @@ has_sse (void)
 #define _FPU_MASK_UM  0x10
 #define _FPU_MASK_PM  0x20
 
+#define _FPU_MASK_ALL 0x3f
+
 void set_fpu (void)
 {
+  int excepts = 0;
   unsigned short cw;
 
-  __asm__ __volatile__ ("fnstcw\t%0" : "=m" (cw));
+  __asm__ __volatile__ ("fstcw\t%0" : "=m" (cw));
 
-  cw |= (_FPU_MASK_IM | _FPU_MASK_DM | _FPU_MASK_ZM | _FPU_MASK_OM
-| _FPU_MASK_UM | _FPU_MASK_PM);
+  if (options.fpe & GFC_FPE_INVALID) excepts |= _FPU_MASK_IM;
+  if (options.fpe & GFC_FPE_DENORMAL) excepts |= _FPU_MASK_DM;
+  if (options.fpe & GFC_FPE_ZERO) excepts |= _FPU_MASK_ZM;
+  if (options.fpe & GFC_FPE_OVERFLOW) excepts |= _FPU_MASK_OM;
+  if (options.fpe & GFC_FPE_UNDERFLOW) excepts |= _FPU_MASK_UM;
+  if (options.fpe & GFC_FPE_INEXACT) excepts |= _FPU_MASK_PM;
 
-  if (options.fpe & GFC_FPE_INVALID) cw &= ~_FPU_MASK_IM;
-  if (options.fpe & GFC_FPE_DENORMAL) cw &= ~_FPU_MASK_DM;
-  if (options.fpe & GFC_FPE_ZERO) cw &= ~_FPU_MASK_ZM;
-  if (options.fpe & GFC_FPE_OVERFLOW) cw &= ~_FPU_MASK_OM;
-  if (options.fpe & GFC_FPE_UNDERFLOW) cw &= ~_FPU_MASK_UM;
-  if (options.fpe & GFC_FPE_INEXACT) cw &= ~_FPU_MASK_PM;
+  cw |= _FPU_MASK_ALL;
+  cw &= ~excepts;
 
-  __asm__ __volatile__ ("fldcw\t%0" : : "m" (cw));
+  __asm__ __volatile__ ("fnclex\n\tfldcw\t%0" : : "m" (cw));
 
   if (has_sse())
 {
@@ -120,22 +123,17 @@ void set_fpu (void)
 
   __asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw_sse));
 
-  cw_sse &= 0x;
-  cw_sse |= (_FPU_MASK_IM | _FPU_MASK_DM | _FPU_MASK_ZM | _FPU_MASK_OM
-| _FPU_MASK_UM | _FPU_MASK_PM ) << 7;
+  /* The SSE exception masks are shifted by 7 bits.  */
+  cw_sse |= _FPU_MASK_ALL << 7;
+  cw_sse &= ~(excepts << 7);
 
-  if (options.fpe & GFC_FPE_INVALID) cw_sse &= ~(_FPU_MASK_IM << 7);
-  if (options.fpe & GFC_FPE_DENORMAL) cw_sse &= ~(_FPU_MASK_DM << 7);
-  if (options.fpe & GFC_FPE_ZERO) cw_sse &= ~(_FPU_MASK_ZM << 7);
-  if (options.fpe & GFC_FPE_OVERFLOW) cw_sse &= ~(_FPU_MASK_OM << 7);
-  if (options.fpe & GFC_FPE_UNDERFLOW) cw_sse &= ~(_FPU_MASK_UM << 7);
-  if (options.fpe & GFC_FPE_INEXACT) cw_sse &= ~(_FPU_MASK_PM << 7);
+  /* Clear stalled exception flags.  */
+  cw_sse &= ~0x3f;
 
   __asm__ __volatile__ ("%vldmxcsr\t%0" : : "m" (cw_sse));
 }
 }
 
-
 int
 get_fpu_except_flags (void)
 {


Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath

2013-06-19 Thread Peter Bergner
On Wed, 2013-06-19 at 10:57 -0500, Peter Bergner wrote:
> On Wed, 2013-06-19 at 10:49 -0500, Peter Bergner wrote:
> > This is due to the following in _ITM_inTransaction():
> > 
> > 47if (tx && (tx->nesting > 0))
> > (gdb) p tx
> > $2 = (GTM::gtm_thread *) 0x10901bf0
> > (gdb) p tx->nesting
> > $3 = 1
> > (gdb) step
> > 49if (tx->state & gtm_thread::STATE_IRREVOCABLE)
> > (gdb) p tx->state
> > $4 = 3
> > (gdb) p gtm_thread::STATE_IRREVOCABLE
> > $5 = 2
> > (gdb) step
> > 50  return inIrrevocableTransaction;
> 
> Bah, ignore this.  It's a different call that is returning something other
> than inIrrevocableTransaction.  Unfortunately, gdb is having problems inside
> hw txns and I'm having trouble seeing why/when _ITM_inTransaction() is
> returning something other than inIrrevocableTransaction.  I'll see if I can
> determine why and will report back.

Ok, we return outsideTransaction because the nesting level (tx->nesting)
is zero.

Peter




Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Kirill Yukhin
Patch preapproved. Jakub
Hi,
Checked into trunk: http://gcc.gnu.org/ml/gcc-cvs/2013-06/msg00646.html

Thanks, K


Re: patch to fix PR57559 for s390

2013-06-19 Thread Vladimir Makarov

On 13-06-19 2:31 PM, Richard Sandiford wrote:

Richard Sandiford  writes:

Vladimir Makarov  writes:

Index: lra.c
===
--- lra.c   (revision 199753)
+++ lra.c   (working copy)
@@ -306,11 +306,11 @@ lra_emit_add (rtx x, rtx y, rtx z)
  || (disp != NULL_RTX && ! CONSTANT_P (disp))
  || (scale != NULL_RTX && ! CONSTANT_P (scale)))
{
- /* Its is not an address generation.  Probably we have no 3 op
+ /* It is not an address generation.   Probably we have no 3 op
 add.  Last chance is to use 2-op add insn.  */
  lra_assert (x != y && x != z);
- emit_move_insn (x, z);
- insn = gen_add2_insn (x, y);
+ emit_move_insn (x, y);
+ insn = gen_add2_insn (x, z);
  emit_insn (insn);
}
else

Could you add a comment to lra_emit_add saying why it has to be this
way round (move y, add z)?

Ping.
I am going to add a comment when I submit my next patch (it will happen 
today or tomorrow).  The reason is simple as address segment is stored 
in y not in z and generation of addition of address segment to pseudo 
can fail (that is what happens for the PR).



Thanks, Richard.



Re: [PATCH] PR57518, RA generated redundent code

2013-06-19 Thread Vladimir Makarov

On 13-06-19 1:23 AM, Wei Mi wrote:

Ping.

On Wed, Jun 12, 2013 at 2:44 PM, Wei Mi  wrote:

Hi,

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518

pr57518 happened because update_equiv_regs in IRA marked a reg
equivalent with a mem, lowered its mem_cost in scan_one_insn, set
NO_REGS to its rclass, but didn't consider the reg was used in
paradoxical subreg which prevented the reg from being replaced by mem
in LRA phase.

This patch is to check whether a reg is used in a paradoxical subreg
in update_equiv_regs before reg is set as equivalent to a mem.

bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for
trunk and gcc-4.8 branch?


Thanks for working on this PR, Wei, and sorry for the delay with the 
answer (I was on vacation).


In general, the PR analysis and the proposed solution looks ok.  I only 
worry that you are adding additional full scan of all RTL code.  It 
might add 0.5% to GCC compilation time if data cache is rewritten (which 
will happen for moderate size or big functions). It would be nice to do 
it on some other existing RTL traversing. Unfortunately, this info is 
calculated later (reg_max_width in reload or biggest_mode in LRA).  I am 
in doubt that other solutions I see now are better:


  o calculate this info in regstat_...  function and store it in reg_info_p
  o calculate it with update_equiv_regs and use it for invalidation the 
equiv info later


The first one increases reg_info_p footprint and calculation is done 
many times although it is used once.

The second one results in complicated code.

So I think the current patch is ok to commit.

Thanks, again.





Re: [Patch tree-ssa] RFC: Enable path threading for control variables (PR tree-optimization/54742).

2013-06-19 Thread Steve Ellcey
On Wed, 2013-06-19 at 14:19 +0100, James Greenhalgh wrote:

> Please let me know if this fixes the performance issues you
> were seeing and if you have any other comments.
> 
> FWIW I've bootstrapped and regression tested this version of
> the patch on x86_64 and ARM with no regressions.
> 
> Thanks,
> James Greenhalgh


James,

This patch does give me the same performance as my original patch, so
that is excellent.  While testing it I noticed that the final executable
is larger with your patch then with mine.  Here are the sizes of the
bare-metal executables I created using the same flags I sent you
earlier, the first has no switch optimization, the second one uses my
plugin optimization, and the third uses your latest patch.  I haven't
looked into why the size difference for your patch and mine exists, do
you see a size difference on your platforms?  I am not sure if path
threading in general is turned off for -Os but it probably should be.

% ll -art coremark.fsf*elf
-rwxr-xr-x 1 sellcey src 413812 Jun 19 11:11 coremark.fsf.1.elf
-rwxr-xr-x 1 sellcey src 414676 Jun 19 11:11 coremark.fsf.2.elf
-rwxr-xr-x 1 sellcey src 416402 Jun 19 11:11 coremark.fsf.3.elf


Steve Ellcey
sell...@mips.com




Re: patch to fix PR57559 for s390

2013-06-19 Thread Richard Sandiford
Richard Sandiford  writes:
> Vladimir Makarov  writes:
>> Index: lra.c
>> ===
>> --- lra.c(revision 199753)
>> +++ lra.c(working copy)
>> @@ -306,11 +306,11 @@ lra_emit_add (rtx x, rtx y, rtx z)
>>|| (disp != NULL_RTX && ! CONSTANT_P (disp))
>>|| (scale != NULL_RTX && ! CONSTANT_P (scale)))
>>  {
>> -  /* Its is not an address generation.  Probably we have no 3 op
>> +  /* It is not an address generation.   Probably we have no 3 op
>>   add.  Last chance is to use 2-op add insn.  */
>>lra_assert (x != y && x != z);
>> -  emit_move_insn (x, z);
>> -  insn = gen_add2_insn (x, y);
>> +  emit_move_insn (x, y);
>> +  insn = gen_add2_insn (x, z);
>>emit_insn (insn);
>>  }
>>else
>
> Could you add a comment to lra_emit_add saying why it has to be this
> way round (move y, add z)?

Ping.


Re: [patch, mips] Fix switch statement for mips16 (PR target/56942)

2013-06-19 Thread Richard Sandiford
"Steve Ellcey "  writes:
> 2013-06-19  Steve Ellcey  
>
>   PR target/56942
>   * config/mips/mips.md (casesi_internal_mips16_):
>   Use NEXT_INSN instead of next_real_insn.

OK, thanks.

Richard


Re: FW: [PATCH] RTEMS: Use strict DWARF-2 on ARM, PowerPC, SPARC

2013-06-19 Thread Sebastian Huber

Hello,

sorry, but this patch was intended for review on the rtems-devel list.  
This patch should not go into GCC at this point.


On 18/06/13 21:04, Rempel, Cynthia wrote:

Hi,

Forwarding this patch to gcc-patches...

Cheers!
Cindy

From: rtems-devel-boun...@rtems.org [rtems-devel-boun...@rtems.org] on behalf 
of Sebastian Huber [sebastian.hu...@embedded-brains.de]
Sent: Tuesday, June 18, 2013 4:58 AM
To: rtems-de...@rtems.org
Subject: [PATCH] RTEMS: Use strict DWARF-2 on ARM, PowerPC, SPARC

Some debuggers do not cope with the new DWARF3/4 debug format introduced
with GCC 4.8.  Default to strict DWARF-2 on ARM, PowerPC and SPARC for
now.

This patch should be committed to GCC 4.8 and 4.9.

gcc/ChangeLog
2013-06-18  Sebastian Huber  

 * config/rtems.c: New.
 * config.gcc (*-*-rtems*): Set extra_objs.
 * config/rtems.h (rtems_override_options): Declare.
 (RTEMS_OVERRIDE_OPTIONS): Define.
 * config/t-rtems (rtems.o): New.
 * config/arm/rtems-eabi.h (SUBTARGET_OVERRIDE_OPTIONS): Define.
 * config/rs6000/rtems.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Define.
 * config/sparc/rtemself.h (SUBTARGET_OVERRIDE_OPTIONS): Define.

[...]

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [patch, mips] Fix switch statement for mips16 (PR target/56942)

2013-06-19 Thread Steven Bosscher
On Wed, Jun 19, 2013 at 6:36 PM, Steve Ellcey wrote:
> Steven and Richard,
>
> I saw the email about the s390 switch statement
>
> http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01026.html
>
> and tested this patch on MIPS to see if using NEXT_INSN instead of
> next_real_insn fixed PR 56942.  It did, so is this the right long
> term fix for MIPS?

Yes it is. Also for other targets that look for JUMP_TABLE_DATA via next_*_insn.

Sorry for not getting the necessary changes in any quicker. I'll try
to get things cleaned up a bit next weekend.

Ciao!
Steven


Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2013-06-19 Thread Jakub Jelinek
On Wed, Jun 19, 2013 at 05:29:42PM +0200, Matthias Klose wrote:
> well, I did fix this assumption last year in gcc.c, then lets fix it in other
> places too, just adding a mode parameter to the public find_a_file function.
> Testing the attached patch.

Ok, provided you:
1) write proper ChangeLog
2) adjust the gcc-ar.c change (because it won't apply cleanly now that
   I've committed the other gcc-ar.c fix
3) 
> --- file-find.h   (revision 200203)
> +++ file-find.h   (working copy)
> @@ -38,7 +38,7 @@
>  };
>  
>  extern void find_file_set_debug (bool);
> -extern char *find_a_file (struct path_prefix *, const char *);
> +extern char *find_a_file (struct path_prefix *, const char *, int mode);

Remove " mode" above, none of the arguments have names, so adding it
is both inconsistent and useless.

4)
>if (ld_file_name == 0)
> -ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker]);
> +ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker], 
> X_OK);

This line looks too long now.

Jakub


CALL_INSN_FUNCTION_USAGE fix, PR52773

2013-06-19 Thread Bernd Schmidt
This is bug that triggers on m68k. The loop unroller creates a MULT
expression and tries to force it into a register, which causes a libcall
to be generated. Since args are pushed we create a
  (use (mem (plus virtual_outgoing_args scratch)))
in CALL_INSN_FUNCTION_USAGE. Since we're past vregs, the
virtual_outgoing_args rtx survives to reload, which blows up.

Fixed by just using stack_pointer_rtx, since we use a scratch anyway
rather than a known offset. I also noticed that we actually add two of
these USEs, so I've fixed that as well.

Bootstrapped and tested on x86_64-linux, ok?


Bernd
diff --git a/gcc/calls.c b/gcc/calls.c
index cdab8e0..db38b73 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -3603,6 +3603,7 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
   int reg_parm_stack_space = 0;
   int needed;
   rtx before_call;
+  bool have_push_fusage;
   tree tfom;			/* type_for_mode (outmode, 0) */
 
 #ifdef REG_PARM_STACK_SPACE
@@ -3956,6 +3957,8 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
 
   /* Push the args that need to be pushed.  */
 
+  have_push_fusage = false;
+
   /* ARGNUM indexes the ARGVEC array in the order in which the arguments
  are to be pushed.  */
   for (count = 0; count < nargs; count++, argnum += inc)
@@ -4046,14 +4049,19 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
 	  if (argblock)
 	use = plus_constant (Pmode, argblock,
  argvec[argnum].locate.offset.constant);
+	  else if (have_push_fusage)
+	continue;
 	  else
-	/* When arguments are pushed, trying to tell alias.c where
-	   exactly this argument is won't work, because the
-	   auto-increment causes confusion.  So we merely indicate
-	   that we access something with a known mode somewhere on
-	   the stack.  */
-	use = gen_rtx_PLUS (Pmode, virtual_outgoing_args_rtx,
-gen_rtx_SCRATCH (Pmode));
+	{
+	  /* When arguments are pushed, trying to tell alias.c where
+		 exactly this argument is won't work, because the
+		 auto-increment causes confusion.  So we merely indicate
+		 that we access something with a known mode somewhere on
+		 the stack.  */
+	  use = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
+  gen_rtx_SCRATCH (Pmode));
+	  have_push_fusage = true;
+	}
 	  use = gen_rtx_MEM (argvec[argnum].mode, use);
 	  use = gen_rtx_USE (VOIDmode, use);
 	  call_fusage = gen_rtx_EXPR_LIST (VOIDmode, use, call_fusage);
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr52773.c b/gcc/testsuite/gcc.c-torture/compile/pr52773.c
new file mode 100644
index 000..8daa5ee
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr52773.c
@@ -0,0 +1,16 @@
+/* pr52773.c */
+
+struct s {
+short x;
+short _pad[2];
+};
+
+static short mat_a_x;
+
+void transform(const struct s *src, struct s *dst, int n)
+{
+int i;
+
+for (i = 0; i < n; ++i)
+	dst[i].x = (src[i].x * mat_a_x) >> 6;
+}


Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types

2013-06-19 Thread Jeff Law

On 06/19/2013 10:02 AM, Bernhard Reutner-Fischer wrote:

On 19 June 2013 15:57, Jeff Law  wrote:

On 06/19/2013 01:02 AM, Chung-Ju Wu wrote:


2013/6/19 Jeff Law :



  * gcc.dg/tree-ssa/forwprop-28.c: New test.



In the gnu coding standard we have a space before
the open-parentheses.  Would that be great to have
testcase follow this convention as well? :)

If so, then...


No reason not to fix the test in this instance.  I'll make these updates
before committing.


eh, nitpicking party ?

+   If a simplification is mode, return TRUE, else return FALSE.  */
+static bool
+simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi,

s/mode/made/

Fixed via attached patch.


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 0ecf5ba..bd60452 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2013-06-19  Jeff Law  
 
+   * tree-ssa-forwprop.c (simplify_bitwise_binary_boolean): Fix typo
+   in comment.
+
* tree-ssa-forwprop.c (simplify_bitwise_binary_boolean): New function.
(simplify_bitwise_binary): Use it to simpify certain binary ops on
booleans.
diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index 29a0bb7..df19295 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -1881,7 +1881,7 @@ hoist_conversion_for_bitop_p (tree to, tree from)
then we can simplify the two statements into a single LT_EXPR or LE_EXPR
when code is BIT_AND_EXPR and BIT_IOR_EXPR respectively.
 
-   If a simplification is mode, return TRUE, else return FALSE.  */
+   If a simplification is made, return TRUE, else return FALSE.  */
 static bool
 simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi,
 enum tree_code code,


Re: [PATCH] PR/57652 collect2 temp files

2013-06-19 Thread Joseph S. Myers
On Wed, 19 Jun 2013, David Edelsohn wrote:

> A 2011 change to collect2 to use the standard diagnostics
> infrastructure broke collect2's cleanup of temp files when an error
> occurs.  This prototype of a patch implements the minimal conversion
> of collect2 to use atexit().
> 
> If this is the right direction, all calls to collect_exit() can be
> converted to exit().
> 
> Thanks, David
> 
> PR driver/57652
> * collect2.c (collect_atexit): New.
> (collect_exit): Directly call exit.
> (main): Register collect_atexit with atexit.

This is OK.  Using atexit seems to me to be the right approach for such 
cleanup.

-- 
Joseph S. Myers
jos...@codesourcery.com


[patch, mips] Fix switch statement for mips16 (PR target/56942)

2013-06-19 Thread Steve Ellcey
Steven and Richard,

I saw the email about the s390 switch statement

http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01026.html

and tested this patch on MIPS to see if using NEXT_INSN instead of
next_real_insn fixed PR 56942.  It did, so is this the right long
term fix for MIPS?  Is it OK to check it in?  Since Steven added
an assert in tablejump_p, I did not include any here, though I could
if we thought it was needed.

Steve Ellcey
sell...@mips.com


2013-06-19  Steve Ellcey  

PR target/56942
* config/mips/mips.md (casesi_internal_mips16_):
Use NEXT_INSN instead of next_real_insn.


diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index ce322d8..b832dda 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -5948,7 +5948,7 @@
(clobber (reg:SI MIPS16_T_REGNUM))]
   "TARGET_MIPS16_SHORT_JUMP_TABLES"
 {
-  rtx diff_vec = PATTERN (next_real_insn (operands[2]));
+  rtx diff_vec = PATTERN (NEXT_INSN (operands[2]));
 
   gcc_assert (GET_CODE (diff_vec) == ADDR_DIFF_VEC);
   



[PATCH] Replaced the array sizes from hard-coded values to #define

2013-06-19 Thread Iyer, Balaji V
Hello Everyone,
This patch will replace the array sizes in array notation test suite 
functions from a hard-coded value to a #defined one. The main reason for doing 
this is that it will get easier in future if I want to experiment with 
different array sizes. In some cases this change was not possible since I am 
using the triplets based on the hard-coded length. I have also increased the 
array sizes from 10 to like 100 so that we can test with larger array-sizes 
(mainly to see if any memory overflow in the temporary storage arrays I have 
created in the compiler). I am checking this patch in as obvious. I am willing 
to revert this if anyone has any objections.

Here is the ChangeLog entries

2013-06-19  Balaji V. Iyer  

* c-c++-common/cilk-plus/AN/builtin_fn_custom.c: Replaced all the
hard-coded values of array sizes with a #define.
* c-c++-common/cilk-plus/AN/builtin_fn_mutating.c: Likewise.
* c-c++-common/cilk-plus/AN/builtin_func_double2.c: Likewise.
* c-c++-common/cilk-plus/AN/gather_scatter.c: Likewise.
* c-c++-common/cilk-plus/AN/pr57577.c: Likewise.
* c-c++-common/cilk-plus/AN/sec_implicit_ex.c: Likewise.

Thanks,

Balaji V. Iyer.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index be51cb3..723af40 100755
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,13 @@
+2013-06-19  Balaji V. Iyer  
+
+   * c-c++-common/cilk-plus/AN/builtin_fn_custom.c: Replaced all the
+   hard-coded values of array sizes with a #define.
+   * c-c++-common/cilk-plus/AN/builtin_fn_mutating.c: Likewise.
+   * c-c++-common/cilk-plus/AN/builtin_func_double2.c: Likewise.
+   * c-c++-common/cilk-plus/AN/gather_scatter.c: Likewise.
+   * c-c++-common/cilk-plus/AN/pr57577.c: Likewise.
+   * c-c++-common/cilk-plus/AN/sec_implicit_ex.c: Likewise.
+   
 2013-06-18  Sriraman Tallam  
 
* gcc.target/i386/inline_error.c: New test.
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c 
b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c
index c5d3d7c..0f066d4 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_custom.c
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-options "-fcilkplus" } */
 
+#define NUMBER 100
 #if HAVE_IO
 #include 
 #endif
@@ -18,17 +19,17 @@ double my_func (double x, double y)
 /* char __sec_reduce_add (int *); */
 int main(void)
 {
-  int ii,array[10], y = 0, y_int = 0, array2[10];
-  double x, yy, array3[10], array4[10];
+  int ii,array[NUMBER], y = 0, y_int = 0, array2[NUMBER];
+  double x, yy, array3[NUMBER], array4[NUMBER];
   double max_value = 0.000, min_value = 0.000, add_value, mul_value = 1.00;
   int max_index = 0, min_index = 0;
-  for (ii = 0; ii < 10; ii++)
+  for (ii = 0; ii < NUMBER; ii++)
 {
   array[ii] = 1+ii;
   array2[ii]= 2; 
 }
 
-  for (ii = 0; ii < 10; ii++)
+  for (ii = 0; ii < NUMBER; ii++)
 {
   if (ii%2 && ii)
array3[ii] = (double)(1./(double)ii);
@@ -43,7 +44,7 @@ int main(void)
 
   /* Initialize it to the first variable.  */
   max_value = array3[0] * array4[0];
-  for (ii = 0; ii < 10; ii++)
+  for (ii = 0; ii < NUMBER; ii++)
 if (array3[ii] * array4[ii] > max_value) {
   max_value = array3[ii] * array4[ii];
   max_index = ii;
@@ -52,7 +53,7 @@ int main(void)
   
   
 #if HAVE_IO
-  for (ii = 0; ii < 10; ii++) 
+  for (ii = 0; ii < NUMBER; ii++) 
 printf("%5.3f ", array3[ii] * array4[ii]);
   printf("\n");
   printf("Max = %5.3f\t Max Index = %2d\n", x, y);
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c 
b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c
index 7c194c2..e01fbb1 100644
--- a/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/AN/builtin_fn_mutating.c
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-options "-fcilkplus" } */
 
+#define NUMBER 100
 #if HAVE_IO
 #include 
 #endif
@@ -15,18 +16,18 @@ void my_func (double *x, double y)
 
 int main(void)
 {
-  int ii,array[10], y = 0, y_int = 0, array2[10];
-  double x = 0.000, yy, array3[10], array4[10];
+  int ii,array[NUMBER], y = 0, y_int = 0, array2[NUMBER];
+  double x = 0.000, yy, array3[NUMBER], array4[NUMBER];
   double max_value = 0.000, min_value = 0.000, add_value, mul_value = 1.00;
   int max_index = 0, min_index = 0;
 #if 1
-  for (ii = 0; ii < 10; ii++)
+  for (ii = 0; ii < NUMBER; ii++)
 {
   array[ii] = 1+ii;
   array2[ii]= 2; 
 }
 
-  for (ii = 0; ii < 10; ii++)
+  for (ii = 0; ii < NUMBER; ii++)
 {
   if (ii%2 && ii)
array3[ii] = (double)(1./(double)ii);
@@ -42,16 +43,16 @@ int main(void)
 
   /* Initialize it to the first variable.  */
   max_value = array3[0] * array4[0];
-  for (ii = 0; ii < 10; ii++)
+  for (ii = 0; ii < NUMBER; ii++)
 if (array3[ii] * array4[ii] > max_value)

Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Mike Stump
On Jun 19, 2013, at 1:38 AM, Richard Biener  wrote:
> On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek  wrote:
>> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
>>> Right, as you did for other cases. It works here as well.
>> 
>> Patch preapproved.
> 
> I wonder how much code breaks these days when we enable -fno-common by
> default?

Not much.  gcc as Apple shipped it, has always been no-common, and indeed the 
shared library scheme doesn't like common.  There are a few test cases that 
would need -fcommon, but I don't think that is a big deal.  Most oss I think is 
-fno-common friendly.  I think gcc should default to c99, and I think c99 mode 
(and later) could use -fno-common by default.  For pre c99 modes, I'd probably 
just leave it to the dust bin of history.

Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Mike Stump
On Jun 19, 2013, at 1:44 AM, Jakub Jelinek  wrote:
> On Wed, Jun 19, 2013 at 10:38:47AM +0200, Richard Biener wrote:
>> On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek  wrote:
>>> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
 Right, as you did for other cases. It works here as well.
>>> 
>>> Patch preapproved.
>> 
>> I wonder how much code breaks these days when we enable -fno-common by
>> default? ...
> 
> Somebody would need to try it ;).

Been there done that.  That experiment has been running for at least 10 years 
now…  :-)

Re: RFA: Fix rtl-optimization/57425

2013-06-19 Thread Michael Matz
On Wed, 19 Jun 2013, Joern Rennecke wrote:

> > I.e. the arguments after your patch are exactly swapped.  This is usually
> > harmless, but not always, so that should be corrected before check in.
> > The change in cselib.c:cselib_invalidate_mem has the same problem.
> 
> Well, I have already committed the patch, so attached is a patch to fix
> things up.


>  int  
>   
>  anti_dependence (const_rtx mem, const_rtx x) 
>   
>  {
>   

...

>  int  
>   
> -canon_anti_dependence (const_rtx mem, enum machine_mode mem_mode,
>   
> -  rtx mem_addr, const_rtx x) 
>   
> +canon_anti_dependence (const_rtx x, enum machine_mode x_mode, rtx x_addr,
>   
> +  const_rtx mem, bool mem_canonicalized) 
>   
 {  

That's not good.  You now have different order of parameters between 
anti_dependence and canon_anti_dependence.  That will be mightily 
confusing, please instead change the caller.  Currently these predicates 
take their arguments in the order of the corresponding instructions, that 
should better be retained:

true_dependence:  write-then-(depending)read
anti_dependence:  read-then-(clobbering)write
write_dependence: write-then-(clobbering)write

We could change the order of arguments to something else, like first the 
clobber, then the clobbered, but then that should be done for all the 
predicates at the same time (and I would suggest to not do it).


Ciao,
Michael.


Re: Symtab cleanups 4/17 - ICE in GUPC due to use of init section

2013-06-19 Thread Gary Funck
On 06/18/13 16:37:04, Gary Funck wrote:
> The initialization function is currently generated in tree form in the
> usual way (it will be gimplified when the gimple pass is run).
> 
> The code that is being generated is roughly equivalent to:
> 
> static void
> __upc_init_decls (void)
> {
>   /* Compiler generated:
>  Initialize data related to UPC shared variables.  */
> }
> 
> static void (*const __upc_init_addr) (void)
>   __attribute__ ((section ("upc_init_array"))) = __upc_init_decls;
> 

I tried building a variable declaration along the
lines of __upc_init_addr above, by defining this function:

/* Create a static variable of type 'type'.
   This routine mimics the behavior of 'objc_create_temporary_var'
   with the change that it creates a static (file scoped) variable.
   If we continue to need this function, the two implementations
   should be unified.  */
static tree
upc_create_static_var (tree type, const char *name)
{
  tree id = (name != NULL) ? get_identifier (name) : NULL;
  tree decl = build_decl (input_location, VAR_DECL, id, type);
  TREE_USED (decl) = 1;
  TREE_EXTERNAL (decl) = 0;
  TREE_STATIC (decl) = 1;
  TREE_READONLY (decl) = 1;
  TREE_THIS_VOLATILE (decl) = 0;
  TREE_ADDRESSABLE (decl) = 0;
  DECL_PRESERVE_P (decl) = 1;
  DECL_ARTIFICIAL (decl) = 1;
  DECL_IGNORED_P (decl) = 1;
  DECL_CONTEXT (decl) = NULL;
  return decl;
}

and then using it as follows:

  init_func_ptr_type = build_pointer_type (TREE_TYPE (init_func));
  init_func_addr = upc_create_static_var (init_func_ptr_type, NULL);
  DECL_INITIAL (init_func_addr) = build_unary_op (loc, ADDR_EXPR,
  init_func, 0);
  DECL_SECTION_NAME (init_func_addr) = build_string (
strlen (UPC_INIT_ARRAY_SECTION_NAME),
UPC_INIT_ARRAY_SECTION_NAME);


The variable declaration tree node looks about right to me.
However, it never makes it into the output assembler file.

What is the recommended method for making sure that the
static variable created above is associated with the current
translation unit and that its initialization makes it into
the assembler output file?

Thanks,
- Gary


Re: [c++-concepts] code review

2013-06-19 Thread Andrew Sutton
>> +// If the types of the underlying templates match, compare
>> +// their constraints. The declarations could differ there.
>> +if (types_match)
>> +  types_match = equivalent_constraints (get_constraints
>> (olddecl),
>> +current_template_reqs);
>
>
> We can't assume that current_template_reqs will always apply to newdecl
> here, as decls_match is called in overload resolution as well.  What's the
> problem with attaching the requirements to the declaration before we get to
> duplicate_decls?

It's because newdecl doesn't have a template_info at the point at
which this is called, and the constraints are associated through that
information. This seems like another good reason for keeping
constraints with template decls.

Until I change that, I can just test to see if newdecl has template
info. If so, I'll use its constraints. If not, I'll use the current
requirements.

Andrew


Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types

2013-06-19 Thread Bernhard Reutner-Fischer
On 19 June 2013 15:57, Jeff Law  wrote:
> On 06/19/2013 01:02 AM, Chung-Ju Wu wrote:
>>
>> 2013/6/19 Jeff Law :
>>>
>>>
>>>  * gcc.dg/tree-ssa/forwprop-28.c: New test.
>>>
>>
>> In the gnu coding standard we have a space before
>> the open-parentheses.  Would that be great to have
>> testcase follow this convention as well? :)
>>
>> If so, then...
>
> No reason not to fix the test in this instance.  I'll make these updates
> before committing.

eh, nitpicking party ?

+   If a simplification is mode, return TRUE, else return FALSE.  */
+static bool
+simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi,

s/mode/made/

Sounds nice, otherwise!
thanks,


Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath

2013-06-19 Thread Peter Bergner
On Wed, 2013-06-19 at 10:49 -0500, Peter Bergner wrote:
> This is due to the following in _ITM_inTransaction():
> 
> 47  if (tx && (tx->nesting > 0))
> (gdb) p tx
> $2 = (GTM::gtm_thread *) 0x10901bf0
> (gdb) p tx->nesting
> $3 = 1
> (gdb) step
> 49  if (tx->state & gtm_thread::STATE_IRREVOCABLE)
> (gdb) p tx->state
> $4 = 3
> (gdb) p gtm_thread::STATE_IRREVOCABLE
> $5 = 2
> (gdb) step
> 50return inIrrevocableTransaction;

Bah, ignore this.  It's a different call that is returning something other
than inIrrevocableTransaction.  Unfortunately, gdb is having problems inside
hw txns and I'm having trouble seeing why/when _ITM_inTransaction() is
returning something other than inIrrevocableTransaction.  I'll see if I can
determine why and will report back.

Peter





Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath

2013-06-19 Thread Peter Bergner
On Wed, 2013-06-19 at 16:57 +0200, Torvald Riegel wrote:
> (Re-sending to the proper list. Sorry for the noise at gcc@.)
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57643
> 
> The HTM fastpath didn't handle a situation in which a relaxed
> transaction executed unsafe code that in turn starts a transaction; it
> simply tried to wait for the "other" transaction, not checking whether
> the current thread started the other transaction.
[snip]
> Peter and/or Andreas: Could you please check that this fixes the bug you
> see on Power/s390?  Thanks.

This patch fixed the hang, but now I'm dying due to an abort in the
test case.  Specifically, the first abort in unsafe()

int __attribute__((transaction_unsafe)) unsafe(int i)
 {
   if (_ITM_inTransaction() != inIrrevocableTransaction)
 dying here:  abort();
   __transaction_atomic {
 x++;
   }
   if (_ITM_inTransaction() != inIrrevocableTransaction)
 abort();
   return i+1;
 }

This is due to the following in _ITM_inTransaction():

47if (tx && (tx->nesting > 0))
(gdb) p tx
$2 = (GTM::gtm_thread *) 0x10901bf0
(gdb) p tx->nesting
$3 = 1
(gdb) step
49if (tx->state & gtm_thread::STATE_IRREVOCABLE)
(gdb) p tx->state
$4 = 3
(gdb) p gtm_thread::STATE_IRREVOCABLE
$5 = 2
(gdb) step
50  return inIrrevocableTransaction;

Peter




Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2013-06-19 Thread Matthias Klose
Am 19.06.2013 14:10, schrieb Jakub Jelinek:
> On Wed, Jun 19, 2013 at 02:03:34PM +0200, Matthias Klose wrote:
>> Am 27.11.2012 19:14, schrieb Meador Inge:
>>> On 11/26/2012 09:05 AM, Richard Biener wrote:
>>>
 On Wed, Nov 7, 2012 at 10:51 PM, Meador Inge  
 wrote:
> Ping ^ 4.

 Ok.
>>>
>>> Thanks for the review.  Committed to trunk.
>>
>> This did break gcc-ar and gcc-nm; now a regression on the 4.8 branch.
>>
>> $ gcc-ar-4.8 -h
>> gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so'
>>
>> the plugin is *not* installed with x permission flags (which seems to be the
>> standard for shared libraries).   You did change the code to use find_a_file
>> which searches only for files with the x bit set.
> 
> That actually is the standard for shared libraries, the linker creates
> libraries with those permissions and libtool/automake installs them that way
> too.  So if you override this, you need to cope with that decision.

well, I did fix this assumption last year in gcc.c, then lets fix it in other
places too, just adding a mode parameter to the public find_a_file function.
Testing the attached patch.

  Matthias


Index: gcc-ar.c
===
--- gcc-ar.c(revision 200203)
+++ gcc-ar.c(working copy)
@@ -136,7 +136,7 @@
   setup_prefixes (av[0]);
 
   /* Find the GCC LTO plugin */
-  plugin = find_a_file (&target_path, LTOPLUGINSONAME);
+  plugin = find_a_file (&target_path, LTOPLUGINSONAME, R_OK);
   if (!plugin)
 {
   fprintf (stderr, "%s: Cannot find plugin '%s'\n", av[0], 
LTOPLUGINSONAME);
@@ -151,7 +151,7 @@
   const char *cross_exe_name;
 
   cross_exe_name = concat (target_machine, "-", PERSONALITY, NULL);
-  exe_name = find_a_file (&path, cross_exe_name);
+  exe_name = find_a_file (&path, cross_exe_name, X_OK);
   if (!exe_name)
{
  fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0],
Index: file-find.c
===
--- file-find.c (revision 200203)
+++ file-find.c (working copy)
@@ -31,7 +31,7 @@
 }
 
 char *
-find_a_file (struct path_prefix *pprefix, const char *name)
+find_a_file (struct path_prefix *pprefix, const char *name, int mode)
 {
   char *temp;
   struct prefix_list *pl;
@@ -50,7 +50,7 @@
 
   if (IS_ABSOLUTE_PATH (name))
 {
-  if (access (name, X_OK) == 0)
+  if (access (name, mode) == 0)
{
  strcpy (temp, name);
 
@@ -66,7 +66,7 @@
   strcpy (temp, name);
strcat (temp, HOST_EXECUTABLE_SUFFIX);
 
-   if (access (temp, X_OK) == 0)
+   if (access (temp, mode) == 0)
  return temp;
 #endif
 
@@ -83,7 +83,7 @@
 
if (stat (temp, &st) >= 0
&& ! S_ISDIR (st.st_mode)
-   && access (temp, X_OK) == 0)
+   && access (temp, mode) == 0)
  return temp;
 
 #ifdef HOST_EXECUTABLE_SUFFIX
@@ -93,7 +93,7 @@
 
if (stat (temp, &st) >= 0
&& ! S_ISDIR (st.st_mode)
-   && access (temp, X_OK) == 0)
+   && access (temp, mode) == 0)
  return temp;
 #endif
   }
Index: file-find.h
===
--- file-find.h (revision 200203)
+++ file-find.h (working copy)
@@ -38,7 +38,7 @@
 };
 
 extern void find_file_set_debug (bool);
-extern char *find_a_file (struct path_prefix *, const char *);
+extern char *find_a_file (struct path_prefix *, const char *, int mode);
 extern void add_prefix (struct path_prefix *, const char *);
 extern void prefix_from_env (const char *, struct path_prefix *);
 extern void prefix_from_string (const char *, struct path_prefix *);
Index: collect2.c
===
--- collect2.c  (revision 200203)
+++ collect2.c  (working copy)
@@ -1110,55 +1110,55 @@
   if (ld_file_name == 0)
 #endif
 #ifdef REAL_LD_FILE_NAME
-  ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME);
+ld_file_name = find_a_file (&path, REAL_LD_FILE_NAME, X_OK);
   if (ld_file_name == 0)
 #endif
   /* Search the (target-specific) compiler dirs for ld'.  */
-  ld_file_name = find_a_file (&cpath, real_ld_suffix);
+ld_file_name = find_a_file (&cpath, real_ld_suffix, X_OK);
   /* Likewise for `collect-ld'.  */
   if (ld_file_name == 0)
 {
-  ld_file_name = find_a_file (&cpath, collect_ld_suffix);
+  ld_file_name = find_a_file (&cpath, collect_ld_suffix, X_OK);
   use_collect_ld = ld_file_name != 0;
 }
   /* Search the compiler directories for `ld'.  We have protection against
  recursive calls in find_a_file.  */
   if (ld_file_name == 0)
-ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker]);
+ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker], X_OK);
   /* Search the ordinary system bin directories
  for `ld' (if native linking) or `TARGET-ld' (if cross).  */
   if (ld_file_name == 0)
-ld_file_name = find_a_file (&path, full_ld_suff

Re: [patch] Improve debug info for small structures passed by reference

2013-06-19 Thread Eric Botcazou
> Especially if it is -O0 only, I don't see why you think so.  Just dg-skip-if
> it for -O1+ if you believe it is unreliable for some reason, but if you
> look at the parameter value after the prologue, not showing the right value
> at -O0 would be a serious bug everywhere.  Having some GDB testcase also
> doesn't hurt, but having it in GCC testsuite has significant advantages
> over that, most GCC developers don't run GDB testsuite after any changes
> they do.

All right, I've attached a couple of guality testcases (param-1.c for -O0 and 
param-2.c for -O1 -fno-var-tracking-assignments, a param-3.c for bare -O1 will 
require further adjustments in var-tracking.c).  Unfortunately they don't fail 
without the patch e.g. on PowerPC, they are reported as unsupported instead 
because of "Cannot access memory at address" messages from GDB...


* gcc.dg/guality/param-1.c: New test.
* gcc.dg/guality/param-2.c: Likewise.


-- 
Eric Botcazou/* { dg-do run } */
/* { dg-options "-g" } */
/* { dg-skip-if "" { *-*-* }  { "*" } { "-O0" } } */

typedef __UINTPTR_TYPE__ uintptr_t;

__attribute__((noinline, noclone)) int
sub (int a, int b)
{
  return a - b;
}

typedef struct { uintptr_t pa; uintptr_t pb; } fatp_t
  __attribute__ ((aligned (2 * __alignof__ (uintptr_t;

__attribute__((noinline, noclone)) void
foo (fatp_t str, int a, int b)
{
  int i = sub (a, b);
  if (i == 0)   /* BREAK */
i = sub (b, a);
}

int
main (void)
{
  fatp_t ptr = { 31415927, 27182818 };
  foo (ptr, 1, 2);
  return 0;
}

/* { dg-final { gdb-test 20 "str.pa" "31415927" } } */
/* { dg-final { gdb-test 20 "str.pb" "27182818" } } */
/* { dg-do run } */
/* { dg-options "-g -fno-var-tracking-assignments" } */
/* { dg-skip-if "" { *-*-* }  { "*" } { "-O0" "-O1" } } */

typedef __UINTPTR_TYPE__ uintptr_t;

__attribute__((noinline, noclone)) int
sub (int a, int b)
{
  return a - b;
}

typedef struct { uintptr_t pa; uintptr_t pb; } fatp_t
  __attribute__ ((aligned (2 * __alignof__ (uintptr_t;

__attribute__((noinline, noclone)) void
foo (fatp_t str, int a, int b)
{
  int i = sub (a, b);
  if (i == 0)   /* BREAK */
foo (str, a - 1, b);
}

int
main (void)
{
  fatp_t ptr = { 31415927, 27182818 };
  foo (ptr, 1, 2);
  return 0;
}

/* { dg-final { gdb-test 20 "str.pa" "31415927" } } */
/* { dg-final { gdb-test 20 "str.pb" "27182818" } } */


[patch] libitm: Fix handling of reentrancy in the HTM fastpath

2013-06-19 Thread Torvald Riegel
(Re-sending to the proper list. Sorry for the noise at gcc@.)

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57643

The HTM fastpath didn't handle a situation in which a relaxed
transaction executed unsafe code that in turn starts a transaction; it
simply tried to wait for the "other" transaction, not checking whether
the current thread started the other transaction.

We fix this by doing this check, and if we have the lock, we just
continue with the fallback serial-mode path instead of using a HW
transaction.  The current code won't do the check before starting a HW
transaction, but this way we can keep the HTM fastpath unchanged; also,
this particular "reentrancy" is probably infrequent in practice, so I
suppose the small slowdown shouldn't matter much.

Also, I first thought about trying to use the HTM in the reentrancy
case, but this doesn't make any sense because other transactions can't
run anyway, and we should really just finish the serial-mode transaction
as fast as possible.

Peter and/or Andreas: Could you please check that this fixes the bug you
see on Power/s390?  Thanks.

Torvald

commit 185af84e365e1bae31aea5afd6e67e81f3c32c72
Author: Torvald Riegel 
Date:   Wed Jun 19 16:42:24 2013 +0200

libitm: Fix handling of reentrancy in the HTM fastpath.

PR libitm/57643

diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index 93e702e..a3bf549 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -197,6 +197,8 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const 
gtm_jmpbuf *jb)
  // We are executing a transaction now.
  // Monitor the writer flag in the serial-mode lock, and abort
  // if there is an active or waiting serial-mode transaction.
+ // Note that this can also happen due to an enclosing
+ // serial-mode transaction; we handle this case below.
  if (unlikely(serial_lock.is_write_locked()))
htm_abort();
  else
@@ -219,6 +221,14 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const 
gtm_jmpbuf *jb)
  tx = new gtm_thread();
  set_gtm_thr(tx);
}
+ // Check whether there is an enclosing serial-mode transaction;
+ // if so, we just continue as a nested transaction and don't
+ // try to use the HTM fastpath.  This case can happen when an
+ // outermost relaxed transaction calls unsafe code that starts
+ // a transaction.
+ if (tx->nesting > 0)
+   break;
+ // Another thread is running a serial-mode transaction.  Wait.
  serial_lock.read_lock(tx);
  serial_lock.read_unlock(tx);
  // TODO We should probably reset the retry count t here, unless


[PATCH] PR/57652 collect2 temp files

2013-06-19 Thread David Edelsohn
A 2011 change to collect2 to use the standard diagnostics
infrastructure broke collect2's cleanup of temp files when an error
occurs.  This prototype of a patch implements the minimal conversion
of collect2 to use atexit().

If this is the right direction, all calls to collect_exit() can be
converted to exit().

Thanks, David

PR driver/57652
* collect2.c (collect_atexit): New.
(collect_exit): Directly call exit.
(main): Register collect_atexit with atexit.

Index: collect2.c
===
--- collect2.c  (revision 200180)
+++ collect2.c  (working copy)
@@ -367,7 +367,7 @@
 /* Delete tempfiles and exit function.  */

 void
-collect_exit (int status)
+collect_atexit (void)
 {
   if (c_file != 0 && c_file[0])
 maybe_unlink (c_file);
@@ -395,12 +395,16 @@
   maybe_unlink (lderrout);
 }

-  if (status != 0 && output_file != 0 && output_file[0])
+  if (output_file != 0 && output_file[0])
 maybe_unlink (output_file);

   if (response_file)
 maybe_unlink (response_file);
+}

+void
+collect_exit (int status)
+{
   exit (status);
 }

@@ -970,6 +974,9 @@
   signal (SIGCHLD, SIG_DFL);
 #endif

+  if (atexit (collect_atexit) != 0)
+fatal_error ("atexit failed");
+
   /* Unlock the stdio streams.  */
   unlock_std_streams ();


Re: [PATCH GCC]Fix PR57540, try to choose scaled_offset address mode when expanding array reference

2013-06-19 Thread Bin.Cheng
On Tue, Jun 18, 2013 at 10:02 PM, Oleg Endo  wrote:
> On Tue, 2013-06-18 at 18:09 +0800, Bin.Cheng wrote:
>> On Tue, Jun 18, 2013 at 3:52 AM, Oleg Endo  wrote:
>> >
>> > My observation is, that legitimizing addressing modes in the backend by
>> > looking at one isolated address works, but doesn't give good results.
>> > In the SH backend there is this a particular case with displacement
>> > addressing (register + constant).  On SH displacements for byte
>> > addressing are 0..15, 0..31 for 16 bit words and 0..63 for 32 bit words.
>> > sh_legitimize_address (or rather sh_find_mov_disp_adjust) uses a fixed
>> > heuristic to satisfy the displacement constraint and splits out a plus
>> > insn if needed to adjust the base address.  Of course that fixed
>> > heuristic doesn't work for some cases and thus sometimes results in
>> > unnecessary base address adjustments.
>> > I had the idea of replacing the current (partially defunct) auto-inc-dec
>> > RTL pass with a more generic addressing mode selection pass:
>> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56590
>> >
>> > Any suggestions/comments/... are highly appreciated.
>> >
>> In PR56590, is PR50749 the only one that correlate with auto-inc-dec?
>> Others seem just problems of wrong addressing mode.
>
> Yes, PR 50749 was the initial description of auto-inc-dec defects.  PR
> 52049 is also related to it, as the code examples there are candidates
> for post-inc addressing mode.  In that case, if 'int' is replaced with
> 'float' on SH post-inc is the optimal mode, because it doesn't have
> displacement addressing for FPU loads, except than SH2A.  Even then,
> using post-inc is better as it has a more compact instruction encoding.
> The current auto-inc-dec is not able to discover such opportunities if,
> for example, mem accesses are reordered by preceding optimization
> passes.
>
>> And one point on PR50749, auto-inc-dec depends on ivopt to choose
>> auto-increment candidate.  Since you disabled ivopt, I bet GCC will
>> miss lots of auto-increment opportunities.
>
> No, I haven't disabled ivopt.
>

But -fno-ivopts is specified in PR50749.
With current implementation, auto-inc-dec iterates instructions
backward, tries to find memory access and increment/decrement pairs.
It will miss opportunities if instructions are interfered with each
other.

Thanks.
bin
--
Best Regards.


Re: RFA: Fix rtl-optimization/57425

2013-06-19 Thread Joern Rennecke

I.e. the arguments after your patch are exactly swapped.  This is usually
harmless, but not always, so that should be corrected before check in.
The change in cselib.c:cselib_invalidate_mem has the same problem.


Well, I have already committed the patch, so attached is a patch to fix
things up.
Looking at the read MEM canonicalization further, these are obviously
canonicalized in cse.c:invalidate, but this is not so clear in cselib.
I can see some canonicalization going on where things are recorded, but
there is no comment in cselib.h:cselib_val nor cselib.c:first_containing_mem
to say that it's guaranteed that all the expression in the locs lists are
canonicalized, so I'll assume that's not a safe assumption, and I've
added an mem_canonicalized parameter to canon_anti_dependence to
indicate if MEM (the previously read location) is canonicalized.

Bootstrapped / regtested on i686-pc-linux-gnu.
2013-06-19  Joern Rennecke 

PR rtl-optimization/57425
PR rtl-optimization/57569
* alias.c (write_dependence_p): Remove parameters mem_mode and
canon_mem_addr.  Add parameters x_mode, x_addr and x_canonicalized.
Changed all callers.
(canon_anti_dependence): Get comments and semantics in sync,
to be a proper drop-in replacement for canon_true_dependence
for use by cse(lib).
Add parameter  mem_canonicalized.  Changed all callers.
* rtl.h (canon_anti_dependence): Update prototype.

Index: alias.c
===
--- alias.c (revision 200133)
+++ alias.c (working copy)
@@ -156,8 +156,9 @@ static int insert_subset_children (splay
 static alias_set_entry get_alias_set_entry (alias_set_type);
 static bool nonoverlapping_component_refs_p (const_rtx, const_rtx);
 static tree decl_for_component_ref (tree);
-static int write_dependence_p (const_rtx, enum machine_mode, rtx, const_rtx,
-  bool, bool);
+static int write_dependence_p (const_rtx,
+  const_rtx, enum machine_mode, rtx,
+  bool, bool, bool);
 
 static void memory_modified_1 (rtx, const_rtx, void *);
 
@@ -2555,20 +2556,22 @@ canon_true_dependence (const_rtx mem, en
 
 /* Returns nonzero if a write to X might alias a previous read from
(or, if WRITEP is true, a write to) MEM.
-   If MEM_CANONCALIZED is nonzero, CANON_MEM_ADDR is the canonicalized
-   address of MEM, and MEM_MODE the mode for that access.  */
+   If X_CANONCALIZED is nonzero, then X_ADDR is the canonicalized address of X,
+   and X_MODE the mode for that access.
+   If MEM_CANONICALIZED is true, MEM is canonicalized.  */
 
 static int
-write_dependence_p (const_rtx mem, enum machine_mode mem_mode,
-   rtx canon_mem_addr, const_rtx x,
-   bool mem_canonicalized, bool writep)
+write_dependence_p (const_rtx mem,
+   const_rtx x, enum machine_mode x_mode, rtx x_addr,
+   bool mem_canonicalized, bool x_canonicalized, bool writep)
 {
-  rtx x_addr, mem_addr;
+  rtx mem_addr;
   rtx base;
   int ret;
 
-  gcc_checking_assert (mem_canonicalized ? (canon_mem_addr != NULL_RTX)
-  : (canon_mem_addr == NULL_RTX && mem_mode == VOIDmode));
+  gcc_checking_assert (x_canonicalized
+  ? (x_addr != NULL_RTX && x_mode != VOIDmode)
+  : (x_addr == NULL_RTX && x_mode == VOIDmode));
 
   if (MEM_VOLATILE_P (x) && MEM_VOLATILE_P (mem))
 return 1;
@@ -2593,17 +2596,21 @@ write_dependence_p (const_rtx mem, enum
   if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x))
 return 1;
 
-  x_addr = XEXP (x, 0);
   mem_addr = XEXP (mem, 0);
-  if (!((GET_CODE (x_addr) == VALUE
-&& GET_CODE (mem_addr) != VALUE
-&& reg_mentioned_p (x_addr, mem_addr))
-   || (GET_CODE (x_addr) != VALUE
-   && GET_CODE (mem_addr) == VALUE
-   && reg_mentioned_p (mem_addr, x_addr
+  if (!x_addr)
 {
-  x_addr = get_addr (x_addr);
-  mem_addr = get_addr (mem_addr);
+  x_addr = XEXP (x, 0);
+  if (!((GET_CODE (x_addr) == VALUE
+&& GET_CODE (mem_addr) != VALUE
+&& reg_mentioned_p (x_addr, mem_addr))
+   || (GET_CODE (x_addr) != VALUE
+   && GET_CODE (mem_addr) == VALUE
+   && reg_mentioned_p (mem_addr, x_addr
+   {
+ x_addr = get_addr (x_addr);
+ if (!mem_canonicalized)
+   mem_addr = get_addr (mem_addr);
+   }
 }
 
   base = find_base_term (mem_addr);
@@ -2619,17 +2626,16 @@ write_dependence_p (const_rtx mem, enum
  GET_MODE (mem)))
 return 0;
 
-  x_addr = canon_rtx (x_addr);
-  if (mem_canonicalized)
-mem_addr = canon_mem_addr;
-  else
+  if (!x_canonicalized)
 {
-  mem_addr = canon_rtx (mem_addr);
-  mem_mode = GET_MODE (mem);
+  x_addr = canon_rtx (x_addr);
+  x_mode = GET_MODE (x);
 }
+  if (!mem_ca

Re: [PATCH] Fix up gcc-{ar,nm,ranlib}

2013-06-19 Thread Andi Kleen
On Wed, Jun 19, 2013 at 03:20:33PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> If say /usr/bin/gcc-ar doesn't find /usr//bin/ar (and a few others),
> it gives up unless CROSS_DIRECTORY_STRUCTURE, while e.g. collect2 looks for
> ld, nm etc. in $PATH.  The collect2.c snippet is:

Looks good. Thanks for fixing.

-Andi


Re: [PATCH] Fix up gcc-{ar,nm,ranlib}

2013-06-19 Thread Matthias Klose
Am 19.06.2013 15:20, schrieb Jakub Jelinek:
> Here is so far untested attempt to do that in gcc-{ar,nm,ranlib} too, ok if
> bootstrap/regtest passes and testing shows it works (for 4.8 too, in 4.7 it
> worked)?

works for me, checked with a 4.8 native build and install.

  Matthias



Re: [c++-concepts] code review

2013-06-19 Thread Jason Merrill

On 06/18/2013 12:27 PM, Andrew Sutton wrote:

There was a bug in instantiation_dependent_expr_r that would cause
trait expressions like __is_class(int) to be marked as type dependent.
It was always testing the 2nd operand, even for unary traits
(NULL_TREE turns out to be type dependent).


I fixed that last month:

2013-05-20  Jason Merrill  

PR c++/57016
* pt.c (instantiation_dependent_r) [TRAIT_EXPR]: Only check
type2 if there is one.

If you want to keep the is_binary_trait stuff, that's fine, except that


+extern bool is_binary_trait (cp_trait_kind);

...

+inline bool
+is_binary_trait (cp_trait_kind k)


violates the rules for inline functions: an inline function must be 
declared as inline before any uses and defined in all translation units 
that use it.



+// reduced terms in the constraints language. Note that conjoining with a
+// non-null expression with  NULL_TREE is an identity operation. That is,


Drop the first "with".


+// If the types of the underlying templates match, compare
+// their constraints. The declarations could differ there.
+if (types_match)
+  types_match = equivalent_constraints (get_constraints (olddecl),
+current_template_reqs);


We can't assume that current_template_reqs will always apply to newdecl 
here, as decls_match is called in overload resolution as well.  What's 
the problem with attaching the requirements to the declaration before we 
get to duplicate_decls?


Jason



Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types

2013-06-19 Thread Jeff Law

On 06/19/2013 01:02 AM, Chung-Ju Wu wrote:

2013/6/19 Jeff Law :


 * gcc.dg/tree-ssa/forwprop-28.c: New test.



In the gnu coding standard we have a space before
the open-parentheses.  Would that be great to have
testcase follow this convention as well? :)

If so, then...
No reason not to fix the test in this instance.  I'll make these updates 
before committing.


jeff



[C++ Patch] PR 57645

2013-06-19 Thread Paolo Carlini

Hi,

when I implemented Core/1123 "Destructors should be noexcept by 
default", unfortunately I caused this regression, present now in 
mainline and 4_8-branch.


When the destructor is user provided, with no exception specifications, 
and the type has data members (not bases, those are already Ok) with the 
destructor which can throw, the destructor is wrongly deduced to be 
noexcept. The reason is that deduce_noexcept_on_destructors is called 
from check_bases_and_members after check_bases but *before* 
check_methods and therefore the latter does too late work relevant for 
the deduction, namely possibly setting TYPE_HAS_NONTRIVIAL_DESTRUCTOR.


My proposal for a fix involves simply anticipating that work as part of 
deduce_noexcept_on_destructors, renamed now check_destructors, and 
called unconditionally. Things appear to work fine. Of course different 
refactorings and naming schemes could make perfect sense.


Tested x86_64-linux.

Thanks,
Paolo.

//
/cp
2013-06-19  Paolo Carlini  

PR c++/57645
* class.c (check_methods): Don't set TYPE_HAS_NONTRIVIAL_DESTRUCTOR
here...
(deduce_noexcept_on_destructors): ... do it here. Rename the
function to check_destructors. 
(check_bases_and_members): Adjust.

/testsuite
2013-06-19  Paolo Carlini  

PR c++/57645
* testsuite/g++.dg/cpp0x/noexcept21.C: New.
Index: cp/class.c
===
--- cp/class.c  (revision 200197)
+++ cp/class.c  (working copy)
@@ -4256,11 +4256,6 @@ check_methods (tree t)
  if (DECL_PURE_VIRTUAL_P (x))
vec_safe_push (CLASSTYPE_PURE_VIRTUALS (t), x);
}
-  /* All user-provided destructors are non-trivial.
- Constructors and assignment ops are handled in
-grok_special_member_properties.  */
-  if (DECL_DESTRUCTOR_P (x) && user_provided_p (x))
-   TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = 1;
 }
 }
 
@@ -4568,8 +4563,12 @@ clone_constructors_and_destructors (tree t)
 clone_function_decl (OVL_CURRENT (fns), /*update_method_vec_p=*/1);
 }
 
-/* Deduce noexcept for a destructor DTOR.  */
+/* Deduce noexcept for a destructor DTOR. 
 
+   12.4/3: A declaration of a destructor that does not have an
+   exception-specification is implicitly considered to have the
+   same exception-specification as an implicit declaration (15.4).  */
+
 void
 deduce_noexcept_on_destructor (tree dtor)
 {
@@ -4584,14 +4583,11 @@ deduce_noexcept_on_destructor (tree dtor)
 }
 }
 
-/* For each destructor in T, deduce noexcept:
+/* Possibly set TYPE_HAS_NONTRIVIAL_DESTRUCTOR and deduce noexcept for
+   each destructor.  */
 
-   12.4/3: A declaration of a destructor that does not have an
-   exception-specification is implicitly considered to have the
-   same exception-specification as an implicit declaration (15.4).  */
-
 static void
-deduce_noexcept_on_destructors (tree t)
+check_destructors (tree t)
 {
   tree fns;
 
@@ -4601,7 +4597,12 @@ static void
 return;
 
   for (fns = CLASSTYPE_DESTRUCTORS (t); fns; fns = OVL_NEXT (fns))
-deduce_noexcept_on_destructor (OVL_CURRENT (fns));
+{
+  if (user_provided_p (OVL_CURRENT (fns)))
+   TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = 1;
+  if (cxx_dialect >= cxx0x)
+   deduce_noexcept_on_destructor (OVL_CURRENT (fns));
+}
 }
 
 /* Subroutine of set_one_vmethod_tm_attributes.  Search base classes
@@ -5319,10 +5320,10 @@ check_bases_and_members (tree t)
   check_bases (t, &cant_have_const_ctor,
   &no_const_asn_ref);
 
-  /* Deduce noexcept on destructors.  This needs to happen after we've set
- triviality flags appropriately for our bases.  */
-  if (cxx_dialect >= cxx0x)
-deduce_noexcept_on_destructors (t);
+  /* Possibly set TYPE_HAS_NONTRIVIAL_DESTRUCTOR and deduce noexcept on
+ destructors.  This needs to happen after we've set triviality flags
+ appropriately for our bases.  */
+  check_destructors (t);
 
   /* Check all the method declarations.  */
   check_methods (t);
Index: testsuite/g++.dg/cpp0x/noexcept21.C
===
--- testsuite/g++.dg/cpp0x/noexcept21.C (revision 0)
+++ testsuite/g++.dg/cpp0x/noexcept21.C (working copy)
@@ -0,0 +1,28 @@
+// PR c++/57645
+// { dg-do compile { target c++11 } }
+
+struct Thrower
+{
+  ~Thrower() noexcept(false) { throw 1; }
+};
+
+struct ExplicitA
+{
+  ~ExplicitA() {}
+
+  Thrower t;
+};
+
+struct ExplicitB
+{
+  ~ExplicitB();
+
+  Thrower t;
+};
+
+ExplicitB::~ExplicitB() { }
+
+#define SA(X) static_assert(X, #X)
+
+SA( !noexcept(ExplicitA()) );
+SA( !noexcept(ExplicitB()) );


[PATCH] Fix up gcc-{ar,nm,ranlib}

2013-06-19 Thread Jakub Jelinek
Hi!

If say /usr/bin/gcc-ar doesn't find /usr//bin/ar (and a few others),
it gives up unless CROSS_DIRECTORY_STRUCTURE, while e.g. collect2 looks for
ld, nm etc. in $PATH.  The collect2.c snippet is:

  /* Search the compiler directories for `ld'.  We have protection against
 recursive calls in find_a_file.  */
  if (ld_file_name == 0)
ld_file_name = find_a_file (&cpath, ld_suffixes[selected_linker]);
  /* Search the ordinary system bin directories
 for `ld' (if native linking) or `TARGET-ld' (if cross).  */
  if (ld_file_name == 0)
ld_file_name = find_a_file (&path, full_ld_suffixes[selected_linker]);
where the difference between full_ld_suffixes and ld_suffixes is
exactly a concat (target_machine, "-", ld_suffixes[xxx], NULL);

Here is so far untested attempt to do that in gcc-{ar,nm,ranlib} too, ok if
bootstrap/regtest passes and testing shows it works (for 4.8 too, in 4.7 it
worked)?

2013-06-19  Jakub Jelinek  

* gcc-ar.c (main): If not CROSS_DIRECTORY_STRUCTURE, look for
PERSONALITY in $PATH derived prefixes.

--- gcc/gcc-ar.c.jj 2013-01-11 09:02:55.0 +0100
+++ gcc/gcc-ar.c2013-06-19 15:09:08.314935157 +0200
@@ -147,21 +147,17 @@ main(int ac, char **av)
   exe_name = find_a_file (&target_path, PERSONALITY);
   if (!exe_name)
 {
+  const char *real_exe_name = PERSONALITY;
 #ifdef CROSS_DIRECTORY_STRUCTURE
-  const char *cross_exe_name;
-
-  cross_exe_name = concat (target_machine, "-", PERSONALITY, NULL);
-  exe_name = find_a_file (&path, cross_exe_name);
+  real_exe_name = concat (target_machine, "-", PERSONALITY, NULL);
+#endif
+  exe_name = find_a_file (&path, real_exe_name);
   if (!exe_name)
{
  fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0],
-  cross_exe_name);
+  real_exe_name);
  exit (1);
}
-#else
-  fprintf (stderr, "%s: Cannot find binary '%s'\n", av[0], PERSONALITY);
-  exit (1);
-#endif
 }
 
   /* Create new command line with plugin */

Jakub


Re: [Patch tree-ssa] RFC: Enable path threading for control variables (PR tree-optimization/54742).

2013-06-19 Thread James Greenhalgh

> -Original Message-
> From: Steve Ellcey [mailto:sell...@mips.com]
> Sent: 14 June 2013 19:07
>
> With my version the compiler calls gimple_duplicate_sese_region from
> duplicate_blocks 60 times.  With your patch it calls
> gimple_duplicate_sese_region from duplicate_thread_path 16 times.
>

Hi Steve,

You are quite right. With -finline-limit=1000 I see
a big difference in performance. The cause of this is
the code added to
tree-ssa-threadedge.c (simplify_control_stmt_condition).

If we have failed to simplify to a gimple_min_invariant,
we want to look for thread paths to the value given by
gimple_goto_dest, rather than the SSA_NAME_VALUE of that
value.

This improves the performance on my x86_64 toolchain to the
same level as your patch. I see "Registered 20 jump paths"
printed 3 times in dom1, for a total of 60 thread paths.

I've also fixed another couple of bugs I spotted, improved
logging of results and added the parameters that were in
your patch.

I did investigate changing the search strategy back to yours,
but I saw no impact on the thread paths found.

Please let me know if this fixes the performance issues you
were seeing and if you have any other comments.

FWIW I've bootstrapped and regression tested this version of
the patch on x86_64 and ARM with no regressions.

Thanks,
James Greenhalgh

---

Changes from v1:

---
gcc/

2013-06-19  James Greenhalgh  

* params.def (PARAM_MAX_THREAD_PATH_INSNS): New.
(PARAM_THREAD_PATHS): Likewise.
* tree-ssa-threadedge.c
(simplify_control_stmt_condition): If we can't simplify
cond, return it unmodified.
(max_insn_count): Do not initialize.
(max_path_count): Likewise.
(find_control_statement_thread_paths): Catch case where path
has already been computed (thus no further path exists),
add sanity checking.
(thread_across_edge): Initialize max_{insn, path}_count;
* tree-ssa-threadupdate.c
(duplicate_thread_path): Add sanity check, logging.
(thread_through_all_blocks): Thread paths, even if no
threaded_edges were found.
(register_thread_paths): Improve logging.

---

Changelog:

gcc/

2013-06-19  James Greenhalgh  

PR tree-optimization/54742
* params.def (PARAM_MAX_THREAD_PATH_INSNS): New.
(PARAM_THREAD_PATHS): Likewise.
* tree.cfg (gimple_duplicate_sese_region): Memoize loop latch
and loop header blocks if copying across a latch/header.
* tree-flow.h (thread_paths): New struct.
(register_thread_paths): Declare.
* tree-ssa-threadedge.c
(simplify_control_stmt_condition): Permit returning something
not in gimple_min_invariant form.
(max_insn_count): Declare.
(max_path_count): Likewise.
(find_thread_path_1): New function.
(find_thread_path): Likewise.
(save_new_thread_path): Likewise.
(find_control_statement_thread_paths): Likewise.
(thread_across_edge): Handle non gimple_min_invariant cases.
* tree-ssa-threadupdate.c (thread_paths_vec): New.
(remove_edge_from_thread_candidates): New function.
(duplicate_thread_path): Likewise.
(copy_control_statement_thread_paths): Likewise.
(thread_through_all_blocks): Handle thread_paths.
(register_thread_paths): New function.
diff --git a/gcc/params.def b/gcc/params.def
index 3c52651..25d36a6 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -123,6 +123,19 @@ DEFPARAM (PARAM_PARTIAL_INLINING_ENTRY_PROBABILITY,
 	  "Maximum probability of the entry BB of split region (in percent relative to entry BB of the function) to make partial inlining happen",
 	  70, 0, 0)
 
+/* Maximum number of instructions to copy when duplicating blocks
+   on a jump thread path.  */
+DEFPARAM (PARAM_MAX_THREAD_PATH_INSNS,
+		"max-thread-path-insns",
+		"Maximum number of instructions to copy when duplicating blocks on a jump thread path",
+		100, 1, 99)
+
+/* Maximum number of jump thread paths to duplicate.  */
+DEFPARAM (PARAM_MAX_THREAD_PATHS,
+		"max-thread-paths",
+		"Maximum number of new jump thread paths to create",
+		50, 1, 99)
+
 /* Limit the number of expansions created by the variable expansion
optimization to avoid register pressure.  */
 DEFPARAM (PARAM_MAX_VARIABLE_EXPANSIONS,
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 4b91a35..6dcd2e4 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -5717,10 +5717,12 @@ gimple_duplicate_sese_region (edge entry, edge exit,
 {
   unsigned i;
   bool free_region_copy = false, copying_header = false;
+  bool save_loop_details = false;
   struct loop *loop = entry->dest->loop_father;
   edge exit_copy;
   vec doms;
   edge redirected;
+  int memo_loop_header_no = 0, memo_loop_latch_no = 0;
   int total_freq = 0, entry_freq = 0;
   gcov_type total_count = 0, entry_count = 0;
 
@@ -5738,9 +5740,15 @@ gimple_duplicate_sese_region (edge entry, edge exit,
   if (re

Re: Go patch committed: Use function descriptors

2013-06-19 Thread Ian Lance Taylor
On Wed, Jun 19, 2013 at 2:19 AM, Matthias Klose  wrote:
>
> so this did change the soname for libgo to 5 on the trunk, and to 4 on the
> branch.  We had this discussion before, and then decided to revert this kind 
> of
> change on the 4.7 branch.  This time the release notes had a hint that the Go
> support would be updated to v1.1 in a bug fix release, so maybe it is ok.  
> Will
> this the only soname bump on the way to Go 1.1 support, or are there more
> changes/version bumps planned on this way?

Yes, exactly, I've been promising Go 1.1 support on the 4.8 branch, so
I think this change is necessary.  The change in type layout makes the
library incompatible for functions that take function arguments.
Sorry for not calling it out.  This is the only soname bump planned.

Ian


Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types

2013-06-19 Thread Richard Biener
On Wed, Jun 19, 2013 at 6:08 AM, Jeff Law  wrote:
>
> The notable changes since the last version:
>
> First, it should properly handle signed single bit types, though I haven't
> tested it with real code.
>
> Second, the transformation is only applied when the result is used in a
> conditional.  Thus it's much less likely to pessimize targets with and-not
> instructions as it's highly likely we'll eliminate two gimple statements
> rather than just one.
>
>
> Other comments (such as not needing to retrieve gsi_stmt) were also
> addressed.  Testcase was renamed, but is otherwise unchanged.
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>
> OK for the trunk?

Ok.

Thanks,
Richard.

>
> * tree-ssa-forwprop.c (simplify_bitwise_binary_boolean): New
> function.
> (simplify_bitwise_binary): Use it to simpify certain binary ops on
> booleans.
>
> * gcc.dg/tree-ssa/forwprop-28.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c
> b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c
> new file mode 100644
> index 000..2c42065
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c
> @@ -0,0 +1,76 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-forwprop1" } */
> +
> +extern char * frob (void);
> +extern _Bool testit(void);
> +
> +test (int code)
> +{
> +  char * temp = frob();;
> +  int rotate = (code == 22);
> +  if (temp == 0 && !rotate)
> +  oof();
> +}
> +
> +test_2 (int code)
> +{
> +  char * temp = frob();
> +  int rotate = (code == 22);
> +  if (!rotate && temp == 0)
> +  oof();
> +}
> +
> +
> +test_3 (int code)
> +{
> +  char * temp = frob();
> +  int rotate = (code == 22);
> +  if (!rotate || temp == 0)
> +  oof();
> +}
> +
> +
> +test_4 (int code)
> +{
> +  char * temp = frob();
> +  int rotate = (code == 22);
> +  if (temp == 0 || !rotate)
> +  oof();
> +}
> +
> +
> +test_5 (int code)
> +{
> +  _Bool temp = testit();;
> +  _Bool rotate = (code == 22);
> +  if (temp == 0 && !rotate)
> +  oof();
> +}
> +
> +test_6 (int code)
> +{
> +  _Bool temp = testit();
> +  _Bool rotate = (code == 22);
> +  if (!rotate && temp == 0)
> +  oof();
> +}
> +
> +
> +test_7 (int code)
> +{
> +  _Bool temp = testit();
> +  _Bool rotate = (code == 22);
> +  if (!rotate || temp == 0)
> +  oof();
> +}
> +
> +
> +test_8 (int code)
> +{
> +  _Bool temp = testit();
> +  _Bool rotate = (code == 22);
> +  if (temp == 0 || !rotate)
> +  oof();
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Replaced" 8 "forwprop1"} } */
> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
> index c6a7eaf..29a0bb7 100644
> --- a/gcc/tree-ssa-forwprop.c
> +++ b/gcc/tree-ssa-forwprop.c
> @@ -1870,6 +1870,52 @@ hoist_conversion_for_bitop_p (tree to, tree from)
>return false;
>  }
>
> +/* GSI points to a statement of the form
> +
> +   result = OP0 CODE OP1
> +
> +   Where OP0 and OP1 are single bit SSA_NAMEs and CODE is either
> +   BIT_AND_EXPR or BIT_IOR_EXPR.
> +
> +   If OP0 is fed by a bitwise negation of another single bit SSA_NAME,
> +   then we can simplify the two statements into a single LT_EXPR or LE_EXPR
> +   when code is BIT_AND_EXPR and BIT_IOR_EXPR respectively.
> +
> +   If a simplification is mode, return TRUE, else return FALSE.  */
> +static bool
> +simplify_bitwise_binary_boolean (gimple_stmt_iterator *gsi,
> +enum tree_code code,
> +tree op0, tree op1)
> +{
> +  gimple op0_def_stmt = SSA_NAME_DEF_STMT (op0);
> +
> +  if (!is_gimple_assign (op0_def_stmt)
> +  || (gimple_assign_rhs_code (op0_def_stmt) != BIT_NOT_EXPR))
> +return false;
> +
> +  tree x = gimple_assign_rhs1 (op0_def_stmt);
> +  if (TREE_CODE (x) == SSA_NAME
> +  && INTEGRAL_TYPE_P (TREE_TYPE (x))
> +  && TYPE_PRECISION (TREE_TYPE (x)) == 1
> +  && TYPE_UNSIGNED (TREE_TYPE (x)) == TYPE_UNSIGNED (TREE_TYPE (op1)))
> +{
> +  enum tree_code newcode;
> +
> +  gimple stmt = gsi_stmt (*gsi);
> +  gimple_assign_set_rhs1 (stmt, x);
> +  gimple_assign_set_rhs2 (stmt, op1);
> +  if (code == BIT_AND_EXPR)
> +   newcode = TYPE_UNSIGNED (TREE_TYPE (x)) ? LT_EXPR : GT_EXPR;
> +  else
> +   newcode = TYPE_UNSIGNED (TREE_TYPE (x)) ? LE_EXPR : GE_EXPR;
> +  gimple_assign_set_rhs_code (stmt, newcode);
> +  update_stmt (stmt);
> +  return true;
> +}
> +  return false;
> +
> +}
> +
>  /* Simplify bitwise binary operations.
> Return true if a transformation applied, otherwise return false.  */
>
> @@ -2117,8 +2163,44 @@ simplify_bitwise_binary (gimple_stmt_iterator *gsi)
>   return true;
> }
> }
> -}
>
> +  /* If arg1 and arg2 are booleans (or any single bit type)
> + then try to simplify:
> +
> +  (~X & Y) -> X < Y
> +  (X & ~Y) -> Y < X
> +  (~X | Y) -> X <= Y
> +  (X | ~Y) -> Y <= X
> +
> + But only do this if our res

Re: [PATCH, rs6000] power8 patches, patch #9, power8 scheduling

2013-06-19 Thread David Edelsohn
On Fri, Jun 7, 2013 at 3:22 PM, Pat Haugen  wrote:
> This patch adds instruction scheduling support for the Power8 processor.
> Bootstrap/regression test with no new failures. Ok for trunk?
>
>
> 2013-06-07  Michael Meissner  
> Pat Haugen 
> Peter Bergner 
>
> * config/rs6000/power8.md: New.
> * config/rs6000/rs6000-cpus.def (RS6000_CPU table): Adjust processor
> setting for power8 entry.
> * config/rs6000/t-rs6000 (MD_INCLUDES): Add power8.md.
> * config/rs6000/rs6000.c (is_microcoded_insn, is_cracked_insn): Adjust
> test for Power4/Power5 only.
> (insn_must_be_first_in_group, insn_must_be_last_in_group): Add Power8
> support.
> (force_new_group): Adjust comment.
> * config/rs6000/rs6000.md: Include power8.md.

This patch is okay.

Thanks, David


Re: [PATCH] Provide a pointer_map template

2013-06-19 Thread Richard Biener
On Wed, 19 Jun 2013, Richard Biener wrote:

> 
> This templates the pointer-map implementation (actually it copies
> the implementation, leaving the old interface unchanged) providing
> a templated value type.  That's suitable to replace the various
> users requesting a pointer-to-integer-type map, like I noticed
> for the two LTO tree recording mechanisms.  Which in turn saves
> memory on 64bit hosts (and should be less heavy-weight on the cache).
> Not very much, but a quarter of the old pointer-map memory usage.
> 
> LTO bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> In theory we can typedef pointer_map pointer_map_t, but
> that requires touching all pointer_map_t users to drop the
> leading 'struct' and eventually include pointer-set.h.
> 
> I changed the insert () interface to get another output as to
> whether the slot was present to avoid the need to have a special
> "not present" value.  That also makes it unnecessary to zero
> the values array.
>
> Any comments?
> 
> If not then I'll comb over existing pointer -> integer type map
> users and convert them.

Added the dominance.c one and changed the implementation to
"inherit" from pointer-set instead, sharing a bit more code.

The pointer-map template is also type-safe for the value array
so converting all pointer-map users will make the code a tiny
bit prettier.

The remaining integer type cases seem to store integer types
as large as pointer types so they fall in the same category
(but eventually they chose that large type for no good reason).

Old patch LTO bootstrapped and tested on x86_64-unknown-linux-gnu.

Any objections?

Thanks,
Richard.

2013-06-19  Richard Biener  

* pointer-set.h (struct pointer_set_t): Move here from
pointer-set.c.
(pointer_set_lookup): Declare.
(class pointer_map): New template class implementing a
generic pointer to T map.
(pointer_map::pointer_map, pointer_map::~pointer_map,
pointer_map::contains, pointer_map::insert,
pointer_map::traverse): New functions.
* pointer-set.c (struct pointer_set_t): Moved to pointer-set.h.
(pointer_set_lookup): New function.
(pointer_set_contains): Use pointer_set_lookup.
(pointer_set_insert): Likewise.
(insert_aux): Remove.
(struct pointer_map_t): Embed a pointer_set_t.
(pointer_map_create): Adjust.
(pointer_map_destroy): Likewise.
(pointer_map_contains): Likewise.
(pointer_map_insert): Likewise.
(pointer_map_traverse): Likewise.
* tree-streamer.h (struct streamer_tree_cache_d): Use a
pointer_map instead of a pointer_map_t.
* tree-streamer.c (streamer_tree_cache_insert_1): Adjust.
(streamer_tree_cache_lookup): Likewise.
(streamer_tree_cache_create): Likewise.
(streamer_tree_cache_delete): Likewise.
* lto-streamer.h (struct lto_tree_ref_encoder): Use a
pointer_map instead of a pointer_map_t.
(lto_init_tree_ref_encoder): Adjust.
(lto_destroy_tree_ref_encoder): Likewise.
* lto-section-out.c (lto_output_decl_index): Likewise.
(lto_record_function_out_decl_state): Likewise.
* dominance.c (iterate_fix_dominators): Use pointer_map.

Index: gcc/pointer-set.c
===
*** gcc/pointer-set.c.orig  2013-06-19 12:28:49.0 +0200
--- gcc/pointer-set.c   2013-06-19 13:52:49.172792131 +0200
*** along with GCC; see the file COPYING3.
*** 21,41 
  #include "system.h"
  #include "pointer-set.h"
  
- /* A pointer set is represented as a simple open-addressing hash
-table.  Simplifications: The hash code is based on the value of the
-pointer, not what it points to.  The number of buckets is always a
-power of 2.  Null pointers are a reserved value.  Deletion is not
-supported (yet).  There is no mechanism for user control of hash
-function, equality comparison, initial size, or resizing policy.  */
- 
- struct pointer_set_t
- {
-   size_t log_slots;
-   size_t n_slots; /* n_slots = 2^log_slots */
-   size_t n_elements;
- 
-   const void **slots;
- };
  
  /* Use the multiplicative method, as described in Knuth 6.4, to obtain
 a hash code for P in the range [0, MAX).  MAX == 2^LOGMAX.
--- 21,26 
*** hash1 (const void *p, unsigned long max,
*** 67,72 
--- 52,58 
return ((A * (uintptr_t) p) >> shift) & (max - 1);
  }
  
+ 
  /* Allocate an empty pointer set.  */
  struct pointer_set_t *
  pointer_set_create (void)
*** pointer_set_destroy (struct pointer_set_
*** 89,108 
XDELETE (pset);
  }
  
- /* Returns nonzero if PSET contains P.  P must be nonnull.
  
!Collisions are resolved by linear probing.  */
! int
! pointer_set_contains (const struct pointer_set_t *pset, const void *p)
  {
size_t n = hash1 (p, pset->n_slots, pset->log_slots);
  
while (true)
  {
  

Re: [PATCH] ARM: Don't clobber CC reg when it is live after the peephole window

2013-06-19 Thread Richard Earnshaw

On 18/06/13 17:22, Meador Inge wrote:

Ping.

On 06/06/2013 01:23 PM, Meador Inge wrote:

On 06/06/2013 08:11 AM, Richard Earnshaw wrote:


I understand (and agree with) this bit...


+(define_peephole2
+  [(set (reg:CC CC_REGNUM)
+(compare:CC (match_operand:SI 0 "register_operand" "")
+(match_operand:SI 1 "arm_rhs_operand" "")))
+   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
+  (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
+   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
+  (set (match_dup 2) (const_int 1)))
+   (match_scratch:SI 3 "r")]
+  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
+  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
+   (parallel
+[(set (reg:CC CC_REGNUM)
+  (compare:CC (const_int 0) (match_dup 3)))
+ (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
+   (set (match_dup 2)
+(plus:SI (plus:SI (match_dup 2) (match_dup 3))
+ (geu:SI (reg:CC CC_REGNUM) (const_int 0])
+


... but what's this bit about?


The original intent was to revert back to the original peephole pattern
(pre-PR 46975) when the CC reg is still live, but that doesn't properly
maintain the CC state either (it just happened to pass in the test
case I was looking at because I only cared about the Z flag, which is
maintained the same).

OK with the above bit left out?






Sorry for the delay, I've been sidetracked onto other things.

Having looked at this patch I realized that we were missing a trick on 
ARMv5 and later, when a more efficient sequence exists, particularly for 
Cortex-A15.  By using CLZ we can avoid the need to set the condition 
code register at all, which gives us far more scheduling freedom.  It's 
also best not to unnecessarily clobber the condition code register even 
if there are other instructions in the sequence that do set/use the 
flags (the peepholer pass right at the end will do this optimization 
when it is useful), so I've tweaked some of the existing alternatives as 
well.


Finally, we can use peep2_regno_dead_p (rather than peep2_reg_dead_p) to 
avoid having to create extra match_operand values.


The result is that I've now committed the patch below.

R.

2013-06-19  Richard Earnshaw  

arm.md (split for eq(reg, 0)): Add variants for ARMv5 and
Thumb2.
(peepholes for eq(reg, not-0)): Ensure condition register is
dead after pattern.  Use more efficient sequences on ARMv5 and
Thumb2.
--- gcc/config/arm/arm.md   (revision 200187)
+++ gcc/config/arm/arm.md   (local)
@@ -10021,6 +10021,16 @@ (define_split
(eq:SI (match_operand:SI 1 "s_register_operand" "")
   (const_int 0)))
(clobber (reg:CC CC_REGNUM))]
+  "arm_arch5 && TARGET_32BIT"
+  [(set (match_dup 0) (clz:SI (match_dup 1)))
+   (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 5)))]
+)
+
+(define_split
+  [(set (match_operand:SI 0 "s_register_operand" "")
+   (eq:SI (match_operand:SI 1 "s_register_operand" "")
+  (const_int 0)))
+   (clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT && reload_completed"
   [(parallel
 [(set (reg:CC CC_REGNUM)
@@ -10090,29 +10100,87 @@ (define_insn_and_split "*compare_scc"
 
 ;; Attempt to improve the sequence generated by the compare_scc splitters
 ;; not to use conditional execution.
+
+;; Rd = (eq (reg1) (const_int0))  // ARMv5
+;; clz Rd, reg1
+;; lsr Rd, Rd, #5
 (define_peephole2
   [(set (reg:CC CC_REGNUM)
(compare:CC (match_operand:SI 1 "register_operand" "")
-   (match_operand:SI 2 "arm_rhs_operand" "")))
+   (const_int 0)))
+   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
+ (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
+   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
+ (set (match_dup 0) (const_int 1)))]
+  "arm_arch5 && TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)"
+  [(set (match_dup 0) (clz:SI (match_dup 1)))
+   (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 5)))]
+)
+
+;; Rd = (eq (reg1) (const_int0))  // !ARMv5
+;; negs Rd, reg1
+;; adc  Rd, Rd, reg1
+(define_peephole2
+  [(set (reg:CC CC_REGNUM)
+   (compare:CC (match_operand:SI 1 "register_operand" "")
+   (const_int 0)))
(cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
  (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
(cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
  (set (match_dup 0) (const_int 1)))
-   (match_scratch:SI 3 "r")]
-  "TARGET_32BIT"
+   (match_scratch:SI 2 "r")]
+  "TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)"
   [(parallel
 [(set (reg:CC CC_REGNUM)
- (compare:CC (match_dup 1) (match_dup 2)))
- (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))])
+ (compare:CC (const_int 0) (match_dup 1)))
+ (set (match_dup 2) (minus:SI (const_int 0) (match_dup 1)))])
+   (set (match_dup 0)
+   (pl

Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2013-06-19 Thread Matthias Klose
Am 19.06.2013 14:03, schrieb Matthias Klose:
> $ gcc-ar-4.8 -h
> gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so'
> 
> the plugin is *not* installed with x permission flags (which seems to be the
> standard for shared libraries).   You did change the code to use find_a_file
> which searches only for files with the x bit set.
> 
> Work around is to install the plugin with the x bits set, or use some helper
> function which doesn't look for the x bits.  I assume that wasn't catched,
> because the plugin then was found in another location?

openend 57651 for that.

now, working around the permission bit, I get:

$ gcc-ar-4.8
gcc-ar-4.8: Cannot find binary 'ar'

so it only searches ar in the given paths, not on the path of the file system
(/usr/bin in this case)



Re: [Committed] S/390: PR57609 fix - use next_active_insn instead of next_real_insn

2013-06-19 Thread Andreas Krebbel
On 18/06/13 19:06, Steven Bosscher wrote:
> BTW I don't understand how a label satisfying the following can be
> true for a label before a jump table:
> 
>   if (LABEL_P (insn)
>   && (LABEL_PRESERVE_P (insn) || LABEL_NAME (insn)))
> 
> LABEL_PRESERVE_P should never be set on a label before a
> JUMP_TABLE_DATA, and LABEL_NAME should be NULL.

Actually LABEL_PRESERVE_P appears to be set on quite many of the jump table 
data labels.  Example
from compiling fold-const.c:

(code_label/s 1285 1284 1286 9315 "" [3 uses])

(jump_table_data 1286 1285 1287 (addr_vec:SI [
(label_ref:SI 54875)
(label_ref:SI 63283)
(label_ref:SI 63283)
(label_ref:SI 63283)
(label_ref:SI 63283)
(label_ref:SI 63283)
(label_ref:SI 63283) ...

Hardware watchpoint 5: table_label->in_struct

Old value = 0
New value = 1
force_const_mem (mode=SImode, x=0x7c0c27c0) at /build/gcc-head/gcc/varasm.c:3699
3699  return copy_rtx (def);
(gdb) bt
#0  force_const_mem (mode=SImode, x=0x7c0c27c0) at 
/build/gcc-head/gcc/varasm.c:3699
#1  0x009ada54 in emit_move_insn (x=0x7bdf44b0, y=0x7c0c27c0) at 
/build/gcc-head/gcc/expr.c:3499
#2  0x010b3e6a in gen_casesi (operand0=0x7bdf42d0, operand1=0x7d7e24e0, 
operand2=0x7c0c20e0,
operand3=0x7b3c4488, operand4=0x7b3c43c0) at 
/build/gcc-head/gcc/config/s390/s390.md:8588
#3  0x00c0c70e in maybe_gen_insn (icode=CODE_FOR_casesi, nops=5, ops=0x7fffe3a8)
at /build/gcc-head/gcc/optabs.c:8219
#4  0x00c0c92a in maybe_expand_jump_insn (icode=CODE_FOR_casesi, nops=5, 
ops=0x7fffe3a8)
at /build/gcc-head/gcc/optabs.c:8257
#5  0x00c0c9ee in expand_jump_insn (icode=CODE_FOR_casesi, nops=5, 
ops=0x7fffe3a8)
at /build/gcc-head/gcc/optabs.c:8283
#6  0x009ca5ec in try_casesi (index_type=0x7d7e6420, index_expr=0x7bca4168, 
minval=0x7d889300,
range=0x7d156920, table_label=0x7b3c4488, default_label=0x7b3c43c0, 
fallback_label=0x7b3c43e8,
default_probability=) at /build/gcc-head/gcc/expr.c:10967
#7  0x00d18016 in emit_case_dispatch_table (index_expr=0x7bca4168, 
index_type=0x7d7e6420,
case_list=0x1a31d58, default_label=0x7b3c43c0, minval=0x7d889300, 
maxval=0x7d0ad180,
range=0x7d156920, stmt_bb=0x7bf96000) at /build/gcc-head/gcc/stmt.c:1933
#8  0x00d18ef4 in expand_case (stmt=0x7d7dc800) at 
/build/gcc-head/gcc/stmt.c:2207

> Better yet would be to use tablejump_p instead of examining the
> pattern by hand, e.g.:

Ok. Better. I've applied your patch after testing it. Thanks!

Bye,

-Andreas-

> 
> Index: s390.c
> ===
> --- s390.c  (revision 200173)
> +++ s390.c  (working copy)
> @@ -7023,7 +7023,7 @@ s390_chunkify_start (void)
>if (LABEL_P (insn)
>   && (LABEL_PRESERVE_P (insn) || LABEL_NAME (insn)))
> {
> - rtx vec_insn = next_active_insn (insn);
> + rtx vec_insn = NEXT_INSN (insn);
>   if (! vec_insn || ! JUMP_TABLE_DATA_P (vec_insn))
> bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (insn));
> }
> @@ -7033,6 +7033,8 @@ s390_chunkify_start (void)
>else if (JUMP_P (insn))
> {
>rtx pat = PATTERN (insn);
> +  rtx table;
> +
>   if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) > 2)
> pat = XVECEXP (pat, 0, 0);
> 
> @@ -7046,28 +7048,18 @@ s390_chunkify_start (void)
> bitmap_set_bit (far_labels, CODE_LABEL_NUMBER (label));
> }
>  }
> - else if (GET_CODE (pat) == PARALLEL
> -  && XVECLEN (pat, 0) == 2
> -  && GET_CODE (XVECEXP (pat, 0, 0)) == SET
> -  && GET_CODE (XVECEXP (pat, 0, 1)) == USE
> -  && GET_CODE (XEXP (XVECEXP (pat, 0, 1), 0)) == LABEL_REF)
> -   {
> - /* Find the jump table used by this casesi jump.  */
> - rtx vec_label = XEXP (XEXP (XVECEXP (pat, 0, 1), 0), 0);
> - rtx vec_insn = next_active_insn (vec_label);
> - if (vec_insn && JUMP_TABLE_DATA_P (vec_insn))
> -   {
> - rtx vec_pat = PATTERN (vec_insn);
> - int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC;
> -
> - for (i = 0; i < XVECLEN (vec_pat, diff_p); i++)
> -   {
> - rtx label = XEXP (XVECEXP (vec_pat, diff_p, i), 0);
> -
> - if (s390_find_pool (pool_list, label)
> - != s390_find_pool (pool_list, insn))
> -   bitmap_set_bit (far_labels, CODE_LABEL_NUMBER 
> (label));
> -   }
> + else if (tablejump_p (insn, NULL, &table))
> +   {
> + rtx vec_pat = PATTERN (table);
> + int i, diff_p = GET_CODE (vec_pat) == ADDR_DIFF_VEC;
> +
> + for (i = 0; i < XVECLEN (vec_pat, diff_p); i++)
> +   {
> + rtx label = XEXP (XVECEXP (ve

Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2013-06-19 Thread Jakub Jelinek
On Wed, Jun 19, 2013 at 02:03:34PM +0200, Matthias Klose wrote:
> Am 27.11.2012 19:14, schrieb Meador Inge:
> > On 11/26/2012 09:05 AM, Richard Biener wrote:
> > 
> >> On Wed, Nov 7, 2012 at 10:51 PM, Meador Inge  
> >> wrote:
> >>> Ping ^ 4.
> >>
> >> Ok.
> > 
> > Thanks for the review.  Committed to trunk.
> 
> This did break gcc-ar and gcc-nm; now a regression on the 4.8 branch.
> 
> $ gcc-ar-4.8 -h
> gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so'
> 
> the plugin is *not* installed with x permission flags (which seems to be the
> standard for shared libraries).   You did change the code to use find_a_file
> which searches only for files with the x bit set.

That actually is the standard for shared libraries, the linker creates
libraries with those permissions and libtool/automake installs them that way
too.  So if you override this, you need to cope with that decision.

Jakub


Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2013-06-19 Thread Matthias Klose
Am 27.11.2012 19:14, schrieb Meador Inge:
> On 11/26/2012 09:05 AM, Richard Biener wrote:
> 
>> On Wed, Nov 7, 2012 at 10:51 PM, Meador Inge  
>> wrote:
>>> Ping ^ 4.
>>
>> Ok.
> 
> Thanks for the review.  Committed to trunk.

This did break gcc-ar and gcc-nm; now a regression on the 4.8 branch.

$ gcc-ar-4.8 -h
gcc-ar-4.8: Cannot find plugin 'liblto_plugin.so'

the plugin is *not* installed with x permission flags (which seems to be the
standard for shared libraries).   You did change the code to use find_a_file
which searches only for files with the x bit set.

Work around is to install the plugin with the x bits set, or use some helper
function which doesn't look for the x bits.  I assume that wasn't catched,
because the plugin then was found in another location?

  Matthias



Re: [ARM][Insn classification refactoring 2/N] Update instruction classification documentation

2013-06-19 Thread Richard Earnshaw

On 18/06/13 15:47, Sofiane Naci wrote:

Hi,

This patch updates the documentation for "type" attribute. It complements
the changes proposed in the previous patch

OK for trunk?

-
Thanks
Sofiane=



OK.





Re: [PATCH] Enable non-complex math builtins from C99 for Bionic

2013-06-19 Thread Alexander Ivchenko
> I don't see how a target hook is required for the command-line idea.
> Targets already have a perfectly working way of changing the default
> of a command-line option.

That's true.. sorry, my bad.

Anyway, could somebody take a look at the patch itself?

--Alexander



>> 2013/4/23 Alexander Ivchenko :
>>> *ping*
>>>
>>> thanks
>>> Alexander
>>>
>>> 2013/3/28 Alexander Ivchenko :
 Hi,

 4.8 is now branched, lets come back to the discussion that we had
 before. I updated the patch a little
 bit since we now have linux-protos.h and linux-android.c files.

 I tried to preserve the avaiability of c99 for all targets, but it's
 pretty difficult, because we are changing
 the defaults. Passing an empty string as second argument doesn't look
 very good, but on the other hand
 the user has one clear way for checking the presence of a certain
 function. But of course we can create
 another function, that will call targetm.libc_has_function
 (function_class, "") within itself.

 best regards,
 Alexander

 2013/1/7 Joseph S. Myers :
> On Fri, 21 Dec 2012, Alexander Ivchenko wrote:
>
>> Hi,
>>
>> Thank you very much for your input! Please, take a look at the updated 
>> version:
>> I fixed coding style, moved documentation for TARGET_LIBC_HAS_FUNCTION
>> to target.def.
>> Removed TARGET_C99_FUNCTIONS and TARGET_HAS_SINCOS and all their
>> influence and moved the implementation of linux_libc_has_function to
>> host-linux.c.
>>   I changed the defaults: now it is assumed that we have C99 runtime,
>> but no sincos. I updated all needed gcc/config/*.h. But 'm not sure in
>> this part,
>> cause I don't have the opportunity to test it properly...
>
> This patch seems mostly plausible, though there are various places that
> call targetm.libc_has_function with and empty string as second argument,
> that should be naming the specific function instead.  I haven't reviewed
> the details, and at this development stage I think it will need to wait
> until after 4.8 branches.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


[PATCH] Provide a pointer_map template

2013-06-19 Thread Richard Biener

This templates the pointer-map implementation (actually it copies
the implementation, leaving the old interface unchanged) providing
a templated value type.  That's suitable to replace the various
users requesting a pointer-to-integer-type map, like I noticed
for the two LTO tree recording mechanisms.  Which in turn saves
memory on 64bit hosts (and should be less heavy-weight on the cache).
Not very much, but a quarter of the old pointer-map memory usage.

LTO bootstrap and regtest running on x86_64-unknown-linux-gnu.

In theory we can typedef pointer_map pointer_map_t, but
that requires touching all pointer_map_t users to drop the
leading 'struct' and eventually include pointer-set.h.

I changed the insert () interface to get another output as to
whether the slot was present to avoid the need to have a special
"not present" value.  That also makes it unnecessary to zero
the values array.

Any comments?

If not then I'll comb over existing pointer -> integer type map
users and convert them.

Thanks,
Richard.


2013-06-19  Richard Biener  

* pointer-set.h (struct pointer_map_base): New struct.
(class pointer_map): New template class implementing a
generic pointer to T map.
(pointer_map::pointer_map, pointer_map::~pointer_map,
pointer_map::contains, pointer_map::insert,
pointer_map::traverse): New functions.
* pointer-set.c (pointer_map_base::lookup): New function.
* tree-streamer.h (struct streamer_tree_cache_d): Use a
pointer_map instead of a pointer_map_t.
* tree-streamer.c (streamer_tree_cache_insert_1): Adjust.
(streamer_tree_cache_lookup): Likewise.
(streamer_tree_cache_create): Likewise.
(streamer_tree_cache_delete): Likewise.
* lto-streamer.h (struct lto_tree_ref_encoder): Use a
pointer_map instead of a pointer_map_t.
(lto_init_tree_ref_encoder): Adjust.
(lto_destroy_tree_ref_encoder): Likewise.
* lto-section-out.c (lto_output_decl_index): Likewise.
(lto_record_function_out_decl_state): Likewise.

Index: gcc/pointer-set.c
===
*** gcc/pointer-set.c   (revision 200189)
--- gcc/pointer-set.c   (working copy)
*** void pointer_map_traverse (const struct
*** 301,303 
--- 301,335 
  if (pmap->keys[i] && !fn (pmap->keys[i], &pmap->values[i], data))
break;
  }
+ 
+ 
+ 
+ /* Lookup the slot for the pointer P and return true if it exists,
+otherwise return false in which case *IX points to the slot that
+would be used on insertion.  */
+ 
+ bool
+ pointer_map_base::lookup (const void *p, size_t *ix)
+ {
+   size_t n = hash1 (p, n_slots, log_slots);
+ 
+   while (true)
+ {
+   if (keys[n] == p)
+   {
+ *ix = n;
+ return true;
+   }
+   else if (keys[n] == 0)
+   {
+ *ix = n;
+ return false;
+   }
+   else
+{
+  ++n;
+  if (n == n_slots)
+n = 0;
+}
+ }
+ }
Index: gcc/pointer-set.h
===
*** gcc/pointer-set.h   (revision 200189)
--- gcc/pointer-set.h   (working copy)
*** void pointer_set_traverse (const struct
*** 30,42 
   bool (*) (const void *, void *),
   void *);
  
  struct pointer_map_t;
! struct pointer_map_t *pointer_map_create (void);
! void pointer_map_destroy (struct pointer_map_t *pmap);
  
! void **pointer_map_contains (const struct pointer_map_t *pmap, const void *p);
! void **pointer_map_insert (struct pointer_map_t *pmap, const void *p);
! void pointer_map_traverse (const struct pointer_map_t *,
   bool (*) (const void *, void **, void *), void *);
  
  #endif  /* POINTER_SET_H  */
--- 30,166 
   bool (*) (const void *, void *),
   void *);
  
+ 
+ struct pointer_map_base
+ {
+   size_t log_slots;
+   size_t n_slots; /* n_slots = 2^log_slots */
+   size_t n_elements;
+   const void **keys;
+ 
+   bool lookup (const void *p, size_t *n);
+ };
+ 
+ /* A pointer map is represented the same way as a pointer_set, so
+the hash code is based on the address of the key, rather than
+its contents.  Null keys are a reserved value.  Deletion is not
+supported (yet).  There is no mechanism for user control of hash
+function, equality comparison, initial size, or resizing policy.  */
+ 
+ template 
+ class pointer_map : protected pointer_map_base
+ {
+   T *values;
+ 
+ public:
+   pointer_map ();
+   ~pointer_map ();
+   T *contains (const void *p);
+   T *insert (const void *p, bool *existed_p = NULL);
+   void traverse (bool (*fn) (const void *, T *, void *), void *data);
+ };
+ 
+ /* Allocate an empty pointer map.  */
+ template 
+ pointer_map::pointer_map (void)
+ {
+   n_elements = 0;
+   log_slots = 8;
+   n_slots = 

Re: Go patch committed: Use function descriptors

2013-06-19 Thread Matthias Klose
Am 19.06.2013 01:50, schrieb Ian Lance Taylor:
> This patch to gccgo changes the representation of values of function
> type.  They used to be a pointer to function code, like a C function
> pointer.  They are now a pointer to a struct.  The first field of the
> struct points to the function code.  The remaining fields, if any, are
> the addresses of variables referenced in enclosing functions.  For each
> call to a function, the address of the function descriptor is passed as
> the last argument.
> 
> This lets us avoid generating trampolines, and removes the use
> of writable/executable sections of the heap.
> 
> This is also a prerequisite to a new Go 1.1 feature, method values.
> 
> Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
> Committed to mainline and 4.8 branch.

so this did change the soname for libgo to 5 on the trunk, and to 4 on the
branch.  We had this discussion before, and then decided to revert this kind of
change on the 4.7 branch.  This time the release notes had a hint that the Go
support would be updated to v1.1 in a bug fix release, so maybe it is ok.  Will
this the only soname bump on the way to Go 1.1 support, or are there more
changes/version bumps planned on this way?

  Matthias



Re: [PATCH, ARM] Reintroduce minipool ranges for zero-extension insn patterns

2013-06-19 Thread Eric Botcazou
> The following patch removed pool_range/neg_pool_range attributes from
> several instructions as a cleanup, which I believe to have been
> incorrect:
> 
> http://gcc.gnu.org/ml/gcc-patches/2010-07/msg01036.html
> 
> On a Mentor-local branch, this caused problems with instructions like:
> 
> (insn 77 53 87 (set (reg:SI 8 r8 [orig:197 s.4 ] [197])
> (zero_extend:SI (mem/u/c:HI (symbol_ref/u:SI ("*.LC0") [flags 0x2])
> [7 S2 A16]))) [...] 161 {*arm_zero_extendhisi2_v6} (nil))
> 
> The reasoning behind the cleanup was that the instructions in question
> have no immediate constraints -- but the minipool code is used for more
> than just immediates, e.g. in the above case where a symbol reference
> ("m") is loaded.

Probably the most reported ARM bug (PR target/49423) and a clear regression.

> I don't have a test case for the problem on mainline at present, but I
> believe it is still a latent bug. Tested with the default multilibs (ARM
> & Thumb mode) on arm-none-eabi, with no regressions. (The patch has
> also been tested with more multilibs on our local branches for a while,
> and I did ensure previously that it did not adversely affect Bernd's
> patch linked above.)

Can you attach it to PR target/49423?  Anybody doing serious testing with an 
ARM compiler will run into it in some configuration so it would be nice to 
have a single source for the fix (although that ought to be the tree...).

-- 
Eric Botcazou


Re: PING [C++ docs patch] PR 56544

2013-06-19 Thread Paolo Carlini
... I have no committed this simple doc update. Also, a 4_8-branch 
version, attached below.


Thanks,
Paolo.


2013-06-19  Paolo Carlini  

PR c++/56544
* doc/cpp.texi [Standard Predefined Macros, __cplusplus]: Document
that now in C++ the value is correct per the C++ standards.
Index: doc/cpp.texi
===
--- doc/cpp.texi(revision 200192)
+++ doc/cpp.texi(working copy)
@@ -1926,11 +1926,9 @@ facilities of the standard C library available.
 This macro is defined when the C++ compiler is in use.  You can use
 @code{__cplusplus} to test whether a header is compiled by a C compiler
 or a C++ compiler.  This macro is similar to @code{__STDC_VERSION__}, in
-that it expands to a version number.  A fully conforming implementation
-of the 1998 C++ standard will define this macro to @code{199711L}.  The
-GNU C++ compiler is not yet fully conforming, so it uses @code{1}
-instead.  It is hoped to complete the implementation of standard C++
-in the near future.
+that it expands to a version number.  Depending on the language standard
+selected, the value of the macro is @code{199711L}, as mandated by the
+1998 C++ standard, or @code{201103L}, per the 2011 C++ standard.
 
 @item __OBJC__
 This macro is defined, with value 1, when the Objective-C compiler is in


Re: [PATCH] Add command line parsing of -fsanitize

2013-06-19 Thread Jakub Jelinek
On Wed, Jun 19, 2013 at 10:45:28AM +0200, Richard Biener wrote:
> Btw, how to handle the issue with LTO and different -fsanitize options
> at compile vs. link-time?  Can TUs without -fsanitize options be LTO
> linked with -fsanitize?  Then lto-wrapper should union -fsanitize
> options from all TUs to the final link.  I hope all -fsanitize options can
> be mixed freely and will properly combine.

You can mix the options at compile time, you can't mix -fsanitize=thread
with -fsanitize=address linking, because libasan.* and libtsan.* are
runtime incompatible, each one uses different incompatible virtual memory
layout.  But of course, if you compile something with -fsanitize=thread,
something else -fsanitize=address, then link, either with -fsanitize=thread
or -fsanitize=address, it will most likely not link (because the other
library will not be there).
The undefined stuff (lots of options) Marek is working on are orthogonal to
this, that library just contains tons of fancy printfs and aborts to
complain about various issues, but can coexist with libasan or libtsan.

Jakub


Re: [PATCH] Add command line parsing of -fsanitize

2013-06-19 Thread Richard Biener
On Tue, Jun 18, 2013 at 10:25 PM, Jakub Jelinek  wrote:
> On Tue, Jun 18, 2013 at 04:42:51PM +0200, Marek Polacek wrote:
>> Ok, should be done now (together with other nit-fixes).
>> Regtested/bootstrapped on x86_64-linux, ok for trunk?
>
> Looks good to me, the only thing I'm worried about are how this
> interferes with the
> %{fsanitize=address:...}
> and
> %{fsanitize=thread:...}
> bits in gcc.c.  Because we should link in -lasan even for
> -fsanitize=shift,address,undefined
> and should not link in -lasan for
> -fsanitize=address -fno-sanitize=undefined,address,shift
> (generally, what we have guarded right now with
> %{fsanitize=address:...}
> should be done if flag_sanitize & SANITIZE_ADDRESS is going to be
> true in the end, etc., and we'll need to link in
> -lubsan  whenever at least one of the undefined options are set in the
> bitmask.  -lubsan isn't incompatible with -lasan nor -ltsan, but -lasan
> and -ltsan are incompatible.
>
> Joseph, any thoughts how to deal with this?

Btw, how to handle the issue with LTO and different -fsanitize options
at compile vs. link-time?  Can TUs without -fsanitize options be LTO
linked with -fsanitize?  Then lto-wrapper should union -fsanitize
options from all TUs to the final link.  I hope all -fsanitize options can
be mixed freely and will properly combine.

Richard.

> Jakub


Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Jakub Jelinek
On Wed, Jun 19, 2013 at 10:38:47AM +0200, Richard Biener wrote:
> On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek  wrote:
> > On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
> >>  Right, as you did for other cases. It works here as well.
> >
> > Patch preapproved.
> 
> I wonder how much code breaks these days when we enable -fno-common by
> default? ...

Somebody would need to try it ;).  From vectorization POV, it surely would
be better if -fno-common was the default.

Jakub


Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Richard Biener
On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek  wrote:
> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
>>  Right, as you did for other cases. It works here as well.
>
> Patch preapproved.

I wonder how much code breaks these days when we enable -fno-common by
default? ...

Richard.


Re: [PATCH] PowerPC: Fix test case for PR55033

2013-06-19 Thread Chung-Ju Wu
2013/6/18 Sebastian Huber :
> Hello Chung-Ju,
>
>
> On 06/18/2013 05:12 AM, Chung-Ju Wu wrote:
>>
>> 2013/6/18 David Edelsohn :
>>>
>>> gcc/testsuite/ChangeLog
>>> 2013-06-17  Sebastian Huber  
>>>
>>> PR target/55033
>>> * gcc.target/powerpc/pr55033.c: Fix options.
>>>
>>> Okay.
>>>
>>> Thanks, David
>>>
>>> P.S. Please explicitly copy me on patches.
>>
>>
>> Hi, Sebastian,
>>
>> Since David has pproved your patch,
>> do you need me to help commit this fix again?
>> I'd happy to do this for you. :)
>
>
> yes, please commit it for me.  Thanks.
>
> --
> Sebastian Huber, embedded brains GmbH
>

Committed into trunk as Revision.200191.
http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=200191


Best regards,
jasonwucj


Re: [PATCH] Add command line parsing of -fsanitize

2013-06-19 Thread Jakub Jelinek
On Tue, Jun 18, 2013 at 10:45:53PM +, Joseph S. Myers wrote:
> On Tue, 18 Jun 2013, Jakub Jelinek wrote:
> 
> > On Tue, Jun 18, 2013 at 04:42:51PM +0200, Marek Polacek wrote:
> > > Ok, should be done now (together with other nit-fixes).
> > > Regtested/bootstrapped on x86_64-linux, ok for trunk?
> > 
> > Looks good to me, the only thing I'm worried about are how this
> > interferes with the
> > %{fsanitize=address:...}
> > and
> > %{fsanitize=thread:...}
> > bits in gcc.c.  Because we should link in -lasan even for
> > -fsanitize=shift,address,undefined
> > and should not link in -lasan for
> > -fsanitize=address -fno-sanitize=undefined,address,shift
> > (generally, what we have guarded right now with
> > %{fsanitize=address:...}
> > should be done if flag_sanitize & SANITIZE_ADDRESS is going to be
> > true in the end, etc., and we'll need to link in
> > -lubsan  whenever at least one of the undefined options are set in the
> > bitmask.  -lubsan isn't incompatible with -lasan nor -ltsan, but -lasan
> > and -ltsan are incompatible.
> > 
> > Joseph, any thoughts how to deal with this?
> 
> Try defining a new spec function or functions that uses flag_sanitize to 
> determine what linker arguments to pass?  Since the option handling is in 
> opts.c it should get run in the driver so flag_sanitize should be set 
> correctly there; as long as the specs in question run after the relevant 
> option processing, a spec function should work for this.

While it would be possible to define say %:sanitize(thread LIBTSAN_EARLY)
that would work roughly like %{fsanitize=thread:LIBTSAN_EARLY} worked
until now (variable number of arguments that would be concatenated together
if flag_sanitize & ..., otherwise return empty), we use e.g. %e inside
of the %{fsanitize=thread:...} etc.
So, I wonder if we couldn't extend the handle_braces, I think right now
empty atoms are disallowed for the first choice, so perhaps
%{%:function(args):...}
where %:function(args) would be expanded to either non-empty or empty string
and depending on that the condition would be then true resp. false.
As % is not considered part of the atom name, and we require after atom name
optional * and then only one of |, }, &, :, I think this wouldn't be
ambiguous in the grammar.
We could then have:
%{!%:function1():-lfoo;%:function2(bar baz):-lbar -lbaz;-lxxx}
and for the sanitizer purposes:
%{%:sanitize(address):LIBTSAN_EARLY}
%{!nostdlib:%{!nodefaultlibs:%{%:sanitize(address):" LIBASAN_SPEC "\
%{static:%ecannot specify -static with -fsanitize=address}\
%{%:sanitize(thread):%e-fsanitize=address is incompatible with 
-fsanitize=thread}}\
%{%:sanitize(thread):" LIBTSAN_SPEC "\
%{!pie:%{!shared:%e-fsanitize=thread linking must be done with -pie or
%-shared}"

Jakub


Re: [PATCH] Re-write LTO type merging again, do tree merging

2013-06-19 Thread Richard Biener
On Tue, 18 Jun 2013, Andi Kleen wrote:

> On Tue, Jun 18, 2013 at 08:04:15PM +0200, Andi Kleen wrote:
> > > Just confirmed with the small build. It does. Running the large build 
> > > now.
> > 
> > Large build worked too.
> 
> Also it seems to be drastically faster. I haven't done a proper 
> measurement run, but the initial run was 58% faster than 4.8,
> using 42% less peak RSS.

That was the intent.  As a side-effect it should also behave
correctly and not have weird effects on dwarf2out.c expectations.

Richard.


[PATCH] Fix LTO kernel build ICE

2013-06-19 Thread Richard Biener

As reported by Andi the following trivial fix fixes the ICE.

Committed as obvious.

Richard.

2013-06-19  Richard Biener  

* expr.c (expand_expr_real_1): Use SCOPE_FILE_SCOPE_P to check
for global context.

Index: gcc/expr.c
===
--- gcc/expr.c  (revision 200189)
+++ gcc/expr.c  (working copy)
@@ -9353,7 +9353,7 @@ expand_expr_real_1 (tree exp, rtx target
   /* Variables inherited from containing functions should have
 been lowered by this point.  */
   context = decl_function_context (exp);
-  gcc_assert (!context
+  gcc_assert (SCOPE_FILE_SCOPE_P (context)
  || context == current_function_decl
  || TREE_STATIC (exp)
  || DECL_EXTERNAL (exp)


Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Jakub Jelinek
On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
>  Right, as you did for other cases. It works here as well.

Patch preapproved.

Jakub


Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types

2013-06-19 Thread Chung-Ju Wu
2013/6/19 Jakub Jelinek :
> On Wed, Jun 19, 2013 at 03:02:38PM +0800, Chung-Ju Wu wrote:
>> In the gnu coding standard we have a space before
>> the open-parentheses.  Would that be great to have
>> testcase follow this convention as well? :)
>>
>> If so, then...
>
> Testcases generally don't need to follow the coding conventions,
> they can and often it is nicer if they follow it, but we certainly
> also want testcases that don't follow it, otherwise we wouldn't have
> testsuite coverage for other coding styles (what if say a parser or
> preprocessor didn't work properly if there wasn't a space in between
> function name and ( ?).
>
> Jakub

That makes sense.  Thanks for clarifying it. :)


Best regards,
jasonwucj


Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Igor Zamyatin
 Right, as you did for other cases. It works here as well.

Thanks,
Igor

On Wed, Jun 19, 2013 at 11:05 AM, Jakub Jelinek  wrote:
> On Wed, Jun 19, 2013 at 11:01:59AM +0400, Igor Zamyatin wrote:
>> The change also affects vectorizer in avx case which could be seen for
>> gcc.dg/tree-ssa/loop-19.c test.
>>
>> After the change report says
>>
>> loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
>> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
>> loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
>> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
>> loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]
>>
>> AFAICS first condition in ix86_data_alignment was true before the
>> change so 256 was a return value.
>>
>> Do we need to tweak this test also?
>
> I'd add -fno-common to the test.
>
> Jakub


Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types

2013-06-19 Thread Jakub Jelinek
On Wed, Jun 19, 2013 at 03:02:38PM +0800, Chung-Ju Wu wrote:
> In the gnu coding standard we have a space before
> the open-parentheses.  Would that be great to have
> testcase follow this convention as well? :)
> 
> If so, then...

Testcases generally don't need to follow the coding conventions,
they can and often it is nicer if they follow it, but we certainly
also want testcases that don't follow it, otherwise we wouldn't have
testsuite coverage for other coding styles (what if say a parser or
preprocessor didn't work properly if there wasn't a space in between
function name and ( ?).

Jakub


Re: FW: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Jakub Jelinek
On Wed, Jun 19, 2013 at 11:01:59AM +0400, Igor Zamyatin wrote:
> The change also affects vectorizer in avx case which could be seen for
> gcc.dg/tree-ssa/loop-19.c test.
> 
> After the change report says
> 
> loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
> loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
> loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]
> 
> AFAICS first condition in ix86_data_alignment was true before the
> change so 256 was a return value.
> 
> Do we need to tweak this test also?

I'd add -fno-common to the test.

Jakub


Re: [PATCH] Improve folding of bitwise ops feeding conditionals for single bit types

2013-06-19 Thread Chung-Ju Wu
2013/6/19 Jeff Law :
>
> * gcc.dg/tree-ssa/forwprop-28.c: New test.
>

In the gnu coding standard we have a space before
the open-parentheses.  Would that be great to have
testcase follow this convention as well? :)

If so, then...

>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c
> b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c
> new file mode 100644
> index 000..2c42065
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-28.c
> @@ -0,0 +1,76 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-forwprop1" } */
> +
> +extern char * frob (void);
> +extern _Bool testit(void);

Missing a space before '('.

> +
> +test (int code)
> +{
> +  char * temp = frob();;

Likewise.  And redundant ';'.

> +  int rotate = (code == 22);
> +  if (temp == 0 && !rotate)
> +  oof();

Likewise.

> +}
> +
> +test_2 (int code)
> +{
> +  char * temp = frob();

Likewise.

> +  int rotate = (code == 22);
> +  if (!rotate && temp == 0)
> +  oof();

Likewise.

And there are similar cases can be fixed in test_3, test_4,
test_5, test_6, test_7, and test_8.


Best regards,
jasonwucj


Re: FW: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-19 Thread Igor Zamyatin
The change also affects vectorizer in avx case which could be seen for
gcc.dg/tree-ssa/loop-19.c test.

After the change report says

loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]

AFAICS first condition in ix86_data_alignment was true before the
change so 256 was a return value.

Do we need to tweak this test also?

Thanks,
Igor

> Hi!
>
> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls for 
> optimization purposes beyond ABI mandated levels.  It is fine to emit the 
> vars aligned as much as we want for optimization purposes, but if we can't be 
> sure that references to that decl bind to the definition we increased the 
> alignment on (e.g. common variables, or -fpic code without hidden visibility, 
> weak vars etc.), we can't assume that alignment.
> As DECL_ALIGN is used for both the alignment emitted for the definitions and 
> alignment assumed on code referring to it, this patch increases DECL_ALIGN 
> only on decls where decl_binds_to_current_def_p is true, and otherwise the 
> optimization part on top of that emits only when aligning definition.
> On x86_64, DATA_ALIGNMENT macro was partly an optimization, partly ABI 
> mandated alignment increase, so I've introduced a new macro, 
> DATA_ABI_ALIGNMENT, which is the ABI mandated increase only (on x86-64 I 
> think the only one is that arrays with size 16 bytes or more (and VLAs, but 
> that is not handled by DATA*ALIGNMENT) are at least 16 byte aligned).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other 
> targets, I've kept them all using DATA_ALIGNMENT, which is considered 
> optimization increase only now, if there is some ABI mandated alignment 
> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as well 
> as DATA_ALIGNMENT.  The patch causes some vectorization regressions (tweaked 
> in the testsuite), especially for common vars where we used to align say 
> common arrays to 256 bits rather than the ABI mandated 128 bits, or for -fpic 
> code, but I'm afraid we need to live with that, if you compile another file 
> with say icc or some other compiler which doesn't increase alignment beyond 
> ABI mandated level and that other file defines the var say as non-common, we 
> have wrong-code.
>
> 2013-06-07  Jakub Jelinek  
>
> PR target/56564
> * varasm.c (align_variable): Don't use DATA_ALIGNMENT or
> CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
> Use DATA_ABI_ALIGNMENT for that case instead if defined.
> (get_variable_align): New function.
> (get_variable_section, emit_bss, emit_common,
> assemble_variable_contents, place_block_symbol): Use
> get_variable_align instead of DECL_ALIGN.
> (assemble_noswitch_variable): Add align argument, use it
> instead of DECL_ALIGN.
> (assemble_variable): Adjust caller.  Use get_variable_align
> instead of DECL_ALIGN.
> * config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
> caller.
> (DATA_ABI_ALIGNMENT): Define.
> * config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
> * config/i386/i386.c (x86_data_alignment): Add opt argument.  If
> opt is false, only return the psABI mandated alignment increase.
> * doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
> * doc/tm.texi: Regenerated.
>
> * gcc.target/i386/pr56564-1.c: New test.
> * gcc.target/i386/pr56564-2.c: New test.
> * gcc.target/i386/pr56564-3.c: New test.
> * gcc.target/i386/pr56564-4.c: New test.
> * gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
> * gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
> * gcc.target/i386/vect-sizes-1.c: Likewise.
> * gcc.target/i386/memcpy-1.c: Likewise.
> * gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
> * gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.
>