Re: [PATCH] Fix PR64822: incorrect folding of bitfield in union on big endian targets

2015-02-03 Thread Jakub Jelinek
On Wed, Feb 04, 2015 at 10:45:11AM +0800, Thomas Preud'homme wrote:
> 2015-01-30  Thomas Preud'homme  
> 
>   PR middle-end/62103
>   * tree-ssa-sccvn.c (fully_constant_vn_reference_p): Use TYPE_PRECISION
>   to compute size of referenced value in the constant case.
> 
> 2015-01-30  Thomas Preud'homme  
> 
>   PR middle-end/62103
>   * gcc.c-torture/execute/bitfld-7.c: New test adapted from bitfld-6.c
>   to use 24 bits for bitfield b.

Richard already acked it with the new testcase, so yes, this is ok for the
trunk (just use today's date).

Jakub


[PR64817-related 3/3] simplify xor of (and or ior) of xor

2015-02-03 Thread Alexandre Oliva
PR64817's testcase creates a long chain of XOR of AND of XOR of AND of
that our rtl simplifiers can't simplify, although they are simplifyable.

combine manages to simplify the generated insns that perform the
computation represented by such chains, but simplify-rtx doesn't.
That's because combine notices that, after the first and with a
constant, some bits are necessarily zero, and so it can simplify
subsequent redundant insns.

simplify-rtx doesn't do that, though.  I suppose it could try to compute
the nonzero bits of its operands, but doing so would likely lead to
exponential behavior on the depth of the expression.

We can still simplify the expressions to a great extent, though: for
each sequence of xor of and of xor of whatever, with constant second
operands for each of the 3 operations, we can get rid of the innermost
xor by tweaking the operands of the outermost one.  We can also do that
if the intervening operation is an ior rather than an and, though with a
slightly different adjustment for the second operand of the outermost
xor.

With this patch, the huge rtl debug location expressions in the
testcases for pr64817 simplify just as well as combine manages to
simplify them.

I'm a bit surprised the gimple layer does not even attempt to simplify
them, but I didn't try to tackle that, since I was not even sure this
was a useful optimization.  After all, how often do we see xor of and of
xor of and of xor of... in the wild, rather than in pathological
testcases? :-)  But hey, at least the rtl simplification is cheap, so
why not?

Regstrapped on x86_64-linux-gnu and i686-pc-linux-gnu.  Ok to install?

for  gcc/ChangeLog

* simplify-rtx.c (simplify_binary_operation_1): Simplify one
of two XORs that have an intervening AND or IOR.
---
 gcc/simplify-rtx.c |   33 +
 1 file changed, 33 insertions(+)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index bea9ec3..04452c6 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -2708,6 +2708,39 @@ simplify_binary_operation_1 (enum rtx_code code, 
machine_mode mode,
XEXP (op0, 1), mode),
op1);
 
+  /* Given (xor (ior (xor A B) C) D), where B, C and D are
+constants, simplify to (xor (ior A C) (B&~C)^D), canceling
+out bits inverted twice and not set by C.  Similarly, given
+(xor (and (xor A B) C) D), simplify without inverting C in
+the xor operand: (xor (and A C) (B&C)^D).
+  */
+  else if ((GET_CODE (op0) == IOR || GET_CODE (op0) == AND)
+  && GET_CODE (XEXP (op0, 0)) == XOR
+  && CONST_INT_P (op1)
+  && CONST_INT_P (XEXP (op0, 1))
+  && CONST_INT_P (XEXP (XEXP (op0, 0), 1)))
+   {
+ enum rtx_code op = GET_CODE (op0);
+ rtx a = XEXP (XEXP (op0, 0), 0);
+ rtx b = XEXP (XEXP (op0, 0), 1);
+ rtx c = XEXP (op0, 1);
+ rtx d = op1;
+ HOST_WIDE_INT bval = INTVAL (b);
+ HOST_WIDE_INT cval = INTVAL (c);
+ HOST_WIDE_INT dval = INTVAL (d);
+ HOST_WIDE_INT xcval;
+
+ if (op == IOR)
+   xcval = cval;
+ else
+   xcval = ~cval;
+
+ return simplify_gen_binary (XOR, mode,
+ simplify_gen_binary (op, mode, a, c),
+ gen_int_mode ((bval & xcval) ^ dval,
+   mode));
+   }
+
   /* Given (xor (and A B) C), using P^Q == (~P&Q) | (~Q&P),
 we can transform like this:
 (A&B)^C == ~(A&B)&C | ~C&(A&B)


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[PR64817-related 2/3] don't alloc rtl when failing to simplify XOR of AND

2015-02-03 Thread Alexandre Oliva
We have a problem in simplify_binary_operation_1 that causes memory
waste in general, and memory explosion in pathological testcases such as
that of PR64817, with large exprs amounting to XOR of AND of XOR of AND
of...

I believe rtl simplifiers are not supposed to allocate rtl before
committing to simplifying the expr at hand.  The code that simplies XORs
of ANDs, whose second operands are both constants, used to do this.

In the pathological rtl resulting from the PR64817 testcases, the
recursive attempts to simplify the large exprs with lots of such
opportunities end up calling simplify_gen_unary O(2^n) times, where n is
the number of XORs in the large expr, each call allocating memory that
will ultimately not be used because simplification is not possible.

This patch rearranges the simplification so as to avoid the early
allocation.  First, we attempt to simplify a negated variant of the
temporary exprs we used to generate, that avoids the need for
simplify_gen_unary and covers the second case in the simplification just
as well as the original code.

If we take the first case, however, even if the negated version failed
to simplify, we know we can simplify the whole thing by combining the
constants, so we negate the sub-expression simplified before, or create
the non-negated sub-expression only if the negation of its operand
simplifies successfully.  This should cover all cases we covered before,
but without allocating rtl before committing to a simplification.

I made a cursory look at the other simplification paths and couldn't
find any similar problem in them.

This fixes the memory explosion problem in var-tracking exposed by the
testcase in patch 1/3 of this series, as well as by the original PR64817
testcase with the fix in patch 1/3.

Regstrapped on x86_64-linux-gnu and i686-pc-linux-gnu.  Ok to install?


for  gcc/ChangeLog

PR debug/64817
* simplify-rtx.c (simplify_binary_operation_1): Rewrite
simplification of XOR of AND to not allocate new rtx before
committing to a simplification.
---
 gcc/simplify-rtx.c |   29 -
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 5c9e3bf..bea9ec3 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -2724,12 +2724,31 @@ simplify_binary_operation_1 (enum rtx_code code, 
machine_mode mode,
  HOST_WIDE_INT bval = INTVAL (b);
  HOST_WIDE_INT cval = INTVAL (c);
 
- rtx na_c
-   = simplify_binary_operation (AND, mode,
-simplify_gen_unary (NOT, mode, a, 
mode),
-c);
+ /* Instead of computing ~A&C, we compute its negated value,
+~(A|~C).  If it yields -1, ~A&C is zero, so we can
+optimize for sure.  If it does not simplify, we still try
+to compute ~A&C below, but since that always allocates
+RTL, we don't try that before committing to returning a
+simplified expression.  */
+ rtx n_na_c = simplify_binary_operation (IOR, mode, a,
+ GEN_INT (~cval));
+
  if ((~cval & bval) == 0)
{
+ rtx na_c = NULL_RTX;
+ if (n_na_c)
+   na_c = simplify_gen_unary (NOT, mode, n_na_c, mode);
+ else
+   {
+ /* If ~A does not simplify, don't bother: we don't
+want to simplify 2 operations into 3, and if na_c
+were to simplify with na, n_na_c would have
+simplified as well.  */
+ rtx na = simplify_unary_operation (NOT, mode, a, mode);
+ if (na)
+   na_c = simplify_gen_binary (AND, mode, na, c);
+   }
+
  /* Try to simplify ~A&C | ~B&C.  */
  if (na_c != NULL_RTX)
return simplify_gen_binary (IOR, mode, na_c,
@@ -2738,7 +2757,7 @@ simplify_binary_operation_1 (enum rtx_code code, 
machine_mode mode,
  else
{
  /* If ~A&C is zero, simplify A&(~C&B) | ~B&C.  */
- if (na_c == const0_rtx)
+ if (n_na_c == CONSTM1_RTX (mode))
{
  rtx a_nc_b = simplify_gen_binary (AND, mode, a,
gen_int_mode (~cval & bval,


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[PR64817-related 1/3] fix debug expr expand of compares

2015-02-03 Thread Alexandre Oliva
PR64817 is lucky that the compare in the testcase was <0 rather than
>0.  expand_debug_expr used to take the signedness of the expr from the
result type, rather than from the operand types, so it the a < 0 test
was expanded as LTU rather than LT, and LTU compares with zero are
always false, so we short-circuited the most complex debug exprs out.

Reversing the sense of the test is enough to expose them.  Even though
Jakub's earlier patch for PR64817 avoided the simplify-rtx problems
during cfgexpand, var-track still runs afoul of it once the sense of the
compare is reversed, as in the testcase below.

This patch makes the situation with a<0 as bad as with a>0.  Fixing the
is left for patch 2 of this series; maybe the testcase should be
installed along with it.  In my series, this patch was actually #3,
that's why I had the testcase with it.  However, I found it easier to
explain the problem starting out with this one, and including the
testcase.  I can split out the testcase into a separate patch or install
it along with the second patch in the series, if that makes more sense.


While staring at the code, I found a couple of typos in comments in
cfgexpand, that I fixed as part of this patch.  I can split that out
too, if I must.


Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?


for  gcc/ChangeLog

PR debug/64817
* cfgexpand.c (expand_debug_expr): Compute unsignedp from
operands for tcc_comparison exprs.  Fix typos.

for  gcc/testsuite/ChangeLog

PR debug/64817
* gcc.dg/pr64817-3.c: New.
---
 gcc/cfgexpand.c  |9 ++---
 gcc/testsuite/gcc.dg/pr64817-3.c |   13 +
 2 files changed, 19 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr64817-3.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 12021de0..7dfe1f6 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3911,7 +3911,6 @@ expand_debug_expr (tree exp)
 
 binary:
 case tcc_binary:
-case tcc_comparison:
   op1 = expand_debug_expr (TREE_OPERAND (exp, 1));
   if (!op1)
return NULL_RTX;
@@ -3925,6 +3924,10 @@ expand_debug_expr (tree exp)
return NULL_RTX;
   break;
 
+case tcc_comparison:
+  unsignedp = TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0)));
+  goto binary;
+
 case tcc_type:
 case tcc_statement:
   gcc_unreachable ();
@@ -4006,7 +4009,7 @@ expand_debug_expr (tree exp)
op0 = copy_rtx (op0);
 
   if (GET_MODE (op0) == BLKmode
- /* If op0 is not BLKmode, but BLKmode is, adjust_mode
+ /* If op0 is not BLKmode, but mode is, adjust_mode
 below would ICE.  While it is likely a FE bug,
 try to be robust here.  See PR43166.  */
  || mode == BLKmode
@@ -5285,7 +5288,7 @@ expand_gimple_basic_block (basic_block bb, bool 
disable_tail_calls)
 
if (have_debug_uses)
  {
-   /* OP is a TERed SSA name, with DEF it's defining
+   /* OP is a TERed SSA name, with DEF its defining
   statement, and where OP is used in further debug
   instructions.  Generate a debug temporary, and
   replace all uses of OP in debug insns with that
diff --git a/gcc/testsuite/gcc.dg/pr64817-3.c b/gcc/testsuite/gcc.dg/pr64817-3.c
new file mode 100644
index 000..3fe0117
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64817-3.c
@@ -0,0 +1,13 @@
+/* PR debug/64817 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -g" } */
+
+int a;
+
+void
+foo (void)
+{
+  int e;
+  a = 
((a
 & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) 
& 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 231) ^ 14) & 
231) ^ 14) & 231) ^ 14) &

Re: [PATCH] Add top-level config support for gold mips target

2015-02-03 Thread Cary Coutant
Ping^3. Should I be addressing this to someone else?

-cary

On Mon, Dec 1, 2014 at 2:15 PM, Cary Coutant  wrote:
> Ping^2.
>
> -cary
>
> On Wed, Oct 29, 2014 at 12:04 PM, Cary Coutant  wrote:
>> Ping?
>>
>> On Mon, Oct 20, 2014 at 10:31 AM, Cary Coutant  wrote:
>>> This patch adds support for the mips target in gold.
>>>
>>> OK to commit?
>>>
>>> -cary
>>>
>>>
>>> 2014-10-20  Cary Coutant  
>>>
>>> * configure (--enable-gold): Add mips*-*-*.
>>> * configure.ac: Regenerate.
>>>
>>>
>>> Index: configure
>>> ===
>>> --- configure   (revision 216487)
>>> +++ configure   (working copy)
>>> @@ -2941,7 +2941,7 @@ case "${ENABLE_GOLD}" in
>>># Check for target supported by gold.
>>>case "${target}" in
>>>  i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
>>> -| aarch64*-*-* | tilegx*-*-*)
>>> +| aarch64*-*-* | tilegx*-*-* | mips*-*-*)
>>>   configdirs="$configdirs gold"
>>>   if test x${ENABLE_GOLD} = xdefault; then
>>> default_ld=gold
>>> Index: configure.ac
>>> ===
>>> --- configure.ac(revision 216487)
>>> +++ configure.ac(working copy)
>>> @@ -332,7 +332,7 @@ case "${ENABLE_GOLD}" in
>>># Check for target supported by gold.
>>>case "${target}" in
>>>  i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
>>> -| aarch64*-*-* | tilegx*-*-*)
>>> +| aarch64*-*-* | tilegx*-*-* | mips*-*-*)
>>>   configdirs="$configdirs gold"
>>>   if test x${ENABLE_GOLD} = xdefault; then
>>> default_ld=gold


Re: PR lto/64837: lto plugin doesn't call ld_plugin_release_input_file

2015-02-03 Thread Cary Coutant
The plugin is not supposed to call release_input_file from the
claim_file handler. That interface is only for releasing a file
descriptor obtained via get_input_file during the all_symbols_read
callback. When the linker calls the claim_file handler, the file
descriptor is open, and the plugin is required to leave it open; the
linker manages the file descriptor at that point. The
get_input_file/release_input_file pair of interfaces was added later,
for the benefit of another (non-LTO) plugin (although I think the LLVM
LTO plugin does use that pair during the all_symbols_read callback).

This is described here:

   https://gcc.gnu.org/wiki/whopr/driver

If you're going to insist on calling the release_input_file API from
the claim_file handler, I'm going to have to fix gold to ignore the
call to avoid a premature unlock of the object file.

-cary



On Wed, Jan 28, 2015 at 4:02 PM, H.J. Lu  wrote:
> On Wed, Jan 28, 2015 at 11:37 AM, H.J. Lu  wrote:
>> On Wed, Jan 28, 2015 at 11:19 AM, Richard Biener
>>  wrote:
>>> On January 28, 2015 7:12:43 PM CET, "H.J. Lu"  wrote:
Hi,

This patch makes claim_file_handler to call release_input_file after it
finishes processing input file.  OK for trunk?
>>>
>>> OK.  How did you test this?
>>
>> I did normal bootstrap and "make check" on Linux/x86-64.
>> I also run ld.bfd and ld.gold by hand to verify that release_input_file
>> is called.
>>
>
> This is needed for LTO build.  ar/nm/ranlib don't provide
> release_input_file.  I checked it in as an obvious fix.
>
> --
> H.J.
> ---
> Index: ChangeLog
> ===
> --- ChangeLog (revision 220212)
> +++ ChangeLog (working copy)
> @@ -1,5 +1,10 @@
>  2015-01-28  H.J. Lu  
>
> + * lto-plugin.c (claim_file_handler): Call release_input_file only
> + if it is not NULL.
> +
> +2015-01-28  H.J. Lu  
> +
>   PR lto/64837
>   * lto-plugin.c (release_input_file): New.
>   (claim_file_handler): Call release_input_file.
> Index: lto-plugin.c
> ===
> --- lto-plugin.c (revision 220212)
> +++ lto-plugin.c (working copy)
> @@ -1007,7 +1007,8 @@ claim_file_handler (const struct ld_plug
>if (obj.objfile)
>  simple_object_release_read (obj.objfile);
>
> -  release_input_file (file);
> +  if (release_input_file)
> +release_input_file (file);
>
>return LDPS_OK;
>  }


Re: r219977 - in /trunk/gcc: ChangeLog config/rs600...

2015-02-03 Thread David Edelsohn
On Tue, Feb 3, 2015 at 5:55 PM, Andreas Schwab  wrote:
> FAIL: gcc.dg/builtins-58.c scan-assembler-not pow
>
> $ grep pow builtins-58.s
> .machine power4

Any suggestions?

- David


Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition

2015-02-03 Thread Bin.Cheng
On Wed, Feb 4, 2015 at 12:28 AM, Jeff Law  wrote:
> On 02/03/15 01:29, Bin.Cheng wrote:
>>
>>
>> Hmm, if I understand correctly, it's a code size regression, so I
>> don't think it's appropriate to adapt the test case.  Either the patch
>> or something else in GCC is doing wrong, right?
>>
>> Hi Alex, could you please file a PR with full dump information for
>> tracking?
>
> But if the code size regression is due to the older compiler incorrectly
> handling the promotion of REG_EQUAL to REG_EQUIV notes, then the test
> absolutely does need updating as the codesize was dependent on incorrect
> behaviour in the compiler.

Hi Jeff,

I looked into the test and can confirm the previous compilation is correct.
The cover letter of this patch said IRA mis-handled REQ_EQUIV before,
but in this case it is REG_EQUAL that is lost.  The full dump (without
this patch) after IRA is like:

   10: NOTE_INSN_BASIC_BLOCK 2
2: r116:SI=r0:SI
3: r117:SI=r1:SI
  REG_DEAD r1:SI
4: r118:SI=r2:SI
  REG_DEAD r2:SI
5: NOTE_INSN_FUNCTION_BEG
   12: r2:SI=0x1
   13: r1:SI=0
   15: r0:SI=call [`lseek'] argc:0
  REG_DEAD r2:SI
  REG_DEAD r1:SI
  REG_CALL_DECL `lseek'
   16: r111:SI=r0:SI
  REG_DEAD r0:SI
   17: r2:SI=0x2
   18: r1:SI=0
   19: r0:SI=r116:SI
  REG_DEAD r116:SI
   20: r0:SI=call [`lseek'] argc:0
  REG_DEAD r2:SI
  REG_DEAD r1:SI
  REG_CALL_DECL `lseek'
   21: r112:SI=r0:SI
  REG_DEAD r0:SI
   22: cc:CC=cmp(r111:SI,0x)
   23: pc={(cc:CC==0)?L46:pc}
  REG_DEAD cc:CC
  REG_BR_PROB 159
   24: NOTE_INSN_BASIC_BLOCK 3
   25: cc:CC=cmp(r112:SI,0x)
   26: pc={(cc:CC==0)?L50:pc}
  REG_DEAD cc:CC
  REG_BR_PROB 159
   27: NOTE_INSN_BASIC_BLOCK 4
   28: NOTE_INSN_DELETED
   29: {cc:CC_NOOV=cmp(r112:SI-r111:SI,0);r114:SI=r112:SI-r111:SI;}
  REG_DEAD r112:SI
   30: pc={(cc:CC_NOOV==0)?L54:pc}
  REG_DEAD cc:CC_NOOV
  REG_BR_PROB 400
   31: NOTE_INSN_BASIC_BLOCK 5
   32: [r117:SI]=r111:SI
  REG_DEAD r117:SI
  REG_DEAD r111:SI
   33: [r118:SI]=r114:SI
  REG_DEAD r118:SI
  REG_DEAD r114:SI
7: r110:SI=0
  REG_EQUAL 0
   76: pc=L34
   77: barrier
   46: L46:
   45: NOTE_INSN_BASIC_BLOCK 6
8: r110:SI=r111:SI
  REG_DEAD r111:SI
  REG_EQUAL 0x
   78: pc=L34
   79: barrier
   50: L50:
   49: NOTE_INSN_BASIC_BLOCK 7
6: r110:SI=r112:SI
  REG_DEAD r112:SI
  REG_EQUAL 0x
   80: pc=L34
   81: barrier
   54: L54:
   53: NOTE_INSN_BASIC_BLOCK 8
9: r110:SI=0x
  REG_EQUAL 0x
   34: L34:
   35: NOTE_INSN_BASIC_BLOCK 9
   40: r0:SI=r110:SI
  REG_DEAD r110:SI
   41: use r0:SI

Before r216169 (with REG_EQUAL in insn9), jumps from basic block 6/7/8
-> 9 can be merged because r110 equals to -1 afterwards.  But with the
patch, the equal information of r110==-1 in basic block 8 is lost.  As
a result, jump from 8->9 can't be merged and two additional
instructions are generated.

I suppose the REG_EQUAL note is correct in insn9?  According to
GCCint, it only means r110 set by insn9 will be equal to the value at
run time at the end of this insn but not necessarily elsewhere in the
function.

I also found another problem (or mis-leading?) with the document:
"Thus, compiler passes prior to register allocation need only check
for REG_EQUAL notes and passes subsequent to register allocation need
only check for REG_EQUIV notes".  This seems not true now as in this
example, passes after register allocation do take advantage of
REG_EQUAL in optimization and we can't achieve that by using
REG_EQUIV.

Thanks,
bin

>
> jeff


RE: [PATCH] Fix PR64822: incorrect folding of bitfield in union on big endian targets

2015-02-03 Thread Thomas Preud'homme
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Thursday, January 29, 2015 6:39 PM
> 

> You should mention
>   PR middle-end/62103

Right, please find the new ChangeLog entries below:

2015-01-30  Thomas Preud'homme  

PR middle-end/62103
* tree-ssa-sccvn.c (fully_constant_vn_reference_p): Use TYPE_PRECISION
to compute size of referenced value in the constant case.

2015-01-30  Thomas Preud'homme  

PR middle-end/62103
* gcc.c-torture/execute/bitfld-7.c: New test adapted from bitfld-6.c
to use 24 bits for bitfield b.

> > 2015-01-28  Thomas Preud'homme  
> >
> > * gcc.c-torture/execute/bitfld-6.c: Use 24 bits for bitfield b.  
> > Adapt
> > expected values accordingly.
> 
> IMHO if the old testcase wasn't incorrect, you'd better add a new
> testcase
> instead of modifying existing one.

Alright, here you are. I changed the existing testcase because the new setup 
felt as
a superset. Both setup goes into the code I added back then when I closed 
PR62103
but the new setup reaches it from SCC value numbering code.

You're right though the two testcases will exercise different code path for the 
rest
of the compiler and there is value in having both.

diff --git a/gcc/testsuite/gcc.c-torture/execute/bitfld-7.c 
b/gcc/testsuite/gcc.c-torture/execute/bitfld-7.c
new file mode 100644
index 000..e9a61df
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/bitfld-7.c
@@ -0,0 +1,23 @@
+union U
+{
+  const int a;
+  unsigned b : 24;
+};
+
+static union U u = { 0x12345678 };
+
+/* Constant folding used to fail to account for endianness when folding a
+   union.  */
+
+int
+main (void)
+{
+#ifdef __BYTE_ORDER__
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  return u.b - 0x345678;
+#else
+  return u.b - 0x123456;
+#endif
+#endif
+  return 0;
+}
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 25c67d0..0f1299a 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -1352,7 +1352,7 @@ fully_constant_vn_reference_p (vn_reference_t ref)
   || TYPE_PRECISION (ref->type) % BITS_PER_UNIT == 0))
 {
   HOST_WIDE_INT off = 0;
-  HOST_WIDE_INT size = tree_to_shwi (TYPE_SIZE (ref->type));
+  HOST_WIDE_INT size = TYPE_PRECISION (ref->type);
   if (size % BITS_PER_UNIT != 0
  || size > MAX_BITSIZE_MODE_ANY_MODE)
return NULL_TREE;

Is this ok for trunk?

Best regards,

Thomas






[c++-concepts] Bring constraints in line with spec

2015-02-03 Thread Braden Obrzut
This is a large patch mostly written by Andrew to change how constraints 
are stored.  It brings the implementation more in line with the 
specification and simplifies some parts.


I'm not entirely sure about the change to 
cp_parser_template_argument_list.  What happens here is when a template 
template parameter is created with terse notation on a function concept, 
a TEMPLATE_DECL appears instead.  This would manifest itself in 
producing a template that can't be forward declared (or mixed with a 
template that is defined using a trailing requires clause).  When I 
looked, it didn't seem like any earlier point was suitable for 
determining that the TEMPLATE_TEMPLATE_PARM should be unwrapped.


- Braden Obrzut

2015-02-03  Braden Obrzut  

* gcc/cp/class.c (build_clone): Clone constraints.
* gcc/cp/constraint.cc (normalize_atom): Update diagnostic.
(normalize_constraints): Return error_mark_node if normalization fails.
(get_constraints): Access constraints through hash map.
(set_constraints): Set constraints through hash map.
(remove_constraints): Access constraints through hash map.
(associate_classtype_constraints): New.
(init_leading_requirements): Removed.
(init_trailing_requirements): Removed.
(update_leadng_requirements): Removed.
(update_trailing_requirements): Removed.
(save_leading_constraints): Removed.
(save_trailing_constraints): Removed.
(finish_template_constraints): Removed.
(build_constraints): New. Builds CONSTRAINT_INFO from requirements.
(finish_concept_introduction): Check generated parameters for errors.
(tsubst_constraint_info): Update implementation.
(equivalent_constraints): Check input types.
(subsumes_constraints): Update implementation.
(at_least_as_constrained): New. Check if a decl's constraints subsumes
another.
(diagnose_constraints): Temporarily simplify diagnostics.
* gcc/cp/cp-tree.h (tree_constraint_info): Refactor the way constraints
are stored.
(CI_TEMPLATE_REQS): Renamed from CI_LEADING_REQS.
(CI_DECLARATOR_REQS): Renamed from CI_TRAILING_REQS.
(CI_ASSOCIATED_CONSTRAINTS): New.
(CI_NORMALIZED_CONSTRAINTS): New.
(CI_ASSOCIATED_REQS): Removed.
(saved_scope): Save template requirements.
(current_template_reqs): Removed.
(lang_decl_min): Replace requires_clause (trailing requirements) with
more generic constraint_info.
* gcc/cp/cxx-pretty-print.c (cxx_pretty_printer::declarator): Print
requires clause.
(pp_cxx_function_definition): Moved requires clause printing to above.
(pp_cxx_init_declarator): Likewise.
(pp_cxx_template_declaration): Update implementation to get
requirements from CONSTRAINT_INFO.
* gcc/cp/decl.c (duplicate_decls): Remove constraints before reclaiming
memory.
(is_class_template_or_specialization): New.
(get_leading_constraints): Removed.
(adjust_fn_constraints): Removed.
(grokfndecl): Update implementation to other changes.
(get_trailing_requires_clause): New.
(grokdeclarator): Pass trailing requires clause to grokfndecl.
(xref_tag_1): Check overload constraints.
* gcc/cp/error.c (dump_template_decl): Print requires clause.
(dump_function_decl): Update implementation for accessing requirements.
* gcc/cp/logic.cc (subsumes_constraints_nonnull): Update
CI_ASSOCIATED_REQS usage.
* gcc/cp/method.c (implicitly_declare_fn): Copy constraints of
inherited constructors.
* gcc/cp/parser.c (cp_parser_lambda_expression): Remove now unneeded
template requirements saving.
(cp_parser_type_parameter): Likewise.
(cp_parser_template_argument_list): Unwrap template_template_parms when
produced by short hand notation with function concepts.
(cp_parser_alias_declaration): Attach constraints to aliases.
(cp_manage_requirements): Removed.
(cp_parser_trailing_requirements_clause): Renamed from
cp_parser_trailing_requirements.
(cp_parser_init_declarator): Removed now unneeded requirements saving.
(cp_parser_basic_declarator): Separated from cp_parser_declarator.
(cp_parser_declarator): Parses trailing requires clause if
cp_parser_basic_declarator succeeds.
(cp_parser_class_specifier_1): Associate constaints with type.
(cp_parser_member_declaration): Remove unneeded template requirement
saving.
(cp_parser_template_declaration_after_export): Likewise.
(cp_parser_single_declaration): Associate constraints.
(cp_parser_late_parsing_for_member): Remove unneeded template
requirement saving.
(synthesize_implicit_template_parm): Likewise.
* gcc/cp/pt.c (maybe_new_partial_specialization): Update
implementation.
(process_template_parm): Removed unneeded template requirement saving.
(build_template_decl): Handle constraints.
(process_partial_specialization): Update constraint access and check
that specialization is more specialized.
(push_template_decl_real): Updat

RE: [PATCH, FT32] initial support

2015-02-03 Thread James Bowman
> optabs.c's expand_abs_nojump already knows this trick:
>   
> /* If this machine has expensive jumps, we can do integer absolute
>value of X as (((signed) x >> (W-1)) ^ x) - ((signed) x >> (W-1)),
>where W is the width of MODE. */
> 
> So if you define BRANCH_COST to be 2 or more there should be no need for
> this pattern at all.

Yes, I just confirmed this. With no abssi2 pattern, this:

  int side;

  int foo(int x)
  {
side = x >> 31;
return abs(x);
  }

does indeed produce:

  foo: 
  ashr.l $r1,$r0,31
  sta.l  side,$r1
  xor.l  $r0,$r1,$r0
  sub.l  $r0,$r0,$r1
  return
  

Thanks.

--
James Bowman
FTDI Open Source Liaison


Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread H.J. Lu
On Tue, Feb 3, 2015 at 2:19 PM, Jakub Jelinek  wrote:
> On Tue, Feb 03, 2015 at 02:03:14PM -0800, H.J. Lu wrote:
>> So we aren't SYMBOL_REF_EXTERNAL_P nor
>> SYMBOL_REF_LOCAL_P.  What do we reference?
>
> That is reasonable.  There is no guarantee the extern weak symbol is local,
> it could very well be non-local.  All that you know about the symbols is
> that its address is non-NULL in that case.
>

This may be true for shared library.  But it isn't true for PIE:

[hjl@gnu-6 copyreloc-3]$ cat x.c
__attribute__((weak))
int a;

extern void bar (void);

int main()
{
  if (a != 0)
__builtin_abort();
  bar ();
  if (a != 30)
__builtin_abort();
  return 0;
}
[hjl@gnu-6 copyreloc-3]$ cat bar.c
int a = -1;

void
bar ()
{
  a = 30;
}
[hjl@gnu-6 copyreloc-3]$ make
gcc -pie -O3 -g -fuse-ld=gold -fpie  -c x.i
gcc -pie -O3 -g -fuse-ld=gold -fpic-c -o bar.o bar.c
gcc -pie  -shared -o libbar.so bar.o
gcc -pie -O3 -g -fuse-ld=gold -o x x.o libbar.so -Wl,-R,.
./x
[hjl@gnu-6 copyreloc-3]$

Even if a common symbol, a, is weak, all references to
a within PIE is local.

-- 
H.J.


Re: Fix PR64876, regressions in powerpc64 Go testsuite

2015-02-03 Thread Alan Modra
On Tue, Feb 03, 2015 at 11:14:49AM -0500, David Edelsohn wrote:
> On Tue, Feb 3, 2015 at 8:57 AM, Alan Modra  wrote:
> > PR target/64876
> > * config/rs6000/rs6000.c (chain_already_loaded): New function.
> > (rs6000_call_aix): Use it.
> 
> Okay with Jakub's suggested change.

No, Jakub's change doesn't work, even if I add the looping in
chain_already_loaded that would need.  We really do want to look at
just (the last insn in) the previous sequence.

The trouble is that the current sequence, ie. the one emitted for
gen_call or gen_call_value, might be empty, *and* the previous
sequence, the one emitted by calls.c:emit_call_1, might be empty at
this point.  (I found that fact out when my first implementation of
chain_already_loaded lacked the "last != NULL" test.)  In that case
get_last_insn_anywhere() will give you rtl insns that aren't part of a
call sequence, and r11 is a general register that might be used for
anything.  So a test for setting r11 is no longer a test for setting
the static chain.

-- 
Alan Modra
Australia Development Lab, IBM


[Ping] Re: [PATCH 1/3] Replace MD_REDIRECT_BRANCH with TARGET_CAN_FOLLOW_JUMP

2015-02-03 Thread Kaz Kojima
Kaz Kojima  wrote:
> 2015-01-27  Joern Rennecke  
>   Kaz Kojima  
> 
>   PR target/64761
>   * config/sh/sh-protos.h (sh_can_redirect_branch): Don't declare.
>   * config/sh/sh.c (TARGET_CAN_FOLLOW_JUMP): Redefine.
>   (sh_can_redirect_branch): Rename to ...
>   (sh_can_follow_jump): ... this.  Constify argument types.
>   * config/sh/sh.h (MD_CAN_REDIRECT_BRANCH): Don't define.
>   * doc/tm.texi.in (MD_CAN_REDIRECT_BRANCH): Remove documentation.
>   * reorg.c (steal_delay_list_from_target): Use targetm.can_follow_jump.
>   * doc/tm.texi: Regenerate.

Ping?

https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02345.html

Regards,
kaz


[RFC testsuite] Fix PR64850, tweak acc_on_device* tests

2015-02-03 Thread Kaz Kojima
Hi,

Several goacc/acc_on_device tests fail for a few targets:

hppa2.0w-hp-hpux11.11 (PR testsuite/64850)
https://gcc.gnu.org/ml/gcc-testresults/2015-01/msg02659.html

m68k-unknown-linux-gnu
https://gcc.gnu.org/ml/gcc-testresults/2015-01/msg02960.html

sh4-unknown-linux-gnu
https://gcc.gnu.org/ml/gcc-testresults/2015-01/msg02930.html

Also they fail with special options
x86_64-unknown-linux-gnu -fpic -mcmodel=large
https://gcc.gnu.org/ml/gcc-testresults/2015-02/msg00198.html

Those tests scan .expand rtl dumps to get the number of calls for
acc_on_device function.  For almost targets, the call rtx looks
something like

  (call (mem:QI (symbol_ref:SI ("acc_on_device") [flags 0x41]  ) [0 acc_on_device S1 A8])

and tests use the regular expression "\\\(call \[^\\n\]*\\\"acc_on_device"
to detect it.
This expression doesn't match with the corresponding call rtx

  (call (mem:SI (symbol_ref/v:SI ("@acc_on_device") [flags 0x41]  
) [0 acc_on_device S4 A32])

for hppa and something like

  (call (mem:QI (reg/f:SI 33) [0 acc_on_device S1 A8])

for m68k and sh.  All call rtxes have the function name in
the alias set of its mem rtx and it seems that the regular
expression "\\\(call \[^\\n\]* acc_on_device" works for all
cases.  The attached patch is tested on i686-pc-linux-gnu and
sh4-unknown-linux-gnu.

Regards,
kaz
--
PR testsuite/64850
* gcc.dg/goacc/acc_on_device-1.c: Use a space instead of \\\" in
the expression to find calls.
* c-c++-common/goacc/acc_on_device-2.c: Likewise.
* c-c++-common/goacc/acc_on_device-2-off.c: Likewise.
* gfortran.dg/goacc/acc_on_device-1.f95: Likewise.
* gfortran.dg/goacc/acc_on_device-2.f95: Likewise.
* gfortran.dg/goacc/acc_on_device-2-off.f95: Likewise.

diff --git a/c-c++-common/goacc/acc_on_device-2-off.c 
b/c-c++-common/goacc/acc_on_device-2-off.c
index 25d21ad..ea31047 100644
--- a/c-c++-common/goacc/acc_on_device-2-off.c
+++ b/c-c++-common/goacc/acc_on_device-2-off.c
@@ -20,6 +20,6 @@ f (void)
 }
 
 /* Without -fopenacc, we're expecting one call.
-   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]*\\\"acc_on_device" 1 
"expand" } } */
+   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 1 
"expand" } } */
 
 /* { dg-final { cleanup-rtl-dump "expand" } } */
diff --git a/c-c++-common/goacc/acc_on_device-2.c 
b/c-c++-common/goacc/acc_on_device-2.c
index d5389a9..2f4ee2b 100644
--- a/c-c++-common/goacc/acc_on_device-2.c
+++ b/c-c++-common/goacc/acc_on_device-2.c
@@ -24,6 +24,6 @@ f (void)
perturbs expansion as a builtin, which expects an int parameter.  It's fine
when changing acc_device_t to plain int, but that's not what we're doing in
.
-   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]*\\\"acc_on_device" 0 
"expand" { xfail c++ } } } */
+   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 0 
"expand" { xfail c++ } } } */
 
 /* { dg-final { cleanup-rtl-dump "expand" } } */
diff --git a/gcc.dg/goacc/acc_on_device-1.c b/gcc.dg/goacc/acc_on_device-1.c
index 1a0276e..d0dbc82 100644
--- a/gcc.dg/goacc/acc_on_device-1.c
+++ b/gcc.dg/goacc/acc_on_device-1.c
@@ -15,6 +15,6 @@ f (void)
 }
 
 /* Unsuitable to be handled as a builtin, so we're expecting four calls.
-   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]*\\\"acc_on_device" 4 
"expand" } } */
+   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 4 
"expand" } } */
 
 /* { dg-final { cleanup-rtl-dump "expand" } } */
diff --git a/gfortran.dg/goacc/acc_on_device-1.f95 
b/gfortran.dg/goacc/acc_on_device-1.f95
index 9dfde26..0126d9c 100644
--- a/gfortran.dg/goacc/acc_on_device-1.f95
+++ b/gfortran.dg/goacc/acc_on_device-1.f95
@@ -17,6 +17,6 @@ logical function f ()
 end function f
 
 ! Unsuitable to be handled as a builtin, so we're expecting four calls.
-! { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]*\\\"acc_on_device" 4 
"expand" } }
+! { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 4 
"expand" } }
 
 ! { dg-final { cleanup-rtl-dump "expand" } }
diff --git a/gfortran.dg/goacc/acc_on_device-2-off.f95 
b/gfortran.dg/goacc/acc_on_device-2-off.f95
index cf28264..0a4978e 100644
--- a/gfortran.dg/goacc/acc_on_device-2-off.f95
+++ b/gfortran.dg/goacc/acc_on_device-2-off.f95
@@ -34,6 +34,6 @@ logical (4) function f ()
 end function f
 
 ! Without -fopenacc, we're expecting one call.
-! { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]*\\\"acc_on_device" 1 
"expand" } }
+! { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 1 
"expand" } }
 
 ! { dg-final { cleanup-rtl-dump "expand" } }
diff --git a/gfortran.dg/goacc/acc_on_device-2.f95 
b/gfortran.dg/goacc/acc_on_device-2.f95
index 7730a60..43ad022 100644
--- a/gfortran.dg/goacc/acc_on_device-2.f95
+++ b/gfortran.dg/goacc/acc_on_device-2.f95
@@ -35,6 +35,6 @@ end function f
 
 ! With -fopenacc, we're expecting the builtin to be expanded, so no calls.
 ! TODO: not working.
-! { dg-final { scan-

Re: Fix PR64876, regressions in powerpc64 Go testsuite

2015-02-03 Thread Alan Modra
On Tue, Feb 03, 2015 at 03:08:01PM +0100, Jakub Jelinek wrote:
> On Wed, Feb 04, 2015 at 12:27:35AM +1030, Alan Modra wrote:
> > @@ -33002,7 +33092,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx fla
> >  originally direct, the 3rd word has not been written since no
> >  trampoline has been built, so we ought not to load it, lest we
> >  override a static chain value.  */
> > - if (!direct_call_p && TARGET_POINTERS_TO_NESTED_FUNCTIONS)
> > + if (!direct_call_p
> > + && TARGET_POINTERS_TO_NESTED_FUNCTIONS
> > + && !chain_already_loaded (crtl->emit.sequence_stack->last))
> 
> Shouldn't that be !chain_already_loaded (get_last_insn_anywhere ()) ?

I considered that interface but it's not best option.  We may have
already emitted insns for the topmost sequence, ie. the GEN_CALL.  So
you'd need to loop over insns in chain_already_loaded.  But not too
far, as that might run into a previous call.

-- 
Alan Modra
Australia Development Lab, IBM


Re: r219977 - in /trunk/gcc: ChangeLog config/rs600...

2015-02-03 Thread Andreas Schwab
FAIL: gcc.dg/builtins-58.c scan-assembler-not pow

$ grep pow builtins-58.s
.machine power4

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH, FT32] initial support

2015-02-03 Thread Joseph Myers
This patch is missing pieces such as Texinfo documentation (in
invoke.texi for target-specific options, at least) and config-list.mk
update so automatic builders verify that this target builds OK.  See
"Back End" in sourcebuild.texi and make sure that you have everything
relevant.

It's a good idea to make sure any new port builds cleanly on both 32-bit 
and 64-bit hosts when configured --enable-werror-always and the compiler 
used to build it is the same version of GCC (this provides equivalent 
coverage for being free of compilation warnings as bootstrap does for 
native tools).

> Index: gcc/config/ft32/ft32.md
> ===
> --- gcc/config/ft32/ft32.md   (revision 0)
> +++ gcc/config/ft32/ft32.md   (revision 0)
> @@ -0,0 +1,965 @@
> +;; Machine description for FT32
> +;; Copyright (C) 2009 Free Software Foundation, Inc.

Copyright years should be -2015.

> +internal_error ("internal error: 'h' applied to non-register 
> operand");

internal_error already prints "internal compiler error: "; don't
include "internal error: " in the error string.  Use %< and %> as
quotes rather than ''.

> +internal_error ("internal error: bad alignment: %d", i);

Likewise.

> +int bf = ft32_as_bitfield(INTVAL(x));

Missing spaces before '(' in function and macro calls; check for any
other instances.

> +  /* if (REGNO (operand) > FT32_R13) internal_error ("internal error: 
> bad register: %d", REGNO (operand));  wtf */

Even commented-out calls (if really needed) should be cleaned up.

> +  if (65536 <= cfun->machine->size_for_adjusting_sp) {
> +error("Stack frame must be smaller than 64K");

Start diagnostics with a lowercase letter.  Note a missing space again.

> +#if 0
> +/* fixed-bit.h does not define functions for TA and UTA because
> +   that part is wrapped in #if MIN_UNITS_PER_WORD > 4.
> +   This would lead to empty functions for TA and UTA.
> +   Thus, supply appropriate defines as if HAVE_[U]TA == 1.
> +   #define HAVE_[U]TA 1 won't work because avr-modes.def
> +   uses ADJUST_BYTESIZE(TA,8) and fixed-bit.h is not generic enough
> +   to arrange for such changes of the mode size.  */

Don't include large amounts of #if 0 or commented-out code in a new
port (*any* such code needs a good justification).

> +ft32-*-elf)
> + # tmake_file="ft32/t-ft32 t-softfp-sfdf t-softfp-excl t-softfp"

Why commented-out?  soft-fp is to be preferred to fp-bit (except for
8-bit and 16-bit ports where space is very critical).  Also,
t-softfp-excl shouldn't be needed for new ports; it's only really
relevant when soft-fp is being built for hard-float multilibs for some
reason (t-hardfp is preferred then).

If the issue is that you want to exclude some soft-fp functions from
being built because you have architecture-specific versions of them,
it should be straightforward to add a new softfp_exclusions variable
tht t-softfp respects (in a separate patch, please).

> + # tm_file="$tm_file ft32/ft32-lib.h"

Unless you're using this file, don't include it in the patch at all.


-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread Jakub Jelinek
On Tue, Feb 03, 2015 at 02:03:14PM -0800, H.J. Lu wrote:
> So we aren't SYMBOL_REF_EXTERNAL_P nor
> SYMBOL_REF_LOCAL_P.  What do we reference?

That is reasonable.  There is no guarantee the extern weak symbol is local,
it could very well be non-local.  All that you know about the symbols is
that its address is non-NULL in that case.

Jakub


Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread H.J. Lu
On Tue, Feb 3, 2015 at 1:35 PM, Sriraman Tallam  wrote:
> On Tue, Feb 3, 2015 at 1:29 PM, H.J. Lu  wrote:
>> On Tue, Feb 3, 2015 at 1:20 PM, Sriraman Tallam  wrote:
>>> On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek  wrote:
 On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
> This was the original patch to i386.c to let global accesses take
> advantage of copy relocations and avoid the GOT.
>
>
> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>   return true;
>  }
>else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> -   && SYMBOL_REF_LOCAL_P (op0)
> +   && (SYMBOL_REF_LOCAL_P (op0)
> +   || (HAVE_LD_PIE_COPYRELOC
> +   && flag_pie
> +   && !SYMBOL_REF_WEAK (op0)
> +   && !SYMBOL_REF_FUNCTION_P (op0)))
> && ix86_cmodel != CM_LARGE_PIC)
>
> I do not understand here why weak global data access must go through
> the GOT and not use copy relocations. Ultimately, there is only going
> to be one copy of the global either defined in the executable or the
> shared object right?
>
> Can we remove the check for SYMBOL_REF_WEAK?

 So, what will then happen if the weak undef symbol isn't defined anywhere?
 In non-PIE binaries that is fine, the linker will store 0.
 But in PIE binaries, the 0 would be biased by the PIE load bias and thus
 wouldn't be NULL.
>>>
>>> Thanks for clarifying.
>>>
 You can only optimize weak vars if there is some weak definition in the
 current TU.
>>>
>>> Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with
>>>
>>> !(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))
>>>
>>
>> The full condition is:
>>
>>   && (SYMBOL_REF_LOCAL_P (op0)
>>|| (HAVE_LD_PIE_COPYRELOC
>>&& flag_pie
>>&& !SYMBOL_REF_WEAK (op0)
>>&& !SYMBOL_REF_FUNCTION_P (op0)))
>>
>> If the weak op0 is defined in the current TU, shouldn't
>> SYMBOL_REF_LOCAL_P (op0)  be true for PIE?
>
> Thats not what I see for this:
>
> zap.cc
> -
> __attribute__((weak))
> int glob;
>
> int main()
> {
>printf("%d\n", glob);
> }
>
> (gdb) p debug_rtx(op0)
> (symbol_ref/i:DI ("glob") )
>
> (gdb) p SYMBOL_REF_LOCAL_P(op0)
> $4 = false
>
> (gdb) p SYMBOL_REF_WEAK (op0)
> $5 = 1
>
> (gdb) p SYMBOL_REF_EXTERNAL_P (op0)
> $6 = false
>
> Thanks

So we aren't SYMBOL_REF_EXTERNAL_P nor
SYMBOL_REF_LOCAL_P.  What do we reference?



-- 
H.J.


[PATCH, committed] fix Fortran docs

2015-02-03 Thread Steve Kargl
I've committed the following typo fix to both 4.9 and 5.0.

Index: ChangeLog
===
--- ChangeLog   (revision 220347)
+++ ChangeLog   (working copy)
@@ -1,3 +1,6 @@
+2015-02-03  Steven G. Kargl  
+
+   * intrinsic.texi (CO_ASSOCIATED): c_prt_1 should be c_ptr_1.
 
 2015-01-30  Andre Vehreschild  
 
Index: intrinsic.texi
===
--- intrinsic.texi  (revision 220347)
+++ intrinsic.texi  (working copy)
@@ -2713,7 +2713,7 @@ end program test_btest
 
 @table @asis
 @item @emph{Description}:
-@code{C_ASSOCIATED(c_prt_1[, c_ptr_2])} determines the status of the C pointer
+@code{C_ASSOCIATED(c_ptr_1[, c_ptr_2])} determines the status of the C pointer
 @var{c_ptr_1} or if @var{c_ptr_1} is associated with the target @var{c_ptr_2}.
 
 @item @emph{Standard}:
@@ -2723,7 +2723,7 @@ Fortran 2003 and later
 Inquiry function
 
 @item @emph{Syntax}:
-@code{RESULT = C_ASSOCIATED(c_prt_1[, c_ptr_2])}
+@code{RESULT = C_ASSOCIATED(c_ptr_1[, c_ptr_2])}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
-- 
Steve


Remove libgo/go/go/types/testdata

2015-02-03 Thread Ian Lance Taylor
The libgo/go/go/types directory was removed in July 2013.
Unfortunately I accidentally left the testdata directory behind.  This
patch removes it.  Committed to mainline.

Ian
diff -r 85d0b689bd17 libgo/go/go/types/testdata/builtins.src
--- a/libgo/go/go/types/testdata/builtins.src   Mon Feb 02 19:27:56 2015 -0800
+++ /dev/null   Thu Jan 01 00:00:00 1970 +
@@ -1,302 +0,0 @@
-// Copyright 2012 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-// builtin calls
-
-package builtins
-
-import "unsafe"
-
-func _append() {
-   var x int
-   var s []byte
-   _0 := append /* ERROR "argument" */ ()
-   _1 := append("foo" /* ERROR "not a typed slice" */)
-   _2 := append(nil /* ERROR "not a typed slice" */, s)
-   _3 := append(x /* ERROR "not a typed slice" */, s)
-   _4 := append(s)
-   append /* ERROR "not used" */ (s)
-}
-
-func _cap() {
-   var a [10]bool
-   var p *[20]int
-   var s []int
-   var c chan string
-   _0 := cap /* ERROR "argument" */ ()
-   _1 := cap /* ERROR "argument" */ (1, 2)
-   _2 := cap(42 /* ERROR "invalid" */)
-   const _3 = cap(a)
-   assert(_3 == 10)
-   const _4 = cap(p)
-   assert(_4 == 20)
-   _5 := cap(c)
-   cap /* ERROR "not used" */ (c)
-}
-
-func _close() {
-   var c chan int
-   var r <-chan int
-   close /* ERROR "argument" */ ()
-   close /* ERROR "argument" */ (1, 2)
-   close(42 /* ERROR "not a channel" */)
-   close(r /* ERROR "receive-only channel" */)
-   close(c)
-}
-
-func _complex() {
-   var i32 int32
-   var f32 float32
-   var f64 float64
-   var c64 complex64
-   _ = complex /* ERROR "argument" */ ()
-   _ = complex /* ERROR "argument" */ (1)
-   _ = complex(true /* ERROR "invalid argument" */ , 0)
-   _ = complex(i32 /* ERROR "invalid argument" */ , 0)
-   _ = complex("foo" /* ERROR "invalid argument" */ , 0)
-   _ = complex(c64 /* ERROR "invalid argument" */ , 0)
-   _ = complex(0, true /* ERROR "invalid argument" */ )
-   _ = complex(0, i32 /* ERROR "invalid argument" */ )
-   _ = complex(0, "foo" /* ERROR "invalid argument" */ )
-   _ = complex(0, c64 /* ERROR "invalid argument" */ )
-   _ = complex(f32, f32)
-   _ = complex(f32, 1)
-   _ = complex(f32, 1.0)
-   _ = complex(f32, 'a')
-   _ = complex(f64, f64)
-   _ = complex(f64, 1)
-   _ = complex(f64, 1.0)
-   _ = complex(f64, 'a')
-   _ = complex(f32 /* ERROR "mismatched types" */, f64)
-   _ = complex(f64 /* ERROR "mismatched types" */, f32)
-   _ = complex(1, 1)
-   _ = complex(1, 1.1)
-   _ = complex(1, 'a')
-   complex /* ERROR "not used" */ (1, 2)
-}
-
-func _copy() {
-   copy /* ERROR "not enough arguments" */ ()
-   copy /* ERROR "not enough arguments" */ ("foo")
-   copy([ /* ERROR "copy expects slice arguments" */ ...]int{}, []int{})
-   copy([ /* ERROR "copy expects slice arguments" */ ]int{}, [...]int{})
-   copy([ /* ERROR "different element types" */ ]int8{}, "foo")
-
-   // spec examples
-   var a = [...]int{0, 1, 2, 3, 4, 5, 6, 7}
-   var s = make([]int, 6)
-   var b = make([]byte, 5)
-   n1 := copy(s, a[0:])// n1 == 6, s == []int{0, 1, 2, 3, 4, 5}
-   n2 := copy(s, s[2:])// n2 == 4, s == []int{2, 3, 4, 5, 4, 5}
-   n3 := copy(b, "Hello, World!")  // n3 == 5, b == []byte("Hello")
-}
-
-func _delete() {
-   var m map[string]int
-   var s string
-   delete /* ERROR "argument" */ ()
-   delete /* ERROR "argument" */ (1)
-   delete /* ERROR "argument" */ (1, 2, 3)
-   delete(m, 0 /* ERROR "not assignable" */)
-   delete(m, s)
-}
-
-func _imag() {
-   var f32 float32
-   var f64 float64
-   var c64 complex64
-   var c128 complex128
-   _ = imag /* ERROR "argument" */ ()
-   _ = imag /* ERROR "argument" */ (1, 2)
-   _ = imag(10 /* ERROR "must be a complex number" */)
-   _ = imag(2.7182818 /* ERROR "must be a complex number" */)
-   _ = imag("foo" /* ERROR "must be a complex number" */)
-   const _5 = imag(1 + 2i)
-   assert(_5 == 2)
-   f32 = _5
-   f64 = _5
-   const _6 = imag(0i)
-   assert(_6 == 0)
-   f32 = imag(c64)
-   f64 = imag(c128)
-   f32 = imag /* ERROR "cannot assign" */ (c128)
-   f64 = imag /* ERROR "cannot assign" */ (c64)
-   imag /* ERROR "not used" */ (c64)
-}
-
-func _len() {
-   const c = "foobar"
-   var a [10]bool
-   var p *[20]int
-   var s []int
-   var m map[string]complex128
-   _ = len /* ERROR "argument" */ ()
-   _ = len /* ERROR "argument" */ (1, 2)
-   _ = len(42 /* ERROR "invalid" */)
-   const _3 = len(c)
-   assert(_3 == 6)
-   const _4 = len(a)
-   assert(_4 == 10)
-   const _5 = len(p)
-   assert(_5 == 20)
-

Re: [Patch, fortran] PR 64757 - [5 Regression] ICE in fold_convert_loc, at fold-const.c:2353

2015-02-03 Thread Paul Richard Thomas
Dear Dominique,

I have fixed all the problems except the last one. For that case, the
other brand gives
type_to_class_30.f90(19): error #7822: Variables containing ultimate
allocatable array components are forbidden from appearing directly in
input/output lists.
print *, TestReference([Test(99), Test(199)])
-^
compilation aborted for type_to_class_30.f90 (code 1)

which seems to me to be correct. I'll see what I can do to fix it.

Thanks for the help

Paul

On 2 February 2015 at 17:53, Dominique Dhumieres  wrote:
> Dear Paul,
>
> I have tested your patch at 
> https://gcc.gnu.org/ml/fortran/2015-01/txtwnaoa1115V.txt
> (the latest version) and I found that the test type_to_class_3.f03 is 
> miscompiled
> (FAIL) with -flto -O0 -m64 (this does not happens with -flto -O0 -m32 or with 
> -Ox and
> x!=0).
>
> In addition, while the reduced test
>
>   type :: Test
> integer :: i
>   end type
>
>   type :: TestReference
>  class(Test), allocatable :: test(:)
>   end type
>
>   type(TestReference) :: testList
>   type(test), allocatable :: x(:)
>
>  allocate (testList%test(2), source = [Test(99), Test(199)]) ! Works, of 
> course
>  print *, size(testList%test)
>  x = testList%test
>  print *, x
> end
>
> gives what I expect, i.e.,
>
>2
>   99 199
>
>   type :: Test
> integer :: i
>   end type
>
>   type :: TestReference
>  class(Test), allocatable :: test(:)
>   end type
>
>   type(TestReference) :: testList
>   type(test), allocatable :: x(:)
>
>   testList = TestReference([Test(99), Test(199)])  ! Gave: The rank of the 
> element in the
>! structure constructor at 
> (1) does not
>! match that of the 
> component (1/0)
>   print *, size(testList%test)
>   x = testList%test
>   print *, x
> end
>
> gives
>
>1
>   99
>
> Last problem I see,
>
> print *, TestReference([Test(99), Test(199)])
>
> gives the following ICE
>
> f951: internal compiler error: Bad IO basetype (7)
>
> type_to_class_3_red_2.f03:12:0:
>
>print *, TestReference([Test(99), Test(199)])
>
>
> Cheers,
>
> Dominique



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread Sriraman Tallam
On Tue, Feb 3, 2015 at 1:29 PM, H.J. Lu  wrote:
> On Tue, Feb 3, 2015 at 1:20 PM, Sriraman Tallam  wrote:
>> On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek  wrote:
>>> On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
 This was the original patch to i386.c to let global accesses take
 advantage of copy relocations and avoid the GOT.


 @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
   return true;
  }
else if (!SYMBOL_REF_FAR_ADDR_P (op0)
 -   && SYMBOL_REF_LOCAL_P (op0)
 +   && (SYMBOL_REF_LOCAL_P (op0)
 +   || (HAVE_LD_PIE_COPYRELOC
 +   && flag_pie
 +   && !SYMBOL_REF_WEAK (op0)
 +   && !SYMBOL_REF_FUNCTION_P (op0)))
 && ix86_cmodel != CM_LARGE_PIC)

 I do not understand here why weak global data access must go through
 the GOT and not use copy relocations. Ultimately, there is only going
 to be one copy of the global either defined in the executable or the
 shared object right?

 Can we remove the check for SYMBOL_REF_WEAK?
>>>
>>> So, what will then happen if the weak undef symbol isn't defined anywhere?
>>> In non-PIE binaries that is fine, the linker will store 0.
>>> But in PIE binaries, the 0 would be biased by the PIE load bias and thus
>>> wouldn't be NULL.
>>
>> Thanks for clarifying.
>>
>>> You can only optimize weak vars if there is some weak definition in the
>>> current TU.
>>
>> Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with
>>
>> !(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))
>>
>
> The full condition is:
>
>   && (SYMBOL_REF_LOCAL_P (op0)
>|| (HAVE_LD_PIE_COPYRELOC
>&& flag_pie
>&& !SYMBOL_REF_WEAK (op0)
>&& !SYMBOL_REF_FUNCTION_P (op0)))
>
> If the weak op0 is defined in the current TU, shouldn't
> SYMBOL_REF_LOCAL_P (op0)  be true for PIE?

Thats not what I see for this:

zap.cc
-
__attribute__((weak))
int glob;

int main()
{
   printf("%d\n", glob);
}

(gdb) p debug_rtx(op0)
(symbol_ref/i:DI ("glob") )

(gdb) p SYMBOL_REF_LOCAL_P(op0)
$4 = false

(gdb) p SYMBOL_REF_WEAK (op0)
$5 = 1

(gdb) p SYMBOL_REF_EXTERNAL_P (op0)
$6 = false

Thanks
Sri




>
> --
> H.J.


Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread H.J. Lu
On Tue, Feb 3, 2015 at 1:20 PM, Sriraman Tallam  wrote:
> On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek  wrote:
>> On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
>>> This was the original patch to i386.c to let global accesses take
>>> advantage of copy relocations and avoid the GOT.
>>>
>>>
>>> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>>>   return true;
>>>  }
>>>else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>>> -   && SYMBOL_REF_LOCAL_P (op0)
>>> +   && (SYMBOL_REF_LOCAL_P (op0)
>>> +   || (HAVE_LD_PIE_COPYRELOC
>>> +   && flag_pie
>>> +   && !SYMBOL_REF_WEAK (op0)
>>> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>>> && ix86_cmodel != CM_LARGE_PIC)
>>>
>>> I do not understand here why weak global data access must go through
>>> the GOT and not use copy relocations. Ultimately, there is only going
>>> to be one copy of the global either defined in the executable or the
>>> shared object right?
>>>
>>> Can we remove the check for SYMBOL_REF_WEAK?
>>
>> So, what will then happen if the weak undef symbol isn't defined anywhere?
>> In non-PIE binaries that is fine, the linker will store 0.
>> But in PIE binaries, the 0 would be biased by the PIE load bias and thus
>> wouldn't be NULL.
>
> Thanks for clarifying.
>
>> You can only optimize weak vars if there is some weak definition in the
>> current TU.
>
> Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with
>
> !(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))
>

The full condition is:

  && (SYMBOL_REF_LOCAL_P (op0)
   || (HAVE_LD_PIE_COPYRELOC
   && flag_pie
   && !SYMBOL_REF_WEAK (op0)
   && !SYMBOL_REF_FUNCTION_P (op0)))

If the weak op0 is defined in the current TU, shouldn't
SYMBOL_REF_LOCAL_P (op0)  be true for PIE?

-- 
H.J.


Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread Sriraman Tallam
On Tue, Feb 3, 2015 at 11:36 AM, Jakub Jelinek  wrote:
> On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
>> This was the original patch to i386.c to let global accesses take
>> advantage of copy relocations and avoid the GOT.
>>
>>
>> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>>   return true;
>>  }
>>else if (!SYMBOL_REF_FAR_ADDR_P (op0)
>> -   && SYMBOL_REF_LOCAL_P (op0)
>> +   && (SYMBOL_REF_LOCAL_P (op0)
>> +   || (HAVE_LD_PIE_COPYRELOC
>> +   && flag_pie
>> +   && !SYMBOL_REF_WEAK (op0)
>> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>> && ix86_cmodel != CM_LARGE_PIC)
>>
>> I do not understand here why weak global data access must go through
>> the GOT and not use copy relocations. Ultimately, there is only going
>> to be one copy of the global either defined in the executable or the
>> shared object right?
>>
>> Can we remove the check for SYMBOL_REF_WEAK?
>
> So, what will then happen if the weak undef symbol isn't defined anywhere?
> In non-PIE binaries that is fine, the linker will store 0.
> But in PIE binaries, the 0 would be biased by the PIE load bias and thus
> wouldn't be NULL.

Thanks for clarifying.

> You can only optimize weak vars if there is some weak definition in the
> current TU.

Would this be fine then?  Replace !SYMBOL_REF_WEAK (op0) with

!(SYMBOL_REF_WEAK (op0) && SYMBOL_REF_EXTERNAL_P (op0))

Thanks
Sri

>
> Jakub


[SH][committed] Use shorter atomic sequences if result values are unused

2015-02-03 Thread Oleg Endo
Hi,

When the result values of atomic ops, such as the previous value of an
atomic_fetch_add, are unused, it's possible to use shorter asm sequences
on SH.  The attached patch does that by checking the reg unused notes of
the insns in split1 and replacing them with the simpler variants, if the
result value operand is unused.
This patch introduces a problem where memory aliasing info is lost,
because the mem operands are passed to the emitted insns not as such,
but only as the register holding the address.  This will be fixed with
PR 64661.

Committed as r220376.
Tested with make -k check-gcc RUNTESTFLAGS="sh.exp --target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m2e/-ml,-m2e/-mb,-m3/-ml,-m3/-mb,-m3e/-ml,-m3e/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
to verify that the insns/splits actually work.  Will observe daily
sh4-linux test results for fallouts.

Cheers,
Oleg

gcc/ChangeLog:
PR target/64660
* config/sh/sync.md (atomic__hard,
atomic_not_hard, atomic__soft_tcb,
atomic_not_soft_tcb, atomic_nand_hard,
atomic_nand_soft_tcb): New insns.
(atomic_fetch_si_hard): Convert to insn_and_split.
Split into atomic__fetchsi_hard if operands[0] is unused.
(define_insn "atomic_fetch_notsi_hard): Convert to insn_and_split.
Split into atomic_not_fetchsi_hard if operands[0] is unused.
(atomic_fetch__hard): Convert to insn_and_split.
Split into atomic__hard if operands[0] is unused.
(atomic_fetch_not_hard): Convert to insn_and_split.  Split into
atomic_not_hard if operands[0] is unused.
(atomic_fetch__soft_gusa): Convert to
insn_and_split.  Split into atomic__fetch_soft_gusa
if operands[0] is unused.
(atomic_fetch_not_soft_gusa): Convert to insn_and_split.  Split
into atomic_not_fetch_soft_gusa if operands[0] is unused.
(atomic_fetch__soft_tcb): Convert to insn_and_split.
Split into atomic__soft_tcb if operands[0] is
unused.
(atomic_fetch_not_soft_tcb): Convert to insn_and_split.  Split
into atomic_not_soft_tcb if operands[0] is unused.
(atomic_fetch__soft_imask): Convert to
insn_and_split.  Split into atomic__fetch_soft_imask
if operands[0] is unused.
(atomic_fetch_not_soft_imask): Convert to insn_and_split.  Split
into atomic_not_fetch_soft_imask is operands[0] is unused.
(atomic_fetch_nandsi_hard): Convert to insn_and_split.  Split into
atomic_nand_fetchsi_hard if operands[0] is unused.
(atomic_fetch_nand_hard): Convert to insn_and_split.  Split into
atomic_nand_hard if operands[0] is unused.
(atomic_fetch_nand_soft_gusa): Convert to insn_and_split.  Split
into atomic_nand_fetch_soft_gusa if operands[0] is unused.
(atomic_fetch_nand_soft_tcb): Convert to insn_and_split.  Split
into atomic_nand_soft_tcb if operands[0] is unused.
(atomic_fetch_nand_soft_imask): Convert to insn_and_split.  Split
into atomic_nand_fetch_soft_imask if operands[0] is unused.
(atomic__fetch_hard): Convert to insn_and_split.
Split into atomic__hard if operands[0] is unused.
(atomic_not_fetch_hard): Convert to insn_and_split.  Split into
atomic_not_hard if operands[0] is unused.
(atomic__fetch_soft_tcb): Convert to insn_and_split.
Split into atomic__soft_tcb if operands[0] is
unused.
(atomic_not_fetch_soft_tcb): Convert to insn_and_split.  Split
into atomic_not_soft_tcb if operands[0] is unused.
(atomic_nand_fetch_hard): Convert to insn_and_split.  Split into
atomic_nand_hard if operands[0] is unused.
(atomic_nand_fetch_soft_tcb): Convert to insn_and_split.  Split
into atomic_nand_soft_tcb if operands[0] is unused.

gcc/testsuite/ChangeLog:
PR target/64660
* gcc.target/sh/pr64660-0.h: New.
* gcc.target/sh/pr64660-1.c: New.
* gcc.target/sh/pr64660-2.c: New.
* gcc.target/sh/pr64660-3.c: New.
* gcc.target/sh/pr64660-4.c: New.

Index: gcc/testsuite/gcc.target/sh/pr64660-2.c
===
--- gcc/testsuite/gcc.target/sh/pr64660-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/sh/pr64660-2.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Check that the appropriate atomic insns are used if the result values
+   are unused.  */
+/* { dg-do compile { target { atomic_model_soft_tcb_available } } }  */
+/* { dg-options "-dp -O2 -matomic-model=soft-tcb,gbr-offset=0,strict" }  */
+/* { dg-final { scan-assembler-times "atomic_add" 12 } }  */
+/* { dg-final { scan-assembler-times "atomic_and" 6 } }  */
+/* { dg-final { scan-assembler-times "atomic_or" 6 } }  */
+/* { dg-final { scan-assembler-times "atomic_xor" 6 } }  */
+/* { dg-final { scan-assembler-times "atomic_nand" 6 } }  */
+/* { dg-final { scan-assembler-times "atomic_not" 12 } }  */
+/* { dg-final { scan-assembler-not "

Re: [gofrontend-dev] GO tools for gccgo cross

2015-02-03 Thread Ian Lance Taylor
On Tue, Feb 3, 2015 at 11:52 AM, Lynn A. Boger
 wrote:
>
> I've been experimenting with the go tools and how to make them work for
> cross gccgo builds.
>
> In golang I think there is just one 'go' tool and the cross build targets
> are specified by the setting of GOARCH and GOOS.  So why couldn't the same
> be done with gccgo?
>
> That means, on any given system, I think we just need to build the native
> gccgo and the go tools that are built with it.  No need to build different
> go tools for each cross target.  If a cross build is needed, then set the
> GOARCH and GOOS values appropriately and invoke the native go (for gccgo)
> tool.  Source changes are needed for the go tool source to determine the
> correct cross compiler name based on the GOARCH and GOOS settings, and then
> use that cross compiler where needed for building or linking instead of
> always using "gccgo".  I was able to make these changes and get this to work
> -- I built gccgo programs targeted for ppc64le on an x86_64.

I agree that setting GOARCH and GOOS should change the default name of
the gccgo compiler that the go tool uses.

Ian


Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-02-03 Thread Ilya Verbin
On 03 Feb 13:00, Julian Brown wrote:
> On Tue, 3 Feb 2015 14:28:44 +0300
> Ilya Verbin  wrote:
> > On 27 Jan 14:07, Julian Brown wrote:
> > > On Mon, 26 Jan 2015 17:34:26 +0300
> > > Ilya Verbin  wrote:
> > > > Here is my current patch, it works for OpenMP->MIC, but obviously
> > > > will not work for PTX, since it requires symmetrical changes in
> > > > the plugin.  Could you please take a look, whether it is possible
> > > > to support this new interface in PTX plugin?
> > > 
> > > I think it can probably be made to work. I'll have a look in more
> > > detail.
> > 
> > Do you have any progress on this?
> 
> I'm still working on a patch to update OpenACC support and the PTX
> backend to use load/unload_image and to unify initialisation/"opening".
> So far I think the answer is basically "yes, the new interface can be
> supported", though I might request a minor tweak -- e.g. that
> load_image takes an extra "void **" argument so that a libgomp backend
> can allocate a block of generic metadata relating to the image, then
> that same block would be passed (void *) to the unload hook so the
> backend can use it there and deallocate it when it's finished with.
> 
> Would that be possible? (It'd mostly be for a "CUmodule" handle: this
> could be stashed away somewhere within the nvptx backend, but it might
> be neater to put it in generic code since it'll probably be useful for
> other backends anyway.)

An extra argument is not a problem, however I don't quite get the idea.
PTX plugin allocates some data while loading, and needs this data while
unloading?  Then why not to create a hash table with image_ptr -> metadata
mapping inside the plugin?  In this case, to the unload hook can deallocate
metadata using the image_ptr key.  Since this metadata is target-specific,
I believe it would be better to keep it inside the plugin.

  -- Ilya


GO tools for gccgo cross

2015-02-03 Thread Lynn A. Boger

Hi,

I've been experimenting with the go tools and how to make them work for 
cross gccgo builds.


In golang I think there is just one 'go' tool and the cross build 
targets are specified by the setting of GOARCH and GOOS.  So why 
couldn't the same be done with gccgo?


That means, on any given system, I think we just need to build the 
native gccgo and the go tools that are built with it.  No need to build 
different go tools for each cross target.  If a cross build is needed, 
then set the GOARCH and GOOS values appropriately and invoke the native 
go (for gccgo) tool.  Source changes are needed for the go tool source 
to determine the correct cross compiler name based on the GOARCH and 
GOOS settings, and then use that cross compiler where needed for 
building or linking instead of always using "gccgo".  I was able to make 
these changes and get this to work -- I built gccgo programs targeted 
for ppc64le on an x86_64.





Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread Jakub Jelinek
On Tue, Feb 03, 2015 at 11:25:38AM -0800, Sriraman Tallam wrote:
> This was the original patch to i386.c to let global accesses take
> advantage of copy relocations and avoid the GOT.
> 
> 
> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>   return true;
>  }
>else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> -   && SYMBOL_REF_LOCAL_P (op0)
> +   && (SYMBOL_REF_LOCAL_P (op0)
> +   || (HAVE_LD_PIE_COPYRELOC
> +   && flag_pie
> +   && !SYMBOL_REF_WEAK (op0)
> +   && !SYMBOL_REF_FUNCTION_P (op0)))
> && ix86_cmodel != CM_LARGE_PIC)
> 
> I do not understand here why weak global data access must go through
> the GOT and not use copy relocations. Ultimately, there is only going
> to be one copy of the global either defined in the executable or the
> shared object right?
> 
> Can we remove the check for SYMBOL_REF_WEAK?

So, what will then happen if the weak undef symbol isn't defined anywhere?
In non-PIE binaries that is fine, the linker will store 0.
But in PIE binaries, the 0 would be biased by the PIE load bias and thus
wouldn't be NULL.
You can only optimize weak vars if there is some weak definition in the
current TU.

Jakub


[PATCH, committed] jit: option-logging

2015-02-03 Thread David Malcolm
It seems prudent to have the jit's log files contain information on
the values of the various options in use.

This patch adds logging when options are changed, and logs the value
of all options when a compile is started (and no-ops for the common
case when no logging is enabled).

Tested via "make check-jit".

Committed to trunk as r220375.

gcc/jit/ChangeLog:
* jit-logging.h (gcc::jit::log_user::log): Make const.
* jit-recording.c (gcc::jit::recording::context::set_str_option):
Log the new value of the option.
(gcc::jit::recording::context::set_int_option): Likewise.
(gcc::jit::recording::context::set_bool_option): Likewise.
(gcc::jit::recording::context::compile): Log the value of all
options.
(gcc::jit::recording::context::compile_to_file): Likewise.
(gcc::jit::recording::context::log_all_options): New function.
(gcc::jit::recording::context::log_str_option): New function.
(gcc::jit::recording::context::log_int_option): New function.
(gcc::jit::recording::context::log_bool_option): New function.
* jit-recording.h (gcc::jit::recording::context::log_all_options):
New function.
(gcc::jit::recording::context::log_str_option): New function.
(gcc::jit::recording::context::log_int_option): New function.
(gcc::jit::recording::context::log_bool_option): New function.
* docs/internals/test-hello-world.exe.log.txt: Update for above
changes.
---
 .../docs/internals/test-hello-world.exe.log.txt| 17 +
 gcc/jit/jit-logging.h  |  4 +-
 gcc/jit/jit-recording.c| 73 ++
 gcc/jit/jit-recording.h|  5 ++
 4 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/gcc/jit/docs/internals/test-hello-world.exe.log.txt 
b/gcc/jit/docs/internals/test-hello-world.exe.log.txt
index a9abc10..5cb3aef 100644
--- a/gcc/jit/docs/internals/test-hello-world.exe.log.txt
+++ b/gcc/jit/docs/internals/test-hello-world.exe.log.txt
@@ -1,18 +1,25 @@
 JIT: libgccjit (GCC) version 5.0.0 20150123 (experimental) 
(x86_64-unknown-linux-gnu)
 JIT:   compiled by GNU C version 4.8.3 20140911 (Red Hat 4.8.3-7), GMP version 
5.1.2, MPFR version 3.1.2, MPC version 1.0.1
 JIT: entering: gcc_jit_context_set_str_option
+JIT:  GCC_JIT_STR_OPTION_PROGNAME: "./test-hello-world.c.exe"
 JIT: exiting: gcc_jit_context_set_str_option
 JIT: entering: gcc_jit_context_set_int_option
+JIT:  GCC_JIT_INT_OPTION_OPTIMIZATION_LEVEL: 3
 JIT: exiting: gcc_jit_context_set_int_option
 JIT: entering: gcc_jit_context_set_bool_option
+JIT:  GCC_JIT_BOOL_OPTION_DEBUGINFO: true
 JIT: exiting: gcc_jit_context_set_bool_option
 JIT: entering: gcc_jit_context_set_bool_option
+JIT:  GCC_JIT_BOOL_OPTION_DUMP_INITIAL_TREE: false
 JIT: exiting: gcc_jit_context_set_bool_option
 JIT: entering: gcc_jit_context_set_bool_option
+JIT:  GCC_JIT_BOOL_OPTION_DUMP_INITIAL_GIMPLE: false
 JIT: exiting: gcc_jit_context_set_bool_option
 JIT: entering: gcc_jit_context_set_bool_option
+JIT:  GCC_JIT_BOOL_OPTION_SELFCHECK_GC: true
 JIT: exiting: gcc_jit_context_set_bool_option
 JIT: entering: gcc_jit_context_set_bool_option
+JIT:  GCC_JIT_BOOL_OPTION_DUMP_SUMMARY: false
 JIT: exiting: gcc_jit_context_set_bool_option
 JIT: entering: gcc_jit_context_get_type
 JIT: exiting: gcc_jit_context_get_type
@@ -47,6 +54,16 @@ JIT: exiting: gcc_jit_context_dump_reproducer_to_file
 JIT: entering: gcc_jit_context_compile
 JIT:  in-memory compile of ctxt: 0x1283e20
 JIT:  entering: gcc::jit::result* gcc::jit::recording::context::compile()
+JIT:   GCC_JIT_STR_OPTION_PROGNAME: "./test-hello-world.c.exe"
+JIT:   GCC_JIT_INT_OPTION_OPTIMIZATION_LEVEL: 3
+JIT:   GCC_JIT_BOOL_OPTION_DEBUGINFO: true
+JIT:   GCC_JIT_BOOL_OPTION_DUMP_INITIAL_TREE: false
+JIT:   GCC_JIT_BOOL_OPTION_DUMP_INITIAL_GIMPLE: false
+JIT:   GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE: false
+JIT:   GCC_JIT_BOOL_OPTION_DUMP_SUMMARY: false
+JIT:   GCC_JIT_BOOL_OPTION_DUMP_EVERYTHING: false
+JIT:   GCC_JIT_BOOL_OPTION_SELFCHECK_GC: true
+JIT:   GCC_JIT_BOOL_OPTION_KEEP_INTERMEDIATES: false
 JIT:   entering: void gcc::jit::recording::context::validate()
 JIT:   exiting: void gcc::jit::recording::context::validate()
 JIT:   entering: 
gcc::jit::playback::context::context(gcc::jit::recording::context*)
diff --git a/gcc/jit/jit-logging.h b/gcc/jit/jit-logging.h
index 48f223d..9ece0df 100644
--- a/gcc/jit/jit-logging.h
+++ b/gcc/jit/jit-logging.h
@@ -127,7 +127,7 @@ class log_user
   logger * get_logger () const { return m_logger; }
   void set_logger (logger * logger);
 
-  void log (const char *fmt, ...)
+  void log (const char *fmt, ...) const
 GNU_PRINTF(2, 3);
 
   void enter_scope (const char *scope_name);
@@ -141,7 +141,7 @@ class log_user
case where the underlying logger is NULL via a no-op.  */
 
 inline void
-log_user::log (const char *fmt, ...)
+log_user::log (const char *fmt, ...) const

Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread Sriraman Tallam
+davidxl +ccoutant

On Tue, Feb 3, 2015 at 11:25 AM, Sriraman Tallam  wrote:
> On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu  wrote:
>> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak  wrote:
>>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu  wrote:
>>>
> It would probably help reviewers if you pointed to actual path
> submission [1], which unfortunately contains the explanation in the
> patch itself [2], which further explains that this functionality is
> currently only supported with gold, patched with [3].
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>
> After a bit of the above detective work, I think that new gcc option
> is not necessary. The configure should detect if new functionality is
> supported in the linker, and auto-configure gcc to use it when
> appropriate.

 I think GCC option is needed since one can use -fuse-ld= to
 change linker.
>>>
>>> IMO, nobody will use this highly special x86_64-only option. It would
>>> be best for gnu-ld to reach feature parity with gold as far as this
>>> functionality is concerned. In this case, the optimization would be
>>> auto-configured, and would fire automatically, without any user
>>> intervention.
>>>
>>
>> Let's do it.  I implemented the same feature in bfd linker on both
>> master and 2.25 branch.
>>
>
> +bool
> +i386_binds_local_p (const_tree exp)
> +{
> +  /* Globals marked extern are treated as local when linker copy 
> relocations
> + support is available with -f{pie|PIE}.  */
> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
> +  && TREE_CODE (exp) == VAR_DECL
> +  && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
> +return true;
> +  return default_binds_local_p (exp);
> +}
> +
>
> It returns true with -fPIE and false without -fPIE.  It is lying to 
> compiler.
> Maybe legitimate_pic_address_disp_p is a better place.
>>>
>>> Agreed.
>>>
 Something like this?
>>>
>>> Yes.
>>>
>>> OK, if Jakub doesn't have any objections here. Please also add
>>> Sriraman as author to ChangeLog entry.
>>>
>>> Thanks,
>>> Uros.
>>
>> Here is the patch.   OK to install?
>>
>> Thanks.
>>
>> --
>> H.J.
>> ---
>> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
>> module using the GOT.  This is two instructions, one to get the address
>> of the global from the GOT and the other to get the value.  If it turns
>> out that the global gets defined in the executable at link-time, it still
>> needs to go through the GOT as it is too late then to generate a direct
>> access.
>>
>> Examples:
>>
>> foo.cc
>> --
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   :
>>mov0x165a(%rip),%eax# 1c40 
>>
>> foo.cc
>> --
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using
>> two memory loads:
>>
>> 6f0  :
>>mov0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>mov(%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads
>> affects performance by 1% to 5%.
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that
>> the global will be defined in the executable.  For globals that are truly
>> extern (come from shared objects), the linker will create copy relocations
>> and have them defined in the executable. Result is that no global access
>> needs to go through the GOT and hence improves performance.
>>
>> This optimization only applies to undefined, non-weak global data.
>> Undefined, weak global data access still must go through the GOT.
>
> Hi H.J.,
>
> This was the original patch to i386.c to let global accesses take
> advantage of copy relocations and avoid the GOT.
>
>
> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>   return true;
>  }
>else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> -   && SYMBOL_REF_LOCAL_P (op0)
> +   && (SYMBOL_REF_LOCAL_P (op0)
> +   || (HAVE_LD_PIE_COPYRELOC
> +   && flag_pie
> +   && !SYMBOL_REF_WEAK (op0)
> +   && !SYMBOL_REF_FUNCTION_P (op0)))
> && ix86_cmodel != CM_LARGE_PIC)
>
> I do not understand here why weak global data access must go through
> the GOT and not use copy relocations. Ultimately, there is only going
> to be one copy of the global either defined in the executable or the
> shared object right?
>
> Can we remove t

Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-03 Thread Sriraman Tallam
On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu  wrote:
> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak  wrote:
>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu  wrote:
>>
 It would probably help reviewers if you pointed to actual path
 submission [1], which unfortunately contains the explanation in the
 patch itself [2], which further explains that this functionality is
 currently only supported with gold, patched with [3].

 [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
 [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
 [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

 After a bit of the above detective work, I think that new gcc option
 is not necessary. The configure should detect if new functionality is
 supported in the linker, and auto-configure gcc to use it when
 appropriate.
>>>
>>> I think GCC option is needed since one can use -fuse-ld= to
>>> change linker.
>>
>> IMO, nobody will use this highly special x86_64-only option. It would
>> be best for gnu-ld to reach feature parity with gold as far as this
>> functionality is concerned. In this case, the optimization would be
>> auto-configured, and would fire automatically, without any user
>> intervention.
>>
>
> Let's do it.  I implemented the same feature in bfd linker on both
> master and 2.25 branch.
>

 +bool
 +i386_binds_local_p (const_tree exp)
 +{
 +  /* Globals marked extern are treated as local when linker copy 
 relocations
 + support is available with -f{pie|PIE}.  */
 +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
 +  && TREE_CODE (exp) == VAR_DECL
 +  && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
 +return true;
 +  return default_binds_local_p (exp);
 +}
 +

 It returns true with -fPIE and false without -fPIE.  It is lying to 
 compiler.
 Maybe legitimate_pic_address_disp_p is a better place.
>>
>> Agreed.
>>
>>> Something like this?
>>
>> Yes.
>>
>> OK, if Jakub doesn't have any objections here. Please also add
>> Sriraman as author to ChangeLog entry.
>>
>> Thanks,
>> Uros.
>
> Here is the patch.   OK to install?
>
> Thanks.
>
> --
> H.J.
> ---
> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
> module using the GOT.  This is two instructions, one to get the address
> of the global from the GOT and the other to get the value.  If it turns
> out that the global gets defined in the executable at link-time, it still
> needs to go through the GOT as it is too late then to generate a direct
> access.
>
> Examples:
>
> foo.cc
> --
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   :
>mov0x165a(%rip),%eax# 1c40 
>
> foo.cc
> --
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using
> two memory loads:
>
> 6f0  :
>mov0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>mov(%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads
> affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are truly
> extern (come from shared objects), the linker will create copy relocations
> and have them defined in the executable. Result is that no global access
> needs to go through the GOT and hence improves performance.
>
> This optimization only applies to undefined, non-weak global data.
> Undefined, weak global data access still must go through the GOT.

Hi H.J.,

This was the original patch to i386.c to let global accesses take
advantage of copy relocations and avoid the GOT.


@@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
  return true;
 }
   else if (!SYMBOL_REF_FAR_ADDR_P (op0)
-   && SYMBOL_REF_LOCAL_P (op0)
+   && (SYMBOL_REF_LOCAL_P (op0)
+   || (HAVE_LD_PIE_COPYRELOC
+   && flag_pie
+   && !SYMBOL_REF_WEAK (op0)
+   && !SYMBOL_REF_FUNCTION_P (op0)))
&& ix86_cmodel != CM_LARGE_PIC)

I do not understand here why weak global data access must go through
the GOT and not use copy relocations. Ultimately, there is only going
to be one copy of the global either defined in the executable or the
shared object right?

Can we remove the check for SYMBOL_REF_WEAK?

Thanks
Sri



>
> This patch checks if linker supports PIE with copy reloc, which is
> enabled in gold and bfd linker in bininutils 2.25, at configure time
> and enables this optimization if the linker support is 

Re: [patch+7.9] compile: Filter out -fpreprocessed

2015-02-03 Thread Mark Wielaard
On Tue, 2015-02-03 at 19:59 +0100, Jan Kratochvil wrote:
> On Tue, 03 Feb 2015 19:50:40 +0100, Doug Evans wrote:
> > On Fri, Jan 16, 2015 at 2:42 PM, Jan Kratochvil
> >  wrote:
> > > [...]
> > > It is wrong that gcc puts -fpreprocessed into DW_AT_producer - I may post 
> > > a gcc
> > > patch for it.
> > 
> > I wasn't aware there are now rules for what can and cannot go in 
> > DW_AT_producer.
> > DW_AT_producer has gone from being informational to having a formal
> > spec (in the sense that something will break if, for example, a
> > particular option is mentioned).
> > Is this spec written down somewhere? [At least guidelines for what
> > things may lead to breakage?]
> 
> No. Do you have a suggestion where to put it? Should it be only a GNU
> extension or should it be even DWARF-standardized?

The gcc documentation describes it:
https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html

-grecord-gcc-switches
This switch causes the command-line options used to invoke the
compiler that may affect code generation to be appended to the
DW_AT_producer attribute in DWARF debugging information. The
options are concatenated with spaces separating them from each
other and from the compiler version. See also
-frecord-gcc-switches for another way of storing compiler
options into the object file. This is the default.

-gno-record-gcc-switches
Disallow appending command-line options to the DW_AT_producer
attribute in DWARF debugging information.

So Jan is right that gcc adding -fpreprocessed, which doesn't affect
code generation, but is a preprocessor option, shouldn't be there.

Cheers,

Mark


Re: [patch+7.9] compile: Filter out -fpreprocessed

2015-02-03 Thread Jan Kratochvil
On Tue, 03 Feb 2015 19:50:40 +0100, Doug Evans wrote:
> On Fri, Jan 16, 2015 at 2:42 PM, Jan Kratochvil
>  wrote:
> > [...]
> > It is wrong that gcc puts -fpreprocessed into DW_AT_producer - I may post a 
> > gcc
> > patch for it.
> 
> Hi.
> I wasn't aware there are now rules for what can and cannot go in 
> DW_AT_producer.
> DW_AT_producer has gone from being informational to having a formal
> spec (in the sense that something will break if, for example, a
> particular option is mentioned).
> Is this spec written down somewhere? [At least guidelines for what
> things may lead to breakage?]

No. Do you have a suggestion where to put it? Should it be only a GNU
extension or should it be even DWARF-standardized?


Jan


Re: [PATCH, testsuite] Fix PR64796: bswap64 effective target should not cache its result

2015-02-03 Thread Jeff Law

On 01/27/15 02:36, Thomas Preud'homme wrote:

As explained in PR64796, code for bswap64 effective target computes the answer 
once and then cache. However the result depends on the flags passed to the 
compiler and with --target_board it's possible to test several sets of flags. 
Besides, this code assume only lp64 targets can do 64-bit bswap when all 32-bit 
targets also can by virtue of expand_doubleword_bswap () called in expand_unop 
(). This patch solve both problems by removing the caching of the result and 
changing the condition to include all target with 32-bit or more wordsize.

ChangeLog entry is as follows:

 gcc/testsuite/ChangeLog ***

2015-01-27  Thomas Preud'homme  

 PR testsuite/64796
 * lib/target-supports.exp (check_effective_target_bswap64): Do not
 cache result in a global variable.  Include all 32-bit targets for
 bswap64 tests.

OK.
jeff



Re: [patch+7.9] compile: Filter out -fpreprocessed

2015-02-03 Thread Doug Evans
On Fri, Jan 16, 2015 at 2:42 PM, Jan Kratochvil
 wrote:
> [...]
> It is wrong that gcc puts -fpreprocessed into DW_AT_producer - I may post a 
> gcc
> patch for it.

Hi.
I wasn't aware there are now rules for what can and cannot go in DW_AT_producer.
DW_AT_producer has gone from being informational to having a formal
spec (in the sense that something will break if, for example, a
particular option is mentioned).
Is this spec written down somewhere? [At least guidelines for what
things may lead to breakage?]


Re: [PATCH, CHKP] Follow alias chain for decl visibility and aliases

2015-02-03 Thread Jeff Law

On 01/30/15 15:18, Ilya Enkovich wrote:

Hi,

This patch fixes two more cases where alias chain should be followed to emit 
correct assembler name for instrumented functions.  Bootstrapped and tested on 
x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-01-30  Ilya Enkovich  

* varasm.c (do_assemble_alias): Follow transparent alias
chain for target.
(default_assemble_visibility): Follow transparent alias
chain for decl name.

gcc/testsuite/

2015-01-30  Ilya Enkovich  

* gcc.target/i386/chkp-hidden-def.c: New.

OK.
jeff



Re: [RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-02-03 Thread Jeff Law

On 01/27/15 07:27, Jakub Jelinek wrote:

Hi!

I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
far fewer enum values than the 1204 that are being defined.

This patch requires builtins.def to say explicitly (by using
DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
ones need that, for all the others only space in the enum is reserved and
nothing else.

I'd hope this could work around the buggy AIX stabs handling, but even
on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
without dwz between TUs).  The cost without dwz is obviously mainly
from repeating that in most of the translation units.  But why declare
BUILT_IN_*_CHKP enums that are never used by anything...

2015-01-27  Jakub Jelinek  

* builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
(DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
(DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
of DEF_BUILTIN.
(BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, BUILT_IN_STRCAT,
BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
(BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
* tree-core.h (enum built_in_function): In between
BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
for builtins that use DEF_BUILTIN_CHKP macro.
Pretty sneaky how you arrange to get the right holes so that the code in 
ipa-chkp still works (oldcode + BUILTIN_CHKP_BUILTINS + 1) references.


OK for the trunk.

jeff



RE: [PATCH MIPS RFA] Regression cleanup for nan2008 toolchain

2015-02-03 Thread Moore, Catherine


> -Original Message-
> From: Robert Suchanek [mailto:robert.sucha...@imgtec.com]
> Sent: Monday, February 02, 2015 11:18 AM
> To: Richard Sandiford
> Cc: gcc-patches@gcc.gnu.org; Matthew Fortune; Moore, Catherine
> Subject: RE: [PATCH MIPS RFA] Regression cleanup for nan2008 toolchain
> 
> 
> > Please could you add a comment explaining that the mips_nanlegacy is
> > there because of the #include of system headers that might not compile
> > with -mnan=legacy?  I agree that that's a good reason, but it's not
> > obvious without a comment.  (And without a comment this could start a
> > precendent of things being skipped in cases where the mips.exp options
> > machinery could be updated instead.)
> >
> 
> True.  Clarification added.
> 
> Ok for trunk?
> 
> Regards,
> Robert
> 
> 2015-02-02  Robert Suchanek  
> 
>* gcc.target/mips/loongson-simd.c: Update comment to clarify the need
>for mips_nanlegacy target.
> 
> diff --git a/gcc/testsuite/gcc.target/mips/loongson-simd.c
> b/gcc/testsuite/gcc.target/mips/loongson-simd.c
> index 949632e..9c3ebce 100644
> --- a/gcc/testsuite/gcc.target/mips/loongson-simd.c
> +++ b/gcc/testsuite/gcc.target/mips/loongson-simd.c
> @@ -21,7 +21,10 @@ along with GCC; see the file COPYING3.  If not see
>  /* { dg-do run } */
>  /* loongson.h does not handle or check for MIPS16ness or
> microMIPSness.  There doesn't seem any good reason for it to, given
> -   that the Loongson processors do not support either.  */
> +   that the Loongson processors do not support either.  The effective target
> +   mips_nanlegacy is required for a toolchain without the legacy NaN support
> +   because inclusion of some system headers e.g. stdint.h will fail due to 
> not
> +   finding stubs-o32_hard.h.  */
>  /* { dg-require-effective-target mips_nanlegacy } */
>  /* { dg-options "isa=loongson -mhard-float -mno-micromips -mno-mips16 -
> flax-vector-conversions" } */
> 

Thanks for the update.  This is OK.
Catherine


Re: Bug 62044 - [4.8/4.9 Regression] ICE in USE statement with RENAME for extended derived type

2015-02-03 Thread H.J. Lu
On Tue, Jan 27, 2015 at 1:09 PM, Paul Richard Thomas
 wrote:
> Dear All,
>
> The highly embarrassing bug in mold = allocations to class entities
> has been fixed in revisions 220140 and 220191 for trunk and 4.9
> respectively. The PR has been set as RESOLVED.
>

This bug fix may have caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64921


-- 
H.J.


Re: [PATCH][ARM][PING] __ARM_FP & __ARM_NEON_FP defined when -march=armv7-m

2015-02-03 Thread Richard Earnshaw
On 06/01/15 09:40, Mantas Mikaitis wrote:
> 
> Ping and changelog spaces removed.
> 
> Thank you,
> Mantas M.
> 
> On 18/11/14 11:58, Richard Earnshaw wrote:
>> On 18/11/14 11:30, Mantas Mikaitis wrote:
>>> Incorrect predefinitions for certain target architectures. E.g. arm7-m
>>> does not contain NEON but the defintion __ARM_NEON_FP was switched on.
>>> Similarly with armv6 and even armv2.
>>>
>>> This patch fixes the predefines for each of the different chips
>>> containing certain types of the FPU implementations.
>>>
>>> Tests:
>>>
>>> Tested on arm-none-linux-gnueabi and arm-none-linux-gnueabihf without
>>> any new regression.
>>>
>>> Manually compiled for various targets and all correct definitions were
>>> present.
>>>
>>> Is this patch ok for trunk?
>>>
>>> Mantas
>>>
>>> gcc/Changelog:
>>>
>>>* config/arm/arm.h (TARGET_NEON_FP): Removed conditional definition, 
>>> define to zero if !TARGET_NEON.
>>>  (TARGET_CPU_CPP_BUILTINS): Added second condition before defining 
>>> __ARM_FP macro.
>>>
>>>
>>> ARM_DEFS.patch
>>>
>>>
>>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>>> index ff4ddac..325fea9 100644
>>> --- a/gcc/config/arm/arm.h
>>> +++ b/gcc/config/arm/arm.h
>>> @@ -118,7 +118,7 @@ extern char arm_arch_name[];
>>> if (TARGET_VFP) \
>>>   builtin_define ("__VFP_FP__");\
>>> \
>>> -   if (TARGET_ARM_FP)  \
>>> +   if (TARGET_ARM_FP && !TARGET_SOFT_FLOAT)\
>> Wouldn't it be better to factor this into TARGET_ARM_FP?  It seems odd
>> that that macro returns a set of values based on something completely
>> unavailable for the current compilation.  That would also then mirror
>> the behaviour of TARGET_NEON_FP (see below) and make the internal macros
>> more consistent.
>>
>> R.
> 
> Thank you. Patch updated.
> 
> Ok for trunk?
> 
> Mantas M.
> 
> gcc/Changelog
> 
> 2014-12-03  Mantas Mikaits  
> 
>* config/arm/arm.h (TARGET_NEON_FP): Removed conditional 
> definition, define to zero if !TARGET_NEON.
>  (TARGET_ARM_FP): Added !TARGET_SOFT_FLOAT into the conditional 
> definition.
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.target/arm/macro_defs0.c: New test.
>* gcc.target/arm/macro_defs1.c: New test.
>* gcc.target/arm/macro_defs2.c: New test.
> 
> 

OK.  However, watch your ChangeLog line length (80 char limit).  Also,
entries (even continuation lines) should be indented with exactly one
(hard) tab.

R.

> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> mypatch.patch
> 
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index ff4ddac..7d4cc39 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -2343,17 +2343,17 @@ extern int making_const_table;
> point types.  Where bit 1 indicates 16-bit support, bit 2 indicates
> 32-bit support, bit 3 indicates 64-bit support.  */
>  #define TARGET_ARM_FP\
> -  (TARGET_VFP_SINGLE ? 4 \
> -  : (TARGET_VFP_DOUBLE ? (TARGET_FP16 ? 14 : 12) : 0))
> +  (!TARGET_SOFT_FLOAT ? (TARGET_VFP_SINGLE ? 4   \
> + : (TARGET_VFP_DOUBLE ? (TARGET_FP16 ? 14 : 12) : 0)) \
> +   : 0)
>  
>  
>  /* Set as a bit mask indicating the available widths of floating point
> types for hardware NEON floating point.  This is the same as
> TARGET_ARM_FP without the 64-bit bit set.  */
> -#ifdef TARGET_NEON
> -#define TARGET_NEON_FP   \
> -  (TARGET_ARM_FP & (0xff ^ 0x08))
> -#endif
> +#define TARGET_NEON_FP\
> +  (TARGET_NEON ? (TARGET_ARM_FP & (0xff ^ 0x08)) \
> +: 0)
>  
>  /* The maximum number of parallel loads or stores we support in an ldm/stm
> instruction.  */
> diff --git a/gcc/testsuite/gcc.target/arm/macro_defs0.c 
> b/gcc/testsuite/gcc.target/arm/macro_defs0.c
> new file mode 100644
> index 000..198243e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/macro_defs0.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "avoid conflicting multilib options"
> +   { *-*-* } { "-march=*" } {"-march=armv7-m"} } */
> +/* { dg-skip-if "avoid conflicting multilib options"
> +   { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
> +/* { dg-options "-march=armv7-m -mcpu=cortex-m3 -mfloat-abi=soft -mthumb" } 
> */
> +
> +#ifdef __ARM_FP
> +#error __ARM_FP should not be defined
> +#endif
> +
> +#ifdef __ARM_NEON_FP
> +#error __ARM_NEON_FP should not be defined
> +#endif
> diff --git a/gcc/testsuite/gcc.target/arm/macro_defs1.c 
> b/gcc/testsuite/gcc.target/arm/macro_defs1.c
> new file mode 100644
> index 000..075b71b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/macro_defs1.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "avoid conflicting multilib options"
> +   { *-*-* } { "-march=*" } { "-march=armv6-m" } } */
> +/* { dg-opt

Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition

2015-02-03 Thread Jeff Law

On 02/03/15 01:29, Bin.Cheng wrote:


Hmm, if I understand correctly, it's a code size regression, so I
don't think it's appropriate to adapt the test case.  Either the patch
or something else in GCC is doing wrong, right?

Hi Alex, could you please file a PR with full dump information for tracking?
But if the code size regression is due to the older compiler incorrectly 
handling the promotion of REG_EQUAL to REG_EQUIV notes, then the test 
absolutely does need updating as the codesize was dependent on incorrect 
behaviour in the compiler.


jeff


Re: [C++ Patch/RFC] PR 64877

2015-02-03 Thread Jason Merrill

On 02/03/2015 11:14 AM, Paolo Carlini wrote:

+ /* Avoid -Waddress warnings (c++/64877).  */
+ TREE_NO_WARNING (pfn0) = 1;


I'd check for ADDR_EXPR before doing this; OK with that change.

Jason



Re: Fix PR64876, regressions in powerpc64 Go testsuite

2015-02-03 Thread David Edelsohn
On Tue, Feb 3, 2015 at 8:57 AM, Alan Modra  wrote:
> This fixes a large number of Go testsuite failures on powerpc64 ELFv1,
> caused by loading r11 from a function descriptor and thus trashing the
> value set up from CALL_EXPR_STATIC_CHAIN.  So don't load r11 if it
> already contains a useful value.  Whether r11 has been set is found
> directly by examining rtl.  Conveniently, looking at the previous
> sequence on the rtl sequence stack lets us skip over anything already
> emitted for GEN_CALL, and the static chain assignment, if present,
> happens to be the last insn of that sequence (calls.c emit_call_1
> stuff).
>
> Alternative approaches considered:
> 1) Turn off TARGET_POINTERS_TO_NESTED_FUNCTIONS for Go in
>rs6000_option_override_internal, similar to the hack posted in the
>PR.  That fixes Go, but leaves __builtin_call_with_static_chain
>broken.
> 2) Turn off TARGET_POINTERS_TO_NESTED_FUNCTIONS everywhere.  This
>means rewriting rs6000_trampoline_init to not put the static chain
>value into the trampoline function descriptor, and possibly other
>code.  Might also affect user code.
> 3) Arrange to have a new flag set in the third arg of rs6000_call_aix.
>This isn't simple due to none of INIT_CUMULATIVE_ARGS or various
>targetm.calls hooks having access to the call expression.  We don't
>have a function decl either, since this is an indirect call.
>
> Bootstrapped and regression tested powerpc64-linux.  OK to apply?
>
> PR target/64876
> * config/rs6000/rs6000.c (chain_already_loaded): New function.
> (rs6000_call_aix): Use it.

Okay with Jakub's suggested change.

Thanks, David


Re: [C++ Patch/RFC] PR 64877

2015-02-03 Thread Paolo Carlini

Hi,

On 02/03/2015 03:19 PM, Jason Merrill wrote:

On 02/03/2015 05:45 AM, Paolo Carlini wrote:

+  if (TREE_CODE (pfn0) != ADDR_EXPR
+  || !decl_with_nonnull_addr_p (TREE_OPERAND (pfn0, 0)))


I don't like duplicating the logic for when we might know the pfn is 
non-null; that seems fragile.
I agree, seems fragile, on the other hand would be rather precise, IMHO. 
We could at least have a function wrapping the above and a comment in 
both places. Anyway...
  I'd rather go with the tf_none approach, or otherwise suppress the 
warning.
There are so many ways to fix this, and I'm a bit confused by the range 
of warning suppression mechanisms we have got at this point (eg, 
TREE_NO_WARNING, c_inhibit_evaluation_warnings, the new warning_sentinel 
in pt.c, tf_none, etc.. ;) But this case should be at least relatively 
safe because the expression is internally generated, right? Fiddling 
with tf_none, as mentioned by Manuel, should indeed work, but then, how 
far we would go with that? Only the 'e2 = cp_build_binary_op (location, 
EQ_EXPR, pfn0, ...)' lines or more of the calls for synthesized 
expressions? Would it be Ok for 5.0 to just use TREE_NO_WARNING like in 
the below?


Thanks,
Paolo.

/

Index: cp/typeck.c
===
--- cp/typeck.c (revision 220371)
+++ cp/typeck.c (working copy)
@@ -4415,7 +4415,8 @@ cp_build_binary_op (location_t location,
  && decl_with_nonnull_addr_p (TREE_OPERAND (op0, 0)))
{
  if ((complain & tf_warning)
- && c_inhibit_evaluation_warnings == 0)
+ && c_inhibit_evaluation_warnings == 0
+ && !TREE_NO_WARNING (op0))
warning (OPT_Waddress, "the address of %qD will never be NULL",
 TREE_OPERAND (op0, 0));
}
@@ -4436,7 +4437,8 @@ cp_build_binary_op (location_t location,
  && decl_with_nonnull_addr_p (TREE_OPERAND (op1, 0)))
{
  if ((complain & tf_warning)
- && c_inhibit_evaluation_warnings == 0)
+ && c_inhibit_evaluation_warnings == 0
+ && !TREE_NO_WARNING (op1))
warning (OPT_Waddress, "the address of %qD will never be NULL",
 TREE_OPERAND (op1, 0));
}
@@ -4537,6 +4539,8 @@ cp_build_binary_op (location_t location,
op1 = save_expr (op1);
 
  pfn0 = pfn_from_ptrmemfunc (op0);
+ /* Avoid -Waddress warnings (c++/64877).  */
+ TREE_NO_WARNING (pfn0) = 1;
  pfn1 = pfn_from_ptrmemfunc (op1);
  delta0 = delta_from_ptrmemfunc (op0);
  delta1 = delta_from_ptrmemfunc (op1);
Index: testsuite/g++.dg/warn/Waddress-2.C
===
--- testsuite/g++.dg/warn/Waddress-2.C  (revision 0)
+++ testsuite/g++.dg/warn/Waddress-2.C  (working copy)
@@ -0,0 +1,24 @@
+// PR c++/64877
+// { dg-options "-Waddress" }
+
+template
+struct S
+{
+  void m() {
+  }
+
+  S()
+  {
+if (&S::Unwrap != &Derived::Unwrap)
+  m();
+  }
+
+  void Unwrap() {
+  }
+};
+
+struct T : public S
+{
+};
+
+T t;


Re: [PATCH] Fix combiner from accessing or writing out of bounds SET_N_REGS (PR other/63504)

2015-02-03 Thread H.J. Lu
On Mon, Feb 2, 2015 at 11:20 PM, Jakub Jelinek  wrote:
> On Mon, Feb 02, 2015 at 05:26:23PM -0600, Segher Boessenkool wrote:
>> On Mon, Feb 02, 2015 at 07:54:46PM +0100, Jakub Jelinek wrote:
>> > +/* Highest pseudo for which we track REG_N_SETS.  */
>> > +static unsigned int reg_n_sets_max;
>>
>> One more than the highest reg num, actually.
>
> Changed in my copy to
> /* One plus the highest pseudo for which we track REG_N_SETS.  */
>
> Ok with that change?
>
> Jakub

This may have caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64921


-- 
H.J.


[PATCH][ARM] PR target/64600 Fix another ICE with -mtune=xscale: properly sign-extend mask during constant splitting

2015-02-03 Thread Kyrill Tkachov

Hi all,

The ICE in this PR occurs when -mtune=xscale triggers a particular path 
through arm_gen_constant during expand
that creates a 0xf00f mask but for a 64-bit HOST_WIDE_INT doesn't 
sign extend it into
0xf00f that signifies the required -4081. It leaves it as 
0xf00f (4294963215) that breaks when
later combine tries to perform an SImode bitwise AND using the wide-int 
machinery.


I think the correct approach here is to use trunc_int_for_mode that 
correctly sign-extends the constant so

that it is properly represented by a HOST_WIDE_INT for the required mode.

Bootstrapped and tested arm-none-linux-gnueabihf with -mtune=xscale in 
BOOT_CFLAGS.


The testcase triggers for -mcpu=xscale and all slowmul targets because 
they are the only ones that have the
constant_limit tune parameter set to anything >1 which is required to 
follow this particular path through

arm_split_constant. Also, the rtx costs can hide this ICE sometimes.

Ok for trunk?

Thanks,
Kyrill

2015-02-03  Kyrylo Tkachov  

PR target/64600
* config/arm/arm.c (arm_gen_constant, AND case): Call
trunc_int_for_mode when constructing AND mask.

2015-02-03  Kyrylo Tkachov  

PR target/64600
* gcc.target/arm/pr64600_1.c: New test.commit 52388a359dd65276bccfac499a2fd9e406fbe1a8
Author: Kyrylo Tkachov 
Date:   Tue Jan 20 11:21:34 2015 +

[ARM] Fix ICE due to arm_gen_constant not sign_extending

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index db4834b..d0f3a52 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -4709,19 +4709,20 @@ arm_gen_constant (enum rtx_code code, machine_mode mode, rtx cond,
 
 	  if ((remainder | shift_mask) != 0x)
 	{
+	  HOST_WIDE_INT new_val
+	= trunc_int_for_mode (remainder | shift_mask, mode);
+
 	  if (generate)
 		{
 		  rtx new_src = subtargets ? gen_reg_rtx (mode) : target;
-		  insns = arm_gen_constant (AND, mode, cond,
-	remainder | shift_mask,
+		  insns = arm_gen_constant (AND, SImode, cond, new_val,
 	new_src, source, subtargets, 1);
 		  source = new_src;
 		}
 	  else
 		{
 		  rtx targ = subtargets ? NULL_RTX : target;
-		  insns = arm_gen_constant (AND, mode, cond,
-	remainder | shift_mask,
+		  insns = arm_gen_constant (AND, mode, cond, new_val,
 	targ, source, subtargets, 0);
 		}
 	}
@@ -4744,12 +4745,13 @@ arm_gen_constant (enum rtx_code code, machine_mode mode, rtx cond,
 
 	  if ((remainder | shift_mask) != 0x)
 	{
+	  HOST_WIDE_INT new_val
+	= trunc_int_for_mode (remainder | shift_mask, mode);
 	  if (generate)
 		{
 		  rtx new_src = subtargets ? gen_reg_rtx (mode) : target;
 
-		  insns = arm_gen_constant (AND, mode, cond,
-	remainder | shift_mask,
+		  insns = arm_gen_constant (AND, mode, cond, new_val,
 	new_src, source, subtargets, 1);
 		  source = new_src;
 		}
@@ -4757,8 +4759,7 @@ arm_gen_constant (enum rtx_code code, machine_mode mode, rtx cond,
 		{
 		  rtx targ = subtargets ? NULL_RTX : target;
 
-		  insns = arm_gen_constant (AND, mode, cond,
-	remainder | shift_mask,
+		  insns = arm_gen_constant (AND, mode, cond, new_val,
 	targ, source, subtargets, 0);
 		}
 	}
diff --git a/gcc/testsuite/gcc.target/arm/pr64600_1.c b/gcc/testsuite/gcc.target/arm/pr64600_1.c
new file mode 100644
index 000..6ba3fa2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr64600_1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=xscale" } */
+
+typedef unsigned int speed_t;
+typedef unsigned int tcflag_t;
+
+struct termios {
+ tcflag_t c_cflag;
+};
+
+speed_t
+cfgetospeed (const struct termios *tp)
+{
+  return tp->c_cflag & 010017;
+}

Re: [PATCH, FT32] initial support

2015-02-03 Thread Paolo Bonzini


On 03/02/2015 07:05, Andrew Pinski wrote:
> Likewise of:
> +(define_insn "abssi2"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> + (abs:SI (match_operand:SI 1 "register_operand" "r")))
> +   (clobber (match_scratch:SI 2 "=&r"))]
> +  ""
> +  "ashr.l\t%2,%1,31\;xor.l\t%0,%1,%2\;sub.l\t%0,%0,%2")
> 

optabs.c's expand_abs_nojump already knows this trick:

/* If this machine has expensive jumps, we can do integer absolute
   value of X as (((signed) x >> (W-1)) ^ x) - ((signed) x >> (W-1)),
   where W is the width of MODE. */

So if you define BRANCH_COST to be 2 or more there should be no need for
this pattern at all.

Paolo


RFA: Tweak documentation of fma

2015-02-03 Thread Richard Sandiford
The original reason for this was to fix the missing space before "@var{z}".
"Do a combined multiply ... and then" didn't sound quite right though:
"and then" implies a separate step and so feels like a contradiction
of "combined".

Tested with "make doc html pdf".  OK to install?

Thanks,
Richard


gcc/
* doc/rtl.texi (fma): Clarify documentation.

Index: gcc/doc/rtl.texi
===
--- gcc/doc/rtl.texi2015-01-12 15:28:02.413075415 +
+++ gcc/doc/rtl.texi2015-02-03 14:21:02.485178134 +
@@ -2306,8 +2306,8 @@ For unsigned widening multiplication, us
 @findex fma
 @item (fma:@var{m} @var{x} @var{y} @var{z})
 Represents the @code{fma}, @code{fmaf}, and @code{fmal} builtin
-functions that do a combined multiply of @var{x} and @var{y} and then
-adding to@var{z} without doing an intermediate rounding step.
+functions, which compute @samp{@var{x} * @var{y} + @var{z}}
+without doing an intermediate rounding step.
 
 @findex div
 @findex ss_div



[ARM, committed] Fix rtl checking failure in thumb2_reorg

2015-02-03 Thread Richard Sandiford
The thumb2_reorg code:

  if (!OBJECT_P (src))
op0 = XEXP (src, 0);

causes an rtl checking failure if SRC is an UNSPEC.  This doesn't matter
in practice without rtl checking since OP0 is only used if SRC is a unary
or binary operator.  This patch tightens the condition to reflect that.

The code was added in the 4.7-4.8 timeframe, although I don't know whether
it had to handle UNSPECs at that stage.  Either way, this is a regression
and so seems like stage 4 material.

Tested on arm-eabi, where it fixes many testsuite failures with checking
enabled.  Approved by Ramana off-list and committed.

Thanks,
Richard


gcc/
* config/arm/arm.c (thumb2_reorg): Test UNARY_P and BINARY_P
instead of OBJECT_P.

Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c2015-02-02 17:08:56.007379607 +
+++ gcc/config/arm/arm.c2015-02-03 14:05:28.000601072 +
@@ -17297,7 +17297,7 @@ thumb2_reorg (void)
  rtx src = XEXP (pat, 1);
  rtx op0 = NULL_RTX, op1 = NULL_RTX;
 
- if (!OBJECT_P (src))
+ if (UNARY_P (src) || BINARY_P (src))
  op0 = XEXP (src, 0);
 
  if (BINARY_P (src))



Re: [C++ Patch/RFC] PR 64877

2015-02-03 Thread Jason Merrill

On 02/03/2015 05:45 AM, Paolo Carlini wrote:

+ if (TREE_CODE (pfn0) != ADDR_EXPR
+ || !decl_with_nonnull_addr_p (TREE_OPERAND (pfn0, 0)))


I don't like duplicating the logic for when we might know the pfn is 
non-null; that seems fragile.  I'd rather go with the tf_none approach, 
or otherwise suppress the warning.


Jason



Re: Fix PR64876, regressions in powerpc64 Go testsuite

2015-02-03 Thread Jakub Jelinek
On Wed, Feb 04, 2015 at 12:27:35AM +1030, Alan Modra wrote:
> +static bool
> +chain_already_loaded (rtx_insn *last)
> +{
> +  if (last != NULL)
> +{
> +  rtx patt = PATTERN (last);
> +
> +  if (GET_CODE (patt) == SET)
> + {
> +   rtx lhs = XEXP (patt, 0);
> +
> +   if (REG_P (lhs) && REGNO (lhs) == STATIC_CHAIN_REGNUM)
> + return true;
> + }
> +}
> +  /* This function is only called when we are about to emit a call,
> + and we know that the static chain is set just before a call, so
> + there is no need to look at previous insns.  */
> +  return false;
> +}
> +
>  /* Expand code to perform a call under the AIX or ELFv2 ABI.  */
>  
>  void
> @@ -33002,7 +33092,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx fla
>originally direct, the 3rd word has not been written since no
>trampoline has been built, so we ought not to load it, lest we
>override a static chain value.  */
> -   if (!direct_call_p && TARGET_POINTERS_TO_NESTED_FUNCTIONS)
> +   if (!direct_call_p
> +   && TARGET_POINTERS_TO_NESTED_FUNCTIONS
> +   && !chain_already_loaded (crtl->emit.sequence_stack->last))

Shouldn't that be !chain_already_loaded (get_last_insn_anywhere ()) ?

Jakub


Fix PR64876, regressions in powerpc64 Go testsuite

2015-02-03 Thread Alan Modra
This fixes a large number of Go testsuite failures on powerpc64 ELFv1,
caused by loading r11 from a function descriptor and thus trashing the
value set up from CALL_EXPR_STATIC_CHAIN.  So don't load r11 if it
already contains a useful value.  Whether r11 has been set is found
directly by examining rtl.  Conveniently, looking at the previous
sequence on the rtl sequence stack lets us skip over anything already
emitted for GEN_CALL, and the static chain assignment, if present,
happens to be the last insn of that sequence (calls.c emit_call_1
stuff).

Alternative approaches considered:
1) Turn off TARGET_POINTERS_TO_NESTED_FUNCTIONS for Go in
   rs6000_option_override_internal, similar to the hack posted in the
   PR.  That fixes Go, but leaves __builtin_call_with_static_chain
   broken.
2) Turn off TARGET_POINTERS_TO_NESTED_FUNCTIONS everywhere.  This
   means rewriting rs6000_trampoline_init to not put the static chain
   value into the trampoline function descriptor, and possibly other
   code.  Might also affect user code.
3) Arrange to have a new flag set in the third arg of rs6000_call_aix.
   This isn't simple due to none of INIT_CUMULATIVE_ARGS or various
   targetm.calls hooks having access to the call expression.  We don't
   have a function decl either, since this is an indirect call.

Bootstrapped and regression tested powerpc64-linux.  OK to apply?

PR target/64876
* config/rs6000/rs6000.c (chain_already_loaded): New function.
(rs6000_call_aix): Use it.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 220358)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -32919,7 +32987,29 @@ rs6000_legitimate_constant_p (machine_mode mode, r
 }
 
 
+/* Return TRUE iff the sequence ending in LAST sets the static chain.  */
 
+static bool
+chain_already_loaded (rtx_insn *last)
+{
+  if (last != NULL)
+{
+  rtx patt = PATTERN (last);
+
+  if (GET_CODE (patt) == SET)
+   {
+ rtx lhs = XEXP (patt, 0);
+
+ if (REG_P (lhs) && REGNO (lhs) == STATIC_CHAIN_REGNUM)
+   return true;
+   }
+}
+  /* This function is only called when we are about to emit a call,
+ and we know that the static chain is set just before a call, so
+ there is no need to look at previous insns.  */
+  return false;
+}
+
 /* Expand code to perform a call under the AIX or ELFv2 ABI.  */
 
 void
@@ -33002,7 +33092,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx fla
 originally direct, the 3rd word has not been written since no
 trampoline has been built, so we ought not to load it, lest we
 override a static chain value.  */
- if (!direct_call_p && TARGET_POINTERS_TO_NESTED_FUNCTIONS)
+ if (!direct_call_p
+ && TARGET_POINTERS_TO_NESTED_FUNCTIONS
+ && !chain_already_loaded (crtl->emit.sequence_stack->last))
{
  rtx sc_reg = gen_rtx_REG (Pmode, STATIC_CHAIN_REGNUM);
  rtx func_sc_offset = GEN_INT (2 * GET_MODE_SIZE (Pmode));

-- 
Alan Modra
Australia Development Lab, IBM


Re: [patch] Fix invalid attributes in libstdc++

2015-02-03 Thread Renlin Li

On 01/02/15 15:08, Jonathan Wakely wrote:

I failed to CC gcc-patches on this patch ...

On 29/01/15 13:02 +, Jonathan Wakely wrote:

diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc
new file mode 100644
index 000..c7ec27a
--- /dev/null
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc
@@ -0,0 +1,38 @@
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++11" }
+// { dg-do compile }
+
+// Ensure the library only uses the __name__ form for attributes.
+// Don't test 'const' and 'noreturn' because they are reserved anyway.
+#define abi_tag 1
+#define always_inline 1
+#define deprecated 1
+#define packed 1
+#define pure 1
+#define unused 1
+#define visibility 1
+
+#include  // TODO: this is missing from 
+#include// TODO: this is missing from 
+#include 
+
+

the test case fails to build on all arm target, giving the following 
message:



src/gcc/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc:28:16: 
error: expected ')' before numeric constant


#define unused 1
  ^

Gcc has code using unused attribute without __name__ form. For example, 
we have the following code in gthr-default.h:

#define GLIBCXX_UNUSED __attribute_((unused))

Regards,
Renlin Li



Re: [PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0

2015-02-03 Thread pinskia




> On Feb 3, 2015, at 3:57 AM, Alan Lawrence  wrote:
> 
> 
> Andrew Pinski wrote:
>> While trying to build the GCC 5 with GCC 5, I ran into an ICE when
>> building libcpp at -O0.  The problem is the C++ front-end was not
>> folding sizeof(a)/sizeof(a[0]) when passed to a function at -O0. The
>> C++ front-end keeps around sizeof until the gimplifier and there is no
>> way to fold the expressions that involve them.  So to work around the
>> issue we need to change __builtin_aarch64_im_lane_boundsi to accept an
>> extra argument and change the first two arguments to size_t type so we
>> don't get an extra cast there and do the division inside the compiler
>> itself.
>> Also we don't want to cause an ICE on any source code so I changed the
>> assert to be a sorry if either of the two arguments are not integer
>> constants.
> 
> TBH I think it _is_ appropriate to ICE rather than sorry, or even error, if 
> the size of the vector or vector elements are non-constant or zero. All the 
> calls to this __builtin are in gcc-specific headers, and if those are wrong, 
> then the error is in(ternal to) the compiler. (Of course if the lane index is 
> non-constant that is a programmer error.) We don't wish to support the 
> programmer writing direct calls to the builtin him/herself!


Even if we don't support direct calls to it, it still should not ice. This is 
gcc policy for a long time now.  Iceing on any input is bad form and not 
helpful to users. We can say in the error message that calling it direct is not 
supported and they should not do it. 


Thanks,
Andrew
> 
> --Alan
> 
> 
>> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions
>> and I was able to bootstrap without a modified libcpp.
>> Thanks,
>> Andrew Pinski
>> ChangeLog:
>>* config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins):
>>Change the first argument type to size_type_node and add another
>>size_type_node.
>>(aarch64_simd_expand_builtin): Handle the new argument to
>>AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather
>>print sorry out when the first two arguments are not
>>integer constants.
>>* config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK):
>>Pass the sizeof's directly to __builtin_aarch64_im_lane_boundsi.
>> testsuite/ChangeLog:
>>* c-c++-common/torture/aarch64-vect-lane-1.c: New testcase.
> 
> 


Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-02-03 Thread Julian Brown
On Tue, 3 Feb 2015 14:28:44 +0300
Ilya Verbin  wrote:

> Hi Julian!
> 
> On 27 Jan 14:07, Julian Brown wrote:
> > On Mon, 26 Jan 2015 17:34:26 +0300
> > Ilya Verbin  wrote:
> > > Here is my current patch, it works for OpenMP->MIC, but obviously
> > > will not work for PTX, since it requires symmetrical changes in
> > > the plugin.  Could you please take a look, whether it is possible
> > > to support this new interface in PTX plugin?
> > 
> > I think it can probably be made to work. I'll have a look in more
> > detail.
> 
> Do you have any progress on this?

I'm still working on a patch to update OpenACC support and the PTX
backend to use load/unload_image and to unify initialisation/"opening".
So far I think the answer is basically "yes, the new interface can be
supported", though I might request a minor tweak -- e.g. that
load_image takes an extra "void **" argument so that a libgomp backend
can allocate a block of generic metadata relating to the image, then
that same block would be passed (void *) to the unload hook so the
backend can use it there and deallocate it when it's finished with.

Would that be possible? (It'd mostly be for a "CUmodule" handle: this
could be stashed away somewhere within the nvptx backend, but it might
be neater to put it in generic code since it'll probably be useful for
other backends anyway.)

Thanks,

Julian


Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition

2015-02-03 Thread Alex Velenko

On 03/02/15 08:29, Bin.Cheng wrote:

On Tue, Feb 3, 2015 at 3:24 PM, Jeff Law  wrote:

On 02/02/15 08:59, Alex Velenko wrote:


On 11/10/14 13:44, Felix Yang wrote:


Hello Jeff,

  I see that you have improved the RTL typesafety issue for ira.c,
so I rebased this patch
  on the latest trunk and change to use the new list walking
interface.
  Bootstrapped on x86_64-SUSE-Linux and make check regression tested.
  OK for trunk?


Hi Felix,
I believe your patch causes a regression for arm-none-eabi.
FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54
FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2

This happens because your patch stops reuse of code for
" return -1;" statements in pr43920-2.c.

As far as I investigated, your patch prevents adding "(expr_list (-1)
(nil)" in ira pass, which prevents jump2 optimization from happening.

So before, in ira pass I could see:
"(insn 9 53 34 8 (set (reg:SI 110 [ D.4934 ])
  (const_int -1 [0x]))
/work/fsf-trunk-ref-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20
613
{*thumb2_movsi_vfp}
   (expr_list:REG_EQUAL (const_int -1 [0x])
  (nil)))"
But with your patch I get
"(insn 9 53 34 8 (set (reg:SI 110 [ D.5322 ])
  (const_int -1 [0x]))
/work/fsf-trunk-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20
615 {*thumb2_movsi_vfp}
   (nil))"

This causes a code generation regression and needs to be fixed.
Kind regards,


We'd need to see the full dumps.  In particular is reg110 set anywhere else?
If so then the change is doing precisely what it should be doing and the
test needs to be updated to handle the different code we generate.


Hmm, if I understand correctly, it's a code size regression, so I
don't think it's appropriate to adapt the test case.  Either the patch
or something else in GCC is doing wrong, right?

Hi Alex, could you please file a PR with full dump information for tracking?

Thanks,
bin

Hi Bin,
Created bugzilla ticket, as requested:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64916
This test already existed in the testsuite, it is not new.
Kind regards,
Alex



Jeff






Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing

2015-02-03 Thread Joseph Myers
On Tue, 3 Feb 2015, Jeff Law wrote:

> +/* Given a bit-wise operation performed in mode P1 on operands
> +   in some narrower type P2 that feeds an outer masking operation.
> +   See if the mask turns off all the bits outside P2, and if so
> +   perform the all the operations in P2 and just convert the final
> +   result from P1 to P2.  */
> +(for inner_op (bit_and bit_ior bit_xor)
> +  (simplify
> +(bit_and (inner_op (convert @0) (convert @1)) INTEGER_CST@3)
> +(if ((TREE_INT_CST_LOW (@3) & ~GET_MODE_MASK (TYPE_MODE (TREE_TYPE 
> (@0 == 0

Are you sure about checking TREE_INT_CST_LOW here?  What if the inner type 
is wider than HOST_WIDE_INT?  (That could occur with bit-fields of type 
__int128, for example.)  I think the check for what bits are set needs to 
be written in a wide-int-safe way - maybe something like tree_int_cst_sgn 
(@3) > 0 && tree_int_cst_min_precision (@3, UNSIGNED) <= TYPE_PRECISION 
(TREE_TYPE (@1)).

> +  && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1))
> +  && TYPE_UNSIGNED (TREE_TYPE (@0)) == TYPE_UNSIGNED (TREE_TYPE (@1))
> +  && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)))
> +  (convert (bit_and (inner_op @0 @1) (convert @3))

I still don't think this is safe.  Suppose @0 and @1 are -128 in type 
int8_t and @3 is 128 in a wider type and the operation is AND.  Then the 
original expression has result 128.  But if you convert @3 to int8_t you 
get -128 and this would result in -128 from the simplified expression.

If the inner values are signed and the mask includes the sign bit of the 
inner type, you have to zero-extend to the wider type (e.g. convert the 
inner values to unsigned), not sign-extend.

(If the inner values are signed, it's *also* valid to optimize with a mask 
where both the sign bit of the inner type and all higher bits are set, 
such as a mask of -128 above; in that case, you do need to sign-extend.  
If the inner values are unsigned, no check on the mask value is needed at 
all as all higher bits in the mask can just be discarded.  Both of these 
statements only apply for bitwise operations, not arithmetic.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0

2015-02-03 Thread Alan Lawrence


Andrew Pinski wrote:

While trying to build the GCC 5 with GCC 5, I ran into an ICE when
building libcpp at -O0.  The problem is the C++ front-end was not
folding sizeof(a)/sizeof(a[0]) when passed to a function at -O0. The
C++ front-end keeps around sizeof until the gimplifier and there is no
way to fold the expressions that involve them.  So to work around the
issue we need to change __builtin_aarch64_im_lane_boundsi to accept an
extra argument and change the first two arguments to size_t type so we
don't get an extra cast there and do the division inside the compiler
itself.

Also we don't want to cause an ICE on any source code so I changed the
assert to be a sorry if either of the two arguments are not integer
constants.


TBH I think it _is_ appropriate to ICE rather than sorry, or even error, if the 
size of the vector or vector elements are non-constant or zero. All the calls to 
this __builtin are in gcc-specific headers, and if those are wrong, then the 
error is in(ternal to) the compiler. (Of course if the lane index is 
non-constant that is a programmer error.) We don't wish to support the 
programmer writing direct calls to the builtin him/herself!


--Alan




OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions
and I was able to bootstrap without a modified libcpp.

Thanks,
Andrew Pinski

ChangeLog:
* config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins):
Change the first argument type to size_type_node and add another
size_type_node.
(aarch64_simd_expand_builtin): Handle the new argument to
AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather
print sorry out when the first two arguments are not
integer constants.
* config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK):
Pass the sizeof's directly to __builtin_aarch64_im_lane_boundsi.

testsuite/ChangeLog:
* c-c++-common/torture/aarch64-vect-lane-1.c: New testcase.





Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing

2015-02-03 Thread Richard Biener
On February 2, 2015 7:32:15 PM CET, Jeff Law  wrote:
>On 02/02/15 01:57, Richard Biener wrote:
>>>
>>> The nice thing about wrapping the result inside a convert is the
>types for
>>> the inner operations will propagate from the type of the inner
>operands,
>>> which is exactly what we want.  We then remove the hack assigning
>type and
>>> instead the original type will be used for the outermost convert.
>>
>> It's not even a hack but wrong ;)  Correct supported syntax is
>>
>> + (with { tree type0 = TREE_TYPE (@0); }
>> +  (convert:type0 (bit_and (inner_op @0 @1) (convert @3)))
>>
>> Thus whenever the generator cannot auto-guess a type (or would guess
>> the wrong one) you can explicitely specify a type to convert to.
>I found that explicit types were ignored in some cases.  It was 
>frustrating to say the least.

Huh, that would be a bug.  Do you have a pattern where that happens?

Richard.

  But I think I've got this part doing
>what 
>I want without the hack.
>
>>
>> Why do you restrict this to GENERIC?  On GIMPLE you'd eventually
>> want to impose some single-use constraints as the result with all
>> the conversions won't really be unconditionally "better"?
>That was strictly because of the mismatch between the resulting type
>and 
>how it was later used.  That restriction shouldn't be needed anymore.
>
>Jeff




Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-02-03 Thread Ilya Verbin
Hi Julian!

On 27 Jan 14:07, Julian Brown wrote:
> On Mon, 26 Jan 2015 17:34:26 +0300
> Ilya Verbin  wrote:
> > Here is my current patch, it works for OpenMP->MIC, but obviously
> > will not work for PTX, since it requires symmetrical changes in the
> > plugin.  Could you please take a look, whether it is possible to
> > support this new interface in PTX plugin?
> 
> I think it can probably be made to work. I'll have a look in more
> detail.

Do you have any progress on this?

Thanks,
  -- Ilya


Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.

2015-02-03 Thread Paolo Carlini

Hi,

On 02/03/2015 11:40 AM, Matthew Wahab wrote:

Ok to commit?

Ok thanks.

Paolo.


[C++ Patch/RFC] PR 64877

2015-02-03 Thread Paolo Carlini

Hi,

Manuel did most of the work on this [5 Regression], caused by my fix for 
c++/43906, which extended a lot the functionality of -Waddress: a 
spurious warning is emitted with -Waddress for an expression internally 
generated in cp_build_binary_op. Manuel suggested that when safe we 
could completely avoid generating the relevant recursive calls to 
cp_build_binary_op. Alternately, for these internally generated 
expressions we could probably use tf_none instead of passing down 
complain. Note: I'm not 100% sure the ptrmemfunc_vbit_in_delta bits are 
correct, for sure avoid the spurious warning for those targets too.


Thanks,
Paolo.

///
Index: cp/typeck.c
===
--- cp/typeck.c (revision 220366)
+++ cp/typeck.c (working copy)
@@ -4552,35 +4552,44 @@ cp_build_binary_op (location_t location,
 
 The reason for the `!op0.pfn' bit is that a NULL
 pointer-to-member is any member with a zero PFN and
-LSB of the DELTA field is 0.  */
+LSB of the DELTA field is 0.  Note: avoid generating the
+'|| (!op0.pfn && ...)' if !op0.pfn is known to be false.  */
 
- e1 = cp_build_binary_op (location, BIT_AND_EXPR,
-  delta0, 
-  integer_one_node,
-  complain);
- e1 = cp_build_binary_op (location,
-  EQ_EXPR, e1, integer_zero_node,
-  complain);
- e2 = cp_build_binary_op (location, BIT_AND_EXPR,
-  delta1,
-  integer_one_node,
-  complain);
- e2 = cp_build_binary_op (location,
-  EQ_EXPR, e2, integer_zero_node,
-  complain);
- e1 = cp_build_binary_op (location,
-  TRUTH_ANDIF_EXPR, e2, e1,
-  complain);
- e2 = cp_build_binary_op (location, EQ_EXPR,
-  pfn0,
-  build_zero_cst (TREE_TYPE (pfn0)),
-  complain);
- e2 = cp_build_binary_op (location,
-  TRUTH_ANDIF_EXPR, e2, e1, complain);
- e1 = cp_build_binary_op (location,
-  EQ_EXPR, delta0, delta1, complain);
- e1 = cp_build_binary_op (location,
-  TRUTH_ORIF_EXPR, e1, e2, complain);
+ if (TREE_CODE (pfn0) != ADDR_EXPR
+ || !decl_with_nonnull_addr_p (TREE_OPERAND (pfn0, 0)))
+   {
+ e1 = cp_build_binary_op (location, BIT_AND_EXPR,
+  delta0, 
+  integer_one_node,
+  complain);
+ e1 = cp_build_binary_op (location,
+  EQ_EXPR, e1, integer_zero_node,
+  complain);
+ e2 = cp_build_binary_op (location, BIT_AND_EXPR,
+  delta1,
+  integer_one_node,
+  complain);
+ e2 = cp_build_binary_op (location,
+  EQ_EXPR, e2, integer_zero_node,
+  complain);
+ e1 = cp_build_binary_op (location,
+  TRUTH_ANDIF_EXPR, e2, e1,
+  complain);
+ e2 = cp_build_binary_op (location, EQ_EXPR,
+  pfn0,
+  build_zero_cst (TREE_TYPE (pfn0)),
+  complain);
+ e2 = cp_build_binary_op (location,
+  TRUTH_ANDIF_EXPR, e2, e1, complain);
+ e1 = cp_build_binary_op (location,
+  EQ_EXPR, delta0, delta1, complain);
+ e1 = cp_build_binary_op (location,
+  TRUTH_ORIF_EXPR, e1, e2, complain);
+
+   }
+ else
+   e1 = cp_build_binary_op (location,
+EQ_EXPR, delta0, delta1, complain);
}
  else
{
@@ -4591,17 +4600,22 @@ cp_build_binary_op (location_t location,
 
 The reason for the `!op0.pfn' bit is that a NULL
 pointer-to-member is any member with a zero PFN; the
-DELTA field

Re: [PATCH] Fix CSE volatile MEM handling (PR rtl-optimization/64756)

2015-02-03 Thread Eric Botcazou
> 2015-02-02  Jakub Jelinek  
> 
>   PR rtl-optimization/64756
>   * cse.c (cse_insn): If dest != SET_DEST (sets[i].rtl) and
>   HASH (SET_DEST (sets[i].rtl), mode) computation sets do_not_record,
>   invalidate and do not record it.
> 
>   * gcc.c-torture/execute/pr64756.c: New test.

OK if you factor out the common code with the similar block for 'dest' above 
into an invalidate_dest function (note that the MEM_P case is identical to the 
REG_P and SUBREG cases) and invoke it on 'dest' and SET_DEST (sets[i].rtl).

-- 
Eric Botcazou


Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.

2015-02-03 Thread Matthew Wahab

On 03/02/15 10:27, Paolo Carlini wrote:

Nit: the path should be * testsuite/28_regex/... and likewise for the
other testcase, because it starts where is the corresponding ChangeLog.


Fixed changelog:

libstdc++-v3/
2015-02-02  Matthew Wahab  

PR libstdc++/64467
* testsuite/28_regex/traits/char/isctype.cc (test01): Add newlib
special case for '\n'.
* testsuite/28_regex/traits/wchar_t/isctype.cc (test01): Likewise.


Ok to commit?
Matthew
diff --git a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc
index a7b1396..7c47045 100644
--- a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc
+++ b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc
@@ -53,7 +53,13 @@ test01()
   VERIFY(!t.isctype('_', t.lookup_classname(range(digit;
   VERIFY( t.isctype(' ', t.lookup_classname(range(blank;
   VERIFY( t.isctype('\t', t.lookup_classname(range(blank;
+#if defined (__NEWLIB__)
+  /* newlib includes '\n' in class 'blank'.
+ See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html.  */
+  VERIFY( t.isctype('\n', t.lookup_classname(range(blank;
+#else
   VERIFY(!t.isctype('\n', t.lookup_classname(range(blank;
+#endif
   VERIFY( t.isctype('t', t.lookup_classname(range(upper), true)));
   VERIFY( t.isctype('T', t.lookup_classname(range(lower), true)));
 #undef range
diff --git a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc
index e450f6d..1b3d69a 100644
--- a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc
+++ b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc
@@ -50,7 +50,13 @@ test01()
   VERIFY(!t.isctype(L'_', t.lookup_classname(range(digit;
   VERIFY( t.isctype(L' ', t.lookup_classname(range(blank;
   VERIFY( t.isctype(L'\t', t.lookup_classname(range(blank;
+#if defined (__NEWLIB__)
+  /* newlib includes '\n' in class 'blank'.
+ See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html.  */
+  VERIFY( t.isctype(L'\n', t.lookup_classname(range(blank;
+#else
   VERIFY(!t.isctype(L'\n', t.lookup_classname(range(blank;
+#endif
   VERIFY( t.isctype(L't', t.lookup_classname(range(upper), true)));
   VERIFY( t.isctype(L'T', t.lookup_classname(range(lower), true)));
 #undef range


Re: [patch] Fix invalid attributes in libstdc++

2015-02-03 Thread Iain Sandoe
Hi Jonathan,

On 1 Feb 2015, at 15:10, Jonathan Wakely wrote:

> On 01/02/15 15:08 +, Jonathan Wakely wrote:
>> I failed to CC gcc-patches on this patch ...
>> 
>> On 29/01/15 13:02 +, Jonathan Wakely wrote:
>>> Jakub pointed out that we have some attributes that don't use the
>>> reserved namespace, e.g. __attribute__ ((always_inline)).
>>> 
>>> This is a 4.9/5 regression and the fix was pre-approved by Jakub so
>>> I've committed it to trunk.
>>> 
>>> When we're back in stage1 I'll fix the TODO comments in the new tests
>>> (see PR64857) and will also rename testsuite/17_intro/headers/c++200x
>>> to .../c++2011.
>>> 
> 
> The new test fails on darwin (PR64883) and --enable-threads=single
> targets (PR64885).
> 
> This is a workaround for 64883. Tested x86_64-linux, committed to
> trunk.
> 

the following additional tweaks provide further work-arounds.
... checked on darwin12 and darwin14.
I have a fixincludes patch for next stage #1.
Iain

diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc
index 76a935e..6fc362a 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++1998/all_attributes.cc
@@ -26,11 +26,11 @@
 // darwin headers use these, see PR 64883
 # define deprecated 1
 # define noreturn 1
+# define visibility 1
 #endif
 #define packed 1
 #define pure 1
 #define unused 1
-#define visibility 1
 
 #include  // TODO: this is missing from 
 #include 
diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc
index c7ec27a..0726e3f 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++200x/all_attributes.cc
@@ -22,11 +22,14 @@
 // Don't test 'const' and 'noreturn' because they are reserved anyway.
 #define abi_tag 1
 #define always_inline 1
-#define deprecated 1
+#ifndef __APPLE__
+// darwin headers use these, see PR 64883
+# define visibility 1
+# define deprecated 1
+#endif
 #define packed 1
 #define pure 1
 #define unused 1
-#define visibility 1
 
 #include  // TODO: this is missing from 
 #include// TODO: this is missing from 
diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc 
b/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc
index 533a6f1..06bcb8e 100644
--- a/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++2014/all_attributes.cc
@@ -22,11 +22,14 @@
 // Don't test 'const' and 'noreturn' because they are reserved anyway.
 #define abi_tag 1
 #define always_inline 1
-#define deprecated 1
+#ifndef __APPLE__
+// darwin headers use these, see PR 64883
+# define deprecated 1
+# define visibility 1
+#endif
 #define packed 1
 #define pure 1
 #define unused 1
-#define visibility 1
 
 #include  // TODO: this is missing from 
 #include   // TODO: this is missing from 




Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.

2015-02-03 Thread Paolo Carlini

Hi,

On 02/03/2015 11:17 AM, Matthew Wahab wrote:

libstdc++-v3/testsuite/
2015-02-02  Matthew Wahab  

PR libstdc++/64467
* 28_regex/traits/char/isctype.cc (test01): Add newlib special
case for '\n'.
* 28_regex/traits/wchar_t/isctype.cc (test01): Likewise.
Nit: the path should be * testsuite/28_regex/... and likewise for the 
other testcase, because it starts where is the corresponding ChangeLog.


Paolo


Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.

2015-02-03 Thread Matthew Wahab

[Email problems so resending to the list, sorry for multiple copies.]

On 02/02/15 16:33, Jonathan Wakely wrote:

On 2 February 2015 at 16:17, Paolo Carlini  wrote:

This is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64467 so please
note that in the ChangeLog.


I guess the patch is Ok for trunk, but please also add in the comment a link
to this message of yours, that is
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html.


Updated patch attached and changelog below.


Yes, not everyone subscribes to gcc-patches so please always send
libstdc++ patches to the libstdc++ list, as documented at
https://gcc.gnu.org/lists.html and in the libstdc++ manual.


Noted, sorry about that.
Matthew

libstdc++-v3/testsuite/
2015-02-02  Matthew Wahab  

PR libstdc++/64467
* 28_regex/traits/char/isctype.cc (test01): Add newlib special
case for '\n'.
* 28_regex/traits/wchar_t/isctype.cc (test01): Likewise.
diff --git a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc
index a7b1396..7c47045 100644
--- a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc
+++ b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc
@@ -53,7 +53,13 @@ test01()
   VERIFY(!t.isctype('_', t.lookup_classname(range(digit;
   VERIFY( t.isctype(' ', t.lookup_classname(range(blank;
   VERIFY( t.isctype('\t', t.lookup_classname(range(blank;
+#if defined (__NEWLIB__)
+  /* newlib includes '\n' in class 'blank'.
+ See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html.  */
+  VERIFY( t.isctype('\n', t.lookup_classname(range(blank;
+#else
   VERIFY(!t.isctype('\n', t.lookup_classname(range(blank;
+#endif
   VERIFY( t.isctype('t', t.lookup_classname(range(upper), true)));
   VERIFY( t.isctype('T', t.lookup_classname(range(lower), true)));
 #undef range
diff --git a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc
index e450f6d..1b3d69a 100644
--- a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc
+++ b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc
@@ -50,7 +50,13 @@ test01()
   VERIFY(!t.isctype(L'_', t.lookup_classname(range(digit;
   VERIFY( t.isctype(L' ', t.lookup_classname(range(blank;
   VERIFY( t.isctype(L'\t', t.lookup_classname(range(blank;
+#if defined (__NEWLIB__)
+  /* newlib includes '\n' in class 'blank'.
+ See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html.  */
+  VERIFY( t.isctype(L'\n', t.lookup_classname(range(blank;
+#else
   VERIFY(!t.isctype(L'\n', t.lookup_classname(range(blank;
+#endif
   VERIFY( t.isctype(L't', t.lookup_classname(range(upper), true)));
   VERIFY( t.isctype(L'T', t.lookup_classname(range(lower), true)));
 #undef range








[SPARC] Adjust multiply costs in 64-bit mode

2015-02-03 Thread Eric Botcazou
As discovered by Rainer, an oversight in sparc_rtx_costs causes the function 
to return very high costs for multiply operations with -m64 for default V9 or 
new processors (when TARGET_DEPRECATED_V8_INSNS is not set to be precise).

Fixed by the attached patch to config/sparc/sparc.c, the config/sparc/sparc.h 
hunk being a no-op (because TARGET_V8PLUS => TARGET_DEPRECATED_V8_INSNS).

Tested on SPARC/Solaris and SPARC64/Solaris, applied on the mainline.


2015-02-03  Eric Botcazou  

PR target/62631
* config/sparc/sparc.h (TARGET_HARD_MUL): Remove TARGET_V8PLUS.
(TARGET_HARD_MUL32): Rewrite based on TARGET_HARD_MUL.
* config/sparc/sparc.c (sparc_rtx_costs) : Return costs based on
int_mulX for integers in 64-bit mode if TARGET_HARD_MUL is not set.


-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 220343)
+++ config/sparc/sparc.c	(working copy)
@@ -11075,7 +11075,7 @@ sparc_rtx_costs (rtx x, int code, int ou
 case MULT:
   if (float_mode_p)
 	*total = sparc_costs->float_mul;
-  else if (! TARGET_HARD_MUL)
+  else if (TARGET_ARCH32 && !TARGET_HARD_MUL)
 	*total = COSTS_N_INSNS (25);
   else
 	{
@@ -3,7 +3,7 @@ sparc_rtx_costs (rtx x, int code, int ou
 	  bit_cost = COSTS_N_INSNS (bit_cost);
 	}
 
-	  if (mode == DImode)
+	  if (mode == DImode || !TARGET_HARD_MUL)
 	*total = sparc_costs->int_mulX + bit_cost;
 	  else
 	*total = sparc_costs->int_mul + bit_cost;
Index: config/sparc/sparc.h
===
--- config/sparc/sparc.h	(revision 220343)
+++ config/sparc/sparc.h	(working copy)
@@ -426,22 +426,20 @@ extern enum cmodel sparc_cmodel;
 #define WCHAR_TYPE_SIZE 16
 
 /* Mask of all CPU selection flags.  */
-#define MASK_ISA \
-(MASK_V8 + MASK_SPARCLITE + MASK_SPARCLET + MASK_V9 + MASK_DEPRECATED_V8_INSNS)
+#define MASK_ISA	\
+  (MASK_SPARCLITE + MASK_SPARCLET			\
+   + MASK_V8 + MASK_V9 + MASK_DEPRECATED_V8_INSNS)
 
-/* TARGET_HARD_MUL: Use hardware multiply instructions but not %y.
-   TARGET_HARD_MUL32: Use hardware multiply instructions with rd %y
-   to get high 32 bits.  False in V8+ or V9 because multiply stores
-   a 64-bit result in a register.  */
-
-#define TARGET_HARD_MUL32\
-  ((TARGET_V8 || TARGET_SPARCLITE			\
-|| TARGET_SPARCLET || TARGET_DEPRECATED_V8_INSNS)	\
-   && ! TARGET_V8PLUS && TARGET_ARCH32)
+/* TARGET_HARD_MUL: Use 32-bit hardware multiply instructions but not %y.  */
+#define TARGET_HARD_MUL\
+  (TARGET_SPARCLITE || TARGET_SPARCLET		\
+   || TARGET_V8 || TARGET_DEPRECATED_V8_INSNS)
 
-#define TARGET_HARD_MUL	\
-  (TARGET_V8 || TARGET_SPARCLITE || TARGET_SPARCLET	\
-   || TARGET_DEPRECATED_V8_INSNS || TARGET_V8PLUS)
+/* TARGET_HARD_MUL32: Use 32-bit hardware multiply instructions with %y
+   to get high 32 bits.  False in 64-bit or V8+ because multiply stores
+   a 64-bit result in a register.  */
+#define TARGET_HARD_MUL32 \
+  (TARGET_HARD_MUL && TARGET_ARCH32 && !TARGET_V8PLUS)
 
 /* MASK_APP_REGS must always be the default because that's what
FIXED_REGISTERS is set to and -ffixed- is processed before

Re: [PATCH] Fix combiner from accessing or writing out of bounds SET_N_REGS (PR other/63504)

2015-02-03 Thread Eric Botcazou
> Changed in my copy to
> /* One plus the highest pseudo for which we track REG_N_SETS.  */
> 
> Ok with that change?

OK if you also add (part of) the explanation you wrote down in the opening 
message to the above comment.

-- 
Eric Botcazou


[PATCH]Remve xfail for wrapped target from libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc

2015-02-03 Thread Renlin Li

Hi all,

This patch simply remove the target selector. It should pass for all target 
which applies.

The comment in the code is not correct. stderr is redirected, not the stdout.
Therefore, the return status which is streamed into stdout should properly 
captured even by wrapped target.



Okay for trunk?


libstdc++-v3/ChangeLog:

2015-02-03  Renlin Li

* testsuite/27_io/ios_base/sync_with_stdio/1.cc: Remve xfail for 
wrapped target.
diff --git a/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc b/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc
index 6edaef3..1c9fa60 100644
--- a/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc
+++ b/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc
@@ -23,12 +23,6 @@
 // @require@ %-*.tst
 // @diff@ %-*.tst %-*.txt
 
-// This test fails on platforms using a wrapper, because this test
-// redirects stdout to a file and so the exit status printed by the
-// wrapper is not visibile to DejaGNU.  DejaGNU then assumes that the
-// test exited with a non-zero exit status.
-// { dg-do run { xfail { ! unwrapped } } }
-
 #include 
 #include 
 #include 

Re: [PATCH] i386: XFAIL the scan-assembler in pr49095.c (PR61225)

2015-02-03 Thread Jakub Jelinek
On Tue, Feb 03, 2015 at 12:49:19AM -0800, Segher Boessenkool wrote:
> As discussed in PR61225, we won't be able to fix the minor regression here
> for GCC 5, so let's XFAIL this test for ia32.
> 
> Tested on x86_64-linux -m32 and -m64.  Okay for mainline?
> 
> 
> Segher
> 
> 
> 2015-02-03  Segher Boessenkool  
> 
> gcc/testsuite/
>   PR middle-end/61225
>   gcc.target/i386/pr49095.c: XFAIL for ia32.

Ok, thanks.

> --- a/gcc/testsuite/gcc.target/i386/pr49095.c
> +++ b/gcc/testsuite/gcc.target/i386/pr49095.c
> @@ -70,4 +70,5 @@ G (short)
>  G (int)
>  G (long)
>  
> -/* { dg-final { scan-assembler-not "test\[lq\]" } } */
> +/* See PR61225 for the XFAIL.  */
> +/* { dg-final { scan-assembler-not "test\[lq\]" { xfail { ia32 } } } } */
> -- 
> 1.8.1.4

Jakub


[PATCH] i386: XFAIL the scan-assembler in pr49095.c (PR61225)

2015-02-03 Thread Segher Boessenkool
As discussed in PR61225, we won't be able to fix the minor regression here
for GCC 5, so let's XFAIL this test for ia32.

Tested on x86_64-linux -m32 and -m64.  Okay for mainline?


Segher


2015-02-03  Segher Boessenkool  

gcc/testsuite/
PR middle-end/61225
gcc.target/i386/pr49095.c: XFAIL for ia32.

---
 gcc/testsuite/gcc.target/i386/pr49095.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr49095.c 
b/gcc/testsuite/gcc.target/i386/pr49095.c
index b7d1fb2..cc79da2 100644
--- a/gcc/testsuite/gcc.target/i386/pr49095.c
+++ b/gcc/testsuite/gcc.target/i386/pr49095.c
@@ -70,4 +70,5 @@ G (short)
 G (int)
 G (long)
 
-/* { dg-final { scan-assembler-not "test\[lq\]" } } */
+/* See PR61225 for the XFAIL.  */
+/* { dg-final { scan-assembler-not "test\[lq\]" { xfail { ia32 } } } } */
-- 
1.8.1.4



Re: [PATCH, v0] fortran: !GCC$ unroll for DO

2015-02-03 Thread Tobias Burnus
Bernhard Reutner-Fischer wrote:
> Some compilers IIRC use !DIR$ unroll, if memory serves me right then
> the DEC compiler had !DEC$ unroll.
> We could support one or the other three-letter keyword or maybe not.

Intel's compiler supports quite a lot of loop directives. (Its Fortran
front end is based on DEC's and part of the developer team also moved
from Compaq to Intel. Not that Intel didn't add a bunch of additional
directives later on.)


Supported are: simd, ivdep, loop count, (no)vector, inline, force, (no)inline,
(no)unroll, unroll_and_jam, nofusion and distribution point.

GCC support "ivdep" (C/C++), simd (I think only via -fcilkplus; but
OpenMP's 'omp simd' is a replacement [-fopenmp/-fopenmp-simd] in C/C++ and
Fortran) - and "unroll" as proposed for C/C++ and with this patch for
Fortran.

By the way: gfortran automatically annotates 'do concurrent' with 'ivdep'.

For Intel's loop directives, see:
https://software.intel.com/en-us/articles/getting-started-with-intel-composer-xe-2013-compiler-pragmas-and-directives
C++: https://software.intel.com/en-us/node/524494
Fortran: https://software.intel.com/en-us/node/525781


 * * *

Regarding the patch: In general, I prefer to stick to standard methods
(which are portable) and think that those user knobs often make things
slower than faster (as they tend to stay for years, even after the hard-
ware as moved on - or they are even inserted blindly).
till, I think it would be fine to add it.

Tobias

PS: For a non-RFC patch, you also need to update gfortran.texi.


Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition

2015-02-03 Thread Bin.Cheng
On Tue, Feb 3, 2015 at 3:24 PM, Jeff Law  wrote:
> On 02/02/15 08:59, Alex Velenko wrote:
>>
>> On 11/10/14 13:44, Felix Yang wrote:
>>>
>>> Hello Jeff,
>>>
>>>  I see that you have improved the RTL typesafety issue for ira.c,
>>> so I rebased this patch
>>>  on the latest trunk and change to use the new list walking
>>> interface.
>>>  Bootstrapped on x86_64-SUSE-Linux and make check regression tested.
>>>  OK for trunk?
>>
>> Hi Felix,
>> I believe your patch causes a regression for arm-none-eabi.
>> FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54
>> FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2
>>
>> This happens because your patch stops reuse of code for
>> " return -1;" statements in pr43920-2.c.
>>
>> As far as I investigated, your patch prevents adding "(expr_list (-1)
>> (nil)" in ira pass, which prevents jump2 optimization from happening.
>>
>> So before, in ira pass I could see:
>> "(insn 9 53 34 8 (set (reg:SI 110 [ D.4934 ])
>>  (const_int -1 [0x]))
>> /work/fsf-trunk-ref-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20
>> 613
>> {*thumb2_movsi_vfp}
>>   (expr_list:REG_EQUAL (const_int -1 [0x])
>>  (nil)))"
>> But with your patch I get
>> "(insn 9 53 34 8 (set (reg:SI 110 [ D.5322 ])
>>  (const_int -1 [0x]))
>> /work/fsf-trunk-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20
>> 615 {*thumb2_movsi_vfp}
>>   (nil))"
>>
>> This causes a code generation regression and needs to be fixed.
>> Kind regards,
>
> We'd need to see the full dumps.  In particular is reg110 set anywhere else?
> If so then the change is doing precisely what it should be doing and the
> test needs to be updated to handle the different code we generate.

Hmm, if I understand correctly, it's a code size regression, so I
don't think it's appropriate to adapt the test case.  Either the patch
or something else in GCC is doing wrong, right?

Hi Alex, could you please file a PR with full dump information for tracking?

Thanks,
bin
>
> Jeff