date:20140527

Re: [PATCH][4/4] Always 64bit-HWI cleanups

2014-05-27 Thread Richard Biener

On May 27, 2014 3:58:06 AM CEST, David Edelsohn dje@gmail.com wrote:
This patch (further) broke bootstrap on AIX.  AIX defaults to 32 bit,
although it supports 64 bit HWI.

/nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'int
print_statistics(bitmap_descriptor_d**, output_info*)':
/nasfarm/edelsohn/src/src/gcc/bitmap.c:2168:24: error: expected ')'
before 'PRId64'
/nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'void
dump_bitmap_statistics()':
/nasfarm/edelsohn/src/src/gcc/bitmap.c:2202:15: error: expected ')'
before 'PRId64'

/nasfarm/edelsohn/src/src/gcc/bt-load.c: In function 'void
migrate_btr_defs(reg_class, int)':
/nasfarm/edelsohn/src/src/gcc/bt-load.c:1414:34: error: expected ')'
before 'PRId64'

/nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
dump_edge_info(FILE*, edge, int, int)':
/nasfarm/edelsohn/src/src/gcc/cfg.c:489:25: error: expected ')' before
'PRId64'
/nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
dump_bb_info(FILE*, basic_block, int, int, bool, bool)':
/nasfarm/edelsohn/src/src/gcc/cfg.c:737:33: error: expected ')' before
'PRId64'

This means aix has inttypes.h but not the standard mandated PRI?64 macros. Does 
it have stdint.h and therein int64_t?

A possibility is to define the GCC fallbacks when the defines are not defined 
as opposed to only when inttypes.h is not available.

Can you investigate some more the situation on aix?

Thanks,
Richard.

bitmap.i looks like:

int
print_statistics (bitmap_descriptor_d **slot, output_info *i)
{
  bitmap_descriptor d = *slot;
  char s[4096];

  if (d-allocated)
{
  const char *s1 = d-file;
  const char *s2;
  while ((s2 = strstr (s1, gcc/)))
 s1 = s2 + 4;
  sprintf (s, %s:%i (%s), s1, d-line, d-function);
  s[41] = 0;
  fprintf ((_iob[2]),
%-41s %9u %15PRId64 %15PRId64 %15PRId64
 %10PRId64 %10PRId64\n,
s, d-created,
d-allocated, d-peak, d-current,
d-nsearches, d-search_iter);
  i-size += d-allocated;
  i-count += d-created;
}
  return 1;
}


- David

Re: [PATCH][4/4] Always 64bit-HWI cleanups

2014-05-27 Thread Jakub Jelinek

On Tue, May 27, 2014 at 08:26:35AM +0200, Richard Biener wrote:
 /nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
 dump_bb_info(FILE*, basic_block, int, int, bool, bool)':
 /nasfarm/edelsohn/src/src/gcc/cfg.c:737:33: error: expected ')' before
 'PRId64'
 
 This means aix has inttypes.h but not the standard mandated PRI?64 macros. 
 Does it have stdint.h and therein int64_t?
 
 A possibility is to define the GCC fallbacks when the defines are not
 defined as opposed to only when inttypes.h is not available.

BTW, we should be prepared for C++11 udlit and use spaces around
the PRI* macros, in case some compiler treats those as udlits
unconditionally.

Jakub

[PATCH, libgomp]: Require vect_simd_clones effective target for libgomp.fortran/declare-simd-[12].f90

2014-05-27 Thread Uros Bizjak

Hello!

These tests require vect_simd_clones effective target, as the target
should be able to compile AVX clones.

2014-05-27  Uros Bizjak  ubiz...@gmail.com

* testsuite/libgomp.fortran/declare-simd-1.f90: Require
vect_simd_clones effective target.  Remove
dg-additional-options directives.
* testsuite/libgomp.fortran/declare-simd-2.f90: Ditto.

Tested on x86_64-linux-gnu CentOS 5.

Uros.
Index: testsuite/libgomp.fortran/declare-simd-1.f90
===
--- testsuite/libgomp.fortran/declare-simd-1.f90(revision 210956)
+++ testsuite/libgomp.fortran/declare-simd-1.f90(working copy)
@@ -1,6 +1,5 @@
 ! { dg-options -fno-inline }
-! { dg-additional-options -msse2 { target sse2_runtime } }
-! { dg-additional-options -mavx { target avx_runtime } }
+! { dg-require-effective-target vect_simd_clones }
 
 module declare_simd_1_mod
   contains
Index: testsuite/libgomp.fortran/declare-simd-2.f90
===
--- testsuite/libgomp.fortran/declare-simd-2.f90(revision 210956)
+++ testsuite/libgomp.fortran/declare-simd-2.f90(working copy)
@@ -1,8 +1,7 @@
 ! { dg-do run }
 ! { dg-options -fno-inline }
-  ! { dg-additional-sources declare-simd-3.f90 }
-! { dg-additional-options -msse2 { target sse2_runtime } }
-! { dg-additional-options -mavx { target avx_runtime } }
+! { dg-additional-sources declare-simd-3.f90 }
+! { dg-require-effective-target vect_simd_clones }
 
 module declare_simd_2_mod
   contains

Re: [PATCH, libgomp]: Require vect_simd_clones effective target for libgomp.fortran/declare-simd-[12].f90

2014-05-27 Thread Jakub Jelinek

On Tue, May 27, 2014 at 08:58:18AM +0200, Uros Bizjak wrote:
 These tests require vect_simd_clones effective target, as the target
 should be able to compile AVX clones.
 
 2014-05-27  Uros Bizjak  ubiz...@gmail.com
 
 * testsuite/libgomp.fortran/declare-simd-1.f90: Require
 vect_simd_clones effective target.  Remove
 dg-additional-options directives.
 * testsuite/libgomp.fortran/declare-simd-2.f90: Ditto.
 
 Tested on x86_64-linux-gnu CentOS 5.

Please don't remove the dg-additional-options there, that is completely
intentional there, only the simd clones are built for SSE2/AVX/AVX2,
the simd loops are built with whatever options the loop is compiled with,
and for the common case (AVX or later HW, but compiler not configured to
support only AVX or later) I want to test as much vectorization as possible.
Requiring vect_simd_clone or the whitespace change is fine, though I'd
just use ! { dg-do run { target vect_simd_clones } } instead of
dg-require-effective-target.

 Index: testsuite/libgomp.fortran/declare-simd-1.f90
 ===
 --- testsuite/libgomp.fortran/declare-simd-1.f90  (revision 210956)
 +++ testsuite/libgomp.fortran/declare-simd-1.f90  (working copy)
 @@ -1,6 +1,5 @@
  ! { dg-options -fno-inline }
 -! { dg-additional-options -msse2 { target sse2_runtime } }
 -! { dg-additional-options -mavx { target avx_runtime } }
 +! { dg-require-effective-target vect_simd_clones }
  
  module declare_simd_1_mod
contains
 Index: testsuite/libgomp.fortran/declare-simd-2.f90
 ===
 --- testsuite/libgomp.fortran/declare-simd-2.f90  (revision 210956)
 +++ testsuite/libgomp.fortran/declare-simd-2.f90  (working copy)
 @@ -1,8 +1,7 @@
  ! { dg-do run }
  ! { dg-options -fno-inline }
 -  ! { dg-additional-sources declare-simd-3.f90 }
 -! { dg-additional-options -msse2 { target sse2_runtime } }
 -! { dg-additional-options -mavx { target avx_runtime } }
 +! { dg-additional-sources declare-simd-3.f90 }
 +! { dg-require-effective-target vect_simd_clones }
  
  module declare_simd_2_mod
contains


Jakub

RE: [PATCH] Fix PR54733 Optimize endian independent load/store

2014-05-27 Thread Thomas Preud'homme

Hi Chistophe and Andreas,

 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
 
 I suspect it's the same kind of problem m68k run into. I already wrote a patch
 to
 reduce the impact of different bitfield layout and it's in review right now. 
 I'll
 send you tomorrow for testing.

Can you both try the attached patch to see if it solves the bug you experienced?
Andreas, I wrote it based on the correction you did so it should at least work 
for
m68k.

ChangeLog is:

2014-05-26  Thomas Preud'homme  thomas.preudho...@arm.com

* gcc.c-torture/execute/bswap-2.c: Add alignment constraints to
bitfield to make it more portable.

Best regards,

Thomas 

fix_bswap-2.diff
Description: Binary data

Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-27 Thread Richard Biener

On Mon, May 26, 2014 at 9:01 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 Hi,

 This patch teaches the C++ frontend to warn on NULL checks against
 inline functions and it teaches the middle-end to fold NULL checks
 against inline functions.  These two things are currently done for
 non-inline C++ functions, but inline functions are exceptional in that
 the C++ frontend marks them as weak, and NULL checks against weak
 symbols cannot be folded in general because the symbol may be mapped to
 NULL at link-time.

 But in the case of an inline function, the fact that it's a weak symbol
 is an implementation detail.  And because it is not permitted to
 explicitly give an inline function the weak attribute (see
 handle_weak_attribute), in order to acheive $SUBJECT it suffices to
 assume that all inline functions are non-null, which is what this patch
 does.

 Bootstrap and regtest against x86_64-unknown-linux-gnu in progress.  Is
 this patch OK assuming no regressions?

What about always_inline function wrappers as used in FORTIFY_SOURCE?
IIRC the actual referenced function may be declared weak and thus the
address can compare to NULL?  Sth like

extern void __foo (void) __attribute__((weak,asm(foo)));
extern inline __attribute__((always_inline,gnu_inline)) void foo (void)
{
  __foo ();
}

int main()
{
  if (foo == 0)
return 0;
  abort ();
}

The issue is that the inline and alias may be hidden to the user and thus
he'll get unwanted and confusing warnings.

Richard.

 2014-05-26  Patrick Palka  patr...@parcs.ath.cx

 * c-family/c-common.c (decl_with_nonnull_addr_p): Assume inline
 functions are non-null.
 * fold-const.c (tree_single_nonzero_warnv_p): Likewise.
 ---
  gcc/c-family/c-common.c |  4 +++-
  gcc/fold-const.c|  5 -
  gcc/testsuite/g++.dg/inline-1.C | 14 ++
  3 files changed, 21 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/inline-1.C

 diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
 index 6ec14fc..d4747a0 100644
 --- a/gcc/c-family/c-common.c
 +++ b/gcc/c-family/c-common.c
 @@ -4508,7 +4508,9 @@ decl_with_nonnull_addr_p (const_tree expr)
return (DECL_P (expr)
(TREE_CODE (expr) == PARM_DECL
   || TREE_CODE (expr) == LABEL_DECL
 - || !DECL_WEAK (expr)));
 + || !DECL_WEAK (expr)
 + || (TREE_CODE (expr) == FUNCTION_DECL
 +  DECL_DECLARED_INLINE_P (expr;
  }

  /* Prepare expr to be an argument of a TRUTH_NOT_EXPR,
 diff --git a/gcc/fold-const.c b/gcc/fold-const.c
 index 188b179..2796079 100644
 --- a/gcc/fold-const.c
 +++ b/gcc/fold-const.c
 @@ -16052,7 +16052,10 @@ tree_single_nonzero_warnv_p (tree t, bool 
 *strict_overflow_p)
 || (DECL_CONTEXT (base)
  TREE_CODE (DECL_CONTEXT (base)) == FUNCTION_DECL
  auto_var_in_fn_p (base, DECL_CONTEXT (base)
 - return !VAR_OR_FUNCTION_DECL_P (base) || !DECL_WEAK (base);
 + return !VAR_OR_FUNCTION_DECL_P (base)
 +|| !DECL_WEAK (base)
 +|| (TREE_CODE (base) == FUNCTION_DECL
 + DECL_DECLARED_INLINE_P (base));

 /* Constants are never weak.  */
 if (CONSTANT_CLASS_P (base))
 diff --git a/gcc/testsuite/g++.dg/inline-1.C b/gcc/testsuite/g++.dg/inline-1.C
 new file mode 100644
 index 000..65beff1
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/inline-1.C
 @@ -0,0 +1,14 @@
 +// { dg-options -Waddress }
 +
 +inline void
 +foo (void)
 +{
 +}
 +
 +int
 +bar (void)
 +{
 +  return foo == 0; // { dg-warning never be NULL }
 +}
 +
 +// { dg-final { scan-assembler-not foo } }
 --
 2.0.0.rc2

Re: [PATCH, libgomp]: Require vect_simd_clones effective target for libgomp.fortran/declare-simd-[12].f90

2014-05-27 Thread Uros Bizjak

On Tue, May 27, 2014 at 9:18 AM, Jakub Jelinek ja...@redhat.com wrote:

 Please don't remove the dg-additional-options there, that is completely
 intentional there, only the simd clones are built for SSE2/AVX/AVX2,
 the simd loops are built with whatever options the loop is compiled with,
 and for the common case (AVX or later HW, but compiler not configured to
 support only AVX or later) I want to test as much vectorization as possible.
 Requiring vect_simd_clone or the whitespace change is fine, though I'd
 just use ! { dg-do run { target vect_simd_clones } } instead of
 dg-require-effective-target.

Thanks for the explanation.

Following is the v2 patch that I plan to commit after testing:

2014-05-27  Uros Bizjak  ubiz...@gmail.com

* testsuite/libgomp.fortran/declare-simd-1.f90: Require
vect_simd_clones effective target.
* testsuite/libgomp.fortran/declare-simd-2.f90: Ditto.

Uros.
Index: testsuite/libgomp.fortran/declare-simd-1.f90
===
--- testsuite/libgomp.fortran/declare-simd-1.f90(revision 210956)
+++ testsuite/libgomp.fortran/declare-simd-1.f90(working copy)
@@ -1,3 +1,4 @@
+! { dg-do run { target vect_simd_clones } }
 ! { dg-options -fno-inline }
 ! { dg-additional-options -msse2 { target sse2_runtime } }
 ! { dg-additional-options -mavx { target avx_runtime } }
Index: testsuite/libgomp.fortran/declare-simd-2.f90
===
--- testsuite/libgomp.fortran/declare-simd-2.f90(revision 210956)
+++ testsuite/libgomp.fortran/declare-simd-2.f90(working copy)
@@ -1,6 +1,6 @@
-! { dg-do run }
+! { dg-do run { target vect_simd_clones } }
 ! { dg-options -fno-inline }
-  ! { dg-additional-sources declare-simd-3.f90 }
+! { dg-additional-sources declare-simd-3.f90 }
 ! { dg-additional-options -msse2 { target sse2_runtime } }
 ! { dg-additional-options -mavx { target avx_runtime } }

Re: [PATCH, libgomp]: Require vect_simd_clones effective target for libgomp.fortran/declare-simd-[12].f90

2014-05-27 Thread Jakub Jelinek

On Tue, May 27, 2014 at 09:40:23AM +0200, Uros Bizjak wrote:
 Following is the v2 patch that I plan to commit after testing:
 
 2014-05-27  Uros Bizjak  ubiz...@gmail.com
 
 * testsuite/libgomp.fortran/declare-simd-1.f90: Require
 vect_simd_clones effective target.
 * testsuite/libgomp.fortran/declare-simd-2.f90: Ditto.

Thanks.

Jakub

[PATCH, fortran]: Include stdlib.h in intrinsics/getcwd.c

2014-05-27 Thread Uros Bizjak

... to avoid implicit declaration of function ‘free’ warning.

2014-05-27  Uros Bizjak  ubiz...@gmail.com

* intrinsics/getcwd.c: Include stdlib.h.

Tested on x86_64-pc-linux-gnu.

OK for mainline?

Uros.

Index: intrinsics/getcwd.c
===
--- intrinsics/getcwd.c (revision 210956)
+++ intrinsics/getcwd.c (working copy)
@@ -25,6 +25,7 @@

 #include libgfortran.h

+#include stdlib.h
 #include string.h
 #include errno.h

Re: [build, doc, testsuite] Centralise clearing hardware capabilities with Sun ld

2014-05-27 Thread Rainer Orth

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Prompted by the recent failures of c-c++-common/gomp/pr60823-2.c on
 Solaris/x86 with Sun as

  http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00943.html

 I've reworked the clearing of hardware capabilities via linker maps for
 Sun ld, which is currently replicated in several places all over the
 testsuite.  I've chosen to use TEST_ALWAYS_FLAGS or ALWAYS_CXXFLAGS to
 pass on the necessary linker flags.

 It turned out that more is needed to deal with the new gcc.dg/gomp and
 g++.dg/gomp failures:

   https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01185.html

 The following patch implements what I've outlined there: it introduces a
 Solaris-specific new -mclear-hwcap option, checking which version of
 the mapfile syntax works, if any.  This way, all the duplication in the
 testsuite and libitm can go.

 Do deal with the OpenMP declare simd failures, this option/mapfile is
 automatically activated with -fopenmp*.

 Initial testing has concluded successfully, also on an Solaris 10/x86
 system where ld only understands the old mapfile syntax.  Mainline
 bootstraps on i386-pc-solaris2.11 (as/ld, gas/ld, gas/gld),
 i386-pc-solaris2.10 (as/ld, gas/ld), and sparc-sun-solaris2.11 (as/ld)
 are still running.  x86_64-unknown-linux-gnu didn't finish due to
 unrelated libgo breakage.  Also, i386-pc-solaris2.9 bootstrap running on
 the 4.9 branch which is equally affected by the testsuite failures.

 Ok for mainline and 4.9 branch if those pass?

It's been a week since I've submitted the patch, so far having received
approval for the testsuite parts only.

While I don't need approval for the Solaris-only parts, it would be good
if the doc and build changes could be checked by the respective
maintainers.  One might argue that they are Solaris-only, thus falling
under my maintainership, but better safe than sorry.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[PATCH, ARM] interrupt testcases for shink-wrapping

2014-05-27 Thread Zhenqiang Chen

Hi,

Following up Richard's comments:
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00967.html

I investigate interrupt routines:
* gcc can shrink-wrapping an interrupt routine (ARM mode). The
shrink-wrap-interrupt_1.c in the patch can show it.
* There is restriction for interrupt routine to be shrink-wrapped.
e.g. shrink-wrap-interrupt_2.c can not be shrink-wrapped.
* When interrupt routine is shrink-wrapped, it uses
*cond_simple_return to return from the function. There is no dwarf
info issue.

So no need to change the codes. The patch only includes two test cases
to track it.

OK for trunk?

Thanks!
-Zhenqiang

testsuite/ChangeLog:
2014-05-27  Zhenqiang Chen  zhenqiang.c...@linaro.org

* gcc.target/arm/shrink-wrap-interrupt_1.c: New test.
* gcc.target/arm/shrink-wrap-interrupt_2.c: New test.

diff --git a/gcc/testsuite/gcc.target/arm/shrink-wrap-interrupt_1.c
b/gcc/testsuite/gcc.target/arm/shrink-wrap-interrupt_1.c
new file mode 100644
index 000..3432115
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/shrink-wrap-interrupt_1.c
@@ -0,0 +1,24 @@
+/* Verify that shrink-wrapping works for functions
+   with __attribute__ ((interrupt)).  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_nothumb } */
+/* { dg-options -O2 -marm -mapcs -fdump-rtl-pro_and_epilogue } */
+
+/* This test is not valid when -mthumb.  */
+
+extern void test (int, int) __attribute__ ((interrupt (IRQ)));
+int p;
+
+void
+test (int a, int b)
+{
+  if (a  b)
+  {
+p = a + 0x123456;
+p = p * (b - 0x9879876);
+  }
+}
+
+/* { dg-final { scan-rtl-dump Performing shrink-wrapping
pro_and_epilogue  } } */
+/* { dg-final { cleanup-rtl-dump pro_and_epilogue } } */
diff --git a/gcc/testsuite/gcc.target/arm/shrink-wrap-interrupt_2.c
b/gcc/testsuite/gcc.target/arm/shrink-wrap-interrupt_2.c
new file mode 100644
index 000..ecb793e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/shrink-wrap-interrupt_2.c
@@ -0,0 +1,24 @@
+/* Verify that shrink-wrapping works for functions
+   with __attribute__ ((interrupt)).  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_nothumb } */
+/* { dg-options -O2 -marm -mapcs -fdump-rtl-pro_and_epilogue } */
+
+/* This test is not valid when -mthumb.  */
+
+extern void test (int, int) __attribute__ ((interrupt (IRQ)));
+int p;
+
+void
+test (int a, int b)
+{
+  if (a)
+  {
+p = a + 0x123456;
+p = p * (b - 0x9879876);
+  }
+}
+
+/* { dg-final { scan-rtl-dump-not Performing shrink-wrapping
pro_and_epilogue  } } */
+/* { dg-final { cleanup-rtl-dump pro_and_epilogue } } */

[PATCH] Fix an AARCH64/ARM performance regression

2014-05-27 Thread Bernd Edlinger

Hi,

I have been informed, that the following check-in may cause a slight performance
regression on aarch64 and arm on trunk and gcc-4_9-branch:
See https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=205899

This can be seen with the following test example:

test.c:
typedef struct
{
  int *x;
  int *y;
  int *z;
  int *m;
  int *n;
  int *o;
}A;

typedef struct
{
  int x;
  int y;
  int z;
  int m;
  int n;
  int o;
}B;

A a[6];
B *b;

void test(int i)
{
  A *a1 = a[i];
  B *b1 = b[i];
  a1-x = b1-x;
  a1-y = b1-y;
  a1-z = b1-z;
   a1-m = b1-m;
  a1-n = b1-n;
   a1-o = b1-o;
}

when compiled with aarch64-elf-gcc -O3 -S we get the following
assembler code:

test:
        adrp     x1, b
        sxtw     x3, w0
        mov       w5, 24
        adrp     x2, a
        add        x2, x2, :lo12:a
        ldr          x4, [x1, #:lo12:b]
        lsl           x1, x3, 2
        sub    x1, x1, x3
        lsl           x1, x1, 4
        smaddl  x0, w0, w5, x4
        add        x3, x2, x1
        add        x7, x0, 8
        add        x6, x0, 16
        add    x8, x0, 4
       add        x5, x0, 12
        add        x4, x0, 20
        str          x0, [x2, x1]
        mov       x0, x3
        mov       x1, x3
       str          x8, [x3, 8]
       str          x7, [x0, 16]!
       str  x6, [x1, 32]!
       str  x5, [x0, 8]
        str  x4, [x1, 8]
        ret

Note the two mov instructions, and that two extra registers are used
to store the values.  The mov instructions have not been there before the patch.

After some investigation I found out how this happened:

The difference is first visible with -fdump-rtl-expand-slim:
1: NOTE_INSN_DELETED
4: NOTE_INSN_BASIC_BLOCK 2
2: r83:SI=x0:SI
3: NOTE_INSN_FUNCTION_BEG
6: r74:DI=sign_extend(r83:SI)
7: r85:DI=high(`b')
8: r84:DI=r85:DI+low(`b')
  REG_EQUAL `b'
9: r87:SI=0x18
   10: r86:DI=sign_extend(r83:SI)*sign_extend(r87:SI)
   11: r88:DI=[r84:DI]
   12: r76:DI=r88:DI+r86:DI
   13: r90:DI=high(`a')
   14: r89:DI=r90:DI+low(`a')
  REG_EQUAL `a'
   15: r91:DI=sign_extend(r83:SI)
   16: r92:DI=r91:DI
   17: r93:DI=r92:DI0x2
   18: r94:DI=r93:DI-r91:DI
  REG_EQUAL r91:DI*0x3
   19: r95:DI=r94:DI0x4
   20: r94:DI=r95:DI
  REG_EQUAL r91:DI*0x30
   21: r96:DI=r89:DI+r94:DI
   22: [r96:DI]=r76:DI
   23: r98:DI=high(`a')
   24: r97:DI=r98:DI+low(`a')
  REG_EQUAL `a'
   25: r99:DI=sign_extend(r83:SI)
   26: r100:DI=r99:DI
   27: r101:DI=r100:DI0x2
   28: r102:DI=r101:DI-r99:DI
  REG_EQUAL r99:DI*0x3
   29: r103:DI=r102:DI0x4
   30: r102:DI=r103:DI
  REG_EQUAL r99:DI*0x30
   31: r104:DI=r97:DI+r102:DI
   32: r105:DI=r76:DI+0x4
   33: [r104:DI+0x8]=r105:DI
   34: r107:DI=high(`a')
   35: r106:DI=r107:DI+low(`a')
  REG_EQUAL `a'
   36: r108:DI=sign_extend(r83:SI)
   37: r109:DI=r108:DI
   38: r110:DI=r109:DI0x2
   39: r111:DI=r110:DI-r108:DI
  REG_EQUAL r108:DI*0x3
   40: r112:DI=r111:DI0x4
   41: r111:DI=r112:DI
  REG_EQUAL r108:DI*0x30
   42: r113:DI=r106:DI+r111:DI
   43: r114:DI=r113:DI+0x10
   44: r115:DI=r76:DI+0x8
   45: [r114:DI]=r115:DI
   46: r117:DI=high(`a')
   47: r116:DI=r117:DI+low(`a')
  REG_EQUAL `a'
   48: r118:DI=sign_extend(r83:SI)
   49: r119:DI=r118:DI
   50: r120:DI=r119:DI0x2
   51: r121:DI=r120:DI-r118:DI
  REG_EQUAL r118:DI*0x3
   52: r122:DI=r121:DI0x4
   53: r121:DI=r122:DI
  REG_EQUAL r118:DI*0x30
   54: r123:DI=r116:DI+r121:DI
   55: r124:DI=r123:DI+0x10
   56: r125:DI=r76:DI+0xc
   57: [r124:DI+0x8]=r125:DI
   58: r127:DI=high(`a')
   59: r126:DI=r127:DI+low(`a')
  REG_EQUAL `a'
   60: r128:DI=sign_extend(r83:SI)
   61: r129:DI=r128:DI
   62: r130:DI=r129:DI0x2
   63: r131:DI=r130:DI-r128:DI
  REG_EQUAL r128:DI*0x3
   64: r132:DI=r131:DI0x4
   65: r131:DI=r132:DI
  REG_EQUAL r128:DI*0x30
   66: r133:DI=r126:DI+r131:DI
   67: r134:DI=r133:DI+0x20
   68: r135:DI=r76:DI+0x10
   69: [r134:DI]=r135:DI
   70: r137:DI=high(`a')
   71: r136:DI=r137:DI+low(`a')
  REG_EQUAL `a'
   72: r138:DI=sign_extend(r83:SI)
   73: r139:DI=r138:DI
   74: r140:DI=r139:DI0x2
   75: r141:DI=r140:DI-r138:DI
  REG_EQUAL r138:DI*0x3
   76: r142:DI=r141:DI0x4
   77: r141:DI=r142:DI
  REG_EQUAL r138:DI*0x30
   78: r143:DI=r136:DI+r141:DI
   79: r144:DI=r143:DI+0x20
   80: r145:DI=r76:DI+0x14
   81: [r144:DI+0x8]=r145:DI

Note the offset +0x8 on the instructions 33, 57, 81 but not on 22, 45, 69.
This artefact was not there before the patch.

Well, this causes little ripples in the later rtl-passes.
There is one pass that does a pretty good job at compensating
this effect: forward-propagate.   In simple cases the resulting code
does not look any different, because fwprop folds all offsets together,
and the result looks as before.   However in this example fwprop could
not remove some dead statements, and that made the

Re: [build, driver] RFC: Support compressed debug sections

2014-05-27 Thread Rainer Orth

Joseph S. Myers jos...@codesourcery.com writes:

 On Thu, 22 May 2014, Rainer Orth wrote:

 Joseph S. Myers jos...@codesourcery.com writes:
 
  On Thu, 22 May 2014, Rainer Orth wrote:
 
* common.opt (compressed_debug_sections): New enum.
(gz, gz=): New options.
 
* opts.c (common_handle_option): Handle OPT_gz, OPT_gz_.
 
  Given that the options are completely handled via specs, why can't they 
  just be Driver options (without Common) and so not mentioned in 
  common_handle_option?
 
 If I do this, -gz still gets passed to e.g. cc1, which errors out like
 this:
 
 cc1: error: unrecognised debug output level z
 
 It seems my way of handling this is clearer than doing this in opts.c
 (set_debug_level).

 Thanks for the explanation.  The driver changes are OK.

Thanks.  I still need approval for the doc and build parts, as well as
the Darwin and DJGPP changes.  For the latter two, I've included the
patch in a x86_64-apple-darwin11.4.2 build, verifying that -gz is
rejected as expected (no idea if the are any working gas and gold ports
that would support the option).  For DJGPP, I've tried a
i386-pc-solaris2.11 x i586-pc-msdosdjgpp cross.  While the specs
additions look correct, trying to compile even the most trivial program
SEGVs:

ro@snoopy 319  ./xgcc -B./ -o hello hello.c
built-in: internal compiler error: Segmentation Fault
0x87549df crash_signal
/vol/gcc/src/hg/trunk/local/gcc/toplev.c:337
0x82b8270 contains_struct_check
/vol/gcc/src/hg/trunk/local/gcc/tree.h:2841
0x82b8270 c_common_nodes_and_builtins()
/vol/gcc/src/hg/trunk/local/gcc/c-family/c-common.c:5619
0x82437be c_init_decl_processing()
/vol/gcc/src/hg/trunk/local/gcc/c/c-decl.c:3567
0x827fe1a c_objc_common_init()
/vol/gcc/src/hg/trunk/local/gcc/c/c-objc-common.c:63
0x8756333 lang_dependent_init
/vol/gcc/src/hg/trunk/local/gcc/toplev.c:1712
0x8756333 do_compile
/vol/gcc/src/hg/trunk/local/gcc/toplev.c:1901

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[PATCH, committed] PR61319 The tests c-c++-common/ubsan/float-cast-overflow-(1|2|4).c fail on x86_64-apple-darwin*

2014-05-27 Thread Dominique Dhumieres

Patch approved by Jakub in the PR thread.

Dominique

2014-05-27  Dominique d'Humieres domi...@lps.ens.fr

PR testsuite/61319
* c-c++-common/ubsan/float-cast-overflow-1.c: Make the sign of
-nan optional.
* c-c++-common/ubsan/float-cast-overflow-2.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-4.c: Likewise.

Index: gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
===
--- gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c(revision 
210957)
+++ gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c(working copy)
@@ -97,7 +97,7 @@
 /* { dg-output \[^\n\r]*value 128.5 is outside the range of representable 
values of type 'signed char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value 132 is outside the range of representable 
values of type 'signed char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value nan is outside the range of representable 
values of type 'signed char'\[^\n\r]*(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*value -nan is outside the range of representable 
values of type 'signed char'\[^\n\r]*(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*value -?nan is outside the range of representable 
values of type 'signed char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value inf is outside the range of representable 
values of type 'signed char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -inf is outside the range of representable 
values of type 'signed char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value 256 is outside the range of representable 
values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
@@ -107,7 +107,7 @@
 /* { dg-output \[^\n\r]*value -1.5 is outside the range of representable 
values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -1 is outside the range of representable values 
of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value nan is outside the range of representable 
values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*value -nan is outside the range of representable 
values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*value -?nan is outside the range of representable 
values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value inf is outside the range of representable 
values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -inf is outside the range of representable 
values of type 'unsigned char'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -32773 is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
@@ -117,7 +117,7 @@
 /* { dg-output \[^\n\r]*value 32768.5 is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value 32772 is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value nan is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*value -nan is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*value -?nan is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value inf is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -inf is outside the range of representable 
values of type 'short int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value 65536 is outside the range of representable 
values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
@@ -127,7 +127,7 @@
 /* { dg-output \[^\n\r]*value -1.5 is outside the range of representable 
values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -1 is outside the range of representable values 
of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value nan is outside the range of representable 
values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*value -nan is outside the range of representable 
values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*value -?nan is outside the range of representable 
values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value inf is outside the range of representable 
values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -inf is outside the range of representable 
values of type 'short unsigned int'\[^\n\r]*(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*value -2.14748e\\\+09 is outside the range of 
representable values of type 'int'\[^\n\r]*(\n|\r\n|\r) } */
@@ -137,7 +137,7 @@
 /* { dg-output \[^\n\r]*value

Re: Make sure that all insn codes are in (0, LAST_INSN_CODE)

2014-05-27 Thread Richard Sandiford

Segher Boessenkool seg...@kernel.crashing.org writes:
 -  puts (  LAST_INSN_CODE\n\
 +  printf (  LAST_INSN_CODE = %d\n\
  };\n\
  \n\
 -#endif /* GCC_INSN_CODES_H */);
 +#endif /* GCC_INSN_CODES_H */, last);

 You probably didn't intend to delete the newline at the end of
 the generated file?

Oops.  Updated patch below.

Thanks,
Richard


gcc/
* gencodes.c (main): Make LAST_INSN_CODE higher than any insn code,
rather than any named insn's code.

Index: gcc/gencodes.c
===
--- gcc/gencodes.c  2014-05-27 09:38:02.195506710 +0100
+++ gcc/gencodes.c  2014-05-27 09:39:16.002188824 +0100
@@ -50,6 +50,7 @@ gen_insn (rtx insn, int code)
 main (int argc, char **argv)
 {
   rtx desc;
+  int last = 1;
 
   progname = gencodes;
 
@@ -82,13 +83,16 @@ enum insn_code {\n\
break;
 
   if (GET_CODE (desc) == DEFINE_INSN || GET_CODE (desc) == DEFINE_EXPAND)
-   gen_insn (desc, insn_code_number);
+   {
+ gen_insn (desc, insn_code_number);
+ last = insn_code_number + 1;
+   }
 }
 
-  puts (  LAST_INSN_CODE\n\
+  printf (  LAST_INSN_CODE = %d\n\
 };\n\
 \n\
-#endif /* GCC_INSN_CODES_H */);
+#endif /* GCC_INSN_CODES_H */\n, last);
 
   if (ferror (stdout) || fflush (stdout) || fclose (stdout))
 return FATAL_EXIT_CODE;

Re: Make sure that all insn codes are in (0, LAST_INSN_CODE)

2014-05-27 Thread Jakub Jelinek

On Tue, May 27, 2014 at 09:41:08AM +0100, Richard Sandiford wrote:
   * gencodes.c (main): Make LAST_INSN_CODE higher than any insn code,
   rather than any named insn's code.

Ok.

 --- gcc/gencodes.c2014-05-27 09:38:02.195506710 +0100
 +++ gcc/gencodes.c2014-05-27 09:39:16.002188824 +0100
 @@ -50,6 +50,7 @@ gen_insn (rtx insn, int code)
  main (int argc, char **argv)
  {
rtx desc;
 +  int last = 1;
  
progname = gencodes;
  
 @@ -82,13 +83,16 @@ enum insn_code {\n\
   break;
  
if (GET_CODE (desc) == DEFINE_INSN || GET_CODE (desc) == DEFINE_EXPAND)
 - gen_insn (desc, insn_code_number);
 + {
 +   gen_insn (desc, insn_code_number);
 +   last = insn_code_number + 1;
 + }
  }
  
 -  puts (  LAST_INSN_CODE\n\
 +  printf (  LAST_INSN_CODE = %d\n\
  };\n\
  \n\
 -#endif /* GCC_INSN_CODES_H */);
 +#endif /* GCC_INSN_CODES_H */\n, last);
  
if (ferror (stdout) || fflush (stdout) || fclose (stdout))
  return FATAL_EXIT_CODE;

Jakub

Re: [PATCH, fortran]: Include stdlib.h in intrinsics/getcwd.c

2014-05-27 Thread Tobias Burnus

Uros Bizjak wrote:
 ... to avoid implicit declaration of function âeeâwarning.

 2014-05-27  Uros Bizjak  ubiz...@gmail.com

 * intrinsics/getcwd.c: Include stdlib.h.

 Tested on x86_64-pc-linux-gnu.
 OK for mainline?

OK - thanks for the patch. (I think it also counts as obvious.)


Tobias

Re: [PATCH] Fix an AARCH64/ARM performance regression

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 10:10 AM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Hi,

 I have been informed, that the following check-in may cause a slight 
 performance
 regression on aarch64 and arm on trunk and gcc-4_9-branch:
 See https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=205899

 This can be seen with the following test example:

 test.c:
 typedef struct
 {
   int *x;
   int *y;
   int *z;
   int *m;
   int *n;
   int *o;
 }A;

 typedef struct
 {
   int x;
   int y;
   int z;
   int m;
   int n;
   int o;
 }B;

 A a[6];
 B *b;

 void test(int i)
 {
   A *a1 = a[i];
   B *b1 = b[i];
   a1-x = b1-x;
   a1-y = b1-y;
   a1-z = b1-z;
a1-m = b1-m;
   a1-n = b1-n;
a1-o = b1-o;
 }

 when compiled with aarch64-elf-gcc -O3 -S we get the following
 assembler code:

 test:
 adrp x1, b
 sxtw x3, w0
 mov   w5, 24
 adrp x2, a
 addx2, x2, :lo12:a
 ldr  x4, [x1, #:lo12:b]
 lsl   x1, x3, 2
 subx1, x1, x3
 lsl   x1, x1, 4
 smaddl  x0, w0, w5, x4
 addx3, x2, x1
 addx7, x0, 8
 addx6, x0, 16
 addx8, x0, 4
addx5, x0, 12
 addx4, x0, 20
 str  x0, [x2, x1]
 mov   x0, x3
 mov   x1, x3
str  x8, [x3, 8]
str  x7, [x0, 16]!
str  x6, [x1, 32]!
str  x5, [x0, 8]
 str  x4, [x1, 8]
 ret

 Note the two mov instructions, and that two extra registers are used
 to store the values.  The mov instructions have not been there before the 
 patch.

 After some investigation I found out how this happened:

 The difference is first visible with -fdump-rtl-expand-slim:
 1: NOTE_INSN_DELETED
 4: NOTE_INSN_BASIC_BLOCK 2
 2: r83:SI=x0:SI
 3: NOTE_INSN_FUNCTION_BEG
 6: r74:DI=sign_extend(r83:SI)
 7: r85:DI=high(`b')
 8: r84:DI=r85:DI+low(`b')
   REG_EQUAL `b'
 9: r87:SI=0x18
10: r86:DI=sign_extend(r83:SI)*sign_extend(r87:SI)
11: r88:DI=[r84:DI]
12: r76:DI=r88:DI+r86:DI
13: r90:DI=high(`a')
14: r89:DI=r90:DI+low(`a')
   REG_EQUAL `a'
15: r91:DI=sign_extend(r83:SI)
16: r92:DI=r91:DI
17: r93:DI=r92:DI0x2
18: r94:DI=r93:DI-r91:DI
   REG_EQUAL r91:DI*0x3
19: r95:DI=r94:DI0x4
20: r94:DI=r95:DI
   REG_EQUAL r91:DI*0x30
21: r96:DI=r89:DI+r94:DI
22: [r96:DI]=r76:DI
23: r98:DI=high(`a')
24: r97:DI=r98:DI+low(`a')
   REG_EQUAL `a'
25: r99:DI=sign_extend(r83:SI)
26: r100:DI=r99:DI
27: r101:DI=r100:DI0x2
28: r102:DI=r101:DI-r99:DI
   REG_EQUAL r99:DI*0x3
29: r103:DI=r102:DI0x4
30: r102:DI=r103:DI
   REG_EQUAL r99:DI*0x30
31: r104:DI=r97:DI+r102:DI
32: r105:DI=r76:DI+0x4
33: [r104:DI+0x8]=r105:DI
34: r107:DI=high(`a')
35: r106:DI=r107:DI+low(`a')
   REG_EQUAL `a'
36: r108:DI=sign_extend(r83:SI)
37: r109:DI=r108:DI
38: r110:DI=r109:DI0x2
39: r111:DI=r110:DI-r108:DI
   REG_EQUAL r108:DI*0x3
40: r112:DI=r111:DI0x4
41: r111:DI=r112:DI
   REG_EQUAL r108:DI*0x30
42: r113:DI=r106:DI+r111:DI
43: r114:DI=r113:DI+0x10
44: r115:DI=r76:DI+0x8
45: [r114:DI]=r115:DI
46: r117:DI=high(`a')
47: r116:DI=r117:DI+low(`a')
   REG_EQUAL `a'
48: r118:DI=sign_extend(r83:SI)
49: r119:DI=r118:DI
50: r120:DI=r119:DI0x2
51: r121:DI=r120:DI-r118:DI
   REG_EQUAL r118:DI*0x3
52: r122:DI=r121:DI0x4
53: r121:DI=r122:DI
   REG_EQUAL r118:DI*0x30
54: r123:DI=r116:DI+r121:DI
55: r124:DI=r123:DI+0x10
56: r125:DI=r76:DI+0xc
57: [r124:DI+0x8]=r125:DI
58: r127:DI=high(`a')
59: r126:DI=r127:DI+low(`a')
   REG_EQUAL `a'
60: r128:DI=sign_extend(r83:SI)
61: r129:DI=r128:DI
62: r130:DI=r129:DI0x2
63: r131:DI=r130:DI-r128:DI
   REG_EQUAL r128:DI*0x3
64: r132:DI=r131:DI0x4
65: r131:DI=r132:DI
   REG_EQUAL r128:DI*0x30
66: r133:DI=r126:DI+r131:DI
67: r134:DI=r133:DI+0x20
68: r135:DI=r76:DI+0x10
69: [r134:DI]=r135:DI
70: r137:DI=high(`a')
71: r136:DI=r137:DI+low(`a')
   REG_EQUAL `a'
72: r138:DI=sign_extend(r83:SI)
73: r139:DI=r138:DI
74: r140:DI=r139:DI0x2
75: r141:DI=r140:DI-r138:DI
   REG_EQUAL r138:DI*0x3
76: r142:DI=r141:DI0x4
77: r141:DI=r142:DI
   REG_EQUAL r138:DI*0x30
78: r143:DI=r136:DI+r141:DI
79: r144:DI=r143:DI+0x20
80: r145:DI=r76:DI+0x14
81: [r144:DI+0x8]=r145:DI

 Note the offset +0x8 on the instructions 33, 57, 81 but not on 22, 45, 69.
 This artefact was not there before the patch.

 Well, this causes little ripples in the later rtl-passes.
 There is one pass that does a pretty good job at compensating

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Eric Botcazou

 +  tree new_const
 +   = fold_build2_loc (loc, reverse_op, TREE_TYPE (arg1), const2,
 const1);
 
/* If the constant operation overflowed this can be
  simplified as a comparison against INT_MAX/INT_MIN.  */
 -  if (TREE_CODE (lhs) == INTEGER_CST
 -  TREE_OVERFLOW (lhs))
 +  if (TREE_OVERFLOW (new_const))
 
 well, either use int_const_binop above or retain the check (or use
 TREE_OVERFLOW_P).  Bonus points for using wide-ints here
 and not relying on TREE_OVERFLOW.

The check is useless (you get either NULL_TREE or INTEGER_CST here) but I'll 
use int_const_binop.

 +  /* Transform comparisons of the form X - Y CMP 0 to X CMP Y.  */
 +  if (TREE_CODE (arg0) == MINUS_EXPR
 +   TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg1))
 
 any good reason for using TYPE_OVERFLOW_UNDEFINED on the
 type of arg1 instead on the type of the MINUS (yes, they should
 match, but it really looks odd ... the overflow of the minus has to be
 undefined).

For the sake of consistency with the X +- C1 CMP C2 case just above, but I can 
change both.

 Also for EQ_EXPR and NE_EXPR the transform is
 fine even when !TYPE_OVERFLOW_UNDEFINED (and we seem
 to perform it already somewhere).  Please look where and try to
 add the undefined overflow case to it.

Yes, but it's the same for the X +- C1 CMP C2 case, i.e. there are specific 
cases for X +- C1 EQ/NE C2 and X - Y EQ/NE 0 in fold_binary, so I'm not sure 
what you're asking.

 As for the VRP pieces I don't understand what is missing here
 (well, compare_range_with_value and/or compare_values might
 not handle this case?  then better fix that to improve symbolic
 range handling in general?).  Also I have a longstanding patch
 in my tree that does

Yes, there is an explicit non-handling of symbolic ranges for PLUS_EXPR and 
MINUS_EXPR in VRP (extract_range_from_binary_expr_1) and the patch works 
around it by propagating the code instead of the ranges, which is far easier 
and sufficient here.  If you think that the way to go is to handle symbolic 
ranges for PLUS_EXPR and MINUS_EXPR instead, fine with me, I can try.

-- 
Eric Botcazou

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 11:25 AM, Eric Botcazou ebotca...@adacore.com wrote:
 +  tree new_const
 +   = fold_build2_loc (loc, reverse_op, TREE_TYPE (arg1), const2,
 const1);

/* If the constant operation overflowed this can be
  simplified as a comparison against INT_MAX/INT_MIN.  */
 -  if (TREE_CODE (lhs) == INTEGER_CST
 -  TREE_OVERFLOW (lhs))
 +  if (TREE_OVERFLOW (new_const))

 well, either use int_const_binop above or retain the check (or use
 TREE_OVERFLOW_P).  Bonus points for using wide-ints here
 and not relying on TREE_OVERFLOW.

 The check is useless (you get either NULL_TREE or INTEGER_CST here) but I'll
 use int_const_binop.

 +  /* Transform comparisons of the form X - Y CMP 0 to X CMP Y.  */
 +  if (TREE_CODE (arg0) == MINUS_EXPR
 +   TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg1))

 any good reason for using TYPE_OVERFLOW_UNDEFINED on the
 type of arg1 instead on the type of the MINUS (yes, they should
 match, but it really looks odd ... the overflow of the minus has to be
 undefined).

 For the sake of consistency with the X +- C1 CMP C2 case just above, but I can
 change both.

 Also for EQ_EXPR and NE_EXPR the transform is
 fine even when !TYPE_OVERFLOW_UNDEFINED (and we seem
 to perform it already somewhere).  Please look where and try to
 add the undefined overflow case to it.

 Yes, but it's the same for the X +- C1 CMP C2 case, i.e. there are specific
 cases for X +- C1 EQ/NE C2 and X - Y EQ/NE 0 in fold_binary, so I'm not sure
 what you're asking.

I'm asking to merge them (move them to fold_comparison).

 As for the VRP pieces I don't understand what is missing here
 (well, compare_range_with_value and/or compare_values might
 not handle this case?  then better fix that to improve symbolic
 range handling in general?).  Also I have a longstanding patch
 in my tree that does

 Yes, there is an explicit non-handling of symbolic ranges for PLUS_EXPR and
 MINUS_EXPR in VRP (extract_range_from_binary_expr_1) and the patch works
 around it by propagating the code instead of the ranges, which is far easier
 and sufficient here.  If you think that the way to go is to handle symbolic
 ranges for PLUS_EXPR and MINUS_EXPR instead, fine with me, I can try.

Yeah, it would be nice to see some support.  The most interesting cases
will be symbolic-singleton +- CST where the offset shrinks a constant offset
in a symbolic A +- CST (thus we don't get into any overflow issues).  Thus
handling

 [a + 1, a + 1] - [1, 1] - [a, a]

for example.  We get the offsetted singleton symbolic ranges from
conditional asserts a lot.

Thanks,
Richard.

 --
 Eric Botcazou

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Eric Botcazou

 I'm asking to merge them (move them to fold_comparison).

OK, I'll repost a first patch doing only the fold-const.c massaging.

 Yeah, it would be nice to see some support.  The most interesting cases
 will be symbolic-singleton +- CST where the offset shrinks a constant offset
 in a symbolic A +- CST (thus we don't get into any overflow issues).  Thus
 handling
 
  [a + 1, a + 1] - [1, 1] - [a, a]
 
 for example.  We get the offsetted singleton symbolic ranges from
 conditional asserts a lot.

For the case at hand, you have [x + 1, INF] for y and you want to evaluate the 
range of y - x, so you really don't want to use the range of x...

-- 
Eric Botcazou

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 11:59 AM, Eric Botcazou ebotca...@adacore.com wrote:
 I'm asking to merge them (move them to fold_comparison).

 OK, I'll repost a first patch doing only the fold-const.c massaging.

 Yeah, it would be nice to see some support.  The most interesting cases
 will be symbolic-singleton +- CST where the offset shrinks a constant offset
 in a symbolic A +- CST (thus we don't get into any overflow issues).  Thus
 handling

  [a + 1, a + 1] - [1, 1] - [a, a]

 for example.  We get the offsetted singleton symbolic ranges from
 conditional asserts a lot.

 For the case at hand, you have [x + 1, INF] for y and you want to evaluate the
 range of y - x, so you really don't want to use the range of x...

Ok.  That's similar to the issue I try to address with the VRP snipped I posted
yesterday.

I'd say we still handle basic symbolic range ops in
extract_range_from_binary_1
but in extract_range_from_binary_expr for a symbolic op0 we try to simplify
it with both [op1, op1] and with the value-range of op1 until we get a
non-varying range as result.  Not sure if it's worth restricting that
to the case
where op0s value-range refers to op1 or vice versa, and eventually only
use op1 symbolically then.

Richard.

 --
 Eric Botcazou

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-05-27 Thread Ilya Verbin

2014-03-01 1:40 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:
 For your use case, I'd imagine the offload compiler would be built
 relatively normally as a full build with
 --enable-as-accelerator-for=x86_64-linux, which would install it into
 locations where the host will eventually be able to find it. Then the
 host compiler would be built with another new configure option (as yet
 unimplemented in my patch set) --enable-offload-targets=mic,... which
 would tell the host compiler about the pre-built offload target
 compilers.

Hi Bernd,

Could you please advise me how to build the offload gcc and target
libraries during the build of host gcc?
I see a possible way: to duplicate all modules in Makefile.def needed
for regular build of gcc and libs. But it looks ugly. Is there a
better solution?

Thanks,
  -- Ilya

[PATCH] Try harder for conditional evaluation in VRP

2014-05-27 Thread Richard Biener


Especially for ops with symbolic ranges it may be preferable
to compare one range with an op such as in
[x + 1, x + 1]  x instead of expanding the range of x on
the rhs to sth unrelated.

So, try harder.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-05-27  Richard Biener  rguent...@suse.de

* tree-vrp.c (vrp_evaluate_conditional_warnv_with_ops_using_ranges):
Try using literal operands when comparing value-ranges failed.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 210931)
+++ gcc/tree-vrp.c  (working copy)
@@ -6919,14 +6919,15 @@ vrp_evaluate_conditional_warnv_with_ops_
   vr0 = (TREE_CODE (op0) == SSA_NAME) ? get_value_range (op0) : NULL;
   vr1 = (TREE_CODE (op1) == SSA_NAME) ? get_value_range (op1) : NULL;
 
+  tree res = NULL_TREE;
   if (vr0  vr1)
-return compare_ranges (code, vr0, vr1, strict_overflow_p);
-  else if (vr0  vr1 == NULL)
-return compare_range_with_value (code, vr0, op1, strict_overflow_p);
-  else if (vr0 == NULL  vr1)
-return (compare_range_with_value
+res = compare_ranges (code, vr0, vr1, strict_overflow_p);
+  if (!res  vr0)
+res = compare_range_with_value (code, vr0, op1, strict_overflow_p);
+  if (!res  vr1)
+res = (compare_range_with_value
(swap_tree_comparison (code), vr1, op0, strict_overflow_p));
-  return NULL;
+  return res;
 }
 
 /* Helper function for vrp_evaluate_conditional_warnv. */

Re: [PATCH][4/4] Always 64bit-HWI cleanups

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Richard Biener wrote:

 On May 27, 2014 3:58:06 AM CEST, David Edelsohn dje@gmail.com wrote:
 This patch (further) broke bootstrap on AIX.  AIX defaults to 32 bit,
 although it supports 64 bit HWI.
 
 /nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'int
 print_statistics(bitmap_descriptor_d**, output_info*)':
 /nasfarm/edelsohn/src/src/gcc/bitmap.c:2168:24: error: expected ')'
 before 'PRId64'
 /nasfarm/edelsohn/src/src/gcc/bitmap.c: In function 'void
 dump_bitmap_statistics()':
 /nasfarm/edelsohn/src/src/gcc/bitmap.c:2202:15: error: expected ')'
 before 'PRId64'
 
 /nasfarm/edelsohn/src/src/gcc/bt-load.c: In function 'void
 migrate_btr_defs(reg_class, int)':
 /nasfarm/edelsohn/src/src/gcc/bt-load.c:1414:34: error: expected ')'
 before 'PRId64'
 
 /nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
 dump_edge_info(FILE*, edge, int, int)':
 /nasfarm/edelsohn/src/src/gcc/cfg.c:489:25: error: expected ')' before
 'PRId64'
 /nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
 dump_bb_info(FILE*, basic_block, int, int, bool, bool)':
 /nasfarm/edelsohn/src/src/gcc/cfg.c:737:33: error: expected ')' before
 'PRId64'
 
 This means aix has inttypes.h but not the standard mandated PRI?64 macros. 
 Does it have stdint.h and therein int64_t?
 
 A possibility is to define the GCC fallbacks when the defines are not defined 
 as opposed to only when inttypes.h is not available.
 
 Can you investigate some more the situation on aix?

Proposed patch below.

Richard.

Index: gcc/hwint.h
===
--- gcc/hwint.h (revision 210932)
+++ gcc/hwint.h (working copy)
@@ -135,21 +135,30 @@ typedef HOST_WIDE_INT __gcc_host_wide_in
 #define HOST_WIDE_INT_PRINT_HEX %# HOST_WIDE_INT_PRINT x
 #define HOST_WIDE_INT_PRINT_HEX_PURE % HOST_WIDE_INT_PRINT x
 
-/* Provide C99 inttypes.h style format definitions for 64bits.  */
-#ifndef HAVE_INTTYPES_H
-#undef PRId64
+/* Provide C99 inttypes.h style format definitions for 64bits.  As
+   64bit types are only conditionally supported also provide fallback
+   definitions for the 64bit variants even if inttypes.h is available.  */
+#ifndef PRId64
 #define PRId64 HOST_WIDE_INT_PRINT d
-#undef PRIi64
+#endif
+#ifndef PRIi64
 #define PRIi64 HOST_WIDE_INT_PRINT i
-#undef PRIo64
+#endif
+#ifndef PRIo64
 #define PRIo64 HOST_WIDE_INT_PRINT o
-#undef PRIu64
+#endif
+#ifndef PRIu64
 #define PRIu64 HOST_WIDE_INT_PRINT u
-#undef PRIx64
+#endif
+#ifndef PRIx64
 #define PRIx64 HOST_WIDE_INT_PRINT x
-#undef PRIX64
+#endif
+#ifndef PRIX64
 #define PRIX64 HOST_WIDE_INT_PRINT X
 #endif
+#ifndef HAVE_INTTYPES_H
+/* 32bit variants go here once we start using them.  */
+#endif
 
 /* Define HOST_WIDEST_FAST_INT to the widest integer type supported
efficiently in hardware.  (That is, the widest integer type that fits

Re: [PATCH][4/4] Always 64bit-HWI cleanups

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Jakub Jelinek wrote:

 On Tue, May 27, 2014 at 08:26:35AM +0200, Richard Biener wrote:
  /nasfarm/edelsohn/src/src/gcc/cfg.c: In function 'void
  dump_bb_info(FILE*, basic_block, int, int, bool, bool)':
  /nasfarm/edelsohn/src/src/gcc/cfg.c:737:33: error: expected ')' before
  'PRId64'
  
  This means aix has inttypes.h but not the standard mandated PRI?64 macros. 
  Does it have stdint.h and therein int64_t?
  
  A possibility is to define the GCC fallbacks when the defines are not
  defined as opposed to only when inttypes.h is not available.
 
 BTW, we should be prepared for C++11 udlit and use spaces around
 the PRI* macros, in case some compiler treats those as udlits
 unconditionally.

Though that's not likely as it would break existing and valid code.

Richard.

[PATCH] Delete temporary string within demangler even in failure cases.

2014-05-27 Thread Andrew Burgess

Spotted that a call to demangle_template might allocate storage within a
temporary string even if the call to demangle_template eventually returns
failure.
This will never cause the demangler to crash, but does leak memory, as a
result I've not added any tests for this.

Calling string_delete is safe, even if nothing is allocated into the
string, the string is initialised with string_init, so we know the internal
pointers are NULL.

I don't have gcc write privilages, so if this is approved could someone
please commit for me.

Thanks,
Andrew



libiberty/ChangeLog

* cplus-dem.c (do_type): Call string_delete even if the call to
demangle_template fails.
---
 libiberty/cplus-dem.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c
index e948487..1c713aa 100644
--- a/libiberty/cplus-dem.c
+++ b/libiberty/cplus-dem.c
@@ -3656,7 +3656,10 @@ do_type (struct work_stuff *work, const char **mangled, 
string *result)
string_delete (temp);
  }
else
- break;
+ {
+   string_delete (temp);
+   break;
+ }
  }
else if (**mangled == 'Q')
  {
-- 
1.8.1.3

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-05-27 Thread Bernd Schmidt


On 05/27/2014 12:17 PM, Ilya Verbin wrote:

2014-03-01 1:40 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:

For your use case, I'd imagine the offload compiler would be built
relatively normally as a full build with
--enable-as-accelerator-for=x86_64-linux, which would install it into
locations where the host will eventually be able to find it. Then the
host compiler would be built with another new configure option (as yet
unimplemented in my patch set) --enable-offload-targets=mic,... which
would tell the host compiler about the pre-built offload target
compilers.


Hi Bernd,

Could you please advise me how to build the offload gcc and target
libraries during the build of host gcc?


There isn't a way to do this. For ptx, target libraries can't be built 
anyway. For your use case, I'd recommend building the offload gcc first, 
installing it, and then building the host gcc with 
--enable-offload-targets as described above.



Bernd

Re: [patch,doc,avr] PR61044: Note that label defferences are not supported

2014-05-27 Thread Denis Chertykov

2014-05-26 20:47 GMT+04:00 Georg-Johann Lay a...@gjlay.de:
 This adds a note to the user manual that code with label differences is not
 supported.  This is because adding an offset to a stub address as generated
 with gs() will in general not yield the address of the label+offset.

 This actually occurs only if gs() has something to do, e.g. on devices with
 more than 128KiB of flash; but I don't see any benefit from supporting the
 feature on small devices -- left out all the problems caused by relaxing and
 when gas resolves the label difference.

 Thus, add a note that it's not available on any AVR and mark the PR as
 invalid.

 Ok for trunk?

 Johann


 PR target/61044
 * doc/extend.texi (Local Labels): Note that label differences are
 not supported for AVR.

 Index: doc/extend.texi


Approved.
Please, apply.

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-05-27 Thread Ilya Verbin

2014-05-27 14:59 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:
 There isn't a way to do this. For ptx, target libraries can't be built
 anyway. For your use case, I'd recommend building the offload gcc first,
 installing it, and then building the host gcc with --enable-offload-targets
 as described above.

Do I understand correctly that you recommend to build our offload gcc
manually, rather than during configure/make?

  -- Ilya

[patch] libstdc++/61329 Add missing 'inline' in regex debug functions

2014-05-27 Thread Jonathan Wakely


Tested x86_64-linux, committed to trunk, will commit to 4.9 later
today.

commit 9ea4330e4e3a7fe431206048f2e1efe5e9c28eaa
Author: Jonathan Wakely jwak...@redhat.com
Date:   Tue May 27 11:48:50 2014 +0100

	PR libstdc++/61329
	* include/bits/regex_automaton.tcc (_State_base::_M_print): Add
	inline specifier.
	(_State_base::_M_dot): Likewise.

diff --git a/libstdc++-v3/include/bits/regex_automaton.tcc b/libstdc++-v3/include/bits/regex_automaton.tcc
index e0ac3f9..1c6cfdd 100644
--- a/libstdc++-v3/include/bits/regex_automaton.tcc
+++ b/libstdc++-v3/include/bits/regex_automaton.tcc
@@ -35,7 +35,7 @@ namespace __detail
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #ifdef _GLIBCXX_DEBUG
-  std::ostream
+  inline std::ostream
   _State_base::_M_print(std::ostream ostr) const
   {
 switch (_M_opcode)
@@ -67,7 +67,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
   // Prints graphviz dot commands for state.
-  std::ostream
+  inline std::ostream
   _State_base::_M_dot(std::ostream __ostr, _StateIdT __id) const
   {
 switch (_M_opcode)

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-05-27 Thread Bernd Schmidt


On 05/27/2014 01:11 PM, Ilya Verbin wrote:

2014-05-27 14:59 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:

There isn't a way to do this. For ptx, target libraries can't be built
anyway. For your use case, I'd recommend building the offload gcc first,
installing it, and then building the host gcc with --enable-offload-targets
as described above.


Do I understand correctly that you recommend to build our offload gcc
manually, rather than during configure/make?


Well, configure/make it separately - build and install it first.

If there was a way to generalize the Makefiles to allow target libraries 
to be built for multiple targets, that would be pretty good too, but it 
might be too much work for too little gain.



Bernd

[patch,libgcc]: More fixes for PR61152

2014-05-27 Thread Georg-Johann Lay

This adds runtime library exceptions for some more files that go into libgcc.c 
via tm.h, mostly due to v850.


Ok for trunk?

Johann


gcc/
PR libgcc/61152
* config/dbx.h (License): Add Runtime Library Exception.
* config/newlib-stdint.h (License): Same.
* config/rtems.h (License): Same
* config/initfini-array.h (License): Same
* config/v850/v850.h (License): Same.
* config/v850/v850-opts.h (License): Same
* config/v850/rtems.h (License): Same.

Index: config/dbx.h
===
--- config/dbx.h	(revision 210958)
+++ config/dbx.h	(working copy)
@@ -13,8 +13,13 @@ but WITHOUT ANY WARRANTY; without even t
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 http://www.gnu.org/licenses/.  */
 
 /* This file causes gcc to prefer using DBX (stabs) debugging
Index: config/newlib-stdint.h
===
--- config/newlib-stdint.h	(revision 210958)
+++ config/newlib-stdint.h	(working copy)
@@ -13,8 +13,13 @@ but WITHOUT ANY WARRANTY; without even t
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 http://www.gnu.org/licenses/.  */
 
 /* newlib uses 32-bit long in certain cases for all non-SPU
Index: config/initfini-array.h
===
--- config/initfini-array.h	(revision 210958)
+++ config/initfini-array.h	(working copy)
@@ -14,8 +14,13 @@
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
License for more details.
 
-   You should have received a copy of the GNU General Public License
-   along with GCC; see the file COPYING3.  If not see
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
http://www.gnu.org/licenses/.  */
 
 #ifdef HAVE_INITFINI_ARRAY_SUPPORT
Index: config/rtems.h
===
--- config/rtems.h	(revision 210958)
+++ config/rtems.h	(working copy)
@@ -13,8 +13,13 @@ but WITHOUT ANY WARRANTY; without even t
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 http://www.gnu.org/licenses/.  */
 
 /* The system headers under RTEMS are C++-aware.  */
Index: config/v850/v850-opts.h
===
--- config/v850/v850-opts.h	(revision 210958)
+++ config/v850/v850-opts.h	(working copy)
@@ -13,8 +13,13 @@
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
 
-   You should have received a copy of the GNU General Public License
-   along with GCC; see the file COPYING3.  If not see
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public

[PATCH] Fix float-cast test (PR testsuite/61319)

2014-05-27 Thread Marek Polacek

E.g. CentOS 5 doesn't define several macros in limits.h, so define
them conditionally to make the test pass.

Ok for trunk?

2014-05-27  Marek Polacek  pola...@redhat.com

PR testsuite/61319
* c-c++-common/ubsan/float-cast.h: Conditionally define LLONG_MAX,
LLONG_MIN, and ULLONG_MAX.

diff --git gcc/testsuite/c-c++-common/ubsan/float-cast.h 
gcc/testsuite/c-c++-common/ubsan/float-cast.h
index 166da8f..e76171a 100644
--- gcc/testsuite/c-c++-common/ubsan/float-cast.h
+++ gcc/testsuite/c-c++-common/ubsan/float-cast.h
@@ -1,5 +1,16 @@
 /* Various macros for -fsanitize=float-cast-overflow testing.  */
 
+/* E.g. on CentOS 5 these aren't defined in limits.h.  */
+#ifndef LLONG_MAX
+# define LLONG_MAX __LONG_LONG_MAX__
+#endif
+#ifndef LLONG_MIN
+# define LLONG_MIN (-__LONG_LONG_MAX__ - 1LL)
+#endif
+#ifndef ULLONG_MAX
+# define ULLONG_MAX (__LONG_LONG_MAX__ * 2ULL + 1ULL)
+#endif
+
 #define INT128_MAX (__int128) (((unsigned __int128) 1  ((__SIZEOF_INT128__ * 
__CHAR_BIT__) - 1)) - 1)
 #define INT128_MIN (-INT128_MAX - 1)
 #define UINT128_MAX ((2 * (unsigned __int128) INT128_MAX) + 1)

Marek

Re: [PATCH, fortran]: Include stdlib.h in intrinsics/getcwd.c

2014-05-27 Thread Steve Kargl

On Tue, May 27, 2014 at 09:44:22AM +0200, Uros Bizjak wrote:
 ... to avoid implicit declaration of function ???free??? warning.
 
 2014-05-27  Uros Bizjak  ubiz...@gmail.com
 
 * intrinsics/getcwd.c: Include stdlib.h.
 
 Tested on x86_64-pc-linux-gnu.
 
 OK for mainline?
 

Yes.

It can also be committed to the 4.9 branch if you have the time.

-- 
Steve

Re: [PATCH] Fix float-cast test (PR testsuite/61319)

2014-05-27 Thread Jakub Jelinek

On Tue, May 27, 2014 at 01:32:03PM +0200, Marek Polacek wrote:
 E.g. CentOS 5 doesn't define several macros in limits.h, so define
 them conditionally to make the test pass.
 
 Ok for trunk?
 
 2014-05-27  Marek Polacek  pola...@redhat.com
 
   PR testsuite/61319
   * c-c++-common/ubsan/float-cast.h: Conditionally define LLONG_MAX,
   LLONG_MIN, and ULLONG_MAX.

Ok, thanks.

Jakub

[PATCH 1/3] Fix demangler testsuite crashes with CP_DEMANGLE_DEBUG defined

2014-05-27 Thread Pedro Alves

I tried running the demangler's testsuite with CP_DEMANGLE_DEBUG
defined, but it crashed, with:

 Program received signal SIGSEGV, Segmentation fault.
 0x0040a8c3 in d_dump (dc=0x1, indent=12) at 
../../src/libiberty/cp-demangle.c:567
 567   switch (dc-type)

 (gdb) bt 3
 #0  0x0040a8c3 in d_dump (dc=0x1, indent=12) at 
../../src/libiberty/cp-demangle.c:567
 #1  0x0040ae47 in d_dump (dc=0x7fffd098, indent=10) at 
../../src/libiberty/cp-demangle.c:787
 #2  0x0040ae47 in d_dump (dc=0x7fffd0c8, indent=8) at 
../../src/libiberty/cp-demangle.c:787

Note dc=0x1, which is obviously a bogus pointer.  This is the end of
d_dump recursing for a component type that that doesn't actually have
subtrees:

 787   d_dump (d_left (dc), indent + 2);
 788   d_dump (d_right (dc), indent + 2);

This fixes the two cases the testsuite trips on.

libiberty/
2014-05-26  Pedro Alves  pal...@redhat.com

* cp-demangle.c (d_dump): Handle DEMANGLE_COMPONENT_FUNCTION_PARAM
and DEMANGLE_COMPONENT_NUMBER.
---
 libiberty/cp-demangle.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 68d8ee1..c0d2ffe 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -575,6 +575,9 @@ d_dump (struct demangle_component *dc, int indent)
 case DEMANGLE_COMPONENT_TEMPLATE_PARAM:
   printf (template parameter %ld\n, dc-u.s_number.number);
   return;
+case DEMANGLE_COMPONENT_FUNCTION_PARAM:
+  printf (function parameter %ld\n, dc-u.s_number.number);
+  return;
 case DEMANGLE_COMPONENT_CTOR:
   printf (constructor %d\n, (int) dc-u.s_ctor.kind);
   d_dump (dc-u.s_ctor.name, indent + 2);
@@ -760,6 +763,9 @@ d_dump (struct demangle_component *dc, int indent)
 case DEMANGLE_COMPONENT_CHARACTER:
   printf (character '%c'\n,  dc-u.s_character.character);
   return;
+case DEMANGLE_COMPONENT_NUMBER:
+  printf (number %ld\n, dc-u.s_number.number);
+  return;
 case DEMANGLE_COMPONENT_DECLTYPE:
   printf (decltype\n);
   break;
-- 
1.9.0

[PATCH 3/3] mangler/demangler dogfooding

2014-05-27 Thread Pedro Alves

This patch teaches g++ to try to demangle any symbol it
generates/mangles, when checking is enabled in the build, as soon as
the symbol is mangled.  The idea here is validate the demangling as
close as possible to the generator as possible.

This allows catching demangler bugs/crashes much earlier in the cycle,
when building libstdc++ and friends, when running g++'s testsuite, and
even potentially earlier than that, when developers work on new
C++11/14-and-beyond features, which influence mangling, validating
against some random test that's not in the testsuite yet (and
sometimes doesn't make it there either), rather than much later in
production when the user is trying to debug the code, or the program
tries to generate a backtrace.  Both libstdc++ and the testsuite
include a good set of tricky symbols to demangle, and the latter
naturally includes all sort of random, weird, code full of corner
cases.  And, I assume g++ maintainers once in a while run WIP g++
through some piles of very complicated C++ code.

It seems clear to me that ideally we should be able to do

 assert (demangle (mangle (symbol));

But, unfortunately, we can't yet.  I built g++ and ran the testsuite
with a variant of this patch that would print the mangled symbol if
the demangling fails, but would otherwise continue without aborting,
and I found out that:

- we can't demangle ~40 symbols generated while building libstdc++
  and friends.  E.g.:

  
_ZN9__gnu_cxx13new_allocatorINSt13__future_base13_State_baseV2EE9constructIS2_IEEEvPT_DpOT0_
  _ZNSt15_Sp_counted_ptrIDnLN9__gnu_cxx12_Lock_policyE1EEC1ERKS3_

- we can't demangle around ~2600 (!) symbols generated while running the g++ 
testsuite.  E.g.,

  _Z1f1SIXadL3FooEEE
  _ZZ4mainENKUlRT_E_clI1SEEDaS0_

Of course, these all may well be mangler bugs rather than demangler
bugs.  It's possible that these are already known mangler bugs even,
that have been fixed, but require a higher -fabi-version=X.  I didn't
investigate that.

Bootstrapped and regtested on x86_64 Fedora 20.

gcc/
2014-05-27  Pedro Alves  pal...@redhat.com

* cp/mangle.c: Include demangle.h.
(finish_mangling_internal) [ENABLE_CHECKING]: Demangle the just
mangled symbol.
---
 gcc/cp/mangle.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 4205fec..c4eb5dc 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include target.h
 #include cgraph.h
 #include wide-int.h
+#include demangle.h
 
 /* Debugging support.  */
 
@@ -3367,6 +3368,35 @@ finish_mangling_internal (const bool warn)
 
   /* Null-terminate the string.  */
   write_char ('\0');
+
+#if ENABLE_CHECKING
+  /* Make sure we can demangle what we just generated.  */
+  {
+const char *str ATTRIBUTE_UNUSED;
+const char *mangled_str;
+int dmgl_opts;
+
+dmgl_opts = (DMGL_VERBOSE
+| DMGL_ANSI
+| DMGL_GNU_V3
+| DMGL_RET_POSTFIX
+| DMGL_PARAMS);
+
+mangled_str = (const char *) obstack_base (mangle_obstack);
+str = cplus_demangle_v3 (mangled_str, dmgl_opts);
+#if 0
+/* XXX The above catches potential demangler crashes.  And,
+   ideally, we'd also abort if demangling fails.  However, we
+   can't do that because the demangler isn't able to demangle all
+   symbols we generate by default.  Some failures might be
+   demangler bugs, others unknown mangler bugs, and others known
+   mangler bugs fixed with a higher -fabi-version, which the
+   demangler doesn't have a workaround for.  */
+if ((str != NULL) != (mangled_str[0] == '_'  mangled_str[1] == 'Z'))
+  internal_error (demangling failed for: %s, mangled_str);
+#endif
+  }
+#endif
 }
 
 
-- 
1.9.0

[PATCH 0/3] mangler/demangler dogfooding

2014-05-27 Thread Pedro Alves

There's been discussion on the GDB lists about demangler crashes
bringing down GDB.

Such demangler bugs should be fixed, of course.  Patch 2 in this
series fixes the one that is currently known.

But while GDB demangles all symbols in every file it loads, always, in
libstdc++, the demangler is only called in error situations, and then
only to demangle the few symbols that appear in the backtrace.  This
means that mangler/demangler bugs can easily escape undetected until
much later, when the user is trying to debug the program.

So we had a discussion in the gdb lists about whether there is
something that could be done to demangler testing to make it more
effective in catching bugs that GDB would catch.  Ideas in progress so
far:

- Just throw more symbols at it once in a while.

We'd like to try demangling the world once in a while.  That is,
demangle all symbols in a distro.  Florian Weimer did this with the
ELF symbols in Fedora 20 and rawhide, and didn't get any crashes,
which is reassuring.  We haven't yet tried extracting and demangling
all DWARF symbols, but we'd like to get there too.

- Throw pseudo-random symbols at the demangler.

That is, make it more robust wrt to invalid input.  Gary wrote a fuzzy
tester, and it already caught a few crashes.

- Teach g++ itself to try to demangle any symbol it generates/mangles

... when checking is enabled in the build.

The idea here is validate the mangling/demangling as close as possible
to the generator as possible.  Patch 3 in this series implements this.
This is my favorite approach, as this catches demangler crashes much
earlier in the cycle, when building libstdc++ and friends, and when
running g++'s testsuite, rather than much later in production when the
user is trying to debug the code, or the program tries to generate a
backtrace.  Both libstdc++ and the testsuite include a good set of
tricky symbols to demangle, and the latter naturally includes all sort
of random, weird, code full of corner cases.  And, I assume g++
maintainers once in a while run WIP g++ through some piles of very
complicated C++ code.

Patch 2 fixes the demangler crashes I got when running the testsuite
with patch 3 installed.

And then patch 1 fixes crashes in the demangler debug that I got while
I was writing patch 2.

Pedro Alves (3):
  Fix demangler testsuite crashes with CP_DEMANGLE_DEBUG defined
  PR other/61321 - demangler crash on casts in template parameters
  mangler/demangler dogfooding

 gcc/cp/mangle.c   | 30 
 include/demangle.h|  4 
 libiberty/cp-demangle.c   | 43 ---
 libiberty/cp-demint.c |  1 +
 libiberty/testsuite/demangle-expected | 23 +++
 5 files changed, 93 insertions(+), 8 deletions(-)

-- 
1.9.0

[PATCH 2/3] PR other/61321 - demangler crash on casts in template parameters

2014-05-27 Thread Pedro Alves

The fix for bug 59195:

 [C++ demangler handles conversion operator incorrectly]
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59195

unfortunately makes the demangler crash due to infinite recursion, in
case of casts in template parameters.

For example, with:

 templateint struct A {};
 template typename Y void function_temp(Asizeof ((Y)(999))) {}
 template void function_tempint(Asizeof (int));

The 'function_tempint' instantiation above mangles to:

  _Z13function_tempIiEv1AIXszcvT_Li999EEE

The demangler parses this as:

typed name
  template
name 'function_temp'
template argument list
  builtin type int
  function type
builtin type void
argument list
  template  (*)
name 'A'
template argument list
  unary operator
operator sizeof
unary operator
  cast
template parameter 0(**)
  literal
builtin type int
name '999'

And after the fix for 59195, due to:

 static void
 d_print_cast (struct d_print_info *dpi, int options,
   const struct demangle_component *dc)
 {
 ...
   /* For a cast operator, we need the template parameters from
  the enclosing template in scope for processing the type.  */
   if (dpi-current_template != NULL)
 {
   dpt.next = dpi-templates;
   dpi-templates = dpt;
   dpt.template_decl = dpi-current_template;
 }

when printing the template argument list of A (what should be sizeof
(int)), the template parameter 0 (that is, T_, the '**' above) now
refers to the first parameter of the the template argument list of the
'A' template (the '*' above), exactly what we were already trying to
print.  This leads to infinite recursion, and stack exaustion.  The
template parameter 0 should actually refer to the first parameter of
the 'function_temp' template.

Where it reads for the cast operator in the comment in d_print_cast
(above), it's really talking about a conversion operator, like:

  struct A { template typename U explicit operator U(); };

We don't want to inject the template parameters from the enclosing
template in scope when processing a cast _expression_, only when
handling a conversion operator.

The problem is that DEMANGLE_COMPONENT_CAST is currently ambiguous,
and means _both_ 'conversion operator' and 'cast expression'.

Fix this by adding a new DEMANGLE_COMPONENT_CONVERSION component type,
which does what DEMANGLE_COMPONENT_CAST does today, and making
DEMANGLE_COMPONENT_CAST just simply print its component subtree.

I think we could instead reuse DEMANGLE_COMPONENT_CAST and in
d_print_comp_inner still do:

 @@ -5001,9 +5013,9 @@ d_print_comp_inner (struct d_print_info *dpi, int 
options,
d_print_comp (dpi, options, dc-u.s_extended_operator.name);
return;

 case DEMANGLE_COMPONENT_CAST:
   d_append_string (dpi, operator );
 - d_print_cast (dpi, options, dc);
 + d_print_conversion (dpi, options, dc);
   return;

leaving the unary cast case below calling d_print_cast, but seems to
me that spliting the component types makes it easier to reason about
the code.

g++'s testsuite actually generates three symbols that crash the
demangler in the same way.  I've added those as tests in the demangler
testsuite as well.

And then this fixes PR other/61233 too, which happens to be a
demangler crash originally reported to GDB, at:
https://sourceware.org/bugzilla/show_bug.cgi?id=16957

Bootstrapped and regtested on x86_64 Fedora 20.

Also ran this through GDB's testsuite.  GDB will require a small
update to use DEMANGLE_COMPONENT_CONVERSION in one place it's using
DEMANGLE_COMPONENT_CAST in its sources.

libiberty/
2014-05-27  Pedro Alves pal...@redhat.com

PR other/61321
PR other/61233
* demangle.h (enum demangle_component_type)
DEMANGLE_COMPONENT_CONVERSION: New value.
* cp-demangle.c (d_demangle_callback, d_make_comp): Handle
DEMANGLE_COMPONENT_CONVERSION.
(is_ctor_dtor_or_conversion): Handle DEMANGLE_COMPONENT_CONVERSION
instead of DEMANGLE_COMPONENT_CAST.
(d_operator_name): Return a DEMANGLE_COMPONENT_CONVERSION
component if handling a conversion.
(d_count_templates_scopes, d_print_comp_inner): Handle
DEMANGLE_COMPONENT_CONVERSION.
(d_print_comp_inner): Handle DEMANGLE_COMPONENT_CONVERSION instead
of DEMANGLE_COMPONENT_CAST.
(d_print_cast): Rename as ...
(d_print_conversion): ... this.  Adjust comments.
(d_print_cast): Rewrite - simply print the left subcomponent.
* cp-demint.c (cplus_demangle_fill_component): Handle
DEMANGLE_COMPONENT_CONVERSION.

* testsuite/demangle-expected: Add tests.
---
 include/demangle.h|  4 
 libiberty/cp-demangle.c   | 37 +++
 libiberty/cp-demint.c |  1 +

RE: [PATCH] Fix an AARCH64/ARM performance regression

2014-05-27 Thread Bernd Edlinger

Hi, 

On Tue, 27 May 2014 11:06:21, Richard Biener wrote:

 But the coment previously read

 /* A constant address in TO_RTX can have VOIDmode, we must not try
 to call force_reg for that case. Avoid that case. */

 and you are removing that check. So I guess you want to retain

  GET_MODE (XEXP (to_rtx, 0)) != VOIDmode


well Richard, this code is definitely out-dated, with no test cases,
no good explanation at all. Obviously force_reg is no longer used
in that code path since years.

Actually I'd really like to clean up this code...

Furthermore nobody knows how to satisfy the condition
MEM_P (to_rtx)  GET_MODE(XEXP(to_rtx,0) == VOIDmode.

A MEM_REF, referring to a SYMBOL_REF always seems to have a REG
in XEXP(to_rtx,0) which always is of type Pmode. (Which may be
specific to the ARM target).
And even if the MEM_REF would directly have a pointer to a
SYMBOL_REF as argument 0, it would also have pointer mode.
I looked for gen_rtx_SYMBOL_REF, and they all seem to use Pmode.

My feeling is that it is impossible to write an example where this
condition is false, and it could even be something like a

  gcc_assert(GET_MODE (XEXP (to_rtx, 0)) != VOIDmode);

At least this condition seems not to trigger on normal SYMBOL_REFs.
If you really don't like to touch that part, please let me know.
I could let it, as it is, however that would not be my preference.

 or investigate why we don't need to avoid this case anymore. In fact,
 that's probably the only change necessary, just drop the == BLKmode
 check? Your lengthy description doesn't reveal if you investigated that.
 It seems to me it tries to avoid ajdusting MEM (symbol-ref) for example?
 Maybe also for optimization reasons?


The example code can be modified to test the conditions.
For instance to understand, why the condition 

MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1)

is too strict, while

MEM_ALIGN (to_rtx)= GET_MODE_ALIGNMENT (mode1) 

generates better code, simply add an alignment attribute
to the variable A a[6] __attribute__((aligned(32)));

This did not reach the if block at all, and the generated code, was
using mov instructions, even before my patch.  But in this example,
we can see, there is no reason to restrict this only to exactly aligned
values.

The proposed condition on this if statement has some similarity
with the logic in store_field().  If the field is aligned in a certain way,
the access will likely be a single move instruction, and it is superior
code, to have only one constant offset part folded together.  This would
naturally be different for bit-field accesses, that's what why the rest
of the conditions should not change.

 Adding a new comment before the block reflecting your analysis above
 would be nice as well.


Yes, I will add a comment, and post and updated version of the patch tomorrow.


Thanks
Bernd.

Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-27 Thread Patrick Palka

On Tue, May 27, 2014 at 3:33 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Mon, May 26, 2014 at 9:01 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 Hi,

 This patch teaches the C++ frontend to warn on NULL checks against
 inline functions and it teaches the middle-end to fold NULL checks
 against inline functions.  These two things are currently done for
 non-inline C++ functions, but inline functions are exceptional in that
 the C++ frontend marks them as weak, and NULL checks against weak
 symbols cannot be folded in general because the symbol may be mapped to
 NULL at link-time.

 But in the case of an inline function, the fact that it's a weak symbol
 is an implementation detail.  And because it is not permitted to
 explicitly give an inline function the weak attribute (see
 handle_weak_attribute), in order to acheive $SUBJECT it suffices to
 assume that all inline functions are non-null, which is what this patch
 does.

 Bootstrap and regtest against x86_64-unknown-linux-gnu in progress.  Is
 this patch OK assuming no regressions?

 What about always_inline function wrappers as used in FORTIFY_SOURCE?
 IIRC the actual referenced function may be declared weak and thus the
 address can compare to NULL?  Sth like

 extern void __foo (void) __attribute__((weak,asm(foo)));
 extern inline __attribute__((always_inline,gnu_inline)) void foo (void)
 {
   __foo ();
 }

 int main()
 {
   if (foo == 0)
 return 0;
   abort ();
 }

 The issue is that the inline and alias may be hidden to the user and thus
 he'll get unwanted and confusing warnings.

 Richard.

So as it stands the foo == 0 check in the above example gets folded
with or without my patch.  The issue here seems to be related to the
use of gnu_inline.  The address of an inline function marked
gnu_inline shouldn't be considered non-NULL because a standalone
function does not actually get emitted.  Of course, one could inspect
aliases but in general it is not correct to assume such functions are
non-NULL.  Does this sound right?

This issue has slight overlap with the issue that my patch tries to
fix.  I could try to handle it as well.

Re: [patch,libgcc]: More fixes for PR61152

2014-05-27 Thread Ian Lance Taylor

Georg-Johann Lay a...@gjlay.de writes:

 gcc/
   PR libgcc/61152
   * config/dbx.h (License): Add Runtime Library Exception.
   * config/newlib-stdint.h (License): Same.
   * config/rtems.h (License): Same
   * config/initfini-array.h (License): Same
   * config/v850/v850.h (License): Same.
   * config/v850/v850-opts.h (License): Same
   * config/v850/rtems.h (License): Same.

This is OK.

Thanks.

Ian

Re: Create a library for tools like collect2 and lto-wrapper (2/2)

2014-05-27 Thread Bernd Schmidt


On 05/22/2014 05:56 PM, Joseph S. Myers wrote:

On Thu, 22 May 2014, Bernd Schmidt wrote:


The implementations of some functions like fork_execute are changed to those
from collect2 and the calls in lto-wrapper adapted accordingly. There are some
minor changes in these functions: for example I avoid calling fatal_error,
instead using the fatal/fatal_perror functions so that the cleanup routine is
called properly (lto-wrapper does not use atexit as collect2 does).


Avoiding fatal_error seems like the wrong direction.  Using the common
diagnostics code from diagnostic.c is preferred to using custom code.


Hmm, honestly I would have gone the other way as diagnostic.c seems 
rather unnecessarily heavyweight for the purposes of these little utilities.



Thus, as an initial step I'd suggest instead converting lto-wrapper to use
the common functionality where possible (fatal_error with %m replacing
fatal_perror), making it use atexit in the process.  Don't move code using
legacy fatal / fatal_perror functions to a common library.


But whatever. Here is such a patch.  Bootstrapped and tested on 
x86_64-linux. Ok?



Bernd

	* lto-wrapper.c (fatal, fatal_perror): Remove functions.  All callers
	changed to use fatal_error.
	(main): Ensure lto_wrapper_cleanup is run atexit.

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index f59d74e..f42b14e 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -111,43 +111,6 @@ fatal_signal (int signum)
   kill (getpid (), signum);
 }
 
-/* Just die. CMSGID is the error message. */
-
-static void __attribute__ ((format (printf, 1, 2)))
-fatal (const char * cmsgid, ...)
-{
-  va_list ap;
-
-  va_start (ap, cmsgid);
-  fprintf (stderr, lto-wrapper: );
-  vfprintf (stderr, _(cmsgid), ap);
-  fprintf (stderr, \n);
-  va_end (ap);
-
-  lto_wrapper_cleanup ();
-  exit (FATAL_EXIT_CODE);
-}
-
-
-/* Die when sys call fails. CMSGID is the error message.  */
-
-static void __attribute__ ((format (printf, 1, 2)))
-fatal_perror (const char *cmsgid, ...)
-{
-  int e = errno;
-  va_list ap;
-
-  va_start (ap, cmsgid);
-  fprintf (stderr, lto-wrapper: );
-  vfprintf (stderr, _(cmsgid), ap);
-  fprintf (stderr, : %s\n, xstrerror (e));
-  va_end (ap);
-
-  lto_wrapper_cleanup ();
-  exit (FATAL_EXIT_CODE);
-}
-
-
 /* Execute a program, and wait for the reply. ARGV are the arguments. The
last one must be NULL. */
 
@@ -174,7 +137,7 @@ collect_execute (char **argv)
 
   pex = pex_init (0, lto-wrapper, NULL);
   if (pex == NULL)
-fatal_perror (pex_init failed);
+fatal_error (pex_init failed: %m);
 
   /* Do not use PEX_LAST here, we use our stdout for communicating with
  collect2 or the linker-plugin.  Any output from the sub-process
@@ -186,10 +149,10 @@ collect_execute (char **argv)
   if (err != 0)
 	{
 	  errno = err;
-	  fatal_perror (errmsg);
+	  fatal_error (%s: %m, _(errmsg));
 	}
   else
-	fatal (errmsg);
+	fatal_error (errmsg);
 }
 
   return pex;
@@ -205,7 +168,7 @@ collect_wait (const char *prog, struct pex_obj *pex)
   int status;
 
   if (!pex_get_status (pex, 1, status))
-fatal_perror (can't get program status);
+fatal_error (can't get program status: %m);
   pex_free (pex);
 
   if (status)
@@ -214,15 +177,15 @@ collect_wait (const char *prog, struct pex_obj *pex)
 	{
 	  int sig = WTERMSIG (status);
 	  if (WCOREDUMP (status))
-	fatal (%s terminated with signal %d [%s], core dumped,
+	fatal_error (%s terminated with signal %d [%s], core dumped,
 		   prog, sig, strsignal (sig));
 	  else
-	fatal (%s terminated with signal %d [%s],
+	fatal_error (%s terminated with signal %d [%s],
 		   prog, sig, strsignal (sig));
 	}
 
   if (WIFEXITED (status))
-	fatal (%s returned %d exit status, prog, WEXITSTATUS (status));
+	fatal_error (%s returned %d exit status, prog, WEXITSTATUS (status));
 }
 
   return 0;
@@ -238,7 +201,7 @@ maybe_unlink_file (const char *file)
 {
   if (unlink_if_ordinary (file)
 	   errno != ENOENT)
-	fatal_perror (deleting LTRANS file %s, file);
+	fatal_error (deleting LTRANS file %s: %m, file);
 }
   else if (verbose)
 fprintf (stderr, [Leaving LTRANS %s]\n, file);
@@ -260,12 +223,12 @@ fork_execute (char **argv)
   at_args = concat (@, args_name, NULL);
   args = fopen (args_name, w);
   if (args == NULL)
-fatal (failed to open %s, args_name);
+fatal_error (failed to open %s, args_name);
 
   status = writeargv (argv[1], args);
 
   if (status)
-fatal (could not write to temporary file %s,  args_name);
+fatal_error (could not write to temporary file %s,  args_name);
 
   fclose (args);
 
@@ -312,7 +275,7 @@ get_options_from_collect_gcc_options (const char *collect_gcc,
 	  do
 	{
 	  if (argv_storage[j] == '\0')
-		fatal (malformed COLLECT_GCC_OPTIONS);
+		fatal_error (malformed COLLECT_GCC_OPTIONS);
 	  else if (strncmp (argv_storage[j], '\\'', 4) == 0)
 		{
 		  argv_storage[k++] = '\'';
@@ -452,8 +415,8 @@ merge_and_complain (struct

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Christophe Lyon

Hi,

Commits 210964 and 210965 for this patch have broken GCC build on arm* targets.
For instance on target arm-none-linux-gnueabi, when creating
unwind-arm.o, I can see:
/tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:1362
0x8e3bcd process_insn_for_elimination

/tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1344
0x8e3bcd lra_eliminate(bool, bool)

/tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1408
0x8dd753 lra_constraints(bool)

/tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-constraints.c:4049
0x8cdc1a lra(_IO_FILE*)
/tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:2332
0x8911f8 do_reload
/tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5444
0x891508 execute
/tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5605
Please submit a full bug report,




On 23 May 2014 20:19, Jeff Law l...@redhat.com wrote:
 On 05/20/14 15:36, Richard Sandiford wrote:

 This is OK for the trunk (referring to the follow-up message which fixed
 EWRONGPATCH.


 Sorry, while working on the follow-up LRA patch, I realised I hadn't
 accounted for target changes that happen directly via target_reinit
 (rather than SWITCHABLE_TARGETS) and cases where reinit_regs is called
 to change just the register information.  Both could potentially affect
 the enabled attribute.

 This version adds a recog_init function that clears the data if necessary.
 There are no other changes from first time.  Is this still OK?

 Thanks for letting me know, that's a minor twiddle -- the patch is still OK.

 Jeff

Mark more constants readonly

2014-05-27 Thread Bernd Schmidt

I noticed that string constants built by the Fortran frontend don't set 
TREE_READONLY for STRING_CST (and subsequently noticed that nothing 
seems to set it for COMPLEX_CST). That was confusing the ptx backend I'm 
working on.
The -fwritable-strings option for C was removed a while ago, so I tested 
the following patch to just set the flag unconditionally, and passed 
bootstrap and test on x86_64-linux. Ok?



Bernd
	* tree.c (build_string, build_complex): Set TREE_READONLY.

diff --git a/gcc/tree.c b/gcc/tree.c
index 22b92f3..067f15a 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -1755,6 +1755,7 @@ build_string (int len, const char *str)
   memset (s, 0, sizeof (struct tree_typed));
   TREE_SET_CODE (s, STRING_CST);
   TREE_CONSTANT (s) = 1;
+  TREE_READONLY (s) = 1;
   TREE_STRING_LENGTH (s) = len;
   memcpy (s-string.str, str, len);
   s-string.str[len] = '\0';
@@ -1776,6 +1777,7 @@ build_complex (tree type, tree real, tree imag)
   TREE_IMAGPART (t) = imag;
   TREE_TYPE (t) = type ? type : build_complex_type (TREE_TYPE (real));
   TREE_OVERFLOW (t) = TREE_OVERFLOW (real) | TREE_OVERFLOW (imag);
+  TREE_READONLY (t) = 1;
   return t;
 }

Fix a function decl in gfortran

2014-05-27 Thread Bernd Schmidt

Compiling Fortran code with the ptx backend I'm working on results in 
assembler warnings about mismatch between function calls and function 
decls. This seems to be a bug in the Fortran frontend, where a decl for 
_gfortran_caf_init is created with four arguments, while the library 
function is declared as


_gfortran_caf_init (int *argc, char ***argv)

and calls to it are constructed (in create_main_function) with only two 
arguments.


Bootstrapped and tested on x86_64-linux. Ok?


Bernd
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 4a5ba1d..db46858 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -3260,7 +3260,7 @@ gfc_build_builtin_function_decls (void)
 
   gfor_fndecl_caf_init = gfc_build_library_function_decl (
 		   get_identifier (PREFIX(caf_init)),  void_type_node,
-		   4, pint_type, pppchar_type, pint_type, pint_type);
+		   2, pint_type, pppchar_type);
 
   gfor_fndecl_caf_finalize = gfc_build_library_function_decl (
 	get_identifier (PREFIX(caf_finalize)), void_type_node, 0);

Don't copy constants in build_constant_desc

2014-05-27 Thread Bernd Schmidt

When putting a constant into the constant pool, we make a copy of the 
tree in build_constant_desc. This caused me some problems with the ptx 
backend, which has a pass to modify the address spaces of all variables 
and constants to match the requirements of the backend. It would be nice 
for it to be able to find them all by walking the varpool and the gimple 
of all functions, and I ran into some testcases where these copies can 
slip through.


After a while it occurred to me that the copy is probably a relic of old 
obstack times and not necessary anymore. And indeed, in 2.7.2.3 the only 
call to it looked like this:

  push_obstacks_nochange ();
  suspend_momentary ();
  p-exp = copy_constant (exp);
  pop_obstacks ();

It seems unlikely that anything in the compiler would want to modify a 
constant after it has been put into the constant pool, and even if so, 
it should have the responsibility of making the copy itself. So, as an 
experiment I've bootstrapped and tested the following patch 
(x86_64-linux), and that seemed to work. Ok to apply along with removal 
of the now unused copy_constant?



Bernd

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 6781096..3469384 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -3205,7 +3205,7 @@ build_constant_desc (tree exp)
   tree decl;
 
   desc = ggc_allocconstant_descriptor_tree ();
-  desc-value = copy_constant (exp);
+  desc-value = exp;
 
   /* Create a string containing the label name, in LABEL.  */
   labelno = const_labelno++;

[C++ Patch] PR 57543

2014-05-27 Thread Paolo Carlini


Hi,

here, in order to accept the code without an explicit this- 
qualification, I propose to simply notice that instance == 
current_class_type. Tested x86_64-linux.


Thanks!
Paolo.


/cp
2014-05-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/57543
* call.c (build_new_method_call_1): In an unevaluated context
allow calling a member function of the same class.

/testsuite
2014-05-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/57543
* g++.dg/cpp0x/decltype59.C: New.
Index: cp/call.c
===
--- cp/call.c   (revision 210956)
+++ cp/call.c   (working copy)
@@ -8000,6 +8000,20 @@ build_new_method_call_1 (tree instance, tree fns,
 we know we really need it.  */
  cand-first_arg = instance;
}
+ else if (cp_unevaluated_operand != 0
+   current_class_type
+   same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (instance)),
+  current_class_type))
+   {
+ /* For example (c++/57543):
+
+template typename  struct X
+{
+  void foo();
+  auto bar() - decltype( X::foo() );
+};  */
+ gcc_assert (cand-first_arg == instance);
+   }
  else
{
  if (complain  tf_error)
Index: testsuite/g++.dg/cpp0x/decltype59.C
===
--- testsuite/g++.dg/cpp0x/decltype59.C (revision 0)
+++ testsuite/g++.dg/cpp0x/decltype59.C (working copy)
@@ -0,0 +1,13 @@
+// PR c++/57543
+// { dg-do compile { target c++11 } }
+
+template typename T struct X
+{
+  void foo();
+  auto bar() - decltype( X::foo() );
+};
+
+int main()
+{
+  Xint().bar();
+}

[PATCH] Fix HWI AIX breakage

2014-05-27 Thread Richard Biener


This fixes the AIX breakage by defining __STDC_FORMAT_MACROS
earlier, before stdio.h on AIX gets to include inttypes.h.

Committed as obvious.

Richard.

2014-05-27  Richard Biener  rguent...@suse.de

* system.h (__STDC_FORMAT_MACROS): Define as very first thing.

Index: gcc/system.h
===
--- gcc/system.h(revision 210972)
+++ gcc/system.h(working copy)
@@ -22,6 +22,12 @@ along with GCC; see the file COPYING3.
 #ifndef GCC_SYSTEM_H
 #define GCC_SYSTEM_H
 
+/* Define this so that inttypes.h defines the PRI?64 macros even
+   when compiling with a C++ compiler.  Define it here so in the
+   event inttypes.h gets pulled in by another header it is already
+   defined.  */
+#define __STDC_FORMAT_MACROS
+
 /* We must include stdarg.h before stdio.h.  */
 #include stdarg.h
 
@@ -491,7 +497,6 @@ extern void *realloc (void *, size_t);
 #endif
 
 #ifdef HAVE_INTTYPES_H
-#define __STDC_FORMAT_MACROS
 #include inttypes.h
 #endif

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Ramana Radhakrishnan

On Tue, May 27, 2014 at 2:12 PM, Christophe Lyon
christophe.l...@linaro.org wrote:
 Hi,

 Commits 210964 and 210965 for this patch have broken GCC build on arm* 
 targets.
 For instance on target arm-none-linux-gnueabi, when creating
 unwind-arm.o, I can see:
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:1362
 0x8e3bcd process_insn_for_elimination
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1344
 0x8e3bcd lra_eliminate(bool, bool)
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1408
 0x8dd753 lra_constraints(bool)
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-constraints.c:4049
 0x8cdc1a lra(_IO_FILE*)
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:2332
 0x8911f8 do_reload
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5444
 0x891508 execute
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5605
 Please submit a full bug report,



Can you file a proper bug report with a pre-processed file -
information about configury options etc. to show what the issue is.

Ramana



 On 23 May 2014 20:19, Jeff Law l...@redhat.com wrote:
 On 05/20/14 15:36, Richard Sandiford wrote:

 This is OK for the trunk (referring to the follow-up message which fixed
 EWRONGPATCH.


 Sorry, while working on the follow-up LRA patch, I realised I hadn't
 accounted for target changes that happen directly via target_reinit
 (rather than SWITCHABLE_TARGETS) and cases where reinit_regs is called
 to change just the register information.  Both could potentially affect
 the enabled attribute.

 This version adds a recog_init function that clears the data if necessary.
 There are no other changes from first time.  Is this still OK?

 Thanks for letting me know, that's a minor twiddle -- the patch is still OK.

 Jeff

Re: [PATCH] Try harder for conditional evaluation in VRP

2014-05-27 Thread Steven Bosscher

On Tue, May 27, 2014 at 12:27 PM, Richard Biener wrote:
 * tree-vrp.c (vrp_evaluate_conditional_warnv_with_ops_using_ranges):
 Try using literal operands when comparing value-ranges failed.

No test case?

Ciao!
Steven

Re: [PATCH] Delete temporary string within demangler even in failure cases.

2014-05-27 Thread Ian Lance Taylor

On Tue, May 27, 2014 at 3:57 AM, Andrew Burgess aburg...@broadcom.com wrote:

 libiberty/ChangeLog

 * cplus-dem.c (do_type): Call string_delete even if the call to
 demangle_template fails.

This is OK.

Thanks.

I have to ask: you know this code is not used, right?  You're looking
at the old demangler, for symbols generated by versions of g++ before
GCC 3.4 (released 2004).  The demangler for current versions of g++ is
in cp-demangle.c.

Ian

Re: [PATCH] Try harder for conditional evaluation in VRP

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Steven Bosscher wrote:

 On Tue, May 27, 2014 at 12:27 PM, Richard Biener wrote:
  * tree-vrp.c (vrp_evaluate_conditional_warnv_with_ops_using_ranges):
  Try using literal operands when comparing value-ranges failed.
 
 No test case?

Sorry ;)  Happens to patches I uncover in my dev tree.  I'll try
to come up with sth (I remember coding it when workin on some
PR ... but I've lost track of which one).

Richard.

Re: [PATCH 1/3] Fix demangler testsuite crashes with CP_DEMANGLE_DEBUG defined

2014-05-27 Thread Ian Lance Taylor

On Tue, May 27, 2014 at 4:57 AM, Pedro Alves pal...@redhat.com wrote:

 libiberty/
 2014-05-26  Pedro Alves  pal...@redhat.com

 * cp-demangle.c (d_dump): Handle DEMANGLE_COMPONENT_FUNCTION_PARAM
 and DEMANGLE_COMPONENT_NUMBER.

This is OK.

Thanks.

Ian

[patch] VxWorks configuration adjustments for powerpc 8548 (e500v2)

2014-05-27 Thread Olivier Hainque

Hello,

The VxWorks compilation of C sources using vxworks header files need the CPU
macro to be defined. Configuration files in the compiler are setup to to pick
a default value depending on the cpu variant for which we are compiling.

On powerpc, this is achieved with:

  #define CPP_SPEC \
  %{!DCPU=*:   \
 %{mcpu=403 : -DCPU=PPC403  ; \
   mcpu=405 : -DCPU=PPC405  ; \
   ...

in rs6000/vxworks.h.

The current definition misses an entry for the e500v2 family
of powerpc's, supported by VxWorks and by GCC with -mcpu=8548.

The attached patch is a proposal to add the missing entry, accounting
for a minor complication: the CPU value to set for e500v2 configurations
differs between the regular and the AE versions of VxWorks.

This is addressed by the use of an internal VXCPU_FOR_8548 configuration
macro, with a default definition in rs6000/vxworks.h and an override in a
vxworksae specific configuration file (rs6000/vxworksae.h).

vxworksae.h used to exist for this port. It was removed by rev 171779
for PR target/47109 as the file happened to be there only for TARGET_VERSION.
This patch re-instates the file, which we'll need for other purposes
in patches to be submitted later.

Tested by checking that
- the compiler builds after the patch, for both powerpc-wrs-vxworks
  and powerpc-wrs-vxworksae,
- the compiler produces the expected CPU defs for -mcpu=8548 in both
  cases,
- the behavior for other -mcpu selections is not altered.

OK for mainline ?

Thanks much in advance for your feedback,

With Kind Regards,

Olivier

2014-05-27  Olivier Hainque  hain...@adacore.com

* config/rs6000/vxworks.h (VXCPU_FOR_8548): New. Default to PPC85XX.
(CPP_SPEC): Add entry for -mcpu=8548.
* config/rs6000/vxworksae.h: Reinstate. Override VXCPU_FOR_8548.
* config.gcc (powerpc-wrs-vxworksae, tm_file): Add back vxworksae.h.



vx8548.diff
Description: Binary data

Re: Fix a function decl in gfortran

2014-05-27 Thread Tobias Burnus

Bernd Schmidt wrote:
 Compiling Fortran code with the ptx backend I'm working on results in
 assembler warnings about mismatch between function calls and function decls.

 Bootstrapped and tested on x86_64-linux. Ok?

OK.

The change/bug is due to my fortran-caf - trunk patch at
https://gcc.gnu.org/ml/fortran/2014-04/msg00091.html,
which seemingly missed the function declaration when updating the calls and
the library.

Thanks!

Tobias

Re: [patch i386]: Sibcall tail-call improvement and partial fix PR/60104

2014-05-27 Thread Kai Tietz

- Original Message -
 On 05/23/14 02:58, Kai Tietz wrote:
  Hello,
 
  yes the underlying issue is the same as for PR/46219. Nevertheless
  the patch doesn't solve this mentioned PR as I used for know a pretty
  conservative checking of allowed memories.  By extending
  x86_sibcall_memory_p_1 function about allowing register-arguments too
  for memory, this problem can be solved.
 BTW, do you want to add 46219 to your list?

Yes, it makes I put it to my list.  Underlying issue is related to this issue 
here.  I have a additional patch for fixing 46219.  Sadly it causes troubles 
about stack-based memories in some rare cases.  Still working on a finding a 
sample to reproduce issue outside of bootstrap.
The point is that in memories on checking for sibling-tail-calls we still see 
pseudo-register variables.  So we can't be sure if it requires 
frame/stack-pointer or not.
 
 At the least, I think we should add the test from 46219 to the suite,
 xfailed if you don't tackle it as a part of this work.

Sure I added this testcase to the testsuite-patch.  I still wait for comments 
on the accumulator-part, as here I added the stdarg-check.
 
 jeff

Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-27 Thread Patrick Palka

On Tue, May 27, 2014 at 8:32 AM, Patrick Palka patr...@parcs.ath.cx wrote:
 On Tue, May 27, 2014 at 3:33 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Mon, May 26, 2014 at 9:01 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 Hi,

 This patch teaches the C++ frontend to warn on NULL checks against
 inline functions and it teaches the middle-end to fold NULL checks
 against inline functions.  These two things are currently done for
 non-inline C++ functions, but inline functions are exceptional in that
 the C++ frontend marks them as weak, and NULL checks against weak
 symbols cannot be folded in general because the symbol may be mapped to
 NULL at link-time.

 But in the case of an inline function, the fact that it's a weak symbol
 is an implementation detail.  And because it is not permitted to
 explicitly give an inline function the weak attribute (see
 handle_weak_attribute), in order to acheive $SUBJECT it suffices to
 assume that all inline functions are non-null, which is what this patch
 does.

 Bootstrap and regtest against x86_64-unknown-linux-gnu in progress.  Is
 this patch OK assuming no regressions?

 What about always_inline function wrappers as used in FORTIFY_SOURCE?
 IIRC the actual referenced function may be declared weak and thus the
 address can compare to NULL?  Sth like

 extern void __foo (void) __attribute__((weak,asm(foo)));
 extern inline __attribute__((always_inline,gnu_inline)) void foo (void)
 {
   __foo ();
 }

 int main()
 {
   if (foo == 0)
 return 0;
   abort ();
 }

 The issue is that the inline and alias may be hidden to the user and thus
 he'll get unwanted and confusing warnings.

 Richard.

 So as it stands the foo == 0 check in the above example gets folded
 with or without my patch.  The issue here seems to be related to the
 use of gnu_inline.  The address of an inline function marked
 gnu_inline shouldn't be considered non-NULL because a standalone
 function does not actually get emitted.  Of course, one could inspect
 aliases but in general it is not correct to assume such functions are
 non-NULL.  Does this sound right?

 This issue has slight overlap with the issue that my patch tries to
 fix.  I could try to handle it as well.

Oh, disregard the above.  I think I see what you mean now, Richard.
Because __foo is an alias for foo, and __foo is weak, is it safe to
assume that foo is non-NULL?  That is a tricky question...

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Sandiford

Christophe Lyon christophe.l...@linaro.org writes:
 Hi,

 Commits 210964 and 210965 for this patch have broken GCC build on arm* 
 targets.

Could you send me the .i?

 For instance on target arm-none-linux-gnueabi, when creating
 unwind-arm.o, I can see:
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:1362
 0x8e3bcd process_insn_for_elimination
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1344
 0x8e3bcd lra_eliminate(bool, bool)
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1408
 0x8dd753 lra_constraints(bool)
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-constraints.c:4049
 0x8cdc1a lra(_IO_FILE*)
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:2332
 0x8911f8 do_reload
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5444
 0x891508 execute
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5605
 Please submit a full bug report,

Hmm, is this because of insn_enabled?  If so, how did that work before
the patch?  LRA already assumed that the enabled attribute didn't depend
on the operands.

Does the following patch help?

Thanks,
Richard


gcc/
* config/arm/iterators.md (shiftable_ops): New code iterator.
(t2_binop0): New code attribute.
* config/arm/arm.md (insn_enabled): Delete.
(enabled): Remove insn_enabled test.
(*arith_shiftsi): Split out...
(*arith_multsi): ...this pattern and remove insn_enabled attribute.

Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md   (revision 210972)
+++ gcc/config/arm/arm.md   (working copy)
@@ -200,17 +200,9 @@
  (const_string yes)]
 (const_string no)))
 
-; Allows an insn to disable certain alternatives for reasons other than
-; arch support.
-(define_attr insn_enabled no,yes
-  (const_string yes))
-
 ; Enable all alternatives that are both arch_enabled and insn_enabled.
  (define_attr enabled no,yes
-   (cond [(eq_attr insn_enabled no)
- (const_string no)
-
- (and (eq_attr predicable_short_it no)
+   (cond [(and (eq_attr predicable_short_it no)
   (and (eq_attr predicated yes)
(match_test arm_restrict_it)))
  (const_string no)
@@ -9876,39 +9868,38 @@
 
 ;; Patterns to allow combination of arithmetic, cond code and shifts
 
-(define_insn *arith_shiftsi
-  [(set (match_operand:SI 0 s_register_operand =r,r,r,r)
-(match_operator:SI 1 shiftable_operator
-  [(match_operator:SI 3 shift_operator
- [(match_operand:SI 4 s_register_operand r,r,r,r)
-  (match_operand:SI 5 shift_amount_operand M,M,M,r)])
-   (match_operand:SI 2 s_register_operand rk,rk,r,rk)]))]
+;; Separate out the (mult ...) case, since it doesn't allow register operands.
+(define_insn *arith_multsi
+  [(set (match_operand:SI 0 s_register_operand =r,r)
+(shiftable_ops:SI
+ (match_operator:SI 2 mult_operator
+   [(match_operand:SI 3 s_register_operand r,r)
+(match_operand:SI 4 power_of_two_operand)])
+ (match_operand:SI 1 s_register_operand rk,t2_binop0)))]
   TARGET_32BIT
-  %i1%?\\t%0, %2, %4%S3
+  %i1%?\\t%0, %1, %3%S2
   [(set_attr predicable yes)
(set_attr predicable_short_it no)
(set_attr shift 4)
-   (set_attr arch a,t2,t2,a)
-   ;; Thumb2 doesn't allow the stack pointer to be used for 
-   ;; operand1 for all operations other than add and sub. In this case 
-   ;; the minus operation is a candidate for an rsub and hence needs
-   ;; to be disabled.
-   ;; We have to make sure to disable the fourth alternative if
-   ;; the shift_operator is MULT, since otherwise the insn will
-   ;; also match a multiply_accumulate pattern and validate_change
-   ;; will allow a replacement of the constant with a register
-   ;; despite the checks done in shift_operator.
-   (set_attr_alternative insn_enabled
-[(const_string yes)
- (if_then_else
-  (match_operand:SI 1 add_operator )
-  (const_string yes) (const_string no))
- (const_string yes)
- (if_then_else
-  (match_operand:SI 3 mult_operator )
-  (const_string no) (const_string yes))])
-   (set_attr type 
alu_shift_imm,alu_shift_imm,alu_shift_imm,alu_shift_reg)])
+   (set_attr arch a,t2)
+   (set_attr type alu_shift_imm,alu_shift_imm)])
 
+;; Handles cases other than the above.
+(define_insn *arith_shiftsi
+  [(set (match_operand:SI 0 s_register_operand =r,r,r)
+(shiftable_ops:SI
+  (match_operator:SI 2 shift_operator
+[(match_operand:SI 3 s_register_operand r,r,r)
+ (match_operand:SI 4 shift_amount_operand M,M,r)])
+

Re: [PATCH] Redesign jump threading profile updates

2014-05-27 Thread Teresa Johnson

On Thu, Apr 17, 2014 at 6:23 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Apr 16, 2014 at 10:39 PM, Jeff Law l...@redhat.com wrote:
 On 03/26/14 17:44, Teresa Johnson wrote:

 Recently I discovered that the profile updates being performed by jump
 threading were incorrect in many cases, particularly in the case where
 the threading path contains a joiner. Some of the duplicated
 blocks/edges were not getting any counts, leading to incorrect
 function splitting and other downstream optimizations, and there were
 other insanities as well. After making a few attempts to fix the
 handling I ended up completely redesigning the profile update code,
 removing a few places throughout the code where it was attempting to
 do some updates.

 The profile updates in that code is a mess.  It's never really been looked
 at in any systematic way, what's there is ad-hoc and usually in response to
 someone mentioning the profile data was incorrectly updated.   As we rely
 more and more on that data the ad-hoc updating is going to cause us more and
 more pain.

 So any work in this space is going to be greatly appreciated.

 I'll have to look at this in some detail.  But I wanted you to know I was
 aware of the work and it's in my queue.

 Great, thanks for the update! I realize that it is not a trivial
 change so it would take some time to get through. Hopefully it should
 address the ongoing profile fixup issues.
 Teresa

Ping.
Thanks,
Teresa



 Thanks!

 jeff



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Sandiford

Richard Sandiford rsand...@linux.vnet.ibm.com writes:
 Does the following patch help?

Bah, it won't of course: %i1 needs to be the operator.

Richard

RE: [PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

2014-05-27 Thread Maciej W. Rozycki

On Tue, 13 Aug 2013, Kyrylo Tkachov wrote:

  On 08/09/13 11:01, Julian Brown wrote:
   On Thu, 8 Aug 2013 15:44:17 +0100
   Kyrylo Tkachov kyrylo.tkac...@arm.com wrote:
  
   Hi all,
  
   The recently added gcc.target/arm/pr58041.c test exposed a bug in the
   backend. When compiling for NEON and with -mno-unaligned-access we
   end up generating the vld1.64 and vst1.64 instructions instead of
   doing the accesses one byte at a time like -mno-unaligned-access
   expects. This patch fixes that by enabling the NEON expander and
   insns that produce these instructions only when unaligned accesses
   are allowed.
  
   Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu.
  
   Ok for trunk and 4.8?
  
   I'm not sure if this is right, FWIW -- do the instructions in question
   trap if the CPU is set to disallow unaligned accesses? I thought that
   control bit only affected ARM core loads  stores, not NEON ones.
  
  Thinking again - the ARM-ARM says - the alignment check is for element
  size, so an alternative might be to use vld1.8 instead to allow for this
  at which point we might as well do something else with the test. I note
  that these patterns are not allowed for BYTES_BIG_ENDIAN so that might
  be a better alternative than completely disabling it.
 
 Looking at the section on unaligned accesses, it seems that the
 ldrb/strb-class instructions are the only ones that are unaffected by the
 SCTLR.A bit and do not produce alignment faults in any case.
 The NEON load/store instructions, including vld1.8 can still cause an
 alignment fault when SCTLR.A is set. So it seems we can only use the byte-wise
 core memory instructions for unaligned data.

 This change however has regressed gcc.dg/vect/vect-72.c on the 
arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.

 Beforehand the code fragment in question produced was:

.L14:
sub r1, r3, #16
add r3, r3, #16
vld1.8  {q8}, [r1]
cmp r3, r0
vst1.64 {d16-d17}, [r2:64]!
bne .L14

Afterwards it is:

.L14:
vldrd16, [r3, #-16]
vldrd17, [r3, #-8]
add r3, r3, #16
cmp r3, r1
vst1.64 {d16-d17}, [r2:64]!
bne .L14

and the second VLDR instruction traps with SIGILL (the value in R3 is 
0x10b29, odd as you'd expect, pointing into `ib').  I don't know why and 
especially why only the second of the two (regrettably I've been unable to 
track down an instruction reference that'd be detailed enough to specify 
what exceptions VLDR can produce and under what conditions).

 Interestingly enough the trap does not happen when the program is 
single-stepped under GDB (via gdbserver), however it then aborts once this 
copy loop has completed as `ia' contains rubbish and fails the test.

 Is there a fix that needs backporting to 4.8 or is this an issue that was 
unknown so far?

 Hardware and Linux used:

$ cat /proc/cpuinfo
Processor   : ARMv7 Processor rev 2 (v7l)
processor   : 0
BogoMIPS: 2013.49

processor   : 1
BogoMIPS: 1963.08

Features: swp half thumb fastmult vfp edsp thumbee neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x1
CPU part: 0xc09
CPU revision: 2

Hardware: OMAP4430 Panda Board
Revision: 0020
Serial  : 
$ uname -a
Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC 
2010 armv7l GNU/Linux
$ 

  Maciej

Re: [PATCH] Fix PR54733 Optimize endian independent load/store

2014-05-27 Thread Christophe Lyon

On 27 May 2014 09:23, Thomas Preud'homme thomas.preudho...@arm.com wrote:
 Hi Chistophe and Andreas,

 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme

 I suspect it's the same kind of problem m68k run into. I already wrote a 
 patch
 to
 reduce the impact of different bitfield layout and it's in review right now. 
 I'll
 send you tomorrow for testing.

 Can you both try the attached patch to see if it solves the bug you 
 experienced?
 Andreas, I wrote it based on the correction you did so it should at least 
 work for
 m68k.

It still doesn't work on armeb.

Christophe.

Re: [PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

2014-05-27 Thread Ramana Radhakrishnan

On Tue, May 27, 2014 at 3:31 PM, Maciej W. Rozycki
ma...@codesourcery.com wrote:
 On Tue, 13 Aug 2013, Kyrylo Tkachov wrote:

  On 08/09/13 11:01, Julian Brown wrote:
   On Thu, 8 Aug 2013 15:44:17 +0100
   Kyrylo Tkachov kyrylo.tkac...@arm.com wrote:
  
   Hi all,
  
   The recently added gcc.target/arm/pr58041.c test exposed a bug in the
   backend. When compiling for NEON and with -mno-unaligned-access we
   end up generating the vld1.64 and vst1.64 instructions instead of
   doing the accesses one byte at a time like -mno-unaligned-access
   expects. This patch fixes that by enabling the NEON expander and
   insns that produce these instructions only when unaligned accesses
   are allowed.
  
   Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu.
  
   Ok for trunk and 4.8?
  
   I'm not sure if this is right, FWIW -- do the instructions in question
   trap if the CPU is set to disallow unaligned accesses? I thought that
   control bit only affected ARM core loads  stores, not NEON ones.
 
  Thinking again - the ARM-ARM says - the alignment check is for element
  size, so an alternative might be to use vld1.8 instead to allow for this
  at which point we might as well do something else with the test. I note
  that these patterns are not allowed for BYTES_BIG_ENDIAN so that might
  be a better alternative than completely disabling it.

 Looking at the section on unaligned accesses, it seems that the
 ldrb/strb-class instructions are the only ones that are unaffected by the
 SCTLR.A bit and do not produce alignment faults in any case.
 The NEON load/store instructions, including vld1.8 can still cause an
 alignment fault when SCTLR.A is set. So it seems we can only use the 
 byte-wise
 core memory instructions for unaligned data.

  This change however has regressed gcc.dg/vect/vect-72.c on the
 arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.

And what are all the configure flags you are using in case some one
has to reproduce this issue ?


  Beforehand the code fragment in question produced was:

 .L14:
 sub r1, r3, #16
 add r3, r3, #16
 vld1.8  {q8}, [r1]

vld1 allows a misaligned load.

 cmp r3, r0
 vst1.64 {d16-d17}, [r2:64]!
 bne .L14

 Afterwards it is:

 .L14:
 vldrd16, [r3, #-16]
 vldrd17, [r3, #-8]
 add r3, r3, #16
 cmp r3, r1
 vst1.64 {d16-d17}, [r2:64]!
 bne .L14

 and the second VLDR instruction traps with SIGILL (the value in R3 is
 0x10b29, odd as you'd expect, pointing into `ib').  I don't know why and
 especially why only the second of the two (regrettably I've been unable to
 track down an instruction reference that'd be detailed enough to specify
 what exceptions VLDR can produce and under what conditions).

vldr will cause an unaligned access fault if the address is
misaligned. The question is why is the address misaligned in this
case.






  Is there a fix that needs backporting to 4.8 or is this an issue that was
 unknown so far?

I haven't seen an issue with this so far.

Ramana


  Hardware and Linux used:

 $ cat /proc/cpuinfo
 Processor   : ARMv7 Processor rev 2 (v7l)
 processor   : 0
 BogoMIPS: 2013.49

 processor   : 1
 BogoMIPS: 1963.08

 Features: swp half thumb fastmult vfp edsp thumbee neon vfpv3
 CPU implementer : 0x41
 CPU architecture: 7
 CPU variant : 0x1
 CPU part: 0xc09
 CPU revision: 2

 Hardware: OMAP4430 Panda Board
 Revision: 0020
 Serial  : 
 $ uname -a
 Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC 
 2010 armv7l GNU/Linux
 $

   Maciej

Re: [PATCH] Try harder for conditional evaluation in VRP

2014-05-27 Thread Richard Biener

On Tue, 27 May 2014, Richard Biener wrote:

 On Tue, 27 May 2014, Steven Bosscher wrote:
 
  On Tue, May 27, 2014 at 12:27 PM, Richard Biener wrote:
   * tree-vrp.c 
   (vrp_evaluate_conditional_warnv_with_ops_using_ranges):
   Try using literal operands when comparing value-ranges failed.
  
  No test case?
 
 Sorry ;)  Happens to patches I uncover in my dev tree.  I'll try
 to come up with sth (I remember coding it when workin on some
 PR ... but I've lost track of which one).

Here is one (yeah, a bit artificial ...).

Committed.

Richard.

2014-05-27  Richard Biener  rguent...@suse.de

* gcc.dg/tree-ssa/vrp92.c: New testcase.

Index: gcc/testsuite/gcc.dg/tree-ssa/vrp92.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/vrp92.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp92.c   (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-vrp1-details } */
+
+void bar (void);
+int foo (int i, int j)
+{
+  int res = 1;
+  if (i  j)
+{
+  /* We should be able to simplify the following conditional
+ during propagation.  */
+  if (i  j)
+   res = 0;
+}
+  /* And compute res as having a value-range of [1,1].  */
+  if (res)
+return i;
+  return j;
+}
+
+/* { dg-final { scan-tree-dump res_.: \\\[1, 1\\\] vrp1 } } */
+/* { dg-final { scan-tree-dump-not Threaded vrp1 } } */
+/* { dg-final { cleanup-tree-dump vrp1 } } */

[PATCH] PR 41936 Memory leakage with allocatables and user-defined operators

2014-05-27 Thread Dominique Dhumieres

This patch fixes a memory leakage with allocatables and user-defined operators.
It is basically Mikael's patch adjusted in order to take into account Tobias'
comments. The patch fixes the memory leak in the test 
gfortran.dg/class_array_15.f03
and I have added a check for it.

In the PR I asked the question if this sufficient or if I should also include
the test in comment 0. Any opinion?

Cheers,

Dominique

2014-05-27  Dominique d'Humieres domi...@lps.ens.fr
Mikael Morin mik...@gcc.gnu.org

PR fortran/41936
* trans-expr.c (gfc_conv_expr_reference): Deallocate array
components.

2014-05-27  Dominique d'Humieres domi...@lps.ens.fr
Mikael Morin mik...@gcc.gnu.org

PR fortran/41936
* gfortran.dg/class_array_15.f03: Check memory leaks.

diff -up ../_clean/gcc/fortran/trans-expr.c gcc/fortran/trans-expr.c
--- ../_clean/gcc/fortran/trans-expr.c  2014-05-22 10:05:53.0 +0200
+++ gcc/fortran/trans-expr.c2014-05-25 00:50:21.0 +0200
@@ -6506,6 +6506,20 @@ gfc_conv_expr_reference (gfc_se * se, gf
 
   /* Take the address of that value.  */
   se-expr = gfc_build_addr_expr (NULL_TREE, var);
+  if (expr-ts.type == BT_DERIVED  expr-rank
+   !gfc_is_finalizable (expr-ts.u.derived, NULL)
+   expr-ts.u.derived-attr.alloc_comp
+   expr-expr_type != EXPR_VARIABLE)
+{
+  tree tmp;
+
+  tmp = build_fold_indirect_ref_loc (input_location, se-expr);
+  tmp = gfc_deallocate_alloc_comp (expr-ts.u.derived, tmp, expr-rank);
+  
+  /* The components shall be deallocated before
+ their containing entity.  */
+  gfc_prepend_expr_to_block (se-post, tmp);
+}
 }
 
 
diff -up ../_clean/gcc/testsuite/gfortran.dg/class_array_15.f03 
gcc/testsuite/gfortran.dg/class_array_15.f03
--- ../_clean/gcc/testsuite/gfortran.dg/class_array_15.f03  2013-01-06 
22:34:50.0 +0100
+++ gcc/testsuite/gfortran.dg/class_array_15.f032014-05-04 
10:24:06.0 +0200
@@ -1,4 +1,5 @@
 ! { dg-do run }
+! { dg-options -fdump-tree-original }
 !
 ! Tests the fixes for three bugs with the same underlying cause.  All are 
regressions
 ! that come about because class array elements end up with a different tree 
type
@@ -114,3 +115,5 @@ subroutine pr54992  ! This test remains 
   bh = bhGet(b,instance=2)
   if (loc (b) .ne. loc(bh%hostNode)) call abort
 end
+! { dg-final { scan-tree-dump-times builtin_free 12 original } }
+! { dg-final { cleanup-tree-dump original } }

Re: [PATCH] Warn on and fold NULL checks against inline functions

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 4:06 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 On Tue, May 27, 2014 at 8:32 AM, Patrick Palka patr...@parcs.ath.cx wrote:
 On Tue, May 27, 2014 at 3:33 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Mon, May 26, 2014 at 9:01 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 Hi,

 This patch teaches the C++ frontend to warn on NULL checks against
 inline functions and it teaches the middle-end to fold NULL checks
 against inline functions.  These two things are currently done for
 non-inline C++ functions, but inline functions are exceptional in that
 the C++ frontend marks them as weak, and NULL checks against weak
 symbols cannot be folded in general because the symbol may be mapped to
 NULL at link-time.

 But in the case of an inline function, the fact that it's a weak symbol
 is an implementation detail.  And because it is not permitted to
 explicitly give an inline function the weak attribute (see
 handle_weak_attribute), in order to acheive $SUBJECT it suffices to
 assume that all inline functions are non-null, which is what this patch
 does.

 Bootstrap and regtest against x86_64-unknown-linux-gnu in progress.  Is
 this patch OK assuming no regressions?

 What about always_inline function wrappers as used in FORTIFY_SOURCE?
 IIRC the actual referenced function may be declared weak and thus the
 address can compare to NULL?  Sth like

 extern void __foo (void) __attribute__((weak,asm(foo)));
 extern inline __attribute__((always_inline,gnu_inline)) void foo (void)
 {
   __foo ();
 }

 int main()
 {
   if (foo == 0)
 return 0;
   abort ();
 }

 The issue is that the inline and alias may be hidden to the user and thus
 he'll get unwanted and confusing warnings.

 Richard.

 So as it stands the foo == 0 check in the above example gets folded
 with or without my patch.  The issue here seems to be related to the
 use of gnu_inline.  The address of an inline function marked
 gnu_inline shouldn't be considered non-NULL because a standalone
 function does not actually get emitted.  Of course, one could inspect
 aliases but in general it is not correct to assume such functions are
 non-NULL.  Does this sound right?

 This issue has slight overlap with the issue that my patch tries to
 fix.  I could try to handle it as well.

 Oh, disregard the above.  I think I see what you mean now, Richard.
 Because __foo is an alias for foo, and __foo is weak, is it safe to
 assume that foo is non-NULL?  That is a tricky question...

No (I wasn't making a folding validity remark).  I want to make sure
we won't see diagnostics on code like

#include string.h

int main()
{
   if (malloc != 0)
 ...
}

just because it is built with -D_FORTIFY_SOURCE.  Whether we
may fold that check at all is another question, of course (and again
that should _not_ differ between -D_FORTIFY_SOURCE or -U_FORTIFY_SOURCE,
but it may well also be a glibc issue).

And yes, aliases are tricky beasts ...

Richard.

Re: Mark more constants readonly

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 3:13 PM, Bernd Schmidt ber...@codesourcery.com wrote:
 I noticed that string constants built by the Fortran frontend don't set
 TREE_READONLY for STRING_CST (and subsequently noticed that nothing seems to
 set it for COMPLEX_CST). That was confusing the ptx backend I'm working on.
 The -fwritable-strings option for C was removed a while ago, so I tested the
 following patch to just set the flag unconditionally, and passed bootstrap
 and test on x86_64-linux. Ok?

Hmm?  Not so obvious.  TREE_READONLY has no purpose on tcc_constant
nodes if I read documentation correctly (tree.h).  In fact
base.readonly_flag is unused for tcc_constant (and would be redundant
with TREE_CONSTANT).

Richard.


 Bernd

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Christophe Lyon

On 27 May 2014 15:37, Ramana Radhakrishnan ramana@googlemail.com wrote:
 On Tue, May 27, 2014 at 2:12 PM, Christophe Lyon
 christophe.l...@linaro.org wrote:
 Hi,

 Commits 210964 and 210965 for this patch have broken GCC build on arm* 
 targets.
 For instance on target arm-none-linux-gnueabi, when creating
 unwind-arm.o, I can see:
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:1362
 0x8e3bcd process_insn_for_elimination
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1344
 0x8e3bcd lra_eliminate(bool, bool)
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-eliminations.c:1408
 0x8dd753 lra_constraints(bool)
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra-constraints.c:4049
 0x8cdc1a lra(_IO_FILE*)
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/lra.c:2332
 0x8911f8 do_reload
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5444
 0x891508 execute
 
 /tmp/5673443_3.tmpdir/aci-gcc-fsf/sources/gcc-fsf/trunk/gcc/ira.c:5605
 Please submit a full bug report,



 Can you file a proper bug report with a pre-processed file -
 information about configury options etc. to show what the issue is.


Filed PR 61331.

Thanks.

Re: Don't copy constants in build_constant_desc

2014-05-27 Thread Richard Biener

On Tue, May 27, 2014 at 3:26 PM, Bernd Schmidt ber...@codesourcery.com wrote:
 When putting a constant into the constant pool, we make a copy of the tree
 in build_constant_desc. This caused me some problems with the ptx backend,
 which has a pass to modify the address spaces of all variables and constants
 to match the requirements of the backend. It would be nice for it to be able
 to find them all by walking the varpool and the gimple of all functions, and
 I ran into some testcases where these copies can slip through.

 After a while it occurred to me that the copy is probably a relic of old
 obstack times and not necessary anymore. And indeed, in 2.7.2.3 the only
 call to it looked like this:
   push_obstacks_nochange ();
   suspend_momentary ();
   p-exp = copy_constant (exp);
   pop_obstacks ();

 It seems unlikely that anything in the compiler would want to modify a
 constant after it has been put into the constant pool, and even if so, it
 should have the responsibility of making the copy itself. So, as an
 experiment I've bootstrapped and tested the following patch (x86_64-linux),
 and that seemed to work. Ok to apply along with removal of the now unused
 copy_constant?

Ok if nobody raises issues within 24h.

Thanks,
Richard.


 Bernd

Re: [PATCH, fortran]: Include stdlib.h in intrinsics/getcwd.c

2014-05-27 Thread Uros Bizjak

On Tue, May 27, 2014 at 1:37 PM, Steve Kargl
s...@troutmask.apl.washington.edu wrote:

 ... to avoid implicit declaration of function ???free??? warning.

 2014-05-27  Uros Bizjak  ubiz...@gmail.com

 * intrinsics/getcwd.c: Include stdlib.h.

 It can also be committed to the 4.9 branch if you have the time.

There is no need for stdlib.h include in the 4.9 branch, the call to
free was introduced in 4.10.

Uros.

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Sandiford

Richard Sandiford rdsandif...@googlemail.com writes:
 Richard Sandiford rsand...@linux.vnet.ibm.com writes:
 Does the following patch help?

 Bah, it won't of course: %i1 needs to be the operator.

Here's v2.  I tested that it worked for simple tests like:

int f1 (int x, int y) { return x + (y  4); }
int f2 (int x, int y) { return x - (y  4); }
int f3 (int x, int y) { return x  (y  4); }
int f4 (int x, int y) { return x | (y  4); }
int f5 (int x, int y) { return x ^ (y  4); }
int f6 (int x, int y) { return (y  4) - x; }
int g1 (int x, int y, int z) { return x + (y  z); }
int g2 (int x, int y, int z) { return x - (y  z); }
int g3 (int x, int y, int z) { return x  (y  z); }
int g4 (int x, int y, int z) { return x | (y  z); }
int g5 (int x, int y, int z) { return x ^ (y  z); }
int g6 (int x, int y, int z) { return (y  z) - x; }

as well as the testcase.

Thanks,
Richard


gcc/
* config/arm/iterators.md (shiftable_ops): New code iterator.
(t2_binop0, arith_shift_insn): New code attributes.
* config/arm/arm.md (insn_enabled): Delete.
(enabled): Remove insn_enabled test.
(*arith_shiftsi): Split out...
(*arith_multsi): ...this pattern and remove insn_enabled attribute.

Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md   (revision 210972)
+++ gcc/config/arm/arm.md   (working copy)
@@ -200,17 +200,9 @@
  (const_string yes)]
 (const_string no)))
 
-; Allows an insn to disable certain alternatives for reasons other than
-; arch support.
-(define_attr insn_enabled no,yes
-  (const_string yes))
-
 ; Enable all alternatives that are both arch_enabled and insn_enabled.
  (define_attr enabled no,yes
-   (cond [(eq_attr insn_enabled no)
- (const_string no)
-
- (and (eq_attr predicable_short_it no)
+   (cond [(and (eq_attr predicable_short_it no)
   (and (eq_attr predicated yes)
(match_test arm_restrict_it)))
  (const_string no)
@@ -9876,39 +9868,38 @@
 
 ;; Patterns to allow combination of arithmetic, cond code and shifts
 
-(define_insn *arith_shiftsi
-  [(set (match_operand:SI 0 s_register_operand =r,r,r,r)
-(match_operator:SI 1 shiftable_operator
-  [(match_operator:SI 3 shift_operator
- [(match_operand:SI 4 s_register_operand r,r,r,r)
-  (match_operand:SI 5 shift_amount_operand M,M,M,r)])
-   (match_operand:SI 2 s_register_operand rk,rk,r,rk)]))]
+;; Separate out the (mult ...) case, since it doesn't allow register operands.
+(define_insn *arith_multsi
+  [(set (match_operand:SI 0 s_register_operand =r,r)
+(shiftable_ops:SI
+ (match_operator:SI 2 mult_operator
+   [(match_operand:SI 3 s_register_operand r,r)
+(match_operand:SI 4 power_of_two_operand)])
+ (match_operand:SI 1 s_register_operand rk,t2_binop0)))]
   TARGET_32BIT
-  %i1%?\\t%0, %2, %4%S3
+  arith_shift_insn%?\\t%0, %1, %3%S2
   [(set_attr predicable yes)
(set_attr predicable_short_it no)
(set_attr shift 4)
-   (set_attr arch a,t2,t2,a)
-   ;; Thumb2 doesn't allow the stack pointer to be used for 
-   ;; operand1 for all operations other than add and sub. In this case 
-   ;; the minus operation is a candidate for an rsub and hence needs
-   ;; to be disabled.
-   ;; We have to make sure to disable the fourth alternative if
-   ;; the shift_operator is MULT, since otherwise the insn will
-   ;; also match a multiply_accumulate pattern and validate_change
-   ;; will allow a replacement of the constant with a register
-   ;; despite the checks done in shift_operator.
-   (set_attr_alternative insn_enabled
-[(const_string yes)
- (if_then_else
-  (match_operand:SI 1 add_operator )
-  (const_string yes) (const_string no))
- (const_string yes)
- (if_then_else
-  (match_operand:SI 3 mult_operator )
-  (const_string no) (const_string yes))])
-   (set_attr type 
alu_shift_imm,alu_shift_imm,alu_shift_imm,alu_shift_reg)])
+   (set_attr arch a,t2)
+   (set_attr type alu_shift_imm,alu_shift_imm)])
 
+;; Handles cases other than the above.
+(define_insn *arith_shiftsi
+  [(set (match_operand:SI 0 s_register_operand =r,r,r)
+(shiftable_ops:SI
+  (match_operator:SI 2 shift_operator
+[(match_operand:SI 3 s_register_operand r,r,r)
+ (match_operand:SI 4 shift_amount_operand M,M,r)])
+  (match_operand:SI 1 s_register_operand rk,t2_binop0,rk)))]
+  TARGET_32BIT  GET_CODE (operands[3]) != MULT
+  arith_shift_insn%?\\t%0, %1, %3%S2
+  [(set_attr predicable yes)
+   (set_attr predicable_short_it no)
+   (set_attr shift 4)
+   (set_attr arch a,t2,a)
+   (set_attr type alu_shift_imm,alu_shift_imm,alu_shift_reg)])
+
 (define_split
   [(set

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Earnshaw

On 27/05/14 15:08, Richard Sandiford wrote:
 Hmm, is this because of insn_enabled?  If so, how did that work before
 the patch?  LRA already assumed that the enabled attribute didn't depend
 on the operands.

Huh!  enabled can be applied to each alternative.  Alternatives are
selected based on the operands.  If LRA can't cope with that we have a
serious problem.  In fact, a pattern that matches but has no enabled
alternatives is meaningless and guaranteed to cause problems during
register allocation.

R.

Re: [PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

2014-05-27 Thread Richard Earnshaw

On 27/05/14 15:31, Maciej W. Rozycki wrote:
 On Tue, 13 Aug 2013, Kyrylo Tkachov wrote:
 
 On 08/09/13 11:01, Julian Brown wrote:
 On Thu, 8 Aug 2013 15:44:17 +0100
 Kyrylo Tkachov kyrylo.tkac...@arm.com wrote:

 Hi all,

 The recently added gcc.target/arm/pr58041.c test exposed a bug in the
 backend. When compiling for NEON and with -mno-unaligned-access we
 end up generating the vld1.64 and vst1.64 instructions instead of
 doing the accesses one byte at a time like -mno-unaligned-access
 expects. This patch fixes that by enabling the NEON expander and
 insns that produce these instructions only when unaligned accesses
 are allowed.

 Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu.

 Ok for trunk and 4.8?

 I'm not sure if this is right, FWIW -- do the instructions in question
 trap if the CPU is set to disallow unaligned accesses? I thought that
 control bit only affected ARM core loads  stores, not NEON ones.

 Thinking again - the ARM-ARM says - the alignment check is for element
 size, so an alternative might be to use vld1.8 instead to allow for this
 at which point we might as well do something else with the test. I note
 that these patterns are not allowed for BYTES_BIG_ENDIAN so that might
 be a better alternative than completely disabling it.

 Looking at the section on unaligned accesses, it seems that the
 ldrb/strb-class instructions are the only ones that are unaffected by the
 SCTLR.A bit and do not produce alignment faults in any case.
 The NEON load/store instructions, including vld1.8 can still cause an
 alignment fault when SCTLR.A is set. So it seems we can only use the 
 byte-wise
 core memory instructions for unaligned data.
 
  This change however has regressed gcc.dg/vect/vect-72.c on the 
 arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.
 
  Beforehand the code fragment in question produced was:
 
 .L14:
   sub r1, r3, #16
   add r3, r3, #16
   vld1.8  {q8}, [r1]
   cmp r3, r0
   vst1.64 {d16-d17}, [r2:64]!
   bne .L14
 
 Afterwards it is:
 
 .L14:
   vldrd16, [r3, #-16]
   vldrd17, [r3, #-8]
   add r3, r3, #16
   cmp r3, r1
   vst1.64 {d16-d17}, [r2:64]!
   bne .L14
 
 and the second VLDR instruction traps with SIGILL (the value in R3 is 
 0x10b29, odd as you'd expect, pointing into `ib').  I don't know why and 
 especially why only the second of the two (regrettably I've been unable to 
 track down an instruction reference that'd be detailed enough to specify 
 what exceptions VLDR can produce and under what conditions).
 

SIGILL means undefined instruction.  It does not mean misaligned
address (unless the kernel is messed up) -- that would be SIGBUS.


  Interestingly enough the trap does not happen when the program is 
 single-stepped under GDB (via gdbserver), however it then aborts once this 
 copy loop has completed as `ia' contains rubbish and fails the test.
 
  Is there a fix that needs backporting to 4.8 or is this an issue that was 
 unknown so far?
 
  Hardware and Linux used:
 
 $ cat /proc/cpuinfo
 Processor : ARMv7 Processor rev 2 (v7l)
 processor : 0
 BogoMIPS  : 2013.49
 
 processor : 1
 BogoMIPS  : 1963.08
 
 Features  : swp half thumb fastmult vfp edsp thumbee neon vfpv3
 CPU implementer   : 0x41
 CPU architecture: 7
 CPU variant   : 0x1
 CPU part  : 0xc09
 CPU revision  : 2
 
 Hardware  : OMAP4430 Panda Board
 Revision  : 0020
 Serial: 
 $ uname -a
 Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC 
 2010 armv7l GNU/Linux
 $ 
 
   Maciej

[PATCH AArch64] Remove from arm_neon.h functions not in the spec

2014-05-27 Thread Alan Lawrence

arm_neon.h contains a bunch of functions (for example, the wonderful vcgez_u* 
intrinsics - that's an unsigned comparison of greater-than-or-equal-to zero) 
that are not present in the current ARM Neon Intrinsics spec:


http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/index.html

This patch just deletes those intrinsics.

OK for trunk?

Cheers, Alan

gcc/ChangeLog:
2014-05-27  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/arm_neon.h (vfmaq_n_f64, vmlaq_n_f64, vmlsq_n_f64,
vrsrtsq_f64, vtst_p16, vtstq_p16, vcge_p8, vcgeq_p8, vcgez_p8, vcgez_u8,
vcgez_u16, vcgez_u32, vcgez_u64, vcgezq_p8, vcgezq_u8, vcgezq_u16,
vcgezq_u32, vcgezq_u64, vcgezd_u64, vcgt_p8, vcgtq_p8, vcgtz_p8,
vcgtz_u8, vcgtz_u16, vcgtz_u32, vcgtz_u64, vcgtzq_p8, vcgtzq_u8,
vcgtzq_u16, vcgtzq_u32, vcgtzq_u64, vcgtzd_u64, vcle_p8, vcleq_p8,
vclez_p8, vclez_u64, vclezq_p8, vclezd_u64, vclt_p8, vcltq_p8, vcltz_p8,
vcltzq_p8, vcltzd_u64): Remove functions as they are not in the spec.diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 747a292..6f20337 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -5781,17 +5781,6 @@ vfmaq_n_f32 (float32x4_t a, float32x4_t b, float32_t c)
   return result;
 }
 
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vfmaq_n_f64 (float64x2_t a, float64x2_t b, float64_t c)
-{
-  float64x2_t result;
-  __asm__ (fmla %0.2d, %2.2d, %3.d[0]
-   : =w(result)
-   : 0(a), w(b), w(c)
-   : /* No clobbers */);
-  return result;
-}
-
 __extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
 vfms_f32 (float32x2_t a, float32x2_t b, float32x2_t c)
 {
@@ -7243,18 +7232,6 @@ vmlaq_n_f32 (float32x4_t a, float32x4_t b, float32_t c)
   return result;
 }
 
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vmlaq_n_f64 (float64x2_t a, float64x2_t b, float64_t c)
-{
-  float64x2_t result;
-  float64x2_t t1;
-  __asm__ (fmul %1.2d, %3.2d, %4.d[0]; fadd %0.2d, %0.2d, %1.2d
-   : =w(result), =w(t1)
-   : 0(a), w(b), w(c)
-   : /* No clobbers */);
-  return result;
-}
-
 __extension__ static __inline int16x8_t __attribute__ ((__always_inline__))
 vmlaq_n_s16 (int16x8_t a, int16x8_t b, int16_t c)
 {
@@ -7943,18 +7920,6 @@ vmlsq_n_f32 (float32x4_t a, float32x4_t b, float32_t c)
   return result;
 }
 
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vmlsq_n_f64 (float64x2_t a, float64x2_t b, float64_t c)
-{
-  float64x2_t result;
-  float64x2_t t1;
-  __asm__ (fmul %1.2d, %3.2d, %4.d[0]; fsub %0.2d, %0.2d, %1.2d
-   : =w(result), =w(t1)
-   : 0(a), w(b), x(c)
-   : /* No clobbers */);
-  return result;
-}
-
 __extension__ static __inline int16x8_t __attribute__ ((__always_inline__))
 vmlsq_n_s16 (int16x8_t a, int16x8_t b, int16_t c)
 {
@@ -11329,17 +11294,6 @@ vrsqrtss_f32 (float32_t a, float32_t b)
   return result;
 }
 
-__extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
-vrsrtsq_f64 (float64x2_t a, float64x2_t b)
-{
-  float64x2_t result;
-  __asm__ (frsqrts %0.2d,%1.2d,%2.2d
-   : =w(result)
-   : w(a), w(b)
-   : /* No clobbers */);
-  return result;
-}
-
 __extension__ static __inline int8x16_t __attribute__ ((__always_inline__))
 vrsubhn_high_s16 (int8x8_t a, int16x8_t b, int16x8_t c)
 {
@@ -16082,13 +16036,6 @@ vcge_f64 (float64x1_t __a, float64x1_t __b)
 }
 
 __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
-vcge_p8 (poly8x8_t __a, poly8x8_t __b)
-{
-  return (uint8x8_t) __builtin_aarch64_cmgev8qi ((int8x8_t) __a,
-		 (int8x8_t) __b);
-}
-
-__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
 vcge_s8 (int8x8_t __a, int8x8_t __b)
 {
   return (uint8x8_t) __builtin_aarch64_cmgev8qi (__a, __b);
@@ -16152,13 +16099,6 @@ vcgeq_f64 (float64x2_t __a, float64x2_t __b)
 }
 
 __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
-vcgeq_p8 (poly8x16_t __a, poly8x16_t __b)
-{
-  return (uint8x16_t) __builtin_aarch64_cmgev16qi ((int8x16_t) __a,
-		   (int8x16_t) __b);
-}
-
-__extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
 vcgeq_s8 (int8x16_t __a, int8x16_t __b)
 {
   return (uint8x16_t) __builtin_aarch64_cmgev16qi (__a, __b);
@@ -16252,14 +16192,6 @@ vcgez_f64 (float64x1_t __a)
 }
 
 __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
-vcgez_p8 (poly8x8_t __a)
-{
-  poly8x8_t __b = {0, 0, 0, 0, 0, 0, 0, 0};
-  return (uint8x8_t) __builtin_aarch64_cmgev8qi ((int8x8_t) __a,
-		 (int8x8_t) __b);
-}
-
-__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__))
 vcgez_s8 (int8x8_t __a)
 {
   int8x8_t __b = {0, 0, 0, 0, 0, 0, 0, 0};
@@ -16286,36 +16218,6 @@ vcgez_s64 (int64x1_t __a)
   return __a = 0ll ? -1ll : 0ll;
 }

Re: [PATCH, ARM] interrupt testcases for shink-wrapping

2014-05-27 Thread Jeff Law


On 05/27/14 02:07, Zhenqiang Chen wrote:

Hi,

Following up Richard's comments:
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00967.html

I investigate interrupt routines:
* gcc can shrink-wrapping an interrupt routine (ARM mode). The
shrink-wrap-interrupt_1.c in the patch can show it.
* There is restriction for interrupt routine to be shrink-wrapped.
e.g. shrink-wrap-interrupt_2.c can not be shrink-wrapped.
* When interrupt routine is shrink-wrapped, it uses
*cond_simple_return to return from the function. There is no dwarf
info issue.

So no need to change the codes. The patch only includes two test cases
to track it.

OK for trunk?

Thanks!
-Zhenqiang

testsuite/ChangeLog:
2014-05-27  Zhenqiang Chen  zhenqiang.c...@linaro.org

 * gcc.target/arm/shrink-wrap-interrupt_1.c: New test.
 * gcc.target/arm/shrink-wrap-interrupt_2.c: New test.

OK.
Jeff

Re: [PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

2014-05-27 Thread Kyrill Tkachov



On 27/05/14 15:47, Ramana Radhakrishnan wrote:

On Tue, May 27, 2014 at 3:31 PM, Maciej W. Rozycki
ma...@codesourcery.com wrote:

On Tue, 13 Aug 2013, Kyrylo Tkachov wrote:


On 08/09/13 11:01, Julian Brown wrote:

On Thu, 8 Aug 2013 15:44:17 +0100
Kyrylo Tkachov kyrylo.tkac...@arm.com wrote:


Hi all,

The recently added gcc.target/arm/pr58041.c test exposed a bug in the
backend. When compiling for NEON and with -mno-unaligned-access we
end up generating the vld1.64 and vst1.64 instructions instead of
doing the accesses one byte at a time like -mno-unaligned-access
expects. This patch fixes that by enabling the NEON expander and
insns that produce these instructions only when unaligned accesses
are allowed.

Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu.

Ok for trunk and 4.8?

I'm not sure if this is right, FWIW -- do the instructions in question
trap if the CPU is set to disallow unaligned accesses? I thought that
control bit only affected ARM core loads  stores, not NEON ones.

Thinking again - the ARM-ARM says - the alignment check is for element
size, so an alternative might be to use vld1.8 instead to allow for this
at which point we might as well do something else with the test. I note
that these patterns are not allowed for BYTES_BIG_ENDIAN so that might
be a better alternative than completely disabling it.

Looking at the section on unaligned accesses, it seems that the
ldrb/strb-class instructions are the only ones that are unaffected by the
SCTLR.A bit and do not produce alignment faults in any case.
The NEON load/store instructions, including vld1.8 can still cause an
alignment fault when SCTLR.A is set. So it seems we can only use the byte-wise
core memory instructions for unaligned data.

  This change however has regressed gcc.dg/vect/vect-72.c on the
arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.

And what are all the configure flags you are using in case some one
has to reproduce this issue ?


Second that. My recently built 4.8 (gcc version 4.8.2 20130531) for 
vect-72 with options:
 -O2 -ftree-vectorize -march=armv5te -mfpu=neon -mfloat-abi=hard 
-fno-vect-cost-model -fno-common


gives code the same as your original one:
.L14:
sub r1, r3, #16
add r3, r3, #16
vld1.8  {q8}, [r1]
cmp r3, r0
vst1.64 {d16-d17}, [r2:64]!
bne .L14
ldr r3, .L22+12
add ip, r3, #128
add r2, r3, #129


Kyrill



  Beforehand the code fragment in question produced was:

.L14:
 sub r1, r3, #16
 add r3, r3, #16
 vld1.8  {q8}, [r1]

vld1 allows a misaligned load.


 cmp r3, r0
 vst1.64 {d16-d17}, [r2:64]!
 bne .L14

Afterwards it is:

.L14:
 vldrd16, [r3, #-16]
 vldrd17, [r3, #-8]
 add r3, r3, #16
 cmp r3, r1
 vst1.64 {d16-d17}, [r2:64]!
 bne .L14

and the second VLDR instruction traps with SIGILL (the value in R3 is
0x10b29, odd as you'd expect, pointing into `ib').  I don't know why and
especially why only the second of the two (regrettably I've been unable to
track down an instruction reference that'd be detailed enough to specify
what exceptions VLDR can produce and under what conditions).

vldr will cause an unaligned access fault if the address is
misaligned. The question is why is the address misaligned in this
case.






  Is there a fix that needs backporting to 4.8 or is this an issue that was
unknown so far?

I haven't seen an issue with this so far.

Ramana


  Hardware and Linux used:

$ cat /proc/cpuinfo
Processor   : ARMv7 Processor rev 2 (v7l)
processor   : 0
BogoMIPS: 2013.49

processor   : 1
BogoMIPS: 1963.08

Features: swp half thumb fastmult vfp edsp thumbee neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x1
CPU part: 0xc09
CPU revision: 2

Hardware: OMAP4430 Panda Board
Revision: 0020
Serial  : 
$ uname -a
Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC 
2010 armv7l GNU/Linux
$

   Maciej

Re: [patch] VxWorks configuration adjustments for powerpc 8548 (e500v2)

2014-05-27 Thread Nathan Sidwell


On 05/27/14 15:57, Olivier Hainque wrote:

Hello,

The VxWorks compilation of C sources using vxworks header files need the CPU
macro to be defined. Configuration files in the compiler are setup to to pick
a default value depending on the cpu variant for which we are compiling.



OK for mainline ?


ok


--
Nathan Sidwell

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Jakub Jelinek

On Tue, May 27, 2014 at 04:15:47PM +0100, Richard Earnshaw wrote:
 On 27/05/14 15:08, Richard Sandiford wrote:
  Hmm, is this because of insn_enabled?  If so, how did that work before
  the patch?  LRA already assumed that the enabled attribute didn't depend
  on the operands.
 
 Huh!  enabled can be applied to each alternative.  Alternatives are
 selected based on the operands.  If LRA can't cope with that we have a
 serious problem.  In fact, a pattern that matches but has no enabled
 alternatives is meaningless and guaranteed to cause problems during
 register allocation.

This is not LRA fault, but the backend misusing the enabled attribute
for something it wasn't designed for, and IMHO against the documentation
of the attribute too.
Just look at the original submission why it has been added.

Jakub

Re: [PATCH AArch64] Remove from arm_neon.h functions not in the spec

2014-05-27 Thread Alan Lawrence

No, hold that, vfmaq_n_f64 has been added back in the latest version (to which I 
linked). Hang on...


--Alan

Alan Lawrence wrote:
arm_neon.h contains a bunch of functions (for example, the wonderful vcgez_u* 
intrinsics - that's an unsigned comparison of greater-than-or-equal-to zero) 
that are not present in the current ARM Neon Intrinsics spec:


http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/index.html

This patch just deletes those intrinsics.

OK for trunk?

Cheers, Alan

gcc/ChangeLog:
2014-05-27  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/arm_neon.h (vfmaq_n_f64, vmlaq_n_f64, vmlsq_n_f64,
vrsrtsq_f64, vtst_p16, vtstq_p16, vcge_p8, vcgeq_p8, vcgez_p8, vcgez_u8,
vcgez_u16, vcgez_u32, vcgez_u64, vcgezq_p8, vcgezq_u8, vcgezq_u16,
vcgezq_u32, vcgezq_u64, vcgezd_u64, vcgt_p8, vcgtq_p8, vcgtz_p8,
vcgtz_u8, vcgtz_u16, vcgtz_u32, vcgtz_u64, vcgtzq_p8, vcgtzq_u8,
vcgtzq_u16, vcgtzq_u32, vcgtzq_u64, vcgtzd_u64, vcle_p8, vcleq_p8,
vclez_p8, vclez_u64, vclezq_p8, vclezd_u64, vclt_p8, vcltq_p8, vcltz_p8,
vcltzq_p8, vcltzd_u64): Remove functions as they are not in the spec.

Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-05-27 Thread Ilya Verbin

2014-05-27 15:15 GMT+04:00 Bernd Schmidt ber...@codesourcery.com:
 On 05/27/2014 01:11 PM, Ilya Verbin wrote:
 Do I understand correctly that you recommend to build our offload gcc
 manually, rather than during configure/make?

 Well, configure/make it separately - build and install it first.

 If there was a way to generalize the Makefiles to allow target libraries to
 be built for multiple targets, that would be pretty good too, but it might
 be too much work for too little gain.

Ok.

And what about the mkoffload tool? Should it be built during the build
of the host or offload compiler?
Do you have any patches for Makefile that builds it?

Thanks,
  -- Ilya

Re: RFA: PATCH to allow DECL_COMDAT_GROUP to be a DECL

2014-05-27 Thread Jason Merrill


On 05/19/2014 06:10 PM, Jan Hubicka wrote:

Instead of adding extra loop over whole symtab, what about converting them at a 
time
the symbols are inserted into the symbol table or in 
function_and_variable_visibility
that already does quite few changes into various flags?


Converting them during the initial analyze pass seems to do the trick:




commit 062e714358ad384f1a69051f3b7f3902b9714904
Author: Jason Merrill ja...@redhat.com
Date:   Mon May 19 15:41:24 2014 -0400

	PR c++/47202
gcc/cp/
	* decl.c (cxx_comdat_group): Return a decl.
	* optimize.c (cdtor_comdat_group): Get its DECL_ASSEMBLER_NAME.
gcc/
	* cgraph.h (symtab_node::get_comdat_group_id): New.
	* cgraphunit.c (analyze_functions): Call it.
	* symtab.c (dump_symtab_node): Likewise.
	* tree.c (decl_comdat_group_id): New.
	* tree.h: Declare it.
	* lto-streamer-out.c (write_symbol): Use it.
	* trans-mem.c (ipa_tm_create_version_alias): Likewise.

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 8556e2d..44c8211 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -147,6 +147,13 @@ public:
   return comdat_group_;
 }
 
+  tree get_comdat_group_id ()
+{
+  if (comdat_group_  TREE_CODE (comdat_group_) != IDENTIFIER_NODE)
+	comdat_group_ = DECL_ASSEMBLER_NAME (comdat_group_);
+  return comdat_group_;
+}
+
   /* Set comdat group.  */
   void set_comdat_group (tree group)
 {
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 3bd364b..e19b0a2 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -974,6 +974,8 @@ analyze_functions (void)
 	   node != first_analyzed
 	node != first_analyzed_var; node = node-next)
 	{
+	  /* Convert COMDAT group designators to IDENTIFIER_NODEs.  */
+	  node-get_comdat_group_id ();
 	  if (decide_is_symbol_needed (node))
 	{
 	  enqueue_node (node);
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 9e97ff8..3502447 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14484,18 +14484,17 @@ cp_missing_noreturn_ok_p (tree decl)
   return DECL_MAIN_P (decl);
 }
 
-/* Return the COMDAT group into which DECL should be placed.  */
+/* Return the decl used to identify the COMDAT group into which DECL should
+   be placed.  */
 
 tree
 cxx_comdat_group (tree decl)
 {
-  tree name;
-
   /* Virtual tables, construction virtual tables, and virtual table
  tables all go in a single COMDAT group, named after the primary
  virtual table.  */
   if (VAR_P (decl)  DECL_VTABLE_OR_VTT_P (decl))
-name = DECL_ASSEMBLER_NAME (CLASSTYPE_VTABLES (DECL_CONTEXT (decl)));
+decl = CLASSTYPE_VTABLES (DECL_CONTEXT (decl));
   /* For all other DECLs, the COMDAT group is the mangled name of the
  declaration itself.  */
   else
@@ -14513,10 +14512,9 @@ cxx_comdat_group (tree decl)
 	  else
 	break;
 	}
-  name = DECL_ASSEMBLER_NAME (decl);
 }
 
-  return name;
+  return decl;
 }
 
 /* Returns the return type for FN as written by the user, which may include
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 4205fec..6b0c555 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3517,10 +3517,15 @@ mangle_decl (const tree decl)
   DECL_IGNORED_P (alias) = 1;
   TREE_PUBLIC (alias) = TREE_PUBLIC (decl);
   DECL_VISIBILITY (alias) = DECL_VISIBILITY (decl);
+  DECL_EXTERNAL (alias) = DECL_EXTERNAL (decl);
   if (vague_linkage_p (decl))
 	DECL_WEAK (alias) = 1;
   if (TREE_CODE (decl) == FUNCTION_DECL)
-	cgraph_same_body_alias (cgraph_get_create_node (decl), alias, decl);
+	{
+	  /* Don't create an alias to an unreferenced function.  */
+	  if (struct cgraph_node *n = cgraph_get_node (decl))
+	cgraph_same_body_alias (n, alias, decl);
+	}
   else
 	varpool_extra_name_alias (alias, decl);
 #endif
diff --git a/gcc/cp/optimize.c b/gcc/cp/optimize.c
index a58565c..3f89b81 100644
--- a/gcc/cp/optimize.c
+++ b/gcc/cp/optimize.c
@@ -170,6 +170,8 @@ cdtor_comdat_group (tree complete, tree base)
 complete_name = cxx_comdat_group (complete);
   if (base_name == NULL)
 base_name = cxx_comdat_group (base);
+  complete_name = DECL_ASSEMBLER_NAME (complete_name);
+  base_name = DECL_ASSEMBLER_NAME (base_name);
   gcc_assert (IDENTIFIER_LENGTH (complete_name)
 	  == IDENTIFIER_LENGTH (base_name));
   grp_name = XALLOCAVEC (char, IDENTIFIER_LENGTH (complete_name) + 1);
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index d1c8d9f..034d9d9 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -2329,7 +2329,7 @@ write_symbol (struct streamer_tree_cache_d *cache,
 size = 0;
 
   if (DECL_ONE_ONLY (t))
-comdat = IDENTIFIER_POINTER (DECL_COMDAT_GROUP (t));
+comdat = IDENTIFIER_POINTER (decl_comdat_group_id (t));
   else
 comdat = ;
 
diff --git a/gcc/symtab.c b/gcc/symtab.c
index 8abb7a1..79fdbc2 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -560,7 +560,7 @@ dump_symtab_base (FILE *f, symtab_node *node)
 fprintf (f,  comdat);
   if (node-get_comdat_group ())
 fprintf

Re: [patch i386]: Expand sibling-tail-calls via accumulator register

2014-05-27 Thread Jeff Law


On 05/26/14 14:32, Kai Tietz wrote:

- Original Message -

On Mon, May 26, 2014 at 03:22:50PM -0400, Kai Tietz wrote:

In any case, I still can't understand how limiting the choices
of the register allocator can improve code rather than making
it worse. If the accumulator is available there, why doesn't
the RA choose it if it is beneficial?  And why aren't other
registers similarly suitable for that?  Say r10, r11...


I don't see it as limiting.  The intend of this is more to have
fixed patterns on epilogue.  And in fact is accumulator that
register which can be used as scratch-register for all
i386-targets.  Beside for varardic-functions, which anyway aren't
any good candidates for sibling-call-optimization (on x86_64 due
ABI).  Well, for x86_64 ABI we might could consider to use
R11_REG instead of AX_REG.  Is there any advantage in
special-case for x86_64 ABI?  The R10-register isn't a good
choice due it might be used as drap-register and therefore can't
be loaded before epilogue gets destroyed.


It is limiting.  If r11/rax and often also r10 can be chosen,
telling the RA it can only choose rax is a limitation.


No, it isn't.  For sure.  The code-branch choosing the accu to call
after epilogue isn't used as memories for sibling-calls aren't
allowed.  This code-branch will get active with my other
sibling-tail-call patch.
But the value we want may be sitting around in a convenient register 
(such as r11).  So if we force the sibcall to use %rax, then we have to 
generate a copy.  Yet if we have a constraint for the set of registers 
allowed here, then we give the register allocator the opportunity to 
coalesce away the copies.


Jeff

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Earnshaw

On 27/05/14 16:27, Jakub Jelinek wrote:
 On Tue, May 27, 2014 at 04:15:47PM +0100, Richard Earnshaw wrote:
 On 27/05/14 15:08, Richard Sandiford wrote:
 Hmm, is this because of insn_enabled?  If so, how did that work before
 the patch?  LRA already assumed that the enabled attribute didn't depend
 on the operands.

 Huh!  enabled can be applied to each alternative.  Alternatives are
 selected based on the operands.  If LRA can't cope with that we have a
 serious problem.  In fact, a pattern that matches but has no enabled
 alternatives is meaningless and guaranteed to cause problems during
 register allocation.
 
 This is not LRA fault, but the backend misusing the enabled attribute
 for something it wasn't designed for, and IMHO against the documentation
 of the attribute too.
 Just look at the original submission why it has been added.
 
   Jakub
 

quote
The @code{enabled} insn attribute may be used to disable certain insn
alternatives for machine-specific reasons.
quote

The rest of the text just says what happens when this is done and then
gives an example usage.  It doesn't any time, either explicitly or
implicitly, say that this must be a static choice determined once-off at
run time.

That being said, I agree that this particular use case is pushing the
boundaries -- but that's always a risk when the boundaries aren't
clearly defined.

A better solution here would be to get rid of all those match_operator
patterns and replace them with iterators; but that's a lot of work,
particularly for all the conditinal operation patterns we have.  It
would probably also bloat the number of patterns quite alarmingly.

Re: [PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

2014-05-27 Thread Maciej W. Rozycki

On Tue, 27 May 2014, Ramana Radhakrishnan wrote:

  Looking at the section on unaligned accesses, it seems that the
  ldrb/strb-class instructions are the only ones that are unaffected by the
  SCTLR.A bit and do not produce alignment faults in any case.
  The NEON load/store instructions, including vld1.8 can still cause an
  alignment fault when SCTLR.A is set. So it seems we can only use the 
  byte-wise
  core memory instructions for unaligned data.
 
   This change however has regressed gcc.dg/vect/vect-72.c on the
  arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.
 
 And what are all the configure flags you are using in case some one
 has to reproduce this issue ?

 There was nothing relevant to code generation among these flags except 
from --with-arch=armv5te, but this architecture selection I already 
included above.

   Beforehand the code fragment in question produced was:
 
  .L14:
  sub r1, r3, #16
  add r3, r3, #16
  vld1.8  {q8}, [r1]
 
 vld1 allows a misaligned load.
 
  cmp r3, r0
  vst1.64 {d16-d17}, [r2:64]!
  bne .L14
 
  Afterwards it is:
 
  .L14:
  vldrd16, [r3, #-16]
  vldrd17, [r3, #-8]
  add r3, r3, #16
  cmp r3, r1
  vst1.64 {d16-d17}, [r2:64]!
  bne .L14
 
  and the second VLDR instruction traps with SIGILL (the value in R3 is
  0x10b29, odd as you'd expect, pointing into `ib').  I don't know why and
  especially why only the second of the two (regrettably I've been unable to
  track down an instruction reference that'd be detailed enough to specify
  what exceptions VLDR can produce and under what conditions).
 
 vldr will cause an unaligned access fault if the address is
 misaligned. The question is why is the address misaligned in this
 case.

 The address is unaligned because odd indexes into `ib' are accessed, 
which is defined as:

char ib[N+1];

I'd say the question is rather why VLDR is used to access that array where 
clearly it can't be.  The loop copies a chunk of data in 64-bit quantities 
between two memory areas of which one is 64-bit aligned and the other is 
not.  So while 32-bit or 64-bit aligned accesses can still be used to do 
it, they can't be for both areas at a time, unless some further data 
manipulation is done such as shifting and merging.

  Maciej

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Jakub Jelinek

On Tue, May 27, 2014 at 04:40:13PM +0100, Richard Earnshaw wrote:
 quote
 The @code{enabled} insn attribute may be used to disable certain insn
 alternatives for machine-specific reasons.
 quote
 
 The rest of the text just says what happens when this is done and then
 gives an example usage.  It doesn't any time, either explicitly or
 implicitly, say that this must be a static choice determined once-off at
 run time.

It says:
This definition should be based on other insn attributes and/or target flags.
where from what I understand Andreas originally meant was that the other
insn attributes would be again based on target flags or other such insn
attributes.  Yeah, it explicitly rules out just basing the enabled attribute on
the operands, arm uses just another attribute based on the operands there,
but it still goes against the intent of the attribute, and is definitely a
nightmare for the register allocator.
Decisions based on the operands should be through constraints.

From what I can see, arm uses it just on a single insn, so I'd say it
shouldn't be a big deal to rewrite it.

Jakub

Re: [patch i386]: Expand sibling-tail-calls via accumulator register

2014-05-27 Thread Kai Tietz

Ok, so just the part ok the patch remains to prevent varardic-functions being a 
sibling-tail-call remains.

Kai

Re: [patch i386]: Expand sibling-tail-calls via accumulator register

2014-05-27 Thread Richard Henderson

On 05/22/2014 02:33 PM, Kai Tietz wrote:
   * config/i386/i386.c (ix86_expand_call): Enforce for sibcalls
   on memory the use of accumulator-register.

I don't like this at all.

I'm fine with allowing memories that are fully symbolic, e.g.

extern void (*foo)(void);
void f(void) { foo(); }

but any time you've got to use one insn to form the address in %eax, you might
as well have also issued the memory load into %eax.  And if the memory load is
moved earlier, you no longer need to constrain to %eax, but let the register
allocator choose the call-clobbered register to use.


r~

[patch, mips, commited] Check in obvious patch to fix mips build.

2014-05-27 Thread Steve Ellcey

The recent comdat group changes broke the mips build because mips.c
was not including cgraph.h.  I am checking in the following patch
as obvious to fix the mips build.


2014-05-27  Steve Ellcey  sell...@mips.com

* config/mips/mips.c: Add include of cgraph.h.


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index eefcfd2..46e6b81 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -71,6 +71,7 @@ along with GCC; see the file COPYING3.  If not see
 #include opts.h
 #include tree-pass.h
 #include context.h
+#include cgraph.h
 
 /* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF.  */
 #define UNSPEC_ADDRESS_P(X)\

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Earnshaw

On 27/05/14 16:50, Jakub Jelinek wrote:
 On Tue, May 27, 2014 at 04:40:13PM +0100, Richard Earnshaw wrote:
 quote
 The @code{enabled} insn attribute may be used to disable certain insn
 alternatives for machine-specific reasons.
 quote

 The rest of the text just says what happens when this is done and then
 gives an example usage.  It doesn't any time, either explicitly or
 implicitly, say that this must be a static choice determined once-off at
 run time.
 
 It says:
 This definition should be based on other insn attributes and/or target 
 flags.

pedantry
- It only says *should*, not *must*.  Should can imply a
recommendation, not a requirement.
- It only says *based on*, not *solely based on*.  Based on implies that
other information might be involved as well, without limitation.
/pedantary

 where from what I understand Andreas originally meant was that the other
 insn attributes would be again based on target flags or other such insn
 attributes.  Yeah, it explicitly rules out just basing the enabled attribute 
 on
 the operands, arm uses just another attribute based on the operands there,
 but it still goes against the intent of the attribute, and is definitely a
 nightmare for the register allocator.
 Decisions based on the operands should be through constraints.
 
 From what I can see, arm uses it just on a single insn, so I'd say it
 shouldn't be a big deal to rewrite it.
 

Pedantry aside, I'm not against making such a restriction; but it is a
new restriction that isn't explictly stated in the documentation today.

R.

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Sandiford

Richard Earnshaw rearn...@arm.com writes:
 On 27/05/14 16:27, Jakub Jelinek wrote:
 On Tue, May 27, 2014 at 04:15:47PM +0100, Richard Earnshaw wrote:
 On 27/05/14 15:08, Richard Sandiford wrote:
 Hmm, is this because of insn_enabled?  If so, how did that work before
 the patch?  LRA already assumed that the enabled attribute didn't depend
 on the operands.

 Huh!  enabled can be applied to each alternative.  Alternatives are
 selected based on the operands.  If LRA can't cope with that we have a
 serious problem.  In fact, a pattern that matches but has no enabled
 alternatives is meaningless and guaranteed to cause problems during
 register allocation.
 
 This is not LRA fault, but the backend misusing the enabled attribute
 for something it wasn't designed for, and IMHO against the documentation
 of the attribute too.
 Just look at the original submission why it has been added.
 
  Jakub
 

 quote
 The @code{enabled} insn attribute may be used to disable certain insn
 alternatives for machine-specific reasons.
 quote

 The rest of the text just says what happens when this is done and then
 gives an example usage.  It doesn't any time, either explicitly or
 implicitly, say that this must be a static choice determined once-off at
 run time.

OK, how about the doc patch below?

 That being said, I agree that this particular use case is pushing the
 boundaries -- but that's always a risk when the boundaries aren't
 clearly defined.

 A better solution here would be to get rid of all those match_operator
 patterns and replace them with iterators; but that's a lot of work,
 particularly for all the conditinal operation patterns we have.  It
 would probably also bloat the number of patterns quite alarmingly.

Yeah, which is why I was just going for the one place where it mattered.
I think match_operator still has a place in cases where the insn pattern
is entirely regular, which seems to be the case for most other uses
of shiftable_operator.  It's just that in this case there really were
two separate cases per operator (plus vs. non-plus and mult vs. true shift).

Thanks,
Richard


gcc/
* doc/md.texi: Document the restrictions on the enabled attribute.

Index: gcc/doc/md.texi
===
--- gcc/doc/md.texi (revision 210972)
+++ gcc/doc/md.texi (working copy)
@@ -4094,11 +4094,11 @@
 @subsection Disable insn alternatives using the @code{enabled} attribute
 @cindex enabled
 
-The @code{enabled} insn attribute may be used to disable certain insn
-alternatives for machine-specific reasons.  This is useful when adding
-new instructions to an existing pattern which are only available for
-certain cpu architecture levels as specified with the @code{-march=}
-option.
+The @code{enabled} insn attribute may be used to disable insn
+alternatives that are not available for the current subtarget.
+This is useful when adding new instructions to an existing pattern
+which are only available for certain cpu architecture levels as
+specified with the @code{-march=} option.
 
 If an insn alternative is disabled, then it will never be used.  The
 compiler treats the constraints for the disabled alternative as
@@ -4112,6 +4112,9 @@
 A definition of the @code{enabled} insn attribute.  The attribute is
 defined as usual using the @code{define_attr} command.  This
 definition should be based on other insn attributes and/or target flags.
+It must not depend directly or indirectly on the current operands,
+since the attribute is expected to be a static property of the subtarget.
+
 The @code{enabled} attribute is a numeric attribute and should evaluate to
 @code{(const_int 1)} for an enabled alternative and to
 @code{(const_int 0)} otherwise.

Re: [patch, mips, commited] Check in obvious patch to fix mips build.

2014-05-27 Thread Richard Sandiford

Steve Ellcey  sell...@mips.com writes:
 The recent comdat group changes broke the mips build because mips.c
 was not including cgraph.h.  I am checking in the following patch
 as obvious to fix the mips build.

Wasn't that fixed by:

2014-05-26  Jan Hubicka  hubi...@ucw.cz

* tree.h (decl_comdat_group): Declare.
* cgraph.h (symtab_in_same_comdat_p): Move offline to ...
* tree.c (decl_comdat_group): Here.

?

Thanks,
Richard

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Eric Botcazou

 I'm asking to merge them (move them to fold_comparison).

Done (in the end the patch removes more lines than it adds :-).

Tested on x86_64-suse-linux with no regressions.


2014-05-27  Eric Botcazou  ebotca...@adacore.com

* fold-const.c (fold_comparison): Clean up and extend X +- C1 CMP C2
to X CMP C2 -+ C1 transformation to EQ_EXPR/NE_EXPR.
Add X - Y CMP 0 to X CMP Y transformation.
(fold_binary_loc) EQ_EXPR/NE_EXPR: Remove same transformations.


2014-05-27  Eric Botcazou  ebotca...@adacore.com

* gcc.dg/fold-compare-8.c: New test.
* gcc.dg/Wstrict-overflow-25.c: Likewise.


-- 
Eric BotcazouIndex: fold-const.c
===
--- fold-const.c	(revision 210911)
+++ fold-const.c	(working copy)
@@ -8904,6 +8904,7 @@ static tree
 fold_comparison (location_t loc, enum tree_code code, tree type,
 		 tree op0, tree op1)
 {
+  const bool equality_code = (code == EQ_EXPR || code == NE_EXPR);
   tree arg0, arg1, tem;
 
   arg0 = op0;
@@ -8920,28 +8921,24 @@ fold_comparison (location_t loc, enum tr
   if (tree_swap_operands_p (arg0, arg1, true))
 return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, op0);
 
-  /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 +- C1.  */
+  /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 -+ C1.  */
   if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
-   (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
-	   !TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
-	   TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg1)))
-   (TREE_CODE (arg1) == INTEGER_CST
-	   !TREE_OVERFLOW (arg1)))
+   (equality_code || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
+   TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
+   !TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
+   TREE_CODE (arg1) == INTEGER_CST
+   !TREE_OVERFLOW (arg1))
 {
+  const enum tree_code
+	reverse_op = TREE_CODE (arg0) == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR;
   tree const1 = TREE_OPERAND (arg0, 1);
-  tree const2 = arg1;
+  tree const2 = fold_convert_loc (loc, TREE_TYPE (const1), arg1);
   tree variable = TREE_OPERAND (arg0, 0);
-  tree lhs;
-  int lhs_add;
-  lhs_add = TREE_CODE (arg0) != PLUS_EXPR;
-
-  lhs = fold_build2_loc (loc, lhs_add ? PLUS_EXPR : MINUS_EXPR,
-			 TREE_TYPE (arg1), const2, const1);
+  tree new_const = int_const_binop (reverse_op, const2, const1);
 
   /* If the constant operation overflowed this can be
 	 simplified as a comparison against INT_MAX/INT_MIN.  */
-  if (TREE_CODE (lhs) == INTEGER_CST
-	   TREE_OVERFLOW (lhs))
+  if (TREE_OVERFLOW (new_const))
 	{
 	  int const1_sgn = tree_int_cst_sgn (const1);
 	  enum tree_code code2 = code;
@@ -8961,29 +8958,48 @@ fold_comparison (location_t loc, enum tr
 	  /* We now can look at the canonicalized case
 	   VARIABLE + 1  CODE2  INT_MIN
 	 and decide on the result.  */
-	  if (code2 == LT_EXPR
-	  || code2 == LE_EXPR
-	  || code2 == EQ_EXPR)
-	return omit_one_operand_loc (loc, type, boolean_false_node, variable);
-	  else if (code2 == NE_EXPR
-		   || code2 == GE_EXPR
-		   || code2 == GT_EXPR)
-	return omit_one_operand_loc (loc, type, boolean_true_node, variable);
-	}
+	  switch (code2)
+	{
+	case EQ_EXPR:
+	case LT_EXPR:
+	case LE_EXPR:
+	  return
+		omit_one_operand_loc (loc, type, boolean_false_node, variable);
+
+	case NE_EXPR:
+	case GE_EXPR:
+	case GT_EXPR:
+	  return
+		omit_one_operand_loc (loc, type, boolean_true_node, variable);
 
-  if (TREE_CODE (lhs) == TREE_CODE (arg1)
-	   (TREE_CODE (lhs) != INTEGER_CST
-	  || !TREE_OVERFLOW (lhs)))
+	default:
+	  gcc_unreachable ();
+	}
+	}
+  else
 	{
-	  if (code != EQ_EXPR  code != NE_EXPR)
+	  if (!equality_code)
 	fold_overflow_warning (assuming signed overflow does not occur 
    when changing X +- C1 cmp C2 to 
-   X cmp C1 +- C2,
+   X cmp C2 -+ C1,
    WARN_STRICT_OVERFLOW_COMPARISON);
-	  return fold_build2_loc (loc, code, type, variable, lhs);
+	  return fold_build2_loc (loc, code, type, variable, new_const);
 	}
 }
 
+  /* Transform comparisons of the form X - Y CMP 0 to X CMP Y.  */
+  if (TREE_CODE (arg0) == MINUS_EXPR
+   (equality_code || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
+   integer_zerop (arg1))
+{
+  if (!equality_code)
+	fold_overflow_warning (assuming signed overflow does not occur 
+			   when changing X - Y cmp 0 to X cmp Y,
+			   WARN_STRICT_OVERFLOW_COMPARISON);
+  return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0),
+			  TREE_OPERAND (arg0, 1));
+}
+
   /* For comparisons of pointers we can decompose it to a compile time
  comparison of the base objects and the offsets into the object.
  This requires at least one operand being an ADDR_EXPR or a
@@ -9111,8 +9127,7 @@ fold_comparison

Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-05-27 Thread Richard Biener

On May 27, 2014 6:12:58 PM CEST, Eric Botcazou ebotca...@adacore.com wrote:
 I'm asking to merge them (move them to fold_comparison).

Done (in the end the patch removes more lines than it adds :-).

That's what I like!

Tested on x86_64-suse-linux with no regressions.

OK.
Thanks,
Richard.



2014-05-27  Eric Botcazou  ebotca...@adacore.com

   * fold-const.c (fold_comparison): Clean up and extend X +- C1 CMP C2
   to X CMP C2 -+ C1 transformation to EQ_EXPR/NE_EXPR.
   Add X - Y CMP 0 to X CMP Y transformation.
   (fold_binary_loc) EQ_EXPR/NE_EXPR: Remove same transformations.


2014-05-27  Eric Botcazou  ebotca...@adacore.com

* gcc.dg/fold-compare-8.c: New test.
* gcc.dg/Wstrict-overflow-25.c: Likewise.

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Earnshaw

On 27/05/14 17:09, Richard Sandiford wrote:
 Richard Earnshaw rearn...@arm.com writes:
 On 27/05/14 16:27, Jakub Jelinek wrote:
 On Tue, May 27, 2014 at 04:15:47PM +0100, Richard Earnshaw wrote:
 On 27/05/14 15:08, Richard Sandiford wrote:
 Hmm, is this because of insn_enabled?  If so, how did that work before
 the patch?  LRA already assumed that the enabled attribute didn't depend
 on the operands.

 Huh!  enabled can be applied to each alternative.  Alternatives are
 selected based on the operands.  If LRA can't cope with that we have a
 serious problem.  In fact, a pattern that matches but has no enabled
 alternatives is meaningless and guaranteed to cause problems during
 register allocation.

 This is not LRA fault, but the backend misusing the enabled attribute
 for something it wasn't designed for, and IMHO against the documentation
 of the attribute too.
 Just look at the original submission why it has been added.

 Jakub


 quote
 The @code{enabled} insn attribute may be used to disable certain insn
 alternatives for machine-specific reasons.
 quote

 The rest of the text just says what happens when this is done and then
 gives an example usage.  It doesn't any time, either explicitly or
 implicitly, say that this must be a static choice determined once-off at
 run time.
 
 OK, how about the doc patch below?
 
 That being said, I agree that this particular use case is pushing the
 boundaries -- but that's always a risk when the boundaries aren't
 clearly defined.

 A better solution here would be to get rid of all those match_operator
 patterns and replace them with iterators; but that's a lot of work,
 particularly for all the conditinal operation patterns we have.  It
 would probably also bloat the number of patterns quite alarmingly.
 
 Yeah, which is why I was just going for the one place where it mattered.
 I think match_operator still has a place in cases where the insn pattern
 is entirely regular, which seems to be the case for most other uses
 of shiftable_operator.  It's just that in this case there really were
 two separate cases per operator (plus vs. non-plus and mult vs. true shift).
 
 Thanks,
 Richard
 
 
 gcc/
   * doc/md.texi: Document the restrictions on the enabled attribute.
 
 Index: gcc/doc/md.texi
 ===
 --- gcc/doc/md.texi   (revision 210972)
 +++ gcc/doc/md.texi   (working copy)
 @@ -4094,11 +4094,11 @@
  @subsection Disable insn alternatives using the @code{enabled} attribute
  @cindex enabled
  
 -The @code{enabled} insn attribute may be used to disable certain insn
 -alternatives for machine-specific reasons.  This is useful when adding
 -new instructions to an existing pattern which are only available for
 -certain cpu architecture levels as specified with the @code{-march=}
 -option.
 +The @code{enabled} insn attribute may be used to disable insn
 +alternatives that are not available for the current subtarget.
 +This is useful when adding new instructions to an existing pattern
 +which are only available for certain cpu architecture levels as
 +specified with the @code{-march=} option.
  
  If an insn alternative is disabled, then it will never be used.  The
  compiler treats the constraints for the disabled alternative as
 @@ -4112,6 +4112,9 @@
  A definition of the @code{enabled} insn attribute.  The attribute is
  defined as usual using the @code{define_attr} command.  This
  definition should be based on other insn attributes and/or target flags.
 +It must not depend directly or indirectly on the current operands,
 +since the attribute is expected to be a static property of the subtarget.
 +

I'd reverse the two components of that sentence.  Something like:

The attribute must be a static property of the subtarget; that is, it
must not depend on the current operands or any other dynamic context
(for example, location of the insn within the body of a loop).

It would be useful if we could precisely define 'static property'
somewhere, so that it encompases per-function changing of the ISA or
optimization variant via __attribute__ annotations.  That does need to
work, since it could be used for switching between ARM and Thumb.

R.


  The @code{enabled} attribute is a numeric attribute and should evaluate to
  @code{(const_int 1)} for an enabled alternative and to
  @code{(const_int 0)} otherwise.

Re: [patch i386]: Expand sibling-tail-calls via accumulator register

2014-05-27 Thread Richard Henderson

On 05/27/2014 08:39 AM, Jeff Law wrote:
 But the value we want may be sitting around in a convenient register (such as
 r11).  So if we force the sibcall to use %rax, then we have to generate a
 copy.  Yet if we have a constraint for the set of registers allowed here, then
 we give the register allocator the opportunity to coalesce away the copies.

We do have an existing call-clobbered register class that we currently do use
for indirect sibcalls.

The problem that Kai is trying to work around is that he doesn't want to load
the value into a register, but its address.  But the register allocator only
has BASE_REG_CLASS and INDEX_REG_CLASS, and has no possibility to force the
address components into registers that are call clobbered.

My thinking is that, with few exceptions, this doesn't help anything.  If we
have to load the full address into a register:

lea ofs(base, index, scale), %eax
...
call*0(%eax)

we might as well include the memory load

mov ofs(base, index, scale), %eax
...
call*%eax

As far as I can see, the only time we can actually save an instruction is when
the address is fully symbolic:

mov foo, %eax
...
call*%eax

becomes

call*foo


r~

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Richard Sandiford

Richard Earnshaw rearn...@arm.com writes:
 On 27/05/14 17:09, Richard Sandiford wrote:
 Richard Earnshaw rearn...@arm.com writes:
 On 27/05/14 16:27, Jakub Jelinek wrote:
 On Tue, May 27, 2014 at 04:15:47PM +0100, Richard Earnshaw wrote:
 On 27/05/14 15:08, Richard Sandiford wrote:
 Hmm, is this because of insn_enabled?  If so, how did that work before
 the patch?  LRA already assumed that the enabled attribute didn't 
 depend
 on the operands.

 Huh!  enabled can be applied to each alternative.  Alternatives are
 selected based on the operands.  If LRA can't cope with that we have a
 serious problem.  In fact, a pattern that matches but has no enabled
 alternatives is meaningless and guaranteed to cause problems during
 register allocation.

 This is not LRA fault, but the backend misusing the enabled attribute
 for something it wasn't designed for, and IMHO against the documentation
 of the attribute too.
 Just look at the original submission why it has been added.

Jakub


 quote
 The @code{enabled} insn attribute may be used to disable certain insn
 alternatives for machine-specific reasons.
 quote

 The rest of the text just says what happens when this is done and then
 gives an example usage.  It doesn't any time, either explicitly or
 implicitly, say that this must be a static choice determined once-off at
 run time.
 
 OK, how about the doc patch below?
 
 That being said, I agree that this particular use case is pushing the
 boundaries -- but that's always a risk when the boundaries aren't
 clearly defined.

 A better solution here would be to get rid of all those match_operator
 patterns and replace them with iterators; but that's a lot of work,
 particularly for all the conditinal operation patterns we have.  It
 would probably also bloat the number of patterns quite alarmingly.
 
 Yeah, which is why I was just going for the one place where it mattered.
 I think match_operator still has a place in cases where the insn pattern
 is entirely regular, which seems to be the case for most other uses
 of shiftable_operator.  It's just that in this case there really were
 two separate cases per operator (plus vs. non-plus and mult vs. true shift).
 
 Thanks,
 Richard
 
 
 gcc/
  * doc/md.texi: Document the restrictions on the enabled attribute.
 
 Index: gcc/doc/md.texi
 ===
 --- gcc/doc/md.texi  (revision 210972)
 +++ gcc/doc/md.texi  (working copy)
 @@ -4094,11 +4094,11 @@
  @subsection Disable insn alternatives using the @code{enabled} attribute
  @cindex enabled
  
 -The @code{enabled} insn attribute may be used to disable certain insn
 -alternatives for machine-specific reasons.  This is useful when adding
 -new instructions to an existing pattern which are only available for
 -certain cpu architecture levels as specified with the @code{-march=}
 -option.
 +The @code{enabled} insn attribute may be used to disable insn
 +alternatives that are not available for the current subtarget.
 +This is useful when adding new instructions to an existing pattern
 +which are only available for certain cpu architecture levels as
 +specified with the @code{-march=} option.
  
  If an insn alternative is disabled, then it will never be used.  The
  compiler treats the constraints for the disabled alternative as
 @@ -4112,6 +4112,9 @@
  A definition of the @code{enabled} insn attribute.  The attribute is
  defined as usual using the @code{define_attr} command.  This
  definition should be based on other insn attributes and/or target flags.
 +It must not depend directly or indirectly on the current operands,
 +since the attribute is expected to be a static property of the subtarget.
 +

 I'd reverse the two components of that sentence.  Something like:

 The attribute must be a static property of the subtarget; that is, it
 must not depend on the current operands or any other dynamic context
 (for example, location of the insn within the body of a loop).

OK, how about the attached?

 It would be useful if we could precisely define 'static property'
 somewhere, so that it encompases per-function changing of the ISA or
 optimization variant via __attribute__ annotations.  That does need to
 work, since it could be used for switching between ARM and Thumb.

Yeah, the cache depends on the current target for SWITCHABLE_TARGETs,
is invalidated by target_reinit for other targets, and is also invalidated
by changes to the register classes.

Thanks,
Richard


gcc/
* doc/md.texi: Document the restrictions on the enabled attribute.

Index: gcc/doc/md.texi
===
--- gcc/doc/md.texi (revision 210972)
+++ gcc/doc/md.texi (working copy)
@@ -4094,11 +4094,11 @@
 @subsection Disable insn alternatives using the @code{enabled} attribute
 @cindex enabled
 
-The @code{enabled} insn attribute may be used to disable certain insn
-alternatives for machine-specific reasons.  This is useful

Re: [patch, mips, commited] Check in obvious patch to fix mips build.

2014-05-27 Thread Steve Ellcey

On Tue, 2014-05-27 at 17:12 +0100, Richard Sandiford wrote:
 Steve Ellcey  sell...@mips.com writes:
  The recent comdat group changes broke the mips build because mips.c
  was not including cgraph.h.  I am checking in the following patch
  as obvious to fix the mips build.
 
 Wasn't that fixed by:
 
 2014-05-26  Jan Hubicka  hubi...@ucw.cz
 
   * tree.h (decl_comdat_group): Declare.
   * cgraph.h (symtab_in_same_comdat_p): Move offline to ...
   * tree.c (decl_comdat_group): Here.
 
 ?
 
 Thanks,
 Richard

It didn't fix the build for me.  I still got this failure after that
patch:

/local/home/sellcey/nightly/src/gcc/gcc/config/mips/mips.c: In function
'void mips_start_unique_function(const char*)':
/local/home/sellcey/nightly/src/gcc/gcc/config/mips/mips.c:6278:27:
error: 'cgraph_create_node' was not declared in this scope
make[2]: *** [mips.o] Error 1

When I grepped for cgraph_create_node I found it in cgraph.h.

Steve Ellcey
sell...@mips.com

Re: RFA: cache enabled attribute by insn code

2014-05-27 Thread Christophe Lyon

On 27 May 2014 17:07, Richard Sandiford rdsandif...@googlemail.com wrote:
 Richard Sandiford rdsandif...@googlemail.com writes:
 Richard Sandiford rsand...@linux.vnet.ibm.com writes:
 Does the following patch help?

 Bah, it won't of course: %i1 needs to be the operator.

 Here's v2.  I tested that it worked for simple tests like:


I confirm that the compiler now builds fine (target
arm-none-linux-gnueabihf) with this patch applied.

Thanks,

Christophe.

 int f1 (int x, int y) { return x + (y  4); }
 int f2 (int x, int y) { return x - (y  4); }
 int f3 (int x, int y) { return x  (y  4); }
 int f4 (int x, int y) { return x | (y  4); }
 int f5 (int x, int y) { return x ^ (y  4); }
 int f6 (int x, int y) { return (y  4) - x; }
 int g1 (int x, int y, int z) { return x + (y  z); }
 int g2 (int x, int y, int z) { return x - (y  z); }
 int g3 (int x, int y, int z) { return x  (y  z); }
 int g4 (int x, int y, int z) { return x | (y  z); }
 int g5 (int x, int y, int z) { return x ^ (y  z); }
 int g6 (int x, int y, int z) { return (y  z) - x; }

 as well as the testcase.

 Thanks,
 Richard


 gcc/
 * config/arm/iterators.md (shiftable_ops): New code iterator.
 (t2_binop0, arith_shift_insn): New code attributes.
 * config/arm/arm.md (insn_enabled): Delete.
 (enabled): Remove insn_enabled test.
 (*arith_shiftsi): Split out...
 (*arith_multsi): ...this pattern and remove insn_enabled attribute.

 Index: gcc/config/arm/arm.md
 ===
 --- gcc/config/arm/arm.md   (revision 210972)
 +++ gcc/config/arm/arm.md   (working copy)
 @@ -200,17 +200,9 @@
   (const_string yes)]
  (const_string no)))

 -; Allows an insn to disable certain alternatives for reasons other than
 -; arch support.
 -(define_attr insn_enabled no,yes
 -  (const_string yes))
 -
  ; Enable all alternatives that are both arch_enabled and insn_enabled.
   (define_attr enabled no,yes
 -   (cond [(eq_attr insn_enabled no)
 - (const_string no)
 -
 - (and (eq_attr predicable_short_it no)
 +   (cond [(and (eq_attr predicable_short_it no)
(and (eq_attr predicated yes)
 (match_test arm_restrict_it)))
   (const_string no)
 @@ -9876,39 +9868,38 @@

  ;; Patterns to allow combination of arithmetic, cond code and shifts

 -(define_insn *arith_shiftsi
 -  [(set (match_operand:SI 0 s_register_operand =r,r,r,r)
 -(match_operator:SI 1 shiftable_operator
 -  [(match_operator:SI 3 shift_operator
 - [(match_operand:SI 4 s_register_operand r,r,r,r)
 -  (match_operand:SI 5 shift_amount_operand M,M,M,r)])
 -   (match_operand:SI 2 s_register_operand rk,rk,r,rk)]))]
 +;; Separate out the (mult ...) case, since it doesn't allow register 
 operands.
 +(define_insn *arith_multsi
 +  [(set (match_operand:SI 0 s_register_operand =r,r)
 +(shiftable_ops:SI
 + (match_operator:SI 2 mult_operator
 +   [(match_operand:SI 3 s_register_operand r,r)
 +(match_operand:SI 4 power_of_two_operand)])
 + (match_operand:SI 1 s_register_operand rk,t2_binop0)))]
TARGET_32BIT
 -  %i1%?\\t%0, %2, %4%S3
 +  arith_shift_insn%?\\t%0, %1, %3%S2
[(set_attr predicable yes)
 (set_attr predicable_short_it no)
 (set_attr shift 4)
 -   (set_attr arch a,t2,t2,a)
 -   ;; Thumb2 doesn't allow the stack pointer to be used for
 -   ;; operand1 for all operations other than add and sub. In this case
 -   ;; the minus operation is a candidate for an rsub and hence needs
 -   ;; to be disabled.
 -   ;; We have to make sure to disable the fourth alternative if
 -   ;; the shift_operator is MULT, since otherwise the insn will
 -   ;; also match a multiply_accumulate pattern and validate_change
 -   ;; will allow a replacement of the constant with a register
 -   ;; despite the checks done in shift_operator.
 -   (set_attr_alternative insn_enabled
 -[(const_string yes)
 - (if_then_else
 -  (match_operand:SI 1 add_operator )
 -  (const_string yes) (const_string no))
 - (const_string yes)
 - (if_then_else
 -  (match_operand:SI 3 mult_operator )
 -  (const_string no) (const_string yes))])
 -   (set_attr type 
 alu_shift_imm,alu_shift_imm,alu_shift_imm,alu_shift_reg)])
 +   (set_attr arch a,t2)
 +   (set_attr type alu_shift_imm,alu_shift_imm)])

 +;; Handles cases other than the above.
 +(define_insn *arith_shiftsi
 +  [(set (match_operand:SI 0 s_register_operand =r,r,r)
 +(shiftable_ops:SI
 +  (match_operator:SI 2 shift_operator
 +[(match_operand:SI 3 s_register_operand r,r,r)
 + (match_operand:SI 4 shift_amount_operand M,M,r)])
 +  (match_operand:SI 1 s_register_operand rk,t2_binop0,rk)))]
 +

1 2 >

1 - 100 of 150 matches

Mail list logo