[PATCH, AARCH64] movcc for fcsel

2014-03-18 Thread Zhenqiang Chen
Hi,

For float value, movsfcc/movdfcc is required by emit_conditional_move
called in ifcvt pass to expand if-then-else to "fcsel" insn.

Bootstrap and no make check regression with qemu-aarch64.

Is it OK for next stage1?

Thanks!
-Zhenqiang

ChangeLog:
2014-03-18  Zhenqiang Chen  

* config/aarch64/aarch64.md (movcc): New for GPF.

testsuite/ChangeLog:
2014-03-18  Zhenqiang Chen  

* gcc.target/aarch64/fcsel.c: New test case.

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 99a6ac8..0f4b8ebf 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2344,6 +2344,25 @@
   }
 )

+(define_expand "movcc"
+  [(set (match_operand:GPF 0 "register_operand" "")
+(if_then_else:GPF (match_operand 1 "aarch64_comparison_operator" "")
+  (match_operand:GPF 2 "register_operand" "")
+  (match_operand:GPF 3 "register_operand" "")))]
+  ""
+  {
+rtx ccreg;
+enum rtx_code code = GET_CODE (operands[1]);
+
+if (code == UNEQ || code == LTGT)
+  FAIL;
+
+ccreg = aarch64_gen_compare_reg (code, XEXP (operands[1], 0),
+  XEXP (operands[1], 1));
+operands[1] = gen_rtx_fmt_ee (code, VOIDmode, ccreg, const0_rtx);
+  }
+)
+
 (define_insn "*csinc2_insn"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 (plus:GPI (match_operator:GPI 2 "aarch64_comparison_operator"
diff --git a/gcc/testsuite/gcc.target/aarch64/fcsel.c
b/gcc/testsuite/gcc.target/aarch64/fcsel.c
new file mode 100644
index 000..9c5431a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fcsel.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options " -O2 " } */
+
+float f1 (float a, float b, float c, float d)
+{
+  if (a > 0.0)
+return c;
+  else
+return 2.0;
+}
+
+double f2 (double a, double b, double c, double d)
+{
+  if (a > b)
+return c;
+  else
+return d;
+}
+
+/* { dg-final { scan-assembler-times "\tfcsel" 2 } } */


[PATCH] Fix transparent alias chain construction in change_decl_assembler_name

2014-03-18 Thread Ilya Enkovich
Hi,

Here is a small patch to fix ICE in change_decl_assembler_name.  I found this 
problem working on my experimental branch which uses transparent alias chains 
heavily and I don't have reproducer for the trunk.

Bootstrapped and checked on linux-x86_64.

Thanks,
Ilya
--

2014-03-17  Ilya Enkovich  

   * symtab.c (change_decl_assembler_name): Fix transparent alias
   chain construction.


diff --git a/gcc/symtab.c b/gcc/symtab.c
index 5d69803..2d6f665 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -492,7 +492,7 @@ change_decl_assembler_name (tree decl, tree name)
   if (alias)
{
  IDENTIFIER_TRANSPARENT_ALIAS (name) = 1;
- TREE_CHAIN (DECL_ASSEMBLER_NAME (name)) = alias;
+ TREE_CHAIN (name) = alias;
}
   if (node)
insert_to_assembler_name_hash (node, true);


Re: [PATCH] Fix -fsanitize=undefined -flto (PR sanitizer/60535)

2014-03-18 Thread Richard Biener
On Mon, 17 Mar 2014, Jakub Jelinek wrote:

> Hi!
> 
> Apparently rest_of_decl_compilation only calls varpool_finalize_decl
> if not in_lto_p, so this patch calls it explicitly after that call to
> make sure with -flto we register the newly created vars with varpool as
> well.
> 
> Additionally, the patch gives name to a few further builtin types, so that
> the null-4.c and overflow-int128.c tests don't fail with -flto (without the
> lto-lang.c change they printed  as type name).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2014-03-17  Jakub Jelinek  
> 
>   PR sanitizer/60535
>   * ubsan.c (ubsan_type_descriptor, ubsan_create_data): Call
>   varpool_finalize_decl after rest_of_decl_compilation.
> lto/
>   * lto-lang.c (lto_init): Add NAME_TYPE for int128_integer_type_node
>   and complex_{float,{,long_}double}_type_node.
> testsuite/
>   * c-c++-common/ubsan/null-1.c: Don't skip if -flto.
>   * c-c++-common/ubsan/null-2.c: Likewise.
>   * c-c++-common/ubsan/null-3.c: Likewise.
>   * c-c++-common/ubsan/null-4.c: Likewise.
>   * c-c++-common/ubsan/null-5.c: Likewise.
>   * c-c++-common/ubsan/null-6.c: Likewise.
>   * c-c++-common/ubsan/null-7.c: Likewise.
>   * c-c++-common/ubsan/null-8.c: Likewise.
>   * c-c++-common/ubsan/null-9.c: Likewise.
>   * c-c++-common/ubsan/null-10.c: Likewise.
>   * c-c++-common/ubsan/null-11.c: Likewise.
>   * c-c++-common/ubsan/overflow-1.c: Likewise.
>   * c-c++-common/ubsan/overflow-2.c: Likewise.
>   * c-c++-common/ubsan/overflow-add-1.c: Likewise.
>   * c-c++-common/ubsan/overflow-add-2.c: Likewise.
>   * c-c++-common/ubsan/overflow-int128.c: Likewise.
>   * c-c++-common/ubsan/overflow-mul-1.c: Likewise.
>   * c-c++-common/ubsan/overflow-mul-2.c: Likewise.
>   * c-c++-common/ubsan/overflow-mul-3.c: Likewise.
>   * c-c++-common/ubsan/overflow-mul-4.c: Likewise.
>   * c-c++-common/ubsan/overflow-negate-1.c: Likewise.
>   * c-c++-common/ubsan/overflow-negate-2.c: Likewise.
>   * c-c++-common/ubsan/overflow-sub-1.c: Likewise.
>   * c-c++-common/ubsan/overflow-sub-2.c: Likewise.
>   * c-c++-common/ubsan/pr59333.c: Likewise.
>   * c-c++-common/ubsan/pr59503.c: Likewise.
>   * c-c++-common/ubsan/pr59667.c: Likewise.
>   * c-c++-common/ubsan/undefined-1.c: Likewise.
>   * g++.dg/ubsan/pr59250.C: Likewise.
>   * g++.dg/ubsan/pr59306.C: Likewise.
> 
> --- gcc/ubsan.c.jj2014-01-08 17:45:06.0 +0100
> +++ gcc/ubsan.c   2014-03-17 14:09:40.280376415 +0100
> @@ -391,6 +391,7 @@ ubsan_type_descriptor (tree type, bool w
>TREE_STATIC (ctor) = 1;
>DECL_INITIAL (decl) = ctor;
>rest_of_decl_compilation (decl, 1, 0);
> +  varpool_finalize_decl (decl);
>  
>/* Save the VAR_DECL into the hash table.  */
>decl_for_type_insert (type, decl);
> @@ -502,6 +503,7 @@ ubsan_create_data (const char *name, loc
>TREE_STATIC (ctor) = 1;
>DECL_INITIAL (var) = ctor;
>rest_of_decl_compilation (var, 1, 0);
> +  varpool_finalize_decl (var);

Isn't varpool_finalize_decl the canonical interface anyway?
Thus, you don't need to call rest_of_decl_compilation AFAIK - the
varpool machinery will do that.

>return var;
>  }
> --- gcc/lto/lto-lang.c.jj 2014-03-10 10:50:15.0 +0100
> +++ gcc/lto/lto-lang.c2014-03-17 15:49:10.592371589 +0100
> @@ -1222,6 +1222,13 @@ lto_init (void)
>NAME_TYPE (long_double_type_node, "long double");
>NAME_TYPE (void_type_node, "void");
>NAME_TYPE (boolean_type_node, "bool");
> +  NAME_TYPE (complex_float_type_node, "complex float");
> +  NAME_TYPE (complex_double_type_node, "complex double");
> +  NAME_TYPE (complex_long_double_type_node, "complex long double");
> +#if HOST_BITS_PER_WIDE_INT >= 64
> +  if (targetm.scalar_mode_supported_p (TImode))
> +NAME_TYPE (int128_integer_type_node, "__int128");
> +#endif

Should be enough to check if int128_integer_type_node is not NULL.

Otherwise looks ok.

Thanks,
Richard.

>  #undef NAME_TYPE
>  
>/* Initialize LTO-specific data structures.  */
> --- gcc/testsuite/c-c++-common/ubsan/null-1.c.jj  2013-11-19 
> 21:56:24.566416519 +0100
> +++ gcc/testsuite/c-c++-common/ubsan/null-1.c 2014-03-17 13:23:46.057000209 
> +0100
> @@ -1,7 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-fsanitize=null -w" } */
>  /* { dg-shouldfail "ubsan" } */
> -/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
>  
>  int
>  main (void)
> --- gcc/testsuite/c-c++-common/ubsan/null-2.c.jj  2013-11-19 
> 21:56:24.566416519 +0100
> +++ gcc/testsuite/c-c++-common/ubsan/null-2.c 2014-03-17 13:23:46.06592 
> +0100
> @@ -1,7 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-fsanitize=null -w" } */
>  /* { dg-shouldfail "ubsan" } */
> -/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
>  
>  int
>  main (void)
> --- gcc/testsuite/c-c++-common/ubsan/null-3.c.jj  2013-11-19 
> 21:56:24.567416516 +010

Re: extending constants in rtl

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 4:36 AM, Mike Stump  wrote:
> So, to support things like this:
>
> (define_constants
>(C1_TEMP_REGNUM  PROLOGUE_SCRATCH_1)
>(C1_TEMP2_REGNUM PROLOGUE_SCRATCH_2)
>
> I need the rtl reader to do less checking.  We we turn off int validation, 
> this then works, and we get:
>
>   #define C1_TEMP_REGNUM PROLOGUE_SCRATCH_1
>
> in insn-constants.h, which is what I wanted.  The problem is that I choose 
> different scratch register based upon the cpu and this is then used in a 
> clobber in the rtl of a define_insn.
>
> I'd be happy to do this some other way, but, I didn't see a way to do this, 
> otherwise.
>
> Absent a better solution, I'd like to pursue this.  The only question I have, 
> remove the checking, or allow the target to explain that we don't want the 
> checking?

Make it

(define_symbolic_constant
   (C1_TEMP_REGNUM  PROLOGUE_SCRATCH_1)
?

>
> diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c
> index c198b5b..ceef96c 100644
> --- a/gcc/read-rtl.c
> +++ b/gcc/read-rtl.c
> @@ -807,8 +807,12 @@ validate_const_int (const char *string)
>  valid = 0;
> break;
>}
> +#if 0
> +  /* In order to support defining the md constants in terms of CPP constants 
> from tm.h, we
> + can't check this.  */
>if (!valid)
>  fatal_with_file_and_line ("invalid decimal constant \"%s\"\n", string);
> +#endif
>  }
>
>  static void
>


Re: [PATCH] Fix transparent alias chain construction in change_decl_assembler_name

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 9:20 AM, Ilya Enkovich  wrote:
> Hi,
>
> Here is a small patch to fix ICE in change_decl_assembler_name.  I found this 
> problem working on my experimental branch which uses transparent alias chains 
> heavily and I don't have reproducer for the trunk.
>
> Bootstrapped and checked on linux-x86_64.

Ok.

Thanks,
Richard.

> Thanks,
> Ilya
> --
>
> 2014-03-17  Ilya Enkovich  
>
>* symtab.c (change_decl_assembler_name): Fix transparent alias
>chain construction.
>
>
> diff --git a/gcc/symtab.c b/gcc/symtab.c
> index 5d69803..2d6f665 100644
> --- a/gcc/symtab.c
> +++ b/gcc/symtab.c
> @@ -492,7 +492,7 @@ change_decl_assembler_name (tree decl, tree name)
>if (alias)
> {
>   IDENTIFIER_TRANSPARENT_ALIAS (name) = 1;
> - TREE_CHAIN (DECL_ASSEMBLER_NAME (name)) = alias;
> + TREE_CHAIN (name) = alias;
> }
>if (node)
> insert_to_assembler_name_hash (node, true);


[patch, libgfortran] Fix SPU link failures (Re: Set close-on-exec flag when opening files)

2014-03-18 Thread Ulrich Weigand
Janne Blomqvist wrote:

> * io/unix.c (set_close_on_exec): New function.

Since this patch, most Fortran tests fail on spu-elf since the system
libraries do not support fcntl on the SPU.

The patch below fixes this by using an autoconf check to verify
fcntl is present before using it, as is already done for many
other routines in io/unix.c.

Tested on spu-elf, fixes the Fortran problems there.
Also verified on an older powerpc64-linux build that we still
use fcntl there.

OK for mainline?

Bye,
Ulrich


ChangeLog:

* configure.ac: Check for presence of fcntl.
* configure: Regenerate.
* config.h.in: Regenerate.
* io/unix.c (set_close_on_exec): Check for HAVE_FCNTL.

Index: libgfortran/io/unix.c
===
*** libgfortran/io/unix.c   (revision 208613)
--- libgfortran/io/unix.c   (working copy)
*** static void __attribute__ ((unused))
*** 1077,1083 
  set_close_on_exec (int fd __attribute__ ((unused)))
  {
/* Mingw does not define F_SETFD.  */
! #if defined(F_SETFD) && defined(FD_CLOEXEC)
if (fd >= 0)
  fcntl(fd, F_SETFD, FD_CLOEXEC);
  #endif
--- 1077,1083 
  set_close_on_exec (int fd __attribute__ ((unused)))
  {
/* Mingw does not define F_SETFD.  */
! #if defined(HAVE_FCNTL) && defined(F_SETFD) && defined(FD_CLOEXEC)
if (fd >= 0)
  fcntl(fd, F_SETFD, FD_CLOEXEC);
  #endif
Index: libgfortran/configure.ac
===
*** libgfortran/configure.ac(revision 208613)
--- libgfortran/configure.ac(working copy)
*** if test "x${with_newlib}" = "xyes"; then
*** 282,288 
  else
 AC_CHECK_FUNCS_ONCE(getrusage times mkstemp strtof strtold snprintf \
 ftruncate chsize chdir getlogin gethostname kill link symlink sleep 
ttyname \
!alarm access fork execl wait setmode execve pipe dup2 close \
 strcasestr getrlimit gettimeofday stat fstat lstat getpwuid vsnprintf dup \
 getcwd localtime_r gmtime_r getpwuid_r ttyname_r clock_gettime \
 readlink getgid getpid getppid getuid geteuid umask getegid \
--- 282,288 
  else
 AC_CHECK_FUNCS_ONCE(getrusage times mkstemp strtof strtold snprintf \
 ftruncate chsize chdir getlogin gethostname kill link symlink sleep 
ttyname \
!alarm access fork execl wait setmode execve pipe dup2 close fcntl \
 strcasestr getrlimit gettimeofday stat fstat lstat getpwuid vsnprintf dup \
 getcwd localtime_r gmtime_r getpwuid_r ttyname_r clock_gettime \
 readlink getgid getpid getppid getuid geteuid umask getegid \
Index: libgfortran/config.h.in
===
*** libgfortran/config.h.in (revision 208613)
--- libgfortran/config.h.in (working copy)
***
*** 360,365 
--- 360,368 
  /* Define to 1 if you have the `fabsl' function. */
  #undef HAVE_FABSL
  
+ /* Define to 1 if you have the `fcntl' function. */
+ #undef HAVE_FCNTL
+ 
  /* libm includes feenableexcept */
  #undef HAVE_FEENABLEEXCEPT
  
Index: libgfortran/configure
===
*** libgfortran/configure   (revision 208613)
--- libgfortran/configure   (working copy)
*** as_fn_append ac_func_list " execve"
*** 2572,2577 
--- 2572,2578 
  as_fn_append ac_func_list " pipe"
  as_fn_append ac_func_list " dup2"
  as_fn_append ac_func_list " close"
+ as_fn_append ac_func_list " fcntl"
  as_fn_append ac_func_list " strcasestr"
  as_fn_append ac_func_list " getrlimit"
  as_fn_append ac_func_list " gettimeofday"
*** else
*** 12342,12348 
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
lt_status=$lt_dlunknown
cat > conftest.$ac_ext <<_LT_EOF
! #line 12345 "configure"
  #include "confdefs.h"
  
  #if HAVE_DLFCN_H
--- 12343,12349 
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
lt_status=$lt_dlunknown
cat > conftest.$ac_ext <<_LT_EOF
! #line 12346 "configure"
  #include "confdefs.h"
  
  #if HAVE_DLFCN_H
*** else
*** 12448,12454 
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
lt_status=$lt_dlunknown
cat > conftest.$ac_ext <<_LT_EOF
! #line 12451 "configure"
  #include "confdefs.h"
  
  #if HAVE_DLFCN_H
--- 12449,12455 
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
lt_status=$lt_dlunknown
cat > conftest.$ac_ext <<_LT_EOF
! #line 12452 "configure"
  #include "confdefs.h"
  
  #if HAVE_DLFCN_H
*** done
*** 16602,16607 
--- 16603,16610 
  
  
  
+ 
+ 
  fi
  
  # Check strerror_r, cannot be above as versions with two and three arguments 
exist
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-18 Thread Ilya Tocar
On 17 Mar 22:18, Ulrich Drepper wrote:
> On Mon, Mar 17, 2014 at 7:39 AM, Ilya Tocar  wrote:
> 
> > undefined is similar in behavior to setzero, but it also clobbers
> > flags. Maybe just define it to setzero for now?
> >
> >
> What do you mean by "clobbers flags"?  Do you have an example?

I've used follwing example:

#include 

extern __inline __m512
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_undefined_ps (void)
{
  __m512 __Y;
  __asm__ ("" : "=x" (__Y));
  return __Y;
}


__m512 foo1(__m512 __A)
{
return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
  (__v16sf)
 _mm512_undefined_ps (),
  (__mmask16) -1);
}

__m512 foo2(__m512 __A)
{
return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
  (__v16sf)
 _mm512_setzero_ps (),
  (__mmask16) -1);
}


In foo1 asm statement is expanded into following rtl:

(insn 6 3 7 2 (parallel [
(set (reg:V16SF 87 [ __Y ])
(asm_operands:V16SF ("") ("=x") 0 []
 []
 [] foo.c:8))
(clobber (reg:QI 18 fpsr))
(clobber (reg:QI 17 flags))
]) foo.c:8 -1

As you can see flags are clobbered by asm statement, while in setzero
case (foo2) i have just:
(insn 7 6 8 2 (set (reg:V16SF 88)
(const_vector:V16SF [
(const_double:SF 0.0 [0x0.0p+0])
(const_double:SF 0.0 [0x0.0p+0])
//rest of zeroes skipped.


Re: [patch, libgfortran] Fix SPU link failures (Re: Set close-on-exec flag when opening files)

2014-03-18 Thread Tobias Burnus
Ulrich Weigand wrote:
> Janne Blomqvist wrote:
> > * io/unix.c (set_close_on_exec): New function.
>
> Since this patch, most Fortran tests fail on spu-elf since the system
> libraries do not support fcntl on the SPU.
>
> The patch below fixes this by using an autoconf check to verify
> fcntl is present before using it, as is already done for many
> other routines in io/unix.c.
>
> Tested on spu-elf, fixes the Fortran problems there.
> Also verified on an older powerpc64-linux build that we still
> use fcntl there.
>
> OK for mainline?

Looks good to me and it rather obvious.

Thanks for the patch!

Tobias


[Patch AArch64] Remove unnecesssary definition of MEMORY_MOVE_COST

2014-03-18 Thread Ramana Radhakrishnan

Hi,

	While looking at something else I realized that we had MEMORY_MOVE_COST 
defined in the backend. However we also have the more recent target hook 
defined for this through TARGET_MEMORY_MOVE_COST making it obvious to 
remove this definition, given that the only use of the macro 
MEMORY_MOVE_COST is in the default target hook implementation for 
TARGET_MEMORY_MOVE_COST :)


Ok for stage4 ? Just rebuilt the compiler (cc1 and cc1plus), built a few 
large enough .i files that I had lying around saw no difference in code 
generated as expected.


regards,
Ramana

  Ramana Radhakrishnan  

* config/aarch64/aarch64.h (MEMORY_MOVE_COST): Delete.

--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 1f71ee5..7962aa4 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -763,10 +763,6 @@ do {   
 \
 /* Put trampolines in the text section so that mapping symbols work
correctly.  */
 #define TRAMPOLINE_SECTION text_section
-
-/* Costs, etc.  */
-#define MEMORY_MOVE_COST(M, CLASS, IN) \
-  (GET_MODE_SIZE (M) < 8 ? 8 : GET_MODE_SIZE (M))
 
 /* To start with.  */
 #define BRANCH_COST(SPEED_P, PREDICTABLE_P) 2

[PATCH GCC]Fix pr60363 by adding backtraced value of phi arg along jump threading path

2014-03-18 Thread bin.cheng
Hi,
After control flow graph change made by
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case
gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on logical_op_short_circuit
targets including cortex-m3/cortex-m0.
The regression reveals a missed opportunity in jump threading, which causes
a forward basic block doesn't get removed in cfgcleanup after jump threading
in VRP1.  Root cause is stated at the corresponding PR:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to it for
detailed report.

This patch fixes the issue by adding constant value instead of ssa_name as
the new phi argument.  Bootstrap and test on x86_64, also test on cortex-m3
and the regression is gone.
I think this should wait for stage1, but would like to hear some comments
now.  So does it look reasonable?


2014-03-18  Bin Cheng  

PR regression/60363
* gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New.
(copy_phi_args): New parameters.  Call get_value_locus_in_path.
(update_destination_phis): New parameter.
(create_edge_and_update_destination_phis): Ditto.
(ssa_fix_duplicate_block_edges): Pass new arguments.
(thread_single_edge): Ditto.
Index: gcc/tree-ssa-threadupdate.c
===
--- gcc/tree-ssa-threadupdate.c (revision 208609)
+++ gcc/tree-ssa-threadupdate.c (working copy)
@@ -403,10 +403,51 @@ copy_phi_arg_into_existing_phi (edge src_e, edge t
 }
 }
 
-/* For each PHI in BB, copy the argument associated with SRC_E to TGT_E.  */
+/* Given ssa_name DEF, backtrack jump threading PATH from node IDX
+   to see if it has constant value in a flow sensitive manner.  Set
+   LOCUS to location of the constant phi arg and return the value.
+   Return DEF directly if either PATH or idx is ZERO.  */
 
+static tree
+get_value_locus_in_path (tree def, vec *path,
+int idx, source_location *locus)
+{
+  tree arg;
+  gimple def_phi;
+  basic_block def_bb;
+
+  if (path == NULL || idx == 0)
+return def;
+
+  def_phi = SSA_NAME_DEF_STMT (def);
+  if (gimple_code (def_phi) != GIMPLE_PHI)
+return def;
+
+  def_bb = gimple_bb (def_phi);
+  /* Backtrack jump threading path from IDX to see if def has constant
+ value.  */
+  for (int j = idx - 1; j >= 0; j--)
+{
+  edge e = (*path)[j]->e;
+  if (e->dest == def_bb)
+   {
+ arg = gimple_phi_arg_def (def_phi, e->dest_idx);
+ *locus = gimple_phi_arg_location (def_phi, e->dest_idx);
+ return (TREE_CODE (arg) == INTEGER_CST ? arg : def);
+   }
+}
+
+  return def;
+}
+
+/* For each PHI in BB, copy the argument associated with SRC_E to TGT_E.
+   Try to backtrack jump threading PATH from node IDX to see if the arg
+   has constant value, copy constant value instead of argument itself
+   if yes.  */
+
 static void
-copy_phi_args (basic_block bb, edge src_e, edge tgt_e)
+copy_phi_args (basic_block bb, edge src_e, edge tgt_e,
+  vec *path, int idx)
 {
   gimple_stmt_iterator gsi;
   int src_indx = src_e->dest_idx;
@@ -414,8 +455,14 @@ static void
   for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
 {
   gimple phi = gsi_stmt (gsi);
+  tree def = gimple_phi_arg_def (phi, src_indx);
   source_location locus = gimple_phi_arg_location (phi, src_indx);
-  add_phi_arg (phi, gimple_phi_arg_def (phi, src_indx), tgt_e, locus);
+
+  if (TREE_CODE (def) == SSA_NAME
+ && !virtual_operand_p (gimple_phi_result (phi)))
+   def = get_value_locus_in_path (def, path, idx, &locus);
+
+  add_phi_arg (phi, def, tgt_e, locus);
 }
 }
 
@@ -423,10 +470,13 @@ static void
edges.  The copy is NEW_BB.  Every PHI node in every direct successor of
ORIG_BB has a new argument associated with edge from NEW_BB to the
successor.  Initialize the PHI argument so that it is equal to the PHI
-   argument associated with the edge from ORIG_BB to the successor.  */
+   argument associated with the edge from ORIG_BB to the successor.
+   PATH and IDX are used to check if the new PHI argument has constant
+   value in a flow sensitive manner.  */
 
 static void
-update_destination_phis (basic_block orig_bb, basic_block new_bb)
+update_destination_phis (basic_block orig_bb, basic_block new_bb,
+vec *path, int idx)
 {
   edge_iterator ei;
   edge e;
@@ -434,7 +484,7 @@ static void
   FOR_EACH_EDGE (e, ei, orig_bb->succs)
 {
   edge e2 = find_edge (new_bb, e->dest);
-  copy_phi_args (e->dest, e, e2);
+  copy_phi_args (e->dest, e, e2, path, idx);
 }
 }
 
@@ -443,11 +493,13 @@ static void
destination.
 
Add an additional argument to any PHI nodes at the single
-   destination.  */
+   destination.  IDX is the start node in jump threading path
+   we start to check to see if the new PHI argument has constant
+   value along the jump threading path.  */
 
 static void
 create_edge_and_update_destination_

[testsuite] Fix gcc.dg/tls/pr58595.c on Solaris 9

2014-03-18 Thread Rainer Orth
The new gcc.dg/tls/pr58595.c testcase FAILs on Solaris 9:

FAIL: gcc.dg/tls/pr58595.c (test for excess errors)
Excess errors:
Undefined   first referenced
 symbol in file
___tls_get_addr /var/tmp//ccuBbAna.o
ld: fatal: Symbol referencing errors. No output written to ./pr58595.exe
WARNING: gcc.dg/tls/pr58595.c compilation failed to produce executable

Fixed as follows, tested with the appropriate runtest invocation on
i386-pc-solaris2.9, i386-pc-solaris2.11, and x86_64-unknown-linux-gnu,
installed on mainline.

Rainer


2014-03-18  Rainer Orth  

* gcc.dg/tls/pr58595.c: Add tls options.

# HG changeset patch
# Parent cb2102d4cf2a47e2919e0b0d88292d05d340f914
Fix gcc.dg/tls/pr58595.c on Solaris 9

diff --git a/gcc/testsuite/gcc.dg/tls/pr58595.c b/gcc/testsuite/gcc.dg/tls/pr58595.c
--- a/gcc/testsuite/gcc.dg/tls/pr58595.c
+++ b/gcc/testsuite/gcc.dg/tls/pr58595.c
@@ -2,6 +2,7 @@
 /* { dg-do run } */
 /* { dg-options "-O2" } */
 /* { dg-additional-options "-fpic" { target fpic } } */
+/* { dg-add-options tls } */
 /* { dg-require-effective-target tls } */
 /* { dg-require-effective-target sync_int_long } */
 

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [testsuite] Fix gcc.dg/tls/pr58595.c on Solaris 9

2014-03-18 Thread Jakub Jelinek
On Tue, Mar 18, 2014 at 11:19:52AM +0100, Rainer Orth wrote:
> The new gcc.dg/tls/pr58595.c testcase FAILs on Solaris 9:
> 
> FAIL: gcc.dg/tls/pr58595.c (test for excess errors)
> Excess errors:
> Undefined   first referenced
>  symbol in file
> ___tls_get_addr /var/tmp//ccuBbAna.o
> ld: fatal: Symbol referencing errors. No output written to ./pr58595.exe
> WARNING: gcc.dg/tls/pr58595.c compilation failed to produce executable
> 
> Fixed as follows, tested with the appropriate runtest invocation on
> i386-pc-solaris2.9, i386-pc-solaris2.11, and x86_64-unknown-linux-gnu,
> installed on mainline.

Can you please also change
/* { dg-require-effective-target tls } */
to
/* { dg-require-effective-target tls_runtime } */
?

BTW, don't know if dg-add-options tls can come before that or not.

> 2014-03-18  Rainer Orth  
> 
>   * gcc.dg/tls/pr58595.c: Add tls options.
> 

> # HG changeset patch
> # Parent cb2102d4cf2a47e2919e0b0d88292d05d340f914
> Fix gcc.dg/tls/pr58595.c on Solaris 9
> 
> diff --git a/gcc/testsuite/gcc.dg/tls/pr58595.c 
> b/gcc/testsuite/gcc.dg/tls/pr58595.c
> --- a/gcc/testsuite/gcc.dg/tls/pr58595.c
> +++ b/gcc/testsuite/gcc.dg/tls/pr58595.c
> @@ -2,6 +2,7 @@
>  /* { dg-do run } */
>  /* { dg-options "-O2" } */
>  /* { dg-additional-options "-fpic" { target fpic } } */
> +/* { dg-add-options tls } */
>  /* { dg-require-effective-target tls } */
>  /* { dg-require-effective-target sync_int_long } */
>  

Jakub


Re: [SPARC] Follow-up to latest LEON3 workaround

2014-03-18 Thread Eric Botcazou
> This is a follow-up to
>   http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00959.html
> which implemented the workaround for the data cache nullify issues on LEON3.

Another fixlet.

Tested on SPARC/Solaris, applied on the mainline and 4.8 branch.


2014-03-18  Eric Botcazou  

* config/sparc/sparc.c (sparc_do_work_around_errata): Speed up and use
proper constant for the store mode.


-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 208631)
+++ config/sparc/sparc.c	(working copy)
@@ -907,15 +907,15 @@ sparc_do_work_around_errata (void)
 	  && REGNO (SET_DEST (set)) % 2 != 0)
 	{
 	  /* The wrong dependency is on the enclosing double register.  */
-	  unsigned int x = REGNO (SET_DEST (set)) - 1;
+	  const unsigned int x = REGNO (SET_DEST (set)) - 1;
 	  unsigned int src1, src2, dest;
 	  int code;
 
-	  /* If the insn has a delay slot, then it cannot be problematic.  */
 	  next = next_active_insn (insn);
 	  if (!next)
 	break;
-	  if (NONJUMP_INSN_P (next) && GET_CODE (PATTERN (next)) == SEQUENCE)
+	  /* If the insn is a branch, then it cannot be problematic.  */
+	  if (!NONJUMP_INSN_P (next) || GET_CODE (PATTERN (next)) == SEQUENCE)
 	continue;
 
 	  extract_insn (next);
@@ -979,11 +979,11 @@ sparc_do_work_around_errata (void)
 	 dependency on the first single-cycle load.  */
 	  rtx x = SET_DEST (set);
 
-	  /* If the insn has a delay slot, then it cannot be problematic.  */
 	  next = next_active_insn (insn);
 	  if (!next)
 	break;
-	  if (NONJUMP_INSN_P (next) && GET_CODE (PATTERN (next)) == SEQUENCE)
+	  /* If the insn is a branch, then it cannot be problematic.  */
+	  if (!NONJUMP_INSN_P (next) || GET_CODE (PATTERN (next)) == SEQUENCE)
 	continue;
 
 	  /* Look for a second memory access to/from an integer register.  */
@@ -1001,13 +1001,13 @@ sparc_do_work_around_errata (void)
 		insert_nop = true;
 
 	  /* STD is *not* affected.  */
-	  else if ((mem = mem_ref (dest)) != NULL_RTX
-		   && GET_MODE_SIZE (GET_MODE (mem)) <= 4
-		   && (src == const0_rtx
+	  else if (MEM_P (dest)
+		   && GET_MODE_SIZE (GET_MODE (dest)) <= 4
+		   && (src == CONST0_RTX (GET_MODE (dest))
 			   || (REG_P (src)
 			   && REGNO (src) < 32
 			   && REGNO (src) != REGNO (x)))
-		   && !reg_mentioned_p (x, XEXP (mem, 0)))
+		   && !reg_mentioned_p (x, XEXP (dest, 0)))
 		insert_nop = true;
 	}
 	}

Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 10:34 AM, Ilya Tocar  wrote:
> On 17 Mar 22:18, Ulrich Drepper wrote:
>> On Mon, Mar 17, 2014 at 7:39 AM, Ilya Tocar  wrote:
>>
>> > undefined is similar in behavior to setzero, but it also clobbers
>> > flags. Maybe just define it to setzero for now?
>> >
>> >
>> What do you mean by "clobbers flags"?  Do you have an example?
>
> I've used follwing example:
>
> #include 
>
> extern __inline __m512
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> _mm512_undefined_ps (void)
> {
>   __m512 __Y;
>   __asm__ ("" : "=x" (__Y));
>   return __Y;
> }

Try the following instead:

extern __inline __m512
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_undefined_ps (void)
{
  __m512 __Y = __Y;
  return __Y;
}

>
> __m512 foo1(__m512 __A)
> {
> return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
>   (__v16sf)
>  _mm512_undefined_ps (),
>   (__mmask16) -1);
> }
>
> __m512 foo2(__m512 __A)
> {
> return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
>   (__v16sf)
>  _mm512_setzero_ps (),
>   (__mmask16) -1);
> }
>
>
> In foo1 asm statement is expanded into following rtl:
>
> (insn 6 3 7 2 (parallel [
> (set (reg:V16SF 87 [ __Y ])
> (asm_operands:V16SF ("") ("=x") 0 []
>  []
>  [] foo.c:8))
> (clobber (reg:QI 18 fpsr))
> (clobber (reg:QI 17 flags))
> ]) foo.c:8 -1
>
> As you can see flags are clobbered by asm statement, while in setzero
> case (foo2) i have just:
> (insn 7 6 8 2 (set (reg:V16SF 88)
> (const_vector:V16SF [
> (const_double:SF 0.0 [0x0.0p+0])
> (const_double:SF 0.0 [0x0.0p+0])
> //rest of zeroes skipped.


Re: [PATCH] __builtin_expect with alternate predictors for Fortran (PR ipa/58721)

2014-03-18 Thread Richard Biener
On Sat, Mar 15, 2014 at 2:13 PM, Jakub Jelinek  wrote:
> Hi!
>
> Here is an updated patch for what Tobias has posted earlier:
> http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00043.html
> While that version bootstrapped/regtested fine, most of the Fortran
> tests ICEd, primarily because the 3 operand __builtin_expect wasn't being
> removed from the IL and for expansion we only allow it for !optimize or
> couple of similar cases.
>
> This (combined) patch fixes that, fixes a couple of if (*predictor) to
> if (predictor),.  The biggest change is to introduce IFN_BUILTIN_EXPECT,
> because as we don't want to allow user code to specify 3+ argument
> __builtin_expect form, we probably don't want to make the builtin prototype
> a varargs function, but in that case it means e.g. gimple_builtin_p (stmt,
> BUILT_IN_EXPECT) will never match the 3 operand __builtin_expect.
> Also, predict.c would happily predict that &__gthrw___pthread_key_create != 0
> is PRED_UNCONDITIONALly true (and also that &__gthrw___pthread_key_create == 0
> is PRED_UNCONDITIONALly true), that is just wrong.
>
> I wanted to minimize the amount of changes for 4.9, so this patch only uses
> the internal fn for the 3 operand __builtin_expect, after branching I'd like
> to use it always and remove handling of non-internal __builtin_expect after
> gimplification.  The advantage could be e.g. that the argument/return value
> doesn't have to be necessarily long, we could just fold it at gimplification
> time.
>
> The predict.c changes affect inlining in libstdc++-v3/src/c++11/thread.cc
> somewhat, so it is not inlining one ctor any longer, Jonathan has kindly
> committed a gnu.ver fix for that yesterday.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-03-15  Jakub Jelinek  
>
> PR ipa/58721
> gcc/
> * internal-fn.c: Include diagnostic-core.h.
> (expand_BUILTIN_EXPECT): New function.
> * gimplify.c (gimplify_call_expr): Use false instead of FALSE.
> (gimplify_modify_expr): Gimplify 3 argument __builtin_expect into
> IFN_BUILTIN_EXPECT call instead of __builtin_expect builtin call.
> * ipa-inline-analysis.c (find_foldable_builtin_expect): Handle
> IFN_BUILTIN_EXPECT.
> * predict.c (expr_expected_value_1): Handle IFN_BUILTIN_EXPECT.
> Revert 3 argument __builtin_expect code.
> (strip_predict_hints): Handle IFN_BUILTIN_EXPECT.
> * gimple-fold.c (gimple_fold_call): Likewise.
> * tree.h (fold_builtin_expect): New prototype.
> * builtins.c (build_builtin_expect_predicate): Add predictor
> argument, if non-NULL, create 3 argument __builtin_expect.
> (fold_builtin_expect): No longer static.  Add ARG2 argument,
> pass it through to build_builtin_expect_predicate.
> (fold_builtin_2): Adjust caller.
> (fold_builtin_3): Handle BUILT_IN_EXPECT.
> * internal-fn.def (BUILTIN_EXPECT): New.
> gcc/fortran/
> * trans.c (gfc_unlikely, gfc_likely): Don't add __builtin_expect
> if !optimize.
>
> 2014-03-15  Tobias Burnus  
>
> PR ipa/58721
> gcc/
> * predict.def (PRED_FORTRAN_OVERFLOW, PRED_FORTRAN_FAIL_ALLOC,
> PRED_FORTRAN_FAIL_IO, PRED_FORTRAN_WARN_ONCE, PRED_FORTRAN_SIZE_ZERO,
> PRED_FORTRAN_INVALID_BOUND, PRED_FORTRAN_ABSENT_DUMMY): Add.
> gcc/fortran/
> * trans.h (gfc_unlikely, gfc_likely): Add predictor as argument.
> (gfc_trans_io_runtime_check): Remove.
> * trans-io.c (gfc_trans_io_runtime_check): Make static; add has_iostat
> as argument, add predictor to block.
> (set_parameter_value, gfc_trans_open, gfc_trans_close, build_filepos,
> gfc_trans_inquire, gfc_trans_wait, build_dt): Update calls.
> * trans.c (gfc_unlikely, gfc_likely): Add predictor as argument.
> (gfc_trans_runtime_check, gfc_allocate_using_malloc,
> gfc_allocate_allocatable, gfc_deallocate_with_status): Set explicitly
> branch predictor.
> * trans-expr.c (gfc_conv_procedure_call): Ditto.
> * trans-stmt.c (gfc_trans_allocate): Ditto.
> * trans-array.c (gfc_array_init_size, gfc_array_allocate): Ditto.
>
> 2014-03-15  Jan Hubicka  
>
> PR ipa/58721
> gcc/
> * predict.c (combine_predictions_for_bb): Fix up formatting.
> (expr_expected_value_1, expr_expected_value): Add predictor argument,
> fill what it points to if non-NULL.
> (tree_predict_by_opcode): Adjust caller, use the predictor.
> * predict.def (PRED_COMPARE_AND_SWAP): Add.
>
> --- gcc/predict.c.jj2014-01-03 11:40:46.957378605 +0100
> +++ gcc/predict.c   2014-03-14 13:16:15.246017052 +0100
> @@ -956,7 +956,8 @@ combine_predictions_for_bb (basic_block
>struct edge_prediction *pred2;
>   int prob = probability;
>
> -  for (pred2 = (struct edge_prediction *) *preds; pred2; pred2 = 
> pred2->e

Re: Ping^3 GCC trunk 4.9: documentation patch on plugins

2014-03-18 Thread Diego Novillo
On Tue, Mar 18, 2014 at 2:12 AM, Basile Starynkevitch
 wrote:
> On Sat, 2014-03-08 at 11:15 +0100, Basile Starynkevitch wrote:
>> I am pinging again this documentation patch
>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00074.html
>> (pinged at http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01002.html on 
>> febµ.17th 2014)
> and also pinged at
> http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00387.html on march 8th
> 2014

Apologies for the delay. Please feel free to include me for patches I
may be able to help with.

>  gcc/ChangeLog entry
>
> 2014-03-18  Basile Starynkevitch  
>
> * doc/plugins.texi (Plugin callbacks): Mention
> PLUGIN_INCLUDE_FILE.
> Italicize plugin event names in description.  Explain that
> PLUGIN_PRAGMAS has no sense for lto1. Explain
> PLUGIN_INCLUDE_FILE.
> Remind that no GCC functions should be called after
> PLUGIN_FINISH.
> Explain what pragmas with expansion are.
>
>  the patch:
> Index: gcc/doc/plugins.texi
> ===
> --- gcc/doc/plugins.texi(revision 207422)
> +++ gcc/doc/plugins.texi(working copy)
> @@ -209,6 +209,10 @@
>PLUGIN_EARLY_GIMPLE_PASSES_END,
>/* Called when a pass is first instantiated.  */
>PLUGIN_NEW_PASS,
> +/* Called when a file is #include-d or given thru #line directive.

s/given thru/given via the/

> +   Could happen many times.  The event data is the included file path,

s/Could/This could/

> +Pragmas registered with @code{c_register_pragma_with_expansion} or
> +@code{c_register_pragma_with_expansion_and_data} are allowing
> +preprocessor expansions, like e.g.

I can't parse the last bit: "... are allowing preprocessor expansions,
like e.g.".  Did you mean something like "support preprocessor
expansions. For example,"


Diego.


Re: [PATCH] Fix PR60505

2014-03-18 Thread Richard Biener
On Mon, 17 Mar 2014, Cong Hou wrote:

> On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener  wrote:
> > On Fri, 14 Mar 2014, Cong Hou wrote:
> >
> >> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener  wrote:
> >> > On Fri, 14 Mar 2014, Jakub Jelinek wrote:
> >> >
> >> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote:
> >> >> > > Consider this fact and if there are alias checks, we can safely 
> >> >> > > remove
> >> >> > > the epilogue if the maximum trip count of the loop is less than or
> >> >> > > equal to the calculated threshold.
> >> >> >
> >> >> > You have to consider n % vf != 0, so an argument on only maximum
> >> >> > trip count or threshold cannot work.
> >> >>
> >> >> Well, if you only check if maximum trip count is <= vf and you know
> >> >> that for n < vf the vectorized loop + it's epilogue path will not be 
> >> >> taken,
> >> >> then perhaps you could, but it is a very special case.
> >> >> Now, the question is when we are guaranteed we enter the scalar 
> >> >> versioned
> >> >> loop instead for n < vf, is that in case of versioning for alias or
> >> >> versioning for alignment?
> >> >
> >> > I think neither - I have plans to do the cost model check together
> >> > with the versioning condition but didn't get around to implement that.
> >> > That would allow stronger max bounds for the epilogue loop.
> >>
> >> In vect_transform_loop(), check_profitability will be set to true if
> >> th >= VF-1 and the number of iteration is unknown (we only consider
> >> unknown trip count here), where th is calculated based on the
> >> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum
> >> value VF-1. If the loop needs to be versioned, then
> >> check_profitability with true value will be passed to
> >> vect_loop_versioning(), in which an enhanced loop bound check
> >> (considering cost) will be built. So I think if the loop is versioned
> >> and n < VF, then we must enter the scalar version, and in this case
> >> removing epilogue should be safe when the maximum trip count <= th+1.
> >
> > You mean exactly in the case where the profitability check ensures
> > that n % vf == 0?  Thus effectively if n == maximum trip count?
> > That's quite a special case, no?
> 
> 
> Yes, it is a special case. But it is in this special case that those
> warnings are thrown out. Also, I think declaring an array with VF*N as
> length is not unusual.

Ok, but then for the patch compute the cost model threshold once
in vect_analyze_loop_2 and store it in a new
LOOP_VINFO_COST_MODEL_THRESHOLD.  Also you have to check
the return value from max_stmt_executions_int as that may return
-1 if the number cannot be computed (or isn't representable in
a HOST_WIDE_INT).  You also should check for
LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT which should have the
same effect on the cost model check.

The existing condition is already complicated enough - adding new
stuff warrants comments before the (sub-)checks.

Richard.


Re: [PATCH] BZ60501: Add addptr optab

2014-03-18 Thread Jakub Jelinek
On Mon, Mar 17, 2014 at 03:24:14PM -0400, Vladimir Makarov wrote:
> It is complicated.  There is no guarantee that it is used only for
> addresses.  I need some time to think how to fix it.
> 
> Meanwhile, you *should* commit the patch into the trunk because it
> solves the real problem.  And I can work from this to make changes
> that the new pattern is only used for addresses.
> 
> The patch is absolutely safe for all targets but s390.  There is
> still a tiny possibility that it might result in some problems for
> s390  (now I see only one situation when a pseudo in a subreg
> changed by equiv plus expr needs a reload).  In any case your patch
> solves real numerous failures and can be used as a base for further
> work.
> 
> Thanks for working on this problem, Andreas.  Sorry that I missed
> the PR60501.

BTW, does LRA require that CC isn't clobbered when it uses
emit_add2_insn? I don't see how it can be guaranteed
(except perhaps on i?86/x86_64 and maybe a few other targets).
emit_add3_insn should be ok (even on s390*?) because recog_memoized
should (I think) never add clobbers (it calls recog with 0
as last argument), but gen_add2_insn is a normall add3 insn that
on many targets clobbers CC.

Jakub


[patch sdbout]: Fix PR rtl-optimization/56356

2014-03-18 Thread Kai Tietz
Hi,

this patch fixes an ICE regarding COFF-debugging information.  Problem is that
the parms isn't necessarily an incoming argument, and not necessarily
already set.
By checking this - as done already in dbxout - we can avoid to process
on invalid
parm-declarations.

ChangeLog

2014-03-18  Kai Tietz  

PR rtl-optimization/56356
* sdbout.c (sdbout_parms): Verify that parms'
incoming argument is valid.
(sdbout_reg_parms): Likewise.

Regression tested on x86_64-unknown-linux-gnu, x86_64-w64-mingw32, and
i686-w64-mingw32.  Ok for apply?

Regards,
Kai

Index: sdbout.c
===
--- sdbout.c(Revision 208594)
+++ sdbout.c(Arbeitskopie)
@@ -1229,7 +1229,10 @@ static void
 sdbout_parms (tree parms)
 {
   for (; parms; parms = TREE_CHAIN (parms))
-if (DECL_NAME (parms))
+if (DECL_NAME (parms)
+&& TREE_TYPE (parms) != error_mark_node
+&& DECL_RTL_SET_P (parms)
+&& DECL_INCOMING_RTL (parms))
   {
 int current_sym_value = 0;
 const char *name = IDENTIFIER_POINTER (DECL_NAME (parms));
@@ -1361,7 +1364,10 @@ static void
 sdbout_reg_parms (tree parms)
 {
   for (; parms; parms = TREE_CHAIN (parms))
-if (DECL_NAME (parms))
+if (DECL_NAME (parms)
+&& TREE_TYPE (parms) != error_mark_node
+&& DECL_RTL_SET_P (parms)
+&& DECL_INCOMING_RTL (parms))
   {
 const char *name = IDENTIFIER_POINTER (DECL_NAME (parms));


[PATCH] Fix my name in contrib.texi

2014-03-18 Thread Richard Biener

Committed as obvious.

Richard.

2014-03-18  Richard Biener  

* doc/contrib.texi: Adjust my name.

Index: gcc/doc/contrib.texi
===
--- gcc/doc/contrib.texi(revision 208642)
+++ gcc/doc/contrib.texi(working copy)
@@ -335,7 +335,7 @@ Stu Grossman for gdb hacking, allowing G
 Michael K. Gschwind contributed the port to the PDP-11.
 
 @item
-Richard Guenther for his ongoing middle-end contributions and bug fixes
+Richard Biener for his ongoing middle-end contributions and bug fixes
 and for release management.
 
 @item


Re: [Patch AArch64] Remove unnecesssary definition of MEMORY_MOVE_COST

2014-03-18 Thread Marcus Shawcroft

On 18/03/14 09:43, Ramana Radhakrishnan wrote:

Hi,

While looking at something else I realized that we had MEMORY_MOVE_COST
defined in the backend. However we also have the more recent target hook
defined for this through TARGET_MEMORY_MOVE_COST making it obvious to
remove this definition, given that the only use of the macro
MEMORY_MOVE_COST is in the default target hook implementation for
TARGET_MEMORY_MOVE_COST :)

Ok for stage4 ? Just rebuilt the compiler (cc1 and cc1plus), built a few
large enough .i files that I had lying around saw no difference in code
generated as expected.

regards,
Ramana

  Ramana Radhakrishnan  

* config/aarch64/aarch64.h (MEMORY_MOVE_COST): Delete.



OK, but leave 24 hours for the RM's to object...

/M



[PATCH] Document -fresulution

2014-03-18 Thread Richard Biener

Just found another patch in my local tree.

Committed as obvious.

Richard.

2014-03-18  Richard Biener  

* doc/lto.texi (-fresolution): Document.

Index: gcc/doc/lto.texi
===
--- gcc/doc/lto.texi(revision 208642)
+++ gcc/doc/lto.texi(working copy)
@@ -586,4 +586,10 @@ optimizes an object and produces the fin
 This option specifies a file to which the names of LTRANS output
 files are written.  This option is only meaningful in conjunction
 with @option{-fwpa}.
+
+@item -fresolution=@var{file}
+@opindex fresolution
+This option specifies the linker resolution file.  This option is
+only meaningful in conjunction with @option{-fwpa} and as option
+to pass through to the LTO linker pluign.
 @end itemize


[PATCH] Loop docs

2014-03-18 Thread Richard Biener

Another one.

Committed.

Richard.

2014-03-18  Richard Biener  

* doc/loop.texi: Remove section on the removed lambda framework.
Update loop docs with recent changes in preserving loop structure.

Index: gcc/doc/loop.texi
===
--- gcc/doc/loop.texi   (revision 208642)
+++ gcc/doc/loop.texi   (working copy)
@@ -25,7 +25,6 @@ variable analysis and number of iteratio
 * loop-iv:: Induction variables on RTL.
 * Number of iterations::Number of iterations analysis.
 * Dependency analysis:: Data dependency analysis.
-* Lambda::  Linear loop transformations framework.
 * Omega::   A solver for linear programming problems.
 @end menu
 
@@ -37,10 +36,13 @@ variable analysis and number of iteratio
 This chapter describes the representation of loops in GCC, and functions
 that can be used to build, modify and analyze this representation.  Most
 of the interfaces and data structures are declared in @file{cfgloop.h}.
-At the moment, loop structures are analyzed and this information is
-updated only by the optimization passes that deal with loops, but some
-efforts are being made to make it available throughout most of the
-optimization passes.
+Loop structures are analyzed and this information disposed or updated
+at the discretion of individual passes.  Still most of the generic
+CFG manipulation routines are aware of loop structures and try to
+keep them up-to-date.  By this means an increasing part of the
+compilation pipeline is setup to maintain loop structure across
+passes to allow attaching meta information to individual loops
+for consumption by later passes.
 
 In general, a natural loop has one entry block (header) and possibly
 several back edges (latches) leading to the header from the inside of
@@ -139,9 +141,13 @@ recorded.
 These properties may also be computed/enforced later, using functions
 @code{create_preheaders}, @code{force_single_succ_latches},
 @code{mark_irreducible_loops} and @code{record_loop_exits}.
+The properties can be queried using @code{loops_state_satisfies_p}.
 
 The memory occupied by the loops structures should be freed with
-@code{loop_optimizer_finalize} function.
+@code{loop_optimizer_finalize} function.  When loop structures are
+setup to be preserved across passes this function reduces the
+information to be kept up-to-date to a minimum (only
+@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} set).
 
 The CFG manipulation functions in general do not update loop structures.
 Specialized versions that additionally do so are provided for the most
@@ -149,6 +155,10 @@ common tasks.  On GIMPLE, @code{cleanup_
 used to cleanup CFG while updating the loops structures if
 @code{current_loops} is set.
 
+At the moment loop structure is preserved from the start of GIMPLE
+loop optimizations until the end of RTL loop optimizations.  During
+this time a loop can be tracked by its @code{struct loop} and number.
+
 @node Loop querying
 @section Loop querying
 @cindex Loop querying
@@ -593,37 +603,6 @@ direction vectors for a data dependence
 @code{dump_data_references} prints the details of the data references
 contained in a data reference array.
 
-@node Lambda
-@section Linear loop transformations framework
-@cindex Linear loop transformations framework
-
-Lambda is a framework that allows transformations of loops using
-non-singular matrix based transformations of the iteration space and
-loop bounds. This allows compositions of skewing, scaling, interchange,
-and reversal transformations.  These transformations are often used to
-improve cache behavior or remove inner loop dependencies to allow
-parallelization and vectorization to take place.
-
-To perform these transformations, Lambda requires that the loopnest be
-converted into an internal form that can be matrix transformed easily.
-To do this conversion, the function
-@code{gcc_loopnest_to_lambda_loopnest} is provided.  If the loop cannot
-be transformed using lambda, this function will return NULL.
-
-Once a @code{lambda_loopnest} is obtained from the conversion function,
-it can be transformed by using @code{lambda_loopnest_transform}, which
-takes a transformation matrix to apply.  Note that it is up to the
-caller to verify that the transformation matrix is legal to apply to the
-loop (dependence respecting, etc).  Lambda simply applies whatever
-matrix it is told to provide.  It can be extended to make legal matrices
-out of any non-singular matrix, but this is not currently implemented.
-Legality of a matrix for a given loopnest can be verified using
-@code{lambda_transform_legal_p}.
-
-Given a transformed loopnest, conversion back into gcc IR is done by
-@code{lambda_loopnest_to_gcc_loopnest}.  This function will modify the
-loops so that they match the transformed loopnest.
-
 
 @node Omega
 @section Omega a solver for linear programming problems


Re: [PATCH] Update -flto docs wrt option handling

2014-03-18 Thread Richard Biener
On Tue, 11 Mar 2014, Richard Biener wrote:

> On Sat, 8 Mar 2014, Gerald Pfeifer wrote:
> 
> > Thanks for the time and diligence writing this up, Richi!
> > 
> > On Thu, 6 Mar 2014, Richard Biener wrote:
> > > -files; if @option{-flto} is not passed to the linker, no
> > > -interprocedural optimizations are applied.
> > > +files; if @option{-fno-lto} is not passed to the linker, no
> > > +interprocedural optimizations are applied.
> > 
> > That looks like one "no" too much?  
> 
> Fixed.
> 
> > >  Note that when
> > > +@option{-fno-fat-lto-objects} is enabled the compile-stage is faster
> > > +but you cannot perform a regular, non-LTO link, on them.
> > 
> > The comma past "link" appears too much.
> 
> Fixed.
> 
> > >  Additionally, the optimization flags used to compile individual files
> > >  are not necessarily related to those used at link time.  For instance,
> > 
> > That requires -ffat-lto-objects, though?  The text above talks more
> > about -fno-fat-lto-objects, not the positive form.
> 
> Doesn't require, no.  Unfortunately the default depends on some
> configure checks ... so the positive form below is required on
> some systems to make the -fno-lto link work.
> 
> > >  @smallexample
> > > -gcc -c -O0 -flto foo.c
> > > -gcc -c -O0 -flto bar.c
> > > -gcc -o myprog -flto -O3 foo.o bar.o
> > > +gcc -c -O0 -ffat-lto-objects -flto foo.c
> > > +gcc -c -O0 -ffat-lto-objects -flto bar.c
> > > +gcc -o myprog -O3 foo.o bar.o
> > >  @end smallexample
> > >  
> > >  This produces individual object files with unoptimized assembler
> > >  code, but the resulting binary @file{myprog} is optimized at
> > > -@option{-O3}.  If, instead, the final binary is generated without
> > > -@option{-flto}, then @file{myprog} is not optimized.
> > > +@option{-O3}.  If, instead, the final binary is generated with
> > > +@option{-fno-lto}, then @file{myprog} is not optimized.
> > 
> > Would it make sense to use -Os in the example?  I assume in the
> > last case myprog would then by optimized with -Os?  
> 
> You mean -Os instead of -O0?
> 
> > I am suggesting this since I believe it's not optimization vs
> > no optimization but "optimization level provided during compilation"?
> 
> Yes.  But we were motivating the -O0 vs. -On case with fat objects
> because you can get a debug build quickly with -fno-lto and
> an optimized build otherwise (without the need to re-compile).
> Not sure if that matters in practice ... but that's what the example
> tries to tell you how to do that.
> 
> [I've merely edited existing parts to reflect reality in 4.9
> due to changed defaults - the whole section should be rewritten
> to be more in a FAQ-like way.  That is, "You want to do X?  Here is
> now to do it!"]
> 
> > > +Currently, the following options and their setting are take from
> > > +the first object file that explicitely specified it: 
> > > +@option{-fPIC}, @option{-fpic}, @option{-fpie}, @option{-fcommon},
> > > +@option{-fexceptions}, @option{-fnon-call-exceptions}, @option{-fgnu-tm}
> > > +and all the @option{-m} target flags.
> > 
> > No -O options in case none are provided during link time?
> 
> See below, "If you do not specify an optimization level option ...".
> I've moved this to the very top.
> 
> > > +Certain ABI changing flags are required to match in all compilation-units
> > > +and trying to override this at link-time with a conflicting value
> > > +is ignored.  This includes options such as @option{-freg-struct-return}
> > > +and @option{-fpcc-struct-return}. 
> > 
> > If they are required to match, shouldn't a conflicting value during
> > link time trigger a diagnoses -- error or at least warning?
> 
> Yes, but unfortunately all diagnoses from link-time are buffered
> by collect2 and thus emitted very late.  So we don't emit any
> but fatal diagnostics from lto-wrapper.
> 
> > > +Other options such as @option{-ffp-contract}, 
> > > @option{-fno-strict-overflow},
> > > +@option{-fwrapv}, @option{-fno-trapv} or @option{-fno-strict-aliasing}
> > > +are passed through to the link stage and merged conservatively for
> > > +conflicting translation units.  You can override them at linke-time.
> > 
> > What does conservative merging imply?  How does that work?
> 
> I've added
> 
> "Specifically
> @option{-fno-strict-overflow}, @option{-fwrapv} and @option{-fno-trapv} 
> take
> precedence and for example @option{-ffp-contract=off} takes precedence
> over @option{-ffp-contract=fast}.  You can override them at linke-time."
> 
> 
> > > +same link with the same options and also specify those options at
> > > +link-time.
> > 
> > "link time" (noun)
> 
> Fixed.
> 
> > > -GCC will not work with an older/newer version of GCC@.
> > > +GCC will not work with an older/newer version of GCC.
> > 
> > What is a version here?  Release series?
> > 
> > Will GCC 4.9.0 and 4.9.1 work, or not?
> 
> We make no guarantees ;)  Specifically the implemented
> bytecode version check is not strong enough :/
> 
> Updated patch below.
> 
> Ok?

I have com

Re: Ping^3 GCC trunk 4.9: documentation patch on plugins

2014-03-18 Thread Basile Starynkevitch
Hello Diego & all,

Here is a slightly improved patch to follow Diego's comments on 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00863.html

(since Diego improved the wording extracted from comments 
in gcc/plugin.def I am also patching the comments in that file).

### gcc/Changelog entry
2014-03-18  Basile Starynkevitch  
* plugin.def: Improve comment for PLUGIN_INCLUDE_FILE.

* doc/plugins.texi (Plugin callbacks): Mention
PLUGIN_INCLUDE_FILE.
Italicize plugin event names in description.  Explain that
PLUGIN_PRAGMAS has no sense for lto1. Explain
PLUGIN_INCLUDE_FILE.
Remind that no GCC functions should be called after
PLUGIN_FINISH.
Explain what pragmas with expansion are.


the [improved] patch against trunk 208643 is attached. 

Ok for GCC trunk 4.9?

Cheers
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***
Index: gcc/doc/plugins.texi
===
--- gcc/doc/plugins.texi	(revision 208643)
+++ gcc/doc/plugins.texi	(working copy)
@@ -209,6 +209,10 @@
   PLUGIN_EARLY_GIMPLE_PASSES_END,
   /* Called when a pass is first instantiated.  */
   PLUGIN_NEW_PASS,
+/* Called when a file is #include-d or given via the #line directive.
+   This could happen many times.  The event data is the included file path,
+   as a const char* pointer.  */
+  PLUGIN_INCLUDE_FILE,
 
   PLUGIN_EVENT_FIRST_DYNAMIC/* Dummy event used for indexing callback
array.  */
@@ -229,15 +233,27 @@
 @item @code{void *user_data}: Pointer to plugin-specific data.
 @end itemize
 
-For the PLUGIN_PASS_MANAGER_SETUP, PLUGIN_INFO, PLUGIN_REGISTER_GGC_ROOTS
-and PLUGIN_REGISTER_GGC_CACHES pseudo-events the @code{callback} should be
-null, and the @code{user_data} is specific.
+For the @i{PLUGIN_PASS_MANAGER_SETUP}, @i{PLUGIN_INFO},
+@i{PLUGIN_REGISTER_GGC_ROOTS} and @i{PLUGIN_REGISTER_GGC_CACHES}
+pseudo-events the @code{callback} should be null, and the
+@code{user_data} is specific.
 
-When the PLUGIN_PRAGMAS event is triggered (with a null
-pointer as data from GCC), plugins may register their own pragmas
-using functions like @code{c_register_pragma} or
-@code{c_register_pragma_with_expansion}.
+When the @i{PLUGIN_PRAGMAS} event is triggered (with a null pointer as
+data from GCC), plugins may register their own pragmas.  Notice that
+pragmas are not available from @file{lto1}, so plugins used with
+@code{-flto} option to GCC during link-time optimization cannot use
+pragmas and do not even see functions like @code{c_register_pragma} or
+@code{pragma_lex}.
 
+The @i{PLUGIN_INCLUDE_FILE} event, with a @code{const char*} file path as
+GCC data, is triggered for processing of @code{#include} or
+@code{#line} directives.
+
+The @i{PLUGIN_FINISH} event is the last time that plugins can call GCC
+functions, notably emit diagnostics with @code{warning}, @code{error}
+etc.
+
+
 @node Plugins pass
 @section Interacting with the pass manager
 
@@ -376,10 +392,13 @@
 @end smallexample
 
 
-The @code{PLUGIN_PRAGMAS} callback is called during pragmas
-registration. Use the @code{c_register_pragma} or
-@code{c_register_pragma_with_expansion} functions to register custom
-pragmas.
+The @i{PLUGIN_PRAGMAS} callback is called once during pragmas
+registration. Use the @code{c_register_pragma},
+@code{c_register_pragma_with_data},
+@code{c_register_pragma_with_expansion},
+@code{c_register_pragma_with_expansion_and_data} functions to register
+custom pragmas and their handlers (which often want to call
+@code{pragma_lex}) from @file{c-family/c-pragma.h}.
 
 @smallexample
 /* Plugin callback called during pragmas registration. Registered with
@@ -397,7 +416,15 @@
 It is suggested to pass @code{"GCCPLUGIN"} (or a short name identifying
 your plugin) as the ``space'' argument of your pragma.
 
+Pragmas registered with @code{c_register_pragma_with_expansion} or
+@code{c_register_pragma_with_expansion_and_data} are supporting
+preprocessor expansions. For an example of using such a pragma:
 
+@smallexample
+#define NUMBER 10
+#pragma GCCPLUGIN foothreshold (NUMBER)
+@end smallexample
+
 @node Plugins recording
 @section Recording information about pass execution
 
Index: gcc/plugin.def
===
--- gcc/plugin.def	(revision 208643)
+++ gcc/plugin.def	(working copy)
@@ -92,8 +92,8 @@
 /* Called when a pass is first instantiated.  */
 DEFEVENT (PLUGIN_NEW_PASS)
 
-/* Called when a file is #include-d or given thru #line directive.
-   Could happen many times.  The event data is the included file path,
+/* Called when a file is #include-d or given via the #line directive.
+   this could happen many times.  The event data is the included file path,
as a

Re: [PATCH] Document -fresulution

2014-03-18 Thread Rainer Orth
Richard Biener  writes:

> +@item -fresolution=@var{file}
> +@opindex fresolution
> +This option specifies the linker resolution file.  This option is
> +only meaningful in conjunction with @option{-fwpa} and as option
> +to pass through to the LTO linker pluign.
 ^ typo: plugin

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Avoid two bitmap_copies in df

2014-03-18 Thread Richard Biener

I found this patch in my local tree which avoids copying a bitmap
by doing

bitmap_and_compl (&tmp, op2, dense_invalidated);

instead of

bitmap_copy (&tmp, op2);
bitmap_and_compl_into (&tmp, dense_invalidated);

which should besides speed also eventually reduce peak memory
usage.

Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok for trunk?

Thanks,
Richard.

2014-03-18  Richard Biener  

* df-problems.c (df_rd_confluence_n): Avoid bitmap_copy
by using bitmap_and_compl instead of bitmap_and_compl_into.
(df_rd_transfer_function): Likewise.

Index: gcc/df-problems.c
===
--- gcc/df-problems.c   (revision 208642)
+++ gcc/df-problems.c   (working copy)
@@ -479,8 +479,7 @@ df_rd_confluence_n (edge e)
   bitmap_head tmp;
 
   bitmap_initialize (&tmp, &df_bitmap_obstack);
-  bitmap_copy (&tmp, op2);
-  bitmap_and_compl_into (&tmp, dense_invalidated);
+  bitmap_and_compl (&tmp, op2, dense_invalidated);
 
   EXECUTE_IF_SET_IN_BITMAP (sparse_invalidated, 0, regno, bi)
{
@@ -524,14 +523,13 @@ df_rd_transfer_function (int bb_index)
   problem_data = (struct df_rd_problem_data *) df_rd->problem_data;
   bitmap_initialize (&tmp, &problem_data->rd_bitmaps);
 
-  bitmap_copy (&tmp, in);
+  bitmap_and_compl (&tmp, in, kill);
   EXECUTE_IF_SET_IN_BITMAP (sparse_kill, 0, regno, bi)
{
  bitmap_clear_range (&tmp,
  DF_DEFS_BEGIN (regno),
  DF_DEFS_COUNT (regno));
}
-  bitmap_and_compl_into (&tmp, kill);
   bitmap_ior_into (&tmp, gen);
   changed = !bitmap_equal_p (&tmp, out);
   if (changed)


Re: [PATCH] Document -fresulution

2014-03-18 Thread Richard Biener
On Tue, 18 Mar 2014, Rainer Orth wrote:

> Richard Biener  writes:
> 
> > +@item -fresolution=@var{file}
> > +@opindex fresolution
> > +This option specifies the linker resolution file.  This option is
> > +only meaningful in conjunction with @option{-fwpa} and as option
> > +to pass through to the LTO linker pluign.
>  ^ typo: plugin

Thanks - fixed.
Richard.


[PATCH] [gomp4] Initial support of OpenACC loop directive in C front-end.

2014-03-18 Thread Ilmir Usmanov

Hi Thomas!

This patch introduces support of OpenACC loop directive (and combined 
directives) in C front-end up to GENERIC. Currently no clause is allowed.


This patch is necessary to finish implementation of OpenACC 1.0 in 
fortran front-end. As you know, OpenACC fortran implementation does 
parsing and resolving of loop directive but doesn't transformation to 
GENERIC.


Bootstraped and tested with no new regressions on x86_64-unknown-linux-gnu.

OK for gomp4 branch?

--
Ilmir.
>From 1adbb9d2e504c7acac8be89b053fa677cf285b42 Mon Sep 17 00:00:00 2001
From: Ilmir Usmanov 
Date: Tue, 18 Mar 2014 16:13:14 +0400
Subject: [PATCH] Initial support of OpenACC loop

---
	Initial support of OpenACC loop directive in C front-end.

	gcc/
	* tree.def (OACC_LOOP): New tree code.
	* tree-pretty-print.c (dump_generic_node): Show it.
	* tree.h (OACC_KERNELS_COMBINED, OACC_PARALLEL_COMBINED): New macros.
	* c-family/c-pragma.c (oacc_pragmas): Add loop directive.
	* c-family/c-pragma.h (enum pragma_kind): Likewise.
	* c/c-parser.c (c_parser_oacc_loop): New function.
	(c_parser_oacc_kernels): Parse combined directive.
	(c_parser_oacc_parallel): Likewise.
	(c_parser_omp_construct): Parse loop directive.
	* doc/generic.texi: Document loop directive.
	* gimplify.c (is_gimple_stmt, gimplify_expr): Stub gimplification of 
	loop directive and combined directives.
	* testsuite/c-c++-common/goacc/loop-1.c: New test.

diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index c8baba4..f99b087 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1172,6 +1172,7 @@ static const struct omp_pragma_def oacc_pragmas[] = {
   { "data", PRAGMA_OACC_DATA },
   { "kernels", PRAGMA_OACC_KERNELS },
   { "parallel", PRAGMA_OACC_PARALLEL },
+  { "loop", PRAGMA_OACC_LOOP },
 };
 static const struct omp_pragma_def omp_pragmas[] = {
   { "atomic", PRAGMA_OMP_ATOMIC },
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index d55a511..16c74a9 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -30,6 +30,7 @@ typedef enum pragma_kind {
   PRAGMA_OACC_DATA,
   PRAGMA_OACC_KERNELS,
   PRAGMA_OACC_PARALLEL,
+  PRAGMA_OACC_LOOP,
   PRAGMA_OMP_ATOMIC,
   PRAGMA_OMP_BARRIER,
   PRAGMA_OMP_CANCEL,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 3d8c0de..2300c6c 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11404,6 +11404,8 @@ c_parser_oacc_data (location_t loc, c_parser *parser)
   return stmt;
 }
 
+static tree c_parser_oacc_loop (location_t, c_parser *, char *);
+
 /* OpenACC 2.0:
# pragma acc kernels oacc-kernels-clause[optseq] new-line
  structured-block
@@ -11424,12 +11426,28 @@ c_parser_oacc_data (location_t loc, c_parser *parser)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_CREATE) )
 
 static tree
-c_parser_oacc_kernels (location_t loc, c_parser *parser)
+c_parser_oacc_kernels (location_t loc, c_parser *parser, char *p_name)
 {
-  tree stmt, clauses, block;
+  tree stmt, clauses = NULL_TREE, block;
+
+  strcat (p_name, " kernels");
+
+  if (c_parser_next_token_is (parser, CPP_NAME))
+{
+  const char *p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
+  if (strcmp (p, "loop") == 0)
+	{
+	  c_parser_consume_token (parser);
+	  block = c_begin_omp_parallel ();
+	  c_parser_oacc_loop (loc, parser, p_name);
+	  stmt = c_finish_oacc_kernels (loc, clauses, block);
+	  OACC_KERNELS_COMBINED (stmt) = 1;
+	  return stmt;
+	}
+}
 
   clauses =  c_parser_oacc_all_clauses (parser, OACC_KERNELS_CLAUSE_MASK,
-	"#pragma acc kernels");
+	p_name);
 
   block = c_begin_omp_parallel ();
   add_stmt (c_parser_omp_structured_block (parser));
@@ -11459,12 +11477,28 @@ c_parser_oacc_kernels (location_t loc, c_parser *parser)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_CREATE) )
 
 static tree
-c_parser_oacc_parallel (location_t loc, c_parser *parser)
+c_parser_oacc_parallel (location_t loc, c_parser *parser, char *p_name)
 {
-  tree stmt, clauses, block;
+  tree stmt, clauses = NULL_TREE, block;
+
+  strcat (p_name, " parallel");
+
+  if (c_parser_next_token_is (parser, CPP_NAME))
+{
+  const char *p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
+  if (strcmp (p, "loop") == 0)
+	{
+	  c_parser_consume_token (parser);
+	  block = c_begin_omp_parallel ();
+	  c_parser_oacc_loop (loc, parser, p_name);
+	  stmt = c_finish_oacc_parallel (loc, clauses, block);
+	  OACC_PARALLEL_COMBINED (stmt) = 1;
+	  return stmt;
+	}
+}
 
   clauses =  c_parser_oacc_all_clauses (parser, OACC_PARALLEL_CLAUSE_MASK,
-	"#pragma acc parallel");
+	p_name);
 
   block = c_begin_omp_parallel ();
   add_stmt (c_parser_omp_structured_block (parser));
@@ -12243,6 +12277,32 @@ omp_split_clauses (location_t loc, enum tree_code code,
   cclauses[i] = c_finish_omp_clauses (cclauses[i]);
 }
 
+/* OpenACC 2.0:
+   # pragma acc loop oacc-loop-clause[optseq] new-line
+ structured-block
+
+   LOC is the location of the #pragma toke

[PATCH] Fix gimple-fold

2014-03-18 Thread Martin Liška

Hello,
I found ICE in Chromium compiled with LTO. There's a call that is 
proved by ipa-devirt as __builtin_unreachable; same decision is done by 
gimple-fold and this call is replaced by GIMPLE_CALL and GIMPLE_ASSIGN 
(in this order). After that condition for 
cgraph_update_edges_for_call_stmt_node is not satisfied and 
corresponding cgraph_edge is not updated. Thus a verifier reports a 
wrong edge.


Bootstrapped and tested on a x86_64 machine.


Changelog:
2014-03-18  Martin Liska  

* cgraph.c (cgraph_update_edges_for_call_stmt_node): added case when
gimple call statement is updated.
* gimple-fold.c (gimple_fold_call): changed order for GIMPLE_ASSIGN and
GIMPLE_CALL, where gsi iterator still points to GIMPLE CALL.

OK for trunk?

Thank you,
Martin


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index a15b6bc..cd68894 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1519,7 +1519,11 @@ cgraph_update_edges_for_call_stmt_node (struct cgraph_node *node,
 		{
 		  if (callee->decl == new_call
 		  || callee->former_clone_of == new_call)
-		return;
+{
+  cgraph_set_call_stmt (cgraph_edge (node, old_stmt),
+new_stmt);
+		  return;
+}
 		  callee = callee->clone_of;
 		}
 	}
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index eafdb2d..a033fbc 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1153,8 +1153,14 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace)
 		{
 		  tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
 		  tree def = get_or_create_ssa_default_def (cfun, var);
-		  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
-		  update_call_from_tree (gsi, def);
+
+  /* To satisfy condition for
+ cgraph_update_edges_for_call_stmt_node,
+ we need to preserve GIMPLE_CALL statement
+ at position of GSI iterator.  */
+  gimple_stmt_iterator oldgsi = *gsi;
+		  gsi_insert_before (gsi, new_stmt, GSI_NEW_STMT);
+		  update_call_from_tree (&oldgsi, def);
 		}
 		  else
 		gsi_replace (gsi, new_stmt, true);


Re: Ping^3 GCC trunk 4.9: documentation patch on plugins

2014-03-18 Thread Diego Novillo
OK with:

+Pragmas registered with @code{c_register_pragma_with_expansion} or
+@code{c_register_pragma_with_expansion_and_data} are supporting
+preprocessor expansions. For an example of using such a pragma:

s/are supporting/support/
s/For an example of using such a pragma/For example/


Diego.


Re: [PATCH] Avoid two bitmap_copies in df

2014-03-18 Thread Jakub Jelinek
On Tue, Mar 18, 2014 at 01:32:57PM +0100, Richard Biener wrote:
> 2014-03-18  Richard Biener  
> 
>   * df-problems.c (df_rd_confluence_n): Avoid bitmap_copy
>   by using bitmap_and_compl instead of bitmap_and_compl_into.
>   (df_rd_transfer_function): Likewise.

Ok, thanks.

Jakub


Re: [PATCH] Fix gimple-fold

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 1:38 PM, Martin Liška  wrote:
> Hello,
> I found ICE in Chromium compiled with LTO. There's a call that is proved
> by ipa-devirt as __builtin_unreachable; same decision is done by gimple-fold
> and this call is replaced by GIMPLE_CALL and GIMPLE_ASSIGN (in this order).
> After that condition for cgraph_update_edges_for_call_stmt_node is not
> satisfied and corresponding cgraph_edge is not updated. Thus a verifier
> reports a wrong edge.

You should be able to simply do

  update_call_from_tree (gsi, def);
  gsi_insert_before (gsi, new_stmt, GSI_NEW_STMT);

also cgraph_edge (node, old_stmt) is already computed in 'e' AFAICS.

Richard.

> Bootstrapped and tested on a x86_64 machine.
>
>
> Changelog:
> 2014-03-18  Martin Liska  
>
> * cgraph.c (cgraph_update_edges_for_call_stmt_node): added case when
> gimple call statement is updated.
> * gimple-fold.c (gimple_fold_call): changed order for GIMPLE_ASSIGN
> and
> GIMPLE_CALL, where gsi iterator still points to GIMPLE CALL.
>
> OK for trunk?
>
> Thank you,
> Martin
>
>


Re: [PATCH] Fix gimple-fold

2014-03-18 Thread Jakub Jelinek
Hi!

> > 2014-03-18  Martin Liska  
> >
> > * cgraph.c (cgraph_update_edges_for_call_stmt_node): added case when
> > gimple call statement is updated.

Capital letter after :

> > * gimple-fold.c (gimple_fold_call): changed order for GIMPLE_ASSIGN 
> > and

Likewise here.

Jakub


[gomp4] OpenACC num_gangs, num_workers, vector_length clauses

2014-03-18 Thread Thomas Schwinge
Hi!

In gomp-4_0-branch's r208648, I have just committed some ;-) support for
OpenACC num_gangs, num_workers, vector_length clauses (OpenACC parallel
directive), that is, just passing them through the compiler -- now,
libgomp just needs to do something useful with that information.

commit 0104aa41048603ecc0bcfe747b735068557e4431
Author: tschwinge 
Date:   Tue Mar 18 13:09:24 2014 +

OpenACC num_gangs, num_workers, vector_length clauses.

gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add
PRAGMA_OMP_CLAUSE_NUM_GANGS, PRAGMA_OMP_CLAUSE_NUM_WORKERS,
PRAGMA_OMP_CLAUSE_VECTOR_LENGTH.
gcc/c/
* c-parser.c (c_parser_omp_clause_num_gangs)
(c_parser_omp_clause_num_workers)
(c_parser_omp_clause_vector_length): New functions.
(c_parser_omp_clause_name, c_parser_oacc_all_clauses): Handle
PRAGMA_OMP_CLAUSE_NUM_GANGS, PRAGMA_OMP_CLAUSE_NUM_WORKERS,
PRAGMA_OMP_CLAUSE_VECTOR_LENGTH.
(OACC_PARALLEL_CLAUSE_MARK): Add these.
* c-typeck.c (c_finish_omp_clauses): Handle these.

gcc/
* builtin-types.def
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR_INT_INT_INT): New type.
* oacc-builtins.def (BUILT_IN_GOACC_KERNELS)
(BUILT_IN_GOACC_PARALLEL): Switch to that one.
* gimplify.c (gimplify_scan_omp_clauses)
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_NUM_GANGS,
OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH.
* omp-low.c (scan_sharing_clauses, expand_oacc_offload): Likewise.
gcc/ada/
* gcc-interface/utils.c (DEF_FUNCTION_TYPE_10): Define.
gcc/c-family/
* c-common.c (DEF_FUNCTION_TYPE_10): Define.
gcc/fortran/
* f95-lang.c (DEF_FUNCTION_TYPE_10): Define.
* types.def
(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR_INT_INT_INT): New type.
gcc/lto/
* c-common.c (DEF_FUNCTION_TYPE_10): Define.
libgomp/
* libgomp_g.h (GOACC_kernels, GOACC_parallel): Add three
additional int arguments.
* oacc-parallel.c (GOACC_kernels, GOACC_parallel): Handle these.
* testsuite/libgomp.oacc-c/goacc_kernels.c: Adjust.
* testsuite/libgomp.oacc-c/goacc_parallel.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@208648 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp|  11 ++
 gcc/ada/ChangeLog.gomp|   6 +-
 gcc/ada/gcc-interface/utils.c |   7 +
 gcc/builtin-types.def |   4 +
 gcc/c-family/ChangeLog.gomp   |   8 ++
 gcc/c-family/c-common.c   |   7 +
 gcc/c-family/c-pragma.h   |   3 +
 gcc/c/ChangeLog.gomp  |  11 ++
 gcc/c/c-parser.c  | 160 +-
 gcc/c/c-typeck.c  |   3 +
 gcc/fortran/ChangeLog.gomp|   8 +-
 gcc/fortran/f95-lang.c|  18 +++
 gcc/fortran/types.def |   4 +
 gcc/gimplify.c|  12 +-
 gcc/lto/ChangeLog.gomp|   6 +-
 gcc/lto/lto-lang.c|   7 +
 gcc/oacc-builtins.def |   6 +-
 gcc/omp-low.c |  49 +--
 libgomp/ChangeLog.gomp|   8 ++
 libgomp/libgomp_g.h   |   6 +-
 libgomp/oacc-parallel.c   |  19 ++-
 libgomp/testsuite/libgomp.oacc-c/goacc_kernels.c  |   4 +-
 libgomp/testsuite/libgomp.oacc-c/goacc_parallel.c |   4 +-
 23 files changed, 343 insertions(+), 28 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 9b8aec0..d35cbee 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,14 @@
+2014-03-18  Thomas Schwinge  
+
+   * builtin-types.def
+   (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR_INT_INT_INT): New type.
+   * oacc-builtins.def (BUILT_IN_GOACC_KERNELS)
+   (BUILT_IN_GOACC_PARALLEL): Switch to that one.
+   * gimplify.c (gimplify_scan_omp_clauses)
+   (gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_NUM_GANGS,
+   OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH.
+   * omp-low.c (scan_sharing_clauses, expand_oacc_offload): Likewise.
+
 2014-03-13  Thomas Schwinge  
 
* tree.h (OMP_CLAUSE_VECTOR_EXPR): Check for OMP_CLAUSE_VECTOR
diff --git gcc/ada/ChangeLog.gomp gcc/ada/ChangeLog.gomp
index 0bb4618..eff84ec 100644
--- gcc/ada/ChangeLog.gomp
+++ gcc/ada/ChangeLog.gomp
@@ -1,8 +1,12 @@
+2014-03-18  Thomas Schwinge  
+
+   * gcc-interface/utils.c (DEF_FUNCTION_TYPE_10): Define.
+
 2013-04-10  Jakub Jelinek  
 
* gcc-interface/utils.c (DEF_FUNCTION_TYPE_8): Define.
 
-Copyright (C) 2013 Free Software Foundation

[patch testsuite]: g++.dg/abi

2014-03-18 Thread Kai Tietz
Hi,

this patch skips anon2.C and anon3.C test for mingw target.  Issue
here is that weak under pe-coff is different to ELF-targets and
therefore test doesn't apply for

FAIL: g++.dg/abi/anon2.C -std=c++11  scan-assembler
.weak(_definition)?[ \t]_?_ZN2N11D1C3fn1ENS0_1BE
FAIL: g++.dg/abi/anon2.C -std=c++11  scan-assembler
.weak(_definition)?[ \t]_?_ZN2N11D1C3fn2ES1_
FAIL: g++.dg/abi/anon2.C -std=c++11  scan-assembler
.weak(_definition)?[ \t]_?_ZN2N31D1CIiE3fn1ENS0_1BE
FAIL: g++.dg/abi/anon2.C -std=c++11  scan-assembler
.weak(_definition)?[ \t]_?_ZN2N31D1CIiE3fn2ES2_
FAIL: g++.dg/abi/anon2.C -std=c++1y  scan-assembler
.weak(_definition)?[ \t]_?_ZN2N11D1C3fn1ENS0_1BE
FAIL: g++.dg/abi/anon2.C -std=c++1y  scan-assembler
.weak(_definition)?[ \t]_?_ZN2N11D1C3fn2ES1_
...

ChangeLog

2014-03-18  Kai Tietz  

* g++.dg/abi/anon2.C: Skip for mingw targets.
* g++.dg/abi/anon3.C: Likewise.

Tested for x86_64-unknown-linux-gnu, and i686-w64-mingw32.  Ok for apply?

Regards,
Kai

Index: anon2.C
===
--- anon2.C (Revision 208594)
+++ anon2.C (Arbeitskopie)
@@ -1,5 +1,6 @@
 // PR c++/55877
 // { dg-require-weak "" }
+// { dg-skip-if "requires unsupported weak in pe-coff" { *-*-mingw* } }

 namespace N1 {
   typedef struct {
Index: anon3.C
===
--- anon3.C (Revision 208594)
+++ anon3.C (Arbeitskopie)
@@ -1,4 +1,5 @@
 // { dg-require-weak "" }
+// { dg-skip-if "requires unsupported weak in pe-coff" { *-*-mingw* } }

 typedef struct {
   // { dg-final { scan-assembler ".weak\(_definition\)?\[
\t\]_?_ZN4Heya4blahEv" } }


Re: [PATCH] Fix gimple-fold

2014-03-18 Thread Martin Liška

Thank you for feedback,

new changelog:
2014-03-18  Martin Liska  

* cgraph.c (cgraph_update_edges_for_call_stmt_node): Added case 
when

gimple call statement is update.
* gimple-fold.c (gimple_fold_call): Changed order for 
GIMPLE_ASSIGN and

GIMPLE_CALL, where gsi iterator still points to GIMPLE CALL.

OK for trunk?

Martin


On 03/18/2014 02:13 PM, Jakub Jelinek wrote:

Hi!


2014-03-18  Martin Liska  

 * cgraph.c (cgraph_update_edges_for_call_stmt_node): added case when
 gimple call statement is updated.

Capital letter after :


 * gimple-fold.c (gimple_fold_call): changed order for GIMPLE_ASSIGN and

Likewise here.

Jakub


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index a15b6bc..269146a 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1519,7 +1519,10 @@ cgraph_update_edges_for_call_stmt_node (struct cgraph_node *node,
 		{
 		  if (callee->decl == new_call
 		  || callee->former_clone_of == new_call)
-		return;
+{
+  cgraph_set_call_stmt (e, new_stmt);
+		  return;
+}
 		  callee = callee->clone_of;
 		}
 	}
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index eafdb2d..177abc1 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1153,8 +1153,13 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace)
 		{
 		  tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
 		  tree def = get_or_create_ssa_default_def (cfun, var);
-		  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+
+  /* To satisfy condition for
+ cgraph_update_edges_for_call_stmt_node,
+ we need to preserve GIMPLE_CALL statement
+ at position of GSI iterator.  */
 		  update_call_from_tree (gsi, def);
+		  gsi_insert_before (gsi, new_stmt, GSI_NEW_STMT);
 		}
 		  else
 		gsi_replace (gsi, new_stmt, true);


Re: [PATCH] [gomp4] Initial support of OpenACC loop directive in C front-end.

2014-03-18 Thread Thomas Schwinge
Hi!

On Tue, 18 Mar 2014 16:37:24 +0400, Ilmir Usmanov  wrote:
> This patch introduces support of OpenACC loop directive (and combined 
> directives) in C front-end up to GENERIC. Currently no clause is allowed.

Thanks!  I had worked on a simpler patch, not yet dealing with combined
clauses.  Also, I have some work for the GIMPLE level, namely building on
GIMPLE_OMP_FOR, adding a new GF_OMP_FOR_KIND_OACC_LOOP.  I'll post this
soon.

> This patch is necessary to finish implementation of OpenACC 1.0 in 
> fortran front-end. As you know, OpenACC fortran implementation does 
> parsing and resolving of loop directive but doesn't transformation to 
> GENERIC.
> 
> Bootstraped and tested with no new regressions on x86_64-unknown-linux-gnu.
> 
> OK for gomp4 branch?

Yes, with minor changes:

> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c

> +#define OACC_LOOP_CLAUSE_MASKPRAGMA_OMP_CLAUSE_NONE

Change to:

#define OACC_LOOP_CLAUSE_MASK   
\
(OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NONE)

;-) I had made the same error before:
.

T> --- /dev/null
T> +++ b/gcc/testsuite/c-c++-common/goacc/loop-1.c

T> +  #pragma acc loop
T> +  for (d = 1; d < 30; d+= 6)  /* { dg-error "invalid type for iteration 
variable" } */
T> +{
T> +  i = d;
T> +  a[i] = 1;
T> +}
T> +  #pragma acc loop
T> +  for (d = 1; d < 30; d+= 5)  /* { dg-error "invalid type for iteration 
variable" } */
T> +{
T> +  i = d;
T> +  a[i] = 2;
T> +}

These two look very similar -- was one of them meant to check for
something else?

T> +  #pragma acc loop
T> +  for(i = 1; i < 30; i++)
T> +{
T> +  for(j = 5; j < 10; j++)
T> +{
T> +  /* TODO: there must be error.  */
T> +  if (i == 6 && j == 7) goto outer; 
T> +}
T> +}
T> +outer:

Most likely will be detected by gcc/omp-low.c's diagnose_omp_blocks pass
-- which you're not yet reaching because of not yet gimplifying the loop.
I'll deal with this in context of my loop gimplification patch.


Grüße,
 Thomas


pgpKG_qf8oJJC.pgp
Description: PGP signature


[PATCH] Add --enable-valgrind-annotations

2014-03-18 Thread Richard Biener

This is another patch (well, I've polished it a bit) that was sitting
in my local tree for some time.  I've not enabled the ggc-common.c
code (I merely want to get rid of the false possitives).

Queued for 4.10 unless I'm told otherwise.

Richard.

2014-03-18  Richard Biener  

* configure.ac: Do valgrind header checks unconditionally.
Add --enable-valgrind-annotations.
* system.h: Guard valgrind header inclusion with
ENABLE_VALGRIND_ANNOTATIONS instead of ENABLE_VALGRIND_CHECKING.
* alloc-pool.c (pool_alloc, pool_free): Use
ENABLE_VALGRIND_ANNOTATIONS instead of ENABLE_VALGRIND_CHECKING
to guard possibly dead code.
* config.in: Regenerated.
* configure: Likewise.

Index: gcc/configure.ac
===
*** gcc/configure.ac(revision 208642)
--- gcc/configure.ac(working copy)
*** dnl # an if statement.  This was the sou
*** 512,538 
  dnl # in converting to autoconf 2.5x!
  AC_CHECK_HEADER(valgrind.h, have_valgrind_h=yes, have_valgrind_h=no)
  
! if test x$ac_valgrind_checking != x ; then
!   # It is certainly possible that there's valgrind but no valgrind.h.
!   # GCC relies on making annotations so we must have both.
!   AC_MSG_CHECKING(for VALGRIND_DISCARD in )
!   AC_PREPROC_IFELSE([AC_LANG_SOURCE(
! [[#include 
  #ifndef VALGRIND_DISCARD
  #error VALGRIND_DISCARD not defined
  #endif]])],
[gcc_cv_header_valgrind_memcheck_h=yes],
[gcc_cv_header_valgrind_memcheck_h=no])
!   AC_MSG_RESULT($gcc_cv_header_valgrind_memcheck_h)
!   AC_MSG_CHECKING(for VALGRIND_DISCARD in )
!   AC_PREPROC_IFELSE([AC_LANG_SOURCE(
! [[#include 
  #ifndef VALGRIND_DISCARD
  #error VALGRIND_DISCARD not defined
  #endif]])],
[gcc_cv_header_memcheck_h=yes],
[gcc_cv_header_memcheck_h=no])
!   AC_MSG_RESULT($gcc_cv_header_memcheck_h)
AM_PATH_PROG_WITH_TEST(valgrind_path, valgrind,
[$ac_dir/$ac_word --version | grep valgrind- >/dev/null 2>&1])
if test "x$valgrind_path" = "x" \
--- 512,547 
  dnl # in converting to autoconf 2.5x!
  AC_CHECK_HEADER(valgrind.h, have_valgrind_h=yes, have_valgrind_h=no)
  
! # It is certainly possible that there's valgrind but no valgrind.h.
! # GCC relies on making annotations so we must have both.
! AC_MSG_CHECKING(for VALGRIND_DISCARD in )
! AC_PREPROC_IFELSE([AC_LANG_SOURCE(
!   [[#include 
  #ifndef VALGRIND_DISCARD
  #error VALGRIND_DISCARD not defined
  #endif]])],
[gcc_cv_header_valgrind_memcheck_h=yes],
[gcc_cv_header_valgrind_memcheck_h=no])
! AC_MSG_RESULT($gcc_cv_header_valgrind_memcheck_h)
! AC_MSG_CHECKING(for VALGRIND_DISCARD in )
! AC_PREPROC_IFELSE([AC_LANG_SOURCE(
!   [[#include 
  #ifndef VALGRIND_DISCARD
  #error VALGRIND_DISCARD not defined
  #endif]])],
[gcc_cv_header_memcheck_h=yes],
[gcc_cv_header_memcheck_h=no])
! AC_MSG_RESULT($gcc_cv_header_memcheck_h)
! if test $gcc_cv_header_valgrind_memcheck_h = yes; then
!   AC_DEFINE(HAVE_VALGRIND_MEMCHECK_H, 1,
!   [Define if valgrind's valgrind/memcheck.h header is installed.])
! fi
! if test $gcc_cv_header_memcheck_h = yes; then
!   AC_DEFINE(HAVE_MEMCHECK_H, 1,
!   [Define if valgrind's memcheck.h header is installed.])
! fi
! 
! if test x$ac_valgrind_checking != x ; then
AM_PATH_PROG_WITH_TEST(valgrind_path, valgrind,
[$ac_dir/$ac_word --version | grep valgrind- >/dev/null 2>&1])
if test "x$valgrind_path" = "x" \
*** if test x$ac_valgrind_checking != x ; th
*** 546,559 
AC_DEFINE(ENABLE_VALGRIND_CHECKING, 1,
  [Define if you want to run subprograms and generated programs
 through valgrind (a memory checker).  This is extremely expensive.])
-   if test $gcc_cv_header_valgrind_memcheck_h = yes; then
- AC_DEFINE(HAVE_VALGRIND_MEMCHECK_H, 1,
-   [Define if valgrind's valgrind/memcheck.h header is installed.])
-   fi
-   if test $gcc_cv_header_memcheck_h = yes; then
- AC_DEFINE(HAVE_MEMCHECK_H, 1,
-   [Define if valgrind's memcheck.h header is installed.])
-   fi
  fi
  AC_SUBST(valgrind_path_defines)
  AC_SUBST(valgrind_command)
--- 555,560 
*** gather_stats=`if test $enable_gather_det
*** 592,597 
--- 593,613 
  AC_DEFINE_UNQUOTED(GATHER_STATISTICS, $gather_stats,
  [Define to enable detailed memory allocation stats gathering.])
  
+ AC_ARG_ENABLE(valgrind-annotations,
+ [AS_HELP_STRING([--enable-valgrind-annotations],
+   [enable valgrind runtime interaction])], [],
+ [enable_valgrind_annotations=no])
+ if test x$enable_valgrind_annotations != xno \
+ || test x$ac_valgrind_checking != x; then
+   if (test $have_valgrind_h = no \
+   && test $gcc_cv_header_memcheck_h = no \
+   && test $gcc_cv_header_valgrind_memcheck_h = no); then
+ AC_MSG_ERROR([*** Can't find valgrind/memcheck.h, memcheck.h or 
valgrind.h])
+   fi
+   AC_DEFINE(ENABLE_VALGRIND_ANNOTATIONS, 1,
+ [Define to get calls to the valgrind runtime enabled.])
+

Re: [PATCH] Fix gimple-fold

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 2:29 PM, Martin Liška  wrote:
> Thank you for feedback,

Ok if it passes bootstrap / regtest.

Thanks,
Richard.

> new changelog:
>
> 2014-03-18  Martin Liska  
>
> * cgraph.c (cgraph_update_edges_for_call_stmt_node): Added case when
> gimple call statement is update.
>
> * gimple-fold.c (gimple_fold_call): Changed order for GIMPLE_ASSIGN
> and
> GIMPLE_CALL, where gsi iterator still points to GIMPLE CALL.
>
> OK for trunk?
>
> Martin
>
>
>
> On 03/18/2014 02:13 PM, Jakub Jelinek wrote:
>>
>> Hi!
>>
 2014-03-18  Martin Liska  

  * cgraph.c (cgraph_update_edges_for_call_stmt_node): added case
 when
  gimple call statement is updated.
>>
>> Capital letter after :
>>
  * gimple-fold.c (gimple_fold_call): changed order for
 GIMPLE_ASSIGN and
>>
>> Likewise here.
>>
>> Jakub
>
>


[PATCH][AARCH64] Support tail indirect function call

2014-03-18 Thread Jiong Wang

Current, indirect function call prevents tail-call optimization on AArch64.

This patch adapt the fix for PR arm/19599 to AArch64.

Is it ok for next stage 1?

Thanks.

-- Jiong

gcc/

* config/aarch64/predicates.md (aarch64_call_insn_operand): New 
predicate.

* config/aarch64/constraints.md ("Ucs", "Usf"):  New constraints.
* config/aarch64/aarch64.md (*sibcall_insn, *sibcall_value_insn): 
Adjust for

tailcalling through registers.
* config/aarch64/aarch64.h (enum reg_class): New caller save 
register class.

(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
* config/aarch64/aarch64.c (aarch64_function_ok_for_sibcall): Allow 
tailcalling

without decls.

gcc/testsuite

*gcc.target/aarch64/tail-indirect-call.c: New test.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 901ad3d..fd93554 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1168,16 +1168,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm)
 static bool
 aarch64_function_ok_for_sibcall (tree decl, tree exp ATTRIBUTE_UNUSED)
 {
-  /* Indirect calls are not currently supported.  */
-  if (decl == NULL)
-return false;
-
-  /* Cannot tail-call to long-calls, since these are outside of the
- range of a branch instruction (we could handle this if we added
- support for indirect tail-calls.  */
-  if (aarch64_decl_is_long_call_p (decl))
-return false;
-
+  /* Currently, always true.  */
   return true;
 }
 
@@ -4255,6 +4246,7 @@ aarch64_class_max_nregs (reg_class_t regclass, enum machine_mode mode)
   switch (regclass)
 {
 case CORE_REGS:
+case CALLER_SAVE_REGS:
 case POINTER_REGS:
 case GENERAL_REGS:
 case ALL_REGS:
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 13c424c..6911206 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -407,6 +407,7 @@ enum reg_class
 {
   NO_REGS,
   CORE_REGS,
+  CALLER_SAVE_REGS,
   GENERAL_REGS,
   STACK_REG,
   POINTER_REGS,
@@ -422,6 +423,7 @@ enum reg_class
 {		\
   "NO_REGS",	\
   "CORE_REGS",	\
+  "CALLER_SAVE_REGS",\
   "GENERAL_REGS",\
   "STACK_REG",	\
   "POINTER_REGS",\
@@ -434,6 +436,7 @@ enum reg_class
 {	\
   { 0x, 0x, 0x },	/* NO_REGS */		\
   { 0x7fff, 0x, 0x0003 },	/* CORE_REGS */		\
+  { 0x0007, 0x, 0x },	/* CALLER_SAVE_REGS */	\
   { 0x7fff, 0x, 0x0003 },	/* GENERAL_REGS */	\
   { 0x8000, 0x, 0x },	/* STACK_REG */		\
   { 0x, 0x, 0x0003 },	/* POINTER_REGS */	\
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 99a6ac8..f30b444 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -513,6 +513,10 @@
 	  (use (match_operand 2 "" ""))])]
   ""
   {
+if (!REG_P (XEXP (operands[0], 0))
+   && (GET_CODE (XEXP (operands[0], 0)) != SYMBOL_REF))
+ XEXP (operands[0], 0) = force_reg (Pmode, XEXP (operands[0], 0));
+
 if (operands[2] == NULL_RTX)
   operands[2] = const0_rtx;
   }
@@ -526,30 +530,37 @@
 	  (use (match_operand 3 "" ""))])]
   ""
   {
+if (!REG_P (XEXP (operands[1], 0))
+   && (GET_CODE (XEXP (operands[1], 0)) != SYMBOL_REF))
+ XEXP (operands[1], 0) = force_reg (Pmode, XEXP (operands[1], 0));
+
 if (operands[3] == NULL_RTX)
   operands[3] = const0_rtx;
   }
 )
 
 (define_insn "*sibcall_insn"
-  [(call (mem:DI (match_operand:DI 0 "" "X"))
+  [(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucs,Usf"))
 	 (match_operand 1 "" ""))
(return)
(use (match_operand 2 "" ""))]
-  "GET_CODE (operands[0]) == SYMBOL_REF"
-  "b\\t%a0"
+  "SIBLING_CALL_P (insn)"
+  "@
+   br\\t%0
+   b\\t%a0"
   [(set_attr "type" "branch")]
-
 )
 
 (define_insn "*sibcall_value_insn"
   [(set (match_operand 0 "" "")
-	(call (mem:DI (match_operand 1 "" "X"))
+	(call (mem:DI (match_operand 1 "aarch64_call_insn_operand" "Ucs,Usf"))
 	  (match_operand 2 "" "")))
(return)
(use (match_operand 3 "" ""))]
-  "GET_CODE (operands[1]) == SYMBOL_REF"
-  "b\\t%a1"
+  "SIBLING_CALL_P (insn)"
+  "@
+   br\\t%1
+   b\\t%a1"
   [(set_attr "type" "branch")]
 )
 
diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md
index 12ab570..244f97d 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -92,6 +92,14 @@
   (and (match_code "const_int")
(match_test "(unsigned HOST_WIDE_INT) ival < 64")))
 
+(define_register_constraint "Ucs" "CALLER_SAVE_REGS"
+ "@internal The caller save registers.  Useful for sibcalls.")
+
+(define_constraint "Usf"
+ "@internal Usf is a symbol reference."
+ (match_code "symbol_ref")
+)
+
 (define_constraint "UsM"
   "@internal
   A constraint that matches the immediate constant -1."
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md

Contents of PO file 'cpplib-4.9-b20140202.de.po'

2014-03-18 Thread Translation Project Robot


cpplib-4.9-b20140202.de.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



New German PO file for 'cpplib' (version 4.9-b20140202)

2014-03-18 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the German team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/de.po

(This file, 'cpplib-4.9-b20140202.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] [gomp4] Initial support of OpenACC loop directive in C front-end.

2014-03-18 Thread Ilmir Usmanov

Committed as r208649.

--
Ilmir.


Re: [PATCH][AArch64] Vreinterpret re-implemention for stage-1

2014-03-18 Thread Marcus Shawcroft
2014-02-13 9:46 GMT+00:00 Alex Velenko :
> Hi,
> This patch re-implements vreinterpret intrinsics to directly call a cast.
> The aim is to forward as much information to front-end as possible.
> This patch had a full LE and BE regression run with no regressions.
>
> Is patch good to commit to stage-1?

OK for stage-1

/Marcus


[RFA jit 1/2] introduce class toplev

2014-03-18 Thread Tom Tromey
This patch introduces a new "class toplev" and changes toplev_main and
toplev_finalize to be methods of this class.  Additionally, now the
timevars are automatically stopped when the object is destroyed.  This
cleans up "compile" a bit and makes it simpler to reuse the toplev
logic in other code.
---
 gcc/ChangeLog.jit  | 14 +
 gcc/diagnostic.c   |  2 +-
 gcc/jit/ChangeLog.jit  |  5 +
 gcc/jit/internal-api.c | 25 +-
 gcc/main.c |  9 
 gcc/toplev.c   | 56 +-
 gcc/toplev.h   | 20 --
 7 files changed, 76 insertions(+), 55 deletions(-)

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..56dc3ac 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -333,7 +333,7 @@ diagnostic_show_locus (diagnostic_context * context,
 static const char * const bt_stop[] =
 {
   "main",
-  "toplev_main",
+  "toplev::main",
   "execute_one_pass",
   "compile_file",
 };
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 8e0395d..6a4d2ae 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3650,7 +3650,7 @@ compile ()
 
   /* Call into the rest of gcc.
  For now, we have to assemble command-line options to pass into
- toplev_main, so that they can be parsed. */
+ toplev::main, so that they can be parsed. */
 
   /* Pass in user-provided "progname", if any, so that it makes it
  into GCC's "progname" global, used in various diagnostics. */
@@ -3724,25 +3724,15 @@ compile ()
   ADD_ARG ("-fdump-ipa-all");
 }
 
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = false;
+  toplev toplev (false);
 
-  if (time_report || !quiet_flag  || flag_detailed_statistics)
-timevar_init ();
-
-  timevar_start (TV_TOTAL);
-
-  toplev_main (num_args, const_cast  (fake_args), &toplev_opts);
-  toplev_finalize ();
+  toplev.main (num_args, const_cast  (fake_args));
+  toplev.finalize ();
 
   active_playback_ctxt = NULL;
 
   if (errors_occurred ())
-{
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-  return NULL;
-}
+return NULL;
 
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
@@ -3765,8 +3755,6 @@ compile ()
 if (ret)
   {
timevar_pop (TV_ASSEMBLE);
-   timevar_stop (TV_TOTAL);
-   timevar_print (stderr);
return NULL;
   }
   }
@@ -3795,9 +3783,6 @@ compile ()
 timevar_pop (TV_LOAD);
   }
 
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-
   return result_obj;
 }
 
diff --git a/gcc/main.c b/gcc/main.c
index b893308..4bba041 100644
--- a/gcc/main.c
+++ b/gcc/main.c
@@ -1,5 +1,5 @@
 /* main.c: defines main() for cc1, cc1plus, etc.
-   Copyright (C) 2007-2013 Free Software Foundation, Inc.
+   Copyright (C) 2007-2014 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -26,15 +26,14 @@ along with GCC; see the file COPYING3.  If not see
 
 int main (int argc, char **argv);
 
-/* We define main() to call toplev_main(), which is defined in toplev.c.
+/* We define main() to call toplev::main(), which is defined in toplev.c.
We do this in a separate file in order to allow the language front-end
to define a different main(), if it so desires.  */
 
 int
 main (int argc, char **argv)
 {
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = true;
+  toplev toplev (true);
 
-  return toplev_main (argc, argv, &toplev_opts);
+  return toplev.main (argc, argv);
 }
diff --git a/gcc/toplev.c b/gcc/toplev.c
index f1ac560..5284621 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1,5 +1,5 @@
 /* Top level of GCC compilers (cc1, cc1plus, etc.)
-   Copyright (C) 1987-2013 Free Software Foundation, Inc.
+   Copyright (C) 1987-2014 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -95,7 +95,7 @@ along with GCC; see the file COPYING3.  If not see
 #endif
 
 static void general_init (const char *);
-static void do_compile (const toplev_options *toplev_opts);
+static void do_compile ();
 static void process_options (void);
 static void backend_init (void);
 static int lang_dependent_init (const char *);
@@ -1854,18 +1854,8 @@ finalize (bool no_backend)
 
 /* Initialize the compiler, and compile the input file.  */
 static void
-do_compile (const toplev_options *toplev_opts)
+do_compile ()
 {
-  /* Initialize timing first.  The C front ends read the main file in
- the post_options hook, and C++ does file timings.  */
-  if (toplev_opts->use_TV_TOTAL)
-{
-  if (time_report || !quiet_flag  || flag_detailed_statistics)
-timevar_init ();
-
-  timevar_start (TV_TOTAL);
-}
-
   process_options ();
 
   /* Don't do any more if an error has already occurred.  */
@@ -1910,13 +1900,28 @@ do_compile (const toplev_options *toplev_opts)
 
   timevar_stop (TV_PHASE_FINALIZE);
 }
+}
 
-  if (toplev_opts->use_TV_TOTAL)
-{
-  /* Stop timing and print the times.

[RFA jit 2/2] introduce scoped_timevar

2014-03-18 Thread Tom Tromey
This introduces a new scoped_timevar class.  It pushes a given timevar
in its constructor, and pops it in the destructor, giving a much
simpler way to use timevars in the typical case where they can be
scoped.
---
 gcc/ChangeLog.jit  |  4 
 gcc/jit/ChangeLog.jit  |  4 
 gcc/jit/internal-api.c | 16 +---
 gcc/timevar.h  | 24 +++-
 4 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 6a4d2ae..8285c64 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3737,8 +3737,6 @@ compile ()
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
 
-  timevar_push (TV_ASSEMBLE);
-
   /* Gross hacks follow:
  We have a .s file; we want a .so file.
  We could reuse parts of gcc/gcc.c to do this.
@@ -3746,6 +3744,8 @@ compile ()
*/
   /* FIXME: totally faking it for now, not even using pex */
   {
+scoped_timevar assemble_timevar (TV_ASSEMBLE);
+
 char cmd[1024];
 snprintf (cmd, 1024, "gcc -shared %s -o %s",
   m_path_s_file, m_path_so_file);
@@ -3753,20 +3753,16 @@ compile ()
   printf ("cmd: %s\n", cmd);
 int ret = system (cmd);
 if (ret)
-  {
-   timevar_pop (TV_ASSEMBLE);
-   return NULL;
-  }
+  return NULL;
   }
-  timevar_pop (TV_ASSEMBLE);
 
   // TODO: split out assembles vs linker
 
   /* dlopen the .so file. */
   {
-const char *error;
+scoped_timevar load_timevar (TV_LOAD);
 
-timevar_push (TV_LOAD);
+const char *error;
 
 /* Clear any existing error.  */
 dlerror ();
@@ -3779,8 +3775,6 @@ compile ()
   result_obj = new result (handle);
 else
   result_obj = NULL;
-
-timevar_pop (TV_LOAD);
   }
 
   return result_obj;
diff --git a/gcc/timevar.h b/gcc/timevar.h
index dc2a8bc..eb8bf0d 100644
--- a/gcc/timevar.h
+++ b/gcc/timevar.h
@@ -1,5 +1,5 @@
 /* Timing variables for measuring compiler performance.
-   Copyright (C) 2000-2013 Free Software Foundation, Inc.
+   Copyright (C) 2000-2014 Free Software Foundation, Inc.
Contributed by Alex Samuel 
 
This file is part of GCC.
@@ -110,6 +110,28 @@ timevar_pop (timevar_id_t tv)
 timevar_pop_1 (tv);
 }
 
+class scoped_timevar
+{
+ public:
+  scoped_timevar (timevar_id_t tv)
+: m_tv (tv)
+  {
+timevar_push (m_tv);
+  }
+
+  ~scoped_timevar ()
+  {
+timevar_push (m_tv);
+  }
+
+ private:
+
+  // Private to disallow copies.
+  scoped_timevar (const scoped_timevar &);
+
+  timevar_id_t m_tv;
+};
+
 extern void print_time (const char *, long);
 
 #endif /* ! GCC_TIMEVAR_H */
-- 
1.8.5.3



[RFA jit 0/2] minor refactorings for reuse

2014-03-18 Thread Tom Tromey
I wanted to do something like playback::context::compile, but in my
project I can't really reuse all the JIT code -- really I just wanted
to be able to use toplev_main and toplev_finalize.

Looking into the code, though, I saw a few spots that could be cleaned
up a little, so I wouldn't have to worry as much about keeping my
hacks in sync with the JIT branch.

The first patch here changes the toplev code into a class and arranges
for the timevars to be managed there.

The second patch just introduces scoped timevars to make them less
error-prone to use.

I built and tested this using the JIT test suite.

Let me know what you think,
Tom



Re: [PATCH] Fix gimple-fold

2014-03-18 Thread Martin Liška

Patch passes bootstrap and regtest.

I fixed indentation according to discussion with Jakub.

OK for trunk?

Thanks,
Martin

On 03/18/2014 02:55 PM, Richard Biener wrote:

On Tue, Mar 18, 2014 at 2:29 PM, Martin Liška  wrote:

Thank you for feedback,

Ok if it passes bootstrap / regtest.

Thanks,
Richard.


new changelog:

2014-03-18  Martin Liska  

 * cgraph.c (cgraph_update_edges_for_call_stmt_node): Added case when
 gimple call statement is update.

 * gimple-fold.c (gimple_fold_call): Changed order for GIMPLE_ASSIGN
and
 GIMPLE_CALL, where gsi iterator still points to GIMPLE CALL.

OK for trunk?

Martin



On 03/18/2014 02:13 PM, Jakub Jelinek wrote:

Hi!


2014-03-18  Martin Liska  

  * cgraph.c (cgraph_update_edges_for_call_stmt_node): added case
when
  gimple call statement is updated.

Capital letter after :


  * gimple-fold.c (gimple_fold_call): changed order for
GIMPLE_ASSIGN and

Likewise here.

 Jakub




diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index a15b6bc..577352f 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1519,7 +1519,10 @@ cgraph_update_edges_for_call_stmt_node (struct cgraph_node *node,
 		{
 		  if (callee->decl == new_call
 		  || callee->former_clone_of == new_call)
-		return;
+		{
+		  cgraph_set_call_stmt (e, new_stmt);
+		  return;
+		}
 		  callee = callee->clone_of;
 		}
 	}
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index eafdb2d..adc9d49 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1153,8 +1153,13 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace)
 		{
 		  tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
 		  tree def = get_or_create_ssa_default_def (cfun, var);
-		  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+
+		  /* To satisfy condition for
+		 cgraph_update_edges_for_call_stmt_node,
+		 we need to preserve GIMPLE_CALL statement
+		 at position of GSI iterator.  */
 		  update_call_from_tree (gsi, def);
+		  gsi_insert_before (gsi, new_stmt, GSI_NEW_STMT);
 		}
 		  else
 		gsi_replace (gsi, new_stmt, true);


Re: [PATCH] Fix gimple-fold

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 3:59 PM, Martin Liška  wrote:
> Patch passes bootstrap and regtest.
>
> I fixed indentation according to discussion with Jakub.
>
> OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Martin
>
>
> On 03/18/2014 02:55 PM, Richard Biener wrote:
>>
>> On Tue, Mar 18, 2014 at 2:29 PM, Martin Liška  wrote:
>>>
>>> Thank you for feedback,
>>
>> Ok if it passes bootstrap / regtest.
>>
>> Thanks,
>> Richard.
>>
>>> new changelog:
>>>
>>> 2014-03-18  Martin Liska  
>>>
>>>  * cgraph.c (cgraph_update_edges_for_call_stmt_node): Added case
>>> when
>>>  gimple call statement is update.
>>>
>>>  * gimple-fold.c (gimple_fold_call): Changed order for
>>> GIMPLE_ASSIGN
>>> and
>>>  GIMPLE_CALL, where gsi iterator still points to GIMPLE CALL.
>>>
>>> OK for trunk?
>>>
>>> Martin
>>>
>>>
>>>
>>> On 03/18/2014 02:13 PM, Jakub Jelinek wrote:

 Hi!

>> 2014-03-18  Martin Liska  
>>
>>   * cgraph.c (cgraph_update_edges_for_call_stmt_node): added
>> case
>> when
>>   gimple call statement is updated.

 Capital letter after :

>>   * gimple-fold.c (gimple_fold_call): changed order for
>> GIMPLE_ASSIGN and

 Likewise here.

  Jakub
>>>
>>>
>


Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-18 Thread Ulrich Drepper
On Tue, Mar 18, 2014 at 7:13 AM, Richard Biener
 wrote:
> extern __inline __m512
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> _mm512_undefined_ps (void)
> {
>   __m512 __Y = __Y;
>   return __Y;
> }


This provokes no warnings (as you wrote) and it doesn't clobber flags,
but it doesn't avoid loading.  The code below creates a pxor for the
parameter.  That's what I think compiler support should help to get
rid of.  If the compiler has some magic to recognize -1 masks then
this will help in some situations but it seems to be a specific
implementation for the intrinsics while I've been looking at generic
solution.


typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));

void g(__m128d);

extern __inline __m128d
__attribute__((__gnu_inline__, __always_inline__, __artificial__, const))
_mm_undefined_pd(void) {
  __m128d v = v;
  return v;
}

void
f()
{
  g(_mm_undefined_pd());
}


Re: [PATCH] Fix -fsanitize=undefined -flto (PR sanitizer/60535)

2014-03-18 Thread Jakub Jelinek
On Tue, Mar 18, 2014 at 10:00:09AM +0100, Richard Biener wrote:
> Isn't varpool_finalize_decl the canonical interface anyway?
> Thus, you don't need to call rest_of_decl_compilation AFAIK - the
> varpool machinery will do that.
...
> Should be enough to check if int128_integer_type_node is not NULL.

Ok, here is what I've checked in after another bootstrap/regtest:

2014-03-18  Jakub Jelinek  

PR sanitizer/60535
* ubsan.c (ubsan_type_descriptor, ubsan_create_data): Call
varpool_finalize_decl instead of rest_of_decl_compilation.
lto/
* lto-lang.c (lto_init): Add NAME_TYPE for int128_integer_type_node
and complex_{float,{,long_}double}_type_node.
testsuite/
* c-c++-common/ubsan/null-1.c: Don't skip if -flto.
* c-c++-common/ubsan/null-2.c: Likewise.
* c-c++-common/ubsan/null-3.c: Likewise.
* c-c++-common/ubsan/null-4.c: Likewise.
* c-c++-common/ubsan/null-5.c: Likewise.
* c-c++-common/ubsan/null-6.c: Likewise.
* c-c++-common/ubsan/null-7.c: Likewise.
* c-c++-common/ubsan/null-8.c: Likewise.
* c-c++-common/ubsan/null-9.c: Likewise.
* c-c++-common/ubsan/null-10.c: Likewise.
* c-c++-common/ubsan/null-11.c: Likewise.
* c-c++-common/ubsan/overflow-1.c: Likewise.
* c-c++-common/ubsan/overflow-2.c: Likewise.
* c-c++-common/ubsan/overflow-add-1.c: Likewise.
* c-c++-common/ubsan/overflow-add-2.c: Likewise.
* c-c++-common/ubsan/overflow-int128.c: Likewise.
* c-c++-common/ubsan/overflow-mul-1.c: Likewise.
* c-c++-common/ubsan/overflow-mul-2.c: Likewise.
* c-c++-common/ubsan/overflow-mul-3.c: Likewise.
* c-c++-common/ubsan/overflow-mul-4.c: Likewise.
* c-c++-common/ubsan/overflow-negate-1.c: Likewise.
* c-c++-common/ubsan/overflow-negate-2.c: Likewise.
* c-c++-common/ubsan/overflow-sub-1.c: Likewise.
* c-c++-common/ubsan/overflow-sub-2.c: Likewise.
* c-c++-common/ubsan/pr59333.c: Likewise.
* c-c++-common/ubsan/pr59503.c: Likewise.
* c-c++-common/ubsan/pr59667.c: Likewise.
* c-c++-common/ubsan/undefined-1.c: Likewise.
* g++.dg/ubsan/pr59250.C: Likewise.
* g++.dg/ubsan/pr59306.C: Likewise.

--- gcc/ubsan.c.jj  2014-03-18 10:04:06.441343729 +0100
+++ gcc/ubsan.c 2014-03-18 10:24:42.824451915 +0100
@@ -390,7 +390,7 @@ ubsan_type_descriptor (tree type, bool w
   TREE_CONSTANT (ctor) = 1;
   TREE_STATIC (ctor) = 1;
   DECL_INITIAL (decl) = ctor;
-  rest_of_decl_compilation (decl, 1, 0);
+  varpool_finalize_decl (decl);
 
   /* Save the VAR_DECL into the hash table.  */
   decl_for_type_insert (type, decl);
@@ -501,7 +501,7 @@ ubsan_create_data (const char *name, loc
   TREE_CONSTANT (ctor) = 1;
   TREE_STATIC (ctor) = 1;
   DECL_INITIAL (var) = ctor;
-  rest_of_decl_compilation (var, 1, 0);
+  varpool_finalize_decl (var);
 
   return var;
 }
--- gcc/lto/lto-lang.c.jj   2014-03-17 20:01:33.927565658 +0100
+++ gcc/lto/lto-lang.c  2014-03-18 10:25:14.141276663 +0100
@@ -1222,6 +1222,11 @@ lto_init (void)
   NAME_TYPE (long_double_type_node, "long double");
   NAME_TYPE (void_type_node, "void");
   NAME_TYPE (boolean_type_node, "bool");
+  NAME_TYPE (complex_float_type_node, "complex float");
+  NAME_TYPE (complex_double_type_node, "complex double");
+  NAME_TYPE (complex_long_double_type_node, "complex long double");
+  if (int128_integer_type_node)
+NAME_TYPE (int128_integer_type_node, "__int128");
 #undef NAME_TYPE
 
   /* Initialize LTO-specific data structures.  */
--- gcc/testsuite/c-c++-common/ubsan/null-1.c.jj2013-11-19 
21:56:24.566416519 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-1.c   2014-03-17 13:23:46.057000209 
+0100
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=null -w" } */
 /* { dg-shouldfail "ubsan" } */
-/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
 
 int
 main (void)
--- gcc/testsuite/c-c++-common/ubsan/null-2.c.jj2013-11-19 
21:56:24.566416519 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-2.c   2014-03-17 13:23:46.06592 
+0100
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=null -w" } */
 /* { dg-shouldfail "ubsan" } */
-/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
 
 int
 main (void)
--- gcc/testsuite/c-c++-common/ubsan/null-3.c.jj2013-11-19 
21:56:24.567416516 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-3.c   2014-03-17 13:23:46.063000958 
+0100
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=null -w" } */
 /* { dg-shouldfail "ubsan" } */
-/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
 
 int
 foo (int *p)
--- gcc/testsuite/c-c++-common/ubsan/null-4.c.jj2013-11-19 
21:56:24.567416516 +0100
+++ gcc/testsuite/c-c++-common/ubsan/null-4.c   2014-03-17 15:37:15.977422737 
+0100
@@ -1,7 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fsanitize=null -w" } */
 /* { dg-shouldfail "ubsan" } */
-/* { dg-skip

[C++ PATCH] Fix -std=c++11 OpenMP UDR handling (PR c++/60331)

2014-03-18 Thread Jakub Jelinek
Hi!

Apparently with -std=c++11 and higher, if in a template we call
finish_expr_stmt on a DECL_EXPR (as cp_parser_omp_declare_reduction_exprs
does) and the omp_priv var in there is not type dependent, it ICEs during:
  else if (!type_dependent_expression_p (expr))
convert_to_void (build_non_dependent_expr (expr), ICV_STATEMENT,
 tf_warning_or_error);
So, the patch in that case just does the rest of what finish_expr_stmt
performs.

Jason, do you have better ideas how to fix this?

The patch has been bootstrapped/regtested on x86_64-linux and i686-linux.

2014-03-18  Jakub Jelinek  

PR c++/60331
* parser.c (cp_parser_omp_declare_reduction_exprs): Avoid calling
finish_expr_stmt on DECL_EXPRs in templates.

* testsuite/libgomp.c++/udr-11.C: New test.
* testsuite/libgomp.c++/udr-12.C: New test.
* testsuite/libgomp.c++/udr-13.C: New test.
* testsuite/libgomp.c++/udr-14.C: New test.
* testsuite/libgomp.c++/udr-15.C: New test.
* testsuite/libgomp.c++/udr-16.C: New test.
* testsuite/libgomp.c++/udr-17.C: New test.
* testsuite/libgomp.c++/udr-18.C: New test.
* testsuite/libgomp.c++/udr-19.C: New test.

--- gcc/cp/parser.c.jj  2014-03-18 10:04:14.0 +0100
+++ gcc/cp/parser.c 2014-03-18 11:18:44.511571459 +0100
@@ -30698,7 +30698,22 @@ cp_parser_omp_declare_reduction_exprs (t
 
   block = finish_omp_structured_block (block);
   cp_walk_tree (&block, cp_remove_omp_priv_cleanup_stmt, omp_priv, NULL);
-  finish_expr_stmt (block);
+  if (block
+ && TREE_CODE (block) == DECL_EXPR
+ && processing_template_decl)
+   {
+ if (check_for_bare_parameter_packs (block))
+   block = error_mark_node;
+ if (TREE_CODE (block) != CLEANUP_POINT_EXPR)
+   {
+ if (TREE_CODE (block) != EXPR_STMT)
+   block = build_stmt (input_location, EXPR_STMT, block);
+ block = maybe_cleanup_point_expr_void (block);
+   }
+ add_stmt (block);
+   }
+  else
+   finish_expr_stmt (block);
 
   if (ctor)
add_decl_expr (omp_orig);
--- libgomp/testsuite/libgomp.c++/udr-11.C.jj   2014-03-18 11:47:43.326846415 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-11.C  2014-03-18 11:47:43.329846576 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-1.C"
--- libgomp/testsuite/libgomp.c++/udr-12.C.jj   2014-03-18 11:47:43.330846623 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-12.C  2014-03-18 11:47:43.33184 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-2.C"
--- libgomp/testsuite/libgomp.c++/udr-13.C.jj   2014-03-18 11:47:43.332846707 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-13.C  2014-03-18 11:47:43.332846707 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-3.C"
--- libgomp/testsuite/libgomp.c++/udr-14.C.jj   2014-03-18 11:47:43.333846744 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-14.C  2014-03-18 11:47:43.334846777 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-4.C"
--- libgomp/testsuite/libgomp.c++/udr-15.C.jj   2014-03-18 11:47:43.334846777 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-15.C  2014-03-18 11:47:43.335846809 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-5.C"
--- libgomp/testsuite/libgomp.c++/udr-16.C.jj   2014-03-18 11:47:43.336846840 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-16.C  2014-03-18 11:47:43.336846840 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-6.C"
--- libgomp/testsuite/libgomp.c++/udr-17.C.jj   2014-03-18 11:47:43.337846867 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-17.C  2014-03-18 11:47:43.337846867 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-7.C"
--- libgomp/testsuite/libgomp.c++/udr-18.C.jj   2014-03-18 11:47:43.338846892 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-18.C  2014-03-18 11:47:43.338846892 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-8.C"
--- libgomp/testsuite/libgomp.c++/udr-19.C.jj   2014-03-18 11:47:43.339846916 
+0100
+++ libgomp/testsuite/libgomp.c++/udr-19.C  2014-03-18 11:47:43.339846916 
+0100
@@ -0,0 +1,4 @@
+// { dg-do run }
+// { dg-options "-fopenmp -std=c++11" }
+
+#include "udr-9.C"

Jakub


Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 4:03 PM, Ulrich Drepper  wrote:
> On Tue, Mar 18, 2014 at 7:13 AM, Richard Biener
>  wrote:
>> extern __inline __m512
>> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>> _mm512_undefined_ps (void)
>> {
>>   __m512 __Y = __Y;
>>   return __Y;
>> }
>
>
> This provokes no warnings (as you wrote) and it doesn't clobber flags,
> but it doesn't avoid loading.  The code below creates a pxor for the
> parameter.  That's what I think compiler support should help to get
> rid of.  If the compiler has some magic to recognize -1 masks then
> this will help in some situations but it seems to be a specific
> implementation for the intrinsics while I've been looking at generic
> solution.
>
>
> typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
>
> void g(__m128d);
>
> extern __inline __m128d
> __attribute__((__gnu_inline__, __always_inline__, __artificial__, const))
> _mm_undefined_pd(void) {
>   __m128d v = v;
>   return v;
> }
>
> void
> f()
> {
>   g(_mm_undefined_pd());
> }

The load from zero is caused by the init-regs pass.  To quote:

/* Check all of the uses of pseudo variables.  If any use that is MUST
   uninitialized, add a store of 0 immediately before it.  For
   subregs, this makes combine happy.  For full word regs, this makes
   other optimizations, like the register allocator and the reg-stack
   happy as well as papers over some problems on the arm and other
   processors where certain isa constraints cannot be handled by gcc.
   These are of the form where two operands to an insn my not be the
   same.  The ra will only make them the same if they do not
   interfere, and this can only happen if one is not initialized.

   There is also the unfortunate consequence that this may mask some
   buggy programs where people forget to initialize stack variable.
   Any programmer with half a brain would look at the uninitialized
   variable warnings.  */

You can disable it with -fdisable-rtl-init-regs.  Not sure if the above
comment today is just overly cautious ... maybe we should have a
target hook that controls its execution.

Richard.


Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-18 Thread Richard Biener
On Tue, Mar 18, 2014 at 4:09 PM, Richard Biener
 wrote:
> On Tue, Mar 18, 2014 at 4:03 PM, Ulrich Drepper  wrote:
>> On Tue, Mar 18, 2014 at 7:13 AM, Richard Biener
>>  wrote:
>>> extern __inline __m512
>>> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>>> _mm512_undefined_ps (void)
>>> {
>>>   __m512 __Y = __Y;
>>>   return __Y;
>>> }
>>
>>
>> This provokes no warnings (as you wrote) and it doesn't clobber flags,
>> but it doesn't avoid loading.  The code below creates a pxor for the
>> parameter.  That's what I think compiler support should help to get
>> rid of.  If the compiler has some magic to recognize -1 masks then
>> this will help in some situations but it seems to be a specific
>> implementation for the intrinsics while I've been looking at generic
>> solution.
>>
>>
>> typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
>>
>> void g(__m128d);
>>
>> extern __inline __m128d
>> __attribute__((__gnu_inline__, __always_inline__, __artificial__, const))
>> _mm_undefined_pd(void) {
>>   __m128d v = v;
>>   return v;
>> }
>>
>> void
>> f()
>> {
>>   g(_mm_undefined_pd());
>> }
>
> The load from zero is caused by the init-regs pass.  To quote:
>
> /* Check all of the uses of pseudo variables.  If any use that is MUST
>uninitialized, add a store of 0 immediately before it.  For
>subregs, this makes combine happy.  For full word regs, this makes
>other optimizations, like the register allocator and the reg-stack
>happy as well as papers over some problems on the arm and other
>processors where certain isa constraints cannot be handled by gcc.
>These are of the form where two operands to an insn my not be the
>same.  The ra will only make them the same if they do not
>interfere, and this can only happen if one is not initialized.
>
>There is also the unfortunate consequence that this may mask some
>buggy programs where people forget to initialize stack variable.
>Any programmer with half a brain would look at the uninitialized
>variable warnings.  */
>
> You can disable it with -fdisable-rtl-init-regs.  Not sure if the above
> comment today is just overly cautious ... maybe we should have a
> target hook that controls its execution.

Btw, without this zeroing (where zero is also "undefined") this may
be an information leak and thus possibly a security issue?  That is,
how is _mm_undefined_pd () specified?

Richard.

> Richard.


Re: [PATCH] x86: Define _mm*_undefined_*

2014-03-18 Thread Ulrich Drepper
On Tue, Mar 18, 2014 at 11:11 AM, Richard Biener
 wrote:
> Btw, without this zeroing (where zero is also "undefined") this may
> be an information leak and thus possibly a security issue?  That is,
> how is _mm_undefined_pd () specified?

People aren't accidentally using the _mm*_undefined_*() functions.  I
don't think that information leak should be a concern here.  Changing
the default behavior of

   TYPE VAR = VAR;

might be a problem.  There might be code which depends on the current behavior.


Re: [PR58479] introduce a param to limit debug stmts count

2014-03-18 Thread Jakub Jelinek
On Fri, Mar 14, 2014 at 11:45:48PM -0300, Alexandre Oliva wrote:
> This bug report had various testcases that had to do with full loop
> unrolling with non-automatic iterators and fixed boundaries, which
> resulted in duplicating debug stmts in the loop for each iteration.  In
> some cases, the resulting executable code is none, but the debug stmts
> add up to millions.  Just dropping them on the floor is somewhat
> undesirable, even though they're not usable for much with today's
> compiler and debugger infrastructure.  I decided to introduce a param to
> limit debug stmts, so that this sort of testcase doesn't run nearly
> forever, eating up all memory while at that.  This is what this patchset
> does.

To some extent this is really a big hammer approach, on the other side we
already have a precedent here, the max-vartrack-size limit, and if we have
too many debug stmts in a single function, we also most likely hit the
max-vartrack-size limit anyway.

It would be nice if for the loop unrolling we could try to do something
smarter (as I wrote in the PR, I think it would be e.g. nice to preserve
debug stmts on the first few unrolled iterations and the last one and
just say the vars are all unavailable for the middle iterations or
something, instead of dropping everything).

> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -4498,6 +4498,8 @@ allocate_struct_function (tree fndecl, bool abstract_p)
>  
>cfun = ggc_alloc_cleared_function ();
>  
> +  SET_BUILD_DEBUG_STMTS (cfun, flag_var_tracking_assignments);

Dunno how this plays together with __attribute__((optimize(...))),
I'm afraid not very well.  E.g. if in -O0 -g compilation some function is
__attribute__((optimize(2))) then we want to have debug stmts in there,
but the above would preclude it, on the other wise in -O2 -g compilation
with __attribute__((optimize(0))) function in it, we don't want debug stmts
in there.  So perhaps it needs to be updated when handling the optimize
attribute if the function doesn't have a body yet or something similar?

> @@ -121,6 +121,12 @@ gimple_alloc_stat (enum gimple_code code, unsigned 
> num_ops MEM_STAT_DECL)
>size_t size;
>gimple stmt;
>  
> +  if (code == GIMPLE_DEBUG)
> +{  
> +  gcc_checking_assert (MAY_HAVE_DEBUG_STMTS);
> +  cfun->debug_stmts++;
> +}
> +
>size = gimple_size (code);
>if (num_ops > 0)
>  size += sizeof (tree) * (num_ops - 1);

I'd strongly prefer it you could move this hunk to
gimple_build_debug_bind_stat and gimple_build_debug_source_bind_stat,
yeah, it is duplication of it, but adding a branch for all GIMPLE allocation
doesn't look like a good idea to me.

Jakub


[patch] fix libstdc++/60564

2014-03-18 Thread Jonathan Wakely

This fixes a 4.8/4.9 regression where we move from an lvalue when
constructing a packaged_task.

Tested x86_64-linux, committed to trunk and 4.8 branch.
commit b07f31e30bc3ca1b91e199c37466824bb7c3bafa
Author: Jonathan Wakely 
Date:   Tue Mar 18 13:23:52 2014 +

PR libstdc++/60564
* include/std/future (__future_base::_Task_state<>): Change
constructors to template functions using perfect forwarding.
(__create_task_state): Use decayed type as stored task.
(packaged_task::packaged_task(_Fn&&)): Forward instead of moving.
* testsuite/30_threads/packaged_task/60564.cc: New.

diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index ca3dacd..717ce71 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -1285,9 +1285,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __future_base::_Task_state<_Fn, _Alloc, _Res(_Args...)> final
 : __future_base::_Task_state_base<_Res(_Args...)>
 {
-  _Task_state(_Fn&& __fn, const _Alloc& __a)
-  : _Task_state_base<_Res(_Args...)>(__a), _M_impl(std::move(__fn), __a)
-  { }
+  template
+   _Task_state(_Fn2&& __fn, const _Alloc& __a)
+   : _Task_state_base<_Res(_Args...)>(__a),
+ _M_impl(std::forward<_Fn2>(__fn), __a)
+   { }
 
 private:
   virtual void
@@ -1316,19 +1318,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   struct _Impl : _Alloc
   {
-   _Impl(_Fn&& __fn, const _Alloc& __a)
- : _Alloc(__a), _M_fn(std::move(__fn)) { }
+   template
+ _Impl(_Fn2&& __fn, const _Alloc& __a)
+ : _Alloc(__a), _M_fn(std::forward<_Fn2>(__fn)) { }
_Fn _M_fn;
   } _M_impl;
 };
 
-template
-  static shared_ptr<__future_base::_Task_state_base<_Signature>>
-  __create_task_state(_Fn&& __fn, const _Alloc& __a)
-  {
-   typedef __future_base::_Task_state<_Fn, _Alloc, _Signature> _State;
-   return std::allocate_shared<_State>(__a, std::move(__fn), __a);
-  }
+  template
+static shared_ptr<__future_base::_Task_state_base<_Signature>>
+__create_task_state(_Fn&& __fn, const _Alloc& __a)
+{
+  typedef typename decay<_Fn>::type _Fn2;
+  typedef __future_base::_Task_state<_Fn2, _Alloc, _Signature> _State;
+  return std::allocate_shared<_State>(__a, std::forward<_Fn>(__fn), __a);
+}
 
   template
 shared_ptr<__future_base::_Task_state_base<_Res(_Args...)>>
@@ -1368,7 +1372,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __constrain_pkgdtask::__type>
explicit
packaged_task(_Fn&& __fn)
-   : packaged_task(allocator_arg, std::allocator(), std::move(__fn))
+   : packaged_task(allocator_arg, std::allocator(),
+   std::forward<_Fn>(__fn))
{ }
 
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
diff --git a/libstdc++-v3/testsuite/30_threads/packaged_task/60564.cc 
b/libstdc++-v3/testsuite/30_threads/packaged_task/60564.cc
new file mode 100644
index 000..956d506
--- /dev/null
+++ b/libstdc++-v3/testsuite/30_threads/packaged_task/60564.cc
@@ -0,0 +1,51 @@
+// { dg-do run { target *-*-freebsd* *-*-netbsd* *-*-linux* *-*-gnu* 
*-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } }
+// { dg-options " -std=gnu++11 -pthread" { target *-*-freebsd* *-*-netbsd* 
*-*-linux* *-*-gnu* powerpc-ibm-aix* } }
+// { dg-options " -std=gnu++11 -pthreads" { target *-*-solaris* } }
+// { dg-options " -std=gnu++11 " { target *-*-cygwin *-*-darwin* } }
+// { dg-require-cstdint "" }
+// { dg-require-gthreads "" }
+// { dg-require-atomic-builtins "" }
+
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+
+#include 
+#include 
+
+struct X
+{
+  X() = default;
+  X(const X&) = default;
+  X(X&& x) { x.moved = true; }
+
+  void operator()() const { }
+
+  bool moved = false;
+};
+
+void test01()
+{
+  X x;
+  std::packaged_task p(x);
+  VERIFY( !x.moved );
+}
+
+int main()
+{
+  test01();
+}


[C++ testcases] PR 54250, PR 60305

2014-03-18 Thread Paolo Carlini

Hi,

I'm adding the below testcases for bugs already fixed in mainline and 4.8.x.

Thanks,
Paolo.

//
2014-03-18  Paolo Carlini  

PR c++/54250
* g++.dg/cpp0x/lambda/lambda-ice12.C: New.
Index: g++.dg/cpp0x/constexpr-ice14.C
===
--- g++.dg/cpp0x/constexpr-ice14.C  (revision 0)
+++ g++.dg/cpp0x/constexpr-ice14.C  (working copy)
@@ -0,0 +1,11 @@
+// PR c++/60305
+// { dg-do compile { target c++11 } }
+
+template int foo() { return I; }
+
+template void bar()
+{
+  constexpr int (*X[])() = { foo... };
+}
+
+template void bar<1,3,5>();
Index: g++.dg/cpp0x/lambda/lambda-ice12.C
===
--- g++.dg/cpp0x/lambda/lambda-ice12.C  (revision 0)
+++ g++.dg/cpp0x/lambda/lambda-ice12.C  (working copy)
@@ -0,0 +1,15 @@
+// PR c++/54250
+// { dg-do compile { target c++11 } }
+
+struct T
+{
+int a;
+int foo()
+{
+return [&]()->int {
+return [&](decltype(/*this->*/a) _)->int {
+return 1;
+}(a);
+}();
+}
+};
2014-03-18  Paolo Carlini  

PR c++/60305
* g++.dg/cpp0x/constexpr-ice14.C: New.
Index: g++.dg/cpp0x/constexpr-ice14.C
===
--- g++.dg/cpp0x/constexpr-ice14.C  (revision 0)
+++ g++.dg/cpp0x/constexpr-ice14.C  (working copy)
@@ -0,0 +1,11 @@
+// PR c++/60305
+// { dg-do compile { target c++11 } }
+
+template int foo() { return I; }
+
+template void bar()
+{
+  constexpr int (*X[])() = { foo... };
+}
+
+template void bar<1,3,5>();


Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation

2014-03-18 Thread Ilya Verbin
On 17 Mar 16:00, Thomas Schwinge wrote:
> >  GOMP_4.0 {
> >global:
> > +   GOMP_offload_register;
> > GOMP_barrier_cancel;
> > GOMP_cancel;
> > GOMP_cancellation_point;
> 
> Now that the GOMP_4.0 symbol version is being used in GCC trunk, and will
> be in the GCC 4.9 release, can we still add new symbols to it here?
> (Jakub?)

I moved it to GOMP_4.0.1.

> > +  /* This is the TYPE of device.  */
> > +  int type;
> 
> Use enum target_type instead of int?

Done.

> > +  offload_images = realloc (offload_images,
> > +   (num_offload_images + 1)
> > +   * sizeof (struct offload_image_descr));
> > +
> > +  if (offload_images == NULL)
> > +return;
> 
> Fail silently, or use gomp_realloc to fail loudly?

Replaced with gomp_realloc.

> >if (dir)
> >  closedir (dir);
> > +  free (offload_images);
> 
> I suggest to set offload_images = NULL, for clarity.

Done.

> OK to commit, thanks!

Committed as r208657.


> Would it make sense to have device_run return a value to make it able to
> indicate to libgomp that the function cannot be run on the device (for
> whatever reason), and libgomp should use host-fallback execution?
> (Probably that needs more thought and discussion, OK to defer.)

Consider the following example (using OpenMP, I don't know OpenACC :)

int foo ()
{
  int x = 0;

  /* offload_fn1  */
  #pragma omp target map(to: x)
{
  x += 5;
}

  /* Some code on host without updating 'x' from target.  */

  /* offload_fn2  */
  #pragma omp target map(from: x)
{
  x += 10;
}

  return x;
}

If both offload_fn1 and offload_fn2 are executed on host, everything is fine
and x = 15.  The same goes to the case when both offload_fn1 and offload_fn2
are executed on target.  But if offload_fn1 is executed on target and
offload_fn2 is executed on host, then 'x' will have incorrect value (10).

Therefore, I proposed to check for target device availability only during
initialization of the plugin.  And to make a decision at this point, will
libgomp run all functions on host or on target.  Probably libgomp should return
an error if something was executed on device, but then it becomes unavailable.

  -- Ilya


[jit] Add missing tests to test-combination.c

2014-03-18 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/testsuite/
* jit.dg/test-combination.c: Add test-arrays.c and test-volatile.c.
Add comment about test-error-*.c.  Remove comment about
test-failure.c, which was removed in
96b218c9a1d5f39fb649e02c0e77586b180e8516.
(create_code): Call into test-arrays.c and test-volatile.c.
(verify_code): Likewise.
---
 gcc/testsuite/ChangeLog.jit |  9 +
 gcc/testsuite/jit.dg/test-combination.c | 24 +---
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/ChangeLog.jit b/gcc/testsuite/ChangeLog.jit
index 1d8c11c..e07ba87 100644
--- a/gcc/testsuite/ChangeLog.jit
+++ b/gcc/testsuite/ChangeLog.jit
@@ -1,3 +1,12 @@
+2014-03-18  David Malcolm  
+
+   * jit.dg/test-combination.c: Add test-arrays.c and test-volatile.c.
+   Add comment about test-error-*.c.  Remove comment about
+   test-failure.c, which was removed in
+   96b218c9a1d5f39fb649e02c0e77586b180e8516.
+   (create_code): Call into test-arrays.c and test-volatile.c.
+   (verify_code): Likewise.
+
 2014-03-14  David Malcolm  
 
* jit.dg/test-expressions.c (called_pointer_checking_function): New.
diff --git a/gcc/testsuite/jit.dg/test-combination.c 
b/gcc/testsuite/jit.dg/test-combination.c
index cd8a0f3..72b8602 100644
--- a/gcc/testsuite/jit.dg/test-combination.c
+++ b/gcc/testsuite/jit.dg/test-combination.c
@@ -14,6 +14,13 @@
 #undef create_code
 #undef verify_code
 
+/* test-arrays.c */
+#define create_code create_code_arrays
+#define verify_code verify_code_arrays
+#include "test-arrays.c"
+#undef create_code
+#undef verify_code
+
 /* test-calling-external-function.c */
 #define create_code create_code_calling_external_function
 #define verify_code verify_code_calling_external_function
@@ -28,6 +35,9 @@
 #undef create_code
 #undef verify_code
 
+/* test-error-*.c: We don't use these test cases, since they deliberately
+   introduce errors, which we don't want here.  */
+
 /* test-expressions.c */
 #define create_code create_code_expressions
 #define verify_code verify_code_expressions
@@ -42,9 +52,6 @@
 #undef create_code
 #undef verify_code
 
-/* We don't use test-failure.c; we don't want its failure to affect our
-   combined case.  */
-
 /* test-fibonacci.c */
 #define create_code create_code_fibonacci
 #define verify_code verify_code_fibonacci
@@ -115,6 +122,13 @@
 #undef create_code
 #undef verify_code
 
+/* test-volatile.c */
+#define create_code create_code_volatile
+#define verify_code verify_code_volatile
+#include "test-volatile.c"
+#undef create_code
+#undef verify_code
+
 /* Now construct a test case from all the other test cases.
 
We undefine TEST_COMBINATION so that we can now include harness.h
@@ -127,6 +141,7 @@ void
 create_code (gcc_jit_context *ctxt, void * user_data)
 {
   create_code_accessing_struct (ctxt, user_data);
+  create_code_arrays (ctxt, user_data);
   create_code_calling_external_function (ctxt, user_data);
   create_code_dot_product (ctxt, user_data);
   create_code_expressions (ctxt, user_data);
@@ -139,12 +154,14 @@ create_code (gcc_jit_context *ctxt, void * user_data)
   create_code_sum_of_squares (ctxt, user_data);
   create_code_types (ctxt, user_data);
   create_code_using_global (ctxt, user_data);
+  create_code_volatile (ctxt, user_data);
 }
 
 void
 verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
 {
   verify_code_accessing_struct (ctxt, result);
+  verify_code_arrays (ctxt, result);
   verify_code_calling_external_function (ctxt, result);
   verify_code_dot_product (ctxt, result);
   verify_code_expressions (ctxt, result);
@@ -157,4 +174,5 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
   verify_code_sum_of_squares (ctxt, result);
   verify_code_types (ctxt, result);
   verify_code_using_global (ctxt, result);
+  verify_code_volatile (ctxt, result);
 }
-- 
1.8.5.3



[committed] Fix -fsanitize=undefined with non-C family FEs (PR sanitizer/60557)

2014-03-18 Thread Jakub Jelinek
Hi!

While most of the sanitization is done in the FEs (c-family/),
some is done in the middle-end (ubsan pass) or when folding
__builtin_unreachable ().

This patch fixes ICE with that, bootstrapped/regtested on x86_64-linux and
i686-linux, committed to trunk as obvious.

2014-03-18  Jakub Jelinek  

PR sanitizer/60557
* ubsan.c (ubsan_instrument_unreachable): Call
initialize_sanitizer_builtins.
(ubsan_pass): Likewise.

--- gcc/ubsan.c.jj  2014-03-17 20:01:34.0 +0100
+++ gcc/ubsan.c 2014-03-18 07:14:50.889409415 +0100
@@ -512,6 +512,7 @@ ubsan_create_data (const char *name, loc
 tree
 ubsan_instrument_unreachable (location_t loc)
 {
+  initialize_sanitizer_builtins ();
   tree data = ubsan_create_data ("__ubsan_unreachable_data", loc, NULL,
 NULL_TREE);
   tree t = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_BUILTIN_UNREACHABLE);
@@ -847,6 +848,8 @@ ubsan_pass (void)
   basic_block bb;
   gimple_stmt_iterator gsi;
 
+  initialize_sanitizer_builtins ();
+
   FOR_EACH_BB_FN (bb, cfun)
 {
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)

Jakub


Re: Ping^3 GCC trunk 4.9: documentation patch on plugins

2014-03-18 Thread Basile Starynkevitch
On Tue, 2014-03-18 at 08:53 -0400, Diego Novillo wrote:
> OK with:
> 
> +Pragmas registered with @code{c_register_pragma_with_expansion} or
> +@code{c_register_pragma_with_expansion_and_data} are supporting
> +preprocessor expansions. For an example of using such a pragma:
> 
> s/are supporting/support/
> s/For an example of using such a pragma/For example/
> 

Thanks for the review.
Committed revision 208660.

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***




[C++ Patch / RFC] PR 51474

2014-03-18 Thread Paolo Carlini

Hi,

the attached is slightly larger than my last ones but, assuming the 
analysis is correct, should be pretty safe, it avoids a segfault: in 
fact, for a pure virtual called from a NSDMI, current_function_decl is 
null and the existing warning about *structors breaks. I'm reading 
12.7/4 and I think we can solve the issue by just adding an 
appropriately worded warning. Tested x86_64-linux.


Thanks,
Paolo.

///
Index: cp/call.c
===
--- cp/call.c   (revision 208658)
+++ cp/call.c   (working copy)
@@ -7828,15 +7828,20 @@ build_new_method_call_1 (tree instance, tree fns,
  if (!(flags & LOOKUP_NONVIRTUAL)
  && DECL_PURE_VIRTUAL_P (fn)
  && instance == current_class_ref
- && (DECL_CONSTRUCTOR_P (current_function_decl)
- || DECL_DESTRUCTOR_P (current_function_decl))
  && (complain & tf_warning))
-   /* This is not an error, it is runtime undefined
-  behavior.  */
-   warning (0, (DECL_CONSTRUCTOR_P (current_function_decl) ?
- "pure virtual %q#D called from constructor"
- : "pure virtual %q#D called from destructor"),
-fn);
+   {
+ /* These is not an error, it is runtime undefined
+behavior.  */
+ if (!current_function_decl)
+   warning (0, "pure virtual %q#D called from "
+"non-static data member initializer", fn);
+ else if (DECL_CONSTRUCTOR_P (current_function_decl)
+  || DECL_DESTRUCTOR_P (current_function_decl))
+   warning (0, (DECL_CONSTRUCTOR_P (current_function_decl)
+? "pure virtual %q#D called from constructor"
+: "pure virtual %q#D called from destructor"),
+fn);
+   }
 
  if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
  && is_dummy_object (instance))
Index: testsuite/g++.dg/cpp0x/nsdmi-virtual2.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-virtual2.C (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-virtual2.C (working copy)
@@ -0,0 +1,8 @@
+// PR c++/51474
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  virtual int foo() = 0;
+  int i = foo();  // { dg-warning "pure virtual" }
+};


Re: [patch sdbout]: Fix PR rtl-optimization/56356

2014-03-18 Thread Kai Tietz
Patch got approved in bug-report for trunk.

Applied at revision 208663 to trunk.

Kai


Fix target/60562 -- more i387 int->float fallout

2014-03-18 Thread Richard Henderson
Brown bag time on this one.  While I did build and test both x86_64 and i686
separately, I apparently only missed the regression in test results.

Anyway, a simple fix for the problem: don't enable the pattern when it's not
supposed to be enabled due to excess precision.  Moving the _387 pattern down
means that we don't have to repeat the condition on the _sse pattern just above.

Two separate patches, because I forgot about the mixed-sse-387 case until after
I'd already committed the first patch.  Ho hum.


r~
PR target/60562
* config/i386/i386.md (*float2_i387): Move down to
be shadowed by *float2_sse.  Test X87_ENABLE_FLOAT.


diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a824e78..abc22f2 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4701,15 +4701,6 @@
 }
 })
 
-(define_insn "*float2_i387"
-  [(set (match_operand:MODEF 0 "register_operand" "=f")
-   (float:MODEF (match_operand:SWI48x 1 "nonimmediate_operand" "m")))]
-  "TARGET_80387 && !(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)"
-  "fild%Z1\t%1"
-  [(set_attr "type" "fmov")
-   (set_attr "mode" "")
-   (set_attr "fp_int_src" "true")])
-
 (define_insn "*float2_sse"
   [(set (match_operand:MODEF 0 "register_operand" "=f,x,x")
(float:MODEF
@@ -4743,6 +4734,15 @@
(symbol_ref "true")))
])
 
+(define_insn "*float2_i387"
+  [(set (match_operand:MODEF 0 "register_operand" "=f")
+   (float:MODEF (match_operand:SWI48x 1 "nonimmediate_operand" "m")))]
+  "TARGET_80387 && X87_ENABLE_FLOAT (mode, mode)"
+  "fild%Z1\t%1"
+  [(set_attr "type" "fmov")
+   (set_attr "mode" "")
+   (set_attr "fp_int_src" "true")])
+
 ;; Try TARGET_USE_VECTOR_CONVERTS, but not so hard as to require extra memory
 ;; slots when !TARGET_INTER_UNIT_MOVES_TO_VEC disables the general_regs
 ;; alternative in sse2_loadld.
PR target/60562
* config/i386/i386.md (*float2_sse): Check
X87_ENABLE_FLOAT for alternative 0.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index abc22f2..d96f997 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4726,7 +4726,9 @@
(set_attr "fp_int_src" "true")
(set (attr "enabled")
  (cond [(eq_attr "alternative" "0")
-  (symbol_ref "TARGET_MIX_SSE_I387")
+  (symbol_ref "TARGET_MIX_SSE_I387
+   && X87_ENABLE_FLOAT (mode,
+mode)"
 (eq_attr "alternative" "1")
   (symbol_ref "TARGET_INTER_UNIT_CONVERSIONS
|| optimize_function_for_size_p (cfun)")


Re: [RFA jit 1/2] introduce class toplev

2014-03-18 Thread David Malcolm
On Tue, 2014-03-18 at 08:57 -0600, Tom Tromey wrote:
> This patch introduces a new "class toplev" and changes toplev_main and
> toplev_finalize to be methods of this class.  Additionally, now the
> timevars are automatically stopped when the object is destroyed.  This
> cleans up "compile" a bit and makes it simpler to reuse the toplev
> logic in other code.

Thanks.

> ---
>  gcc/ChangeLog.jit  | 14 +

I see the Changelog.jit file listed here...

>  gcc/diagnostic.c   |  2 +-
>  gcc/jit/ChangeLog.jit  |  5 +

...and here, but I don't see the content below, or within the header of
the email.  Is this available somewhere?

[...snip...]



Re: [RFA jit 2/2] introduce scoped_timevar

2014-03-18 Thread David Malcolm
On Tue, 2014-03-18 at 08:57 -0600, Tom Tromey wrote:
> This introduces a new scoped_timevar class.  It pushes a given timevar
> in its constructor, and pops it in the destructor, giving a much
> simpler way to use timevars in the typical case where they can be
> scoped.
> ---
>  gcc/ChangeLog.jit  |  4 
>  gcc/jit/ChangeLog.jit  |  4 
>  gcc/jit/internal-api.c | 16 +---
>  gcc/timevar.h  | 24 +++-
>  4 files changed, 36 insertions(+), 12 deletions(-)

Thanks.  Looks good to me, but, like the other patch, are the
ChangeLog.jit entries available somewhere?




[PATCH] Fix ICE with MASK_LOAD and -fno-tree-dce (PR tree-optimization/60559)

2014-03-18 Thread Jakub Jelinek
Hi!

With -fno-tree-dce the scalar MASK_LOAD isn't removed from the IL and we ICE
on it during expansion (as we support only the vector loads, if those aren't
supported, MASK_LOAD is either not created by if-conversion at all, or
vectorization refuses to vectorize the loop and thus it is cfg cleaned up
away.

The following patch replaces the scalar MASK_LOAD manually with load of
zero, similarly how we do it for vectorizable calls.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-03-18  Jakub Jelinek  

PR tree-optimization/60559
* vectorizable_mask_load_store): Replace scalar MASK_LOAD
with build_zero_cst assignment.

* g++.dg/vect/pr60559.cc: New test.

--- gcc/tree-vect-stmts.c.jj2014-03-03 08:24:33.0 +0100
+++ gcc/tree-vect-stmts.c   2014-03-18 14:01:40.969657763 +0100
@@ -2038,6 +2038,15 @@ vectorizable_mask_load_store (gimple stm
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
  prev_stmt_info = vinfo_for_stmt (new_stmt);
}
+
+  /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed
+from the IL.  */
+  tree lhs = gimple_call_lhs (stmt);
+  new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
+  set_vinfo_for_stmt (new_stmt, stmt_info);
+  set_vinfo_for_stmt (stmt, NULL);
+  STMT_VINFO_STMT (stmt_info) = new_stmt;
+  gsi_replace (gsi, new_stmt, true);
   return true;
 }
   else if (is_store)
@@ -2149,6 +2158,18 @@ vectorizable_mask_load_store (gimple stm
}
 }
 
+  if (!is_store)
+{
+  /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed
+from the IL.  */
+  tree lhs = gimple_call_lhs (stmt);
+  new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
+  set_vinfo_for_stmt (new_stmt, stmt_info);
+  set_vinfo_for_stmt (stmt, NULL);
+  STMT_VINFO_STMT (stmt_info) = new_stmt;
+  gsi_replace (gsi, new_stmt, true);
+}
+
   return true;
 }
 
--- gcc/testsuite/g++.dg/vect/pr60559.cc.jj 2014-03-18 14:04:55.173449250 
+0100
+++ gcc/testsuite/g++.dg/vect/pr60559.cc2014-03-18 14:05:26.610273088 
+0100
@@ -0,0 +1,8 @@
+// PR tree-optimization/60559
+// { dg-do compile }
+// { dg-additional-options "-O3 -std=c++11 -fnon-call-exceptions 
-fno-tree-dce" }
+// { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } }
+
+#include "pr60023.cc"
+
+// { dg-final { cleanup-tree-dump "vect" } }

Jakub


[Fortran-caf, committed] Merge trunk into branch

2014-03-18 Thread Tobias Burnus


* Merge from the trunk: r208408 bis r208666.

Committed to the Fortran-caf branch, Rev. 208667.

Tobias


Re: [PATCH] BZ60501: Add addptr optab

2014-03-18 Thread Richard Henderson
On 03/18/2014 04:59 AM, Jakub Jelinek wrote:
> On Mon, Mar 17, 2014 at 03:24:14PM -0400, Vladimir Makarov wrote:
>> It is complicated.  There is no guarantee that it is used only for
>> addresses.  I need some time to think how to fix it.
>>
>> Meanwhile, you *should* commit the patch into the trunk because it
>> solves the real problem.  And I can work from this to make changes
>> that the new pattern is only used for addresses.
>>
>> The patch is absolutely safe for all targets but s390.  There is
>> still a tiny possibility that it might result in some problems for
>> s390  (now I see only one situation when a pseudo in a subreg
>> changed by equiv plus expr needs a reload).  In any case your patch
>> solves real numerous failures and can be used as a base for further
>> work.
>>
>> Thanks for working on this problem, Andreas.  Sorry that I missed
>> the PR60501.
> 
> BTW, does LRA require that CC isn't clobbered when it uses
> emit_add2_insn? I don't see how it can be guaranteed
> (except perhaps on i?86/x86_64 and maybe a few other targets).

At least in pre-LRA days one *had* to have such an addition, or alternately you
had to leave compare+use as a single insn pattern until post-reload splitting.
 I assume that's really unchanged with LRA...

> emit_add3_insn should be ok (even on s390*?) because recog_memoized
> should (I think) never add clobbers (it calls recog with 0
> as last argument), but gen_add2_insn is a normall add3 insn that
> on many targets clobbers CC.

Sure, but s390 has no choice but to implement the clobber-less add with the
LOAD ADDRESS instruction.

Which is perfectly safe for 64-bit, but for 32-bit will in fact crop that 31st
bit.  Which is perfectly safe so long as we never want to do addition except on
pointers.  Which brings us back to Vlad's point about REG_EQUIV being a
potential hazzard.

I agree with Vlad that we're better off with Andreas' patch than without, since
computing addresses is going to be 99% of what reload/LRA needs to do.

I also agree with Eric that some better commentary would be nice.


r~


Re: [PATCH] Fix error message from -Wcast-qual when casting away volatile

2014-03-18 Thread Manuel López-Ibáñez
It seems that the patch was tested only in the C testsuite, but the
changes to the testcase fail for C++. So I committed the following,
which is what my patch should have done in the first place, as
obvious.

I hope this is OK with you.

Cheers,

Manuel.

Index: gcc/testsuite/gcc.dg/cast-qual-3.c
===
--- gcc/testsuite/gcc.dg/cast-qual-3.c  (revision 0)
+++ gcc/testsuite/gcc.dg/cast-qual-3.c  (revision 208669)
@@ -0,0 +1,11 @@
+/* PR 55383 */
+/* { dg-do compile } */
+/* { dg-options "-Wcast-qual" } */
+
+void set(void*);
+
+int foo(int argc)
+{
+  volatile double val;
+  set((void*)&val); /* { dg-warning "cast discards .volatile. qualifier" } */
+}
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 208668)
+++ gcc/testsuite/ChangeLog (revision 208669)
@@ -1,3 +1,11 @@
+2014-03-19  Manuel López-Ibáñez  
+
+   PR c/55383
+   * gcc.dg/cast-qual-3.c: New.
+   Revert:
+   2014-03-18  Manuel López-Ibáñez  
+   * c-c++-common/Wcast-qual-1.c: More precise match text.
+
 2014-03-18  Janus Weil  

PR fortran/55207
Index: gcc/testsuite/c-c++-common/Wcast-qual-1.c
===
--- gcc/testsuite/c-c++-common/Wcast-qual-1.c   (revision 208668)
+++ gcc/testsuite/c-c++-common/Wcast-qual-1.c   (revision 208669)
@@ -85,11 +85,11 @@
 void
 f4 (void * const **bar)
 {
-  const void ***p9 = (const void ***) bar; /* { dg-warning "cast
discards .const. qualifier " } */
+  const void ***p9 = (const void ***) bar; /* { dg-warning "cast" } */
   void * const **p11 = (void * const **) bar;
   void ** const *p13 = (void ** const *) bar; /* { dg-warning "cast" } */
   const void * const **p15 = (const void * const **) bar; /* {
dg-warning "cast" } */
-  const void ** const *p17 = (const void ** const *) bar; /* {
dg-warning "cast discards .const. qualifier" } */
+  const void ** const *p17 = (const void ** const *) bar; /* {
dg-warning "cast" } */
   void * const * const * p19 = (void * const * const *) bar;
   const void * const * const *p21 = (const void * const * const *) bar;
 }


Re: [PR58479] introduce a param to limit debug stmts count

2014-03-18 Thread Alexandre Oliva
On Mar 18, 2014, Jakub Jelinek  wrote:

>> --- a/gcc/function.c
>> +++ b/gcc/function.c
>> @@ -4498,6 +4498,8 @@ allocate_struct_function (tree fndecl, bool abstract_p)
>> 
>> cfun = ggc_alloc_cleared_function ();
>> 
>> +  SET_BUILD_DEBUG_STMTS (cfun, flag_var_tracking_assignments);

> Dunno how this plays together with __attribute__((optimize(...))),
> I'm afraid not very well.

MAY_HAVE_DEBUG_STMTS (and MAY_HAVE_DEBUG_INSNS) used to be defined like
that, this just saves that status in a per-function data structure.  I
suppose this makes it easier to take other per-function directives into
account, but I haven't done that myself.

>> +  if (code == GIMPLE_DEBUG)
>> +{  
>> +  gcc_checking_assert (MAY_HAVE_DEBUG_STMTS);
>> +  cfun->debug_stmts++;
>> +}

> I'd strongly prefer it you could move this hunk to
> gimple_build_debug_bind_stat and gimple_build_debug_source_bind_stat,

That's where I put it at first, but it didn't fix the loop unrolling
testcase, because duplicate_bb, used for loop unrolling, and other paths
use gimple_alloc instead of specific functions to allocate copies of
existing stmts.


After posting the patchset, I got a failure to compile reflect-go with
the limit set to 20 on i686.  The problem was in
maybe_move_debug_stmts_to_successors, that removed debug stmts that were
in the debug_stmts vec, to be remapped later, and when they got
remapped, they were updated (even though they were removed) and SSA DEFs
ended up linking to the removed stmt, that later got GCed.  Oops.
So I went back to an earlier version of that hunk, that moved the stmts
instead of removing them.

This is the fixed patch, that I'm finishing testing along with the other
two unmodified patches, with various limits.  Is this ok to install?

I'm afraid I won't be able to work on this any further, because tomorrow
I'm going to fly to travel to the US for LibrePlanet, and when I return
I'm supposed to get back to glibc.  If this doesn't go in as is and
nobody else picks it up, it will have to wait until I get another time
slot for GCC.


Optimize debug stmt generation after limit is exceeded

From: Alexandre Oliva 

for  gcc/ChangeLog

	PR debug/58479
	* ipa-prop.c (ipa_modify_call_arguments): Don't build new
	debug stmts if BUILD_DEBUG_STMTS_P doesn't hold any more.
	* ipa-split.c (split_function): Likewise.
	* tree-cfg.c (gimple_merge_blocks): Likewise.
	(gimple_duplicate_bb): Likewise.
	* tree-inline.c (remap_ssa_name): Likewise.
	(remap_gimple_seq, copy_bb): Tolerate that...
	(remap_gimple_stmt): ... return NULL after the debug stmt
	limit is reached.  Check that debug stmts don't get to where
	we don't expect them.
	(maybe_move_debug_stmts_to_successors): Move trailing debug
	stmts to any successor if we're past the limit.  Avoid
	confusing reuse of variable si.
	(insert_init_debug_bind): Don't build debug stmts past the
	limit.
	(insert_init_stmt): Likewise.
	* tree-into-ssa.c (insert_phi_nodes_for): Likewise.
	(rewrite_debug_stmt_uses): Likewise.
	(rewrite_stmt): Likewise.
	(maybe_register_def): Likewise.
	* tree-sra.c (generate_subtree_copies): Likewise.
	(init_subtree_with_zero): Likewise.
	(sra_modify_expr): Likewise.
	(load_assign_lhs_subreplacements): Likewise.
	(sra_modify_assign): Likewise.
	(sra_ipa_reset_debug_stmts): Likewise.
	* tree-ssa-dce.c (remove_dead_stmt): Likewise.
	* tree-ssa-loop-ivopts.c (remove_unused_ivs): Likewise.
	* tree-ssa.c (insert_debug_temp_for_var_def): Likewise.
---
 gcc/ipa-prop.c |2 +-
 gcc/ipa-split.c|   10 +-
 gcc/tree-cfg.c |7 ++-
 gcc/tree-inline.c  |   30 +-
 gcc/tree-into-ssa.c|   17 ++---
 gcc/tree-sra.c |   15 +--
 gcc/tree-ssa-dce.c |3 ++-
 gcc/tree-ssa-loop-ivopts.c |2 +-
 gcc/tree-ssa.c |3 ++-
 9 files changed, 53 insertions(+), 36 deletions(-)

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 9f144fa..0d196b4 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -3823,7 +3823,7 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gimple stmt,
 	}
 	  vargs.quick_push (expr);
 	}
-  if (adj->op != IPA_PARM_OP_COPY && MAY_HAVE_DEBUG_STMTS)
+  if (adj->op != IPA_PARM_OP_COPY && BUILD_DEBUG_STMTS_P)
 	{
 	  unsigned int ix;
 	  tree ddecl = NULL_TREE, origin = DECL_ORIGIN (adj->base), arg;
diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
index 38bd8836..6d9fc44 100644
--- a/gcc/ipa-split.c
+++ b/gcc/ipa-split.c
@@ -1300,11 +1300,11 @@ split_function (struct split_point *split_point)
 	  tree ddecl;
 	  gimple def_temp;
 
-	  /* This needs to be done even without MAY_HAVE_DEBUG_STMTS,
-	 otherwise if it didn't exist before, we'd end up with
-	 different SSA_NAME_VERSIONs between -g and -g0.  */
+	  /* This needs to be done even without debug stmts, otherwise
+	 if it didn't exist before, we'd end up with different
+	 SSA_NAME_VERSIONs between -g and -g0.  */
 	  arg

Re: [RFA jit 1/2] introduce class toplev

2014-03-18 Thread Tom Tromey
>> gcc/ChangeLog.jit  | 14 +

David> I see the Changelog.jit file listed here...

>> gcc/diagnostic.c   |  2 +-
>> gcc/jit/ChangeLog.jit  |  5 +

David> ...and here, but I don't see the content below, or within the header of
David> the email.  Is this available somewhere?

Sorry about that.  My mail-sending script went awry somehow -- I will
resend with the plain old git send-email tomorrow.

Tom


Re: [RFA jit 2/2] introduce scoped_timevar

2014-03-18 Thread Trevor Saunders
On Tue, Mar 18, 2014 at 08:57:44AM -0600, Tom Tromey wrote:
> This introduces a new scoped_timevar class.  It pushes a given timevar
> in its constructor, and pops it in the destructor, giving a much
> simpler way to use timevars in the typical case where they can be
> scoped.

thanks for doing this.  I wonder about naming, we already have auto_vec
and while I don't really care wether we use auto_ or scoped_ it seems like
being consistant would be nice.

Trev

> ---
>  gcc/ChangeLog.jit  |  4 
>  gcc/jit/ChangeLog.jit  |  4 
>  gcc/jit/internal-api.c | 16 +---
>  gcc/timevar.h  | 24 +++-
>  4 files changed, 36 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
> index 6a4d2ae..8285c64 100644
> --- a/gcc/jit/internal-api.c
> +++ b/gcc/jit/internal-api.c
> @@ -3737,8 +3737,6 @@ compile ()
>if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
> dump_generated_code ();
>  
> -  timevar_push (TV_ASSEMBLE);
> -
>/* Gross hacks follow:
>   We have a .s file; we want a .so file.
>   We could reuse parts of gcc/gcc.c to do this.
> @@ -3746,6 +3744,8 @@ compile ()
> */
>/* FIXME: totally faking it for now, not even using pex */
>{
> +scoped_timevar assemble_timevar (TV_ASSEMBLE);
> +
>  char cmd[1024];
>  snprintf (cmd, 1024, "gcc -shared %s -o %s",
>m_path_s_file, m_path_so_file);
> @@ -3753,20 +3753,16 @@ compile ()
>printf ("cmd: %s\n", cmd);
>  int ret = system (cmd);
>  if (ret)
> -  {
> - timevar_pop (TV_ASSEMBLE);
> - return NULL;
> -  }
> +  return NULL;
>}
> -  timevar_pop (TV_ASSEMBLE);
>  
>// TODO: split out assembles vs linker
>  
>/* dlopen the .so file. */
>{
> -const char *error;
> +scoped_timevar load_timevar (TV_LOAD);
>  
> -timevar_push (TV_LOAD);
> +const char *error;
>  
>  /* Clear any existing error.  */
>  dlerror ();
> @@ -3779,8 +3775,6 @@ compile ()
>result_obj = new result (handle);
>  else
>result_obj = NULL;
> -
> -timevar_pop (TV_LOAD);
>}
>  
>return result_obj;
> diff --git a/gcc/timevar.h b/gcc/timevar.h
> index dc2a8bc..eb8bf0d 100644
> --- a/gcc/timevar.h
> +++ b/gcc/timevar.h
> @@ -1,5 +1,5 @@
>  /* Timing variables for measuring compiler performance.
> -   Copyright (C) 2000-2013 Free Software Foundation, Inc.
> +   Copyright (C) 2000-2014 Free Software Foundation, Inc.
> Contributed by Alex Samuel 
>  
> This file is part of GCC.
> @@ -110,6 +110,28 @@ timevar_pop (timevar_id_t tv)
>  timevar_pop_1 (tv);
>  }
>  
> +class scoped_timevar
> +{
> + public:
> +  scoped_timevar (timevar_id_t tv)
> +: m_tv (tv)
> +  {
> +timevar_push (m_tv);
> +  }
> +
> +  ~scoped_timevar ()
> +  {
> +timevar_push (m_tv);
> +  }
> +
> + private:
> +
> +  // Private to disallow copies.
> +  scoped_timevar (const scoped_timevar &);
> +
> +  timevar_id_t m_tv;
> +};
> +
>  extern void print_time (const char *, long);
>  
>  #endif /* ! GCC_TIMEVAR_H */
> -- 
> 1.8.5.3
> 


[PATCH] Fix PR c++/60573

2014-03-18 Thread Adam Butcher
PR c++/60573
* parser.c (synthesize_implicit_template_parm): Handle the fact that
nested class member declarations erroneously appearing in an enclosing
class contain an addition scope level for the class being defined.

PR c++/60573
* g++.dg/cpp1y/pr60573.C: New testcase.
---
 gcc/cp/parser.c  | 18 --
 gcc/testsuite/g++.dg/cpp1y/pr60573.C | 13 +
 2 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/pr60573.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 46e2453..36872c9 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -32006,9 +32006,23 @@ synthesize_implicit_template_parm  (cp_parser *parser)
must be injected in the scope of 'B', just beyond the scope of 'A'
introduced by 'A::'.  */
 
- while (scope->kind == sk_class
-&& !TYPE_BEING_DEFINED (scope->this_entity))
+ while (scope->kind == sk_class)
{
+ /* In valid cases where the class being defined is reached, we're
+at the point where the template argument list should be
+injected for a generic member function.  In the erroneous case
+of generic member function of a nested class being declared in
+the enclosing class, an additional class scope for the
+enclosing class has been pushed by push_nested_class via
+push_scope in cp_parser_direct_declarator.  This additional
+scope needs to be skipped to reach the class definition scope
+where the template argument list should be injected.  */
+
+ if (TYPE_BEING_DEFINED (scope->this_entity))
+   if (scope->level_chain == 0
+   || scope->this_entity != scope->level_chain->this_entity)
+ break;
+
  parent_scope = scope;
  scope = scope->level_chain;
}
diff --git a/gcc/testsuite/g++.dg/cpp1y/pr60573.C 
b/gcc/testsuite/g++.dg/cpp1y/pr60573.C
new file mode 100644
index 000..7f56ff4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/pr60573.C
@@ -0,0 +1,13 @@
+// PR c++/60573
+// { dg-do compile { target c++1y } }
+// { dg-options "" }
+
+struct A
+{
+  struct B
+  {
+void foo(auto);
+  };
+
+  void B::foo(auto) {}  // { dg-error "cannot define" }
+};
-- 
1.9.0



PATCH COMMITTED: Don't use int64_t in test

2014-03-18 Thread Ian Lance Taylor
Among other issues, PR 60563 complains that g++.dg/ext/sync-4.c fails on
Darwin because both the system header files and the test define the
typedef int64_t.  This patch should avoid that problem by renaming the
typedef.  Tested on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2014-03-18  Ian Lance Taylor  

PR target/60563
* g++.dg/ext/sync-4.C (int32_t): Remove typedef.
(ditype): Rename typedef from int64_t.


Index: g++.dg/ext/sync-4.C
===
--- g++.dg/ext/sync-4.C	(revision 208672)
+++ g++.dg/ext/sync-4.C	(working copy)
@@ -8,8 +8,7 @@
 #include 
 #include 
 
-typedef int int32_t __attribute__ ((mode (SI)));
-typedef int int64_t __attribute__ ((mode (DI)));
+typedef int ditype __attribute__ ((mode (DI)));
 
 #define FN(IDX, RET, CALL)	\
 static RET f ## IDX (void *p) __attribute__ ((noinline));	\
@@ -32,54 +31,54 @@ t ## IDX ()			\
   abort();			\
 }
 
-FN(1, int64_t, (__sync_fetch_and_add((int64_t*)p, 1)))
-FN(2, int64_t, (__sync_fetch_and_sub((int64_t*)p, 1)))
-FN(3, int64_t, (__sync_fetch_and_or((int64_t*)p, 1)))
-FN(4, int64_t, (__sync_fetch_and_and((int64_t*)p, 1)))
-FN(5, int64_t, (__sync_fetch_and_xor((int64_t*)p, 1)))
-FN(6, int64_t, (__sync_fetch_and_nand((int64_t*)p, 1)))
-
-FN( 7, int64_t, (__sync_add_and_fetch((int64_t*)p, 1)))
-FN( 8, int64_t, (__sync_sub_and_fetch((int64_t*)p, 1)))
-FN( 9, int64_t, (__sync_or_and_fetch((int64_t*)p, 1)))
-FN(10, int64_t, (__sync_and_and_fetch((int64_t*)p, 1)))
-FN(11, int64_t, (__sync_xor_and_fetch((int64_t*)p, 1)))
-FN(12, int64_t, (__sync_nand_and_fetch((int64_t*)p, 1)))
-
-FN(13, bool, (__sync_bool_compare_and_swap((int64_t*)p, 1, 2)))
-FN(14, int64_t, (__sync_val_compare_and_swap((int64_t*)p, 1, 2)))
-
-FN(15, int64_t, (__sync_lock_test_and_set((int64_t*)p, 1)))
-FN(16, void, (__sync_lock_release((int64_t*)p)))
-
-FN(17, bool, (__atomic_test_and_set((int64_t*)p, __ATOMIC_SEQ_CST)))
-FN(18, void, (__atomic_clear((int64_t*)p, __ATOMIC_SEQ_CST)))
-
-FN(19, void, (__atomic_exchange((int64_t*)p, (int64_t*)0, (int64_t*)0, __ATOMIC_SEQ_CST)))
-FN(20, int64_t, (__atomic_exchange_n((int64_t*)p, 1, 2)))
-
-FN(21, void, (__atomic_load((int64_t*)p, (int64_t*)0, __ATOMIC_SEQ_CST)))
-FN(22, int64_t, (__atomic_load_n((int64_t*)p, __ATOMIC_SEQ_CST)))
-
-FN(23, bool, (__atomic_compare_exchange((int64_t*)p, (int64_t*)0, (int64_t*)0, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)))
-FN(24, bool, (__atomic_compare_exchange_n((int64_t*)p, (int64_t*)0, 1, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)))
-
-FN(25, void, (__atomic_store((int64_t*)p, (int64_t*)0, __ATOMIC_SEQ_CST)))
-FN(26, void, (__atomic_store_n((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-
-FN(27, int64_t, (__atomic_add_fetch((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(28, int64_t, (__atomic_sub_fetch((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(29, int64_t, (__atomic_and_fetch((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(30, int64_t, (__atomic_nand_fetch((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(31, int64_t, (__atomic_xor_fetch((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(32, int64_t, (__atomic_or_fetch((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-
-FN(33, int64_t, (__atomic_fetch_add((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(34, int64_t, (__atomic_fetch_sub((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(35, int64_t, (__atomic_fetch_and((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(36, int64_t, (__atomic_fetch_nand((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(37, int64_t, (__atomic_fetch_xor((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
-FN(38, int64_t, (__atomic_fetch_or((int64_t*)p, 1, __ATOMIC_SEQ_CST)))
+FN(1, ditype, (__sync_fetch_and_add((ditype*)p, 1)))
+FN(2, ditype, (__sync_fetch_and_sub((ditype*)p, 1)))
+FN(3, ditype, (__sync_fetch_and_or((ditype*)p, 1)))
+FN(4, ditype, (__sync_fetch_and_and((ditype*)p, 1)))
+FN(5, ditype, (__sync_fetch_and_xor((ditype*)p, 1)))
+FN(6, ditype, (__sync_fetch_and_nand((ditype*)p, 1)))
+
+FN( 7, ditype, (__sync_add_and_fetch((ditype*)p, 1)))
+FN( 8, ditype, (__sync_sub_and_fetch((ditype*)p, 1)))
+FN( 9, ditype, (__sync_or_and_fetch((ditype*)p, 1)))
+FN(10, ditype, (__sync_and_and_fetch((ditype*)p, 1)))
+FN(11, ditype, (__sync_xor_and_fetch((ditype*)p, 1)))
+FN(12, ditype, (__sync_nand_and_fetch((ditype*)p, 1)))
+
+FN(13, bool, (__sync_bool_compare_and_swap((ditype*)p, 1, 2)))
+FN(14, ditype, (__sync_val_compare_and_swap((ditype*)p, 1, 2)))
+
+FN(15, ditype, (__sync_lock_test_and_set((ditype*)p, 1)))
+FN(16, void, (__sync_lock_release((ditype*)p)))
+
+FN(17, bool, (__atomic_test_and_set((ditype*)p, __ATOMIC_SEQ_CST)))
+FN(18, void, (__atomic_clear((ditype*)p, __ATOMIC_SEQ_CST)))
+
+FN(19, void, (__atomic_exchange((ditype*)p, (ditype*)0, (ditype*)0, __ATOMIC_SEQ_CST)))
+FN(20, ditype, (__atomic_exchange_n((ditype*)p, 1, 2)))
+
+FN(21, void, (__atomic_load((ditype*)p, (ditype*)0, __ATOMIC_SEQ_CST)))
+FN(22, ditype, (__atomic_load_n((ditype*)p, __ATOMIC_SEQ_CST)))
+
+FN(23, bool, (__atomic_compare_exchange((ditype*)p, (ditype*)0, (ditype

[PATCH] Fix PR54733 Optimize endian independent load/store

2014-03-18 Thread Thomas Preud'homme
Hi everybody,

*** Motivation ***

Currently gcc is capable of replacing hand-crafted implementation of byteswap 
by a suitable instruction thanks to the bswap optimization pass. The patch 
proposed here aims at extending this pass to also optimize load in a specific 
endianness, independent of the host endianness.

*** Methodology ***

The patch adds support for dealing with a memory source (array or structure) 
and detect whether the result of a bitwise operation happens to be equivalent 
to a big endian or little endian load and replace it by a load or a load and a 
byteswap according to the host endianness. The original code used the concept 
of symbolic number: a number where the value of each byte indicates its 
position (in terms of weight) before the bitwise manipulation. After performing 
the bit manipulation on that symbolic number, the result tells how the byte 
were shuffled (see variable cmp in function find_bswap). Detecting an operation 
resulting in a number in the host endianness is thus pretty straightforward: 
look if the symbolic number has *not* changed.

As to supporting read from array and structure, there is some logic to 
recognize the base of the array/structure and the offset of entries/fields 
accessed to check if the range of memory accessed would fit in an integer. Each 
entries is initially treated independently and when they are ORed together the 
values in the symbolic number are updated according to the host endianness: the 
entry of higher address would see its values incremented on a little endian 
machine.

Note that as it stands the patch does not work for arrays indexed with variable 
(such a tab[a] || (tab[a+1] << 8)) because fold_const does not fold (a + 1) - 
a. If such cases were folded, the number of cases detected would automatically 
be increased due to the use of fold_build2 to compare two offsets.

This patch also adds a few testcases to check both (i) that the optimization 
works as expected and (ii) that the result are correct. It also define new 
effective targets (bswap16, bswap32 and bswap64) to centralize the information 
about what target supports  byte swap instructions for the testsuite and modify 
existing tests to use these new effective targets.

The patch is quite big but could be split if necessary. A big part of the code 
added is for handling memory source and it would be difficult to split it but 
variable renaming and introduction of bwapXX effective target could be made 
separately to reduce the noise. The patch is too big so is only in attachment 
of this email.

The ChangeLog are as follows:

*** gcc/ChangeLog ***

2014-03-19  Thomas Preud'homme  

PR tree-optimization/54733
* tree-ssa-math-opts.c (find_bswap_1): Renamed to ...
(find_bswap_or_nop_1): This. Also add support for memory source.
(find_bswap): Renamed to ...
(find_bswap_or_nop): This. Also add support for memory source and
detection of noop bitwise operations.
(execute_optimize_bswap): Likewise.

*** gcc/testsuite/ChangeLog ***

2014-03-19  Thomas Preud'homme  

PR tree-optimization/54733
* lib/target-supports.exp: New effective targets for architectures
capable of performing byte swap.
* gcc.dg/optimize-bswapdi-1.c: Convert to new bswap target.
* gcc.dg/optimize-bswapdi-2.c: Likewise.
* gcc.dg/optimize-bswapsi-1.c: Likewise.
* gcc.dg/optimize-bswapdi-3.c: New test to check extension of bswap
optimization to support memory sources.
* gcc.dg/optimize-bswaphi-1.c: Likewise.
* gcc.dg/optimize-bswapsi-2.c: Likewise.
* gcc.c-torture/execute/bswap-2.c: Likewise.

Is this ok for stage 1?

Best regards,

Thomas

gcc32rm-84.3.diff
Description: Binary data