date:20141007

Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Daniel Hellstrom


On 10/06/2014 06:57 PM, Eric Botcazou wrote:

I posted it with v2 in the subject. I have now attached it.

There is a pasto in the v2:

@@ -1764,6 +1772,12 @@ extern int sparc_indent_opcode;
  #define AS_LEON_FLAG -Av8
  #endif
  
+#ifdef HAVE_AS_LEONV7

+#define AS_LEONV7_FLAG -Aleon
+#else
+#define AS_LEONV7_FLAG -Av7
+#endif
+
  /* We use gcc _mcount for profiling.  */
  #define NO_PROFILE_COUNTERS 0
  


I think you would be better off adding a line to the arms of the existing code

#ifdef HAVE_AS_LEON
#define AS_LEON_FLAG -Aleon
#else
#define AS_LEON_FLAG -Av8
#endif

rather than duplicating it.


You're right. I have attached an updated patch. The new code becomes:

 #ifdef HAVE_AS_LEON
 #define AS_LEON_FLAG -Aleon
+#define AS_LEONV7_FLAG -Aleon
 #else
 #define AS_LEON_FLAG -Av8
+#define AS_LEONV7_FLAG -Av7
 #endif

Thanks!
From bfecb09e9402c9bd55373a7eb08ce6e2b244729e Mon Sep 17 00:00:00 2001
From: Daniel Hellstrom dan...@gaisler.com
Date: Wed, 20 Aug 2014 10:53:22 +0100
Subject: [PATCH v3] SPARC: add mcpu=leon3v7 target

The LEON3/4 soft-core CPU has support for both SPARCv7 and SPARCv8 that
is configurable at design time. The majority of the LEON3 ASICs are v8
compatible, however when designing an as small LEON3 as possible, v7
without FPU is frequently used.

The current GCC leon3 support implies the SPARCv8 instruction set
which is not compatible with SPARCv7. Relying on the standard SPARCv7
(-mcpu=v7) target for a LEON3-V7 is not feasible since the atomic
instruction (CAS) can not be generated by standard v7.

 * config.gcc (sparc*-*-*): Accept mcpu=leon3v7 processor
 * doc/invoke.texi (SPARC options): add mcpu=leon3v7 comment
 * config/sparc/leon.md (leon3_load, leon_store, leon_fp_*):
   handle leon3v7 as leon3
 * config/sparc/sparc-opts.h (enum processor_type): Add PROCESSOR_LEON3V7
 * config/sparc/sparc.c (sparc_option_override): add leon3v7 support
 * config/sparc/sparc.h (TARGET_CPU_leon3v7): new define
 * config/sparc/sparc.md (cpu): add leon3v7
 * config/sparc/sparc.opt (enum processor_type): Add leon3v7
---
 gcc/config.gcc|5 -
 gcc/config/sparc/leon.md  |   14 +++---
 gcc/config/sparc/sparc-opts.h |1 +
 gcc/config/sparc/sparc.c  |3 +++
 gcc/config/sparc/sparc.h  |   40 +---
 gcc/config/sparc/sparc.md |1 +
 gcc/config/sparc/sparc.opt|3 +++
 gcc/doc/invoke.texi   |   16 
 8 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 605efc0..199e387 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3065,6 +3065,9 @@ if test x$with_cpu = x ; then
 	*-leon[3-9]*)
 	  with_cpu=leon3
 	  ;;
+	*-leon[3-9]v7*)
+	  with_cpu=leon3v7
+	  ;;
 	*)
 	  with_cpu=`echo ${target} | sed 's/-.*$//'`
 	  ;;
@@ -3749,7 +3752,7 @@ case ${target} in
 			case ${val} in
 			 | sparc | sparcv9 | sparc64 \
 			| v7 | cypress \
-			| v8 | supersparc | hypersparc | leon | leon3 \
+			| v8 | supersparc | hypersparc | leon | leon3 | leon3v7 \
 			| sparclite | f930 | f934 | sparclite86x \
 			| sparclet | tsc701 \
 			| v9 | ultrasparc | ultrasparc3 | niagara | niagara2 \
diff --git a/gcc/config/sparc/leon.md b/gcc/config/sparc/leon.md
index b511397..e8050fa 100644
--- a/gcc/config/sparc/leon.md
+++ b/gcc/config/sparc/leon.md
@@ -29,11 +29,11 @@
 
 ;; Use a double reservation to work around the load pipeline hazard on UT699.
 (define_insn_reservation leon3_load 1
-  (and (eq_attr cpu leon3) (eq_attr type load,sload))
+  (and (eq_attr cpu leon3,leon3v7) (eq_attr type load,sload))
   leon_memory*2)
 
 (define_insn_reservation leon_store 2
-  (and (eq_attr cpu leon,leon3) (eq_attr type store))
+  (and (eq_attr cpu leon,leon3,leon3v7) (eq_attr type store))
   leon_memory*2)
 
 ;; This describes Gaisler Research's FPU
@@ -44,21 +44,21 @@
 (define_cpu_unit grfpu_ds grfpu)
 
 (define_insn_reservation leon_fp_alu 4
-  (and (eq_attr cpu leon,leon3) (eq_attr type fp,fpcmp,fpmul))
+  (and (eq_attr cpu leon,leon3,leon3v7) (eq_attr type fp,fpcmp,fpmul))
   grfpu_alu, nothing*3)
 
 (define_insn_reservation leon_fp_divs 16
-  (and (eq_attr cpu leon,leon3) (eq_attr type fpdivs))
+  (and (eq_attr cpu leon,leon3,leon3v7) (eq_attr type fpdivs))
   grfpu_ds*14, nothing*2)
 
 (define_insn_reservation leon_fp_divd 17
-  (and (eq_attr cpu leon,leon3) (eq_attr type fpdivd))
+  (and (eq_attr cpu leon,leon3,leon3v7) (eq_attr type fpdivd))
   grfpu_ds*15, nothing*2)
 
 (define_insn_reservation leon_fp_sqrts 24
-  (and (eq_attr cpu leon,leon3) (eq_attr type fpsqrts))
+  (and (eq_attr cpu leon,leon3,leon3v7) (eq_attr type fpsqrts))
   grfpu_ds*22, nothing*2)
 
 (define_insn_reservation leon_fp_sqrtd 25
-  (and (eq_attr cpu leon,leon3) (eq_attr type fpsqrtd))
+  (and (eq_attr cpu leon,leon3,leon3v7) (eq_attr type fpsqrtd))
   grfpu_ds*23, nothing*2)
diff --git a/gcc/config/sparc/sparc-opts.h b/gcc/config/sparc/sparc-opts.h
index b5e9761..c35bee4 100644
---

Re: Fix libgomp crash without TLS (PR42616)

2014-10-07 Thread Jakub Jelinek

On Wed, Oct 01, 2014 at 08:44:59PM +0400, Varvara Rainchik wrote:
 Ok, then here it is a new patch (tested and bootstrapped on linux).
 
 On linux with --disable-tls now all libgomp make check tests pass; for
 Android I've patched toolchain and tried test from one of the
 mentioned bugs, test passes too.

 Is there some benchmark to check performance?

There is SPEC OMP,
http://www.spec.org/hpg/omp2001/
EPCC,
http://www2.epcc.ed.ac.uk/computing/research_activities/openmpbench/openmp_index.html
NAS,
http://www.nas.nasa.gov/publications/npb.html
http://phase.hpcc.jp/Omni/benchmarks/NPB/
Rodinia,
https://www.cs.virginia.edu/~skadron/wiki/rodinia/index.php/Main_Page

Now, I wonder on which OS and why does config/tls.m4 CHECK_GCC_TLS
actually fail?  Can you figure that out?

If we get rid of HAVE_TLS code altogether, we might lose support of
some very old OSes, e.g. some Linux distros with a recent gcc and binutils
(so that emutls isn't used), but very old glibc (that doesn't support
TLS or supports it incorrectly, think of pre-2002 glibc).  So, if we get
rid of !HAVE_TLS code in libgomp, it would be nice if config/tls.m4 detected
it properly and we'd just fail at configure time.
And if we don't, just make sure that on Android, Darwin and/or M$Win (or
whatever other OS you had in mind which does support pthreads, but doesn't
support native TLS) find out why HAVE_AS_TLS is not defined (guess
config.log should explain that).

 2014-10-01  Varvara Rainchik  varvara.rainc...@intel.com
 
 * libgomp.h (HAVE_TLS): Set to 1.

Jakub

Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Dodji Seketeli

Jason Merrill ja...@redhat.com writes:

 On 10/06/2014 08:50 PM, Siva Chandra wrote:
 On Sat, Oct 4, 2014 at 11:14 AM, Jason Merrill ja...@redhat.com wrote:
 On 10/03/2014 05:41 PM, Siva Chandra wrote:

 I understand that knowing whether a copy-ctor or a d-tor has been
 explicitly defaulted is not sufficient to determine the parameter
 passing ABI. However, why is it not necessary? I could be wrong, but
 doesn't the example I have given show that it is necessary?

 An implicitly declared 'tor can also be trivial.

 But, the question is whether it is required to determine the parameter
 passing ABI. If there is no special marker to indicate that the user
 declared 'tor is explicitly defaulted, then GDB could (in the absence
 of other properties which make the 'tor non-trivial) incorrectly
 conclude that the the 'tor is user defined, and hence not-trivial.

 I've been thinking that we should just mark the 'tor as trivial or not
 directly rather than hint at it.

FWIW, this would be my inclination too.  I think it would make the job
of the debug info consumer a lot easier.

Thanks.

-- 
Dodji

Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Eric Botcazou

 You're right. I have attached an updated patch. The new code becomes:
 
   #ifdef HAVE_AS_LEON
   #define AS_LEON_FLAG -Aleon
 +#define AS_LEONV7_FLAG -Aleon
   #else
   #define AS_LEON_FLAG -Av8
 +#define AS_LEONV7_FLAG -Av7
   #endif

The patch is OK for all active branches (trunk, 4.9 and 4.8), modulo nits in 
the ChangeLog entry: capital letter at the beginning and period at the end of 
every sentence.

* config.gcc (sparc*-*-*): Accept mcpu=leon3v7 processor.
* doc/invoke.texi (SPARC options): Add mcpu=leon3v7 comment.
* config/sparc/leon.md (leon3_load, leon_store, leon_fp_*): Handle
leon3v7 as leon3.
* config/sparc/sparc-opts.h (enum processor_type): Add LEON3V7.
* config/sparc/sparc.c (sparc_option_override): Add leon3v7 support.
* config/sparc/sparc.h (TARGET_CPU_leon3v7): New define.
* config/sparc/sparc.md (cpu): Add leon3v7.
* config/sparc/sparc.opt (enum processor_type): Add leon3v7.

I assume that you have applied for write access so you'll be able to install 
it yourself.  Otherwise let me know if I can lend a hand.

-- 
Eric Botcazou

[PATCH][match-and-simplify] Change (match ...) syntax

2014-10-07 Thread Richard Biener


After internal discussion this changes

(match logical_inverted_value
 (ne truth_valued_p@0 integer_onep)
 (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
  (logical_inverted_value @0)))

to

(match (logical_inverted_value @0)
 (ne truth_valued_p@0 integer_onep)
 (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)

thus avoids repeating 'logical_inverted_value' and puts whether
this is an expression or predicate matcher and its operands first.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-10-07  Richard Biener  rguent...@suse.de

* genmatch.c (parser::parse_pattern): Change match parsing
to expect the matching template first, not as result.
(parser::parse_simplify): Likewise.
* match-bitwise.pd: Adjust.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215917)
+++ gcc/genmatch.c  (working copy)
@@ -2163,7 +2163,8 @@ private:
   operand *parse_op ();
 
   void parse_pattern ();
-  void parse_simplify (source_location, vecsimplify *, predicate_id *);
+  void parse_simplify (source_location, vecsimplify *, predicate_id *,
+  expr *);
   void parse_for (source_location);
   void parse_if (source_location);
   void parse_predicates (source_location);
@@ -2528,7 +2529,8 @@ parser::parse_op ()
 
 void
 parser::parse_simplify (source_location match_location,
-   vecsimplify * simplifiers, predicate_id *matcher)
+   vecsimplify * simplifiers, predicate_id *matcher,
+   expr *result)
 {
   /* Reset the capture map.  */
   capture_ids = new std::mapstd::string, unsigned;
@@ -2549,12 +2551,8 @@ parser::parse_simplify (source_location
 {
   if (!matcher)
fatal_at (token, expected transform expression);
-  else if (matcher-nargs  0)
-   fatal_at (token, expected match operand expression);
-  if (matcher-nargs == -1)
-   matcher-nargs = 0;
   simplifiers.safe_push
-   (new simplify (match, match_location, NULL, token-src_loc,
+   (new simplify (match, match_location, result, token-src_loc,
   active_ifs.copy (), active_fors.copy (),
   capture_ids));
   return;
@@ -2579,12 +2577,8 @@ parser::parse_simplify (source_location
{
  if (!matcher)
fatal_at (token, manual transform not implemented);
- else if (matcher-nargs  0)
-   fatal_at (token, expected match operand expression);
- if (matcher-nargs == -1)
-   matcher-nargs = 0;
  simplifiers.safe_push
- (new simplify (match, match_location, NULL,
+ (new simplify (match, match_location, result,
 paren_loc, active_ifs.copy (),
 active_fors.copy (), capture_ids));
}
@@ -2599,19 +2593,9 @@ parser::parse_simplify (source_location
}
  else
{
- operand *op = parse_expr ();
- if (matcher)
-   {
- expr *e = dyn_cast expr * (op);
- if (!e)
-   fatal_at (token, match operand expression cannot 
- be captured);
- if (matcher-nargs == -1)
-   matcher-nargs = e-ops.length ();
- if (matcher-nargs == 0
- || (unsigned) matcher-nargs != e-ops.length ())
-   fatal_at (token, match arity doesn't match);
-   }
+ operand *op = result;
+ if (!matcher)
+   op = parse_expr ();
  simplifiers.safe_push
  (new simplify (match, match_location, op,
 token-src_loc, active_ifs.copy (),
@@ -2644,7 +2628,8 @@ parser::parse_simplify (source_location
  if (matcher)
fatal_at (token, expected match operand expression);
  simplifiers.safe_push
- (new simplify (match, match_location, parse_op (),
+ (new simplify (match, match_location,
+matcher ? result : parse_op (),
 token-src_loc, active_ifs.copy (),
 active_fors.copy (), capture_ids));
  /* A default result closes the enclosing scope.  */
@@ -2811,9 +2796,15 @@ parser::parse_pattern ()
   const cpp_token *token = peek ();
   const char *id = get_ident ();
   if (strcmp (id, simplify) == 0)
-parse_simplify (token-src_loc, simplifiers, NULL);
+parse_simplify (token-src_loc, simplifiers, NULL, NULL);
   else if (strcmp (id, match) == 0)
 {
+  bool with_args = false;
+  if (peek ()-type == CPP_OPEN_PAREN)
+   {
+ eat_token (CPP_OPEN_PAREN);
+ with_args = true;
+   }
   const char *name = get_ident ();

Re: [patch] Work harder to find DECL_STRUCT_FUNCTION

2014-10-07 Thread Eric Botcazou

 I wonder if this is worth abstracting into a callee_fn () cgraph edge
 method?

That would rather be a cgraph node method without callee in the name since 
we also apply it to callers, something like:

struct function *cgraph_node::cfun (void)

and the code in can_inline_edge_p would just be:

  struct function *caller_cfun = e-caller-cfun ();
  struct function *callee_cfun = callee ? callee-cfun () : NULL;

-- 
Eric Botcazou

Re: [PATCH 0/14+2][Vectorizer] Made reductions endianness-neutral, fixes PR/61114

2014-10-07 Thread Richard Biener

On Mon, Oct 6, 2014 at 7:30 PM, Alan Lawrence alan.lawre...@arm.com wrote:
 Ok, so unless there are objections, I plan to commit patches 1, 2, 4, 5, and
 6,
 which have been previously approved, in that sequence. (Of those, all bar
 patch
 2 are AArch64 only.) I think this is better than maintaining an
 ever-expanding
 patch series.

Agreed.

 Then I'll get to work on migrating all backends to the new _scal_ optab (and
 removing the vector optab). Certainly I'd like to replace vec_shr/l with
 vec_perm_expr too, but I'm conscious that the end of stage 1 is approaching!

I suppose we all are.  It will last until end of October at least
(stage1 of gcc 4.9
ended Nov 22th, certainly a bit late).

I do expect we will continue merging already developed / posted stuff through
stage3 (as usual).

That said, it would be really nice to get rid of VEC_RSHIFT_EXPR.

Thanks,
Richard.

 --Alan




 Richard Biener wrote:

 On Thu, Sep 18, 2014 at 1:41 PM, Alan Lawrence alan.lawre...@arm.com
 wrote:

 The end goal here is to remove this code from tree-vect-loop.c
 (vect_create_epilog_for_reduction):

   if (BYTES_BIG_ENDIAN)
 bitpos = size_binop (MULT_EXPR,
  bitsize_int (TYPE_VECTOR_SUBPARTS (vectype)
 -
 1),
  TYPE_SIZE (scalar_type));
   else

 as this is the root cause of PR/61114 (see testcase there, failing on all
 bigendian targets supporting reduc_[us]plus_optab). Quoting Richard
 Biener,
 all code conditional on BYTES/WORDS_BIG_ENDIAN in tree-vect* is
 suspicious. The code snippet above is used on two paths:

 (Path 1) (patches 1-6) Reductions using REDUC_(PLUS|MIN|MAX)_EXPR =
 reduc_[us](plus|min|max)_optab.
 The optab is documented as the scalar result is stored in the least
 significant bits of operand 0, but the tree code as the first element
 in
 the vector holding the result of the reduction of all elements of the
 operand. This mismatch means that when the tree code is folded, the code
 snippet above reads the result from the wrong end of the vector.

 The strategy (as per
 https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html) is to define
 new
 tree codes and optabs that produce scalar results directly; this seems
 better than tying (the element of the vector into which the result is
 placed) to (the endianness of the target), and avoids generating extra
 moves
 on current bigendian targets. However, the previous optabs are retained
 for
 now as a migration strategy so as not to break existing backends; moving
 individual platforms over will follow.

 A complication here is on AArch64, where we directly generate
 REDUC_PLUS_EXPRs from intrinsics in gimple_fold_builtin; I temporarily
 remove this folding in order to decouple the midend and AArch64 backend.


 Sounds fine.  I hope we can transition all backends for 5.0 and remove
 the vector variant optabs (maybe renaming the scalar ones).

 (Path 2) (patches 7-13) Reductions using whole-vector-shifts, i.e.
 VEC_RSHIFT_EXPR and vec_shr_optab. Here the tree code as well as the
 optab
 is defined in an endianness-dependent way, leading to significant
 complication in fold-const.c. (Moreover, the equivalent vec_shl_optab
 is
 never used!). Few platforms appear to handle vec_shr_optab (and fewer
 bigendian - I see only PowerPC and MIPS), so it seems pertinent to change
 the existing optab to be endianness-neutral.

 Patch 10 defines vec_shr for AArch64, for the old specification; patch 13
 updates that implementation to fit the new endianness-neutral
 specification,
 serving as a guide for other existing backends. Patches/RFCs 15 and 16
 are
 equivalents for MIPS and PowerPC; I haven't tested these but hope they
 act
 as useful pointers for the port maintainers.

 Finally patch 14 cleans up the affected part of tree-vect-loop.c
 (vect_create_epilog_for_reduction).


 As said during the individual patches review I'd like the vectorizer to
 use a VEC_PERM_EXPR instead of VEC_RSHIFT_EXPR (with
 only whole-element amounts).  This means we can remove
 VEC_RSHIFT_EXPR.  It also means that if the backend defines
 vec_perm_const (which it really should) it can handle the special
 permutes that boil down to a possibly more efficient vector shift
 there (a good optimization anyway).  Until it does that all backends
 would at least create correct code (with the endian dependent
 vec_shr removed).

 Richard.

 --Alan




 -- IMPORTANT NOTICE: The contents of this email and any attachments are
 confidential and may also be privileged. If you are not the intended
 recipient, please notify the sender immediately and do not disclose the
 contents to any other person, use it for any purpose, or store or copy the
 information in any medium.  Thank you.

 ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
 Registered in England  Wales, Company No:  2557590
 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
 Registered in England  Wales, Company No:  2548782

Re: [PATCH 0/14+2][Vectorizer] Made reductions endianness-neutral, fixes PR/61114

2014-10-07 Thread Richard Biener

On Tue, Oct 7, 2014 at 9:45 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Mon, Oct 6, 2014 at 7:30 PM, Alan Lawrence alan.lawre...@arm.com wrote:
 Ok, so unless there are objections, I plan to commit patches 1, 2, 4, 5, and
 6,
 which have been previously approved, in that sequence. (Of those, all bar
 patch
 2 are AArch64 only.) I think this is better than maintaining an
 ever-expanding
 patch series.

 Agreed.

 Then I'll get to work on migrating all backends to the new _scal_ optab (and
 removing the vector optab). Certainly I'd like to replace vec_shr/l with
 vec_perm_expr too, but I'm conscious that the end of stage 1 is approaching!

 I suppose we all are.  It will last until end of October at least
 (stage1 of gcc 4.9
 ended Nov 22th, certainly a bit late).

 I do expect we will continue merging already developed / posted stuff through
 stage3 (as usual).

 That said, it would be really nice to get rid of VEC_RSHIFT_EXPR.

And you can fix performance regressions you introduce (badly handled
VEC_PERM) until the GCC 5 release happens (and even after that).
Heh.  Easy way out ;)

Richard.

 Thanks,
 Richard.

 --Alan




 Richard Biener wrote:

 On Thu, Sep 18, 2014 at 1:41 PM, Alan Lawrence alan.lawre...@arm.com
 wrote:

 The end goal here is to remove this code from tree-vect-loop.c
 (vect_create_epilog_for_reduction):

   if (BYTES_BIG_ENDIAN)
 bitpos = size_binop (MULT_EXPR,
  bitsize_int (TYPE_VECTOR_SUBPARTS (vectype)
 -
 1),
  TYPE_SIZE (scalar_type));
   else

 as this is the root cause of PR/61114 (see testcase there, failing on all
 bigendian targets supporting reduc_[us]plus_optab). Quoting Richard
 Biener,
 all code conditional on BYTES/WORDS_BIG_ENDIAN in tree-vect* is
 suspicious. The code snippet above is used on two paths:

 (Path 1) (patches 1-6) Reductions using REDUC_(PLUS|MIN|MAX)_EXPR =
 reduc_[us](plus|min|max)_optab.
 The optab is documented as the scalar result is stored in the least
 significant bits of operand 0, but the tree code as the first element
 in
 the vector holding the result of the reduction of all elements of the
 operand. This mismatch means that when the tree code is folded, the code
 snippet above reads the result from the wrong end of the vector.

 The strategy (as per
 https://gcc.gnu.org/ml/gcc-patches/2014-08/msg00041.html) is to define
 new
 tree codes and optabs that produce scalar results directly; this seems
 better than tying (the element of the vector into which the result is
 placed) to (the endianness of the target), and avoids generating extra
 moves
 on current bigendian targets. However, the previous optabs are retained
 for
 now as a migration strategy so as not to break existing backends; moving
 individual platforms over will follow.

 A complication here is on AArch64, where we directly generate
 REDUC_PLUS_EXPRs from intrinsics in gimple_fold_builtin; I temporarily
 remove this folding in order to decouple the midend and AArch64 backend.


 Sounds fine.  I hope we can transition all backends for 5.0 and remove
 the vector variant optabs (maybe renaming the scalar ones).

 (Path 2) (patches 7-13) Reductions using whole-vector-shifts, i.e.
 VEC_RSHIFT_EXPR and vec_shr_optab. Here the tree code as well as the
 optab
 is defined in an endianness-dependent way, leading to significant
 complication in fold-const.c. (Moreover, the equivalent vec_shl_optab
 is
 never used!). Few platforms appear to handle vec_shr_optab (and fewer
 bigendian - I see only PowerPC and MIPS), so it seems pertinent to change
 the existing optab to be endianness-neutral.

 Patch 10 defines vec_shr for AArch64, for the old specification; patch 13
 updates that implementation to fit the new endianness-neutral
 specification,
 serving as a guide for other existing backends. Patches/RFCs 15 and 16
 are
 equivalents for MIPS and PowerPC; I haven't tested these but hope they
 act
 as useful pointers for the port maintainers.

 Finally patch 14 cleans up the affected part of tree-vect-loop.c
 (vect_create_epilog_for_reduction).


 As said during the individual patches review I'd like the vectorizer to
 use a VEC_PERM_EXPR instead of VEC_RSHIFT_EXPR (with
 only whole-element amounts).  This means we can remove
 VEC_RSHIFT_EXPR.  It also means that if the backend defines
 vec_perm_const (which it really should) it can handle the special
 permutes that boil down to a possibly more efficient vector shift
 there (a good optimization anyway).  Until it does that all backends
 would at least create correct code (with the endian dependent
 vec_shr removed).

 Richard.

 --Alan




 -- IMPORTANT NOTICE: The contents of this email and any attachments are
 confidential and may also be privileged. If you are not the intended
 recipient, please notify the sender immediately and do not disclose the
 contents to any other person, use it for any purpose, or store or copy the
 information in any medium.  Thank

Re: [patch] Work harder to find DECL_STRUCT_FUNCTION

2014-10-07 Thread Richard Biener

On Tue, Oct 7, 2014 at 9:43 AM, Eric Botcazou ebotca...@adacore.com wrote:
 I wonder if this is worth abstracting into a callee_fn () cgraph edge
 method?

 That would rather be a cgraph node method without callee in the name since
 we also apply it to callers, something like:

 struct function *cgraph_node::cfun (void)

 and the code in can_inline_edge_p would just be:

   struct function *caller_cfun = e-caller-cfun ();
   struct function *callee_cfun = callee ? callee-cfun () : NULL;

Ah, ok.  Yes agreed - but without the 'c' (nothing is current here IMHO).
Maybe -get_fun () to be consistent with other method names.

I'll pre-approve a patch to do that.

Thanks,
Richard.

 --
 Eric Botcazou

Re: [patch] Turn 1 lra_assert into 1 gcc_assert

2014-10-07 Thread Eric Botcazou

 The docs on the asm_p flags say there is sth wrong with the asm constraints
 so maybe better do
 
  if (!asm_p)
error_at (loc, );
 
 with an appropriate message and location?

OK, I guess I can copy-and-paste reload1.c:spill_failure there.

-- 
Eric Botcazou

Re: [patch] Fix miscompilation of gnat1 in LTO bootstrap

2014-10-07 Thread Eric Botcazou

 Testcase?  I think it would be better to handle this in the canonical type
 merging code in lto.c - or how does it end up working without LTO?  That is,
 what does the Ada frontend do to make sure get_alias_set handles this
 correctly?

It manages the alias sets, see gcc-interface/utils.c:relate_alias_sets.

-- 
Eric Botcazou

Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Andrew Haley

On 06/10/14 22:00, Mark Wielaard wrote:
 If no java maintainer responds, try CCing java-patc...@gcc.gnu.org
 to draw their attention.

Please.  I can't see the patch here.

Andrew.

Re: [PATCH] Enhance array types debug info. for Ada

2014-10-07 Thread Pierre-Marie de Rodat


Jakub,

First, thank you very much for reviewing this set of patches.

I think it's better to start with an answer to your last mail:

On 10/03/2014 11:20 AM, Jakub Jelinek wrote:

What kind of more complex expressions do you need and why?


GNAT can produce array types that make sense only as part of a record 
type and whose bounds are equal to members of this record type. Such 
ARRAY_TYPE nodes get generated from the kind of example you could see on 
the Dwarf-Discuss mailing list:


type Array_Type is array (Integer range ) of Integer;
type Record_Type (N : Integer) is record
   A : Array_Type (1 .. N);
end record;

In this case, the A field's type is an ARRAY_TYPE node whose upper 
bound is:


COMPONENT_REF (PLACEHOLDER_EXPR (Record_Type),
   FIELD_DECL(N))

Upcoming patches will actually extend the need to handle more complex 
expressions: Ada arrays can contain dynamically-sized objects (their 
size is bounded, though). As a consequence, debuggers need these arrays 
to have a DW_AT_byte_stride attribute in order to decode them. The size 
expressions that describe the array stride in GCC can contain fairly 
complex operations such as unsigned divisions, unsigned comparisons, 
bitwise operations, calls to size functions (see 
stor-layout.c:self_referencial_size).


On 10/03/2014 11:18 AM, Jakub Jelinek wrote:

+ /* Instead of producing a dedicated DW_TAG_array_type DIE for  this type, let
+the circuitry wrap the main variant with DIEs for qualifiers  (for
+instance: DW_TAG_const_type, ...). */
+ if (type != TYPE_MAIN_VARIANT (type))
+ {
+   gen_type_die (TYPE_MAIN_VARIANT (type), context_die);
+   return;
+ }


I don't like this, can you explain why? I'd say that if you only
want to see TYPE_MAIN_VARIANT here, it should be responsibility of
the callers to ensure that.


Agreed. I have updated the patch to:

 1. remove this hunk;
 2. in gen_type_die_with_usage, which is the only caller, move the 
type_main_variant call on type right before the array descriptors 
handling.



@@ -19941,7 +19991,8 @@ gen_type_die_with_usage (tree type,  dw_die_ref 
context_die,
/* If this is an array type with hidden descriptor, handle itfirst.  */
if (!TREE_ASM_WRITTEN (type)
 lang_hooks.types.get_array_descr_info
-   lang_hooks.types.get_array_descr_info (type, info)
+   lang_hooks.types.get_array_descr_info (type,
+   init_array_descr_info (info))


Just memset it to 0 instead?


Sure. I was not sure about whether is was considered good style, but 
it's done, now.



+  enum array_descr_ordering ordering;
tree element_type;
tree base_decl;
tree data_location;
tree allocated;
tree associated;
+


Why the extra vertical space?

struct array_descr_dimen
  {


It made the separation between global and dimension-local 
information clearer to me. As I suppose you don't like it and as there 
is already one indentation level, I removed it.



* dwarf2out.c (gen_type_die_with_usage): Enable the array lang-hook
even when (dwarf_version  3  dwarf_strict).
(gen_descr_array_die): Do not output DW_AT_data_locationn,
DW_AT_associated, DW_AT_allocated and DW_AT_byte_stride DWARF
attributes when (dwarf_version  3  dwarf_strict).


This patch sounds very wrong.  DW_OP_push_object_address is not in DWARF2
either, and that is the basis of all the fields, so there is reallynothing
you can really output correctly for DWARF2.  It isn't the default on sane
targets, where GCC defaults to DWARF4 these days, so why bother?


Generating DW_OP_push_object_address in strict DWARF2 mode is indeed a 
bug (patch is adjusted). However, if I understand correctly all 
fields/attributes don't have to rely on it.


In the case of the first Ada example I quoted above, such an operation 
would not be emitted: instead, add_bound_info/add_scalar_info are going 
to output a DW_AT_upper_bound attribute that is a reference to another 
DIE. This is valid DWARF2 and, I think, justifies enabling the language 
hook in this case.


We have several platforms whose default to strict DWARF2. These are 
quite used platforms on which some DWARF consumers crash when provided 
DIEs and tags they do not handle.



gcc/fortran/
* trans-types.c (gfc_get_array_descr_info): Use PLACEHOLDER_EXPR nodes
instead of VAR_DECL ones in type-related expressions.  Remove base_decl
initialization.


Ugh, I must say I don't like PLACEHOLDER_EXPRs at all.


Why so? I know that as far as supported front-ends are concerned, 
PLACEHOLDE_EXPR nodes are used only in GNAT, but it seems to me they 
describe the best what object the bound/stride/allocated/associated 
expressions (self-)reference.


I have attached to this mail the 3 patches that are updated thanks to 
your (Jakub and Jason's) comments and run successfuly the GCC testsuite 
on x86_64-pc-linux-gnu.


Thanks again for

Re: [PING] Enhance array types debug info. for Ada

2014-10-07 Thread Pierre-Marie de Rodat


On 10/03/2014 06:41 PM, Jason Merrill wrote:

Patches 1-4 are OK.


+  bool pell_conversions = true;


I don't understand pell.  Do you mean strip?


Absolutely: I though it was correct English. I replaced all occurences 
of pell with strip. Updates patches will follow...


Thank you very much for your review! :-)

--
Pierre-Marie de Rodat

Re: [PATCH] Indirect-call topn targets profiler (instrumentation)

2014-10-07 Thread Bernhard Reutner-Fischer

On 6 October 2014 22:31:18 CEST, Jan Hubicka hubi...@ucw.cz wrote:
 

 Is it ok to commit these two patches now?

Yes, it is OK, thanks!

I do not see documentation of the new parameter added to doc in the ChangeLog?
Also, I would not abbreviate indir in the param name.

Thanks,

Re: [PATCH] Enhance array types debug info. for Ada

2014-10-07 Thread Jakub Jelinek

On Tue, Oct 07, 2014 at 10:08:23AM +0200, Pierre-Marie de Rodat wrote:
 gcc/fortran/
 * trans-types.c (gfc_get_array_descr_info): Use PLACEHOLDER_EXPR nodes
 instead of VAR_DECL ones in type-related expressions.  Remove base_decl
 initialization.
 
 Ugh, I must say I don't like PLACEHOLDER_EXPRs at all.
 
 Why so? I know that as far as supported front-ends are concerned,
 PLACEHOLDE_EXPR nodes are used only in GNAT, but it seems to me they
 describe the best what object the bound/stride/allocated/associated
 expressions (self-)reference.

But isn't there a risk that you will have PLACEHOLDER_EXPRs (likely for Ada
only) in some trees not constructed by the langhook?
I mean, DW_OP_push_object_address isn't meaningful in all DWARF contexts,
in some it is forbidden, in others there is really no object to push, and as
implemented, you emit DW_OP_push_object_address (which emits the address of
a context related particular object) for any kind of PLACEHOLDER_EXPR with
RECORD_TYPE.

Thus, I'd feel safer, even if you decide to use a PLACEHOLDER_EXPR, that
the translation of that to DW_OP_push_object_address would be done only
if the PLACEHOLDER_EXPR is equal to some global variable, normally NULL,
and only changed temporarily while emitting loc for the array descriptor.
But then IMHO a DEBUG_EXPR_DECL is better.

That said, if Jason is fine with the patchset as is, I can live with it,
as other FEs don't use PLACEHOLDER_EXPRs, worst case it will affect Ada
only.
Also, please verify that with your patch the generated debug info for some
Fortran arrays is the same.

Jakub

[Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Marek Polacek

[CCing java-patches now]

Java testsuite breaks with -std=gnu11 as a default and/or with 
-Wimplicit-function-declaration on, since the jvgenmain.c program
that generates a C file containing 'main' function which calls either
'JvRunMainName' or 'JvRunMain' does not generate forward declarations
for these functions.  The following patch generates such a declaration
depending on whether -findirect-dispatch is given.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-10-07  Marek Polacek  pola...@redhat.com

* jvgenmain.c (main): Provide declaration for JvRunMain{,Name}.

diff --git gcc/java/jvgenmain.c gcc/java/jvgenmain.c
index 5b14258..82e468d 100644
--- gcc/java/jvgenmain.c
+++ gcc/java/jvgenmain.c
@@ -127,6 +127,10 @@ main (int argc, char **argv)
   /* At this point every element of ARGV from 1 to LAST_ARG is a `-D'
  option.  Process them appropriately.  */
   fprintf (stream, extern const char **_Jv_Compiler_Properties;\n);
+  if (indirect)
+fprintf (stream, extern void JvRunMainName ();\n);
+  else
+fprintf (stream, extern void JvRunMain ();\n);
   fprintf (stream, static const char *props[] =\n{\n);
   for (i = 1; i  last_arg; ++i)
 {

Marek

Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Marek Polacek

On Mon, Oct 06, 2014 at 11:00:48PM +0200, Mark Wielaard wrote:
 On Mon, Oct 06, 2014 at 11:54:00AM +0200, Marek Polacek wrote:
  Java testsuite breaks with -std=gnu11 as a default and/or with 
  -Wimplicit-function-declaration on, since the jvgenmain.c program
  that generates a C file containing 'main' function which calls either
  'JvRunMainName' or 'JvRunMain' does not generate forward declarations
  for these functions.  The fix is obvious IMHO.
  
  Bootstrapped/regtested on x86_64-linux, ok for trunk?
 
 I cannot approve (java) patches, but it does look ok to me.
 With one nitpick. JvRunMain is only used when -findirect-dispatch
 is given, and otherwise JvRunMainName is used. So you could output
 only the actually used forward declaration by checking if (indirect).

Yeah, that will be better.
 
 If no java maintainer responds, try CCing java-patc...@gcc.gnu.org
 to draw their attention.

Done (separate mail).  Thanks.

Marek

Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Daniel Hellstrom


On 10/07/2014 09:26 AM, Eric Botcazou wrote:

You're right. I have attached an updated patch. The new code becomes:

   #ifdef HAVE_AS_LEON
   #define AS_LEON_FLAG -Aleon
+#define AS_LEONV7_FLAG -Aleon
   #else
   #define AS_LEON_FLAG -Av8
+#define AS_LEONV7_FLAG -Av7
   #endif

The patch is OK for all active branches (trunk, 4.9 and 4.8), modulo nits in
the ChangeLog entry: capital letter at the beginning and period at the end of
every sentence.

* config.gcc (sparc*-*-*): Accept mcpu=leon3v7 processor.
* doc/invoke.texi (SPARC options): Add mcpu=leon3v7 comment.
* config/sparc/leon.md (leon3_load, leon_store, leon_fp_*): Handle
leon3v7 as leon3.
* config/sparc/sparc-opts.h (enum processor_type): Add LEON3V7.
* config/sparc/sparc.c (sparc_option_override): Add leon3v7 support.
* config/sparc/sparc.h (TARGET_CPU_leon3v7): New define.
* config/sparc/sparc.md (cpu): Add leon3v7.
* config/sparc/sparc.opt (enum processor_type): Add leon3v7.


Ok, I will update that. Is there a way of generating the comments automatically?


I assume that you have applied for write access so you'll be able to install
it yourself.  Otherwise let me know if I can lend a hand.


Thanks, I'll let you know. I just applied.

Daniel

Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Eric Botcazou

 Ok, I will update that. Is there a way of generating the comments
 automatically?

Do you mean the ChangeLog?  If so, contrib/mklog will generate a skeleton but 
you'll still need to write the decription sentences.

-- 
Eric Botcazou

Re: [PATCH] SPARC: add mcpu=leon3v7 target

2014-10-07 Thread Daniel Hellstrom


On 10/07/2014 11:07 AM, Eric Botcazou wrote:

Ok, I will update that. Is there a way of generating the comments
automatically?

Do you mean the ChangeLog?  If so, contrib/mklog will generate a skeleton but
you'll still need to write the decription sentences.


Perfect, thanks!

[PATCH] Fix PR ipa/61190, 2nd edition‏

2014-10-07 Thread Bernd Edlinger

Hi Honza,


as you know, we have a wrong code bug, when a pure or const method is called 
via a virtual thunk.
I had some more Ideas, how to fix that, but all of them had some serious 
draw-backs, so I leave the details out...


But now I have a new insight, why the obvious fix for this serious code 
generation bug did not work
in the first place.


And the reason was, that if ipa-pure-const.c calls set_const_flag or 
set_pure_flag for a thunk, it calls the same
function later for the called method, and this overwrites the flags of _all_ 
associated thunks and aliases.
However that should at least not be done for virtual thunks, as these need to 
be IPA_NEITHER, even if
the method itself has different attributes, that is because the assembler thunk 
accesses the vtable, while
other thunks do not.


So I re-factored set_const_flag and set_pure_flag to exclude the virtual 
thunks, taking care that other
users of call_for_symbol_thunks_and_aliases do not get a different behavior 
than before this patch.


The attached patch was boot-strapped and
regression-tested on x86_64-linux-gnu.
Ok for trunk?


PS: As a side-note, there are two identical functions, named 
call_for_symbol_and_aliases, in
class symtab_node and in class cgraph_node, which inherits from symtab_node. 
Both functions are
not declared virtual.  Is that what's intended?  Usually this could lead to 
errors, or at least some serious
compiler warnings.


Thanks
Bernd.
  2014-10-07  Bernd Edlinger  bernd.edlin...@hotmail.de

PR ipa/61190
* cgraph.h (symtab_node::call_for_symbol_and_aliases): Fix comment.
(cgraph_node::call_for_symbol_and_aliases): Likewise.
(cgraph_node::call_for_symbol_thunks_and_aliases_1): New function.
(cgraph_node::call_for_symbol_thunks_and_aliases): Adjust comment.
Call call_for_symbol_thunks_and_aliases_1.
* cgraph.c (cgraph_node::call_for_symbol_thunks_and_aliases): Renamed
to cgraph_node::call_for_symbol_thunks_and_aliases_1.  Added new
parameter exclude_virtual_thunks.
(cgraph_node::set_const_flag): Don't propagate to virtual thunks.
(cgraph_node::set_pure_flag): Likewise.
* ipa-pure-const.c (analyze_function): For virtual thunks
set pure_const_state to IPA_NEITHER.

testsuite/ChangeLog:
2014-10-07  Bernd Edlinger  bernd.edlin...@hotmail.de

PR ipa/61190
* g++.old-deja/g++.mike/p4736b.C: Use -O2.



patch-pr61190.diff
Description: Binary data

Re: [gofrontend-dev] [PATCH 5/9] Gccgo port to s390[x] -- part I

2014-10-07 Thread Dominik Vogt

On Mon, Oct 06, 2014 at 07:29:33AM -0700, Ian Lance Taylor wrote:
 On Mon, Oct 6, 2014 at 12:42 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote:
  On s390[x] the symbol value of a section symbol is definitely not
  zero.
 
 Is true even in an object file?

No.

 I agree that in an executable a
 section symbol will have a non-zero value, but that case doesn't arise
 since an executable won't have (non-dynamic) relocations.  But I'm
 quite surprised that hear that the section symbol would be non-zero in
 an object file.

I spent a day looking at that issue again, and while it's true
that section symbols don't necessarily have a zero value, that is
not the problem here.  The problem is about how cgo determines the
names of functions(?) from an object file.  On s390 we need to do
an indirect lookup of (non-section-)symbols to find the names, and
the symbol value is not zero.

The only points in that patch are that on one hand - as far as I
know - the Abi does not guarantee that section symbols are either
zero or not relocated, even if that may be the case in reality.
And on the other hand, if that code is ever modified to handle
non-section symbols, it's not obvious that you not only need to
remove the test for the symbol type but also modify the
calculations below.  So, apply the patch or drop it as you like,
but in any case, at least a comment in the code would improve
maintainability.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Mark Wielaard

On Mon, 2014-10-06 at 20:55 -0400, Jason Merrill wrote:
 On 10/06/2014 08:50 PM, Siva Chandra wrote:
  On Sat, Oct 4, 2014 at 11:14 AM, Jason Merrill ja...@redhat.com wrote:
  On 10/03/2014 05:41 PM, Siva Chandra wrote:
 
  I understand that knowing whether a copy-ctor or a d-tor has been
  explicitly defaulted is not sufficient to determine the parameter
  passing ABI. However, why is it not necessary? I could be wrong, but
  doesn't the example I have given show that it is necessary?
 
  An implicitly declared 'tor can also be trivial.
 
  But, the question is whether it is required to determine the parameter
  passing ABI. If there is no special marker to indicate that the user
  declared 'tor is explicitly defaulted, then GDB could (in the absence
  of other properties which make the 'tor non-trivial) incorrectly
  conclude that the the 'tor is user defined, and hence not-trivial.
 
 I've been thinking that we should just mark the 'tor as trivial or not 
 directly rather than hint at it.  Does GDB have enough information to 
 determine triviality if we just add defaulted info?

To be honest my original patches for a deleted/defaulted markers on
special member functions was really just meant to give the consumer a
way to know why GCC produced a declaration in the first place. Which I
still think is useful information for the consumer to have, but
certainly not enough to solve the abi problem with inferior function
calls Siva was seeing. Maybe GDB has enough information/smarts, but I
don't think other consumers have. So an explicit trivial/non-trivial
marker on special member functions seems like a good idea.

But looking at the definition of trivial copy constructor and trivial
destructor they do look more like class concepts instead of individual
constructor/destructor concepts (since they rely on properties of other
members and the base class). Currently GCC doesn't output declarations
unless the user declares them. So an implicit copy constructor or
destructor doesn't get a DWARF class member declaration. But I don't
think a consumer can conclude just from that fact that the copy
constructor or destructor is trivial. Nor can it asssume they are
non-trivial just because they are are respresented in DWARF. So should
we always output them and add a flag value to indicate
(non-trivialness). Or should we add attributes on the class itself?

Taking a step back and looking at the actual function that is causing
the trouble because abi/calling convention seems unclear. Which makes me
wonder if the issue isn't actually with the DWARF declaration of the
function that has special calling conventions. I am slightly surprised
the special return value passed in rule isn't expressed in the mangling
of the function name (or is it?). So the calling convention needs to be
interpreted from the DWARF representation. We already add a synthetic
formal parameter for this if necessary to be passed in. Why don't we
just add a similar synthetic return formal parameter if that is how
the function is really being invoked? That seems like a more direct way
to solve the inferior function call issue.

Cheers,

Mark

Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Andrew Haley

On 10/07/2014 09:31 AM, Marek Polacek wrote:
 Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK, thanks.

Andrew.

[wwwdocs] Update libstdc++ section of gcc-5/changes.html

2014-10-07 Thread Jonathan Wakely


Document the latest additions.

? gcc-5/.changes.html.swp
Index: gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.14
diff -u -u -r1.14 changes.html
--- gcc-5/changes.html  2 Oct 2014 10:13:36 -   1.14
+++ gcc-5/changes.html  7 Oct 2014 12:34:27 -
@@ -96,15 +96,33 @@
 li codestd::deque/code meets the allocator-aware container 
requirements;/li
 li movable and swappable iostream classes;/li
 li support for codestd::aligned_union/code;/li
+li I/O manipulators codestd::hexfloat/code and
+codestd::defaultfloat/code;
+/li
   /ul
 /li
+li Support for the C++11 codehexfloat/code manipulator changes how
+  the codenum_put/code facet formats floating point types when
+  codeios_base::fixed|ios_base::scientific/code is set in a stream's
+  codefmtflags/code. This change affects all language modes, even
+  though the C++98 standard gave no special meaning to that combination
+  of flags. To prevent the use of hexadecimal notation for floating point
+  types use codestr.unsetf(std::ios_base::floatfield)/code to clear
+  the relevant bits in codestr.flags()/code.
+/li
 lia 
href=https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2014;
   Improved experimental support for C++14/a, including:
   ul
 li codestd::is_final/code type trait; /li
   /ul
 /li
-liAn implementation of codestd::experimental::any/code./li
+lia 
href=https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2014;
+  Improved experimental support for the Library Fundamentals TS/a, 
including:
+  ul
+li Class codestd::experimental::any/code; /li
+li Function template codestd::experimental::apply/code; /li
+  /ul
+/li
 liNew random number distributions codelogistic_distribution/code and
   codeuniform_on_sphere_distribution/code as extensions./li
 lia 
href=https://sourceware.org/gdb/current/onlinedocs/gdb/Xmethods-In-Python.html;GDB

Re: [wwwdocs] Add feature-testing macros and std::is_final to gcc-5/changes.html

2014-10-07 Thread Ed Smith-Rowland


On 10/02/2014 10:24 AM, Jonathan Wakely wrote:

On 02/10/14 10:09 -0400, Ed Smith-Rowland wrote:

On 10/02/2014 06:14 AM, Jonathan Wakely wrote:

On 02/10/14 11:12 +0100, Jonathan Wakely wrote:

Note Ed's recent changes. Committed to CVS.


And fix a markup error that I expected xmllint to catch :-(
Thank you! I tried to do this and couldn't for permissions.  I'm 
probably not doing it right.


If I remember my cvs-fu you need CVS_RHS=ssh and use
CVSROOT=:ext:$u...@gcc.gnu.org:/cvs/gcc (with your sourceware.org
username as $USER) and then it should work over SSH just like svn and
git.

Anyway, the real thing I wanted to suggest is we put a line for 
C-family about the availability of __has_include and 
__has_include_next.  We could mention clang has it.


Good idea, I'm happy to commit a patch if you can prepare something.


OK, here is a patch for both using typename as a class key for template 
template parms and for __has_include, etc.

Are these too wordy?

Ed

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.14
diff -r1.14 changes.html
56a57,81
 liNew preprocessor constructs, code__has_include_next/code
 and code__has_include_next/code, to test the availability of 
 headers
 have been added.br/
 This demonstrates a way to include the header 
 codelt;optionalgt;/code
 only if it is available:br/
 blockquotepre
 #ifdef __has_include
 #  if __has_include(lt;optionalgt;)
 #include lt;optionalgt;
 #define have_optional 1
 #  elif __has_include(lt;experimental/optionalgt;)
 #include lt;experimental/optionalgt;
 #define have_optional 1
 #define experimental_optional
 #  else
 #define have_optional 0
 #  endif
 #endif
 /pre/blockquote
 The header search paths for code__has_include_next/code
 and code__has_include_next/code are equivalent to those
 of the standard directive code#include/code
 and the extension code#include_next/code respectively.
 /li
 
88a114,117
   liG++ now allows codetypename/code in a template template parameter.
 blockquotepre
   templatelt;templatelt;typenamegt; btypename/b Xgt; struct D; // 
 OK
 /pre/blockquote/li

Re: [wwwdocs] Add feature-testing macros and std::is_final to gcc-5/changes.html

2014-10-07 Thread Jonathan Wakely


On 07/10/14 08:39 -0400, Ed Smith-Rowland wrote:
OK, here is a patch for both using typename as a class key for 
template template parms and for __has_include, etc.

Are these too wordy?


They look OK to me, although you say __has_include_next and
__has_include_next in both places, the first should be just
__has_include.

I can make that change and commit it, thanks.

Re: RFA: Merge definitions of get_some_local_dynamic_name

2014-10-07 Thread Richard Sandiford

Richard Sandiford rdsandif...@googlemail.com writes:
 Rainer Orth r...@cebitec.uni-bielefeld.de writes:
 Hi Richard,
 Rainer Orth r...@cebitec.uni-bielefeld.de writes:
 Hi Richard,
 It seems the new get_some_local_dynamic_name implementation in
 function.c lost the non-NULL check the sparc.c version had.  I'm
 currently testing the following patch:

 Could you do a:

   call debug_rtx (...)

 on the insn that contains a null pointer?  Normally insn patterns
 shouldn't contain nulls, so I was wondering whether this was some
 SPARC-specific construct.

 proved a bit difficult to do: at the default -O2, insn was optimized
 away, at -g3 -O0, I only got

 can't compute CFA for this frame

 with gdb 7.8 even after recompiling all of the gcc dir with -g3 -O0.

 Here's what I find after inserting the call in the source:

 (insn 30 6 28 (sequence [
 (call_insn:TI 8 6 7 (parallel [
 (set (reg:SI 8 %o0)
 (call (mem:SI (unspec:SI [
 (symbol_ref:SI 
 (__tls_get_addr))
 ] UNSPEC_TLSLDM) [0  S4 A32])
 (const_int 1 [0x1])))
 (clobber (reg:SI 15 %o7))
 ]) 
 /vol/gcc/src/hg/trunk/local/libgo/runtime/proc.c:936 390 {tldm_call32}
  (expr_list:REG_EH_REGION (const_int -2147483648 
 [0x8000])
 (nil))
 (expr_list (use (reg:SI 8 %o0))
 (nil)))
 (insn 7 8 28 (set (reg:SI 8 %o0)
 (plus:SI (reg:SI 23 %l7)
 (unspec:SI [
 (reg:SI 8 %o0 [112])
 ] UNSPEC_TLSLDM))) 388 {tldm_add32}
  (nil))
 ]) /vol/gcc/src/hg/trunk/local/libgo/runtime/proc.c:936 -1
  (nil))

 Bah, a sequence.  Hadn't thought of that.

 IMO it's a bug for a walk on a PATTERN to pull in non-PATTERN parts
 of an insn.  We should really be looking at the patterns of the two
 subinsns instead and ignore the other stuff.  Let me have a think
 about it.

 did you come to a conclusion here?

 Sorry, forgot to come back to this.  I have a patch that iterates over
 PATTERNs of a SEQUENCE if the SEQUENCE (rather than its containing insn)
 is the topmost iterated rtx.  So if PATTERN (insn) is a SEQUENCE:

FOR_EACH_SUBRTX (, insn, x)
  ...

 will iterate over the insns in the SEQUENCE (including pattern, notes,
 jump label, etc.), whereas:

FOR_EACH_SUBRTX (, PATTERN (insn), x)
  ...

 would only iterate over the patterns of the insns in the SEQUENCE.

Does this work for you?  I tested it on x86_64-linux-gnu but obviously
that's not particularly useful for SEQUENCEs.

Thanks,
Richard

gcc/
* rtlanal.c (generic_subrtx_iterator T::add_subrtxes_to_queue):
Add the parts of an insn in reverse order, with the pattern at
the top of the queue.  Detect when we're iterating over a SEQUENCE
pattern and in that case just consider patterns of subinstructions.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   2014-09-25 16:40:44.944406590 +0100
+++ gcc/rtlanal.c   2014-10-07 13:13:57.698132753 +0100
@@ -128,29 +128,58 @@ generic_subrtx_iterator T::add_subrtxe
value_type *base,
size_t end, rtx_type x)
 {
-  const char *format = GET_RTX_FORMAT (GET_CODE (x));
+  enum rtx_code code = GET_CODE (x);
+  const char *format = GET_RTX_FORMAT (code);
   size_t orig_end = end;
-  for (int i = 0; format[i]; ++i)
-if (format[i] == 'e')
-  {
-   value_type subx = T::get_value (x-u.fld[i].rt_rtx);
-   if (__builtin_expect (end  LOCAL_ELEMS, true))
- base[end++] = subx;
-   else
- base = add_single_to_queue (array, base, end++, subx);
-  }
-else if (format[i] == 'E')
-  {
-   int length = GET_NUM_ELEM (x-u.fld[i].rt_rtvec);
-   rtx *vec = x-u.fld[i].rt_rtvec-elem;
-   if (__builtin_expect (end + length = LOCAL_ELEMS, true))
- for (int j = 0; j  length; j++)
-   base[end++] = T::get_value (vec[j]);
-   else
- for (int j = 0; j  length; j++)
-   base = add_single_to_queue (array, base, end++,
-   T::get_value (vec[j]));
-  }
+  if (__builtin_expect (INSN_P (x), false))
+{
+  /* Put the pattern at the top of the queue, since that's what
+we're likely to want most.  It also allows for the SEQUENCE
+code below.  */
+  for (int i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i = 0; --i)
+   if (format[i] == 'e')
+ {
+   value_type subx = T::get_value (x-u.fld[i].rt_rtx);
+   if (__builtin_expect (end  LOCAL_ELEMS, true))
+ base[end++] = subx;
+   else

[C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek

PR59717 is a request for hints which header to include if the compiler warns
about incompatible implicit declarations.  E.g., if one uses abort
without declaring it first, we now say
note: include ‘stdlib.h’ or provide a declaration of ‘abort’
I've added hints only for standard functions which means we won't display
the hint for functions such as mempcpy.

The implementation is based on a function that just maps built_in_function
codes to header names.

Two remarks:
* header_for_builtin_fn is long and I don't want to unnecessarily
  inflate already big c-decl.c file, so it might make sense to move
  the function into c-errors.c;
* we don't issue incompatible implicit declaration of built-in function
  warning for functions that return int and whose parameter types don't need
  default promotions - for instance putc, fputs, ilogb, strcmp, vprintf, isnan,
  isalpha, ...  Therefore for such functions we don't print the hint neither.
  header_for_builtin_fn is ready for them, though.  (The cases for ctype.h
  and wctype.h could be removed.)

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-10-07  Marek Polacek  pola...@redhat.com

PR c/59717
* c-decl.c (header_for_builtin_fn): New function.
(implicitly_declare): Suggest which header to include.

* gcc.dg/pr59717.c: New test.

diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index ce5a8de..e23284a 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -2968,6 +2968,189 @@ implicit_decl_warning (location_t loc, tree id, tree 
olddecl)
 }
 }
 
+/* This function represents mapping of a function code FCODE
+   to its respective header.  */
+
+static const char *
+header_for_builtin_fn (enum built_in_function fcode)
+{
+  switch (fcode)
+{
+CASE_FLT_FN (BUILT_IN_ACOS):
+CASE_FLT_FN (BUILT_IN_ACOSH):
+CASE_FLT_FN (BUILT_IN_ASIN):
+CASE_FLT_FN (BUILT_IN_ASINH):
+CASE_FLT_FN (BUILT_IN_ATAN):
+CASE_FLT_FN (BUILT_IN_ATANH):
+CASE_FLT_FN (BUILT_IN_ATAN2):
+CASE_FLT_FN (BUILT_IN_CBRT):
+CASE_FLT_FN (BUILT_IN_CEIL):
+CASE_FLT_FN (BUILT_IN_COPYSIGN):
+CASE_FLT_FN (BUILT_IN_COS):
+CASE_FLT_FN (BUILT_IN_COSH):
+CASE_FLT_FN (BUILT_IN_ERF):
+CASE_FLT_FN (BUILT_IN_ERFC):
+CASE_FLT_FN (BUILT_IN_EXP):
+CASE_FLT_FN (BUILT_IN_EXP2):
+CASE_FLT_FN (BUILT_IN_EXPM1):
+CASE_FLT_FN (BUILT_IN_FABS):
+CASE_FLT_FN (BUILT_IN_FDIM):
+CASE_FLT_FN (BUILT_IN_FLOOR):
+CASE_FLT_FN (BUILT_IN_FMA):
+CASE_FLT_FN (BUILT_IN_FMAX):
+CASE_FLT_FN (BUILT_IN_FMIN):
+CASE_FLT_FN (BUILT_IN_FMOD):
+CASE_FLT_FN (BUILT_IN_FREXP):
+CASE_FLT_FN (BUILT_IN_HYPOT):
+CASE_FLT_FN (BUILT_IN_ILOGB):
+CASE_FLT_FN (BUILT_IN_LDEXP):
+CASE_FLT_FN (BUILT_IN_LGAMMA):
+CASE_FLT_FN (BUILT_IN_LLRINT):
+CASE_FLT_FN (BUILT_IN_LLROUND):
+CASE_FLT_FN (BUILT_IN_LOG):
+CASE_FLT_FN (BUILT_IN_LOG10):
+CASE_FLT_FN (BUILT_IN_LOG1P):
+CASE_FLT_FN (BUILT_IN_LOG2):
+CASE_FLT_FN (BUILT_IN_LOGB):
+CASE_FLT_FN (BUILT_IN_LRINT):
+CASE_FLT_FN (BUILT_IN_LROUND):
+CASE_FLT_FN (BUILT_IN_MODF):
+CASE_FLT_FN (BUILT_IN_NAN):
+CASE_FLT_FN (BUILT_IN_NEARBYINT):
+CASE_FLT_FN (BUILT_IN_NEXTAFTER):
+CASE_FLT_FN (BUILT_IN_NEXTTOWARD):
+CASE_FLT_FN (BUILT_IN_POW):
+CASE_FLT_FN (BUILT_IN_REMAINDER):
+CASE_FLT_FN (BUILT_IN_REMQUO):
+CASE_FLT_FN (BUILT_IN_RINT):
+CASE_FLT_FN (BUILT_IN_ROUND):
+CASE_FLT_FN (BUILT_IN_SCALBLN):
+CASE_FLT_FN (BUILT_IN_SCALBN):
+CASE_FLT_FN (BUILT_IN_SIN):
+CASE_FLT_FN (BUILT_IN_SINH):
+CASE_FLT_FN (BUILT_IN_SINCOS):
+CASE_FLT_FN (BUILT_IN_SQRT):
+CASE_FLT_FN (BUILT_IN_TAN):
+CASE_FLT_FN (BUILT_IN_TANH):
+CASE_FLT_FN (BUILT_IN_TGAMMA):
+CASE_FLT_FN (BUILT_IN_TRUNC):
+case BUILT_IN_ISINF:
+case BUILT_IN_ISNAN:
+  return math.h;
+CASE_FLT_FN (BUILT_IN_CABS):
+CASE_FLT_FN (BUILT_IN_CACOS):
+CASE_FLT_FN (BUILT_IN_CACOSH):
+CASE_FLT_FN (BUILT_IN_CARG):
+CASE_FLT_FN (BUILT_IN_CASIN):
+CASE_FLT_FN (BUILT_IN_CASINH):
+CASE_FLT_FN (BUILT_IN_CATAN):
+CASE_FLT_FN (BUILT_IN_CATANH):
+CASE_FLT_FN (BUILT_IN_CCOS):
+CASE_FLT_FN (BUILT_IN_CCOSH):
+CASE_FLT_FN (BUILT_IN_CEXP):
+CASE_FLT_FN (BUILT_IN_CIMAG):
+CASE_FLT_FN (BUILT_IN_CLOG):
+CASE_FLT_FN (BUILT_IN_CONJ):
+CASE_FLT_FN (BUILT_IN_CPOW):
+CASE_FLT_FN (BUILT_IN_CPROJ):
+CASE_FLT_FN (BUILT_IN_CREAL):
+CASE_FLT_FN (BUILT_IN_CSIN):
+CASE_FLT_FN (BUILT_IN_CSINH):
+CASE_FLT_FN (BUILT_IN_CSQRT):
+CASE_FLT_FN (BUILT_IN_CTAN):
+CASE_FLT_FN (BUILT_IN_CTANH):
+  return complex.h;
+case BUILT_IN_MEMCHR:
+case BUILT_IN_MEMCMP:
+case BUILT_IN_MEMCPY:
+case BUILT_IN_MEMMOVE:
+case BUILT_IN_MEMSET:
+case BUILT_IN_STRCAT:
+case BUILT_IN_STRCHR:
+case BUILT_IN_STRCMP:
+case BUILT_IN_STRCPY:
+case BUILT_IN_STRCSPN:
+case BUILT_IN_STRLEN:
+case BUILT_IN_STRNCAT:
+case BUILT_IN_STRNCMP:
+case

[PATCH] PR lto/59441 Add initialization and release of bitmap obstack

2014-10-07 Thread Ilya Palachev


Hi all,

Attached patch fixes PR lto/59441.
The reason of failure was that the default bitmap obstack was released 
just before the execution of early local passes.
The error was found using valgrind. It reported that there were 153 
invalid reads and 173 invalid writes into the field of the default 
bitmap obstack structure,
and all of them were trying to access data that was free'd previously 
(at the same point of the program).


The solution is to add initialization and release of the bitmap obstack 
before and after the execution of early local passes.
After applying this patch valgrind does not report any errors for the 
same testcase.


The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.

Ok for trunk?

Best regards,
Ilya Palachev
From 9bf2878c0a74475283b5424f24e46b31feb13cf7 Mon Sep 17 00:00:00 2001
From: Ilya Palachev i.palac...@samsung.com
Date: Tue, 7 Oct 2014 16:09:25 +0400
Subject: [PATCH] Add initialization and release of bitmap obstack

gcc/

2014-10-07  Ilya Palachev  i.palac...@samsung.com

	* cgraphunit.c (process_new_functions): Add initialization and
	release of bitmap obstack before and after running of passes.

gcc/testsuite/

2014-10-07  Ilya Palachev  i.palac...@samsung.com

	* g++.dg/lto/pr59441_0.C: New test from bugzilla.
---
 gcc/cgraphunit.c |  6 +-
 gcc/testsuite/g++.dg/lto/pr59441_0.C | 26 ++
 2 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/lto/pr59441_0.C

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index d463505..ee42ad1 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -323,7 +323,11 @@ symbol_table::process_new_functions (void)
 	  push_cfun (DECL_STRUCT_FUNCTION (fndecl));
 	  if (state == IPA_SSA
 	   !gimple_in_ssa_p (DECL_STRUCT_FUNCTION (fndecl)))
-	g-get_passes ()-execute_early_local_passes ();
+	{
+	  bitmap_obstack_initialize (NULL);
+	  g-get_passes ()-execute_early_local_passes ();
+	  bitmap_obstack_release (NULL);
+	}
 	  else if (inline_summary_vec != NULL)
 	compute_inline_parameters (node, true);
 	  free_dominance_info (CDI_POST_DOMINATORS);
diff --git a/gcc/testsuite/g++.dg/lto/pr59441_0.C b/gcc/testsuite/g++.dg/lto/pr59441_0.C
new file mode 100644
index 000..3c766e5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr59441_0.C
@@ -0,0 +1,26 @@
+// { dg-lto-do assemble }
+// { dg-lto-options { { -shared -fPIC -flto -O -fvtable-verify=std } } }
+
+template  typename T  struct A
+{
+  T foo ();
+};
+
+template  typename T  struct C: virtual public A  T 
+{
+  C  operator (C  (C ));
+};
+
+template  typename T 
+C  T  endl (C  int  c)
+{
+  c.foo ();
+return c;
+}
+
+C  int  cout;
+void
+fn ()
+{
+  cout  endl;
+}
-- 
2.1.1

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek

On Mon, Oct 06, 2014 at 07:53:17PM +0400, Ilya Verbin wrote:
 This patch adds plugin support to libgomp, as well as memory mapping and
 interaction with target devices through plugin's interface.

Still have issues with the non-installed testing.

( mkdir objmic  cd objmic  ../configure 
--build=x86_64-intelmicemul-linux-gnu \
--host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu \
--enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap \
 make  make install DESTDIR=`cd ..; pwd`/objinst )
( mkdir objhost  cd objhost  ../configure --build=x86_64-pc-linux-gnu \
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu \
--enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gcc-git/objmic
--disable-bootstrap  make )
( mkdir objhost2  cd objhost2  ../configure --build=x86_64-pc-linux-gnu \
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu \
--enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gcc-git/objinst/usr/local
--disable-bootstrap  make )

All 3 succeeded for me.

Now, in objhost make check-target-libgomp doesn't really work, in objhost2
it does.

E.g. trying to link target-1.exe, I get:

lto-wrapper: fatal error: Problem with building target image for 
x86_64-intelmicemul-linux-gnu.

compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

If I add
-B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
-B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/
to the command line so it at least finds mkoffload, it then can't find for
some reason the offload compiler:

(null): fatal error: offload compiler 
x86_64-pc-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc not found.
compilation terminated.
lto-wrapper: fatal error: 
/usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0//accel/x86_64-intelmicemul-linux-gnu/mkoffload
 returned 1 exit status
compilation terminated.
/usr/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status

So, what exactly should be added (by libgomp.exp) so that the testing succeeds 
in
the case of non-installed offload and non-installed host compilers?

Jakub

Re: [Patch ARM-AArch64/testsuite v2 01/21] Neon intrinsics execution tests initial framework.

2014-10-07 Thread Christophe Lyon

On 1 October 2014 17:11, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:
 On 30 September 2014 15:27, Christophe Lyon christophe.l...@linaro.org 
 wrote:
 On 10 July 2014 12:12, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:
 On 1 July 2014 11:05, Christophe Lyon christophe.l...@linaro.org wrote:
 * documentation (README)
 * dejanu driver (neon-intrinsics.exp)
 * support macros (arm-neon-ref.h, compute-ref-data.h)
 * Tests for 3 intrinsics: vaba, vld1, vshl

 Hi, The terminology in armv8 is advsimd rather than neon.  Can we
 rename neon-intrinsics to advsimd-intrinsics or simd-intrinsics
 throughout please.  The existing gcc.target/aarch64/simd directory of
 tests will presumably be superseded by this more comprehensive set of
 tests so I suggest these tests go in gcc.target/aarch64/advsimd and we
 eventually remove gcc.target/aarch64/simd/ directory.

 GNU style should apply throughout this patch series, notably double
 space after period in comments and README text.  Space before left
 parenthesis in function/macro call and function declaration.  The
 function name in a declaration goes on a new line.  The GCC wiki notes
 on test case state individual test should have file names ending in
 _number, see here https://gcc.gnu.org/wiki/TestCaseWriting


 Hi,

 For the record, these tests are based on a testsuite I wrote quite
 some time ago:
 https://gitorious.org/arm-neon-tests/

 where obviously I had no such requirement (and v8 wasn't public yet)

 So I prefer to apply the changes you request in my main version before
 re-submitting it here.
 (libsanitizer-style, sort-of).

 This will take me some time, so the next version of my patch series
 should not be expected really soon :-(


Ramana, Marcus,

 Hi Christophe,   Given that this test suite code is an existing body
 of work I see no reason to impose the GNU style change I originally
 asked for. I withdraw my original comment that these patches should
 conform to GNU style.  My comment on file names is also withdrawn.  I
 would like to see the terminology corrected.


Thanks, I have updated my patch according to this.

But meanwhile I have also updated my testsuite, and fixed the #define
flag I used to toggle float16 tests: I now use __ARM_FP16_FORMAT_IEEE,
such as:
#if defined(__ARM_FP16_FORMAT_IEEE)
  TEST_VLD1(vector, buffer, , float, f, 16, 4);
  TEST_VLD1(vector, buffer, q, float, f, 16, 8);
#endif

Which reminded me that:
- on ARM (AArch32), float16x4_t is supported, but float16x8_t isn't yet
- on AArch64, -mfp16-format=ieee is rejected, and I didn't see a
similar option in the doc

What do you prefer me to do for these tests? I can think of:
- do not include them at all until fp16 is fully supported on both
AArch32 and AArch64
- include only those with float16x4_t
- include both float16x4_t and float16x8_t tests, leaving float16x8_t commented
- include both, uncommented, but do not test with -mfp16-format=ieee

Thanks,

Christophe.


 Thanks
 /Marcus

Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Siva Chandra

On Mon, Oct 6, 2014 at 5:55 PM, Jason Merrill ja...@redhat.com wrote:
 On 10/06/2014 08:50 PM, Siva Chandra wrote:
 But, the question is whether it is required to determine the parameter
 passing ABI. If there is no special marker to indicate that the user
 declared 'tor is explicitly defaulted, then GDB could (in the absence
 of other properties which make the 'tor non-trivial) incorrectly
 conclude that the the 'tor is user defined, and hence not-trivial.

 I've been thinking that we should just mark the 'tor as trivial or not
 directly rather than hint at it.  Does GDB have enough information to
 determine triviality if we just add defaulted info?

Barring some incompleteness, for which patches are very close to
getting committed, I believe GDB has the rest of the information.
After those patches are committed, the algorithm used by GDB to
determine whether a value is returned in a hidden param or not is as
follows:

1. If the value is of a dynamic class (as in, has virtual bases or
virtual functions), return in hidden param.
2. Else, go over all methods that are found in the DWARF:
2a. If a method is marked artificial, ignore it.
2b. If the method is a copy-constructor or a destructor, conclude
that a pointer to the value is to be returned in the hidden first
param.
 This is because, presence of a copy-ctor or dtor which are
nor artificial indicates that they were user declared and not
implicit.
3. If a decision was not made in 2, do 1 and 2 for base class
subobjects and non-static members.
4. If a decision was not made in 3, then conclude that it should not
be passed in a hidden param.

If an explicitly defaulted copy-ctor or dtor is not marked as such,
step 2 is broken.

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Ilya Verbin

On 07 Oct 15:06, Jakub Jelinek wrote:
 Still have issues with the non-installed testing.

The idea was that the offload compiler should be installed.

 If I add
 -B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
 -B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/

Yes, since lto-wrapper uses COMPILER_PATH + /accel/target/ to find
mkoffload, it requires that the offload compiler with mkoffload are installed.
Probably, it can be extended to search in the build paths, specified by
--enable-offload-targets option.

 to the command line so it at least finds mkoffload, it then can't find for
 some reason the offload compiler:

mkoffload itself also wants the offload compiler with correct name
(host-accel-target-gcc).  It can be extended to use xgcc.  But I don't know,
how to construct all paths for it (-B, -I, -L)?

 So, what exactly should be added (by libgomp.exp) so that the testing 
 succeeds in
 the case of non-installed offload and non-installed host compilers?

Looks like, that non-installed offload compiler requires some complications.
Is this really necessary?

Thanks,
  -- Ilya

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Thomas Schwinge

Hi!

On Tue, 7 Oct 2014 17:51:53 +0400, Ilya Verbin iver...@gmail.com wrote:
 On 07 Oct 15:06, Jakub Jelinek wrote:
  Still have issues with the non-installed testing.
 
 The idea was that the offload compiler should be installed.
 
  If I add
  -B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
  -B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/
 
 Yes, since lto-wrapper uses COMPILER_PATH + /accel/target/ to find
 mkoffload, it requires that the offload compiler with mkoffload are installed.
 Probably, it can be extended to search in the build paths, specified by
 --enable-offload-targets option.
 
  to the command line so it at least finds mkoffload, it then can't find for
  some reason the offload compiler:
 
 mkoffload itself also wants the offload compiler with correct name
 (host-accel-target-gcc).  It can be extended to use xgcc.  But I don't 
 know,
 how to construct all paths for it (-B, -I, -L)?

For what it's worth, I first build accel-nvptx GCC (in
$T/build-gcc-accel-nvptx/), then normal GCC ($PWD, that is, in
$T/build-gcc/), and use the following steps to make offloading work for
build-tree testing of both GCC builds:

[...]
mkdir -p gcc/accel/nvptx-none 
ln -vsf \
  $T/build-gcc-accel-nvptx/gcc/lto1 \
  $T/build-gcc-accel-nvptx/gcc/mkoffload \
  $T/build-gcc-accel-nvptx/gcc/xgcc \
  gcc/accel/nvptx-none/ 
cat  gcc/x86_64-unknown-linux-gnu-accel-nvptx-none-gcc EOF 
#! /bin/sh
set -e
d=$(dirname $0)
$d/accel/nvptx-none/xgcc -B$d/accel/nvptx-none/ $@
EOF
chmod +x gcc/x86_64-unknown-linux-gnu-accel-nvptx-none-gcc 
[...]

  So, what exactly should be added (by libgomp.exp) so that the testing 
  succeeds in
  the case of non-installed offload and non-installed host compilers?
 
 Looks like, that non-installed offload compiler requires some complications.
 Is this really necessary?


Grüße,
 Thomas


pgph12Jtgys7g.pgp
Description: PGP signature

Re: [PATCH 2/2] PR debug/63240 Add DWARF representation for C++11 defaulted member function.

2014-10-07 Thread Siva Chandra

On Tue, Oct 7, 2014 at 4:05 AM, Mark Wielaard m...@redhat.com wrote:
 To be honest my original patches for a deleted/defaulted markers on
 special member functions was really just meant to give the consumer a
 way to know why GCC produced a declaration in the first place. Which I
 still think is useful information for the consumer to have, but
 certainly not enough to solve the abi problem with inferior function
 calls Siva was seeing. Maybe GDB has enough information/smarts, but I
 don't think other consumers have. So an explicit trivial/non-trivial
 marker on special member functions seems like a good idea.

 But looking at the definition of trivial copy constructor and trivial
 destructor they do look more like class concepts instead of individual
 constructor/destructor concepts (since they rely on properties of other
 members and the base class). Currently GCC doesn't output declarations
 unless the user declares them. So an implicit copy constructor or
 destructor doesn't get a DWARF class member declaration. But I don't
 think a consumer can conclude just from that fact that the copy
 constructor or destructor is trivial. Nor can it asssume they are
 non-trivial just because they are are respresented in DWARF. So should
 we always output them and add a flag value to indicate
 (non-trivialness). Or should we add attributes on the class itself?

I also feel that triviality of special methods is more like a class
concept. Also, this concept is specified by the language.

 Taking a step back and looking at the actual function that is causing
 the trouble because abi/calling convention seems unclear. Which makes me
 wonder if the issue isn't actually with the DWARF declaration of the
 function that has special calling conventions. I am slightly surprised
 the special return value passed in rule isn't expressed in the mangling
 of the function name (or is it?). So the calling convention needs to be
 interpreted from the DWARF representation. We already add a synthetic
 formal parameter for this if necessary to be passed in. Why don't we
 just add a similar synthetic return formal parameter if that is how
 the function is really being invoked? That seems like a more direct way
 to solve the inferior function call issue.

Triviality (or not) is specified by the language. Similarly, the
'this' pointer is specified by the language. However, function calling
convention is specified by the ABI. ISTR that DWARF cannot/should not
describe the ABI. May be I am wrong, but if it is indeed possible to
specify the ABI in DWARF, then I agree that it probably is the best
solution for function call issue.

Thanks,
Siva Chandra

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek

On Tue, Oct 07, 2014 at 05:51:53PM +0400, Ilya Verbin wrote:
 On 07 Oct 15:06, Jakub Jelinek wrote:
  Still have issues with the non-installed testing.
 
 The idea was that the offload compiler should be installed.
 
  If I add
  -B /usr/src/gcc-git/objinst/usr/local/lib/gcc/x86_64-pc-linux-gnu/5.0.0/ \
  -B /usr/src/gcc-git/objinst/usr/local/libexec/gcc/x86_64-pc-linux-gnu/5.0.0/
 
 Yes, since lto-wrapper uses COMPILER_PATH + /accel/target/ to find
 mkoffload, it requires that the offload compiler with mkoffload are installed.
 Probably, it can be extended to search in the build paths, specified by
 --enable-offload-targets option.
 
  to the command line so it at least finds mkoffload, it then can't find for
  some reason the offload compiler:
 
 mkoffload itself also wants the offload compiler with correct name
 (host-accel-target-gcc).  It can be extended to use xgcc.  But I don't 
 know,
 how to construct all paths for it (-B, -I, -L)?
 
  So, what exactly should be added (by libgomp.exp) so that the testing 
  succeeds in
  the case of non-installed offload and non-installed host compilers?
 
 Looks like, that non-installed offload compiler requires some complications.
 Is this really necessary?

I think it is useful, doesn't have to be in the initial checkin, but I'd
certainly prefer if from the (optional) --enable-offload-target argument
it would figure out everything it needs to add for testing.
And, if mkoffload isn't flexible enough to be convinced to find it in that
scenario, it better should be made more flexible.

Another thing I've noticed, when target-1.exe is built, there are tons of
sections that IMHO should have been stripped away:

  [ 0]   NULL 00 00 00  
0   0  0
  [ 1] .interp   PROGBITS00400238 000238 1c 00   A  
0   0  1
  [ 2] .note.ABI-tag NOTE00400254 000254 20 00   A  
0   0  4
  [ 3] .hash HASH00400278 000278 94 04   A  
4   0  8
  [ 4] .dynsym   DYNSYM  00400310 000310 0001b0 18   A  
5   1  8
  [ 5] .dynstr   STRTAB  004004c0 0004c0 000189 00   A  
0   0  1
  [ 6] .gnu.version  VERSYM  0040064a 00064a 24 02   A  
4   0  2
  [ 7] .gnu.version_rVERNEED 00400670 000670 70 00   A  
5   2  8
  [ 8] .rela.dyn RELA004006e0 0006e0 18 18   A  
4   0  8
  [ 9] .rela.plt RELA004006f8 0006f8 000150 18   A  
4  11  8
  [10] .init PROGBITS00400848 000848 1a 00  AX  
0   0  4
  [11] .plt  PROGBITS00400870 000870 f0 10  AX  
0   0 16
  [12] .text PROGBITS00400960 000960 000b44 00  AX  
0   0 16
  [13] .fini PROGBITS004014a4 0014a4 09 00  AX  
0   0  4
  [14] .rodata   PROGBITS004014b0 0014b0 20 00   A  
0   0  8
  [15] .eh_frame_hdr PROGBITS004014d0 0014d0 94 00   A  
0   0  4
  [16] .eh_frame PROGBITS00401568 001568 00032c 00   A  
0   0  8
  [17] .init_array   INIT_ARRAY  00601dd8 001dd8 10 00  WA  
0   0  8
  [18] .fini_array   FINI_ARRAY  00601de8 001de8 08 00  WA  
0   0  8
  [19] .jcr  PROGBITS00601df0 001df0 08 00  WA  
0   0  8
  [20] .dynamic  DYNAMIC 00601df8 001df8 000200 10  WA  
5   0  8
  [21] .got  PROGBITS00601ff8 001ff8 08 08  WA  
0   0  8
  [22] .got.plt  PROGBITS00602000 002000 88 08  WA  
0   0  8
  [23] .data PROGBITS006020a0 0020a0 000120 00  WA  
0   0 32
  [24] .offload_image_section PROGBITS006021c0 0021c0 003439 00 
 WA  0   0 16
  [25] __gnu_offload_funcs PROGBITS00605600 005600 18 00  
WA  0   0  8
  [26] __gnu_offload_vars PROGBITS00605618 005618 10 00  WA 
 0   0  8
  [27] .bss  NOBITS  00605628 005628 08 00  WA  
0   0  4
  [28] .comment  PROGBITS 005628 55 01  MS  
0   0  1
  [29] .gnu.target_lto_.profile.3e3ce5aae4e95dd4 PROGBITS
 00567d 14 00  0   0  1
  [30] .gnu.target_lto_.jmpfuncs.3e3ce5aae4e95dd4 PROGBITS
 005691 28 00  0   0  1
  [31] .gnu.target_lto_.inline.3e3ce5aae4e95dd4 PROGBITS
 0056b9 000130 00  0   0  1
  [32] .gnu.target_lto_.pureconst.3e3ce5aae4e95dd4 PROGBITS
 0057e9 1d 00  0   0  1
  [33] .gnu.target_lto_fn2._omp_fn.1.3e3ce5aae4e95dd4 PROGBITS
 005806 0005fc 00  0   0  1
  [34] .gnu.target_lto_fn2._omp_fn.0.3e3ce5aae4e95dd4 PROGBITS
 005e02 000765 00  0   0  1

[jit] Use the full name of the installed driver binary

2014-10-07 Thread David Malcolm

On Fri, 2014-09-26 at 21:55 +, Joseph S. Myers wrote:
On Thu, 25 Sep 2014, David Malcolm wrote:
 
  Should this have the $(exeext) suffix seen in Makefile.in?
$(target_noncanonical)-gcc-$(version)$(exeext)
 
 Depends on whether that's needed for the pex code to find it.
  As for (B), would it make sense to bake in the path to the binary into
  the pex invocation, and hence to turn off PEX_SEARCH?  If so, presumably
  I need to somehow expand the Makefile's value of $(bindir) into
  internal-api.c, right?  (I tried this in configure.ac, but merely got
  $(exec_prefix)/bin iirc).
 
 An installation must be relocatable.  Thus, you can't just hardcode 
 looking in the configured prefix; you'd need to locate it relative to 
 libgccjit.so in some way (i.e. using make_relative_prefix, but I don't 
 know offhand how libgccjit.so would locate itself).
 
  A better long-term approach to this would be to extract the spec
  machinery from gcc.c (perhaps into a libdriver.a?) and run it directly
  from the jit library - but that's a rather involved patch, I suspect.
 
 And you'd still need libgccjit.so to locate itself for proper 
 relocatability in finding other pieces such as assembler and linker.
 
  I wonder if the appropriate approach here is to have a single library
  with multiple plugin backends e.g. one for the CPU, one for each GPU
  family, with the ability to load multiple backends at once.
 
 If you can get that working, sure.
 
  Unfortunately, backend is horribly overloaded here - I mean basically
  all of gcc here, everything other than the libgccjit.h API seen by
  client code.
 
 (Though preferably as much as possible could be shared, i.e. properly 
 define the parts of GCC that need building separately for each target and 
 limit them as much as possible.  Joern's multi-target patches from 2010 
 that selectively built parts of GCC using namespaces while sharing others 
 without an obvious clear separation seemed very fragile.  For something 
 robust you either build everything separately for each target, or have a 
 well-defined separation between bits needing building separately and bits 
 that can be built once and ways to avoid non-obvious target dependencies 
 in bits built once.)

I've been experimenting with directly embedding the gcc.c driver code
in-process, but that patch was getting unwieldy, so for now, I'm going
with the simpler approach: just call the driver out-of-process,
specifying the full installed name:
  $(target_noncanonical)-gcc-$(version)$(exeext)
as expanded at configuration time, requiring it to be on the PATH.

Hopefully this addresses the last of the concerns raised in your initial
review; I'll do some more testing and then try to resubmit to the list
(I'm also thinking about breaking up internal-api.c/h, as they've become
rather large, into jit-recording/jit-playback.c/h)

Committed to branch dmalcolm/jit:

gcc/ChangeLog.jit:
* Makefile.in (site.exp): When constructing site.exp, add a line
to set bindir.
* configure.ac: Generate a gcc-driver-name.h file containing
GCC_DRIVER_NAME for the benefit of jit/internal-api.c.
* configure: Regenerate.

gcc/jit/ChangeLog.jit:
* docs/internals/index.rst
(Using a working copy without installing): Rename to...
(Using a working copy without installing every time): ...this, and
update to reflect the need to have installed the driver binary
when running directly from a build directory.
(Running the test suite): Add PATH setting to the example.
* docs/intro/install.rst (Hello world): Likewise.
* internal-api.c: Include new autogenerated header
gcc-driver-name.h.
(gcc::jit::playback::context::compile): Rather than looking for a
gcc on the path, look for GCC_DRIVER_NAME from gcc-driver-name.h,
as created by the configure script, so that we are using one for
the correct target.

gcc/testsuite/ChangeLog.jit:
* jit.dg/jit.exp (jit-dg-test): Prepend the installed bindir to
the PATH before invoking built binaries using the library, so that
the library can find the driver.  Restore the PATH immediately
afterwards.
---
 gcc/ChangeLog.jit|  8 +
 gcc/Makefile.in  |  1 +
 gcc/configure|  6 
 gcc/configure.ac |  6 
 gcc/jit/ChangeLog.jit| 16 ++
 gcc/jit/docs/internals/index.rst | 64 
 gcc/jit/docs/intro/install.rst   | 36 ++
 gcc/jit/internal-api.c   | 10 +--
 gcc/testsuite/ChangeLog.jit  |  7 +
 gcc/testsuite/jit.dg/jit.exp | 14 +
 10 files changed, 147 insertions(+), 21 deletions(-)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index e71f7c4..ca73c04 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,3 +1,11 @@
+2014-10-07  David Malcolm  dmalc...@redhat.com
+
+

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Richard Biener

On Tue, Oct 7, 2014 at 2:53 PM, Marek Polacek pola...@redhat.com wrote:
 PR59717 is a request for hints which header to include if the compiler warns
 about incompatible implicit declarations.  E.g., if one uses abort
 without declaring it first, we now say
 note: include ‘stdlib.h’ or provide a declaration of ‘abort’
 I've added hints only for standard functions which means we won't display
 the hint for functions such as mempcpy.

 The implementation is based on a function that just maps built_in_function
 codes to header names.

 Two remarks:
 * header_for_builtin_fn is long and I don't want to unnecessarily
   inflate already big c-decl.c file, so it might make sense to move
   the function into c-errors.c;
 * we don't issue incompatible implicit declaration of built-in function
   warning for functions that return int and whose parameter types don't need
   default promotions - for instance putc, fputs, ilogb, strcmp, vprintf, 
 isnan,
   isalpha, ...  Therefore for such functions we don't print the hint neither.
   header_for_builtin_fn is ready for them, though.  (The cases for ctype.h
   and wctype.h could be removed.)

 Bootstrapped/regtested on x86_64-linux, ok for trunk?

Why not annotate builtins.def with the info?

Richard.

 2014-10-07  Marek Polacek  pola...@redhat.com

 PR c/59717
 * c-decl.c (header_for_builtin_fn): New function.
 (implicitly_declare): Suggest which header to include.

 * gcc.dg/pr59717.c: New test.

 diff --git gcc/c/c-decl.c gcc/c/c-decl.c
 index ce5a8de..e23284a 100644
 --- gcc/c/c-decl.c
 +++ gcc/c/c-decl.c
 @@ -2968,6 +2968,189 @@ implicit_decl_warning (location_t loc, tree id, tree 
 olddecl)
  }
  }

 +/* This function represents mapping of a function code FCODE
 +   to its respective header.  */
 +
 +static const char *
 +header_for_builtin_fn (enum built_in_function fcode)
 +{
 +  switch (fcode)
 +{
 +CASE_FLT_FN (BUILT_IN_ACOS):
 +CASE_FLT_FN (BUILT_IN_ACOSH):
 +CASE_FLT_FN (BUILT_IN_ASIN):
 +CASE_FLT_FN (BUILT_IN_ASINH):
 +CASE_FLT_FN (BUILT_IN_ATAN):
 +CASE_FLT_FN (BUILT_IN_ATANH):
 +CASE_FLT_FN (BUILT_IN_ATAN2):
 +CASE_FLT_FN (BUILT_IN_CBRT):
 +CASE_FLT_FN (BUILT_IN_CEIL):
 +CASE_FLT_FN (BUILT_IN_COPYSIGN):
 +CASE_FLT_FN (BUILT_IN_COS):
 +CASE_FLT_FN (BUILT_IN_COSH):
 +CASE_FLT_FN (BUILT_IN_ERF):
 +CASE_FLT_FN (BUILT_IN_ERFC):
 +CASE_FLT_FN (BUILT_IN_EXP):
 +CASE_FLT_FN (BUILT_IN_EXP2):
 +CASE_FLT_FN (BUILT_IN_EXPM1):
 +CASE_FLT_FN (BUILT_IN_FABS):
 +CASE_FLT_FN (BUILT_IN_FDIM):
 +CASE_FLT_FN (BUILT_IN_FLOOR):
 +CASE_FLT_FN (BUILT_IN_FMA):
 +CASE_FLT_FN (BUILT_IN_FMAX):
 +CASE_FLT_FN (BUILT_IN_FMIN):
 +CASE_FLT_FN (BUILT_IN_FMOD):
 +CASE_FLT_FN (BUILT_IN_FREXP):
 +CASE_FLT_FN (BUILT_IN_HYPOT):
 +CASE_FLT_FN (BUILT_IN_ILOGB):
 +CASE_FLT_FN (BUILT_IN_LDEXP):
 +CASE_FLT_FN (BUILT_IN_LGAMMA):
 +CASE_FLT_FN (BUILT_IN_LLRINT):
 +CASE_FLT_FN (BUILT_IN_LLROUND):
 +CASE_FLT_FN (BUILT_IN_LOG):
 +CASE_FLT_FN (BUILT_IN_LOG10):
 +CASE_FLT_FN (BUILT_IN_LOG1P):
 +CASE_FLT_FN (BUILT_IN_LOG2):
 +CASE_FLT_FN (BUILT_IN_LOGB):
 +CASE_FLT_FN (BUILT_IN_LRINT):
 +CASE_FLT_FN (BUILT_IN_LROUND):
 +CASE_FLT_FN (BUILT_IN_MODF):
 +CASE_FLT_FN (BUILT_IN_NAN):
 +CASE_FLT_FN (BUILT_IN_NEARBYINT):
 +CASE_FLT_FN (BUILT_IN_NEXTAFTER):
 +CASE_FLT_FN (BUILT_IN_NEXTTOWARD):
 +CASE_FLT_FN (BUILT_IN_POW):
 +CASE_FLT_FN (BUILT_IN_REMAINDER):
 +CASE_FLT_FN (BUILT_IN_REMQUO):
 +CASE_FLT_FN (BUILT_IN_RINT):
 +CASE_FLT_FN (BUILT_IN_ROUND):
 +CASE_FLT_FN (BUILT_IN_SCALBLN):
 +CASE_FLT_FN (BUILT_IN_SCALBN):
 +CASE_FLT_FN (BUILT_IN_SIN):
 +CASE_FLT_FN (BUILT_IN_SINH):
 +CASE_FLT_FN (BUILT_IN_SINCOS):
 +CASE_FLT_FN (BUILT_IN_SQRT):
 +CASE_FLT_FN (BUILT_IN_TAN):
 +CASE_FLT_FN (BUILT_IN_TANH):
 +CASE_FLT_FN (BUILT_IN_TGAMMA):
 +CASE_FLT_FN (BUILT_IN_TRUNC):
 +case BUILT_IN_ISINF:
 +case BUILT_IN_ISNAN:
 +  return math.h;
 +CASE_FLT_FN (BUILT_IN_CABS):
 +CASE_FLT_FN (BUILT_IN_CACOS):
 +CASE_FLT_FN (BUILT_IN_CACOSH):
 +CASE_FLT_FN (BUILT_IN_CARG):
 +CASE_FLT_FN (BUILT_IN_CASIN):
 +CASE_FLT_FN (BUILT_IN_CASINH):
 +CASE_FLT_FN (BUILT_IN_CATAN):
 +CASE_FLT_FN (BUILT_IN_CATANH):
 +CASE_FLT_FN (BUILT_IN_CCOS):
 +CASE_FLT_FN (BUILT_IN_CCOSH):
 +CASE_FLT_FN (BUILT_IN_CEXP):
 +CASE_FLT_FN (BUILT_IN_CIMAG):
 +CASE_FLT_FN (BUILT_IN_CLOG):
 +CASE_FLT_FN (BUILT_IN_CONJ):
 +CASE_FLT_FN (BUILT_IN_CPOW):
 +CASE_FLT_FN (BUILT_IN_CPROJ):
 +CASE_FLT_FN (BUILT_IN_CREAL):
 +CASE_FLT_FN (BUILT_IN_CSIN):
 +CASE_FLT_FN (BUILT_IN_CSINH):
 +CASE_FLT_FN (BUILT_IN_CSQRT):
 +CASE_FLT_FN (BUILT_IN_CTAN):
 +CASE_FLT_FN (BUILT_IN_CTANH):
 +  return complex.h;
 +case BUILT_IN_MEMCHR:
 +case BUILT_IN_MEMCMP:
 +case BUILT_IN_MEMCPY:
 +case BUILT_IN_MEMMOVE:

Re: [PATCH] PR lto/59441 Add initialization and release of bitmap obstack

2014-10-07 Thread Richard Biener

On Tue, Oct 7, 2014 at 2:55 PM, Ilya Palachev i.palac...@samsung.com wrote:
 Hi all,

 Attached patch fixes PR lto/59441.
 The reason of failure was that the default bitmap obstack was released just
 before the execution of early local passes.
 The error was found using valgrind. It reported that there were 153 invalid
 reads and 173 invalid writes into the field of the default bitmap obstack
 structure,
 and all of them were trying to access data that was free'd previously (at
 the same point of the program).

 The solution is to add initialization and release of the bitmap obstack
 before and after the execution of early local passes.
 After applying this patch valgrind does not report any errors for the same
 testcase.

 The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.

 Ok for trunk?

Ok.

Thanks,
Richard.

 Best regards,
 Ilya Palachev

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek

Hi!

Also, something that I believe has been discussed in the past, but can't
find it on your wiki page nor in *.opt, are option overrides for the
offloading target, i.e. some option you can pass to the host compiler driver
during linking that will tell the driver for which offloading targets (if
any at all) to produce the offloading support (defaulting to all configured
offloading target is fine) and optionally what extra options beyond what has
been passed on the command line should be passed to the offloading compiler.

Say, if I want to link target-1.exe such that it will only support host
fallback and not x86_64-intelmicemul-linux-gnu , how do I achieve that now?

Jakub

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Ilya Verbin

On 07 Oct 16:30, Jakub Jelinek wrote:
 Another thing I've noticed, when target-1.exe is built, there are tons of
 sections that IMHO should have been stripped away:

Could you please re-checkout the branch?  I fixed this issue a week ago.

Thanks,
  -- Ilya

Re: [patch] Fix miscompilation of gnat1 in LTO bootstrap

2014-10-07 Thread Richard Biener

On Tue, Oct 7, 2014 at 10:04 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Testcase?  I think it would be better to handle this in the canonical type
 merging code in lto.c - or how does it end up working without LTO?  That is,
 what does the Ada frontend do to make sure get_alias_set handles this
 correctly?

 It manages the alias sets, see gcc-interface/utils.c:relate_alias_sets.

Ugh :/

I can't see how this can work with LTO.  We need a middle-end way
to represent the alias relation of those types.  At least I can't see how
your simple patch covers all cases here?

With LTO we preserve TYPE_ALIAS_SET == 0, so another way to
fix this (and which I'd like more) is to do your patch in the Ada frontend,
that is, use alias-set zero for all types you relate if flag_lto.

Another way is to make LTO canonical type merging handle the
case of type_contains_placeholder_p better, that is by treating
two types with those equivalent more easily.  For arrays this simply
means hashing and comparing non-constant TYPE_DOMAIN the
same / as equal.  There is already some code handling PLACEHODER_EXPR
special, but it doesn't seem to be enough (why in this case)?

Thanks,
Richard.

 --
 Eric Botcazou

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek

On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
 Why not annotate builtins.def with the info?

Because I think that would be more hairy, I'd have to change DEF_BUILTIN
and all the builtins.  That seemed superfluous given that this hint is
only for a C FE...

Marek

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Jakub Jelinek

On Tue, Oct 07, 2014 at 04:51:31PM +0200, Marek Polacek wrote:
 On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
  Why not annotate builtins.def with the info?
 
 Because I think that would be more hairy, I'd have to change DEF_BUILTIN
 and all the builtins.  That seemed superfluous given that this hint is
 only for a C FE...

Guess it depends on how many DEF_*_BUILTIN classes would this affect,
if just a couple, you could add DEF_*_BUILTIN_WITH_C_HINT, with an extra
arg.  But as the builtins.def info already has quite long lines, making them
even longer might not be best.  So perhaps the switch is good enough.

Jakub

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Richard Biener

On Tue, Oct 7, 2014 at 4:51 PM, Marek Polacek pola...@redhat.com wrote:
 On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
 Why not annotate builtins.def with the info?

 Because I think that would be more hairy, I'd have to change DEF_BUILTIN
 and all the builtins.  That seemed superfluous given that this hint is
 only for a C FE...

All C family frontends, no?  And builtins.def is used by (and only by)
all C family frontends...

Well - just a suggestion ;)

I'd like to see some easier to grok specification of the number of arguments
expected to the builtins for example (for the match-and-simplify work).

Richard.

 Marek

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Jakub Jelinek

On Tue, Oct 07, 2014 at 05:00:26PM +0200, Richard Biener wrote:
 On Tue, Oct 7, 2014 at 4:51 PM, Marek Polacek pola...@redhat.com wrote:
  On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
  Why not annotate builtins.def with the info?
 
  Because I think that would be more hairy, I'd have to change DEF_BUILTIN
  and all the builtins.  That seemed superfluous given that this hint is
  only for a C FE...
 
 All C family frontends, no?  And builtins.def is used by (and only by)
 all C family frontends...

Well, the C++ FE on say:
void
bar (void)
{
  abort ();
}

just errors out:
/tmp/a.c: In function ‘void bar()’:
/tmp/a.c:4:10: error: ‘abort’ was not declared in this scope
   abort ();
  ^
adding a hint in this case is less obvious than in the C case, because,
what if this wasn't supposed to be ::abort (), but std::abort (), or
some other namespace abort, or some class abort () method etc.?

Jakub

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek

On Tue, Oct 07, 2014 at 05:00:05PM +0200, Jakub Jelinek wrote:
 On Tue, Oct 07, 2014 at 04:51:31PM +0200, Marek Polacek wrote:
  On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
   Why not annotate builtins.def with the info?
  
  Because I think that would be more hairy, I'd have to change DEF_BUILTIN
  and all the builtins.  That seemed superfluous given that this hint is
  only for a C FE...
 
 Guess it depends on how many DEF_*_BUILTIN classes would this affect,

At least the following:
DEF_LIB_BUILTIN
DEF_C94_BUILTIN
DEF_C99_BUILTIN
DEF_C11_BUILTIN
DEF_C99_COMPL_BUILTIN
DEF_C99_C90RES_BUILTIN
I think that is quite a lot.

 if just a couple, you could add DEF_*_BUILTIN_WITH_C_HINT, with an extra
 arg.  But as the builtins.def info already has quite long lines, making them
 even longer might not be best.  So perhaps the switch is good enough.

Yeah, that the lines are long enough already was one of the things
that discouraged me from tweaking builtins.def.

Marek

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Marek Polacek

On Tue, Oct 07, 2014 at 05:00:26PM +0200, Richard Biener wrote:
 On Tue, Oct 7, 2014 at 4:51 PM, Marek Polacek pola...@redhat.com wrote:
  On Tue, Oct 07, 2014 at 04:39:55PM +0200, Richard Biener wrote:
  Why not annotate builtins.def with the info?
 
  Because I think that would be more hairy, I'd have to change DEF_BUILTIN
  and all the builtins.  That seemed superfluous given that this hint is
  only for a C FE...
 
 All C family frontends, no?  And builtins.def is used by (and only by)
 all C family frontends...

As Jakub pointed out, only C and ObjC for now.
 
 Well - just a suggestion ;)

Thanks - builtins.def was where I originally started.

Marek

Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes

2014-10-07 Thread Ilya Enkovich

2014-10-03 23:59 GMT+04:00 Jeff Law l...@redhat.com:
 On 10/03/14 02:50, Ilya Enkovich wrote:

 Attached is an updated version of the patch.  It has disabled
 instrumenttation for builtin calls.

 Thanks,
 Ilya
 --
 gcc/

 2014-10-02  Ilya Enkovichilya.enkov...@intel.com

 * tree-chkp.c: New.
 * tree-chkp.h: New.
 * rtl-chkp.c: New.
 * rtl-chkp.h: New.
 * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
 (GTFILES): Add tree-chkp.c.
 * c-family/c.opt (fchkp-check-incomplete-type): New.
 (fchkp-zero-input-bounds-for-main): New.
 (fchkp-first-field-has-own-bounds): New.
 (fchkp-narrow-bounds): New.
 (fchkp-narrow-to-innermost-array): New.
 (fchkp-optimize): New.
 (fchkp-use-fast-string-functions): New.
 (fchkp-use-nochk-string-functions): New.
 (fchkp-use-static-bounds): New.
 (fchkp-use-static-const-bounds): New.
 (fchkp-treat-zero-dynamic-size-as-infinite): New.
 (fchkp-check-read): New.
 (fchkp-check-write): New.
 (fchkp-store-bounds): New.
 (fchkp-instrument-calls): New.
 (fchkp-instrument-marked-only): New.
 * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
 __CHKP__ macro when Pointer Bounds Checker is on.
 * passes.def (pass_ipa_chkp_versioning): New.
 (pass_early_local_passes): Removed.
 (pass_build_ssa_passes): New.
 (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
 (pass_chkp_instrumentation_passes): New.
 (pass_ipa_chkp_produce_thunks): New.
 (pass_local_optimization_passes): New.
 (pass_chkp_opt): New.
 * toplev.c: include tree-chkp.h.
 (compile_file): Add chkp_finish_file call.
 * tree-pass.h (make_pass_ipa_chkp_versioning): New.
 (make_pass_ipa_chkp_produce_thunks): New.
 (make_pass_chkp): New.
 (make_pass_chkp_opt): New.
 (make_pass_early_local_passes): Removed.
 (make_pass_build_ssa_passes): New.
 (make_pass_chkp_instrumentation_passes): New.
 (make_pass_local_optimization_passes): New.
 * tree.h (called_as_built_in): New.
 * builtins.c (called_as_built_in): Not static anymore.
 * passes.c (pass_manager::execute_early_local_passes): Execute
 early passes in three steps.
 (execute_all_early_local_passes): Removed.
 (pass_data_early_local_passes): Removed.
 (pass_early_local_passes): Removed.
 (execute_build_ssa_passes): New.
 (pass_data_build_ssa_passes): New.
 (pass_build_ssa_passes): New.
 (pass_data_chkp_instrumentation_passes): New.
 (pass_chkp_instrumentation_passes): New.
 (pass_data_local_optimization_passes): New.
 (pass_local_optimization_passes): New.
 (make_pass_early_local_passes): Removed.
 (make_pass_build_ssa_passes): New.
 (make_pass_chkp_instrumentation_passes): New.
 (make_pass_local_optimization_passes): New.

 gcc/testsuite

 2014-10-02  Ilya Enkovichilya.enkov...@intel.com

 * gcc.dg/pr37858.c: Replace early_local_cleanups pass name
 with build_ssa_passes.

 General question.  At the RTL level you represent the bounds with an RTX
 which is perfectly reasonable.  What are the structure sharing assumptions
 of those values?  Do they follow the existing RTL structure sharing
 assumptions?

 Minor nit 2014 in the copyright year for all these files ;-)

 So, for example if there are two references to the same bounds in RTL, are
 they distinct RTXs with the same underlying values?  Or is it a single rtx
 object that is shared?  It looks like you generally create new RTXs, but I'm
 a bit concerned that you might shove those things into a hash table and
 return them and embed a single reference into multiple hunks of parent RTL.

For expander bounds are quite regular vars and SSA names which are
expanded as all other values and therefore I believe regular sharing
assumptions are followed.

Hash tables are used just to link pointer values returned by call with
returned bounds.  It is required to expand retbnd calls.  Similarly
returned bounds are associated with DECL_RESULT using
SET_DECL_BOUNDS_RTL.









 mpx-9-pass.patch


 diff --git a/gcc/builtins.c b/gcc/builtins.c
 index 17754e5..78ac91f 100644
 --- a/gcc/builtins.c
 +++ b/gcc/builtins.c
 @@ -255,7 +255,7 @@ is_builtin_fn (tree decl)
  of the optimization level.  This means whenever a function is invoked
 with
  its internal name, which normally contains the prefix __builtin.
 */

 -static bool
 +bool
   called_as_built_in (tree node)
   {
 /* Note that we must use DECL_NAME, not DECL_ASSEMBLER_NAME_SET_P
 since

 Is there some reason you put the new prototype in tree.h rather than
 builtins.h?

It was made at time when everything was in tree.h :)
Since current version doesn't work with builtins,

Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Tom Tromey

 Marek == Marek Polacek pola...@redhat.com writes:

Marek [CCing java-patches now]
Marek Java testsuite breaks with -std=gnu11 as a default and/or with 
Marek -Wimplicit-function-declaration on

I don't recall how one gets warnings when compiling this generated code,
but if it is generally possible then I think this:

Marek +  if (indirect)
Marek +fprintf (stream, extern void JvRunMainName ();\n);
Marek +  else
Marek +fprintf (stream, extern void JvRunMain ();\n);

... will fail with -Wstrict-prototypes, since in C those should
read (void) rather than ().

If it's not possible then no big deal.

Tom

Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Marek Polacek

On Tue, Oct 07, 2014 at 10:03:26AM -0600, Tom Tromey wrote:
  Marek == Marek Polacek pola...@redhat.com writes:
 
 Marek [CCing java-patches now]
 Marek Java testsuite breaks with -std=gnu11 as a default and/or with 
 Marek -Wimplicit-function-declaration on
 
 I don't recall how one gets warnings when compiling this generated code,
 but if it is generally possible then I think this:

I'm not sure I understand, but this piece of code gets compiled when
running the libjava testsuite.  And when the warning triggers, we get
many fails.
 
 Marek +  if (indirect)
 Marek +fprintf (stream, extern void JvRunMainName ();\n);
 Marek +  else
 Marek +fprintf (stream, extern void JvRunMain ();\n);
 
 ... will fail with -Wstrict-prototypes, since in C those should
 read (void) rather than ().
 
 If it's not possible then no big deal.

I saw declarations of JvRunMain{,Name} with no parameters and with
some parameters.  So I decided to make it prototype-less function
declaration for now.  I think we don't have to worry about
-Wstrict-prototypes for now.

Marek

[patch] tag ../include/*

2014-10-07 Thread Aldy Hernandez

Is there a reason we don't create etags for toplevel include files?  If 
not, could I please apply this patch?


Thanks.
Aldy
commit a679529d14f005d8c88517f72d2b5295d8c82f0f
Author: Aldy Hernandez al...@redhat.com
Date:   Tue Oct 7 09:32:21 2014 -0700

* Makefile.in (TAGS): Tag ../include files.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 97b439a..df43b9c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3772,6 +3772,7 @@ TAGS: lang.tags
  fi;   \
done;   \
etags -o TAGS.sub c-family/*.h c-family/*.c *.h *.c *.cc \
+ ../include/*.h \
  --language=none --regex=/\(char\|unsigned 
int\|int\|bool\|void\|HOST_WIDE_INT\|enum [A-Za-z_0-9]+\) 
[*]?\([A-Za-z_0-9]+\)/\2/ common.opt\
  --language=none 
--regex=/\(DEF_RTL_EXPR\|DEFTREECODE\|DEFGSCODE\).*(\([A-Za-z_0-9]+\)/\2/ 
rtl.def tree.def gimple.def \
  --language=none --regex=/DEFTIMEVAR (\([A-Za-z_0-9]+\)/\1/ 
timevar.def \

Re: SD-6 C++ feature-testing macros for 4.9

2014-10-07 Thread Jason Merrill


On 10/04/2014 07:28 PM, Ed Smith-Rowland wrote:

This really does build clean and test clean on x86_64-linux.
It's basically the same as for 5.0 except experimental/any isn't in and
variable templates aren't in.


OK.

Jason

Re: [Java PATCH] Generate declarations in jvgenmain.c

2014-10-07 Thread Tom Tromey

Marek I saw declarations of JvRunMain{,Name} with no parameters and with
Marek some parameters.

Oh yeah, duh.

Marek  So I decided to make it prototype-less function
Marek declaration for now.  I think we don't have to worry about
Marek -Wstrict-prototypes for now.

Thanks for looking.

Tom

Re: [Patch, MIPS] Add Octeon3 support

2014-10-07 Thread Joseph S. Myers

Patches adding new -march= values need to update invoke.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com

[patch] remove dwarf2out's current_function_has_inlines

2014-10-07 Thread Aldy Hernandez


Errr... a static that only gets written to?

OK to commit?
commit 7b1c19385fd06d6a2d0844d453bf1c7683071440
Author: Aldy Hernandez al...@redhat.com
Date:   Tue Oct 7 10:14:02 2014 -0700

* dwarf2out.c: Remove current_function_has_inlines.
(gen_subprogram_die): Same.
(gen_inlined_subroutine_die): Same.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index b5fcfa4..1b30ea9 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -2954,9 +2954,6 @@ static GTY(()) unsigned int loclabel_num;
 /* Unique label counter for point-of-call tables.  */
 static GTY(()) unsigned int poc_label_num;
 
-/* Record whether the function being analyzed contains inlined functions.  */
-static int current_function_has_inlines;
-
 /* The last file entry emitted by maybe_emit_file().  */
 static GTY(()) struct dwarf_file_data * last_emitted_file;
 
@@ -18613,7 +18610,6 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
   if (DECL_NAME (DECL_RESULT (decl)))
gen_decl_die (DECL_RESULT (decl), NULL, subr_die);
 
-  current_function_has_inlines = 0;
   decls_for_scope (outer_scope, subr_die, 0);
 
   if (call_arg_locations  !dwarf_strict)
@@ -19270,7 +19266,6 @@ gen_inlined_subroutine_die (tree stmt, dw_die_ref 
context_die, int depth)
   add_call_src_coords_attributes (stmt, subr_die);
 
   decls_for_scope (stmt, subr_die, depth);
-  current_function_has_inlines = 1;
 }
 }

[PATCH] More testsuite cleanups

2014-10-07 Thread Marek Polacek

Some more cleanups revealed by testing on ppc64.

Applying to trunk.

2014-10-07  Marek Polacek  pola...@redhat.com

* gcc.dg/guality/pr41616-1.c: Use -fgnu89-inline.
* gcc.dg/iftrap-1.c: Fix implicit declarations.
* gcc.target/powerpc/pr26350.c: Likewise.
* gcc.target/powerpc/altivec-consts.c: Likewise.
* gcc.target/powerpc/altivec-varargs-1.c: Likewise.
* gcc.target/powerpc/le-altivec-consts.c: Likewise.
* gcc.target/powerpc/ppc-vector-memcpy.c: Likewise.
* gcc.target/powerpc/ppc-vector-memset.c: Likewise.
* gcc.target/powerpc/pr47862.c: Likewise.
* gcc.target/powerpc/pr48053-1.c: Likewise.
* gcc.target/powerpc/pr53487.c: Likewise.
* gcc.dg/vect/pr48765.c: Fix implicit declarations and defaulting
to int.
* gcc.target/powerpc/20050603-1.c: Fix defaulting to int.
* gcc.target/powerpc/altivec-2.c: Likewise.
* gcc.target/powerpc/pr47755-2.c: Likewise.

diff --git gcc/testsuite/gcc.dg/guality/pr41616-1.c 
gcc/testsuite/gcc.dg/guality/pr41616-1.c
index 24f64ab..fcd1ad5 100644
--- gcc/testsuite/gcc.dg/guality/pr41616-1.c
+++ gcc/testsuite/gcc.dg/guality/pr41616-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run { xfail *-*-* } } */
-/* { dg-options -g } */
+/* { dg-options -g -fgnu89-inline } */
 
 #include guality.h
 
diff --git gcc/testsuite/gcc.dg/iftrap-1.c gcc/testsuite/gcc.dg/iftrap-1.c
index 1427820..c6d5584 100644
--- gcc/testsuite/gcc.dg/iftrap-1.c
+++ gcc/testsuite/gcc.dg/iftrap-1.c
@@ -3,6 +3,8 @@
 /* { dg-do compile { target rs6000-*-* powerpc*-*-* sparc*-*-* ia64-*-* } } */
 /* { dg-final { scan-assembler-not ^\t(trap|ta|break)\[ \t\] } } */
 
+void bar (void);
+
 void f1(int p)
 {
   if (p)
diff --git gcc/testsuite/gcc.dg/vect/pr48765.c 
gcc/testsuite/gcc.dg/vect/pr48765.c
index 50839e3..2b2907b 100644
--- gcc/testsuite/gcc.dg/vect/pr48765.c
+++ gcc/testsuite/gcc.dg/vect/pr48765.c
@@ -33,8 +33,10 @@ static char *regs_change_size;
 static HARD_REG_SET *after_insn_hard_regs;
 static int stupid_find_reg (int, enum reg_class, enum machine_mode, int, int,
int);
+enum reg_class reg_preferred_class (int);
 void
 stupid_life_analysis (f, nregs, file)
+ int nregs, file;
  rtx f;
 {
   register int i;
@@ -52,7 +54,7 @@ stupid_life_analysis (f, nregs, file)
 static int
 stupid_find_reg (call_preserved, class, mode, born_insn, dead_insn,
 changes_size)
- int call_preserved;
+ int call_preserved, born_insn, dead_insn, changes_size;
  enum reg_class class;
  enum machine_mode mode;
 {
diff --git gcc/testsuite/gcc.target/powerpc/20050603-1.c 
gcc/testsuite/gcc.target/powerpc/20050603-1.c
index 041551b..f801c43 100644
--- gcc/testsuite/gcc.target/powerpc/20050603-1.c
+++ gcc/testsuite/gcc.target/powerpc/20050603-1.c
@@ -15,6 +15,7 @@ test_reg_save_restore (int *p)
 setlocale (LC_ALL, C);
 testreg = ext_func(p);
 }
+int
 main() {
   testreg = x;
   test_reg_save_restore (y);
diff --git gcc/testsuite/gcc.target/powerpc/altivec-2.c 
gcc/testsuite/gcc.target/powerpc/altivec-2.c
index 4f341dd..a91ac0c 100644
--- gcc/testsuite/gcc.target/powerpc/altivec-2.c
+++ gcc/testsuite/gcc.target/powerpc/altivec-2.c
@@ -23,6 +23,7 @@ int xxx[sizeof(foobar) == 16 ? 69 : -1];
 
 int nc17[sizeof(shoe) == sizeof (char *) ? 69 : -1];
 
+void
 code ()
 {
   *shoe = polish;
diff --git gcc/testsuite/gcc.target/powerpc/altivec-consts.c 
gcc/testsuite/gcc.target/powerpc/altivec-consts.c
index 2afd13f..36cb60c 100644
--- gcc/testsuite/gcc.target/powerpc/altivec-consts.c
+++ gcc/testsuite/gcc.target/powerpc/altivec-consts.c
@@ -6,6 +6,7 @@
 /* Check that easy AltiVec constants are correctly synthesized.  */
 
 extern void abort (void);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 typedef __attribute__ ((vector_size (16))) unsigned char v16qi;
 typedef __attribute__ ((vector_size (16))) unsigned short v8hi;
diff --git gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c 
gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c
index 1349ae5..d62f5bb 100644
--- gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c
+++ gcc/testsuite/gcc.target/powerpc/altivec-varargs-1.c
@@ -7,6 +7,7 @@
 
 extern void exit (int);
 extern void abort (void);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 #define vector __attribute__((vector_size (16)))
 
diff --git gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c 
gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
index 75733d6..15ec650 100644
--- gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
+++ gcc/testsuite/gcc.target/powerpc/le-altivec-consts.c
@@ -6,6 +6,7 @@
 /* Check that easy AltiVec constants are correctly synthesized.  */
 
 extern void abort (void);
+extern int memcmp (const void *, const void *, __SIZE_TYPE__);
 
 typedef __attribute__ ((vector_size (16))) unsigned char v16qi;
 typedef __attribute__ ((vector_size (16))) unsigned short v8hi;
diff --git

Re: RFA: fix mode confusion in caller-save.c:replace_reg_with_saved_mem

2014-10-07 Thread Jeff Law


On 10/06/14 20:57, Joern Rennecke wrote:

On 6 October 2014 19:58, Jeff Law l...@redhat.com wrote:

What makes word_mode special here?  ie, why is special casing for word_mode
the right thing to do?


The patch does not special-case word mode.  The if condition tests if
smode would
cover multiple hard registers.
If that would be the case, smode is replaced with word_mode.

SO I'll ask another way.  Why do you want to change smode to word_mode?

Jeff

[jit] Eliminate internal-api.c/h in favor of jit-common.h, jit-playback.c/h, jit-recording.c/h

2014-10-07 Thread David Malcolm

jit/internal-api.c and .h were getting large, so I broke them out into:

  * jit-common.h (forward decls of types)
  * jit-recording.h/c (the gcc::jit::recording classes)
  * jit-playback.h/c (the gcc::jit::playback classes)

Committed to branch dmalcolm/jit as 3071567787aef4a8ada8b38c890d01c19b4b998f

Not posting the full patch here as it's 400KB, but it can be seen at:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=3071567787aef4a8ada8b38c890d01c19b4b998f

gcc/jit/ChangeLog.jit:
* Make-lang.in (jit_OBJS): Drop jit/internal-api.o.
Add jit/jit-recording.o and jit/jit-playback.o.

* internal-api.c: Delete, moving content to new files jit-recording.c
and jit-playback.c.
* internal-api.h: Delete, moving content to new files
jit-common.h, jit-playback.h, jit-recording.h.
* jit-common.h: New file, containing the forward decls of classes
formerly in internal-api.h.
* jit-recording.c: New file, containing the gcc::jit::recording
code formerly in internal-api.c, and gcc::jit::dump.
* jit-recording.h: New file, containing the gcc::jit::recording
prototypes formerly in internal-api.h.
* jit-playback.c: New file, containing the gcc::jit::playback
code formerly in internal-api.c.
* jit-playback.h: New file, containing the gcc::jit::playback
prototypes formerly in internal-api.h.

* dummy-frontend.c: Don't include internal-api.h.  Add includes
of jit-common.h and jit-playback.h.
* jit-builtins.h: Replace include of internal-api.h with
jit-common.h.
* jit-builtins.c: Replace include of internal-api.h with
jit-common.h.  Add include of jit-recording.h.
* libgccjit.c: Likewise.

* docs/internals/index.rst (Overview of code structure): Update
to reflect the above changes.
---
 gcc/jit/ChangeLog.jit|   31 +
 gcc/jit/Make-lang.in |6 +-
 gcc/jit/docs/internals/index.rst |   18 +-
 gcc/jit/dummy-frontend.c |3 +-
 gcc/jit/internal-api.c   | 5473 --
 gcc/jit/internal-api.h   | 2264 
 gcc/jit/jit-builtins.c   |3 +-
 gcc/jit/jit-builtins.h   |2 +-
 gcc/jit/jit-common.h |  180 ++
 gcc/jit/jit-playback.c   | 2098 +++
 gcc/jit/jit-playback.h   |  564 
 gcc/jit/jit-recording.c  | 3415 
 gcc/jit/jit-recording.h  | 1593 +++
 gcc/jit/libgccjit.c  |3 +-
 14 files changed, 7902 insertions(+), 7751 deletions(-)
 delete mode 100644 gcc/jit/internal-api.c
 delete mode 100644 gcc/jit/internal-api.h
 create mode 100644 gcc/jit/jit-common.h
 create mode 100644 gcc/jit/jit-playback.c
 create mode 100644 gcc/jit/jit-playback.h
 create mode 100644 gcc/jit/jit-recording.c
 create mode 100644 gcc/jit/jit-recording.h

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Joseph S. Myers

On Tue, 7 Oct 2014, Marek Polacek wrote:

 2014-10-07  Marek Polacek  pola...@redhat.com
 
   PR c/59717
   * c-decl.c (header_for_builtin_fn): New function.
   (implicitly_declare): Suggest which header to include.
 
   * gcc.dg/pr59717.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [patch] Work harder to find DECL_STRUCT_FUNCTION

2014-10-07 Thread Jan Hubicka

 On Mon, Oct 6, 2014 at 11:52 AM, Eric Botcazou ebotca...@adacore.com wrote:
  Hi,
 
  you can have chains of clone functions in the callgraph but 
  can_inline_edge_p
  stops at the first clone when it is looking for DECL_STRUCT_FUNCTION, which
  can fool the following conditions in the predicate.
 
  Tested on x86_64-suse-linux, OK for the mainline?
 
 I wonder if this is worth abstracting into a callee_fn () cgraph edge method?
 
 Honzas call.

I would rather fix can_inline_edge_p to not use DECL_STRUCT_FUNCTION - it is not
available during WPA and thus all the code using it is wrong.  The 
non_call_exceptions code has FIXME explaining that, I see that someone added 
cilk.
It should be easy to move these flags to cgraph node itself - originally I did 
not
want to duplicate it and worried about performance implications.

Honza
 
 Thanks,
 Richard.
 
 
  2014-10-06  Eric Botcazou  ebotca...@adacore.com
 
  * ipa-inline.c (can_inline_edge_p): Recurse on clones to find the
  DECL_STRUCT_FUNCTION of the original node.
 
 
  --
  Eric Botcazou

[jit] Documentation tweaks

2014-10-07 Thread David Malcolm

Committed to branch dmalcolm/jit:

gcc/jit/ChangeLog.jit:
* docs/internals/index.rst (Overview of code structure): Directly
include the comment from jit-common.h as rst, rather than as a
quoted C++ comment.
* jit-common.h: Convert the summary format to valid reStructured
text for inclusion by docs/internals/index.rst.
* notes.txt: Clarify where libgccjit.c, jit-recording.c and
jit-playback.c fit into the high-level diagram.
---
 gcc/jit/ChangeLog.jit| 10 ++
 gcc/jit/docs/internals/index.rst |  7 +++
 gcc/jit/jit-common.h | 12 +++-
 gcc/jit/notes.txt| 13 +
 4 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 4592002..1a76543 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,15 @@
 2014-10-07  David Malcolm  dmalc...@redhat.com
 
+   * docs/internals/index.rst (Overview of code structure): Directly
+   include the comment from jit-common.h as rst, rather than as a
+   quoted C++ comment.
+   * jit-common.h: Convert the summary format to valid reStructured
+   text for inclusion by docs/internals/index.rst.
+   * notes.txt: Clarify where libgccjit.c, jit-recording.c and
+   jit-playback.c fit into the high-level diagram.
+
+2014-10-07  David Malcolm  dmalc...@redhat.com
+
* Make-lang.in (jit_OBJS): Drop jit/internal-api.o.
Add jit/jit-recording.o and jit/jit-playback.o.
 
diff --git a/gcc/jit/docs/internals/index.rst b/gcc/jit/docs/internals/index.rst
index 3065c60..1e3952c 100644
--- a/gcc/jit/docs/internals/index.rst
+++ b/gcc/jit/docs/internals/index.rst
@@ -152,7 +152,6 @@ Overview of code structure
 
 Here is a high-level summary from ``jit-common.h``:
 
-   .. literalinclude:: ../../jit-common.h
-:start-after: /* Summary.  */
-:end-before: namespace gcc {
-:language: c++
+.. include:: ../../jit-common.h
+  :start-after: This comment is included by the docs.
+  :end-before: End of comment for inclusion in the docs.  */
diff --git a/gcc/jit/jit-common.h b/gcc/jit/jit-common.h
index 5c41ddd..58e4a8c 100644
--- a/gcc/jit/jit-common.h
+++ b/gcc/jit/jit-common.h
@@ -36,9 +36,9 @@ along with GCC; see the file COPYING3.  If not see
 
 const int NUM_GCC_JIT_TYPES = GCC_JIT_TYPE_FILE_PTR + 1;
 
-/* Summary.  */
+/* This comment is included by the docs.
 
-/* In order to allow jit objects to be usable outside of a compile
+   In order to allow jit objects to be usable outside of a compile
whilst working with the existing structure of GCC's code the
C API is implemented in terms of a gcc::jit::recording::context,
which records the calls made to it.
@@ -79,15 +79,17 @@ const int NUM_GCC_JIT_TYPES = GCC_JIT_TYPE_FILE_PTR + 1;
 
During a playback, we associate objects from the recording with
their counterparts during this playback.  For simplicity, we store this
-   within the recording objects, as void *m_playback_obj, casting it to
+   within the recording objects, as ``void *m_playback_obj``, casting it to
the appropriate playback object subclass.  For these casts to make
sense, the two class hierarchies need to have the same structure.
 
-   Note that the playback objects that m_playback_obj points to are
+   Note that the playback objects that ``m_playback_obj`` points to are
GC-allocated, but the recording objects don't own references:
these associations only exist within a part of the code where
the GC doesn't collect, and are set back to NULL before the GC can
-   run.  */
+   run.
+
+   End of comment for inclusion in the docs.  */
 
 namespace gcc {
 
diff --git a/gcc/jit/notes.txt b/gcc/jit/notes.txt
index 54dca8f..d337cb4 100644
--- a/gcc/jit/notes.txt
+++ b/gcc/jit/notes.txt
@@ -5,8 +5,13 @@ Client Code   . Generated .libgccjit.so
│  .   .  .   .
 ──  .   .
   .   .│ .   .
-
-  .   (record API calls) .
+  .   .V .   .
+  .   .── libgccjit.c   .
+  .   .│ (error-checking).
+  .   .│ .
+  .   .── jit-recording.c
+  .   .  (record API calls)
+  .   .───  .
   .   .│ .   .
───  .   .
│  .   .  .   .
@@ -27,8 +32,8 @@ Client Code   . Generated .libgccjit.so
   .   .  .│  .
 ..│..V...
   .   .  .│  .   No GC in here
-

Re: [patch] remove dwarf2out's current_function_has_inlines

2014-10-07 Thread Jason Merrill


On 10/07/2014 01:16 PM, Aldy Hernandez wrote:

Errr... a static that only gets written to?

OK to commit?


Yes.  This should have been removed with

2010-09-03  Marcin Baczynski  marb...@gmail.com

* dwarf2out.c (file scope): Remove #if0'd code.
(gen_subprogram_die): Same.

Jason

Re: [C PATCH] Print header hints (PR c/59717)

2014-10-07 Thread Jason Merrill


On 10/07/2014 11:04 AM, Jakub Jelinek wrote:

adding a hint in this case is less obvious than in the C case, because,
what if this wasn't supposed to be ::abort (), but std::abort (), or
some other namespace abort, or some class abort () method etc.?


It still seems reasonable to offer a hint if no declaration was found.

Jason

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Ilya Verbin

On 07 Oct 16:30, Jakub Jelinek wrote:
 I think it is useful, doesn't have to be in the initial checkin, but I'd
 certainly prefer if from the (optional) --enable-offload-target argument
 it would figure out everything it needs to add for testing.
 And, if mkoffload isn't flexible enough to be convinced to find it in that
 scenario, it better should be made more flexible.

Ok, then we will implement this in a separate patch.

 I thought .gnu.target_lto* sections hold LTO bytecore and are desirable only 
 in the
 ET_REL objects for ld(1)/lto-wrapper purposes.  For large programs containing 
 large
 target regions the LTO bytecode could be very big, so leaving it in the 
 binary is
 undesirable.

Already fixed in kyukhin/gomp4-offload branch.
 
 For .offload_image_section name, wouldn't it be better to prefix that with 
 .gnu?

Renamed to .gnu.offload_images, I'll update the branch tomorrow after testing.

 And, is __gnu_offload_{funcs,vars} named that way just because the plugin 
 isn't able to add
 symbols around the sections for you?  As it doesn't contain a dot, it would 
 collide
 with user declarations put into __attribute__((section 
 (__gnu_offload_funcs))).

Renamed to .gnu.offload_{funcs,vars}.
Automatically provided symbols __start__*, __stop__* don't work with shared
libraries, since the symbols from exec override the respective symbols in dso.
 
 Looking at the symbols:
 perhaps it would be better to have . somewhere in the names too, though if 
 you are
 accessing that from C or declaring them in C, it might be too hard to bother.
 It is all in reserved namespace anyway, but use two underscores prefix 
 instead of one
 for those IMHO.

All these symbols are declared/accessed in C, so I renamed them to __offload_*.

On 07 Oct 16:45, Jakub Jelinek wrote:
 Also, something that I believe has been discussed in the past, but can't
 find it on your wiki page nor in *.opt, are option overrides for the
 offloading target, i.e. some option you can pass to the host compiler driver
 during linking that will tell the driver for which offloading targets (if
 any at all) to produce the offloading support (defaulting to all configured
 offloading target is fine) and optionally what extra options beyond what has
 been passed on the command line should be passed to the offloading compiler.
 
 Say, if I want to link target-1.exe such that it will only support host
 fallback and not x86_64-intelmicemul-linux-gnu , how do I achieve that now?

Unfortunately, this is still under development.  I hope to have a working patch
in a week.  Now, without it, lto-wrapped builds offload images for all offload
targets, specified during configure.

  -- Ilya

[committed] Fix missing include in check_effective_target_fd_truncate

2014-10-07 Thread Marek Polacek

In -std=gnu11 as a default mode many Fortran tests ended up as
UNSUPPORTED, because check_effective_target_fd_truncate routine
was missing the string.h header (it uses strncmp) hence it failed.

Applying to trunk.

2014-10-07  Marek Polacek  pola...@redhat.com

* lib/target-supports.exp (check_effective_target_fd_truncate):
Include string.h.

diff --git gcc/testsuite/lib/target-supports.exp 
gcc/testsuite/lib/target-supports.exp
index 77e45cb..2144683 100644
--- gcc/testsuite/lib/target-supports.exp
+++ gcc/testsuite/lib/target-supports.exp
@@ -5284,6 +5284,7 @@ proc check_effective_target_fd_truncate { } {
#include unistd.h
#include stdio.h
#include stdlib.h
+   #include string.h
int main ()
{
  FILE *f = fopen (tst.tmp, wb);

Marek

Re: [Fortran, Patch] Implement IMPLICIT NONE

2014-10-07 Thread Andreas Schwab

Tobias Burnus bur...@net-b.de writes:

 diff --git a/gcc/testsuite/gfortran.dg/implicit_4.f90 
 b/gcc/testsuite/gfortran.dg/implicit_4.f90
 index 2e871b0..9bf8d86 100644
 --- a/gcc/testsuite/gfortran.dg/implicit_4.f90
 +++ b/gcc/testsuite/gfortran.dg/implicit_4.f90
 @@ -5,13 +5,13 @@ IMPLICIT NONE ! { dg-error Duplicate }
  END
  
  SUBROUTINE a
 -IMPLICIT REAL(b-j) ! { dg-error cannot follow }
 -implicit none  ! { dg-error cannot follow }
 +IMPLICIT REAL(b-j)
 +implicit none  ! { dg-error Type IMPLICIT NONE statement at .1. 
 following an IMPLICIT statement }

That doesn't match.

/usr/local/gcc/gcc-20141007/gcc/testsuite/gfortran.dg/implicit_4.f90:9:103: 
Err\or: IMPLICIT NONE (type) statement at (1) following an IMPLICIT statement

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

[GOOGLE] Handle missing BINFO for LIPO

2014-10-07 Thread Teresa Johnson

We may have missing BINFO on a type if that type is a builtin, since
in LIPO mode we will reset builtin types to their original tree nodes
before parsing subsequent modules. Handle incomplete information by
returning false so we won't put an entry in the type inheritance graph
for optimization.

Passes regression tests. Ok for google branches?

Teresa

2014-10-07  Teresa Johnson  tejohn...@google.com

Google ref b/16511102.
* ipa-devirt.c (polymorphic_type_binfo_p): Handle missing BINFO.

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 215830)
+++ ipa-devirt.c(working copy)
@@ -177,7 +177,10 @@ static inline bool
 polymorphic_type_binfo_p (tree binfo)
 {
   /* See if BINFO's type has an virtual table associtated with it.  */
-  return BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (binfo)));
+  tree type_binfo = TYPE_BINFO (BINFO_TYPE (binfo));
+  if (!type_binfo)
+return false;
+  return BINFO_VTABLE (type_binfo);
 }

 /* One Definition Rule hashtable helpers.  */


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [GOOGLE] Handle missing BINFO for LIPO

2014-10-07 Thread Xinliang David Li

Ok (please also guard it with L_IPO_COMP_MODE).

David

On Tue, Oct 7, 2014 at 11:27 AM, Teresa Johnson tejohn...@google.com wrote:
 We may have missing BINFO on a type if that type is a builtin, since
 in LIPO mode we will reset builtin types to their original tree nodes
 before parsing subsequent modules. Handle incomplete information by
 returning false so we won't put an entry in the type inheritance graph
 for optimization.

 Passes regression tests. Ok for google branches?

 Teresa

 2014-10-07  Teresa Johnson  tejohn...@google.com

 Google ref b/16511102.
 * ipa-devirt.c (polymorphic_type_binfo_p): Handle missing BINFO.

 Index: ipa-devirt.c
 ===
 --- ipa-devirt.c(revision 215830)
 +++ ipa-devirt.c(working copy)
 @@ -177,7 +177,10 @@ static inline bool
  polymorphic_type_binfo_p (tree binfo)
  {
/* See if BINFO's type has an virtual table associtated with it.  */
 -  return BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (binfo)));
 +  tree type_binfo = TYPE_BINFO (BINFO_TYPE (binfo));
 +  if (!type_binfo)
 +return false;
 +  return BINFO_VTABLE (type_binfo);
  }

  /* One Definition Rule hashtable helpers.  */


 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [patch] tag ../include/*

2014-10-07 Thread Mike Stump

On Oct 7, 2014, at 9:37 AM, Aldy Hernandez al...@redhat.com wrote:
 Is there a reason we don't create etags for toplevel include files?

I don’t think there is.

  If not, could I please apply this patch?

I’m in favor.

Re: [PATCH v2] libstdc++: Add hexfloat/defaultfloat io manipulators.

2014-10-07 Thread Andreas Schwab

Jonathan Wakely jwak...@redhat.com writes:

 diff --git a/libstdc++-v3/src/c++98/locale_facets.cc 
 b/libstdc++-v3/src/c++98/locale_facets.cc
 index 3669acb..7ed04e6 100644
 --- a/libstdc++-v3/src/c++98/locale_facets.cc
 +++ b/libstdc++-v3/src/c++98/locale_facets.cc
 @@ -69,19 +69,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (__flags  ios_base::showpoint)
*__fptr++ = '#';
  
 -// As per DR 231: _always_, not only when 
 -// __flags  ios_base::fixed || __prec  0
 -*__fptr++ = '.';
 -*__fptr++ = '*';
 +ios_base::fmtflags __fltfield = __flags  ios_base::floatfield;
 +
 +if (__fltfield != (ios_base::fixed | ios_base::scientific))
 +  {
 +// As per DR 231: not only when __flags  ios_base::fixed || __prec 
  0
 +*__fptr++ = '.';
 +*__fptr++ = '*';
 +  }
  
  if (__mod)
*__fptr++ = __mod;
 -ios_base::fmtflags __fltfield = __flags  ios_base::floatfield;
  // [22.2.2.2.2] Table 58
  if (__fltfield == ios_base::fixed)
*__fptr++ = 'f';
  else if (__fltfield == ios_base::scientific)
*__fptr++ = (__flags  ios_base::uppercase) ? 'E' : 'e';
 +#ifdef _GLIBCXX_USE_C99
 +else if (__fltfield == (ios_base::fixed | ios_base::scientific))
 +  *__fptr++ = (__flags  ios_base::uppercase) ? 'A' : 'a';
 +#endif

That cannot work.  std::__convert_from_v always passes __prec before
__v, but the format is %a.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: [PATCH 5/n] OpenMP 4.0 offloading infrastructure: libgomp

2014-10-07 Thread Jakub Jelinek

On Tue, Oct 07, 2014 at 10:12:22PM +0400, Ilya Verbin wrote:
  And, is __gnu_offload_{funcs,vars} named that way just because the plugin 
  isn't able to add
  symbols around the sections for you?  As it doesn't contain a dot, it would 
  collide
  with user declarations put into __attribute__((section 
  (__gnu_offload_funcs))).
 
 Renamed to .gnu.offload_{funcs,vars}.
 Automatically provided symbols __start__*, __stop__* don't work with shared
 libraries, since the symbols from exec override the respective symbols in dso.

...

Thanks.

One more thing, I've noticed that running target-1.exe testcase also leaves
/tmp/offload_XX directories around (one for each invocation).
That can be useful for debugging, but generally should be cleaned up in
__cxa_atexit callback or similar.

OT, from the various IRC discussions with Kirill on IRC, it seems you or
your colleges typed pretty much all target related tests from OpenMP 4.0.1
examples, can those be also submitted for inclusion in the testsuite?
AFAIK we already have the appendix-a/ testcases and had permissions from
OpenMP committee to use them, so if we put these into the same directory
(sure, it is not appendix-a anymore, but no tests are in that appendix
anymore), it would be appreciated.

Jakub

[PATCH] add overlap function to gcov-tool

2014-10-07 Thread Rong Xu

Hi,

This patch adds overlap functionality to gcov-tool. The overlap score
estimates the similarity of two profiles. Currently it only computes
overlap for arc counters.

The overlap score is defined as
\sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
where p1-counter[i] and p2-counter[2] are two matched counter from
profile1 and profiler2.
p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
profile2, repetitively.

The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
no match and 1.0 mean a perfect match.

This tool can be used in performance triaging and reducing the fdo
training set size (where similar inputs can be pruned).

Tested with spec2006 profiles.

Thanks,

-Rong
2014-10-07  Rong Xu  x...@google.com

* gcc/gcov-tool.c (profile_overlap): New driver function
to compute profile overlap. 
(print_overlap_usage_message): New.
(overlap_usage): New.
(do_overlap): New.
(print_usage): Add calls to overlap function.
(main): Ditto.
* libgcc/libgcov-util.c (read_gcda_file): Fix format.
(find_match_gcov_info): Ditto.
(calculate_2_entries): New.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.
(gcov_info_count_all_zero): Ditto.
(extract_file_basename): Ditto.
(get_file_basename): Ditto.
(set_flag): Ditto.
(matched_gcov_info): Ditto.
(calculate_overlap): Ditto.
(gcov_profile_overlap): Ditto.
* libgcc/libgcov-driver.c (compute_summary): Make
it avavilable for external calls.
* gcc/doc/gcov-tool.texi: Add documentation.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 215981)
+++ gcc/gcov-tool.c (working copy)
@@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include getopt.h
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
+extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
 extern int gcov_profile_scale (struct gcov_info*, float, int, int);
 extern struct gcov_info* gcov_read_profile_dir (const char*, int);
@@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
   return ret;
 }
 
+/* Driver function to computer the overlap score b/w profile D1 and D2.
+   Return 1 on error and 0 if OK.  */
+
+static int
+profile_overlap (const char *d1, const char *d2)
+{
+  struct gcov_info *d1_profile;
+  struct gcov_info *d2_profile;
+
+  d1_profile = gcov_read_profile_dir (d1, 0);
+  if (!d1_profile)
+return 1;
+
+  if (d2)
+{
+  d2_profile = gcov_read_profile_dir (d2, 0);
+  if (!d2_profile)
+return 1;
+
+  return gcov_profile_overlap (d1_profile, d2_profile);
+}
+
+  return 1;
+}
+
+/* Usage message for profile overlap.  */
+
+static void
+print_overlap_usage_message (int error_p)
+{
+  FILE *file = error_p ? stderr : stdout;
+
+  fnotice (file,   overlap [options] dir1 dir2   Compute the overlap 
of two profiles\n);
+  fnotice (file, -v, --verbose   Verbose mode\n);
+  fnotice (file, -h, --hotonly   Only print info for 
hot objects/functions\n);
+  fnotice (file, -f, --function  Print function level 
info\n);
+  fnotice (file, -F, --fullname  Print full 
filename\n);
+  fnotice (file, -o, --objectPrint object level 
info\n);
+  fnotice (file, -t float, --hot_threshold float Set the threshold 
for hotness\n);
+
+}
+
+static const struct option overlap_options[] =
+{
+  { verbose,no_argument,   NULL, 'v' },
+  { function,   no_argument,   NULL, 'f' },
+  { fullname,   no_argument,   NULL, 'F' },
+  { object, no_argument,   NULL, 'o' },
+  { hotonly,no_argument,   NULL, 'h' },
+  { hot_threshold,  required_argument, NULL, 't' },
+  { 0, 0, 0, 0 }
+};
+
+/* Print overlap usage and exit.  */
+
+static void
+overlap_usage (void)
+{
+  fnotice (stderr, Overlap subcomand usage:);
+  print_overlap_usage_message (true);
+  exit (FATAL_EXIT_CODE);
+}
+
+int overlap_func_level;
+int overlap_obj_level;
+int overlap_hot_only;
+int overlap_use_fullname;
+double overlap_hot_threshold = 0.005;
+
+/* Driver for profile overlap sub-command.  */
+
+static int
+do_overlap (int argc, char **argv)
+{
+  int opt;
+  int ret;
+
+  optind = 0;
+  while ((opt = getopt_long (argc, argv, vfFoht:, overlap_options, NULL)) != 
-1)
+{
+  switch (opt)
+{
+case 'v':
+  verbose = true;
+  gcov_set_verbose ();
+  break;
+case 'f':
+  overlap_func_level = 1;
+  break;
+case 'F':
+  overlap_use_fullname = 1;
+  break;
+case 'o':
+

[Google 4.9] Backport of r210828

2014-10-07 Thread Sterling Augustine

The enclosed patch for google 4.9 is a backport of r210828 from
trunk.

googleref:b/14623977

The given tests now pass when run by hand, but timeout under dejagnu
I will be sending a different change to fix that.

OK for google 4.9?
The enclosed patch for google 4.9 is a backport of r210828 from
trunk.

googleref:b/14623977

The given tests now pass when run by hand, but timeout under dejagnu
I will be sending a different change to fix that.

OK for google 4.9?

Index: gcc/config/aarch64/aarch64-builtins.c
===
--- gcc/config/aarch64/aarch64-builtins.c   (revision 215958)
+++ gcc/config/aarch64/aarch64-builtins.c   (working copy)
@@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum aarch64_simd_bui
 enum aarch64_builtins
 {
   AARCH64_BUILTIN_MIN,
+
+  AARCH64_BUILTIN_GET_FPCR,
+  AARCH64_BUILTIN_SET_FPCR,
+  AARCH64_BUILTIN_GET_FPSR,
+  AARCH64_BUILTIN_SET_FPSR,
+
   AARCH64_SIMD_BUILTIN_BASE,
 #include aarch64-simd-builtins.def
   AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE
@@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void)
 void
 aarch64_init_builtins (void)
 {
+  tree ftype_set_fpr
+= build_function_type_list (void_type_node, unsigned_type_node, NULL);
+  tree ftype_get_fpr
+= build_function_type_list (unsigned_type_node, NULL);
+
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR]
+= add_builtin_function (__builtin_aarch64_get_fpcr, ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR]
+= add_builtin_function (__builtin_aarch64_set_fpcr, ftype_set_fpr,
+   AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR]
+= add_builtin_function (__builtin_aarch64_get_fpsr, ftype_get_fpr,
+   AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR]
+= add_builtin_function (__builtin_aarch64_set_fpsr, ftype_set_fpr,
+   AARCH64_BUILTIN_SET_FPSR, BUILT_IN_MD, NULL, 
NULL_TREE);
+
   if (TARGET_SIMD)
 aarch64_init_simd_builtins ();
 }
@@ -964,7 +988,37 @@ aarch64_expand_builtin (tree exp,
 {
   tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   int fcode = DECL_FUNCTION_CODE (fndecl);
+  int icode;
+  rtx pat, op0;
+  tree arg0;
 
+  switch (fcode)
+{
+case AARCH64_BUILTIN_GET_FPCR:
+case AARCH64_BUILTIN_SET_FPCR:
+case AARCH64_BUILTIN_GET_FPSR:
+case AARCH64_BUILTIN_SET_FPSR:
+  if ((fcode == AARCH64_BUILTIN_GET_FPCR)
+ || (fcode == AARCH64_BUILTIN_GET_FPSR))
+   {
+ icode = (fcode == AARCH64_BUILTIN_GET_FPSR) ?
+   CODE_FOR_get_fpsr : CODE_FOR_get_fpcr;
+ target = gen_reg_rtx (SImode);
+ pat = GEN_FCN (icode) (target);
+   }
+  else
+   {
+ target = NULL_RTX;
+ icode = (fcode == AARCH64_BUILTIN_SET_FPSR) ?
+   CODE_FOR_set_fpsr : CODE_FOR_set_fpcr;
+ arg0 = CALL_EXPR_ARG (exp, 0);
+ op0 = expand_normal (arg0);
+ pat = GEN_FCN (icode) (op0);
+   }
+  emit_insn (pat);
+  return target;
+}
+
   if (fcode = AARCH64_SIMD_BUILTIN_BASE)
 return aarch64_simd_expand_builtin (fcode, exp, target);
 
@@ -1196,6 +1250,106 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator
   return changed;
 }
 
+void
+aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
+{
+  const unsigned AARCH64_FE_INVALID = 1;
+  const unsigned AARCH64_FE_DIVBYZERO = 2;
+  const unsigned AARCH64_FE_OVERFLOW = 4;
+  const unsigned AARCH64_FE_UNDERFLOW = 8;
+  const unsigned AARCH64_FE_INEXACT = 16;
+  const unsigned HOST_WIDE_INT AARCH64_FE_ALL_EXCEPT = (AARCH64_FE_INVALID
+   | AARCH64_FE_DIVBYZERO
+   | AARCH64_FE_OVERFLOW
+   | AARCH64_FE_UNDERFLOW
+   | AARCH64_FE_INEXACT);
+  const unsigned HOST_WIDE_INT AARCH64_FE_EXCEPT_SHIFT = 8;
+  tree fenv_cr, fenv_sr, get_fpcr, set_fpcr, mask_cr, mask_sr;
+  tree ld_fenv_cr, ld_fenv_sr, masked_fenv_cr, masked_fenv_sr, hold_fnclex_cr;
+  tree hold_fnclex_sr, new_fenv_var, reload_fenv, restore_fnenv, get_fpsr, 
set_fpsr;
+  tree update_call, atomic_feraiseexcept, hold_fnclex, masked_fenv, ld_fenv;
+
+  /* Generate the equivalence of :
+   unsigned int fenv_cr;
+   fenv_cr = __builtin_aarch64_get_fpcr ();
+
+   unsigned int fenv_sr;
+   fenv_sr = __builtin_aarch64_get_fpsr ();
+
+   Now set all exceptions to non-stop
+   unsigned int mask_cr
+   = ~(AARCH64_FE_ALL_EXCEPT  AARCH64_FE_EXCEPT_SHIFT);
+   unsigned int masked_cr;
+   masked_cr = fenv_cr  mask_cr;
+
+   And clear all exception flags
+

Towards GNU11

2014-10-07 Thread Marek Polacek

Hi!

I'd like to kick off a discussion about moving the default standard
for C from gnu89 to gnu11.

This really shouldn't be much of a surprise: the docs mention that
gnu11 is intended future default for a year now.  I would presume now
is a good time to make this move: together with the new naming scheme
this should make GCC more modern (C89 really is as old as the hills).
And we're still in stage1.

Prerequisites should be largely complete at this point:
- we have -Wc90-c99-compat option that warns about features not present
  in ISO C90, but present in ISO C99,
- we have -Wc99-c11-compat option that warns about features not present
  in ISO C99, but present in ISO C11,
- the testsuite has been adjusted so all the test that pass with gnu89
  default should pass with gnu11 default as well (see my recent batch
  of cleanup patches).  This unfortunately isn't correct for all archs,
  I just don't have enough resources to test everything.  But generally
  the fallout from moving to gnu11 is easy to fix: just add proper decls
  and return types (to fix defaulting to int), or for inline stuff use
  -fgnu89-inline/gnu_inline attribute.  I'd appreciate testing on other
  architectures than x86_64/ppc64.

The things I had to fix in the testsuite nicely reflect what we can expect
in the real life: mostly bunch of new warnings about missing declarations
and defaulting to int (this is probably going to be a pain with -Werror,
but I feel that people really should write proper declarations), different
inline semantics (in C99 semantics, the TU has to have the body of the inline
function etc.), new return with no value, in function returning non-void
warnings.  Different rules for constant expressions, the fact that in C90
non-lvalue arrays do not decay to pointers, slightly different rules for
compatible types (?) might come in game as well.

In turn, you can use all C99 and C11 features even with -pedantic.

Comments?

Regtested/bootstrapped on powerpc64-linux and x86_64-linux.

2014-10-07  Marek Polacek  pola...@redhat.com

* doc/invoke.texi: Update to reflect that GNU11 is the default
mode for C.
* c-common.h (c_language_kind): Update comment.
c-family/
* c-opts.c (c_common_init_options): Make -std=gnu11 the default for C.

diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 1e3477f..a895084 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -445,7 +445,7 @@ struct GTY(()) sorted_fields_type {
 
 typedef enum c_language_kind
 {
-  clk_c= 0,/* C90, C94 or C99 */
+  clk_c= 0,/* C90, C94, C99 or C11 */
   clk_objc = 1,/* clk_c with ObjC features.  */
   clk_cxx  = 2,/* ANSI/ISO C++ */
   clk_objcxx   = 3 /* clk_cxx with ObjC features.  */
diff --git gcc/c-family/c-opts.c gcc/c-family/c-opts.c
index 3f295d8..eb078e3 100644
--- gcc/c-family/c-opts.c
+++ gcc/c-family/c-opts.c
@@ -250,6 +250,9 @@ c_common_init_options (unsigned int decoded_options_count,
 
   if (c_language == clk_c)
 {
+  /* The default for C is gnu11.  */
+  set_std_c11 (false /* ISO */);
+
   /* If preprocessing assembly language, accept any of the C-family
 front end options since the driver may pass them through.  */
   for (i = 1; i  decoded_options_count; i++)
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 5fe7e15..fa84ed4 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -1692,8 +1692,7 @@ interfaces) and L (Analyzability).  The name @samp{c1x} 
is deprecated.
 
 @item gnu90
 @itemx gnu89
-GNU dialect of ISO C90 (including some C99 features). This
-is the default for C code.
+GNU dialect of ISO C90 (including some C99 features).
 
 @item gnu99
 @itemx gnu9x
@@ -1701,8 +1700,8 @@ GNU dialect of ISO C99.  The name @samp{gnu9x} is 
deprecated.
 
 @item gnu11
 @itemx gnu1x
-GNU dialect of ISO C11.  This is intended to become the default in a
-future release of GCC.  The name @samp{gnu1x} is deprecated.
+GNU dialect of ISO C11.  This is the default for C code.
+The name @samp{gnu1x} is deprecated.
 
 @item c++98
 @itemx c++03

Marek

Re: sort_heap complexity guarantee

2014-10-07 Thread François Dumont


On 06/10/2014 23:05, Daniel Krügler wrote:

2014-10-06 23:00 GMT+02:00 François Dumont frs.dum...@gmail.com:

On 05/10/2014 22:54, Marc Glisse wrote:

On Sun, 5 Oct 2014, François Dumont wrote:


I took a look at PR 61217 regarding pop_heap complexity guarantee.
Looks like we have no test to check complexity of our algos so I start
writing some starting with the heap operations. I found no issue with
make_heap, push_heap and pop_heap despite what the bug report is saying
however the attached testcase for sort_heap is failing.

Standard is saying std::sort_heap shall use less than N * log(N)
comparisons but with my test using 1000 random values the test is showing:

8687 comparisons on 6907.76 max allowed

Is this a known issue of sort_heap ? Do you confirm that the test is
valid ?

I would first look for confirmation that the standard didn't just forget a
big-O or something. I would expect an implementation as n calls to pop_heap
to be legal, and if pop_heap makes 2*log(n) comparisons, that naively sums
to too much. And I don't expect the standard to contain an advanced
amortized analysis or anything like that...


Good point, with n calls to pop_heap it means that limit must be 2*log(1) +
2*log(2) +... + 2*log(n) which is 2*log(n!) and  which is also necessarily 
2*n*log(n). I guess Standard comittee has forgotten the factor 2 in the
limit so this is what I am using as limit in the final test, unless someone
prefer the stricter 2*log(n!) ?

François, could you please submit a corresponding LWG issue by sending
an email using the recipe described here:

http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#submit_issue

?


I just did requesting to use 2N log(N).

And is it ok to commit those ?

François

Re: sort_heap complexity guarantee

2014-10-07 Thread Daniel Krügler

2014-10-07 23:11 GMT+02:00 François Dumont frs.dum...@gmail.com:
 On 06/10/2014 23:05, Daniel Krügler wrote:
 François, could you please submit a corresponding LWG issue by sending
 an email using the recipe described here:

 http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#submit_issue

 ?

 I just did requesting to use 2N log(N).

 And is it ok to commit those ?

Looks fine to me - Thanks!

- Daniel

Re: C++ Patch for c++/60894

2014-10-07 Thread Jason Merrill


On 09/24/2014 05:15 PM, Jason Merrill wrote:

On 09/24/2014 05:06 PM, Fabien Chêne wrote:

Unfortunately, just stripping the USING_DECL in lookup_and_check_tag
does not really work because some diagnotic codes expect the
USING_DECL not to be stripped.


It seems to me that the problem is that lookup_and_check_tag is 
rejecting a USING_DECL rather than returning it.  What if we return the 
USING_DECL?


Jason

[Patch, testsuite] check if -shared is supported

2014-10-07 Thread Christophe Lyon

Hi,

When Jason added the new g++.dg/ipa/devirt-28a.C test  along with his
fix for PR c++/58678
(https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00838.html), this new
test was failing in the ARM and AArch64 configuration I am testing.

For the arm*-none-eabi and aarch64*-none-elf configurations, this was
simply because -shared is not supported by these targets. The attached
patch adds support to test availability of this option, similarly to
what is done for -fpic.

For the record, for the arm*linux configurations, the test was also
failing because testglue.o contained relocations incompatible with
-shared. I managed to have them work by adding
set_board_info wrap_compile_flags -mword-relocations
to my .exp dejagnu configuration.

In summary, this patch enables to have devirt-28a.C:
- PASS on arm*linux*
- UNSUPPORTED on arm*-none-eabi and aarch64*-none-elf
instead of FAIL.

Is it OK for trunk, and 4.9 (since Jason's patch was also committed to 4.9) ?

2014-10-08  Christophe Lyon  christophe.l...@linaro.org

* lib/target-supports.exp (check_effective_target_shared): New
function.
* g++.dg/ipa/devirt-28a.C: Check if -shared is supported.

Thanks,

Christophe.
diff --git a/gcc/testsuite/g++.dg/ipa/devirt-28a.C b/gcc/testsuite/g++.dg/ipa/devirt-28a.C
index bdd1682..65d5fcd 100644
--- a/gcc/testsuite/g++.dg/ipa/devirt-28a.C
+++ b/gcc/testsuite/g++.dg/ipa/devirt-28a.C
@@ -1,6 +1,6 @@
 // PR c++/58678
 // { dg-options -O3 -flto -shared -fPIC -Wl,--no-undefined }
-// { dg-do link { target { gld  fpic } } }
+// { dg-do link { target { { gld  fpic }  shared } } }
 
 struct A {
   virtual ~A();
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 77e45cb..7ae6161 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -840,6 +840,19 @@ proc check_effective_target_fpic { } {
 return 0
 }
 
+# Return 1 if -shared is supported, as in no warnings or errors
+# emitted, 0 otherwise.
+
+proc check_effective_target_shared { } {
+# Note that M68K has a multilib that supports -fpic but not
+# -fPIC, so we need to check both.  We test with a program that
+# requires GOT references.
+return [check_no_compiler_messages shared executable {
+	extern int foo (void); extern int bar;
+	int baz (void) { return foo () + bar; }
+} -shared -fpic]
+}
+
 # Return 1 if -pie, -fpie and -fPIE are supported, 0 otherwise.
 
 proc check_effective_target_pie { } {

Re: RFA: fix mode confusion in caller-save.c:replace_reg_with_saved_mem

2014-10-07 Thread Joern Rennecke

On 7 October 2014 18:38, Jeff Law l...@redhat.com wrote:
 On 10/06/14 20:57, Joern Rennecke wrote:

 On 6 October 2014 19:58, Jeff Law l...@redhat.com wrote:

 What makes word_mode special here?  ie, why is special casing for
 word_mode
 the right thing to do?


 The patch does not special-case word mode.  The if condition tests if
 smode would
 cover multiple hard registers.
 If that would be the case, smode is replaced with word_mode.

 SO I'll ask another way.  Why do you want to change smode to word_mode?

Because SImode covers four hard registers, wheras the intention is to
have a single
one.

(concatn:SI [
(reg:SI 18 r18)
(reg:SI 19 r19)
(mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 43 [0x2b])) [6  S1 A8])
(mem/c:QI (plus:HI (reg/f:HI 28 r28)
(const_int 44 [0x2c])) [6  S1 A8])
])

(see original post) is invalid RTL, and thuis the cause of the later ICE.

Re: [Fortran, Patch] Implement IMPLICIT NONE

2014-10-07 Thread Dominique Dhumieres

Patch:

--- ../_clean/gcc/testsuite/gfortran.dg/implicit_4.f90  2014-10-07 
00:21:56.0 +0200
+++ gcc/testsuite/gfortran.dg/implicit_4.f902014-10-07 19:09:45.0 
+0200
@@ -6,7 +6,7 @@ END
 
 SUBROUTINE a
 IMPLICIT REAL(b-j)
-implicit none  ! { dg-error Type IMPLICIT NONE statement at .1. following 
an IMPLICIT statement }
+implicit none  ! { dg-error IMPLICIT NONE .type. statement at .1. 
following an IMPLICIT statement }
 END SUBROUTINE a
 
 subroutine b

Note that the loci are badly placed:

/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:4.40:

IMPLICIT NONE ! { dg-error Duplicate }
1
Error: Duplicate IMPLICIT NONE statement at (1)
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:9.105:

r IMPLICIT NONE .type. statement at .1. following an IMPLICIT statement }
   1
Error: IMPLICIT NONE (type) statement at (1) following an IMPLICIT statement
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:14.8:

implicit real(g-k) ! { dg-error IMPLICIT statement at .1. following an IMPLICI
1
Error: IMPLICIT statement at (1) following an IMPLICIT NONE (type) statement
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:19.47:

implicit integer (b-c) ! { dg-error already }
   1
Error: Letter B already has an IMPLICIT type at (1)
/opt/gcc/work/gcc/testsuite/gfortran.dg/implicit_4.f90:20.57:

implicit real(d-f), complex(f-g) ! { dg-error already }
 1
Error: Letter F already has an IMPLICIT type at (1)

i.e., at the end of the comment and not where the error occurs.

Dominique

[committed] MAINTAINERS (Write After Approval): Add myself.

2014-10-07 Thread Felix Yang

Index: MAINTAINERS
===
--- MAINTAINERS(revision 215985)
+++ MAINTAINERS(working copy)
@@ -583,6 +583,7 @@ Chung-Ju Wujasonw...@gmail.com
 Le-Chun Wul...@google.com
 Mingjie Xingmingjie.x...@gmail.com
 Canqun Yangcan...@nudt.edu.cn
+Fei Yangfelix.y...@huawei.com
 Jeffrey Yasskinjyass...@google.com
 Joey Yejoey...@arm.com
 Greta Yorshgreta.yo...@arm.com


Cheers,
Felix

[PATCH] Fix PR63259: bswap not recognized when finishing with rotation

2014-10-07 Thread Thomas Preud'homme

Currently the bswap pass only look for bswap pattern by examining bitwise
OR statement and doing following def-use chains. However a rotation
(left or right) can finish a manual byteswap, as shown in the following example:

unsigned
byteswap_ending_with_rotation (unsigned in)
{
in = ((in  0xff00ff00)   8) | ((in  0x00ff00ff)   8);
in = ((in  0x)  16) | ((in  0x)  16);
return in;
}

which is compiled into:

byteswap_ending_with_rotation (unsigned int in)
{
  unsigned int _2;
  unsigned int _3;
  unsigned int _4;
  unsigned int _5;

  bb 2:
  _2 = in_1(D)  4278255360;
  _3 = _2  8;
  _4 = in_1(D)  16711935;
  _5 = _4  8;
  in_6 = _5 | _3;
  in_7 = in_6 r 16;
  return in_7;

}

This patch adds rotation (left and right) to the list of statement to consider 
for byte swap.

ChangeLog are as follows:

*** gcc/ChangeLog ***

2014-09-30  Thomas Preud'homme  thomas.preudho...@arm.com

PR tree-optimization/63259
* tree-ssa-math-opts.c (pass_optimize_bswap::execute): Also consider
bswap in LROTATE_EXPR and RROTATE_EXPR statements.

*** gcc/testsuite/ChangeLog ***

2014-09-30  Thomas Preud'homme  thomas.preudho...@arm.com

PR tree-optimization/63259
* optimize-bswapsi-1.c (swap32_e): New bswap pass test.


diff --git a/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c 
b/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c
index 580e6e0..d4b5740 100644
--- a/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c
+++ b/gcc/testsuite/gcc.dg/optimize-bswapsi-1.c
@@ -64,5 +64,16 @@ swap32_d (SItype in)
 | (((in  24)  0xFF)  0);
 }
 
-/* { dg-final { scan-tree-dump-times 32 bit bswap implementation found at 4 
bswap } } */
+/* This variant comes from PR63259.  It compiles to a gimple sequence that ends
+   with a rotation instead of a bitwise OR.  */
+
+unsigned
+swap32_e (unsigned in)
+{
+  in = ((in  0xff00ff00)   8) | ((in  0x00ff00ff)   8);
+  in = ((in  0x)  16) | ((in  0x)  16);
+  return in;
+}
+
+/* { dg-final { scan-tree-dump-times 32 bit bswap implementation found at 5 
bswap } } */
 /* { dg-final { cleanup-tree-dump bswap } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 3c6e935..2023f2e 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2377,11 +2377,16 @@ pass_optimize_bswap::execute (function *fun)
 {
  gimple src_stmt, cur_stmt = gsi_stmt (gsi);
  tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
+ enum tree_code code;
  struct symbolic_number n;
  bool bswap;
 
- if (!is_gimple_assign (cur_stmt)
- || gimple_assign_rhs_code (cur_stmt) != BIT_IOR_EXPR)
+ if (!is_gimple_assign (cur_stmt))
+   continue;
+
+ code = gimple_assign_rhs_code (cur_stmt);
+ if (code != BIT_IOR_EXPR  code != LROTATE_EXPR
+  code != RROTATE_EXPR)
continue;
 
  src_stmt = find_bswap_or_nop (cur_stmt, n, bswap);

Testing was done by running the testsuite on arm-none-eabi target with QEMU
emulating Cortex-M3: no regression were found. Due to the potential increase
in compilation time, A bootstrap with sequential build (no -j option when 
calling
make) and with default option was made with and without the patch. The
results shows no increase compilation time:

r215662 with patch:
make  6167.48s user 401.03s system 99% cpu 1:49:52.07 total

r215662 without patch
make  6136.63s user 400.32s system 99% cpu 1:49:27.28 total

Is it ok for trunk?

Best regards,

Thomas Preud'homme

[PATCH PING]Improve induction variable elimination

2014-10-07 Thread Bin Cheng

Hi,
This patch is posted long before in a series of patches at
https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01392.html .  Since the
preceding patch is changed according to review comments, also because it's
long time not reviewed, I rebased and updated this patch as attached.  

With this patch, spec2k/fp can be improved a little on aarch64.
Bootstrap and test on x86_64 and x86, I am also prepared to fix any
regression in the future.  Is it OK?

2014-09-30  Bin Cheng  bin.ch...@arm.com

* tree-ssa-loop-ivopts.c (iv_nowrap_period)
(nowrap_cand_for_loop_niter_p): New functions.
(period_greater_niter_exit): New function refactored from
may_eliminate_iv.
(difference_cannot_overflow_p): Handle zero offset.
(iv_elimination_compare_lt): New parameter.  Check wrapping
behavior for candidate of wrapping type.  Handle folded forms
of may_be_zero expression.
(may_eliminate_iv): Call period_greater_niter_exit.  Pass new
argument for iv_elimination_compare_lt.

gcc/testsuite/ChangeLog
2014-09-30  Bin Cheng  bin.ch...@arm.com

* gcc.dg/tree-ssa/ivopts-lt-3.c: New test.
* gcc.dg/tree-ssa/ivopts-lt-4.c: New test.Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 215108)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -4451,6 +4451,44 @@ iv_period (struct iv *iv)
   return period;
 }
 
+/* Returns no wrapping period of induction variable IV.  For now
+   only unsigned type IV is handled, we could extend it in case
+   of non-overflow for signed ones.  Return zero if it can't be
+   decided.  */
+
+static tree
+iv_nowrap_period (struct iv *iv)
+{
+  bool overflow;
+  tree type;
+  tree base = iv-base, step = iv-step;
+  widest_int base_val, step_val, max_val, span, period;
+
+  gcc_assert (step  TREE_CODE (step) == INTEGER_CST);
+
+  type = TREE_TYPE (base);
+  if (!TYPE_UNSIGNED (type) || TREE_CODE (base) != INTEGER_CST)
+return integer_zero_node;
+
+  base_val = wi::to_widest (base);
+  step_val = wi::to_widest (step);
+  if (!POINTER_TYPE_P (type)  TYPE_MAX_VALUE (type)
+   TREE_CODE (TYPE_MAX_VALUE (type)) == INTEGER_CST)
+max_val = wi::to_widest (TYPE_MAX_VALUE (type));
+  else
+{
+  wide_int max_wi = wi::max_value (TYPE_PRECISION (type), UNSIGNED);
+  max_val = wi::to_widest (wide_int_to_tree (type, max_wi));
+}
+
+  span = max_val - base_val + step_val - 1;
+  period = wi::div_trunc (span, step_val, UNSIGNED, overflow);
+  if (overflow)
+return integer_zero_node;
+
+  return wide_int_to_tree (type, period);
+}
+
 /* Returns the comparison operator used when eliminating the iv USE.  */
 
 static enum tree_code
@@ -4483,6 +4521,10 @@ difference_cannot_overflow_p (struct ivopts_data *
   tree e1, e2;
   aff_tree aff_e1, aff_e2, aff_offset;
 
+  /* No overflow if offset is zero.  */
+  if (integer_zerop (offset))
+return true;
+
   if (!nowrap_type_p (TREE_TYPE (base)))
 return false;
 
@@ -4538,7 +4580,84 @@ difference_cannot_overflow_p (struct ivopts_data *
 }
 }
 
-/* Tries to replace loop exit by one formulated in terms of a LT_EXPR
+/* Check whether PERIOD of CAND is greater than the number of iterations
+   described by DESC for which the exit condition is true.  The exit
+   condition is comparison against USE.  */
+
+static bool
+period_greater_niter_exit (struct ivopts_data *data,
+  struct iv_use *use, struct iv_cand *cand,
+  tree period, struct tree_niter_desc *desc)
+{
+  struct loop *loop = data-current_loop;
+
+  /* If the number of iterations is constant, compare against it directly.  */
+  if (TREE_CODE (desc-niter) == INTEGER_CST)
+{
+  /* See cand_value_at.  */
+  if (stmt_after_increment (loop, cand, use-stmt))
+{
+  if (!tree_int_cst_lt (desc-niter, period))
+return false;
+}
+  else
+{
+  if (tree_int_cst_lt (period, desc-niter))
+return false;
+}
+}
+
+  /* If not, and if this is the only possible exit of the loop, see whether
+ we can get a conservative estimate on the number of iterations of the
+ entire loop and compare against that instead.  */
+  else
+{
+  widest_int period_value, max_niter;
+
+  max_niter = desc-max;
+  if (stmt_after_increment (loop, cand, use-stmt))
+max_niter += 1;
+  period_value = wi::to_widest (period);
+  if (wi::gtu_p (max_niter, period_value))
+{
+  /* See if we can take advantage of inferred loop bound information.  
*/
+  if (data-loop_single_exit_p)
+{
+  if (!max_loop_iterations (loop, max_niter))
+return false;
+  /* The loop bound is already adjusted by adding 1.  */
+  if (wi::gtu_p (max_niter, period_value))
+return false;
+}
+

Re: [PATCH GCC]Improve candidate selecting in IVOPT

2014-10-07 Thread Bin.Cheng

Ping.  Any review comments?

Thanks,
bin

On Wed, Oct 1, 2014 at 6:31 AM, Sebastian Pop seb...@gmail.com wrote:
 Bin Cheng wrote:
 Hi,
 As analyzed in PR62178, IVOPT can't find the optimal iv set for that case.
 The problem with current heuristic algorithm is it only replaces candidate
 with ones not in current solution one by one, starting from small solution.
 This patch adds another heuristic which starts from assigning the best
 candidate for each iv use, then replaces candidate with ones in the current
 solution.
 Before this patch, there are two runs of find_optimal_set_1 to find the
 optimal iv sets, we name them as set_a and set_b.  After this patch we will
 have set_c.  At last, IVOPT chooses the best one from set_a/set_b/set_c.  To
 prove that this patch is necessary, I collected instrumental data for gcc
 bootstrap, spec2k, eembc and can confirm for some cases only the newly added
 heuristic can find the optimal iv set.  The number of these cases in which
 set_c is the optimal one is on the same level of set_b.
 As for the compilation time, the newly added function actually is one
 iteration of previous selection algorithm, it should be much faster than
 previous process.

 I also added one target dependent test case.
 Bootstrap and test on x86_64, test on aarch64.  Any comments?

 I verified that the patch fixes the performance regression on intmm.  I have
 seen improvements to other benchmarks, and very small degradations that could
 very well be noise.

 Thanks for fixing this perf issue!
 Sebastian


 2014-09-30  Bin Cheng  bin.ch...@arm.com

   PR tree-optimization/62178
   * tree-ssa-loop-ivopts.c (enum sel_type): New.
   (iv_ca_add_use): Add parameter RELATED_P and find the best cand
   for iv use if it's true.
   (try_add_cand_for, get_initial_solution): Change paramter ORIGINALP
   to SELECT_TYPE and handle it.
   (find_optimal_iv_set_1): Ditto.
   (try_prune_iv_set, find_optimal_iv_set_2): New functions.
   (find_optimal_iv_set): Call find_optimal_iv_set_2 and choose the
   best candidate set.

 gcc/testsuite/ChangeLog
 2014-09-30  Bin Cheng  bin.ch...@arm.com

   PR tree-optimization/62178
   * gcc.target/aarch64/pr62178.c: New test.

 Index: gcc/testsuite/gcc.target/aarch64/pr62178.c
 ===
 --- gcc/testsuite/gcc.target/aarch64/pr62178.c(revision 0)
 +++ gcc/testsuite/gcc.target/aarch64/pr62178.c(revision 0)
 @@ -0,0 +1,17 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 } */
 +
 +int a[30 +1][30 +1], b[30 +1][30 +1], r[30 +1][30 +1];
 +
 +void Intmm (int run) {
 +  int i, j, k;
 +
 +  for ( i = 1; i = 30; i++ )
 +for ( j = 1; j = 30; j++ ) {
 +  r[i][j] = 0;
 +  for(k = 1; k = 30; k++ )
 +r[i][j] += a[i][k]*b[k][j];
 +}
 +}
 +
 +/* { dg-final { scan-assembler ld1r\\t\{v\[0-9\]+\.} } */
 Index: gcc/tree-ssa-loop-ivopts.c
 ===
 --- gcc/tree-ssa-loop-ivopts.c(revision 215113)
 +++ gcc/tree-ssa-loop-ivopts.c(working copy)
 @@ -254,6 +254,14 @@ struct iv_inv_expr_ent
hashval_t hash;
  };

 +/* Types used to start selecting the candidate for each IV use.  */
 +enum sel_type
 +{
 +  SEL_ORIGINAL,  /* Start selecting from original cands.  */
 +  SEL_IMPORTANT, /* Start selecting from important cands.  */
 +  SEL_RELATED/* Start selecting from related cands.  */
 +};
 +
  /* The data used by the induction variable optimizations.  */

  typedef struct iv_use *iv_use_p;
 @@ -5417,22 +5425,51 @@ iv_ca_set_cp (struct ivopts_data *data, struct iv_
  }

  /* Extend set IVS by expressing USE by some of the candidates in it
 -   if possible.  Consider all important candidates if candidates in
 -   set IVS don't give any result.  */
 +   if possible.  If RELATED_P is FALSE, consider all important
 +   candidates if candidates in set IVS don't give any result;
 +   otherwise, try to find the best one from related or all candidates,
 +   depending on consider_all_candidates.  */

  static void
  iv_ca_add_use (struct ivopts_data *data, struct iv_ca *ivs,
 -struct iv_use *use)
 +struct iv_use *use, bool related_p)
  {
struct cost_pair *best_cp = NULL, *cp;
bitmap_iterator bi;
unsigned i;
struct iv_cand *cand;

 -  gcc_assert (ivs-upto = use-id);
 +  gcc_assert (ivs-upto == use-id);
ivs-upto++;
ivs-bad_uses++;

 +  if (related_p)
 +{
 +  if (data-consider_all_candidates)
 + {
 +   for (i = 0; i  n_iv_cands (data); i++)
 + {
 +   cand = iv_cand (data, i);
 +   cp = get_use_iv_cost (data, use, cand);
 +   if (cheaper_cost_pair (cp, best_cp))
 + best_cp = cp;
 + }
 + }
 +  else
 + {
 +   EXECUTE_IF_SET_IN_BITMAP (use-related_cands, 0, i, bi)
 + {
 +   cand = iv_cand (data, i);
 +   cp =

Re: [PATCH] add overlap function to gcov-tool

2014-10-07 Thread Jan Hubicka

 Hi,
 
 This patch adds overlap functionality to gcov-tool. The overlap score
 estimates the similarity of two profiles. Currently it only computes
 overlap for arc counters.
 
 The overlap score is defined as
 \sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
 where p1-counter[i] and p2-counter[2] are two matched counter from
 profile1 and profiler2.
 p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
 profile2, repetitively.

The patch looks fine in general.  My statistics is all rusty, but can't we use
one of the established techniques like Kullback-Leibler to compare the
probabilitis distributions? It would be also nice to have ability to compare
branch probabilities in btween train runs.

Honza
 
 The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
 no match and 1.0 mean a perfect match.
 
 This tool can be used in performance triaging and reducing the fdo
 training set size (where similar inputs can be pruned).
 
 Tested with spec2006 profiles.
 
 Thanks,
 
 -Rong

 2014-10-07  Rong Xu  x...@google.com
 
   * gcc/gcov-tool.c (profile_overlap): New driver function
 to compute profile overlap. 
   (print_overlap_usage_message): New.
   (overlap_usage): New.
   (do_overlap): New.
   (print_usage): Add calls to overlap function.
   (main): Ditto.
   * libgcc/libgcov-util.c (read_gcda_file): Fix format.
   (find_match_gcov_info): Ditto.
   (calculate_2_entries): New.
   (compute_one_gcov): Ditto.
   (gcov_info_count_all_cold): Ditto.
   (gcov_info_count_all_zero): Ditto.
   (extract_file_basename): Ditto.
   (get_file_basename): Ditto.
   (set_flag): Ditto.
   (matched_gcov_info): Ditto.
   (calculate_overlap): Ditto.
   (gcov_profile_overlap): Ditto.
   * libgcc/libgcov-driver.c (compute_summary): Make
 it avavilable for external calls.
   * gcc/doc/gcov-tool.texi: Add documentation.
 
 Index: gcc/gcov-tool.c
 ===
 --- gcc/gcov-tool.c   (revision 215981)
 +++ gcc/gcov-tool.c   (working copy)
 @@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
  #include getopt.h
  
  extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, 
 int);
 +extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
  extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
  extern int gcov_profile_scale (struct gcov_info*, float, int, int);
  extern struct gcov_info* gcov_read_profile_dir (const char*, int);
 @@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
return ret;
  }
  
 +/* Driver function to computer the overlap score b/w profile D1 and D2.
 +   Return 1 on error and 0 if OK.  */
 +
 +static int
 +profile_overlap (const char *d1, const char *d2)
 +{
 +  struct gcov_info *d1_profile;
 +  struct gcov_info *d2_profile;
 +
 +  d1_profile = gcov_read_profile_dir (d1, 0);
 +  if (!d1_profile)
 +return 1;
 +
 +  if (d2)
 +{
 +  d2_profile = gcov_read_profile_dir (d2, 0);
 +  if (!d2_profile)
 +return 1;
 +
 +  return gcov_profile_overlap (d1_profile, d2_profile);
 +}
 +
 +  return 1;
 +}
 +
 +/* Usage message for profile overlap.  */
 +
 +static void
 +print_overlap_usage_message (int error_p)
 +{
 +  FILE *file = error_p ? stderr : stdout;
 +
 +  fnotice (file,   overlap [options] dir1 dir2   Compute the 
 overlap of two profiles\n);
 +  fnotice (file, -v, --verbose   Verbose mode\n);
 +  fnotice (file, -h, --hotonly   Only print info 
 for hot objects/functions\n);
 +  fnotice (file, -f, --function  Print function 
 level info\n);
 +  fnotice (file, -F, --fullname  Print full 
 filename\n);
 +  fnotice (file, -o, --objectPrint object level 
 info\n);
 +  fnotice (file, -t float, --hot_threshold float Set the threshold 
 for hotness\n);
 +
 +}
 +
 +static const struct option overlap_options[] =
 +{
 +  { verbose,no_argument,   NULL, 'v' },
 +  { function,   no_argument,   NULL, 'f' },
 +  { fullname,   no_argument,   NULL, 'F' },
 +  { object, no_argument,   NULL, 'o' },
 +  { hotonly,no_argument,   NULL, 'h' },
 +  { hot_threshold,  required_argument, NULL, 't' },
 +  { 0, 0, 0, 0 }
 +};
 +
 +/* Print overlap usage and exit.  */
 +
 +static void
 +overlap_usage (void)
 +{
 +  fnotice (stderr, Overlap subcomand usage:);
 +  print_overlap_usage_message (true);
 +  exit (FATAL_EXIT_CODE);
 +}
 +
 +int overlap_func_level;
 +int overlap_obj_level;
 +int overlap_hot_only;
 +int overlap_use_fullname;
 +double overlap_hot_threshold = 0.005;
 +
 +/* Driver for profile overlap sub-command.  */
 +
 +static int
 +do_overlap (int argc, char **argv)
 +{
 +  int opt;
 +  int ret;
 +
 +  optind = 0;

[PATCH] Fix PR bootstrap/63432 in jump threading

2014-10-07 Thread Teresa Johnson

This patch addresses PR bootstrap/63432 which was an insanity in the
probabilities created during jump threading. This was caused by threading
multiple times along the same path leading to the second jump thread path being
corrupted, which in turn caused the profile update code to fail. There
was code in mark_threaded_blocks that intended to catch and suppress
these cases of threading multiple times along the same path, but it
was sensitive to the order in which the paths were discovered and recorded.
This patch changes the detection to do two passes and removes the ordering
sensitivity.

Also, while fixing this I realized that the previous method of checking
the entry BB's profile count was not an accurate way to determine whether
the function has any non-zero profile counts. Created a new routine
to walk the path and see if it has all zero profile counts and estimated
frequencies.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Also did an LTO
profiledbootstrap.

Ok for trunk?

Thanks,
Teresa

2014-10-07  Teresa Johnson  tejohn...@google.com

PR bootstrap/63432.
* tree-ssa-threadupdate.c (estimated_freqs_path): New function.
(ssa_fix_duplicate_block_edges): Invoke it.
(mark_threaded_blocks): Make two passes to avoid ordering dependences.

Index: tree-ssa-threadupdate.c
===
--- tree-ssa-threadupdate.c (revision 215830)
+++ tree-ssa-threadupdate.c (working copy)
@@ -959,6 +959,43 @@ update_joiner_offpath_counts (edge epath, basic_bl
 }


+/* Check if the paths through RD all have estimated frequencies but zero
+   profile counts.  This is more accurate than checking the entry block
+   for a zero profile count, since profile insanities sometimes creep in.  */
+
+static bool
+estimated_freqs_path (struct redirection_data *rd)
+{
+  edge e = rd-incoming_edges-e;
+  vecjump_thread_edge * *path = THREAD_PATH (e);
+  edge ein;
+  edge_iterator ei;
+  bool non_zero_freq = false;
+  FOR_EACH_EDGE (ein, ei, e-dest-preds)
+{
+  if (ein-count)
+return false;
+  non_zero_freq |= ein-src-frequency != 0;
+}
+
+  for (unsigned int i = 1; i  path-length (); i++)
+{
+  edge epath = (*path)[i]-e;
+  if (epath-src-count)
+return false;
+  non_zero_freq |= epath-src-frequency != 0;
+  edge esucc;
+  FOR_EACH_EDGE (esucc, ei, epath-src-succs)
+{
+  if (esucc-count)
+return false;
+  non_zero_freq |= esucc-src-frequency != 0;
+}
+}
+  return non_zero_freq;
+}
+
+
 /* Invoked for routines that have guessed frequencies and no profile
counts to record the block and edge frequencies for paths through RD
in the profile count fields of those blocks and edges.  This is because
@@ -1058,9 +1095,11 @@ ssa_fix_duplicate_block_edges (struct redirection_
  data we first take a snapshot of the existing block and edge frequencies
  by copying them into the empty profile count fields.  These counts are
  then used to do the incremental updates, and cleared at the end of this
- routine.  */
+ routine.  If the function is marked as having a profile, we still check
+ to see if the paths through RD are using estimated frequencies because
+ the routine had zero profile counts.  */
   bool do_freqs_to_counts = (profile_status_for_fn (cfun) != PROFILE_READ
- || !ENTRY_BLOCK_PTR_FOR_FN (cfun)-count);
+ || estimated_freqs_path (rd));
   if (do_freqs_to_counts)
 freqs_to_counts_path (rd);

@@ -2077,35 +2116,52 @@ mark_threaded_blocks (bitmap threaded_blocks)

   /* Now iterate again, converting cases where we want to thread
  through a joiner block, but only if no other edge on the path
- already has a jump thread attached to it.  */
+ already has a jump thread attached to it.  We do this in two passes,
+ to avoid situations where the order in the paths vec can hide overlapping
+ threads (the path is recorded on the incoming edge, so we would miss
+ cases where the second path starts at a downstream edge on the same
+ path).  First record all joiner paths, deleting any in the unexpected
+ case where there is already a path for that incoming edge.  */
   for (i = 0; i  paths.length (); i++)
 {
   vecjump_thread_edge * *path = paths[i];

   if ((*path)[1]-type == EDGE_COPY_SRC_JOINER_BLOCK)
+{
+ /* Attach the path to the starting edge if none is yet recorded.  */
+  if ((*path)[0]-e-aux == NULL)
+(*path)[0]-e-aux = path;
+ else if (dump_file  (dump_flags  TDF_DETAILS))
+   dump_jump_thread_path (dump_file, *path, false);
+}
+}
+  /* Second, look for paths that have any other jump thread attached to
+ them, and either finish converting them or cancel them.  */
+  for (i = 0; i  paths.length (); i++)
+{
+  vecjump_thread_edge *

Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

2014-10-07 Thread Jeff Law


On 10/06/14 19:31, Bin.Cheng wrote:

On Tue, Oct 7, 2014 at 1:20 AM, Mike Stump mikest...@comcast.net wrote:

On Oct 6, 2014, at 4:32 AM, Richard Biener richard.guent...@gmail.com wrote:

On Mon, Oct 6, 2014 at 11:57 AM, Bin.Cheng amker.ch...@gmail.com wrote:

How many merging opportunities does sched2 undo again?  ISTR it
has the tendency of pushing stores down and loads up.


So, the pass works by merging 2 or more loads into 1 load (at least on my 
port).  sched2 would need to rip apart 1 load into 2 loads to be able to undo 
the real work.  The non-real work, doesn't matter any.  Can sched2 rip apart a 
single load?

On ARM and AARCH64, the two merged load/store are transformed into
single parallel insn by the following peephole2 pass, so that sched2
would not undo the fusion work.  I though sched2 works on the basis of
instructions, and it isn't good practice to have sched2 do split work.
It's certainly advantageous for sched2 to split insns that generate 
multiple instructions.  Running after register allocation, sched2 is 
ideal for splitting because the we know the alternative for each insn 
and thus we can (possibly for the first time) accurately know if a 
particular insn will generate multiple assembly instructions.


If the port has a splitter to rip apart a douple-word load into 
single-word loads, then we'd obviously only want to do that in cases 
where the double-word load actually generates  1 assembly instruction.


Addressing issues in that space seems out of scope for Bin's work to me, 
except perhaps for such issues on aarch64/arm which are Bin's primary 
concerns.


jeff

Re: [PATCH] Fix PR bootstrap/63432 in jump threading

2014-10-07 Thread Jeff Law


On 10/07/14 22:39, Teresa Johnson wrote:

This patch addresses PR bootstrap/63432 which was an insanity in the
probabilities created during jump threading. This was caused by threading
multiple times along the same path leading to the second jump thread path being
corrupted, which in turn caused the profile update code to fail. There
was code in mark_threaded_blocks that intended to catch and suppress
these cases of threading multiple times along the same path, but it
was sensitive to the order in which the paths were discovered and recorded.
This patch changes the detection to do two passes and removes the ordering
sensitivity.

Also, while fixing this I realized that the previous method of checking
the entry BB's profile count was not an accurate way to determine whether
the function has any non-zero profile counts. Created a new routine
to walk the path and see if it has all zero profile counts and estimated
frequencies.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Also did an LTO
profiledbootstrap.

Ok for trunk?

Thanks,
Teresa

2014-10-07  Teresa Johnson  tejohn...@google.com

 PR bootstrap/63432.
 * tree-ssa-threadupdate.c (estimated_freqs_path): New function.
 (ssa_fix_duplicate_block_edges): Invoke it.
 (mark_threaded_blocks): Make two passes to avoid ordering dependences.

OK.

Thanks,
jeff

Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

2014-10-07 Thread Bin.Cheng

On Wed, Oct 8, 2014 at 1:28 PM, Jeff Law l...@redhat.com wrote:
 On 10/06/14 19:31, Bin.Cheng wrote:

 On Tue, Oct 7, 2014 at 1:20 AM, Mike Stump mikest...@comcast.net wrote:

 On Oct 6, 2014, at 4:32 AM, Richard Biener richard.guent...@gmail.com
 wrote:

 On Mon, Oct 6, 2014 at 11:57 AM, Bin.Cheng amker.ch...@gmail.com
 wrote:

 How many merging opportunities does sched2 undo again?  ISTR it
 has the tendency of pushing stores down and loads up.


 So, the pass works by merging 2 or more loads into 1 load (at least on my
 port).  sched2 would need to rip apart 1 load into 2 loads to be able to
 undo the real work.  The non-real work, doesn't matter any.  Can sched2 rip
 apart a single load?

 On ARM and AARCH64, the two merged load/store are transformed into
 single parallel insn by the following peephole2 pass, so that sched2
 would not undo the fusion work.  I though sched2 works on the basis of
 instructions, and it isn't good practice to have sched2 do split work.

 It's certainly advantageous for sched2 to split insns that generate multiple
 instructions.  Running after register allocation, sched2 is ideal for
 splitting because the we know the alternative for each insn and thus we can
 (possibly for the first time) accurately know if a particular insn will
 generate multiple assembly instructions.

 If the port has a splitter to rip apart a douple-word load into single-word
 loads, then we'd obviously only want to do that in cases where the
 double-word load actually generates  1 assembly instruction.

 Addressing issues in that space seems out of scope for Bin's work to me,
 except perhaps for such issues on aarch64/arm which are Bin's primary
 concerns.


Hi Jeff,

Thanks very much for the explanation.  Very likely I am wrong here,
but seems what you mentioned fits to pass_split_before_sched2 very
well.  Then I guess it would be nice if we can differentiate cases in
the first place by generating different patterns, rather than split
some of instructions later.  Though I have no idea if we can do that
or not.

For arm/aarch64, I guess it's not an issue, otherwise the peephole2
won't work at all.  ARM maintainers should have answer to this.



 jeff

92 matches

Mail list logo