[c++-delayed-folding] cp_fold_r

2015-09-17 Thread Jason Merrill
I think we want to clear *walk_subtrees a lot more often in cp_fold_r; 
as it is, for most expressions we end up calling cp_fold on the 
full-expression, then uselessly on the subexpressions after we already 
folded the containing expression.


Jason


[c++-delayed-folding] code review

2015-09-17 Thread Jason Merrill

@@ -216,6 +216,7 @@ libgcov-driver-tool.o-warn = -Wno-error
 libgcov-merge-tool.o-warn = -Wno-error
 gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
+insn-modes.o-warn = -Wno-error


This doesn't seem to be needed anymore.


@@ -397,11 +397,13 @@ convert_to_real (tree type, tree expr)
EXPR must be pointer, integer, discrete (enum, char, or bool), float,
fixed-point or vector; in other cases error is called.

+   If DOFOLD is TRZE, we try to simplify newly-created patterns by folding.


"TRUE"


+  if (!dofold)
+{
+ expr = build1 (CONVERT_EXPR,
+lang_hooks.types.type_for_size
+  (TYPE_PRECISION (intype), 0),
+expr);
+ return build1 (CONVERT_EXPR, type, expr);
+   }


When we're not folding, I don't think we want to do the two-step 
conversion, just the second one.  And we might want to use NOP_EXPR 
instead of CONVERT_EXPR, but I'm not sure about that.



@@ -818,10 +828,15 @@ convert_to_integer (tree type, tree expr)
if (TYPE_UNSIGNED (typex))
  typex = signed_type_for (typex);
  }
-   return convert (type,
-   fold_build2 (ex_form, typex,
-convert (typex, arg0),
-convert (typex, arg1)));
+   if (dofold)
+ return convert (type,
+ fold_build2 (ex_form, typex,
+  convert (typex, arg0),
+  convert (typex, arg1)));
+   arg0 = build1 (CONVERT_EXPR, typex, arg0);
+   arg1 = build1 (CONVERT_EXPR, typex, arg1);
+   expr = build2 (ex_form, typex, arg0, arg1);
+   return build1 (CONVERT_EXPR, type, expr);


This code path seems to be for pushing a conversion down into a binary 
expression.  We shouldn't do this at all when we aren't folding.



@@ -845,9 +860,14 @@ convert_to_integer (tree type, tree expr)

if (!TYPE_UNSIGNED (typex))
  typex = unsigned_type_for (typex);
+   if (!dofold)
+ return build1 (CONVERT_EXPR, type,
+build1 (ex_form, typex,
+build1 (CONVERT_EXPR, typex,
+TREE_OPERAND (expr, 0;


Likewise.


@@ -867,6 +887,14 @@ convert_to_integer (tree type, tree expr)
 the conditional and never loses.  A COND_EXPR may have a throw
 as one operand, which then has void type.  Just leave void
 operands as they are.  */
+ if (!dofold)
+   return build3 (COND_EXPR, type, TREE_OPERAND (expr, 0),
+  VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 1)))
+  ? TREE_OPERAND (expr, 1)
+  : build1 (CONVERT_EXPR, type, TREE_OPERAND (expr, 
1)),
+  VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 2)))
+  ? TREE_OPERAND (expr, 2)
+  : build1 (CONVERT_EXPR, type, TREE_OPERAND (expr, 
2)));


Likewise.


@@ -903,6 +933,10 @@ convert_to_integer (tree type, tree expr)
   return build1 (FIXED_CONVERT_EXPR, type, expr);

 case COMPLEX_TYPE:
+  if (!dofold)
+   return build1 (CONVERT_EXPR, type,
+  build1 (REALPART_EXPR,
+  TREE_TYPE (TREE_TYPE (expr)), expr));


Why can't we call convert here rather than build1 a CONVERT_EXPR?

It would be good to ask a fold/convert maintainer to review the changes 
to this file, too.



@@ -5671,8 +5668,8 @@ build_new_op_1 (location_t loc, enum tree_code code, int 
flags, tree arg1,
 decaying an enumerator to its value.  */
  if (complain & tf_warning)
warn_logical_operator (loc, code, boolean_type_node,
-  code_orig_arg1, arg1,
-  code_orig_arg2, arg2);
+  code_orig_arg1, fold (arg1),
+  code_orig_arg2, fold (arg2));

  arg2 = convert_like (conv, arg2, complain);
}
@@ -5710,7 +5707,7 @@ build_new_op_1 (location_t loc, enum tree_code code, int 
flags, tree arg1,
 case TRUTH_OR_EXPR:
   if (complain & tf_warning)
warn_logical_operator (loc, code, boolean_type_node,
-  code_orig_arg1, arg1, code_orig_arg2, arg2);
+  code_orig_arg1, fold (arg1), code_orig_arg2, 
fold (arg2));
   /* Fall through.  */
 case GT_EXPR:
 case LT_EXPR:
@@ -5721,9 +5718,10 @@ build_new_op_1 (location_t loc, enum tree_code code, int 
flags, tree arg1,
   if 

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-17 Thread Richard Biener
On Wed, 16 Sep 2015, Alan Lawrence wrote:

> On 16/09/15 17:10, Bill Schmidt wrote:
> > 
> > On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote:
> > > On 16/09/15 15:28, Bill Schmidt wrote:
> > > > 2015-09-16  Bill Schmidt  
> > > > 
> > > >   * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX,
> > > > UNSPEC_REDUC_SMIN,
> > > >   UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, UNSPEC_REDUC_SMAX_SCAL,
> > > >   UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL,
> > > >   UNSPEC_REDUC_UMIN_SCAL): New enumerated constants.
> > > >   (reduc_smax_v2di): New define_expand.
> > > >   (reduc_smax_scal_v2di): Likewise.
> > > >   (reduc_smin_v2di): Likewise.
> > > >   (reduc_smin_scal_v2di): Likewise.
> > > >   (reduc_umax_v2di): Likewise.
> > > >   (reduc_umax_scal_v2di): Likewise.
> > > >   (reduc_umin_v2di): Likewise.
> > > >   (reduc_umin_scal_v2di): Likewise.
> > > >   (reduc_smax_v4si): Likewise.
> > > >   (reduc_smin_v4si): Likewise.
> > > >   (reduc_umax_v4si): Likewise.
> > > >   (reduc_umin_v4si): Likewise.
> > > >   (reduc_smax_v8hi): Likewise.
> > > >   (reduc_smin_v8hi): Likewise.
> > > >   (reduc_umax_v8hi): Likewise.
> > > >   (reduc_umin_v8hi): Likewise.
> > > >   (reduc_smax_v16qi): Likewise.
> > > >   (reduc_smin_v16qi): Likewise.
> > > >   (reduc_umax_v16qi): Likewise.
> > > >   (reduc_umin_v16qi): Likewise.
> > > >   (reduc_smax_scal_): Likewise.
> > > >   (reduc_smin_scal_): Likewise.
> > > >   (reduc_umax_scal_): Likewise.
> > > >   (reduc_umin_scal_): Likewise.
> > > 
> > > You shouldn't need the non-_scal reductions. Indeed, they shouldn't be
> > > used if
> > > the _scal are present. The non-_scal's were previously defined as
> > > producing a
> > > vector with one element holding the result and the other elements all
> > > zero, and
> > > this was only ever used with a vec_extract immediately after; the _scal
> > > pattern
> > > now includes the vec_extract as well. Hence the non-_scal patterns are
> > > deprecated / considered legacy, as per md.texi.
> > 
> > Thanks -- I had misread the description of the non-scalar versions,
> > missing the part where the other elements are zero.  What I really
> > want/need is an optab defined as computing the maximum value in all
> > elements of the vector.
> 
> Yes, indeed. It seems reasonable to me that this would coexist with an optab
> which computes only a single value (i.e. a scalar).

So just to clarify - you need to reduce the vector with max to a scalar
but want the (same) result in all vector elements?

> At that point it might be appropriate to change the cond-reduction code to
> generate the reduce-to-vector in all cases, and optabs.c expand it to
> reduce-to-scalar + broadcast if reduce-to-vector was not available. Along with
> the (parallel) changes to cost model already proposed, does that cover all the
> cases? It does add a new tree code, yes, but I'm feeling that could be
> justified if we go down this route.

I'd rather have combine recognize an insn that does both (if it
exists).  As I understand powerpc doesn't have reduction to scalar
(I think no ISA actually has this, but all ISAs have reduce to
one vector element plus a cheap way of extraction (effectively a subreg))
but it's reduction already has all result vector elements set to the
same value (as opposed to having some undefined or zero or whatever).

> However, another point that springs to mind: if you reduce a loop containing
> OR or MUL expressions, the vect_create_epilog_for_reduction reduces these
> using shifts, and I think will also use shifts for platforms not possessing a
> reduc_plus/min/max. If shifts could be changed for rotates, the code there
> would do your reduce-to-a-vector-of-identical-elements in the midend...can we
> (sensibly!) bring all of these together?
>
> > Perhaps the practical thing is to have the vectorizer also do an
> > add_stmt_cost with some new token that indicates the cost model should
> > make an adjustment if the back end doesn't need the extract/broadcast.
> > Targets like PowerPC and AArch32 could then subtract the unnecessary
> > cost, and remove the unnecessary code in simplify-rtx.
> 
> I think it'd be good if we could do it before simplify-rtx, really, although
> I'm not sure I have a strong argument as to why, as long as we can cost it
> appropriately.
> 
> > In any case, I will remove implementing the deprecated optabs, and I'll
> > also try to look at Alan L's patch shortly.
> 
> That'd be great, thanks :)

As for the cost modeling the simple solution is of course to have
these as "high-level" costs (a new enum entry), but the sustainable
solution is to have the target see the vectorized code and decide
for itself.  Iff we'd go down the simple route then the target
may as well create the 

[patch committed SH] Fix build failure

2015-09-17 Thread Kaz Kojima
I've committed the attached obvious fix for build failure for SH.
object_allocator is changed so to remove the 2nd argument of its
constructor.

Regards,
kaz
--
2015-09-17  Kaz Kojima  

* config/sh/sh.c (label_ref_list_d_pool): Adjust to
object_allocator change.

diff --git a/config/sh/sh.c b/config/sh/sh.c
index 25149a6..ec0abc5 100644
--- a/config/sh/sh.c
+++ b/config/sh/sh.c
@@ -4659,7 +4659,7 @@ typedef struct label_ref_list_d
 } *label_ref_list_t;
 
 static object_allocator label_ref_list_d_pool
-  ("label references list", 30);
+  ("label references list");
 
 /* The SH cannot load a large constant into a register, constants have to
come from a pc relative load.  The reference of a pc relative load


Re: [PATCH] Convert SPARC to LRA

2015-09-17 Thread David Miller
From: David Miller 
Date: Mon, 14 Sep 2015 11:30:29 -0700 (PDT)

> There are some other issues I'm having troubles resolving for 64-bit
> native bootstraps as well, and I am probably going to revert the LRA
> sparc changes unless I can resolve them by the end of today.

So I was actually able to resolve these problems, and committed the
following patch.

The big take-aways are:

1) Since we can only access SFmode/SImode/etc. in FP_REGS but not
   necessarily in EXTRA_FP_REGS, we have to tell LRA that secondary
   memory is needed when moving such a mode between those two classes.

2) HARD_REGNO_CALLER_SAVE_MODE's default should really be reconsidered.
   If the caller has an explicit mode in mind we should just use it
   instead just going to choose_hard_reg_mode().  That causes stupid
   things like using a DFmode register to spill an SFmode value and
   all the subregging that results from that.

3) It's amazing how much we get away with in RELOAD, particular wrt.
   to addressing.  Here all of the "(set x (HIGH (SYMBOLIC...)))"
   were always available even when flag_pic and this was never
   noticed before.


[PATCH] Fix LRA regressions on 64-bit SPARC.

gcc/
* config/sparc/sparc-protos.h (sparc_secondary_memory_needed):
Declare.
* config/sparc/sparc.c (sparc_secondary_memory_needed): New
function.
* config/sparc/sparc.h (SECONDARY_MEMORY_NEEDED): Use it.
(HARD_REGNO_CALLER_SAVE_MODE): Define.
* config/sparc/sparc.md (sethi_di_medlow, losum_di_medlow, seth44)
(setm44, setl44, sethh, setlm, sethm, setlo, embmedany_sethi)
(embmedany_losum, embmedany_brsum, embmedany_textuhi)
(embmedany_texthi, embmedany_textulo, embmedany_textlo): Do not
provide when flag_pic.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@227847 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog   | 14 ++
 gcc/config/sparc/sparc-protos.h |  2 ++
 gcc/config/sparc/sparc.c| 20 
 gcc/config/sparc/sparc.h| 14 +-
 gcc/config/sparc/sparc.md   | 32 
 5 files changed, 61 insertions(+), 21 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 76566ca..42faf2e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2015-09-17  David S. Miller  
+
+   * config/sparc/sparc-protos.h (sparc_secondary_memory_needed):
+   Declare.
+   * config/sparc/sparc.c (sparc_secondary_memory_needed): New
+   function.
+   * config/sparc/sparc.h (SECONDARY_MEMORY_NEEDED): Use it.
+   (HARD_REGNO_CALLER_SAVE_MODE): Define.
+   * config/sparc/sparc.md (sethi_di_medlow, losum_di_medlow, seth44)
+   (setm44, setl44, sethh, setlm, sethm, setlo, embmedany_sethi)
+   (embmedany_losum, embmedany_brsum, embmedany_textuhi)
+   (embmedany_texthi, embmedany_textulo, embmedany_textlo): Do not
+   provide when flag_pic.
+
 2015-09-17  Kaz Kojima  
 
* config/sh/sh.c (label_ref_list_d_pool): Adjust to
diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h
index 1431437..18192fd 100644
--- a/gcc/config/sparc/sparc-protos.h
+++ b/gcc/config/sparc/sparc-protos.h
@@ -62,6 +62,8 @@ extern bool constant_address_p (rtx);
 extern bool legitimate_pic_operand_p (rtx);
 extern rtx sparc_legitimize_reload_address (rtx, machine_mode, int, int,
int, int *win);
+extern bool sparc_secondary_memory_needed (enum reg_class, enum reg_class,
+  machine_mode);
 extern void load_got_register (void);
 extern void sparc_emit_call_insn (rtx, rtx);
 extern void sparc_defer_case_vector (rtx, rtx, int);
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index b41800c..f4ad68d 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -12283,6 +12283,26 @@ sparc_expand_vector_init (rtx target, rtx vals)
   emit_move_insn (target, mem);
 }
 
+bool sparc_secondary_memory_needed (enum reg_class class1, enum reg_class 
class2,
+   machine_mode mode)
+{
+  if (FP_REG_CLASS_P (class1) != FP_REG_CLASS_P (class2))
+{
+  if (! TARGET_VIS3
+ || GET_MODE_SIZE (mode) > 8
+ || GET_MODE_SIZE (mode) < 4)
+   return true;
+  return false;
+}
+
+  if (GET_MODE_SIZE (mode) == 4
+  && ((class1 == FP_REGS && class2 == EXTRA_FP_REGS)
+ || (class1 == EXTRA_FP_REGS && class2 == FP_REGS)))
+return true;
+
+  return false;
+}
+
 /* Implement TARGET_SECONDARY_RELOAD.  */
 
 static reg_class_t
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index 8343671..1f26232 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -747,6 +747,12 @@ extern int sparc_mode_class[];
register window instruction in the prologue.  */
 #define 

Re: [PATCH, ARM]: Fix static interworking call

2015-09-17 Thread Kyrill Tkachov


On 17/09/15 09:46, Christian Bruel wrote:

As obvious, bad operand number.

OK for trunk ?

Christian


p1.patch


2015-09-18  Christian Bruel

* config/arm/arm.md (*call_value_symbol): Fix operand for interworking.

2015-09-18  Christian Bruel

* gcc.target/arm/attr_thumb-static2.c: New test.


Ok, assuming testing is clean.
Thanks,
Kyrill



Re: Openacc launch API

2015-09-17 Thread Bernd Schmidt

Since Jakub appears to be busy, I'll give my 2 cents.

On 08/25/2015 03:29 PM, Nathan Sidwell wrote:

I  did rename the GOACC_parallel entry point to GOACC_parallel_keyed and
provide a forwarding function. However, as the mkoffload data is
incompatible, this is probably overkill.  I've had to increment the
(just committed) version number to detect the change in data
representation.  So any attempt to run an old binary with a new libgomp
will fail at the loading point.


Fail how? Jakub has requested that it works but falls back to 
unaccelerated execution, can you confirm this is what you expect to 
happen with this patch?



+/* Varadic launch arguments.  */
+#define GOMP_LAUNCH_END0  /* End of args, no dev or op */
+#define GOMP_LAUNCH_DIM1  /* Launch dimensions, op = mask */
+#define GOMP_LAUNCH_ASYNC  2  /* Async, op = cst val if not MAX  */
+#define GOMP_LAUNCH_WAIT   3  /* Waits, op = num waits.  */
+#define GOMP_LAUNCH_CODE_SHIFT 28
+#define GOMP_LAUNCH_DEVICE_SHIFT 16
+#define GOMP_LAUNCH_OP_SHIFT 0
+#define GOMP_LAUNCH_PACK(CODE,DEVICE,OP)   \
+  (((CODE) << GOMP_LAUNCH_CODE_SHIFT)\
+   | ((DEVICE) << GOMP_LAUNCH_DEVICE_SHIFT)  \
+   | ((OP) << GOMP_LAUNCH_OP_SHIFT))
+#define GOMP_LAUNCH_CODE(X) (((X) >> GOMP_LAUNCH_CODE_SHIFT) & 0xf)
+#define GOMP_LAUNCH_DEVICE(X) (((X) >> GOMP_LAUNCH_DEVICE_SHIFT) & 0xfff)
+#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & 0x)
+#define GOMP_LAUNCH_OP_MAX 0x


I probably would have used something simpler, like a code/device/op 
argument triplet, but I guess this ok.



-  if (num_waits)
+  va_start (ap, kinds);
+  /* TODO: This will need amending when device_type is implemented.  */


I'd expect that this will check whether the device type in the argument 
is either zero (or whatever indicates all devices) or matches the 
current device. Is that what you intend?



+  while (GOMP_LAUNCH_PACK (GOMP_LAUNCH_END, 0, 0)
+!= (tag = va_arg (ap, unsigned)))


That's a somewhat non-idiomatic way to write this, with the constant 
first and not obviously a constant. I'd initialize a variable with the 
constant before the loop.



+  assert (!GOMP_LAUNCH_DEVICE (tag));


Uh, that seems unfriendly, and not exactly forwards compatible. Can that 
fail a bit more gracefully? (Alternatively, implement the device_type 
stuff now so that we don't have TODOs in the code and don't have to 
worry about compatibility issues.)



+  if (num_waits > 8)
+gomp_fatal ("Too many waits for legacy interface");


How did you arrive at this number?



+GOACC_2.0,1 {
+  global:
+   GOACC_parallel_keyed;
+} GOACC_2.0;


Did you mean to use a comma?


+  if (dims[GOMP_DIM_GANG] != 1)
+GOMP_PLUGIN_fatal ("non-unity num_gangs (%d) not supported",
+  dims[GOMP_DIM_GANG]);
+  if (dims[GOMP_DIM_WORKER] != 1)
+GOMP_PLUGIN_fatal ("non-unity num_workers (%d) not supported",
+  dims[GOMP_DIM_WORKER]);


I see that this is just moved here (which is good), but is this still a 
limitation? Or is that on trunk only?



+  for (comma = "", id = var_ids; id; comma = ",", id = id->next)
+fprintf (out, "%s\n\t%s", comma, id->ptx_name);


The comma trick is new to me, I'll have to remember this one.


+static void
+set_oacc_fn_attrib (tree fn, tree clauses, vec *args)
+tree
+get_oacc_fn_attrib (tree fn)


These need function comments.


+{
+  /* Must match GOMP_DIM ordering.  */
+  static const omp_clause_code ids[] =
+{OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH};


Formatting. No = at the end of a line, and whitespace around braces.


@@ -9150,6 +9245,7 @@ expand_omp_target (struct omp_region *re
  }

gimple g;
+  bool tagging = false;
/* The maximum number used by any start_ix, without varargs.  */


That looks misindented, but may be an email client thing.


+   else if (!tagging)


Oh... so tagging controls two different methods for constructing 
argument lists, one for GOACC_parallel and the other for whatever OMP 
uses? That's a bit unfortunate, I'll need to think about it for a bit or 
defer to Jakub.


Looks reasonable otherwise.


Bernd



Re: [PATCH v2][GCC] Algorithmic optimization in match and simplify

2015-09-17 Thread Richard Biener
On Thu, Sep 3, 2015 at 1:11 PM, Andre Vieira
 wrote:
> On 01/09/15 15:01, Richard Biener wrote:
>>
>> On Tue, Sep 1, 2015 at 3:40 PM, Andre Vieira
>>  wrote:
>>>
>>> Hi Marc,
>>>
>>> On 28/08/15 19:07, Marc Glisse wrote:


 (not a review, I haven't even read the whole patch)

 On Fri, 28 Aug 2015, Andre Vieira wrote:

> 2015-08-03  Andre Vieira  
>
>* match.pd: Added new patterns:
>  ((X {&,<<,>>} C0) {|,^} C1) {^,|} C2)
>  (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>}
> C1)



 +(for op0 (rshift rshift lshift lshift bit_and bit_and)
 + op1 (bit_ior bit_xor bit_ior bit_xor bit_ior bit_xor)
 + op2 (bit_xor bit_ior bit_xor bit_ior bit_xor bit_ior)

 You can nest for-loops, it seems clearer as:
 (for op0 (rshift lshift bit_and)
 (for op1 (bit_ior bit_xor)
  op2 (bit_xor bit_ior)
>>>
>>>
>>>
>>> Will do, thank you for pointing it out.



 +(simplify
 + (op2:c
 +  (op1:c
 +   (op0 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3)

 I suspect you will want more :s (single_use) and less :c
 (canonicalization
 should put constants in second position).

>>> I can't find the definition of :s (single_use).
>>
>>
>> Sorry for that - didn't get along updating it yet :/  It restricts
>> matching to
>> sub-expressions that have a single-use.  So
>>
>> +  a &= 0xd123;
>> +  unsigned short tem = a ^ 0x6040;
>> +  a = tem | 0xc031; /* Simplify _not_ to ((a & 0xd123) | 0xe071).  */
>> ... use of tem ...
>>
>> we shouldn't do the simplifcation here because the expression
>> (a & 0x123) ^ 0x6040 is kept live by 'tem'.
>>
>>> GCC internals do point out
>>> that canonicalization does put constants in the second position, didnt
>>> see
>>> that first. Thank you for pointing it out.
>>>
 +   C1 = wi::bit_and_not (C1,C2);

 Space after ','.

>>> Will do.
>>>
 Having wide_int_storage in many places is surprising, I can't find
 similar
 code anywhere else in gcc.


>>>
>>> I tried looking for examples of something similar, I think I ended up
>>> using
>>> wide_int because I was able to convert easily to and from it and it has
>>> the
>>> "mask" and "wide_int_to_tree" functions. I welcome any suggestions on
>>> what I
>>> should be using here for integer constant transformations and
>>> comparisons.
>>
>>
>> Using wide-ints is fine, but you shouldn't need 'wide_int_storage'
>> constructors - those
>> are indeed odd.  Is it just for the initializers of wide-ints?
>>
>> +wide_int zero_mask = wi::zero (prec);
>> +wide_int C0 = wide_int_storage (@1);
>> +wide_int C1 = wide_int_storage (@2);
>> +wide_int C2 = wide_int_storage (@3);
>> ...
>> +   zero_mask = wide_int_storage (wi::mask (C0.to_uhwi (), false,
>> prec));
>>
>> tree_to_uhwi (@1) should do the trick as well
>>
>> +   C1 = wi::bit_and_not (C1,C2);
>> +   cst_emit = wi::bit_or (C1, C2);
>>
>> the ops should be replacable with @2 and @3, the store to C1 obviously not
>> (but you can use a tree temporary and use wide_int_to_tree here to avoid
>> the back-and-forth for the case where C1 is not assigned to).
>>
>> Note that transforms only doing association are prone to endless recursion
>> in case some other pattern does the reverse op...
>>
>> Richard.
>>
>>
>>> BR,
>>> Andre
>>>
>>
> Thank you for all the comments, see reworked version:
>
> Two new algorithmic optimisations:
>   1.((X op0 C0) op1 C1) op2 C2)
> with op0 = {&, >>, <<}, op1 = {|,^}, op2 = {|,^} and op1 != op2
> zero_mask has 1's for all bits that are sure to be 0 in (X op0 C0)
> and 0's otherwise.
> if (op1 == '^') C1 &= ~C2 (Only changed if actually emitted)
> if ((C1 & ~zero_mask) == 0) then emit (X op0 C0) op2 (C1 op2 C2)
> if ((C2 & ~zero_mask) == 0) then emit (X op0 C0) op1 (C1 op2 C2)
>   2. (X {|,^,&} C0) {<<,>>} C1 -> (X {<<,>>} C1) {|,^,&} (C0 {<<,>>} C1)
>
>
> This patch does two algorithmic optimisations that target patterns like:
> (((x << 24) | 0x00FF) ^ 0xFF00) and ((x ^ 0x4002) >> 1) |
> 0x8000.
>
> The transformation uses the knowledge of which bits are zero after
> operations like (X {&,<<,(unsigned)>>}) to combine constants, reducing
> run-time operations.
> The two examples above would be transformed into (X << 24) ^ 0x and
> (X >> 1) ^ 0xa001 respectively.
>
> The second transformation enables more applications of the first. Also some
> targets may benefit from delaying shift operations. I am aware that such an
> optimization, in combination with one or more optimizations that cause the
> reverse transformation, may lead to an infinite loop. Though such behavior
> has not been detected during regression testing and bootstrapping on
> aarch64.

+/* (X bit_op C0) rshift C1 -> (X rshift C0) bit_op (C0 rshift C1) 

[PATCH] Improve genmatch errors

2015-09-17 Thread Richard Biener

For a misplaced :s like

(simplify
 (plus (minus@1:s @1 @2) (minus @2 @0))
 (plus @1 @0))

we were giving

test.pd:2:16 error: not implemented: predicate on leaf operand
 (plus (minus@1:s @1 @2) (minus @2 @0))
   ^

which is at least confusing.  The following patch improves this to

test.pd:2:16 error: expected expression operand
 (plus (minus@1:s @1 @2) (minus @2 @0))
   ^

also handling any other garbage after operands and rejecting

test.pd:2:21 error: expected expression operand
 (plus (minus:s@1 @1@2) (minus @2 @0))
^
which was previously accepted.

Bootstrap on x86_64-unknown-linux-gnu in progress, no changes in
generated files.

Richard.

2015-09-17  Richard Biener  

* genmatch.c (parser::parse_expr): Improve error message
for mis-placed flags.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 227779)
+++ gcc/genmatch.c  (working copy)
@@ -3857,6 +3858,9 @@ parser::parse_expr ()
  e->expr_type = expr_type;
  return op;
}
+  else if (!(token->flags & PREV_WHITE))
+   fatal_at (token, "expected expression operand");
+
   e->append_op (parse_op ());
 }
   while (1);


Re: [PATCH] Target hook for disabling the delay slot filler.

2015-09-17 Thread Bernd Schmidt

On 09/17/2015 11:52 AM, Simon Dardis wrote:

The profitability of using an ordinary branch over a delay slot branch
depends on how the delay slot is filled. If a delay slot can be filled from
an instruction preceding the branch or instructions proceeding that must be
executed on both sides then it is profitable to use a delay slot branch.

For cases when instructions are chosen from one side of the branch,
the proposed optimization strategy is to not speculatively execute
instructions when ordinary branches could be used. Performance-wise
this avoids executing instructions which the eager delay filler picked
wrongly.

Since most branches have a compact form disabling the eager delay filler
should be no worse than altering it not to fill delay slots in this case.


Ok, so in that case I think the patch would be reasonable if the target 
hook was named appropriately to say something like don't speculate when 
filling delay slots. It looks like fill_eager_delay_slots always 
speculates, so you needn't change your approach in reorg.c.
Possibly place the hook after rtx_cost/address_cost in target.def since 
it's cost related.



Bernd


Re: [PATCH][RTL-ifcvt] PR rtl-optimization/67465: Handle pairs of complex+simple blocks and empty blocks more gracefully

2015-09-17 Thread Rainer Orth
Hi Kyrill,

> On 11/09/15 09:51, Rainer Orth wrote:
>> Kyrill Tkachov  writes:
>>
>>> On 10/09/15 12:43, Rainer Orth wrote:
 Hi Kyrill,

> Rainer, could you please check that this patch still fixes the SPARC
> regressions?
 unfortunately, it breaks sparc-sun-solaris2.10 bootstrap: compiling
 stage2 libiberty/regex.c FAILs:


>>> Thanks for providing the preprocessed file.
>>> I've reproduced and fixed the ICE in this version of the patch.
>>> The problem was that I was taking the mode of x before the check
>>> of whether a and b are MEMs, after which we would change x to an
>>> address_mode reg,
>>> thus confusing emit_move_insn.
>>>
>>> The fix is to take the mode of x and perform the
>>> can_conditionally_move_p check
>>> after that transformation.
>>>
>>> Bootstrapped and tested on aarch64 and x86_64.
>>> The preprocessed regex.i that Rainer provided now compiles successfully
>>> for me
>>> on a sparc-sun-solaris2.10 stage-1 cross-compiler.
>>>
>>> Rainer, thanks for your help so far, could you please try out this patch?
>> While bootstrap succeeds again, the testsuite regression in
>> gcc.c-torture/execute/20071216-1.c reoccured.
> Right, so I dug into the RTL dumps and I think this is a separate issue
> that's being exacerbated by my patch.
> The code tries to if-convert a block which contains a compare instruction
> i.e. sets the CC register.
> Now, bb_valid_for_noce_process_p should have caught this, and in particular
> insn_valid_noce_process_p
> which should check that the instruction doesn't set the CC
> register. However, on SPARC the
> cc_in_cond returns NULL! This is due to the canonicalize_comparison
> implementation that seems to
> remove the CC register from the condition expression and returns something 
> like:
> (leu (reg/v:SI 109 [ b ])
> (const_int -4096 [0xf000])
>
> Therefore the set_of (cc_in_cond (cond), insn) check doesn't get triggered
> because cc_in_cond returns NULL.
> Regardless of how the branch condition got canonicalized, I think we still
> want to reject any insn in the block
> that sets a condition code register, so this patch checks the destination
> of every set in the block for a MODE_CC
> expression and cancels if-conversion if that's the case.
>
> Oleg pointed me to the older PR 58517 affecting SH which seems similar and
> I think my previous ifcvt patch would expose
> this problem more.
>
> Anyway, with this patch the failing SPARC testcase
> gcc.c-torture/execute/20071216-1.c generates the same assembly
> as before r227368 and bootstrap and test on aarch64 and x86_64 passes ok for 
> me.
>
> Rainer, could you try this patch on top of the previous patch?
> (https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00689.html)
> The two together should fix all of PR 67456, 67464, 67465 and 67481.

sorry this took me so long: we've had a major switch failure and my
sparc machines are way slow.

Anyway, here's what I found: the two patches on top of each other do
bootstrap just fine and the gcc.c-torture/execute/20071216-1.c
regressions are gone.  However, it introduces a new one:

FAIL: gcc.dg/torture/stackalign/sibcall-1.c   -O1 -fpic execution test

It fails for both 32 and 64-bit.  The testcase SEGVs:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x00010bb0 in ix86_split_ashr (mode=1)
at 
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/torture/stackalign/sibcall-1.c:20
20: gen_x86_64_shrd) (0);
(gdb) where
#0  0x00010bb0 in ix86_split_ashr (mode=1)
at 
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/torture/stackalign/sibcall-1.c:20
#1  0x00010be4 in main ()
at 
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/torture/stackalign/sibcall-1.c:27
1: x/i $pc
=> 0x10bb0 :ld  [ %g1 ], %g1
(gdb) p/x $g1
$1 = 0x317f8

truss reveals:

Incurred fault #6, FLTBOUNDS  %pc = 0x00010BB0
  siginfo: SIGSEGV SEGV_MAPERR addr=0x000317F8
Received signal #11, SIGSEGV [default]
  siginfo: SIGSEGV SEGV_MAPERR addr=0x000317F8

with

#define FLTBOUNDS   6   /* Memory bounds (invalid address) */

and indeed that address isn't mapped according to

ro@apoc 58 > pmap 26536
26536:  /var/gcc/regression/trunk/11-gcc/build/gcc/testsuite/gcc/sibcall-1.exe
0001   8K r-x--  
/var/gcc/regression/trunk/11-gcc/build/gcc/testsuite/gcc/sibcall-1.exe
0002   8K rwx--  
/var/gcc/regression/trunk/11-gcc/build/gcc/testsuite/gcc/sibcall-1.exe
FEE6 696K r-x--  /lib/libm.so.2
FEF1C000  16K rwx--  /lib/libm.so.2
FF181464K r-x--  /lib/libc.so.1
FF2FE000  48K rwx--  /lib/libc.so.1
FF35  24K rwx--[ anon ]
FF36   8K rw---[ anon ]
FF37   8K rw---[ anon ]
FF38   8K rw---[ anon ]
FF39   8K rw---[ anon ]
FF3A 248K r-x--  /lib/ld.so.1
FF3EE000  16K rwx--  /lib/ld.so.1
FFBFC000  16K rwx--[ stack ]
 total  

Re: [PATCH] PR28901 -Wunused-variable ignores unused const initialised variables

2015-09-17 Thread Gerald Pfeifer
On Sun, 13 Sep 2015, Mark Wielaard wrote:
> Slightly adjusted patch attached. Now it is explicit that the warning is
> enabled by -Wunused-variable for C, but not C++. There are testcases for
> both C and C++ to check the defaults. And the hardcoded override is
> removed for C++, so the user could enable it if they want.

I believe making -Wunused-const-variable part of -Wall is not
a good idea.

For example, I have a nightly build of Wine with a nightly build 
of GCC.  And my notifaction mail went from twently lines of warning 
to 6500 -- all coming from header files.

Gerald


Re: [patch] libstdc++/67173 Fix filesystem::canonical for Solaris 10.

2015-09-17 Thread Jonathan Wakely

On 16/09/15 17:42 -0600, Martin Sebor wrote:

I see now the first exists test will detect symlink loops in
the original path. But I'm not convinced there isn't a corner
case that's subject to a TOCTOU race condition between the first
exists test and the while loop during which a symlink loop can
be introduced.

Suppose we call the function with /foo/bar as an argument and
the path exists and contains no symlinks. result is / and cmpts
is set to { foo, bar }. Just as the loop is entered, /foo/bar
is replaced with a symlink containing /foo/bar. The loop then
proceeds like so:

1. The first iteration removes foo from cmpts and sets result
to /foo. cmpts is { bar }.

2. The second iteration removes bar from cmpts, sets result to
/foo/bar, determines it's a symlink, reads its contents, sees
it's an absolute pathname and replaces result with /. It then
inserts the symlink's components { foo, bar } into cmpts. cmpts
becomes { foo, bar }. exists(result) succeeds.

3. The next iteration of the loop has the same initial state
as the first.

But I could have very easily missed something that takes care
of this corner case. If I did, sorry for the false alarm!


No, you're right. The TS says such filesystem races are undefined:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4099.html#fs.race.behavior
but it would be nice to fail gracefully rather than DOS the
application.

The simplest approach would be to increment a counter every time we
follow a symlink, and if it reaches some limit decide something is
wrong and fail with ELOOP.

I don't see how anything else can be 100% bulletproof, because a truly
evil attacker could just keep altering the contents of symlinks so we
keep ping-ponging between two or more paths. If we keep track of paths
we've seen before the attacker could just keep changing the contents
to a unique path each time, that initially exists as a file, but by
the time we get to is_symlink() its become a symlink to a new path.

So if we use a counter, what's a sane maximum? Is MAXSYMLINKS in
 the value the kernel uses? 20 seems quite low, I was
thinking of a much higher number.



Re: [patch] libstdc++/67173 Fix filesystem::canonical for Solaris 10.

2015-09-17 Thread Jonathan Wakely

On 17/09/15 12:16 +0100, Jonathan Wakely wrote:

So if we use a counter, what's a sane maximum? Is MAXSYMLINKS in
 the value the kernel uses? 20 seems quite low, I was
thinking of a much higher number.


Until very recently Linux seemed to hardcode it to 40:
https://github.com/torvalds/linux/blob/v4.1/fs/namei.c#L865

That changed in v4.2 and I'm not sure what it does not, if this is the
relevant bit of code it uses MAXSYMLINKS:
https://github.com/torvalds/linux/blob/v4.2/fs/namei.c#L1647




Re: [PATCH] Fix default_binds_local_p_2 for extern protected data

2015-09-17 Thread Szabolcs Nagy

ping 2.

this patch is needed for working visibility ("protected")
attribute for extern data on targets using default_binds_local_p_2.
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01871.html

On 10/08/15 12:04, Szabolcs Nagy wrote:


ping.

On 22/07/15 18:01, Szabolcs Nagy wrote:

The commit
https://gcc.gnu.org/viewcvs/gcc?view=revision=222184
changed a true to false in varasm.c:

   bool
   default_binds_local_p_2 (const_tree exp)
   {
-  return default_binds_local_p_3 (exp, flag_shlib != 0, true, true);
+  return default_binds_local_p_3 (exp, flag_shlib != 0, true, false,
+ !flag_pic);
   }

where

   default_binds_local_p_3 (const_tree exp, bool shlib, bool weak_dominate,
-bool extern_protected_data)
+bool extern_protected_data, bool common_local_p)
   {

false means that extern protected data binds locally,
which is wrong if the target can have copy relocations
against it (then the address must be loaded from GOT
otherwise the main executable will see different address).

Currently S/390, ARM and AArch64 targets use this predicate
and the current default is wrong for all of them (they can
have copy relocs) so I changed the default instead of doing
it in a target specific way.

The equivalent x86_64 bug was
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65248
the default was changed for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65780
now i opened
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66912
for arm and aarch64.

Needs a further binutils patch too to emit R_*_GLOB_DAT
instead of R_*_RELATIVE relocs for protected data.
The glibc elf/tst-protected1a and elf/tst-protected1b
tests depend on this.

Tested ARM and AArch64 targets.

gcc/ChangeLog:

2015-07-22  Szabolcs Nagy  

PR target/66912
* varasm.c (default_binds_local_p_2): Turn on extern_protected_data.

gcc/testsuite/ChangeLog:

2015-07-22  Szabolcs Nagy  

PR target/66912
* gcc.target/aarch64/pr66912.c: New.
* gcc.target/arm/pr66912.c: New.







Re: [patch match.pd c c++]: Ignore results of 'shorten_compare' and move missing patterns in match.pd

2015-09-17 Thread Kai Tietz
2015-09-17 11:12 GMT+02:00 Richard Biener :
> On Wed, Sep 16, 2015 at 9:51 PM, Jeff Law  wrote:
>> On 09/15/2015 03:42 AM, Kai Tietz wrote:
>
> 2015-09-08  Kai Tietz  
>
>   * gcc.dg/tree-ssa/vrp23.c: Adjust testcase to reflect that
>   pattern is matching now already within forward-propagation pass.
>   * gcc.dg/tree-ssa/vrp24.c: Likewise.


 So for the new patterns, I would have expected testcases to ensure
 they're
 getting used.
>>>
>>>
>>> We were talking about that.  My approach was to disable shortening
>>> code and see what regressions we get.  For C++ our test-coverage isn't
>>> that good, as we didn't had here any regressions, but for C testcases
>>> we got some.  Eg the testcase vrp25.c is one of them
>>
>> But as I noted, I think that simply looking at testsuite regressions after
>> disabling shorten_compare is not sufficient as I don't think we have
>> significant coverage of this code.
>>
>>>
>>> By disabling "shorten_compare" one of most simple testcases, which is
>>> failing, is:
>>>
>>> int foo (short s, short t)
>>> {
>>>int a = (int) s;
>>>int b = (int) t;
>>>
>>>return a >= b;
>>> }
>>
>> And I would argue it's precisely this kind of stuff we should be building
>> tests around and resolving prior to actually disabling shorten_compare.
>>
>>
>>>
>>> Where we miss to do narrow down SSA-tree comparison to simply s >= t;
>>> If we disable fre-pass ( -fno-tree-fre) for current trunk, we can see
>>> that this pattern gets resolved in forward-propagation pass due
>>> invocation of shorten_compare.
>>>
>>> Another simple one (with disabled "shorten_compare") is:
>>>
>>> The following testcase:
>>>
>>> int foo (unsigned short a)
>>> {
>>>unsigned int b = 0;
>>>return (unsigned int) a) < b;
>>> }
>>>
>>> Here we lacked the detection of ((unsigned int) a) < b being a < 0
>>> (which is always false).
>>
>> And again, this deserves a test and resolving prior to disabling
>> shorten_compare.
>>
>>
>>
>>>
 In *theory* one ought to be able to look at the dumps or .s files before
 after this patch for a blob of tests and see that nothing significant has
 changed.  Unfortunately, so much changes that it's hard to evaluate if
 this
 patch is a step forward or a step back.
>>>
>>>
>>> Hmm, actually we deal here with obvious patterns from
>>> "shorten_compare".  Of interest are here the narrowing-lines on top of
>>> this function, which we need to reflect in match.pd too, before we can
>>> deprecate it. I don't get that there are so much changes?  This are in
>>> fact just 3 basic patterns '(convert) X  (convert) Y',
>>> '((convert) X)  CST', and a more specialized variant for the last
>>> pattern for '==/!=' case.
>>>
 I wonder if we'd do better to first add new match.pd patterns, one at a
 time, with tests, and evaluating them along the way by looking at the
 dumps
 or .s files across many systems.  Then when we think we've got the right
 set, then look at what happens to those dumps/.s files if we make the
 changes so that shorten_compare really just emits warnings.
>>>
>>>
>>> As the shorten_compare function covers those transformations, I don't
>>> see that this is something leading to something useful.  As long as we
>>> call shorten_compare on folding in FEs, we won't see those patterns in
>>> ME that easy.  And even more important is, that patterns getting
>>> resolved if function gets invoked.
>>
>> It will tell you what will be missed from a code generation standpoint if
>> you disable shorten_compare.  And that is the best proxy we have for
>> performance regressions related to disabling shorten_compare.
>>
>> ie, in a local tree, you disable shorten_compare.  Then compare the code we
>> generate with and without shorten compare.  Reduce to a simple testcase.
>> Resolve the issue with a match.pd pattern (most likely) and repeat the
>> process.  In theory at each step the  number of things to deeply investigate
>> should be diminishing while at the same time you're building match.pd
>> patterns and tests we can include in the testsuite. And each step is likely
>> a new patch in the patch series -- the final patch of which disables
>> shorten_compare.
>>
>> It's a lot of work, but IMHO, it's necessary to help ensure we don't have
>> code generation regressions.
>
> As said in the other reply I successfully used gcc_unreachable ()s in
> code in intend to remove and then inspect/fix all fallout from that during
> bootstrap & test ...

Yes, that would be fine, if shorten_compare would be part of classical
fold-machinery.  But it use isn't there.  It gets just used in
frontend's binary-operation generation in C/C++'s build_binary_op
function.
As we don't want (and can) remove shorten_compare completely, due we
still need atleast its diagnostics,  using of gcc_unreachable ()
doesn't look suitable.
Also the request to 

Re: [PATCH WIP] Use Levenshtein distance for various misspellings in C frontend v2

2015-09-17 Thread Richard Biener
On Wed, Sep 16, 2015 at 5:45 PM, Manuel López-Ibáñez
 wrote:
> On 16 September 2015 at 15:33, Richard Biener
>  wrote:
>> On Wed, Sep 16, 2015 at 3:22 PM, Michael Matz  wrote:
 if we suggest 'foo' instead of foz then we'll get a more confusing followup
 error if we actually use it.
>>>
>>> This particular case could be solved by ruling out candidaten of the wrong
>>> kind (here, something that can be assigned to, vs. a function).  But it
>>> might actually be too early in parsing to say that there will be an
>>> assignment.  I don't think _this_ problem should block the patch.
>
> Indeed. The patch by David does not try to fix-up the code, it merely
> suggests a possible candidate. The follow-up errors should be the same
> before and after. Such suggestions will never be 100% right, even if
> the suggestion makes the code compile and run, it may still be the
> wrong one. A wrong suggestion is far less serious than a wrong
> uninitialized or Warray-bounds warning and we can live with those. Why
> this needs to be perfect from the very beginning?
>
> BTW, there is a PR for this: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52277
>
>> I wonder if we can tentatively parse with the choice at hand, only allowing
>> (and even suggesting?) it if that works out.
>
> This would require to queue the error, fix-up the wrong name and
> continue parsing. If there is another error, ignore that one and emit
> the original error without suggestion. The problem here is that we do
> not know if the additional error is actually caused by the fix-up we
> did or it is an already existing error. It would be equally terrible
> to emit errors caused by the fix-up or emit just a single error for
> the typo. We would need to roll-back the tentative parse and do a
> definitive parse anyway. This does not seem possible at the moment
> because the parsers maintain a lot of global state that is not easy to
> roll-back. We cannot simply create a copy of the parser state and
> throw it away later to continue as if the tentative parse has not
> happened.
>
> I'm not even sure if, in general, one can stop at the statement level
> or we would need to parse the whole function (or translation unit) to
> be able to tell if the suggestion is a valid candidate.

I was suggesting to only tentatively finish parsing the "current construct".
No idea how to best figure that out to the extend to make the tentative
parse useful.  Say, if we have "a + s.foz" and the field foz is not there
but foo is, so if we continue parsing with 'foo' instead but 'foo' will have
a type that makes "a + s.foo" invalid then we probably shouldn't suggest
it.  It _might_ be reasonably "easy" to implement that, but I'm not sure.
There might be a field named fz (with same or bigger levenstein distance)
with the correct type.  Of course it might have been I misspelled
's' and meant 'r' instead which has a field foz of corect type... (and 's'
is available as well).

I agree that we don't have to solve all this in the first iteration.

Richard.

> Cheers,
>
> Manuel.


****ping****: [Patch, fortran] PR40054 and PR63921 - Implement pointer function assignment - redux

2015-09-17 Thread Paul Richard Thomas
Could somebody review this please?

Thanks

Paul


-- Forwarded message --
From: Paul Richard Thomas 
Date: 6 September 2015 at 18:40
Subject: Re: [Patch, fortran] PR40054 and PR63921 - Implement pointer
function assignment - redux
To: Dominique Dhumieres , "fort...@gcc.gnu.org"
, gcc-patches 


It helps to attach the patch :-)

Paul

On 6 September 2015 at 13:42, Paul Richard Thomas
 wrote:
> Dear All,
>
> The attached patch more or less implements the assignment of
> expressions to the result of a pointer function. To wit:
>
> my_ptr_fcn (arg1, arg2...) = expr
>
> arg1 would usually be the target, pointed to by the function. The
> patch parses these statements and resolves them into:
>
> temp_ptr => my_ptr_fcn (arg1, arg2...)
> temp_ptr = expr
>
> I say more or less implemented because I have ducked one of the
> headaches here. At the end of the specification block, there is an
> ambiguity between statement functions and pointer function
> assignments. I do not even try to resolve this ambiguity and require
> that there be at least one other type of executable statement before
> these beasts. This can undoubtedly be fixed but the effort seems to me
> to be unwarranted at the present time.
>
> This version of the patch extends the coverage of allowed rvalues to
> any legal expression. Also, all the problems with error.c have been
> dealt with by Manuel's patch.
>
> I am grateful to Dominique for reminding me of PR40054 and pointing
> out PR63921. After a remark of his on #gfortran, I fixed the checking
> of the standard to pick up all the offending lines with F2003 and
> earlier.
>
>
> Bootstraps and regtests on FC21/x86_64 - OK for trunk?
>
> Cheers
>
> Paul
>
> 2015-09-06  Paul Thomas  
>
> PR fortran/40054
> PR fortran/63921
> * decl.c (get_proc_name): Return if statement function is
> found.
> * match.c (gfc_match_ptr_fcn_assign): New function.
> * match.h : Add prototype for gfc_match_ptr_fcn_assign.
> * parse.c : Add static flag 'in_specification_block'.
> (decode_statement): If in specification block match a statement
> function, otherwise if standard embraces F2008 try to match a
> pointer function assignment.
> (parse_interface): Set 'in_specification_block' on exiting from
> parse_spec.
> (parse_spec): Set and then reset 'in_specification_block'.
> (gfc_parse_file): Set 'in_specification_block'.
> * resolve.c (get_temp_from_expr): Extend to include other
> expressions than variables and constants as rvalues.
> (resolve_ptr_fcn_assign): New function.
> (gfc_resolve_code): Call resolve_ptr_fcn_assign.
> * symbol.c (gfc_add_procedure): Add a sentence to the error to
> flag up the ambiguity between a statement function and pointer
> function assignment at the end of the specification block.
>
> 2015-09-06  Paul Thomas  
>
> PR fortran/40054
> PR fortran/63921
> * gfortran.dg/fmt_tab_1.f90: Change from run to compile and set
> standard as legacy.
> * gfortran.dg/ptr_func_assign_1.f08: New test.
> * gfortran.dg/ptr_func_assign_2.f08: New test.



--
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx
Index: gcc/fortran/decl.c
===
*** gcc/fortran/decl.c  (revision 227508)
--- gcc/fortran/decl.c  (working copy)
*** get_proc_name (const char *name, gfc_sym
*** 901,906 
--- 901,908 
  return rc;
  
sym = *result;
+   if (sym->attr.proc == PROC_ST_FUNCTION)
+ return rc;
  
if (sym->attr.module_procedure
&& sym->attr.if_source == IFSRC_IFBODY)
Index: gcc/fortran/match.c
===
*** gcc/fortran/match.c (revision 227508)
--- gcc/fortran/match.c (working copy)
*** match
*** 4886,4892 
  gfc_match_st_function (void)
  {
gfc_error_buffer old_error;
- 
gfc_symbol *sym;
gfc_expr *expr;
match m;
--- 4886,4891 
*** gfc_match_st_function (void)
*** 4926,4931 
--- 4925,4990 
return MATCH_YES;
  
  undo_error:
+   gfc_pop_error (_error);
+   return MATCH_NO;
+ }
+ 
+ 
+ /* Match an assignment to a pointer function (F2008). This could, in
+general be ambiguous with a statement function. In this implementation
+it remains so if it is the first statement after the specification
+block.  */
+ 
+ match
+ gfc_match_ptr_fcn_assign (void)
+ {
+   gfc_error_buffer old_error;
+   locus old_loc;
+   gfc_symbol *sym;
+   gfc_expr *expr;
+   match m;
+   char name[GFC_MAX_SYMBOL_LEN + 1];
+ 
+   old_loc = gfc_current_locus;
+   m = gfc_match_name (name);
+ 

Re: [patch match.pd c c++]: Ignore results of 'shorten_compare' and move missing patterns in match.pd

2015-09-17 Thread Richard Biener
On Wed, Sep 16, 2015 at 9:51 PM, Jeff Law  wrote:
> On 09/15/2015 03:42 AM, Kai Tietz wrote:

 2015-09-08  Kai Tietz  

   * gcc.dg/tree-ssa/vrp23.c: Adjust testcase to reflect that
   pattern is matching now already within forward-propagation pass.
   * gcc.dg/tree-ssa/vrp24.c: Likewise.
>>>
>>>
>>> So for the new patterns, I would have expected testcases to ensure
>>> they're
>>> getting used.
>>
>>
>> We were talking about that.  My approach was to disable shortening
>> code and see what regressions we get.  For C++ our test-coverage isn't
>> that good, as we didn't had here any regressions, but for C testcases
>> we got some.  Eg the testcase vrp25.c is one of them
>
> But as I noted, I think that simply looking at testsuite regressions after
> disabling shorten_compare is not sufficient as I don't think we have
> significant coverage of this code.
>
>>
>> By disabling "shorten_compare" one of most simple testcases, which is
>> failing, is:
>>
>> int foo (short s, short t)
>> {
>>int a = (int) s;
>>int b = (int) t;
>>
>>return a >= b;
>> }
>
> And I would argue it's precisely this kind of stuff we should be building
> tests around and resolving prior to actually disabling shorten_compare.
>
>
>>
>> Where we miss to do narrow down SSA-tree comparison to simply s >= t;
>> If we disable fre-pass ( -fno-tree-fre) for current trunk, we can see
>> that this pattern gets resolved in forward-propagation pass due
>> invocation of shorten_compare.
>>
>> Another simple one (with disabled "shorten_compare") is:
>>
>> The following testcase:
>>
>> int foo (unsigned short a)
>> {
>>unsigned int b = 0;
>>return (unsigned int) a) < b;
>> }
>>
>> Here we lacked the detection of ((unsigned int) a) < b being a < 0
>> (which is always false).
>
> And again, this deserves a test and resolving prior to disabling
> shorten_compare.
>
>
>
>>
>>> In *theory* one ought to be able to look at the dumps or .s files before
>>> after this patch for a blob of tests and see that nothing significant has
>>> changed.  Unfortunately, so much changes that it's hard to evaluate if
>>> this
>>> patch is a step forward or a step back.
>>
>>
>> Hmm, actually we deal here with obvious patterns from
>> "shorten_compare".  Of interest are here the narrowing-lines on top of
>> this function, which we need to reflect in match.pd too, before we can
>> deprecate it. I don't get that there are so much changes?  This are in
>> fact just 3 basic patterns '(convert) X  (convert) Y',
>> '((convert) X)  CST', and a more specialized variant for the last
>> pattern for '==/!=' case.
>>
>>> I wonder if we'd do better to first add new match.pd patterns, one at a
>>> time, with tests, and evaluating them along the way by looking at the
>>> dumps
>>> or .s files across many systems.  Then when we think we've got the right
>>> set, then look at what happens to those dumps/.s files if we make the
>>> changes so that shorten_compare really just emits warnings.
>>
>>
>> As the shorten_compare function covers those transformations, I don't
>> see that this is something leading to something useful.  As long as we
>> call shorten_compare on folding in FEs, we won't see those patterns in
>> ME that easy.  And even more important is, that patterns getting
>> resolved if function gets invoked.
>
> It will tell you what will be missed from a code generation standpoint if
> you disable shorten_compare.  And that is the best proxy we have for
> performance regressions related to disabling shorten_compare.
>
> ie, in a local tree, you disable shorten_compare.  Then compare the code we
> generate with and without shorten compare.  Reduce to a simple testcase.
> Resolve the issue with a match.pd pattern (most likely) and repeat the
> process.  In theory at each step the  number of things to deeply investigate
> should be diminishing while at the same time you're building match.pd
> patterns and tests we can include in the testsuite. And each step is likely
> a new patch in the patch series -- the final patch of which disables
> shorten_compare.
>
> It's a lot of work, but IMHO, it's necessary to help ensure we don't have
> code generation regressions.

As said in the other reply I successfully used gcc_unreachable ()s in
code in intend to remove and then inspect/fix all fallout from that during
bootstrap & test ...

Yeah, it's a lot of work (sometimes).  But it's way easier than to investigate
code changes (which may also miss cases as followup transforms may end
up causing the same code to be generated).

Richard.

>
>
>
>>
>> So I would suggest here to disable shorten_compare invocation and
>> cleanup with fallout detectable in C-FE's testsuite.
>
> But without deeper analysis at the start & patches+testcases for those
> issues, we have a real risk of regressing the code we generate.
>
>>
>>> My worry is that we get part way through the conversion and end up with
>>> the
>>> match.pd 

RE: [PATCH] Target hook for disabling the delay slot filler.

2015-09-17 Thread Simon Dardis
The profitability of using an ordinary branch over a delay slot branch
depends on how the delay slot is filled. If a delay slot can be filled from
an instruction preceding the branch or instructions proceeding that must be 
executed on both sides then it is profitable to use a delay slot branch.

For cases when instructions are chosen from one side of the branch, 
the proposed optimization strategy is to not speculatively execute 
instructions when ordinary branches could be used. Performance-wise
this avoids executing instructions which the eager delay filler picked
wrongly.

Since most branches have a compact form disabling the eager delay filler
should be no worse than altering it not to fill delay slots in this case.

Thanks,
Simon

-Original Message-
From: Jeff Law [mailto:l...@redhat.com] 
Sent: 15 September 2015 16:02
To: Bernd Schmidt; Simon Dardis; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Target hook for disabling the delay slot filler.

On 09/15/2015 08:27 AM, Bernd Schmidt wrote:
> On 09/15/2015 04:19 PM, Simon Dardis wrote:
>> This patch adds a target hook for disabling the eager delay slot 
>> filler which when disabled can give better code. No new regressions.
>> Ok to commit?
>
> Hmm. Whether a branch was filled by the simple or eager filler is an 
> implementation detail - is there some better way to describe which 
> kind of branch is profitable?
And more importantly, it's far better to be able to describe when it is not 
profitable to use eager filling rather than just disabling it completely.

Jeff


Re: Openacc launch API

2015-09-17 Thread Nathan Sidwell

On 09/17/15 05:36, Bernd Schmidt wrote:


Fail how? Jakub has requested that it works but falls back to unaccelerated
execution, can you confirm this is what you expect to happen with this patch?


Yes, that is the failure mode.


-  if (num_waits)
+  va_start (ap, kinds);
+  /* TODO: This will need amending when device_type is implemented.  */


I'd expect that this will check whether the device type in the argument is
either zero (or whatever indicates all devices) or matches the current device.
Is that what you intend?


correct.


+  while (GOMP_LAUNCH_PACK (GOMP_LAUNCH_END, 0, 0)
+ != (tag = va_arg (ap, unsigned)))


That's a somewhat non-idiomatic way to write this, with the constant first and
not obviously a constant. I'd initialize a variable with the constant before the
loop.


Hm, yeah, that is a little unpleasant.  The alternative is IIRC a mid-loop 
break.

(If only this was C++  then we could write:

  while (int tag = va_arg (ap, unsigned)) { ... }

relying on GOMP_LAUNCH_PACK (GOMP_LAUNCH_END, 0, 0) being zero.  Maybe the 
C-ified version of that would be clearer?)



+  assert (!GOMP_LAUNCH_DEVICE (tag));

Uh, that seems unfriendly, and not exactly forwards compatible. Can that fail a


I suppose.  I was thinking of this as an internal interface from the compiler, 
but I guess down the road some one  could try  running a device_type implemented 
binary from a future compiler with an old libgomp.



+  if (num_waits > 8)
+gomp_fatal ("Too many waits for legacy interface");


How did you arrive at this number?


The voice in my head.  I've only seen code that had up to 2 waits, so I figured 
8 was way plenty.



+GOACC_2.0,1 {
+  global:
+GOACC_parallel_keyed;
+} GOACC_2.0;


Did you mean to use a comma?


Probably.




+  if (dims[GOMP_DIM_GANG] != 1)
+GOMP_PLUGIN_fatal ("non-unity num_gangs (%d) not supported",
+   dims[GOMP_DIM_GANG]);
+  if (dims[GOMP_DIM_WORKER] != 1)
+GOMP_PLUGIN_fatal ("non-unity num_workers (%d) not supported",
+   dims[GOMP_DIM_WORKER]);


I see that this is just moved here (which is good), but is this still a
limitation? Or is that on trunk only?


Trunk only.  It'll go away shortly when more patches get merged.



+static void
+set_oacc_fn_attrib (tree fn, tree clauses, vec *args)
+tree
+get_oacc_fn_attrib (tree fn)


These need function comments.


oops.


+  /* Must match GOMP_DIM ordering.  */
+  static const omp_clause_code ids[] =
+{OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH};


Formatting. No = at the end of a line, and whitespace around braces.


oh, I thought intialization placed the = where I had -- didn't  know that nor 
the space-brace rule.





@@ -9150,6 +9245,7 @@ expand_omp_target (struct omp_region *re
  }

gimple g;
+  bool tagging = false;
/* The maximum number used by any start_ix, without varargs.  */


That looks misindented, but may be an email client thing.


It does, doesn't it.  Appears to be email artifact or something.




+else if (!tagging)


Oh... so tagging controls two different methods for constructing argument lists,
one for GOACC_parallel and the other for whatever OMP uses? That's a bit
unfortunate, I'll need to think about it for a bit or defer to Jakub.


It's the (new)  difference between how the following 3 OpenACC builtins handle 
asyn & wait args.

   case BUILT_IN_GOACC_PARALLEL:
  {
set_oacc_fn_attrib (child_fn, clauses, );
tagging = true;
  }
  /* FALLTHRU */
case BUILT_IN_GOACC_ENTER_EXIT_DATA:
case BUILT_IN_GOACC_UPDATE:

All 3 pass info about memory copies etc, async and optional waits.  For  E_E_D 
and UPDATE the async is always passed (with a special value for 'synchronous'), 
followed by a count and then variadic wait ints.


An alternarive I suppse would be to break out the meminfo arg pushing to a 
helper function and have 2 separate code paths, or something like that.



Looks reasonable otherwise.


thanks for your review.  I'll repost shortly

nathan


Re: [patch] libstdc++/67173 Fix filesystem::canonical for Solaris 10.

2015-09-17 Thread Martin Sebor

On 09/17/2015 05:16 AM, Jonathan Wakely wrote:

On 16/09/15 17:42 -0600, Martin Sebor wrote:

I see now the first exists test will detect symlink loops in
the original path. But I'm not convinced there isn't a corner
case that's subject to a TOCTOU race condition between the first
exists test and the while loop during which a symlink loop can
be introduced.

Suppose we call the function with /foo/bar as an argument and
the path exists and contains no symlinks. result is / and cmpts
is set to { foo, bar }. Just as the loop is entered, /foo/bar
is replaced with a symlink containing /foo/bar. The loop then
proceeds like so:

1. The first iteration removes foo from cmpts and sets result
to /foo. cmpts is { bar }.

2. The second iteration removes bar from cmpts, sets result to
/foo/bar, determines it's a symlink, reads its contents, sees
it's an absolute pathname and replaces result with /. It then
inserts the symlink's components { foo, bar } into cmpts. cmpts
becomes { foo, bar }. exists(result) succeeds.

3. The next iteration of the loop has the same initial state
as the first.

But I could have very easily missed something that takes care
of this corner case. If I did, sorry for the false alarm!


No, you're right. The TS says such filesystem races are undefined:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4099.html#fs.race.behavior

but it would be nice to fail gracefully rather than DOS the
application.

The simplest approach would be to increment a counter every time we
follow a symlink, and if it reaches some limit decide something is
wrong and fail with ELOOP.

I don't see how anything else can be 100% bulletproof, because a truly
evil attacker could just keep altering the contents of symlinks so we
keep ping-ponging between two or more paths. If we keep track of paths
we've seen before the attacker could just keep changing the contents
to a unique path each time, that initially exists as a file, but by
the time we get to is_symlink() its become a symlink to a new path.

So if we use a counter, what's a sane maximum? Is MAXSYMLINKS in
 the value the kernel uses? 20 seems quite low, I was
thinking of a much higher number.


Yes, it is a corner case, and it's not really avoidable in the case
of hard links. For symlinks, POSIX defines the SYMLOOP_MAX constant
as the maximum, with the _SC_SYMLOOP_MAX and _PC_SYMLOOP_MAX
sysconf and pathconf variables. Otherwise 40 seems reasonable.

With this, I'll let you get back to work -- I think we've beat this
function to death ;)

Martin






Re: [C/C++ PATCH] RFC: Implement -Wduplicated-cond (PR c/64249)

2015-09-17 Thread Jeff Law

On 09/17/2015 10:05 AM, Marek Polacek wrote:



The patch doesn't diagnose some more involved cases like the one
below:

   if (i < 0) return 1; else if (!(i > 0 || i == 0)) return 2;

even though it does diagnose some other such cases, including:

   if (i < 0) return 1; else if (!(i >= 0)) return 2;

and even:

   int foo (int a, int b, int c) {
 if (a + c == b) return 1; else if (a + c - b == 0) return 2;
 return 0;
   }

I'm guessing this is due to operand_equal_p returning true for
some pairs of equivalent expressions and false for others. Would
making this work be feasible?


You're right that this is limited by what operand_equal_p considers equal and
what not.  I'm not sure if there is something more convoluted else I could use
in the FE, but I think not.  It certainly doesn't look terrible important to
me.
And you'll run into a point of diminishing returns quickly.  There's 
many ways to write this stuff in ways that are equivalent, but a pain to 
uncover.  I think relying on operand_equal_p is probably sufficient at 
this stage.





You probably didn't intend to diagnose the final else clause in
the following case but I wonder if it should be (as an extension
of the original idea):

 if (i) return 1; else if (!i) return 2; else return 0;


Diagnosing this wasn't my intention.  I'm afraid it would be kinda hard
to do that.
This is the "unreachable code" problem.  We used to have a warning for 
that, but in practice it was so unstable it was removed.





(In fact, I wonder if it might be worth diagnosing even the
'if (!i)' part since the condition is the inverse of the one
in the if and thus redundant).


This, on the other hand, seems doable provided we have some predicate that
would tell us whether an expression is a logical inverse of another expression.
E.g. "a > 3" and "a <= 3".  Something akin to pred_equal_p -- invert one expr
and check whether they're equal.
I think you just want to select a canonical form.  So you canonicalize, 
then compare.


jeff


[AArch64][PATCH 3/5] Add atomic load-operate instructions.

2015-09-17 Thread Matthew Wahab

Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch adds the ARMv8.1 atomic
load-operate instructions.

Tested the series for aarch64-none-linux-gnu with native bootstrap and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

2015-09-17  Matthew Wahab  

* config/aarch64/aarch64/atomics.md (UNSPECV_ATOMIC_LDOP): New.
(UNSPECV_ATOMIC_LDOP_OR): New.
(UNSPECV_ATOMIC_LDOP_BIC): New.
(UNSPECV_ATOMIC_LDOP_XOR): New.
(UNSPECV_ATOMIC_LDOP_PLUS): New.
(ATOMIC_LDOP): New.
(atomic_ldop): New.
(aarch64_atomic_load): New.

>From 6a8a83c4efbd607924f0630779d4915c9dad079c Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Mon, 10 Aug 2015 17:02:08 +0100
Subject: [PATCH 3/5] Add atomic load-operate instructions.

Change-Id: I3746875bad7469403bee7df952f0ba565e4abc71
---
 gcc/config/aarch64/atomics.md | 41 +
 1 file changed, 41 insertions(+)

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 0e71002..b7b6fb5 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -29,8 +29,25 @@
 UNSPECV_ATOMIC_CAS			; Represent an atomic CAS.
 UNSPECV_ATOMIC_SWP			; Represent an atomic SWP.
 UNSPECV_ATOMIC_OP			; Represent an atomic operation.
+UNSPECV_ATOMIC_LDOP			; Represent an atomic load-operation
+UNSPECV_ATOMIC_LDOP_OR		; Represent an atomic load-or
+UNSPECV_ATOMIC_LDOP_BIC		; Represent an atomic load-bic
+UNSPECV_ATOMIC_LDOP_XOR		; Represent an atomic load-xor
+UNSPECV_ATOMIC_LDOP_PLUS		; Represent an atomic load-add
 ])
 
+;; Iterators for load-operate instructions.
+
+(define_int_iterator ATOMIC_LDOP
+ [UNSPECV_ATOMIC_LDOP_OR UNSPECV_ATOMIC_LDOP_BIC
+  UNSPECV_ATOMIC_LDOP_XOR UNSPECV_ATOMIC_LDOP_PLUS])
+
+(define_int_attr atomic_ldop
+ [(UNSPECV_ATOMIC_LDOP_OR "set") (UNSPECV_ATOMIC_LDOP_BIC "clr")
+  (UNSPECV_ATOMIC_LDOP_XOR "eor") (UNSPECV_ATOMIC_LDOP_PLUS "add")])
+
+;; Instruction patterns.
+
 (define_expand "atomic_compare_and_swap"
   [(match_operand:SI 0 "register_operand" "")			;; bool out
(match_operand:ALLI 1 "register_operand" "")			;; val out
@@ -541,3 +558,27 @@
 else
   return "casal\t%0, %2, %1";
 })
+
+;; Atomic load-op: Load data, operate, store result, keep data.
+
+(define_insn "aarch64_atomic_load"
+ [(set (match_operand:ALLI 0 "register_operand" "=r")
+   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q"))
+  (set (match_dup 1)
+   (unspec_volatile:ALLI
+[(match_dup 1)
+ (match_operand:ALLI 2 "register_operand")
+ (match_operand:SI 3 "const_int_operand")]
+ATOMIC_LDOP))]
+ "TARGET_LSE && reload_completed"
+ {
+   enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+   if (is_mm_relaxed (model))
+ return "ld\t%2, %0, %1";
+   else if (is_mm_acquire (model) || is_mm_consume (model))
+ return "lda\t%2, %0, %1";
+   else if (is_mm_release (model))
+ return "ldl\t%2, %0, %1";
+   else
+ return "ldal\t%2, %0, %1";
+ })
-- 
2.1.4



[C++ Patch/RFC] PR 67576 ("[4.9/5/6 Regression] expression of typeid( expression ) is evaluated twice")

2015-09-17 Thread Paolo Carlini

Hi,

submitter noticed that the fix for c++/25466:

https://gcc.gnu.org/ml/gcc-patches/2013-04/msg00553.html

caused a double evaluation of the typeid, at least in some cases. I had 
a quick look and wondered if it could make sense to just use the new 
code when we are outside the straightforward INDIRECT_REF cases which we 
already handled correctly... The below passes testing, anyway.


Thanks,
Paolo.

//
Index: cp/rtti.c
===
--- cp/rtti.c   (revision 227849)
+++ cp/rtti.c   (working copy)
@@ -336,10 +336,25 @@ build_typeid (tree exp, tsubst_flags_t complain)
 {
   /* So we need to look into the vtable of the type of exp.
  Make sure it isn't a null lvalue.  */
-  exp = cp_build_addr_expr (exp, complain);
-  exp = stabilize_reference (exp);
-  cond = cp_convert (boolean_type_node, exp, complain);
-  exp = cp_build_indirect_ref (exp, RO_NULL, complain);
+  if (INDIRECT_REF_P (exp)
+ && TYPE_PTR_P (TREE_TYPE (TREE_OPERAND (exp, 0
+   {
+ exp = mark_lvalue_use (exp);
+ exp = stabilize_reference (exp);
+ cond = cp_convert (boolean_type_node, TREE_OPERAND (exp, 0),
+complain);
+   }
+  else
+   {
+ /* The standard is somewhat unclear here, but makes sense
+to always check whether the address is null rather than
+confine the check to when the immediate operand of typeid
+is an INDIRECT_REF.  */
+ exp = cp_build_addr_expr (exp, complain);
+ exp = stabilize_reference (exp);
+ cond = cp_convert (boolean_type_node, exp, complain);
+ exp = cp_build_indirect_ref (exp, RO_NULL, complain);
+   }
 }
 
   exp = get_tinfo_decl_dynamic (exp, complain);
Index: testsuite/g++.dg/rtti/typeid11.C
===
--- testsuite/g++.dg/rtti/typeid11.C(revision 0)
+++ testsuite/g++.dg/rtti/typeid11.C(working copy)
@@ -0,0 +1,17 @@
+// PR c++/67576
+// { dg-do run }
+
+#include 
+
+struct Base { virtual void foo() {} };
+
+int main()
+{
+  Base b;
+  Base *ary[] = { , ,  };
+
+  int iter = 0;
+  typeid(*ary[iter++]);
+  if (iter != 1)
+__builtin_abort();
+}


[patch] libstdc++/65913 Handle alignment in __atomic_is_lock_free

2015-09-17 Thread Jonathan Wakely

This fixes a 5/6 regression that causes atomic::is_lock_free() to
require libatomic even on targets where the compiler knows the answer.

The new test only runs on x86_64-linux and powerpc*-linux. It isn't
actually OS-dependent but as long as it runs somewhere we should pick
up regressions so those commonly-tested targets are sufficient.

Tested x86_64-linux and powerpc64le-linux.

Richard, this is your middle-end change plus your suggestion for the
std::atomic members, which works just as you said it would. OK for
trunk?


commit db33491a59064c80104d8482f92cbf80d0d9775c
Author: Jonathan Wakely 
Date:   Thu Sep 17 00:21:34 2015 +0100

Handle alignment in __atomic_is_lock_free

gcc:

2015-09-17  Richard Henderson  

	PR libstdc++/65913
	* builtins.c (fold_builtin_atomic_always_lock_free): Handle fake
	pointers that encode the alignment of the object.

libstdc++-v3:

2015-09-17  Jonathan Wakely  

	PR libstdc++/65913
	* include/bits/atomic_base.h (__atomic_base<_TTp>::is_lock_free(),
	__atomic_base<_PTp*>::is_lock_free()): Call the built-in with the
	immediate pointer value, not a variable.
	* include/std/atomic (atomic::is_lock_free()): Likewise.
	* testsuite/29_atomics/atomic/65913.cc: New.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index d79372c..aeec170 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5635,8 +5635,20 @@ fold_builtin_atomic_always_lock_free (tree arg0, tree arg1)
   mode = mode_for_size (size, MODE_INT, 0);
   mode_align = GET_MODE_ALIGNMENT (mode);
 
-  if (TREE_CODE (arg1) == INTEGER_CST && INTVAL (expand_normal (arg1)) == 0)
-type_align = mode_align;
+  if (TREE_CODE (arg1) == INTEGER_CST)
+{
+  unsigned HOST_WIDE_INT val = UINTVAL (expand_normal (arg1));
+
+  /* Either this argument is null, or it's a fake pointer encoding
+ the alignment of the object.  */
+  val = val & -val;
+  val *= BITS_PER_UNIT;
+
+  if (val == 0 || mode_align < val)
+type_align = mode_align;
+  else
+type_align = val;
+}
   else
 {
   tree ttype = TREE_TYPE (arg1);
diff --git a/libstdc++-v3/include/bits/atomic_base.h b/libstdc++-v3/include/bits/atomic_base.h
index 79769cf..75a7ca7 100644
--- a/libstdc++-v3/include/bits/atomic_base.h
+++ b/libstdc++-v3/include/bits/atomic_base.h
@@ -350,17 +350,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   bool
   is_lock_free() const noexcept
   {
-	// Produce a fake, minimally aligned pointer.
-	void *__a = reinterpret_cast(-__alignof(_M_i));
-	return __atomic_is_lock_free(sizeof(_M_i), __a);
+	// Use a fake, minimally aligned pointer.
+	return __atomic_is_lock_free(sizeof(_M_i),
+	reinterpret_cast(-__alignof(_M_i)));
   }
 
   bool
   is_lock_free() const volatile noexcept
   {
-	// Produce a fake, minimally aligned pointer.
-	void *__a = reinterpret_cast(-__alignof(_M_i));
-	return __atomic_is_lock_free(sizeof(_M_i), __a);
+	// Use a fake, minimally aligned pointer.
+	return __atomic_is_lock_free(sizeof(_M_i),
+	reinterpret_cast(-__alignof(_M_i)));
   }
 
   _GLIBCXX_ALWAYS_INLINE void
@@ -666,16 +666,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   is_lock_free() const noexcept
   {
 	// Produce a fake, minimally aligned pointer.
-	void *__a = reinterpret_cast(-__alignof(_M_p));
-	return __atomic_is_lock_free(sizeof(_M_p), __a);
+	return __atomic_is_lock_free(sizeof(_M_p),
+	reinterpret_cast(-__alignof(_M_p)));
   }
 
   bool
   is_lock_free() const volatile noexcept
   {
 	// Produce a fake, minimally aligned pointer.
-	void *__a = reinterpret_cast(-__alignof(_M_p));
-	return __atomic_is_lock_free(sizeof(_M_p), __a);
+	return __atomic_is_lock_free(sizeof(_M_p),
+	reinterpret_cast(-__alignof(_M_p)));
   }
 
   _GLIBCXX_ALWAYS_INLINE void
diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index 125e37a..cdd1f0b 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -208,16 +208,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   is_lock_free() const noexcept
   {
 	// Produce a fake, minimally aligned pointer.
-	void *__a = reinterpret_cast(-__alignof(_M_i));
-	return __atomic_is_lock_free(sizeof(_M_i), __a);
+	return __atomic_is_lock_free(sizeof(_M_i),
+	reinterpret_cast(-__alignof(_M_i)));
   }
 
   bool
   is_lock_free() const volatile noexcept
   {
 	// Produce a fake, minimally aligned pointer.
-	void *__a = reinterpret_cast(-__alignof(_M_i));
-	return __atomic_is_lock_free(sizeof(_M_i), __a);
+	return __atomic_is_lock_free(sizeof(_M_i),
+	reinterpret_cast(-__alignof(_M_i)));
   }
 
   void
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/65913.cc b/libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
new file mode 100644
index 000..dbdd9cf
--- /dev/null
+++ 

Re: [PATCH] Fix PR64078

2015-09-17 Thread Jeff Law

On 09/17/2015 09:00 AM, Marek Polacek wrote:

On Wed, Sep 09, 2015 at 07:48:15PM +0200, Bernd Edlinger wrote:

Hi,

On Wed, 9 Sep 2015 09:31:33, Jeff Law wrote:

You could probably make the function static or change its visibility via
a function attribute (there's a visibility attribute which can take the
values default, hidden protected or internal). Default visibility
essentially means the function can be overridden. I think changing it
to "protected" might work. Note if we do that, we may need some kind of
target selector on the test since not all targets support the various
visibility attributes.



Yes, it works both ways: static works, and __attribute__ ((visibility 
("protected"))) works too:

make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c 
--target_board='unix{-fpic,-mcmodel=medium,-fpic\ -mcmodel=medium,-mcmodel=large,-fpic\ 
-mcmodel=large}'"

has all tests passed, but..

make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c
--target_board='unix{-fno-inline}'"

still fails in the same way for all workarounds: inline, static, and __attribute__ 
((visibility ("protected"))).

Maybe "static" would be preferable?


Yeah, I'd go with static if that helps.  I'd rather avoid playing games
with visibility.
Static is certainly easier and doesn't rely on targets implementing all 
the visibility capabilities.  So static is probably the best approach.


jeff


[AArch64][PATCH 2/5] Add BIC instruction.

2015-09-17 Thread Matthew Wahab

Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch adds an expander to
generate a BIC instruction that can be explicitly called when
implementing the atomic__fetch pattern to calculate the value to
be returned by the operation.

Tested the series for aarch64-none-linux-gnu with native bootstrap and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

2015-09-17  Matthew Wahab  

* config/aarch64/aarch64.md (bic_3): New.


>From 14e122ee98aa20826ee070d20c58c94206cdd91b Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Mon, 17 Aug 2015 17:48:27 +0100
Subject: [PATCH 2/5] Add BIC instruction

Change-Id: Ibef049bfa1bfe5e168feada3dc358f28383e6410
---
 gcc/config/aarch64/aarch64.md | 13 +
 1 file changed, 13 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 88ba72e..bae4af4 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3351,6 +3351,19 @@
(set_attr "simd" "*,yes")]
 )
 
+(define_expand "bic_3"
+ [(set (match_operand:GPI 0 "register_operand" "=r")
+   (and:GPI
+(not:GPI
+ (SHIFT:GPI
+  (match_operand:GPI 1 "register_operand" "r")
+  (match_operand:QI 2 "aarch64_shift_imm_si" "n")))
+(match_operand:GPI 3 "register_operand" "r")))]
+ ""
+ ""
+ [(set_attr "type" "logics_shift_imm")]
+)
+
 (define_insn "*and_one_cmpl3_compare0"
   [(set (reg:CC_NZ CC_REGNUM)
 	(compare:CC_NZ
-- 
2.1.4



[Ada] Language-defined checks and side effects in gigi

2015-09-17 Thread Eric Botcazou
As diagnosed by Richard B., emit_check wrongly resets the TREE_SIDE_EFFECTS 
flag on the COND_EXPR built for language-defined checks in Ada, leading to the 
omission of required checks in peculiar cases.

Tested on x86_64-suse-linux, applied on the mainline.


2015-09-17  Eric Botcazou  

* gcc-interface/trans.c (emit_check): Do not touch TREE_SIDE_EFFECTS.


2015-09-17  Eric Botcazou  

* gnat.dg/overflow_sum3.adb: New test.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 227819)
+++ gcc-interface/trans.c	(working copy)
@@ -8798,29 +8798,32 @@ emit_index_check (tree gnu_array_object,
  gnu_expr, CE_Index_Check_Failed, gnat_node);
 }
 
-/* GNU_COND contains the condition corresponding to an access, discriminant or
-   range check of value GNU_EXPR.  Build a COND_EXPR that returns GNU_EXPR if
-   GNU_COND is false and raises a CONSTRAINT_ERROR if GNU_COND is true.
-   REASON is the code that says why the exception was raised.  GNAT_NODE is
-   the GNAT node conveying the source location for which the error should be
-   signaled.  */
+/* GNU_COND contains the condition corresponding to an index, overflow or
+   range check of value GNU_EXPR.  Build a COND_EXPR that returns GNU_EXPR
+   if GNU_COND is false and raises a CONSTRAINT_ERROR if GNU_COND is true.
+   REASON is the code that says why the exception is raised.  GNAT_NODE is
+   the node conveying the source location for which the error should be
+   signaled.
+
+   We used to propagate TREE_SIDE_EFFECTS from GNU_EXPR to the COND_EXPR,
+   overwriting the setting inherited from the call statement, on the ground
+   that the expression need not be evaluated just for the check.  However
+   that's incorrect because, in the GCC type system, its value is presumed
+   to be valid so its comparison against the type bounds always yields true
+   and, therefore, could be done without evaluating it; given that it can
+   be a computation that overflows the bounds, the language may require the
+   check to fail and thus the expression to be evaluated in this case.  */
 
 static tree
 emit_check (tree gnu_cond, tree gnu_expr, int reason, Node_Id gnat_node)
 {
   tree gnu_call
 = build_call_raise (reason, gnat_node, N_Raise_Constraint_Error);
-  tree gnu_result
-= fold_build3 (COND_EXPR, TREE_TYPE (gnu_expr), gnu_cond,
-		   build2 (COMPOUND_EXPR, TREE_TYPE (gnu_expr), gnu_call,
-			   convert (TREE_TYPE (gnu_expr), integer_zero_node)),
-		   gnu_expr);
-
-  /* GNU_RESULT has side effects if and only if GNU_EXPR has:
- we don't need to evaluate it just for the check.  */
-  TREE_SIDE_EFFECTS (gnu_result) = TREE_SIDE_EFFECTS (gnu_expr);
-
-  return gnu_result;
+  return
+fold_build3 (COND_EXPR, TREE_TYPE (gnu_expr), gnu_cond,
+		 build2 (COMPOUND_EXPR, TREE_TYPE (gnu_expr), gnu_call,
+			 convert (TREE_TYPE (gnu_expr), integer_zero_node)),
+		 gnu_expr);
 }
 
 /* Return an expression that converts GNU_EXPR to GNAT_TYPE, doing overflow
--  { dg-do run }
--  { dg-options "-gnato" }

procedure Overflow_Sum3 is

   function Ident (I : Integer) return Integer is
   begin
  return I;
   end;

   X : Short_Short_Integer := Short_Short_Integer (Ident (127));

begin
   if X+1 <= 127 then
  raise Program_Error;
   end if;
exception
   when Constraint_Error => null;
end;


Re: [patch] libstdc++/65913 Handle alignment in __atomic_is_lock_free

2015-09-17 Thread Richard Henderson
On 09/17/2015 07:04 AM, Jonathan Wakely wrote:
> Handle alignment in __atomic_is_lock_free
> 
> gcc:
> 
> 2015-09-17  Richard Henderson  
> 
>   PR libstdc++/65913
>   * builtins.c (fold_builtin_atomic_always_lock_free): Handle fake
>   pointers that encode the alignment of the object.
> 
> libstdc++-v3:
> 
> 2015-09-17  Jonathan Wakely  
> 
>   PR libstdc++/65913
>   * include/bits/atomic_base.h (__atomic_base<_TTp>::is_lock_free(),
>   __atomic_base<_PTp*>::is_lock_free()): Call the built-in with the
>   immediate pointer value, not a variable.
>   * include/std/atomic (atomic::is_lock_free()): Likewise.
>   * testsuite/29_atomics/atomic/65913.cc: New.

Yes, this is ok.


r~


Re: [patch] libstdc++/67173 Fix filesystem::canonical for Solaris 10.

2015-09-17 Thread Jonathan Wakely

On 16/09/15 23:50 +0100, Jonathan Wakely wrote:

On 16/09/15 19:58 +0100, Jonathan Wakely wrote:

commit ef25038796485298ff8f040bc79e0d9a371171fa
Author: Jonathan Wakely 
Date:   Wed Sep 16 18:07:32 2015 +0100

  Implement filesystem::canonical() without realpath
PR libstdc++/67173
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check _XOPEN_VERSION
and PATH_MAX for _GLIBCXX_USE_REALPATH.
* config.h.in: Regenerate.
* configure: Regenerate.
* src/filesystem/ops.cc: (canonical) [!_GLIBCXX_USE_REALPATH]: Add
alternative implementation.
* testsuite/experimental/filesystem/operations/canonical.cc: New.
* testsuite/experimental/filesystem/operations/exists.cc: Add more
tests.
* testsuite/experimental/filesystem/operations/absolute.cc: Add test
variables.
* testsuite/experimental/filesystem/operations/copy.cc: Likewise.
* testsuite/experimental/filesystem/operations/current_path.cc:
Likewise.
* testsuite/experimental/filesystem/operations/file_size.cc: Likewise.
* testsuite/experimental/filesystem/operations/status.cc: Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.


Committed to trunk.



I'm removing part of the new canonical.cc test as it fails
occasionally with:

terminate called after throwing an instance of 
'std::experimental::filesystem::v1::__cxx11::filesystem_error'
 what():  filesystem error: cannot canonicalize: No such file or directory 
[/dev/stdin]
FAIL: experimental/filesystem/operations/canonical.cc execution test

This is odd, as I check with exists() before calling canonical(), but
rather than try to understand what is happening I'm just going to
remove that part.

Committed to trunk.


commit a250423d1964952312bf97e6be3de987308a5164
Author: Jonathan Wakely 
Date:   Thu Sep 17 16:17:11 2015 +0100

Remove non-deterministic part of canonical() test

	* testsuite/experimental/filesystem/operations/canonical.cc: Remove
	non-deterministic part of the test.

diff --git a/libstdc++-v3/testsuite/experimental/filesystem/operations/canonical.cc b/libstdc++-v3/testsuite/experimental/filesystem/operations/canonical.cc
index d752feb..5091a70 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/operations/canonical.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/operations/canonical.cc
@@ -57,17 +57,6 @@ test01()
   p = canonical( p, ec );
   VERIFY( p == "/" );
   VERIFY( !ec );
-
-  p = "/dev/stdin";
-  if (exists(p))
-{
-  auto p2 = canonical(p);
-  if (is_symlink(p))
-VERIFY( p != p2 );
-  else
-VERIFY( p == p2 );
-  VERIFY( canonical(p2) == p2 );
-}
 }
 
 int


Re: [C/C++ PATCH] RFC: Implement -Wduplicated-cond (PR c/64249)

2015-09-17 Thread Marek Polacek
On Wed, Sep 16, 2015 at 02:59:12PM -0600, Martin Sebor wrote:
> On 09/16/2015 09:59 AM, Marek Polacek wrote:
> >This patch implements a new warning, -Wduplicated-cond.  It warns for
> >code such as
> >
> >   if (x)
> > // ...
> >   else if (x)
> > // ...
> 
> As usual, I like this improvement. I think it's worth extending
> to conditional expressions (e.g., x ? f() : x ? g() : h() should
> be diagnosed as well).
 
Maybe, probably.  I admit I wasn't thinking of conditional expressions
at all.

> The patch currently issues a false positive for the test case
> below. I suspect the chain might need to be cleared after each
> condition that involves a side-effect.
> 
>   int foo (int a)
>   {
> if (a) return 1; else if (++a) return 2; else if (a) return 3;
> return 0;
>   }

But the last branch here can never be reached, right?  If a == 0, foo
returns 2, otherwise it just returns 1.  So I think we should diagnose
this.

> The patch doesn't diagnose some more involved cases like the one
> below:
> 
>   if (i < 0) return 1; else if (!(i > 0 || i == 0)) return 2;
> 
> even though it does diagnose some other such cases, including:
> 
>   if (i < 0) return 1; else if (!(i >= 0)) return 2;
> 
> and even:
> 
>   int foo (int a, int b, int c) {
> if (a + c == b) return 1; else if (a + c - b == 0) return 2;
> return 0;
>   }
> 
> I'm guessing this is due to operand_equal_p returning true for
> some pairs of equivalent expressions and false for others. Would
> making this work be feasible?

You're right that this is limited by what operand_equal_p considers equal and
what not.  I'm not sure if there is something more convoluted else I could use 
in the FE, but I think not.  It certainly doesn't look terrible important to
me.

> You probably didn't intend to diagnose the final else clause in
> the following case but I wonder if it should be (as an extension
> of the original idea):
> 
> if (i) return 1; else if (!i) return 2; else return 0;

Diagnosing this wasn't my intention.  I'm afraid it would be kinda hard
to do that.

> (In fact, I wonder if it might be worth diagnosing even the
> 'if (!i)' part since the condition is the inverse of the one
> in the if and thus redundant).

This, on the other hand, seems doable provided we have some predicate that
would tell us whether an expression is a logical inverse of another expression.
E.g. "a > 3" and "a <= 3".  Something akin to pred_equal_p -- invert one expr
and check whether they're equal.

But that sounds like another warning ;).  And it smells of doing some kind of
VRP in the FE - ick.

> Is diagnosing things like 'if (a) if (a);' or 'if (a); else {
> if (a); }' not feasible or too involved, or did you choose not
> to because you expect it might generate too many warnings? (If
> the latter, perhaps it could be disabled by default and enabled
> by -Wduplicated-cond=2).

I intentionally avoided that.  It would certainly be harder and I'm
not sure it's worth it.  We'd need to be careful to not warn on e.g.

  if (a > 5)
{
  --a;
  if (a < 5) // ...
}
etc.

Thanks a lot for looking into this.

Marek


Re: [C/C++ PATCH] RFC: Implement -Wduplicated-cond (PR c/64249)

2015-09-17 Thread Martin Sebor

The patch currently issues a false positive for the test case
below. I suspect the chain might need to be cleared after each
condition that involves a side-effect.

   int foo (int a)
   {
 if (a) return 1; else if (++a) return 2; else if (a) return 3;
 return 0;
   }


But the last branch here can never be reached, right?  If a == 0, foo
returns 2, otherwise it just returns 1.  So I think we should diagnose
this.


It probably wasn't the best example. The general issue here is
that the second condition has a side-effect that can change (in
this case clearly does) the value of the expression.

Here's a better example:

int a;

int bar (void) { a = 1; return 0; }

int foo (void) {
if (a) return 1;
else if (foo ()) return 2;
else if (a) return 3;
return 0;
}

Since we don't know bar's side-effects we must assume they change
the value of a and so we must avoid diagnosing the third if.

Martin


[AArch64][PATCH 1/5] Use atomic instructions for swap and fetch-update operations.

2015-09-17 Thread Matthew Wahab

Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch series adds the
instructions to GCC, making them available with -march=armv8.1-a or
-march=armv8+lse, and uses them to implement the __sync and __atomic
builtins.

The ARMv8.1 swap instruction swaps the value in a register with a value
in memory. The load-operate instructions load a value from memory,
update it with the result of an operation and store the result in
memory.

This series uses the swap instruction to implement the atomic_exchange
patterns and the load-operate instructions to implement the
atomic_fetch_ and atomic__fetch patterns. For the
atomic__fetch patterns, the value returned as the result of the
operation has to be recalculated from the loaded data. The ARMv8 BIC
instruction is added so that it can be used for this recalculation.

The patches in this series
- add and use the atomic swap instruction.
- add the Aarch64 BIC instruction,
- add the ARMv8.1 load-operate instructions,
- use the load-operate instructions to implement the atomic_fetch_
  patterns,
- use the load-operate instructions to implement the patterns
  atomic__fetch patterns,

The code-generation changes in this series are based around a new
function, aarch64_gen_atomic_ldop, which takes the operation to be
implemented and emits the appropriate code, making use of the atomic
instruction. This follows the existing uses aarch64_split_atomic_op for
the same purpose when atomic instructions aren't available.

This patch adds the ARMv8.1 SWAP instruction and function
aarch64_gen_atomic_ldop and changes the implementation of the
atomic_exchange pattern to the atomic instruction when it is available.

The general form of the code generated for an atomic_exchange, with
destination D, source S, memory address A and memory order MO is:

   swp S, D, [A]

where
is one of {'', 'a', 'l', 'al'} depending on memory order MO.
is one of {'', 'b', 'h'} depending on the data size.

This patch also adds tests for the changes. These reuse the support code
introduced for the atomic CAS tests, adding macros to test functions
taking one memory ordering argument. These are used to iteratively
define functions using the __atomic_exchange builtins, which should be
implemented using the atomic swap.

Tested the series for aarch64-none-linux-gnu with native bootstrap and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

gcc/
2015-09-17  Matthew Wahab  

* config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
Declare.
* config/aarch64/aarch64.c (aarch64_emit_atomic_swp): New.
(aarch64_gen_atomic_ldop): New.
(aarch64_split_atomic_op): Fix whitespace and add a comment.
* config/aarch64/atomics.md (UNSPECV_ATOMIC_SWP): New.
(atomic_compare_and_swap_lse): Remove comments and fix
whitespace.
(atomic_exchange): Replace with an expander.
(aarch64_atomic_exchange): New.
(aarch64_atomic_exchange_lse): New.
(aarch64_atomic_): Fix some whitespace.
(aarch64_atomic_swp): New.


gcc/testsuite/
2015-09-17  Matthew Wahab  

* gcc.target/aarch64/atomic-inst-ops.inc: (TEST_MODEL): New.
(TEST_ONE): New.
 * gcc.target/aarch64/atomic-inst-swap.c: New.

>From 425fa05a5e3656c8d6d0d085628424b4c846cd49 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Fri, 7 Aug 2015 17:18:37 +0100
Subject: [PATCH 1/5] Add atomic SWP instruction

Change-Id: I87bf48526cb11e65edd15691f5eab20446e418c9
---
 gcc/config/aarch64/aarch64-protos.h|  1 +
 gcc/config/aarch64/aarch64.c   | 46 ++-
 gcc/config/aarch64/atomics.md  | 92 +++---
 .../gcc.target/aarch64/atomic-inst-ops.inc | 13 +++
 gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c | 44 +++
 5 files changed, 183 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-swp.c

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index ff19851..eba4c76 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -378,6 +378,7 @@ rtx aarch64_load_tp (rtx);
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
 void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
+void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_gen_adjusted_ldpstp (rtx *, bool, enum machine_mode, RTX_CODE);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4d2126b..dc05c6e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11185,11 

Re: [Aarch64] Use vector wide add for mixed-mode adds

2015-09-17 Thread James Greenhalgh
On Mon, Sep 07, 2015 at 06:54:30AM +0100, Michael Collison wrote:
> This patch is designed to address code that was not being vectorized due
> to missing widening patterns in the aarch64 backend. Code such as:
> 
> int t6(int len, void * dummy, short * __restrict x)
> {
>len = len & ~31;
>int result = 0;
>__asm volatile ("");
>for (int i = 0; i < len; i++)
>  result += x[i];
>return result;
> }
> 
> Validated on aarch64-none-elf, aarch64_be-none-elf, and
> aarch64-none-linus-gnu.
> 
> Note that there are three non-execution tree dump vectorization
> regressions where previously code was being vectorized.  They are:

I'd like to understand these better before taking the patch...

> 
> Passed now fails  [PASS => FAIL]:
>gcc.dg/vect/slp-multitypes-4.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "vectorized 1 loops" 1
>gcc.dg/vect/slp-multitypes-4.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "vectorizing stmts using SLP" 1
>gcc.dg/vect/slp-multitypes-4.c scan-tree-dump-times vect "vectorized 1 
> loops" 1
>gcc.dg/vect/slp-multitypes-4.c scan-tree-dump-times vect "vectorizing 
> stmts using SLP" 1
>gcc.dg/vect/slp-multitypes-5.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "vectorized 1 loops" 1
>gcc.dg/vect/slp-multitypes-5.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "vectorizing stmts using SLP" 1
>gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorized 1 
> loops" 1
>gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorizing 
> stmts using SLP" 1

These look like weaknesses in SLP trying to build a widening add by a
constant in the wider mode (i.e. (short) w+ (int)). I'd like to understand
what the issue is here.

>gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
>gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1

Is this one as simple as setting
check_effective_target_vect_widen_sum_hi_to_si_pattern in
testsuite/lib/target-supports.exp ?

>gcc.dg/vect/vect-125.c -flto -ffat-lto-objects  scan-tree-dump vect 
> "vectorized 1 loops"
>gcc.dg/vect/vect-125.c scan-tree-dump vect "vectorized 1 loops"

These look like a failure to match the widening optab for operand types
(int) w+ (short) -> (int), so I'm also concerned here.

> I would like to treat these as saperate bugs and resolve them separately.

I think we shouldn't write these off without at least partially understanding
the issues.

> diff --git a/gcc/config/aarch64/aarch64-simd.md
> b/gcc/config/aarch64/aarch64-simd.md
> index 9777418..d6c5d61 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -2636,6 +2636,60 @@
> 
>   ;; w.
> 
> +(define_expand "widen_ssum3"
> +  [(set (match_operand: 0 "register_operand" "")
> +(plus: (sign_extend: (match_operand:VQW 1
> "register_operand" ""))

Please check your mail settings, as the patch currently does not apply.
The last time I saw this issue it was a mail client turning on
format=flowed which caused carnage for patch files

( https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00179.html )

> +  (match_operand: 2 "register_operand" "")))]
> +  "TARGET_SIMD"
> +  {
> +rtx p = aarch64_simd_vect_par_cnst_half (mode, false);
> +rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
> +
> +emit_insn (gen_aarch64_saddw_internal (temp, operands[2],
> +operands[1], p));

Replace 8 spaces with a tab (mail client?).

Thanks,
James

> +emit_insn (gen_aarch64_saddw2 (operands[0], temp, operands[1]));
> +DONE;
> +  }
> +)
> +
> +(define_expand "widen_ssum3"
> +  [(set (match_operand: 0 "register_operand" "")
> +(plus: (sign_extend:
> +   (match_operand:VD_BHSI 1 "register_operand" ""))
> +  (match_operand: 2 "register_operand" "")))]
> +  "TARGET_SIMD"
> +{
> +  emit_insn (gen_aarch64_saddw (operands[0], operands[2],
> operands[1]));
> +  DONE;
> +})
> +
> +(define_expand "widen_usum3"
> +  [(set (match_operand: 0 "register_operand" "=")
> +(plus: (zero_extend: (match_operand:VQW 1
> "register_operand" ""))
> +  (match_operand: 2 "register_operand" "")))]
> +  "TARGET_SIMD"
> +  {
> +rtx p = aarch64_simd_vect_par_cnst_half (mode, false);
> +rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
> +
> +emit_insn (gen_aarch64_uaddw_internal (temp, operands[2],
> + operands[1], p));
> +emit_insn (gen_aarch64_uaddw2 (operands[0], temp, operands[1]));
> +DONE;
> +  }
> +)
> +
> +(define_expand "widen_usum3"
> +  [(set (match_operand: 0 "register_operand" "")
> +(plus: (zero_extend:
> +   (match_operand:VD_BHSI 1 "register_operand" ""))
> +  (match_operand: 2 "register_operand" "")))]
> +  "TARGET_SIMD"
> +{
> +  emit_insn (gen_aarch64_uaddw (operands[0], operands[2],
> operands[1]));
> + 

Re: [PATCH][RTL-ifcvt] PR rtl-optimization/67465: Handle pairs of complex+simple blocks and empty blocks more gracefully

2015-09-17 Thread Kyrill Tkachov

Hi Rainer,

On 17/09/15 12:33, Rainer Orth wrote:

Hi Kyrill,


On 11/09/15 09:51, Rainer Orth wrote:

Kyrill Tkachov  writes:


On 10/09/15 12:43, Rainer Orth wrote:

Hi Kyrill,


Rainer, could you please check that this patch still fixes the SPARC
regressions?

unfortunately, it breaks sparc-sun-solaris2.10 bootstrap: compiling
stage2 libiberty/regex.c FAILs:



Thanks for providing the preprocessed file.
I've reproduced and fixed the ICE in this version of the patch.
The problem was that I was taking the mode of x before the check
of whether a and b are MEMs, after which we would change x to an
address_mode reg,
thus confusing emit_move_insn.

The fix is to take the mode of x and perform the
can_conditionally_move_p check
after that transformation.

Bootstrapped and tested on aarch64 and x86_64.
The preprocessed regex.i that Rainer provided now compiles successfully
for me
on a sparc-sun-solaris2.10 stage-1 cross-compiler.

Rainer, thanks for your help so far, could you please try out this patch?

While bootstrap succeeds again, the testsuite regression in
gcc.c-torture/execute/20071216-1.c reoccured.

Right, so I dug into the RTL dumps and I think this is a separate issue
that's being exacerbated by my patch.
The code tries to if-convert a block which contains a compare instruction
i.e. sets the CC register.
Now, bb_valid_for_noce_process_p should have caught this, and in particular
insn_valid_noce_process_p
which should check that the instruction doesn't set the CC
register. However, on SPARC the
cc_in_cond returns NULL! This is due to the canonicalize_comparison
implementation that seems to
remove the CC register from the condition expression and returns something like:
(leu (reg/v:SI 109 [ b ])
 (const_int -4096 [0xf000])

Therefore the set_of (cc_in_cond (cond), insn) check doesn't get triggered
because cc_in_cond returns NULL.
Regardless of how the branch condition got canonicalized, I think we still
want to reject any insn in the block
that sets a condition code register, so this patch checks the destination
of every set in the block for a MODE_CC
expression and cancels if-conversion if that's the case.

Oleg pointed me to the older PR 58517 affecting SH which seems similar and
I think my previous ifcvt patch would expose
this problem more.

Anyway, with this patch the failing SPARC testcase
gcc.c-torture/execute/20071216-1.c generates the same assembly
as before r227368 and bootstrap and test on aarch64 and x86_64 passes ok for me.

Rainer, could you try this patch on top of the previous patch?
(https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00689.html)
The two together should fix all of PR 67456, 67464, 67465 and 67481.

sorry this took me so long: we've had a major switch failure and my
sparc machines are way slow.


No problem. You're doing me a huge favour by testing the iterations
of the patches.  Sorry for causing the regression in the first place.
The issues I'm finding are exposed due to the way the sparc backend
does some things, so my aarch64 and x86_64 testing is unlikely to catch them.



Anyway, here's what I found: the two patches on top of each other do
bootstrap just fine and the gcc.c-torture/execute/20071216-1.c
regressions are gone.  However, it introduces a new one:

FAIL: gcc.dg/torture/stackalign/sibcall-1.c   -O1 -fpic execution test

It fails for both 32 and 64-bit.  The testcase SEGVs:


Indeed, I can see if-conversion triggering here and doing something funky with 
the
first patch that I posted 
(https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00689.html)

In this testcase we trigger the is_mem path through noce_try_cmove_arith
and we have the 'a' and 'b' having the form reg+symbol. When we try to move
them into a register with noce_emit_move_insn the resulting move is not 
recognised
(presumably sparc doesn't have any instructions/patterns to do this operation)
and the alternative tricks that noce_emit_move_insn tries to create the move
end up generating a bizzare sequence that involves loading from the memory at 
reg+symbol
expression and adding the base reg to it!

In any case, this patch doesn't try calling noce_emit_move_insn and instead 
generates a simple
SET expression, emits that and relies on end_ifcvt_sequence to call recog on it 
and cancel the
transformation if it's not a valid instruction. IMO this is the desired 
behaviour since the
move in question is supposed to be a simple move that would ideally be 
eliminated by the register
allocator later on if the dependencies work out. If it actually expands to more 
complex sequences
it's not going to be a win to if-convert anyway.

TLDR: This updated patch generates the same code for the sibcall-1.c testcase 
on sparc as before
the bad transformation and all the previous regressions are still fixed.

Bootstrapped and tested on aarch64 and x86_64.
Rainer, could you please try this patch in combination with the one I sent 
earlier at:

Re: [PATCH] PR28901 -Wunused-variable ignores unused const initialised variables

2015-09-17 Thread Mark Wielaard
On Thu, 2015-09-17 at 18:12 +0200, Bernd Schmidt wrote:
> On 09/17/2015 02:01 PM, Gerald Pfeifer wrote:
> > On Sun, 13 Sep 2015, Mark Wielaard wrote:
> >> Slightly adjusted patch attached. Now it is explicit that the warning is
> >> enabled by -Wunused-variable for C, but not C++. There are testcases for
> >> both C and C++ to check the defaults. And the hardcoded override is
> >> removed for C++, so the user could enable it if they want.
> >
> > I believe making -Wunused-const-variable part of -Wall is not
> > a good idea.
> >
> > For example, I have a nightly build of Wine with a nightly build
> > of GCC.  And my notifaction mail went from twently lines of warning
> > to 6500 -- all coming from header files.
> 
> What does it warn about, do the warnings look valid?

I was just taking a look. They come from constructs like:

static const WCHAR INSTALLPROPERTY_VERSIONSTRINGW[] =
{'V','e','r','s','i','o','n','S','t','r','i','n','g',0};

Using the more idiomatic and a bit more readable:

#define INSTALLPROPERTY_VERSIONSTRINGW u"VersionString"

gets rid of most of the issues.

Cheers,

Mark


Amend documentation of DF LIVE problem

2015-09-17 Thread Eric Botcazou
As discussed in the audit trail of PR rtl-optimization/66790, the doc of the 
DF_LIVE problem is confusing/wrong so the attached patch amends it.

Approved by Kenneth and applied on all active branches.


2015-09-17  Eric Botcazou  

PR rtl-optimization/66790
* df-problems.c (LIVE): Amend documentation.

-- 
Eric BotcazouIndex: df-problems.c
===
--- df-problems.c	(revision 227819)
+++ df-problems.c	(working copy)
@@ -1309,22 +1309,23 @@ df_lr_verify_transfer_functions (void)
 
 
 /*
-   LIVE AND MUST-INITIALIZED REGISTERS.
+   LIVE AND MAY-INITIALIZED REGISTERS.
 
This problem first computes the IN and OUT bitvectors for the
-   must-initialized registers problems, which is a forward problem.
-   It gives the set of registers for which we MUST have an available
-   definition on any path from the entry block to the entry/exit of
-   a basic block.  Sets generate a definition, while clobbers kill
+   may-initialized registers problems, which is a forward problem.
+   It gives the set of registers for which we MAY have an available
+   definition, i.e. for which there is an available definition on
+   at least one path from the entry block to the entry/exit of a
+   basic block.  Sets generate a definition, while clobbers kill
a definition.
 
In and out bitvectors are built for each basic block and are indexed by
regnum (see df.h for details).  In and out bitvectors in struct
-   df_live_bb_info actually refers to the must-initialized problem;
+   df_live_bb_info actually refers to the may-initialized problem;
 
Then, the in and out sets for the LIVE problem itself are computed.
These are the logical AND of the IN and OUT sets from the LR problem
-   and the must-initialized problem.
+   and the may-initialized problem.
 */
 
 /* Private data used to verify the solution for this problem.  */
@@ -1531,7 +1532,7 @@ df_live_confluence_n (edge e)
 }
 
 
-/* Transfer function for the forwards must-initialized problem.  */
+/* Transfer function for the forwards may-initialized problem.  */
 
 static bool
 df_live_transfer_function (int bb_index)
@@ -1555,7 +1556,7 @@ df_live_transfer_function (int bb_index)
 }
 
 
-/* And the LR info with the must-initialized registers, to produce the LIVE info.  */
+/* And the LR info with the may-initialized registers to produce the LIVE info.  */
 
 static void
 df_live_finalize (bitmap all_blocks)

Re: [PATCH] PR28901 -Wunused-variable ignores unused const initialised variables

2015-09-17 Thread Bernd Schmidt

On 09/17/2015 02:01 PM, Gerald Pfeifer wrote:

On Sun, 13 Sep 2015, Mark Wielaard wrote:

Slightly adjusted patch attached. Now it is explicit that the warning is
enabled by -Wunused-variable for C, but not C++. There are testcases for
both C and C++ to check the defaults. And the hardcoded override is
removed for C++, so the user could enable it if they want.


I believe making -Wunused-const-variable part of -Wall is not
a good idea.

For example, I have a nightly build of Wine with a nightly build
of GCC.  And my notifaction mail went from twently lines of warning
to 6500 -- all coming from header files.


What does it warn about, do the warnings look valid?


Bernd



Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-17 Thread Bill Schmidt
On Thu, 2015-09-17 at 09:18 -0500, Bill Schmidt wrote:
> On Thu, 2015-09-17 at 09:39 +0200, Richard Biener wrote:
> > On Wed, 16 Sep 2015, Alan Lawrence wrote:
> > 
> > > On 16/09/15 17:10, Bill Schmidt wrote:
> > > > 
> > > > On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote:
> > > > > On 16/09/15 15:28, Bill Schmidt wrote:
> > > > > > 2015-09-16  Bill Schmidt  
> > > > > > 
> > > > > >   * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX,
> > > > > > UNSPEC_REDUC_SMIN,
> > > > > >   UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, 
> > > > > > UNSPEC_REDUC_SMAX_SCAL,
> > > > > >   UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL,
> > > > > >   UNSPEC_REDUC_UMIN_SCAL): New enumerated constants.
> > > > > >   (reduc_smax_v2di): New define_expand.
> > > > > >   (reduc_smax_scal_v2di): Likewise.
> > > > > >   (reduc_smin_v2di): Likewise.
> > > > > >   (reduc_smin_scal_v2di): Likewise.
> > > > > >   (reduc_umax_v2di): Likewise.
> > > > > >   (reduc_umax_scal_v2di): Likewise.
> > > > > >   (reduc_umin_v2di): Likewise.
> > > > > >   (reduc_umin_scal_v2di): Likewise.
> > > > > >   (reduc_smax_v4si): Likewise.
> > > > > >   (reduc_smin_v4si): Likewise.
> > > > > >   (reduc_umax_v4si): Likewise.
> > > > > >   (reduc_umin_v4si): Likewise.
> > > > > >   (reduc_smax_v8hi): Likewise.
> > > > > >   (reduc_smin_v8hi): Likewise.
> > > > > >   (reduc_umax_v8hi): Likewise.
> > > > > >   (reduc_umin_v8hi): Likewise.
> > > > > >   (reduc_smax_v16qi): Likewise.
> > > > > >   (reduc_smin_v16qi): Likewise.
> > > > > >   (reduc_umax_v16qi): Likewise.
> > > > > >   (reduc_umin_v16qi): Likewise.
> > > > > >   (reduc_smax_scal_): Likewise.
> > > > > >   (reduc_smin_scal_): Likewise.
> > > > > >   (reduc_umax_scal_): Likewise.
> > > > > >   (reduc_umin_scal_): Likewise.
> > > > > 
> > > > > You shouldn't need the non-_scal reductions. Indeed, they shouldn't be
> > > > > used if
> > > > > the _scal are present. The non-_scal's were previously defined as
> > > > > producing a
> > > > > vector with one element holding the result and the other elements all
> > > > > zero, and
> > > > > this was only ever used with a vec_extract immediately after; the 
> > > > > _scal
> > > > > pattern
> > > > > now includes the vec_extract as well. Hence the non-_scal patterns are
> > > > > deprecated / considered legacy, as per md.texi.
> > > > 
> > > > Thanks -- I had misread the description of the non-scalar versions,
> > > > missing the part where the other elements are zero.  What I really
> > > > want/need is an optab defined as computing the maximum value in all
> > > > elements of the vector.
> > > 
> > > Yes, indeed. It seems reasonable to me that this would coexist with an 
> > > optab
> > > which computes only a single value (i.e. a scalar).
> > 
> > So just to clarify - you need to reduce the vector with max to a scalar
> > but want the (same) result in all vector elements?
> 
> Yes.  Alan Hayward's cond-reduction patch is set up to perform a
> reduction to scalar, followed by a scalar broadcast to get the value
> into all positions.  It happens that our most efficient expansion to
> reduce to scalar will naturally produce the value in all positions.
> Using the reduc_smax_scal_ expander, we are required to extract
> this into a scalar and then perform the broadcast.  Those two operations
> are unnecessary.  We can clean up after the fact, but the cost model
> will still reflect the unnecessary activity.
> 
> > 
> > > At that point it might be appropriate to change the cond-reduction code to
> > > generate the reduce-to-vector in all cases, and optabs.c expand it to
> > > reduce-to-scalar + broadcast if reduce-to-vector was not available. Along 
> > > with
> > > the (parallel) changes to cost model already proposed, does that cover 
> > > all the
> > > cases? It does add a new tree code, yes, but I'm feeling that could be
> > > justified if we go down this route.
> > 
> > I'd rather have combine recognize an insn that does both (if it
> > exists).  As I understand powerpc doesn't have reduction to scalar
> > (I think no ISA actually has this, but all ISAs have reduce to
> > one vector element plus a cheap way of extraction (effectively a subreg))
> > but it's reduction already has all result vector elements set to the
> > same value (as opposed to having some undefined or zero or whatever).
> 
> The extraction is not necessarily so cheap.  For floating-point values
> in a register, we can indeed use a subreg.  But in this case we are
> talking about integers, and an extract must move them from the vector
> registers to the GPRs.  With POWER8, we can do this with a 5-cycle move
> instruction.  Prior to that, we can only do this with a very expensive
> path through memory, as previous processors had no 

Re: [PATCH, fortran] PR 53379 Backtrace on error termination

2015-09-17 Thread David Edelsohn
On Wed, Sep 9, 2015 at 12:12 AM, Ian Lance Taylor  wrote:
> Mike Stump  writes:
>
>> Not a big issue, but slightly better if (O_CLOEXEC>>32) != 0 is also
>> true.  See, if AIX should ever define this to a sensible value, the
>> above would disappear the feature.  However, if they did, then this
>> expression should then be false.
>
> Yes, I think this might be even better in code.  How about something
> like
>
>   /* On some versions of AIX O_CLOEXEC does not fit in int, so use a
>  cast to force it.  */
>   descriptor = open (filename, (int) (O_RDONLY | O_BINARY | O_CLOEXEC));
>
> Does that work on AIX?

Any decision about this patch?

Thanks, David


Re: Split up optabs.[hc]

2015-09-17 Thread Richard Sandiford
Bernd Schmidt  writes:
> On 09/14/2015 07:54 PM, Richard Sandiford wrote:
>> This patch splits optabs up as follows:
>>
>>- optabs-query.[hc]: IL-independent functions for querying what a target
>>can do natively.
>>- optabs-tree.[hc]: tree and gimple query functions (an extension of
>>optabs-query.[hc]).
>>- optabs-libfuncs.[hc]: optabs-specific libfuncs (an extension of
>>libfuncs.h)
>>- optabs.h: For now includes optabs-query.h and optabs-libfuncs.h.
>
> This seems like a good change.
>
>> I changed can_conditionally_move_p from returning an int to returning
>> a bool and fixed a few formatting glitches.  There should be no other
>> changes to the functions themselves.
>
> I'm taking your word for it. The patch is slightly confusing in one area 
> of optabs.c (it looks like debug_optab_libfuncs got moved around, it 
> might be better for patch readability not to do that).

Hmm, yeah.  I think that was just diff getting confused between very
similar blocks of code: the retained functions are in the same order.
The diff seems to be clean if I use --diff-algorithm=minimal instead.

> The only thing I really wondered about...
>
>> --- /dev/null
>> +++ b/gcc/optabs-tree.h
>> @@ -0,0 +1,45 @@
>> +
>> +#include "optabs-query.h"
>
> I haven't quite followed amacleod's work on the #includes, so I wasn't 
> quite sure whether headers are supposed to include other headers these 
> days. But as far as I can tell that's fine, So, patch ok.

Thanks, applied (slightly later than planned).

Richard



Re: Openacc launch API

2015-09-17 Thread Nathan Sidwell

Updated patch addressing your points.  Some  further comments though ...


+  while (GOMP_LAUNCH_PACK (GOMP_LAUNCH_END, 0, 0)
+ != (tag = va_arg (ap, unsigned)))


That's a somewhat non-idiomatic way to write this, with the constant first and
not obviously a constant. I'd initialize a variable with the constant before the
loop.


I went with
  while ((tag = va_arg (...)) != 0) ...

and killed GOMP_LAUNCH_END throughout, using explicit '0'.


+  assert (!GOMP_LAUNCH_DEVICE (tag));


Uh, that seems unfriendly, and not exactly forwards compatible. Can that fail a
bit more gracefully? (Alternatively, implement the device_type stuff now so that
we don't have TODOs in the code and don't have to worry about compatibility
issues.)


Added call to gomp_fatal, indicating libgomp is out of date. Also added a 
default to the switch following with the same effect.  The trouble  with 
implementing handling of device_type here now, is difficulty in testing its 
correctness.  If it were  buggy we'd be in a worse position than not having it.



+GOACC_2.0,1 {
+  global:
+GOACC_parallel_keyed;
+} GOACC_2.0;


Did you mean to use a comma?


I misunderstood your comment as 'did you mean to use a comma where you used 
something else', not 'is that comma a typo?'  well spotted!



+else if (!tagging)


Oh... so tagging controls two different methods for constructing argument lists,
one for GOACC_parallel and the other for whatever OMP uses? That's a bit
unfortunate, I'll need to think about it for a bit or defer to Jakub.


My earlier description was lacking.  The memory arguments have already been 
pushed before that switch.  This is just dealing with async & wait args.  I 
found it easier to modify the existing code path and have a tagging flag, rather 
than duplicate it.


nathan
o2015-09-17  Nathan Sidwell  

	inlude/
	* gomp-constants.h (GOMP_VERSION_NVIDIA_PTX): Increment.
	(GOMP_DIM_GANG, GOMP_DIM_WORKER, GOMP_DIM_VECTOR, GOMP_DIM_MAX,
	GOMP_DIM_MASK): New.
	(GOMP_LAUNCH_DIM, GOMP_LAUNCH_ASYNC, GOMP_LAUNCH_WAIT): New.
	(GOMP_LAUNCH_CODE_SHIFT, GOMP_LAUNCH_DEVICE_SHIFT,
	GOMP_LAUNCH_OP_SHIFT): New.
	(GOMP_LAUNCH_PACK, GOMP_LAUNCH_CODE, GOMP_LAUNCH_DEVICE,
	GOMP_LAUNCH_OP): New.
	(GOMP_LAUNCH_OP_MAX): New.

	libgomp/
	* libgomp.h (acc_dispatch_t): Replace separate geometry args with
	array.
	* libgomp.map (GOACC_parallel_keyed): New.
	* oacc-parallel.c (goacc_wait): Take pointer to va_list.  Adjust
	all callers.
	(GOACC_parallel_keyed): New interface.  Lose geometry arguments
	and take keyed varargs list.  Adjust call to exec_func.
	(GOACC_parallel): Forward to GACC_parallel_keyed.
	* libgomp_g.h (GOACC_parallel): Remove.
	(GOACC_parallel_keyed): Declare.
	* plugin/plugin-nvptx.c (struct targ_fn_launch): New struct.
	(stuct targ_gn_descriptor): Replace name field with launch field.
	(nvptx_exec): Lose separate geometry args, take array.  Process
	dynamic dimensions and adjust.
	(struct nvptx_tdata): Replace fn_names field with fn_descs.
	(GOMP_OFFLOAD_load_image): Adjust for change in function table
	data.
	(GOMP_OFFLOAD_openacc_parallel): Adjust for change in dimension
	passing.
	* oacc-host.c (host_openacc_exec): Adjust for change in dimension
	passing.

	gcc/
	* config/nvptx/nvptx.c: Include omp-low.h and gomp-constants.h.
	(nvptx_record_offload_symbol): Record function execution geometry.
	* config/nvptx/mkoffload.c (process): Include launch geometry in
	function data.
	* omp-low.c (oacc_launch_pack): New.
	(replace_oacc_fn_attrib): New.
	(set_oacc_fn_attrib): New.
	(get_oacc_fn_attrib): New.
	(expand_omp_target): Create keyed varargs for GOACC_parallel call
	generation.
	* omp-low.h (get_oacc_fn_attrib): Declare.
	* builtin-types.def (DEF_FUNCTION_TyPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.
	* tree.h (OMP_CLAUSE_EXPR): New.
	* omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Change target fn name.

	gcc/lto/
	* lto-lang.c (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.

	gcc/c-family/
	* c-common.c (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.

	gcc/fortran/
	* f95-lang.c (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.
	* types.def (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.

Index: include/gomp-constants.h
===
--- include/gomp-constants.h	(revision 227862)
+++ include/gomp-constants.h	(working copy)
@@ -115,11 +115,33 @@ enum gomp_map_kind
 
 /* Versions of libgomp and device-specific plugins.  */
 #define GOMP_VERSION	0
-#define GOMP_VERSION_NVIDIA_PTX 0
+#define GOMP_VERSION_NVIDIA_PTX 1
 #define GOMP_VERSION_INTEL_MIC 0
 
 #define GOMP_VERSION_PACK(LIB, DEV) (((LIB) << 16) | (DEV))
 #define GOMP_VERSION_LIB(PACK) (((PACK) >> 16) & 0x)
 #define GOMP_VERSION_DEV(PACK) ((PACK) & 0x)
 
+#define GOMP_DIM_GANG	0
+#define GOMP_DIM_WORKER	1
+#define GOMP_DIM_VECTOR	2
+#define GOMP_DIM_MAX	3
+#define 

[patch] Only do shrink_to_fit() when exceptions enabled

2015-09-17 Thread Jonathan Wakely

When exceptions are disabled a failed allocation while trying to
shrink_to_fit() will abort the program. Since shrink_to_fit() is a
non-binding request we should just ignore it rather than risk taking
down the whole process.

Tested powerpc64le-linux, committed to trunk.


commit 13cf19282acf42a52d5eacd2c293a944bd3e5ebe
Author: Jonathan Wakely 
Date:   Thu Sep 17 00:07:33 2015 +0100

Only do shrink_to_fit() when exceptions enabled

	* include/bits/allocator.h (__shrink_to_fit_aux::_S_do_it):
	Do nothing if exceptions are disabled.
	* include/bits/basic_string.h (basic_string::shrink_to_fit): Likewise.

diff --git a/libstdc++-v3/include/bits/allocator.h b/libstdc++-v3/include/bits/allocator.h
index 6fd3214..0131521 100644
--- a/libstdc++-v3/include/bits/allocator.h
+++ b/libstdc++-v3/include/bits/allocator.h
@@ -209,15 +209,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static bool
   _S_do_it(_Tp& __c) noexcept
   {
-	__try
+#if __cpp_exceptions
+	try
 	  {
 	_Tp(__make_move_if_noexcept_iterator(__c.begin()),
 		__make_move_if_noexcept_iterator(__c.end()),
 		__c.get_allocator()).swap(__c);
 	return true;
 	  }
-	__catch(...)
+	catch(...)
 	  { return false; }
+#else
+	return false;
+#endif
   }
 };
 #endif
diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index e6e7bb5..b5e7e36 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -833,13 +833,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   void
   shrink_to_fit() noexcept
   {
+#if __cpp_exceptions
 	if (capacity() > size())
 	  {
-	__try
+	try
 	  { reserve(0); }
-	__catch(...)
+	catch(...)
 	  { }
 	  }
+#endif
   }
 #endif
 
@@ -3282,12 +3284,14 @@ _GLIBCXX_END_NAMESPACE_CXX11
   void
   shrink_to_fit() _GLIBCXX_NOEXCEPT
   {
+#if __cpp_exceptions
 	if (capacity() > size())
 	  {
-	__try
+	try
 	  { reserve(0); }
-	__catch(...)
+	catch(...)
 	  { }
+#endif
 	  }
   }
 #endif


Re: [C++] Coding rule enforcement

2015-09-17 Thread Ville Voutilainen
>>>+  else if (warn_multiple_inheritance)
>>>+warning (OPT_Wmultiple_inheritance,
>>>+ "%qT defined with multiple direct bases", ref);
>>You don't need to guard the warning with a check of the warning flag; warning
>>will only give the warning if the option is enabled.
>the spelling mistake on the option I used to experiment on didn't help, but I 
>>discovered one must have the Var clause in the options file too, which I 
>didn't expect.
>Anyway, fixed now.  ok?

Note that there are a couple of places where 'namespace' is misspelled
as 'namepace' in the patch.


Re: [patch] libstdc++/65142 Check read() result in std::random_device.

2015-09-17 Thread Jonathan Wakely

On 15/09/15 12:47 +0100, Jonathan Wakely wrote:

On 11/09/15 14:44 +0100, Jonathan Wakely wrote:

We should not silently ignore a failure to read from the random
device.

Tested powerpc64le-linux, committed to trunk. I'm going to commit this
to the gcc-5 branch too.





commit 2d2f7012dc3744dafef0de94dd845bd190253dbd
Author: Jonathan Wakely 
Date:   Fri Feb 20 17:29:50 2015 +

  Check read() result in std::random_device.
PR libstdc++/65142
* src/c++11/random.cc (random_device::_M_getval()): Check read result.

diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index edf900f..1d102c7 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -130,13 +130,17 @@ namespace std _GLIBCXX_VISIBILITY(default)
#endif

   result_type __ret;
+
#ifdef _GLIBCXX_HAVE_UNISTD_H
-read(fileno(static_cast(_M_file)),
-static_cast(&__ret), sizeof(result_type));
+auto e = read(fileno(static_cast(_M_file)),
+ static_cast(&__ret), sizeof(result_type));
#else
-std::fread(static_cast(&__ret), sizeof(result_type),
-  1, static_cast(_M_file));
+auto e = std::fread(static_cast(&__ret), sizeof(result_type),
+   1, static_cast(_M_file));
#endif
+if (e != sizeof(result_type))
+  __throw_runtime_error(__N("random_device could not read enough bytes"));
+
   return __ret;
 }



Florian pointed out that this code should handle short reads (or
EINTR) by retrying in a loop, so here's another attempt to fix it.

This also fixes the bug I introduced in the std::fread() case where it
expects e == sizeof(result_type) but fread will only return 0 or 1.

We could loop in the fread case too, but I'm not doing that. If
someone on non-POSIX targets cares enough they can make that change
later.

Any comments on this version?


Committed to trunk.



commit 6700c8c652da94332562fff465a1569d8fa9c94d
Author: Jonathan Wakely 
Date:   Tue Sep 15 11:02:42 2015 +0100

   Fix handling of short reads in std::random_device
   
   	PR libstdc++/65142

* src/c++11/random.cc (random_device::_M_getval()): Retry after short
reads.

diff --git a/libstdc++-v3/src/c++11/random.cc b/libstdc++-v3/src/c++11/random.cc
index 1d102c7..f1d6125 100644
--- a/libstdc++-v3/src/c++11/random.cc
+++ b/libstdc++-v3/src/c++11/random.cc
@@ -130,16 +130,26 @@ namespace std _GLIBCXX_VISIBILITY(default)
#endif

result_type __ret;
-
+void* p = &__ret;
+size_t n = sizeof(result_type);
#ifdef _GLIBCXX_HAVE_UNISTD_H
-auto e = read(fileno(static_cast(_M_file)),
- static_cast(&__ret), sizeof(result_type));
+do
+  {
+   const int e = read(fileno(static_cast(_M_file)), p, n);
+   if (e > 0)
+ {
+   n -= e;
+   p = static_cast(p) + e;
+ }
+   else if (e != -1 || errno != EINTR)
+ __throw_runtime_error(__N("random_device could not be read"));
+  }
+while (n > 0);
#else
-auto e = std::fread(static_cast(&__ret), sizeof(result_type),
-   1, static_cast(_M_file));
+const size_t e = std::fread(p, n, 1, static_cast(_M_file));
+if (e != 1)
+  __throw_runtime_error(__N("random_device could not be read"));
#endif
-if (e != sizeof(result_type))
-  __throw_runtime_error(__N("random_device could not read enough bytes"));

return __ret;
  }




Re: dejagnu version update?

2015-09-17 Thread Richard Earnshaw
On 16/09/15 17:36, Jeff Law wrote:
> On 09/16/2015 10:25 AM, Ramana Radhakrishnan wrote:
>>
>>
>> On 16/09/15 17:14, Mike Stump wrote:
>>> On Sep 16, 2015, at 12:29 AM, Andreas Schwab 
>>> wrote:
 Mike Stump  writes:

> The software presently works with 1.4.4 and there aren’t any
> changes that require anything newer.

 SLES 12 has 1.4.4.
>>>
>>> Would be nice to cover them as well, but their update schedule, 3-4
>>> years, means that their next update is 2018.  They didn’t update to
>>> a 3 year old stable release of dejagnu for their last OS, meaning
>>> they are on a > 7 year update cycle.  I love embedded and really
>>> long term support cycles (20 years), but, don’t think we should
>>> cater to the 20 year cycle just yet.  :-)  Since 7 is substantially
>>> longer than 2, I don’t think we should worry about it.  If they had
>>> updated at the time, they would have had 3 years of engineering and
>>> testing before the release and _had_ 1.5.
>>>
>>
>> Sorry about the obvious (possibly dumb) question.
>>
>> Can't we just import a copy of dejagnu each year and install it as
>> part of the source tree ? I can't imagine installing dejagnu is
>> adding a huge amount of time to build and regression test time ?
>> Advantage is that everyone is guaranteed to be on the same version. I
>> fully expect resistance due to specific issues with specific versions
>> of tcl and expect, but if folks aren't aware of this .
> That should work -- certainly that's the way we used to do things at
> Cygnus.  Some of that code may have bitrotted as single tree builds have
> fallen out-of-favor through the years.
> 
> As to whether or  not its a good idea.  I'm torn -- I don't like copying
> code from other repos because of the long term maintenance concerns.
> 
> I'd rather just move to 1.5 and get on with things.  If some systems
> don't have a new enough version, I'm comfortable telling developers on
> those platforms that they need to update.  It's not like every *user*
> needs dejagnu, it's just for the testing side of things.
> 
> 
> jeff

I don't see it as a major issue to have your own private build of
dejagnu rather than the system supplied one.  The only local change you
need is to add it to the front of your path before testing.

Dejagnu does not heavily depend on system libraries, it is not built
directly into GCC is pretty independent on the version of expect that
you have on your machine (likely the system version will serve fine).
So why don't we just migrate to the latest and greatest version as our
standard and be done with these old versions that are lying around?

R.



Re: [PATCH] start of rewrite of unordered_{set,map}

2015-09-17 Thread Jonathan Wakely

Oops, I meant to reply to the lists, not just Geoff ...


On 16/09/15 18:35 +0100, Jonathan Wakely wrote:

On 16/09/15 10:08 -0700, Geoff Pike wrote:

Also, to clarify, I am primarily seeking high-level comments; I am new
here and don't want to waste anybody's time.


Hi Geoff,

This looks very interesting indeed, but we've just gone through one
big ABI break earlier this year and my nerves and our users probably
can't take another one in the near future!

That means *replacing* the current containers is unlikely in the short
term, but we could certainly offer alternatives (as e.g.
__gnu_cxx::unordered_map instead of std::unordered_map), and could
maybe have a configure option that would make them the default.

So I don't want to discourage you from working on this, but would have
been a lot more positive about the proposed changes 18 months ago :-)


P.S.  While I have your attention, could I ask you to comment on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55815 ? I think I tried
to reach you about it a while back, via someone else at Google, but
failed. That's another case where we have probably missed our chance
to change the default (i.e. std::hash) but we could still offer a
__gnu_cxx::__sip_hash functor as an extension.




Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-17 Thread Bill Schmidt
On Thu, 2015-09-17 at 09:39 +0200, Richard Biener wrote:
> On Wed, 16 Sep 2015, Alan Lawrence wrote:
> 
> > On 16/09/15 17:10, Bill Schmidt wrote:
> > > 
> > > On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote:
> > > > On 16/09/15 15:28, Bill Schmidt wrote:
> > > > > 2015-09-16  Bill Schmidt  
> > > > > 
> > > > >   * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX,
> > > > > UNSPEC_REDUC_SMIN,
> > > > >   UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, 
> > > > > UNSPEC_REDUC_SMAX_SCAL,
> > > > >   UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL,
> > > > >   UNSPEC_REDUC_UMIN_SCAL): New enumerated constants.
> > > > >   (reduc_smax_v2di): New define_expand.
> > > > >   (reduc_smax_scal_v2di): Likewise.
> > > > >   (reduc_smin_v2di): Likewise.
> > > > >   (reduc_smin_scal_v2di): Likewise.
> > > > >   (reduc_umax_v2di): Likewise.
> > > > >   (reduc_umax_scal_v2di): Likewise.
> > > > >   (reduc_umin_v2di): Likewise.
> > > > >   (reduc_umin_scal_v2di): Likewise.
> > > > >   (reduc_smax_v4si): Likewise.
> > > > >   (reduc_smin_v4si): Likewise.
> > > > >   (reduc_umax_v4si): Likewise.
> > > > >   (reduc_umin_v4si): Likewise.
> > > > >   (reduc_smax_v8hi): Likewise.
> > > > >   (reduc_smin_v8hi): Likewise.
> > > > >   (reduc_umax_v8hi): Likewise.
> > > > >   (reduc_umin_v8hi): Likewise.
> > > > >   (reduc_smax_v16qi): Likewise.
> > > > >   (reduc_smin_v16qi): Likewise.
> > > > >   (reduc_umax_v16qi): Likewise.
> > > > >   (reduc_umin_v16qi): Likewise.
> > > > >   (reduc_smax_scal_): Likewise.
> > > > >   (reduc_smin_scal_): Likewise.
> > > > >   (reduc_umax_scal_): Likewise.
> > > > >   (reduc_umin_scal_): Likewise.
> > > > 
> > > > You shouldn't need the non-_scal reductions. Indeed, they shouldn't be
> > > > used if
> > > > the _scal are present. The non-_scal's were previously defined as
> > > > producing a
> > > > vector with one element holding the result and the other elements all
> > > > zero, and
> > > > this was only ever used with a vec_extract immediately after; the _scal
> > > > pattern
> > > > now includes the vec_extract as well. Hence the non-_scal patterns are
> > > > deprecated / considered legacy, as per md.texi.
> > > 
> > > Thanks -- I had misread the description of the non-scalar versions,
> > > missing the part where the other elements are zero.  What I really
> > > want/need is an optab defined as computing the maximum value in all
> > > elements of the vector.
> > 
> > Yes, indeed. It seems reasonable to me that this would coexist with an optab
> > which computes only a single value (i.e. a scalar).
> 
> So just to clarify - you need to reduce the vector with max to a scalar
> but want the (same) result in all vector elements?

Yes.  Alan Hayward's cond-reduction patch is set up to perform a
reduction to scalar, followed by a scalar broadcast to get the value
into all positions.  It happens that our most efficient expansion to
reduce to scalar will naturally produce the value in all positions.
Using the reduc_smax_scal_ expander, we are required to extract
this into a scalar and then perform the broadcast.  Those two operations
are unnecessary.  We can clean up after the fact, but the cost model
will still reflect the unnecessary activity.

> 
> > At that point it might be appropriate to change the cond-reduction code to
> > generate the reduce-to-vector in all cases, and optabs.c expand it to
> > reduce-to-scalar + broadcast if reduce-to-vector was not available. Along 
> > with
> > the (parallel) changes to cost model already proposed, does that cover all 
> > the
> > cases? It does add a new tree code, yes, but I'm feeling that could be
> > justified if we go down this route.
> 
> I'd rather have combine recognize an insn that does both (if it
> exists).  As I understand powerpc doesn't have reduction to scalar
> (I think no ISA actually has this, but all ISAs have reduce to
> one vector element plus a cheap way of extraction (effectively a subreg))
> but it's reduction already has all result vector elements set to the
> same value (as opposed to having some undefined or zero or whatever).

The extraction is not necessarily so cheap.  For floating-point values
in a register, we can indeed use a subreg.  But in this case we are
talking about integers, and an extract must move them from the vector
registers to the GPRs.  With POWER8, we can do this with a 5-cycle move
instruction.  Prior to that, we can only do this with a very expensive
path through memory, as previous processors had no direct path between
the vector and integer registers.

> 
> > However, another point that springs to mind: if you reduce a loop containing
> > OR or MUL expressions, the vect_create_epilog_for_reduction reduces these
> > using shifts, and I think will also use 

Re: [PATCH, fortran] PR 53379 Backtrace on error termination

2015-09-17 Thread Ian Lance Taylor
On Thu, Sep 17, 2015 at 7:08 AM, David Edelsohn  wrote:
> On Wed, Sep 9, 2015 at 12:12 AM, Ian Lance Taylor  wrote:
>> Mike Stump  writes:
>>
>>> Not a big issue, but slightly better if (O_CLOEXEC>>32) != 0 is also
>>> true.  See, if AIX should ever define this to a sensible value, the
>>> above would disappear the feature.  However, if they did, then this
>>> expression should then be false.
>>
>> Yes, I think this might be even better in code.  How about something
>> like
>>
>>   /* On some versions of AIX O_CLOEXEC does not fit in int, so use a
>>  cast to force it.  */
>>   descriptor = open (filename, (int) (O_RDONLY | O_BINARY | O_CLOEXEC));
>>
>> Does that work on AIX?
>
> Any decision about this patch?

This patch is OK if it works.

Ian


Re: [RFC] Masking vectorized loops with bound not aligned to VF.

2015-09-17 Thread Ilya Enkovich
2015-09-16 15:30 GMT+03:00 Richard Biener :
> On Mon, 14 Sep 2015, Kirill Yukhin wrote:
>
>> Hello,
>> I'd like to initiate discussion on vectorization of loops which
>> boundaries are not aligned to VF. Main target for this optimization
>> right now is x86's AVX-512, which features per-element embedded masking
>> for all instructions. The main goal for this mail is to agree on overall
>> design of the feature.
>>
>> This approach was presented @ GNU Cauldron 2015 by Ilya Enkovich [1].
>>
>> Here's a sketch of the algorithm:
>>   1. Add check on basic stmts for masking: possibility to introduce index 
>> vector and
>>  corresponding mask
>>   2. At the check if statements are vectorizable we additionally check if 
>> stmts
>>  need and can be masked and compute masking cost. Result is stored in 
>> `stmt_vinfo`.
>>  We are going  to mask only mem. accesses, reductions and modify mask 
>> for already
>>  masked stmts (mask load, mask store and vect. condition)
>
> I think you also need to mask divisions (for integer divide by zero) and
> want to mask FP ops which may result in NaNs or denormals (because that's
> generally to slow down execution a lot in my experience).
>
> Why not simply mask all stmts?

Hi,

Statement masking may be not free. Especially if we need to transform
mask somehow to do it. It also may be unsupported on a platform (e.g.
for AVX-512 not all instructions support masking) but still not be a
problem to mask a loop. BTW for AVX-512 masking doesn't boost
performance even if we have some special cases like NaNs. We don't
consider exceptions in vector code (and it seems to be a case now?)
otherwise we would need to mask them also.

>
>>   3. Make a decision about masking: take computed costs and est. iterations 
>> count
>>  into consideration
>>   4. Modify prologue/epilogue generation according decision made at 
>> analysis. Three
>>  options available:
>> a. Use scalar remainder
>> b. Use masked remainder. Won't be supported in first version
>> c. Mask main loop
>>   5.Support vectorized loop masking:
>> - Create stmts for mask generation
>> - Support generation of masked vector code (create generic vector code 
>> then
>>   patch it w/ masks)
>>   -  Mask loads/stores/vconds/reductions only
>>
>>  In first version (targeted v6) we're not going to support 4.b and loop
>> mask pack/unpack. No `pack/unpack` means that masking will be supported
>> only for types w/ the same size as index variable
>
> This means that if ncopies for any stmt is > 1 masking won't be supported,
> right?  (you'd need two or more different masks)

We don't think it is a very important feature to have in initial
version. It can be added later and shouldn't affect overall
implementation design much. BTW currently masked loads and stores
don't support masks of other sizes and don't do masks pack/unpack.

>
>>
>> [1] - 
>> https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile=view=Vectorization+for+Intel+AVX-512.pdf
>>
>> What do you think?
>
> There was the idea some time ago to use single-iteration vector
> variants for prologues/epilogues by simply overlapping them with
> the vector loop (and either making sure to mask out the overlap
> area or make sure the result stays the same).  This kind-of is
> similar to 4b and thus IMHO it's better to get 4b implemented
> rather than trying 4c.  So for example
>
>  int a[];
>  for (i=0; i < 13; ++i)
>a[i] = i;
>
> would be vectorized (with v4si) as
>
>  for (i=0; i < 13 / 4; ++i)
>((v4si *)a)[i] = { ... };
>  *(v4si *)([9]) = { ... };
>
> where the epilogue store of course would be unaligned.  The masked
> variant can avoid the data pointer adjustment and instead use a masked
> store.
>
> OTOH it might be that the unaligned scheme is as efficient as the
> masked version.  Only the masked version is more trivially correct,
> data dependences can make the above idea not work without masking
> out stores like for
>
>  for (i=0; i < 13; ++i)
>a[i] = a[i+1];
>
> obviously the overlapping iterations in the epilogue would
> compute bogus values.  To avoid this we can merge the result
> with the previously stored values (using properly computed masks)
> before storing it.
>
> Basically both 4b and the above idea need to peel a vector
> iteration and "modify" it.  The same trick can be applied to
> prologue loops of course.
>
> Any chance you can try working on 4b instead?  It also feels
> like it would need less hacks throughout the vectorizer
> (basically post-processing the generated vector loop).
>
> If 4b is implemented I don't think 4c is worth doing.

I agree 4b is a more universal approach and should cover more cases.
But we consider 4c is the first step to have 4b. I think significant
difference of 4b from a replay you described is that in 4b masked
remainder may be used to execute short trip count loops which mean we
can execute masked remainder without actually executing main

Re: [Patch, fortran] PR40054 and PR63921 - Implement pointer function assignment - redux

2015-09-17 Thread Mikael Morin

Le 06/09/2015 18:40, Paul Richard Thomas a écrit :

It helps to attach the patch :-)

Paul

On 6 September 2015 at 13:42, Paul Richard Thomas
 wrote:

Dear All,

The attached patch more or less implements the assignment of
expressions to the result of a pointer function. To wit:

my_ptr_fcn (arg1, arg2...) = expr

arg1 would usually be the target, pointed to by the function. The
patch parses these statements and resolves them into:

temp_ptr => my_ptr_fcn (arg1, arg2...)
temp_ptr = expr

I say more or less implemented because I have ducked one of the
headaches here. At the end of the specification block, there is an
ambiguity between statement functions and pointer function
assignments. I do not even try to resolve this ambiguity and require
that there be at least one other type of executable statement before
these beasts. This can undoubtedly be fixed but the effort seems to me
to be unwarranted at the present time.

This version of the patch extends the coverage of allowed rvalues to
any legal expression. Also, all the problems with error.c have been
dealt with by Manuel's patch.

I am grateful to Dominique for reminding me of PR40054 and pointing
out PR63921. After a remark of his on #gfortran, I fixed the checking
of the standard to pick up all the offending lines with F2003 and
earlier.


Bootstraps and regtests on FC21/x86_64 - OK for trunk?


Hello Paul,

I'm mostly concerned about the position where the code rewriting happens.
Details below.

Mikael




submit_2.diff




Index: gcc/fortran/parse.c
===
*** gcc/fortran/parse.c (revision 227508)
--- gcc/fortran/parse.c (working copy)
*** decode_statement (void)
*** 356,362 
--- 357,371 

match (NULL, gfc_match_assignment, ST_ASSIGNMENT);
match (NULL, gfc_match_pointer_assignment, ST_POINTER_ASSIGNMENT);
+
+   if (in_specification_block)
+ {
match (NULL, gfc_match_st_function, ST_STATEMENT_FUNCTION);
+ }
+   else if (!gfc_notification_std (GFC_STD_F2008))
+ {
+   match (NULL, gfc_match_ptr_fcn_assign, ST_ASSIGNMENT);
+ }

As match exits the function upon success, I think it makes sense to move 
match (... gfc_match_ptr_fcn_assign ...) out of the else, namely:


  if (in_specification_block)
{
  /* match statement function */
}

  /* match pointer fonction assignment */

so that non-ambiguous cases are recognized with gfc_match_ptr_fcn_assign.
Non-ambiguous cases are for example the ones where one of the function 
arguments is a non-variable, or a variable with a subreference, or when 
there is one keyword argument. Example (rejected with unclassifiable 
statement):


program p
  integer, parameter :: b = 3
  integer, target:: a = 2

  func(arg=b) = 1
  if (a /= 1) call abort

  func(b + b - 3) = -1
  if (a /= -1) call abort

contains
  function func(arg) result(r)
integer, pointer :: r
integer :: arg

if (arg == 3) then
  r => a
else
  r => null()
end if
  end function func
end program p



Index: gcc/fortran/resolve.c
===
*** gcc/fortran/resolve.c   (revision 227508)
--- gcc/fortran/resolve.c   (working copy)
*** generate_component_assignments (gfc_code
*** 10133,10138 
--- 10141,10205 
  }


+ /* F2008: Pointer function assignments are of the form:
+   ptr_fcn (args) = expr
+This function breaks these assignments into two statements:
+   temporary_pointer => ptr_fcn(args)
+   temporary_pointer = expr  */
+
+ static bool
+ resolve_ptr_fcn_assign (gfc_code **code, gfc_namespace *ns)
+ {
+   gfc_expr *tmp_ptr_expr;
+   gfc_code *this_code;
+   gfc_component *comp;
+   gfc_symbol *s;
+
+   if ((*code)->expr1->expr_type != EXPR_FUNCTION)
+ return false;
+
+   /* Even if standard does not support this feature, continue to build
+  the two statements to avoid upsetting frontend_passes.c.  */
I don't mind this, but maybe we should return false at the end, when an 
error has been emitted?



+   gfc_notify_std (GFC_STD_F2008, "Pointer procedure assignment at "
+ "%L", &(*code)->loc);
+
+   comp = gfc_get_proc_ptr_comp ((*code)->expr1);
+
+   if (comp)
+ s = comp->ts.interface;
+   else
+ s = (*code)->expr1->symtree->n.sym;
+
+   if (s == NULL || !s->result->attr.pointer)
+ {
+   gfc_error ("F2008: The function result at %L must have "
+"the pointer attribute.", &(*code)->expr1->where);
+   /* Return true because we want a break after the call.  */

Hum, I would rather not do this if possible.  Do we really need the break?


+   return true;
+ }
+
+   tmp_ptr_expr = get_temp_from_expr ((*code)->expr2, ns);
+
+   /* get_temp_from_expression is set up for ordinary assignments. To that
+  end, where array bounds are not known, arrays are made allocatable.
+  Change the temporary to a pointer 

[patch] Remove redundant conditional expressions in

2015-09-17 Thread Jonathan Wakely

This is silly and makes it harder to read:

 return _M_value != 0 ? true : false; 


Tested powerpc64le-linux, committed to trunk.


commit ca8d1e0e92810ac72b337247c10af1de1de465d4
Author: Jonathan Wakely 
Date:   Thu Sep 17 15:07:24 2015 +0100

Remove redundant conditional expressions in 

	* include/std/system_error (error_code::operator bool(),
	error_condition::operator bool()): Remove redundant conditional
	expression.

diff --git a/libstdc++-v3/include/std/system_error b/libstdc++-v3/include/std/system_error
index 92f8af9..cc82bdf 100644
--- a/libstdc++-v3/include/std/system_error
+++ b/libstdc++-v3/include/std/system_error
@@ -181,7 +181,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return category().message(value()); }
 
 explicit operator bool() const noexcept
-{ return _M_value != 0 ? true : false; }
+{ return _M_value != 0; }
 
 // DR 804.
   private:
@@ -257,7 +257,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return category().message(value()); }
 
 explicit operator bool() const noexcept
-{ return _M_value != 0 ? true : false; }
+{ return _M_value != 0; }
 
 // DR 804.
   private:


Re: [PATCH] Fix PR64078

2015-09-17 Thread Marek Polacek
On Wed, Sep 09, 2015 at 07:48:15PM +0200, Bernd Edlinger wrote:
> Hi,
> 
> On Wed, 9 Sep 2015 09:31:33, Jeff Law wrote:
> > You could probably make the function static or change its visibility via
> > a function attribute (there's a visibility attribute which can take the
> > values default, hidden protected or internal). Default visibility
> > essentially means the function can be overridden. I think changing it
> > to "protected" might work. Note if we do that, we may need some kind of
> > target selector on the test since not all targets support the various
> > visibility attributes.
> >
> 
> Yes, it works both ways: static works, and __attribute__ ((visibility 
> ("protected"))) works too:
> 
> make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c 
> --target_board='unix{-fpic,-mcmodel=medium,-fpic\ 
> -mcmodel=medium,-mcmodel=large,-fpic\ -mcmodel=large}'"
> 
> has all tests passed, but..
> 
> make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c 
> --target_board='unix{-fno-inline}'"
> 
> still fails in the same way for all workarounds: inline, static, and 
> __attribute__ ((visibility ("protected"))).
> 
> Maybe "static" would be preferable?

Yeah, I'd go with static if that helps.  I'd rather avoid playing games
with visibility.

Marek


Go compiler patch: issue receive type errors earlier

2015-09-17 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to issue type
errors earlier for a receive operation.  This fixes
https://golang.org/issue/12323 .  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 227834)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1cb26dc898bda1e85f4dd2ee204adbce792e4813
+e069d4417a692c1261df99fe3323277e1a0193d2
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 227834)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -13502,9 +13502,14 @@ Expression::make_heap_expression(Express
 Type*
 Receive_expression::do_type()
 {
+  if (this->is_error_expression())
+return Type::make_error_type();
   Channel_type* channel_type = this->channel_->type()->channel_type();
   if (channel_type == NULL)
-return Type::make_error_type();
+{
+  this->report_error(_("expected channel"));
+  return Type::make_error_type();
+}
   return channel_type->element_type();
 }
 
@@ -13516,6 +13521,7 @@ Receive_expression::do_check_types(Gogo*
   Type* type = this->channel_->type();
   if (type->is_error())
 {
+  go_assert(saw_errors());
   this->set_is_error();
   return;
 }
Index: gcc/go/gofrontend/statements.cc
===
--- gcc/go/gofrontend/statements.cc (revision 227696)
+++ gcc/go/gofrontend/statements.cc (working copy)
@@ -3856,7 +3856,10 @@ Switch_statement::do_lower(Gogo*, Named_
   if (this->val_ != NULL
   && (this->val_->is_error_expression()
  || this->val_->type()->is_error()))
-return Statement::make_error_statement(loc);
+{
+  go_assert(saw_errors());
+  return Statement::make_error_statement(loc);
+}
 
   if (this->val_ != NULL
   && this->val_->type()->integer_type() != NULL


[RFC][PATCH] Preferred rename register in regrename pass

2015-09-17 Thread Robert Suchanek
Hi,

We came across a situation for MIPS64 where moves for sign-extension were
not converted into a nop because of IRA spilled some of the allocnos and
assigned different hard register for the output operand in the move.
LRA is not fixing this up as most likely the move was not introduced by
the LRA itself.  I found it hard to fix this in LRA and looked at
an alternative solution where regrename pass appeared to be the best candidate.

The patch below introduces a notion of a preferred rename register and attempts
to use the output register for an input register iff the input register dies
in an instruction.  The preferred register is validated and in the case it fails
to be validated, it falls back to the old technique of finding the best rename 
register.
Of course, it has slightly limited scope of use as it's not enabled be default,
however, when targeting performance one is likely to enable it via
-funroll-loops or -fpeel-loops.

I did some experiments with -funroll-loops on x86_64-unknown-linux-gnu and
the code size improved almost 0.4% on average case.  I haven't done an extensive
performance testing but it appeared SPEC2006 had some minor improvement on 
average,
which could be real improvement or just noise.
On MIPS64 with -funroll-loops, there were a number of cases where the 
unnecessary
moves turned into a nop in CSiBE.  MIPS32 also marginally improved but to
a lower degree.

The patch successfully passed x86_64-unknown-linux-gnu, mips-mti-linux-gnu and
mips-img-linux-gnu.

I'm not sure if this is something that should be enabled by default for everyone
or a target hook should be added.  Any other comments/suggestions?

Regards,
Robert

gcc/
* regrename.c (find_preferred_rename_reg): New function.
(record_operand_use): Remove assertion.  Allocate or resize heads and
chains vectors, if necessary.
(find_best_rename_reg): Use the new function and validate chosen
register.
(build_def_use): Don't allocate and initialize space of size 0.
(regrename_finish): Free heads and chains vectors.
(regrename_optimize): Pass true to initializing function.
* regrename.h (struct operand_rr_info): Replace arrays of heads and
chains with vectors.
---
 gcc/regrename.c | 86 +
 gcc/regrename.h |  4 +--
 2 files changed, 82 insertions(+), 8 deletions(-)

diff --git a/gcc/regrename.c b/gcc/regrename.c
index c328c1b..90fee98 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -174,6 +174,51 @@ dump_def_use_chain (int from)
 }
 }
 
+/* Return a preferred rename register for HEAD.  */
+
+static int
+find_preferred_rename_reg (du_head_p head)
+{
+  struct du_chain *this_du;
+  int preferred_reg = -1;
+
+  for (this_du = head->first; this_du; this_du = this_du->next_use)
+{
+  rtx note;
+  insn_rr_info *p;
+
+  /* The preferred rename register is an output register iff an input
+register dies in an instruction but the candidate must be validated by
+check_new_reg_p.  */
+  for (note = REG_NOTES (this_du->insn); note; note = XEXP (note, 1))
+   if (insn_rr.exists()
+   && REG_NOTE_KIND (note) == REG_DEAD
+   && REGNO (XEXP (note, 0)) == head->regno
+   && (p = _rr[INSN_UID (this_du->insn)])
+   && p->op_info)
+ {
+   int i;
+   for (i = 0; i < p->op_info->n_chains; i++)
+ {
+   struct du_head *next_head = p->op_info->heads[i];
+   if (head != next_head)
+ {
+   preferred_reg = next_head->regno;
+   if (dump_file)
+ fprintf (dump_file,
+  "Chain %s (%d) has preferred rename register"
+  " %s for insn %d [%s]\n",
+  reg_names[head->regno], head->id,
+  reg_names[preferred_reg],
+  INSN_UID (this_du->insn),
+  reg_class_names[this_du->cl]);
+ }
+ }
+ }
+}
+  return preferred_reg;
+}
+
 static void
 free_chain_data (void)
 {
@@ -206,7 +251,16 @@ record_operand_use (struct du_head *head, struct du_chain 
*this_du)
 {
   if (cur_operand == NULL)
 return;
-  gcc_assert (cur_operand->n_chains < MAX_REGS_PER_ADDRESS);
+
+  if (!cur_operand->heads.exists ())
+cur_operand->heads.create (0);
+  if (!cur_operand->chains.exists ())
+cur_operand->chains.create (0);
+  if (cur_operand->heads.length () <= (unsigned) cur_operand->n_chains)
+cur_operand->heads.safe_grow_cleared (cur_operand->n_chains + 1);
+  if (cur_operand->chains.length () <= (unsigned) cur_operand->n_chains)
+cur_operand->chains.safe_grow_cleared (cur_operand->n_chains + 1);
+
   cur_operand->heads[cur_operand->n_chains] = head;
   cur_operand->chains[cur_operand->n_chains++] = this_du;
 }
@@ -355,6 +409,7 @@ 

Re: [patch] libstdc++/67173 Fix filesystem::canonical for Solaris 10.

2015-09-17 Thread Jonathan Wakely

On 17/09/15 12:16 +0100, Jonathan Wakely wrote:

On 16/09/15 17:42 -0600, Martin Sebor wrote:

I see now the first exists test will detect symlink loops in
the original path. But I'm not convinced there isn't a corner
case that's subject to a TOCTOU race condition between the first
exists test and the while loop during which a symlink loop can
be introduced.

Suppose we call the function with /foo/bar as an argument and
the path exists and contains no symlinks. result is / and cmpts
is set to { foo, bar }. Just as the loop is entered, /foo/bar
is replaced with a symlink containing /foo/bar. The loop then
proceeds like so:

1. The first iteration removes foo from cmpts and sets result
to /foo. cmpts is { bar }.

2. The second iteration removes bar from cmpts, sets result to
/foo/bar, determines it's a symlink, reads its contents, sees
it's an absolute pathname and replaces result with /. It then
inserts the symlink's components { foo, bar } into cmpts. cmpts
becomes { foo, bar }. exists(result) succeeds.

3. The next iteration of the loop has the same initial state
as the first.

But I could have very easily missed something that takes care
of this corner case. If I did, sorry for the false alarm!


No, you're right. The TS says such filesystem races are undefined:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4099.html#fs.race.behavior
but it would be nice to fail gracefully rather than DOS the
application.

The simplest approach would be to increment a counter every time we
follow a symlink, and if it reaches some limit decide something is
wrong and fail with ELOOP.

I don't see how anything else can be 100% bulletproof, because a truly
evil attacker could just keep altering the contents of symlinks so we
keep ping-ponging between two or more paths. If we keep track of paths
we've seen before the attacker could just keep changing the contents
to a unique path each time, that initially exists as a file, but by
the time we get to is_symlink() its become a symlink to a new path.

So if we use a counter, what's a sane maximum? Is MAXSYMLINKS in
 the value the kernel uses? 20 seems quite low, I was
thinking of a much higher number.


This patch sets ELOOP after following 40 symlinks.

I can also move the exists(result, ec) check to the end, because the
is_symlink(result, ec) call will already check for existence on every
iteration that adds a component to the result.

I've also simplified the error handling (when exists(p, ec) fails it
sets ENOENT anyway) and moved !ec into the loop condition, rather than
using 'fail(e); break;' on errors.

I'm quite happy with this version now.


diff --git a/libstdc++-v3/src/filesystem/ops.cc 
b/libstdc++-v3/src/filesystem/ops.cc
index b5c8eb9..95146bf 100644
--- a/libstdc++-v3/src/filesystem/ops.cc
+++ b/libstdc++-v3/src/filesystem/ops.cc
@@ -98,6 +98,7 @@ fs::canonical(const path& p, const path& base, error_code& ec)
 {
   const path pa = absolute(p, base);
   path result;
+
 #ifdef _GLIBCXX_USE_REALPATH
   char_ptr buf{ nullptr };
 # if _XOPEN_VERSION < 700
@@ -119,18 +120,9 @@ fs::canonical(const path& p, const path& base, error_code& 
ec)
 }
 #endif
 
-  auto fail = [, ](int e) mutable {
-  if (!ec.value())
-   ec.assign(e, std::generic_category());
-  result.clear();
-  };
-
   if (!exists(pa, ec))
-{
-  fail(ENOENT);
-  return result;
-}
-  // else we can assume no unresolvable symlink loops
+return result;
+  // else: we know there are (currently) no unresolvable symlink loops
 
   result = pa.root_path();
 
@@ -138,18 +130,17 @@ fs::canonical(const path& p, const path& base, 
error_code& ec)
   for (auto& f : pa.relative_path())
 cmpts.push_back(f);
 
-  while (!cmpts.empty())
+  int max_allowed_symlinks = 40;
+
+  while (!cmpts.empty() && !ec)
 {
   path f = std::move(cmpts.front());
   cmpts.pop_front();
 
   if (f.compare(".") == 0)
{
- if (!is_directory(result, ec))
-   {
- fail(ENOTDIR);
- break;
-   }
+ if (!is_directory(result, ec) && !ec)
+   ec.assign(ENOTDIR, std::generic_category());
}
   else if (f.compare("..") == 0)
{
@@ -166,27 +157,30 @@ fs::canonical(const path& p, const path& base, 
error_code& ec)
  if (is_symlink(result, ec))
{
  path link = read_symlink(result, ec);
- if (!ec.value())
+ if (!ec)
{
- if (link.is_absolute())
+ if (--max_allowed_symlinks == 0)
+   ec.assign(ELOOP, std::generic_category());
+ else
{
- result = link.root_path();
- link = link.relative_path();
+ if (link.is_absolute())
+   {
+ result = link.root_path();
+ link = link.relative_path();
+   }
+ 

Re: [C++] Coding rule enforcement

2015-09-17 Thread Nathan Sidwell

On 09/16/15 10:23, Jason Merrill wrote:

On 09/16/2015 08:02 AM, Nathan Sidwell wrote:

+  else if (warn_multiple_inheritance)
+warning (OPT_Wmultiple_inheritance,
+ "%qT defined with multiple direct bases", ref);


You don't need to guard the warning with a check of the warning flag; warning
will only give the warning if the option is enabled.


the spelling mistake on the option I used to experiment on didn't help, but I 
discovered one must have the Var clause in the options file too, which I didn't 
expect.


Anyway, fixed now.  ok?

nathan
2015-09-17  Nathan Sidwell  

	c-family/
	* c.opt (Wmultiple-inheritance, Wvirtual-inheritance, Wtemplates,
	Wnamespaces): New C++ warnings.

	cp/
	* decl.c (xref_basetypes): Check virtual and/or multiple
	inheritance warning..
	* parser.c (cp_parser_namespace_definition): Check namespaces
	warning.
	* pt.c (push_template_decl_real): Check templates warning.

	* doc/invoke.texi  (-Wmultiple-inheritance, -Wvirtual-inheritance,
	-Wtemplates, -Wnamespaces): Document.

	testsuite/
	* g++.dg/diagostic/disable.C: New.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 227761)
+++ gcc/c-family/c.opt	(working copy)
@@ -573,6 +573,14 @@ Wmissing-field-initializers
 C ObjC C++ ObjC++ Var(warn_missing_field_initializers) Warning EnabledBy(Wextra)
 Warn about missing fields in struct initializers
 
+Wmultiple-inheritance
+C++ ObjC++ Var(warn_multiple_inheritance) Warning
+Warn on direct multiple inheritance
+
+Wnamespaces
+C++ ObjC++ Var(warn_namespaces) Warning
+Warn on namespace definition
+
 Wsized-deallocation
 C++ ObjC++ Var(warn_sized_deallocation) Warning EnabledBy(Wextra)
 Warn about missing sized deallocation functions
@@ -610,6 +618,10 @@ Wswitch-bool
 C ObjC C++ ObjC++ Var(warn_switch_bool) Warning Init(1)
 Warn about switches with boolean controlling expression
 
+Wtemplates
+C++ ObjC++ Var(warn_templates) Warning
+Warn on primary template declaration
+
 Wmissing-format-attribute
 C ObjC C++ ObjC++ Alias(Wsuggest-attribute=format)
 ;
@@ -936,6 +948,10 @@ Wvolatile-register-var
 C ObjC C++ ObjC++ Var(warn_volatile_register_var) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
 Warn when a register variable is declared volatile
 
+Wvirtual-inheritance
+C++ ObjC++ Var(warn_virtual_inheritance) Warning
+Warn on direct virtual inheritance
+
 Wvirtual-move-assign
 C++ ObjC++ Var(warn_virtual_move_assign) Warning Init(1)
 Warn if a virtual base has a non-trivial move assignment operator
Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c	(revision 227761)
+++ gcc/cp/decl.c	(working copy)
@@ -12729,6 +12729,7 @@ xref_basetypes (tree ref, tree base_list
   tree binfo, base_binfo;
   unsigned max_vbases = 0; /* Maximum direct & indirect virtual bases.  */
   unsigned max_bases = 0;  /* Maximum direct bases.  */
+  unsigned max_dvbases = 0; /* Maximum direct virtual bases.  */
   int i;
   tree default_access;
   tree igo_prev; /* Track Inheritance Graph Order.  */
@@ -12766,12 +12767,13 @@ xref_basetypes (tree ref, tree base_list
 	{
 	  max_bases++;
 	  if (TREE_TYPE (*basep))
-	max_vbases++;
+	max_dvbases++;
 	  if (CLASS_TYPE_P (basetype))
 	max_vbases += vec_safe_length (CLASSTYPE_VBASECLASSES (basetype));
 	  basep = _CHAIN (*basep);
 	}
 }
+  max_vbases += max_dvbases;
 
   TYPE_MARKED_P (ref) = 1;
 
@@ -12814,6 +12816,9 @@ xref_basetypes (tree ref, tree base_list
 	  error ("Java class %qT cannot have multiple bases", ref);
   return false;
 }
+  else
+	warning (OPT_Wmultiple_inheritance,
+		 "%qT defined with multiple direct bases", ref);
 }
 
   if (max_vbases)
@@ -12825,6 +12830,9 @@ xref_basetypes (tree ref, tree base_list
 	  error ("Java class %qT cannot have virtual bases", ref);
   return false;
 }
+  else if (max_dvbases)
+	warning (OPT_Wvirtual_inheritance,
+		 "%qT defined with direct virtual base", ref);
 }
 
   for (igo_prev = binfo; base_list; base_list = TREE_CHAIN (base_list))
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c	(revision 227761)
+++ gcc/cp/parser.c	(working copy)
@@ -16798,6 +16798,8 @@ cp_parser_namespace_definition (cp_parse
 
   has_visibility = handle_namespace_attrs (current_namespace, attribs);
 
+  warning  (OPT_Wnamespaces, "namepace %qD entered", current_namespace);
+
   /* Parse the body of the namespace.  */
   cp_parser_namespace_body (parser);
 
Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c	(revision 227761)
+++ gcc/cp/pt.c	(working copy)
@@ -5088,6 +5088,8 @@ push_template_decl_real (tree decl, bool
 
   if (is_primary)
 {
+  warning (OPT_Wtemplates, "template %qD declared", decl);
+  
   if (DECL_CLASS_SCOPE_P (decl))
 	member_template_p = true;
   if 

Re: [RFC][PATCH] Preferred rename register in regrename pass

2015-09-17 Thread Jeff Law

On 09/17/2015 08:38 AM, Robert Suchanek wrote:

Hi,

We came across a situation for MIPS64 where moves for sign-extension were
not converted into a nop because of IRA spilled some of the allocnos and
assigned different hard register for the output operand in the move.
LRA is not fixing this up as most likely the move was not introduced by
the LRA itself.  I found it hard to fix this in LRA and looked at
an alternative solution where regrename pass appeared to be the best candidate.
Yea, we've never been great at tying the source & destination of 
sign/zero extensions.  The inherently different modes often caused the 
old allocator (and pre-allocator bits like regmove, and post-allocator 
bits like reload) to 'give up'.





I'm not sure if this is something that should be enabled by default for everyone
or a target hook should be added.  Any other comments/suggestions?
I'll let Bernd comment on the patch itself.  But I would say that if 
you're setting up cases where we can tie the source/dest of an extension 
together, then it's a good thing.  It'll cause more of them to turn into 
NOPs and it'll make the redundant extension elimination pass more 
effective as well.  This would be something that I'd expect to benefit 
most architectures (obviously to varying degrees).



Jeff



Re: [patch match.pd c c++]: Ignore results of 'shorten_compare' and move missing patterns in match.pd

2015-09-17 Thread Jeff Law

On 09/17/2015 03:12 AM, Richard Biener wrote:

I wonder if we'd do better to first add new match.pd patterns, one at a
time, with tests, and evaluating them along the way by looking at the
dumps
or .s files across many systems.  Then when we think we've got the right
set, then look at what happens to those dumps/.s files if we make the
changes so that shorten_compare really just emits warnings.



As the shorten_compare function covers those transformations, I don't
see that this is something leading to something useful.  As long as we
call shorten_compare on folding in FEs, we won't see those patterns in
ME that easy.  And even more important is, that patterns getting
resolved if function gets invoked.


It will tell you what will be missed from a code generation standpoint if
you disable shorten_compare.  And that is the best proxy we have for
performance regressions related to disabling shorten_compare.

ie, in a local tree, you disable shorten_compare.  Then compare the code we
generate with and without shorten compare.  Reduce to a simple testcase.
Resolve the issue with a match.pd pattern (most likely) and repeat the
process.  In theory at each step the  number of things to deeply investigate
should be diminishing while at the same time you're building match.pd
patterns and tests we can include in the testsuite. And each step is likely
a new patch in the patch series -- the final patch of which disables
shorten_compare.

It's a lot of work, but IMHO, it's necessary to help ensure we don't have
code generation regressions.


As said in the other reply I successfully used gcc_unreachable ()s in
code in intend to remove and then inspect/fix all fallout from that during
bootstrap & test ...

Yeah, it's a lot of work (sometimes).  But it's way easier than to investigate
code changes (which may also miss cases as followup transforms may end
up causing the same code to be generated).
That'd be a fine approach as well, though it may not work in this case 
as shorten_compare would get called prior to the transformations in 
match.pd.  Though I would certainly agree in general that your approach 
is a good one.


jeff



RE: [PATCH] Fix PR64078

2015-09-17 Thread Bernd Edlinger
Hi,

On Thu, 17 Sep 2015 10:39:04, Jeff Law wrote:
>
> On 09/17/2015 09:00 AM, Marek Polacek wrote:
>> On Wed, Sep 09, 2015 at 07:48:15PM +0200, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> On Wed, 9 Sep 2015 09:31:33, Jeff Law wrote:
 You could probably make the function static or change its visibility via
 a function attribute (there's a visibility attribute which can take the
 values default, hidden protected or internal). Default visibility
 essentially means the function can be overridden. I think changing it
 to "protected" might work. Note if we do that, we may need some kind of
 target selector on the test since not all targets support the various
 visibility attributes.

>>>
>>> Yes, it works both ways: static works, and __attribute__ ((visibility 
>>> ("protected"))) works too:
>>>
>>> make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c 
>>> --target_board='unix{-fpic,-mcmodel=medium,-fpic\ 
>>> -mcmodel=medium,-mcmodel=large,-fpic\ -mcmodel=large}'"
>>>
>>> has all tests passed, but..
>>>
>>> make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c
>>> --target_board='unix{-fno-inline}'"
>>>
>>> still fails in the same way for all workarounds: inline, static, and 
>>> __attribute__ ((visibility ("protected"))).
>>>
>>> Maybe "static" would be preferable?
>>
>> Yeah, I'd go with static if that helps. I'd rather avoid playing games
>> with visibility.
> Static is certainly easier and doesn't rely on targets implementing all
> the visibility capabilities. So static is probably the best approach.
>

That's fine for me too, so is the original patch OK for trunk with 
s/inline/static/ ?

Thanks
Bernd.
  

[AArch64][PATCH 4/5] Use atomic load-operate instructions for fetch-update patterns.

2015-09-17 Thread Matthew Wahab

Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch uses the ARMv8.1 atomic
load-operate instructions to implement the atomic_fetch_
patterns. This patch also updates the implementation of the atomic_
patterns, which are treated as versions of the atomic_fetch_ which
discard the loaded data.

The general form of the code generated for an atomic_fetch_, with
destination D, source S, memory address A and memory order MO, depends
on whether the operation is directly supported by the instruction. If
 is one of PLUS, IOR or XOR, the code generated is:

ld S, D, [A]

where
   is one {add, set, eor}
   is one of {'', 'a', 'l', 'al'} depending on memory order MO.
   is one of {'', 'b', 'h'} depending on the data size.

If  is SUB, the code generated, with scratch register r, is:

neg r, S
ldadd r, D, [A]

If  is AND, the code generated is:
not r, S
ldclr r, D, [A]

Any operation not in {PLUS, IOR, XOR, SUB, AND} is passed to the
existing aarch64_split_atomic_op function, to implement the operation
using sequences built with the ARMv8 load-exclusive/store-exclusive
instructions

Tested the series for aarch64-none-linux-gnu with native bootstrap and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

gcc/
2015-09-17  Matthew Wahab  

* config/aarch64/aarch64-protos.h
(aarch64_atomic_ldop_supported_p): Declare.
* config/aarch64/aarch64.c (aarch64_atomic_ldop_supported_p): New.
(enum aarch64_atomic_load_op_code): New.
(aarch64_emit_atomic_load_op): New.
(aarch64_gen_atomic_load_op): Update to support load-operate
patterns.
* config/aarch64/atomics.md (atomic_): Change
to an expander.
(aarch64_atomic_): New.
(aarch64_atomic__lse): New.
(atomic_fetch_): Change to an expander.
(aarch64_atomic_fetch_): New.
(aarch64_atomic_fetch__lse): New.

gcc/testsuite/
2015-09-17  Matthew Wahab  

* gcc.target/aarch64/atomic-inst-ldadd.c: New.
* gcc.target/aarch64/atomic-inst-ldlogic.c: New.

>From c4b8eb6d2ca62c57f4a011e06335b918f603ad2a Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Fri, 7 Aug 2015 17:10:42 +0100
Subject: [PATCH 4/5] Use atomic instructions for fetch-update patterns.

Change-Id: I39759f02e61039067ccaabfd52039e4804eddf2f
---
 gcc/config/aarch64/aarch64-protos.h|   2 +
 gcc/config/aarch64/aarch64.c   | 176 -
 gcc/config/aarch64/atomics.md  | 109 -
 .../gcc.target/aarch64/atomic-inst-ldadd.c |  58 +++
 .../gcc.target/aarch64/atomic-inst-ldlogic.c   | 109 +
 5 files changed, 444 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-ldadd.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-inst-ldlogic.c

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index eba4c76..76ebd6f 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -378,6 +378,8 @@ rtx aarch64_load_tp (rtx);
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
 void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
+
+bool aarch64_atomic_ldop_supported_p (enum rtx_code);
 void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index dc05c6e..33f9ef3 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11064,6 +11064,33 @@ aarch64_expand_compare_and_swap (rtx operands[])
   emit_insn (gen_rtx_SET (bval, x));
 }
 
+/* Test whether the target supports using a atomic load-operate instruction.
+   CODE is the operation and AFTER is TRUE if the data in memory after the
+   operation should be returned and FALSE if the data before the operation
+   should be returned.  Returns FALSE if the operation isn't supported by the
+   architecture.
+  */
+
+bool
+aarch64_atomic_ldop_supported_p (enum rtx_code code)
+{
+  if (!TARGET_LSE)
+return false;
+
+  switch (code)
+{
+case SET:
+case AND:
+case IOR:
+case XOR:
+case MINUS:
+case PLUS:
+  return true;
+default:
+  return false;
+}
+}
+
 /* Emit a barrier, that is appropriate for memory model MODEL, at the end of a
sequence implementing an atomic operation.  */
 
@@ -11206,26 +11233,169 @@ aarch64_emit_atomic_swap (machine_mode mode, rtx dst, rtx value,
   emit_insn (gen (dst, mem, value, model));
 }
 
-/* Emit an atomic operation where the architecture supports it.  */
+/* Operations supported by 

Re: [PATCH] Target hook for disabling the delay slot filler.

2015-09-17 Thread Jeff Law

On 09/17/2015 03:52 AM, Simon Dardis wrote:

The profitability of using an ordinary branch over a delay slot branch
depends on how the delay slot is filled. If a delay slot can be filled from
an instruction preceding the branch or instructions proceeding that must be
executed on both sides then it is profitable to use a delay slot branch.
Agreed.  It's an over-simplification, but for the purposes of this 
discussion it's close enough.





For cases when instructions are chosen from one side of the branch,
the proposed optimization strategy is to not speculatively execute
instructions when ordinary branches could be used. Performance-wise
this avoids executing instructions which the eager delay filler picked
wrongly.
Are you trying to say that you have the option as to what kind of branch 
to use?  ie, "ordinary", presumably without a delay slot or one with a 
delay slot?


Is the "ordinary" actually just a nullified delay slot or some form of 
likely/not likely static hint?






Since most branches have a compact form disabling the eager delay filler
should be no worse than altering it not to fill delay slots in this case.
But what is the compact form at the micro-architectural level?  My 
mips-fu has diminished greatly, but my recollection is the bubble is 
always there.   Is that not the case?


fill_eager_delay_slots is most definitely speculative and its 
profitability is largely dependent on the cost of what insns it finds to 
fill those delay slots and whether they're from the common or uncommon path.


If it is able to find insns from the commonly executed path that don't 
have a long latency, then the fill is usually profitable (since the 
pipeline bubble always exists).  However, pulling a long latency 
instruction (say anything that might cache miss or an fdiv/fsqrt) off 
the slow path and conditionally nullifying it can be *awful*. 
Everything else is in-between.




Jeff



Re: [PATCH] Fix PR64078

2015-09-17 Thread Marek Polacek
On Thu, Sep 17, 2015 at 08:06:48PM +0200, Bernd Edlinger wrote:
> Hi,
> 
> On Thu, 17 Sep 2015 10:39:04, Jeff Law wrote:
> >
> > On 09/17/2015 09:00 AM, Marek Polacek wrote:
> >> On Wed, Sep 09, 2015 at 07:48:15PM +0200, Bernd Edlinger wrote:
> >>> Hi,
> >>>
> >>> On Wed, 9 Sep 2015 09:31:33, Jeff Law wrote:
>  You could probably make the function static or change its visibility via
>  a function attribute (there's a visibility attribute which can take the
>  values default, hidden protected or internal). Default visibility
>  essentially means the function can be overridden. I think changing it
>  to "protected" might work. Note if we do that, we may need some kind of
>  target selector on the test since not all targets support the various
>  visibility attributes.
> 
> >>>
> >>> Yes, it works both ways: static works, and __attribute__ ((visibility 
> >>> ("protected"))) works too:
> >>>
> >>> make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c 
> >>> --target_board='unix{-fpic,-mcmodel=medium,-fpic\ 
> >>> -mcmodel=medium,-mcmodel=large,-fpic\ -mcmodel=large}'"
> >>>
> >>> has all tests passed, but..
> >>>
> >>> make check-gcc-c++ RUNTESTFLAGS="ubsan.exp=object-size-9.c
> >>> --target_board='unix{-fno-inline}'"
> >>>
> >>> still fails in the same way for all workarounds: inline, static, and 
> >>> __attribute__ ((visibility ("protected"))).
> >>>
> >>> Maybe "static" would be preferable?
> >>
> >> Yeah, I'd go with static if that helps. I'd rather avoid playing games
> >> with visibility.
> > Static is certainly easier and doesn't rely on targets implementing all
> > the visibility capabilities. So static is probably the best approach.
> >
> 
> That's fine for me too, so is the original patch OK for trunk with 
> s/inline/static/ ?

Yes.

Marek


Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-17 Thread Segher Boessenkool
On Thu, Sep 17, 2015 at 09:18:42AM -0500, Bill Schmidt wrote:
> On Thu, 2015-09-17 at 09:39 +0200, Richard Biener wrote:
> > So just to clarify - you need to reduce the vector with max to a scalar
> > but want the (same) result in all vector elements?
> 
> Yes.  Alan Hayward's cond-reduction patch is set up to perform a
> reduction to scalar, followed by a scalar broadcast to get the value
> into all positions.  It happens that our most efficient expansion to
> reduce to scalar will naturally produce the value in all positions.

It also is many insns after expand, so relying on combine to combine
all that plus the following splat (as Richard suggests below) is not
really going to work.

If there also are targets where the _scal version is cheaper, maybe
we should keep both, and have expand expand to whatever the target
supports?


Segher


Re: [AArch64][PATCH 5/5] Use atomic load-operate instructions for update-fetch patterns.

2015-09-17 Thread Andrew Pinski
On Thu, Sep 17, 2015 at 9:54 AM, Matthew Wahab
 wrote:
> Hello,
>
> ARMv8.1 adds atomic swap and atomic load-operate instructions with
> optional memory ordering specifiers. This patch uses the ARMv8.1
> load-operate instructions to implement the atomic__fetch patterns.
>
> The approach is to use the atomic load-operate instruction to atomically
> load the data and update memory and then to use the loaded data to
> calculate the value that the instruction would have stored. The
> calculation attempts to mirror the operation of the atomic instruction.
> For example, atomic_and_fetch is implemented with an atomic
> load-bic so the result is also calculated using a BIC instruction.
>
> The general form of the code generated for an atomic__fetch, with
> destination D, source S, memory address A and memory order MO, depends
> on whether or not the operation is directly supported by the
> instruction. If  is one of PLUS, IOR or XOR, the code generated is:
>
> ld S, D, [A]
>  D, D, S
> where
>  is one {add, set, eor}
>  is one of {add, orr, xor}
>  is one of {'', 'a', 'l', 'al'} depending on memory order MO.
>  is one of {'', 'b', 'h'} depending on the data size.
>
> If  is SUB, the code generated is:
>
> neg S, S
> ldadd S, D, [A]
> add D, D, S
>
> If  is AND, the code generated is:
>
> not S, S
> ldclr S, D, [A]
> bic D, S, S
>
> Any operation not in {PLUS, IOR, XOR, SUB, AND} is passed to the
> existing aarch64_split_atomic_op function, to implement the operation
> using sequences built with the ARMv8 load-exclusive/store-exclusive
> instructions
>
> Tested the series for aarch64-none-linux-gnu with native bootstrap and
> make check. Also tested for aarch64-none-elf with cross-compiled
> check-gcc on an ARMv8.1 emulator with +lse enabled by default.


Are you going to add some builtins for MIN/MAX support too?

Thanks,
Andrew Pinski

>
> Ok for trunk?
> Matthew
>
> 2015-09-17  Matthew Wahab  
>
> * config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
> Adjust declaration.
> * config/aarch64/aarch64.c (aarch64_emit_bic): New.
> (aarch64_gen_atomic_load_op): Adjust comment.  Add parameter
> out_result.  Update to support update-fetch operations.
> * config/aarch64/atomics.md (aarch64_atomic_exchange_lse):
> Adjust for change to aarch64_gen_atomic_ldop.
> (aarch64_atomic__lse): Likewise.
> (aarch64_atomic_fetch__lse): Likewise.
> (atomic__fetch): Change to an expander.
> (aarch64_atomic__fetch): New.
> (aarch64_atomic__fetch_lse): New.
>
> gcc/testsuite
> 2015-09-17  Matthew Wahab  
>
> * gcc.target/aarch64/atomic-inst-ldadd.c: Add tests for
> update-fetch operations.
> * gcc.target/aarch64/atomic-inst-ldlogic.c: Likewise.
>


[AArch64][PATCH 5/5] Use atomic load-operate instructions for update-fetch patterns.

2015-09-17 Thread Matthew Wahab

Hello,

ARMv8.1 adds atomic swap and atomic load-operate instructions with
optional memory ordering specifiers. This patch uses the ARMv8.1
load-operate instructions to implement the atomic__fetch patterns.

The approach is to use the atomic load-operate instruction to atomically
load the data and update memory and then to use the loaded data to
calculate the value that the instruction would have stored. The
calculation attempts to mirror the operation of the atomic instruction.
For example, atomic_and_fetch is implemented with an atomic
load-bic so the result is also calculated using a BIC instruction.

The general form of the code generated for an atomic__fetch, with
destination D, source S, memory address A and memory order MO, depends
on whether or not the operation is directly supported by the
instruction. If  is one of PLUS, IOR or XOR, the code generated is:

ld S, D, [A]
 D, D, S
where
 is one {add, set, eor}
 is one of {add, orr, xor}
 is one of {'', 'a', 'l', 'al'} depending on memory order MO.
 is one of {'', 'b', 'h'} depending on the data size.

If  is SUB, the code generated is:

neg S, S
ldadd S, D, [A]
add D, D, S

If  is AND, the code generated is:

not S, S
ldclr S, D, [A]
bic D, S, S

Any operation not in {PLUS, IOR, XOR, SUB, AND} is passed to the
existing aarch64_split_atomic_op function, to implement the operation
using sequences built with the ARMv8 load-exclusive/store-exclusive
instructions

Tested the series for aarch64-none-linux-gnu with native bootstrap and
make check. Also tested for aarch64-none-elf with cross-compiled
check-gcc on an ARMv8.1 emulator with +lse enabled by default.

Ok for trunk?
Matthew

2015-09-17  Matthew Wahab  

* config/aarch64/aarch64-protos.h (aarch64_gen_atomic_ldop):
Adjust declaration.
* config/aarch64/aarch64.c (aarch64_emit_bic): New.
(aarch64_gen_atomic_load_op): Adjust comment.  Add parameter
out_result.  Update to support update-fetch operations.
* config/aarch64/atomics.md (aarch64_atomic_exchange_lse):
Adjust for change to aarch64_gen_atomic_ldop.
(aarch64_atomic__lse): Likewise.
(aarch64_atomic_fetch__lse): Likewise.
(atomic__fetch): Change to an expander.
(aarch64_atomic__fetch): New.
(aarch64_atomic__fetch_lse): New.

gcc/testsuite
2015-09-17  Matthew Wahab  

* gcc.target/aarch64/atomic-inst-ldadd.c: Add tests for
update-fetch operations.
* gcc.target/aarch64/atomic-inst-ldlogic.c: Likewise.

>From 577bdb656e451df5ce1c8c651a642c3ac4d7c73b Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Mon, 17 Aug 2015 11:27:18 +0100
Subject: [PATCH 5/5] Use atomic instructions for update-fetch patterns.

Change-Id: I5eef48586fe904f0d2df8c581fb3c12a4a2d9c78
---
 gcc/config/aarch64/aarch64-protos.h|   2 +-
 gcc/config/aarch64/aarch64.c   |  72 +++--
 gcc/config/aarch64/atomics.md  |  61 ++-
 .../gcc.target/aarch64/atomic-inst-ldadd.c |  53 ++---
 .../gcc.target/aarch64/atomic-inst-ldlogic.c   | 118 ++---
 5 files changed, 247 insertions(+), 59 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 76ebd6f..dd8ebcc 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -380,7 +380,7 @@ void aarch64_split_compare_and_swap (rtx op[]);
 void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_atomic_ldop_supported_p (enum rtx_code);
-void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx);
+void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_gen_adjusted_ldpstp (rtx *, bool, enum machine_mode, RTX_CODE);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 33f9ef3..d95b81f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11212,6 +11212,25 @@ aarch64_split_compare_and_swap (rtx operands[])
 aarch64_emit_post_barrier (model);
 }
 
+/* Emit a BIC instruction.  */
+
+static void
+aarch64_emit_bic (machine_mode mode, rtx dst, rtx s1, rtx s2, int shift)
+{
+  rtx shift_rtx = GEN_INT (shift);
+  rtx (*gen) (rtx, rtx, rtx, rtx);
+
+  switch (mode)
+{
+case SImode: gen = gen_bic_lshrsi3; break;
+case DImode: gen = gen_bic_lshrdi3; break;
+default:
+  gcc_unreachable ();
+}
+
+  emit_insn (gen (dst, s2, shift_rtx, s1));
+}
+
 /* Emit an atomic swap.  */
 
 static void
@@ -11306,13 +11325,14 @@ aarch64_emit_atomic_load_op (enum aarch64_atomic_load_op_code code,
 }
 
 /* Emit an atomic load+operate.  CODE is the operation.  OUT_DATA is the
-   location to store the data read from memory.  MEM is the memory location to
-   

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-17 Thread Alan Lawrence
On 15/09/15 08:43, Richard Biener wrote:
>
> Sorry for chiming in so late...

Not at all, TYVM for your help!

> TREE_CONSTANT isn't the correct thing to test.  You should use
> TREE_CODE () == INTEGER_CST instead.

Done (in some cases, via tree_fits_shwi_p).

> Also you need to handle
> NULL_TREE TYPE_MIN/MAX_VALUE as that can happen as well.

I've not found any documentation as to what these mean, but from experiment,
it seems that a zero-length array has MIN_VALUE 0 and MAX_VALUE null (as well
as zero TYPE_SIZE) - so I allow that. In contrast a variable-length array also
has zero TYPE_SIZE, but a large range of MIN-MAX, and I think I want to rule
those out.

Another awkward case is Ada arrays with large offset (e.g. array(2^32...2^32+1)
which has only two elements); I don't see either of tree_to_shwi or tree_to_uhwi
as necessarily being "better" here, each will handle (some) (rare) cases the
other will not, so I've tried to use tree_to_shwi throughout for consistency.

Summary: taken advantage of tree_fits_shwi_p, as this includes a check against
NULL_TREE and that TREE_CODE () == INTEGER_CST.

> +  if (DECL_P (elem) && DECL_BIT_FIELD (elem))
> +   return false;
>
> that can't happen (TREE_TYPE (array-type) is never a DECL).

Removed.

> +   int el_size = tree_to_uhwi (elem_size);
> +   gcc_assert (el_size);
>
> so you assert on el_size being > 0 later but above you test
> only array size ...

Good point, thanks.

> +   tree t_idx = build_int_cst (TYPE_DOMAIN (decl_type), idx);
>
> use t_idx = size_int (idx);

Done.

I've also added another test, of scalarizing a structure containing a
zero-length array, as earlier attempts accidentally prevented this.

Bootstrapped + check-gcc,g++,ada,fortran on ARM and x86_64;
Bootstrapped + check-gcc,g++,fortran on AArch64.

OK for trunk?

Thanks,
Alan

gcc/ChangeLog:

PR tree-optimization/67283
* tree-sra.c (type_consists_of_records_p): Rename to...
(scalarizable_type_p): ...this, add case for ARRAY_TYPE.
(completely_scalarize_record): Rename to...
(completely_scalarize): ...this, add ARRAY_TYPE case, move some code to:
(scalarize_elem): New.
(analyze_all_variable_accesses): Follow renamings.

gcc/testsuite/ChangeLog:

PR tree-optimization/67283
* gcc.dg/tree-ssa/sra-15.c: New.
* gcc.dg/tree-ssa/sra-16.c: New.
---
 gcc/testsuite/gcc.dg/tree-ssa/sra-15.c |  37 
 gcc/testsuite/gcc.dg/tree-ssa/sra-16.c |  37 
 gcc/tree-sra.c | 165 +++--
 3 files changed, 191 insertions(+), 48 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-15.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-16.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c
new file mode 100644
index 000..a22062e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c
@@ -0,0 +1,37 @@
+/* Verify that SRA total scalarization works on records containing arrays.  */
+/* { dg-do run } */
+/* { dg-options "-O1 -fdump-tree-release_ssa --param 
sra-max-scalarization-size-Ospeed=32" } */
+
+extern void abort (void);
+
+struct S
+{
+  char c;
+  unsigned short f[2][2];
+  int i;
+  unsigned short f3, f4;
+};
+
+
+int __attribute__ ((noinline))
+foo (struct S *p)
+{
+  struct S l;
+
+  l = *p;
+  l.i++;
+  l.f[1][0] += 3;
+  *p = l;
+}
+
+int
+main (int argc, char **argv)
+{
+  struct S a = {0, { {5, 7}, {9, 11} }, 4, 0, 0};
+  foo ();
+  if (a.i != 5 || a.f[1][0] != 12)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "l;" 0 "release_ssa" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-16.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sra-16.c
new file mode 100644
index 000..fef34c0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-16.c
@@ -0,0 +1,37 @@
+/* Verify that SRA total scalarization works on records containing arrays.  */
+/* { dg-do run } */
+/* { dg-options "-O1 -fdump-tree-release_ssa --param 
sra-max-scalarization-size-Ospeed=16" } */
+
+extern void abort (void);
+
+struct S
+{
+  long zilch[0];
+  char c;
+  int i;
+  unsigned short f3, f4;
+};
+
+
+int __attribute__ ((noinline))
+foo (struct S *p)
+{
+  struct S l;
+
+  l = *p;
+  l.i++;
+  l.f3++;
+  *p = l;
+}
+
+int
+main (int argc, char **argv)
+{
+  struct S a = { { }, 0, 4, 0, 0};
+  foo ();
+  if (a.i != 5 || a.f3 != 1)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "l;" 0 "release_ssa" } } */
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 8b3a0ad..8247554 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -915,73 +915,142 @@ create_access (tree expr, gimple stmt, bool write)
 }
 
 
-/* Return true iff TYPE is a RECORD_TYPE with fields that are either of gimple
-   register types or (recursively) records with only these two kinds of fields.
-   It also returns false if any of these records contains a bit-field.  */
+/* Return true iff TYPE is 

Re: [RFC] Try vector as a new representation for vector masks

2015-09-17 Thread Richard Henderson
On 09/15/2015 06:52 AM, Ilya Enkovich wrote:
> I made a step forward forcing vector comparisons have a mask (vec) 
> result and disabling bool patterns in case vector comparison is supported by 
> target.  Several issues were met.
> 
>  - c/c++ front-ends generate vector comparison with integer vector result.  I 
> had to make some modifications to use vec_cond instead.  Don't know if there 
> are other front-ends producing vector comparisons.
>  - vector lowering fails to expand vector masks due to mismatch of type and 
> mode sizes.  I fixed vector type size computation to match mode size and 
> added a special handling of mask expand.
>  - I disabled canonical type creation for vector mask because we can't layout 
> it with VOID mode. I don't know why we may need a canonical type here.  But 
> get_mask_mode call may be moved into type layout to get it.
>  - Expand of vec constants/contstructors requires special handling.  
> Common case should require target hooks/optabs to expand vector into required 
> mode.  But I suppose we want to have a generic code to handle vector of int 
> mode case to avoid modification of multiple targets which use default 
> vec modes.
> 
> Currently 'make check' shows two types of regression.
>   - missed vector expression pattern recongnition (MIN, MAX, ABX, VEC_COND).  
> This must be due to my front-end changes.  Hope it will be easy to fix.
>   - missed vectorization. All of them appear due to bool patterns disabling.  
> I didn't look into all of them but it seems the main problem is in mixed type 
> sizes.  With bool patterns and integer vector masks we just put int->(other 
> sized int) conversion for masks and it gives us required mask transformation. 
>  With boolean mask we don't have a proper scalar statements to do that.  I 
> think mask widening/narrowing may be directly supported in masked statements 
> vectorization.  Going to look into it.
> 
> I attach what I currently have for a prototype.  It grows bigger so I split 
> into several parts.

The general approach looks good.


> +/* By defaults a vector of integers is used as a mask.  */
> +
> +machine_mode
> +default_get_mask_mode (unsigned nunits, unsigned vector_size)
> +{
> +  unsigned elem_size = vector_size / nunits;
> +  machine_mode elem_mode
> += smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);

Why these arguments as opposed to passing elem_size?  It seems that every hook
is going to have to do this division...

> +#define VECTOR_MASK_TYPE_P(TYPE) \
> +  (TREE_CODE (TYPE) == VECTOR_TYPE   \
> +   && TREE_CODE (TREE_TYPE (TYPE)) == BOOLEAN_TYPE)

Perhaps better as VECTOR_BOOLEAN_TYPE_P, since that's exactly what's being 
tested?

> @@ -3464,10 +3464,10 @@ verify_gimple_comparison (tree type, tree op0, tree 
> op1)
>return true;
>  }
>  }
> -  /* Or an integer vector type with the same size and element count
> +  /* Or a boolean vector type with the same element count
>   as the comparison operand types.  */
>else if (TREE_CODE (type) == VECTOR_TYPE
> -&& TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE)
> +&& TREE_CODE (TREE_TYPE (type)) == BOOLEAN_TYPE)

VECTOR_BOOLEAN_TYPE_P.

> @@ -122,7 +122,19 @@ tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
> tree t, tree bitsize, tree bitpos)
>  {
>if (bitpos)
> -return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
> +{
> +  if (TREE_CODE (type) == BOOLEAN_TYPE)
> + {
> +   tree itype
> + = build_nonstandard_integer_type (tree_to_uhwi (bitsize), 0);
> +   tree field = gimplify_build3 (gsi, BIT_FIELD_REF, itype, t,
> + bitsize, bitpos);
> +   return gimplify_build2 (gsi, NE_EXPR, type, field,
> +   build_zero_cst (itype));
> + }
> +  else
> + return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
> +}
>else
>  return gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, t);
>  }

So... this is us lowering vector operations on a target that doesn't support
them.  Which means that we get to decide what value is produced for a
comparison?  Which means that we can settle on the "normal" -1, correct?

Which means that we ought not need to extract the entire element and then
compare for non-zero, but rather simply extract a single bit from the element,
and directly call that a boolean result, correct?

I assume you tested all this code with -mno-sse or equivalent arch default?

> @@ -1885,7 +1885,9 @@ expand_MASK_LOAD (gcall *stmt)
>create_output_operand ([0], target, TYPE_MODE (type));
>create_fixed_operand ([1], mem);
>create_input_operand ([2], mask, TYPE_MODE (TREE_TYPE (maskt)));
> -  expand_insn (optab_handler (maskload_optab, TYPE_MODE (type)), 3, ops);
> +  expand_insn (convert_optab_handler (maskload_optab, TYPE_MODE (type),
> +   

[ARM] Add ARMv8.1 command line options.

2015-09-17 Thread Matthew Wahab

Hello,

ARMv8.1 is a set of architectural extensions to ARMv8. Support has been
enabled in binutils for ARMv8.1 for the architechure, using the name
"armv8.1-a".

This patch adds support to gcc for specifying an ARMv8.1 architecture
using options "-march=armv8.1-a" and "-march=armv8.1-a+crc". It also
adds the FPU options "-mfpu=neon-fp-armv8.1" and
"-mpu=crypto-neon-fp-armv8.1", to specify the ARMv8.1 Adv.SIMD
instruction set.  The changes set the apropriate architecture and fpu
options for binutils but don't otherwise change the code generated by
gcc.

Tested for arm-none-linux-gnueabihf with native bootstrap and make
check.

Ok for trunk?
Matthew

2015-09-17  Matthew Wahab  

* config/arm/arm-arches.def: Add "armv8.1-a" and "armv8.1-a+crc".
* config/arm/arm-fpus.def: Add "neon-fp-armv8.1" and
"crypto-neon-fp-armv8.1".
* config/arm/arm-protos.h (FL2_ARCH8_1): New.
(FL2_FOR_ARCH8_1A): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.h (FPU_FL_RDMA): New.
* doc/invoke.texi (ARM -march): Add "armv8.1-a" and
"armv8.1-a+crc".
(ARM -mfpu): Add "neon-fp-armv8.1" and "crypto-neon-fp-armv8.1".
diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index ddf6c3c..4cf71fd 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -57,6 +57,8 @@ ARM_ARCH("armv7-m", cortexm3,	7M,	ARM_FSET_MAKE_CPU1 (FL_CO_PROC |	  FL_FOR_
 ARM_ARCH("armv7e-m", cortexm4,  7EM,	ARM_FSET_MAKE_CPU1 (FL_CO_PROC |	  FL_FOR_ARCH7EM))
 ARM_ARCH("armv8-a", cortexa53,  8A,	ARM_FSET_MAKE_CPU1 (FL_CO_PROC | FL_FOR_ARCH8A))
 ARM_ARCH("armv8-a+crc",cortexa53, 8A,   ARM_FSET_MAKE_CPU1 (FL_CO_PROC | FL_CRC32  | FL_FOR_ARCH8A))
+ARM_ARCH("armv8.1-a", cortexa53,  8A,	ARM_FSET_MAKE (FL_CO_PROC | FL_FOR_ARCH8A,  FL2_FOR_ARCH8_1A))
+ARM_ARCH("armv8.1-a+crc",cortexa53, 8A,	ARM_FSET_MAKE (FL_CO_PROC | FL_CRC32 | FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A))
 ARM_ARCH("iwmmxt",  iwmmxt, 5TE,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT))
 ARM_ARCH("iwmmxt2", iwmmxt2,5TE,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2))
 
diff --git a/gcc/config/arm/arm-fpus.def b/gcc/config/arm/arm-fpus.def
index efd5896..065fb3d9 100644
--- a/gcc/config/arm/arm-fpus.def
+++ b/gcc/config/arm/arm-fpus.def
@@ -44,5 +44,9 @@ ARM_FPU("fp-armv8",	ARM_FP_MODEL_VFP, 8, VFP_REG_D32, FPU_FL_FP16)
 ARM_FPU("neon-fp-armv8",ARM_FP_MODEL_VFP, 8, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16)
 ARM_FPU("crypto-neon-fp-armv8",
 			ARM_FP_MODEL_VFP, 8, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16 | FPU_FL_CRYPTO)
+ARM_FPU("neon-fp-armv8.1",
+			ARM_FP_MODEL_VFP, 8, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16 | FPU_FL_RDMA)
+ARM_FPU("crypto-neon-fp-armv8.1",
+			ARM_FP_MODEL_VFP, 8, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16 | FPU_FL_RDMA | FPU_FL_CRYPTO)
 /* Compatibility aliases.  */
 ARM_FPU("vfp3",		ARM_FP_MODEL_VFP, 3, VFP_REG_D32, FPU_FL_NONE)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 8df312f..e60ad4c 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -387,6 +387,8 @@ extern bool arm_is_constant_pool_ref (rtx);
 #define FL_IWMMXT2(1 << 30)   /* "Intel Wireless MMX2 technology".  */
 #define FL_ARCH6KZ(1 << 31)   /* ARMv6KZ architecture.  */
 
+#define FL2_ARCH8_1   (1 << 0)	  /* Architecture 8.1.  */
+
 /* Flags that only effect tuning, not available instructions.  */
 #define FL_TUNE		(FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
 			 | FL_CO_PROC)
@@ -415,6 +417,7 @@ extern bool arm_is_constant_pool_ref (rtx);
 #define FL_FOR_ARCH7M	(FL_FOR_ARCH7 | FL_THUMB_DIV)
 #define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
 #define FL_FOR_ARCH8A	(FL_FOR_ARCH7VE | FL_ARCH8)
+#define FL2_FOR_ARCH8_1A	FL2_ARCH8_1
 
 /* There are too many feature bits to fit in a single word so the set of cpu and
fpu capabilities is a structure.  A feature set is created and manipulated
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index f7a9d63..274bc46 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -336,6 +336,7 @@ typedef unsigned long arm_fpu_feature_set;
 #define FPU_FL_NEON	(1 << 0)	/* NEON instructions.  */
 #define FPU_FL_FP16	(1 << 1)	/* Half-precision.  */
 #define FPU_FL_CRYPTO	(1 << 2)	/* Crypto extensions.  */
+#define FPU_FL_RDMA	(1 << 3)	/* ARMv8.1 extensions.  */
 
 /* Which floating point model to use.  */
 enum arm_fp_model
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 99c9685..9f49189 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13267,8 +13267,8 @@ of the @option{-mcpu=} option.  Permissible names are: @samp{armv2},
 @samp{armv6}, @samp{armv6j},
 @samp{armv6t2}, @samp{armv6z}, @samp{armv6kz}, @samp{armv6-m},
 @samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m}, @samp{armv7e-m},
-@samp{armv7ve}, 

Re: [PATCH WIP] Use Levenshtein distance for various misspellings in C frontend v2

2015-09-17 Thread David Malcolm
On Thu, 2015-09-17 at 13:31 -0600, Jeff Law wrote:
> On 09/16/2015 02:34 AM, Richard Biener wrote:
> >
> > Btw, this looks quite expensive - I'm sure we want to limit the effort
> > here a bit.
> A limiter is reasonable, though as it's been pointed out this only fires 
> during error processing, so we probably have more leeway to take time 
> and see if we can do better error recovery.
> 
> FWIW, I've used this algorithm in totally unrelated projects and while 
> it seems expensive, it's worked out quite nicely.
> 
> >
> > So while the idea might be an improvement to selected cases it can cause
> > confusion as well.  And if using the suggestion for further parsing it can
> > cause worse followup errors (unless we can limit such "fixup" use to the
> > cases where we can parse the result without errors).  Consider
> >
> > foo()
> > {
> >foz = 1;
> > }
> >
> > if we suggest 'foo' instead of foz then we'll get a more confusing followup
> > error if we actually use it.
> True.  This kind of problem is probably inherent in this kind of "I'm 
> going assume you meant..." error recovery mechanisms.
> 
> And just to be clear, even in a successful recovery scenario, we still 
> issue an error.  The error recovery is just meant to try and give the 
> user a hint what might have gone wrong and gracefully handle the case 
> where they just made a minor goof.  

(nods)

> Obviously the idea here is to cut 
> down on the number of iterations of edit-compile cycle one has to do :-)

In my mind it's more about saving the user from having to locate the
field they really meant within the corresponding structure declaration
(either by grep, or by some cross-referencing tool).

A lot of the time I find myself wishing that the compiler had issued a
note saying "here's the declaration of the struct in question", which
would make it easy for me to go straight there in Emacs.

I wonder what proportion of our users use a cross-referencing tool or
have an IDE that can find this stuff for them, vs those that rely on
grep, and if that should mean something for our diagnostics (I tend to
just rely on grep).

This is rather tangential to this RFE, of course.

Dave



Re: [patch] libstdc++/65142 Check read() result in std::random_device.

2015-09-17 Thread Gerald Pfeifer
On Thu, 17 Sep 2015, Jonathan Wakely wrote:
>> Any comments on this version?
> Committed to trunk.

Unfortunately this broke bootstrap on FreeBSD 10.1.

/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc: In member 
function 'std::random_device::result_type std::random_device::_M_getval()':
/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc:144:22: error: 
'errno' was not declared in this scope
  else if (e != -1 || errno != EINTR)
  ^
/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc:144:31: error: 
'EINTR' was not declared in this scope
  else if (e != -1 || errno != EINTR)
   ^
Makefile:545: recipe for target 'random.lo' failed

I probably won't be able to dig in deeper today, but figured this
might already send you on the right path?

Actually...

...how about he patch below?  Bootstraps on i386-unknown-freebsd10.1,
no regressions.

Gerald


2015-09-17  Gerald Pfeifer  

* src/c++11/random.cc: Include .

Index: libstdc++-v3/src/c++11/random.cc
===
--- libstdc++-v3/src/c++11/random.cc(revision 227883)
+++ libstdc++-v3/src/c++11/random.cc(working copy)
@@ -31,6 +31,7 @@
 # include 
 #endif
 
+#include 
 #include 
 
 #ifdef _GLIBCXX_HAVE_UNISTD_H


Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-17 Thread Jeff Law

On 09/17/2015 10:49 AM, David Malcolm wrote:


FWIW, I have a (very messy) implementation of this working for the C
frontend, which gives us source ranges on expressions without needing to
add any new tree nodes, or add any fields to existing tree structs.

The approach I'm using:

* ranges are stored directly as fields within cpp_token and c_token
(maybe we can ignore cp_token for now)

* ranges are stashed in the C FE, both (a) within the "struct c_expr"
and (b) within the location_t of each tree expression node as a new
field in the adhoc map.

Doing it twice may seem slightly redundant, but I think both are needed:
   (a) doing it within c_expr allows us to support ranges for constants
and VAR_DECL etc during parsing, without needing any kind of new tree
wrapper node
   (b) doing it in the ad-hoc map allows the ranges for expressions to
survive the parse and be usable in diagnostics later.

So this gives us (in the C FE): ranges for everything during parsing,
and ranges for expressions afterwards, with no new tree nodes or new
fields within tree nodes.

I'm working on cleaning it up into a much more minimal set of patches
that I hope are reviewable.

Hopefully this sounds like a good approach
So is the assumption here that the location information is or is not 
supposed to survive through the gimple optimizers?   If I understand 
what you're doing correctly, I think the location information gets 
invalidated by const/copy propagations.


Though perhaps that's not a major problem because we're typically 
propagating an SSA_NAME, not a _DECL node.  Hmm.


jeff


Re: [c++-delayed-folding] code review

2015-09-17 Thread Kai Tietz
Hello Jason,

thanks for your review.  I addressed most of your mentioned issues
already today.  To some of them I have further comments ...

2015-09-17 8:18 GMT+02:00 Jason Merrill :
>> @@ -216,6 +216,7 @@ libgcov-driver-tool.o-warn = -Wno-error
>>  libgcov-merge-tool.o-warn = -Wno-error
>>  gimple-match.o-warn = -Wno-unused
>>  generic-match.o-warn = -Wno-unused
>> +insn-modes.o-warn = -Wno-error
>
>
> This doesn't seem to be needed anymore.

Yes, removed.

>> @@ -397,11 +397,13 @@ convert_to_real (tree type, tree expr)
>> EXPR must be pointer, integer, discrete (enum, char, or bool), float,
>> fixed-point or vector; in other cases error is called.
>>
>> +   If DOFOLD is TRZE, we try to simplify newly-created patterns by
>> folding.
>
>
> "TRUE"

Fixed.  Thanks.


>> +  if (!dofold)
>> +{
>> + expr = build1 (CONVERT_EXPR,
>> +lang_hooks.types.type_for_size
>> +  (TYPE_PRECISION (intype), 0),
>> +expr);
>> + return build1 (CONVERT_EXPR, type, expr);
>> +   }
>
>
> When we're not folding, I don't think we want to do the two-step conversion,
> just the second one.  And we might want to use NOP_EXPR instead of
> CONVERT_EXPR, but I'm not sure about that.

This is indeed a point I was thinking about while writing this code.
The comments below regarding other introduced patterns are related to
the same thinking.
Sure is that we want to remove such code-modification from FEs itself,
but that means that those conversions need to be handled elsewhere
later in ME.  We have a different functional change about
shorten_compare already, which makes me headaches, as we don't deal
with all cases of it in ME right (which is a different task not
directly related to delayed-folding, but linked).  Therefore I kept
those code-modification-code also for df, which we should move out
into ME (with other patterns here in convert.c) in different task.

For above code we need to do 2 step folding, as we need to make sure
that we convert expr first to language-specific size-type, which isn't
necessarily the same as TYPE here.  Casting directly to TYPE might
cause side-effects (especially sign/unsigned-conversion issues I could
think about here)


>> @@ -818,10 +828,15 @@ convert_to_integer (tree type, tree expr)
>> if (TYPE_UNSIGNED (typex))
>>   typex = signed_type_for (typex);
>>   }
>> -   return convert (type,
>> -   fold_build2 (ex_form, typex,
>> -convert (typex, arg0),
>> -convert (typex, arg1)));
>> +   if (dofold)
>> + return convert (type,
>> + fold_build2 (ex_form, typex,
>> +  convert (typex, arg0),
>> +  convert (typex,
>> arg1)));
>> +   arg0 = build1 (CONVERT_EXPR, typex, arg0);
>> +   arg1 = build1 (CONVERT_EXPR, typex, arg1);
>> +   expr = build2 (ex_form, typex, arg0, arg1);
>> +   return build1 (CONVERT_EXPR, type, expr);
>
>
> This code path seems to be for pushing a conversion down into a binary
> expression.  We shouldn't do this at all when we aren't folding.

Yes, but see comment above.  I agree that such code-modifications are
not the thing we intend to do, but this should be part of a different
task.

>> @@ -845,9 +860,14 @@ convert_to_integer (tree type, tree expr)
>>
>> if (!TYPE_UNSIGNED (typex))
>>   typex = unsigned_type_for (typex);
>> +   if (!dofold)
>> + return build1 (CONVERT_EXPR, type,
>> +build1 (ex_form, typex,
>> +build1 (CONVERT_EXPR, typex,
>> +TREE_OPERAND (expr, 0;
>
>
> Likewise.

^^

>> @@ -867,6 +887,14 @@ convert_to_integer (tree type, tree expr)
>>  the conditional and never loses.  A COND_EXPR may have a
>> throw
>>  as one operand, which then has void type.  Just leave void
>>  operands as they are.  */
>> + if (!dofold)
>> +   return build3 (COND_EXPR, type, TREE_OPERAND (expr, 0),
>> +  VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (expr,
>> 1)))
>> +  ? TREE_OPERAND (expr, 1)
>> +  : build1 (CONVERT_EXPR, type, TREE_OPERAND
>> (expr, 1)),
>> +  VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (expr,
>> 2)))
>> +  ? TREE_OPERAND (expr, 2)
>> +  : build1 (CONVERT_EXPR, type, TREE_OPERAND
>> (expr, 2)));
>
>
> Likewise.

^^

>> @@ -903,6 +933,10 @@ convert_to_integer (tree type, tree expr)
>> 

Re: [patch] libstdc++/67173 Fix filesystem::canonical for Solaris 10.

2015-09-17 Thread Andreas Schwab
Jonathan Wakely  writes:

> +  p = "/dev/stdin";
> +  if (exists(p))
> +{
> +  auto p2 = canonical(p);
> +  if (is_symlink(p))
> +VERIFY( p != p2 );
> +  else
> +VERIFY( p == p2 );
> +  VERIFY( canonical(p2) == p2 );

This fails if stdin is a pipe, which doesn't have a (real) name, so
realpath fails.

$ echo | ./canonical.exe
terminate called after throwing an instance of 
'std::experimental::filesystem::v1::__cxx11::filesystem_error'
  what():  filesystem error: cannot canonicalize: No such file or directory 
[/dev/stdin]

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH WIP] Use Levenshtein distance for various misspellings in C frontend v2

2015-09-17 Thread Jeff Law

On 09/16/2015 02:34 AM, Richard Biener wrote:


Btw, this looks quite expensive - I'm sure we want to limit the effort
here a bit.
A limiter is reasonable, though as it's been pointed out this only fires 
during error processing, so we probably have more leeway to take time 
and see if we can do better error recovery.


FWIW, I've used this algorithm in totally unrelated projects and while 
it seems expensive, it's worked out quite nicely.




So while the idea might be an improvement to selected cases it can cause
confusion as well.  And if using the suggestion for further parsing it can
cause worse followup errors (unless we can limit such "fixup" use to the
cases where we can parse the result without errors).  Consider

foo()
{
   foz = 1;
}

if we suggest 'foo' instead of foz then we'll get a more confusing followup
error if we actually use it.
True.  This kind of problem is probably inherent in this kind of "I'm 
going assume you meant..." error recovery mechanisms.


And just to be clear, even in a successful recovery scenario, we still 
issue an error.  The error recovery is just meant to try and give the 
user a hint what might have gone wrong and gracefully handle the case 
where they just made a minor goof.  Obviously the idea here is to cut 
down on the number of iterations of edit-compile cycle one has to do :-)



Jeff


Re: C++ PATCH to implement fold-expressions

2015-09-17 Thread Andrew Sutton
Fantastic. I've wanted to get back to this, but since school started
back up, I haven't had any time to look at it.

Thanks!

Andrew


On Thu, Sep 17, 2015 at 2:04 PM, Jason Merrill  wrote:
> Back in January Andrew mostly implemented C++1z fold-expressions
> (https://isocpp.org/files/papers/n4295.html), but the patch broke bootstrap.
> I've fixed it up now and am committing it to the trunk.
>
> Andrew: The main change I made was to drop tentative parsing in favor of
> scanning the tokens ahead looking for an ellipsis next to a fold-operator.
> I also tweaked some diagnostics, fixed handling of dependent expressions
> with no TREE_TYPE, and corrected empty fold of modops.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.


Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-17 Thread David Malcolm
On Thu, 2015-09-17 at 13:14 -0600, Jeff Law wrote:
> On 09/17/2015 10:49 AM, David Malcolm wrote:
> 
> > FWIW, I have a (very messy) implementation of this working for the C
> > frontend, which gives us source ranges on expressions without needing to
> > add any new tree nodes, or add any fields to existing tree structs.
> >
> > The approach I'm using:
> >
> > * ranges are stored directly as fields within cpp_token and c_token
> > (maybe we can ignore cp_token for now)
> >
> > * ranges are stashed in the C FE, both (a) within the "struct c_expr"
> > and (b) within the location_t of each tree expression node as a new
> > field in the adhoc map.
> >
> > Doing it twice may seem slightly redundant, but I think both are needed:
> >(a) doing it within c_expr allows us to support ranges for constants
> > and VAR_DECL etc during parsing, without needing any kind of new tree
> > wrapper node
> >(b) doing it in the ad-hoc map allows the ranges for expressions to
> > survive the parse and be usable in diagnostics later.
> >
> > So this gives us (in the C FE): ranges for everything during parsing,
> > and ranges for expressions afterwards, with no new tree nodes or new
> > fields within tree nodes.
> >
> > I'm working on cleaning it up into a much more minimal set of patches
> > that I hope are reviewable.
> >
> > Hopefully this sounds like a good approach
> So is the assumption here that the location information is or is not 
> supposed to survive through the gimple optimizers?

To be honest I hadn't given much thought to that stage; my hope is that
most of the diagnostics have been issued by the time we're optimizing.

In the proposed implementation the range information is "baked in" to
the location_t (via the ad-hoc lookaside), so it's carried along
wherever the location_t goes, and ought to have the same chances of
survival within the gimple optimizers as the existing location
information does.

>If I understand 
> what you're doing correctly, I think the location information gets 
> invalidated by const/copy propagations.
> 
> Though perhaps that's not a major problem because we're typically 
> propagating an SSA_NAME, not a _DECL node.  Hmm.

Well, if the location_t is being invalidated by an optimization, we're
already losing source *point* information: file/line/column.  Given
that, losing range information as well seems like no great loss.

Or am I missing something?

(I am attempting to preserve the source_range data when block ptrs are
baked in to the ad-hoc locations, if that's what you're referring to)

Dave



Re: C++ PATCH to implement fold-expressions

2015-09-17 Thread Jason Merrill

On 09/17/2015 02:04 PM, Jason Merrill wrote:

Back in January Andrew mostly implemented C++1z fold-expressions
(https://isocpp.org/files/papers/n4295.html), but the patch broke
bootstrap.  I've fixed it up now and am committing it to the trunk.

Andrew: The main change I made was to drop tentative parsing in favor of
scanning the tokens ahead looking for an ellipsis next to a
fold-operator.  I also tweaked some diagnostics, fixed handling of
dependent expressions with no TREE_TYPE, and corrected empty fold of
modops.

Tested x86_64-pc-linux-gnu, applying to trunk.


Now with the patch...

Jason

commit 5946e0026988259a71992e3a653556b76460919c
Author: Jason Merrill 
Date:   Wed Sep 16 15:27:10 2015 -0400

	Implement N4295 fold-expressions.

	* cp-tree.def: Add UNARY_LEFT_FOLD_EXPR, UNARY_RIGHT_FOLD_EXPR,
	BINARY_LEFT_FOLD_EXPR, BINARY_RIGHT_FOLD_EXPR.
	* cp-objcp-common.c (cp_common_init_ts): Handle them.
	* cp-tree.h (FOLD_EXPR_CHECK, BINARY_FOLD_EXPR_CHECK, FOLD_EXPR_P)
	(FOLD_EXPR_MODIFY_P, FOLD_EXPR_OP, FOLD_EXPR_PACK, FOLD_EXPR_INIT): New.
	* parser.c (cp_parser_skip_to_closing_parenthesis): Split out...
	(cp_parser_skip_to_closing_parenthesis_1): This function.  Change
	or_comma parameter to or_ttype.
	(cp_parser_fold_operator, cp_parser_fold_expr_p)
	(cp_parser_fold_expression): New.
	(cp_parser_primary_expression): Use them.
	* pt.c (expand_empty_fold, fold_expression, tsubst_fold_expr_pack)
	(tsubst_fold_expr_init, expand_left_fold, tsubst_unary_left_fold)
	(tsubst_binary_left_fold, expand_right_fold)
	(tsubst_unary_right_fold, tsubst_binary_right_fold): New.
	(tsubst_copy): Use them.
	(type_dependent_expression_p): Handle fold-expressions.
	* semantics.c (finish_unary_fold_expr)
	(finish_left_unary_fold_expr, finish_right_unary_fold_expr)
	(finish_binary_fold_expr): New.

diff --git a/gcc/cp/cp-objcp-common.c b/gcc/cp/cp-objcp-common.c
index 808defd..22f063b 100644
--- a/gcc/cp/cp-objcp-common.c
+++ b/gcc/cp/cp-objcp-common.c
@@ -311,6 +311,10 @@ cp_common_init_ts (void)
   MARK_TS_TYPED (CTOR_INITIALIZER);
   MARK_TS_TYPED (ARRAY_NOTATION_REF);
   MARK_TS_TYPED (REQUIRES_EXPR);
+  MARK_TS_TYPED (UNARY_LEFT_FOLD_EXPR);
+  MARK_TS_TYPED (UNARY_RIGHT_FOLD_EXPR);
+  MARK_TS_TYPED (BINARY_LEFT_FOLD_EXPR);
+  MARK_TS_TYPED (BINARY_RIGHT_FOLD_EXPR);
 }
 
 #include "gt-cp-cp-objcp-common.h"
diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index 61acf27..7df72c5 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -432,6 +432,26 @@ DEFTREECODE (EXPR_PACK_EXPANSION, "expr_pack_expansion", tcc_expression, 3)
index is a machine integer.  */
 DEFTREECODE (ARGUMENT_PACK_SELECT, "argument_pack_select", tcc_exceptional, 0)
 
+/* Fold expressions allow the expansion of a template argument pack
+   over a binary operator.
+
+   FOLD_EXPR_MOD_P is true when the fold operation is a compound assignment
+   operator.
+
+   FOLD_EXPR_OP is an INTEGER_CST storing the tree code for the folded
+   expression. Note that when FOLDEXPR_MOD_P is true, the operator is
+   a compound assignment operator for that kind of expression.
+
+   FOLD_EXPR_PACK is an expression containing an unexpanded parameter pack;
+   when expanded, each term becomes an argument of the folded expression.
+
+   In a BINARY_FOLD_EXPRESSION, FOLD_EXPR_INIT is the non-pack argument. */
+DEFTREECODE (UNARY_LEFT_FOLD_EXPR, "unary_left_fold_expr", tcc_expression, 2)
+DEFTREECODE (UNARY_RIGHT_FOLD_EXPR, "unary_right_fold_expr", tcc_expression, 2)
+DEFTREECODE (BINARY_LEFT_FOLD_EXPR, "binary_left_fold_expr", tcc_expression, 3)
+DEFTREECODE (BINARY_RIGHT_FOLD_EXPR, "binary_right_fold_expr", tcc_expression, 3)
+
+
 /** C++ extensions. */
 
 /* Represents a trait expression during template expansion.  */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 8643e08..80d6c4e 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -80,6 +80,7 @@ c-common.h, not after.
   COMPOUND_REQ_NOEXCEPT_P (in COMPOUND_REQ)
   WILDCARD_PACK_P (in WILDCARD_DECL)
   BLOCK_OUTER_CURLY_BRACE_P (in BLOCK)
+  FOLD_EXPR_MODOP_P (*_FOLD_EXPR)
1: IDENTIFIER_VIRTUAL_P (in IDENTIFIER_NODE)
   TI_PENDING_TEMPLATE_FLAG.
   TEMPLATE_PARMS_FOR_INLINE.
@@ -3249,6 +3250,37 @@ extern void decl_shadowed_for_var_insert (tree, tree);
   TREE_VEC_ELT (ARGUMENT_PACK_ARGS (ARGUMENT_PACK_SELECT_FROM_PACK (NODE)), \
 	ARGUMENT_PACK_SELECT_INDEX (NODE))
 
+#define FOLD_EXPR_CHECK(NODE)		\
+  TREE_CHECK4 (NODE, UNARY_LEFT_FOLD_EXPR, UNARY_RIGHT_FOLD_EXPR,	\
+	   BINARY_LEFT_FOLD_EXPR, BINARY_RIGHT_FOLD_EXPR)
+
+#define BINARY_FOLD_EXPR_CHECK(NODE) \
+  TREE_CHECK2 (NODE, BINARY_LEFT_FOLD_EXPR, BINARY_RIGHT_FOLD_EXPR)
+
+/* True if NODE is UNARY_FOLD_EXPR or a BINARY_FOLD_EXPR */
+#define FOLD_EXPR_P(NODE) \
+  TREE_CODE (NODE) == UNARY_LEFT_FOLD_EXPR \
+|| TREE_CODE (NODE) == UNARY_RIGHT_FOLD_EXPR \
+|| TREE_CODE (NODE) == 

Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-17 Thread Jeff Law

On 09/15/2015 06:18 AM, Richard Biener wrote:



Of course this boils down to "uses" of a VAR_DECL using the shared tree
node.  On GIMPLE some stmt kinds have separate locations for each operand
(PHI nodes), on GENERIC we'd have to invent a no-op expr tree code to
wrap such uses to be able to give them distinct locations (can't use sth
existing as frontends would need to ignore them in a different way than say
NOP_EXPRs or NON_LVALUE_EXPRs).

Exactly.

Jeff



Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-17 Thread Jeff Law

On 09/15/2015 06:54 AM, Manuel López-Ibáñez wrote:

On 15 September 2015 at 14:18, Richard Biener
 wrote:

Of course this boils down to "uses" of a VAR_DECL using the shared tree
node.  On GIMPLE some stmt kinds have separate locations for each operand
(PHI nodes), on GENERIC we'd have to invent a no-op expr tree code to
wrap such uses to be able to give them distinct locations (can't use sth
existing as frontends would need to ignore them in a different way than say
NOP_EXPRs or NON_LVALUE_EXPRs).



The problem with that approach (besides the waste of memory implied by
a whole tree node just to store one location_t) is keeping those
wrappers in place while making them transparent for most of the
compiler. According to Arnaud, folding made this approach infeasible:
https://gcc.gnu.org/ml/gcc-patches/2012-09/msg01222.html

The other two alternatives are to store the location of the operands
on the expressions themselves or to store them as on-the-side
data-structure, but they come with their own drawbacks. I was
initially more in favour of the wrapper solution, but after dealing
with NOP_EXPRs, having to deal also with LOC_EXPR would be a nightmare
(as you say, they will have to be ignored in a different way). The
other alternatives seem less invasive and the problems mentioned here
https://gcc.gnu.org/ml/gcc-patches/2012-11/msg00164.html do not seem
as serious as I thought (passing down the location of the operand is
becoming  the norm anyway).
I suspect any on-the-side data structure to handle this is ultimately 
doomed to failure.  Storing the location info for the operands in the 
expression means that anything that modifies an operand would have to 
have access to the expression so that location information could be 
updated.  Ugh.


As painful as it will be, the right way is to stop using DECL nodes like 
this and instead be using another node that isn't shared.  That allows 
atttaching location information.  David and I kicked this around before 
he posted his patch and didn't come up with anything better IIRC.


These wrapper nodes are definitely going to get in the way of folders 
and all kinds of things.  So it's not something that's going to be easy 
to add without digging into and modifying a lot of code.


I've always considered this a wart, but fixing that wart hasn't seemed 
worth the effort until recently.


Jeff


Re: [PATCH] PR66870 PowerPC64 Enable gold linker with split stack

2015-09-17 Thread Lynn A. Boger

Here is my updated patch, with the changes suggested by
Ian for gcc/gospec.c and David for gcc/configure.ac.

Bootstrap built and tested on ppc64le, ppc64 multilib.

2015-09-17Lynn Boger 
gcc/
PR target/66870
config/rs6000/sysv4.h:  Define TARGET_CAN_SPLIT_STACK_64BIT
config.in:  Set up HAVE_GOLD_ALTERNATE_SPLIT_STACK
configure.ac:  Define HAVE_GOLD_ALTERNATE_SPLIT_STACK
on Power based on gold linker version
configure:  Regenerate
gcc.c:  Add -fuse-ld=gold to STACK_SPLIT_SPEC if
HAVE_GOLD_ALTERNATE_SPLIT_STACK defined
go/gospec.c:  (lang_specific_driver):  Set appropriate 
split stack
options for 64 bit compiles based on 
TARGET_CAN_SPLIT_STACK_64BIT


On 09/15/2015 02:58 PM, Ian Lance Taylor wrote:

On Tue, Sep 15, 2015 at 11:24 AM, Lynn A. Boger
  wrote:

I need some feedback on whether to enable the gold linker at
all for split stack on platforms other than Power in gcc/configure.ac.
I don't know if there are gold linker versions that should be verified for
non-Power platforms.  My first patch only enabled it on Power and that
  is what I think should be done.  Those who would like to use gold
with split stack for other platforms can enable it under the appropriate
conditions.

I think that is fine for now.  If you want to enable gold for
i386/x86_64 with -fsplit-stack, that is also fine.

Ian




Index: gcc/config/rs6000/sysv4.h
===
--- gcc/config/rs6000/sysv4.h	(revision 227812)
+++ gcc/config/rs6000/sysv4.h	(working copy)
@@ -940,6 +940,14 @@ ncrtn.o%s"
 #undef TARGET_ASAN_SHADOW_OFFSET
 #define TARGET_ASAN_SHADOW_OFFSET rs6000_asan_shadow_offset
 
+/* On ppc64 and ppc64le, split stack is only support for
+   64 bit. */
+#undef TARGET_CAN_SPLIT_STACK_64BIT
+#if TARGET_GLIBC_MAJOR > 2 \
+  || (TARGET_GLIBC_MAJOR == 2 && TARGET_GLIBC_MINOR >= 18)
+#define TARGET_CAN_SPLIT_STACK_64BIT
+#endif
+
 /* This target uses the sysv4.opt file.  */
 #define TARGET_USES_SYSV4_OPT 1
 
Index: gcc/config.in
===
--- gcc/config.in	(revision 227812)
+++ gcc/config.in	(working copy)
@@ -1310,6 +1310,12 @@
 #endif
 
 
+/* Define if the gold linker with split stack is available as a non-default */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GOLD_NON_DEFAULT_SPLIT_STACK
+#endif
+
+
 /* Define if you have the iconv() function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_ICONV
Index: gcc/configure.ac
===
--- gcc/configure.ac	(revision 227812)
+++ gcc/configure.ac	(working copy)
@@ -2247,6 +2247,42 @@ if test x$gcc_cv_ld != x; then
 fi
 AC_MSG_RESULT($ld_is_gold)
 
+AC_MSG_CHECKING(gold linker with split stack support as non default)
+# Check to see if default ld is not gold, but gold is
+# available and has support for split stack.  If gcc was configured
+# with gold then no checking is done.
+# 
+if test x$ld_is_gold = xno && which ${gcc_cv_ld}.gold >/dev/null 2>&1; then
+
+# For platforms other than powerpc64*, enable as appropriate.
+
+  gold_non_default=no
+  ld_gold=`which ${gcc_cv_ld}.gold`
+# Make sure this gold has minimal split stack support
+  if $ld_gold --help 2>/dev/null | grep split-stack-adjust-size >/dev/null 2>&1; then
+ld_vers=`$ld_gold --version | sed 1q`
+gold_vers=`echo $ld_vers | sed -n \
+  -e 's,^[[^)]]*[[  ]]\([[0-9]][[0-9]]*\.[[0-9]][[0-9]]*[[^)]]*\)) .*$,\1,p'`
+case $target in
+# check that the gold version contains the complete split stack support
+# on powerpc64 big and little endian
+  powerpc64*-*-*)
+case "$gold_vers" in
+  2.25.[[1-9]]*|2.2[[6-9]][[.0-9]]*|2.[[3-9]][[.0-9]]*|[[3-9]].[[.0-9]]*) gold_non_default=yes
+  ;;
+  *) gold_non_default=no
+  ;;
+esac
+;;
+esac
+  fi
+  if test $gold_non_default = yes; then
+AC_DEFINE(HAVE_GOLD_NON_DEFAULT_SPLIT_STACK, 1,
+	[Define if the gold linker supports split stack and is available as a non-default])
+  fi
+fi
+AC_MSG_RESULT($gold_non_default)
+
 ORIGINAL_LD_FOR_TARGET=$gcc_cv_ld
 AC_SUBST(ORIGINAL_LD_FOR_TARGET)
 case "$ORIGINAL_LD_FOR_TARGET" in
Index: gcc/gcc.c
===
--- gcc/gcc.c	(revision 227812)
+++ gcc/gcc.c	(working copy)
@@ -666,7 +666,11 @@ proper position among the other output files.  */
libgcc.  This is not yet a real spec, though it could become one;
it is currently just stuffed into LINK_SPEC.  FIXME: This wrapping
only works with GNU ld and gold.  */
+#ifdef HAVE_GOLD_NON_DEFAULT_SPLIT_STACK
+#define STACK_SPLIT_SPEC " %{fsplit-stack: -fuse-ld=gold --wrap=pthread_create}"
+#else
 #define STACK_SPLIT_SPEC " %{fsplit-stack: --wrap=pthread_create}"
+#endif
 
 #ifndef LIBASAN_SPEC
 #define STATIC_LIBASAN_LIBS \

Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-17 Thread Jeff Law

On 09/15/2015 04:33 AM, Richard Biener wrote:

On Tue, Sep 15, 2015 at 12:20 PM, Jakub Jelinek  wrote:

On Tue, Sep 15, 2015 at 12:14:22PM +0200, Richard Biener wrote:

diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
index 760467c..c7558a0 100644
--- a/gcc/cp/parser.h
+++ b/gcc/cp/parser.h
@@ -61,6 +61,8 @@ struct GTY (()) cp_token {
BOOL_BITFIELD purged_p : 1;
/* The location at which this token was found.  */
location_t location;
+  /* The source range at which this token was found.  */
+  source_range range;


Is it just me or does location now feel somewhat redundant with range?  Can't we
compress that somehow?


For a token I'd expect it is redundant, I don't see how it would be useful
for a single preprocessing token to have more than start and end locations.
But generally, for expressions, 3 locations make sense.
If you have
abc + def
^
then having a range is useful.  In any case, I'm surprised that the ranges 
aren't encoded in
location_t (the data structures behind it, where we already stick also
BLOCK pointer).


Probably lack of encoding space ... I suppose upping location_t to
64bits coud solve
some of that (with its own drawback on increasing size of core structures).
If we're going to 64 bits, then we might consider making it a pointer so 
that we don't have to spend so much time building up an encoding scheme.


Jeff


Re: [PATCH] Convert SPARC to LRA

2015-09-17 Thread David Miller
From: Eric Botcazou 
Date: Sat, 12 Sep 2015 16:04:09 +0200

> You need to update https://gcc.gnu.org/backends.html

Done.


[wwwdocs] PATCH for Re: Advertisement in the GCC mirrors list, again

2015-09-17 Thread Gerald Pfeifer
On Tue, 15 Sep 2015, niXman wrote:
> mirrors.webhostinggeeks.com/gcc/

This is a little disappointing, though I am inclinded to consider
it a genuine mistake/migration.  Addressed like this.

Gerald

Index: mirrors.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v
retrieving revision 1.231
diff -u -r1.231 mirrors.html
--- mirrors.html9 Sep 2015 16:57:26 -   1.231
+++ mirrors.html17 Sep 2015 19:06:29 -
@@ -38,7 +38,6 @@
 Hungary, Budapest: http://robotlab.itk.ppke.hu/gcc/;>robotlab.itk.ppke.hu, thanks to 
Adam Rak (neurhlp at gmail.com)
 Japan: ftp://ftp.dti.ad.jp/pub/lang/gcc/;>ftp.dti.ad.jp, 
thanks to IWAIZAKO Takahiro (ftp-admin at dti.ad.jp)
 Japan: http://ftp.tsukuba.wide.ad.jp/software/gcc/;>ftp.tsukuba.wide.ad.jp, 
thanks to Kohei Takahashi (tsukuba-ftp-servers at tsukuba.wide.ad.jp)
-Latvia, Riga: http://mirrors.webhostinggeeks.com/gcc/;>mirrors.webhostinggeeks.com/gcc/,
 thanks to Igor (whg.igp at gmail.com)
 The Netherlands, Amsterdam:
   http://mirror2.babylon.network/gcc/;>http://mirror2.babylon.network/gcc/
 |
   ftp://mirror2.babylon.network/gcc/;>ftp://mirror2.babylon.network/gcc/
 |


Re: debug mode symbols cleanup

2015-09-17 Thread François Dumont
On 15/09/2015 23:57, Jonathan Wakely wrote:
> On 14/09/15 20:26 +0200, François Dumont wrote:
>> On 08/09/2015 22:47, François Dumont wrote:
>>> On 07/09/2015 13:03, Jonathan Wakely wrote:
 On 05/09/15 22:53 +0200, François Dumont wrote:
>I remember Paolo saying once that we were not guarantiing any abi
> compatibility for debug mode. I haven't found any section for
> unversioned symbols in gnu.ver so I simply uncomment the global
> export.
 There is no section, because all exported symbols are versioned.

 It's OK if objects compiled with Debug Mode using one version of GCC
 don't link to objects compiled with Debug Mode using a different
 version of GCC, but you can't change the exported symbols in the DSO.


 Your changelog doesn't include the changes to config/abi/pre/gnu.ver,
 but those changes are not OK anyway, they fail the abi-check:

 FAIL: libstdc++-abi/abi_check

=== libstdc++ Summary ===

 # of unexpected failures1


>>> Sorry, I though policy regarding debug mode symbols was even more
>>> relax.
>>> It is not so here is another patch that doesn"t break abi checks.
>>>
>>> I eventually made all methods that should not be used deprecated, they
>>> were normally not used explicitely anyway. Their implementation is now
>>> empty. I just needed to add a symbol for the not const _M_message
>>> method
>>> which is the correct signature.
>>>
>>> François
>>>
>> I eventually considered doing it without impacting exported symbols. I
>> just kept the const qualifier on _M_messages and introduced a const_cast
>> in the implementation.
>>
>> Is it ok to commit with this version ?
>
> The changes look OK now (assuming it passes 'make check-abi') but what
> does it actually do?

Mostly clean code, I have been upset in the past to be forced to
introduce new print methods to _Error_formatter or _Parameter type while
those methods were only used through a call to _M_error. Now we can
change anything to the code used to output the diagnostic without having
to also impact _Error_formatter.

I also disliked the mutable fields of _Error_formatter, it often
indicates a design flow, which was the case here.

>
> Does it significantly improve the output?

Not significantly. I initially start working on that because debug
messages were wrapped is a bad way. The message used to be cut on any
character that was not alnum, now we cut only on space. I would like to
improve rendering of types, they are not respecting the max line length
that can be specified by users, but that's not top priority of course.

Committed.

François



Re: [PATCH, PR 57195] Allow mode iterators inside angle brackets

2015-09-17 Thread Richard Sandiford
Michael Collison  writes:
> On 09/14/2015 02:34 AM, Richard Sandiford wrote:
>> Michael Collison  writes:
>>> Here is a modified patch that takes your comments into account. Breaking
>>> on depth == 0 with '>' does not work due to the code looking for whitespace.
>> What goes wrong?  Just to make sure we're talking about the same thing,
>> I meant that in:
>>
>> (match_operand:FOO> ...
>>
>> the name should be "FOO" and you should get an error on ">" when parsing
>> the text after the name, just like you would for:
>>
>> (match_operand:FOO] ...
>
> When I try breaking on '>' with a nesting depth of 0 all examples of 
>  fail.

I meant break when depth == 0 before the decrement that's associated
with '>'.  I think that problem would only occur if you broke _after_
decrementing the depth.

>> It's not a big deal though, so...
>>
>>> 2015-08-25  Michael Collison  
>>>
>>>   PR other/57195
>>>   * read-md.c (read_name): Allow mode iterators inside angle
>>>   brackets in rtl expressions.
>> OK, thanks.
> Meaning okay to check the patch in?

Yeah, it's OK to check in.

Richard



Re: [PATCH, RFC] Implement N4230, Nested namespace definition

2015-09-17 Thread Jason Merrill

On 09/16/2015 07:55 AM, Ville Voutilainen wrote:

This is the first stab, I haven't written the tests yet. Feedback would be
most welcome; should I put this code into a separate function? Is the minor
code duplication with the regular namespace definition ok?


I think I'd prefer to keep it in the same function, but avoid the code 
duplication.


Jason




Re: [PATCH WIP] Use Levenshtein distance for various misspellings in C frontend v2

2015-09-17 Thread Manuel López-Ibáñez
On 17 September 2015 at 21:57, David Malcolm  wrote:
> In my mind it's more about saving the user from having to locate the
> field they really meant within the corresponding structure declaration
> (either by grep, or by some cross-referencing tool).

I think it is more than that. After a long coding session, one can
start to wonder why the compiler cannot find type_of_unknwon_predicate
or firstColourInColumn (ah! it was type_of_unknown_predicate and
firstColorInColumn!).

Or when we extend this to options (PR67613), why I get

error: unrecognized command line option '-Weffic++'

when I just read it in the manual!

Cheers,

Manuel.


Re: [PATCH] Convert SPARC to LRA

2015-09-17 Thread Eric Botcazou
> > You need to update https://gcc.gnu.org/backends.html
> 
> Done.

Nice work!  Did you keep the 64-bit promotion in PROMOTE_MODE or...?

-- 
Eric Botcazou


RE: [patch] libstdc++/65142 Check read() result in std::random_device.

2015-09-17 Thread Moore, Catherine


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jonathan Wakely
> Sent: Thursday, September 17, 2015 5:28 PM
> To: Gerald Pfeifer
> Cc: libstd...@gcc.gnu.org; gcc-patches@gcc.gnu.org
> Subject: Re: [patch] libstdc++/65142 Check read() result in
> std::random_device.
> 
> On 17/09/15 22:21 +0200, Gerald Pfeifer wrote:
> >On Thu, 17 Sep 2015, Jonathan Wakely wrote:
> >>> Any comments on this version?
> >> Committed to trunk.
> >
> >Unfortunately this broke bootstrap on FreeBSD 10.1.
> >
> >/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc: In
> member function 'std::random_device::result_type
> std::random_device::_M_getval()':
> >/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc:144:22:
> >error: 'errno' was not declared in this scope
> >  else if (e != -1 || errno != EINTR)
> >  ^
> >/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc:144:31:
> >error: 'EINTR' was not declared in this scope
> >  else if (e != -1 || errno != EINTR)
> >   ^
> >Makefile:545: recipe for target 'random.lo' failed
> >
> >I probably won't be able to dig in deeper today, but figured this might
> >already send you on the right path?
> >
> >Actually...
> >
> >...how about he patch below?  Bootstraps on i386-unknown-freebsd10.1,
> >no regressions.
> 
> Sorry about that, I've committed your patch.
> 
> >Gerald
> >
> >
I'm still seeing errors for a build of the mips-sde-elf target with these 
patches.

Errors are:
/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/src/c++11/debug.cc:
 In function 'void {anonymous}::print_word({anonymous
}::PrintContext&, const char*, std::ptrdiff_t)':
/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/src/c++11/debug.cc:573:10:
 error: 'stderr' was not declared in this scop
e
  fprintf(stderr, "\n");
  ^
/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/src/c++11/debug.cc:573:22:
 error: 'fprintf' was not declared in this sco
pe
  fprintf(stderr, "\n");
  ^
/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/src/c++11/debug.cc:596:14:
 error: 'stderr' was not declared in this scop
e
  fprintf(stderr, "%s", spacing);
  ^
/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/src/c++11/debug.cc:596:35:
 error: 'fprintf' was not declared in this sco
pe
  fprintf(stderr, "%s", spacing);
   ^
/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/src/c++11/debug.cc:600:24:
 error: 'stderr' was not declared in this scope
  int written = fprintf(stderr, "%s", word);
^
/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/src/c++11/debug.cc:600:42:
 error: 'fprintf' was not declared in this scop
e
  int written = fprintf(stderr, "%s", word);
  ^

Thanks,
Catherine



Re: [patch] libstdc++/65142 Check read() result in std::random_device.

2015-09-17 Thread Jonathan Wakely

On 17/09/15 22:21 +0200, Gerald Pfeifer wrote:

On Thu, 17 Sep 2015, Jonathan Wakely wrote:

Any comments on this version?

Committed to trunk.


Unfortunately this broke bootstrap on FreeBSD 10.1.

/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc: In member 
function 'std::random_device::result_type std::random_device::_M_getval()':
/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc:144:22: error: 
'errno' was not declared in this scope
 else if (e != -1 || errno != EINTR)
 ^
/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc:144:31: error: 
'EINTR' was not declared in this scope
 else if (e != -1 || errno != EINTR)
  ^
Makefile:545: recipe for target 'random.lo' failed

I probably won't be able to dig in deeper today, but figured this
might already send you on the right path?

Actually...

...how about he patch below?  Bootstraps on i386-unknown-freebsd10.1,
no regressions.


Sorry about that, I've committed your patch.


Gerald


2015-09-17  Gerald Pfeifer  

* src/c++11/random.cc: Include .

Index: libstdc++-v3/src/c++11/random.cc
===
--- libstdc++-v3/src/c++11/random.cc(revision 227883)
+++ libstdc++-v3/src/c++11/random.cc(working copy)
@@ -31,6 +31,7 @@
# include 
#endif

+#include 
#include 

#ifdef _GLIBCXX_HAVE_UNISTD_H


Re: [patch] libstdc++/67173 Fix filesystem::canonical for Solaris 10.

2015-09-17 Thread Jonathan Wakely

On 17/09/15 21:25 +0200, Andreas Schwab wrote:

Jonathan Wakely  writes:


+  p = "/dev/stdin";
+  if (exists(p))
+{
+  auto p2 = canonical(p);
+  if (is_symlink(p))
+VERIFY( p != p2 );
+  else
+VERIFY( p == p2 );
+  VERIFY( canonical(p2) == p2 );


This fails if stdin is a pipe, which doesn't have a (real) name, so
realpath fails.

$ echo | ./canonical.exe
terminate called after throwing an instance of 
'std::experimental::filesystem::v1::__cxx11::filesystem_error'
 what():  filesystem error: cannot canonicalize: No such file or directory 
[/dev/stdin]


Ah, of course, the symlink exists but doesn't point to a real file.
Thanks for the explanation.

I'll re-add tests for symlinks when I come up with a proper method for
testing the Filesystem code.




Re: [PATCH, RFC] Implement N4230, Nested namespace definition

2015-09-17 Thread Ville Voutilainen
On 17 September 2015 at 23:11, Jason Merrill  wrote:
> On 09/16/2015 07:55 AM, Ville Voutilainen wrote:
>>
>> This is the first stab, I haven't written the tests yet. Feedback would be
>> most welcome; should I put this code into a separate function? Is the
>> minor
>> code duplication with the regular namespace definition ok?
>
>
> I think I'd prefer to keep it in the same function, but avoid the code
> duplication.

Ok. Tested on Linux-PPC64. This patch doesn't handle attributes yet, it looks to
me as if gcc doesn't support namespace attributes in the location that
the standard
grammar puts them into. I had to adjust a couple of
point-of-declaration error/warning
locations in the testsuite, but those seem reasonable to me, biased as
I may be towards keeping
this patch simpler than keeping those locations where they were would
likely require.

Nathan, this patch touches areas close to ones that your "Coding rule
enforcement" patch
does, please be aware of potential merge conflicts.

/cp
2015-09-18  Ville Voutilainen  

Implement nested namespace definitions.
* parser.c (cp_parser_namespace_definition): Grok nested namespace
definitions.

/testsuite
2015-09-18  Ville Voutilainen  

Implement nested namespace definitions.
* g++.dg/cpp1z/nested-namespace-def.C: New.
* g++.dg/lookup/name-clash5.C: Adjust.
* g++.dg/lookup/name-clash6.C: Likewise.
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 4f424b6..9fee310 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -16953,6 +16953,8 @@ cp_parser_namespace_definition (cp_parser* parser)
   tree identifier, attribs;
   bool has_visibility;
   bool is_inline;
+  cp_token* token;
+  int nested_definition_count = 0;
 
   cp_ensure_no_omp_declare_simd (parser);
   if (cp_lexer_next_token_is_keyword (parser->lexer, RID_INLINE))
@@ -16965,7 +16967,7 @@ cp_parser_namespace_definition (cp_parser* parser)
 is_inline = false;
 
   /* Look for the `namespace' keyword.  */
-  cp_parser_require_keyword (parser, RID_NAMESPACE, RT_NAMESPACE);
+  token = cp_parser_require_keyword (parser, RID_NAMESPACE, RT_NAMESPACE);
 
   /* Get the name of the namespace.  We do not attempt to distinguish
  between an original-namespace-definition and an
@@ -16979,11 +16981,32 @@ cp_parser_namespace_definition (cp_parser* parser)
   /* Parse any specified attributes.  */
   attribs = cp_parser_attributes_opt (parser);
 
-  /* Look for the `{' to start the namespace.  */
-  cp_parser_require (parser, CPP_OPEN_BRACE, RT_OPEN_BRACE);
   /* Start the namespace.  */
   push_namespace (identifier);
 
+  /* Parse any nested namespace definition. */
+  if (cp_lexer_next_token_is (parser->lexer, CPP_SCOPE))
+{
+  if (is_inline)
+error_at (token->location, "a nested % definition cannot 
be inline");
+  while (cp_lexer_next_token_is (parser->lexer, CPP_SCOPE))
+{
+  cp_lexer_consume_token (parser->lexer);
+  if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
+identifier = cp_parser_identifier (parser);
+  else
+{
+  cp_parser_error (parser, "nested identifier required");
+  break;
+}
+  ++nested_definition_count;
+  push_namespace (identifier);
+}
+}
+
+  /* Look for the `{' to validate starting the namespace.  */
+  cp_parser_require (parser, CPP_OPEN_BRACE, RT_OPEN_BRACE);
+
   /* "inline namespace" is equivalent to a stub namespace definition
  followed by a strong using directive.  */
   if (is_inline)
@@ -17007,6 +17030,10 @@ cp_parser_namespace_definition (cp_parser* parser)
   if (has_visibility)
 pop_visibility (1);
 
+  /* Finish the nested namespace definitions.  */
+  while (nested_definition_count--)
+pop_namespace ();
+
   /* Finish the namespace.  */
   pop_namespace ();
   /* Look for the final `}'.  */
diff --git a/gcc/testsuite/g++.dg/cpp1z/nested-namespace-def.C 
b/gcc/testsuite/g++.dg/cpp1z/nested-namespace-def.C
new file mode 100644
index 000..da35835
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/nested-namespace-def.C
@@ -0,0 +1,14 @@
+// { dg-options "-std=c++1z" }
+
+namespace A::B::C
+{
+   struct X {};
+   namespace T::U::V { struct Y {}; };
+};
+
+A::B::C::X x;
+A::B::C::T::U::V::Y y;
+
+inline namespace D::E {}; // { dg-error "cannot be inline" }
+
+namespace F::G:: {}; // { dg-error "nested identifier required" }
diff --git a/gcc/testsuite/g++.dg/lookup/name-clash5.C 
b/gcc/testsuite/g++.dg/lookup/name-clash5.C
index 74595c2..9673bb9 100644
--- a/gcc/testsuite/g++.dg/lookup/name-clash5.C
+++ b/gcc/testsuite/g++.dg/lookup/name-clash5.C
@@ -6,8 +6,8 @@
 // "[Note: a namespace name or a class template name must be unique in its
 // declarative region (7.3.2, clause 14). ]"
 
-namespace N
-{ // { dg-message "previous declaration" }
+namespace N // { dg-message "previous declaration" }
+{
 }

Fwd: vector lightweight debug mode

2015-09-17 Thread Christopher Jefferson
-- Forwarded message --
From: Christopher Jefferson 
Date: 17 September 2015 at 18:59
Subject: Re: vector lightweight debug mode
To: Jonathan Wakely 


On 16 September 2015 at 21:29, Jonathan Wakely  wrote:
> On 16/09/15 21:37 +0200, François Dumont wrote:
>
 @@ -1051,6 +1071,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   iterator
   insert(const_iterator __position, size_type __n, const
 value_type& __x)
   {
 +__glibcxx_assert(__position >= cbegin() && __position <= cend());
 difference_type __offset = __position - cbegin();
 _M_fill_insert(begin() + __offset, __n, __x);
 return begin() + __offset;
>>>
>>>
>>> This is undefined behaviour, so I'd rather not add this check (I know
>>> it's on the google branch, but it's still undefined behaviour).
>>
>>
>> Why ? Because of the >= operator usage ? Is the attached patch better ?
>> < and == operators are well defined for a random access iterator, no ?
>
>
> No, because it is undefined to compare iterators that belong to
> different containers, or to compare pointers that point to different
> arrays.

While that's true, on the other hand it's defined behaviour when the
assert passes, and in the case where the thing it's trying to check
fails, we are off into undefined-land anyway.

A defined check would be to check if __offset is < 0 or > size(). Once
again if it's false we are undefined, but the assert line itself is
then defined behaviour.


Re: [RFC][PATCH] Preferred rename register in regrename pass

2015-09-17 Thread Eric Botcazou
> I'll let Bernd comment on the patch itself.  But I would say that if
> you're setting up cases where we can tie the source/dest of an extension
> together, then it's a good thing.  It'll cause more of them to turn into
> NOPs and it'll make the redundant extension elimination pass more
> effective as well.

Not if you do it in regrename though, as it's run very late in the pipeline.
If you want to make REE more effective, this would need to be done during RA.

-- 
Eric Botcazou


Re: [PATCH] Convert SPARC to LRA

2015-09-17 Thread David Miller
From: Eric Botcazou 
Date: Thu, 17 Sep 2015 23:13:21 +0200

> Did you keep the 64-bit promotion in PROMOTE_MODE or...?

Yes I had to, the compiler aborts() if I make it use SImode on 64-bit.

I'll take a closer look soon.


Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-17 Thread David Malcolm
On Wed, 2015-09-16 at 16:21 -0400, David Malcolm wrote:
> On Tue, 2015-09-15 at 12:48 +0200, Jakub Jelinek wrote:
> > On Tue, Sep 15, 2015 at 12:33:58PM +0200, Richard Biener wrote:
> > > On Tue, Sep 15, 2015 at 12:20 PM, Jakub Jelinek  wrote:
> > > > On Tue, Sep 15, 2015 at 12:14:22PM +0200, Richard Biener wrote:
> > > >> > diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
> > > >> > index 760467c..c7558a0 100644
> > > >> > --- a/gcc/cp/parser.h
> > > >> > +++ b/gcc/cp/parser.h
> > > >> > @@ -61,6 +61,8 @@ struct GTY (()) cp_token {
> > > >> >BOOL_BITFIELD purged_p : 1;
> > > >> >/* The location at which this token was found.  */
> > > >> >location_t location;
> > > >> > +  /* The source range at which this token was found.  */
> > > >> > +  source_range range;
> > > >>
> > > >> Is it just me or does location now feel somewhat redundant with range? 
> > > >>  Can't we
> > > >> compress that somehow?
> > > >
> > > > For a token I'd expect it is redundant, I don't see how it would be 
> > > > useful
> > > > for a single preprocessing token to have more than start and end 
> > > > locations.
> > > > But generally, for expressions, 3 locations make sense.
> > > > If you have
> > > > abc + def
> > > > ^
> > > > then having a range is useful.  In any case, I'm surprised that the 
> > > > ranges aren't encoded in
> > > > location_t (the data structures behind it, where we already stick also
> > > > BLOCK pointer).
> > > 
> > > Probably lack of encoding space ... I suppose upping location_t to
> > > 64bits coud solve
> > > some of that (with its own drawback on increasing size of core 
> > > structures).
> > 
> > What I had in mind was just add
> >   source_location start, end;
> > to location_adhoc_data struct and use !IS_ADHOC_LOC locations to represent
> > just plain locations without block and without range (including the cases
> > where the range has both start and end equal to the locus) and IS_ADHOC_LOC
> > locations for the cases where either we have non-NULL block, or we have
> > some other range, or both.  But I haven't spent any time on that, so just
> > wondering if such an encoding has been considered.
> 
> I've been attempting to implement that.
> 
> Am I right in thinking that the ad-hoc locations never get purged? i.e.
> that once we've registered an ad-hoc location, then is that slot within
> location_adhoc_data_map is forever associated with that (locus, block)
> pair?  [or in the proposed model, the (locus, src_range, block)
> triple?].
> 
> If so, it may make more sense to put the ranges into ad-hoc locations,
> but only *after tokenization*: in this approach, the src_range would be
> a field within the tokens (like in patch 07/22), in the hope that the
> tokens are short-lived  (which AIUI is the case for libcpp and C, though
> not for C++), presumably also killing the "location" field within
> tokens.  We then stuff the range into the location_t when building trees
> (maybe putting a src_range into c_expr to further delay populating
> location_adhoc_data_map).
> 
> That way we avoid bloating the location_adhoc_data_map during lexing,
> whilst preserving the range information, and we can stuff the ranges
> into the 32-bit location_t within tree/gimple etc (albeit paying a cost
> within the location_adhoc_data_map).
> 
> Thoughts?  Hope this sounds sane.
> Dave

FWIW, I have a (very messy) implementation of this working for the C
frontend, which gives us source ranges on expressions without needing to
add any new tree nodes, or add any fields to existing tree structs.

The approach I'm using:

* ranges are stored directly as fields within cpp_token and c_token
(maybe we can ignore cp_token for now)

* ranges are stashed in the C FE, both (a) within the "struct c_expr"
and (b) within the location_t of each tree expression node as a new
field in the adhoc map.

Doing it twice may seem slightly redundant, but I think both are needed:
  (a) doing it within c_expr allows us to support ranges for constants
and VAR_DECL etc during parsing, without needing any kind of new tree
wrapper node
  (b) doing it in the ad-hoc map allows the ranges for expressions to
survive the parse and be usable in diagnostics later.

So this gives us (in the C FE): ranges for everything during parsing,
and ranges for expressions afterwards, with no new tree nodes or new
fields within tree nodes.

I'm working on cleaning it up into a much more minimal set of patches
that I hope are reviewable.

Hopefully this sounds like a good approach

I've also been looking at ways to mitigate bloat of the ad-hoc map, by
using some extra bits of location_t for representing short ranges
directly.

Dave



C++ PATCH to implement fold-expressions

2015-09-17 Thread Jason Merrill
Back in January Andrew mostly implemented C++1z fold-expressions 
(https://isocpp.org/files/papers/n4295.html), but the patch broke 
bootstrap.  I've fixed it up now and am committing it to the trunk.


Andrew: The main change I made was to drop tentative parsing in favor of 
scanning the tokens ahead looking for an ellipsis next to a 
fold-operator.  I also tweaked some diagnostics, fixed handling of 
dependent expressions with no TREE_TYPE, and corrected empty fold of modops.


Tested x86_64-pc-linux-gnu, applying to trunk.


RE: [patch] libstdc++/65142 Check read() result in std::random_device.

2015-09-17 Thread Moore, Catherine


> -Original Message-
> From: Jonathan Wakely [mailto:jwak...@redhat.com]
> Sent: Thursday, September 17, 2015 6:54 PM
> To: Moore, Catherine; fdum...@gcc.gnu.org
> Cc: Gerald Pfeifer; libstd...@gcc.gnu.org; gcc-patches@gcc.gnu.org
> Subject: Re: [patch] libstdc++/65142 Check read() result in
> std::random_device.
> 
> On 17/09/15 22:32 +, Moore, Catherine wrote:
> >
> >
> >> -Original Message-
> >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> >> ow...@gcc.gnu.org] On Behalf Of Jonathan Wakely
> >> Sent: Thursday, September 17, 2015 5:28 PM
> >> To: Gerald Pfeifer
> >> Cc: libstd...@gcc.gnu.org; gcc-patches@gcc.gnu.org
> >> Subject: Re: [patch] libstdc++/65142 Check read() result in
> >> std::random_device.
> >>
> >> On 17/09/15 22:21 +0200, Gerald Pfeifer wrote:
> >> >On Thu, 17 Sep 2015, Jonathan Wakely wrote:
> >> >>> Any comments on this version?
> >> >> Committed to trunk.
> >> >
> >> >Unfortunately this broke bootstrap on FreeBSD 10.1.
> >> >
> >> >/scratch/tmp/gerald/gcc-HEAD/libstdc++-v3/src/c++11/random.cc: In
> >> member function 'std::random_device::result_type
> >> std::random_device::_M_getval()':
> >> >/scratch/tmp/gerald/gcc-HEAD/libstdc++-
> v3/src/c++11/random.cc:144:22:
> >> >error: 'errno' was not declared in this scope
> >> >  else if (e != -1 || errno != EINTR)
> >> >  ^
> >> >/scratch/tmp/gerald/gcc-HEAD/libstdc++-
> v3/src/c++11/random.cc:144:31:
> >> >error: 'EINTR' was not declared in this scope
> >> >  else if (e != -1 || errno != EINTR)
> >> >   ^
> >> >Makefile:545: recipe for target 'random.lo' failed
> >> >
> >> >I probably won't be able to dig in deeper today, but figured this
> >> >might already send you on the right path?
> >> >
> >> >Actually...
> >> >
> >> >...how about he patch below?  Bootstraps on
> >> >i386-unknown-freebsd10.1, no regressions.
> >>
> >> Sorry about that, I've committed your patch.
> >>
> >> >Gerald
> >> >
> >> >
> >I'm still seeing errors for a build of the mips-sde-elf target with these
> patches.
> >
> >Errors are:
> >/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/s
> >rc/c++11/debug.cc: In function 'void
> {anonymous}::print_word({anonymous
> >}::PrintContext&, const char*, std::ptrdiff_t)':
> >/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/s
> >rc/c++11/debug.cc:573:10: error: 'stderr' was not declared in this scop
> >e
> >  fprintf(stderr, "\n");
> >  ^
> >/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/s
> >rc/c++11/debug.cc:573:22: error: 'fprintf' was not declared in this sco
> >pe
> >  fprintf(stderr, "\n");
> >  ^
> >/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/s
> >rc/c++11/debug.cc:596:14: error: 'stderr' was not declared in this scop e
> >  fprintf(stderr, "%s", spacing);
> >  ^
> >/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/s
> >rc/c++11/debug.cc:596:35: error: 'fprintf' was not declared in this sco pe
> >  fprintf(stderr, "%s", spacing);
> >   ^
> >/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/s
> >rc/c++11/debug.cc:600:24: error: 'stderr' was not declared in this
> >scope
> >  int written = fprintf(stderr, "%s", word);
> >^
> >/scratch/cmoore/virgin-upstream-elf-lite/src/gcc-trunk-5/libstdc++-v3/s
> >rc/c++11/debug.cc:600:42: error: 'fprintf' was not declared in this
> >scop e
> >  int written = fprintf(stderr, "%s", word);
> 
> That's a different problem, due to https://gcc.gnu.org/r227885
> 
> François, could you take a look please?
> 

I've now committed this patch to solve this problem (pre-approved by Jonathan).

2015-09-17  Catherine Moore  

* src/c++11/debug.cc: Include .


Index: src/c++11/debug.cc
===
--- src/c++11/debug.cc  (revision 227887)
+++ src/c++11/debug.cc  (working copy)
@@ -32,6 +32,7 @@
 #include 

 #include 
+#include 

 #include  // for std::min
 #include  // for _Hash_impl


  1   2   >