Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-14 Thread Florian Weimer
On 09/10/2015 06:57 PM, Martin Sebor wrote:

>>> There is quite a bit of documentation of _FORTIFY_SOURCE that explains
>>> its effect on user code.
>>
>> I think there are only random blog articles discussing aspects of it,
>> most of them slightly incorrect or outdated.
> 
> _FORTIFY_SOURCE is a GLIBC feature test macro. It's documented
> in  and mentioned in some of its online manuals.
> For example:
> 
> http://man7.org/linux/man-pages/man7/feature_test_macros.7.html
> 
> or here:
> 
> http://manpages.ubuntu.com/manpages/hardy/man7/feature_test_macros.7.html

Oh, so there is an out-dated man-page as well. :-/

The fd_set checks added in glibc 2.15 are missing.  That caused some
backslash because some folks were actually abusing FD_SET and related
macros.  Nothing too severe, and in the end, we stood our ground.  I
expect the libstdc++ changes to be similar.

Again, my main argument is that the main users of _FORTIFY_SOURCE are
distributions, and they would inject whatever preprocessor macro enables
the new libstdc++ checks anyway, so saving them that work would be
preferable IMHO.

-- 
Florian Weimer / Red Hat Product Security


Re: [PATCH] Update ENABLE_CHECKING to make it usable in "if" conditions

2015-09-14 Thread Richard Biener
On Wed, Sep 9, 2015 at 11:07 PM, Jeff Law  wrote:
> On 08/31/2015 05:30 AM, Richard Biener wrote:
>>
>> On Mon, Aug 31, 2015 at 7:49 AM, Mikhail Maltsev 
>> wrote:
>>>
>>> Hi, all!
>>>
>>> This patch removes some conditional compilation from GCC. In this patch I
>>> define
>>> a macro CHECKING_P, which is equal to 1 when ENABLE_CHECKING is defined
>>> and 0
>>> otherwise. The reason for using a new name is the following: currently in
>>> GCC
>>> there are many places where ENABLE_CHECKING is checked using #ifdef, and
>>> it is a
>>> common way to do such checks (though #if would also work and is used in
>>> several
>>> places). If we change it in such way that it is always defined,
>>> accidentally
>>> using "#ifdef" instead of "#if" will lead to subtle errors: some
>>> expensive
>>> checks intended only for development stage will be enabled in release
>>> build and
>>> cause performance degradation.
>>>
>>> This patch removes all uses of ENABLE_CHECKING (I also tried poisoning
>>> this
>>> identifier in system.h, and the build succeeded, but I don't know how
>>> will this
>>> affect, e.g. merging feature branches, so I think such decisions should
>>> be made
>>> by maintainers).
>>
>>
>> I think we want to keep ENABLE_CHECKING for macro use and for some
>> selected
>> cases.
>
> Can you outline which cases you want to keep?  My general feeling is to
> avoid conditionally compiled code as much as we can.

I guess I was merely looking for the patch to be split up to see the motivation
of a always-defined CHECKING_P macro.  With the suggestion to have
a runtime flag_checking variable I wonder if there is any real code that
is guarded by ENABLE_CHECKING now that can't use flag_checking
(yes, tree checking macros and gcc_checking_assert, but those already
work with ENABLE_CHECKING just fine and are in macro context anyway).

Richard.

>>
>>> As for conditional compilation, I tried to remove it and replace #ifdef-s
>>> with
>>> if-s where possible, but, of course, avoided changing data structures
>>> layout. I
>>> also avoided reindenting large blocks of code. I changed some functions
>>> (and a
>>> couple of global variables) that were only used in "checking" build so
>>> that now
>>> they are always defined/compiled and have a DEBUG_FUNCTION (i.e. "used")
>>> attribute. I'll try to handle them in a more clean way: move CHECKING_P
>>> check
>>> inside their definitions - that will slightly reduce "visual noise" from
>>> conditions like
>
> We'll eventually want to do the reindentations as well, but for an RFC,
> that's fine.
>
>
>>>
>>> While working on this patch I noticed some issues worth mentioning. In
>>> sese.h:bb_in_region we have a check (enabled only in "checking" build):
>>>
>>>/* Check that there are no edges coming in the region: all the
>>>   predecessors of EXIT are dominated by ENTRY.  */
>>>FOR_EACH_EDGE (e, ei, exit->preds)
>>>  dominated_by_p (CDI_DOMINATORS, e->src, entry);
>>>
>>> IIUC, dominated_by_p has no side effects, so it's useless. Changing it to
>>> "gcc_assert (dominated_by_p (...));" causes regressions. I disabled it in
>>> the
>>> patch and added a comment.
>>
>>
>> You should open a bugreport.
>
> Agreed.  Looks like a simple bug.
>
>>
>>> In lra.c we have:
>>>
>>> #ifdef ENABLE_CHECKING
>>>
>>>   /* Function checks RTL for correctness. If FINAL_P is true, it is
>>>  done at the end of LRA and the check is more rigorous.  */
>>>   static void
>>>   check_rtl (bool final_p)
>>> ...
>>> #ifdef ENABLED_CHECKING
>>>  extract_constrain_insn (insn);
>>> #endif
>>> ...
>>> #endif /* #ifdef ENABLE_CHECKING */
>>>
>>> The comment for extract_constrain_insn says:
>>> /* Do uncached extract_insn, constrain_operands and complain about
>>> failures.
>>> This should be used when extracting a pre-existing constrained
>>> instruction
>>> if the caller wants to know which alternative was chosen.  */
>>>
>>> So, as I understand, this additional check can be useful. This patch
>>> removes
>>> "#ifdef ENABLED_CHECKING" (without regressions).
>
> No strong opinions here.  There's other things that would catch this later.
> The check merely catches it closer to the point where things go wrong.  So
> I'd tend to want it to be conditional.
>
>>>
>>> The third issue is that some gcc_checking_assert-s were guarded by #ifdef
>>> like:
>>> #ifdef ENABLE_CHECKING
>>> gcc_checking_assert (!is_deleted (*slot));
>>> #endif
>>> (is_deleted is defined unconditionally).
>>>
>>> Probably that is because it is not obvious from the "release" definition
>>> #define gcc_checking_assert(EXPR) ((void)(0 && (EXPR)))
>>> that EXPR is not evaluated.
>>>
>>> At least, I first decided to change it to something like
>>> "do { if (0) (void)(EXPR); } while(0)"
>>> and only then realized that the effect of "0 &&" is exactly the same. I
>>> added a
>>> comment to make it more obvious.
>
> Sounds reasonable.
>
>>>
>>> I tested the patch on x86_64-pc-linux-gnu with --enabl

RE: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*

2015-09-14 Thread David Sherwood
Hi All,

For what it's worth I have uploaded a new patch that changes the name
from STRICT_FMIN/MAX to just FMIN/FMAX, although I realise that this
discussion has not yet been resolved. I have also added scheduling
attributes to the aarch64 instructions.

Regards,
David Sherwood.

ChangeLog:

2015-08-28  David Sherwood  

gcc/
* builtins.c (integer_valued_real_p): Add FMIN_EXPR and FMAX_EXPR.
(fold_builtin_fmin_fmax): For strict math, convert builtins fmin and
fmax to FMIN_EXPR and FMIN_EXPR, respectively.
* expr.c (expand_expr_real_2): Add FMIN_EXPR and FMAX_EXPR.
* fold-const.c (const_binop): Likewise.
(fold_binary_loc, tree_binary_nonnegative_warnv_p): Likewise.
(tree_binary_nonzero_warnv_p): Likewise.
* optabs.h (fminmax_support): Declare.
* optabs.def: Add new optabs fmax_optab/fmin_optab.
* optabs.c (optab_for_tree_code): Return new optabs for FMIN_EXPR and
FMAX_EXPR.
(fminmax_support): New function.
* real.c (real_arithmetic): Add FMIN_EXPR and FMAX_EXPR.
* tree.def: Likewise.
* tree.c (associative_tree_code, commutative_tree_code): Likewise.
* tree-cfg.c (verify_expr): Likewise.
(verify_gimple_assign_binary): Likewise.
* tree-inline.c (estimate_operator_cost): Likewise.
* tree-pretty-print.c (dump_generic_node, op_code_prio): Likewise.
(op_symbol_code): Likewise.
* config/aarch64/aarch64.md: New pattern.
* config/aarch64/aarch64-simd.md: Likewise.
* config/aarch64/iterators.md: New unspecs, iterators.
* config/arm/iterators.md: New iterators.
* config/arm/unspecs.md: New unspecs.
* config/arm/neon.md: New pattern.
* config/arm/vfp.md: Likewise.
* doc/generic.texi: Add FMAX_EXPR and FMIN_EXPR.
* doc/md.texi: Add fmin and fmax patterns.
gcc/testsuite
* gcc.target/aarch64/fmaxmin.c: New test.
* gcc.target/arm/fmaxmin.c: New test.


> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: 19 August 2015 14:41
> To: Richard Biener; David Sherwood; GCC Patches; Richard Sandiford
> Subject: Re: [PING][Patch] Add support for IEEE-conformant versions of scalar 
> fmin* and fmax*
> 
> On Wed, Aug 19, 2015 at 3:06 PM, Richard Sandiford
>  wrote:
> > Richard Biener  writes:
> >> As an additional point for many math functions we have to support errno
> >> which means, like, BUILT_IN_SQRT can be rewritten to SQRT_EXPR
> >> only if -fno-math-errno is in effect.  But then code has to handle
> >> both variants for things like constant folding and expression combining.
> >> That's very unfortunate and something we want to avoid (one reason
> >> the POW_EXPR thing didn't fly when I tried).  STRICT_FMIN/MAX_EXPR
> >> is an example where this doesn't apply, of course (but I detest the name,
> >> just use FMIN/FMAX_EXPR?).  Still you'd need to handle both,
> >> FMIN_EXPR and BUILT_IN_FMIN, in code doing analysis/transform.
> >
> > Yeah, but match.pd makes that easy, right? ;-)
> 
> Sure, but that only addresses stmt combining, not other passes.  And of course
> it causes {gimple,generic}-match.c to become even bigger ;)
> 
> Richard.



fmaxmin.patch
Description: Binary data


[patch] gfortran testsuite/read_dir.f90 - set xfail on dragonfly as well

2015-09-14 Thread John Marino
Steve Kargl suspected that read_dir.f90 would fail on all BSDs with the
recent change.  At least for DragonFly BSD, he is correct:

https://gcc.gnu.org/ml/gcc-testresults/2015-09/msg01222.html

Please consider incorporating the attached patch which sets the test for
xfail on dragonfly as well as freebsd.

Suggested gcc/testsuite/ChangeLog entry:

2015-09-XX  John Marino  

* gfortran.dg/read_dir.f90: XFAIL this testcase on DragonFly.


Thanks,
John
Index: gcc/testsuite/gfortran.dg/read_dir.f90
===
--- gcc/testsuite/gfortran.dg/read_dir.f90  (revision 227744)
+++ gcc/testsuite/gfortran.dg/read_dir.f90  (working copy)
@@ -1,4 +1,4 @@
-! { dg-do run { xfail *-*-freebsd* } }
+! { dg-do run { xfail *-*-freebsd* *-*-dragonfly* } }
 ! PR67367
 program bug
implicit none


RE: [PATCH 2/4] [MIPS] Add pipeline description for MSA

2015-09-14 Thread Matthew Fortune
> gcc/ChangeLog:
> 
>   * config/mips/i6400.md (i6400_fpu_intadd, i6400_fpu_logic)
>   (i6400_fpu_div, i6400_fpu_cmp, i6400_fpu_float,
>   i6400_fpu_store)
>   (i6400_fpu_long_pipe, i6400_fpu_logic_l, i6400_fpu_float_l)
>   (i6400_fpu_mult): New cpu units.
>   (i6400_msa_add_d, i6400_msa_int_add, i6400_msa_short_logic3)
>   (i6400_msa_short_logic2, i6400_msa_short_logic, i6400_msa_move)
>   (i6400_msa_cmp, i6400_msa_short_float2, i6400_msa_div_d)
>   (i6400_msa_div_w, i6400_msa_div_h, i6400_msa_div_b,
> i6400_msa_copy)
>   (i6400_msa_branch, i6400_fpu_msa_store, i6400_fpu_msa_load)
>   (i6400_fpu_msa_move, i6400_msa_long_logic1,
> i6400_msa_long_logic2)
>   (i6400_msa_mult, i6400_msa_long_float2, i6400_msa_long_float4)
>   (i6400_msa_long_float5, i6400_msa_long_float8, i6400_msa_fdiv_df)
>   (i6400_msa_fdiv_sf): New reservations.
>   * config/mips/p5600.md (p5600_fpu_intadd, p5600_fpu_cmp)
>   (p5600_fpu_float, p5600_fpu_logic_a, p5600_fpu_logic_b,
> p5600_fpu_div)
>   (p5600_fpu_logic, p5600_fpu_float_a, p5600_fpu_float_b,)
>   (p5600_fpu_float_c, p5600_fpu_float_d, p5600_fpu_mult,
> p5600_fpu_fdiv)
>   (p5600_fpu_load): New cpu units.
>   (msa_short_int_add, msa_short_logic, msa_short_logic_move_v)
>   (msa_short_cmp, msa_short_float2, msa_short_logic3,
> msa_short_store4)
>   (msa_long_load, msa_short_store, msa_long_logic,
> msa_long_float2)
>   (msa_long_float4, msa_long_float5, msa_long_float8,
> msa_long_mult)
>   (msa_long_fdiv, msa_long_div): New reservations.

This all looks fine to me.

Thanks,
Matthew


Re: [C++ Patch] PR 51911 V2 ("G++ accepts new auto { list }")

2015-09-14 Thread Paolo Carlini

... concretely, I tested successfully the below.

Thanks,
Paolo.


Index: cp/parser.c
===
--- cp/parser.c (revision 227737)
+++ cp/parser.c (working copy)
@@ -7591,8 +7591,9 @@ cp_parser_new_expression (cp_parser* parser)
 type = cp_parser_new_type_id (parser, &nelts);
 
   /* If the next token is a `(' or '{', then we have a new-initializer.  */
-  if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)
-  || cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
+  cp_token *token = cp_lexer_peek_token (parser->lexer);
+  if (token->type == CPP_OPEN_PAREN
+  || token->type == CPP_OPEN_BRACE)
 initializer = cp_parser_new_initializer (parser);
   else
 initializer = NULL;
@@ -7601,6 +7602,21 @@ cp_parser_new_expression (cp_parser* parser)
  expression.  */
   if (cp_parser_non_integral_constant_expression (parser, NIC_NEW))
 ret = error_mark_node;
+  /* 5.3.4/2: "If the auto type-specifier appears in the type-specifier-seq
+ of a new-type-id or type-id of a new-expression, the new-expression shall
+ contain a new-initializer of the form ( assignment-expression )".
+ Additionally, consistently with the spirit of DR 1467, we want to accept
+ 'new auto { 2 }' too.  */
+  else if (type_uses_auto (type)
+  && (vec_safe_length (initializer) != 1
+  || (BRACE_ENCLOSED_INITIALIZER_P ((*initializer)[0])
+  && CONSTRUCTOR_NELTS ((*initializer)[0]) != 1)))
+{
+  error_at (token->location,
+   "initialization of new-expression for type % "
+   "requires exactly one element");
+  ret = error_mark_node;
+}
   else
 {
   /* Create a representation of the new-expression.  */
Index: testsuite/g++.dg/cpp0x/new-auto1.C
===
--- testsuite/g++.dg/cpp0x/new-auto1.C  (revision 0)
+++ testsuite/g++.dg/cpp0x/new-auto1.C  (working copy)
@@ -0,0 +1,10 @@
+// PR c++/51911
+// { dg-do compile { target c++11 } }
+
+#include 
+
+auto foo1 = new auto { 3, 4, 5 };  // { dg-error "22:initialization of 
new-expression for type 'auto'" }
+auto bar1 = new auto { 2 };
+
+auto foo2 = new auto ( 3, 4, 5 );  // { dg-error "22:initialization of 
new-expression for type 'auto'" }
+auto bar2 = new auto ( 2 );


Re: [RFC, PR target/65105] Use vector instructions for scalar 64bit computations on 32bit target

2015-09-14 Thread Ilya Enkovich
On 09 Sep 10:20, Uros Bizjak wrote:
> On Wed, Sep 9, 2015 at 10:12 AM, Uros Bizjak  wrote:
> > On Tue, Sep 8, 2015 at 5:49 PM, Ilya Enkovich  
> > wrote:
> >
> > Please depend new changes to insn patterns to TARGET_STV. This way,
> > non-STV compiles will behave exactly as now.
> >
> > +;; Math-dependant integer modes with DImode.
> > +(define_mode_iterator SWIM1248x [(QI "TARGET_QIMODE_MATH")
> > +   (HI "TARGET_HIMODE_MATH")
> > +   SI DI])
> > +
> >
> > DI should depend on TARGET_STV && TARGET_SSE2
> >
> > @@ -2093,9 +2098,9 @@
> >
> >  (define_insn "*movdi_internal"
> >[(set (match_operand:DI 0 "nonimmediate_operand"
> > -"=r  ,o  ,r,r  ,r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m ,?r
> > ,?r,?*Yi,?*Ym,?*Yi,*k,*k ,*r ,*m")
> > +"=r  ,o  ,r,r  ,r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m,?r
> > ,?r,?*Yi,?*Ym,?*Yi,*k,*k ,*r ,*m")
> >   (match_operand:DI 1 "general_operand"
> > -"riFo,riF,Z,rem,i,re,C ,*y,m  ,*y,*Yn,r   ,C ,*v,m ,*v,*Yj,*v,r
> > ,*Yj ,*Yn ,*r ,*km,*k,*k"))]
> > +"riFo,riF,Z,rem,i,re,C ,*y,m  ,*y,*Yn,r   ,C ,*v,m ,v,*Yj,*v,r
> > ,*Yj ,*Yn ,*r ,*km,*k,*k"))]
> >"!(MEM_P (operands[0]) && MEM_P (operands[1]))"
> >  {
> >
> > Please add new alternative and use enabled attribute to conditionaly
> > select correct alternative. Preferrably, the new alternative should be
> > just after the one it changes, so you will have to change many of the
> > alternative's numbers in attribute calculations.
> >
> > +(define_insn_and_split "*anddi3_doubleword"
> > +  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r")
> > + (and:DI
> > + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0")
> > + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm")))
> > +   (clobber (reg:CC FLAGS_REG))]
> > +  "!TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)"
> > +  "#"
> > +  "!TARGET_64BIT && reload_completed"
> > +  [(parallel [(set (match_dup 0)
> >
> > You should add TARGET_STV && TARGET_SSE2 in the above and other added 
> > patterns.
> 
> (I pushed "send" too fast here ;) )
> 
> Please note that you can use "&& ..." in the split condition to avoid
> duplication with insn condition.
> 
> + if (TARGET_SSE4_1)
> +  {
> +emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
> + CONST0_RTX (V4SImode),
> + gen_rtx_SUBREG (SImode, reg, 0)));
> +emit_insn (gen_sse4_1_pinsrd (gen_rtx_SUBREG (V4SImode, vreg, 0),
> +  gen_rtx_SUBREG (V4SImode, vreg, 0),
> +  gen_rtx_SUBREG (SImode, reg, 4),
> +  GEN_INT (2)));
> +  }
> + else if (TARGET_INTER_UNIT_MOVES_TO_VEC)
> +  {
> +rtx tmp = gen_reg_rtx (DImode);
> +emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
> + CONST0_RTX (V4SImode),
> + gen_rtx_SUBREG (SImode, reg, 0)));
> +emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, tmp, 0),
> + CONST0_RTX (V4SImode),
> + gen_rtx_SUBREG (SImode, reg, 4)));
> +emit_insn (gen_vec_interleave_lowv4si
> +   (gen_rtx_SUBREG (V4SImode, vreg, 0),
> + gen_rtx_SUBREG (V4SImode, vreg, 0),
> + gen_rtx_SUBREG (V4SImode, tmp, 0)));
> +  }
> + else
> +  {
> +rtx tmp = assign_386_stack_local (DImode, SLOT_TEMP);
> +emit_move_insn (adjust_address (tmp, SImode, 0),
> +gen_rtx_SUBREG (SImode, reg, 0));
> +emit_move_insn (adjust_address (tmp, SImode, 4),
> +gen_rtx_SUBREG (SImode, reg, 4));
> +emit_move_insn (vreg, tmp);
> +  }
> 
> As a future cleanup idea, maybe we should reimplement the above code
> as an expander and use it in several places. IIRC, there are already
> several places in the code that would benefit from it.
> 
> Uros.

Hi Uros!

Thanks a lot for your review!  I fixed my patch according to your comments.  
Everything is under stv target now and should be transparent for non-stv 
targets.  Bootstrapped and regtested for x86_64-unknown-linux-gnu.  Does it 
look OK?

Thanks,
Ilya
--
gcc/

2015-09-14  Ilya Enkovich  

* config/i386/i386.c: Include dbgcnt.h.
(has_non_address_hard_reg): New.
(convertible_comparison_p): New.
(scalar_to_vector_candidate_p): New.
(remove_non_convertible_regs): New.
(scalar_chain): New.
(scalar_chain::scalar_chain): New.
(scalar_chain::~scalar_chain): New.
(scalar_chain::add_to_queue): New.
(scalar_chain::mark_dual_mode_def): New.
(scalar_chain::analyze_register_chain): New.
(scalar_chain::add_insn): New.
(scalar_chain::build): New.
(scalar_chain::compute_convert_gain): New.
(scalar_chain::replace_with_subreg): New.
(scalar_chain::replace_with_subreg_in_insn): New.
(scalar_chain::emit_conversion_insns): New.
(scalar_chain::make_vector_copies): New.
(scalar_chain::convert_reg): New.
(scalar_chain::convert_op): New.
(scalar_chain::convert_insn): New.
(scalar_chain::convert): New.
(convert_scalars_to_vector): New.
(pass_data_stv): New.
(pass_stv): New.
(make_pass_stv): New.
(ix86_

Re: [gomp4] SESE region neutering

2015-09-14 Thread Nathan Sidwell

On 09/14/15 04:11, Bernd Schmidt wrote:


This looks like it could potentially go into sese.c instead.


yeah, I looked  at that to see if it had any goodies -- AFAICT it does 
processing of already-identified SESE regions.  It could certainly be abstracted 
to there, if there are more general needs.  Heck, if this turns out to be like 
the mach-dep loop stuff I did way back, that you cloned into AVR(?), I guess you 
might get the honours!


nathan


Re: [RFC AArch64][PR 63304] Handle literal pools for functions > 1 MiB in size.

2015-09-14 Thread Ramana Radhakrishnan


On 27/08/15 15:07, Marcus Shawcroft wrote:
> On 27 July 2015 at 15:33, Ramana Radhakrishnan
>  wrote:
> 
>>   Ramana Radhakrishnan  
>>
>> PR target/63304
>> * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
>> nopcrelative_literal_loads.
>> (aarch64_classify_address): Likewise.
>> (aarch64_constant_pool_reload_icode): Define.
>> (aarch64_secondary_reload): Handle secondary reloads for
>> literal pools.
>> (aarch64_override_options): Handle nopcrelative_literal_loads.
>> (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
>> * config/aarch64/aarch64.md 
>> (aarch64_reload_movcp):
>> Define.
>> (aarch64_reload_movcp): Likewise.
>> * config/aarch64/aarch64.opt: New option mnopc-relative-literal-loads
>> * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
>> predicate.
>> * doc/invoke.texi (mnopc-relative-literal-loads): Document.
> 
> This looks OK to me. It needs rebasing, but OK if the rebase is
> trival.  Default on is fine.  Hold off on the back ports for a couple
> of weeks.
> Cheers
> /Marcus
> 

This is what I applied. I'll give it a week or so on trunk before backporting 
to the release branches. 
Since we handle literal pools > 1MiB away on by default, this final rebased 
version switches the option name
to the positive form (mpc-relative-literal-loads) and handles it accordingly.

Tested on aarch64-none-elf , no regressions. Applied to trunk.

Thanks,
Ramana 


2015-09-14  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
nopcrelative_literal_loads.
(aarch64_classify_address): Likewise.
(aarch64_constant_pool_reload_icode): Define.
(aarch64_secondary_reload): Handle secondary reloads for
literal pools.
(aarch64_override_options): Handle nopcrelative_literal_loads.
(aarch64_classify_symbol): Handle nopcrelative_literal_loads.
* config/aarch64/aarch64.md (aarch64_reload_movcp):
Define.
(aarch64_reload_movcp): Likewise.
* config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
* config/aarch64/predicates.md (aarch64_constant_pool_symref): New
predicate.
* doc/invoke.texi (mpc-relative-literal-loads): Document.
Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 227737)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,22 @@
+2015-09-14  Ramana Radhakrishnan  
+
+   PR target/63304
+   * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
+   nopcrelative_literal_loads.
+   (aarch64_classify_address): Likewise.
+   (aarch64_constant_pool_reload_icode): Define.
+   (aarch64_secondary_reload): Handle secondary reloads for
+   literal pools.
+   (aarch64_override_options): Handle nopcrelative_literal_loads.
+   (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
+   * config/aarch64/aarch64.md (aarch64_reload_movcp):
+   Define.
+   (aarch64_reload_movcp): Likewise.
+   * config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
+   * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
+   predicate.
+   * doc/invoke.texi (mpc-relative-literal-loads): Document.
+
 2015-09-13  Olivier Hainque  
Eric Botcazou  
 
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c(revision 227737)
+++ gcc/config/aarch64/aarch64.c(working copy)
@@ -1734,11 +1734,27 @@
  aarch64_emit_move (dest, base);
  return;
}
+
  mem = force_const_mem (ptr_mode, imm);
  gcc_assert (mem);
+
+ /* If we aren't generating PC relative literals, then
+we need to expand the literal pool access carefully.
+This is something that needs to be done in a number
+of places, so could well live as a separate function.  */
+ if (nopcrelative_literal_loads)
+   {
+ gcc_assert (can_create_pseudo_p ());
+ base = gen_reg_rtx (ptr_mode);
+ aarch64_expand_mov_immediate (base, XEXP (mem, 0));
+ mem = gen_rtx_MEM (ptr_mode, base);
+   }
+
  if (mode != ptr_mode)
mem = gen_rtx_ZERO_EXTEND (mode, mem);
+
  emit_insn (gen_rtx_SET (dest, mem));
+
  return;
 
 case SYMBOL_SMALL_TLSGD:
@@ -3854,9 +3870,10 @@
  rtx sym, addend;
 
  split_const (x, &sym, &addend);
- return (GET_CODE (sym) == LABEL_REF
- || (GET_CODE (sym) == SYMBOL_REF
- && CONSTANT_POOL_ADDRESS_P (sym)));
+ return ((GET_CODE (sym) == LABEL_REF
+  || (GET_CODE (s

Re: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*

2015-09-14 Thread Richard Biener
To make progess here I think adding new optabs is fine.  So can you
split out that part and implement builtin expanders
for fmin/max instead?

Btw, FMIN/MAX_EXPR are not commutative AFAIK because of behavior for
fmax (-NaN, NaN)
vs. fmax (NaN, -NaN)?

Richard.

On Mon, Sep 14, 2015 at 12:36 PM, David Sherwood  wrote:
> Hi All,
>
> For what it's worth I have uploaded a new patch that changes the name
> from STRICT_FMIN/MAX to just FMIN/FMAX, although I realise that this
> discussion has not yet been resolved. I have also added scheduling
> attributes to the aarch64 instructions.
>
> Regards,
> David Sherwood.
>
> ChangeLog:
>
> 2015-08-28  David Sherwood  
>
> gcc/
> * builtins.c (integer_valued_real_p): Add FMIN_EXPR and FMAX_EXPR.
> (fold_builtin_fmin_fmax): For strict math, convert builtins fmin and
> fmax to FMIN_EXPR and FMIN_EXPR, respectively.
> * expr.c (expand_expr_real_2): Add FMIN_EXPR and FMAX_EXPR.
> * fold-const.c (const_binop): Likewise.
> (fold_binary_loc, tree_binary_nonnegative_warnv_p): Likewise.
> (tree_binary_nonzero_warnv_p): Likewise.
> * optabs.h (fminmax_support): Declare.
> * optabs.def: Add new optabs fmax_optab/fmin_optab.
> * optabs.c (optab_for_tree_code): Return new optabs for FMIN_EXPR and
> FMAX_EXPR.
> (fminmax_support): New function.
> * real.c (real_arithmetic): Add FMIN_EXPR and FMAX_EXPR.
> * tree.def: Likewise.
> * tree.c (associative_tree_code, commutative_tree_code): Likewise.
> * tree-cfg.c (verify_expr): Likewise.
> (verify_gimple_assign_binary): Likewise.
> * tree-inline.c (estimate_operator_cost): Likewise.
> * tree-pretty-print.c (dump_generic_node, op_code_prio): Likewise.
> (op_symbol_code): Likewise.
> * config/aarch64/aarch64.md: New pattern.
> * config/aarch64/aarch64-simd.md: Likewise.
> * config/aarch64/iterators.md: New unspecs, iterators.
> * config/arm/iterators.md: New iterators.
> * config/arm/unspecs.md: New unspecs.
> * config/arm/neon.md: New pattern.
> * config/arm/vfp.md: Likewise.
> * doc/generic.texi: Add FMAX_EXPR and FMIN_EXPR.
> * doc/md.texi: Add fmin and fmax patterns.
> gcc/testsuite
> * gcc.target/aarch64/fmaxmin.c: New test.
> * gcc.target/arm/fmaxmin.c: New test.
>
>
>> -Original Message-
>> From: Richard Biener [mailto:richard.guent...@gmail.com]
>> Sent: 19 August 2015 14:41
>> To: Richard Biener; David Sherwood; GCC Patches; Richard Sandiford
>> Subject: Re: [PING][Patch] Add support for IEEE-conformant versions of 
>> scalar fmin* and fmax*
>>
>> On Wed, Aug 19, 2015 at 3:06 PM, Richard Sandiford
>>  wrote:
>> > Richard Biener  writes:
>> >> As an additional point for many math functions we have to support errno
>> >> which means, like, BUILT_IN_SQRT can be rewritten to SQRT_EXPR
>> >> only if -fno-math-errno is in effect.  But then code has to handle
>> >> both variants for things like constant folding and expression combining.
>> >> That's very unfortunate and something we want to avoid (one reason
>> >> the POW_EXPR thing didn't fly when I tried).  STRICT_FMIN/MAX_EXPR
>> >> is an example where this doesn't apply, of course (but I detest the name,
>> >> just use FMIN/FMAX_EXPR?).  Still you'd need to handle both,
>> >> FMIN_EXPR and BUILT_IN_FMIN, in code doing analysis/transform.
>> >
>> > Yeah, but match.pd makes that easy, right? ;-)
>>
>> Sure, but that only addresses stmt combining, not other passes.  And of 
>> course
>> it causes {gimple,generic}-match.c to become even bigger ;)
>>
>> Richard.
>


Re: Fix 61441

2015-09-14 Thread Richard Biener
On Thu, Sep 10, 2015 at 9:30 AM, Sujoy Saraswati  wrote:
> Hi,
>  Here is a modified patch for this. The change this time is in
> fold-const.c and real.c.
>
> Bootstrap and regression tests on x86_64-linux-gnu and
> aarch64-unknown-linux-gnu passed with changes done on trunk.
>
> Is this fine ?
>
> Regards,
> Sujoy
>
> 2015-09-10  Sujoy Saraswati 
>
> PR tree-optimization/61441
> * fold-const.c (const_binop, fold_abs_const): Convert
> sNaN to qNaN when flag_signaling_nans is off.
> * real.c (do_add, do_multiply, do_divide, do_fix_trunc): Same.
> (real_arithmetic, real_ldexp, real_convert): Same
>
> PR tree-optimization/61441
> * gcc.dg/pr61441.c: New testcase.
>
> Index: gcc/fold-const.c
> ===
> --- gcc/fold-const.c(revision 227584)
> +++ gcc/fold-const.c(working copy)
> @@ -1183,9 +1183,19 @@ const_binop (enum tree_code code, tree arg1, tree
>/* If either operand is a NaN, just return it.  Otherwise, set up
>  for floating-point trap; we return an overflow.  */
>if (REAL_VALUE_ISNAN (d1))
> +  {
> +/* Convert sNaN to qNaN when flag_signaling_nans is off */
> +if (!flag_signaling_nans)
> +  (TREE_REAL_CST_PTR (arg1))->signalling = 0;

As REAL_CSTs can be shared you need to build a new one, you can't
modify it in place.

I'll leave the correctness part of the patch to Joseph who knows FP
arithmetic better than me,
implementation-wise this is ok if you fix the REAL_CST sharing issue.

Thanks,
Richard.

> return arg1;
> +  }
>else if (REAL_VALUE_ISNAN (d2))
> +  {
> +/* Convert sNaN to qNaN when flag_signaling_nans is off */
> +if (!flag_signaling_nans)
> +  (TREE_REAL_CST_PTR (arg1))->signalling = 0;
> return arg2;
> +  }
>
>inexact = real_arithmetic (&value, code, &d1, &d2);
>real_convert (&result, mode, &value);
> @@ -13644,6 +13654,9 @@ fold_abs_const (tree arg0, tree type)
> t = build_real (type, real_value_negate (&TREE_REAL_CST (arg0)));
>else
> t =  arg0;
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +(TREE_REAL_CST_PTR (t))->signalling = 0;
>break;
>
>  default:
> Index: gcc/real.c
> ===
> --- gcc/real.c  (revision 227584)
> +++ gcc/real.c  (working copy)
> @@ -545,6 +545,9 @@ do_add (REAL_VALUE_TYPE *r, const REAL_VALUE_TYPE
>  case CLASS2 (rvc_normal, rvc_inf):
>/* R + Inf = Inf.  */
>*r = *b;
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +r->signalling = 0;
>r->sign = sign ^ subtract_p;
>return false;
>
> @@ -558,6 +561,9 @@ do_add (REAL_VALUE_TYPE *r, const REAL_VALUE_TYPE
>  case CLASS2 (rvc_inf, rvc_normal):
>/* Inf + R = Inf.  */
>*r = *a;
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +r->signalling = 0;
>return false;
>
>  case CLASS2 (rvc_inf, rvc_inf):
> @@ -680,6 +686,9 @@ do_multiply (REAL_VALUE_TYPE *r, const REAL_VALUE_
>  case CLASS2 (rvc_nan, rvc_nan):
>/* ANY * NaN = NaN.  */
>*r = *b;
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +r->signalling = 0;
>r->sign = sign;
>return false;
>
> @@ -688,6 +697,9 @@ do_multiply (REAL_VALUE_TYPE *r, const REAL_VALUE_
>  case CLASS2 (rvc_nan, rvc_inf):
>/* NaN * ANY = NaN.  */
>*r = *a;
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +r->signalling = 0;
>r->sign = sign;
>return false;
>
> @@ -830,6 +842,9 @@ do_divide (REAL_VALUE_TYPE *r, const REAL_VALUE_TY
>  case CLASS2 (rvc_nan, rvc_nan):
>/* ANY / NaN = NaN.  */
>*r = *b;
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +r->signalling = 0;
>r->sign = sign;
>return false;
>
> @@ -838,6 +853,9 @@ do_divide (REAL_VALUE_TYPE *r, const REAL_VALUE_TY
>  case CLASS2 (rvc_nan, rvc_inf):
>/* NaN / ANY = NaN.  */
>*r = *a;
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +r->signalling = 0;
>r->sign = sign;
>return false;
>
> @@ -968,6 +986,9 @@ do_fix_trunc (REAL_VALUE_TYPE *r, const REAL_VALUE
>  case rvc_zero:
>  case rvc_inf:
>  case rvc_nan:
> +  /* Convert sNaN to qNaN when flag_signaling_nans is off */
> +  if (!flag_signaling_nans)
> +r->signalling = 0;
>break;
>
>  case rvc_normal:
> @@ -1059,6 +1080,11 @@ real_arithmetic (REAL_VALUE_TYPE *r, i

[SH][committed] Fix PR 67061

2015-09-14 Thread Oleg Endo
Hi,

The attached patch fixes PR 67061.
Tested on sh-elf trunk r227682 with
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

Committed to trunk as r227750.
Will backport to GCC 5 branch later.

Cheers,
Oleg

gcc/ChangeLog:
PR target/67061
* config/sh/sh-protos.h (sh_find_set_of_reg): Simplfiy for-loop.
Handle call insns.


Index: gcc/config/sh/sh-protos.h
===
--- gcc/config/sh/sh-protos.h	(revision 227749)
+++ gcc/config/sh/sh-protos.h	(working copy)
@@ -192,19 +192,20 @@
   if (!REG_P (reg) || insn == NULL_RTX)
 return result;
 
-  rtx_insn* previnsn = insn;
-
-  for (result.insn = stepfunc (insn); result.insn != NULL_RTX;
-   previnsn = result.insn, result.insn = stepfunc (result.insn))
+  for (rtx_insn* i = stepfunc (insn); i != NULL_RTX; i = stepfunc (i))
 {
-  if (BARRIER_P (result.insn))
+  if (BARRIER_P (i))
 	break;
-  if (!NONJUMP_INSN_P (result.insn))
-	continue;
-  if (reg_set_p (reg, result.insn))
+  if (!INSN_P (i) || DEBUG_INSN_P (i))
+	  continue;
+  if (reg_set_p (reg, i))
 	{
-	  result.set_rtx = set_of (reg, result.insn);
+	  if (CALL_P (i))
+	break;
 
+	  result.insn = i;
+	  result.set_rtx = set_of (reg, i);
+
 	  if (result.set_rtx == NULL_RTX || GET_CODE (result.set_rtx) != SET)
 	break;
 
@@ -226,12 +227,6 @@
 	}
 }
 
-  /* If the loop above stopped at the first insn in the list,
- result.insn will be null.  Use the insn from the previous iteration
- in this case.  */
-  if (result.insn == NULL)
-result.insn = previnsn;
-
   if (result.set_src != NULL)
 gcc_assert (result.insn != NULL && result.set_rtx != NULL);
 


Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-09-14 Thread Bill Schmidt
On Mon, 2015-09-14 at 10:47 +0100, Alan Lawrence wrote:
> On 11/09/15 14:19, Bill Schmidt wrote:
> >
> > A secondary concern for powerpc is that REDUC_MAX_EXPR produces a scalar
> > that has to be broadcast back to a vector, and the best way to implement
> > it for us already has the max value in all positions of a vector.  But
> > that is something we should be able to fix with simplify-rtx in the back
> > end.
> 
> Reading this thread again, this bit stands out as unaddressed. Yes PowerPC 
> can 
> "fix" this with simplify-rtx, but the vector cost model will not take this 
> into 
> account - it will think that the broadcast-back-to-a-vector requires an extra 
> operation after the reduction, whereas in fact it will not.
> 
> Does that suggest we should have a new entry in vect_cost_for_stmt for 
> vec_to_scalar-and-back-to-vector (that defaults to 
> vec_to_scalar+scalar_to_vec, 
> but on some architectures e.g. PowerPC would be the same as vec_to_scalar)?

Ideally I think we need to do something for that, yeah.  The back ends
could try to patch up the cost when finishing costs for the loop body,
epilogue, etc., but that would be somewhat of a guess; it would be
better to just be up-front that we're doing a reduction to a vector.

As part of this, I dislike the term "vec_to_scalar", which is somewhat
vague about what's going on (it sound like it could mean a vector
extract operation, which is more of an inverse of "scalar_to_vec" than a
reduction is).  GIMPLE calls it a reduction, and the optabs call it a
reduction, so we ought to call it a reduction in the vectorizer cost
model, too.

To cover our bases for PowerPC and AArch32, we probably need:

  plus_reduc_to_scalar
  plus_reduc_to_vector
  minmax_reduc_to_scalar
  minmax_reduc_to_vector

although I think plus_reduc_to_vector wouldn't be used yet, so could be
omitted.  If we go this route, then at that time we would change your
code to use minmax_reduc_to_vector and let the back ends determine
whether that requires a scalar reduction followed by a broadcast, or
whether it would be performed directly.

Using direct reduction to vector for MIN and MAX on PowerPC would be a
big cost savings over scalar reduction/broadcast.

Thanks,
Bill

> 
> (I agree that if that's the limit of how "different" conditional reductions 
> may 
> be between architectures, then we should not have a vec_cost_for_stmt for a 
> whole conditional reduction.)
> 
> Cheers, Alan
> 




[gomp4.1] Small doacross fixes and improvements

2015-09-14 Thread Jakub Jelinek
Hi!

This patch makes sure that when we remove all depend clauses from a
construct (e.g. because all of them refer to invalid iterations), we don't
leave ordered construct without clauses in (because that means omp ordered
threads).  Furthermore, it attempts to optimize adjacent #pragma omp ordered
depend(sink:...) directives.  And makes sure we don't lower them to any
code, the goal is to do something about them only during expansion.
The patch also treats them as stand-alone directives for the construction
of omp regions and edges in the cfg (so we don't emit GIMPLE_OMP_RETURN
for those).

2015-09-14  Jakub Jelinek  

* omp-low.c (lower_omp_ordered_clauses): Add GSI_P argument.
Use XALLOCAVEC, don't check for gimple_omp_for_collapse 0.
Merge adjacent ordered depend(sink:) statements.  If removing
all depend clauses, replace also the ordered stmt with GIMPLE_NOP.
(lower_omp_ordered): Call lower_omp_ordered_clauses only if depend
clause is present.  Don't do anything else after that call.
(build_omp_regions_1): Treat GIMPLE_OMP_ORDERED with depend clause
as stand-alone directive.
(make_gimple_omp_edges): Likewise.

--- gcc/omp-low.c.jj2015-09-10 14:46:48.0 +0200
+++ gcc/omp-low.c   2015-09-14 15:00:15.713086562 +0200
@@ -11185,6 +11185,13 @@ build_omp_regions_1 (basic_block bb, str
  gcc_unreachable ();
}
}
+ else if (code == GIMPLE_OMP_ORDERED
+  && find_omp_clause (gimple_omp_ordered_clauses
+(as_a  (stmt)),
+  OMP_CLAUSE_DEPEND))
+   /* #pragma omp ordered depend is also just a stand-alone
+  directive.  */
+   region = NULL;
  /* ..., this directive becomes the parent for a new region.  */
  if (region)
parent = region;
@@ -12110,22 +12117,54 @@ lower_omp_taskgroup (gimple_stmt_iterato
 /* Fold the OMP_ORDERED_CLAUSES for the OMP_ORDERED in STMT if possible.  */
 
 static void
-lower_omp_ordered_clauses (gomp_ordered *ord_stmt, omp_context *ctx)
+lower_omp_ordered_clauses (gimple_stmt_iterator *gsi_p, gomp_ordered *ord_stmt,
+  omp_context *ctx)
 {
   struct omp_for_data fd;
   if (!ctx->outer || gimple_code (ctx->outer->stmt) != GIMPLE_OMP_FOR)
 return;
 
   unsigned int len = gimple_omp_for_collapse (ctx->outer->stmt);
-  if (!len)
-return;
-  struct omp_for_data_loop *loops
-= (struct omp_for_data_loop *)
-alloca (len * sizeof (struct omp_for_data_loop));
+  struct omp_for_data_loop *loops = XALLOCAVEC (struct omp_for_data_loop, len);
   extract_omp_for_data (as_a  (ctx->outer->stmt), &fd, loops);
   if (!fd.ordered)
 return;
 
+  tree *list_p = gimple_omp_ordered_clauses_ptr (ord_stmt);
+  tree c = gimple_omp_ordered_clauses (ord_stmt);
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
+  && OMP_CLAUSE_DEPEND_KIND (c) == OMP_CLAUSE_DEPEND_SINK)
+{
+  /* Merge depend clauses from multiple adjacent
+#pragma omp ordered depend(sink:...) constructs
+into one #pragma omp ordered depend(sink:...), so that
+we can optimize them together.  */
+  gimple_stmt_iterator gsi = *gsi_p;
+  gsi_next (&gsi);
+  while (!gsi_end_p (gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+ if (is_gimple_debug (stmt)
+ || gimple_code (stmt) == GIMPLE_NOP)
+   {
+ gsi_next (&gsi);
+ continue;
+   }
+ if (gimple_code (stmt) != GIMPLE_OMP_ORDERED)
+   break;
+ gomp_ordered *ord_stmt2 = as_a  (stmt);
+ c = gimple_omp_ordered_clauses (ord_stmt2);
+ if (c == NULL_TREE
+ || OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
+ || OMP_CLAUSE_DEPEND_KIND (c) != OMP_CLAUSE_DEPEND_SINK)
+   break;
+ while (*list_p)
+   list_p = &OMP_CLAUSE_CHAIN (*list_p);
+ *list_p = c;
+ gsi_remove (&gsi, true);
+   }
+}
+
   /* Canonicalize sink dependence clauses into one folded clause if
  possible.
 
@@ -12183,8 +1,7 @@ lower_omp_ordered_clauses (gomp_ordered
   tree *iter_vars = (tree *) alloca (sizeof (tree) * len);
   memset (iter_vars, 0, sizeof (tree) * len);
 
-  tree *list_p = gimple_omp_ordered_clauses_ptr (ord_stmt);
-  tree c;
+  list_p = gimple_omp_ordered_clauses_ptr (ord_stmt);
   unsigned int i;
   while ((c = *list_p) != NULL)
 {
@@ -12317,6 +12355,11 @@ lower_omp_ordered_clauses (gomp_ordered
  lower_omp_ordered_ret:
   sbitmap_free (folded_deps_used);
   folded_deps.release ();
+
+  /* Ordered without clauses is #pragma omp threads, while we want
+ a nop instead if we remove all clauses.  */
+  if (gimple_omp_ordered_clauses (ord_stmt) == NULL_TREE)
+gsi_replace (gsi_p, gimple_build_nop (), true);
 }
 
 
@@ -12333,7 +12376,12 @@ lower_omp_ordered (gimple_stmt_iterator
   bool

Re: [PATCH][PR67476] Add param parloops-schedule

2015-09-14 Thread Tom de Vries

On 14/09/15 11:09, Jakub Jelinek wrote:

On Fri, Sep 11, 2015 at 03:28:01PM +0200, Tom de Vries wrote:

On 11/09/15 12:57, Jakub Jelinek wrote:

On Fri, Sep 11, 2015 at 12:55:00PM +0200, Tom de Vries wrote:

Hi,

this patch adds a param parloops-schedule=<0-4>, which sets the omp schedule
for loops paralellized by parloops.

The <0-4> maps onto .

Bootstrapped and reg-tested on x86_64.

OK for trunk?

I don't really like it, the mapping of the integers to the enum values
is non-obvious and hard to remember.
Perhaps add support for enumeration params if you want this instead?



This patch adds handling of a DEFPARAMENUM macro, which is similar to the
DEFPARAM macro, but allows the values to be named.

So the definition of param parloop-schedule becomes:
...
DEFPARAMENUM PARAM_PARLOOPS_SCHEDULE,
  "parloops-schedule",
  "Schedule type of omp schedule for loops parallelized by "
  "parloops (static, dynamic, guided, auto, runtime)",
  0, 0, 4, "static", "dynamic", "guided", "auto", "runtime")
...
[ I'll repost the original patch containing this update. ]

OK for trunk if x86_64 bootstrap and reg-test succeeds?


That still allows numeric arguments for the param, which is IMHO
undesirable.  If it is enum kind, only the enum values should be accepted.


Fixed.


Also, it would be nice if params.h in that case would define an enum with
the values like
PARAM_PARLOOPS_SCHEDULE_KIND_{static,dynamic,guided,auto,runtime}, so use
values not wrapped in ""s and only in a macro or generator make both
enums and string array out of that.


Done, there's now a file params-enum.h containing these enums.


There is also the question if we can use __VA_ARGS__, isn't that C99 or
C++11 and later feature?  I see gengtype.h and ipa-icf-gimple.h use
that too, so maybe yes, but am not sure.



I've removed the use of variadic macros, meaning we now use 
DEFPARAMENUM5 instead of DEFPARAMENUM.


Also, I've remove the min/max arguments in DEFPARAMENUM5.

And I've ensured that the default is now specified as a string rather 
than an integer.


So the new definition of PARAM_PARLOOPS_SCHEDULE looks like:

DEFPARAMENUM5 (PARAM_PARLOOPS_SCHEDULE,
  "parloops-schedule",
  "Schedule type of omp schedule for loops parallelized by "
  "parloops (static, dynamic, guided, auto, runtime)",
  static,
  static, dynamic, guided, auto, runtime)

[ Again, I'll repost the original patch containing this update. ]

This patch adds support for DEFPARAMENUM5.

OK for trunk, if bootstrap and reg-test on x86_64 succeeds?

Thanks,
- Tom

Support DEFPARAMENUM in params.def

2015-09-11  Tom de Vries  

	* Makefile.in (PARAMS_H, PLUGIN_HEADERS): Add params-enum.h.
	* params-enum.h: New file.
	* opts.c (handle_param): Handle case that param arg is a string.
	* params-list.h: Handle DEFPARAMENUM5 in params.def.
	* params.c (find_param): New function, factored out of ...
	(set_param_value): ... here.
	(param_string_value_p): New function.
	* params.h (struct param_info): Add value_names field.
	(find_param, param_string_value_p): Declare.
---
 gcc/Makefile.in   |  6 ++--
 gcc/opts.c| 19 +++
 gcc/params-enum.h | 39 ++
 gcc/params-list.h |  3 ++
 gcc/params.c  | 97 ++-
 gcc/params.h  |  6 
 6 files changed, 138 insertions(+), 32 deletions(-)
 create mode 100644 gcc/params-enum.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b495bd2..b825785 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -890,7 +890,7 @@ RTL_BASE_H = coretypes.h rtl.h rtl.def $(MACHMODE_H) reg-notes.def \
 FIXED_VALUE_H = fixed-value.h $(MACHMODE_H) double-int.h
 RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h
 READ_MD_H = $(OBSTACK_H) $(HASHTAB_H) read-md.h
-PARAMS_H = params.h params.def
+PARAMS_H = params.h params-enum.h params.def
 BUILTINS_DEF = builtins.def sync-builtins.def omp-builtins.def \
 	gtm-builtins.def sanitizer.def cilkplus.def cilk-builtins.def
 INTERNAL_FN_DEF = internal-fn.def
@@ -3254,8 +3254,8 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
   tree-iterator.h $(PLUGIN_H) $(TREE_SSA_H) langhooks.h incpath.h debug.h \
   $(EXCEPT_H) tree-ssa-sccvn.h real.h output.h $(IPA_UTILS_H) \
   $(C_PRAGMA_H)  $(CPPLIB_H)  $(FUNCTION_H) \
-  cppdefault.h flags.h $(MD5_H) params.def params.h prefix.h tree-inline.h \
-  $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
+  cppdefault.h flags.h $(MD5_H) params.def params.h params-enum.h \
+  prefix.h tree-inline.h $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
   $(IPA_PROP_H) $(TARGET_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) \
   version.h stringpool.h gimplify.h gimple-iterator.h gimple-ssa.h \
   fold-const.h tree-cfg.h tree-into-ssa.h tree-ssanames.h print-tree.h \
diff --git a/gcc/opts.c b/gcc/opts.c
index f1a9acd..3349aaf 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2116,14 +2116,21 @@ handle_param (struct gcc_

Re: [PATCH][PR67476] Add param parloops-schedule

2015-09-14 Thread Tom de Vries

On 14/09/15 16:28, Tom de Vries wrote:

On 14/09/15 11:09, Jakub Jelinek wrote:

On Fri, Sep 11, 2015 at 03:28:01PM +0200, Tom de Vries wrote:

On 11/09/15 12:57, Jakub Jelinek wrote:

On Fri, Sep 11, 2015 at 12:55:00PM +0200, Tom de Vries wrote:

Hi,

this patch adds a param parloops-schedule=<0-4>, which sets the
omp schedule
for loops paralellized by parloops.

The <0-4> maps onto .

Bootstrapped and reg-tested on x86_64.

OK for trunk?

I don't really like it, the mapping of the integers to the enum values
is non-obvious and hard to remember.
Perhaps add support for enumeration params if you want this instead?



This patch adds handling of a DEFPARAMENUM macro, which is similar to
the
DEFPARAM macro, but allows the values to be named.

So the definition of param parloop-schedule becomes:
...
DEFPARAMENUM PARAM_PARLOOPS_SCHEDULE,
  "parloops-schedule",
  "Schedule type of omp schedule for loops parallelized by "
  "parloops (static, dynamic, guided, auto, runtime)",
  0, 0, 4, "static", "dynamic", "guided", "auto", "runtime")
...
[ I'll repost the original patch containing this update. ]

OK for trunk if x86_64 bootstrap and reg-test succeeds?


That still allows numeric arguments for the param, which is IMHO
undesirable.  If it is enum kind, only the enum values should be
accepted.


Fixed.


Also, it would be nice if params.h in that case would define an enum with
the values like
PARAM_PARLOOPS_SCHEDULE_KIND_{static,dynamic,guided,auto,runtime}, so use
values not wrapped in ""s and only in a macro or generator make both
enums and string array out of that.


Done, there's now a file params-enum.h containing these enums.


There is also the question if we can use __VA_ARGS__, isn't that C99 or
C++11 and later feature?  I see gengtype.h and ipa-icf-gimple.h use
that too, so maybe yes, but am not sure.



I've removed the use of variadic macros, meaning we now use
DEFPARAMENUM5 instead of DEFPARAMENUM.

Also, I've remove the min/max arguments in DEFPARAMENUM5.

And I've ensured that the default is now specified as a string rather
than an integer.

So the new definition of PARAM_PARLOOPS_SCHEDULE looks like:

DEFPARAMENUM5 (PARAM_PARLOOPS_SCHEDULE,
   "parloops-schedule",
   "Schedule type of omp schedule for loops parallelized by "
   "parloops (static, dynamic, guided, auto, runtime)",
   static,
   static, dynamic, guided, auto, runtime)

[ Again, I'll repost the original patch containing this update. ]


This patch adds param parloops-schedule (and now uses the enum 
associated with the param).


OK for trunk, if bootstrap and reg-test on x86_64 succeeds?

Thanks,
- Tom

Add param parloops-schedule

2015-09-10  Tom de Vries  

	PR tree-optimization/67476
	* doc/invoke.texi (@item parloops-schedule): New item.
	* omp-low.c (expand_omp_for_generic): Handle simple latch.  Add missing
	phis.  Handle original loop.
	* params.def (PARAM_PARLOOPS_SCHEDULE): New DEFPARAMENUM5.
	* tree-parloops.c: Include params-enum.h.
	(create_parallel_loop): Handle PARAM_PARLOOPS_SCHEDULE.

	* testsuite/libgomp.c/autopar-3.c: New test.
	* testsuite/libgomp.c/autopar-4.c: New test.
	* testsuite/libgomp.c/autopar-5.c: New test.
	* testsuite/libgomp.c/autopar-6.c: New test.
	* testsuite/libgomp.c/autopar-7.c: New test.
	* testsuite/libgomp.c/autopar-8.c: New test.
---
 gcc/doc/invoke.texi |  4 +++
 gcc/omp-low.c   | 57 +++--
 gcc/params.def  | 12 +++
 gcc/tree-parloops.c | 26 ++-
 libgomp/testsuite/libgomp.c/autopar-3.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-4.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-5.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-6.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-7.c |  4 +++
 libgomp/testsuite/libgomp.c/autopar-8.c |  4 +++
 10 files changed, 119 insertions(+), 4 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-6.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-7.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-8.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 76e5e29..2221795 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11005,6 +11005,10 @@ automaton.  The default is 50.
 Chunk size of omp schedule for loops parallelized by parloops.  The default
 is 0.
 
+@item parloops-schedule
+Schedule type of omp schedule for loops parallelized by parloops (static,
+dynamic, guided, auto, runtime).  The default is static.
+
 @end table
 @end table
 
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 88a5149..4f0498b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -239,6 +239,7 @@ static vec taskreg_conte

Re: [C++ Patch] PR 51911 V2 ("G++ accepts new auto { list }")

2015-09-14 Thread Jason Merrill

OK.

Jason


[gomp4] Fix handling of declare'd variable.

2015-09-14 Thread James Norris

Hi,

The attached patch fixes an issue where a declare'd variable,
with the link clause, wasn't marked as offloadable.

Committed after regtesting on x86_64.

Thanks!
Jim
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/declare-4.c
===
--- libgomp/testsuite/libgomp.oacc-c-c++-common/declare-4.c	(revision 227748)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/declare-4.c	(working copy)
@@ -6,7 +6,16 @@
 float b;
 #pragma acc declare link (b)
 
+#pragma acc routine
 int
+func (int a)
+{
+  b = a + 1;
+
+  return b;
+}
+
+int
 main (int argc, char **argv)
 {
   float a;
@@ -23,5 +32,10 @@ main (int argc, char **argv)
   if (a != 3.0)
 abort ();
 
+  a = func (a);
+
+  if (a != 4.0)
+abort ();
+
   return 0;
 }
Index: gcc/varpool.c
===
--- gcc/varpool.c	(revision 227748)
+++ gcc/varpool.c	(working copy)
@@ -173,24 +173,8 @@ make_offloadable (varpool_node *node, tree decl)
   attrs = lookup_attribute ("oacc declare", DECL_ATTRIBUTES (decl));
   if (attrs)
 {
-  tree *t;
-  int total = 0, skip = 0;
+  make_offloadable_1 (node, decl);
 
-  gcc_assert (&TREE_VALUE (attrs));
-
-  for (t = &TREE_VALUE (attrs); *t; t = &TREE_CHAIN (*t))
-	{
-	  HOST_WIDE_INT kind = OMP_CLAUSE_MAP_KIND (TREE_VALUE (*t));
-
-	  total++;
-
-	  if (kind == GOMP_MAP_LINK)
-	skip++;
-	}
-
-  if (total - skip > 0)
-	make_offloadable_1 (node, decl);
-
   DECL_ATTRIBUTES (decl)
 	  = remove_attribute ("oacc declare", DECL_ATTRIBUTES (decl));
 }


Re: [PATCH] Teach genmatch.c to generate single-use restrictions from flags

2015-09-14 Thread Bernhard Reutner-Fischer
On September 14, 2015 11:23:28 AM GMT+02:00, Richard Biener  
wrote:
>On Fri, 11 Sep 2015, Bernd Schmidt wrote:
>
>> On 07/08/2015 04:39 PM, Richard Biener wrote:
>> > 
>> > This introduces a :s flag to match expressions which enforces
>> > the expression to have a single-use if(!) the simplified
>> > expression is larger than one statement.
>> 
>> This seems to be missing documentation in match-and-simplify.texi.
>
>Fixed as follows, built and inspected .info and .pdf on x86_64-linux,
>applied.
>
>Richard.
>
>2015-09-14  Richard Biener  
>
>   * doc/match-and-simplify.texi: Fixup some formatting issues
>   and document the 's' flag.
>
>Index: gcc/doc/match-and-simplify.texi
>===
>--- gcc/doc/match-and-simplify.texi(revision 227737)
>+++ gcc/doc/match-and-simplify.texi(working copy)
>@@ -186,20 +186,36 @@ preprocessor directives.
>   (bit_and @@1 @@0))
> @end smallexample
> 
>-Here we introduce flags on match expressions.  There is currently
>-a single flag, @code{c}, which denotes that the expression should
>+Here we introduce flags on match expressions.  There used flag

s/There used flag/The flag used/

Thanks,

>+above, @code{c}, denotes that the expression should
> be also matched commutated.  Thus the above match expression
> is really the following four match expressions:
> 
>+@smallexample
>   (bit_and integral_op_p@@0 (bit_ior (bit_not @@0) @@1))
>   (bit_and (bit_ior (bit_not @@0) @@1) integral_op_p@@0)
>   (bit_and integral_op_p@@0 (bit_ior @@1 (bit_not @@0)))
>   (bit_and (bit_ior @@1 (bit_not @@0)) integral_op_p@@0)
>+@end smallexample
> 
> Usual canonicalizations you know from GENERIC expressions are
> applied before matching, so for example constant operands always
> come second in commutative expressions.
> 
>+The second supported flag is @code{s} which tells the code
>+generator to fail the pattern if the expression marked with
>+@code{s} does have more than one use.  For example in
>+
>+@smallexample
>+(simplify
>+  (pointer_plus (pointer_plus:s @@0 @@1) @@3)
>+  (pointer_plus @@0 (plus @@1 @@3)))
>+@end smallexample
>+
>+this avoids the association if @code{(pointer_plus @@0 @@1)} is
>+used outside of the matched expression and thus it would stay
>+live and not trivially removed by dead code elimination.
>+
> More features exist to avoid too much repetition.
> 
> @smallexample
>@@ -291,17 +307,17 @@ with a @code{?}:
> 
> @smallexample
> (simplify
>- (eq (convert@@0 @@1) (convert? @@2))
>+ (eq (convert@@0 @@1) (convert@? @@2))
>  (eq @@1 (convert @@2)))
> @end smallexample
> 
> which will match both @code{(eq (convert @@1) (convert @@2))} and
> @code{(eq (convert @@1) @@2)}.  The optional converts are supposed
> to be all either present or not, thus
>-@code{(eq (convert? @@1) (convert? @@2))} will result in two
>+@code{(eq (convert@? @@1) (convert@? @@2))} will result in two
> patterns only.  If you want to match all four combinations you
> have access to two additional conditional converts as in
>-@code{(eq (convert1? @@1) (convert2? @@2))}.
>+@code{(eq (convert1@? @@1) (convert2@? @@2))}.
> 
> Predicates available from the GCC middle-end need to be made
> available explicitely via @code{define_predicates}:




Re: [PATCH 1/4] [ARM] Add attribute/pragma target fpu=

2015-09-14 Thread Bernhard Reutner-Fischer
On September 14, 2015 11:36:13 AM GMT+02:00, Christian Bruel 
 wrote:
>Hi,
>
>This patch moves the FPU flags settings and checks to the attribute 
>hooks. .e.g neon must be checked in arm_option_check_internal. FPU name
>
>is emitted before each function and arm_fpu_index is Saved to be
>handled 
>by the cl_target_option_save/restore functions.

diff -ruN gnu_trunk.p0/gcc/gcc/config/arm/arm.c 
gnu_trunk.p1/gcc/gcc/config/arm/arm.c
--- gnu_trunk.p0/gcc/gcc/config/arm/arm.c   2015-09-10 14:27:28.931847219 
+0200
+++ gnu_trunk.p1/gcc/gcc/config/arm/arm.c   2015-09-11 14:53:45.388771537 
+0200
@@ -2713,6 +2713,12 @@
 arm_option_check_internal (struct gcc_options *opts)
 {
   int flags = opts->x_target_flags;
+  const struct arm_fpu_desc *fpu_desc = &all_fpus[opts->x_arm_fpu_index];
+
+  /* iWMMXt and NEON are incompatible.  */
+if (TARGET_IWMMXT && TARGET_VFP  
+  && ARM_FPU_FSET_HAS (fpu_desc->features, FPU_FL_NEON))
+error ("iWMMXt `and NEON are incompatible");

Maybe there is just dirt on my screen or my graphic-memory is broken, but I see 
an odd character between iWMMXt and and?

I keep forgetting if it's a capital W or a lowercase one but you'll know.
Thanks,




Re: [RFC, PR target/65105] Use vector instructions for scalar 64bit computations on 32bit target

2015-09-14 Thread Uros Bizjak
On Mon, Sep 14, 2015 at 2:03 PM, Ilya Enkovich  wrote:
> On 09 Sep 10:20, Uros Bizjak wrote:
>> On Wed, Sep 9, 2015 at 10:12 AM, Uros Bizjak  wrote:
>> > On Tue, Sep 8, 2015 at 5:49 PM, Ilya Enkovich  
>> > wrote:
>> >
>> > Please depend new changes to insn patterns to TARGET_STV. This way,
>> > non-STV compiles will behave exactly as now.
>> >
>> > +;; Math-dependant integer modes with DImode.
>> > +(define_mode_iterator SWIM1248x [(QI "TARGET_QIMODE_MATH")
>> > +   (HI "TARGET_HIMODE_MATH")
>> > +   SI DI])
>> > +
>> >
>> > DI should depend on TARGET_STV && TARGET_SSE2
>> >
>> > @@ -2093,9 +2098,9 @@
>> >
>> >  (define_insn "*movdi_internal"
>> >[(set (match_operand:DI 0 "nonimmediate_operand"
>> > -"=r  ,o  ,r,r  ,r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m ,?r
>> > ,?r,?*Yi,?*Ym,?*Yi,*k,*k ,*r ,*m")
>> > +"=r  ,o  ,r,r  ,r,m ,*y,*y,?*y,?m,?r ,?*Ym,*v,*v,*v,m,?r
>> > ,?r,?*Yi,?*Ym,?*Yi,*k,*k ,*r ,*m")
>> >   (match_operand:DI 1 "general_operand"
>> > -"riFo,riF,Z,rem,i,re,C ,*y,m  ,*y,*Yn,r   ,C ,*v,m ,*v,*Yj,*v,r
>> > ,*Yj ,*Yn ,*r ,*km,*k,*k"))]
>> > +"riFo,riF,Z,rem,i,re,C ,*y,m  ,*y,*Yn,r   ,C ,*v,m ,v,*Yj,*v,r
>> > ,*Yj ,*Yn ,*r ,*km,*k,*k"))]
>> >"!(MEM_P (operands[0]) && MEM_P (operands[1]))"
>> >  {
>> >
>> > Please add new alternative and use enabled attribute to conditionaly
>> > select correct alternative. Preferrably, the new alternative should be
>> > just after the one it changes, so you will have to change many of the
>> > alternative's numbers in attribute calculations.
>> >
>> > +(define_insn_and_split "*anddi3_doubleword"
>> > +  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r")
>> > + (and:DI
>> > + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0")
>> > + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm")))
>> > +   (clobber (reg:CC FLAGS_REG))]
>> > +  "!TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)"
>> > +  "#"
>> > +  "!TARGET_64BIT && reload_completed"
>> > +  [(parallel [(set (match_dup 0)
>> >
>> > You should add TARGET_STV && TARGET_SSE2 in the above and other added 
>> > patterns.
>>
>> (I pushed "send" too fast here ;) )
>>
>> Please note that you can use "&& ..." in the split condition to avoid
>> duplication with insn condition.
>>
>> + if (TARGET_SSE4_1)
>> +  {
>> +emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
>> + CONST0_RTX (V4SImode),
>> + gen_rtx_SUBREG (SImode, reg, 0)));
>> +emit_insn (gen_sse4_1_pinsrd (gen_rtx_SUBREG (V4SImode, vreg, 0),
>> +  gen_rtx_SUBREG (V4SImode, vreg, 0),
>> +  gen_rtx_SUBREG (SImode, reg, 4),
>> +  GEN_INT (2)));
>> +  }
>> + else if (TARGET_INTER_UNIT_MOVES_TO_VEC)
>> +  {
>> +rtx tmp = gen_reg_rtx (DImode);
>> +emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
>> + CONST0_RTX (V4SImode),
>> + gen_rtx_SUBREG (SImode, reg, 0)));
>> +emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, tmp, 0),
>> + CONST0_RTX (V4SImode),
>> + gen_rtx_SUBREG (SImode, reg, 4)));
>> +emit_insn (gen_vec_interleave_lowv4si
>> +   (gen_rtx_SUBREG (V4SImode, vreg, 0),
>> + gen_rtx_SUBREG (V4SImode, vreg, 0),
>> + gen_rtx_SUBREG (V4SImode, tmp, 0)));
>> +  }
>> + else
>> +  {
>> +rtx tmp = assign_386_stack_local (DImode, SLOT_TEMP);
>> +emit_move_insn (adjust_address (tmp, SImode, 0),
>> +gen_rtx_SUBREG (SImode, reg, 0));
>> +emit_move_insn (adjust_address (tmp, SImode, 4),
>> +gen_rtx_SUBREG (SImode, reg, 4));
>> +emit_move_insn (vreg, tmp);
>> +  }
>>
>> As a future cleanup idea, maybe we should reimplement the above code
>> as an expander and use it in several places. IIRC, there are already
>> several places in the code that would benefit from it.
>>
>> Uros.
>
> Hi Uros!
>
> Thanks a lot for your review!  I fixed my patch according to your comments.  
> Everything is under stv target now and should be transparent for non-stv 
> targets.  Bootstrapped and regtested for x86_64-unknown-linux-gnu.  Does it 
> look OK?

The patch addresses all my concerns, I have only one additional small
change request below:

+(define_insn_and_split "*zext_doubleword"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+ (zero_extend:DI (match_operand:SWI24 1 "nonimmediate_operand" "rm")))]
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
+  "#"
+  "&& reload_completed && GENERAL_REG_P (operands[0])"
+  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
+   (set (match_dup 2) (const_int 0))]
+  "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);")
+
+(define_insn_and_split "*zextqi_doubleword"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+ (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "qm")))]
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
+  "#"
+  "&& reload_completed && GENERAL_REG_P (operands[0])"
+  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
+   (set (match_dup 2) (const_int 0))]
+  "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);")
+

Please p

Re: [PATCH][PR67476] Add param parloops-schedule

2015-09-14 Thread Tom de Vries

On 14/09/15 10:52, Bernd Schmidt wrote:

I'm curious why
this would be a param rather than a -f option.


Hi Bernd,

parloops-chunk-size is also a param, so I think it would make sense to 
have parloops-schedule as a param as well.

[ So, in order for parloops to generate:
#pragma omp for schedule(dynamic,100)
  we specify on the command line:
--param parloops-schedule=dynamic --param parloops-chunk-size=100 ]

But in general, I don't really know how I should choose between:
* --param parm=, and
* -fparm=.

Thanks,
- Tom


Re: [PATCH] fix TLS support detection for sh targets

2015-09-14 Thread Rich Felker
On Mon, Sep 14, 2015 at 07:08:33AM +0900, Kaz Kojima wrote:
> Rich Felker  wrote:
> > I'm pretty sure this will still apply to trunk, but I can check that
> > and add the changelog entry. Is there something I should read on the
> > form or just follow the example from my last patch where you added it?
> 
> The latter would be enough for this, though
> 
> https://gcc.gnu.org/wiki/ChangeLog
> https://sourceware.org/gdb/wiki/ContributionChecklist#Properly_Formatted_GNU_ChangeLog
> 
> will be handy instructions.

Thanks. I confirmed that the patch as submitted applies cleanly to
trunk. For the ChangeLog message, do I need to list both configure and
configure.ac or just the latter? And should configure be included in
the patch like I did, or regenerated when the patch is applied?

Rich


Re: [PATCH] fix TLS support detection for sh targets

2015-09-14 Thread Szabolcs Nagy

On 14/09/15 17:58, Rich Felker wrote:

trunk. For the ChangeLog message, do I need to list both configure and
configure.ac or just the latter? And should configure be included in
the patch like I did, or regenerated when the patch is applied?


list both

i think it's ok to fix configure manually



Re: [PATCH] Convert SPARC to LRA

2015-09-14 Thread Richard Henderson
On 09/12/2015 10:44 PM, David Miller wrote:
> From: Eric Botcazou 
> Date: Sat, 12 Sep 2015 16:04:09 +0200
> 
>>> Richard, Eric, any objections?
>>
>> Do we really need to promote to 64-bit if TARGET_ARCH64?  Most 32-bit 
>> instructions are still available.  Otherwise this looks good to me.
> 
> No, we don't, we can just promote to 32-bit.  I'll make that adjustment
> and update the backends page as well.

There's a possibility of benefit though -- br and movr only work with DImode.
You may want to examine the generated code to decide one way or another.

It's possible that the extra comparison instructions don't really matter
compared with the larger spill slot, but you never know...


r~


[PATCH, testsuite]: Also scan for $loopfn

2015-09-14 Thread Uros Bizjak
Hello!

tree-parloops pass names loop functions as $loopfn. These names are
later renamed for NO_DOLLAR_IN_LABEL targets to _loopfn.
alpha-linux-gnu is not in this group, so it fails

FAIL: gcc.dg/gomp/notify-new-function-3.c scan-tree-dump-times
ompexpssa "Added new ssa gimple function foo._loopfn.0 to
callgraph" 1

Attached patch changes a couple of scan patterns to also handle
dollars in loopfn names.

2015-09-14  Uros Bizjak  

* gcc.dg/gomp/dump-new-function-3.c (dg-final): Also scan for $loopfn.
* gcc.dg/gomp/notify-new-function-3.c (dg-final): Ditto.

Tested on alphaev68-linux-gnu and x86_64-linux-gnu.

Committed to mainline SVN.

Uros.
Index: gcc.dg/gomp/dump-new-function-3.c
===
--- gcc.dg/gomp/dump-new-function-3.c   (revision 227715)
+++ gcc.dg/gomp/dump-new-function-3.c   (working copy)
@@ -10,4 +10,4 @@
 }
 
 /* Check that new function does not end up in gimple dump.  */
-/* { dg-final { scan-tree-dump-not "foo\\._loopfn\\.0" "gimple" } } */
+/* { dg-final { scan-tree-dump-not "foo\\.\[\\\$_\]loopfn\\.0" "gimple" } } */
Index: gcc.dg/gomp/notify-new-function-3.c
===
--- gcc.dg/gomp/notify-new-function-3.c (revision 227715)
+++ gcc.dg/gomp/notify-new-function-3.c (working copy)
@@ -11,4 +11,4 @@
 
 
 /* Check for new function notification in ompexpssa dump.  */
-/* { dg-final { scan-tree-dump-times "Added new ssa gimple function 
foo\\._loopfn\\.0 to callgraph" 1 "ompexpssa" } } */
+/* { dg-final { scan-tree-dump-times "Added new ssa gimple function 
foo\\.\[\\\$_\]loopfn\\.0 to callgraph" 1 "ompexpssa" } } */


Re: [PATCH] fix TLS support detection for sh targets

2015-09-14 Thread Rich Felker
On Mon, Sep 14, 2015 at 06:06:02PM +0100, Szabolcs Nagy wrote:
> On 14/09/15 17:58, Rich Felker wrote:
> >trunk. For the ChangeLog message, do I need to list both configure and
> >configure.ac or just the latter? And should configure be included in
> >the patch like I did, or regenerated when the patch is applied?
> 
> list both
> 
> i think it's ok to fix configure manually

OK thanks! Sorry I missed the thing about configure: Regenerate in the
second link. I'll format the patch properly and send it soon.

Rich


Re: C++ PATCH for c++/44282 (ia32 calling convention attributes and mangling)

2015-09-14 Thread Jason Merrill
It occurred to me that we weren't warning about this with -Wabi and a 
lower -fabi-version.  This patch fixes that.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit e924f74640ecd18240fb514f99b0d5b5ceeeffb7
Author: Jason Merrill 
Date:   Wed Sep 2 16:32:15 2015 -0400

	PR c++/44282

	* mangle.c (write_CV_qualifiers_for_type): Also warn about regparm
	mangling with lower -fabi-version.

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 342cb93..2640d52 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -2196,7 +2196,7 @@ write_CV_qualifiers_for_type (const tree type)
  We don't do this with classes and enums because their attributes
  are part of their definitions, not something added on.  */
 
-  if (abi_version_at_least (10) && !OVERLOAD_TYPE_P (type))
+  if (!OVERLOAD_TYPE_P (type))
 {
   auto_vec vec;
   for (tree a = TYPE_ATTRIBUTES (type); a; a = TREE_CHAIN (a))
@@ -2207,31 +2207,34 @@ write_CV_qualifiers_for_type (const tree type)
 	  && !is_attribute_p ("abi_tag", name))
 	vec.safe_push (a);
 	}
-  vec.qsort (attr_strcmp);
-  while (!vec.is_empty())
+  if (abi_version_crosses (10) && !vec.is_empty ())
+	G.need_abi_warning = true;
+  if (abi_version_at_least (10))
 	{
-	  tree a = vec.pop();
-	  const attribute_spec *as
-	= lookup_attribute_spec (get_attribute_name (a));
-
-	  write_char ('U');
-	  write_unsigned_number (strlen (as->name));
-	  write_string (as->name);
-	  if (TREE_VALUE (a))
+	  vec.qsort (attr_strcmp);
+	  while (!vec.is_empty())
 	{
-	  write_char ('I');
-	  for (tree args = TREE_VALUE (a); args;
-		   args = TREE_CHAIN (args))
+	  tree a = vec.pop();
+	  const attribute_spec *as
+		= lookup_attribute_spec (get_attribute_name (a));
+
+	  write_char ('U');
+	  write_unsigned_number (strlen (as->name));
+	  write_string (as->name);
+	  if (TREE_VALUE (a))
 		{
-		  tree arg = TREE_VALUE (args);
-		  write_template_arg (arg);
+		  write_char ('I');
+		  for (tree args = TREE_VALUE (a); args;
+		   args = TREE_CHAIN (args))
+		{
+		  tree arg = TREE_VALUE (args);
+		  write_template_arg (arg);
+		}
+		  write_char ('E');
 		}
-	  write_char ('E');
+
+	  ++num_qualifiers;
 	}
-
-	  ++num_qualifiers;
-	  if (abi_version_crosses (10))
-	G.need_abi_warning = true;
 	}
 }
 
diff --git a/gcc/testsuite/g++.dg/abi/mangle-regparm1a.C b/gcc/testsuite/g++.dg/abi/mangle-regparm1a.C
new file mode 100644
index 000..bfa6c9b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/mangle-regparm1a.C
@@ -0,0 +1,21 @@
+// { dg-do run { target { { i?86-*-* x86_64-*-* } && ia32 } } }
+// { dg-options "-fabi-version=8 -Wabi -save-temps" }
+// { dg-final { scan-assembler "_Z18IndirectExternCallIPFviiEiEvT_T0_S3_" } }
+
+template 
+void IndirectExternCall(F f, T t1, T t2) { // { dg-warning "mangled name" }
+  typedef F (*WrapF)(F);
+  f (t1, t2);
+}
+
+__attribute__((regparm(3), stdcall))
+void regparm_func (int i, int j)
+{
+  if (i != 24 || j != 42)
+__builtin_abort();
+}
+
+int main()
+{
+  IndirectExternCall (regparm_func, 24, 42);
+}


Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-14 Thread Alan Lawrence

Ping. (Rerevert with 5 lines extra paranoia in scalarizable_type_p).

Thanks, Alan

On 08/09/15 13:43, Martin Jambor wrote:

Hi,

On Mon, Sep 07, 2015 at 02:15:45PM +0100, Alan Lawrence wrote:

In-Reply-To: <55e0697d.2010...@arm.com>

On 28/08/15 16:08, Alan Lawrence wrote:

Alan Lawrence wrote:


Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix
(that declares arrays with any of these properties as unscalarizable).

...
In the meantime I've reverted the patch pending further testing on x86, aarch64
and arm.


I've now tested g++ and fortran (+ bootstrap + check-gcc) on x86, AArch64 and
ARM, and Ada on x86 and ARM.

So far the list of failures from the original patch seems to be:

* g++.dg/torture/pr64312.C on ARM and m68k-linux
* Building Ada on x86
* Ada ACATS c87b31a on ARM (where the Ada frontend builds fine)

Here's a new version, that fixes all the above, by adding a dose of
paranoia in scalarizable_type_p...


I have only had a bref look at scalarizable_type_p then, considering
all of the rest unchanged, and the tests there seem natural to me.
(Note that I do not have the authority to approve the patch.)


(I wonder about adding a comment
in completely_scalarize that such cases have already been ruled
out?)


The comment already references scalarizable_type_p which is enough at
least for me.

Thanks,

Martin





Re: [PATCH 3/4] [ARM] Add attribute/pragma target fpu=

2015-09-14 Thread Bernhard Reutner-Fischer
On September 14, 2015 1:39:28 PM GMT+02:00, Christian Bruel 
 wrote:
>This patch splits the neon_builtins initialization into 2 internals 
>functions. One for NEON and one for CRYPTO, each one guarded by its own
>
>predicate. arm_init_neon_builtins is now global to be called from 
>arm_valid_target_attribute_tree if needed.

arm_init_crypto_builtins_internal should thus be static, shouldn't it?

Thanks,




Re: [PATCH 00/22] RFC: Overhaul of diagnostics

2015-09-14 Thread Bernd Schmidt

On 09/10/2015 10:28 PM, David Malcolm wrote:

Attached is a work-in-progress patch kit implementing these ideas.
I posting it now to get feedback: some parts of it may be ready to
commit, but other parts are definitely *not* ready yet.


It's hard to provide meaningful review under these conditions. My advice 
would be to resubmit the things that are ready now and can stand on 
their own so that we can get them out of the way first. Also, gather 
memory/time information before posting the patches if that seems likely 
to be important. For example, patch 21 looks quite cool but also 
potentially expensive, I'd probably want that to be restricted by param 
to identifiers of a maximum length (for both identifiers being compared).


For the most part I declare myself agnostic as to whether this is an 
improvement or not, and leave that for others to comment on. I 
personally prefer single-line errors without much noise.


I see lots of unit tests implemented as plugins - have we decided that 
this is the mechanism we want to use for this kind of thing?


Patch 3 is ok as a purely mechanical move.



Bernd


[AArch64] Force __builtin_aarch64_fp[sc]r argument into a REG

2015-09-14 Thread Richard Sandiford
The attached testcase triggered an ICE because the builtin expansion
code passed the output of expand_normal directly to the SET_FP[SC]R
generator, without forcing it into a register first.

Tested on aarch64-linux-gnu.  OK to install?

Thanks,
Richard

gcc/
* config/aarch64/aarch64-builtins.c (aarch64_expand_builtin): Force
__builtin_aarch64_fp[sc]r arguments into a register.

gcc/testsuite/
* gcc.target/aarch64/fpcr_fpsr_1.c: New file.

diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index e3a90b5..62af878 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1164,7 +1164,7 @@ aarch64_expand_builtin (tree exp,
  icode = (fcode == AARCH64_BUILTIN_SET_FPSR) ?
CODE_FOR_set_fpsr : CODE_FOR_set_fpcr;
  arg0 = CALL_EXPR_ARG (exp, 0);
- op0 = expand_normal (arg0);
+ op0 = force_reg (SImode, expand_normal (arg0));
  pat = GEN_FCN (icode) (op0);
}
   emit_insn (pat);
diff --git a/gcc/testsuite/gcc.target/aarch64/fpcr_fpsr_1.c 
b/gcc/testsuite/gcc.target/aarch64/fpcr_fpsr_1.c
new file mode 100644
index 000..29aa1f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fpcr_fpsr_1.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+f1 (int *x)
+{
+  __builtin_aarch64_set_fpsr (*x);
+}
+
+void
+f2 (int *x)
+{
+  __builtin_aarch64_set_fpcr (*x);
+}
+
+void
+f3 (int *x)
+{
+  *x = __builtin_aarch64_get_fpsr ();
+}
+
+void
+f4 (int *x)
+{
+  *x = __builtin_aarch64_get_fpcr ();
+}



Go patch committed: Don't use context for constant expressions

2015-09-14 Thread Ian Lance Taylor
This patch by Chris Manghane changes the Go frontend to ignore the
result context when determining the type of constant expressions.
This fixes https://golang.org/issue/11566 .  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 227699)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-aea4360ca9c37f8e929f177ae7e42593ee62aa79
+1d9d92ab09996d2f7795481d2876a21194502b89
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 227696)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -5307,6 +5307,14 @@ Binary_expression::do_determine_type(con
|| this->op_ == OPERATOR_GT
|| this->op_ == OPERATOR_GE);
 
+  // For constant expressions, the context of the result is not useful in
+  // determining the types of the operands.  It is only legal to use abstract
+  // boolean, numeric, and string constants as operands where it is legal to
+  // use non-abstract boolean, numeric, and string constants, respectively.
+  // Any issues with the operation will be resolved in the check_types pass.
+  bool is_constant_expr = (this->left_->is_constant()
+   && this->right_->is_constant());
+
   Type_context subcontext(*context);
 
   if (is_comparison)
@@ -5351,7 +5359,8 @@ Binary_expression::do_determine_type(con
subcontext.type = subcontext.type->make_non_abstract_type();
 }
 
-  this->left_->determine_type(&subcontext);
+  if (!is_constant_expr)
+this->left_->determine_type(&subcontext);
 
   if (is_shift_op)
 {
@@ -5371,7 +5380,8 @@ Binary_expression::do_determine_type(con
   subcontext.may_be_abstract = false;
 }
 
-  this->right_->determine_type(&subcontext);
+  if (!is_constant_expr)
+this->right_->determine_type(&subcontext);
 
   if (is_comparison)
 {
@@ -5396,7 +5406,8 @@ Binary_expression::check_operator_type(O
 {
 case OPERATOR_OROR:
 case OPERATOR_ANDAND:
-  if (!type->is_boolean_type())
+  if (!type->is_boolean_type()
+  || !otype->is_boolean_type())
{
  error_at(location, "expected boolean type");
  return false;
@@ -5431,10 +5442,8 @@ Binary_expression::check_operator_type(O
 
 case OPERATOR_PLUS:
 case OPERATOR_PLUSEQ:
-  if (type->integer_type() == NULL
- && type->float_type() == NULL
- && type->complex_type() == NULL
- && !type->is_string_type())
+  if ((!type->is_numeric_type() && !type->is_string_type())
+  || (!otype->is_numeric_type() && !otype->is_string_type()))
{
  error_at(location,
   "expected integer, floating, complex, or string type");
@@ -5448,9 +5457,7 @@ Binary_expression::check_operator_type(O
 case OPERATOR_MULTEQ:
 case OPERATOR_DIV:
 case OPERATOR_DIVEQ:
-  if (type->integer_type() == NULL
- && type->float_type() == NULL
- && type->complex_type() == NULL)
+  if (!type->is_numeric_type() || !otype->is_numeric_type())
{
  error_at(location, "expected integer, floating, or complex type");
  return false;
@@ -5467,7 +5474,7 @@ Binary_expression::check_operator_type(O
 case OPERATOR_XOREQ:
 case OPERATOR_BITCLEAR:
 case OPERATOR_BITCLEAREQ:
-  if (type->integer_type() == NULL)
+  if (type->integer_type() == NULL || otype->integer_type() == NULL)
{
  error_at(location, "expected integer type");
  return false;


Re: [PATCH diagnostics/67460] Replace some_warnings_are_errors with diagnostic_kind_count (context, DK_WERROR)

2015-09-14 Thread Bernd Schmidt

On 09/14/2015 01:45 AM, Manuel López-Ibáñez wrote:

The flag diagnostic_context::some_warnings_are_errors controls whether
to give the message "all warnings being treated as errors". However, when
warnings are buffered and then discarded, this flag is not reset. It turns
out we do not need this flag at all, since we already count explicitly how
many warnings were converted into errors, and this number is kept up to
date for the buffered diagnostics used by Fortran.


Ok.


Bernd


Re: Split up optabs.[hc]

2015-09-14 Thread Bernd Schmidt

On 09/14/2015 07:54 PM, Richard Sandiford wrote:

This patch splits optabs up as follows:

   - optabs-query.[hc]: IL-independent functions for querying what a target
   can do natively.
   - optabs-tree.[hc]: tree and gimple query functions (an extension of
   optabs-query.[hc]).
   - optabs-libfuncs.[hc]: optabs-specific libfuncs (an extension of
   libfuncs.h)
   - optabs.h: For now includes optabs-query.h and optabs-libfuncs.h.


This seems like a good change.


I changed can_conditionally_move_p from returning an int to returning
a bool and fixed a few formatting glitches.  There should be no other
changes to the functions themselves.


I'm taking your word for it. The patch is slightly confusing in one area 
of optabs.c (it looks like debug_optab_libfuncs got moved around, it 
might be better for patch readability not to do that).

The only thing I really wondered about...


--- /dev/null
+++ b/gcc/optabs-tree.h
@@ -0,0 +1,45 @@
+
+#include "optabs-query.h"


I haven't quite followed amacleod's work on the #includes, so I wasn't 
quite sure whether headers are supposed to include other headers these 
days. But as far as I can tell that's fine, So, patch ok.



Bernd


Re: debug mode symbols cleanup

2015-09-14 Thread François Dumont
On 08/09/2015 22:47, François Dumont wrote:
> On 07/09/2015 13:03, Jonathan Wakely wrote:
>> On 05/09/15 22:53 +0200, François Dumont wrote:
>>>I remember Paolo saying once that we were not guarantiing any abi
>>> compatibility for debug mode. I haven't found any section for
>>> unversioned symbols in gnu.ver so I simply uncomment the global export.
>> There is no section, because all exported symbols are versioned.
>>
>> It's OK if objects compiled with Debug Mode using one version of GCC
>> don't link to objects compiled with Debug Mode using a different
>> version of GCC, but you can't change the exported symbols in the DSO.
>>
>>
>> Your changelog doesn't include the changes to config/abi/pre/gnu.ver,
>> but those changes are not OK anyway, they fail the abi-check:
>>
>> FAIL: libstdc++-abi/abi_check
>>
>>=== libstdc++ Summary ===
>>
>> # of unexpected failures1
>>
>>
> Sorry, I though policy regarding debug mode symbols was even more relax.
> It is not so here is another patch that doesn"t break abi checks.
>
> I eventually made all methods that should not be used deprecated, they
> were normally not used explicitely anyway. Their implementation is now
> empty. I just needed to add a symbol for the not const _M_message method
> which is the correct signature.
>
> François
>
I eventually considered doing it without impacting exported symbols. I
just kept the const qualifier on _M_messages and introduced a const_cast
in the implementation.

Is it ok to commit with this version ?

François

diff --git a/libstdc++-v3/include/debug/formatter.h b/libstdc++-v3/include/debug/formatter.h
index f0ac694..6e56c8f 100644
--- a/libstdc++-v3/include/debug/formatter.h
+++ b/libstdc++-v3/include/debug/formatter.h
@@ -133,6 +133,13 @@ namespace __gnu_debug
 
   class _Error_formatter
   {
+// Tags denoting the type of parameter for construction
+struct _Is_iterator { };
+struct _Is_iterator_value_type { };
+struct _Is_sequence { };
+struct _Is_instance { };
+
+  public:
 /// Whether an iterator is constant, mutable, or unknown
 enum _Constness
 {
@@ -154,13 +161,6 @@ namespace __gnu_debug
   __last_state
 };
 
-// Tags denoting the type of parameter for construction
-struct _Is_iterator { };
-struct _Is_iterator_value_type { };
-struct _Is_sequence { };
-struct _Is_instance { };
-
-  public:
 // A parameter that may be referenced by an error message
 struct _Parameter
 {
@@ -376,15 +376,16 @@ namespace __gnu_debug
 
   void
   _M_print_field(const _Error_formatter* __formatter,
-		 const char* __name) const;
+		 const char* __name) const _GLIBCXX_DEPRECATED;
 
   void
-  _M_print_description(const _Error_formatter* __formatter) const;
+  _M_print_description(const _Error_formatter* __formatter)
+	const _GLIBCXX_DEPRECATED;
 };
 
 template
-  const _Error_formatter&
-  _M_iterator(const _Iterator& __it, const char* __name = 0)  const
+  _Error_formatter&
+  _M_iterator(const _Iterator& __it, const char* __name = 0)
   {
 	if (_M_num_parameters < std::size_t(__max_parameters))
 	  _M_parameters[_M_num_parameters++] = _Parameter(__it, __name,
@@ -393,57 +394,59 @@ namespace __gnu_debug
   }
 
 template
-  const _Error_formatter&
+  _Error_formatter&
   _M_iterator_value_type(const _Iterator& __it,
-			 const char* __name = 0)  const
+			 const char* __name = 0)
   {
-	if (_M_num_parameters < std::size_t(__max_parameters))
+	if (_M_num_parameters < __max_parameters)
 	  _M_parameters[_M_num_parameters++] =
 	_Parameter(__it, __name, _Is_iterator_value_type());
 	return *this;
   }
 
-const _Error_formatter&
-_M_integer(long __value, const char* __name = 0) const
+_Error_formatter&
+_M_integer(long __value, const char* __name = 0)
 {
-  if (_M_num_parameters < std::size_t(__max_parameters))
+  if (_M_num_parameters < __max_parameters)
 	_M_parameters[_M_num_parameters++] = _Parameter(__value, __name);
   return *this;
 }
 
-const _Error_formatter&
-_M_string(const char* __value, const char* __name = 0) const
+_Error_formatter&
+_M_string(const char* __value, const char* __name = 0)
 {
-  if (_M_num_parameters < std::size_t(__max_parameters))
+  if (_M_num_parameters < __max_parameters)
 	_M_parameters[_M_num_parameters++] = _Parameter(__value, __name);
   return *this;
 }
 
 template
-  const _Error_formatter&
-  _M_sequence(const _Sequence& __seq, const char* __name = 0) const
+  _Error_formatter&
+  _M_sequence(const _Sequence& __seq, const char* __name = 0)
   {
-	if (_M_num_parameters < std::size_t(__max_parameters))
+	if (_M_num_parameters < __max_parameters)
 	  _M_parameters[_M_num_parameters++] = _Parameter(__seq, __name,
 			  _Is_sequence());
 	return *this;
   }
 
 template
-  const _Error_formatter&
-  _M_

vector lightweight debug mode

2015-09-14 Thread François Dumont
Hi

Here is what I had in mind when talking about moving debug checks to
the lightweight debug checks.

Sometimes the checks have been simply moved resulting in a simpler
debug vector implementation (front, back...). Sometimes I copy the
checks in a simpler form and kept the debug one too to make sure
execution of the debug code is fine.

I plan to do the same for other containers.

I still need to run tests, ok if tests are fine ?

François

diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h
index 305d446..89a9aec 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -449,6 +449,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   vector&
   operator=(vector&& __x) noexcept(_Alloc_traits::_S_nothrow_move())
   {
+	__glibcxx_assert(this != &__x);
 constexpr bool __move_storage =
   _Alloc_traits::_S_propagate_on_move_assign()
   || _Alloc_traits::_S_always_equal();
@@ -778,7 +779,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   reference
   operator[](size_type __n) _GLIBCXX_NOEXCEPT
-  { return *(this->_M_impl._M_start + __n); }
+  {
+	__glibcxx_assert(__n < size());
+	return *(this->_M_impl._M_start + __n);
+  }
 
   /**
*  @brief  Subscript access to the data contained in the %vector.
@@ -793,7 +797,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   const_reference
   operator[](size_type __n) const _GLIBCXX_NOEXCEPT
-  { return *(this->_M_impl._M_start + __n); }
+  {
+	__glibcxx_assert(__n < size());
+	return *(this->_M_impl._M_start + __n);
+  }
 
 protected:
   /// Safety check used only from at().
@@ -850,7 +857,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   reference
   front() _GLIBCXX_NOEXCEPT
-  { return *begin(); }
+  {
+	__glibcxx_assert(!empty());
+	return *begin();
+  }
 
   /**
*  Returns a read-only (constant) reference to the data at the first
@@ -858,7 +868,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   const_reference
   front() const _GLIBCXX_NOEXCEPT
-  { return *begin(); }
+  {
+	__glibcxx_assert(!empty());
+	return *begin();
+  }
 
   /**
*  Returns a read/write reference to the data at the last
@@ -866,7 +879,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   reference
   back() _GLIBCXX_NOEXCEPT
-  { return *(end() - 1); }
+  {
+	__glibcxx_assert(!empty());
+	return *(end() - 1);
+  }
   
   /**
*  Returns a read-only (constant) reference to the data at the
@@ -874,7 +890,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   const_reference
   back() const _GLIBCXX_NOEXCEPT
-  { return *(end() - 1); }
+  {
+	__glibcxx_assert(!empty());
+	return *(end() - 1);
+  }
 
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // DR 464. Suggestion for new member functions in standard containers.
@@ -949,6 +968,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   void
   pop_back() _GLIBCXX_NOEXCEPT
   {
+	__glibcxx_assert(!empty());
 	--this->_M_impl._M_finish;
 	_Alloc_traits::destroy(this->_M_impl, this->_M_impl._M_finish);
   }
@@ -1051,6 +1071,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   iterator
   insert(const_iterator __position, size_type __n, const value_type& __x)
   {
+	__glibcxx_assert(__position >= cbegin() && __position <= cend());
 	difference_type __offset = __position - cbegin();
 	_M_fill_insert(begin() + __offset, __n, __x);
 	return begin() + __offset;
@@ -1071,7 +1092,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   void
   insert(iterator __position, size_type __n, const value_type& __x)
-  { _M_fill_insert(__position, __n, __x); }
+  {
+	__glibcxx_assert(__position >= begin() && __position <= end());
+	_M_fill_insert(__position, __n, __x);
+  }
 #endif
 
 #if __cplusplus >= 201103L
@@ -1096,6 +1120,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 insert(const_iterator __position, _InputIterator __first,
 	   _InputIterator __last)
 {
+	  __glibcxx_assert(__position >= cbegin() && __position <= cend());
 	  difference_type __offset = __position - cbegin();
 	  _M_insert_dispatch(begin() + __offset,
 			 __first, __last, __false_type());
@@ -1121,6 +1146,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 insert(iterator __position, _InputIterator __first,
 	   _InputIterator __last)
 {
+	  __glibcxx_assert(__position >= begin() && __position <= end());
 	  // Check whether it's an integral type.  If so, it's not an iterator.
 	  typedef typename std::__is_integer<_InputIterator>::__type _Integral;
 	  _M_insert_dispatch(__position, __first, __last, _Integral());
@@ -1145,10 +1171,16 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   iterator
 #if __cplusplus >= 201103L
   erase(const_iterator __position)
-  { return _M_erase(begin() + (__position - cbegin())); }
+  {
+

Re: [PATCH] Convert SPARC to LRA

2015-09-14 Thread David Miller
From: Richard Henderson 
Date: Mon, 14 Sep 2015 10:20:00 -0700

> There's a possibility of benefit though -- br and movr only work with DImode.
> You may want to examine the generated code to decide one way or another.
> 
> It's possible that the extra comparison instructions don't really matter
> compared with the larger spill slot, but you never know...

And another issue is that I get expr.c:expand_expr_real_1() assertion
failures when I try to use SImode for 64-bit, specifically the one in
this code sequence:

  /* Get the signedness to be used for this variable.  Ensure we get
 the same mode we got when the variable was declared.  */
  if (code != SSA_NAME)
pmode = promote_decl_mode (exp, &unsignedp);
  else if ((g = SSA_NAME_DEF_STMT (ssa_name))
   && gimple_code (g) == GIMPLE_CALL
   && !gimple_call_internal_p (g))
pmode = promote_function_mode (type, mode, &unsignedp,
   gimple_call_fntype (g),
   2);
  else
pmode = promote_ssa_mode (ssa_name, &unsignedp);
  gcc_assert (GET_MODE (decl_rtl) == pmode);

There are some other issues I'm having troubles resolving for 64-bit
native bootstraps as well, and I am probably going to revert the LRA
sparc changes unless I can resolve them by the end of today.


[gomp4] reduction simplification

2015-09-14 Thread Nathan Sidwell
working on another task, I found the idiom used in lower_oacc_loop_enter_exit 
rather confusing.  I've applied this to use two different loops, clearly going 
in different directions, rather than a single loop and an internal mechanism to 
make one of the instances go backwards.


nathan
2015-09-14  Nathan Sidwell  

	* omp-low.c (lower_oacc_loop_enter_exit): Use separate loops for
	entry and for exit.

Index: gcc/omp-low.c
===
--- gcc/omp-low.c	(revision 227683)
+++ gcc/omp-low.c	(working copy)
@@ -11182,41 +11182,34 @@ lower_omp_for_lastprivate (struct omp_fo
 
 static void
 lower_oacc_loop_enter_exit (bool enter_loop, tree clauses, gimple_seq *ilist,
-			 omp_context *ctx)
+			omp_context *ctx)
 {
   unsigned loop_dim_mask = extract_oacc_loop_mask (ctx);
-  gimple_seq *seq;
-  enum internal_fn fork_join, f1, f2;
-  int dir;
 
   if (loop_dim_mask == 0)
 return;
 
   if (enter_loop)
 {
-  fork_join = IFN_GOACC_FORK;
-  f1 = IFN_GOACC_REDUCTION_SETUP;
-  f2 = IFN_GOACC_REDUCTION_INIT;
-  seq = &oacc_gang_reduction_init;
-  dir = 1;
+  for (int i = GOMP_DIM_GANG; i < GOMP_DIM_MAX; i++)
+	if (loop_dim_mask & GOMP_DIM_MASK (i))
+	  loop_dim_mask =
+	lower_oacc_loop_helper (clauses, ilist, &oacc_gang_reduction_init,
+ctx, IFN_GOACC_REDUCTION_SETUP,
+IFN_GOACC_REDUCTION_INIT,
+IFN_GOACC_FORK, i, loop_dim_mask,
+enter_loop);
 }
   else
 {
-  fork_join = IFN_GOACC_JOIN;
-  f1 = IFN_GOACC_REDUCTION_FINI;
-  f2 = IFN_GOACC_REDUCTION_TEARDOWN;
-  seq = &oacc_gang_reduction_fini;
-  dir = -1;
-}
-
-  for (int i = GOMP_DIM_GANG; i < GOMP_DIM_MAX; i++)
-{
-  int dim = dir > 0 ? i : GOMP_DIM_MAX - (i + 1);
-  if (loop_dim_mask & GOMP_DIM_MASK (dim))
-	loop_dim_mask =
-	  lower_oacc_loop_helper (clauses, ilist, seq, ctx, f1, f2,
-  fork_join, dim, loop_dim_mask,
-  enter_loop);
+  for (int i = GOMP_DIM_MAX; i-- != GOMP_DIM_GANG;)
+	if (loop_dim_mask & GOMP_DIM_MASK (i))
+	  loop_dim_mask =
+	lower_oacc_loop_helper (clauses, ilist, &oacc_gang_reduction_fini,
+ctx, IFN_GOACC_REDUCTION_FINI,
+IFN_GOACC_REDUCTION_TEARDOWN,
+IFN_GOACC_JOIN, i, loop_dim_mask,
+enter_loop);
 }
 }
 


Re: [wwwdocs] GCC 6 Release Notes for RTEMS

2015-09-14 Thread Jeff Law

On 09/14/2015 12:36 AM, Sebastian Huber wrote:

Ping.

On 04/09/15 08:26, Sebastian Huber wrote:

Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.25
diff -u -r1.25 changes.html
--- htdocs/gcc-6/changes.html   25 Aug 2015 22:27:46 - 1.25
+++ htdocs/gcc-6/changes.html   4 Sep 2015 06:21:14 -
@@ -203,6 +203,23 @@

 

+
+  
+The RTEMS thread model implementation changed.  For the mutexes
+self-contained objects defined in Newlib  are used
+instead of Classic API semaphores.  The keys for thread specific
data and
+the once function are directly defined via .
+Self-contained condition variables are provided via Newlib
+.  The RTEMS thread model supports now the C++11
+threads.
+
+The OpenMP support uses now self-contained objects provided
by Newlib
+ and offers a significantly better performance
compared
+to the POSIX configuration of libgomp.  It is
possible to
+configure thread pools for each scheduler instance via the
environment
+variable GOMP_RTEMS_THREAD_POOLS.
+  
+
 

OK.
jeff


Re: [PATCH] Update ENABLE_CHECKING to make it usable in "if" conditions

2015-09-14 Thread Jeff Law

On 09/14/2015 04:11 AM, Richard Biener wrote:

On Wed, Sep 9, 2015 at 11:07 PM, Jeff Law  wrote:

On 08/31/2015 05:30 AM, Richard Biener wrote:


On Mon, Aug 31, 2015 at 7:49 AM, Mikhail Maltsev 
wrote:


Hi, all!

This patch removes some conditional compilation from GCC. In this patch I
define
a macro CHECKING_P, which is equal to 1 when ENABLE_CHECKING is defined
and 0
otherwise. The reason for using a new name is the following: currently in
GCC
there are many places where ENABLE_CHECKING is checked using #ifdef, and
it is a
common way to do such checks (though #if would also work and is used in
several
places). If we change it in such way that it is always defined,
accidentally
using "#ifdef" instead of "#if" will lead to subtle errors: some
expensive
checks intended only for development stage will be enabled in release
build and
cause performance degradation.

This patch removes all uses of ENABLE_CHECKING (I also tried poisoning
this
identifier in system.h, and the build succeeded, but I don't know how
will this
affect, e.g. merging feature branches, so I think such decisions should
be made
by maintainers).



I think we want to keep ENABLE_CHECKING for macro use and for some
selected
cases.


Can you outline which cases you want to keep?  My general feeling is to
avoid conditionally compiled code as much as we can.


I guess I was merely looking for the patch to be split up to see the motivation
of a always-defined CHECKING_P macro.
Ah.  I suspect the motivation is that was the easiest way forward and 
roughly what was initially suggested.



  With the suggestion to have

a runtime flag_checking variable I wonder if there is any real code that
is guarded by ENABLE_CHECKING now that can't use flag_checking
I wouldn't expect to see any in the .c files.   There'll likely be 
stragglers (like tree/rtl checking).  But even flushing all the stuff 
out of the .c files is a major win.


Do you have a preference on whether or not to make flag_checking a pre, 
post or integrated patch?  I can see arguments for any of those three 
choices.


jeff



Re: [PATCH diagnostics/67460] Replace some_warnings_are_errors with diagnostic_kind_count (context, DK_WERROR)

2015-09-14 Thread Jeff Law

On 09/13/2015 05:45 PM, Manuel López-Ibáñez wrote:

The flag diagnostic_context::some_warnings_are_errors controls whether
to give the message "all warnings being treated as errors". However, when
warnings are buffered and then discarded, this flag is not reset. It turns
out we do not need this flag at all, since we already count explicitly how
many warnings were converted into errors, and this number is kept up to
date for the buffered diagnostics used by Fortran.

Bootstrapped & tested on x86_64-linux-gnu with
--enable-languages=c,c++,objc,fortran,ada,obj-c++

OK?

gcc/ChangeLog:

2015-09-14  Manuel López-Ibáñez  

 PR fortran/67460
 * diagnostic.c (diagnostic_initialize): Do not set
 some_warnings_are_errors.
 (diagnostic_finish): Use DK_WERROR count instead.
 (diagnostic_report_diagnostic): Do not set
 some_warnings_are_errors.
 * diagnostic.h (struct diagnostic_context): Remove
 some_warnings_are_errors.

gcc/testsuite/ChangeLog:

2015-09-14  Manuel López-Ibáñez  

 PR fortran/67460
 * gfortran.dg/pr67460.f90: New test.

OK.
jeff



Re: [PATCH] Warn when comparing nonnull arguments to NULL in a function.

2015-09-14 Thread Jeff Law

On 09/09/2015 04:33 PM, Mark Wielaard wrote:

On Thu, 2015-09-10 at 00:03 +0200, Jakub Jelinek wrote:

On Wed, Sep 09, 2015 at 04:01:07PM -0600, Jeff Law wrote:

* gcc.dg/nonnull-4.c: New test.
* g++.dg/warn/nonnull3.C: Likewise.


If the tests are the same, perhaps stick just one test into
c-c++-common/nonnull-1.c instead?


Yes, that would be better. The warnings should be exactly the same.


   Also, all the "cp1 compared to NULL"
strings mention cp1, did you mean the second one to mention cp2 and so on?


Oops. copy/paste error indeed.


Can you also upate the -Wnonnull documentation in invoke.texi to indicate it
also will warn if it discovers a non-null argument that is compared against
null?

With the doc fix and a bootstrap/regression test, this patch ought to be
fine.


Documentation added. bootstrap/regression test still running.

Updated patch attached.
Assuming the bootstrap & regression test completed without errors, this 
patch is fine for the trunk.  Please install if you haven't done so already.


jeff



Re: [PATCH 01/22] Change of location_get_source_line signature

2015-09-14 Thread Jeff Law

On 09/10/2015 02:28 PM, David Malcolm wrote:

location_get_source_line takes an expanded_location, but the column
is irrelevant; it just needs a filename and line number.

This change is used by, but independent of, the new implementation of
diagnostic_show_locus later in the kit, so am breaking this out early.

gcc/ChangeLog:
* input.h (location_get_source_line): Drop "expanded_location"
param in favor of a file and line number.
* input.c (location_get_source_line): Likewise.
(dump_location_info): Update for change in signature of
location_get_source_line.
* diagnostic.c (diagnostic_print_caret_line): Likewise.

gcc/c-family/ChangeLog:
* c-format.c (location_from_offset): Update for change in
signature of location_get_source_line.
* c-indentation.c (get_visual_column): Likewise.
(line_contains_hash_if): Likewise.
This looks like a reasonable cleanup in and of itself.  It's OK for the 
trunk once you've done the usual bootstrap & regression test.


jeff



RE: [RFA] Compact EH Patch

2015-09-14 Thread Moore, Catherine


> -Original Message-
> From: Richard Henderson [mailto:r...@redhat.com]
> Sent: Wednesday, September 09, 2015 7:46 PM
> To: Jason Merrill; Moore, Catherine; gcc-patches@gcc.gnu.org
> Cc: Matthew Fortune; Ian Lance Taylor
> Subject: Re: [RFA] Compact EH Patch
> 
> On 09/09/2015 01:35 PM, Jason Merrill wrote:
> > On 07/30/2015 04:14 PM, Moore, Catherine wrote:
> >> This patch implements a more compact format for exception handling
> data.
> >> Although I don't have recent numbers for the amount of compression
> >> achieved, an earlier measurement showed a 30% reduction in the size
> >> of EH data for
> >> libstdc++.
> >>
> >> A design document detailing the new format is available
> >> (https://github.com/MentorEmbedded/cxx-
> abi/blob/master/MIPSCompactEH.pdf).
> >>
> >> This implementation enables the new format for MIPS targets only, but
> >> the generic pieces to enable the new format for other architectures is in
> place.
> >
> > Hi, sorry for the slow response.
> >
> > I'm surprised that there was no mention of this design on the ABI
> > list, especially since you've decided to post the design document to its git
> repository.
> >
> > I'm skeptical about the explicit rejection of asynchronous
> > backtracing; this is an important capability for debug traces on
> > hosted systems, which is why the compiler flag is on by default in
> > many linux distributions.  The document mentions using libunwind
> > instead, but that wouldn't help, as libunwind relies on the same
> > unwind information.  So it seems to me that the objective in 1.2 of
> supporting both unhosted and Linux-hosted programs isn't sufficiently met.

The support of asynchronous unwinding was not a requirement for our customer 
and wasn't addressed in the design.
As Richard points out, it is certainly possible to extend the scheme to support 
it, although I don't think such support should be a requirement for acceptance 
of the patch.

> 
> Indeed.  Though that is certainly fixable.
> 
> Let us suppose for the moment that the note on page 17 becomes true --
> use some of the currently unused encoding space for push/pop of state, and
> advancing the pc, so that one can represent asynchronous data.
> 
> At that point what's present there in section 10.1 looks plausible for use on
> MIPS.  With appropriate scheduling barriers in the mips prologue, it would in
> all likelyhood only add a single byte to the unwind info.
> 
> For instance, suppose
> 
>0101 1011  akin to DW_CFA_remember_state
>0101   akin to DW_CFA_restore_state
>0111  uleb128  akin to DW_CFA_advance_loc, of uleb128 * CALIGN
>0111   akin to DW_CFA_advance_loc, of  * CALIGN
> 
> where CALIGN is 4 for mips32 and 2 for mips16/micromips.  This allows one to
> advance 15 insns with 1 byte, and 127 insns with 2 bytes.
> 
> For the first example in 10.2.1 is instructive, in that it would take 5 bytes 
> to
> encode: pc += 1*4, sp += 56, pc += 8*4, push {16-22,31}, finish.  Given that
> most functions that allocate a stack frame will do so in the first insn, and
> indeed cannot do anything useful in zero instructions, one could make that
> first pc adjustment implicit, reducing the size of the unwind to 4 bytes, 
> which
> does fit into your inline unwind info.
> 
> Anyway, the exact encodings of this are something for the mips maintainers,
> since it isn't applicable generically.
> 
> Of more interest to me is the rest of the proposal, particularly section 10.4.
>   I like that there's more locality to the unwind data than the current
> .gcc_except_table contents.  I like that there's less pointer chasing.
> 
> Looking at the contents of my desktop, the vast majority of binaries have no
> .gcc_except_table, or a trivially small amount.  But I do have 102 binaries 
> with
> a table larger than 64k, with a maximum size of 705k.  So I also like the
> potential size savings of 25-40%.
> 
> The spec in section 10.4 looks good.  I can't see any issues with it off-hand.
> 
> The spec in section 8.2 out to be extended to handle 64-bit offsets instead of
> 32-bit offsets, even if only by reserving version 3 for the purpose.  While
> MIPS may want to restrict the size of the elf object to 2GB, and that's the
> common case for most files on all systems, we cannot restrict the size on all
> systems.
> 
> Anyway, that's having read the referenced document only, and nothing yet
> of the code.  I'll try to get to that tomorrow.
> 



Re: [PATCH 02/22] Testsuite: add dg-{begin|end}-multiline-output commands

2015-09-14 Thread Jeff Law

On 09/10/2015 02:28 PM, David Malcolm wrote:

This patch adds an easy way to write tests for expected multiline
output.  For example we can test carets and underlines for
a particular diagnostic with:

/* { dg-begin-multiline-output "" }
  typedef struct _GMutex GMutex;
 ^~~
{ dg-end-multiline-output "" } */

It is used extensively by the rest of the patch kit.

And could be used to simplify/test the basic caret diagnostics as well.



multiline.exp is used by prune.exp; hence we need to load it before
prune.exp via *load_gcc_lib* for the testsuites of the various
non-"gcc" support libraries (e.g. boehm-gc).
?!? Then why does prune.exp also load multiline.exp?  I  must be missing 
something here.





Question: which ChangeLog file should the change to
   libgo/testsuite/lib/libgo.exp
go into?
gcc/testsuite/ChangeLog is the nearest enclosing ChangeLog.  So that 
seems to be right place.  That's also where Ian put changes to go-test.exp.




Jeff


Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs

2015-09-14 Thread Jeff Law

On 09/11/2015 07:55 AM, Michael Matz wrote:

Hi,

On Thu, 10 Sep 2015, David Malcolm wrote:


Does anyone know why this was "carefully packed" and to what extent
this matters?  I'm adding an extra 8 bytes to it (or 4 if we eliminate
the existing location_t).  As far as I can see, these are
short-lived, and there are only relative few alive at any time.


The c++ frontend stores _all_ tokens before starting to parse, so the size
of cp_token is not totally irrelevant.  It still might not matter much,
though.
FWIW, Zack hasn't gone away totally (he was just posting about the 
explicit_bzero stuff) -- it couldn't hurt to ping him on the 
implications of changing the size of that structure.


jeff


Re: [C++ Patch] PR 53184 ("Unnecessary anonymous namespace warnings")

2015-09-14 Thread Florian Weimer
On 09/09/2015 05:09 PM, Paolo Carlini wrote:
> +@item -Wsubobject-linkage @r{(C++ and Objective-C++ only)}
> +@opindex Wsubobject-linkage
> +@opindex Wno-subobject-linkage
> +Warn if a class type has a base or a field whose type uses the anonymous
> +namespace or depends on a type with no linkage.

“and the class type does not itself use the anonymous namespace or has
no linkage”?

>  This warning is
> +enabled by default.

Maybe add a sentence why this is bad?  I can only guess, but I suspect
the reason is this: Such types are necessarily specific to a single
translation unit because any definition in another translation unit
would be an ODR violation, so they can be put into the anonymous
namespace themselves.

-- 
Florian Weimer / Red Hat Product Security


Re: [PATCH 03/22] Move diagnostic_show_locus and friends out into a new source file

2015-09-14 Thread Jeff Law

On 09/10/2015 02:28 PM, David Malcolm wrote:

The function "diagnostic_show_locus" gains new functionality in the
next patch, so this preliminary patch breaks it out into a new source
file, diagnostic-show-locus.c, along with a couple of related functions.

gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add diagnostic-show-locus.o.
* diagnostic.c (adjust_line): Move to diagnostic-show-locus.c.
(diagnostic_show_locus): Likewise.
(diagnostic_print_caret_line): Likewise.
* diagnostic-show-locus.c: New file.

This is fine for the trunk.

So much for the easy stuff :-)

jeff



Re: [PATCH 00/22] RFC: Overhaul of diagnostics

2015-09-14 Thread Jeff Law

On 09/14/2015 11:43 AM, Bernd Schmidt wrote:

It's hard to provide meaningful review under these conditions. My advice
would be to resubmit the things that are ready now and can stand on
their own so that we can get them out of the way first. Also, gather
memory/time information before posting the patches if that seems likely
to be important. For example, patch 21 looks quite cool but also
potentially expensive, I'd probably want that to be restricted by param
to identifiers of a maximum length (for both identifiers being compared).
I think David is looking for some feedback on some of this stuff. 
There's clearly some design/implementation issues in those middling 
patches.  The thought behind showing the later patches is so that folks 
can generally see where this work is trying to go.


One of my big worries is the memory consumption.



For the most part I declare myself agnostic as to whether this is an
improvement or not, and leave that for others to comment on. I
personally prefer single-line errors without much noise.
I wasn't a fan of rich location diagnostics, carets, etc.  However, now 
that I'm doing more C++ bits, I'm seeing the utility of this kind of stuff.




I see lots of unit tests implemented as plugins - have we decided that
this is the mechanism we want to use for this kind of thing?
A lot of the plugin-based testing is stuff that's painful to test 
end-to-end.  Probably the best way to think of those tests is they're 
trying to directly test internal state.


Jeff



Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu=

2015-09-14 Thread Bernhard Reutner-Fischer
On September 14, 2015 4:30:23 PM GMT+02:00, Christian Bruel 
 wrote:
>Finally, the final part of the patch set does the attribute target 
>parsing and checking, redefines the preprocessor macros and implements 
>the inlining rules.
>
>testcases and documentation included.

@@ -29501,6 +29532,8 @@
 static bool
 arm_valid_target_attribute_rec (tree args, struct gcc_options *opts)
 {
+  int ret=true;
+
   if (TREE_CODE (args) == TREE_LIST)
 {
   bool ret = true;


Doesn't the hunk above trigger a shadow warning? Furthermore there are missing 
spaces before and after the '='. And finally (no diff -p so I can only guess) 
why the int if the function returns a bool?

Thanks,

@@ -29518,30 +29551,35 @@
 }
 
   char *argstr = ASTRDUP (TREE_STRING_POINTER (args));
-  while (argstr && *argstr != '\0')
+  char *q;
+
+  while ((q = strtok (argstr, ",")) != NULL)
 {
-  while (ISSPACE (*argstr))
-   argstr++;
+  while (ISSPACE (*q)) ++q;
 
-  if (!strcmp (argstr, "thumb"))
-   {
+  argstr = NULL;
+  if (!strncmp (q, "thumb", 5))
  opts->x_target_flags |= MASK_THUMB;
- arm_option_check_internal (opts);
- return true;
-   }
 
-  if (!strcmp (argstr, "arm"))
-   {
+  else if (!strncmp (q, "arm", 3))
  opts->x_target_flags &= ~MASK_THUMB;
- arm_option_check_internal (opts);
- return true;
+
+  else if (!strncmp (q, "fpu=", 4))
+   {
+ if (! opt_enum_arg_to_value (OPT_mfpu_, q+4,
+  &opts->x_arm_fpu_index, CL_TARGET))
+   {
+ error ("invalid fpu for attribute(target(\"%s\"))", q);
+ return false;
+   }
}
+  else
+   warning (0, "attribute(target(\"%s\")) is unknown", argstr);
 
-  warning (0, "attribute(target(\"%s\")) is unknown", argstr);
-  return false;
+  arm_option_check_internal (opts);
 }
 
-  return false;
+  return ret;
 }


>
>thanks
>
>Christian




Re: vector lightweight debug mode

2015-09-14 Thread Jonathan Wakely

On 14/09/15 20:27 +0200, François Dumont wrote:

Hi

   Here is what I had in mind when talking about moving debug checks to
the lightweight debug checks.


Ah yes, I hadn't thought about removing reundant checks from the
__gnu_debug containers, but that makes sense.


   Sometimes the checks have been simply moved resulting in a simpler
debug vector implementation (front, back...). Sometimes I copy the
checks in a simpler form and kept the debug one too to make sure
execution of the debug code is fine.

   I plan to do the same for other containers.

   I still need to run tests, ok if tests are fine ?

François




diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 305d446..89a9aec 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -449,6 +449,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  vector&
  operator=(vector&& __x) noexcept(_Alloc_traits::_S_nothrow_move())
  {
+   __glibcxx_assert(this != &__x);


Please don't do this, it fails in valid programs. The standard needs
to be fixed in this regard.


constexpr bool __move_storage =
  _Alloc_traits::_S_propagate_on_move_assign()
  || _Alloc_traits::_S_always_equal();
@@ -778,7 +779,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   */
  reference
  operator[](size_type __n) _GLIBCXX_NOEXCEPT
-  { return *(this->_M_impl._M_start + __n); }
+  {
+   __glibcxx_assert(__n < size());
+   return *(this->_M_impl._M_start + __n);
+  }


This could use __glibcxx_requires_subscript(__n), see the attached
patch.



  /**
   *  @brief  Subscript access to the data contained in the %vector.
@@ -793,7 +797,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   */
  const_reference
  operator[](size_type __n) const _GLIBCXX_NOEXCEPT
-  { return *(this->_M_impl._M_start + __n); }
+  {
+   __glibcxx_assert(__n < size());
+   return *(this->_M_impl._M_start + __n);
+  }

protected:
  /// Safety check used only from at().
@@ -850,7 +857,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   */
  reference
  front() _GLIBCXX_NOEXCEPT
-  { return *begin(); }
+  {
+   __glibcxx_assert(!empty());


This is __glibcxx_requires_nonempty(), already defined for
_GLIBCXX_ASSERTIONS.


@@ -1051,6 +1071,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  iterator
  insert(const_iterator __position, size_type __n, const value_type& __x)
  {
+   __glibcxx_assert(__position >= cbegin() && __position <= cend());
difference_type __offset = __position - cbegin();
_M_fill_insert(begin() + __offset, __n, __x);
return begin() + __offset;


This is undefined behaviour, so I'd rather not add this check (I know
it's on the google branch, but it's still undefined behaviour).

We could do it with std::less, I suppose.

I've attached the simplest checks I thought we should add for vector
and deque.


commit c2b5d263b7553074c82a721dc59b71a2a4a84436
Author: Jonathan Wakely 
Date:   Thu Sep 10 14:23:43 2015 +0100

Add cheap assertions to std::vector and std::deque.

PR libstdc++/56109
* include/bits/stl_deque.h (deque::operator[], deque::front,
deque::back, deque::pop_front, deque::pop_back, deque::swap): Assert
preconditions.
* include/bits/stl_vector.h (vector::operator[], vector::front,
vector::back, vector::pop_back, vector::swap): Likewise.
* include/debug/debug.h [_GLIBCXX_ASSERTIONS]: Define
__glibcxx_requires_subscript.

diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index e4fa6e3..fd38af9 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,14 @@
+2015-09-10  Jonathan Wakely  
+
+   PR libstdc++/56109
+   * include/bits/stl_deque.h (deque::operator[], deque::front,
+   deque::back, deque::pop_front, deque::pop_back, deque::swap): Assert
+   preconditions.
+   * include/bits/stl_vector.h (vector::operator[], vector::front,
+   vector::back, vector::pop_back, vector::swap): Likewise.
+   * include/debug/debug.h [_GLIBCXX_ASSERTIONS]: Define
+   __glibcxx_requires_subscript.
+
 2015-09-09  Jonathan Wakely  
 
* doc/xml/manual/using.xml (_GLIBCXX_ASSERTIONS): Document.
diff --git a/libstdc++-v3/include/bits/stl_deque.h 
b/libstdc++-v3/include/bits/stl_deque.h
index f674245..3c8bb2e 100644
--- a/libstdc++-v3/include/bits/stl_deque.h
+++ b/libstdc++-v3/include/bits/stl_deque.h
@@ -1362,7 +1362,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   reference
   operator[](size_type __n) _GLIBCXX_NOEXCEPT
-  { return this->_M_impl._M_start[difference_type(__n)]; }
+  {
+   __glibcxx_requires_subscript(__n);
+   return this->_M_impl._M_start[difference_type(__n)];
+  }
 
   /**
*  @brief Subscript access to the data contained in the %deque.
@@ -1377,7 +1380,10 @@ _GLIBCXX_BEGIN_NAM

Re: [PATCH 2/4] [ARM] Add attribute/pragma target fpu=

2015-09-14 Thread Bernhard Reutner-Fischer
On September 14, 2015 12:47:23 PM GMT+02:00, Christian Bruel 
 wrote:
>This patch defines and uses accessors for the current fpu type fields, 
>based on switchable arm_fpu_index rather than defuncted arm_fpu_desc.


* config/arm/arm.c (arm_fpu_desc): Remove.
(all_fpus): Make global.

I suppose there will be no "mixed" targets nor multiple targets in one binary 
anytime soon, still I'd have stuck some arm indicator into the "all_fpus" 
global while exporting it..




[PATCH] Minor DOM cleanup

2015-09-14 Thread Jeff Law


Passing void * to avail_expr_hash was an artifact of the old htab 
interface.  Since we're no longer using that interface and call the 
hashing routine directly, we can just pass in the right type and avoid 
the annoying casting.


This simplifies class-ifying the available expression stack and related 
bits on the way to fixing 47679.


Bootstrapped and regression tested on x86_64-linux-gnu.  Installed on 
the trunk.



Jeff
PR tree-optimization/47679
* tree-ssa-dom.c (avail_expr_hash): Pass a pointer to a real
type rather than void *.

diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index e3eb0db..248d24f 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -231,7 +231,7 @@ static struct opt_stats_d opt_stats;
 /* Local functions.  */
 static void optimize_stmt (basic_block, gimple_stmt_iterator);
 static tree lookup_avail_expr (gimple, bool);
-static hashval_t avail_expr_hash (const void *);
+static hashval_t avail_expr_hash (struct expr_hash_elt *);
 static void htab_statistics (FILE *,
 const hash_table &);
 static void record_cond (cond_equivalence *);
@@ -2661,9 +2661,9 @@ lookup_avail_expr (gimple stmt, bool insert)
its operands.  */
 
 static hashval_t
-avail_expr_hash (const void *p)
+avail_expr_hash (struct expr_hash_elt *p)
 {
-  const struct hashable_expr *expr = &((const struct expr_hash_elt *)p)->expr;
+  const struct hashable_expr *expr = &p->expr;
   inchash::hash hstate;
 
   inchash::add_hashable_expr (expr, hstate);


[RFC] Masking vectorized loops with bound not aligned to VF.

2015-09-14 Thread Kirill Yukhin
Hello,
I'd like to initiate discussion on vectorization of loops which boundaries are 
not
aligned to VF. Main target for this optimization right now is x86's AVX-512, 
which
features per-element embedded masking for all instructions.
The main goal for this mail is to agree on overall design of the feature.

This approach was presented @ GNU Cauldron 2015 by Ilya Enkovich [1].
 
Here's a sketch of the algorithm:
  1. Add check on basic stmts for masking: possibility to introduce index 
vector and
 corresponding mask
  2. At the check if statements are vectorizable we additionally check if stmts 
 need and can be masked and compute masking cost. Result is stored in 
`stmt_vinfo`.
 We are going  to mask only mem. accesses, reductions and modify mask for 
already 
 masked stmts (mask load, mask store and vect. condition)
  3. Make a decision about masking: take computed costs and est. iterations 
count
 into consideration
  4. Modify prologue/epilogue generation according decision made at analysis. 
Three
 options available:
a. Use scalar remainder
b. Use masked remainder. Won't be supported in first version
c. Mask main loop
  5.Support vectorized loop masking: 
- Create stmts for mask generation
- Support generation of masked vector code (create generic vector code then
  patch it w/ masks)
  -  Mask loads/stores/vconds/reductions only
 
In first version (targeted v6) we're not going to support 4.b and loop mask 
pack/unpack.
No `pack/unpack` means that masking will be supported only for types w/ the same
size as index variable
 
[1] - 
https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Vectorization+for+Intel+AVX-512.pdf

What do you think?

--
Thanks, K


Re: [PING][Patch] Add support for IEEE-conformant versions of scalar fmin* and fmax*

2015-09-14 Thread Joseph Myers
On Mon, 14 Sep 2015, Richard Biener wrote:

> To make progess here I think adding new optabs is fine.  So can you
> split out that part and implement builtin expanders
> for fmin/max instead?
> 
> Btw, FMIN/MAX_EXPR are not commutative AFAIK because of behavior for
> fmax (-NaN, NaN)
> vs. fmax (NaN, -NaN)?

For those cases, it's unspecified which NaN is returned, so they can be 
considered commutative (and likewise for the case when the arguments are 0 
and -0).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Fix 61441

2015-09-14 Thread Joseph Myers
On Mon, 14 Sep 2015, Richard Biener wrote:

> I'll leave the correctness part of the patch to Joseph who knows FP
> arithmetic better than me,
> implementation-wise this is ok if you fix the REAL_CST sharing issue.

Changing fold_abs_const is wrong - abs of sNaN is sNaN, no exceptions 
raised.  Changing real_arithmetic is wrong for the NEGATE_EXPR and 
ABS_EXPR cases, both of which should just affect the sign bit without 
quieting sNaNs.

All the comments in the patch should end with ".  " (full stop, two 
spaces).

If -fsignaling-nans, then folding of expressions involving sNaNs should be 
disabled, outside of static initializers - such expressions should not get 
folded to return an sNaN (it's incorrect to fold sNaN + 1 to sNaN, for 
example).  I think existing code may ensure that (the HONOR_SNANS check in 
const_binop, for example).

Inside static initializers, expressions involving sNaNs still need to be 
folded (so sNaN + 1 becomes qNaN inside such an initializer, for example, 
with the translation-time exception being discarded).  Again, existing 
code should handle this: START_FOLD_INIT / END_FOLD_INIT already handle 
clearing and restoring flag_signaling_nans.

My understanding of the design of the existing code is that real.c will do 
the arithmetic regardless of whether it might raise an exception or have 
rounding-mode-dependent results, with fold-const.c being responsible for 
deciding whether the result can be used to fold the expression in 
question.  That is, you shouldn't need any flag_signaling_nans conditions 
in real.c; rather, if IEEE semantics mean an sNaN is quieted, real.c 
should do so unconditionally.  It should be the callers in fold-const.c 
that check HONOR_SNANS and disallow folding when it would lose exceptions.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 02/22] Testsuite: add dg-{begin|end}-multiline-output commands

2015-09-14 Thread Bernhard Reutner-Fischer
On September 14, 2015 9:32:54 PM GMT+02:00, Jeff Law  wrote:
>On 09/10/2015 02:28 PM, David Malcolm wrote:

>>
>> multiline.exp is used by prune.exp; hence we need to load it before
>> prune.exp via *load_gcc_lib* for the testsuites of the various
>> non-"gcc" support libraries (e.g. boehm-gc).
>?!? Then why does prune.exp also load multiline.exp?  I  must be
>missing 
>something here.

https://gcc.gnu.org/ml/fortran/2012-03/msg00094.html

dejagnu can now handle libdirs fine since a couple of years but this was deemed 
too early for GCC-5. Maybe GCC-6 can bump the required dejagnu version to allow 
for getting rid of all these superfluous load_gcc_lib? *blink* :)

Thanks and cheers,



Re: [C++ Patch] PR 53184 ("Unnecessary anonymous namespace warnings")

2015-09-14 Thread Paolo Carlini

Hi Florian,

On 09/14/2015 09:41 PM, Florian Weimer wrote:

  This warning is
+enabled by default.
Maybe add a sentence why this is bad?  I can only guess, but I suspect
the reason is this: Such types are necessarily specific to a single
translation unit because any definition in another translation unit
would be an ODR violation, so they can be put into the anonymous
namespace themselves.
As I probably mentioned somewhere, GCC is the only compiler I have at 
hand implementing something similar: frankly, I'm not sure how exactly 
we want to put it, concisely and neatly at the same time. If you are 
willing to prepare something more concrete, I'm sure Jason would be 
happy to review it!


Thanks,
Paolo.


Re: [PATCH 02/22] Testsuite: add dg-{begin|end}-multiline-output commands

2015-09-14 Thread Jeff Law

On 09/14/2015 02:38 PM, Bernhard Reutner-Fischer wrote:

On September 14, 2015 9:32:54 PM GMT+02:00, Jeff Law 
wrote:

On 09/10/2015 02:28 PM, David Malcolm wrote:




multiline.exp is used by prune.exp; hence we need to load it
before prune.exp via *load_gcc_lib* for the testsuites of the
various non-"gcc" support libraries (e.g. boehm-gc).

?!? Then why does prune.exp also load multiline.exp?  I  must be
missing something here.


https://gcc.gnu.org/ml/fortran/2012-03/msg00094.html

dejagnu can now handle libdirs fine since a couple of years but this
was deemed too early for GCC-5. Maybe GCC-6 can bump the required
dejagnu version to allow for getting rid of all these superfluous
load_gcc_lib? *blink* :)

I'd support that as a direction.

Certainly dropping the 2001 version from our website in favor of 1.5 
(which is what I'm using anyway) would be a step forward.


jeff


Re: [patch match.pd c c++]: Ignore results of 'shorten_compare' and move missing patterns in match.pd

2015-09-14 Thread Jeff Law

On 09/08/2015 05:17 AM, Kai Tietz wrote:

Hi,

This patch is the first part of obsoleting 'shorten_compare' function
for folding.
It adjusts the uses of 'shorten_compare' to ignore folding returned by
it, and adds
missing pattterns to match.pd to allow full bootstrap of C/C++ without
regressions.
Due we are using 'shorten_compare' for some diagnostic we can't simply
remove it.  So if this patch gets approved, the next step will be to
rename the function to something like 'check_compare', and adjust its
arguments and inner logic to reflect that we don't modify
arguments/expression anymore within that function.

Bootstrap just show 2 regressions within gcc.dg testsuite due patterns
matched are folded more early by forward-propagation.  I adjusted
them, and added them to patch, too.

I did regression-testing for x86_64-unknown-linux-gnu.

ChangeLog

2015-09-08  Kai Tietz  

 * match.pd: Add missing patterns from shorten_compare.
 * c/c-typeck.c (build_binary_op): Discard foldings of shorten_compare.
 * cp/typeck.c (cp_build_binary_op): Likewise.

2015-09-08  Kai Tietz  

 * gcc.dg/tree-ssa/vrp23.c: Adjust testcase to reflect that
 pattern is matching now already within forward-propagation pass.
 * gcc.dg/tree-ssa/vrp24.c: Likewise.
So for the new patterns, I would have expected testcases to ensure 
they're getting used.


The fact that we're not regressing with the front-end specific 
shortening disabled like this is probably more of an artifact of lack of 
testing of this feature than anything.


In *theory* one ought to be able to look at the dumps or .s files before 
after this patch for a blob of tests and see that nothing significant 
has changed.  Unfortunately, so much changes that it's hard to evaluate 
if this patch is a step forward or a step back.


I wonder if we'd do better to first add new match.pd patterns, one at a 
time, with tests, and evaluating them along the way by looking at the 
dumps or .s files across many systems.  Then when we think we've got the 
right set, then look at what happens to those dumps/.s files if we make 
the changes so that shorten_compare really just emits warnings.


My worry is that we get part way through the conversion and end up with 
the match.pd patterns without actually getting shorten_compare cleaned 
up and turned into just a warning generator.


jeff



[gomp4] IFN_UNIQUE

2015-09-14 Thread Nathan Sidwell
I've committed this patch, which replaces IFN_GOACC_FORK and IFN_GOACC_JOIN with 
a single IFN_UNIQUE function.  This makes the fork & join handling slightly more 
tricky, but reduces the invasion into the rest of the compiler -- the unique 
predicate function only need check a single value and is worth making inline.


nathan
2015-09-14  Nathan Sidwell  

	* gimple.h (gimple_call_internal_unique_p): Make inline.
	* internal-fn.def (UNIQUE): New.
	(GOACC_FORK, GOACC_JOIN): Delete.
	(IFN_UNIQUE_UNSPEC, IFN_UNIQUE_OACC_FORK,
	IFN_UNIQUE_OACC_JOIN): Define.
	* internal-fn.c (gimple_call_internal_unique_p): Delete.
	(expand_UNIQUE): New.  Absorb ...
	(expand_GOACC_FORK, expand_GOACC_JOIN): ... these.  Delete.
	* omp-low.c (lower_oacc_loop_helper): Emit IFN_UNIQUE call.
	(lower_pacc_loop_enter_exit): Adjust.
	(execute_oacc_transform): Check IFN_UNIQUE for fork & join.
	* config/nvptx/nvptx.c (nvptx_xform_fork_join): Adjust arg
	position.

Index: gcc/gimple.h
===
--- gcc/gimple.h	(revision 227759)
+++ gcc/gimple.h	(working copy)
@@ -2691,8 +2691,11 @@ gimple_call_internal_fn (const_gimple gs
 
 /* Return true, if this internal gimple call is unique.  */
 
-extern bool
-gimple_call_internal_unique_p (const_gimple);
+inline bool
+gimple_call_internal_unique_p (const_gimple gs)
+{
+  return gimple_call_internal_fn (gs) == IFN_UNIQUE;
+}
 
 /* If CTRL_ALTERING_P is true, mark GIMPLE_CALL S to be a stmt
that could alter control flow.  */
Index: gcc/internal-fn.def
===
--- gcc/internal-fn.def	(revision 227759)
+++ gcc/internal-fn.def	(working copy)
@@ -65,15 +65,12 @@ DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOV
 DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL)
 DEF_INTERNAL_FN (GOACC_DATA_END_WITH_ARG, ECF_NOTHROW, ".r")
 
-/* FORK and JOIN mark the points at which partitioned execution is
-   entered or exited.  We arrange for these two function to be
-   unduplicable and uncombinable in order to preserve the SESE CFG
-   property of partitioned loops.  These are non-const functions to prevent
-   optimizations migrating memory accesses across a partition change
-   boundary.  They take a single INTEGER_CST
-   argument and return nothing.  */
-DEF_INTERNAL_FN (GOACC_FORK, ECF_NOTHROW | ECF_LEAF, ".")
-DEF_INTERNAL_FN (GOACC_JOIN, ECF_NOTHROW | ECF_LEAF, ".")
+/* An unduplicable, uncombinable function.  Generally used to preserve
+   a CFG property in the face of jump threading, tail merging or
+   other such optimizations. The optional first argument distinguishes
+   between uses.  Other arguments are as needed for use.  The return
+   type depends on use too. */
+DEF_INTERNAL_FN (UNIQUE, ECF_NOTHROW | ECF_LEAF, NULL)
 
 /* DIM_SIZE and DIM_POS return the size of a particular compute
dimension and the executing thread's position within that
@@ -109,3 +106,11 @@ DEF_INTERNAL_FN (GOACC_REDUCTION_INIT, E
 DEF_INTERNAL_FN (GOACC_REDUCTION_FINI, ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOACC_REDUCTION_TEARDOWN, ECF_NOTHROW, NULL)
 
+/* IFN_UNIQUE uses an INTEGER_CST first argument to discriminate use.  */
+#define IFN_UNIQUE_UNSPEC 0
+
+/* FORK and JOIN mark the points at which partitioned execution is
+   entered or exited.  They take an INTEGER_CST argument, indicating
+   the axis of forking or joining and return nothing.  */
+#define IFN_UNIQUE_OACC_FORK 1
+#define IFN_UNIQUE_OACC_JOIN 2
Index: gcc/omp-low.c
===
--- gcc/omp-low.c	(revision 227759)
+++ gcc/omp-low.c	(working copy)
@@ -4862,7 +4862,7 @@ oacc_fake_gang_reduction (omp_context *c
 static unsigned
 lower_oacc_loop_helper (tree clauses, gimple_seq *ilist, gimple_seq *olist,
 			 omp_context *ctx, enum internal_fn f1,
-			 enum internal_fn f2, enum internal_fn fork_join,
+			 enum internal_fn f2, unsigned fork_join,
 			 unsigned loop_dim, unsigned loop_mask,
 			 bool emit_f1)
 {
@@ -4872,7 +4872,8 @@ lower_oacc_loop_helper (tree clauses, gi
 
   lower_oacc_reductions (f1, loop_dim, clauses, ilist, ctx, emit_f1);
   gwv = build_int_cst (unsigned_type_node, loop_dim);
-  call = gimple_build_call_internal (fork_join, 1, gwv);
+  call = gimple_build_call_internal
+(IFN_UNIQUE, 2, build_int_cst (unsigned_type_node, fork_join), gwv);
   gimple_seq_add_stmt (ilist, call);
   lower_oacc_reductions (f2, loop_dim, clauses, ilist, ctx, true);
   loop_mask = loop_mask & ~GOMP_DIM_MASK (loop_dim);
@@ -11199,7 +11200,7 @@ lower_oacc_loop_enter_exit (bool enter_l
 	lower_oacc_loop_helper (clauses, ilist, &oacc_gang_reduction_init,
 ctx, IFN_GOACC_REDUCTION_SETUP,
 IFN_GOACC_REDUCTION_INIT,
-IFN_GOACC_FORK, i, loop_dim_mask,
+IFN_UNIQUE_OACC_FORK, i, loop_dim_mask,
 enter_loop);
 }
   else
@@ -11210,7 +11211,7 @@ lower_oacc_loop_enter_exit (bool enter_l
 	lower_oacc_loop_helper (clauses, ilist, &oacc_gang_reduction_fini,

[PATCH v2] fix TLS support detection for sh targets

2015-09-14 Thread Rich Felker
2015-09-14  Rich Felker  

* gcc/configure.ac: Change target pattern for sh TLS support
test from "sh[34]-*-*" to "sh[123456789lbe]*-*-*".
* gcc/configure: Regenerate.

diff --git a/gcc/configure b/gcc/configure
index 846c996..6fb11a7 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -23977,7 +23977,7 @@ foo:.long   25
tls_first_minor=14
tls_as_opt="-m64 -Aesame --fatal-warnings"
;;
-  sh-*-* | sh[34]-*-*)
+  sh-*-* | sh[123456789lbe]*-*-*)
 conftest_s='
.section ".tdata","awT",@progbits
 foo:   .long   25
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 34c43d5..a6e078a 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3325,7 +3325,7 @@ foo:  .long   25
tls_first_minor=14
tls_as_opt="-m64 -Aesame --fatal-warnings"
;;
-  sh-*-* | sh[34]-*-*)
+  sh-*-* | sh[123456789lbe]*-*-*)
 conftest_s='
.section ".tdata","awT",@progbits
 foo:   .long   25


Re: [PATCH v2] fix TLS support detection for sh targets

2015-09-14 Thread Kaz Kojima
Rich Felker  wrote:
> 2015-09-14  Rich Felker  
> 
>   * gcc/configure.ac: Change target pattern for sh TLS support
>   test from "sh[34]-*-*" to "sh[123456789lbe]*-*-*".
>   * gcc/configure: Regenerate.

The patch is OK.  I've committed it with removing gcc/ from
the file names in gcc/ChangeLog entry.  Double checked with
sh4-unknown-linux-gnu and i686-pc-linux-gnu builds.
Thanks for the patch!

Regards,
kaz


Re: [PATCH 2/2] shrink-wrap: Rewrite try_shrink_wrapping

2015-09-14 Thread Segher Boessenkool
On Fri, Sep 11, 2015 at 03:40:53PM +0100, Jiong Wang wrote:
> >> A quick check shows > 30% more functions shrink-wrapped during
> >> bootstrapping by a the following command:
> >> 
> >> cd $TOP_BUILD ; find . -name "*.pro_and_epilogue" | xargs grep 
> >> "Perform.*shrink" | wc -l
> >
> > Wow, that is a lot!  But this is mostly the testsuite?  Shorter functions
> > can be wrapped a whole lot more often.
> 
> They all comes from gcc source code, not from testsuite as my bootstrap
> command is "make BOOT_CFLAGS=-O2 -fdump-rtl-pro_and_epilogue". testsuite
> itself is not involved in bootstrap.
> 
> And I can confirm I get >30% more functions shrink-wrapped by
> 
> cd $TOP_BUILD/gcc ; grep "Perform.*shrink" *.pro_and_epilogue | wc -l
> 
> This only count shrink-wrap performed on gcc core source code during
> final stage in bootstrapping. I also do some quick check, new
> shrink-wrap opportunites come from files like dwarf2out.c, emit-rtl.c,
> tree.c, tree-into-ssa.c etc, so they are valid.

It turns out for powerpc64-linux this particular "benchmark" shows 7.4%
increase as well, much more than average code.  Something in GCC code
likes shrink-wrapping a lot, maybe all the checking code.

> I know shrink-wrap is very sensitive to the RTL instruction sequences,
> looks like your re-write make it much more friendly to AArch64 :)

Yes indeed.  I wonder what it is about AArch64 code gen :-)

The old algorithm gave up if any block that would need duplicating could
not in fact be duplicated; the new one does not (it places the prologue
earlier and tries again).  Maybe that explains (some of) it.


Segher


Re: [PATCH 00/22] RFC: Overhaul of diagnostics

2015-09-14 Thread David Malcolm
On Mon, 2015-09-14 at 13:42 -0600, Jeff Law wrote:
> On 09/14/2015 11:43 AM, Bernd Schmidt wrote:
> > It's hard to provide meaningful review under these conditions. My advice
> > would be to resubmit the things that are ready now and can stand on
> > their own so that we can get them out of the way first. Also, gather
> > memory/time information before posting the patches if that seems likely
> > to be important. For example, patch 21 looks quite cool but also
> > potentially expensive, I'd probably want that to be restricted by param
> > to identifiers of a maximum length (for both identifiers being compared).
> I think David is looking for some feedback on some of this stuff. 
> There's clearly some design/implementation issues in those middling 
> patches.  The thought behind showing the later patches is so that folks 
> can generally see where this work is trying to go.

Indeed: my hope was that it would be helpful to see the kinds of
diagnostics I was hoping to be able to print, as that motivates both the
changes to diagnostics_show_locus, and efforts to try to capture and
store range information somehow within our IR.
I can post more screenshots if it will be helpful.

> One of my big worries is the memory consumption.

Yes.  Clearly the implementation I have in patch 12 isn't going to fly;
ideas welcome.   One thing I may try next is to only try to track the
ranges as the trees are constructed, immediately discarding them once
we've done that first level of error-checking... basically to not store
it beyond the frontends (for example to stuff it into c_expr in the C
FE).   That might be a useful compromise: hopefully letting us make a
lot of diagnostics more readable, without bloating the memory
requirements.  That's my hope, anyway :)


> > For the most part I declare myself agnostic as to whether this is an
> > improvement or not, and leave that for others to comment on. I
> > personally prefer single-line errors without much noise.
> I wasn't a fan of rich location diagnostics, carets, etc.  However, now 
> that I'm doing more C++ bits, I'm seeing the utility of this kind of stuff.

FWIW, I've mostly been holding off on adding ranges to the C++ FE in the
hope that the delayed folding branch will get merged soon (since
otherwise its unclear what to base the changes on); hence I only touched
a few places where token ranges were in use; I didn't attempt tree
ranges.

> > I see lots of unit tests implemented as plugins - have we decided that
> > this is the mechanism we want to use for this kind of thing?
> A lot of the plugin-based testing is stuff that's painful to test 
> end-to-end.  Probably the best way to think of those tests is they're 
> trying to directly test internal state.

Right.  The new plugins allow us to exercise the underlying machinery
unit by unit, and this is good for sanity (in particular, mine).

The unit tests in this patch kit use source code, which is going to be
the case for some tests, and fits neatly into the
gcc.dg/plugin/plugin.exp pattern, but not every test fits this pattern.

By contrast, if we want to e.g. verify that gengtype generates sane
mark&sweep routines, that doesn't necessarily need specific source code.
This latter style of test is what I was thinking of in the other patch
kit I posted here:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00765.html
("[PATCH 00/17] RFC: Adding a unit testing framework to gcc").
It's kind of a pain to write a plugin each time we poke at a data
structure,  and run it with an empty source file, so my thinking was to
consolidate those tests that simply exercise internal data structures
into a single unit-test plugin, and run all the tests within it.

In particular, my hope is that this style of test could (a) help us
track down bugs earlier [1] and (b) be dramatically faster: I want us to
be measuring e.g. how many 100s or 1000s of unit tests per second we can
run, rather than having to fork/exec subprocesses for just a few tests
each time.

(Though that's probably a different discussion).

Thanks for the comments.  Hope the above sounds sane.
Dave

[1] I *hate* tracking down gengtype bugs; I'm keen to give us direct
test coverage for the code it generates, so we can track down bugs
immediately, rather than with multi-hour gdb sessions...



Re: [C++ Patch] PR 53184 ("Unnecessary anonymous namespace warnings")

2015-09-14 Thread Jason Merrill

On 09/14/2015 06:17 PM, Paolo Carlini wrote:

Hi Florian,

On 09/14/2015 09:41 PM, Florian Weimer wrote:

  This warning is
+enabled by default.
Maybe add a sentence why this is bad?  I can only guess, but I suspect
the reason is this: Such types are necessarily specific to a single
translation unit because any definition in another translation unit
would be an ODR violation, so they can be put into the anonymous
namespace themselves.

As I probably mentioned somewhere, GCC is the only compiler I have at
hand implementing something similar: frankly, I'm not sure how exactly
we want to put it, concisely and neatly at the same time. If you are
willing to prepare something more concrete, I'm sure Jason would be
happy to review it!


Florian's summary is correct.

If a type A depends on a type B with no or internal linkage, defining it 
in multiple translation units would be an ODR violation because the 
meaning of B is different in each translation unit.  If A only appears 
in a single translation unit, the best way to silence the warning is to 
give it internal linkage by putting it in an anonymous namespace as 
well.  The compiler doesn't give this warning for types defined in the 
main .C file, as those are unlikely to have multiple definitions.


Jason



Re: [PATCH] Warn when comparing nonnull arguments to NULL in a function.

2015-09-14 Thread Martin Sebor

+void foo(void *bar) __attribute__((nonnull(1)));
+
+void foo(void *bar) { if (!bar) abort(); } /* { dg-warning "null" "argument ‘bar’ 
compared to NULL" } */


This looks like a very useful enhancement. Since the change is limited
to build_binary_op in the two front ends I wonder if the warning also
issued for other expressions? For example, suppose I were to add to
function foo above the following:

 bool is_null = bar;

would GCC issue a warning? The same question goes for other expressions
non-binary expressions, including:

 bar ? f () : g ();

or in C++:

 bool x = static_cast(bar);

If not, I would think issuing it would make the feature even more
useful (and the diagnostics more consistent).

Martin



libgo patch committed: Don't provide ustat on arm64 GNU/Linux

2015-09-14 Thread Ian Lance Taylor
Apparently arm64 GNU/Linux does not provide ustat, and the linker
warns about it when using glibc.  This patch to libgo avoids providing
syscall.Ustat on arm64 GNU/Linux, since it will apparently never work.

Since I was touching Makefile.am I rebuilt with automake 1.11.6, the
new GCC standard.

Bootstrapped and tested on x86_64-unknown-linux-gnu, which admittedly
proves little, but at least syscall.Ustat still works there.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 227758)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1d9d92ab09996d2f7795481d2876a21194502b89
+ae60deadd72b3b29df98cee61deed68f251f0122
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 227696)
+++ libgo/Makefile.am   (working copy)
@@ -1742,6 +1742,17 @@ else
 syscall_lsf_file =
 endif
 
+# GNU/Linux specific ustat support.
+if LIBGO_IS_LINUX
+if LIBGO_IS_ARM64
+syscall_ustat_file =
+else
+syscall_ustat_file = go/syscall/libcall_linux_ustat.go
+endif
+else
+syscall_ustat_file =
+endif
+
 # GNU/Linux specific utimesnano support.
 if LIBGO_IS_LINUX
 syscall_utimesnano_file = go/syscall/libcall_linux_utimesnano.go
@@ -1780,6 +1791,7 @@ go_base_syscall_files = \
$(syscall_uname_file) \
$(syscall_netlink_file) \
$(syscall_lsf_file) \
+   $(syscall_ustat_file) \
$(syscall_utimesnano_file) \
$(GO_LIBCALL_OS_FILE) \
$(GO_LIBCALL_OS_ARCH_FILE) \
Index: libgo/go/syscall/libcall_linux.go
===
--- libgo/go/syscall/libcall_linux.go   (revision 227696)
+++ libgo/go/syscall/libcall_linux.go   (working copy)
@@ -408,6 +408,3 @@ func Unlinkat(dirfd int, path string) (e
 
 //sys  Unshare(flags int) (err error)
 //unshare(flags _C_int) _C_int
-
-//sys  Ustat(dev int, ubuf *Ustat_t) (err error)
-//ustat(dev _dev_t, ubuf *Ustat_t) _C_int
Index: libgo/go/syscall/libcall_linux_ustat.go
===
--- libgo/go/syscall/libcall_linux_ustat.go (revision 0)
+++ libgo/go/syscall/libcall_linux_ustat.go (working copy)
@@ -0,0 +1,11 @@
+// Copyright 2015 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// GNU/Linux library ustat call.
+// This is not supported on some kernels, such as arm64.
+
+package syscall
+
+//sys  Ustat(dev int, ubuf *Ustat_t) (err error)
+//ustat(dev _dev_t, ubuf *Ustat_t) _C_int


Re: [patch] Avoid #ifdef _GLIBCXX_DEBUG in regex_compiler.h

2015-09-14 Thread Tim Shen
On Mon, Sep 7, 2015 at 8:22 AM, Jonathan Wakely  wrote:
> And we could get rid of the _Empty type, because std::bitset<0> is an
> empty type anyway, so if we made _S_cache_size()==0 when _UseCache is
> false then in the current code we could just unconditionally use:
>
>  using _CacheT = std::bitset<_S_cache_size()>;
>
> and for the suggested change to use a padding byte for the _M_is_ready
> flag we could use:
>
>  struct _ExtraMembers
>  : _Flags, bitset<_S_cache_size()>
>  {
>explicit
>_ExtraMembers(bool __is_non_matching)
>: _Flags{__is_non_matching, false}
>{ }
>  };

I'm not fully awaring the context, but I planned to elimiate the cache
size dispatch based on char types by always using size 256 (to be
exact, the size of all possible unsigned char values) to cache the
smallest 256 results, regardless of the actual char type. It also
covers other char types and reduced the code complexity.

As for #ifdef _GLIBCXX_DEBUG, I think it's fine to delete them, since
they seem not catching any useful bugs.


-- 
Regards,
Tim Shen


Re: [PATCH GCC]Look into unnecessary conversion when checking mult_op in get_shiftadd_cost

2015-09-14 Thread Bin.Cheng
On Wed, Sep 2, 2015 at 8:32 PM, Richard Biener
 wrote:
> On Wed, Sep 2, 2015 at 5:50 AM, Bin Cheng  wrote:
>> Hi,
>> When calling get_shiftadd_cost, the mult_op is stripped at caller places.
>> We should look into unnecessary conversion in op1 before checking equality,
>> otherwise it computes wrong shiftadd cost.  This patch picks this small
>> issue up.
>>
>> Bootstrap and test on x86_64 and aarch64 along with other patches.  Is it
>> OK?
>
> Just do STRIP_NOPS (op1) unconditionally?  Thus
>
>   STRIP_NOPS (op1);
>   mult_in_op1 = operand_equal_p (op1, mult, 0);
>
> ok with that change.
Patch committed as suggested.

Thanks,
bin


Re: [PATCH PR66388]Add sizetype cand for BIV of smaller type if it's used as index of memory ref

2015-09-14 Thread Bin.Cheng
Just realized that I missed the updated patch before.  Here it is...

Thanks,
bin

On Tue, Sep 8, 2015 at 6:07 PM, Bin.Cheng  wrote:
> On Tue, Sep 8, 2015 at 6:06 PM, Bin.Cheng  wrote:
>> On Wed, Sep 2, 2015 at 10:12 PM, Richard Biener
>>  wrote:
>>> On Wed, Sep 2, 2015 at 5:26 AM, Bin Cheng  wrote:
 Hi,
 This patch is a new approach to fix PR66388.  IVO today computes iv_use 
 with
 iv_cand which has at least same type precision as the use.  On 64bit
 platforms like AArch64, this results in different iv_cand created for each
 address type iv_use, and register pressure increased.  As a matter of fact,
 the BIV should be used for all iv_uses in some of these cases.  It is a
 latent bug but recently getting worse because of overflow changes.

 The original approach at
 https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01484.html can fix the issue
 except it conflict with IV elimination.  Seems to me it is impossible to
 mitigate the contradiction.

 This new approach fixes the issue by adding sizetype iv_cand for BIVs
 directly.  In cases if the original BIV is preferred, the sizetype iv_cand
 will be chosen.  As for code generation, the sizetype iv_cand has the same
 effect as the original BIV.  Actually, it's better because BIV needs to be
 explicitly extended to sizetype to be used in address expression on most
 targets.

 One shortage of this approach is it may introduce more iv candidates.  To
 minimize the impact, this patch does sophisticated code analysis and adds
 sizetype candidate for BIV only if it is used as index.  Moreover, it 
 avoids
 to add candidate of the original type if the BIV is only used as index.
 Statistics for compiling spec2k6 shows increase of candidate number is
 modest and can be ignored.

 There are two more patches following to fix corner cases revealed by this
 one.  In together they bring obvious perf improvement for spec26k/int on
 aarch64.
 Spec2k6/int
 400.perlbench   3.44%
 445.gobmk   -0.86%
 456.hmmer   14.83%
 458.sjeng   2.49%
 462.libquantum  -0.79%
 GEOMEAN 1.68%

 There is also about 0.36% improvement for spec2k6/fp, mostly because of 
 case
 436.cactusADM.  I believe it can be further improved, but that should be
 another patch.

 I also collected benchmark data for x86_64.  Spec2k6/fp is not affected.  
 As
 for spec2k6/int, though the geomean is improved slightly, 400.perlbench is
 regressed by ~3%.  I can see BIVs are chosen for some loops instead of
 address candidates.  Generally, the loop header will be simplified because
 iv elimination with BIV is simpler; the number of instructions in loop body
 isn't changed.  I suspect the regression comes from different addressing
 modes.  With BIV, complex addressing mode like [base + index << scale +
 disp] is used, rather than [base + disp].  I guess the former has more
 micro-ops, thus more expensive.  This guess can be confirmed by manually
 suppressing the complex addressing mode with higher address cost.
 Now the problem becomes why overall cost of BIV is computed lower while the
 actual cost is higher.  I noticed for most affected loops, loop header is
 bloated because of iv elimination using the old address candidate.  The
 bloated loop header results in much higher cost than BIV.  As a result, BIV
 is preferred.  I also noticed the bloated loop header generally can be
 simplified (I have a following patch for this).  After applying the local
 patch, the old address candidate is chosen, and most of regression is
 recovered.
 Conclusion is I think loop header bloated issue should be blamed for the
 regression, and it can be resolved.

 Bootstrap and test on x64_64 and aarch64.  It fixes failure of
 gcc.target/i386/pr49781-1.c, without new breakage.

 So what do you think?
>>>
>>> The data above looks ok to me.
>>>
>>> +static struct iv *
>>> +find_deriving_biv_for_iv (struct ivopts_data *data, struct iv *iv)
>>> +{
>>> +  aff_tree aff;
>>> +  struct expand_data exp_data;
>>> +
>>> +  if (!iv->ssa_name || TREE_CODE (iv->ssa_name) != SSA_NAME)
>>> +return iv;
>>> +
>>> +  /* Expand IV's ssa_name till the deriving biv is found.  */
>>> +  exp_data.data = data;
>>> +  exp_data.biv = NULL;
>>> +  tree_to_aff_combination_expand (iv->ssa_name, TREE_TYPE (iv->ssa_name),
>>> + &aff, &data->name_expansion_cache,
>>> + stop_expand, &exp_data);
>>> +  return exp_data.biv;
>>>
>>> that's actually "abusing" tree_to_aff_combination_expand for simply walking
>>> SSA uses and their defs uses recursively until you hit "stop".  ISTR past
>>> discussion to add a generic walk_ssa_use interface for that.  Not sure if it
>>> materialized with a name I can't remember or 

Re: [PATCH GCC][rework]Improve loop bound info by simplifying conversions in iv base

2015-09-14 Thread Bin.Cheng
Ping.

On Thu, Aug 27, 2015 at 5:41 PM, Bin Cheng  wrote:
> Hi,
> This is a rework for
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02335.html, with review
> comments addressed.  For now, SCEV may compute iv base in the form of
> "(signed T)((unsigned T)base + step))".  This complicates other
> optimizations/analysis depending on SCEV because it's hard to dive into type
> conversions.  This kind of type conversions can be simplified with
> additional range information implied by loop initial conditions.  This patch
> does such simplification.
> With simplified iv base, loop niter analysis can compute more accurate bound
> information since sensible value range can be derived for "base+step".  For
> example, accurate loop bound&may_be_zero information is computed for cases
> added by this patch.
>
> The code is actually moved from loop_exits_before_overflow.  After this
> patch, the corresponding code in loop_exits_before_overflow will be never
> executed, so I removed that part code.  The patch also includes some code
> format changes.
>
> Bootstrap and test on x86_64.  Is it OK?
>
> Thanks,
> bin
>
> 2015-08-27  Bin Cheng  
>
> * tree-ssa-loop-niter.c (tree_simplify_using_condition_1): Support
> new parameter.
> (tree_simplify_using_condition): Ditto.
> (simplify_using_initial_conditions): Ditto.
> (loop_exits_before_overflow): Pass new argument to function
> simplify_using_initial_conditions.  Remove case for type conversions
> simplification.
> * tree-ssa-loop-niter.h (simplify_using_initial_conditions): New
> parameter.
> * tree-scalar-evolution.c (simple_iv): Simplify type conversions
> in iv base using loop initial conditions.
>
> gcc/testsuite/ChangeLog
> 2015-08-27  Bin Cheng  
>
> * gcc.dg/tree-ssa/loop-bound-2.c: New test.
> * gcc.dg/tree-ssa/loop-bound-4.c: New test.
> * gcc.dg/tree-ssa/loop-bound-6.c: New test.


Possible patch for pr62242 -- follow-up

2015-09-14 Thread Louis Krupp
Would anyone like me to spend some more time on this and perhaps clear up some 
of the TODO items?

(Unlike some of you, I'm retired.  I have time for this.)

Louis

 == == == == == == Forwarded message == == == == == == 
>From : Louis Krupp
To : "gcc-patches","fortran"
Date : Wed, 09 Sep 2015 00:25:45 -0700
Subject : Possible patch for pr62242
 == == == == == == Forwarded message == == == == == == 
This was ... interesting. There were a couple of problems that triggered ICEs.

This patch fixes the reported file (I made sure this time) and causes no 
regressions as far as I can tell.

Dominique ... merci de votre patience.

Louis

Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog(revision 227571)
+++ gcc/fortran/ChangeLog(working copy)
@@ -1,3 +1,12 @@
+2015-09-08 Louis Krupp 
+
+PR fortran/62242
+* trans-array.c (get_array_ctor_all_strlen): Don't store length
+tree pointer unless we know it's necessary
+(trans_array_constructor): Create new gfc_charlen instance so
+context-specific length expression isn't shared
+(gfc_add_loop_ss_code): Don't try to convert non-constant length
+
 2015-09-04 Francois-Xavier Coudert 
 
 * intrinsic.h (gfc_simplify_mvbits): Remove.
Index: gcc/fortran/trans-array.c
===
--- gcc/fortran/trans-array.c(revision 227571)
+++ gcc/fortran/trans-array.c(working copy)
@@ -1836,7 +1836,9 @@ get_array_ctor_all_strlen (stmtblock_t *block, gfc
 gfc_add_block_to_block (block, &se.pre);
 gfc_add_block_to_block (block, &se.post);
 
- e->ts.u.cl->backend_decl = *len;
+ /* TODO: No test cases failed when the "if (0)" was added.
+ Is there a reason to put this back the way it was? */
+ if (0) e->ts.u.cl->backend_decl = *len;
 }
 }
 
@@ -2226,6 +2228,7 @@ trans_array_constructor (gfc_ss * ss, locus * wher
 if (expr->ts.type == BT_CHARACTER)
 {
 bool const_string;
+ gfc_charlen *new_cl;
 
 /* get_array_ctor_strlen walks the elements of the constructor, if a
  typespec was given, we already know the string length and want the one
@@ -2251,8 +2254,36 @@ trans_array_constructor (gfc_ss * ss, locus * wher
  and not end up here. */
 gcc_assert (ss_info->string_length);
 
- expr->ts.u.cl->backend_decl = ss_info->string_length;
+ /* get_array_ctor_strlen can create a temporary variable in the
+ current context which will be part of string_length. If we share
+ the resulting gfc_charlen structure with a variable in a different
+ declaration context, we could trip the assertion in
+ expand_expr_real_1 when it sees that the temporary has been
+ created in one context and referenced in another:
 
+ if (exp)
+ context = decl_function_context (exp);
+ gcc_assert (!exp
+ || SCOPE_FILE_SCOPE_P (context)
+ || context == current_function_decl
+ || TREE_STATIC (exp)
+ || DECL_EXTERNAL (exp)
+ // ??? C++ creates functions that are not TREE_STATIC.
+ || TREE_CODE (exp) == FUNCTION_DECL);
+
+ So we create a new gfc_charlen structure and link it into what
+ looks like the current namespace.
+
+ TODO: Can we do this only when get_array_ctor_strlen has been
+ called? Does it matter? Are we using the right namespace (and
+ does it matter, as long as the gfc_charlen structure is cleaned
+ up)?
+ */
+
+ new_cl = gfc_new_charlen (gfc_current_ns, expr->ts.u.cl);
+ new_cl->backend_decl = ss_info->string_length;
+ expr->ts.u.cl = new_cl;
+
 type = gfc_get_character_type_len (expr->ts.kind, ss_info->string_length);
 if (const_string)
 type = build_pointer_type (type);
@@ -2589,7 +2620,8 @@ gfc_add_loop_ss_code (gfc_loopinfo * loop, gfc_ss
  if (expr->ts.type == BT_CHARACTER
  && ss_info->string_length == NULL
  && expr->ts.u.cl
- && expr->ts.u.cl->length)
+ && expr->ts.u.cl->length
+ && expr->ts.u.cl->length->expr_type == EXPR_CONSTANT)
  {
  gfc_init_se (&se, NULL);
  gfc_conv_expr_type (&se, expr->ts.u.cl->length,



string_array_constructor_1.f90
Description: Binary data


string_array_constructor_2.f90
Description: Binary data


string_array_constructor_3.f90
Description: Binary data


[PATCH] Fix endianness assumption in LRA

2015-09-14 Thread David Miller

This was the most glaring case, and would result in LRA crashing
if this code snippet was actually hit on big-endian, since
simplify_gen_subreg() will return NULL in such a case and then
we try to blindly emit a move to 'subreg'.

There is code in match_reload which seems to have a similar problem,
specifically these sequences:

  if (SCALAR_INT_MODE_P (inmode))
new_out_reg = gen_lowpart_SUBREG (outmode, reg);
  else
new_out_reg = gen_rtx_SUBREG (outmode, reg, 0);
 ...
  if (SCALAR_INT_MODE_P (outmode))
new_in_reg = gen_lowpart_SUBREG (inmode, reg);
  else
new_in_reg = gen_rtx_SUBREG (inmode, reg, 0);

But I have not tried to address those cases in this patch.

Vlad, is this OK to commit?

2015-09-14  David S. Miller  

* lra-constraints.c (simplify_operand_subreg): Do not assume that
lowpart of a SUBREG has offset zero.

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index cdb2695..fc8e43d 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -1545,7 +1545,7 @@ simplify_operand_subreg (int nop, machine_mode reg_mode)
  bool insert_before, insert_after;
 
  PUT_MODE (new_reg, mode);
-  subreg = simplify_gen_subreg (innermode, new_reg, mode, 0);
+  subreg = gen_lowpart_SUBREG (innermode, new_reg);
  bitmap_set_bit (&lra_subreg_reload_pseudos, REGNO (new_reg));
 
  insert_before = (type != OP_OUT);


[Ada] Relax assertion in gigi

2015-09-14 Thread Eric Botcazou
The assertion should accept all constructs built by Call_to_gnu.

Tested on x86_64-suse-linux, applied on the mainline.


2015-09-14  Eric Botcazou  

* gcc-interface/utils2.c (gnat_rewrite_reference) : Add
another acceptable pattern for the RHS.

-- 
Eric BotcazouIndex: gcc-interface/utils2.c
===
--- gcc-interface/utils2.c	(revision 227729)
+++ gcc-interface/utils2.c	(working copy)
@@ -2807,7 +2807,9 @@ gnat_rewrite_reference (tree ref, rewrit
   gcc_assert (*init == NULL_TREE);
   *init = TREE_OPERAND (ref, 0);
   /* We expect only the pattern built in Call_to_gnu.  */
-  gcc_assert (DECL_P (TREE_OPERAND (ref, 1)));
+  gcc_assert (DECL_P (TREE_OPERAND (ref, 1))
+		  || (TREE_CODE (TREE_OPERAND (ref, 1)) == COMPONENT_REF
+		  && DECL_P (TREE_OPERAND (TREE_OPERAND (ref, 1), 0;
   return TREE_OPERAND (ref, 1);
 
 case CALL_EXPR:


[Ada] Housekeeping work in gigi

2015-09-14 Thread Eric Botcazou
No functional changes, tested on x86_64-suse-linux, applied on the mainline.


2015-09-14  Eric Botcazou  

* gcc-interface/gigi.h (ref_filename): Delete.
(Sloc_to_locus): Add clean_column parameter defaulting to false.
(build_call_raise): Adjust comment.
(build_call_raise_range): Move around.
* gcc-interface/trans.c (ref_filename): Delete.
(gigi): Fix formatting.
(block_end_locus_sink): Delete.
(Sloc_to_locus1): Tidy up and reformat.  Rename into...
(Sloc_to_locus): ...this.  Add default for clean_colmun parameter.
(set_expr_location_from_node1): Rename into...
(set_expr_location_from_node): ...this.
(set_end_locus_from_node): Move around.  Adjust for renaming.
(Handled_Sequence_Of_Statements_to_gnu): Likewise.
(add_cleanup): Likewise.
* gcc-interface/utils2.c (expand_sloc): New static function.
(build_call_raise): Call it.
(build_call_raise_column): Likewise.
(build_call_raise_range): Likewise.  Move around.

-- 
Eric BotcazouIndex: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 227729)
+++ gcc-interface/utils.c	(working copy)
@@ -5278,7 +5278,7 @@ builtin_decl_for (tree name)
heavily inspired from the "C" family implementation, with chunks copied
verbatim from there.
 
-   Two obvious TODO candidates are
+   Two obvious improvement candidates are:
o Use a more efficient name/decl mapping scheme
o Devise a middle-end infrastructure to avoid having to copy
  pieces between front-ends.  */
@@ -5627,7 +5627,7 @@ handle_pure_attribute (tree *node, tree
 {
   if (TREE_CODE (*node) == FUNCTION_DECL)
 DECL_PURE_P (*node) = 1;
-  /* ??? TODO: Support types.  */
+  /* TODO: support types.  */
   else
 {
   warning (OPT_Wattributes, "%qs attribute ignored",
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 227729)
+++ gcc-interface/decl.c	(working copy)
@@ -6241,7 +6241,7 @@ elaborate_expression_1 (tree gnu_expr, E
 	 Returning the variable ensures the caller will use it in generated
 	 code.  Note that there is no need for a location if the debug info
 	 contains an integer constant.
-	 FIXME: when the encoding-based debug scheme is dropped, move this
+	 TODO: when the encoding-based debug scheme is dropped, move this
 	 condition to the top-level IF block: we will not need to create a
 	 variable anymore in such cases, then.  */
   if (use_variable || (need_debug && !TREE_CONSTANT (gnu_expr)))
Index: gcc-interface/utils2.c
===
--- gcc-interface/utils2.c	(revision 227735)
+++ gcc-interface/utils2.c	(working copy)
@@ -1754,25 +1754,58 @@ build_call_n_expr (tree fndecl, int n, .
   return fn;
 }
 
-/* Call a function that raises an exception and pass the line number and file
-   name, if requested.  MSG says which exception function to call.
+/* Expand the SLOC of GNAT_NODE, if present, into tree location information
+   pointed to by FILENAME, LINE and COL.  Fall back to the current location
+   if GNAT_NODE is absent or has no SLOC.  */
 
-   GNAT_NODE is the gnat node conveying the source location for which the
-   error should be signaled, or Empty in which case the error is signaled on
-   the current ref_file_name/input_line.
+static void
+expand_sloc (Node_Id gnat_node, tree *filename, tree *line, tree *col)
+{
+  const char *str;
+  int line_number, column_number;
+
+  if (Debug_Flag_NN || Exception_Locations_Suppressed)
+{
+  str = "";
+  line_number = 0;
+  column_number = 0;
+}
+  else if (Present (gnat_node) && Sloc (gnat_node) != No_Location)
+{
+  str = Get_Name_String
+	(Debug_Source_Name (Get_Source_File_Index (Sloc (gnat_node;
+  line_number = Get_Logical_Line_Number (Sloc (gnat_node));
+  column_number = Get_Column_Number (Sloc (gnat_node));
+}
+  else
+{
+  str = lbasename (LOCATION_FILE (input_location));
+  line_number = LOCATION_LINE (input_location);
+  column_number = LOCATION_COLUMN (input_location);
+}
+
+  const int len = strlen (str);
+  *filename = build_string (len, str);
+  TREE_TYPE (*filename) = build_array_type (unsigned_char_type_node,
+	build_index_type (size_int (len)));
+  *line = build_int_cst (NULL_TREE, line_number);
+  if (col)
+*col = build_int_cst (NULL_TREE, column_number);
+}
 
-   KIND says which kind of exception this is for
-   (N_Raise_{Constraint,Storage,Program}_Error).  */
+/* Build a call to a function that raises an exception and passes file name
+   and line number, if requested.  MSG says which exception function to call.
+   GNAT_NODE is the node conveying the source location for which the error
+   should be signaled, or Empty in which case the error is signaled for the
+   current

Re: [PATCH] PR28901 -Wunused-variable ignores unused const initialised variables

2015-09-14 Thread Bernd Schmidt

On 09/13/2015 08:24 PM, Mark Wielaard wrote:

commit 97505bd0e4ac15d86c2a302cfebc5f1a4fc2c2e8
Author: Mark Wielaard
Date:   Fri Sep 11 23:54:15 2015 +0200

 PR28901 -Wunused-variable ignores unused const initialised variables in C


This is ok.


Bernd


[Ada] Issue a warning for -gstabs

2015-09-14 Thread Eric Botcazou
We are phasing out the GNAT encoding scheme in the debugging information so 
STABS is considered obsolete for Ada.

Tested on x86_64-suse-linux, applied on the mainline.


2015-09-14  Pierre-Marie de Rodat  

* gcc-interface/misc.c (gnat_post_options): Issue a warning if
generating STABS debugging information when not the default.

-- 
Eric BotcazouIndex: gcc-interface/misc.c
===
--- gcc-interface/misc.c	(revision 227736)
+++ gcc-interface/misc.c	(working copy)
@@ -268,6 +268,13 @@ gnat_post_options (const char **pfilenam
   if (!global_options_set.x_flag_diagnostics_show_caret)
 global_dc->show_caret = false;
 
+  /* Warn only if STABS is not the default: we don't want to emit a warning if
+ the user did not use a -gstabs option.  */
+  if (PREFERRED_DEBUGGING_TYPE != DBX_DEBUG && write_symbols == DBX_DEBUG)
+warning (0, "STABS debugging information for Ada is obsolete and not "
+		"supported anymore");
+
+  /* Copy global settings to local versions.  */
   optimize = global_options.x_optimize;
   optimize_size = global_options.x_optimize_size;
   flag_compare_debug = global_options.x_flag_compare_debug;


Re: [PATCH] v2 shrink-wrap: Rewrite

2015-09-14 Thread Bernd Schmidt

* shrink-wrap.c (requires_stack_frame_p): Fix formatting.
(dup_block_and_redirect): Delete function.
(can_dup_for_shrink_wrapping): New function.
(fix_fake_fallthrough_edge): New function.
(try_shrink_wrapping): Rewrite function.
(convert_to_simple_return): Call fix_fake_fallthrough_edge.


Ok. Thanks!


Bernd


Re: [gomp4] SESE region neutering

2015-09-14 Thread Bernd Schmidt

On 09/11/2015 11:49 PM, Nathan Sidwell wrote:

This patch implements that optimization.  As the comment at the head of
the code says, we first find 'cycle-equivalent' BBs.  These are ones
that we determine are in the same (set of) loops, in the closed graph.
Such equivalent BBs form the entry and exit BBs of an SESE region.  Once
we've found these, we need to find the ones that cover the most of the
graph -- and delete the ones that are consumed by the larger areas.
This is done by a coloring algorithm executed as a DFS walk.  One of the
properties of SESE regions is that they are always strictly nested --
they never partially overlap.  That property is used by the coloring
algorithm.


This looks like it could potentially go into sese.c instead.


Bernd


Re: [PATCH][wwwdocs][AArch64] Add entry for target attributes and pragmas

2015-09-14 Thread Kyrill Tkachov


On 07/09/15 13:16, Kyrill Tkachov wrote:

Hi Gerald,

On 07/09/15 12:31, Gerald Pfeifer wrote:

On Wed, 2 Sep 2015, Kyrill Tkachov wrote:

My thinking was that when we introduce some new command-line option we
list it here and give a short description of it (new -mcpu values, for
example). However, here we introduce about 10 new target attributes and
pragmas and listing them all would make this entry too long for my
liking so as a shorthand for listing them all I chose to point to the
documentation.

Unless you feel strongly against this reasoning I'd like to commit the
patch as is within 48 hours.

I can follow your reasoning, and anyway the 48 hours are way over ;-),
just have you considered adding a reference to the documentation (as a
hyperlink to the respective section, if there is a good one, such as
https://gcc.gnu.org/onlinedocs/gcc/ARM-Pragmas.html#ARM-Pragmas )?

Good idea, I'll send a patch to mention the link.
The relevant one is:
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes


And here is the patch.
I'll apply it in 48 hours if there's no objection.

Thanks,
Kyrill

Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.26
diff -U 3 -r1.26 changes.html
--- htdocs/gcc-6/changes.html	4 Sep 2015 09:33:28 -	1.26
+++ htdocs/gcc-6/changes.html	7 Sep 2015 17:01:08 -
@@ -140,7 +140,8 @@
  
  
The AArch64 port now supports target attributes and pragmas.  Please
-   refer to the documentation for details of available attributes and
+   refer to the https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes";>
+   documentation for details of available attributes and
pragmas as well as usage instructions.
  
  


[patch] Bump size of stack checking protection area

2015-09-14 Thread Eric Botcazou
Hi,

as documented, STACK_CHECK_PROTECT is supposed to be an "estimate of the 
amount of stack required to propagate an exception".  It's (mainly) for Ada 
and it needs to distinguish the various EH schemes, which might have different 
needs.  While the current setting is OK for the front-end SJLJ scheme used up 
to now in Ada, it's not sufficient for the middle-end SJLJ scheme that we are 
experimenting with; you need 8K on some platforms to pass the ACATS testsuite.

Tested on x86_64-suse-linux, OK for the mainline?


2015-09-14  Eric Botcazou  

* defaults.h (STACK_OLD_CHECK_PROTECT): Adjust for -fno-exceptions.
Bump to 4K for SJLJ exceptions.
(STACK_CHECK_PROTECT): Likewise.  Bump to 8K for SJLJ exceptions.

-- 
Eric BotcazouIndex: defaults.h
===
--- defaults.h	(revision 227729)
+++ defaults.h	(working copy)
@@ -1406,9 +1406,11 @@ see the files COPYING3 and COPYING.RUNTI
 #define STACK_OLD_CHECK_PROTECT STACK_CHECK_PROTECT
 #else
 #define STACK_OLD_CHECK_PROTECT		\
- (targetm_common.except_unwind_info (&global_options) == UI_SJLJ	\
+ (!global_options.x_flag_exceptions	\
   ? 75 * UNITS_PER_WORD			\
-  : 8 * 1024)
+  : targetm_common.except_unwind_info (&global_options) == UI_SJLJ	\
+? 4 * 1024\
+: 8 * 1024)
 #endif
 
 /* Minimum amount of stack required to recover from an anticipated stack
@@ -1416,9 +1418,11 @@ see the files COPYING3 and COPYING.RUNTI
of stack required to propagate an exception.  */
 #ifndef STACK_CHECK_PROTECT
 #define STACK_CHECK_PROTECT		\
- (targetm_common.except_unwind_info (&global_options) == UI_SJLJ	\
-  ? 75 * UNITS_PER_WORD			\
-  : 12 * 1024)
+ (!global_options.x_flag_exceptions	\
+  ? 4 * 1024\
+  : targetm_common.except_unwind_info (&global_options) == UI_SJLJ	\
+? 8 * 1024\
+: 12 * 1024)
 #endif
 
 /* Make the maximum frame size be the largest we can and still only need

Re: [PATCH][wwwdocs][AArch64] Add entry for target attributes and pragmas

2015-09-14 Thread Gerald Pfeifer
On Mon, 14 Sep 2015, Kyrill Tkachov wrote:
> And here is the patch.

Thanks.

> I'll apply it in 48 hours if there's no objection.

No need to wait. ;-)

Gerald


Re: [PATCH] PR67401: Fix wrong code generated by expand_atomic_compare_and_swap

2015-09-14 Thread Bernd Schmidt

On 09/11/2015 05:15 PM, John David Anglin wrote:

On 2015-09-11 4:15 AM, Bernd Schmidt wrote:

On 09/11/2015 01:21 AM, John David Anglin wrote:

As noted in the PR, expand_atomic_compare_and_swap can generate wrong
code when libcalls are emitted
for the sync_compare_and_swap and the result comparison test. This is
fixed by emitting a move insn to copy
the result rtx of the sync_compare_and_swap libcall to target_oval
instead of directly assigning it.

Could you provide relevant parts of the rtl dumps or (preferrably) the
patch you are using to enable the libcall?


This can be duplicated with a cross to hppa-unknown-linux-gnu with the
following change to enable the libcall:


Ok, thanks, This patch is ok.


Bernd


Re: [PATCH][20/n] Remove GENERIC stmt combining from SCCVN

2015-09-14 Thread Richard Biener
On Sat, 12 Sep 2015, Eric Botcazou wrote:

> > * fold-const.c (fold_binary_loc): Move simplifying of comparisons
> > against the highest or lowest possible integer ...
> > * match.pd: ... as patterns here.
> 
> This incorrectly dropped the calls to omit_one_operand_loc, resulting in the 
> failure of the attached Ada test: if the operand has side effects, you cannot 
> replace the entire comparison with just 'true' or 'false'.

Still trying to reproduce, but I suppose you hit

 /* Comparisons with the highest or lowest possible integer of
the specified precision will have known values.  */
 (simplify
  (cmp (convert?@2 @0) INTEGER_CST@1)
  (if ((INTEGRAL_TYPE_P (TREE_TYPE (@1)) || POINTER_TYPE_P (TREE_TYPE 
(@1)))
   && tree_nop_conversion_p (TREE_TYPE (@2), TREE_TYPE (@0)))
   (with
{
  tree arg1_type = TREE_TYPE (@1);
  unsigned int prec = TYPE_PRECISION (arg1_type);
  wide_int max = wi::max_value (arg1_type);
  wide_int signed_max = wi::max_value (prec, SIGNED);
  wide_int min = wi::min_value (arg1_type);
}
(switch
 (if (wi::eq_p (@1, max))
  (switch
   (if (cmp == GT_EXPR)
{ constant_boolean_node (false, type); })
   (if (cmp == GE_EXPR)
(eq @2 @1))
   (if (cmp == LE_EXPR)
{ constant_boolean_node (true, type); })

this which should handle side-effects in @0 just fine:

/* #line 2019 "/space/rguenther/src/svn/trunk/gcc/match.pd" */
  if (cmp == LE_EXPR)
{
  if (dump_file && (dump_flags & TDF_DETAILS)) 
fprintf (dump_file, "Applying pattern match.pd:2020, %s:%d\n", __FILE__, 
__LINE__);
  tree res;
  res =  constant_boolean_node (true, type);
  if (TREE_SIDE_EFFECTS (captures[0]))
res = build2_loc (loc, COMPOUND_EXPR, type, 
fold_ignored_result (captures[0]), res);
  return res;

note that genmatch "inlines" omit_one_operand, so you only see
fold_ignored_result here.

So maybe the issue is with some other pattern or was latent
elsewehere.  I'll have a closer look once I manage to reproduce
the issue.

Richard.

> 
>   * gnat.dg/overflow_sum3.adb: New test.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH][PR67476] Add param parloops-schedule

2015-09-14 Thread Bernd Schmidt

On 09/11/2015 03:28 PM, Tom de Vries wrote:


This patch adds handling of a DEFPARAMENUM macro, which is similar to
the DEFPARAM macro, but allows the values to be named.

So the definition of param parloop-schedule becomes:
...
DEFPARAMENUM PARAM_PARLOOPS_SCHEDULE,
  "parloops-schedule",
  "Schedule type of omp schedule for loops parallelized by "
  "parloops (static, dynamic, guided, auto, runtime)",
  0, 0, 4, "static", "dynamic", "guided", "auto", "runtime")


So in principle I like this, but there's one oddity:

+  switch (schedule_type)
+{
+case 0:
+  OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+  break;

The code using the param is using integers rather than enum values. Can 
that be fixed?



...
[ I'll repost the original patch containing this update. ]


I'll let Jakub and/or Richard handle the rest of that. I'm curious why 
this would be a param rather than a -f option.



Bernd


Re: [PATCH][20/n] Remove GENERIC stmt combining from SCCVN

2015-09-14 Thread Richard Biener
On Mon, 14 Sep 2015, Richard Biener wrote:

> On Sat, 12 Sep 2015, Eric Botcazou wrote:
> 
> > >   * fold-const.c (fold_binary_loc): Move simplifying of comparisons
> > >   against the highest or lowest possible integer ...
> > >   * match.pd: ... as patterns here.
> > 
> > This incorrectly dropped the calls to omit_one_operand_loc, resulting in 
> > the 
> > failure of the attached Ada test: if the operand has side effects, you 
> > cannot 
> > replace the entire comparison with just 'true' or 'false'.
> 
> Still trying to reproduce, but I suppose you hit
> 
>  /* Comparisons with the highest or lowest possible integer of
> the specified precision will have known values.  */
>  (simplify
>   (cmp (convert?@2 @0) INTEGER_CST@1)
>   (if ((INTEGRAL_TYPE_P (TREE_TYPE (@1)) || POINTER_TYPE_P (TREE_TYPE 
> (@1)))
>&& tree_nop_conversion_p (TREE_TYPE (@2), TREE_TYPE (@0)))
>(with
> {
>   tree arg1_type = TREE_TYPE (@1);
>   unsigned int prec = TYPE_PRECISION (arg1_type);
>   wide_int max = wi::max_value (arg1_type);
>   wide_int signed_max = wi::max_value (prec, SIGNED);
>   wide_int min = wi::min_value (arg1_type);
> }
> (switch
>  (if (wi::eq_p (@1, max))
>   (switch
>(if (cmp == GT_EXPR)
> { constant_boolean_node (false, type); })
>(if (cmp == GE_EXPR)
> (eq @2 @1))
>(if (cmp == LE_EXPR)
> { constant_boolean_node (true, type); })
> 
> this which should handle side-effects in @0 just fine:
> 
> /* #line 2019 "/space/rguenther/src/svn/trunk/gcc/match.pd" */
>   if (cmp == LE_EXPR)
> {
>   if (dump_file && (dump_flags & TDF_DETAILS)) 
> fprintf (dump_file, "Applying pattern match.pd:2020, %s:%d\n", __FILE__, 
> __LINE__);
>   tree res;
>   res =  constant_boolean_node (true, type);
>   if (TREE_SIDE_EFFECTS (captures[0]))
> res = build2_loc (loc, COMPOUND_EXPR, type, 
> fold_ignored_result (captures[0]), res);
>   return res;
> 
> note that genmatch "inlines" omit_one_operand, so you only see
> fold_ignored_result here.
> 
> So maybe the issue is with some other pattern or was latent
> elsewehere.  I'll have a closer look once I manage to reproduce
> the issue.

Ok, so it's folding

x == 127 ? .gnat_rcheck_CE_Overflow_Check ("overflow_sum3.adb", 14);, 0 : 
(short_short_integer) x + 1

<= 127

where op0 (the COND_EXPR) does not have TREE_SIDE_EFFECTS set but
its operand 1 has:

(gdb) p debug_tree (op0)
 
unit size 
align 8 symtab 0 alias set -1 canonical type 0x76572dc8 
precision 8 min  max  context  RM 
size 
chain >
   
arg 0 
side-effects
...
arg 2 
...

that's unexpected to the code generated by genmatch and I don't see
how omit_one_operand would handle that either.  The COND_EXPR is
originally built with TREE_SIDE_EFFECTS set but:

Hardware watchpoint 7: *$43

Old value = 65595
New value = 59
emit_check (gnu_cond=, 
gnu_expr=, reason=10, gnat_node=2320)
at /space/rguenther/src/svn/trunk/gcc/ada/gcc-interface/trans.c:8823
8823  return gnu_result;
$45 = 0

so the Ada frontend resets the flag (improperly?):

emit_check (gnu_cond=, 
gnu_expr=, reason=10, gnat_node=2320)
at /space/rguenther/src/svn/trunk/gcc/ada/gcc-interface/trans.c:8823
8823  return gnu_result;
$45 = 0
(gdb) l
8818
8819  /* GNU_RESULT has side effects if and only if GNU_EXPR has:
8820 we don't need to evaluate it just for the check.  */
8821  TREE_SIDE_EFFECTS (gnu_result) = TREE_SIDE_EFFECTS (gnu_expr);
8822
8823  return gnu_result;
8824}


Richard.


Re: [PATCH][20/n] Remove GENERIC stmt combining from SCCVN

2015-09-14 Thread Eric Botcazou
> Still trying to reproduce, but I suppose you hit

The testcase fails as of r227729 on x86-64/Linux.

>  /* Comparisons with the highest or lowest possible integer of
> the specified precision will have known values.  */
>  (simplify
>   (cmp (convert?@2 @0) INTEGER_CST@1)
>   (if ((INTEGRAL_TYPE_P (TREE_TYPE (@1)) || POINTER_TYPE_P (TREE_TYPE
> (@1)))
>&& tree_nop_conversion_p (TREE_TYPE (@2), TREE_TYPE (@0)))
>(with
> {
>   tree arg1_type = TREE_TYPE (@1);
>   unsigned int prec = TYPE_PRECISION (arg1_type);
>   wide_int max = wi::max_value (arg1_type);
>   wide_int signed_max = wi::max_value (prec, SIGNED);
>   wide_int min = wi::min_value (arg1_type);
> }
> (switch
>  (if (wi::eq_p (@1, max))
>   (switch
>(if (cmp == GT_EXPR)
> { constant_boolean_node (false, type); })
>(if (cmp == GE_EXPR)
> (eq @2 @1))
>(if (cmp == LE_EXPR)
> { constant_boolean_node (true, type); })
> 
> this which should handle side-effects in @0 just fine:
> 
> /* #line 2019 "/space/rguenther/src/svn/trunk/gcc/match.pd" */
>   if (cmp == LE_EXPR)
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, "Applying pattern match.pd:2020, %s:%d\n", __FILE__,
> __LINE__);
>   tree res;
>   res =  constant_boolean_node (true, type);
>   if (TREE_SIDE_EFFECTS (captures[0]))
> res = build2_loc (loc, COMPOUND_EXPR, type,
> fold_ignored_result (captures[0]), res);
>   return res;
> 
> note that genmatch "inlines" omit_one_operand, so you only see
> fold_ignored_result here.

I see, then for some reason TREE_SIDE_EFFECTS is not set here.

-- 
Eric Botcazou


Update my email address.

2015-09-14 Thread Bernd Schmidt

Committed.


Bernd
Index: ChangeLog
===
--- ChangeLog	(revision 227737)
+++ ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2015-09-14  Bernd Schmidt  
+
+	* MAINTAINERS: Update my email address.
+
 2015-09-01  James Bowman  
 
 	* MAINTAINERS (ft32 port): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS	(revision 227737)
+++ MAINTAINERS	(working copy)
@@ -31,7 +31,7 @@ Michael Meissner
 David S. Miller	
 Joseph Myers	
-Bernd Schmidt	
+Bernd Schmidt	
 Ian Lance Taylor
 Jim Wilson	
 
@@ -49,9 +49,9 @@ arm port		Nick Clifton		
 arm port		Ramana Radhakrishnan	
 avr port		Denis Chertykov		
-bfin port		Bernd Schmidt		
+bfin port		Bernd Schmidt		
 bfin port		Jie Zhang		
-c6x port		Bernd Schmidt		
+c6x port		Bernd Schmidt		
 cris port		Hans-Peter Nilsson	
 epiphany port		Joern Rennecke		
 fr30 port		Nick Clifton		
@@ -90,7 +90,7 @@ nds32 port		Chung-Ju Wu		
 nios2 port		Chung-Lin Tang		
 nios2 port		Sandra Loosemore	
-nvptx port		Bernd Schmidt		
+nvptx port		Bernd Schmidt		
 nvptx port		Nathan Sidwell		
 pdp11 port		Paul Koning		
 picochip port		Daniel Towner		
@@ -248,7 +248,7 @@ profile feedback	Jan Hubicka		
 alias analysis		Daniel Berlin		
 reload			Ulrich Weigand		
-reload			Bernd Schmidt		
+reload			Bernd Schmidt		
 dfp.c, related		Ben Elliston		
 RTL optimizers		Eric Botcazou		
 instruction combiner	Segher Boessenkool	


Re: [PATCH][PR67476] Add param parloops-schedule

2015-09-14 Thread Jakub Jelinek
On Fri, Sep 11, 2015 at 03:28:01PM +0200, Tom de Vries wrote:
> On 11/09/15 12:57, Jakub Jelinek wrote:
> >On Fri, Sep 11, 2015 at 12:55:00PM +0200, Tom de Vries wrote:
> >>>Hi,
> >>>
> >>>this patch adds a param parloops-schedule=<0-4>, which sets the omp 
> >>>schedule
> >>>for loops paralellized by parloops.
> >>>
> >>>The <0-4> maps onto .
> >>>
> >>>Bootstrapped and reg-tested on x86_64.
> >>>
> >>>OK for trunk?
> >I don't really like it, the mapping of the integers to the enum values
> >is non-obvious and hard to remember.
> >Perhaps add support for enumeration params if you want this instead?
> >
> 
> This patch adds handling of a DEFPARAMENUM macro, which is similar to the
> DEFPARAM macro, but allows the values to be named.
> 
> So the definition of param parloop-schedule becomes:
> ...
> DEFPARAMENUM PARAM_PARLOOPS_SCHEDULE,
>  "parloops-schedule",
>  "Schedule type of omp schedule for loops parallelized by "
>  "parloops (static, dynamic, guided, auto, runtime)",
>  0, 0, 4, "static", "dynamic", "guided", "auto", "runtime")
> ...
> [ I'll repost the original patch containing this update. ]
> 
> OK for trunk if x86_64 bootstrap and reg-test succeeds?

That still allows numeric arguments for the param, which is IMHO
undesirable.  If it is enum kind, only the enum values should be accepted.
Also, it would be nice if params.h in that case would define an enum with
the values like
PARAM_PARLOOPS_SCHEDULE_KIND_{static,dynamic,guided,auto,runtime}, so use
values not wrapped in ""s and only in a macro or generator make both
enums and string array out of that.

There is also the question if we can use __VA_ARGS__, isn't that C99 or
C++11 and later feature?  I see gengtype.h and ipa-icf-gimple.h use
that too, so maybe yes, but am not sure.

Jakub


Re: [PATCH] Teach genmatch.c to generate single-use restrictions from flags

2015-09-14 Thread Richard Biener
On Fri, 11 Sep 2015, Bernd Schmidt wrote:

> On 07/08/2015 04:39 PM, Richard Biener wrote:
> > 
> > This introduces a :s flag to match expressions which enforces
> > the expression to have a single-use if(!) the simplified
> > expression is larger than one statement.
> 
> This seems to be missing documentation in match-and-simplify.texi.

Fixed as follows, built and inspected .info and .pdf on x86_64-linux,
applied.

Richard.

2015-09-14  Richard Biener  

* doc/match-and-simplify.texi: Fixup some formatting issues
and document the 's' flag.

Index: gcc/doc/match-and-simplify.texi
===
--- gcc/doc/match-and-simplify.texi (revision 227737)
+++ gcc/doc/match-and-simplify.texi (working copy)
@@ -186,20 +186,36 @@ preprocessor directives.
   (bit_and @@1 @@0))
 @end smallexample
 
-Here we introduce flags on match expressions.  There is currently
-a single flag, @code{c}, which denotes that the expression should
+Here we introduce flags on match expressions.  There used flag
+above, @code{c}, denotes that the expression should
 be also matched commutated.  Thus the above match expression
 is really the following four match expressions:
 
+@smallexample
   (bit_and integral_op_p@@0 (bit_ior (bit_not @@0) @@1))
   (bit_and (bit_ior (bit_not @@0) @@1) integral_op_p@@0)
   (bit_and integral_op_p@@0 (bit_ior @@1 (bit_not @@0)))
   (bit_and (bit_ior @@1 (bit_not @@0)) integral_op_p@@0)
+@end smallexample
 
 Usual canonicalizations you know from GENERIC expressions are
 applied before matching, so for example constant operands always
 come second in commutative expressions.
 
+The second supported flag is @code{s} which tells the code
+generator to fail the pattern if the expression marked with
+@code{s} does have more than one use.  For example in
+
+@smallexample
+(simplify
+  (pointer_plus (pointer_plus:s @@0 @@1) @@3)
+  (pointer_plus @@0 (plus @@1 @@3)))
+@end smallexample
+
+this avoids the association if @code{(pointer_plus @@0 @@1)} is
+used outside of the matched expression and thus it would stay
+live and not trivially removed by dead code elimination.
+
 More features exist to avoid too much repetition.
 
 @smallexample
@@ -291,17 +307,17 @@ with a @code{?}:
 
 @smallexample
 (simplify
- (eq (convert@@0 @@1) (convert? @@2))
+ (eq (convert@@0 @@1) (convert@? @@2))
  (eq @@1 (convert @@2)))
 @end smallexample
 
 which will match both @code{(eq (convert @@1) (convert @@2))} and
 @code{(eq (convert @@1) @@2)}.  The optional converts are supposed
 to be all either present or not, thus
-@code{(eq (convert? @@1) (convert? @@2))} will result in two
+@code{(eq (convert@? @@1) (convert@? @@2))} will result in two
 patterns only.  If you want to match all four combinations you
 have access to two additional conditional converts as in
-@code{(eq (convert1? @@1) (convert2? @@2))}.
+@code{(eq (convert1@? @@1) (convert2@? @@2))}.
 
 Predicates available from the GCC middle-end need to be made
 available explicitely via @code{define_predicates}:


Re: [PATCH, PR 57195] Allow mode iterators inside angle brackets

2015-09-14 Thread Richard Sandiford
Michael Collison  writes:
> Here is a modified patch that takes your comments into account. Breaking 
> on depth == 0 with '>' does not work due to the code looking for whitespace.

What goes wrong?  Just to make sure we're talking about the same thing,
I meant that in:

   (match_operand:FOO> ...

the name should be "FOO" and you should get an error on ">" when parsing
the text after the name, just like you would for:

   (match_operand:FOO] ...

It's not a big deal though, so...

> 2015-08-25  Michael Collison  
>
>  PR other/57195
>  * read-md.c (read_name): Allow mode iterators inside angle
>  brackets in rtl expressions.

OK, thanks.

Richard



[PATCH] Document match.pd changed if syntax and switch

2015-09-14 Thread Richard Biener

Built .info and .pdf on x86_64-linux, applied.

Richard.

2015-09-14  Richard Biener  

* doc/match-and-simplify.texi: Update for changed syntax
of inner ifs and the new switch expression.

Index: gcc/doc/match-and-simplify.texi
===
--- gcc/doc/match-and-simplify.texi (revision 227739)
+++ gcc/doc/match-and-simplify.texi (working copy)
@@ -118,8 +118,8 @@ be a valid GIMPLE operand (so you cannot
 @smallexample
 (simplify
   (trunc_mod integer_zerop@@0 @@1)
-  (if (!integer_zerop (@@1)))
-  @@0)
+  (if (!integer_zerop (@@1))
+   @@0))
 @end smallexample
 
 Here @code{@@0} captures the first operand of the trunc_mod expression
@@ -130,9 +130,11 @@ can be unconstrained or capture expresio
 This example introduces an optional operand of simplify,
 the if-expression.  This condition is evaluated after the
 expression matched in the IL and is required to evaluate to true
-to enable the replacement expression.  The expression operand
-of the @code{if} is a standard C expression which may contain references
-to captures.
+to enable the replacement expression in the second operand
+position.  The expression operand of the @code{if} is a standard C
+expression which may contain references to captures.  The @code{if}
+has an optional third operand which may contain the replacement
+expression that is enabled when the condition evaluates to false.
 
 A @code{if} expression can be used to specify a common condition
 for multiple simplify patterns, avoiding the need
@@ -149,8 +151,48 @@ to repeat that multiple times:
 (negate @@1)))
 @end smallexample
 
+Note that @code{if}s in outer position do not have the optional
+else clause but instead have multiple then clauses.
+
 Ifs can be nested.
 
+There exists a @code{switch} expression which can be used to
+chain conditions avoiding nesting @code{if}s too much:
+
+@smallexample
+(simplify
+ (simple_comparison @@0 REAL_CST@@1)
+ (switch
+  /* a CMP (-0) -> a CMP 0  */
+  (if (REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@@1)))
+   (cmp @@0 @{ build_real (TREE_TYPE (@@1), dconst0); @}))
+  /* x != NaN is always true, other ops are always false.  */
+  (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@@1))
+   && ! HONOR_SNANS (@@1))
+   @{ constant_boolean_node (cmp == NE_EXPR, type); @})))
+@end smallexample
+
+Is equal to
+
+@smallexample
+(simplify
+ (simple_comparison @@0 REAL_CST@@1)
+ (switch
+  /* a CMP (-0) -> a CMP 0  */
+  (if (REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@@1)))
+   (cmp @@0 @{ build_real (TREE_TYPE (@@1), dconst0); @})
+   /* x != NaN is always true, other ops are always false.  */
+   (if (REAL_VALUE_ISNAN (TREE_REAL_CST (@@1))
+&& ! HONOR_SNANS (@@1))
+@{ constant_boolean_node (cmp == NE_EXPR, type); @}
+@end smallexample
+
+which has the second @code{if} in the else operand of the first.
+The @code{switch} expression takes @code{if} expressions as
+operands (which may not have else clauses) and as a last operand
+a replacement expression which should be enabled by default if
+no other condition evaluated to true.
+
 Captures can also be used for capturing results of sub-expressions.
 
 @smallexample


Re: [patch] Bump size of stack checking protection area

2015-09-14 Thread Bernd Schmidt

On 09/14/2015 10:23 AM, Eric Botcazou wrote:


as documented, STACK_CHECK_PROTECT is supposed to be an "estimate of the
amount of stack required to propagate an exception".  It's (mainly) for Ada
and it needs to distinguish the various EH schemes, which might have different
needs.  While the current setting is OK for the front-end SJLJ scheme used up
to now in Ada, it's not sufficient for the middle-end SJLJ scheme that we are
experimenting with; you need 8K on some platforms to pass the ACATS testsuite.


So it looks like some targets are at least optionally still using sjlj 
exceptions and would be affected by this change. AFAICT it only makes a 
difference with -fstack-check and would be a bugfix even for those 
targets - correct? Or is there something Ada-specific that makes it 
require more stack?
If it's not Ada-specific, the patch is ok if you also update the tm.texi 
documentation.



Bernd


Re: [PATCH][20/n] Remove GENERIC stmt combining from SCCVN

2015-09-14 Thread Eric Botcazou
> Ok, so it's folding
> 
> x == 127 ? .gnat_rcheck_CE_Overflow_Check ("overflow_sum3.adb", 14);, 0 :
> (short_short_integer) x + 1
> 
> <= 127
> 
> where op0 (the COND_EXPR) does not have TREE_SIDE_EFFECTS set but
> its operand 1 has:
> 
> (gdb) p debug_tree (op0)
>   type  public visited QI
> size 
> unit size 
> align 8 symtab 0 alias set -1 canonical type 0x76572dc8
> precision 8 min  max  0x7656a6c0 127> context  RM
> size 
> chain >
> 
> arg 0  ...
> arg 1  short_short_integer>
> side-effects
> ...
> arg 2  short_short_integer>
> ...
> 
> that's unexpected to the code generated by genmatch and I don't see
> how omit_one_operand would handle that either.

The old code was propagating the comparison inside the arms of COND_EXPR
(fold_binary_op_with_conditional_arg) before applying the transformation:

  if ((short_short_integer) x == 127 ? .gnat_rcheck_CE_Overflow_Check 
("overflow_sum3.adb", 14);, 1 : 1)

The new code does the reverse, but the old behavior can be easily restored:

Index: fold-const.c
===
--- fold-const.c(revision 227729)
+++ fold-const.c(working copy)
@@ -9025,10 +9025,6 @@ fold_binary_loc (location_t loc,
   && tree_swap_operands_p (arg0, arg1, true))
 return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, 
op0);
 
-  tem = generic_simplify (loc, code, type, op0, op1);
-  if (tem)
-return tem;
-
   /* ARG0 is the first operand of EXPR, and ARG1 is the second operand.
 
  First check for cases where an arithmetic operation is applied to a
@@ -9114,6 +9110,10 @@ fold_binary_loc (location_t loc,
}
 }
 
+  tem = generic_simplify (loc, code, type, op0, op1);
+  if (tem)
+return tem;
+
   switch (code)
 {
 case MEM_REF:

is sufficient to fix the regression.

> The COND_EXPR is originally built with TREE_SIDE_EFFECTS set but:
> 
> Hardware watchpoint 7: *$43
> 
> Old value = 65595
> New value = 59
> emit_check (gnu_cond=,
> gnu_expr=, reason=10, gnat_node=2320)
> at /space/rguenther/src/svn/trunk/gcc/ada/gcc-interface/trans.c:8823
> 8823  return gnu_result;
> $45 = 0
> 
> so the Ada frontend resets the flag (improperly?):
> 
> emit_check (gnu_cond=,
> gnu_expr=, reason=10, gnat_node=2320)
> at /space/rguenther/src/svn/trunk/gcc/ada/gcc-interface/trans.c:8823
> 8823  return gnu_result;
> $45 = 0
> (gdb) l
> 8818
> 8819  /* GNU_RESULT has side effects if and only if GNU_EXPR has:
> 8820 we don't need to evaluate it just for the check.  */
> 8821  TREE_SIDE_EFFECTS (gnu_result) = TREE_SIDE_EFFECTS (gnu_expr);
> 8822
> 8823  return gnu_result;
> 8824}

That's old code and the comment makes it quite clear why this is done though.

-- 
Eric Botcazou


Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-09-14 Thread Alan Lawrence

On 11/09/15 14:19, Bill Schmidt wrote:


A secondary concern for powerpc is that REDUC_MAX_EXPR produces a scalar
that has to be broadcast back to a vector, and the best way to implement
it for us already has the max value in all positions of a vector.  But
that is something we should be able to fix with simplify-rtx in the back
end.


Reading this thread again, this bit stands out as unaddressed. Yes PowerPC can 
"fix" this with simplify-rtx, but the vector cost model will not take this into 
account - it will think that the broadcast-back-to-a-vector requires an extra 
operation after the reduction, whereas in fact it will not.


Does that suggest we should have a new entry in vect_cost_for_stmt for 
vec_to_scalar-and-back-to-vector (that defaults to vec_to_scalar+scalar_to_vec, 
but on some architectures e.g. PowerPC would be the same as vec_to_scalar)?


(I agree that if that's the limit of how "different" conditional reductions may 
be between architectures, then we should not have a vec_cost_for_stmt for a 
whole conditional reduction.)


Cheers, Alan



Re: [PATCH][20/n] Remove GENERIC stmt combining from SCCVN

2015-09-14 Thread Richard Biener
On Mon, 14 Sep 2015, Eric Botcazou wrote:

> > Ok, so it's folding
> > 
> > x == 127 ? .gnat_rcheck_CE_Overflow_Check ("overflow_sum3.adb", 14);, 0 :
> > (short_short_integer) x + 1
> > 
> > <= 127
> > 
> > where op0 (the COND_EXPR) does not have TREE_SIDE_EFFECTS set but
> > its operand 1 has:
> > 
> > (gdb) p debug_tree (op0)
> >   > type  > public visited QI
> > size 
> > unit size 
> > align 8 symtab 0 alias set -1 canonical type 0x76572dc8
> > precision 8 min  max  > 0x7656a6c0 127> context  RM
> > size 
> > chain >
> > 
> > arg 0  > ...
> > arg 1  > short_short_integer>
> > side-effects
> > ...
> > arg 2  > short_short_integer>
> > ...
> > 
> > that's unexpected to the code generated by genmatch and I don't see
> > how omit_one_operand would handle that either.
> 
> The old code was propagating the comparison inside the arms of COND_EXPR
> (fold_binary_op_with_conditional_arg) before applying the transformation:
> 
>   if ((short_short_integer) x == 127 ? .gnat_rcheck_CE_Overflow_Check 
> ("overflow_sum3.adb", 14);, 1 : 1)
> 
> The new code does the reverse, but the old behavior can be easily restored:
>
> Index: fold-const.c
> ===
> --- fold-const.c(revision 227729)
> +++ fold-const.c(working copy)
> @@ -9025,10 +9025,6 @@ fold_binary_loc (location_t loc,
>&& tree_swap_operands_p (arg0, arg1, true))
>  return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, 
> op0);
>  
> -  tem = generic_simplify (loc, code, type, op0, op1);
> -  if (tem)
> -return tem;
> -
>/* ARG0 is the first operand of EXPR, and ARG1 is the second operand.
>  
>   First check for cases where an arithmetic operation is applied to a
> @@ -9114,6 +9110,10 @@ fold_binary_loc (location_t loc,
> }
>  }
>  
> +  tem = generic_simplify (loc, code, type, op0, op1);
> +  if (tem)
> +return tem;
> +
>switch (code)
>  {
>  case MEM_REF:
> 
> is sufficient to fix the regression.

The newly generated code is better though and I can't see how we
should allow fold_binary_op_with_conditional_arg to be required
for correctness.  Iff then the "fix" would not be the above but
to move fold_binary_op_with_conditional_arg to match.pd itself.

> > The COND_EXPR is originally built with TREE_SIDE_EFFECTS set but:
> > 
> > Hardware watchpoint 7: *$43
> > 
> > Old value = 65595
> > New value = 59
> > emit_check (gnu_cond=,
> > gnu_expr=, reason=10, gnat_node=2320)
> > at /space/rguenther/src/svn/trunk/gcc/ada/gcc-interface/trans.c:8823
> > 8823  return gnu_result;
> > $45 = 0
> > 
> > so the Ada frontend resets the flag (improperly?):
> > 
> > emit_check (gnu_cond=,
> > gnu_expr=, reason=10, gnat_node=2320)
> > at /space/rguenther/src/svn/trunk/gcc/ada/gcc-interface/trans.c:8823
> > 8823  return gnu_result;
> > $45 = 0
> > (gdb) l
> > 8818
> > 8819  /* GNU_RESULT has side effects if and only if GNU_EXPR has:
> > 8820 we don't need to evaluate it just for the check.  */
> > 8821  TREE_SIDE_EFFECTS (gnu_result) = TREE_SIDE_EFFECTS (gnu_expr);
> > 8822
> > 8823  return gnu_result;
> > 8824}
> 
> That's old code and the comment makes it quite clear why this is done though.

Yeah, but then here "we don't need to evaluate it just for the check"
applies - the check is dead code as the outer comparison is always
false.  I think what the code in the Ada frontend tries to achieve
is not actually what it does.  Or the testcase is invalid (or rather
dependent on optimization performed).

Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[patch] Get rid of useless EH cleanups at -O0

2015-09-14 Thread Eric Botcazou
Hi,

this patchlet makes it possible to get rid of useless EH cleanups generated at 
-O0 in Ada for simple constructs involving VLAs:

  declare
S : String (1 .. N);
  begin
...
  end;

by duplicating finally blocks that contain only a stack restore.  Then the EH 
optimization machinery (which is also run at -O0) can remove the cleanups.

Tested on x86_64-suse-linux, OK for the mainline?


2015-09-14  Eric Botcazou  

* tree-eh.c (lower_try_finally_dup_block): Clear location information
on stack restore statements.
(decide_copy_try_finally): Do not consider a stack restore statement as
coming from sources.


2015-09-14  Eric Botcazou  

* gnat.dg/array24.adb: New test.

-- 
Eric BotcazouIndex: tree-eh.c
===
--- tree-eh.c	(revision 227729)
+++ tree-eh.c	(working copy)
@@ -915,7 +915,12 @@ lower_try_finally_dup_block (gimple_seq
   for (gsi = gsi_start (new_seq); !gsi_end_p (gsi); gsi_next (&gsi))
 {
   gimple stmt = gsi_stmt (gsi);
-  if (LOCATION_LOCUS (gimple_location (stmt)) == UNKNOWN_LOCATION)
+  /* We duplicate __builtin_stack_restore at -O0 in the hope of eliminating
+	 it on the EH paths.  When it is not eliminated, make it transparent in
+	 the debug info.  */
+  if (gimple_call_builtin_p (stmt, BUILT_IN_STACK_RESTORE))
+	gimple_set_location (stmt, UNKNOWN_LOCATION);
+  else if (LOCATION_LOCUS (gimple_location (stmt)) == UNKNOWN_LOCATION)
 	{
 	  tree block = gimple_block (stmt);
 	  gimple_set_location (stmt, loc);
@@ -1604,8 +1609,12 @@ decide_copy_try_finally (int ndests, boo
 
   for (gsi = gsi_start (finally); !gsi_end_p (gsi); gsi_next (&gsi))
 	{
+	  /* Duplicate __builtin_stack_restore in the hope of eliminating it
+	 on the EH paths and, consequently, useless cleanups.  */
 	  gimple stmt = gsi_stmt (gsi);
-	  if (!is_gimple_debug (stmt) && !gimple_clobber_p (stmt))
+	  if (!is_gimple_debug (stmt)
+	  && !gimple_clobber_p (stmt)
+	  && !gimple_call_builtin_p (stmt, BUILT_IN_STACK_RESTORE))
 	return false;
 	}
   return true;-- { dg-do compile }
-- { dg-options "-fdump-tree-optimized" }

procedure Array24 (N : Natural) is
  S : String (1 .. N);
  pragma Volatile (S);
begin
  S := (others => '0');
end;

-- { dg-final { scan-tree-dump-not "builtin_unwind_resume" "optimized"  } }