Re: Review Hashtable extract node API

2019-06-17 Thread François Dumont

A small regression noticed while merging.

We shouldn't keep on using a moved-from key_type instance.

Ok to commit ? Feel free to do it if you prefer, I'll do so at end of 
Europe day otherwise.


François


On 6/17/19 12:28 PM, Jonathan Wakely wrote:

On 07/06/19 18:39 +0200, François Dumont wrote:

On 6/5/19 6:22 PM, Jonathan Wakely wrote:

On 04/06/19 19:19 +0200, François Dumont wrote:

Hi

    Here is a patch to enhance the _Hashtable extract node API and 
fix a FIXME request.


    The enhancement to the extract node Api is that extract(const 
key_type&) do not call extract(const_iterator) anymore. Doing so we 
had to loop again through bucket nodes to find the previous node to 
the one to extract. Even if a bucket shall not contain many nodes 
(in unique key mode) it's easy to avoid it.


Nice.

    To fix the FIXME I introduced a node smart pointer type 
managing the node lifetime. The node is extracted from this smart 
pointer only when there can't be any exception raised. In the 
context of the node extract api the node handle is considered as a 
smart pointer. So the node handle will remain owner of the node in 
case of exception when reinserting it, I hope it is the expected 
behavior.


Yes, that's right.

I was going to suggest just using the node handle type instead of
inventing a new smart pointer, but the handle type uses std::optional
so isn't available for C++11/14.


I considered it too, or even use shared_ptr but thought it was overkill.





    * include/bits/hashtable_policy.h
    (struct _NodeSmartPointer<_NodeAlloc>): New.
    (_Map_base<>::operator[](const key_type&)): Use latter, adapt.
    (_Map_base<>::operator[](key_type&&)): Likewise.
    * include/bits/hashtable.h
    (_Hashtable<>::__node_sp_t): New.
    (_Hashtable<>::_M_insert_unique_node(size_type, __hash_code,
    __node_type*, size_type)): Replace by...
(_Hashtable<>::_M_insert_unique_node<_NodeAccessor>(const key_type&,
    size_type, __hash_code, const _NodeAccessor&, size_type)): 
...that.

    (_Hashtable<>::_M_insert_multi_node(__node_type*, __hash_code,
    __node_type*)): Replace by...
(_Hashtable<>::_M_insert_multi_node<_NodeAccessor>(__node_type*,
    __hash_code, const _NodeAccessor&)): ...that.
    (_Hashtable<>::_M_reinsert_node): Adapt.
    (_Hashtable<>::_M_reinsert_node_multi): Adapt.
    (_Hashtable<>::_M_extract_node(size_t, __node_base*)): New.
    (_Hashtable<>::extract(const_iterator)): Use latter.
    (_Hashtable<>::extract(const _Key&)): Likewise.
    (_Hashtable<>::_M_merge_unique): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(true_type, _Args&&...)): Adapt.
(_Hashtable<>::_M_emplace<_Args>(const_iterator, false_type,
    _Args&&...)): Adapt.

Tested under Linux x86_64.

Ok to commit ?

François



diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h

index e2e3f016a35..307865b96bf 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -197,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  using __hash_cached = typename __traits_type::__hash_cached;
  using __node_type = __detail::_Hash_node<_Value, 
__hash_cached::value>;

  using __node_alloc_type = __alloc_rebind<_Alloc, __node_type>;
+  using __node_sp_t = 
__detail::_NodeSmartPointer<__node_alloc_type>;


  using __hashtable_alloc = 
__detail::_Hashtable_alloc<__node_alloc_type>;


@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __node_base*
  _M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no 
rehash (assumes
-  // no element with its key already present). Take ownership 
of the node,

-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket 
__bkt if no

+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-    __node_type* __n, size_type __n_elt = 1);
+    _M_insert_unique_node(const key_type& __k, size_type __bkt,
+  __hash_code __code, const _NodeAccessor&,
+  size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the 
node,

-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-   __hash_code __code, __node_type* __n);
+    _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+ const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object t

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-17 Thread Martin Liška
On 6/18/19 3:45 AM, Xiong Hu Luo wrote:

Hello.

Thank you for the interest in the area.

> This patch aims to fix PR69678 caused by PGO indirect call profiling bugs.
> Currently the default instrument function can only find the indirect function
> that called more than 50% with an incorrect count number returned.

Can you please explain what you mean by 'an incorrect count number returned'?

>  This patch
> leverages the "--param indir-call-topn-profile=1" and enables multiple 
> indirect

Note that I've remove indir-call-topn-profile last week, the patch will not 
apply
on current trunk. However, I can help you how to adapt single-value counters
to support tracking of multiple values.

> targets profiling and use in LTO-WPA and LTO-LTRANS stage, as a result, 
> function
> specialization, profiling, partial devirtualization, inlining and cloning 
> could
> be done successfully based on it.

This decision is definitely big question for Honza?

> Performance can get improved 3x (1.7 sec -> 0.4 sec) on simple tests.
> Details are:
>   1.  When do PGO with indir-call-topn-profile, the gcda data format is not
>   supported in ipa-profile pass,

If you take a look at gcc/ipa-profile.c:195 you can see how the probability
is propagated to IPA passes. Why is that not sufficient?

Martin

> so add variables to pass the information
>   through passes, and postpone gimple_ic to ipa-profile like default as inline
>   pass will decide whether it is benefit to transform indirect call.
>   2.  Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
>   profile full support in ipa passes and cgraph_edge functions.
>   3.  Fix various hidden speculative call ICEs exposed after enabling this
>   feature when running SPEC2017.
>   4.  Add 1 in module testcase and 2 cross module testcases.
>   5.  TODOs:
> 5.1.  Some reference info will be dropped from WPA to LTRANS, so
> reference check will be difficult in LTRANS, need replace the strstr
> with reference compare.
> 5.2.  Some duplicate code need be removed as top1 and topn share same 
> logic.
> Actually top1 related logic could be eliminated totally as topn includes 
> it.
> 5.3.  Split patch maybe needed as too big but not sure how many would be
> reasonable.
>   6.  Performance result for ppc64le:
> 6.1.  Representative test: indir-call-prof-topn.c runtime improved from
> 1.7s to 0.4s.
> 6.2.  SPEC2017 peakrate:
> 523.xalancbmk_r (+4.87%); 538.imagick_r (+4.59%); 511.povray_r 
> (+13.33%);
> 525.x264_r (-5.29%).
> No big changes of other benchmarks.
> Option: -Ofast -mcpu=power8
> PASS1_OPTIMIZE: -fprofile-generate --param indir-call-topn-profile=1 
> -flto
> PASS2_OPTIMIZE: -fprofile-use --param indir-call-topn-profile=1 -flto
> -fprofile-correction
> 6.3.  No performance change on PHP benchmark.
>   7.  Bootstrap and regression test passed on Power8-LE.
> 
> gcc/ChangeLog
> 
>   2019-06-17  Xiong Hu Luo  
> 
>   PR ipa/69678
>   * cgraph.c (cgraph_node::get_create): Copy profile_id.
>   (cgraph_edge::speculative_call_info): Find real
>   reference for indirect targets.
>   (cgraph_edge::resolve_speculation): Add speculative code process
>   for indirect targets.
>   (cgraph_edge::redirect_call_stmt_to_callee): Likewise.
>   (cgraph_node::verify_node): Likewise.
>   * cgraph.h (common_target_ids): New variable.
>   (common_target_probabilities): Likewise.
>   (num_of_ics): Likewise.
>   * cgraphclones.c (cgraph_node::create_clone): Copy profile_id.
>   * ipa-inline.c (inline_small_functions): Add iterator update.
>   * ipa-profile.c (ipa_profile_generate_summary): Add indirect
>   multiple targets logic.
>   (ipa_profile): Likewise.
>   * ipa-utils.c (ipa_merge_profiles): Clone speculative src's
>   referrings to dst.
>   * ipa.c (process_references): Fix typo.
>   * lto-cgraph.c (lto_output_edge): Add indirect multiple targets
>   logic.
>   (input_edge): Likewise.
>   * predict.c (dump_prediction): Revome edges count assert to be
>   precise.
>   * tree-profile.c (gimple_gen_ic_profiler): Use the new variable
>   __gcov_indirect_call.counters and __gcov_indirect_call.callee.
>   (gimple_gen_ic_func_profiler): Likewise.
>   (pass_ipa_tree_profile::gate): Fix comment typos.
>   * tree-inline.c (copy_bb): Duplicate all the speculative edges
>   if indirect call contains multiple speculative targets.
>   * value-prof.c (check_counter): Proportion the counter for
>   multiple targets.
>   (ic_transform_topn): New function.
>   (gimple_ic_transform): Handle topn case, fix comment typos.
> 
> gcc/testsuite/ChangeLog
> 
>   2019-06-17  Xiong Hu Luo  
> 
>   PR ipa/69678
>   * gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
>   * gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
>   * 

Re: [PR 90889] Fix IPA-CP topological sorting

2019-06-17 Thread Martin Liška
On 6/17/19 10:36 PM, Martin Jambor wrote:
> Hi,
> 
> r272115 introduced IPA-CP hook to ignore some edges in order to break
> useless SCCs, but the condition in the new hook was a bit too lenient.

Oh, sorry for the breakage.
Thank you Martin for the patch.

Martin

> 
> If we want to break SCCs in which contains some nodes has IPA-CP
> disabled, we must do it at the incoming edges, not at the outgoing, at
> least without making sure that we propagate that these calls have to
> mark callee's lattices as containing variable stuff.
> 
> But breaking SCCs only at the edges which are easier to handle
> is... well.. easier, and so the following patch does that.  The patch
> has restored LTO bootstrap with Ada, C and C++, I am now running LTO
> bootstrap and testing with all languages, OK if it passes?
> 
> Thanks,
> 
> Martin
> 
> 
> 2019-06-17  Martin Jambor  
> 
>   PR ipa/90889
>   * ipa-cp.c (ignore_edge_p): Do not ignore edges when only the
>   caller does not have flag_ipa_cp set.
> ---
>  gcc/ipa-cp.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index a0f6f23829b..d3a88756a91 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -817,7 +817,6 @@ ignore_edge_p (cgraph_edge *e)
>  = e->callee->function_or_virtual_thunk_symbol (&avail, e->caller);
>  
>return (avail <= AVAIL_INTERPOSABLE
> -   || !opt_for_fn (e->caller->decl, flag_ipa_cp)
> || !opt_for_fn (ultimate_target->decl, flag_ipa_cp));
>  }
>  
> 



[PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-17 Thread Xiong Hu Luo
This patch aims to fix PR69678 caused by PGO indirect call profiling bugs.
Currently the default instrument function can only find the indirect function
that called more than 50% with an incorrect count number returned.  This patch
leverages the "--param indir-call-topn-profile=1" and enables multiple indirect
targets profiling and use in LTO-WPA and LTO-LTRANS stage, as a result, function
specialization, profiling, partial devirtualization, inlining and cloning could
be done successfully based on it.
Performance can get improved 3x (1.7 sec -> 0.4 sec) on simple tests.
Details are:
  1.  When do PGO with indir-call-topn-profile, the gcda data format is not
  supported in ipa-profile pass, so add variables to pass the information
  through passes, and postpone gimple_ic to ipa-profile like default as inline
  pass will decide whether it is benefit to transform indirect call.
  2.  Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
  profile full support in ipa passes and cgraph_edge functions.
  3.  Fix various hidden speculative call ICEs exposed after enabling this
  feature when running SPEC2017.
  4.  Add 1 in module testcase and 2 cross module testcases.
  5.  TODOs:
5.1.  Some reference info will be dropped from WPA to LTRANS, so
reference check will be difficult in LTRANS, need replace the strstr
with reference compare.
5.2.  Some duplicate code need be removed as top1 and topn share same logic.
Actually top1 related logic could be eliminated totally as topn includes it.
5.3.  Split patch maybe needed as too big but not sure how many would be
reasonable.
  6.  Performance result for ppc64le:
6.1.  Representative test: indir-call-prof-topn.c runtime improved from
1.7s to 0.4s.
6.2.  SPEC2017 peakrate:
523.xalancbmk_r (+4.87%); 538.imagick_r (+4.59%); 511.povray_r 
(+13.33%);
525.x264_r (-5.29%).
No big changes of other benchmarks.
Option: -Ofast -mcpu=power8
PASS1_OPTIMIZE: -fprofile-generate --param indir-call-topn-profile=1 
-flto
PASS2_OPTIMIZE: -fprofile-use --param indir-call-topn-profile=1 -flto
-fprofile-correction
6.3.  No performance change on PHP benchmark.
  7.  Bootstrap and regression test passed on Power8-LE.

gcc/ChangeLog

2019-06-17  Xiong Hu Luo  

PR ipa/69678
* cgraph.c (cgraph_node::get_create): Copy profile_id.
(cgraph_edge::speculative_call_info): Find real
reference for indirect targets.
(cgraph_edge::resolve_speculation): Add speculative code process
for indirect targets.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
(cgraph_node::verify_node): Likewise.
* cgraph.h (common_target_ids): New variable.
(common_target_probabilities): Likewise.
(num_of_ics): Likewise.
* cgraphclones.c (cgraph_node::create_clone): Copy profile_id.
* ipa-inline.c (inline_small_functions): Add iterator update.
* ipa-profile.c (ipa_profile_generate_summary): Add indirect
multiple targets logic.
(ipa_profile): Likewise.
* ipa-utils.c (ipa_merge_profiles): Clone speculative src's
referrings to dst.
* ipa.c (process_references): Fix typo.
* lto-cgraph.c (lto_output_edge): Add indirect multiple targets
logic.
(input_edge): Likewise.
* predict.c (dump_prediction): Revome edges count assert to be
precise.
* tree-profile.c (gimple_gen_ic_profiler): Use the new variable
__gcov_indirect_call.counters and __gcov_indirect_call.callee.
(gimple_gen_ic_func_profiler): Likewise.
(pass_ipa_tree_profile::gate): Fix comment typos.
* tree-inline.c (copy_bb): Duplicate all the speculative edges
if indirect call contains multiple speculative targets.
* value-prof.c (check_counter): Proportion the counter for
multiple targets.
(ic_transform_topn): New function.
(gimple_ic_transform): Handle topn case, fix comment typos.

gcc/testsuite/ChangeLog

2019-06-17  Xiong Hu Luo  

PR ipa/69678
* gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
---
 gcc/cgraph.c  |  38 +++-
 gcc/cgraph.h  |   9 +-
 gcc/cgraphclones.c|   1 +
 gcc/ipa-inline.c  |   3 +
 gcc/ipa-profile.c | 185 +-
 gcc/ipa-utils.c   |   5 +
 gcc/ipa.c |   2 +-
 gcc/lto-cgraph.c  |  38 
 gcc/predict.c |   1 -
 .../tree-prof/crossmod

Re: [PATCH] Fix PR84521

2019-06-17 Thread Jeff Law
On 6/17/19 6:58 PM, Wilco Dijkstra wrote:
> Hi Jeff,
> 
>> So I like the significant simplification here.  My worry is whether or
>> not this is, in effect, an ABI change.  ie, would we be able to mix and
>> match .o files from before/after this change which used the builtin
>> setjmp/longjmp bits?
> 
> No it's not an ABI change. It does affect the value stored in the setjmp
> record, but that is entirely local to the function containing setjmp.
> Note that the function containing the longjmp is not affected by this
> change.
OK.  Thanks for the clarification.

> 
>> You mention that arm, mips and xtensa are still broken.  Are they worse
>> after this patch?  Presumably for arm/mips the problem is the frame
>> pointer varies based on the thumb/mips and mips/mips16 state?  Is it
>> even valid to longjmp from one mode to the other?
> 
> Yes the only remaining issue after this is the fact that these targets can use
> different frame pointers in functions. The generic code doesn't consider this,
> but any function could be Arm or Thumb, mips or mips16. One solution
> would be to pass the frame pointer in an argument register (that would be
> an ABI change).
ACK.  I think if we want to make this work (arm <-> thumb or mips <->
mips16) then we should take that up separately.

> 
>> I think xtensa has two abis and the frame pointer is different across
>> them.  Presumably a longjmp from one abi to the other isn't valid.
>  
> If it isn't possible to call functions with the other frame pointer then it
> won't be affected.
I'd be surprised given it's two distinct abis as opposed to a processor
mode change like we see in the arm/mips cases.

OK for the trunk.  Thanks for your patience.

jeff


Re: [PATCH] Fix PR84521

2019-06-17 Thread Wilco Dijkstra
Hi Jeff,

> So I like the significant simplification here.  My worry is whether or
> not this is, in effect, an ABI change.  ie, would we be able to mix and
> match .o files from before/after this change which used the builtin
> setjmp/longjmp bits?

No it's not an ABI change. It does affect the value stored in the setjmp
record, but that is entirely local to the function containing setjmp.
Note that the function containing the longjmp is not affected by this
change.

> You mention that arm, mips and xtensa are still broken.  Are they worse
> after this patch?  Presumably for arm/mips the problem is the frame
> pointer varies based on the thumb/mips and mips/mips16 state?  Is it
> even valid to longjmp from one mode to the other?

Yes the only remaining issue after this is the fact that these targets can use
different frame pointers in functions. The generic code doesn't consider this,
but any function could be Arm or Thumb, mips or mips16. One solution
would be to pass the frame pointer in an argument register (that would be
an ABI change).

> I think xtensa has two abis and the frame pointer is different across
> them.  Presumably a longjmp from one abi to the other isn't valid.
 
If it isn't possible to call functions with the other frame pointer then it
won't be affected.

Wilco

[PATCH] PowerPC: Add 'prefix' to the 'isa' attribute

2019-06-17 Thread Michael Meissner
Some of my future patches for prefixed memory instructions in a future PowerPC
processor need the following tweak to the ISA attribute, so that the
alternative that uses a prefixed instruction to load up large integer constants
can be eliminated if the user does not compile for the 'future' target.

I have bootstrapped the compiler and run make check with no regressions.  Can I
check this into the trunk?

2019-06-17  Michael Meissner  

* config/rs6000/rs6000.md (isa attribute): Add support for
prefixed instructions.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 272270)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -267,7 +267,9 @@ (define_attr "cpu"
   (const (symbol_ref "(enum attr_cpu) rs6000_tune")))
 
 ;; The ISA we implement.
-(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9v,p9kf,p9tf" (const_string "any"))
+(define_attr "isa"
+  "any,p5,p6,p7,p7v,p8v,p9v,p9kf,p9tf,prefix"
+  (const_string "any"))
 
 ;; Is this alternative enabled for the current CPU/ISA/etc.?
 (define_attr "enabled" ""
@@ -306,6 +308,10 @@ (define_attr "enabled" ""
  (and (eq_attr "isa" "p9tf")
  (match_test "FLOAT128_VECTOR_P (TFmode)"))
  (const_int 1)
+
+ (and (eq_attr "isa" "prefix")
+ (match_test "TARGET_PREFIXED_ADDR"))
+ (const_int 1)
 ] (const_int 0)))
 
 ;; If this instruction is microcoded on the CELL processor

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] Fix PR84521

2019-06-17 Thread Jeff Law
On 5/28/19 11:37 AM, Wilco Dijkstra wrote:
> This fixes and simplifies the setjmp and non-local goto implementation.
> Currently the virtual frame pointer is saved when using __builtin_setjmp or
> a non-local goto.  Depending on whether a frame pointer is used, this may
> either save SP or FP with an immediate offset.  However the goto or longjmp
> always updates the hard frame pointer.
> 
> A receiver veneer in the original function then assigns the hard frame pointer
> to the virtual frame pointer, which should, if it works correctly, again 
> assign
> SP or FP.  However the special elimination code in eliminate_regs_in_insn
> doesn't do this correctly unless the frame pointer is used, and even if it
> worked by writing SP, the frame pointer would still be corrupted.
> 
> A much simpler implementation is to always save and restore the hard frame
> pointer.  This avoids 2 redundant instructions which add/subtract the virtual
> frame offset.  A large amount of code can be removed as a result, including 
> all
> implementations of TARGET_BUILTIN_SETJMP_FRAME_VALUE (all of which already use
> the hard frame pointer).  The expansion of nonlocal_goto on PA can be simplied
> to just restore the hard frame pointer. 
> 
> This fixes the most obvious issues, however there are still issues on targets
> which define HARD_FRAME_POINTER_IS_FRAME_POINTER (arm, mips, xtensa).
> Each function could have a different hard frame pointer, so a non-local goto
> may restore the wrong frame pointer (TARGET_BUILTIN_SETJMP_FRAME_VALUE could
> be useful for this).
> 
> The i386 TARGET_BUILTIN_SETJMP_FRAME_VALUE was incorrect: if stack_realign_fp
> is true, it would save the hard frame pointer value but restore the virtual
> frame pointer which according to ix86_initial_elimination_offset can have a
> non-zero offset from the hard frame pointer.
> 
> The ia64 implementation of nonlocal_goto seems incorrect since the helper
> function moves the the frame pointer value into the static chain register
> (so this patch does nothing to make it better or worse).
> 
> AArch64 bootstrap OK, new test passes on AArch64, x86-64 and Arm.
> 
> ChangeLog:
> 2018-12-13  Wilco Dijkstra  
> 
> gcc/
>   PR middle-end/84521
>   * builtins.c (expand_builtin_setjmp_setup): Save hard_frame_pointer_rtx.
>   (expand_builtin_setjmp_receiver): Do not emit sfp = fp move since we 
> restore fp.
>   * function.c (expand_function_start): Save hard_frame_pointer_rtx for 
> non-local goto.
>   * lra-eliminations.c (eliminate_regs_in_insn): Remove sfp = fp 
> elimination code.
>   (remove_reg_equal_offset_note): Remove unused function.
>   * reload1.c (eliminate_regs_in_insn): Remove sfp = hfp elimination code.
>   * config/arc/arc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
>   (arc_builtin_setjmp_frame_value): Remove function.
>   * config/avr/avr.c  (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
>   (avr_builtin_setjmp_frame_value): Remove function.
>   * config/i386/i386.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
>   (ix86_builtin_setjmp_frame_value): Remove function.
>   * config/pa/pa.md (nonlocal_goto): Remove FP adjustment.
>   * config/sparc/sparc.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
>   (sparc_builtin_setjmp_frame_value): Remove function.
>   * config/vax/vax.c (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Remove.
>   (vax_builtin_setjmp_frame_value): Remove function.
> 
> testsuite/
>   PR middle-end/84521
>   * gcc.c-torture/execute/pr84521.c: New test.
> 
So I like the significant simplification here.  My worry is whether or
not this is, in effect, an ABI change.  ie, would we be able to mix and
match .o files from before/after this change which used the builtin
setjmp/longjmp bits?

You mention that arm, mips and xtensa are still broken.  Are they worse
after this patch?  Presumably for arm/mips the problem is the frame
pointer varies based on the thumb/mips and mips/mips16 state?  Is it
even valid to longjmp from one mode to the other?

I think xtensa has two abis and the frame pointer is different across
them.  Presumably a longjmp from one abi to the other isn't valid.

Or am I missing something?

jeff


Re: [PATCH] PR c/17896 Check for missplaced bitwise op

2019-06-17 Thread Jeff Law
On 5/24/19 6:53 AM, Rafael Tsuha wrote:
> This patch adds a function to warn when there's a bitwise operation
> between a boolean and any other type. This kind of operation is
> probably a programmer mistake that may lead to unexpected behavior
> because possibily the logical operation was intended.
> The test was adapted from PR c/17896.
> 
> gcc/c-family/ChangeLog
> 2019-05-24  Rafael Tsuha 
> 
> * c-warn.c (warn_logical_operator): Check for missplaced bitwise op.
> 
> gcc/testsuite/ChangeLog
> 2019-05-24  Rafael Tsuha 
> 
> PR c/17896
> * gcc.dg/boolean-bitwise.c: New test.
> 
> 
> bitwiseWarning.patch
> 
> Index: gcc/c-family/c-warn.c
> ===
> --- gcc/c-family/c-warn.c (revision 268782)
> +++ gcc/c-family/c-warn.c (working copy)
> @@ -167,10 +167,10 @@
>  }
>  
>  /* Warn about uses of logical || / && operator in a context where it
> -   is likely that the bitwise equivalent was intended by the
> -   programmer.  We have seen an expression in which CODE is a binary
> -   operator used to combine expressions OP_LEFT and OP_RIGHT, which before 
> folding
> -   had CODE_LEFT and CODE_RIGHT, into an expression of type TYPE.  */
> +   is likely that the bitwise equivalent was intended by the programmer or 
> vice
> +   versa.  We have seen an expression in which CODE is a binary operator 
> used to
> +   combine expressions OP_LEFT and OP_RIGHT, which before folding had 
> CODE_LEFT
> +   and CODE_RIGHT, into an expression of type TYPE.  */
>  
>  void
>  warn_logical_operator (location_t location, enum tree_code code, tree type,
> @@ -178,6 +178,7 @@
>  enum tree_code ARG_UNUSED (code_right), tree op_right)
>  {
>int or_op = (code == TRUTH_ORIF_EXPR || code == TRUTH_OR_EXPR);
> +  int and_op = (code == TRUTH_ANDIF_EXPR || code == TRUTH_AND_EXPR);
>int in0_p, in1_p, in_p;
>tree low0, low1, low, high0, high1, high, lhs, rhs, tem;
>bool strict_overflow_p = false;
> @@ -188,7 +189,9 @@
>if (code != TRUTH_ANDIF_EXPR
>&& code != TRUTH_AND_EXPR
>&& code != TRUTH_ORIF_EXPR
> -  && code != TRUTH_OR_EXPR)
> +  && code != TRUTH_OR_EXPR
> +  && code != BIT_AND_EXPR
> +  && code != BIT_IOR_EXPR)
>  return;
>  
>/* We don't want to warn if either operand comes from a macro
> @@ -219,7 +222,7 @@
> if (or_op)
>   warning_at (location, OPT_Wlogical_op, "logical %"
>   " applied to non-boolean constant");
> -   else
> +   else if (and_op)
>   warning_at (location, OPT_Wlogical_op, "logical %"
>   " applied to non-boolean constant");
> TREE_NO_WARNING (op_left) = true;
> @@ -227,6 +230,26 @@
>   }
>  }
>  
> +  /* Warn if &/| are being used in a context where it is
> + likely that the logical equivalent was intended by the
> + programmer. That is, an expression such as op_1 & op_2
> + where op_n should not be any boolean expression. */
> +  if ( TREE_CODE (TREE_TYPE (op_right)) == BOOLEAN_TYPE
> +  || TREE_CODE (TREE_TYPE (op_left)) == BOOLEAN_TYPE 
> +   || COMPARISON_CLASS_P ((op_left))
> +   || COMPARISON_CLASS_P ((op_right)))
> +{
> +  tree folded_op_right = fold_for_warn (op_right);
So you've got an unused variable here.   If you just want the side
effect of calling fold_for_warn, then just call it, but don't assign its
result to a variable.

The existence of an unused variable would have triggered a bootstrap
failure.  This tells me you didn't bootstrap the change.  It's also
likely you didn't regression test the change.

If the bootstrap fails because we have instance of bitwise operation
between booleans, then standard practice would be to fix those too.  You
may want to consider submitting those fixes as a separate patch if
there's many of them (hopefully there's none :-)


jeff


Re: [PATCH][Tree-optimization/PR89772]fold memchr builtins for character not in constant nul-padded string

2019-06-17 Thread Jeff Law
On 5/9/19 2:01 AM, JunMa wrote:
> 在 2019/5/9 上午10:22, JunMa 写道:
>> 在 2019/5/9 上午3:02, Bernd Edlinger 写道:
>>> On 5/8/19 4:31 PM, Richard Biener wrote:
 On Tue, May 7, 2019 at 4:34 AM JunMa  wrote:
> 在 2019/5/6 下午7:58, JunMa 写道:
>> 在 2019/5/6 下午6:02, Richard Biener 写道:
>>> On Thu, Mar 21, 2019 at 5:57 AM JunMa 
>>> wrote:
 Hi
 For now, gcc can not fold code like:

 const char a[5] = "123"
 __builtin_memchr (a, '7', sizeof a)

 It tries to avoid folding out of string length although length of a
 is 5.
 This is a bit conservative, it's safe to folding memchr/bcmp/memcmp
 builtins when constant string stores in array with some trailing
 nuls.

 This patch folds these cases by exposing additional length of
 trailing nuls in c_getstr().
 Bootstrapped/regtested on x86_64-linux, ok for trunk?
>>> It's probably better if gimple_fold_builtin_memchr uses
>>> string_constant
>>> directly instead?
>> We can use string_constant in gimple_fold_builtin_memchr, however
>> it is a
>> bit complex to use it in memchr/memcmp constant folding.
>>> You are changing memcmp constant folding but only have a testcase
>>> for memchr.
>> I'll add later.
>>> If we decide to amend c_getstr I would rather make it return the
>>> total length instead of the number of trailing zeros.
>> I think return the total length is safe in other place as well.
>> I used the argument in patch since I do not want any impact on
>> other part at all.
>>
> Although it is safe to use total length, I found that function
> inline_expand_builtin_string_cmp() which used c_getstr() may emit
> redundant rtls for trailing null chars when total length is returned.
>
> Also it is trivial to handle constant string with trailing null chars.
>
> So this updated patch follow richard's suggestion: using
> string_constant
> directly.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
 Doesn't this fold to NULL for the case where seaching for '0' and it
 doesn't occur in the string constant but only the zero-padding?
So I'm not sure if JunMa addressed this question specifically.  What
happens is we'll get back a NULL terminated string from c_getstr, but
the returned length will include the NUL terminator.  Then we call
memchr on the result with a length that would include that NUL
terminator.  So I'm pretty sure the current patch will DTRT in that case.

It'd be better to have a stronger test which verified not only that the
call was folded away, but what the resultant value was and whether or
not it was the right value.

That would include testing for NUL in the string as well as testing for
NUL in the tail padding.

I'll ack the change to gimple-fold, but please improve the testcase a
bit and commit the change to gimple-fold and the improved testcase together.

Thanks, and sorry for the delays.

jeff


Re: [PATCH] improve ifcvt optimization (PR rtl-optimization/89430)

2019-06-17 Thread Jeff Law
On 3/14/19 10:46 PM, JiangNing OS wrote:
> This patch is to fix a missing ifcvt opportunity in back-end. For the simple 
> case below,
> 
> if (...)
> x = a;  /* x is memory */
> /* no else */
> 
> We can generate conditional move and remove the branch as below if the target 
> cost is acceptable. 
> 
> r1 = x
> r2 = a
> cmp ...
> csel r3, r1, r2, cond
> x = r3
> 
> This could be safe if x is a stack variable, and there isn't any address 
> taken in current function, so the store speculation can be avoided. 
So at a high level should we be doing this in gimple rather than RTL?
We're going to have a lot more information about types, better
infrastructure for looking at uses/defs, access to the alias oracle, we
should be able to accurately distinguish between potentially shared
objects vs those which are local to the thread, etc.  We lose the low
level costing information though.

I'm still going to go through the patch and do some level of review, but
I do think we need to answer the higher level question though.


>  
> Index: ifcvt.c
> ===
> --- ifcvt.c   (revision 269698)
> +++ ifcvt.c   (working copy)
> @@ -2029,6 +2045,93 @@
>return true;
>  }
>  
> +/* Return true if MEM x is on stack. a_insn contains x, if it exists. */
Nits: We typically refer to parameters, variables, etc in comments using
upper case.  You'll need to review the entire patch for these its.

So perhaps the comment should be something like:

/* Return true of X, a MEM expression, is on the stack.  A_INSN contains
   X if A_INSN exists.  */


Just from a design standpoint, what are the consequences if this
function returns true for something that isn't actually in the stack or
false for something that is on the stack?



> +
> +static bool
> +noce_mem_is_on_stack (rtx_insn *a_insn, const_rtx x)
> +{
> +  df_ref use;
> +
> +  gcc_assert (x);
> +  gcc_assert (MEM_P (x));
> +
> +  /* Early exits if find base is a stack register. */
> +  rtx a = XEXP (x, 0);
> +  if (fixed_base_plus_p(a))
> +return true;
Nit: Space between the function name and its open paren for arguments.  ie

if (fixed_base_plus_p (a)
 ^
I see other instances of this nit, please review the patch and correct them.


> +
> +  if (!a_insn)
> +return false;
I'm not sure what the calling profile is for this function, but this is
a cheaper test, so you might consider moving it before the test of
fixed_base_plus_p.


> +
> +  if (!reg_mentioned_p (x, a_insn))
> +return false;
> +
> +  /* Check if x is on stack. Assume a mem expression using registers
> + related to stack register is always on stack. */
> +  FOR_EACH_INSN_USE (use, a_insn)
> +if (reg_mentioned_p (DF_REF_REG (use), x)
> +&& bitmap_bit_p (bba_sets_must_be_sfp, DF_REF_REGNO (use)))
> +  return true;
> +
> +  return false;
> +}
So is X always a MEM?  Just wanted to make sure I understand.
reg_mentioned_p will do the right thing (using rtx_equal_p) for the
comparison.


> +
> +/* Always return true, if there is a dominating write.
> +   
> +   When there is a dominating read from memory on stack,
> +   1) if x = a is a memory read, return true.
> +   2) if x = a is a memory write, return true if the memory is on stack.
> +  This is the guarantee the memory is *not* readonly. */
> +
> +static bool
> +noce_valid_for_dominating (basic_block bb, rtx_insn *a_insn,
> +   const_rtx x, bool is_store)
> +{
> +  rtx_insn *insn;
> +  rtx set;
> +
> +  gcc_assert (MEM_P (x));
> +
> +  FOR_BB_INSNS (bb, insn)
> +{
> +  set = single_set (insn);
> +  if (!set)
> +continue;
> +
> +  /* Dominating store */
> +  if (rtx_equal_p (x, SET_DEST (set)))
> +return true;
> +
> +  /* Dominating load */
> +  if (rtx_equal_p (x, SET_SRC (set)))
> +if (is_store && noce_mem_is_on_stack (a_insn, x))
> +  return true;
> +}
> +
> +  return false;
> +}
So what would be the consequences here of returning false when in fact
there was a dominating read or write?  That could easily happen if the
dominating read or write was not a single_set.

I'm guessing that from a design standpoint you're trying to find cases
where you know an object was written to within the block and does not
escape.  So returning false when there was a dominating write is safe.
Returning true when there was not would be disastrous.  Right?





> @@ -5347,6 +5462,234 @@
>  
>return FALSE;
>  }
> +
> +
> +/* Find all insns that must be stack address and store REGNO into
> +   bitmap bba_sets_must_be_sfp. */
> +
> +static void
> +find_all_must_be_sfp_insns (void)
> +{
> +  rtx_insn *a_insn;
> +  basic_block bb;
> +  bool sfp_found;
> +  auto_vec def_count;
> +  df_ref def, use;
> +
> +  /* Calculate def counts for each insn. */
> +  def_count.safe_grow_cleared (max_reg_num () + 1);
> +  FOR_ALL_BB_FN (bb, cfun)
> +FOR_BB_INSNS (bb, a_insn)
> +  {
> + 

C++ PATCH for c++/61490 - qualified-id in friend function definition

2019-06-17 Thread Marek Polacek
[class.friend]/6 says that when we define a function in a friend declaration,
the function name must be unqualified.  But we never made sure that's so.

For good measure, I'm also improving the location of the related diagnostic.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-06-17  Marek Polacek  

PR c++/61490 - qualified-id in friend function definition.
* decl.c (grokdeclarator): Diagnose qualified-id in friend function
definition.  Improve location for diagnostics of friend functions.

* g++.dg/diagnostic/friend2.C: New test.
* g++.dg/diagnostic/friend3.C: New test.

diff --git gcc/cp/decl.c gcc/cp/decl.c
index 0a3ef452536..efc49137cdc 100644
--- gcc/cp/decl.c
+++ gcc/cp/decl.c
@@ -11605,13 +11605,29 @@ grokdeclarator (const cp_declarator *declarator,
friendp = 0;
  }
if (decl_context == NORMAL)
- error ("friend declaration not in class definition");
+ error_at (declarator->id_loc,
+   "friend declaration not in class definition");
if (current_function_decl && funcdef_flag)
  {
-   error ("cannot define friend function %qs in a local "
-  "class definition", name);
+   error_at (declarator->id_loc,
+ "cannot define friend function %qs in a local "
+ "class definition", name);
friendp = 0;
  }
+   /* [class.friend]/6: A function can be defined in a friend
+  declaration if the function name is unqualified.  */
+   if (funcdef_flag && in_namespace)
+ {
+   if (in_namespace == global_namespace)
+ error_at (declarator->id_loc,
+   "friend function definition %qs cannot have "
+   "a name qualified with %<::%>", name);
+   else
+ error_at (declarator->id_loc,
+   "friend function definition %qs cannot have "
+   "a name qualified with %<%D::%>", name,
+   in_namespace);
+ }
  }
else if (ctype && sfk == sfk_conversion)
  {
diff --git gcc/testsuite/g++.dg/diagnostic/friend2.C 
gcc/testsuite/g++.dg/diagnostic/friend2.C
new file mode 100644
index 000..4f4ada8bc16
--- /dev/null
+++ gcc/testsuite/g++.dg/diagnostic/friend2.C
@@ -0,0 +1,10 @@
+// PR c++/61490
+// { dg-do compile }
+
+namespace N { void f (); }
+void f2 ();
+
+struct A {
+  friend void N::f () { } // { dg-error "15:friend function definition 'f' 
cannot have a name qualified with 'N::'" }
+  friend void ::f2 () { } // { dg-error "15:friend function definition 'f2' 
cannot have a name qualified with '::'" }
+};
diff --git gcc/testsuite/g++.dg/diagnostic/friend3.C 
gcc/testsuite/g++.dg/diagnostic/friend3.C
new file mode 100644
index 000..574d7caa5fb
--- /dev/null
+++ gcc/testsuite/g++.dg/diagnostic/friend3.C
@@ -0,0 +1,9 @@
+// { dg-do compile }
+
+void
+fn ()
+{
+  struct S {
+friend void bar () { } // { dg-error "17:cannot define friend function 
'bar' in a local class definition" }
+  };
+}


[PATCH] #pragma omp scan inclusive vectorization

2019-06-17 Thread Jakub Jelinek
Hi!

On Mon, Jun 17, 2019 at 08:35:23AM +0200, Richard Biener wrote:
> Ugh, not pretty but probably best we can do.  Btw, can you please
> add support for the SLP case and group_size == 1?  I know I'm slow
> with the branch ripping out the non-SLP path but it would save me
> some extra work (possibly).

Here is what I've committed so far.  The SLP & group_size == 1 stuff
I'll try to help you with on your branch when you are close to merging in,
though there is a small complication, because this scan stuff isn't 100%
identical to just unrolling it several times, the "orig" vector is broadcast
from the last value to all the lanes.

I'll work incrementally on further improvements (e.g. the C++ testcase is
still not vectorized, we end up with
  _18 = .GOMP_SIMD_LANE (simduid.0_14(D), 0);
...
  _11 = (sizetype) _18;
  _9 = _11 * 4;
  _28 = &D.2456 + _9;
  _32 = MEM[(int *)_28];
instead of the usual _32 = D.2456[_18], nothing for whatever reason folds
that back together and the current simd lane handling probably doesn't
recognize that), handling references, exclusive scan etc.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2019-06-17  Jakub Jelinek  

* omp-low.c (struct omp_context): Add scan_inclusive field.
(scan_omp_1_stmt) : Set ctx->scan_inclusive
if inclusive scan.
(struct omplow_simd_context): Add lastlane member.
(lower_rec_simd_input_clauses): Add rvar argument, handle inscan
reductions.  Build 2 or 3 argument .GOMP_SIMD_LANE calls rather than
1 or 2 argument.
(lower_rec_input_clauses): Handle inscan reductions in simd contexts.
(lower_lastprivate_clauses): Set TREE_THIS_NOTRAP on the ARRAY_REF.
(lower_omp_scan): New function.
(lower_omp_1) : Use lower_omp_scan.
* tree-ssa-dce.c (eliminate_unnecessary_stmts): For IFN_GOMP_SIMD_LANE
check 3rd argument if present rather than 2nd.
* tree-vectorizer.h (struct _loop_vec_info): Add scan_map member.
(struct _stmt_vec_info): Change simd_lane_access_p from bool into
2-bit bitfield.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
scan_map.  For IFN_GOMP_SIMD_LANE check 3rd argument if present rather
than 2nd.
(_loop_vec_info::~_loop_vec_info): Delete scan_map.
* tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Allow two
different STMT_VINFO_SIMD_LANE_ACCESS_P refs if they have the same
init.
(vect_find_stmt_data_reference): Encode in ->aux the 2nd
IFN_GOMP_SIMD_LANE argument.
(vect_analyze_data_refs): Set STMT_VINFO_SIMD_LANE_ACCESS_P from the
encoded ->aux value.
* tree-vect-stmts.c: Include attribs.h.
(vectorizable_call): Adjust comment about IFN_GOMP_SIMD_LANE.
(scan_operand_equal_p, check_scan_store, vectorizable_scan_store): New
functions.
(vectorizable_load): For STMT_VINFO_SIMD_LANE_ACCESS_P tests use != 0.
(vectorizable_store): Handle STMT_VINFO_SIMD_LANE_ACCESS_P > 1.
cp/
* semantics.c (finish_omp_clauses): For OMP_CLAUSE_REDUCTION_INSCAN
set need_copy_assignment.
testsuite/
* gcc.dg/vect/vect-simd-8.c: New test.
* gcc.dg/vect/vect-simd-9.c: New test.
* g++.dg/vect/simd-2.cc: New test.
* g++.dg/gomp/scan-1.C: New test.

--- gcc/omp-low.c.jj2019-06-15 09:06:53.794030048 +0200
+++ gcc/omp-low.c   2019-06-17 18:13:09.426055084 +0200
@@ -141,6 +141,9 @@ struct omp_context
   /* True if lower_omp_1 should look up lastprivate conditional in parent
  context.  */
   bool combined_into_simd_safelen0;
+
+  /* True if there is nested scan context with inclusive clause.  */
+  bool scan_inclusive;
 };
 
 static splay_tree all_contexts;
@@ -3329,11 +3332,15 @@ scan_omp_1_stmt (gimple_stmt_iterator *g
   scan_omp_single (as_a  (stmt), ctx);
   break;
 
+case GIMPLE_OMP_SCAN:
+  if (tree clauses = gimple_omp_scan_clauses (as_a  (stmt)))
+   if (OMP_CLAUSE_CODE (clauses) == OMP_CLAUSE_INCLUSIVE)
+ ctx->scan_inclusive = true;
+  /* FALLTHRU */
 case GIMPLE_OMP_SECTION:
 case GIMPLE_OMP_MASTER:
 case GIMPLE_OMP_ORDERED:
 case GIMPLE_OMP_CRITICAL:
-case GIMPLE_OMP_SCAN:
 case GIMPLE_OMP_GRID_BODY:
   ctx = new_omp_context (stmt, ctx);
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
@@ -3671,6 +3678,7 @@ struct omplow_simd_context {
   omplow_simd_context () { memset (this, 0, sizeof (*this)); }
   tree idx;
   tree lane;
+  tree lastlane;
   vec simt_eargs;
   gimple_seq simt_dlist;
   poly_uint64_pod max_vf;
@@ -3682,7 +3690,8 @@ struct omplow_simd_context {
 
 static bool
 lower_rec_simd_input_clauses (tree new_var, omp_context *ctx,
- omplow_simd_context *sctx, tree &ivar, tree &lvar)
+ omplow_simd_context *sctx, tree &ivar,
+ tree &lvar, tree *rvar = NULL)
 {
   if (known

[PATCH], PowerPC PR90822 (cleanup lfiwax, lfiwzx generation)

2019-06-17 Thread Michael Meissner
I wrote the code to generate LFIWAX and LFIWZX originally for the power7 in the
2010 time frame.  At the time, we did not allow SImode to go into floating
point and vector registers.  As part of the power9 work, we now allow SImode to
go into FP/vector registers with for 64-bit code targetting -mcpu=power8 or
higher.  But we never went back and tweaked the LFIWAX/LFIWZX support.

I was writing code for a possible future PowerPC machine, and the new code
added an attribute that caused some of the -mno-vsx tests to fail.  This was
due to the floatsi2_lfiwax and floatunssi2_lfiwzx patterns did not
have a non-VSX alternative, and the attribute processing needed to process the
alternatives before the first split pass.

In looking at the code, I decided to also clean up the underlying lfiwax and
lfiwzx patterns.  In this code, on machines that support SImode in floating
point and vector registers, after register allocation we split the conversion
to SFmode and DFmode into a sign/zero extend operation.  On machines that do
not support SImode in floating point and vector registers, we continue to use
the lfiwax and lfiwzx unspec patterns.

I have tested this code by doing bootstraps and make checks on both little
endian and big endian systems, and there are no regressions.

I did build (but not run) the following versions of Spec 2006 and every version
built.

little endian power8 64-bit
little endian power9 64-bit
big endian power7 64-bit
big endian power8 64-bit
big endian power9 64-bit
bit endian power7 32-bit
bit endian power8 32-bit
bit endian power9 32-bit

In general, the 32-bit code seems to generate a lot less instructions,
including fewer lfiwax/lfiwzx instructions.  On power8/power9 32-bit code,
there was more mtvsrwz mtvsrwa instructions.

The 64-bit code is more similar, but I notice that we aren't generating as many
mtvsrd instructions, but instead generating mtvsrwa and mtvsrwd instructions.

Can I check this into the trunk?

[gcc]
2019-06-14  Michael Meissner  

PR target/90822
* config/rs6000/rs6000.md (lfiwax): Update comment.  Add
vupkhsw/xxspltd or vupklsw/xxspltd split support to sign extend
SImode on ISA 2.07 systems.
(floatsi2_lfiwax): Rewrite.  Do not split the insn until
after register allocation so that we should always get lfiwax
generated instead of doing a gpr load and direct move.  On 64-bit
systems that allow SImode in vector registers, split to do a
sign_extend instead of lfiwax.
(floatsi2_lfiwax_mem): Delete, no longer used.
(lfiwzx): Move so this is next to lfiwax.
(floatunssi2_lfiwax): Rewrite.  Do not split the insn until
after register allocation so that we should always get lfiwzx
generated instead of doing a gpr load and direct move.  On 64-bit
systems that allow SImode in vector registers, split to do a
zero_extend instead of lfiwzx.
(floatunssi2_lfiwzx_mem): Delete, no longer used.

[gcc/testsuite]
2019-06-14  Michael Meissner  

PR target/90822
* gcc.target/powerpc/pr81348.c: Use -O2 instead of -Og.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 272166)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -5231,88 +5231,46 @@ (define_insn "*xxsel"
 
 ;; Conversions to and from floating-point.
 
-; We don't define lfiwax/lfiwzx with the normal definition, because we
-; don't want to support putting SImode in FPR registers.
-(define_insn "lfiwax"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wa,wa,v")
-   (unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r,v")]
+; On 32-bit systems, we need to have special versions of LFIWAX and LFIWZX 
because
+; the sign/zero extend insns are not defined.
+(define_insn_and_split "lfiwax"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wa,wa,v,v")
+   (unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r,v,v")]
   UNSPEC_LFIWAX))]
   "TARGET_HARD_FLOAT && TARGET_LFIWAX"
   "@
lfiwax %0,%y1
lxsiwax %x0,%y1
mtvsrwa %x0,%1
-   vextsw2d %0,%1"
-  [(set_attr "type" "fpload,fpload,mffgpr,vecexts")
-   (set_attr "isa" "*,p8v,p8v,p9v")])
-
-; This split must be run before register allocation because it allocates the
-; memory slot that is needed to move values to/from the FPR.  We don't allocate
-; it earlier to allow for the combiner to merge insns together where it might
-; not be needed and also in case the insns are deleted as dead code.
-
-(define_insn_and_split "floatsi2_lfiwax"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=")
-   (float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wa"))]
-  "TARGET_HARD_FLOAT && TARGET_LFIWAX
-   &&  && can_create_pseudo_p ()"
-  "#"
-  ""
-  [(pc)]
+   vextsw2d %0,%1
+   #"
+  "&& r

[PR 90889] Fix IPA-CP topological sorting

2019-06-17 Thread Martin Jambor
Hi,

r272115 introduced IPA-CP hook to ignore some edges in order to break
useless SCCs, but the condition in the new hook was a bit too lenient.

If we want to break SCCs in which contains some nodes has IPA-CP
disabled, we must do it at the incoming edges, not at the outgoing, at
least without making sure that we propagate that these calls have to
mark callee's lattices as containing variable stuff.

But breaking SCCs only at the edges which are easier to handle
is... well.. easier, and so the following patch does that.  The patch
has restored LTO bootstrap with Ada, C and C++, I am now running LTO
bootstrap and testing with all languages, OK if it passes?

Thanks,

Martin


2019-06-17  Martin Jambor  

PR ipa/90889
* ipa-cp.c (ignore_edge_p): Do not ignore edges when only the
caller does not have flag_ipa_cp set.
---
 gcc/ipa-cp.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index a0f6f23829b..d3a88756a91 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -817,7 +817,6 @@ ignore_edge_p (cgraph_edge *e)
 = e->callee->function_or_virtual_thunk_symbol (&avail, e->caller);
 
   return (avail <= AVAIL_INTERPOSABLE
- || !opt_for_fn (e->caller->decl, flag_ipa_cp)
  || !opt_for_fn (ultimate_target->decl, flag_ipa_cp));
 }
 
-- 
2.21.0



Re: Fix ICE due to commit for PR88834

2019-06-17 Thread Jeff Law
On 6/17/19 3:08 AM, Richard Sandiford wrote:
> Kugan Vivekanandarajah  writes:
>> Hi All,
>>
>> As pointed to me by Jeff, after committing patch to fix PR88834, some
>> tests are failing for target rx-elf. This is because in
>> preferred_mem_scale_factor we end up with mem_mode which is BLKmode
>> and hence GET_MODE_UNIT_SIZE returns zero.
>>
>> I have fixed this by checking for BLKmode. I believe this is the only
>> way we can have GET_MODE_UNIT_SIZE of 0. Otherwise, we can check for
>> GET_MODE_UNIT_SIZE of zero.
>>
>> Bootstrapped and regression tested attached patch on x86_64-linux-gnu
>> with no new regressions. Is this OK for trunk?
>>
>> Thanks,
>> Kugan
>>
>> gcc/ChangeLog:
>>
>> 2019-06-17  Kugan Vivekanandarajah  
>>
>> * tree-ssa-address.c (preferred_mem_scale_factor): Handle when
>> mem_mode is BLKmode.
>>
>> From 5cd4ac35ce8006a6c407a2386175382f053dcdd3 Mon Sep 17 00:00:00 2001
>> From: Kugan Vivekanandarajah 
>> Date: Sun, 16 Jun 2019 21:02:59 +1000
>> Subject: [PATCH] Fix ICE for rx-elf
>>
>> Change-Id: I503b6b8316e7d11d63ec7749ff44dbc641078539
>> ---
>>  gcc/tree-ssa-address.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
>> index cdd432a..1dca779 100644
>> --- a/gcc/tree-ssa-address.c
>> +++ b/gcc/tree-ssa-address.c
>> @@ -1138,6 +1138,10 @@ preferred_mem_scale_factor (tree base, machine_mode 
>> mem_mode,
>>addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (base));
>>unsigned int fact = GET_MODE_UNIT_SIZE (mem_mode);
>>  
>> +  /* for BLKmode, we cant do anything so return 1.  */
> 
> s/for/For/;s/cant/can't/
> 
>> +  if (mem_mode == BLKmode)
>> +return 1;
>> +
> 
> Think it makes more sense to do this at the start of the function,
> before:
> 
>   struct mem_address parts = {};
>   addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (base));
>   unsigned int fact = GET_MODE_UNIT_SIZE (mem_mode);
> 
> (Hopefully one day GET_MODE_SIZE & co. will assert on BLKmode and VOIDmode.)
Yea.  Not sure why.  There's probably a compile-time hit, but I suspect
it'd turn up some interesting bugs.

jeff


[PATCH, i386]: Fix PR 62055, recognize fnabs (FP negative absolute value)

2019-06-17 Thread Uros Bizjak
2019-06-17  Uroš Bizjak  

PR target/62055
* config/i386/i386.md (*nabstf2_1): New insn pattern.
(*nabs2_1): Ditto.
(nabs sse-reg splitter): New splitter.
* config/i386/sse.md (*nabs2): New insn_and_split pattern.

testsuite/ChangeLog:

2019-06-17  Uroš Bizjak  

PR target/62055
* gcc.target/i386/fnabs.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 272386)
+++ config/i386/i386.md (working copy)
@@ -9452,6 +9452,16 @@
   "#"
   [(set_attr "isa" "noavx,noavx,avx,avx")])
 
+(define_insn "*nabstf2_1"
+  [(set (match_operand:TF 0 "register_operand" "=x,x,Yv,Yv")
+   (neg:TF
+ (abs:TF
+   (match_operand:TF 1 "vector_operand" "0,xBm,Yv,m"
+   (use (match_operand:TF 2 "vector_operand" "xBm,0,Yvm,Yv"))]
+  "TARGET_SSE"
+  "#"
+  [(set_attr "isa" "noavx,noavx,avx,avx")])
+
 (define_expand "2"
   [(set (match_operand:X87MODEF 0 "register_operand")
(absneg:X87MODEF (match_operand:X87MODEF 1 "register_operand")))]
@@ -9553,6 +9563,48 @@
   [(const_int 0)]
   "ix86_split_fp_absneg_operator (, mode, operands); DONE;")
 
+(define_insn "*nabs2_1"
+  [(set (match_operand:MODEF 0 "register_operand" "=x,x,Yv")
+   (neg:MODEF
+ (abs:MODEF
+   (match_operand:MODEF 1 "register_operand" "0,x,Yv"
+   (use (match_operand: 2 "vector_operand" "xBm,0,Yvm"))]
+  "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH"
+  "#"
+  [(set_attr "isa" "noavx,noavx,avx")])
+
+(define_split
+  [(set (match_operand:SSEMODEF 0 "sse_reg_operand")
+   (neg:SSEMODEF
+ (abs:SSEMODEF
+   (match_operand:SSEMODEF 1 "vector_operand"
+   (use (match_operand: 2 "vector_operand"))]
+  "((SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
+|| (TARGET_SSE && (mode == TFmode)))
+   && reload_completed"
+  [(set (match_dup 0) (match_dup 3))]
+{
+  machine_mode mode = mode;
+  machine_mode vmode = mode;
+
+  operands[0] = lowpart_subreg (vmode, operands[0], mode);
+  operands[1] = lowpart_subreg (vmode, operands[1], mode);
+
+  if (TARGET_AVX)
+{
+  if (MEM_P (operands[1]))
+std::swap (operands[1], operands[2]);
+}
+  else
+   {
+ if (operands_match_p (operands[0], operands[2]))
+   std::swap (operands[1], operands[2]);
+   }
+
+  operands[3]
+= gen_rtx_fmt_ee (IOR, vmode, operands[1], operands[2]);
+})
+
 ;; Conditionalize these after reload. If they match before reload, we
 ;; lose the clobber and ability to use integer instructions.
 
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 272386)
+++ config/i386/sse.md  (working copy)
@@ -1748,6 +1748,33 @@
 }
   [(set_attr "isa" "noavx,noavx,avx,avx")])
 
+(define_insn_and_split "*nabs2"
+  [(set (match_operand:VF 0 "register_operand" "=x,x,v,v")
+   (neg:VF
+ (abs:VF
+   (match_operand:VF 1 "vector_operand" "0,xBm,v,m"
+   (use (match_operand:VF 2 "vector_operand""xBm,0,vm,v"))]
+  "TARGET_SSE"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0) (match_dup 3))]
+{
+  if (TARGET_AVX)
+{
+  if (MEM_P (operands[1]))
+std::swap (operands[1], operands[2]);
+}
+  else
+   {
+ if (operands_match_p (operands[0], operands[2]))
+   std::swap (operands[1], operands[2]);
+   }
+
+  operands[3]
+= gen_rtx_fmt_ee (IOR, mode, operands[1], operands[2]);
+}
+  [(set_attr "isa" "noavx,noavx,avx,avx")])
+
 (define_expand "3"
   [(set (match_operand:VF 0 "register_operand")
(plusminus:VF
Index: testsuite/gcc.target/i386/fnabs.c
===
--- testsuite/gcc.target/i386/fnabs.c   (nonexistent)
+++ testsuite/gcc.target/i386/fnabs.c   (working copy)
@@ -0,0 +1,21 @@
+/* PR target/62055 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+
+float testf (float a)
+{
+  return -__builtin_fabsf (a);
+}
+
+double test (double a)
+{
+   return -__builtin_fabs (a);
+}
+
+__float128 testq (__float128 a)
+{
+   return -__builtin_fabsq (a);
+}
+
+/* { dg-final { scan-assembler-times "\tv?orp\[sd\]\[ \t\]" 2 } } */
+/* { dg-final { scan-assembler-times "\tv?por\[ \t\]" 1 } } */


Re: C++ PATCH for c++/83820 - excessive attribute arguments not detected

2019-06-17 Thread Marek Polacek
On Mon, Jun 17, 2019 at 09:50:51AM -0600, Martin Sebor wrote:
> On 6/16/19 1:10 PM, Marek Polacek wrote:
> > While messing with [[noreturn]] I also found out that we don't detect
> > the case when an attribute specifier that takes no arguments contains
> > an attribute-argument-clause.
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > 2019-06-16  Marek Polacek  
> > 
> > PR c++/83820 - excessive attribute arguments not detected.
> > * parser.c (cp_parser_std_attribute): Detect excessive arguments.
> > 
> > * g++.dg/cpp0x/gen-attrs-67.C: New test.
> > 
> > diff --git gcc/cp/parser.c gcc/cp/parser.c
> > index 8f5ae84670a..871bc45da63 100644
> > --- gcc/cp/parser.c
> > +++ gcc/cp/parser.c
> > @@ -26149,6 +26149,20 @@ cp_parser_std_attribute (cp_parser *parser, tree 
> > attr_ns)
> >   vec *vec;
> >   int attr_flag = normal_attr;
> > +/* Maybe we don't expect to see any arguments for this attribute.  */
> > +const attribute_spec *as
> > +  = lookup_attribute_spec (TREE_PURPOSE (attribute));
> > +if (as && as->max_length == 0)
> > +  {
> > +   error_at (token->location, "attribute %qE does not take any arguments",
> > + attr_id);
> 
> Not to be too anal about this but most messages have the word order
> reversed and would be phrased as
> 
>   %qE attribute does not take any arguments
> 
> I've been adjusting the order to match this form as I notice it so
> it would be great if we could use this form in new diagnostics as
> well.

*shrug* I have no problem changing that.  Will commit with your proposed
change.  Thanks,

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA


Re: C++ PATCH for c++/60364 - noreturn after first decl not diagnosed (v3)

2019-06-17 Thread Marek Polacek
On Mon, Jun 17, 2019 at 05:48:23PM +0200, Jakub Jelinek wrote:
> On Sun, Jun 16, 2019 at 12:36:58PM -0400, Marek Polacek wrote:
> > On Sun, Jun 16, 2019 at 06:18:56PM +0200, Jakub Jelinek wrote:
> > > On Sun, Jun 16, 2019 at 12:10:37PM -0400, Marek Polacek wrote:
> > > > > Guess I will really have to make the changes to treat [[noreturn]] 
> > > > > similarly
> > > > > to e.g. [[nodiscard]], so that cxx11_attribute_p works.
> > > > 
> > > > Thus.  Changes I've made:
> > > > * don't treat [[noreturn]] as an equivalent to 
> > > > __attribute__((noreturn));
> > > > * for that I had to adjust decl_attributes, it wasn't preserving the
> > > >   C++11 form (a list in another list); fix shadowing while at it;
> > > > * the above turned up two spots that were wrongly accessing TREE_PURPOSE
> > > >   directly instead of using get_attribute_name;
> > > > * give error only for [[noreturn]] but not for __attribute__((noreturn))
> > > >   or [[gnu::noreturn]].
> > > > 
> > > > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > > 
> > > I'd prefer to defer review of this to Jason, just want to note that I 
> > > don't
> > > see any testsuite coverage on mixing declarations with different forms of
> > > attributes ([[noreturn]] on one decl and __attribute__((noreturn)) or
> > > [[gnu::noreturn]] on another one or vice versa.
> > 
> > Added now.  I suppose it should compile fine, which it does.
> 
> I meant also the tests of the new diagnostics, say if you have
> a decl without any of those attributes, then gnu:: one (or __attribute__
> one; that merge decls should be ok) and on third decl [[noreturn]] (shall
> that diagnose anything or not?  As there is no way to differentiate it from
> the gnu:: attribute on the very first one, I'd say it shouldn't, with the
> use of the gnu:: or __attribute__ we are already outside of the standard.

I've added noreturn-11.C for that.  With my patch we don't diagnose

void f1 ();
void f1 [[gnu::noreturn]] ();
void f1 [[noreturn]] ();

but that seems fine to me, too.  Of course there's the problem that we only
check the previous decl, not the first one, but I guess we'll have to live
with it; it will detect the bogus cases anyway.

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA


Re: C++ PATCH for c++/60364 - noreturn after first decl not diagnosed (v2)

2019-06-17 Thread Marek Polacek
On Mon, Jun 17, 2019 at 09:02:17AM -0600, Martin Sebor wrote:
> > diff --git gcc/cp/tree.c gcc/cp/tree.c
> > index cd021b7f594..bb695e14e73 100644
> > --- gcc/cp/tree.c
> > +++ gcc/cp/tree.c
> > @@ -4453,6 +4453,8 @@ const struct attribute_spec std_attribute_table[] =
> >   handle_likeliness_attribute, attr_cold_hot_exclusions },
> > { "unlikely", 0, 0, false, false, false, false,
> >   handle_likeliness_attribute, attr_cold_hot_exclusions },
> > +  { "noreturn", 0, 0, true, false, false, false,
> > +handle_noreturn_attribute, NULL },
>   
> 
> The GNU attribute is made mutually exclusive with a bunch of other
> attributes (e.g., malloc or warn_unused_result) by setting the last
> member to the array of exclusive attribute.  Does the change preserve
> this relationship some other way?

Oop, no, that is a bug.  I meant to go back to that, but I'd forgotten to add
an XXX comment as I'm wont to, and the testsuite doesn't test that, so it
slipped.  Fixed & new test added.  Thanks for catching it.

Also added a test for the scenario Jakub pointed out in the other mail.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-06-17  Marek Polacek  

PR c++/60364 - noreturn after first decl not diagnosed.
* attribs.c (get_attribute_namespace): No longer static.
(decl_attributes): Avoid shadowing.  Preserve the C++11 form for C++11
attributes.
(attr_noreturn_exclusions): Make it extern.
* attribs.h (get_attribute_namespace): Declare.
* tree-inline.c (function_attribute_inlinable_p): Use
get_attribute_name.

* c-attribs.c (handle_noreturn_attribute): No longer static.
* c-common.h (handle_noreturn_attribute, attr_noreturn_exclusions):
Declare.
* c-format.c (check_function_format): Use get_attribute_name.

* decl.c (duplicate_decls): Give an error when a function is
declared [[noreturn]] after its first declaration.
* parser.c (cp_parser_std_attribute): Don't treat C++11 noreturn
attribute as equivalent to GNU's.
* tree.c (std_attribute_table): Add noreturn.

* g++.dg/warn/noreturn-8.C: New test.
* g++.dg/warn/noreturn-9.C: New test.
* g++.dg/warn/noreturn-10.C: New test.
* g++.dg/warn/noreturn-11.C: New test.

diff --git gcc/attribs.c gcc/attribs.c
index 4441922543f..8e540165597 100644
--- gcc/attribs.c
+++ gcc/attribs.c
@@ -340,7 +340,7 @@ lookup_attribute_spec (const_tree name)
Please read the comments of cxx11_attribute_p to understand the
format of attributes.  */
 
-static tree
+tree
 get_attribute_namespace (const_tree attr)
 {
   if (cxx11_attribute_p (attr))
@@ -469,7 +469,6 @@ tree
 decl_attributes (tree *node, tree attributes, int flags,
 tree last_decl /* = NULL_TREE */)
 {
-  tree a;
   tree returned_attrs = NULL_TREE;
 
   if (TREE_TYPE (*node) == error_mark_node || attributes == error_mark_node)
@@ -548,22 +547,23 @@ decl_attributes (tree *node, tree attributes, int flags,
 
   /* Note that attributes on the same declaration are not necessarily
  in the same order as in the source.  */
-  for (a = attributes; a; a = TREE_CHAIN (a))
+  for (tree attr = attributes; attr; attr = TREE_CHAIN (attr))
 {
-  tree ns = get_attribute_namespace (a);
-  tree name = get_attribute_name (a);
-  tree args = TREE_VALUE (a);
+  tree ns = get_attribute_namespace (attr);
+  tree name = get_attribute_name (attr);
+  tree args = TREE_VALUE (attr);
   tree *anode = node;
   const struct attribute_spec *spec
= lookup_scoped_attribute_spec (ns, name);
   int fn_ptr_quals = 0;
   tree fn_ptr_tmp = NULL_TREE;
+  const bool cxx11_attr_p = cxx11_attribute_p (attr);
 
   if (spec == NULL)
{
  if (!(flags & (int) ATTR_FLAG_BUILT_IN))
{
- if (ns == NULL_TREE || !cxx11_attribute_p (a))
+ if (ns == NULL_TREE || !cxx11_attr_p)
warning (OPT_Wattributes, "%qE attribute directive ignored",
 name);
  else
@@ -584,7 +584,7 @@ decl_attributes (tree *node, tree attributes, int flags,
   gcc_assert (is_attribute_p (spec->name, name));
 
   if (TYPE_P (*node)
- && cxx11_attribute_p (a)
+ && cxx11_attr_p
  && !(flags & ATTR_FLAG_TYPE_IN_PLACE))
{
  /* This is a c++11 attribute that appertains to a
@@ -707,8 +707,7 @@ decl_attributes (tree *node, tree attributes, int flags,
 
   if (spec->handler != NULL)
{
- int cxx11_flag =
-   cxx11_attribute_p (a) ? ATTR_FLAG_CXX11 : 0;
+ int cxx11_flag = (cxx11_attr_p ? ATTR_FLAG_CXX11 : 0);
 
  /* Pass in an array of the current declaration followed
 by the last pushed/merged declaration if  one exists.
@@ -756,17 +755,23 @@ decl_attributes (tree *node, tree attributes, int flags,
  if (a 

Re: [C++ PATCH] PR c++/90875 - added -Wswitch-outside-range option.

2019-06-17 Thread Marek Polacek
Thanks for the patch.

On Mon, Jun 17, 2019 at 01:26:37PM -0400, Matthew Beliveau wrote:
> --- gcc/c-family/c.opt
> +++ gcc/c-family/c.opt
> @@ -819,6 +819,10 @@ Wswitch-bool
>  C ObjC C++ ObjC++ Var(warn_switch_bool) Warning Init(1)
>  Warn about switches with boolean controlling expression.
>  
> +Wswitch-outside-range
> +C++ ObjC++ Var(warn_switch_outside_range) Warning Init(1)

This warning is not C++- specific; it also applies to C and Obj-C.

> +Warn about switch values that are outside of their type's range.

This is slightly imprecise -- the values are outside of the type
of the controlling expression of the switch.  So I'd say
"that are outside of the switch's type range" or so.

> diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
> index bf9da0f0a6e..c23496b2668 100644
> --- gcc/doc/invoke.texi
> +++ gcc/doc/invoke.texi
> @@ -5390,6 +5390,12 @@ switch ((int) (a == 4))
>  @end smallexample
>  This warning is enabled by default for C and C++ programs.
>  
> +@item -Wswitch-outside-range
> +@opindex Wswitch-outside-range
> +@opindex Wno-switch-outside-range
> +Warn whenever a @code{switch} state has a value that is outside of it's

"its"

> +respective type range.
> +

This should also say
"This warning is enabled by default for C and C++ programs."

> diff --git gcc/testsuite/g++.dg/warn/Wswitch-outside-range-1.C 
> gcc/testsuite/g++.dg/warn/Wswitch-outside-range-1.C
> new file mode 100644
> index 000..29e56f3ba2d
> --- /dev/null
> +++ gcc/testsuite/g++.dg/warn/Wswitch-outside-range-1.C
> @@ -0,0 +1,8 @@
> +// PR c++/90875
> +
> +void f(char c)
> +{
> +  switch (c)
> +case 300: // { dg-warning "case label value exceeds maximum value for 
> type" }
> +case -300:; // { dg-warning "case label value is less than minimum value 
> for type" }
> +}
> diff --git gcc/testsuite/g++.dg/warn/Wswitch-outside-range-2.C 
> gcc/testsuite/g++.dg/warn/Wswitch-outside-range-2.C
> new file mode 100644
> index 000..20cc019b209
> --- /dev/null
> +++ gcc/testsuite/g++.dg/warn/Wswitch-outside-range-2.C
> @@ -0,0 +1,9 @@
> +// PR c++/90875
> +// { dg-options -Wno-switch-outside-range }
> +
> +void f(char c)
> +{
> +  switch (c)
> +case 300: //{ dg-bogus "case label value is less than minimum value for 
> type" }
> +case -300:; // { dg-bogus "case label value is less than minimum value 
> for type" }
> +}
> diff --git gcc/testsuite/g++.dg/warn/Wswitch-outside-range-3.C 
> gcc/testsuite/g++.dg/warn/Wswitch-outside-range-3.C
> new file mode 100644
> index 000..baf15561af0
> --- /dev/null
> +++ gcc/testsuite/g++.dg/warn/Wswitch-outside-range-3.C
> @@ -0,0 +1,9 @@
> +// PR c++/90875
> +// { dg-options -Wno-pedantic }
> +
> +void f(char c)
> +{
> +  switch (c)
> +  
> +case -300 ... 300:; // { dg-warning "lower value in case label range 
> less than minimum value for type|upper value in case label range exceeds 
> maximum value for type" }
> +}
> diff --git gcc/testsuite/g++.dg/warn/Wswitch-outside-range-4.C 
> gcc/testsuite/g++.dg/warn/Wswitch-outside-range-4.C
> new file mode 100644
> index 000..d9bd756dc50
> --- /dev/null
> +++ gcc/testsuite/g++.dg/warn/Wswitch-outside-range-4.C
> @@ -0,0 +1,9 @@
> +// PR c++/90875
> +// { dg-options "-Wno-pedantic -Wno-switch-outside-range" }

(You can also use __extension__ so that you don't need -Wno-pedantic.)

> +void f(char c)
> +{
> +  switch (c)
> +  
> +case -300 ... 300:; // { dg-bogus "lower value in case label range less 
> than minimum value for type|upper value in case label range exceeds maximum 
> value for type" }
> +}

The tests belong to c-c++-common/ because we want to test both the C and
C++ compilers.

Please also fix the formatting as Jakub suggested.

Thanks,
--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA


Re: [MIPS] Fix for the wrong argument sequence in MSA builtin for FMADD/MADDV family.

2019-06-17 Thread Jeff Law
On 6/17/19 1:44 AM, Dragan Mladjenovic wrote:
> Sorry for the late response.
> 
> 
> I've checked the patch on gcc-9 and gcc-8 branches with mips{el}-linux-gnu.
> 
> There are no new regressions, as expected.
> 
> 
> Is this change still eligible for back-port ?
Yes, and I went ahead and backported the patch to the gcc-8 and gcc-9
branches.

Thanks,
jeff


Re: [C++ PATCH] PR c++/90875 - added -Wswitch-outside-range option.

2019-06-17 Thread Jakub Jelinek
On Mon, Jun 17, 2019 at 01:26:37PM -0400, Matthew Beliveau wrote:
> 2019-06-14  Matthew Beliveau  
>   
>   PR c++/90875 - added -Wswitch-outside-range option
>   * doc/invoke.texi (Wswitch-outside-range): Document.
> 
>   * c-warn.c (c_do_switch_warnings): Implemented new Wswitch-outside-range
>   warning option.
>   
>   * c.opt (Wswitch-outside-range): Added new option.
> 
>   * g++.dg/warn/Wswitch-outside-range-1.C: New test.
>   * g++.dg/warn/Wswitch-outside-range-2.C: New test.
>   * g++.dg/warn/Wswitch-outside-range-3.C: New test.
>   * g++.dg/warn/Wswitch-outside-range-4.C: New test.
> 
> diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
> index 5941c10cddb..b61694af638 100644
> --- gcc/c-family/c-warn.c
> +++ gcc/c-family/c-warn.c
> @@ -1460,8 +1460,9 @@ c_do_switch_warnings (splay_tree cases, location_t 
> switch_location,
>  min_value) >= 0)
>   {
> location_t loc = EXPR_LOCATION ((tree) node->value);
> -   warning_at (loc, 0, "lower value in case label range"
> -   " less than minimum value for type");
> +   warning_at (loc, OPT_Wswitch_outside_range, "lower value in case"
> +   " label range less than minimum value"
> +   " for type");

Just a formatting nit, not a review.  Previously the indentation of the
second part of the string literal made sense, now it doesn't, it shouldn't
start at a location that is not related in any way to the previous line.
Best move the whole string literal to the second line, like:
  warning_at (loc, OPT_Wswitch_outside_range,
  "lower value in case label range less than minimum"
  " value for type");
Happens multiple times in the patch.

Jakub


[C++ PATCH] PR c++/90875 - added -Wswitch-outside-range option.

2019-06-17 Thread Matthew Beliveau
This patch adds a new warning option: Wswitch-outside-range, so that
users are able to selectively control the warning. The warning is
enabled by default.

Best,
Matthew Beliveau
Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-06-14  Matthew Beliveau  
	
	PR c++/90875 - added -Wswitch-outside-range option
	* doc/invoke.texi (Wswitch-outside-range): Document.

	* c-warn.c (c_do_switch_warnings): Implemented new Wswitch-outside-range
	warning option.
	
	* c.opt (Wswitch-outside-range): Added new option.

	* g++.dg/warn/Wswitch-outside-range-1.C: New test.
	* g++.dg/warn/Wswitch-outside-range-2.C: New test.
	* g++.dg/warn/Wswitch-outside-range-3.C: New test.
	* g++.dg/warn/Wswitch-outside-range-4.C: New test.

diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
index 5941c10cddb..b61694af638 100644
--- gcc/c-family/c-warn.c
+++ gcc/c-family/c-warn.c
@@ -1460,8 +1460,9 @@ c_do_switch_warnings (splay_tree cases, location_t switch_location,
    min_value) >= 0)
 	{
 	  location_t loc = EXPR_LOCATION ((tree) node->value);
-	  warning_at (loc, 0, "lower value in case label range"
-  " less than minimum value for type");
+	  warning_at (loc, OPT_Wswitch_outside_range, "lower value in case"
+  " label range less than minimum value"
+  " for type");
 	  CASE_LOW ((tree) node->value) = convert (TREE_TYPE (cond),
 		   min_value);
 	  node->key = (splay_tree_key) CASE_LOW ((tree) node->value);
@@ -1474,8 +1475,8 @@ c_do_switch_warnings (splay_tree cases, location_t switch_location,
 	  if (node == NULL || !node->key)
 		break;
 	  location_t loc = EXPR_LOCATION ((tree) node->value);
-	  warning_at (loc, 0, "case label value is less than minimum "
-  "value for type");
+	  warning_at (loc, OPT_Wswitch_outside_range, "case label value is"
+  " less than minimum value for type");
 	  splay_tree_remove (cases, node->key);
 	}
 	  while (1);
@@ -1491,8 +1492,8 @@ c_do_switch_warnings (splay_tree cases, location_t switch_location,
    max_value) > 0)
 	{
 	  location_t loc = EXPR_LOCATION ((tree) node->value);
-	  warning_at (loc, 0, "upper value in case label range"
-			  " exceeds maximum value for type");
+	  warning_at (loc, OPT_Wswitch_outside_range, "upper value in case"
+			  " label range exceeds maximum value for type");
 	  CASE_HIGH ((tree) node->value)
 	= convert (TREE_TYPE (cond), max_value);
 	  outside_range_p = true;
@@ -1503,7 +1504,7 @@ c_do_switch_warnings (splay_tree cases, location_t switch_location,
 	 != NULL)
 	{
 	  location_t loc = EXPR_LOCATION ((tree) node->value);
-	  warning_at (loc, 0,
+	  warning_at (loc, OPT_Wswitch_outside_range,
 		  "case label value exceeds maximum value for type");
 	  splay_tree_remove (cases, node->key);
 	  outside_range_p = true;
diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index 572cf186262..4ced767ca99 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -819,6 +819,10 @@ Wswitch-bool
 C ObjC C++ ObjC++ Var(warn_switch_bool) Warning Init(1)
 Warn about switches with boolean controlling expression.
 
+Wswitch-outside-range
+C++ ObjC++ Var(warn_switch_outside_range) Warning Init(1)
+Warn about switch values that are outside of their type's range.
+
 Wtemplates
 C++ ObjC++ Var(warn_templates) Warning
 Warn on primary template declaration.
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index bf9da0f0a6e..c23496b2668 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -5390,6 +5390,12 @@ switch ((int) (a == 4))
 @end smallexample
 This warning is enabled by default for C and C++ programs.
 
+@item -Wswitch-outside-range
+@opindex Wswitch-outside-range
+@opindex Wno-switch-outside-range
+Warn whenever a @code{switch} state has a value that is outside of it's
+respective type range.
+
 @item -Wswitch-unreachable
 @opindex Wswitch-unreachable
 @opindex Wno-switch-unreachable
diff --git gcc/testsuite/g++.dg/warn/Wswitch-outside-range-1.C gcc/testsuite/g++.dg/warn/Wswitch-outside-range-1.C
new file mode 100644
index 000..29e56f3ba2d
--- /dev/null
+++ gcc/testsuite/g++.dg/warn/Wswitch-outside-range-1.C
@@ -0,0 +1,8 @@
+// PR c++/90875
+
+void f(char c)
+{
+  switch (c)
+case 300: // { dg-warning "case label value exceeds maximum value for type" }
+case -300:; // { dg-warning "case label value is less than minimum value for type" }
+}
diff --git gcc/testsuite/g++.dg/warn/Wswitch-outside-range-2.C gcc/testsuite/g++.dg/warn/Wswitch-outside-range-2.C
new file mode 100644
index 000..20cc019b209
--- /dev/null
+++ gcc/testsuite/g++.dg/warn/Wswitch-outside-range-2.C
@@ -0,0 +1,9 @@
+// PR c++/90875
+// { dg-options -Wno-switch-outside-range }
+
+void f(char c)
+{
+  switch (c)
+case 300: //{ dg-bogus "case label value is less than minimum value for type" }
+case -300:; // { dg-bogus "case label value is less than minimum value for type" }
+}
diff --git gcc/testsuite/g++.dg/warn/Wswitch-outside-range-3.C gcc/testsu

Re: [PATCH, og9] Port OpenACC profiling interface to OG9

2019-06-17 Thread Thomas Schwinge
Hi Kwok!

On Mon, 17 Jun 2019 14:27:46 +0100, Kwok Cheung Yeung  
wrote:
> This is a straightforward port of the OpenACC profiling interface from 
> OG8 to OG9, with a few tweaks

Thanks for looking into this.

> to compensate for patches that have not 
> been carried over to OG9.

Conceptually ACK.  (I have not reviewed what's missing/got dropped.)

> I have also bundled in the profiling hooks and 
> documentation updates that have been added since the original patch in OG8.

ACK.

> Okay to push to openacc-gcc-9-branch?

What you probably didn't know (sorry!) is that trunk r271346,
,
contains some changes that are not in the og8 code: code and
documentation improvements, TODOs resolved, etc.  It also doesn't contain
the actual 'acc_register_library' implementation, because that one needs
further work, as discussed before.

I think what would be best, is the following approach:

  - First, backport trunk r271346 to og9.

That might not be completely trivial, because as you know, og9
contains a number of "feature" commits that need to include changes
related to the profiling code.  It's the question in which order to
sequence patches -- whether these changes related to the profiling
code are part of the individual "feature" commits, or they're part of
a later "profiling" commit.  As og9 has been published with "feature"
commits not containing any profiling bits, all these things will have
to be in a later "profiling" commit.

However, for upstreaming this into trunk later on, it will be
beneficial to have the backported "profiling" commit as similar as
possible to trunk r271346, so we shall live with some short-lived
inconsistencies, maybe even a few testuite regressions, which then
get resolved when you...

  - Second, on top of that, add the pieces of functionality (from the
og8) version that are missing from the trunk r271346 backport.  Of
course, remove all changes that would worsen the state, compared to
what trunk r271346 already contains.

So, conceptually: a first commit to backport trunk r271346, then a second
commit containing a merger of a (temporary) revert of the first commit
followed by applying the patch you just posted.  Then, remove from the
second commit all changes that worsen the state, compared to what trunk
r271346 already contains.

Or, in other words: split the patch you just posted into two, where the
first one is as close as possible to a backport of trunk r271346, then
all other changes in a second commit.

Will that work?

I'll be happy to help review these changes, especially whether something
should be part of the second commit, or get dropped.


Grüße
 Thomas


RE: [patch][aarch64]: add usra and ssra combine patterns

2019-06-17 Thread Sylvia Taylor
Updating patch with missing scan-assembler checks.

Cheers,
Syl

-Original Message-
From: Sylvia Taylor 
Sent: 04 June 2019 12:24
To: James Greenhalgh 
Cc: Richard Earnshaw ; Marcus Shawcroft 
; gcc-patches@gcc.gnu.org; nd 
Subject: RE: [patch][aarch64]: add usra and ssra combine patterns

Hi James,

I've managed to remove the odd redundant git diff change.

Regarding aarch64_sra_n, this patch shouldn't affect it.

I am also not aware of any way of enabling this combine inside the pattern used 
for those intrinsics, so I kept them separate.

Cheers,
Syl

-Original Message-
From: James Greenhalgh 
Sent: 03 June 2019 11:20
To: Sylvia Taylor 
Cc: Richard Earnshaw ; Marcus Shawcroft 
; gcc-patches@gcc.gnu.org; nd 
Subject: Re: [patch][aarch64]: add usra and ssra combine patterns

On Thu, May 30, 2019 at 03:25:19PM +0100, Sylvia Taylor wrote:
> Greetings,
> 
> This patch adds support to combine:
> 
> 1) ushr and add into usra, example:
> 
> ushr  v0.16b, v0.16b, 2
> add   v0.16b, v0.16b, v2.16b
> ---
> usra  v2.16b, v0.16b, 2
> 
> 2) sshr and add into ssra, example:
> 
> sshr  v1.16b, v1.16b, 2
> add   v1.16b, v1.16b, v3.16b
> ---
> ssra  v3.16b, v1.16b, 2
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for trunk? If yes, I don't have any commit rights, so can someone 
> please commit it on my behalf.

This patch has an unrelated change to
aarch64_get_lane_zero_extend Please revert that and 
resend.

What changes (if any) should we make to aarch64_sra_n based on this 
patch, and to the vsra_n intrinsics in arm_neon.h ?

Thanks,
James

> 
> Cheers,
> Syl
> 
> gcc/ChangeLog:
> 
> 2019-05-30  Sylvia Taylor  
> 
>   * config/aarch64/aarch64-simd.md
>   (*aarch64_simd_sra): New.
>   * config/aarch64/iterators.md
>   (SHIFTRT): New iterator.
>   (sra_op): New attribute.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-05-30  Sylvia Taylor  
> 
>   * gcc.target/aarch64/simd/ssra.c: New test.
>   * gcc.target/aarch64/simd/usra.c: New test.

> diff --git a/gcc/config/aarch64/aarch64-simd.md
> b/gcc/config/aarch64/aarch64-simd.md
> index
> e3852c5d182b70978d7603225fce55c0b8ee2894..502ac5f3b45a1da059bb07701150
> a531091378ed 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -3110,22 +3122,22 @@
>  operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2]));
>  return "smov\\t%0, %1.[%2]";
>}
> -  [(set_attr "type" "neon_to_gp")]
> -)
> -
> -(define_insn "*aarch64_get_lane_zero_extend"
> -  [(set (match_operand:GPI 0 "register_operand" "=r")
> - (zero_extend:GPI
> -   (vec_select:
> - (match_operand:VDQQH 1 "register_operand" "w")
> - (parallel [(match_operand:SI 2 "immediate_operand" "i")]]
> -  "TARGET_SIMD"
> -  {
> -operands[2] = aarch64_endian_lane_rtx (mode,
> -INTVAL (operands[2]));
> -return "umov\\t%w0, %1.[%2]";
> -  }
> -  [(set_attr "type" "neon_to_gp")]
> +  [(set_attr "type" "neon_to_gp")]
> +)
> +
> +(define_insn "*aarch64_get_lane_zero_extend"
> +  [(set (match_operand:GPI 0 "register_operand" "=r")
> + (zero_extend:GPI
> +   (vec_select:
> + (match_operand:VDQQH 1 "register_operand" "w")
> + (parallel [(match_operand:SI 2 "immediate_operand" "i")]]
> +  "TARGET_SIMD"
> +  {
> +operands[2] = aarch64_endian_lane_rtx (mode,
> +INTVAL (operands[2]));
> +return "umov\\t%w0, %1.[%2]";
> +  }
> +  [(set_attr "type" "neon_to_gp")]
>  )
>  
>  ;; Lane extraction of a value, neither sign nor zero extension

These changes should be dropped.


diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
eeed08e71ca0b96726cb28743ef38487a8287600..aba6af24eee1c29fe4524eb352747c94617b30c7
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -986,6 +986,18 @@
   [(set_attr "type" "neon_shift_imm")]
 )
 
+(define_insn "*aarch64_simd_sra"
+ [(set (match_operand:VDQ_I 0 "register_operand" "=w")
+   (plus:VDQ_I
+  (SHIFTRT:VDQ_I
+   (match_operand:VDQ_I 1 "register_operand" "w")
+   (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" "Dr"))
+  (match_operand:VDQ_I 3 "register_operand" "0")))]
+  "TARGET_SIMD"
+  "sra\t%0., %1., %2"
+  [(set_attr "type" "neon_shift_acc")]
+)
+
 (define_insn "aarch64_simd_imm_shl"
  [(set (match_operand:VDQ_I 0 "register_operand" "=w")
(ashift:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 
d0070b1a73218822976acb846638ee385d8a4f2c..9bc84c28bba1a6591fab2314753d5d43845b6e54
 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -1168,6 +1168,8 @@
 ;; This code iterator allows the shifts supported in arithmetic instructions
 (define_code_iterator ASHIFT [ashift ashiftrt lshiftrt])
 
+(define_co

Re: Do not give up early on access path oracle

2019-06-17 Thread Richard Biener
On June 17, 2019 6:23:03 PM GMT+02:00, Richard Sandiford 
 wrote:
>Jan Hubicka  writes:
>>> >> > 
>>> >> > A more stricter test would be
>>> >> > 
>>> >> >if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2,
>size2))
>>> >> >   return true;
>>> >> > /* If there's a variable access in one of the refs fall
>through
>>> >> >to access-path based disambiguation.  */
>>> >> > 
>>> >> > where you'd need to pass down ao_ref_size in addition to
>max_size as well.
>>> >> 
>>> >> Proably || here?
>>> >
>>> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1
>thus
>>> > I think && is correct if you want to disambiguate a[1].v2 and
>a[i].v1
>>> >
>>> > But yes, if you don't want that then || is cheaper.  Probably add
>>> > another testcase with one of the accesses with a constant index?
>>> 
>>> Might be misunderstanding, but isn't the check for a variable access
>>> !known_eq (max_size1, size1) == maybe_ne (max_size1, size1)? 
>"maybe_eq"
>>> means "could be equal in some circumstances, even if they're not
>always
>>> equal".
>>
>> Seems I am getting progressively more confused.
>
>Well, me too :-)  I didn't really understand the choice of the original
>condition above.  It seemed to be "return true if both access sizes are
>variable", but the comment implied something else.

Sorry,! Must_eq is obviously fine. 

>But I was just picking up on the choice of maybe_eq vs. known_eq in the
>condition, which as things stand would only matter for SVE.  And...
>
>> I think I want to pass the size down from ao_ref to the functions and
>> then see if they are always equal.
>
>...right, that's what I mean.  known_eq gives you "always equal",
>i.e. "not variable".  The maybe_eq in the original condition seemed odd
>because it includes the constant case and (potentially) variable cases
>that are size1 or bigger.
>
>In inverted conditions, !known_eq can be written maybe_ne (but doesn't
>need to be, if !known_eq seems clearer).
>
>Thanks,
>Richard
>
>>
>> Why would I want to use maybe variant?
>>
>> Honza
>>
>>> 
>>> Richard



[PATCH] i386: Separate costs of RTL expressions from costs of moves

2019-06-17 Thread H.J. Lu
processor_costs has costs of RTL expressions and costs of moves:

1. Costs of RTL expressions is computed as COSTS_N_INSNS which are used
to generate RTL expressions with the lowest costs.  Costs of RTL memory
operation can be very close to costs of fast instructions to indicate
fast memory operations.

2. After RTL expressions have been generated, costs of moves are used by
TARGET_REGISTER_MOVE_COST and TARGET_MEMORY_MOVE_COST to compute move
costs for register allocator.  Costs of load and store are higher than
costs of register moves to reduce stack usages by register allocator.

We should separate costs of RTL expressions from costs of moves so that
they can be adjusted independently.  This patch moves costs of moves to
the new used_by_ra field and duplicates costs of moves which are also
used for costs of RTL expressions.

All cost models have been checked with

static void
check_one (const struct processor_costs *p)
{
  if (p->used_by_ra.int_load[2] != p->int_load)
abort ();
  if (p->used_by_ra.int_store[2] != p->int_store)
abort ();
  if (p->used_by_ra.xmm_move != p->xmm_move)
abort ();
  if (p->used_by_ra.sse_to_integer != p->sse_to_integer)
abort ();
  if (p->used_by_ra.integer_to_sse != p->integer_to_sse)
abort ();
  if (memcmp (p->used_by_ra.sse_load, p->sse_load, sizeof (p->sse_load)))
abort ();
  if (memcmp (p->used_by_ra.sse_store, p->sse_store, sizeof (p->sse_store)))
abort ();
}

static void
check_cost ()
{
 check_one (&ix86_size_cost);
  for (unsigned int i = 0; i < ARRAY_SIZE (processor_cost_table); i++)
check_one (processor_cost_table[i]);
}

by calling check_cost from ix86_option_override_internal.

PR target/90878
* config/i386/i386-features.c
(dimode_scalar_chain::compute_convert_gain): Replace int_store[2]
and int_load[2] with int_store and int_load.
* config/i386/i386.c (inline_memory_move_cost): Use used_by_ra
for costs of moves.
(ix86_register_move_cost): Likewise.
(ix86_builtin_vectorization_cost): Replace int_store[2] and
int_load[2] with int_store and int_load.
* config/i386/i386.h (processor_costs): Move costs of moves to
used_by_ra.  Add int_load, int_store, xmm_move, sse_to_integer,
integer_to_sse, sse_load, sse_store, sse_unaligned_load and
sse_unaligned_store for costs of RTL expressions.
* config/i386/x86-tune-costs.h: Duplicate int_load, int_store,
xmm_move, sse_to_integer, integer_to_sse, sse_load, sse_store
for costs of RTL expressions.  Use sse_unaligned_load and
sse_unaligned_store only for costs of RTL expressions.

-- 
H.J.
From 1c04d184860d613ba0d789b9bc4e8754ca283e1e Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 14 Jun 2019 13:30:16 -0700
Subject: [PATCH] i386: Separate costs of RTL expressions from costs of moves

processor_costs has costs of RTL expressions and costs of moves:

1. Costs of RTL expressions is computed as COSTS_N_INSNS which are used
to generate RTL expressions with the lowest costs.  Costs of RTL memory
operation can be very close to costs of fast instructions to indicate
fast memory operations.

2. After RTL expressions have been generated, costs of moves are used by
TARGET_REGISTER_MOVE_COST and TARGET_MEMORY_MOVE_COST to compute move
costs for register allocator.  Costs of load and store are higher than
costs of register moves to reduce stack usages by register allocator.

We should separate costs of RTL expressions from costs of moves so that
they can be adjusted independently.  This patch moves costs of moves to
the new used_by_ra field and duplicates costs of moves which are also
used for costs of RTL expressions.

All cost models have been checked with

static void
check_one (const struct processor_costs *p)
{
  if (p->used_by_ra.int_load[2] != p->int_load)
abort ();
  if (p->used_by_ra.int_store[2] != p->int_store)
abort ();
  if (p->used_by_ra.xmm_move != p->xmm_move)
abort ();
  if (p->used_by_ra.sse_to_integer != p->sse_to_integer)
abort ();
  if (p->used_by_ra.integer_to_sse != p->integer_to_sse)
abort ();
  if (memcmp (p->used_by_ra.sse_load, p->sse_load, sizeof (p->sse_load)))
abort ();
  if (memcmp (p->used_by_ra.sse_store, p->sse_store, sizeof (p->sse_store)))
abort ();
}

static void
check_cost ()
{
 check_one (&ix86_size_cost);
  for (unsigned int i = 0; i < ARRAY_SIZE (processor_cost_table); i++)
check_one (processor_cost_table[i]);
}

by calling check_cost from ix86_option_override_internal.

	PR target/90878
	* config/i386/i386-features.c
	(dimode_scalar_chain::compute_convert_gain): Replace int_store[2]
	and int_load[2] with int_store and int_load.
	* config/i386/i386.c (inline_memory_move_cost): Use used_by_ra
	for costs of moves.
	(ix86_register_move_cost): Likewise.
	(ix86_builtin_vectorization_cost): Replace int_store[2] and
	int_load[2] with int_store and int_load.
	* config/i386/i386.h (processor_costs): Move costs of moves to
	used_by_ra.  Add int_load, int_store, xmm_move, sse_to_integer,
	integer_to_sse, sse_load, sse_store, sse_unaligned_load a

Re: Do not give up early on access path oracle

2019-06-17 Thread Richard Sandiford
Jan Hubicka  writes:
>> >> > 
>> >> > A more stricter test would be
>> >> > 
>> >> > if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, 
>> >> > size2))
>> >> >   return true;
>> >> > /* If there's a variable access in one of the refs fall through
>> >> >to access-path based disambiguation.  */
>> >> > 
>> >> > where you'd need to pass down ao_ref_size in addition to max_size as 
>> >> > well.
>> >> 
>> >> Proably || here?
>> >
>> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
>> > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
>> >
>> > But yes, if you don't want that then || is cheaper.  Probably add
>> > another testcase with one of the accesses with a constant index?
>> 
>> Might be misunderstanding, but isn't the check for a variable access
>> !known_eq (max_size1, size1) == maybe_ne (max_size1, size1)?  "maybe_eq"
>> means "could be equal in some circumstances, even if they're not always
>> equal".
>
> Seems I am getting progressively more confused.

Well, me too :-)  I didn't really understand the choice of the original
condition above.  It seemed to be "return true if both access sizes are
variable", but the comment implied something else.

But I was just picking up on the choice of maybe_eq vs. known_eq in the
condition, which as things stand would only matter for SVE.  And...

> I think I want to pass the size down from ao_ref to the functions and
> then see if they are always equal.

...right, that's what I mean.  known_eq gives you "always equal",
i.e. "not variable".  The maybe_eq in the original condition seemed odd
because it includes the constant case and (potentially) variable cases
that are size1 or bigger.

In inverted conditions, !known_eq can be written maybe_ne (but doesn't
need to be, if !known_eq seems clearer).

Thanks,
Richard

>
> Why would I want to use maybe variant?
>
> Honza
>
>> 
>> Richard


Re: C++ PATCH for c++/83820 - excessive attribute arguments not detected

2019-06-17 Thread Jason Merrill
Ok.

On Sun, Jun 16, 2019, 3:10 PM Marek Polacek  wrote:

> While messing with [[noreturn]] I also found out that we don't detect
> the case when an attribute specifier that takes no arguments contains
> an attribute-argument-clause.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2019-06-16  Marek Polacek  
>
> PR c++/83820 - excessive attribute arguments not detected.
> * parser.c (cp_parser_std_attribute): Detect excessive arguments.
>
> * g++.dg/cpp0x/gen-attrs-67.C: New test.
>
> diff --git gcc/cp/parser.c gcc/cp/parser.c
> index 8f5ae84670a..871bc45da63 100644
> --- gcc/cp/parser.c
> +++ gcc/cp/parser.c
> @@ -26149,6 +26149,20 @@ cp_parser_std_attribute (cp_parser *parser, tree
> attr_ns)
>  vec *vec;
>  int attr_flag = normal_attr;
>
> +/* Maybe we don't expect to see any arguments for this attribute.  */
> +const attribute_spec *as
> +  = lookup_attribute_spec (TREE_PURPOSE (attribute));
> +if (as && as->max_length == 0)
> +  {
> +   error_at (token->location, "attribute %qE does not take any
> arguments",
> + attr_id);
> +   cp_parser_skip_to_closing_parenthesis (parser,
> +  /*recovering=*/true,
> +  /*or_comma=*/false,
> +  /*consume_paren=*/true);
> +   return error_mark_node;
> +  }
> +
>  if (attr_ns == gnu_identifier
> && attribute_takes_identifier_p (attr_id))
>/* A GNU attribute that takes an identifier in parameter.  */
> diff --git gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C
> gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C
> new file mode 100644
> index 000..bbbedd0240a
> --- /dev/null
> +++ gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C
> @@ -0,0 +1,11 @@
> +// PR c++/83820 - excessive attribute arguments not detected.
> +// { dg-do compile { target c++11 } }
> +
> +[[noreturn()]] void f0 (); // { dg-error "attribute .noreturn. does not
> take any arguments" }
> +[[noreturn(1)]] void f1 (); // { dg-error "attribute .noreturn. does not
> take any arguments" }
> +[[noreturn(1, 2)]] void f2 (); // { dg-error "attribute .noreturn. does
> not take any arguments" }
> +[[maybe_unused()]] int f3(); // { dg-error "attribute .maybe_unused. does
> not take any arguments" }
> +[[nodiscard()]] int f4(); // { dg-error "attribute .nodiscard. does not
> take any arguments" }
> +[[gnu::noinline()]] int f5(); // { dg-error "attribute .noinline. does
> not take any arguments" }
> +[[gnu::constructor]] int f6();
> +[[gnu::constructor(101)]] int f7();
>


Re: [PATCH] aarch64: fix gcc.target/aarch64/pcs_attribute-2.c on non-gnu targets

2019-06-17 Thread Szabolcs Nagy
On 07/06/2019 17:03, Szabolcs Nagy wrote:
> Move the ifunc symbol tests into a separate file with dg-require-ifunc.
> And added a base pcs ifunc symbol to the test for completeness.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-06-07  Szabolcs Nagy  
> 
>   * gcc.target/aarch64/pcs_attribute-2.c: Remove ifunc usage.
>   * gcc.target/aarch64/pcs_attribute-3.c: New test.
> 

ping.

this fixes a test issue.


[PATCH] Fix AIX test failure due to replacement operator delete

2019-06-17 Thread Jonathan Wakely

On AIX the sized delete defined in the library will call the non-sized
delete defined in the library, not the replacement version defined in
the test file. By also replacing sized delete we make the test pass
everywhere.

* testsuite/20_util/allocator/1.cc: Add sized delete, which fixes a
failure on AIX.

This was broken by the recent change to make std::allocator use sized
delete.

Tested x86_64-linux and powerpc-aix, committed to trunk.


commit 8a80c5c7319cd3c2d43457d7a3fbdd9fc2712e03
Author: redi 
Date:   Mon Jun 17 15:51:31 2019 +

Fix AIX test failure due to replacement operator delete

On AIX the sized delete defined in the library will call the non-sized
delete defined in the library, not the replacement version defined in
the test file. By also replacing sized delete we make the test pass
everywhere.

* testsuite/20_util/allocator/1.cc: Add sized delete, which fixes a
failure on AIX.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272391 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/testsuite/20_util/allocator/1.cc 
b/libstdc++-v3/testsuite/20_util/allocator/1.cc
index 8ea08958f8d..c8a74dacf63 100644
--- a/libstdc++-v3/testsuite/20_util/allocator/1.cc
+++ b/libstdc++-v3/testsuite/20_util/allocator/1.cc
@@ -24,6 +24,12 @@
 #include 
 #include 
 
+#if __cplusplus >= 201103L
+# define NOTHROW noexcept
+#else
+# define NOTHROW throw()
+#endif
+
 struct gnu { };
 
 bool check_new = false;
@@ -36,12 +42,19 @@ operator new(std::size_t n) THROW(std::bad_alloc)
   return std::malloc(n);
 }
 
-void operator delete(void *v) throw()
+void operator delete(void *v) NOTHROW
 {
   check_delete = true;
   return std::free(v);
 }
 
+#if __cpp_sized_deallocation
+void operator delete(void *v, std::size_t) NOTHROW
+{
+  ::operator delete(v);
+}
+#endif
+
 void test01()
 {
   std::allocator obj;


Re: C++ PATCH for c++/83820 - excessive attribute arguments not detected

2019-06-17 Thread Martin Sebor

On 6/16/19 1:10 PM, Marek Polacek wrote:

While messing with [[noreturn]] I also found out that we don't detect
the case when an attribute specifier that takes no arguments contains
an attribute-argument-clause.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-06-16  Marek Polacek  

PR c++/83820 - excessive attribute arguments not detected.
* parser.c (cp_parser_std_attribute): Detect excessive arguments.

* g++.dg/cpp0x/gen-attrs-67.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index 8f5ae84670a..871bc45da63 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -26149,6 +26149,20 @@ cp_parser_std_attribute (cp_parser *parser, tree 
attr_ns)
  vec *vec;
  int attr_flag = normal_attr;
  
+/* Maybe we don't expect to see any arguments for this attribute.  */

+const attribute_spec *as
+  = lookup_attribute_spec (TREE_PURPOSE (attribute));
+if (as && as->max_length == 0)
+  {
+   error_at (token->location, "attribute %qE does not take any arguments",
+ attr_id);


Not to be too anal about this but most messages have the word order
reversed and would be phrased as

  %qE attribute does not take any arguments

I've been adjusting the order to match this form as I notice it so
it would be great if we could use this form in new diagnostics as
well.

Thanks!
Martin


+   cp_parser_skip_to_closing_parenthesis (parser,
+  /*recovering=*/true,
+  /*or_comma=*/false,
+  /*consume_paren=*/true);
+   return error_mark_node;
+  }
+
  if (attr_ns == gnu_identifier
&& attribute_takes_identifier_p (attr_id))
/* A GNU attribute that takes an identifier in parameter.  */
diff --git gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C 
gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C
new file mode 100644
index 000..bbbedd0240a
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C
@@ -0,0 +1,11 @@
+// PR c++/83820 - excessive attribute arguments not detected.
+// { dg-do compile { target c++11 } }
+
+[[noreturn()]] void f0 (); // { dg-error "attribute .noreturn. does not take any 
arguments" }
+[[noreturn(1)]] void f1 (); // { dg-error "attribute .noreturn. does not take any 
arguments" }
+[[noreturn(1, 2)]] void f2 (); // { dg-error "attribute .noreturn. does not take 
any arguments" }
+[[maybe_unused()]] int f3(); // { dg-error "attribute .maybe_unused. does not take 
any arguments" }
+[[nodiscard()]] int f4(); // { dg-error "attribute .nodiscard. does not take any 
arguments" }
+[[gnu::noinline()]] int f5(); // { dg-error "attribute .noinline. does not take any 
arguments" }
+[[gnu::constructor]] int f6();
+[[gnu::constructor(101)]] int f7();





Re: C++ PATCH for c++/60364 - noreturn after first decl not diagnosed (v3)

2019-06-17 Thread Jakub Jelinek
On Sun, Jun 16, 2019 at 12:36:58PM -0400, Marek Polacek wrote:
> On Sun, Jun 16, 2019 at 06:18:56PM +0200, Jakub Jelinek wrote:
> > On Sun, Jun 16, 2019 at 12:10:37PM -0400, Marek Polacek wrote:
> > > > Guess I will really have to make the changes to treat [[noreturn]] 
> > > > similarly
> > > > to e.g. [[nodiscard]], so that cxx11_attribute_p works.
> > > 
> > > Thus.  Changes I've made:
> > > * don't treat [[noreturn]] as an equivalent to __attribute__((noreturn));
> > > * for that I had to adjust decl_attributes, it wasn't preserving the
> > >   C++11 form (a list in another list); fix shadowing while at it;
> > > * the above turned up two spots that were wrongly accessing TREE_PURPOSE
> > >   directly instead of using get_attribute_name;
> > > * give error only for [[noreturn]] but not for __attribute__((noreturn))
> > >   or [[gnu::noreturn]].
> > > 
> > > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > I'd prefer to defer review of this to Jason, just want to note that I don't
> > see any testsuite coverage on mixing declarations with different forms of
> > attributes ([[noreturn]] on one decl and __attribute__((noreturn)) or
> > [[gnu::noreturn]] on another one or vice versa.
> 
> Added now.  I suppose it should compile fine, which it does.

I meant also the tests of the new diagnostics, say if you have
a decl without any of those attributes, then gnu:: one (or __attribute__
one; that merge decls should be ok) and on third decl [[noreturn]] (shall
that diagnose anything or not?  As there is no way to differentiate it from
the gnu:: attribute on the very first one, I'd say it shouldn't, with the
use of the gnu:: or __attribute__ we are already outside of the standard.

Jakub


Re: [nvptx] Fix missing mode warnings in nvptx.md, omp part

2019-06-17 Thread Jakub Jelinek
On Mon, Jun 17, 2019 at 04:53:24PM +0200, Tom de Vries wrote:
> Updated accordingly, and committed as attached.

Note, current trunk allows one to define expanders that take mode as the
first argument, so you could
(define_insn "@set_softstack_"
  [(unspec [(match_operand:P 0 "nvptx_register_operand" "R")]
UNSPEC_SET_SOFTSTACK)]
  "TARGET_SOFT_STACK"
...
and then just use gen_set_softstack (Pmode, arg).

Jakub


Re: C++ PATCH for c++/60364 - noreturn after first decl not diagnosed (v2)

2019-06-17 Thread Martin Sebor

On 6/16/19 10:10 AM, Marek Polacek wrote:

On Sat, Jun 15, 2019 at 10:39:13AM -0400, Marek Polacek wrote:

On Sat, Jun 15, 2019 at 04:33:26PM +0200, Jakub Jelinek wrote:

On Sat, Jun 15, 2019 at 10:29:17AM -0400, Marek Polacek wrote:

[dcl.attr.noreturn] says "The first declaration of a function shall specify the
noreturn attribute if any declaration of that function specifies the noreturn
attribute" meaning that we should diagnose

   void func ();
   void func [[noreturn]] ();

but we do not.  I'd been meaning to issue a hard error for [[noreturn]] and
only a warning for __attribute__((noreturn)) but then I found out that we
treat [[noreturn]] exactly as the GNU attribute, and so cxx11_attribute_p
returns false for it, so I decided to make it a pedwarn for all the cases.
A pedwarn counts as a diagnostic, so we'd be conforming still.

Bootstrapped/regtested on x86_64-linux, ok for trunk?


IMHO we should treat __attribute__((noreturn)) as before without any
warnings, just [[noreturn]] that way.  There is nothing wrong on declaring
it just on second or following declaration, it is an optimization attribute.


That's a complication then; currently [[noreturn]], [[gnu::noreturn]], and
__attribute__((noreturn)) are not distinguishable.  :(

Guess I will really have to make the changes to treat [[noreturn]] similarly
to e.g. [[nodiscard]], so that cxx11_attribute_p works.


Thus.  Changes I've made:
* don't treat [[noreturn]] as an equivalent to __attribute__((noreturn));
* for that I had to adjust decl_attributes, it wasn't preserving the
   C++11 form (a list in another list); fix shadowing while at it;
* the above turned up two spots that were wrongly accessing TREE_PURPOSE
   directly instead of using get_attribute_name;
* give error only for [[noreturn]] but not for __attribute__((noreturn))
   or [[gnu::noreturn]].

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-06-16  Marek Polacek  

PR c++/60364 - noreturn after first decl not diagnosed.
* attribs.c (get_attribute_namespace): No longer static.
(decl_attributes): Avoid shadowing.  Preserve the C++11 form for C++11
attributes.
* attribs.h (get_attribute_namespace): Declare.
* tree-inline.c (function_attribute_inlinable_p): Use
get_attribute_name.

* c-attribs.c (handle_noreturn_attribute): No longer static.
* c-common.h (handle_noreturn_attribute): Declare.
* c-format.c (check_function_format): Use get_attribute_name.

* decl.c (duplicate_decls): Give an error when a function is
declared [[noreturn]] after its first declaration.
* parser.c (cp_parser_std_attribute): Don't treat C++11 noreturn
attribute as equivalent to GNU's.
* tree.c (std_attribute_table): Add noreturn.

* g++.dg/warn/noreturn-8.C: New test.

...

diff --git gcc/cp/tree.c gcc/cp/tree.c
index cd021b7f594..bb695e14e73 100644
--- gcc/cp/tree.c
+++ gcc/cp/tree.c
@@ -4453,6 +4453,8 @@ const struct attribute_spec std_attribute_table[] =
  handle_likeliness_attribute, attr_cold_hot_exclusions },
{ "unlikely", 0, 0, false, false, false, false,
  handle_likeliness_attribute, attr_cold_hot_exclusions },
+  { "noreturn", 0, 0, true, false, false, false,
+handle_noreturn_attribute, NULL },

  

The GNU attribute is made mutually exclusive with a bunch of other
attributes (e.g., malloc or warn_unused_result) by setting the last
member to the array of exclusive attribute.  Does the change preserve
this relationship some other way?

Martin


Re: [nvptx] Fix missing mode warnings in nvptx.md, omp part

2019-06-17 Thread Tom de Vries
On 17-06-19 14:35, Alexander Monakov wrote:
> On Mon, 17 Jun 2019, Tom de Vries wrote:
> 
>> Hi Alexander,
>>
>> any comments?
> 
> A couple suggestions (see below), but no serious concerns from my side.
> 
>> --- a/gcc/config/nvptx/nvptx.c
>> +++ b/gcc/config/nvptx/nvptx.c
>> @@ -112,6 +112,17 @@ enum nvptx_data_area
>>DATA_AREA_MAX
>>  };
>>  
>> +rtx
>> +gen_set_softstack_insn (rtx op)
>> +{
> 
> Here it might be appropriate to have something like
> 
>   gcc_assert (GET_MODE (op) == Pmode);
> 
> because failure to supply a Pmode operand indicates a bug in the caller and
> it may be desirable to catch it here; doesn't make sense to allow restoring
> stack pointer from a 32-bit register with a 64-bit address space.
> 
> Likewise for other instances of this 'if (... DImode) ... else if (... 
> SImode) ...'
> pattern in the patch.
> 

Good point, implemented.

>> +  if (GET_MODE (op) == DImode)
>> +return gen_set_softstack_insn_di (op);
>> +  else if (GET_MODE (op) == SImode)
>> +return gen_set_softstack_insn_si (op);
>> +  else
>> +gcc_unreachable ();
>> +}
>> +
>>  /*  We record the data area in the target symbol flags.  */
>>  #define SYMBOL_DATA_AREA(SYM) \
>>(nvptx_data_area)((SYMBOL_REF_FLAGS (SYM) >> SYMBOL_FLAG_MACH_DEP_SHIFT) \
>> diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
>> index 3ed5296db96..1a676b942b2 100644
>> --- a/gcc/config/nvptx/nvptx.md
>> +++ b/gcc/config/nvptx/nvptx.md
>> @@ -1071,8 +1071,8 @@
>>DONE;
>>  })
>>  
>> -(define_insn "set_softstack_insn"
>> -  [(unspec [(match_operand 0 "nvptx_register_operand" "R")]
>> +(define_insn "set_softstack_insn_"
>> +  [(unspec [(match_operand:P 0 "nvptx_register_operand" "R")]
> 
> A purely cosmetic issue, but I'll mention it for completeness: the '_insn'
> suffix is unnecessary with the '_' suffix added. This is how e.g. the
> i386 backend deals with a similar issue, they have 'stack_protect_test' expand
> and 'stack_protect_test_' insns.

Updated accordingly, and committed as attached.

Thanks,
- Tom
[nvptx] Fix missing mode warnings in nvptx.md, omp part

Fix these warnings:
...
gcc/config/nvptx/nvptx.md:1074:1: warning: operand 0 missing mode?
gcc/config/nvptx/nvptx.md:1240:1: warning: operand 0 missing mode?
gcc/config/nvptx/nvptx.md:1240:1: warning: operand 1 missing mode?
gcc/config/nvptx/nvptx.md:1240:1: warning: operand 2 missing mode?
gcc/config/nvptx/nvptx.md:1268:1: warning: operand 0 missing mode?
...

Build and reg-tested on x86_64 with nvptx accelerator.

2019-06-17  Tom de Vries  

	* config/nvptx/nvptx-protos.h (gen_set_softstack_insn): Declare.
	* config/nvptx/nvptx.c (gen_set_softstack_insn): New function.
	* config/nvptx/nvptx.md (define_insn "set_softstack_insn"): Rename to
	...
	(define_insn "set_softstack_"): ... this.  Use P iterator on
	match_operand 0.
	(define_insn "omp_simt_enter_insn"): Rename to ...
	(define_insn "omp_simt_enter_"): ... this.  Use P iterator on
	match_operand 0, 1 and 2, as well as the unspec_volatile result.
	(define_expand "omp_simt_enter): Use gen_omp_simt_enter_di and
	gen_omp_simt_enter_si.
	(define_expand "omp_simt_exit"): New.
	(define_insn "omp_simt_exit"): Rename to ...
	(define_insn "omp_simt_exit_"): ... this.  Use P iterator on
	match_operand 0.

---
 gcc/config/nvptx/nvptx-protos.h |  1 +
 gcc/config/nvptx/nvptx.c| 12 
 gcc/config/nvptx/nvptx.md   | 38 +-
 3 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/gcc/config/nvptx/nvptx-protos.h b/gcc/config/nvptx/nvptx-protos.h
index be09a15e49c..061897a3921 100644
--- a/gcc/config/nvptx/nvptx-protos.h
+++ b/gcc/config/nvptx/nvptx-protos.h
@@ -57,5 +57,6 @@ extern const char *nvptx_output_set_softstack (unsigned);
 extern const char *nvptx_output_simt_enter (rtx, rtx, rtx);
 extern const char *nvptx_output_simt_exit (rtx);
 extern const char *nvptx_output_red_partition (rtx, rtx);
+extern rtx gen_set_softstack_insn (rtx);
 #endif
 #endif
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index c53a1ae9f26..aa4a67fbead 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -112,6 +112,18 @@ enum nvptx_data_area
   DATA_AREA_MAX
 };
 
+rtx
+gen_set_softstack_insn (rtx op)
+{
+  gcc_assert (GET_MODE (op) == Pmode);
+  if (GET_MODE (op) == DImode)
+return gen_set_softstack_di (op);
+  else if (GET_MODE (op) == SImode)
+return gen_set_softstack_si (op);
+  else
+gcc_unreachable ();
+}
+
 /*  We record the data area in the target symbol flags.  */
 #define SYMBOL_DATA_AREA(SYM) \
   (nvptx_data_area)((SYMBOL_REF_FLAGS (SYM) >> SYMBOL_FLAG_MACH_DEP_SHIFT) \
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 3ed5296db96..84c0ea45431 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -1071,8 +1071,8 @@
   DONE;
 })
 
-(define_insn "set_softstack_insn"
-  [(unspec [(match_operand 0 "nvptx_register_operand" "R")]
+(define_insn "set_softstack_"
+  [(u

Re: [patch][aarch64] add netbsd/aarch64 target

2019-06-17 Thread Richard Earnshaw (lists)
On 14/06/2019 16:38, co...@sdf.org wrote:
> Hi folks,
> 
> This patch adds support for netbsd/aarch64.
> It would be nice to have it committed, please tell me if anything is
> wrong.
> 
> Thanks.
> 
> Matthew Green 
> Maya Rashish 
> 
> gcc:
>   * config.gcc (aarch64*-*-netbsd*): New target.
>   * config/aarch64/aarch64-netbsd.h: New file.
>   * config/aarch64/t-aarch64-netbsd: Likewise.
> 
> libgcc:
>   * config.host (aarch64*-*-netbsd*): New case.
> 
> 

Thanks, I've put this in.

R.

> 
> ---
>  gcc/config.gcc  |  6 +++
>  gcc/config/aarch64/aarch64-netbsd.h | 80 +
>  gcc/config/aarch64/t-aarch64-netbsd | 21 
>  libgcc/config.host  |  6 +++
>  4 files changed, 113 insertions(+)
>  create mode 100644 gcc/config/aarch64/aarch64-netbsd.h
>  create mode 100644 gcc/config/aarch64/t-aarch64-netbsd
> 
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 0f80e836f4e..678c4ec51a3 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1021,6 +1021,12 @@ aarch64*-*-freebsd*)
>   tmake_file="${tmake_file} aarch64/t-aarch64 aarch64/t-aarch64-freebsd"
>   tm_defines="${tm_defines}  TARGET_DEFAULT_ASYNC_UNWIND_TABLES=1"
>   ;;
> +aarch64*-*-netbsd*)
> + tm_file="${tm_file} dbxelf.h elfos.h ${nbsd_tm_file}"
> + tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-netbsd.h"
> + tmake_file="${tmake_file} aarch64/t-aarch64 aarch64/t-aarch64-netbsd"
> + extra_options="${extra_options} netbsd.opt netbsd-elf.opt"
> + ;;
>  aarch64*-*-linux*)
>   tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h"
>   tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-linux.h"
> diff --git a/gcc/config/aarch64/aarch64-netbsd.h 
> b/gcc/config/aarch64/aarch64-netbsd.h
> new file mode 100644
> index 000..72fe6a2bdb5
> --- /dev/null
> +++ b/gcc/config/aarch64/aarch64-netbsd.h
> @@ -0,0 +1,80 @@
> +/* Definitions for AArch64 running NetBSD
> +   Copyright (C) 2016-2019 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but
> +   WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   .  */
> +
> +#ifndef GCC_AARCH64_NETBSD_H
> +#define GCC_AARCH64_NETBSD_H
> +
> +#define TARGET_LINKER_BIG_EMULATION "aarch64nbsdb"
> +#define TARGET_LINKER_LITTLE_EMULATION "aarch64nbsd"
> +
> +#if TARGET_BIG_ENDIAN_DEFAULT
> +#define TARGET_LINKER_EMULATION  TARGET_LINKER_BIG_EMULATION
> +#else
> +#define TARGET_LINKER_EMULATION  TARGET_LINKER_LITTLE_EMULATION
> +#endif
> +
> +#undef  SUBTARGET_EXTRA_LINK_SPEC
> +#define SUBTARGET_EXTRA_LINK_SPEC " -m" TARGET_LINKER_EMULATION
> +
> +#define NETBSD_ENTRY_POINT "__start"
> +
> +#define NETBSD_TARGET_LINK_SPEC  "%{h*}  \
> +   -X %{mbig-endian:-EB -m " TARGET_LINKER_BIG_EMULATION "} \
> +   %{mlittle-endian:-EL -m " TARGET_LINKER_LITTLE_EMULATION "} \
> +   %(netbsd_link_spec)"
> +
> +#if TARGET_FIX_ERR_A53_835769_DEFAULT
> +#define CA53_ERR_835769_SPEC \
> +  " %{!mno-fix-cortex-a53-835769:--fix-cortex-a53-835769}"
> +#else
> +#define CA53_ERR_835769_SPEC \
> +  " %{mfix-cortex-a53-835769:--fix-cortex-a53-835769}"
> +#endif
> +
> +#ifdef TARGET_FIX_ERR_A53_843419_DEFAULT
> +#define CA53_ERR_843419_SPEC \
> +  " %{!mno-fix-cortex-a53-843419:--fix-cortex-a53-843419}"
> +#else
> +#define CA53_ERR_843419_SPEC \
> +  " %{mfix-cortex-a53-843419:--fix-cortex-a53-843419}"
> +#endif
> +
> +#undef  LINK_SPEC
> +#define LINK_SPEC NETBSD_LINK_SPEC_ELF   \
> +   NETBSD_TARGET_LINK_SPEC   \
> +  CA53_ERR_835769_SPEC   \
> +  CA53_ERR_843419_SPEC
> +
> +#undef TARGET_OS_CPP_BUILTINS
> +#define TARGET_OS_CPP_BUILTINS() \
> +  do \
> +{\
> +  NETBSD_OS_CPP_BUILTINS_ELF();  \
> +}\
> +  while (0)
> +
> +#undef SUBTARGET_CPP_SPEC
> +#define SUBTARGET_CPP_SPEC NETBSD_CPP_SPEC
> +
> +#undef EXTRA_SPECS
> +#define EXTRA_SPECS \
> +  { "asm_cpu_spec", ASM_CPU_SPEC }, \
> +  NETBSD_SUBTARGET_EXTRA_SPECS
> +
> +#endif  /* GCC_AARCH64_NETBSD_H */
> diff --git a/gcc/config/aarch64/t-aarch64-netbsd 
> b/gcc/config/aarch64/t-aarch64-netbsd
> new file mode 100644
> index 000..aa447d0f6d4
> --- /de

[PATCH] Add 'noexcept' to std::lerp

2019-06-17 Thread Jonathan Wakely

* include/c_global/cmath (__lerp, lerp): Add noexcept (LWG 3201).

Tested x86_64-linux, committed to trunk.


commit 4b48ab1965d6d690f6b6b0bb55f48149d1cbbb25
Author: redi 
Date:   Mon Jun 17 14:32:44 2019 +

Add 'noexcept' to std::lerp

* include/c_global/cmath (__lerp, lerp): Add noexcept (LWG 3201).

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272386 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/include/c_global/cmath 
b/libstdc++-v3/include/c_global/cmath
index b843c18f1da..01e56a559fe 100644
--- a/libstdc++-v3/include/c_global/cmath
+++ b/libstdc++-v3/include/c_global/cmath
@@ -1891,7 +1891,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 constexpr _Fp
-__lerp(_Fp __a, _Fp __b, _Fp __t)
+__lerp(_Fp __a, _Fp __b, _Fp __t) noexcept
 {
   if (__a <= 0 && __b >= 0 || __a >= 0 && __b <= 0)
return __t * __b + (1 - __t) * __a;
@@ -1908,15 +1908,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   constexpr float
-  lerp(float __a, float __b, float __t)
+  lerp(float __a, float __b, float __t) noexcept
   { return std::__lerp(__a, __b, __t); }
 
   constexpr double
-  lerp(double __a, double __b, double __t)
+  lerp(double __a, double __b, double __t) noexcept
   { return std::__lerp(__a, __b, __t); }
 
   constexpr long double
-  lerp(long double __a, long double __b, long double __t)
+  lerp(long double __a, long double __b, long double __t) noexcept
   { return std::__lerp(__a, __b, __t); }
 #endif // C++20
 


Re: Do not give up early on access path oracle

2019-06-17 Thread Jan Hubicka
> >> > 
> >> > A more stricter test would be
> >> > 
> >> >  if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
> >> >   return true;
> >> > /* If there's a variable access in one of the refs fall through
> >> >to access-path based disambiguation.  */
> >> > 
> >> > where you'd need to pass down ao_ref_size in addition to max_size as 
> >> > well.
> >> 
> >> Proably || here?
> >
> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> >
> > But yes, if you don't want that then || is cheaper.  Probably add
> > another testcase with one of the accesses with a constant index?
> 
> Might be misunderstanding, but isn't the check for a variable access
> !known_eq (max_size1, size1) == maybe_ne (max_size1, size1)?  "maybe_eq"
> means "could be equal in some circumstances, even if they're not always
> equal".

Seems I am getting progressively more confused.  I think I want to pass
the size down from ao_ref to the functions and then see if they are
always equal.

Why would I want to use maybe variant?

Honza

> 
> Richard


Re: Do not give up early on access path oracle

2019-06-17 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, 17 Jun 2019, Jan Hubicka wrote:
>
>> > 
>> > But part of the expensiveness we want to avoid is this
>> > (repeated) walking of the ref tree...
>> 
>> I was considering to pass contains_union_p down from one of
>> earlier walks, but did not find suitable one for that...
>> > 
>> > So...
>> > 
>> > > +  return !handled_component_p (ref2);
>> > > +}
>> > > +
>> > >  /* Return true if an indirect reference based on *PTR1 constrained
>> > > to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
>> > > constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
>> > > @@ -1533,7 +1563,12 @@ indirect_ref_may_alias_decl_p (tree ref1
>> > >&& (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
>> > >|| (TYPE_SIZE (TREE_TYPE (base1))
>> > >&& TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == 
>> > > INTEGER_CST)))
>> > > -return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, 
>> > > max_size2);
>> > > +{
>> > > +  if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, 
>> > > max_size2))
>> > > +return false;
>> > > +  if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
>> > > +return true;
>> > 
>> > how about a simpler test like
>> > 
>> >  if (known_size_p (max_size1) && known_size_p (max_size2))
>> >return true;
>> >  /* If there's an unconstrained variable access in the ref fall
>> >through to access-path based disambiguation.  */
>> 
>> If I have something like
>>  struct a {int a[10];int b;}
>> and then
>>  aptr->a[i]
>> in the access path, won't be max_size known (40) where type size is 4?
>
> Yes.
>
>> In this case I want to contiue to access path.
>> > 
>> > ?
>> > 
>> > I'd certainly like to see testcases btw...
>> 
>> There is a testcase for variable array access included in the patch,
>> would you like to have one with union in it?
>
> Oh, I missed the testcase ;)
>
>> > 
>> > A more stricter test would be
>> > 
>> >if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
>> >   return true;
>> > /* If there's a variable access in one of the refs fall through
>> >to access-path based disambiguation.  */
>> > 
>> > where you'd need to pass down ao_ref_size in addition to max_size as well.
>> 
>> Proably || here?
>
> Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
>
> But yes, if you don't want that then || is cheaper.  Probably add
> another testcase with one of the accesses with a constant index?

Might be misunderstanding, but isn't the check for a variable access
!known_eq (max_size1, size1) == maybe_ne (max_size1, size1)?  "maybe_eq"
means "could be equal in some circumstances, even if they're not always
equal".

Richard


Re: [PATCH v4 2/3] Add predict_doloop_p target hook

2019-06-17 Thread Kewen.Lin
on 2019/6/17 下午9:44, Richard Biener wrote:
> On Mon, 17 Jun 2019, Kewen.Lin wrote:
> 
>> on 2019/6/17 下午8:08, Richard Biener wrote:
>>> On June 17, 2019 11:59:30 AM GMT+02:00, Segher Boessenkool 
>>>  wrote:
 On Mon, Jun 17, 2019 at 10:51:49AM +0200, Richard Biener wrote:
> On Mon, 17 Jun 2019, Kewen.Lin wrote:
>
>> Hi Segher and Bill,
>>
>> Thanks a lot for your review comments! I've updated the patch
 accordingly.
>>
>> The updated one attached.
>
> OK.  I suppose all low-overhead loop instructions use a decrement to
 zero
> style iterator?

 The documentation says decrement to 0, decrement to -1, and decrement
 to
 any negative are all supported.  But all are decrement, yes.
>>>
>>> Hmm, so I wonder if we should tell IVOPTS the kind because the IV generated 
>>> has to match RTL doloops expectations? Thus return an enum value from the 
>>> hook? 
>>>
>>
>> I guess we don't need to make it in IVOPTs, since doloop_optimize can 
>> transform the loop closing to
>> its expected pattern once it gets the iteration count, with target hook 
>> doloop_end.
>> It's to modify the original closing into:
>>   niter   -> 0
>>   niter-1 -> -1
>>   niter-n -> -n
> 
> True, but this IV may affect other IVs choice (well, at least if it is
> "cheap" to use the doloop IV in derived IVs which it is not for power).
> 

Good point!  For those targets like power having counter register, it would be 
fine.
For the other, I'm sorry that I'm not sure how the sequence looks like.
All decrement/cmp kept?  Just like normal but it has hardware fusion support?
Maybe we can add one more iv cand specific for this kind of use, assign 
negative cost
for this specific pair (group/cand), it can give chances to be selected for 
other uses
if it's better?


Thanks,
Kewen

> Richard.
> 



Re: [PATCH] PR libstdc++/90281 Fix string conversions for filesystem::path

2019-06-17 Thread Jonathan Wakely

On 30/04/19 16:06 +0100, Jonathan Wakely wrote:

Fix several bugs in the encoding conversions for filesystem::path that
prevent conversion of Unicode characters outside the Basic Multilingual
Plane, and prevent returning basic_string specializations with
alternative allocator types.

The std::codecvt_utf8 class template is not suitable for UTF-16
conversions because it uses UCS-2 instead. For conversions between UTF-8
and UTF-16 either std::codecvt or
codecvt_utf8_utf16 must be used.

The __str_codecvt_in and __str_codecvt_out utilities do not
return false on a partial conversion (e.g. for invalid or incomplete
Unicode input). Add new helpers that treat partial conversions as
errors, and use them for all filesystem::path conversions.

PR libstdc++/90281 Fix string conversions for filesystem::path
* include/bits/fs_path.h (u8path) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]:
Use codecvt_utf8_utf16 instead of codecvt_utf8. Use
__str_codecvt_in_all to fail for partial conversions and throw on
error.
[!_GLIBCXX_FILESYSTEM_IS_WINDOWS && _GLIBCXX_USE_CHAR8_T]
(path::_Cvt): Add explicit specialization.
[_GLIBCXX_FILESYSTEM_IS_WINDOWS] (path::_Cvt::_S_wconvert): Remove
overloads.
[_GLIBCXX_FILESYSTEM_IS_WINDOWS] (path::_Cvt::_S_convert): Use
if-constexpr instead of dispatching to _S_wconvert. Use codecvt
instead of codecvt_utf8. Use __str_codecvt_in_all and
__str_codecvt_out_all.
[!_GLIBCXX_FILESYSTEM_IS_WINDOWS] (path::_Cvt::_S_convert): Use
codecvt instead of codecvt_utf8. Use __str_codecvt_out_all.
(path::_S_str_convert) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Use
codecvt_utf8_utf16 instead of codecvt_utf8. Construct return values
with allocator. Use __str_codecvt_out_all. Fallthrough to POSIX code
after converting to UTF-8.
(path::_S_str_convert): Use codecvt instead of codecvt_utf8. Use
__str_codecvt_in_all.
(path::string): Fix initialization of string types with different
allocators.
(path::u8string) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Use
codecvt_utf8_utf16 instead of codecvt_utf8. Use __str_codecvt_out_all.
* include/bits/locale_conv.h (__do_str_codecvt): Reorder static and
runtime conditions.
(__str_codecvt_out_all, __str_codecvt_in_all): New functions that
return false for partial conversions.
* include/experimental/bits/fs_path.h (u8path):
[_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Implement correctly for mingw.
[_GLIBCXX_FILESYSTEM_IS_WINDOWS] (path::_Cvt::_S_wconvert): Add
missing handling for char8_t. Use codecvt and codecvt_utf8_utf16
instead of codecvt_utf8. Use __str_codecvt_in_all and
__str_codecvt_out_all.
[!_GLIBCXX_FILESYSTEM_IS_WINDOWS] (path::_Cvt::_S_convert): Use
codecvt instead of codecvt_utf8. Use __str_codecvt_out_all.
(path::string) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Use
codecvt_utf8_utf16 instead of codecvt_utf8. Construct return values
with allocator. Use __str_codecvt_out_all and __str_codecvt_in_all.
(path::string) [!_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Use
__str_codecvt_in_all.
(path::u8string) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Use
codecvt_utf8_utf16 instead of codecvt_utf8. Use __str_codecvt_out_all.
* src/c++17/fs_path.cc (path::_S_convert_loc): Use
__str_codecvt_in_all.
* src/filesystem/path.cc (path::_S_convert_loc): Likewise.
* testsuite/27_io/filesystem/path/construct/90281.cc: New test.
* testsuite/27_io/filesystem/path/factory/u8path.cc: New test.
* testsuite/27_io/filesystem/path/native/string.cc: Test with empty
strings and with Unicode characters outside the basic multilingual
plane.
* testsuite/27_io/filesystem/path/native/alloc.cc: New test.
* testsuite/experimental/filesystem/path/construct/90281.cc: New test.
* testsuite/experimental/filesystem/path/factory/u8path.cc: New test.
* testsuite/experimental/filesystem/path/native/alloc.cc: New test.
* testsuite/experimental/filesystem/path/native/string.cc: Test with
empty strings and with Unicode characters outside the basic
multilingual plane.

Tested powerpc64le-linux and x86_64-w64-mingw32.

As this ended up being a large patch I'll wait until after 9.1.0 is
released to commit this (and then will backport a simpler version).


I forgot to commit the patch after the release. It's now on trunk, as
r272385, and I'll backport it soon.





Re: Do not give up early on access path oracle

2019-06-17 Thread Jan Hubicka
> On Mon, 17 Jun 2019, Jan Hubicka wrote:
> 
> > > 
> > > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> > > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> > > 
> > > But yes, if you don't want that then || is cheaper.  Probably add
> > > another testcase with one of the accesses with a constant index?
> > 
> > Hmm, OK, without unions I do not see how I could disambiguate something
> > with || and not &&.
> > Do we care about the testcase with union and constant accesses I sent?
> 
> Wouldn't it be as simple as
> 
> struct S { int i; int j; };
> 
> struct S a[10];
> 
> and a[i].i vs. a[1].j?  The first has offset = 0, max_size = sizeof(a)-4
> while the second has offset = 12, max_size = 4 and thus they overlap.

Ha, you are right of course (and in fact similar testcases was why
I started to look into this :)

Honza


Re: Remove nonoverlapping_component_refs_of_decl_p

2019-06-17 Thread Jan Hubicka
> On Mon, 17 Jun 2019, Jan Hubicka wrote:
> 
> > Hi,
> > this is patch for aliasing_component_refs_p and VCE.
> > I also turned the refp to ref, but there are no changes relative
> > to that.
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> OK.
> 
> Bonus points for wrong-code testcases involving these...  like
> the following which doesn't fail for some reason:

We punt on same_type_for_tbaa on arrays so the aliasing_component_refs_p
thinks that int and int[2] possibly match and gives up :)

Honza


> 
> struct S { int i; int j[2]; };
> 
> __GIMPLE(startwith("fre1")) int foo(struct S *x, struct S *y)
> {
>   int _1;
>   x->j[0] = 1;
>   __VIEW_CONVERT  (__MEM  ((int(*)[2])y + 4)).i = 0;
>   _1 = x->j[0];
>   return _1;
> }
> 
> int main()
> {
>   struct S s;
>   if (foo (&s, &s) != 0)
> __builtin_abort ();
>   return 0;
> }
> 
> Thanks,
> Richard.
> 
> > Honza
> > 
> > * tree-ssa-alias.c (aliasing_component_refs_p): Consider only
> > the access path from base to first VIEW_CONVERT_EXPR or
> > BIT_FIELD_REF.
> > Index: tree-ssa-alias.c
> > ===
> > --- tree-ssa-alias.c(revision 272381)
> > +++ tree-ssa-alias.c(working copy)
> > @@ -874,7 +874,6 @@ aliasing_component_refs_p (tree ref1,
> >   disambiguating q->i and p->a.j.  */
> >tree base1, base2;
> >tree type1, type2;
> > -  tree *refp;
> >int same_p1 = 0, same_p2 = 0;
> >bool maybe_match = false;
> >tree end_struct_ref1 = NULL, end_struct_ref2 = NULL;
> > @@ -903,6 +902,9 @@ aliasing_component_refs_p (tree ref1,
> >   gcc_checking_assert (!end_struct_ref1);
> >end_struct_ref1 = base1;
> > }
> > +  if (TREE_CODE (base1) == VIEW_CONVERT_EXPR
> > + || TREE_CODE (base1) == BIT_FIELD_REF)
> > +   ref1 = TREE_OPERAND (base1, 0);
> >base1 = TREE_OPERAND (base1, 0);
> >  }
> >type1 = TREE_TYPE (base1);
> > @@ -918,6 +920,9 @@ aliasing_component_refs_p (tree ref1,
> >   gcc_checking_assert (!end_struct_ref2);
> >   end_struct_ref2 = base2;
> > }
> > +  if (TREE_CODE (base2) == VIEW_CONVERT_EXPR
> > + || TREE_CODE (base2) == BIT_FIELD_REF)
> > +   ref2 = TREE_OPERAND (base2, 0);
> >base2 = TREE_OPERAND (base2, 0);
> >  }
> >type2 = TREE_TYPE (base2);
> > @@ -934,23 +939,23 @@ aliasing_component_refs_p (tree ref1,
> >|| (end_struct_ref2
> >   && compare_type_sizes (TREE_TYPE (end_struct_ref2), type1) >= 0))
> >  {
> > -  refp = &ref2;
> > +  tree ref = ref2;
> >while (true)
> > {
> >   /* We walk from inner type to the outer types. If type we see is
> >  already too large to be part of type1, terminate the search.  */
> > - int cmp = compare_type_sizes (type1, TREE_TYPE (*refp));
> > + int cmp = compare_type_sizes (type1, TREE_TYPE (ref));
> >  
> >   if (cmp < 0
> >   && (!end_struct_ref1
> >   || compare_type_sizes (TREE_TYPE (end_struct_ref1),
> > -TREE_TYPE (*refp)) < 0))
> > +TREE_TYPE (ref)) < 0))
> > break;
> >   /* If types may be of same size, see if we can decide about their
> >  equality.  */
> >   if (cmp == 0)
> > {
> > - same_p2 = same_type_for_tbaa (TREE_TYPE (*refp), type1);
> > + same_p2 = same_type_for_tbaa (TREE_TYPE (ref), type1);
> >   if (same_p2 == 1)
> > break;
> >   /* In case we can't decide whether types are same try to
> > @@ -960,9 +965,9 @@ aliasing_component_refs_p (tree ref1,
> >   if (same_p2 == -1)
> > maybe_match = true;
> > }
> > - if (!handled_component_p (*refp))
> > + if (!handled_component_p (ref))
> > break;
> > - refp = &TREE_OPERAND (*refp, 0);
> > + ref = TREE_OPERAND (ref, 0);
> > }
> >if (same_p2 == 1)
> > {
> > @@ -977,13 +982,13 @@ aliasing_component_refs_p (tree ref1,
> >   if (TREE_CODE (TREE_TYPE (base1)) == ARRAY_TYPE
> >   && (!TYPE_SIZE (TREE_TYPE (base1))
> >   || TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) != INTEGER_CST
> > - || (*refp == base2 && !ref2_is_decl)))
> > + || (ref == base2 && !ref2_is_decl)))
> > {
> >   ++alias_stats.aliasing_component_refs_p_may_alias;
> >   return true;
> > }
> >  
> > - get_ref_base_and_extent (*refp, &offadj, &sztmp, &msztmp, &reverse);
> > + get_ref_base_and_extent (ref, &offadj, &sztmp, &msztmp, &reverse);
> >   offset2 -= offadj;
> >   get_ref_base_and_extent (base1, &offadj, &sztmp, &msztmp, &reverse);
> >   offset1 -= offadj;
> > @@ -1005,28 +1010,28 @@ aliasing_component_refs_p (tree ref1,
> >|| (end_struct_ref1
> >   && compare_type_sizes (TREE_TYPE (end_struct_ref1), type1) <= 0))
> >  {
> > -  refp = &ref1;
> > +  tree ref = ref1;
> >while (true)
> > 

Re: [Vectorizer] Support masking fold left reductions

2019-06-17 Thread Richard Sandiford
Sorry for the slow review.

Alejandro Martinez Vicente  writes:
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 4942c69..7840ed8 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -5881,6 +5881,31 @@ vect_expand_fold_left (gimple_stmt_iterator *gsi, tree 
> scalar_dest,
>return lhs;
>  }
>  
> +/* Get a masked internal function equivalent to REDUC_FN.  VECTYPE_IN is the
> +   type of the vector input.  */
> +
> +static internal_fn
> +get_masked_reduction_fn (internal_fn reduc_fn, tree vectype_in)
> +{
> +  internal_fn mask_reduc_fn;
> +
> +  switch (reduc_fn)
> +{
> +case IFN_FOLD_LEFT_PLUS:
> +  mask_reduc_fn = IFN_MASK_FOLD_LEFT_PLUS;
> +  break;
> +
> +default:
> +  mask_reduc_fn = IFN_LAST;

Very minor, but it'd be simpler to return IFN_LAST directly here and...

> +}
> +
> +  if (mask_reduc_fn != IFN_LAST

...remove this check.

> +  && direct_internal_fn_supported_p (mask_reduc_fn, vectype_in,
> +  OPTIMIZE_FOR_SPEED))
> +return mask_reduc_fn;
> +  return IFN_LAST;
> +}
> +
>  /* Perform an in-order reduction (FOLD_LEFT_REDUCTION).  STMT_INFO is the
> statement that sets the live-out value.  REDUC_DEF_STMT is the phi
> statement.  CODE is the operation performed by STMT_INFO and OPS are

OK with that change, thanks.

Richard


Re: [PATCH v4 2/3] Add predict_doloop_p target hook

2019-06-17 Thread Richard Biener
On Mon, 17 Jun 2019, Kewen.Lin wrote:

> on 2019/6/17 下午8:08, Richard Biener wrote:
> > On June 17, 2019 11:59:30 AM GMT+02:00, Segher Boessenkool 
> >  wrote:
> >> On Mon, Jun 17, 2019 at 10:51:49AM +0200, Richard Biener wrote:
> >>> On Mon, 17 Jun 2019, Kewen.Lin wrote:
> >>>
>  Hi Segher and Bill,
> 
>  Thanks a lot for your review comments! I've updated the patch
> >> accordingly.
> 
>  The updated one attached.
> >>>
> >>> OK.  I suppose all low-overhead loop instructions use a decrement to
> >> zero
> >>> style iterator?
> >>
> >> The documentation says decrement to 0, decrement to -1, and decrement
> >> to
> >> any negative are all supported.  But all are decrement, yes.
> > 
> > Hmm, so I wonder if we should tell IVOPTS the kind because the IV generated 
> > has to match RTL doloops expectations? Thus return an enum value from the 
> > hook? 
> > 
> 
> I guess we don't need to make it in IVOPTs, since doloop_optimize can 
> transform the loop closing to
> its expected pattern once it gets the iteration count, with target hook 
> doloop_end.
> It's to modify the original closing into:
>   niter   -> 0
>   niter-1 -> -1
>   niter-n -> -n

True, but this IV may affect other IVs choice (well, at least if it is
"cheap" to use the doloop IV in derived IVs which it is not for power).

Richard.

Re: Do not give up early on access path oracle

2019-06-17 Thread Richard Biener
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> > 
> > Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> > I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> > 
> > But yes, if you don't want that then || is cheaper.  Probably add
> > another testcase with one of the accesses with a constant index?
> 
> Hmm, OK, without unions I do not see how I could disambiguate something
> with || and not &&.
> Do we care about the testcase with union and constant accesses I sent?

Wouldn't it be as simple as

struct S { int i; int j; };

struct S a[10];

and a[i].i vs. a[1].j?  The first has offset = 0, max_size = sizeof(a)-4
while the second has offset = 12, max_size = 4 and thus they overlap.

Richard.


[PR c++/90754] name lookup ICE

2019-06-17 Thread Nathan Sidwell
It turned out qualify_lookup was rejecting non-decls, and hence the 
ordering of that call relative to other tests was significant.  I've 
restored the relative ordering to that prior to my reimplementation of 
lookup_type_scope_1.


applying to trunk.

nathan
--
Nathan Sidwell
2019-06-17  Nathan Sidwell  

	PR c++/90754
	* name-lookup.c (lookup_type_scope_1): Calll qualify_lookup before
	checking context.

	PR c++/90754
	* g++.dg/lookup/pr90754.C: New.

Index: gcc/cp/name-lookup.c
===
--- gcc/cp/name-lookup.c	(revision 272381)
+++ gcc/cp/name-lookup.c	(working copy)
@@ -6488,13 +6488,13 @@ lookup_type_scope_1 (tree name, tag_scop
 	   correctly.  */
 	if (tree type = iter->type)
-	  if ((scope != ts_current
-	   || LOCAL_BINDING_P (iter)
-	   || DECL_CONTEXT (type) == iter->scope->this_entity)
-	  && qualify_lookup (iter->type, LOOKUP_PREFER_TYPES))
-	return iter->type;
+	  if (qualify_lookup (type, LOOKUP_PREFER_TYPES)
+	  && (scope != ts_current
+		  || LOCAL_BINDING_P (iter)
+		  || DECL_CONTEXT (type) == iter->scope->this_entity))
+	return type;
 
-	if ((scope != ts_current
-	 || !INHERITED_VALUE_BINDING_P (iter))
-	&& qualify_lookup (iter->value, LOOKUP_PREFER_TYPES))
+	if (qualify_lookup (iter->value, LOOKUP_PREFER_TYPES)
+	&& (scope != ts_current
+		|| !INHERITED_VALUE_BINDING_P (iter)))
 	  return iter->value;
   }
Index: gcc/testsuite/g++.dg/lookup/pr90754.C
===
--- gcc/testsuite/g++.dg/lookup/pr90754.C	(revision 0)
+++ gcc/testsuite/g++.dg/lookup/pr90754.C	(working copy)
@@ -0,0 +1,11 @@
+// PR c++/90754 ICE in type lookup.
+
+class A {
+  struct COMTypeInfo;
+};
+class B {
+  struct COMTypeInfo;
+};
+class C : A, B {
+  struct COMTypeInfo;
+};


Re: Remove nonoverlapping_component_refs_of_decl_p

2019-06-17 Thread Richard Biener
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> Hi,
> this is patch for aliasing_component_refs_p and VCE.
> I also turned the refp to ref, but there are no changes relative
> to that.
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Bonus points for wrong-code testcases involving these...  like
the following which doesn't fail for some reason:

struct S { int i; int j[2]; };

__GIMPLE(startwith("fre1")) int foo(struct S *x, struct S *y)
{
  int _1;
  x->j[0] = 1;
  __VIEW_CONVERT  (__MEM  ((int(*)[2])y + 4)).i = 0;
  _1 = x->j[0];
  return _1;
}

int main()
{
  struct S s;
  if (foo (&s, &s) != 0)
__builtin_abort ();
  return 0;
}

Thanks,
Richard.

> Honza
> 
>   * tree-ssa-alias.c (aliasing_component_refs_p): Consider only
>   the access path from base to first VIEW_CONVERT_EXPR or
>   BIT_FIELD_REF.
> Index: tree-ssa-alias.c
> ===
> --- tree-ssa-alias.c  (revision 272381)
> +++ tree-ssa-alias.c  (working copy)
> @@ -874,7 +874,6 @@ aliasing_component_refs_p (tree ref1,
>   disambiguating q->i and p->a.j.  */
>tree base1, base2;
>tree type1, type2;
> -  tree *refp;
>int same_p1 = 0, same_p2 = 0;
>bool maybe_match = false;
>tree end_struct_ref1 = NULL, end_struct_ref2 = NULL;
> @@ -903,6 +902,9 @@ aliasing_component_refs_p (tree ref1,
> gcc_checking_assert (!end_struct_ref1);
>end_struct_ref1 = base1;
>   }
> +  if (TREE_CODE (base1) == VIEW_CONVERT_EXPR
> +   || TREE_CODE (base1) == BIT_FIELD_REF)
> + ref1 = TREE_OPERAND (base1, 0);
>base1 = TREE_OPERAND (base1, 0);
>  }
>type1 = TREE_TYPE (base1);
> @@ -918,6 +920,9 @@ aliasing_component_refs_p (tree ref1,
> gcc_checking_assert (!end_struct_ref2);
> end_struct_ref2 = base2;
>   }
> +  if (TREE_CODE (base2) == VIEW_CONVERT_EXPR
> +   || TREE_CODE (base2) == BIT_FIELD_REF)
> + ref2 = TREE_OPERAND (base2, 0);
>base2 = TREE_OPERAND (base2, 0);
>  }
>type2 = TREE_TYPE (base2);
> @@ -934,23 +939,23 @@ aliasing_component_refs_p (tree ref1,
>|| (end_struct_ref2
> && compare_type_sizes (TREE_TYPE (end_struct_ref2), type1) >= 0))
>  {
> -  refp = &ref2;
> +  tree ref = ref2;
>while (true)
>   {
> /* We walk from inner type to the outer types. If type we see is
>already too large to be part of type1, terminate the search.  */
> -   int cmp = compare_type_sizes (type1, TREE_TYPE (*refp));
> +   int cmp = compare_type_sizes (type1, TREE_TYPE (ref));
>  
> if (cmp < 0
> && (!end_struct_ref1
> || compare_type_sizes (TREE_TYPE (end_struct_ref1),
> -  TREE_TYPE (*refp)) < 0))
> +  TREE_TYPE (ref)) < 0))
>   break;
> /* If types may be of same size, see if we can decide about their
>equality.  */
> if (cmp == 0)
>   {
> -   same_p2 = same_type_for_tbaa (TREE_TYPE (*refp), type1);
> +   same_p2 = same_type_for_tbaa (TREE_TYPE (ref), type1);
> if (same_p2 == 1)
>   break;
> /* In case we can't decide whether types are same try to
> @@ -960,9 +965,9 @@ aliasing_component_refs_p (tree ref1,
> if (same_p2 == -1)
>   maybe_match = true;
>   }
> -   if (!handled_component_p (*refp))
> +   if (!handled_component_p (ref))
>   break;
> -   refp = &TREE_OPERAND (*refp, 0);
> +   ref = TREE_OPERAND (ref, 0);
>   }
>if (same_p2 == 1)
>   {
> @@ -977,13 +982,13 @@ aliasing_component_refs_p (tree ref1,
> if (TREE_CODE (TREE_TYPE (base1)) == ARRAY_TYPE
> && (!TYPE_SIZE (TREE_TYPE (base1))
> || TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) != INTEGER_CST
> -   || (*refp == base2 && !ref2_is_decl)))
> +   || (ref == base2 && !ref2_is_decl)))
>   {
> ++alias_stats.aliasing_component_refs_p_may_alias;
> return true;
>   }
>  
> -   get_ref_base_and_extent (*refp, &offadj, &sztmp, &msztmp, &reverse);
> +   get_ref_base_and_extent (ref, &offadj, &sztmp, &msztmp, &reverse);
> offset2 -= offadj;
> get_ref_base_and_extent (base1, &offadj, &sztmp, &msztmp, &reverse);
> offset1 -= offadj;
> @@ -1005,28 +1010,28 @@ aliasing_component_refs_p (tree ref1,
>|| (end_struct_ref1
> && compare_type_sizes (TREE_TYPE (end_struct_ref1), type1) <= 0))
>  {
> -  refp = &ref1;
> +  tree ref = ref1;
>while (true)
>   {
> -   int cmp = compare_type_sizes (type2, TREE_TYPE (*refp));
> +   int cmp = compare_type_sizes (type2, TREE_TYPE (ref));
> if (cmp < 0
> && (!end_struct_ref2
> || compare_type_sizes (TREE_TYPE (end_struct_ref2),
> -  T

Re: [PATCH v4 2/3] Add predict_doloop_p target hook

2019-06-17 Thread Kewen.Lin
on 2019/6/17 下午8:08, Richard Biener wrote:
> On June 17, 2019 11:59:30 AM GMT+02:00, Segher Boessenkool 
>  wrote:
>> On Mon, Jun 17, 2019 at 10:51:49AM +0200, Richard Biener wrote:
>>> On Mon, 17 Jun 2019, Kewen.Lin wrote:
>>>
 Hi Segher and Bill,

 Thanks a lot for your review comments! I've updated the patch
>> accordingly.

 The updated one attached.
>>>
>>> OK.  I suppose all low-overhead loop instructions use a decrement to
>> zero
>>> style iterator?
>>
>> The documentation says decrement to 0, decrement to -1, and decrement
>> to
>> any negative are all supported.  But all are decrement, yes.
> 
> Hmm, so I wonder if we should tell IVOPTS the kind because the IV generated 
> has to match RTL doloops expectations? Thus return an enum value from the 
> hook? 
> 

I guess we don't need to make it in IVOPTs, since doloop_optimize can transform 
the loop closing to
its expected pattern once it gets the iteration count, with target hook 
doloop_end.
It's to modify the original closing into:
  niter   -> 0
  niter-1 -> -1
  niter-n -> -n


Thanks,
Kewen

> Richard. 
> 
>>
>> Segher
> 



Re: *ping* Re: [PATCH] PR fortran/89103 - Allow blank format items in format strings

2019-06-17 Thread Mark Eggleston


On 12/06/2019 19:11, Steve Kargl wrote:

On Tue, Jun 11, 2019 at 11:50:40AM +0200, Jakub Jelinek wrote:

On Tue, Jun 11, 2019 at 10:30:59AM +0100, Mark Eggleston wrote:

     Jim MacArthur 
     Mark Eggleston 

Two spaces before < instead of one.

This is not a patch review, just comments:

Mark, do you plan to address any of Jakub's comments.
Do note, I just 'OK' Jakub's patch that uses G_()
forms for the strings.


Now that Jakubs's patch has been committed, please find attached an 
updated patch and updated change logs:


gcc/fortran

    Jim MacArthur  
    Mark Eggleston  

    PR fortran/89103
    * gfortran.texi: Add -fdec-blank-format-item
    * invoke.texi: Add option to list of options.
    * invoke.texi: Add to section on Commas in FORMAT specifications.
    * io.c (check_format): At FMT_RPAREN goto finished if
    -fdec-blank-format-item otherwise set error string.
    * lang.opt: Add new option.
    * options.c (set_dec_flags): Add SET_BITFLAG for
    flag_dec_format_defaults.

gcc/testsuite

    Jim MacArthur  
    Mark Eggleston  

    PR fortran/89103
    * gfortran.dg/dec_format_empty_item_1.f: New test.
    * gfortran.dg/dec_format_empty_item_2.f: New test.
    * gfortran.dg/dec_format_empty_item_3.f: New test.

as before... Please can someone commit this as do not have commit rights.



Also, do you have plans to contribute additional
patches (either for -fdec* extensions or preferrably
to help with bug fixes and new features)?  It may be
advantageous for you to get a commit bit.
Yes, I do intend to contribute additional patches, mostly -fdec- 
patches, there are also some patches unrelated to -fdec* extensions.



--
https://www.codethink.co.uk/privacy.html

>From 48c734966d0f5d9f618b532d12b24fe784679dea Mon Sep 17 00:00:00 2001
From: Jim MacArthur 
Date: Thu, 4 Feb 2016 16:59:41 +
Subject: [PATCH 01/10] Allow blank format items in format strings

Use -fdec-blank-format-item to enable. Also enabled by -fdec.
---
 gcc/fortran/gfortran.texi   |  7 ++-
 gcc/fortran/invoke.texi | 13 +
 gcc/fortran/io.c|  9 +
 gcc/fortran/lang.opt|  4 
 gcc/fortran/options.c   |  1 +
 gcc/testsuite/gfortran.dg/dec_format_empty_item_1.f | 19 +++
 gcc/testsuite/gfortran.dg/dec_format_empty_item_2.f | 19 +++
 gcc/testsuite/gfortran.dg/dec_format_empty_item_3.f | 19 +++
 8 files changed, 86 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/dec_format_empty_item_1.f
 create mode 100644 gcc/testsuite/gfortran.dg/dec_format_empty_item_2.f
 create mode 100644 gcc/testsuite/gfortran.dg/dec_format_empty_item_3.f

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 57461e0e42f..c887e7d1a42 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1761,11 +1761,16 @@ When omitted, the count is implicitly assumed to be one.
 
 To support legacy codes, GNU Fortran allows the comma separator
 to be omitted immediately before and after character string edit
-descriptors in @code{FORMAT} statements.
+descriptors in @code{FORMAT} statements.  A comma with no following format
+decriptor is permited if the @option{-fdec-blank-format-item} is given on
+the command line. This is considered non-conforming code and is
+discouraged.
 
 @smallexample
PRINT 10, 2, 3
 10 FORMAT ('FOO='I1' BAR='I2)
+   print 20, 5, 6
+20 FORMAT (I3, I3,)
 @end smallexample
 
 
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 2e2cb5b2728..2b08ac4de22 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -119,10 +119,10 @@ by type.  Explanations are in the following sections.
 @gccoptlist{-fall-intrinsics -fbackslash -fcray-pointer -fd-lines-as-code @gol
 -fd-lines-as-comments -fdec -fdec-structure -fdec-intrinsic-ints @gol
 -fdec-static -fdec-math -fdec-include -fdec-format-defaults @gol
--fdefault-double-8 -fdefault-integer-8 -fdefault-real-8 -fdefault-real-10 @gol
--fdefault-real-16 -fdollar-ok -ffixed-line-length-@var{n} @gol
--ffixed-line-length-none -fpad-source -ffree-form @gol
--ffree-line-length-@var{n} -ffree-line-length-none @gol
+-fdec-blank-format-item -fdefault-double-8 -fdefault-integer-8 @gol
+-fdefault-real-8 -fdefault-real-10 -fdefault-real-16 -fdollar-ok @gol
+-ffixed-line-length-@var{n} -ffixed-line-length-none -fpad-source @gol
+-ffree-form -ffree-line-length-@var{n} -ffree-line-length-none @gol
 -fimplicit-none -finteger-4-integer-8 -fmax-identifier-length @gol
 -fmodule-private -ffixed-form -fno-range-check -fopenacc -fopenmp @gol
 -freal-4-real-10 -freal-4-real-16 -freal-4-real-8 -freal-8-real-10 @gol
@@ -289,6 +289,11 @@ be on a single line and can use line continuations.
 Enable format specifiers F, G and I to be used without width specifiers,
 default widths will be used instead.
 
+@

Re: Do not give up early on access path oracle

2019-06-17 Thread Jan Hubicka
> 
> Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
> I think && is correct if you want to disambiguate a[1].v2 and a[i].v1
> 
> But yes, if you don't want that then || is cheaper.  Probably add
> another testcase with one of the accesses with a constant index?

Hmm, OK, without unions I do not see how I could disambiguate something
with || and not &&.
Do we care about the testcase with union and constant accesses I sent?

Honza
> 
> Richard.


Re: Do not give up early on access path oracle

2019-06-17 Thread Richard Biener
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> > 
> > But part of the expensiveness we want to avoid is this
> > (repeated) walking of the ref tree...
> 
> I was considering to pass contains_union_p down from one of
> earlier walks, but did not find suitable one for that...
> > 
> > So...
> > 
> > > +  return !handled_component_p (ref2);
> > > +}
> > > +
> > >  /* Return true if an indirect reference based on *PTR1 constrained
> > > to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
> > > constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
> > > @@ -1533,7 +1563,12 @@ indirect_ref_may_alias_decl_p (tree ref1
> > >&& (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
> > > || (TYPE_SIZE (TREE_TYPE (base1))
> > > && TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == INTEGER_CST)))
> > > -return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, 
> > > max_size2);
> > > +{
> > > +  if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, 
> > > max_size2))
> > > + return false;
> > > +  if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
> > > + return true;
> > 
> > how about a simpler test like
> > 
> >  if (known_size_p (max_size1) && known_size_p (max_size2))
> >return true;
> >  /* If there's an unconstrained variable access in the ref fall
> > through to access-path based disambiguation.  */
> 
> If I have something like
>  struct a {int a[10];int b;}
> and then
>  aptr->a[i]
> in the access path, won't be max_size known (40) where type size is 4?

Yes.

> In this case I want to contiue to access path.
> > 
> > ?
> > 
> > I'd certainly like to see testcases btw...
> 
> There is a testcase for variable array access included in the patch,
> would you like to have one with union in it?

Oh, I missed the testcase ;)

> > 
> > A more stricter test would be
> > 
> > if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
> >   return true;
> > /* If there's a variable access in one of the refs fall through
> >to access-path based disambiguation.  */
> > 
> > where you'd need to pass down ao_ref_size in addition to max_size as well.
> 
> Proably || here?

Hmm, !maybe_eq () -> ! max_size1 == size1 -> max_size != size1 thus
I think && is correct if you want to disambiguate a[1].v2 and a[i].v1

But yes, if you don't want that then || is cheaper.  Probably add
another testcase with one of the accesses with a constant index?

Richard.


Re: [PATCH] Replace std::to_string for integers with optimized version

2019-06-17 Thread Jonathan Wakely

On 13/06/19 22:41 +0200, Christophe Lyon wrote:

Hi,


On Wed, 12 Jun 2019 at 16:54, Jonathan Wakely  wrote:


The std::to_chars functions from C++17 can be used to implement
std::to_string with much better performance than calling snprintf. Only
the __detail::__to_chars_len and __detail::__to_chars_10 functions are
needed for to_string, because it always outputs base 10 representations.

The return type of __detail::__to_chars_10 should not be declared before
C++17, so the function body is extracted into a new function that can be
reused by to_string and __detail::__to_chars_10.

The existing tests for to_chars rely on to_string to check for correct
answers. Now that they use the same code that doesn't actually ensure
correctness, so add new tests for std::to_string that compare against
printf output.

* include/Makefile.am: Add new  header.
* include/Makefile.in: Regenerate.
* include/bits/basic_string.h (to_string(int), to_string(unsigned))
(to_string(long), to_string(unsigned long), to_string(long long))
(to_string(unsigned long long)): Rewrite to use __to_chars_10_impl.
* include/bits/charconv.h: New header.
(__detail::__to_chars_len): Move here from .
(__detail::__to_chars_10_impl): New function extracted from
__detail::__to_chars_10.
* include/std/charconv (__cpp_lib_to_chars): Add, but comment out.
(__to_chars_unsigned_type): New class template that reuses
__make_unsigned_selector_base::__select to pick a type.
(__unsigned_least_t): Redefine as __to_chars_unsigned_type::type.
(__detail::__to_chars_len): Move to new header.
(__detail::__to_chars_10): Add inline specifier. Move code doing the
output to __detail::__to_chars_10_impl and call that.
* include/std/version (__cpp_lib_to_chars): Add, but comment out.
* testsuite/21_strings/basic_string/numeric_conversions/char/
to_string.cc: Fix reference in comment. Remove unused variable.
* testsuite/21_strings/basic_string/numeric_conversions/char/
to_string_int.cc: New test.

Tested x86_64-linux, committed to trunk.



The new test to_string_int.cc fails on arm-none-eabi:
PASS: 21_strings/basic_string/numeric_conversions/char/to_string_int.cc
(test for excess errors)
spawn 
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-eabi/gcc3/utils/bin/qemu-wrapper.sh
./to_string_int.exe
/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc:105:
void check_value(T) [with T = long long int]: Assertion 's ==
expected' failed.
FAIL: 21_strings/basic_string/numeric_conversions/char/to_string_int.cc
execution test


Does the target support "%lld" in printf?

Does the .sum file show UNSUPPORTED for the existing
21_strings/basic_string/numeric_conversions/char/to_string.cc test?

Does the attached patch make the test show what fails?

I suspect the problem is that the test relies on snprintf to check the
answers are correct, even though the actual library code doesn't need
snprintf any longer. Previously the std::to_string functions were all
guarded by _GLIBCXX_USE_C99_STDIO and so I'm guessing were not
supported for arm-none-eabi. Now the overloads for integral types
should work without any C library support, and so the new test is
enabled unconditionally. But the test itself still uses the C library.

Maybe I should have a simple test that doesn't use snprintf and just
checks some hardcoded values produce the expected results, and a more
involved test that compares the results to snprintf but uses 
{ dg-require-string-conversions "" } like the existing test.




diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc b/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc
index 8eb2a48bd30..911e6812b39 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/to_string_int.cc
@@ -95,13 +95,15 @@ const std::uint_least32_t values[] = {
 
 const std::size_t empty_string_capacity = std::string().capacity();
 
-#include 
+#include 
 
 template
   void check_value(T val)
   {
 const std::string s = std::to_string(val);
 const std::string expected = test::to_string(val);
+if (s != expected)
+  std::cerr << "GOT: " << s << " BUT EXPECTED: " << expected << '\n';
 VERIFY( s == expected );
 VERIFY( s[s.size()] == '\0' ); // null-terminator not overwritten!
 if (s.size() > empty_string_capacity)


Re: Do not give up early on access path oracle

2019-06-17 Thread Jan Hubicka
Hi,
this is testcase for unions.  It is somewhat artificial because
nonoverlapping_component_refs is conservative about matching union
fields and aliasing_component_refs_p resorts to offset/max_size
test which of course suceeds.
So I had to add the wrapping struct a (which is matched by
the earlier) and disambiguated.

ICC optimizes it as expected.

/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-fre1" } */
struct a {int v1;
  int v2;};
struct b {int x; struct a a;};
struct c {struct a a;};
union d {struct b b; struct c c;};

int
test (union d *dptr1, union d *dptr2)
{
  dptr1->b.a.v1=123;
  dptr2->c.a.v2=1;
  return dptr1->b.a.v1;
}
/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */


Re: Remove nonoverlapping_component_refs_of_decl_p

2019-06-17 Thread Jan Hubicka
Hi,
this is patch for aliasing_component_refs_p and VCE.
I also turned the refp to ref, but there are no changes relative
to that.
Bootstrapped/regtested x86_64-linux, OK?
Honza

* tree-ssa-alias.c (aliasing_component_refs_p): Consider only
the access path from base to first VIEW_CONVERT_EXPR or
BIT_FIELD_REF.
Index: tree-ssa-alias.c
===
--- tree-ssa-alias.c(revision 272381)
+++ tree-ssa-alias.c(working copy)
@@ -874,7 +874,6 @@ aliasing_component_refs_p (tree ref1,
  disambiguating q->i and p->a.j.  */
   tree base1, base2;
   tree type1, type2;
-  tree *refp;
   int same_p1 = 0, same_p2 = 0;
   bool maybe_match = false;
   tree end_struct_ref1 = NULL, end_struct_ref2 = NULL;
@@ -903,6 +902,9 @@ aliasing_component_refs_p (tree ref1,
  gcc_checking_assert (!end_struct_ref1);
   end_struct_ref1 = base1;
}
+  if (TREE_CODE (base1) == VIEW_CONVERT_EXPR
+ || TREE_CODE (base1) == BIT_FIELD_REF)
+   ref1 = TREE_OPERAND (base1, 0);
   base1 = TREE_OPERAND (base1, 0);
 }
   type1 = TREE_TYPE (base1);
@@ -918,6 +920,9 @@ aliasing_component_refs_p (tree ref1,
  gcc_checking_assert (!end_struct_ref2);
  end_struct_ref2 = base2;
}
+  if (TREE_CODE (base2) == VIEW_CONVERT_EXPR
+ || TREE_CODE (base2) == BIT_FIELD_REF)
+   ref2 = TREE_OPERAND (base2, 0);
   base2 = TREE_OPERAND (base2, 0);
 }
   type2 = TREE_TYPE (base2);
@@ -934,23 +939,23 @@ aliasing_component_refs_p (tree ref1,
   || (end_struct_ref2
  && compare_type_sizes (TREE_TYPE (end_struct_ref2), type1) >= 0))
 {
-  refp = &ref2;
+  tree ref = ref2;
   while (true)
{
  /* We walk from inner type to the outer types. If type we see is
 already too large to be part of type1, terminate the search.  */
- int cmp = compare_type_sizes (type1, TREE_TYPE (*refp));
+ int cmp = compare_type_sizes (type1, TREE_TYPE (ref));
 
  if (cmp < 0
  && (!end_struct_ref1
  || compare_type_sizes (TREE_TYPE (end_struct_ref1),
-TREE_TYPE (*refp)) < 0))
+TREE_TYPE (ref)) < 0))
break;
  /* If types may be of same size, see if we can decide about their
 equality.  */
  if (cmp == 0)
{
- same_p2 = same_type_for_tbaa (TREE_TYPE (*refp), type1);
+ same_p2 = same_type_for_tbaa (TREE_TYPE (ref), type1);
  if (same_p2 == 1)
break;
  /* In case we can't decide whether types are same try to
@@ -960,9 +965,9 @@ aliasing_component_refs_p (tree ref1,
  if (same_p2 == -1)
maybe_match = true;
}
- if (!handled_component_p (*refp))
+ if (!handled_component_p (ref))
break;
- refp = &TREE_OPERAND (*refp, 0);
+ ref = TREE_OPERAND (ref, 0);
}
   if (same_p2 == 1)
{
@@ -977,13 +982,13 @@ aliasing_component_refs_p (tree ref1,
  if (TREE_CODE (TREE_TYPE (base1)) == ARRAY_TYPE
  && (!TYPE_SIZE (TREE_TYPE (base1))
  || TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) != INTEGER_CST
- || (*refp == base2 && !ref2_is_decl)))
+ || (ref == base2 && !ref2_is_decl)))
{
  ++alias_stats.aliasing_component_refs_p_may_alias;
  return true;
}
 
- get_ref_base_and_extent (*refp, &offadj, &sztmp, &msztmp, &reverse);
+ get_ref_base_and_extent (ref, &offadj, &sztmp, &msztmp, &reverse);
  offset2 -= offadj;
  get_ref_base_and_extent (base1, &offadj, &sztmp, &msztmp, &reverse);
  offset1 -= offadj;
@@ -1005,28 +1010,28 @@ aliasing_component_refs_p (tree ref1,
   || (end_struct_ref1
  && compare_type_sizes (TREE_TYPE (end_struct_ref1), type1) <= 0))
 {
-  refp = &ref1;
+  tree ref = ref1;
   while (true)
{
- int cmp = compare_type_sizes (type2, TREE_TYPE (*refp));
+ int cmp = compare_type_sizes (type2, TREE_TYPE (ref));
  if (cmp < 0
  && (!end_struct_ref2
  || compare_type_sizes (TREE_TYPE (end_struct_ref2),
-TREE_TYPE (*refp)) < 0))
+TREE_TYPE (ref)) < 0))
break;
  /* If types may be of same size, see if we can decide about their
 equality.  */
  if (cmp == 0)
{
- same_p1 = same_type_for_tbaa (TREE_TYPE (*refp), type2);
+ same_p1 = same_type_for_tbaa (TREE_TYPE (ref), type2);
  if (same_p1 == 1)
break;
  if (same_p1 == -1)
maybe_match = true;
}
- if (!handled_component_p (*ref

Re: Do not give up early on access path oracle

2019-06-17 Thread Jan Hubicka
> 
> But part of the expensiveness we want to avoid is this
> (repeated) walking of the ref tree...

I was considering to pass contains_union_p down from one of
earlier walks, but did not find suitable one for that...
> 
> So...
> 
> > +  return !handled_component_p (ref2);
> > +}
> > +
> >  /* Return true if an indirect reference based on *PTR1 constrained
> > to [OFFSET1, OFFSET1 + MAX_SIZE1) may alias a variable based on BASE2
> > constrained to [OFFSET2, OFFSET2 + MAX_SIZE2).  *PTR1 and BASE2 have
> > @@ -1533,7 +1563,12 @@ indirect_ref_may_alias_decl_p (tree ref1
> >&& (TREE_CODE (TREE_TYPE (base1)) != ARRAY_TYPE
> >   || (TYPE_SIZE (TREE_TYPE (base1))
> >   && TREE_CODE (TYPE_SIZE (TREE_TYPE (base1))) == INTEGER_CST)))
> > -return ranges_maybe_overlap_p (doffset1, max_size1, doffset2, 
> > max_size2);
> > +{
> > +  if (!ranges_maybe_overlap_p (doffset1, max_size1, doffset2, 
> > max_size2))
> > +   return false;
> > +  if (same_access_paths_p (ref1, max_size1, ref2, max_size2))
> > +   return true;
> 
> how about a simpler test like
> 
>  if (known_size_p (max_size1) && known_size_p (max_size2))
>return true;
>  /* If there's an unconstrained variable access in the ref fall
>   through to access-path based disambiguation.  */

If I have something like
 struct a {int a[10];int b;}
and then
 aptr->a[i]
in the access path, won't be max_size known (40) where type size is 4?
In this case I want to contiue to access path.
> 
> ?
> 
> I'd certainly like to see testcases btw...

There is a testcase for variable array access included in the patch,
would you like to have one with union in it?
> 
> A more stricter test would be
> 
>   if (!maybe_eq (max_size1, size1) && !maybe_eq (max_size2, size2))
>   return true;
> /* If there's a variable access in one of the refs fall through
>to access-path based disambiguation.  */
> 
> where you'd need to pass down ao_ref_size in addition to max_size as well.

Proably || here?

Honza


Re: [nvptx] Fix missing mode warnings in nvptx.md, omp part

2019-06-17 Thread Alexander Monakov
On Mon, 17 Jun 2019, Tom de Vries wrote:

> Hi Alexander,
> 
> any comments?

A couple suggestions (see below), but no serious concerns from my side.

> --- a/gcc/config/nvptx/nvptx.c
> +++ b/gcc/config/nvptx/nvptx.c
> @@ -112,6 +112,17 @@ enum nvptx_data_area
>DATA_AREA_MAX
>  };
>  
> +rtx
> +gen_set_softstack_insn (rtx op)
> +{

Here it might be appropriate to have something like

  gcc_assert (GET_MODE (op) == Pmode);

because failure to supply a Pmode operand indicates a bug in the caller and
it may be desirable to catch it here; doesn't make sense to allow restoring
stack pointer from a 32-bit register with a 64-bit address space.

Likewise for other instances of this 'if (... DImode) ... else if (... SImode) 
...'
pattern in the patch.

> +  if (GET_MODE (op) == DImode)
> +return gen_set_softstack_insn_di (op);
> +  else if (GET_MODE (op) == SImode)
> +return gen_set_softstack_insn_si (op);
> +  else
> +gcc_unreachable ();
> +}
> +
>  /*  We record the data area in the target symbol flags.  */
>  #define SYMBOL_DATA_AREA(SYM) \
>(nvptx_data_area)((SYMBOL_REF_FLAGS (SYM) >> SYMBOL_FLAG_MACH_DEP_SHIFT) \
> diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
> index 3ed5296db96..1a676b942b2 100644
> --- a/gcc/config/nvptx/nvptx.md
> +++ b/gcc/config/nvptx/nvptx.md
> @@ -1071,8 +1071,8 @@
>DONE;
>  })
>  
> -(define_insn "set_softstack_insn"
> -  [(unspec [(match_operand 0 "nvptx_register_operand" "R")]
> +(define_insn "set_softstack_insn_"
> +  [(unspec [(match_operand:P 0 "nvptx_register_operand" "R")]

A purely cosmetic issue, but I'll mention it for completeness: the '_insn'
suffix is unnecessary with the '_' suffix added. This is how e.g. the
i386 backend deals with a similar issue, they have 'stack_protect_test' expand
and 'stack_protect_test_' insns.

Thanks!
Alexander


Re: Do not give up early on access path oracle

2019-06-17 Thread Richard Biener
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> Hi,
> while working on testcases for nonoverlapping_component_refs_p I run into 
> issue
> that we often bypass it because the indirect-decl and indirect-indirect 
> oracles
> give up if they manage to match bases and ranges_maybe_overlap_p return true.
> (one testcase is included in patch).
> 
> The issue is that decl-decl oracle first test base+offsets and if that
> fails it proceeds to nonoverlapping_component_refs_of_decl_p which is
> still able to do some useful disambiguations when the access paths
> contains non-constat ARRAY_REFs where max_size is not very informative.
> 
> I modified the other two oracles to avoid the early return which increased
> effectivity of nonoverlapping_component_refs_p and aliasing_component_refs_p
> somewhat but also about doubled number of calls of them.
> 
> I can't say if the early return is just omision or intended to save compile 
> time
> but it appeared to me that most of those nondisambiguated comparsions acutally
> covers the case we know that access paths are the same and in such case
> we could still return early.
> 
> This patch adds same_access_paths_p which implements simple logic matching
> max_size with type size (thus ruling out variable array accesses) and then
> verifying that there are no unions on the way (in that case we still may have
> different access paths to same memory location.
> 
> With this patch I get relatively reasonable increase in number of querries
> and disambiguations.
> 
> For tramp3d:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 3021539 disambiguations, 3321136 queries
>   ref_maybe_used_by_call_p: 7118 disambiguations, 3047133 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   nonoverlapping_component_refs_p: 22 disambiguations, 61735 queries
>   nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
>   aliasing_component_refs_p: 2050 disambiguations, 19908 queries
>   TBAA oracle: 1419961 disambiguations 2916692 queries
>555158 are in alias set 0
>575103 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>252874 are dependent in the DAG
>113596 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 671982 disambiguations, 952513 queries
>   pt_solutions_intersect: 97060 disambiguations, 437912 queries
> 
> to:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 3021842 disambiguations, 3321308 queries
>   ref_maybe_used_by_call_p: 7118 disambiguations, 3047440 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   nonoverlapping_component_refs_p: 25 disambiguations, 63108 queries
>   nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
>   aliasing_component_refs_p: 2236 disambiguations, 20594 queries
>   TBAA oracle: 1420030 disambiguations 2918103 queries
>555158 are in alias set 0
>575109 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>253260 are dependent in the DAG
>114546 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 671990 disambiguations, 952532 queries
>   pt_solutions_intersect: 97060 disambiguations, 438091 queries
> 
> So 3% new querries on wich we seems to have have about 30% disambiguation rate
> (190 new disambiguations)
> 
> For cc1:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 38345354 disambiguations, 46034874 queries
>   ref_maybe_used_by_call_p: 57452 disambiguations, 38905700 queries
>   call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
>   nonoverlapping_component_refs_p: 9674 disambiguations, 743745 queries
>   nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 
> queries
>   aliasing_component_refs_p: 103234 disambiguations, 291017 queries
>   TBAA oracle: 10719978 disambiguations 32885735 queries
>13521045 are in alias set 0
>5233132 queries asked about the same object
>200 queries asked about the same alias set
>0 access volatile
>1459189 are dependent in the DAG
>1952191 are aritificially in conflict with void *
> 
> PTA query stats:
>   pt_solution_includes: 465494 disambiguations, 6918046 queries
>   pt_solutions_intersect: 342384 disambiguations, 6435014 queries
> 
> to:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 38345581 disambiguations, 46035002 queries
>   ref_maybe_used_by_call_p: 57452 disambiguations, 38905936 queries
>   call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
>   nonoverlapping_component_refs_p: 9754 disambiguations, 768725 queries
>   nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 
> queries
>   aliasing_component_refs_

Re: [PATCH v4 2/3] Add predict_doloop_p target hook

2019-06-17 Thread Richard Biener
On June 17, 2019 11:59:30 AM GMT+02:00, Segher Boessenkool 
 wrote:
>On Mon, Jun 17, 2019 at 10:51:49AM +0200, Richard Biener wrote:
>> On Mon, 17 Jun 2019, Kewen.Lin wrote:
>> 
>> > Hi Segher and Bill,
>> > 
>> > Thanks a lot for your review comments! I've updated the patch
>accordingly.
>> > 
>> > The updated one attached.
>> 
>> OK.  I suppose all low-overhead loop instructions use a decrement to
>zero
>> style iterator?
>
>The documentation says decrement to 0, decrement to -1, and decrement
>to
>any negative are all supported.  But all are decrement, yes.

Hmm, so I wonder if we should tell IVOPTS the kind because the IV generated has 
to match RTL doloops expectations? Thus return an enum value from the hook? 

Richard. 

>
>Segher



Re: [ARM/FDPIC v5 00/21] FDPIC ABI for ARM

2019-06-17 Thread Christophe Lyon
ping^3 ?

On Thu, 6 Jun 2019 at 14:36, Christophe Lyon  wrote:
>
> Hi,
>
> If this makes review easier, here are the areas covered by the patches:
>
> - patches 1,3,4,7,8,9,10,12,21: target-specific
> - patch 2: configure
> - patch 5,6,11,13: generic parts, undef #if defined(__FDPIC__)
> - patches 14-20: testsuite
>
> Christophe
>
> On Tue, 4 Jun 2019 at 14:57, Christophe Lyon  
> wrote:
> >
> > Ping?
> >
> >
> > On Thu, 23 May 2019 at 14:46, Christophe Lyon  
> > wrote:
> > >
> > > Ping?
> > >
> > > Any feedback other than what I got on patch 03/21 ?
> > >
> > > Thanks,
> > >
> > > Christophe
> > >
> > >
> > > On 15/05/2019 14:39, Christophe Lyon wrote:
> > > > Hello,
> > > >
> > > > This patch series implements the GCC contribution of the FDPIC ABI for
> > > > ARM targets.
> > > >
> > > > This ABI enables to run Linux on ARM MMU-less cores and supports
> > > > shared libraries to reduce the memory footprint.
> > > >
> > > > Without MMU, text and data segments relative distances are different
> > > > from one process to another, hence the need for a dedicated FDPIC
> > > > register holding the start address of the data segment. One of the
> > > > side effects is that function pointers require two words to be
> > > > represented: the address of the code, and the data segment start
> > > > address. These two words are designated as "Function Descriptor",
> > > > hence the "FD PIC" name.
> > > >
> > > > On ARM, the FDPIC register is r9 [1], and the target name is
> > > > arm-uclinuxfdpiceabi. Note that arm-uclinux exists, but uses another
> > > > ABI and the BFLAT file format; it does not support code sharing.
> > > > The -mfdpic option is enabled by default, and -mno-fdpic should be
> > > > used to build the Linux kernel.
> > > >
> > > > This work was developed some time ago by STMicroelectronics, and was
> > > > presented during Linaro Connect SFO15 (September 2015). You can watch
> > > > the discussion and read the slides [2].
> > > > This presentation was related to the toolchain published on github [3],
> > > > which is based on binutils-2.22, gcc-4.7, uclibc-0.9.33.2, gdb-7.5.1
> > > > and qemu-2.3.0, and for which pre-built binaries are available [3].
> > > >
> > > > The ABI itself is described in details in [1].
> > > >
> > > > Our Linux kernel patches have been updated and committed by Nicolas
> > > > Pitre (Linaro) in July 2017. They are required so that the loader is
> > > > able to handle this new file type. Indeed, the ELF files are tagged
> > > > with ELFOSABI_ARM_FDPIC. This new tag has been allocated by ARM, as
> > > > well as the new relocations involved.
> > > >
> > > > The binutils, QEMU and uclibc-ng patch series have been merged a few
> > > > months ago. [4][5][6]
> > > >
> > > > This series provides support for architectures that support ARM and/or
> > > > Thumb-2 and has been tested on arm-linux-gnueabi without regression,
> > > > as well as arm-uclinuxfdpiceabi, using QEMU. arm-uclinuxfdpiceabi has
> > > > a few more failures than arm-linux-gnueabi, but is quite functional.
> > > >
> > > > I have also booted an STM32 board (stm32f469) which uses a cortex-m4
> > > > with linux-4.20.17 and ran successfully several tools.
> > > >
> > > > Are the GCC patches OK for inclusion in master?
> > > >
> > > > Changes between v4 and v5:
> > > > - rebased on top of recent gcc-10 master (April 26th, 2019)
> > > > - fixed handling of stack-protector combined patterns in FDPIC mode
> > > >
> > > > Changes between v3 and v4:
> > > >
> > > > - improved documentation (patch 1)
> > > > - emit an error message (sorry) if the target architecture does not
> > > >support arm nor thumb-2 modes (patch 4)
> > > > - handle Richard's comments on patch 4 (comments, unspec)
> > > > - added .align directive (patch 5)
> > > > - fixed use of kernel helpers (__kernel_cmpxchg, __kernel_dmb) (patch 6)
> > > > - code factorization in patch 7
> > > > - typos/internal function name in patch 8
> > > > - improved patch 12
> > > > - dropped patch 16
> > > > - patch 20 introduces arm_arch*_thumb_ok effective targets to help
> > > >skip some tests
> > > > - I tested patch 2 on xtensa-buildroot-uclinux-uclibc, it adds many
> > > >new tests, but a few regressions
> > > >(https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00713.html)
> > > > - I compiled and executed several LTP tests to exercise pthreads and 
> > > > signals
> > > > - I wrote and executed a simple testcase to change the interaction
> > > >with __kernel_cmpxchg (ie. call the kernel helper rather than use an
> > > >implementation in libgcc as requested by Richard)
> > > >
> > > > Changes between v2 and v3:
> > > > - added doc entry for -mfdpic new option
> > > > - took Kyrill's comments into account (use "Armv7" instead of "7",
> > > >code factorization, use preprocessor instead of hard-coding "r9",
> > > >remove leftover code for thumb1 support, fixed comments)
> > > > - rebase over recent trunk
> > > > - patches with changes: 1, 2 (commit

[COMMITTED] Improve PR64242 testcase

2019-06-17 Thread Wilco Dijkstra
Clear the input array to avoid the testcase accidentally
passing with an incorrect frame pointer.

Committed as obvious.

ChangeLog:
2019-06-17  Wilco Dijkstra  

testsuite/
PR middle-end/64242
* gcc.c-torture/execute/pr64242.c: Improve test.

--
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr64242.c 
b/gcc/testsuite/gcc.c-torture/execute/pr64242.c
index 
e6139ede3f34d587ac53d04e286e5d75fd2ca76c..5151f6eef839c0a98c76aeba93fdb800129523a3
 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr64242.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr64242.c
@@ -7,6 +7,7 @@ broken_longjmp (void *p)
 {
   void *buf[32];
   __builtin_memcpy (buf, p, 5 * sizeof (void*));
+  __builtin_memset (p, 0, 5 * sizeof (void*));
   /* Corrupts stack pointer...  */
   __builtin_longjmp (buf, 1);
 }



Re: [PATCH, d]: Fix PR90261, FAIL: libphobos.phobos/std/file.d on CentOS 5.11, Linux 2.6.18

2019-06-17 Thread Uros Bizjak
Ping.

On Wed, May 8, 2019 at 1:52 PM Uros Bizjak  wrote:
>
> Hello!
>
> CentOS 5.11 (glibc 2.5) does not have utimensat function, so there is
> no nanosecond precision of file times available. Currently, the test
> fails with:
>
> /tmp/cc36u3o7.o: In function
> `_D3std4file17__T8setTimesTAyaZ8setTimesFAyaS3std8datetime7systime7SysTimeS3std8datetime7systime7SysTimeZ16trustedUtimensatFNbNiNeiPxaKxG2S4core3sys5posix6signal8timespeciZi':
> /home/uros/git/gcc/libphobos/testsuite/../src/std/file.d:1272:
> undefined reference to `utimensat'
> collect2: error: ld returned 1 exit status
> compiler exited with status 1
>
> Attached patch detects utimensat function during configure time and
> falls back to utimes in case utimensat is not available.
>
> 2019-05-08  Uroš Bizjak  
>
> PR d/90261
> * m4/druntime/libraries.m4 (DRUNTIME_LIBRARIES_CLIB):
> Check for utimensat function.
> * configure: Regenerate
> * Makefile.in: Regenerate
> * libdruntime/gcc/config.d.in: Add Have_Utimensat.
> * libdruntime/Makefile.in: Regenerate.
> * libdruntime/core/sys/posix/sys/stat.d [version (CRuntime_Glibc)]:
> Declare utimensat and futimens only with Have_Utimensat.
> * src/Makefile.in: Regenerate.
> * src/std/file.d: Call testTimes with non-zero argument only
> when utimensat is defined.
> * testsuite/Makefile.in: Regenerate.
>
> BTW: The same fix as applied to CRuntime_Glibc can also be applied to
> FreeBSD version, which currently reads:
>
> // Since FreeBSD 11:
> version (none)
> {
> int utimensat(int dirfd, const char *pathname,
> ref const(timespec)[2] times, int flags);
> int futimens(int fd, ref const(timespec)[2] times);
> }
>
> BTW2: The testcase now fails in another place in src/std/file.d on
> CentOS 5.11 (and probably other non-modern systems):
>
> // Tests sub-second precision of querying file times.
> // Should pass on most modern systems running on modern filesystems.
> // Exceptions:
> // - FreeBSD, where one would need to first set the
> //   vfs.timestamp_precision sysctl to a value greater than zero.
> // - OS X, where the native filesystem (HFS+) stores filesystem
> //   timestamps with 1-second precision.
>
> This test should check the availability of utimensat on linux,
> otherwise the resolution is only in seconds range.
>
> Patch was bootstrapped and regression tested on x86_64-linux-gnu
> {,-m32} with CentOS 5.11 and Fedora 30.
>
> Uros.


Re: Review Hashtable extract node API

2019-06-17 Thread Jonathan Wakely

On 07/06/19 18:39 +0200, François Dumont wrote:

On 6/5/19 6:22 PM, Jonathan Wakely wrote:

On 04/06/19 19:19 +0200, François Dumont wrote:

Hi

    Here is a patch to enhance the _Hashtable extract node API and 
fix a FIXME request.


    The enhancement to the extract node Api is that extract(const 
key_type&) do not call extract(const_iterator) anymore. Doing so 
we had to loop again through bucket nodes to find the previous 
node to the one to extract. Even if a bucket shall not contain 
many nodes (in unique key mode) it's easy to avoid it.


Nice.

    To fix the FIXME I introduced a node smart pointer type 
managing the node lifetime. The node is extracted from this smart 
pointer only when there can't be any exception raised. In the 
context of the node extract api the node handle is considered as a 
smart pointer. So the node handle will remain owner of the node in 
case of exception when reinserting it, I hope it is the expected 
behavior.


Yes, that's right.

I was going to suggest just using the node handle type instead of
inventing a new smart pointer, but the handle type uses std::optional
so isn't available for C++11/14.


I considered it too, or even use shared_ptr but thought it was overkill.





    * include/bits/hashtable_policy.h
    (struct _NodeSmartPointer<_NodeAlloc>): New.
    (_Map_base<>::operator[](const key_type&)): Use latter, adapt.
    (_Map_base<>::operator[](key_type&&)): Likewise.
    * include/bits/hashtable.h
    (_Hashtable<>::__node_sp_t): New.
    (_Hashtable<>::_M_insert_unique_node(size_type, __hash_code,
    __node_type*, size_type)): Replace by...
(_Hashtable<>::_M_insert_unique_node<_NodeAccessor>(const key_type&,
    size_type, __hash_code, const _NodeAccessor&, size_type)): ...that.
    (_Hashtable<>::_M_insert_multi_node(__node_type*, __hash_code,
    __node_type*)): Replace by...
(_Hashtable<>::_M_insert_multi_node<_NodeAccessor>(__node_type*,
    __hash_code, const _NodeAccessor&)): ...that.
    (_Hashtable<>::_M_reinsert_node): Adapt.
    (_Hashtable<>::_M_reinsert_node_multi): Adapt.
    (_Hashtable<>::_M_extract_node(size_t, __node_base*)): New.
    (_Hashtable<>::extract(const_iterator)): Use latter.
    (_Hashtable<>::extract(const _Key&)): Likewise.
    (_Hashtable<>::_M_merge_unique): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(true_type, _Args&&...)): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(const_iterator, false_type,
    _Args&&...)): Adapt.

Tested under Linux x86_64.

Ok to commit ?

François



diff --git a/libstdc++-v3/include/bits/hashtable.h 
b/libstdc++-v3/include/bits/hashtable.h

index e2e3f016a35..307865b96bf 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -197,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  using __hash_cached = typename __traits_type::__hash_cached;
  using __node_type = __detail::_Hash_node<_Value, 
__hash_cached::value>;

  using __node_alloc_type = __alloc_rebind<_Alloc, __node_type>;
+  using __node_sp_t = 
__detail::_NodeSmartPointer<__node_alloc_type>;


  using __hashtable_alloc = 
__detail::_Hashtable_alloc<__node_alloc_type>;


@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __node_base*
  _M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no 
rehash (assumes
-  // no element with its key already present). Take ownership 
of the node,

-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket 
__bkt if no

+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-    __node_type* __n, size_type __n_elt = 1);
+    _M_insert_unique_node(const key_type& __k, size_type __bkt,
+  __hash_code __code, const _NodeAccessor&,
+  size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-   __hash_code __code, __node_type* __n);
+    _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+ const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.



Maybe one the reasons we have a difference when running performance 
tests in -03. I don't know why I abused so much of 

Re: Remove nonoverlapping_component_refs_of_decl_p

2019-06-17 Thread Jan Hubicka
Hi
this is patch for BIT_FIELD_REFs on nonoverlapping_component_refs_p
I comitted after testing on x86_64-linux

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 272379)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2019-06-17  Jan Hubicka  
+
+   * tree-ssa-alias.c (nonoverlapping_component_refs_p): Also truncate
+   access path on BIT_FIELD_REFs.
+
 2019-06-17  Martin Liska  
 
PR ipa/90874
Index: tree-ssa-alias.c
===
--- tree-ssa-alias.c(revision 272379)
+++ tree-ssa-alias.c(working copy)
@@ -1268,7 +1268,8 @@ nonoverlapping_component_refs_p (const_t
  if (TREE_CODE (type) == RECORD_TYPE)
fieldsx.safe_push (field);
}
-  else if (TREE_CODE (x) == VIEW_CONVERT_EXPR)
+  else if (TREE_CODE (x) == VIEW_CONVERT_EXPR
+  || TREE_CODE (x) == BIT_FIELD_REF)
fieldsx.truncate (0);
   x = TREE_OPERAND (x, 0);
 }
@@ -1284,7 +1285,8 @@ nonoverlapping_component_refs_p (const_t
  if (TREE_CODE (type) == RECORD_TYPE)
fieldsy.safe_push (TREE_OPERAND (y, 1));
}
-  else if (TREE_CODE (y) == VIEW_CONVERT_EXPR)
+  else if (TREE_CODE (y) == VIEW_CONVERT_EXPR
+  || TREE_CODE (y) == BIT_FIELD_REF)
fieldsy.truncate (0);
   y = TREE_OPERAND (y, 0);
 }



Re: [PATCH v4 2/3] Add predict_doloop_p target hook

2019-06-17 Thread Segher Boessenkool
On Mon, Jun 17, 2019 at 10:51:49AM +0200, Richard Biener wrote:
> On Mon, 17 Jun 2019, Kewen.Lin wrote:
> 
> > Hi Segher and Bill,
> > 
> > Thanks a lot for your review comments! I've updated the patch accordingly.
> > 
> > The updated one attached.
> 
> OK.  I suppose all low-overhead loop instructions use a decrement to zero
> style iterator?

The documentation says decrement to 0, decrement to -1, and decrement to
any negative are all supported.  But all are decrement, yes.


Segher


Re: [PATCH v4 2/3] Add predict_doloop_p target hook

2019-06-17 Thread Kewen.Lin
on 2019/6/17 下午4:51, Richard Biener wrote:
> On Mon, 17 Jun 2019, Kewen.Lin wrote:
> 
>> Hi Segher and Bill,
>>
>> Thanks a lot for your review comments! I've updated the patch accordingly.
>>
>> The updated one attached.
> 
> OK.  I suppose all low-overhead loop instructions use a decrement to zero
> style iterator?
> 

Thanks Richard!

Yes, I think so.  As the function doloop_condition_get, it checks a decrement
and branch jump patterns.


Thanks, 
Kewen



Re: [PATCH] Handle '\0' in strcmp in RTL expansion (PR tree-optimization/90892).

2019-06-17 Thread Jakub Jelinek
On Mon, Jun 17, 2019 at 11:26:03AM +0200, Martin Liška wrote:
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 3463ffb1539..917852071b9 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -7142,6 +7142,20 @@ inline_expand_builtin_string_cmp (tree exp, rtx target)
>const char *src_str1 = c_getstr (arg1, &len1);
>const char *src_str2 = c_getstr (arg2, &len2);
>  
> +  if (src_str1 != NULL)
> +{
> +  unsigned HOST_WIDE_INT str_str1_strlen = strlen (src_str1);
> +  if (str_str1_strlen + 1 < len1)
> + len1 = str_str1_strlen + 1;

So use strnlen instead of strlen?
  if (src_str1 != NULL)
len1 = strnlen (src_str1, len1);
etc.?

Jakub


[PATCH] Handle '\0' in strcmp in RTL expansion (PR tree-optimization/90892).

2019-06-17 Thread Martin Liška
Hi.

The function c_getstr returns string and length of the string.
In inline_expand_builtin_string_cmp, we should consider situations
where a string constant contains a zero character. In that case
we have to shorten len[12].

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-17  Martin Liska  

PR tree-optimization/90892
* builtins.c (inline_expand_builtin_string_cmp): Handle '\0'
in string constants.

gcc/testsuite/ChangeLog:

2019-06-17  Martin Liska  

PR tree-optimization/90892
* gcc.dg/pr90892.c: New test.
---
 gcc/builtins.c | 14 ++
 gcc/testsuite/gcc.dg/pr90892.c | 14 ++
 2 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr90892.c


diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3463ffb1539..917852071b9 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7142,6 +7142,20 @@ inline_expand_builtin_string_cmp (tree exp, rtx target)
   const char *src_str1 = c_getstr (arg1, &len1);
   const char *src_str2 = c_getstr (arg2, &len2);
 
+  if (src_str1 != NULL)
+{
+  unsigned HOST_WIDE_INT str_str1_strlen = strlen (src_str1);
+  if (str_str1_strlen + 1 < len1)
+	len1 = str_str1_strlen + 1;
+}
+
+  if (src_str2 != NULL)
+{
+  unsigned HOST_WIDE_INT str_str2_strlen = strlen (src_str2);
+  if (str_str2_strlen + 1 < len2)
+	len2 = str_str2_strlen + 1;
+}
+
   /* If neither strings is constant string, the call is not qualify.  */
   if (!src_str1 && !src_str2)
 return NULL_RTX;
diff --git a/gcc/testsuite/gcc.dg/pr90892.c b/gcc/testsuite/gcc.dg/pr90892.c
new file mode 100644
index 000..e4b5310807a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr90892.c
@@ -0,0 +1,14 @@
+/* PR tree-optimization/90892 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+const char *a = "A\0b";
+
+int
+main()
+{
+  if (__builtin_strncmp(a, "A\0", 2) != 0)
+__builtin_abort ();
+
+  return 0;
+}



[PATCH] Add new micro-benchmark for string operations.

2019-06-17 Thread Martin Liška
Hi.

I'm adding a micro-benchmark that Honza has been using for quite some time.

Sample output:

./contrib/bench-stringop 64 1024 gcc
memcpy
  block size  libcall rep1noalg   rep4noalg   rep8noalg   loop
noalg   unrlnoalg   sse noalg   bytePGO dynamicBEST
 8192000  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.01 0:00.00 0:00.000:00.00 
libcall
  819200  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   81920  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   20480  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
8192  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
4096  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
2048  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
1024  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
 512  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
 256  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
 128  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  64  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  48  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  32  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  24  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  16  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  14  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  12  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
  10  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   8  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   6  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   4  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   1  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
Aligned
  block size  libcall rep1noalg   rep4noalg   rep8noalg   loop
noalg   unrlnoalg   sse noalg   bytePGO dynamicBEST
 8192000  0:00.00 0:00.00 0:00.00 0:00.01 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.01 0:00.00 0:00.000:00.00 
libcall
  819200  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   81920  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
   20480  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
8192  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.000:00.00 byte
4096  0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 0:00.00 
0:00.00 

Do not give up early on access path oracle

2019-06-17 Thread Jan Hubicka
Hi,
while working on testcases for nonoverlapping_component_refs_p I run into issue
that we often bypass it because the indirect-decl and indirect-indirect oracles
give up if they manage to match bases and ranges_maybe_overlap_p return true.
(one testcase is included in patch).

The issue is that decl-decl oracle first test base+offsets and if that
fails it proceeds to nonoverlapping_component_refs_of_decl_p which is
still able to do some useful disambiguations when the access paths
contains non-constat ARRAY_REFs where max_size is not very informative.

I modified the other two oracles to avoid the early return which increased
effectivity of nonoverlapping_component_refs_p and aliasing_component_refs_p
somewhat but also about doubled number of calls of them.

I can't say if the early return is just omision or intended to save compile time
but it appeared to me that most of those nondisambiguated comparsions acutally
covers the case we know that access paths are the same and in such case
we could still return early.

This patch adds same_access_paths_p which implements simple logic matching
max_size with type size (thus ruling out variable array accesses) and then
verifying that there are no unions on the way (in that case we still may have
different access paths to same memory location.

With this patch I get relatively reasonable increase in number of querries
and disambiguations.

For tramp3d:

Alias oracle query stats:
  refs_may_alias_p: 3021539 disambiguations, 3321136 queries
  ref_maybe_used_by_call_p: 7118 disambiguations, 3047133 queries
  call_may_clobber_ref_p: 817 disambiguations, 817 queries
  nonoverlapping_component_refs_p: 22 disambiguations, 61735 queries
  nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
  aliasing_component_refs_p: 2050 disambiguations, 19908 queries
  TBAA oracle: 1419961 disambiguations 2916692 queries
   555158 are in alias set 0
   575103 queries asked about the same object
   0 queries asked about the same alias set
   0 access volatile
   252874 are dependent in the DAG
   113596 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 671982 disambiguations, 952513 queries
  pt_solutions_intersect: 97060 disambiguations, 437912 queries

to:

Alias oracle query stats:
  refs_may_alias_p: 3021842 disambiguations, 3321308 queries
  ref_maybe_used_by_call_p: 7118 disambiguations, 3047440 queries
  call_may_clobber_ref_p: 817 disambiguations, 817 queries
  nonoverlapping_component_refs_p: 25 disambiguations, 63108 queries
  nonoverlapping_component_refs_of_decl_p: 19 disambiguations, 2192 queries
  aliasing_component_refs_p: 2236 disambiguations, 20594 queries
  TBAA oracle: 1420030 disambiguations 2918103 queries
   555158 are in alias set 0
   575109 queries asked about the same object
   0 queries asked about the same alias set
   0 access volatile
   253260 are dependent in the DAG
   114546 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 671990 disambiguations, 952532 queries
  pt_solutions_intersect: 97060 disambiguations, 438091 queries

So 3% new querries on wich we seems to have have about 30% disambiguation rate
(190 new disambiguations)

For cc1:

Alias oracle query stats:
  refs_may_alias_p: 38345354 disambiguations, 46034874 queries
  ref_maybe_used_by_call_p: 57452 disambiguations, 38905700 queries
  call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
  nonoverlapping_component_refs_p: 9674 disambiguations, 743745 queries
  nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 queries
  aliasing_component_refs_p: 103234 disambiguations, 291017 queries
  TBAA oracle: 10719978 disambiguations 32885735 queries
   13521045 are in alias set 0
   5233132 queries asked about the same object
   200 queries asked about the same alias set
   0 access volatile
   1459189 are dependent in the DAG
   1952191 are aritificially in conflict with void *

PTA query stats:
  pt_solution_includes: 465494 disambiguations, 6918046 queries
  pt_solutions_intersect: 342384 disambiguations, 6435014 queries

to:

Alias oracle query stats:
  refs_may_alias_p: 38345581 disambiguations, 46035002 queries
  ref_maybe_used_by_call_p: 57452 disambiguations, 38905936 queries
  call_may_clobber_ref_p: 5856 disambiguations, 8685 queries
  nonoverlapping_component_refs_p: 9754 disambiguations, 768725 queries
  nonoverlapping_component_refs_of_decl_p: 6962 disambiguations, 212637 queries
  aliasing_component_refs_p: 103316 disambiguations, 314129 queries
  TBAA oracle: 10720037 disambiguations 32893789 queries
   13521037 are in alias set 0
   5233163 queries asked about the same object
   200 queries asked about the same alias set
   

Re: Fix ICE due to commit for PR88834

2019-06-17 Thread Richard Sandiford
Kugan Vivekanandarajah  writes:
> Hi All,
>
> As pointed to me by Jeff, after committing patch to fix PR88834, some
> tests are failing for target rx-elf. This is because in
> preferred_mem_scale_factor we end up with mem_mode which is BLKmode
> and hence GET_MODE_UNIT_SIZE returns zero.
>
> I have fixed this by checking for BLKmode. I believe this is the only
> way we can have GET_MODE_UNIT_SIZE of 0. Otherwise, we can check for
> GET_MODE_UNIT_SIZE of zero.
>
> Bootstrapped and regression tested attached patch on x86_64-linux-gnu
> with no new regressions. Is this OK for trunk?
>
> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
> 2019-06-17  Kugan Vivekanandarajah  
>
> * tree-ssa-address.c (preferred_mem_scale_factor): Handle when
> mem_mode is BLKmode.
>
> From 5cd4ac35ce8006a6c407a2386175382f053dcdd3 Mon Sep 17 00:00:00 2001
> From: Kugan Vivekanandarajah 
> Date: Sun, 16 Jun 2019 21:02:59 +1000
> Subject: [PATCH] Fix ICE for rx-elf
>
> Change-Id: I503b6b8316e7d11d63ec7749ff44dbc641078539
> ---
>  gcc/tree-ssa-address.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
> index cdd432a..1dca779 100644
> --- a/gcc/tree-ssa-address.c
> +++ b/gcc/tree-ssa-address.c
> @@ -1138,6 +1138,10 @@ preferred_mem_scale_factor (tree base, machine_mode 
> mem_mode,
>addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (base));
>unsigned int fact = GET_MODE_UNIT_SIZE (mem_mode);
>  
> +  /* for BLKmode, we cant do anything so return 1.  */

s/for/For/;s/cant/can't/

> +  if (mem_mode == BLKmode)
> +return 1;
> +

Think it makes more sense to do this at the start of the function,
before:

  struct mem_address parts = {};
  addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (base));
  unsigned int fact = GET_MODE_UNIT_SIZE (mem_mode);

(Hopefully one day GET_MODE_SIZE & co. will assert on BLKmode and VOIDmode.)

OK with those changes, thanks.

Richard


Re: [PATCH] Remove dead code (PR ipa/90874).

2019-06-17 Thread Jan Hubicka
> Hi.
> 
> After r272037 change we have a leftover dead code in odr_type_p.
> 
> Ready to be installed?
OK (it is obvious anyway:),
thanks!
Honza
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-17  Martin Liska  
> 
>   PR ipa/90874
>   * ipa-utils.h (odr_type_p): Remove dead code.
> ---
>  gcc/ipa-utils.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> 

> diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
> index 5903da5840d..3e582f45283 100644
> --- a/gcc/ipa-utils.h
> +++ b/gcc/ipa-utils.h
> @@ -240,7 +240,6 @@ odr_type_p (const_tree t)
>gcc_checking_assert (in_lto_p || flag_lto);
>return TYPE_NAME (t) && TREE_CODE (TYPE_NAME (t)) == TYPE_DECL
>   && DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t));
> -  return false;
>  }
>  
>  #endif  /* GCC_IPA_UTILS_H  */
> 



Re: [PATCH v4 2/3] Add predict_doloop_p target hook

2019-06-17 Thread Richard Biener
On Mon, 17 Jun 2019, Kewen.Lin wrote:

> Hi Segher and Bill,
> 
> Thanks a lot for your review comments! I've updated the patch accordingly.
> 
> The updated one attached.

OK.  I suppose all low-overhead loop instructions use a decrement to zero
style iterator?

Thanks,
Richard.


[PATCH] Remove dead code (PR ipa/90874).

2019-06-17 Thread Martin Liška
Hi.

After r272037 change we have a leftover dead code in odr_type_p.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-17  Martin Liska  

PR ipa/90874
* ipa-utils.h (odr_type_p): Remove dead code.
---
 gcc/ipa-utils.h | 1 -
 1 file changed, 1 deletion(-)


diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index 5903da5840d..3e582f45283 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -240,7 +240,6 @@ odr_type_p (const_tree t)
   gcc_checking_assert (in_lto_p || flag_lto);
   return TYPE_NAME (t) && TREE_CODE (TYPE_NAME (t)) == TYPE_DECL
  && DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t));
-  return false;
 }
 
 #endif  /* GCC_IPA_UTILS_H  */



[build] Remove support for alternative Solaris 11.4 ld -V output

2019-06-17 Thread Rainer Orth
I've been informed that parsing of Solaris ld -V output doesn't work
with Solaris sed.  The problem is that the alternation used since

[build] Cleanup Solaris linker version checks
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00068.html

isn't recognized by /usr/bin/sed and /usr/xpg4/bin/sed which can only
handle BREs.  This means that all features dependent on ld version
checking aren't supported anymore.

I hadn't recognized this before since I've been using GNU sed
exclusively.  While Illumos sed could work around this by using sed -E
instead, which enables ERE support, this option isn't recognized by
Solaris sed.

Fortunately, the fix is simple: the alternative ld -V output format
handled since the patch above never made it into a non-beta version of
Solaris, and never will, so I'm just removing this again.

Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.   Installed for mainline now; will backport to
the gcc-9 and gcc-8 branches after some soak time.

For good measure, I've also run Solaris bootstraps with /usr/bin in
front of /usr/gnu/bin, but this issue was the only difference.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2019-06-14  Rainer Orth  

* configure.ac (ld_vers) <*-*-solaris2*>: Remove support for
alternative Solaris 11.4 format.
* configure: Regenerate.

# HG changeset patch
# Parent  5e2355b076a745aef941612cd3fa7827279e9325
Remove support for alternative Solaris 11.4 ld -V output

diff --git a/gcc/configure.ac b/gcc/configure.ac
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -2684,17 +2684,13 @@ if test $in_tree_ld != yes ; then
 	#
 	# ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1701:onnv-ab196087-6931056-03/25/10
 	#
-	# In Solaris 11.4, this was changed to
-	#
-	# ld: Solaris ELF Utilities: 11.4-1.3123
-	#
 	# ld and ld.so.1 are guaranteed to be updated in lockstep, so ld version
 	# numbers can be used in ld.so.1 feature checks even if a different
 	# linker is configured.
 	ld_ver=`$gcc_cv_ld -V 2>&1`
-	if echo "$ld_ver" | $EGREP 'Solaris Link Editors|Solaris ELF Utilities' > /dev/null; then
+	if echo "$ld_ver" | grep 'Solaris Link Editors' > /dev/null; then
 	  ld_vers=`echo $ld_ver | sed -n \
-	-e 's,^.*: \(5\|1[0-9]\)\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\2,p'`
+	-e 's,^.*: 5\.[0-9][0-9]*-\([0-9]\.[0-9][0-9]*\).*$,\1,p'`
 	  ld_vers_major=`expr "$ld_vers" : '\([0-9]*\)'`
 	  ld_vers_minor=`expr "$ld_vers" : '[0-9]*\.\([0-9]*\)'`
 	fi


Re: [PATCH] Avoid unnecessary inclusion of header

2019-06-17 Thread Jonathan Wakely

On 06/06/19 16:38 +0100, Jonathan Wakely wrote:

This can greatly reduce the amount of preprocessed code that is included
by other headers, because  depends on  which is huge.

* include/std/array: Do not include .
* include/std/optional: Include  and
 instead of .

Preprocessed line counts for C++17 mode:

   
Before   2577432453 31616
After 992523194 19062

Tested x86_64-linux, committed to trunk.

Once we have a gcc-10/porting_to.html page I'll note this change
there, because code relying on std::string and std::allocator being
defined by transitive includes will need to include the right headers.


After this change some tests fail without PCH, because they weren't
including what they use.

Tested x86_64-linux, committed to trunk.

commit 9462a83e712b1e8621ffee998ccd3291df889c4b
Author: redi 
Date:   Mon Jun 17 08:18:17 2019 +

Fix tests that fail without PCH

The recent change to stop transitively including  broke some
tests, but only when the library is configured without PCH, because
otherwise the  header still gets included via the precompiled
 header.

* testsuite/20_util/bad_function_call/what.cc: Include  header
for std::string.
* testsuite/20_util/shared_ptr/cons/weak_ptr_expired.cc: Likewise.
* testsuite/20_util/tuple/cons/allocator_with_any.cc: Include 
header for std::allocator.
* testsuite/23_containers/array/tuple_interface/tuple_element.cc: Add
using-declaration for std::size_t.
* testsuite/23_containers/array/tuple_interface/tuple_size.cc:
Likewise.
* testsuite/23_containers/deque/cons/55977.cc: Include  for
std::istream.
* testsuite/23_containers/vector/cons/55977.cc: Likewise.
* testsuite/experimental/map/erasure.cc: Include  for
std::string.
* testsuite/experimental/unordered_map/erasure.cc: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272376 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/testsuite/20_util/bad_function_call/what.cc b/libstdc++-v3/testsuite/20_util/bad_function_call/what.cc
index e17b42feedd..229c7ef1cab 100644
--- a/libstdc++-v3/testsuite/20_util/bad_function_call/what.cc
+++ b/libstdc++-v3/testsuite/20_util/bad_function_call/what.cc
@@ -18,6 +18,7 @@
 // .
 
 #include 
+#include 
 #include 
 
 int main()
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/weak_ptr_expired.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/weak_ptr_expired.cc
index f9c44cd60f3..a32a5c91943 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/weak_ptr_expired.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/weak_ptr_expired.cc
@@ -20,6 +20,7 @@
 // 20.6.6.2 Template class shared_ptr [util.smartptr.shared]
 
 #include 
+#include 
 #include 
 
 struct A { };
diff --git a/libstdc++-v3/testsuite/20_util/tuple/cons/allocator_with_any.cc b/libstdc++-v3/testsuite/20_util/tuple/cons/allocator_with_any.cc
index 655b5328bda..154ec9161bd 100644
--- a/libstdc++-v3/testsuite/20_util/tuple/cons/allocator_with_any.cc
+++ b/libstdc++-v3/testsuite/20_util/tuple/cons/allocator_with_any.cc
@@ -23,6 +23,7 @@
 // this test may begin to fail.
 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element.cc b/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element.cc
index b6fda44a3c5..e8b6bc12e77 100644
--- a/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element.cc
+++ b/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_element.cc
@@ -29,7 +29,7 @@ test01()
   // This relies on the fact that  includes :
   using std::is_same;
 
-  const size_t len = 3;
+  const std::size_t len = 3;
   typedef array array_type;
 
   static_assert(is_same::type, int>::value, "" );
diff --git a/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_size.cc b/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_size.cc
index aedd5fc2a2d..740c42a8914 100644
--- a/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_size.cc
+++ b/libstdc++-v3/testsuite/23_containers/array/tuple_interface/tuple_size.cc
@@ -26,6 +26,7 @@ test01()
 {
   using std::array;
   using std::tuple_size;
+  using std::size_t;
   // This relies on the fact that  includes :
   using std::is_same;
 
diff --git a/libstdc++-v3/testsuite/23_containers/deque/cons/55977.cc b/libstdc++-v3/testsuite/23_containers/deque/cons/55977.cc
index 492aedf97b4..5ab516a2950 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/cons/55977.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/cons/55977.cc
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 template 
 struct MyAllocator
diff --git a/libstdc++-v3/testsuite/23_containers/vector/cons/55977.cc b/libstdc++-v3/tes

Re: [PATCH, sanitizer] Wrap rethrow_primary_exception (PR 87880).

2019-06-17 Thread Jonathan Wakely

On 16/06/19 20:58 +0200, Jakub Jelinek wrote:

On Sun, Jun 16, 2019 at 07:54:42PM +0100, Iain Sandoe wrote:

So, I guess, unless Jonathan has plans to add __cxa_rethrow_primary_exception
during the 10 time-frame, it’s correct to exclude the symbol anyway and we 
should
bump the so version and apply trunk.


I don't understand why they've added it, it should be called
std::rethrow_exception and that is how it is called in libstdc++.


std::rethrow_exception was new in C++11, so maybe they wanted to be
able to use it in C++03 code as well? (Just a guess).


Actually, because the way in which interposing works for Darwin is different, 
the only
symbol change in the library on Darwin is removing an "undefined dynamic 
lookup".
So, for back-ports, I can could up with some Darwin-specific Makefike change 
that
only adds the ASAN_HAS_CXA_RETHROW_PRIMARY_EXCEPTION=0 for Darwin.

So - OK for trunk with a bumped soname?


Yes.


(and a TODO to figure a Darwin-only backport)


Yeah.

Jakub


Re: Remove nonoverlapping_component_refs_of_decl_p

2019-06-17 Thread Jan Hubicka
> On Mon, 17 Jun 2019, Jan Hubicka wrote:
> 
> > > 
> > > get_alias_set already handles VCEs properly.  Btw I've said
> > 
> > I see - i keep thinking of get_alias_set as a simple accessor to type's
> > alias set which it is not.  It may make sense to separate these two
> > later.
> > 
> > > BIT_FIELD_REF has the same issue but you didn't include that
> > > case below.
> > 
> > Sorry, missed that in your reply.  I am testing the fix.
> > 
> > What about aliasing_component_refs.  When I have VCE in the the path,
> > I suppose I do not want to consider the stuff after VCE as part of path
> > and also while testing whether one path can be continuation from other
> > I want to start from outermost VCE rather than ref_type, right?
> 
> Yeah, when searching for base1/2 you want to initialize ref1/2p
> like
> 
>  while (handled_component_p (base1))
>   {
> if (TREE_CODE (base1) == VIEW_CONVERT_EXPR
> || TREE_CODE (base1) == BIT_FIELD_REF)
>  ref1p = &TREE_OPERAND (base1, 0);
> base1 = TREE_OPERAND (base1, 0);
>   }
> 
> and then instead of refp = &ref1/2 do refp = ref1/2p for the searches.
> 
> So not use the type of the innermost(!) VIEW_CONVERT_EXPR but its
> base (same for BIT_FIELD_REF).

Yep, that is what I had in mind.  I will test the patch.
Also will remove the ref pointers - we always just read them.

Thanks,
Honza


Re: Remove nonoverlapping_component_refs_of_decl_p

2019-06-17 Thread Richard Biener
On Mon, 17 Jun 2019, Jan Hubicka wrote:

> > 
> > get_alias_set already handles VCEs properly.  Btw I've said
> 
> I see - i keep thinking of get_alias_set as a simple accessor to type's
> alias set which it is not.  It may make sense to separate these two
> later.
> 
> > BIT_FIELD_REF has the same issue but you didn't include that
> > case below.
> 
> Sorry, missed that in your reply.  I am testing the fix.
> 
> What about aliasing_component_refs.  When I have VCE in the the path,
> I suppose I do not want to consider the stuff after VCE as part of path
> and also while testing whether one path can be continuation from other
> I want to start from outermost VCE rather than ref_type, right?

Yeah, when searching for base1/2 you want to initialize ref1/2p
like

 while (handled_component_p (base1))
  {
if (TREE_CODE (base1) == VIEW_CONVERT_EXPR
|| TREE_CODE (base1) == BIT_FIELD_REF)
 ref1p = &TREE_OPERAND (base1, 0);
base1 = TREE_OPERAND (base1, 0);
  }

and then instead of refp = &ref1/2 do refp = ref1/2p for the searches.

So not use the type of the innermost(!) VIEW_CONVERT_EXPR but its
base (same for BIT_FIELD_REF).

> 
> So I would go for adjusting the ref1 and ref2 accordingly in the first
> walk of handled components. alias set needs no updating since that is
> aready done earlier.

Yes, adjusting ref1/2 is also possible.

Richard.

> Honza
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [MIPS] Fix for the wrong argument sequence in MSA builtin for FMADD/MADDV family.

2019-06-17 Thread Dragan Mladjenovic
Sorry for the late response.


I've checked the patch on gcc-9 and gcc-8 branches with mips{el}-linux-gnu.

There are no new regressions, as expected.


Is this change still eligible for back-port ?



From: Jeff Law 
Sent: Friday, May 31, 2019 11:41:35 PM
To: Dragan Mladjenovic; gcc-patches@gcc.gnu.org
Cc: Matthew Fortune
Subject: Re: [MIPS] Fix for the wrong argument sequence in MSA builtin for 
FMADD/MADDV family.

On 5/22/19 12:13 AM, Dragan Mladjenovic wrote:
> Hi all,
>
> This patch makes the behavior of __builtin_msa_ fmadd/fmsub/maddv match the
> expected one. While not a recent regression, it would be nice if this can
> be back-ported in order to minimize the fragmentation.
>
> Best regards,
>
> Dragan
>
> gcc/ChangeLog:
>
> 2019-05-20  Prachi Godbole  
> Robert Suchanek  
>
>* gcc/config/mips/mips.c (mips_expand_builtin_insn): Swap the 1st
>and 3rd operands of the fmadd/fmsub/maddv builtin.
>
> gcc/testsuite/ChangeLog:
>
> 2019-05-20  Dragan Mladjenovic  
>
>* gcc.target/mips/msa-fmadd.c: New.
THanks.  I've installed this one the trunk.

Out of an abundance of caution I have not installed it on any of the
release branches.  Feel free to ping me in a week or so and if there
haven't been any issues I'll back port it.

jeff


[Ada] Workaround for PR ada/80590

2019-06-17 Thread Arnaud Charlet
This patch avoids an unnecessary exception propagation in the compiler when
handling the "delay" construct. This should avoid the crashes/hangs observed
in PR 80590 on g-exptty.adb among others.

Note that the proper fix for this PR is to do a full bootstrap of GCC, since
the issue is that GNAT switched from SJLJ to ZCX exception handling from
GCC 7 to GCC 8, causing these inconsistencies that can only be resolved via
a full bootstrap, or by using the newly built compiler to build new compilers.

Tested on x86_64-pc-linux-gnu

Verified manually that no exception propagation occurs in gnat1 with this
patch when compiling g-exptty.adb, g-locfil.adb, s-gloloc.adb and g-socthi.adb.

Tom, if you could confirm that you are unblocked and can move forward on
reenabling Ada on your builds, that'd be great, let me know.

Committed on branches 8, 9 and trunk.

2019-06-17  Arnaud Charlet  

PR ada/80590

* exp_ch9.adb (Expand_N_Delay_Relative_Statement): Swap the two
conditions to avoid a unnecessary exception propagation in the default
case.

--
Index: exp_ch9.adb
===
--- exp_ch9.adb (revision 272370)
+++ exp_ch9.adb (working copy)
@@ -8258,18 +8258,17 @@
   Proc : Entity_Id;
 
begin
-  --  Try to use System.Relative_Delays.Delay_For only if available. This
-  --  is the implementation used on restricted platforms when Ada.Calendar
-  --  is not available.
+  --  Try to use Ada.Calendar.Delays.Delay_For if available.
 
-  if RTE_Available (RO_RD_Delay_For) then
- Proc := RTE (RO_RD_Delay_For);
+  if RTE_Available (RO_CA_Delay_For) then
+ Proc := RTE (RO_CA_Delay_For);
 
-  --  Otherwise, use Ada.Calendar.Delays.Delay_For and emit an error
-  --  message if not available.
+  --  Otherwise, use System.Relative_Delays.Delay_For and emit an error
+  --  message if not available. This is the implementation used on
+  --  restricted platforms when Ada.Calendar is not available.
 
   else
- Proc := RTE (RO_CA_Delay_For);
+ Proc := RTE (RO_RD_Delay_For);
   end if;
 
   Rewrite (N,


Re: Remove nonoverlapping_component_refs_of_decl_p

2019-06-17 Thread Jan Hubicka
> 
> get_alias_set already handles VCEs properly.  Btw I've said

I see - i keep thinking of get_alias_set as a simple accessor to type's
alias set which it is not.  It may make sense to separate these two
later.

> BIT_FIELD_REF has the same issue but you didn't include that
> case below.

Sorry, missed that in your reply.  I am testing the fix.

What about aliasing_component_refs.  When I have VCE in the the path,
I suppose I do not want to consider the stuff after VCE as part of path
and also while testing whether one path can be continuation from other
I want to start from outermost VCE rather than ref_type, right?

So I would go for adjusting the ref1 and ref2 accordingly in the first
walk of handled components. alias set needs no updating since that is
aready done earlier.

Honza