Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

2019-10-26 Thread Oleg Endo
On Sat, 2019-10-26 at 14:04 -0600, Jeff Law wrote:
> On 10/26/19 1:33 PM, Andrew Waterman wrote:
> > I don't know enough to say whether the legitimize_address hook is
> > sufficient for the intended purpose, but I am sure that RISC-V's
> > concerns are not unique: other GCC targets have to cope with
> > offset-size constraints that are coupled to register-allocation
> > constraints.
> 
> Yup.  I think every risc port in the 90s faces this problem.  I
> always
> wished for a generic mechanism for ports to handle this problem.
> 
> Regardless, it's probably worth investigating.
> 

What we tried to do with the address mode selection (AMS) optimization
some time ago was the following:

  - Extract memory accesses from the insns stream and put them in 
"access sequences".  Also analyze the address expression and try   
to find effective base addresses by tracing back address 
calculations.

  - For each memory access, get a set of address mode alternatives and 
the corresponding costs from the backend.  The full context of each
access is provided, so the backend can detect things like 
"in this access sequence, FP loads dominate" and use this 
information to tune the alternative costs.

  - Given the alternatives and costs for each memory access, the pass 
would then try to minimize the costs of the whole memory access
sequence, taking costs of address modification isnns into account. 

I think this is quite generic, but of course also complex.  The
optimization problem itself is hard.  There was some research done by
others using  CPLEX or PBQP solvers.  To keep things simple we used a 
backtracking algorithm and handled only a limited set of scenarios. 
For example, I think we could not get loop constructs work nicely to
improve post-inc address mode utilization.

The SH ISA has very similar properties to ARM thumb and RVC, and
perhaps others.  Advantages would not be limited to RISC only, even
CISC ISAs like M68K, RX, ... can benefit from it, as the "proper
optimization" can reduce the instruction sizes by shortening the
addresses in the instruction stream.

If anyone is interested, here is the AMS optimization pass class:

https://github.com/erikvarga/gcc/blob/master/gcc/ams.h
https://github.com/erikvarga/gcc/blob/master/gcc/ams.cc

It's using a different style to callback into the backend code.  Not
GCC's "hooks" but a delegate pattern.  SH backend's delegate
implementation is here

https://github.com/erikvarga/gcc/blob/master/gcc/config/sh/sh.c#L11897

We were getting some improvements in the generated code, but it started
putting pressure on register allocation related issues in the SH
backend (the R0 problem), so we could not do more proper testing.

Maybe somebody can get some ideas out of it.

Cheers,
Oleg



Re: [PATCH] PR85678: Change default to -fno-common

2019-10-26 Thread Harald van Dijk

On 25/10/2019 16:47, Wilco Dijkstra wrote:

GCC currently defaults to -fcommon.  As discussed in the PR, this is an ancient
C feature which is not conforming with the latest C standards.


The PR references C99/C11 6.9p5, but that is not a constraint. Any 
violation merely renders the behaviour of a program undefined, so no 
diagnostic is required. To the best of my knowledge, both -fcommon and 
-fno-common are compatible with the C standard.


This is no reason not to change the default, but...


[...]
+definition.  This behavior does not conform to ISO C, is inconsistent with C++,
+and on many targets implies a speed and code size penalty on global variable
+references.  It is mainly useful to enable legacy code to link without errors.


...can this inaccurate characterization be left out of the documentation?

Cheers,
Harald van Dijk


Fix summaries after ipa-icf alias/wrapper creation

2019-10-26 Thread Jan Hubicka
Hi,
when ipa-icf produces wrapper/alias it needs to also update summaries to
represent the new function body. W/o this propagation across new
wrappers does not work.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

* ipa-icf.c (sem_function::merge): Update function summaries.
* ipa-prop.h (ipa_get_param): Do not sanity check for WPA.
Index: ipa-icf.c
===
--- ipa-icf.c   (revision 277474)
+++ ipa-icf.c   (working copy)
@@ -1266,6 +1266,7 @@ sem_function::merge (sem_item *alias_ite
 
   /* Remove the function's body.  */
   ipa_merge_profiles (original, alias);
+  symtab->call_cgraph_removal_hooks (alias);
   alias->release_body (true);
   alias->reset ();
   /* Notice global symbol possibly produced RTL.  */
@@ -1287,11 +1288,13 @@ sem_function::merge (sem_item *alias_ite
 {
   gcc_assert (!create_alias);
   alias->icf_merged = true;
+  symtab->call_cgraph_removal_hooks (alias);
   local_original->icf_merged = true;
 
   /* FIXME update local_original counts.  */
   ipa_merge_profiles (original, alias, true);
   alias->create_wrapper (local_original);
+  symtab->call_cgraph_insertion_hooks (alias);
 
   if (dump_enabled_p ())
dump_printf (MSG_OPTIMIZED_LOCATIONS,
Index: ipa-prop.h
===
--- ipa-prop.h  (revision 277474)
+++ ipa-prop.h  (working copy)
@@ -457,7 +457,6 @@ static inline tree
 ipa_get_param (class ipa_node_params *info, int i)
 {
   gcc_checking_assert (info->descriptors);
-  gcc_checking_assert (!flag_wpa);
   tree t = (*info->descriptors)[i].decl_or_type;
   gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
   return t;


[libstdc++,doc] Switch pubs.opengroup.org to https

2019-10-26 Thread Gerald Pfeifer
Turns out we have three instances of the same link in our libstdc++
documentation -- do we really need all of those?

In any case, I went ahead and applied this straightforward update.

Gerald


2019-10-26  Gerald Pfeifer  

* doc/xml/manual/codecvt.xml: Switch pubs.opengroup.org to https.
* doc/xml/manual/locale.xml (LC_ALL): Ditto.
* doc/xml/manual/messages.xml: Ditto.

Index: libstdc++-v3/doc/xml/manual/codecvt.xml
===
--- libstdc++-v3/doc/xml/manual/codecvt.xml (revision 277475)
+++ libstdc++-v3/doc/xml/manual/codecvt.xml (working copy)
@@ -595,7 +595,7 @@ codecvt usage.
   
   
http://www.w3.org/1999/xlink;
- xlink:href="http://pubs.opengroup.org/onlinepubs/9699919799/;>
+ xlink:href="https://pubs.opengroup.org/onlinepubs/9699919799/;>
   System Interface Definitions, Issue 7 (IEEE Std. 1003.1-2008)

   
Index: libstdc++-v3/doc/xml/manual/locale.xml
===
--- libstdc++-v3/doc/xml/manual/locale.xml  (revision 277475)
+++ libstdc++-v3/doc/xml/manual/locale.xml  (working copy)
@@ -560,7 +560,7 @@ global locale" (emphasis Paolo), that is:
   
   
http://www.w3.org/1999/xlink;
- xlink:href="http://pubs.opengroup.org/onlinepubs/9699919799/;>
+ xlink:href="https://pubs.opengroup.org/onlinepubs/9699919799/;>
   System Interface Definitions, Issue 7 (IEEE Std. 1003.1-2008)

   
Index: libstdc++-v3/doc/xml/manual/messages.xml
===
--- libstdc++-v3/doc/xml/manual/messages.xml(revision 277475)
+++ libstdc++-v3/doc/xml/manual/messages.xml(working copy)
@@ -488,7 +488,7 @@ void test01()
   
   
http://www.w3.org/1999/xlink;
- xlink:href="http://pubs.opengroup.org/onlinepubs/9699919799/;>
+ xlink:href="https://pubs.opengroup.org/onlinepubs/9699919799/;>
   System Interface Definitions, Issue 7 (IEEE Std. 1003.1-2008)

   


Re: [PATCH] C testsuite, silence a FreeBSD libc warning

2019-10-26 Thread Andreas Tobler

On 04.10.19 19:04, Jeff Law wrote:

On 9/30/19 12:47 PM, Andreas Tobler wrote:

On 30.09.19 20:37, Kamil Rytarowski wrote:

On 30.09.2019 19:47, Jakub Jelinek wrote:

On Mon, Sep 30, 2019 at 07:41:00PM +0200, Andreas Tobler wrote:

--- fprintf-2.c    (revision 276292)
+++ fprintf-2.c    (working copy)
@@ -1,7 +1,8 @@
   /* Verify that calls to fprintf don't get eliminated even if their
  result on success can be computed at compile time (they can fail).
  The calls can still be transformed into those of other functions.
-   { dg-skip-if "requires io" { freestanding } } */
+   { dg-skip-if "requires io" { freestanding } }
+   { dg-prune-output "(^|\n)(\[^\n\])*warning: warning: \[^\n\]*
possibly used unsafely; consider using \[^\n\]*\n" } */


I'm worried about that (^|\n) at the start + \n at the end, doesn't
it prune
too much then?
Looking at other tests, they dg-prune-output just a few words from a
message, or .*few words.*
So, can you try just
     { dg-prune-output "warning: warning: \[^\n\r\]* possibly used
unsafely; consider using" } */
or if that doesn't work, with .* at start end end?

 Jakub



Please handle the NetBSD specific string too: "warning: tmpnam()
possibly used unsafely, use mkstemp() or mkdtemp()".

https://nxr.netbsd.org/xref/src/lib/libc/stdio/tmpnam.c#52



Ok, I think the attached version should also match these cases. Although
untested on NetBSD.
Kamil, if you have cycles, would you mind giving it a run? Thanks!
Andreas


OK assuming Kamil's testing shows that it works.


Kamil, do you have a feedback? If not I'm going to commit by tomorrow.

Andreas


[committed] pa: Update libstdc++-v3 baseline symbols for hppa-linux

2019-10-26 Thread John David Anglin
Committed to trunk.

Dave

-- 

2019-10-26  John David Anglin  

* config/abi/post/hppa-linux-gnu/baseline_symbols.txt: Update.

Index: config/abi/post/hppa-linux-gnu/baseline_symbols.txt
===
--- config/abi/post/hppa-linux-gnu/baseline_symbols.txt (revision 277363)
+++ config/abi/post/hppa-linux-gnu/baseline_symbols.txt (working copy)
@@ -112,6 +112,7 @@
 FUNC:_ZN11__gnu_debug19_Safe_sequence_base18_M_detach_singularEv@@GLIBCXX_3.4
 
FUNC:_ZN11__gnu_debug19_Safe_sequence_base22_M_revalidate_singularEv@@GLIBCXX_3.4
 FUNC:_ZN11__gnu_debug19_Safe_sequence_base7_M_swapERS0_@@GLIBCXX_3.4
+FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base16_M_attach_singleEPNS_19_Safe_sequence_baseEb@@GLIBCXX_3.4.26
 
FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base9_M_attachEPNS_19_Safe_sequence_baseEb@@GLIBCXX_3.4.17
 FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base9_M_detachEv@@GLIBCXX_3.4.17
 
FUNC:_ZN11__gnu_debug30_Safe_unordered_container_base13_M_detach_allEv@@GLIBCXX_3.4.17
@@ -261,6 +262,7 @@
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE8capacityEv@@GLIBCXX_3.4
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE8max_sizeEv@@GLIBCXX_3.4
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE9_M_ibeginEv@@GLIBCXX_3.4
+FUNC:_ZNKSbIwSt11char_traitsIwESaIwEEcvSt17basic_string_viewIwS0_EEv@@GLIBCXX_3.4.26
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEEixEj@@GLIBCXX_3.4
 FUNC:_ZNKSi6gcountEv@@GLIBCXX_3.4
 FUNC:_ZNKSi6sentrycvbEv@@GLIBCXX_3.4
@@ -328,9 +330,66 @@
 FUNC:_ZNKSs8capacityEv@@GLIBCXX_3.4
 FUNC:_ZNKSs8max_sizeEv@@GLIBCXX_3.4
 FUNC:_ZNKSs9_M_ibeginEv@@GLIBCXX_3.4
+FUNC:_ZNKSscvSt17basic_string_viewIcSt11char_traitsIcEEEv@@GLIBCXX_3.4.26
 FUNC:_ZNKSsixEj@@GLIBCXX_3.4
 FUNC:_ZNKSt10bad_typeid4whatEv@@GLIBCXX_3.4.9
 FUNC:_ZNKSt10error_code23default_error_conditionEv@@GLIBCXX_3.4.11
+FUNC:_ZNKSt10filesystem16filesystem_error4whatEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem16filesystem_error5path1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem16filesystem_error5path2Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem18directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iterator17recursion_pendingEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iterator5depthEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iterator7optionsEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem28recursive_directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path11parent_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path12has_filenameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path13has_root_nameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path13has_root_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path13relative_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path14root_directoryEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path15has_parent_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path16lexically_normalEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path17_M_find_extensionEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path17has_relative_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path18has_root_directoryEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path18lexically_relativeERKS0_@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path19lexically_proximateERKS0_@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path5_List13_Impl_deleterclEPNS1_5_ImplE@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path5_List3endEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path5_List5beginEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path7compareERKS0_@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path7compareESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path9root_nameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem4path9root_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error4whatEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error5path1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error5path2Ev@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1118directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator17recursion_pendingEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator5depthEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator7optionsEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iteratordeEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path11parent_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path12has_filenameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path13has_root_nameEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path13has_root_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path13relative_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path14root_directoryEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path15has_parent_pathEv@@GLIBCXX_3.4.26
+FUNC:_ZNKSt10filesystem7__cxx114path16lexically_normalEv@@GLIBCXX_3.4.26

Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

2019-10-26 Thread Jeff Law
On 10/26/19 1:33 PM, Andrew Waterman wrote:
> I don't know enough to say whether the legitimize_address hook is
> sufficient for the intended purpose, but I am sure that RISC-V's
> concerns are not unique: other GCC targets have to cope with
> offset-size constraints that are coupled to register-allocation
> constraints.
Yup.  I think every risc port in the 90s faces this problem.  I always
wished for a generic mechanism for ports to handle this problem.

Regardless, it's probably worth investigating.

jeff



Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

2019-10-26 Thread Jeff Law
On 10/26/19 1:10 PM, Oleg Endo wrote:
> On Sat, 2019-10-26 at 12:21 -0600, Jeff Law wrote:
>> On 10/25/19 11:39 AM, Craig Blackmore wrote:
>>> This patch aims to allow more load/store instructions to be
>>> compressed by
>>> replacing a load/store of 'base register + large offset' with a new
>>> load/store
>>> of 'new base + small offset'. If the new base gets stored in a
>>> compressed
>>> register, then the new load/store can be compressed. Since there is
>>> an overhead
>>> in creating the new base, this change is only attempted when 'base
>>> register' is
>>> referenced in at least 4 load/stores in a basic block.
>>>
>>> The optimization is implemented in a new RISC-V specific pass
>>> called
>>> shorten_memrefs which is enabled for RVC targets. It has been
>>> developed for the
>>> 32-bit lw/sw instructions but could also be extended to 64-bit
>>> ld/sd in future.
>>>
>>> Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc,
>>> rv64imac,
>>> rv64imafdc via QEMU. No regressions.
>>>
>>> gcc/ChangeLog:
>>>
>>> * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for
>>> riscv.
>>> * config/riscv/riscv-passes.def: New file.
>>> * config/riscv/riscv-protos.h (make_pass_shorten_memrefs):
>>> Declare.
>>> * config/riscv/riscv-shorten-memrefs.c: New file.
>>> * config/riscv/riscv.c (tree-pass.h): New include.
>>> (riscv_compressed_reg_p): New Function
>>> (riscv_compressed_lw_offset_p): Likewise.
>>> (riscv_compressed_lw_address_p): Likewise.
>>> (riscv_shorten_lw_offset): Likewise.
>>> (riscv_legitimize_address): Attempt to convert base +
>>> large_offset
>>> to compressible new_base + small_offset.
>>> (riscv_address_cost): Make anticipated compressed load/stores
>>> cheaper for code size than uncompressed load/stores.
>>> (riscv_register_priority): Move compressed register check to
>>> riscv_compressed_reg_p.
>>> * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET):
>>> Define.
>>> * config/riscv/riscv.opt (mshorten-memefs): New option.
>>> * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule.
>>> (PASSES_EXTRA): Add riscv-passes.def.
>>> * doc/invoke.texi: Document -mshorten-memrefs.
>>
>> This has traditionally been done via the the legitimize_address hook.
>> Is there some reason that hook is insufficient for this case?
>>
>> The hook, IIRC, is called out explow.c.
>>
> 
> This sounds like some of my addressing mode selection (AMS) attempts on
> SH.  Haven't looked at the patch (sorry), but I'm sure the problem is
> pretty much the same.
> 
> On SH legitimize_address is used to do ... "something" ... to the
> address in order to make the displacement fit.  The issue is,
> legitimize_address doesn't get any context so it can't even try to find
> a local optimal base address or something like that.
True it's not given any context.  I think most ports try to form the
address in such a way that the base+offset' is likely to be a common
subexpression.

THere was a similar hook for reload which I didn't mention as I wasn't
sure it was used i the LRA world.

SH is likely  than most more complex.  Joern tried really hard to get
good code on the SH and as a result the implementation is sometimes
tough to follow.

Jeff



Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

2019-10-26 Thread Andrew Waterman
I don't know enough to say whether the legitimize_address hook is
sufficient for the intended purpose, but I am sure that RISC-V's
concerns are not unique: other GCC targets have to cope with
offset-size constraints that are coupled to register-allocation
constraints.


On Sat, Oct 26, 2019 at 11:21 AM Jeff Law  wrote:
>
> On 10/25/19 11:39 AM, Craig Blackmore wrote:
> > This patch aims to allow more load/store instructions to be compressed by
> > replacing a load/store of 'base register + large offset' with a new 
> > load/store
> > of 'new base + small offset'. If the new base gets stored in a compressed
> > register, then the new load/store can be compressed. Since there is an 
> > overhead
> > in creating the new base, this change is only attempted when 'base 
> > register' is
> > referenced in at least 4 load/stores in a basic block.
> >
> > The optimization is implemented in a new RISC-V specific pass called
> > shorten_memrefs which is enabled for RVC targets. It has been developed for 
> > the
> > 32-bit lw/sw instructions but could also be extended to 64-bit ld/sd in 
> > future.
> >
> > Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc, rv64imac,
> > rv64imafdc via QEMU. No regressions.
> >
> > gcc/ChangeLog:
> >
> >   * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for riscv.
> >   * config/riscv/riscv-passes.def: New file.
> >   * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): Declare.
> >   * config/riscv/riscv-shorten-memrefs.c: New file.
> >   * config/riscv/riscv.c (tree-pass.h): New include.
> >   (riscv_compressed_reg_p): New Function
> >   (riscv_compressed_lw_offset_p): Likewise.
> >   (riscv_compressed_lw_address_p): Likewise.
> >   (riscv_shorten_lw_offset): Likewise.
> >   (riscv_legitimize_address): Attempt to convert base + large_offset
> > to compressible new_base + small_offset.
> >   (riscv_address_cost): Make anticipated compressed load/stores
> > cheaper for code size than uncompressed load/stores.
> >   (riscv_register_priority): Move compressed register check to
> > riscv_compressed_reg_p.
> >   * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET): Define.
> >   * config/riscv/riscv.opt (mshorten-memefs): New option.
> >   * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule.
> >   (PASSES_EXTRA): Add riscv-passes.def.
> >   * doc/invoke.texi: Document -mshorten-memrefs.
> This has traditionally been done via the the legitimize_address hook.
> Is there some reason that hook is insufficient for this case?
>
> The hook, IIRC, is called out explow.c.
>
> Jeff
>


Re: [Patch, fortran] PR86248 - [7/8/9/10 Regression] LEN_TRIM in specification expression causes link failure

2019-10-26 Thread Paul Richard Thomas
Hi Steve,

Thanks for the heads up. I'll get on to it right away.

Regards

Paul

On Sat, 26 Oct 2019 at 20:04, Steve Kargl
 wrote:
>
> On Sat, Oct 26, 2019 at 07:39:55PM +0100, Paul Richard Thomas wrote:
> > As far as I can tell, this is a 6-regression as well - ***sigh***
> >
> > The patch is fundamentally very simple. Symbols that were marked with
> > the fn_result_spec flag that really were module parameters were having
> > the wrong name mangling applied to them. The rest of the patch is a
> > tidy up.
> >
> > Regtested on FC30/x86_64 - OK for all the branches after a bedding in
> > period on trunk?
> >
>
> OK.  Note you cannot wait too long for 7 or 9.  The 7 branch
> is being closed, and the 9-branch is close to its next release.
>
> --
> Steve



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


*Early ping* Re: [Patch, Fortran] PR91863 - fix call to bind(C) with array descriptor

2019-10-26 Thread Tobias Burnus

On 10/23/19 3:07 PM, Tobias Burnus wrote:


With the trunk, there are three issues:

(a) With bind(C), the callee side handles deallocation with intent(out).

This should produce the code:
    if (cfi.0 != 0B)
  {
    __builtin_free (cfi.0);
    cfi.0 = 0B;
  }
This fails as cfi.0 (of type 'void*') is dereferenced and
*cfi.0 = 0B' (i.e. assignment of type 'void') causes the ICE.


(b) With that fixed, one gets:
    sub (cfi.4);
    _gfortran_cfi_desc_to_gfc_desc (, );
    if (cfi.4 != 0B)
  __builtin_free (cfi.4);
    ... code using "a" ...
That also won't shine as 'a.data' == 'cfi.4'; hence, one
accesses already freed memory.

I don't see whether freeing the cfi memory makes sense at all;
as I didn't come up with a reason, I removed it for good.


Those issues, I have solved. The third issue is now PR fortran/92189:
(c) When allocating memory in a Fortran-written Bind(C) function, the
shape/bounds changes are not propagated back to Fortran.
Namely, "sub" lacks some _gfortran_gfc_desc_to_cfi_desc call at the end!

The issue pops up, if you change 'dg-do compile' into 'dg-do run'. For
using a C-written function, that's a non-issue. Hence, it makes sense
to fix (a)+(b) of the bug separately.


OK for the trunk and GCC 9? (At least the ICE is a regression.)

Tobias



Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

2019-10-26 Thread Oleg Endo
On Sat, 2019-10-26 at 12:21 -0600, Jeff Law wrote:
> On 10/25/19 11:39 AM, Craig Blackmore wrote:
> > This patch aims to allow more load/store instructions to be
> > compressed by
> > replacing a load/store of 'base register + large offset' with a new
> > load/store
> > of 'new base + small offset'. If the new base gets stored in a
> > compressed
> > register, then the new load/store can be compressed. Since there is
> > an overhead
> > in creating the new base, this change is only attempted when 'base
> > register' is
> > referenced in at least 4 load/stores in a basic block.
> > 
> > The optimization is implemented in a new RISC-V specific pass
> > called
> > shorten_memrefs which is enabled for RVC targets. It has been
> > developed for the
> > 32-bit lw/sw instructions but could also be extended to 64-bit
> > ld/sd in future.
> > 
> > Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc,
> > rv64imac,
> > rv64imafdc via QEMU. No regressions.
> > 
> > gcc/ChangeLog:
> > 
> > * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for
> > riscv.
> > * config/riscv/riscv-passes.def: New file.
> > * config/riscv/riscv-protos.h (make_pass_shorten_memrefs):
> > Declare.
> > * config/riscv/riscv-shorten-memrefs.c: New file.
> > * config/riscv/riscv.c (tree-pass.h): New include.
> > (riscv_compressed_reg_p): New Function
> > (riscv_compressed_lw_offset_p): Likewise.
> > (riscv_compressed_lw_address_p): Likewise.
> > (riscv_shorten_lw_offset): Likewise.
> > (riscv_legitimize_address): Attempt to convert base +
> > large_offset
> > to compressible new_base + small_offset.
> > (riscv_address_cost): Make anticipated compressed load/stores
> > cheaper for code size than uncompressed load/stores.
> > (riscv_register_priority): Move compressed register check to
> > riscv_compressed_reg_p.
> > * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET):
> > Define.
> > * config/riscv/riscv.opt (mshorten-memefs): New option.
> > * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule.
> > (PASSES_EXTRA): Add riscv-passes.def.
> > * doc/invoke.texi: Document -mshorten-memrefs.
> 
> This has traditionally been done via the the legitimize_address hook.
> Is there some reason that hook is insufficient for this case?
> 
> The hook, IIRC, is called out explow.c.
> 

This sounds like some of my addressing mode selection (AMS) attempts on
SH.  Haven't looked at the patch (sorry), but I'm sure the problem is
pretty much the same.

On SH legitimize_address is used to do ... "something" ... to the
address in order to make the displacement fit.  The issue is,
legitimize_address doesn't get any context so it can't even try to find
a local optimal base address or something like that.

Cheers,
Oleg



Re: [Patch, fortran] PR86248 - [7/8/9/10 Regression] LEN_TRIM in specification expression causes link failure

2019-10-26 Thread Steve Kargl
On Sat, Oct 26, 2019 at 07:39:55PM +0100, Paul Richard Thomas wrote:
> As far as I can tell, this is a 6-regression as well - ***sigh***
> 
> The patch is fundamentally very simple. Symbols that were marked with
> the fn_result_spec flag that really were module parameters were having
> the wrong name mangling applied to them. The rest of the patch is a
> tidy up.
> 
> Regtested on FC30/x86_64 - OK for all the branches after a bedding in
> period on trunk?
> 

OK.  Note you cannot wait too long for 7 or 9.  The 7 branch
is being closed, and the 9-branch is close to its next release.

-- 
Steve


[Patch, fortran] PR86248 - [7/8/9/10 Regression] LEN_TRIM in specification expression causes link failure

2019-10-26 Thread Paul Richard Thomas
As far as I can tell, this is a 6-regression as well - ***sigh***

The patch is fundamentally very simple. Symbols that were marked with
the fn_result_spec flag that really were module parameters were having
the wrong name mangling applied to them. The rest of the patch is a
tidy up.

Regtested on FC30/x86_64 - OK for all the branches after a bedding in
period on trunk?

Cheers

Paul

2019-10-26  Paul Thomas  

PR fortran/86248
* resolve.c (flag_fn_result_spec): Correct a typo before the
function declaration.
* trans-decl.c (gfc_sym_identifier): Boost the length of 'name'
to allow for all variants. Simplify the code by using a pointer
to the symbol's proc_name and taking the return out of each of
the conditional branches. Allow symbols with fn_result_spec set
that do not come from a procedure namespace and have a module
name to go through the non-fn_result_spec branch.

2019-10-26  Paul Thomas  

PR fortran/86248
* gfortran.dg/char_result_19.f90 : New test.
* gfortran.dg/char_result_mod_19.f90 : Module for the new test.
Index: gcc/fortran/resolve.c
===
*** gcc/fortran/resolve.c	(revision 277203)
--- gcc/fortran/resolve.c	(working copy)
*** resolve_equivalence (gfc_equiv *eq)
*** 16774,16781 
  }
  
  
! /* Function called by resolve_fntype to flag other symbol used in the
!length type parameter specification of function resuls.  */
  
  static bool
  flag_fn_result_spec (gfc_expr *expr,
--- 16774,16781 
  }
  
  
! /* Function called by resolve_fntype to flag other symbols used in the
!length type parameter specification of function results.  */
  
  static bool
  flag_fn_result_spec (gfc_expr *expr,
Index: gcc/fortran/trans-decl.c
===
*** gcc/fortran/trans-decl.c	(revision 277203)
--- gcc/fortran/trans-decl.c	(working copy)
*** gfc_sym_identifier (gfc_symbol * sym)
*** 369,412 
  static const char *
  mangled_identifier (gfc_symbol *sym)
  {
!   static char name[GFC_MAX_MANGLED_SYMBOL_LEN + 1];
/* Prevent the mangling of identifiers that have an assigned
   binding label (mainly those that are bind(c)).  */
  
if (sym->attr.is_bind_c == 1 && sym->binding_label)
  return sym->binding_label;
  
!   if (!sym->fn_result_spec)
  {
if (sym->module == NULL)
  	return sym_identifier (sym);
else
! 	{
! 	  snprintf (name, sizeof name, "__%s_MOD_%s", sym->module, sym->name);
! 	  return name;
! 	}
  }
else
  {
/* This is an entity that is actually local to a module procedure
  	 that appears in the result specification expression.  Since
  	 sym->module will be a zero length string, we use ns->proc_name
! 	 instead. */
!   if (sym->ns->proc_name && sym->ns->proc_name->module)
! 	{
! 	  snprintf (name, sizeof name, "__%s_MOD__%s_PROC_%s",
! 		sym->ns->proc_name->module,
! 		sym->ns->proc_name->name,
! 		sym->name);
! 	  return name;
! 	}
else
! 	{
! 	  snprintf (name, sizeof name, "__%s_PROC_%s",
! 		sym->ns->proc_name->name, sym->name);
! 	  return name;
! 	}
  }
  }
  
  /* Get mangled identifier, adding the symbol to the global table if
--- 369,405 
  static const char *
  mangled_identifier (gfc_symbol *sym)
  {
!   gfc_symbol *proc = sym->ns->proc_name;
!   static char name[3*GFC_MAX_MANGLED_SYMBOL_LEN + 14];
/* Prevent the mangling of identifiers that have an assigned
   binding label (mainly those that are bind(c)).  */
  
if (sym->attr.is_bind_c == 1 && sym->binding_label)
  return sym->binding_label;
  
!   if (!sym->fn_result_spec
!   || (sym->module && !(proc && proc->attr.flavor == FL_PROCEDURE)))
  {
if (sym->module == NULL)
  	return sym_identifier (sym);
else
! 	snprintf (name, sizeof name, "__%s_MOD_%s", sym->module, sym->name);
  }
else
  {
/* This is an entity that is actually local to a module procedure
  	 that appears in the result specification expression.  Since
  	 sym->module will be a zero length string, we use ns->proc_name
! 	 to provide the module name instead. */
!   if (proc && proc->module)
! 	snprintf (name, sizeof name, "__%s_MOD__%s_PROC_%s",
! 		  proc->module, proc->name, sym->name);
else
! 	snprintf (name, sizeof name, "__%s_PROC_%s",
! 		  proc->name, sym->name);
  }
+ 
+   return name;
  }
  
  /* Get mangled identifier, adding the symbol to the global table if
Index: gcc/testsuite/gfortran.dg/char_result_19.f90
===
*** gcc/testsuite/gfortran.dg/char_result_19.f90	(nonexistent)
--- gcc/testsuite/gfortran.dg/char_result_19.f90	(working copy)
***
*** 0 
--- 1,24 
+ ! { dg-do preprocess }
+ ! { dg-additional-options "-cpp" }
+ !
+ ! Test the fix for PR86248
+ !
+ ! Contributed by Bill Long  
+ !
+ program test
+   

Re: [PATCH][MSP430] Use 430 insns in the large memory model for more patterns

2019-10-26 Thread Jeff Law
On 10/25/19 6:24 AM, Jozef Lawrynowicz wrote:
> Where possible, it is always desirable to use 430 format instructions when
> compiling for the 430X ISA and the large memory model. 430 instructions have
> reduced code size and faster execution time.
> 
> This patch recognizes a couple of new patterns in which we can use 430 insns 
> in
> the large memory model.
> 
> Successfully regtested on trunk.
> 
> Ok to apply?
> 
> 
> 0001-MSP430-Use-430-insns-in-the-large-memory-model-for-m.patch
> 
> From ba3a8eafeba08dc034e219f892f2784c16f94c40 Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Thu, 24 Oct 2019 15:17:29 +0100
> Subject: [PATCH] MSP430: Use 430 insns in the large memory model for more
>  patterns
> 
> gcc/ChangeLog:
> 
> 2019-10-25  Jozef Lawrynowicz  
> 
>   * config/msp430/msp430.c (msp430_check_index_not_high_mem): New.
>   (msp430_check_plus_not_high_mem): New.
>   (msp430_op_not_in_high_mem): Use new functions to check if the operand
>   might be in low memory.
>   Indicate that a 16-bit absolute address is in lower memory.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-10-25  Jozef Lawrynowicz  
> 
>   * gcc.target/msp430/mlarge-use-430-insn.c: New test.
> 
OK
jeff



Re: [PATCH 4/N] Fix unsigned type overflow in memory report.

2019-10-26 Thread Jeff Law
On 10/25/19 7:21 AM, Martin Liška wrote:
> Hi.
> 
> And there's one more integer overflow fix.
> 
> Martin
> 
> 
> 0001-Fix-unsigned-type-overflow-in-memory-report.patch
> 
> From 35c22704dc705508672f19b09e6d1b94bd956535 Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Fri, 25 Oct 2019 15:09:32 +0200
> Subject: [PATCH] Fix unsigned type overflow in memory report.
> 
> gcc/ChangeLog:
> 
> 2019-10-25  Martin Liska  
> 
>   * ggc-common.c: One can't subtract unsigned types
>   in compare function.
OK
jeff



Re: [PATCH 3/3] Print header in dump_memory_report.

2019-10-26 Thread Jeff Law
On 10/25/19 3:32 AM, Martin Liska wrote:
> 
> gcc/ChangeLog:
> 
> 2019-10-25  Martin Liska  
> 
>   * cgraphunit.c (symbol_table::compile): Pass
>   title as dump_memory_report argument.
>   * toplev.c (dump_memory_report):  New argument.
>   (finalize): Pass new argument.
>   * toplev.h (dump_memory_report): Add argument.
> 
> gcc/lto/ChangeLog:
> 
> 2019-10-25  Martin Liska  
> 
>   * lto.c (do_whole_program_analysis): Pass
>   title as dump_memory_report argument.
> ---
>  gcc/cgraphunit.c | 10 ++
>  gcc/lto/lto.c| 12 +++-
>  gcc/toplev.c | 13 +++--
>  gcc/toplev.h |  3 ++-
>  4 files changed, 18 insertions(+), 20 deletions(-)
> 
OK
jeff



Re: [PATCH 2/3] Move Leak in GCC memory report to the first column.

2019-10-26 Thread Jeff Law
On 10/25/19 3:10 AM, Martin Liska wrote:
> 
> gcc/ChangeLog:
> 
> 2019-10-25  Martin Liska  
> 
>   * ggc-common.c: Move Leak to the first column.
OK
jeff



Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass

2019-10-26 Thread Jeff Law
On 10/25/19 11:39 AM, Craig Blackmore wrote:
> This patch aims to allow more load/store instructions to be compressed by
> replacing a load/store of 'base register + large offset' with a new load/store
> of 'new base + small offset'. If the new base gets stored in a compressed
> register, then the new load/store can be compressed. Since there is an 
> overhead
> in creating the new base, this change is only attempted when 'base register' 
> is
> referenced in at least 4 load/stores in a basic block.
> 
> The optimization is implemented in a new RISC-V specific pass called
> shorten_memrefs which is enabled for RVC targets. It has been developed for 
> the
> 32-bit lw/sw instructions but could also be extended to 64-bit ld/sd in 
> future.
> 
> Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc, rv64imac,
> rv64imafdc via QEMU. No regressions.
> 
> gcc/ChangeLog:
> 
>   * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for riscv.
>   * config/riscv/riscv-passes.def: New file.
>   * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): Declare.
>   * config/riscv/riscv-shorten-memrefs.c: New file.
>   * config/riscv/riscv.c (tree-pass.h): New include.
>   (riscv_compressed_reg_p): New Function
>   (riscv_compressed_lw_offset_p): Likewise.
>   (riscv_compressed_lw_address_p): Likewise.
>   (riscv_shorten_lw_offset): Likewise.
>   (riscv_legitimize_address): Attempt to convert base + large_offset
> to compressible new_base + small_offset.
>   (riscv_address_cost): Make anticipated compressed load/stores
> cheaper for code size than uncompressed load/stores.
>   (riscv_register_priority): Move compressed register check to
> riscv_compressed_reg_p.
>   * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET): Define.
>   * config/riscv/riscv.opt (mshorten-memefs): New option.
>   * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule.
>   (PASSES_EXTRA): Add riscv-passes.def.
>   * doc/invoke.texi: Document -mshorten-memrefs.
This has traditionally been done via the the legitimize_address hook.
Is there some reason that hook is insufficient for this case?

The hook, IIRC, is called out explow.c.

Jeff



Re: [PATCH] PR85678: Change default to -fno-common

2019-10-26 Thread Jeff Law
On 10/25/19 9:47 AM, Wilco Dijkstra wrote:
> GCC currently defaults to -fcommon.  As discussed in the PR, this is an 
> ancient
> C feature which is not conforming with the latest C standards.  On many 
> targets
> this means global variable accesses have a codesize and performance penalty.
> This applies to C code only, C++ code is not affected by -fcommon.  It is 
> about
> time to change the default.
> 
> OK for commit?
> 
> ChangeLog
> 2019-10-25  Wilco Dijkstra  
> 
>   PR85678
>   * common.opt (fcommon): Change init to 1.
> 
> doc/
>   * invoke.texi (-fcommon): Update documentation.
Has this been bootstrapped and regression tested?

jeff



Re: [PATCH] Remove redudant iptr when operand already has a scalar mode.

2019-10-26 Thread Uros Bizjak
On Sat, Oct 26, 2019 at 3:27 PM Hongtao Liu  wrote:
>
> > BTW: Please also note that there is no need to use  or operand
> > mode override in scalar insn templates for intel asm dialect when
> > operand already has a scalar mode.
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01868.html
>
> This patch is to remove redundant  when operand already has a scalar 
> mode.
>
> bootstrap and regression test for i386/x86-64 is ok.
>
> Changelog
> gcc/
> * config/i386/sse.md (*_vm3,
> _vm3): Remove  since
> operand is already scalar mode.
> (iptr): Remove SF/DF.

OK.

Thanks,
Uros.


[PATCH] rs6000: Fix allocate_stack in a corner case (PR91289)

2019-10-26 Thread Segher Boessenkool
When we have -fstack-limit-symbol with sysv we can end up with a non-
existing instruction (you cannot add an immediate to register 0).  Fix
this by using register 11 instead.  It might be used for something else
already though, so save and restore its value around this.  In
optimizing compiles these extra moves are usually removed again: the
restore by cprop_hardreg, and then the save by rtl_dce.

Committing to trunk.


Segher


2019-10-26  Segher Boessenkool  

PR target/91289
* config/rs6000/rs6000-logue.c (rs6000_emit_allocate_stack): Don't add
an immediate to r0; use r11 instead.  Save and restore r11 to r0 around
this.

---
 gcc/config/rs6000/rs6000-logue.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c
index e98893a..04aae80 100644
--- a/gcc/config/rs6000/rs6000-logue.c
+++ b/gcc/config/rs6000/rs6000-logue.c
@@ -1700,10 +1700,14 @@ rs6000_emit_allocate_stack (HOST_WIDE_INT size, rtx 
copy_reg, int copy_off)
stack_limit_rtx,
GEN_INT (size)));
 
- emit_insn (gen_elf_high (tmp_reg, toload));
- emit_insn (gen_elf_low (tmp_reg, tmp_reg, toload));
- emit_insn (gen_cond_trap (LTU, stack_reg, tmp_reg,
-   const0_rtx));
+ /* We cannot use r0 with elf_low.  Lamely solve this problem by
+moving registers around.  */
+ rtx r11_reg = gen_rtx_REG (Pmode, 11);
+ emit_move_insn (tmp_reg, r11_reg);
+ emit_insn (gen_elf_high (r11_reg, toload));
+ emit_insn (gen_elf_low (r11_reg, r11_reg, toload));
+ emit_insn (gen_cond_trap (LTU, stack_reg, r11_reg, const0_rtx));
+ emit_move_insn (r11_reg, tmp_reg);
}
   else
warning (0, "stack limit expression is not supported");
-- 
1.8.3.1



Re: [PATCH] Fix PR92222

2019-10-26 Thread Richard Sandiford
Richard Biener  writes:
> We have to check each operand for being in a pattern, not just the
> first when avoiding build from scalars (we could possibly handle
> the special case of some of them being the pattern stmt root, but
> that would be a followup improvement).
>
> Bootstrap & regtest running on x86-64-unknown-linux-gnu.
>
> Richard.
>
> 2019-10-25  Richard Biener  
>
>   PR tree-optimization/9
>   * tree-vect-slp.c (_slp_oprnd_info::first_pattern): Remove.
>   (_slp_oprnd_info::second_pattern): Likewise.
>   (_slp_oprnd_info::any_pattern): New.
>   (vect_create_oprnd_info): Adjust.
>   (vect_get_and_check_slp_defs): Compute whether any stmt is
>   in a pattern.
>   (vect_build_slp_tree_2): Avoid building up a node from scalars
>   if any of the operand defs, not just the first, is in a pattern.

Is recording this in vect_get_and_check_slp_defs enough?  AIUI we don't
get that far if vect_build_slp_tree_1 fails, but that doesn't prevent us
from building the vector from scalars on return from vect_build_slp_tree.

Was wondering whether we should use a helper function that explicitly
checks whether any stmt_info in the vec is a pattern statement, rather
than computing it on the fly.

Thanks,
Richard

>
>   * gcc.dg/torture/pr9.c: New testcase.
>
> Index: gcc/tree-vect-slp.c
> ===
> --- gcc/tree-vect-slp.c   (revision 277441)
> +++ gcc/tree-vect-slp.c   (working copy)
> @@ -177,8 +177,7 @@ typedef struct _slp_oprnd_info
>   stmt.  */
>tree first_op_type;
>enum vect_def_type first_dt;
> -  bool first_pattern;
> -  bool second_pattern;
> +  bool any_pattern;
>  } *slp_oprnd_info;
>  
>  
> @@ -199,8 +198,7 @@ vect_create_oprnd_info (int nops, int gr
>oprnd_info->ops.create (group_size);
>oprnd_info->first_dt = vect_uninitialized_def;
>oprnd_info->first_op_type = NULL_TREE;
> -  oprnd_info->first_pattern = false;
> -  oprnd_info->second_pattern = false;
> +  oprnd_info->any_pattern = false;
>oprnds_info.quick_push (oprnd_info);
>  }
>  
> @@ -339,13 +337,11 @@ vect_get_and_check_slp_defs (vec_info *v
>tree oprnd;
>unsigned int i, number_of_oprnds;
>enum vect_def_type dt = vect_uninitialized_def;
> -  bool pattern = false;
>slp_oprnd_info oprnd_info;
>int first_op_idx = 1;
>unsigned int commutative_op = -1U;
>bool first_op_cond = false;
>bool first = stmt_num == 0;
> -  bool second = stmt_num == 1;
>  
>if (gcall *stmt = dyn_cast  (stmt_info->stmt))
>  {
> @@ -418,13 +414,12 @@ again:
> return -1;
>   }
>  
> -  if (second)
> - oprnd_info->second_pattern = pattern;
> +  if (def_stmt_info && is_pattern_stmt_p (def_stmt_info))
> + oprnd_info->any_pattern = true;
>  
>if (first)
>   {
> oprnd_info->first_dt = dt;
> -   oprnd_info->first_pattern = pattern;
> oprnd_info->first_op_type = TREE_TYPE (oprnd);
>   }
>else
> @@ -1311,7 +1306,7 @@ vect_build_slp_tree_2 (vec_info *vinfo,
> /* ???  Rejecting patterns this way doesn't work.  We'd have to
>do extra work to cancel the pattern so the uses see the
>scalar version.  */
> -   && !is_pattern_stmt_p (SLP_TREE_SCALAR_STMTS (child)[0]))
> +   && !oprnd_info->any_pattern)
>   {
> slp_tree grandchild;
>  
> @@ -1358,18 +1353,16 @@ vect_build_slp_tree_2 (vec_info *vinfo,
> /* ???  Rejecting patterns this way doesn't work.  We'd have to
>do extra work to cancel the pattern so the uses see the
>scalar version.  */
> -   && !is_pattern_stmt_p (stmt_info))
> +   && !is_pattern_stmt_p (stmt_info)
> +   && !oprnd_info->any_pattern)
>   {
> if (dump_enabled_p ())
>   dump_printf_loc (MSG_NOTE, vect_location,
>"Building vector operands from scalars\n");
> this_tree_size++;
> -   child = vect_create_new_slp_node (oprnd_info->def_stmts);
> -   SLP_TREE_DEF_TYPE (child) = vect_external_def;
> -   SLP_TREE_SCALAR_OPS (child) = oprnd_info->ops;
> +   child = vect_create_new_slp_node (oprnd_info->ops);
> children.safe_push (child);
> oprnd_info->ops = vNULL;
> -   oprnd_info->def_stmts = vNULL;
> continue;
>   }
>  
> @@ -1469,7 +1440,7 @@ vect_build_slp_tree_2 (vec_info *vinfo,
> /* ???  Rejecting patterns this way doesn't work.  We'd have
>to do extra work to cancel the pattern so the uses see the
>scalar version.  */
> -   && !is_pattern_stmt_p (SLP_TREE_SCALAR_STMTS (child)[0]))
> +   && !oprnd_info->any_pattern)
>   {
> unsigned int j;
> slp_tree grandchild;
> Index: gcc/testsuite/gcc.dg/torture/pr9.c
> 

[PATCH] Remove redudant iptr when operand already has a scalar mode.

2019-10-26 Thread Hongtao Liu
> BTW: Please also note that there is no need to use  or operand
> mode override in scalar insn templates for intel asm dialect when
> operand already has a scalar mode.
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01868.html

This patch is to remove redundant  when operand already has a scalar mode.

bootstrap and regression test for i386/x86-64 is ok.

Changelog
gcc/
* config/i386/sse.md (*_vm3,
_vm3): Remove  since
operand is already scalar mode.
(iptr): Remove SF/DF.
-- 
BR,
Hongtao


0001-Remove-redudant-iptr-when-operand-is-already-scalar-.patch
Description: Binary data


Re: [PATCH] PR85678: Change default to -fno-common

2019-10-26 Thread Segher Boessenkool
On Sat, Oct 26, 2019 at 12:21:15PM +0100, Iain Sandoe wrote:
> Georg-Johann Lay  wrote:
> > Wilco Dijkstra schrieb:
> >> GCC currently defaults to -fcommon.  As discussed in the PR, this is an 
> >> ancient
> >> C feature which is not conforming with the latest C standards.  On many 
> >> targets
> >> this means global variable accesses have a codesize and performance 
> >> penalty.
> >> This applies to C code only, C++ code is not affected by -fcommon.  It is 
> >> about
> >> time to change the default.
> >> OK for commit?
> > 
> > IIRC using -fno-common might lead to some testsuit fallout because
> > some optimizations / test cases are sensitive to -f[no-]common.
> > So I wonder that no adjustments to test cases are needed?
> 
> Two points here:
> 
> (a) I actually agree with the idea to change the default

I think everyone does, this has been long overdue.

> (b) there will surely be testsuite fallout on Darwin, where common accesses 
> are
>  mandated to be indirect in the ABI, where non-weak .data accesses are not
> Thus code generated for Darwin will change for any test using variables 
> that
> are “common” and where the test does not already force -fno-common
> 
> I’d expect fallout to be quite large - since there are plenty of 
> testcases with uninit
>gobals.  We might want to make a test-run and see the size of the problem, 
> but
>preferably once the stage#1 submisison and gcc-7 release cycles are done.

Yeah -- and if fallout is more than just testsuite, should it be postponed
to GCC 11?  And announced now, etc.

But to find out we probably should flip the switch soon, see what happens,
only flip it back if necessary.


Segher


Re: [PATCH] PR85678: Change default to -fno-common

2019-10-26 Thread Iain Sandoe
Georg-Johann Lay  wrote:

> Wilco Dijkstra schrieb:
>> GCC currently defaults to -fcommon.  As discussed in the PR, this is an 
>> ancient
>> C feature which is not conforming with the latest C standards.  On many 
>> targets
>> this means global variable accesses have a codesize and performance penalty.
>> This applies to C code only, C++ code is not affected by -fcommon.  It is 
>> about
>> time to change the default.
>> OK for commit?
> 
> IIRC using -fno-common might lead to some testsuit fallout because
> some optimizations / test cases are sensitive to -f[no-]common.
> So I wonder that no adjustments to test cases are needed?

Two points here:

(a) I actually agree with the idea to change the default

(b) there will surely be testsuite fallout on Darwin, where common accesses are
 mandated to be indirect in the ABI, where non-weak .data accesses are not
Thus code generated for Darwin will change for any test using variables that
are “common” and where the test does not already force -fno-common

I’d expect fallout to be quite large - since there are plenty of testcases 
with uninit
   gobals.  We might want to make a test-run and see the size of the problem, 
but
   preferably once the stage#1 submisison and gcc-7 release cycles are done.

thanks
Iain

>> ChangeLog
>> 2019-10-25  Wilco Dijkstra  
>>  PR85678
>>  * common.opt (fcommon): Change init to 1.
>> doc/
>>  * invoke.texi (-fcommon): Update documentation.
>> ---
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 
>> 0195b0cb85a06dd043fd0412b42dfffddfa2495b..b0840f41a5e480f4428bd62724b0dc3d54c68c0b
>>  100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -1131,7 +1131,7 @@ Common Report Var(flag_combine_stack_adjustments) 
>> Optimization
>> Looks for opportunities to reduce stack adjustments and stack references.
>>  fcommon
>> -Common Report Var(flag_no_common,0)
>> +Common Report Var(flag_no_common,0) Init(1)
>> Put uninitialized globals in the common section.
>>  fcompare-debug
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 
>> 857d9692729e503657d0d0f44f1f6252ec90d49a..5b4ff66015f5f94a5bd89e4dc3d2d53553cc091e
>>  100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -568,7 +568,7 @@ Objective-C and Objective-C++ Dialects}.
>> -fnon-call-exceptions  -fdelete-dead-exceptions  -funwind-tables @gol
>> -fasynchronous-unwind-tables @gol
>> -fno-gnu-unique @gol
>> --finhibit-size-directive  -fno-common  -fno-ident @gol
>> +-finhibit-size-directive  -fcommon  -fno-ident @gol
>> -fpcc-struct-return  -fpic  -fPIC  -fpie  -fPIE  -fno-plt @gol
>> -fno-jump-tables @gol
>> -frecord-gcc-switches @gol
>> @@ -14050,35 +14050,27 @@ useful for building programs to run under WINE@.
>> code that is not binary compatible with code generated without that switch.
>> Use it to conform to a non-default application binary interface.
>> -@item -fno-common
>> -@opindex fno-common
>> +@item -fcommon
>> @opindex fcommon
>> +@opindex fno-common
>> @cindex tentative definitions
>> -In C code, this option controls the placement of global variables -defined 
>> without an initializer, known as @dfn{tentative definitions} -in the C 
>> standard.  Tentative definitions are distinct from declarations +In C code, 
>> this option controls the placement of global variables
>> +defined without an initializer, known as @dfn{tentative definitions}
>> +in the C standard.  Tentative definitions are distinct from declarations
>> of a variable with the @code{extern} keyword, which do not allocate storage.
>> -Unix C compilers have traditionally allocated storage for
>> -uninitialized global variables in a common block.  This allows the
>> -linker to resolve all tentative definitions of the same variable
>> +The default is @option{-fno-common}, which specifies that the compiler 
>> places
>> +uninitialized global variables in the BSS section of the object file.
> 
> IMO "uninitialized" is confusing because the variables actually
> *are* initialized: with zero.  It's just that the variables don't have
> explicit initializers.  Dito for "uninitialized" in the --help message.
> 
> Johann
> 
> 
>> +This inhibits the merging of tentative definitions by the linker so you get 
>> a
>> +multiple-definition error if the same variable is accidentally defined in 
>> more
>> +than one compilation unit.
>> +
>> +The @option{-fcommon} places uninitialized global variables in a common 
>> block.
>> +This allows the linker to resolve all tentative definitions of the same 
>> variable
>> in different compilation units to the same object, or to a non-tentative
>> -definition.  -This is the behavior specified by @option{-fcommon}, and is 
>> the default for -GCC on most targets.  -On the other hand, this behavior is 
>> not required by ISO
>> -C, and on some targets may carry a speed or code size penalty on
>> -variable references.
>> -
>> -The @option{-fno-common} option specifies that the compiler should instead
>> -place 

Re: introduce -fcallgraph-info option

2019-10-26 Thread Alexandre Oliva
Hi, Richi,

On Oct 26, 2019, Richard Biener  wrote:

> How does it relate to the LTO-dump utility we have now which can in
> theory provide a more complete view?

Erhm...  Not at all, AFAICT.  The only vague resemblance I see is in the
presence of the word "callgraph" in the description of what both can do,
but even that's not quite the same callgraph, besides the different
format.

E.g., the reason we gather expanded calls rather than just use
cgraph_edges is that the latter would dump several "calls" that are
builtins expanded internally by the compiler, and would NOT dump other
ops that are expanded as (lib)calls.  In order to get an accurate
assessment of stack size requirements, the presence of the builtins
could be confusing but not so much trouble, but the absence of libcalls
would underestimate the potential max total stack use by a symbol.

> Maybe some infrastructure can be
> shared here (the actual dumping of the cgraph?)

You mean the one-liner loop in cgraph_node::dump_graphviz, called by
lto-dump to dump the cgraph?  That doesn't really seem worth sharing;
more so considering we dump edges in a quite different format, and not
just the edges.  In this different format expected by gnatstack, we also
dump the nodes, and include information in the labels of the nodes, such
as their original spelling and location, as well as stack use:

node: { title: "_ada_p" label: "P\np.adb:1:1\n48 bytes (static)" }

and dynamic stack allocations (alloca and vlas):

node: { title: "p.adb:p__u" label: "u\np.adb:21:3\n2 dynamic objects\n rt 
p.adb:23:5\n ri p.adb:24:5" }

and though edges to libcalls may carry just as little info as a graphviz
"from" -> "to" edge:

edge: { sourcename: "add" targetname: "__addvsi3" }

those between user symbols carry location info as well:

edge: { sourcename: "p__s" targetname: "p.adb:p__v" label: "p.adb:46:5" }

So I'm afraid I don't see anything that could be usefully factored out.
Do you?

Thanks,

-- 
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free!FSF VP & FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara


[PING**2] [PATCH] Fix PR c++/92024

2019-10-26 Thread Bernd Edlinger
Ping...

On 10/18/19 3:19 PM, Bernd Edlinger wrote:
> Ping...
> 
> for the c++ FE and testsuite changes in the updated patch
> here: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00916.html
> 
> 
> Thanks
> Bernd.
> 
> 
> 
> 
> On 10/12/19 8:10 PM, Bernd Edlinger wrote:
>> On 10/11/19 6:31 PM, Jason Merrill wrote:
>>> On 10/10/19 2:06 PM, Bernd Edlinger wrote:
 On 10/10/19 7:49 PM, Jason Merrill wrote:

 if -Wshadow=compatible-local is used, the can_convert function crashes
 in instantiate_class_template_1.
>>>
>>> Right, checking can_convert is problematic here, as it can cause template 
>>> instantiations that change the semantics of the program.  Or, in this case, 
>>> crash.
>>>
>>
>> So I try to make C++ behave more consistently with the code in c-decl.c,
>> thus dependent on warn_shadow but not on warn_shadow_local and/or
>> warn_shadow_compatible_local:
>>
>>if (warn_shadow)
>>   warning_code = OPT_Wshadow;
>> else if (comptypes (TREE_TYPE (old_decl), TREE_TYPE (new_decl)))
>>   warning_code = OPT_Wshadow_compatible_local;
>> else
>>   warning_code = OPT_Wshadow_local;
>> warned = warning_at (DECL_SOURCE_LOCATION (new_decl), 
>> warning_code,
>>  "declaration of %qD shadows a parameter",
>>  new_decl);
>>
>> I cannot remove the if (warn_shadow) since this breaks gcc.dg/pr48062.c
>> which uses:
>>
>> #pragma GCC diagnostic ignored "-Wshadow"
>>
>> to disable a -Wshadow=compatible-local warning, but while -Wno-shadow on the
>> command line disables also dependent warnings the pragma does not (always) 
>> do that.
>>
>> So instead I'd like to adjust the doc of -Wshadow to reflect the 
>> implementation
>> and remove the if(warn_shadow_local) to have C and C++ behave identical and
>> hopefully now in sync with the doc.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>


Re: introduce -fcallgraph-info option

2019-10-26 Thread Richard Biener
On October 26, 2019 6:35:43 AM GMT+02:00, Alexandre Oliva  
wrote:
>This was first submitted many years ago
>https://gcc.gnu.org/ml/gcc-patches/2010-10/msg02468.html
>
>The command line option -fcallgraph-info is added and makes the
>compiler generate another output file (xxx.ci) for each compilation
>unit, which is a valid VCG file (you can launch your favorite VCG
>viewer on it unmodified) and contains the "final" callgraph of the
>unit.  "final" is a bit of a misnomer as this is actually the
>callgraph at RTL expansion time, but since most high-level
>optimizations are done at the Tree level and RTL doesn't usually
>fiddle with calls, it's final in almost all cases.  Moreover, the
>nodes can be decorated with additional info: -fcallgraph-info=su adds
>stack usage info and -fcallgraph-info=da dynamic allocation info.
>
>Compared with the earlier version, this patch does not modify cgraph,
>and instead adds the required information next to the stage usage
>function data structure, so we only hold one of those at at time.  I've
>switched to vecs from linked lists, for more compact edges and dynamic
>allocation annotations, and arranged for them to be released as soon as
>we've printed out the information.  I have NOT changed the file format,
>because existing tools such as gnatstack consume the current format.
>
>Regstrapped on x86_64-linux-gnu.  Ok to install?

How does it relate to the LTO-dump utility we have now which can in theory 
provide a more complete view? Maybe some infrastructure can be shared here (the 
actual dumping of the cgraph?) 

Thanks, 
Richard. 

>
>for  gcc/ChangeLog
>From  Eric Botcazou  , Alexandre Oliva 
>
>
>   * common.opt (-fcallgraph-info[=]): New option.
>   * doc/invoke.texi (Debugging options): Document it.
>   * opts.c (common_handle_option): Handle it.
>   * builtins.c (expand_builtin_alloca): Record allocation if
>   -fcallgraph-info=da.
>   * calls.c (expand_call): If -fcallgraph-info, record the call.
>   (emit_library_call_value_1): Likewise.
>   * flag-types.h (enum callgraph_info_type): New type.
>   * explow.c: Include stringpool.h.
>   (set_stack_check_libfunc): Set SET_SYMBOL_REF_DECL on the symbol.
>   * function.c (allocate_stack_usage_info): New.
>   (allocate_struct_function): Call it for -fcallgraph-info.
>   (prepare_function_start): Call it otherwise.
>   (rest_of_handle_thread_prologue_and_epilogue): Release callees
>   and dallocs after output_stack_usage.
>   (record_final_call, record_dynamic_alloc): New.
>   * function.h (struct callee, struct dalloc): New.
>   (struct stack_usage): Add callees and dallocs.
>   (record_final_call, record_dynamic_alloc): Declare.
>   * gimplify.c (gimplify_decl_expr): Record dynamically-allocated
>   object if -fcallgraph-info=da.
>   * optabs-libfuncs.c (build_libfunc_function): Keep SYMBOL_REF_DECL.
>   * print-tree.h (print_decl_identifier): Declare.
>   (PRINT_DECL_ORIGIN, PRINT_DECL_NAME, PRINT_DECL_UNIQUE_NAME): New.
>   * print-tree.c: Include print-tree.h.
>   (print_decl_identifier): New function.
>   * toplev.c: Include print-tree.h.
>   (callgraph_info_file): New global variable.
>   (callgraph_info_indirect_emitted): Likewise.
>   (output_stack_usage): Rename to...
>   (output_stack_usage_1): ... this.  Make it static, add cf
>   parameter.  If -fcallgraph-info=su, print stack usage to cf.
>   If -fstack-usage, use print_decl_identifier for
>   pretty-printing.
>   (INDIRECT_CALL_NAME): New.
>   (dump_final_indirect_call_node_vcg): New.
>   (dump_final_callee_vcg, dump_final_node_vcg): New.
>   (output_stack_usage): New.
>   (lang_dependent_init): Open and start file if
>   -fcallgraph-info.
>   (finalize): If callgraph_info_file is not null, finish it,
>   close it, and reset callgraph info state.
>
>for  gcc/ada/ChangeLog
>
>   * gcc-interface/misc.c (callgraph_info_file): Delete.
>---
> gcc/ada/gcc-interface/misc.c |3 -
> gcc/builtins.c   |4 +
> gcc/calls.c  |6 +
> gcc/common.opt   |8 ++
> gcc/doc/invoke.texi  |   17 
> gcc/explow.c |5 +
> gcc/flag-types.h |   16 
> gcc/function.c   |   63 ++--
> gcc/function.h   |   25 ++
> gcc/gimplify.c   |4 +
> gcc/optabs-libfuncs.c|4 -
> gcc/opts.c   |   26 ++
> gcc/output.h |2 
> gcc/print-tree.c |   76 +++
> gcc/print-tree.h |4 +
>gcc/toplev.c |  169
>++
> 16 files changed, 381 insertions(+), 51 deletions(-)
>
>diff --git a/gcc/ada/gcc-interface/misc.c
>b/gcc/ada/gcc-interface/misc.c
>index 4abd4d5708a54..d68b37384ff7f 100644
>--- a/gcc/ada/gcc-interface/misc.c
>+++