Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
On Sat, 2019-10-26 at 14:04 -0600, Jeff Law wrote: > On 10/26/19 1:33 PM, Andrew Waterman wrote: > > I don't know enough to say whether the legitimize_address hook is > > sufficient for the intended purpose, but I am sure that RISC-V's > > concerns are not unique: other GCC targets have to cope with > > offset-size constraints that are coupled to register-allocation > > constraints. > > Yup. I think every risc port in the 90s faces this problem. I > always > wished for a generic mechanism for ports to handle this problem. > > Regardless, it's probably worth investigating. > What we tried to do with the address mode selection (AMS) optimization some time ago was the following: - Extract memory accesses from the insns stream and put them in "access sequences". Also analyze the address expression and try to find effective base addresses by tracing back address calculations. - For each memory access, get a set of address mode alternatives and the corresponding costs from the backend. The full context of each access is provided, so the backend can detect things like "in this access sequence, FP loads dominate" and use this information to tune the alternative costs. - Given the alternatives and costs for each memory access, the pass would then try to minimize the costs of the whole memory access sequence, taking costs of address modification isnns into account. I think this is quite generic, but of course also complex. The optimization problem itself is hard. There was some research done by others using CPLEX or PBQP solvers. To keep things simple we used a backtracking algorithm and handled only a limited set of scenarios. For example, I think we could not get loop constructs work nicely to improve post-inc address mode utilization. The SH ISA has very similar properties to ARM thumb and RVC, and perhaps others. Advantages would not be limited to RISC only, even CISC ISAs like M68K, RX, ... can benefit from it, as the "proper optimization" can reduce the instruction sizes by shortening the addresses in the instruction stream. If anyone is interested, here is the AMS optimization pass class: https://github.com/erikvarga/gcc/blob/master/gcc/ams.h https://github.com/erikvarga/gcc/blob/master/gcc/ams.cc It's using a different style to callback into the backend code. Not GCC's "hooks" but a delegate pattern. SH backend's delegate implementation is here https://github.com/erikvarga/gcc/blob/master/gcc/config/sh/sh.c#L11897 We were getting some improvements in the generated code, but it started putting pressure on register allocation related issues in the SH backend (the R0 problem), so we could not do more proper testing. Maybe somebody can get some ideas out of it. Cheers, Oleg
Re: [PATCH] PR85678: Change default to -fno-common
On 25/10/2019 16:47, Wilco Dijkstra wrote: GCC currently defaults to -fcommon. As discussed in the PR, this is an ancient C feature which is not conforming with the latest C standards. The PR references C99/C11 6.9p5, but that is not a constraint. Any violation merely renders the behaviour of a program undefined, so no diagnostic is required. To the best of my knowledge, both -fcommon and -fno-common are compatible with the C standard. This is no reason not to change the default, but... [...] +definition. This behavior does not conform to ISO C, is inconsistent with C++, +and on many targets implies a speed and code size penalty on global variable +references. It is mainly useful to enable legacy code to link without errors. ...can this inaccurate characterization be left out of the documentation? Cheers, Harald van Dijk
Fix summaries after ipa-icf alias/wrapper creation
Hi, when ipa-icf produces wrapper/alias it needs to also update summaries to represent the new function body. W/o this propagation across new wrappers does not work. Bootstrapped/regtested x86_64-linux, will commit it shortly. * ipa-icf.c (sem_function::merge): Update function summaries. * ipa-prop.h (ipa_get_param): Do not sanity check for WPA. Index: ipa-icf.c === --- ipa-icf.c (revision 277474) +++ ipa-icf.c (working copy) @@ -1266,6 +1266,7 @@ sem_function::merge (sem_item *alias_ite /* Remove the function's body. */ ipa_merge_profiles (original, alias); + symtab->call_cgraph_removal_hooks (alias); alias->release_body (true); alias->reset (); /* Notice global symbol possibly produced RTL. */ @@ -1287,11 +1288,13 @@ sem_function::merge (sem_item *alias_ite { gcc_assert (!create_alias); alias->icf_merged = true; + symtab->call_cgraph_removal_hooks (alias); local_original->icf_merged = true; /* FIXME update local_original counts. */ ipa_merge_profiles (original, alias, true); alias->create_wrapper (local_original); + symtab->call_cgraph_insertion_hooks (alias); if (dump_enabled_p ()) dump_printf (MSG_OPTIMIZED_LOCATIONS, Index: ipa-prop.h === --- ipa-prop.h (revision 277474) +++ ipa-prop.h (working copy) @@ -457,7 +457,6 @@ static inline tree ipa_get_param (class ipa_node_params *info, int i) { gcc_checking_assert (info->descriptors); - gcc_checking_assert (!flag_wpa); tree t = (*info->descriptors)[i].decl_or_type; gcc_checking_assert (TREE_CODE (t) == PARM_DECL); return t;
[libstdc++,doc] Switch pubs.opengroup.org to https
Turns out we have three instances of the same link in our libstdc++ documentation -- do we really need all of those? In any case, I went ahead and applied this straightforward update. Gerald 2019-10-26 Gerald Pfeifer * doc/xml/manual/codecvt.xml: Switch pubs.opengroup.org to https. * doc/xml/manual/locale.xml (LC_ALL): Ditto. * doc/xml/manual/messages.xml: Ditto. Index: libstdc++-v3/doc/xml/manual/codecvt.xml === --- libstdc++-v3/doc/xml/manual/codecvt.xml (revision 277475) +++ libstdc++-v3/doc/xml/manual/codecvt.xml (working copy) @@ -595,7 +595,7 @@ codecvt usage. http://www.w3.org/1999/xlink; - xlink:href="http://pubs.opengroup.org/onlinepubs/9699919799/;> + xlink:href="https://pubs.opengroup.org/onlinepubs/9699919799/;> System Interface Definitions, Issue 7 (IEEE Std. 1003.1-2008) Index: libstdc++-v3/doc/xml/manual/locale.xml === --- libstdc++-v3/doc/xml/manual/locale.xml (revision 277475) +++ libstdc++-v3/doc/xml/manual/locale.xml (working copy) @@ -560,7 +560,7 @@ global locale" (emphasis Paolo), that is: http://www.w3.org/1999/xlink; - xlink:href="http://pubs.opengroup.org/onlinepubs/9699919799/;> + xlink:href="https://pubs.opengroup.org/onlinepubs/9699919799/;> System Interface Definitions, Issue 7 (IEEE Std. 1003.1-2008) Index: libstdc++-v3/doc/xml/manual/messages.xml === --- libstdc++-v3/doc/xml/manual/messages.xml(revision 277475) +++ libstdc++-v3/doc/xml/manual/messages.xml(working copy) @@ -488,7 +488,7 @@ void test01() http://www.w3.org/1999/xlink; - xlink:href="http://pubs.opengroup.org/onlinepubs/9699919799/;> + xlink:href="https://pubs.opengroup.org/onlinepubs/9699919799/;> System Interface Definitions, Issue 7 (IEEE Std. 1003.1-2008)
Re: [PATCH] C testsuite, silence a FreeBSD libc warning
On 04.10.19 19:04, Jeff Law wrote: On 9/30/19 12:47 PM, Andreas Tobler wrote: On 30.09.19 20:37, Kamil Rytarowski wrote: On 30.09.2019 19:47, Jakub Jelinek wrote: On Mon, Sep 30, 2019 at 07:41:00PM +0200, Andreas Tobler wrote: --- fprintf-2.c (revision 276292) +++ fprintf-2.c (working copy) @@ -1,7 +1,8 @@ /* Verify that calls to fprintf don't get eliminated even if their result on success can be computed at compile time (they can fail). The calls can still be transformed into those of other functions. - { dg-skip-if "requires io" { freestanding } } */ + { dg-skip-if "requires io" { freestanding } } + { dg-prune-output "(^|\n)(\[^\n\])*warning: warning: \[^\n\]* possibly used unsafely; consider using \[^\n\]*\n" } */ I'm worried about that (^|\n) at the start + \n at the end, doesn't it prune too much then? Looking at other tests, they dg-prune-output just a few words from a message, or .*few words.* So, can you try just { dg-prune-output "warning: warning: \[^\n\r\]* possibly used unsafely; consider using" } */ or if that doesn't work, with .* at start end end? Jakub Please handle the NetBSD specific string too: "warning: tmpnam() possibly used unsafely, use mkstemp() or mkdtemp()". https://nxr.netbsd.org/xref/src/lib/libc/stdio/tmpnam.c#52 Ok, I think the attached version should also match these cases. Although untested on NetBSD. Kamil, if you have cycles, would you mind giving it a run? Thanks! Andreas OK assuming Kamil's testing shows that it works. Kamil, do you have a feedback? If not I'm going to commit by tomorrow. Andreas
[committed] pa: Update libstdc++-v3 baseline symbols for hppa-linux
Committed to trunk. Dave -- 2019-10-26 John David Anglin * config/abi/post/hppa-linux-gnu/baseline_symbols.txt: Update. Index: config/abi/post/hppa-linux-gnu/baseline_symbols.txt === --- config/abi/post/hppa-linux-gnu/baseline_symbols.txt (revision 277363) +++ config/abi/post/hppa-linux-gnu/baseline_symbols.txt (working copy) @@ -112,6 +112,7 @@ FUNC:_ZN11__gnu_debug19_Safe_sequence_base18_M_detach_singularEv@@GLIBCXX_3.4 FUNC:_ZN11__gnu_debug19_Safe_sequence_base22_M_revalidate_singularEv@@GLIBCXX_3.4 FUNC:_ZN11__gnu_debug19_Safe_sequence_base7_M_swapERS0_@@GLIBCXX_3.4 +FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base16_M_attach_singleEPNS_19_Safe_sequence_baseEb@@GLIBCXX_3.4.26 FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base9_M_attachEPNS_19_Safe_sequence_baseEb@@GLIBCXX_3.4.17 FUNC:_ZN11__gnu_debug25_Safe_local_iterator_base9_M_detachEv@@GLIBCXX_3.4.17 FUNC:_ZN11__gnu_debug30_Safe_unordered_container_base13_M_detach_allEv@@GLIBCXX_3.4.17 @@ -261,6 +262,7 @@ FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE8capacityEv@@GLIBCXX_3.4 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE8max_sizeEv@@GLIBCXX_3.4 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE9_M_ibeginEv@@GLIBCXX_3.4 +FUNC:_ZNKSbIwSt11char_traitsIwESaIwEEcvSt17basic_string_viewIwS0_EEv@@GLIBCXX_3.4.26 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEEixEj@@GLIBCXX_3.4 FUNC:_ZNKSi6gcountEv@@GLIBCXX_3.4 FUNC:_ZNKSi6sentrycvbEv@@GLIBCXX_3.4 @@ -328,9 +330,66 @@ FUNC:_ZNKSs8capacityEv@@GLIBCXX_3.4 FUNC:_ZNKSs8max_sizeEv@@GLIBCXX_3.4 FUNC:_ZNKSs9_M_ibeginEv@@GLIBCXX_3.4 +FUNC:_ZNKSscvSt17basic_string_viewIcSt11char_traitsIcEEEv@@GLIBCXX_3.4.26 FUNC:_ZNKSsixEj@@GLIBCXX_3.4 FUNC:_ZNKSt10bad_typeid4whatEv@@GLIBCXX_3.4.9 FUNC:_ZNKSt10error_code23default_error_conditionEv@@GLIBCXX_3.4.11 +FUNC:_ZNKSt10filesystem16filesystem_error4whatEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem16filesystem_error5path1Ev@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem16filesystem_error5path2Ev@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem18directory_iteratordeEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem28recursive_directory_iterator17recursion_pendingEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem28recursive_directory_iterator5depthEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem28recursive_directory_iterator7optionsEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem28recursive_directory_iteratordeEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path11parent_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path12has_filenameEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path13has_root_nameEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path13has_root_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path13relative_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path14root_directoryEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path15has_parent_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path16lexically_normalEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path17_M_find_extensionEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path17has_relative_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path18has_root_directoryEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path18lexically_relativeERKS0_@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path19lexically_proximateERKS0_@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path5_List13_Impl_deleterclEPNS1_5_ImplE@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path5_List3endEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path5_List5beginEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path7compareERKS0_@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path7compareESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path9root_nameEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem4path9root_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error4whatEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error5path1Ev@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1116filesystem_error5path2Ev@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1118directory_iteratordeEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator17recursion_pendingEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator5depthEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iterator7optionsEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx1128recursive_directory_iteratordeEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path11parent_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path12has_filenameEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path13has_root_nameEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path13has_root_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path13relative_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path14root_directoryEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path15has_parent_pathEv@@GLIBCXX_3.4.26 +FUNC:_ZNKSt10filesystem7__cxx114path16lexically_normalEv@@GLIBCXX_3.4.26
Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
On 10/26/19 1:33 PM, Andrew Waterman wrote: > I don't know enough to say whether the legitimize_address hook is > sufficient for the intended purpose, but I am sure that RISC-V's > concerns are not unique: other GCC targets have to cope with > offset-size constraints that are coupled to register-allocation > constraints. Yup. I think every risc port in the 90s faces this problem. I always wished for a generic mechanism for ports to handle this problem. Regardless, it's probably worth investigating. jeff
Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
On 10/26/19 1:10 PM, Oleg Endo wrote: > On Sat, 2019-10-26 at 12:21 -0600, Jeff Law wrote: >> On 10/25/19 11:39 AM, Craig Blackmore wrote: >>> This patch aims to allow more load/store instructions to be >>> compressed by >>> replacing a load/store of 'base register + large offset' with a new >>> load/store >>> of 'new base + small offset'. If the new base gets stored in a >>> compressed >>> register, then the new load/store can be compressed. Since there is >>> an overhead >>> in creating the new base, this change is only attempted when 'base >>> register' is >>> referenced in at least 4 load/stores in a basic block. >>> >>> The optimization is implemented in a new RISC-V specific pass >>> called >>> shorten_memrefs which is enabled for RVC targets. It has been >>> developed for the >>> 32-bit lw/sw instructions but could also be extended to 64-bit >>> ld/sd in future. >>> >>> Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc, >>> rv64imac, >>> rv64imafdc via QEMU. No regressions. >>> >>> gcc/ChangeLog: >>> >>> * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for >>> riscv. >>> * config/riscv/riscv-passes.def: New file. >>> * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): >>> Declare. >>> * config/riscv/riscv-shorten-memrefs.c: New file. >>> * config/riscv/riscv.c (tree-pass.h): New include. >>> (riscv_compressed_reg_p): New Function >>> (riscv_compressed_lw_offset_p): Likewise. >>> (riscv_compressed_lw_address_p): Likewise. >>> (riscv_shorten_lw_offset): Likewise. >>> (riscv_legitimize_address): Attempt to convert base + >>> large_offset >>> to compressible new_base + small_offset. >>> (riscv_address_cost): Make anticipated compressed load/stores >>> cheaper for code size than uncompressed load/stores. >>> (riscv_register_priority): Move compressed register check to >>> riscv_compressed_reg_p. >>> * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET): >>> Define. >>> * config/riscv/riscv.opt (mshorten-memefs): New option. >>> * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule. >>> (PASSES_EXTRA): Add riscv-passes.def. >>> * doc/invoke.texi: Document -mshorten-memrefs. >> >> This has traditionally been done via the the legitimize_address hook. >> Is there some reason that hook is insufficient for this case? >> >> The hook, IIRC, is called out explow.c. >> > > This sounds like some of my addressing mode selection (AMS) attempts on > SH. Haven't looked at the patch (sorry), but I'm sure the problem is > pretty much the same. > > On SH legitimize_address is used to do ... "something" ... to the > address in order to make the displacement fit. The issue is, > legitimize_address doesn't get any context so it can't even try to find > a local optimal base address or something like that. True it's not given any context. I think most ports try to form the address in such a way that the base+offset' is likely to be a common subexpression. THere was a similar hook for reload which I didn't mention as I wasn't sure it was used i the LRA world. SH is likely than most more complex. Joern tried really hard to get good code on the SH and as a result the implementation is sometimes tough to follow. Jeff
Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
I don't know enough to say whether the legitimize_address hook is sufficient for the intended purpose, but I am sure that RISC-V's concerns are not unique: other GCC targets have to cope with offset-size constraints that are coupled to register-allocation constraints. On Sat, Oct 26, 2019 at 11:21 AM Jeff Law wrote: > > On 10/25/19 11:39 AM, Craig Blackmore wrote: > > This patch aims to allow more load/store instructions to be compressed by > > replacing a load/store of 'base register + large offset' with a new > > load/store > > of 'new base + small offset'. If the new base gets stored in a compressed > > register, then the new load/store can be compressed. Since there is an > > overhead > > in creating the new base, this change is only attempted when 'base > > register' is > > referenced in at least 4 load/stores in a basic block. > > > > The optimization is implemented in a new RISC-V specific pass called > > shorten_memrefs which is enabled for RVC targets. It has been developed for > > the > > 32-bit lw/sw instructions but could also be extended to 64-bit ld/sd in > > future. > > > > Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc, rv64imac, > > rv64imafdc via QEMU. No regressions. > > > > gcc/ChangeLog: > > > > * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for riscv. > > * config/riscv/riscv-passes.def: New file. > > * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): Declare. > > * config/riscv/riscv-shorten-memrefs.c: New file. > > * config/riscv/riscv.c (tree-pass.h): New include. > > (riscv_compressed_reg_p): New Function > > (riscv_compressed_lw_offset_p): Likewise. > > (riscv_compressed_lw_address_p): Likewise. > > (riscv_shorten_lw_offset): Likewise. > > (riscv_legitimize_address): Attempt to convert base + large_offset > > to compressible new_base + small_offset. > > (riscv_address_cost): Make anticipated compressed load/stores > > cheaper for code size than uncompressed load/stores. > > (riscv_register_priority): Move compressed register check to > > riscv_compressed_reg_p. > > * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET): Define. > > * config/riscv/riscv.opt (mshorten-memefs): New option. > > * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule. > > (PASSES_EXTRA): Add riscv-passes.def. > > * doc/invoke.texi: Document -mshorten-memrefs. > This has traditionally been done via the the legitimize_address hook. > Is there some reason that hook is insufficient for this case? > > The hook, IIRC, is called out explow.c. > > Jeff >
Re: [Patch, fortran] PR86248 - [7/8/9/10 Regression] LEN_TRIM in specification expression causes link failure
Hi Steve, Thanks for the heads up. I'll get on to it right away. Regards Paul On Sat, 26 Oct 2019 at 20:04, Steve Kargl wrote: > > On Sat, Oct 26, 2019 at 07:39:55PM +0100, Paul Richard Thomas wrote: > > As far as I can tell, this is a 6-regression as well - ***sigh*** > > > > The patch is fundamentally very simple. Symbols that were marked with > > the fn_result_spec flag that really were module parameters were having > > the wrong name mangling applied to them. The rest of the patch is a > > tidy up. > > > > Regtested on FC30/x86_64 - OK for all the branches after a bedding in > > period on trunk? > > > > OK. Note you cannot wait too long for 7 or 9. The 7 branch > is being closed, and the 9-branch is close to its next release. > > -- > Steve -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
*Early ping* Re: [Patch, Fortran] PR91863 - fix call to bind(C) with array descriptor
On 10/23/19 3:07 PM, Tobias Burnus wrote: With the trunk, there are three issues: (a) With bind(C), the callee side handles deallocation with intent(out). This should produce the code: if (cfi.0 != 0B) { __builtin_free (cfi.0); cfi.0 = 0B; } This fails as cfi.0 (of type 'void*') is dereferenced and *cfi.0 = 0B' (i.e. assignment of type 'void') causes the ICE. (b) With that fixed, one gets: sub (cfi.4); _gfortran_cfi_desc_to_gfc_desc (, ); if (cfi.4 != 0B) __builtin_free (cfi.4); ... code using "a" ... That also won't shine as 'a.data' == 'cfi.4'; hence, one accesses already freed memory. I don't see whether freeing the cfi memory makes sense at all; as I didn't come up with a reason, I removed it for good. Those issues, I have solved. The third issue is now PR fortran/92189: (c) When allocating memory in a Fortran-written Bind(C) function, the shape/bounds changes are not propagated back to Fortran. Namely, "sub" lacks some _gfortran_gfc_desc_to_cfi_desc call at the end! The issue pops up, if you change 'dg-do compile' into 'dg-do run'. For using a C-written function, that's a non-issue. Hence, it makes sense to fix (a)+(b) of the bug separately. OK for the trunk and GCC 9? (At least the ICE is a regression.) Tobias
Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
On Sat, 2019-10-26 at 12:21 -0600, Jeff Law wrote: > On 10/25/19 11:39 AM, Craig Blackmore wrote: > > This patch aims to allow more load/store instructions to be > > compressed by > > replacing a load/store of 'base register + large offset' with a new > > load/store > > of 'new base + small offset'. If the new base gets stored in a > > compressed > > register, then the new load/store can be compressed. Since there is > > an overhead > > in creating the new base, this change is only attempted when 'base > > register' is > > referenced in at least 4 load/stores in a basic block. > > > > The optimization is implemented in a new RISC-V specific pass > > called > > shorten_memrefs which is enabled for RVC targets. It has been > > developed for the > > 32-bit lw/sw instructions but could also be extended to 64-bit > > ld/sd in future. > > > > Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc, > > rv64imac, > > rv64imafdc via QEMU. No regressions. > > > > gcc/ChangeLog: > > > > * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for > > riscv. > > * config/riscv/riscv-passes.def: New file. > > * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): > > Declare. > > * config/riscv/riscv-shorten-memrefs.c: New file. > > * config/riscv/riscv.c (tree-pass.h): New include. > > (riscv_compressed_reg_p): New Function > > (riscv_compressed_lw_offset_p): Likewise. > > (riscv_compressed_lw_address_p): Likewise. > > (riscv_shorten_lw_offset): Likewise. > > (riscv_legitimize_address): Attempt to convert base + > > large_offset > > to compressible new_base + small_offset. > > (riscv_address_cost): Make anticipated compressed load/stores > > cheaper for code size than uncompressed load/stores. > > (riscv_register_priority): Move compressed register check to > > riscv_compressed_reg_p. > > * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET): > > Define. > > * config/riscv/riscv.opt (mshorten-memefs): New option. > > * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule. > > (PASSES_EXTRA): Add riscv-passes.def. > > * doc/invoke.texi: Document -mshorten-memrefs. > > This has traditionally been done via the the legitimize_address hook. > Is there some reason that hook is insufficient for this case? > > The hook, IIRC, is called out explow.c. > This sounds like some of my addressing mode selection (AMS) attempts on SH. Haven't looked at the patch (sorry), but I'm sure the problem is pretty much the same. On SH legitimize_address is used to do ... "something" ... to the address in order to make the displacement fit. The issue is, legitimize_address doesn't get any context so it can't even try to find a local optimal base address or something like that. Cheers, Oleg
Re: [Patch, fortran] PR86248 - [7/8/9/10 Regression] LEN_TRIM in specification expression causes link failure
On Sat, Oct 26, 2019 at 07:39:55PM +0100, Paul Richard Thomas wrote: > As far as I can tell, this is a 6-regression as well - ***sigh*** > > The patch is fundamentally very simple. Symbols that were marked with > the fn_result_spec flag that really were module parameters were having > the wrong name mangling applied to them. The rest of the patch is a > tidy up. > > Regtested on FC30/x86_64 - OK for all the branches after a bedding in > period on trunk? > OK. Note you cannot wait too long for 7 or 9. The 7 branch is being closed, and the 9-branch is close to its next release. -- Steve
[Patch, fortran] PR86248 - [7/8/9/10 Regression] LEN_TRIM in specification expression causes link failure
As far as I can tell, this is a 6-regression as well - ***sigh*** The patch is fundamentally very simple. Symbols that were marked with the fn_result_spec flag that really were module parameters were having the wrong name mangling applied to them. The rest of the patch is a tidy up. Regtested on FC30/x86_64 - OK for all the branches after a bedding in period on trunk? Cheers Paul 2019-10-26 Paul Thomas PR fortran/86248 * resolve.c (flag_fn_result_spec): Correct a typo before the function declaration. * trans-decl.c (gfc_sym_identifier): Boost the length of 'name' to allow for all variants. Simplify the code by using a pointer to the symbol's proc_name and taking the return out of each of the conditional branches. Allow symbols with fn_result_spec set that do not come from a procedure namespace and have a module name to go through the non-fn_result_spec branch. 2019-10-26 Paul Thomas PR fortran/86248 * gfortran.dg/char_result_19.f90 : New test. * gfortran.dg/char_result_mod_19.f90 : Module for the new test. Index: gcc/fortran/resolve.c === *** gcc/fortran/resolve.c (revision 277203) --- gcc/fortran/resolve.c (working copy) *** resolve_equivalence (gfc_equiv *eq) *** 16774,16781 } ! /* Function called by resolve_fntype to flag other symbol used in the !length type parameter specification of function resuls. */ static bool flag_fn_result_spec (gfc_expr *expr, --- 16774,16781 } ! /* Function called by resolve_fntype to flag other symbols used in the !length type parameter specification of function results. */ static bool flag_fn_result_spec (gfc_expr *expr, Index: gcc/fortran/trans-decl.c === *** gcc/fortran/trans-decl.c (revision 277203) --- gcc/fortran/trans-decl.c (working copy) *** gfc_sym_identifier (gfc_symbol * sym) *** 369,412 static const char * mangled_identifier (gfc_symbol *sym) { ! static char name[GFC_MAX_MANGLED_SYMBOL_LEN + 1]; /* Prevent the mangling of identifiers that have an assigned binding label (mainly those that are bind(c)). */ if (sym->attr.is_bind_c == 1 && sym->binding_label) return sym->binding_label; ! if (!sym->fn_result_spec) { if (sym->module == NULL) return sym_identifier (sym); else ! { ! snprintf (name, sizeof name, "__%s_MOD_%s", sym->module, sym->name); ! return name; ! } } else { /* This is an entity that is actually local to a module procedure that appears in the result specification expression. Since sym->module will be a zero length string, we use ns->proc_name ! instead. */ ! if (sym->ns->proc_name && sym->ns->proc_name->module) ! { ! snprintf (name, sizeof name, "__%s_MOD__%s_PROC_%s", ! sym->ns->proc_name->module, ! sym->ns->proc_name->name, ! sym->name); ! return name; ! } else ! { ! snprintf (name, sizeof name, "__%s_PROC_%s", ! sym->ns->proc_name->name, sym->name); ! return name; ! } } } /* Get mangled identifier, adding the symbol to the global table if --- 369,405 static const char * mangled_identifier (gfc_symbol *sym) { ! gfc_symbol *proc = sym->ns->proc_name; ! static char name[3*GFC_MAX_MANGLED_SYMBOL_LEN + 14]; /* Prevent the mangling of identifiers that have an assigned binding label (mainly those that are bind(c)). */ if (sym->attr.is_bind_c == 1 && sym->binding_label) return sym->binding_label; ! if (!sym->fn_result_spec ! || (sym->module && !(proc && proc->attr.flavor == FL_PROCEDURE))) { if (sym->module == NULL) return sym_identifier (sym); else ! snprintf (name, sizeof name, "__%s_MOD_%s", sym->module, sym->name); } else { /* This is an entity that is actually local to a module procedure that appears in the result specification expression. Since sym->module will be a zero length string, we use ns->proc_name ! to provide the module name instead. */ ! if (proc && proc->module) ! snprintf (name, sizeof name, "__%s_MOD__%s_PROC_%s", ! proc->module, proc->name, sym->name); else ! snprintf (name, sizeof name, "__%s_PROC_%s", ! proc->name, sym->name); } + + return name; } /* Get mangled identifier, adding the symbol to the global table if Index: gcc/testsuite/gfortran.dg/char_result_19.f90 === *** gcc/testsuite/gfortran.dg/char_result_19.f90 (nonexistent) --- gcc/testsuite/gfortran.dg/char_result_19.f90 (working copy) *** *** 0 --- 1,24 + ! { dg-do preprocess } + ! { dg-additional-options "-cpp" } + ! + ! Test the fix for PR86248 + ! + ! Contributed by Bill Long + ! + program test +
Re: [PATCH][MSP430] Use 430 insns in the large memory model for more patterns
On 10/25/19 6:24 AM, Jozef Lawrynowicz wrote: > Where possible, it is always desirable to use 430 format instructions when > compiling for the 430X ISA and the large memory model. 430 instructions have > reduced code size and faster execution time. > > This patch recognizes a couple of new patterns in which we can use 430 insns > in > the large memory model. > > Successfully regtested on trunk. > > Ok to apply? > > > 0001-MSP430-Use-430-insns-in-the-large-memory-model-for-m.patch > > From ba3a8eafeba08dc034e219f892f2784c16f94c40 Mon Sep 17 00:00:00 2001 > From: Jozef Lawrynowicz > Date: Thu, 24 Oct 2019 15:17:29 +0100 > Subject: [PATCH] MSP430: Use 430 insns in the large memory model for more > patterns > > gcc/ChangeLog: > > 2019-10-25 Jozef Lawrynowicz > > * config/msp430/msp430.c (msp430_check_index_not_high_mem): New. > (msp430_check_plus_not_high_mem): New. > (msp430_op_not_in_high_mem): Use new functions to check if the operand > might be in low memory. > Indicate that a 16-bit absolute address is in lower memory. > > gcc/testsuite/ChangeLog: > > 2019-10-25 Jozef Lawrynowicz > > * gcc.target/msp430/mlarge-use-430-insn.c: New test. > OK jeff
Re: [PATCH 4/N] Fix unsigned type overflow in memory report.
On 10/25/19 7:21 AM, Martin Liška wrote: > Hi. > > And there's one more integer overflow fix. > > Martin > > > 0001-Fix-unsigned-type-overflow-in-memory-report.patch > > From 35c22704dc705508672f19b09e6d1b94bd956535 Mon Sep 17 00:00:00 2001 > From: Martin Liska > Date: Fri, 25 Oct 2019 15:09:32 +0200 > Subject: [PATCH] Fix unsigned type overflow in memory report. > > gcc/ChangeLog: > > 2019-10-25 Martin Liska > > * ggc-common.c: One can't subtract unsigned types > in compare function. OK jeff
Re: [PATCH 3/3] Print header in dump_memory_report.
On 10/25/19 3:32 AM, Martin Liska wrote: > > gcc/ChangeLog: > > 2019-10-25 Martin Liska > > * cgraphunit.c (symbol_table::compile): Pass > title as dump_memory_report argument. > * toplev.c (dump_memory_report): New argument. > (finalize): Pass new argument. > * toplev.h (dump_memory_report): Add argument. > > gcc/lto/ChangeLog: > > 2019-10-25 Martin Liska > > * lto.c (do_whole_program_analysis): Pass > title as dump_memory_report argument. > --- > gcc/cgraphunit.c | 10 ++ > gcc/lto/lto.c| 12 +++- > gcc/toplev.c | 13 +++-- > gcc/toplev.h | 3 ++- > 4 files changed, 18 insertions(+), 20 deletions(-) > OK jeff
Re: [PATCH 2/3] Move Leak in GCC memory report to the first column.
On 10/25/19 3:10 AM, Martin Liska wrote: > > gcc/ChangeLog: > > 2019-10-25 Martin Liska > > * ggc-common.c: Move Leak to the first column. OK jeff
Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass
On 10/25/19 11:39 AM, Craig Blackmore wrote: > This patch aims to allow more load/store instructions to be compressed by > replacing a load/store of 'base register + large offset' with a new load/store > of 'new base + small offset'. If the new base gets stored in a compressed > register, then the new load/store can be compressed. Since there is an > overhead > in creating the new base, this change is only attempted when 'base register' > is > referenced in at least 4 load/stores in a basic block. > > The optimization is implemented in a new RISC-V specific pass called > shorten_memrefs which is enabled for RVC targets. It has been developed for > the > 32-bit lw/sw instructions but could also be extended to 64-bit ld/sd in > future. > > Tested on bare metal rv32i, rv32iac, rv32im, rv32imac, rv32imafc, rv64imac, > rv64imafdc via QEMU. No regressions. > > gcc/ChangeLog: > > * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for riscv. > * config/riscv/riscv-passes.def: New file. > * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): Declare. > * config/riscv/riscv-shorten-memrefs.c: New file. > * config/riscv/riscv.c (tree-pass.h): New include. > (riscv_compressed_reg_p): New Function > (riscv_compressed_lw_offset_p): Likewise. > (riscv_compressed_lw_address_p): Likewise. > (riscv_shorten_lw_offset): Likewise. > (riscv_legitimize_address): Attempt to convert base + large_offset > to compressible new_base + small_offset. > (riscv_address_cost): Make anticipated compressed load/stores > cheaper for code size than uncompressed load/stores. > (riscv_register_priority): Move compressed register check to > riscv_compressed_reg_p. > * config/riscv/riscv.h (RISCV_MAX_COMPRESSED_LW_OFFSET): Define. > * config/riscv/riscv.opt (mshorten-memefs): New option. > * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule. > (PASSES_EXTRA): Add riscv-passes.def. > * doc/invoke.texi: Document -mshorten-memrefs. This has traditionally been done via the the legitimize_address hook. Is there some reason that hook is insufficient for this case? The hook, IIRC, is called out explow.c. Jeff
Re: [PATCH] PR85678: Change default to -fno-common
On 10/25/19 9:47 AM, Wilco Dijkstra wrote: > GCC currently defaults to -fcommon. As discussed in the PR, this is an > ancient > C feature which is not conforming with the latest C standards. On many > targets > this means global variable accesses have a codesize and performance penalty. > This applies to C code only, C++ code is not affected by -fcommon. It is > about > time to change the default. > > OK for commit? > > ChangeLog > 2019-10-25 Wilco Dijkstra > > PR85678 > * common.opt (fcommon): Change init to 1. > > doc/ > * invoke.texi (-fcommon): Update documentation. Has this been bootstrapped and regression tested? jeff
Re: [PATCH] Remove redudant iptr when operand already has a scalar mode.
On Sat, Oct 26, 2019 at 3:27 PM Hongtao Liu wrote: > > > BTW: Please also note that there is no need to use or operand > > mode override in scalar insn templates for intel asm dialect when > > operand already has a scalar mode. > https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01868.html > > This patch is to remove redundant when operand already has a scalar > mode. > > bootstrap and regression test for i386/x86-64 is ok. > > Changelog > gcc/ > * config/i386/sse.md (*_vm3, > _vm3): Remove since > operand is already scalar mode. > (iptr): Remove SF/DF. OK. Thanks, Uros.
[PATCH] rs6000: Fix allocate_stack in a corner case (PR91289)
When we have -fstack-limit-symbol with sysv we can end up with a non- existing instruction (you cannot add an immediate to register 0). Fix this by using register 11 instead. It might be used for something else already though, so save and restore its value around this. In optimizing compiles these extra moves are usually removed again: the restore by cprop_hardreg, and then the save by rtl_dce. Committing to trunk. Segher 2019-10-26 Segher Boessenkool PR target/91289 * config/rs6000/rs6000-logue.c (rs6000_emit_allocate_stack): Don't add an immediate to r0; use r11 instead. Save and restore r11 to r0 around this. --- gcc/config/rs6000/rs6000-logue.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c index e98893a..04aae80 100644 --- a/gcc/config/rs6000/rs6000-logue.c +++ b/gcc/config/rs6000/rs6000-logue.c @@ -1700,10 +1700,14 @@ rs6000_emit_allocate_stack (HOST_WIDE_INT size, rtx copy_reg, int copy_off) stack_limit_rtx, GEN_INT (size))); - emit_insn (gen_elf_high (tmp_reg, toload)); - emit_insn (gen_elf_low (tmp_reg, tmp_reg, toload)); - emit_insn (gen_cond_trap (LTU, stack_reg, tmp_reg, - const0_rtx)); + /* We cannot use r0 with elf_low. Lamely solve this problem by +moving registers around. */ + rtx r11_reg = gen_rtx_REG (Pmode, 11); + emit_move_insn (tmp_reg, r11_reg); + emit_insn (gen_elf_high (r11_reg, toload)); + emit_insn (gen_elf_low (r11_reg, r11_reg, toload)); + emit_insn (gen_cond_trap (LTU, stack_reg, r11_reg, const0_rtx)); + emit_move_insn (r11_reg, tmp_reg); } else warning (0, "stack limit expression is not supported"); -- 1.8.3.1
Re: [PATCH] Fix PR92222
Richard Biener writes: > We have to check each operand for being in a pattern, not just the > first when avoiding build from scalars (we could possibly handle > the special case of some of them being the pattern stmt root, but > that would be a followup improvement). > > Bootstrap & regtest running on x86-64-unknown-linux-gnu. > > Richard. > > 2019-10-25 Richard Biener > > PR tree-optimization/9 > * tree-vect-slp.c (_slp_oprnd_info::first_pattern): Remove. > (_slp_oprnd_info::second_pattern): Likewise. > (_slp_oprnd_info::any_pattern): New. > (vect_create_oprnd_info): Adjust. > (vect_get_and_check_slp_defs): Compute whether any stmt is > in a pattern. > (vect_build_slp_tree_2): Avoid building up a node from scalars > if any of the operand defs, not just the first, is in a pattern. Is recording this in vect_get_and_check_slp_defs enough? AIUI we don't get that far if vect_build_slp_tree_1 fails, but that doesn't prevent us from building the vector from scalars on return from vect_build_slp_tree. Was wondering whether we should use a helper function that explicitly checks whether any stmt_info in the vec is a pattern statement, rather than computing it on the fly. Thanks, Richard > > * gcc.dg/torture/pr9.c: New testcase. > > Index: gcc/tree-vect-slp.c > === > --- gcc/tree-vect-slp.c (revision 277441) > +++ gcc/tree-vect-slp.c (working copy) > @@ -177,8 +177,7 @@ typedef struct _slp_oprnd_info > stmt. */ >tree first_op_type; >enum vect_def_type first_dt; > - bool first_pattern; > - bool second_pattern; > + bool any_pattern; > } *slp_oprnd_info; > > > @@ -199,8 +198,7 @@ vect_create_oprnd_info (int nops, int gr >oprnd_info->ops.create (group_size); >oprnd_info->first_dt = vect_uninitialized_def; >oprnd_info->first_op_type = NULL_TREE; > - oprnd_info->first_pattern = false; > - oprnd_info->second_pattern = false; > + oprnd_info->any_pattern = false; >oprnds_info.quick_push (oprnd_info); > } > > @@ -339,13 +337,11 @@ vect_get_and_check_slp_defs (vec_info *v >tree oprnd; >unsigned int i, number_of_oprnds; >enum vect_def_type dt = vect_uninitialized_def; > - bool pattern = false; >slp_oprnd_info oprnd_info; >int first_op_idx = 1; >unsigned int commutative_op = -1U; >bool first_op_cond = false; >bool first = stmt_num == 0; > - bool second = stmt_num == 1; > >if (gcall *stmt = dyn_cast (stmt_info->stmt)) > { > @@ -418,13 +414,12 @@ again: > return -1; > } > > - if (second) > - oprnd_info->second_pattern = pattern; > + if (def_stmt_info && is_pattern_stmt_p (def_stmt_info)) > + oprnd_info->any_pattern = true; > >if (first) > { > oprnd_info->first_dt = dt; > - oprnd_info->first_pattern = pattern; > oprnd_info->first_op_type = TREE_TYPE (oprnd); > } >else > @@ -1311,7 +1306,7 @@ vect_build_slp_tree_2 (vec_info *vinfo, > /* ??? Rejecting patterns this way doesn't work. We'd have to >do extra work to cancel the pattern so the uses see the >scalar version. */ > - && !is_pattern_stmt_p (SLP_TREE_SCALAR_STMTS (child)[0])) > + && !oprnd_info->any_pattern) > { > slp_tree grandchild; > > @@ -1358,18 +1353,16 @@ vect_build_slp_tree_2 (vec_info *vinfo, > /* ??? Rejecting patterns this way doesn't work. We'd have to >do extra work to cancel the pattern so the uses see the >scalar version. */ > - && !is_pattern_stmt_p (stmt_info)) > + && !is_pattern_stmt_p (stmt_info) > + && !oprnd_info->any_pattern) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_NOTE, vect_location, >"Building vector operands from scalars\n"); > this_tree_size++; > - child = vect_create_new_slp_node (oprnd_info->def_stmts); > - SLP_TREE_DEF_TYPE (child) = vect_external_def; > - SLP_TREE_SCALAR_OPS (child) = oprnd_info->ops; > + child = vect_create_new_slp_node (oprnd_info->ops); > children.safe_push (child); > oprnd_info->ops = vNULL; > - oprnd_info->def_stmts = vNULL; > continue; > } > > @@ -1469,7 +1440,7 @@ vect_build_slp_tree_2 (vec_info *vinfo, > /* ??? Rejecting patterns this way doesn't work. We'd have >to do extra work to cancel the pattern so the uses see the >scalar version. */ > - && !is_pattern_stmt_p (SLP_TREE_SCALAR_STMTS (child)[0])) > + && !oprnd_info->any_pattern) > { > unsigned int j; > slp_tree grandchild; > Index: gcc/testsuite/gcc.dg/torture/pr9.c >
[PATCH] Remove redudant iptr when operand already has a scalar mode.
> BTW: Please also note that there is no need to use or operand > mode override in scalar insn templates for intel asm dialect when > operand already has a scalar mode. https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01868.html This patch is to remove redundant when operand already has a scalar mode. bootstrap and regression test for i386/x86-64 is ok. Changelog gcc/ * config/i386/sse.md (*_vm3, _vm3): Remove since operand is already scalar mode. (iptr): Remove SF/DF. -- BR, Hongtao 0001-Remove-redudant-iptr-when-operand-is-already-scalar-.patch Description: Binary data
Re: [PATCH] PR85678: Change default to -fno-common
On Sat, Oct 26, 2019 at 12:21:15PM +0100, Iain Sandoe wrote: > Georg-Johann Lay wrote: > > Wilco Dijkstra schrieb: > >> GCC currently defaults to -fcommon. As discussed in the PR, this is an > >> ancient > >> C feature which is not conforming with the latest C standards. On many > >> targets > >> this means global variable accesses have a codesize and performance > >> penalty. > >> This applies to C code only, C++ code is not affected by -fcommon. It is > >> about > >> time to change the default. > >> OK for commit? > > > > IIRC using -fno-common might lead to some testsuit fallout because > > some optimizations / test cases are sensitive to -f[no-]common. > > So I wonder that no adjustments to test cases are needed? > > Two points here: > > (a) I actually agree with the idea to change the default I think everyone does, this has been long overdue. > (b) there will surely be testsuite fallout on Darwin, where common accesses > are > mandated to be indirect in the ABI, where non-weak .data accesses are not > Thus code generated for Darwin will change for any test using variables > that > are “common” and where the test does not already force -fno-common > > I’d expect fallout to be quite large - since there are plenty of > testcases with uninit >gobals. We might want to make a test-run and see the size of the problem, > but >preferably once the stage#1 submisison and gcc-7 release cycles are done. Yeah -- and if fallout is more than just testsuite, should it be postponed to GCC 11? And announced now, etc. But to find out we probably should flip the switch soon, see what happens, only flip it back if necessary. Segher
Re: [PATCH] PR85678: Change default to -fno-common
Georg-Johann Lay wrote: > Wilco Dijkstra schrieb: >> GCC currently defaults to -fcommon. As discussed in the PR, this is an >> ancient >> C feature which is not conforming with the latest C standards. On many >> targets >> this means global variable accesses have a codesize and performance penalty. >> This applies to C code only, C++ code is not affected by -fcommon. It is >> about >> time to change the default. >> OK for commit? > > IIRC using -fno-common might lead to some testsuit fallout because > some optimizations / test cases are sensitive to -f[no-]common. > So I wonder that no adjustments to test cases are needed? Two points here: (a) I actually agree with the idea to change the default (b) there will surely be testsuite fallout on Darwin, where common accesses are mandated to be indirect in the ABI, where non-weak .data accesses are not Thus code generated for Darwin will change for any test using variables that are “common” and where the test does not already force -fno-common I’d expect fallout to be quite large - since there are plenty of testcases with uninit gobals. We might want to make a test-run and see the size of the problem, but preferably once the stage#1 submisison and gcc-7 release cycles are done. thanks Iain >> ChangeLog >> 2019-10-25 Wilco Dijkstra >> PR85678 >> * common.opt (fcommon): Change init to 1. >> doc/ >> * invoke.texi (-fcommon): Update documentation. >> --- >> diff --git a/gcc/common.opt b/gcc/common.opt >> index >> 0195b0cb85a06dd043fd0412b42dfffddfa2495b..b0840f41a5e480f4428bd62724b0dc3d54c68c0b >> 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -1131,7 +1131,7 @@ Common Report Var(flag_combine_stack_adjustments) >> Optimization >> Looks for opportunities to reduce stack adjustments and stack references. >> fcommon >> -Common Report Var(flag_no_common,0) >> +Common Report Var(flag_no_common,0) Init(1) >> Put uninitialized globals in the common section. >> fcompare-debug >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index >> 857d9692729e503657d0d0f44f1f6252ec90d49a..5b4ff66015f5f94a5bd89e4dc3d2d53553cc091e >> 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -568,7 +568,7 @@ Objective-C and Objective-C++ Dialects}. >> -fnon-call-exceptions -fdelete-dead-exceptions -funwind-tables @gol >> -fasynchronous-unwind-tables @gol >> -fno-gnu-unique @gol >> --finhibit-size-directive -fno-common -fno-ident @gol >> +-finhibit-size-directive -fcommon -fno-ident @gol >> -fpcc-struct-return -fpic -fPIC -fpie -fPIE -fno-plt @gol >> -fno-jump-tables @gol >> -frecord-gcc-switches @gol >> @@ -14050,35 +14050,27 @@ useful for building programs to run under WINE@. >> code that is not binary compatible with code generated without that switch. >> Use it to conform to a non-default application binary interface. >> -@item -fno-common >> -@opindex fno-common >> +@item -fcommon >> @opindex fcommon >> +@opindex fno-common >> @cindex tentative definitions >> -In C code, this option controls the placement of global variables -defined >> without an initializer, known as @dfn{tentative definitions} -in the C >> standard. Tentative definitions are distinct from declarations +In C code, >> this option controls the placement of global variables >> +defined without an initializer, known as @dfn{tentative definitions} >> +in the C standard. Tentative definitions are distinct from declarations >> of a variable with the @code{extern} keyword, which do not allocate storage. >> -Unix C compilers have traditionally allocated storage for >> -uninitialized global variables in a common block. This allows the >> -linker to resolve all tentative definitions of the same variable >> +The default is @option{-fno-common}, which specifies that the compiler >> places >> +uninitialized global variables in the BSS section of the object file. > > IMO "uninitialized" is confusing because the variables actually > *are* initialized: with zero. It's just that the variables don't have > explicit initializers. Dito for "uninitialized" in the --help message. > > Johann > > >> +This inhibits the merging of tentative definitions by the linker so you get >> a >> +multiple-definition error if the same variable is accidentally defined in >> more >> +than one compilation unit. >> + >> +The @option{-fcommon} places uninitialized global variables in a common >> block. >> +This allows the linker to resolve all tentative definitions of the same >> variable >> in different compilation units to the same object, or to a non-tentative >> -definition. -This is the behavior specified by @option{-fcommon}, and is >> the default for -GCC on most targets. -On the other hand, this behavior is >> not required by ISO >> -C, and on some targets may carry a speed or code size penalty on >> -variable references. >> - >> -The @option{-fno-common} option specifies that the compiler should instead >> -place
Re: introduce -fcallgraph-info option
Hi, Richi, On Oct 26, 2019, Richard Biener wrote: > How does it relate to the LTO-dump utility we have now which can in > theory provide a more complete view? Erhm... Not at all, AFAICT. The only vague resemblance I see is in the presence of the word "callgraph" in the description of what both can do, but even that's not quite the same callgraph, besides the different format. E.g., the reason we gather expanded calls rather than just use cgraph_edges is that the latter would dump several "calls" that are builtins expanded internally by the compiler, and would NOT dump other ops that are expanded as (lib)calls. In order to get an accurate assessment of stack size requirements, the presence of the builtins could be confusing but not so much trouble, but the absence of libcalls would underestimate the potential max total stack use by a symbol. > Maybe some infrastructure can be > shared here (the actual dumping of the cgraph?) You mean the one-liner loop in cgraph_node::dump_graphviz, called by lto-dump to dump the cgraph? That doesn't really seem worth sharing; more so considering we dump edges in a quite different format, and not just the edges. In this different format expected by gnatstack, we also dump the nodes, and include information in the labels of the nodes, such as their original spelling and location, as well as stack use: node: { title: "_ada_p" label: "P\np.adb:1:1\n48 bytes (static)" } and dynamic stack allocations (alloca and vlas): node: { title: "p.adb:p__u" label: "u\np.adb:21:3\n2 dynamic objects\n rt p.adb:23:5\n ri p.adb:24:5" } and though edges to libcalls may carry just as little info as a graphviz "from" -> "to" edge: edge: { sourcename: "add" targetname: "__addvsi3" } those between user symbols carry location info as well: edge: { sourcename: "p__s" targetname: "p.adb:p__v" label: "p.adb:46:5" } So I'm afraid I don't see anything that could be usefully factored out. Do you? Thanks, -- Alexandre Oliva, freedom fighter he/him https://FSFLA.org/blogs/lxo Be the change, be Free!FSF VP & FSF Latin America board member GNU Toolchain EngineerFree Software Evangelist Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara
[PING**2] [PATCH] Fix PR c++/92024
Ping... On 10/18/19 3:19 PM, Bernd Edlinger wrote: > Ping... > > for the c++ FE and testsuite changes in the updated patch > here: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00916.html > > > Thanks > Bernd. > > > > > On 10/12/19 8:10 PM, Bernd Edlinger wrote: >> On 10/11/19 6:31 PM, Jason Merrill wrote: >>> On 10/10/19 2:06 PM, Bernd Edlinger wrote: On 10/10/19 7:49 PM, Jason Merrill wrote: if -Wshadow=compatible-local is used, the can_convert function crashes in instantiate_class_template_1. >>> >>> Right, checking can_convert is problematic here, as it can cause template >>> instantiations that change the semantics of the program. Or, in this case, >>> crash. >>> >> >> So I try to make C++ behave more consistently with the code in c-decl.c, >> thus dependent on warn_shadow but not on warn_shadow_local and/or >> warn_shadow_compatible_local: >> >>if (warn_shadow) >> warning_code = OPT_Wshadow; >> else if (comptypes (TREE_TYPE (old_decl), TREE_TYPE (new_decl))) >> warning_code = OPT_Wshadow_compatible_local; >> else >> warning_code = OPT_Wshadow_local; >> warned = warning_at (DECL_SOURCE_LOCATION (new_decl), >> warning_code, >> "declaration of %qD shadows a parameter", >> new_decl); >> >> I cannot remove the if (warn_shadow) since this breaks gcc.dg/pr48062.c >> which uses: >> >> #pragma GCC diagnostic ignored "-Wshadow" >> >> to disable a -Wshadow=compatible-local warning, but while -Wno-shadow on the >> command line disables also dependent warnings the pragma does not (always) >> do that. >> >> So instead I'd like to adjust the doc of -Wshadow to reflect the >> implementation >> and remove the if(warn_shadow_local) to have C and C++ behave identical and >> hopefully now in sync with the doc. >> >> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >> Is it OK for trunk? >> >> >> Thanks >> Bernd. >>
Re: introduce -fcallgraph-info option
On October 26, 2019 6:35:43 AM GMT+02:00, Alexandre Oliva wrote: >This was first submitted many years ago >https://gcc.gnu.org/ml/gcc-patches/2010-10/msg02468.html > >The command line option -fcallgraph-info is added and makes the >compiler generate another output file (xxx.ci) for each compilation >unit, which is a valid VCG file (you can launch your favorite VCG >viewer on it unmodified) and contains the "final" callgraph of the >unit. "final" is a bit of a misnomer as this is actually the >callgraph at RTL expansion time, but since most high-level >optimizations are done at the Tree level and RTL doesn't usually >fiddle with calls, it's final in almost all cases. Moreover, the >nodes can be decorated with additional info: -fcallgraph-info=su adds >stack usage info and -fcallgraph-info=da dynamic allocation info. > >Compared with the earlier version, this patch does not modify cgraph, >and instead adds the required information next to the stage usage >function data structure, so we only hold one of those at at time. I've >switched to vecs from linked lists, for more compact edges and dynamic >allocation annotations, and arranged for them to be released as soon as >we've printed out the information. I have NOT changed the file format, >because existing tools such as gnatstack consume the current format. > >Regstrapped on x86_64-linux-gnu. Ok to install? How does it relate to the LTO-dump utility we have now which can in theory provide a more complete view? Maybe some infrastructure can be shared here (the actual dumping of the cgraph?) Thanks, Richard. > >for gcc/ChangeLog >From Eric Botcazou , Alexandre Oliva > > > * common.opt (-fcallgraph-info[=]): New option. > * doc/invoke.texi (Debugging options): Document it. > * opts.c (common_handle_option): Handle it. > * builtins.c (expand_builtin_alloca): Record allocation if > -fcallgraph-info=da. > * calls.c (expand_call): If -fcallgraph-info, record the call. > (emit_library_call_value_1): Likewise. > * flag-types.h (enum callgraph_info_type): New type. > * explow.c: Include stringpool.h. > (set_stack_check_libfunc): Set SET_SYMBOL_REF_DECL on the symbol. > * function.c (allocate_stack_usage_info): New. > (allocate_struct_function): Call it for -fcallgraph-info. > (prepare_function_start): Call it otherwise. > (rest_of_handle_thread_prologue_and_epilogue): Release callees > and dallocs after output_stack_usage. > (record_final_call, record_dynamic_alloc): New. > * function.h (struct callee, struct dalloc): New. > (struct stack_usage): Add callees and dallocs. > (record_final_call, record_dynamic_alloc): Declare. > * gimplify.c (gimplify_decl_expr): Record dynamically-allocated > object if -fcallgraph-info=da. > * optabs-libfuncs.c (build_libfunc_function): Keep SYMBOL_REF_DECL. > * print-tree.h (print_decl_identifier): Declare. > (PRINT_DECL_ORIGIN, PRINT_DECL_NAME, PRINT_DECL_UNIQUE_NAME): New. > * print-tree.c: Include print-tree.h. > (print_decl_identifier): New function. > * toplev.c: Include print-tree.h. > (callgraph_info_file): New global variable. > (callgraph_info_indirect_emitted): Likewise. > (output_stack_usage): Rename to... > (output_stack_usage_1): ... this. Make it static, add cf > parameter. If -fcallgraph-info=su, print stack usage to cf. > If -fstack-usage, use print_decl_identifier for > pretty-printing. > (INDIRECT_CALL_NAME): New. > (dump_final_indirect_call_node_vcg): New. > (dump_final_callee_vcg, dump_final_node_vcg): New. > (output_stack_usage): New. > (lang_dependent_init): Open and start file if > -fcallgraph-info. > (finalize): If callgraph_info_file is not null, finish it, > close it, and reset callgraph info state. > >for gcc/ada/ChangeLog > > * gcc-interface/misc.c (callgraph_info_file): Delete. >--- > gcc/ada/gcc-interface/misc.c |3 - > gcc/builtins.c |4 + > gcc/calls.c |6 + > gcc/common.opt |8 ++ > gcc/doc/invoke.texi | 17 > gcc/explow.c |5 + > gcc/flag-types.h | 16 > gcc/function.c | 63 ++-- > gcc/function.h | 25 ++ > gcc/gimplify.c |4 + > gcc/optabs-libfuncs.c|4 - > gcc/opts.c | 26 ++ > gcc/output.h |2 > gcc/print-tree.c | 76 +++ > gcc/print-tree.h |4 + >gcc/toplev.c | 169 >++ > 16 files changed, 381 insertions(+), 51 deletions(-) > >diff --git a/gcc/ada/gcc-interface/misc.c >b/gcc/ada/gcc-interface/misc.c >index 4abd4d5708a54..d68b37384ff7f 100644 >--- a/gcc/ada/gcc-interface/misc.c >+++