[PATCH, committed] PR 91048 Open ~/.mklog in string mode
Hi, another leftover from the py3 conversion. Committed r272921 as obvious. 2019-07-02 Janne Blomqvist PR other/91048 * mklog (read_user_info): Open ~/.mklog in string mode. Index: mklog === --- mklog(revision 272920) +++ mklog(working copy) @@ -100,7 +100,7 @@ EMAIL = ... mklog_conf = os.path.expanduser('~/.mklog') if os.path.exists(mklog_conf): attrs = {} -f = open(mklog_conf, 'rb') +f = open(mklog_conf) for s in f: if cache.match(r'^\s*([a-zA-Z0-9_]+)\s*=\s*(.*?)\s*$', s): attrs[cache.group(1)] = cache.group(2) -- Janne Blomqvist
RE: Use ODR for canonical types construction in LTO
Hi, FYI. This patch works for my application LTO build on aarch64. Thanks, -Jiangning > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org > On Behalf Of Jan Hubicka > Sent: Monday, July 1, 2019 8:22 PM > To: Christophe Lyon > Cc: Eric Botcazou ; gcc Patches patc...@gcc.gnu.org>; Richard Biener ; d...@dcepelik.cz; > Martin Liška > Subject: Re: Use ODR for canonical types construction in LTO > > Hi, > this patch fixes the ICE. Problem is that va_arg type is pre-streamed and thus > at stream-in time we have main variant constructed by LTO FE which > is !CXX_ODR_P while vairants are ones comming from C++ FE which are ODR. > It is safe to drop the flags here since we only care about main variants > anyway. > > I am testing the patch on aarch64 now. > > Honza > > Index: lto/lto-common.c > > === > --- lto/lto-common.c (revision 272846) > +++ lto/lto-common.c (working copy) > @@ -568,8 +568,17 @@ lto_register_canonical_types_for_odr_typ > >/* Register all remaining types. */ >FOR_EACH_VEC_ELT (*types_to_register, i, t) > -if (!TYPE_CANONICAL (t)) > - gimple_register_canonical_type (t); > +{ > + /* For pre-streamed types like va-arg it is possible that main variant > + is !CXX_ODR_P while the variant (which is streamed) is. > + Copy CXX_ODR_P to make type verifier happy. This is safe because > + in canonical type calculation we only consider main variants. > + However we can not change this flag before streaming is finished > + to not affect tree merging. */ > + TYPE_CXX_ODR_P (t) = TYPE_CXX_ODR_P (TYPE_MAIN_VARIANT (t)); > + if (!TYPE_CANONICAL (t)) > +gimple_register_canonical_type (t); > +} > } > >
Re: [PATCH] integrate sprintf pass into strlen (PR 83431)
Attached is a more complete solution that fully resolves the bug report by avoiding a warning in cases like: char a[32], b[8]; void f (void) { if (strlen (a) < sizeof b - 2) snprintf (b, sizeof b, "b=%s", a); // no -Wformat-truncation } It does that by having get_range_strlen_dynamic use the EVRP information for non-constant strlen calls: EVRP has recorded that the result is less sizeof b - 2 and so the function returns this limited range of lengths to snprintf which can then avoid warning. It also improves the checking and can find latent bugs it missed before (like the off-by-one in print-rtl.c). Besides improving the accuracy of the -Wformat-overflow and truncation warnings this can also result in better code. So far this only benefits snprintf but there may be other opportunities to string functions as well (e.g., strcmp or memcmp). Jeff, I looked into your question/suggestion for me last week when we spoke, to introduce some sort of a recursion limit for get_range_strlen_dynamic. It's easily doable but before we go down that path I did some testing to see how bad it can get and to compare it with get_range_strlen. Here are the results for a few packages. The dept is the maximum recursion depth, success and fail are the numbers of successful and failed calls to the function: binutils-gdb: depth success fail get_range_strlen: 319 830221621 get_range_strlen_dynamic:41 1503 161 gcc: get_range_strlen:46 721111365 get_range_strlen_dynamic:23 10272 12 glibc: get_range_strlen:76 284011422 get_range_strlen_dynamic:51 1186 46 elfutils: get_range_strlen:33 1198 2516 get_range_strlen_dynamic:31 685 36 kernel: get_range_strlen:25 529911698 get_range_strlen_dynamic:31 9911 122 Except the first case of get_range_strlen (I haven't looked into it yet), it doesn't seem too bad, and with just one exception it's better than get_range_strlen. Let me know if you think it's worth adding a parameter (I assume we'd use it for both functions) and what to set it to. On 6/11/19 5:26 PM, Martin Sebor wrote: The sprintf and strlen passes both work with strings but run independently of one another and don't share state. As a result, lengths of strings dynamically created by functions that are available to the strlen pass are not available to sprintf. Conversely, lengths of strings formatted by the sprintf functions are not made available to the strlen pass. The result is less efficient code, poor diagnostics, and ultimately less than optimal user experience. The attached patch is the first step toward rectifying this design problem. It integrates the two passes into one and exposes the string length data managed by the strlen pass to the sprintf "module." (It does not expose any sprintf data to the strlen pass yet.) The sprintf pass invocations in passes.def have been replaced with those of strlen. The first "early" invocation is only effective for the sprintf module to enable warnings without optimization. The second invocation is "late" and enables both warnings and the sprintf and strlen optimizations unless explicitly disabled via -fno-optimize-strlen. Since the strlen optimization is the second invocation of the pass tests that scan the strlen dump have been adjusted to look for the "strlen2" dump file. The changes in the patch are mostly mechanical. The one new "feature" worth calling out is the get_range_strlen_dynamic function. It's analogous to get_range_strlen in gimple-fold.c except that it makes use of the strlen "dynamically" obtained string length info. In cases when the info is not available the former calls the latter. The other new functions in tree-ssa-strlen.c are check_and_optimize_call and handle_integral_assign: I added them only to keep the check_and_optimize_stmt function from getting excessively long and hard to follow. Otherwise, the code in the functions is unchanged. There are a number of further enhancements to consider as the next steps: * enhance the strlen module to determine string length range information from integer variable to which it was previously assigned (necessary to fully resolve pr83431 and pr90625), * make the sprintf/snprintf string length results directly available to the strlen module, * enhance the sprintf module to optimize snprintf(0, 0, fmt) calls with fmt strings consisting solely of plain characters and %s directives to series of strlen calls on the arguments, * and more. Martin PR tree-optimization/83431 - -Wformat-truncation may incorrectly report truncation gcc/ChangeLog: PR c++/83431 * gimple-ssa-sprintf.c (pass_data_sprintf_length): Remove object. (sprintf_dom_walker): Remove class. (get_int_range): Make argument const.
Go patch committed: refactor Exports to encapsulate type refs map
This Go frontend patch by Than McIntosh refactors the Export class to encapsulate the type refs map. This convert the Export::type_refs map from a static object to a field contained (indirectly, via an impl class) in Export itself, for better encapsulation and to be able to reclaim its memory when exporting is done. No change in compiler functionality. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 272666) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -d3d0f3c5bbe9d272178d55bdb907b07c188800e1 +1e042a49d6f2e95d371301aa7b911522dc5877f4 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/export.cc === --- gcc/go/gofrontend/export.cc (revision 272608) +++ gcc/go/gofrontend/export.cc (working copy) @@ -41,14 +41,6 @@ const char Export::v2_magic[Export::magi const int Export::checksum_len; -// Constructor. - -Export::Export(Stream* stream) - : stream_(stream), type_index_(1), packages_() -{ - go_assert(Export::checksum_len == Go_sha1_helper::checksum_len); -} - // Type hash table operations, treating aliases as distinct. class Type_hash_alias_identical @@ -80,14 +72,31 @@ class Type_alias_identical } }; -// Mapping from Type objects to a constant index. This would be nicer -// as a field in Export, but then export.h would have to #include -// types.h. - +// Mapping from Type objects to a constant index. typedef Unordered_map_hash(const Type*, int, Type_hash_alias_identical, - Type_alias_identical) Type_refs; + Type_alias_identical) Type_refs; + +// Implementation object for class Export. Hidden implementation avoids +// having to #include types.h in export.h, or use a static map. + +struct Export_impl { + Type_refs type_refs; +}; -static Type_refs type_refs; +// Constructor. + +Export::Export(Stream* stream) +: stream_(stream), type_index_(1), packages_(), impl_(new Export_impl) +{ + go_assert(Export::checksum_len == Go_sha1_helper::checksum_len); +} + +// Destructor. + +Export::~Export() +{ + delete this->impl_; +} // A traversal class to collect functions and global variables // referenced by inlined functions. @@ -635,7 +644,7 @@ Export::set_type_index(Type* type) type = type->forwarded(); std::pair ins = -type_refs.insert(std::make_pair(type, 0)); +this->impl_->type_refs.insert(std::make_pair(type, 0)); if (!ins.second) { // We've already seen this type. @@ -1011,8 +1020,8 @@ Export::write_types(int unexported_type_ { // Map from type index to type. std::vector types(static_cast(this->type_index_)); - for (Type_refs::const_iterator p = type_refs.begin(); - p != type_refs.end(); + for (Type_refs::const_iterator p = this->impl_->type_refs.begin(); + p != this->impl_->type_refs.end(); ++p) { if (p->second >= 0) @@ -1152,8 +1161,8 @@ int Export::type_index(const Type* type) { type = type->forwarded(); - Type_refs::const_iterator p = type_refs.find(type); - go_assert(p != type_refs.end()); + Type_refs::const_iterator p = this->impl_->type_refs.find(type); + go_assert(p != this->impl_->type_refs.end()); int index = p->second; go_assert(index != 0); return index; @@ -1231,7 +1240,7 @@ Export::register_builtin_type(Gogo* gogo Named_object* named_object = gogo->lookup_global(name); go_assert(named_object != NULL && named_object->is_type()); std::pair ins = -type_refs.insert(std::make_pair(named_object->type_value(), code)); +this->impl_->type_refs.insert(std::make_pair(named_object->type_value(), code)); go_assert(ins.second); // We also insert the underlying type. We can see the underlying @@ -1239,7 +1248,7 @@ Export::register_builtin_type(Gogo* gogo // fails--we expect duplications here, and it doesn't matter when // they occur. Type* real_type = named_object->type_value()->real_type(); - type_refs.insert(std::make_pair(real_type, code)); + this->impl_->type_refs.insert(std::make_pair(real_type, code)); } // Class Export::Stream. Index: gcc/go/gofrontend/export.h === --- gcc/go/gofrontend/export.h (revision 272608) +++ gcc/go/gofrontend/export.h (working copy) @@ -22,6 +22,7 @@ class Import_init_set; class Backend; class Temporary_statement; class Unnamed_label; +struct Export_impl; // Codes used for the builtin types. These are all negative to make // them easily distinct from the codes assigned by Export::write_type. @@ -121,6 +122,7 @@ class Export : public String_dump }; Export(Stream*); + ~Export(); // Size of export data magic string (which includes version numb
Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)
Segher Boessenkool writes: > Hi Gaius, > > On Fri, Jun 14, 2019 at 02:09:48PM +0100, Gaius Mulley wrote: >> here is version two of the patches which introduce Modula-2 into the >> GCC trunk. The patches include: >> >> (*) a patch to allow all front ends to register a lang spec function. >>(included are patches for all front ends to provide an empty >> callback function). >> (*) patch diffs to allow the Modula-2 front end driver to be >>built using GCC Makefile and friends. >> >> The compressed tarball includes: >> >> (*) gcc/m2 (compiler driver and lang-spec stuff for Modula-2). >>Including the need for registering lang spec functions. >> (*) gcc/testsuite/gm2 (a Modula-2 dejagnu test to ensure that >>the gm2 driver is built and can understands --version). > > I built on pwoerpc64-linux, with the patch and the tarball. > > I first need this patch, because srcdir is an absolute path for me: > > === > diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in > index e2d5098..a423a9e 100644 > --- a/gcc/m2/Make-lang.in > +++ b/gcc/m2/Make-lang.in > @@ -71,13 +71,13 @@ m2/gm2config.h: > export AR ; \ > RANLIB=`echo $(RANLIB_FOR_TARGET) | sed -e "s/^ //"` ; \ > export RANLIB ; \ > -$(SHELL) -c '../$(srcdir)/m2/configure --srcdir=../$(srcdir)/m2 > --t > +$(SHELL) -c '$(srcdir)/m2/configure --srcdir=$(srcdir)/m2 > --target= > else \ > -$(SHELL) -c '../$(srcdir)/m2/configure --srcdir=../$(srcdir)/m2 > --t > +$(SHELL) -c '$(srcdir)/m2/configure --srcdir=$(srcdir)/m2 > --target= > fi > > m2/gm2version.c: m2/gm2version.h > - cd m2 ; bash ../$(srcdir)/m2/tools-src/makeversion -p ../$(srcdir) > + cd m2 ; bash $(srcdir)/m2/tools-src/makeversion -p $(srcdir) > > # Build hooks. > > === > > (This patch might not be correct, but it works for me to get things to > build, at least). > > But then I still get build failures: it tries to run xgcc when it hasn't > been built yet. ("it" == "something", I didn't keep logs, sorry). > > I let it run overnight with -j1, and it finished. The testsuite is > running now :-) > > > Segher Hi Segher, many thanks for the patch and highlighting the relative vs absolute srcdir build. I've fixed the makeversion with a change to the script (directory option added). I'll work on the configure line and explore a solution. Yes -j1 is definitely an overnight task! regards, Gaius
Re: [PATCH, Ada, Darwin, PPC] PPC Darwin has stack check probes.
> 2019-07-01 Iain Sandoe > > * libgnat/system-darwin-ppc.ads: Set Stack_Check_Probes True for > PPC Darwin. OK, thanks. -- Eric Botcazou
[PATCH] @signbit2_dm
One more I forgot to send before. Segher 2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (signbit2_dm): Make this a parameterized name. (signbit2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 9ab9ceb..9e81df9 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4857,13 +4857,7 @@ (define_expand "signbit2" rtx tmp = gen_reg_rtx (DImode); rtx dest_di = gen_lowpart (DImode, dest); - if (mode == KFmode) - emit_insn (gen_signbitkf2_dm (tmp, src)); - else if (mode == TFmode) - emit_insn (gen_signbittf2_dm (tmp, src)); - else - gcc_unreachable (); - + emit_insn (gen_signbit2_dm (mode, tmp, src)); emit_insn (gen_lshrdi3 (dest_di, tmp, GEN_INT (63))); DONE; } @@ -4893,7 +4887,7 @@ (define_expand "signbit2" ;; After register allocation, if the _Float128 had originally been in GPRs, the ;; split allows the post reload phases to eliminate the move, and do the shift ;; directly with the register that contains the signbit. -(define_insn_and_split "signbit2_dm" +(define_insn_and_split "@signbit2_dm" [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r") (unspec:DI [(match_operand:SIGNBIT 1 "gpc_reg_operand" "wa,r")] UNSPEC_SIGNBIT))] -- 1.8.3.1
Re: [PATCH][ARM] Add support for "noinit" attribute
On 6/13/19 9:13 AM, Christophe Lyon wrote: @@ -7131,6 +7132,18 @@ given via attribute argument. @end table +@node ARM Variable Attributes +@subsection ARM Variable Attributes + +@table @code +@item noinit +@cindex @code{noinit} variable attribute, ARM +Any data with the @code{noinit} attribute will not be initialised by +the C runtime startup code, or the program loader. Not initialising +data in this way can reduce program startup times. + +@end table + @node AVR Variable Attributes @subsection AVR Variable Attributes Minor nit: please use American spelling: initialized, initializing. -Sandra
Re: [PATCH] Prepare for prefixed instructions on PowerPC
Hi Mike, Sorry I missed this patch :-( On Thu, Jun 27, 2019 at 04:18:00PM -0400, Michael Meissner wrote: > As we discussed off-line earlier, I changed all of the "4" lengths to be "*", > even for instruction alternatives that would not be subject to being changed > to > be a prefixed instruction (i.e. the sign extend instruction "extsw"'s length > is > set to "*", even though it would not be a prefixed instruction). "*" means "use the default for this attribute", which often is nicer to see than "4". For example, "8" stands out more in a sea of "*"s. Usually "*" is the insns defined as "normal" alternatives, and "8" or "12" etc. are split. > @@ -7231,7 +7231,7 @@ (define_insn "*movcc_internal1" >(const_string "mtjmpr") >(const_string "load") >(const_string "store")]) > - (set_attr "length" "4,4,12,4,4,8,4,4,4,4,4,4")]) > + (set_attr "length" "*,*,12,*,*,8,*,*,*,*,*,*")]) In this case, the "12" and "8" are actually defined as one insn in the template, with some "\;". Luckily there aren't many of those. > @@ -7385,8 +7385,8 @@ (define_insn "*mov_softfloat" > *, *, *, *") > > (set_attr "length" > - "4, 4, 4, 4, 4, 4, > - 4, 4, 8, 4")]) > + "*, *, *, *, *, *, > + *, *, 8, *")]) [ That last line should start with a tab as well. ] The entry before the 8 is split as well. Maybe that should be "4", to stand out? I don't know what works better; your choice. > @@ -7696,8 +7696,8 @@ (define_insn "*mov_softfloat64" > *, *, *") > > (set_attr "length" > -"4, 4, 4, 4, 4, 8, > - 12, 16, 4")]) > +"*, *, *, *, *, 8, > + 12, 16, *")]) Same for the last entry here. > @@ -8760,10 +8760,10 @@ (define_insn "*movdi_internal32" >vecsimple") > (set_attr "size" "64") > (set_attr "length" > - "8, 8, 8, 4, 4, 4, > - 16,4, 4, 4, 4, 4, > - 4, 4, 4, 4, 4, 8, > - 4") > + "8, 8, 8, *, *, *, > + 16,*, *, *, *, *, > + *, *, *, *, *, 8, > + *") And the last here. > @@ -8853,11 +8853,11 @@ (define_insn "*movdi_internal64" > mftgpr,mffgpr") > (set_attr "size" "64") > (set_attr "length" > - "4, 4, 4, 4, 4, 20, > -4, 4, 4, 4, 4, 4, > -4, 4, 4, 4, 4, 4, > -4, 8, 4, 4, 4, 4, > -4, 4") > + "*, *, *, *, *, 20, > +*, *, *, *, *, *, > +*, *, *, *, *, *, > +*, 8, *, *, *, *, > +*, *") And two of the entries here. > @@ -1150,9 +1150,9 @@ (define_insn "vsx_mov_64bit" > store, load, store, *, vecsimple, > vecsimple, > vecsimple, *, *, vecstore, vecload") > (set_attr "length" > - "4, 4, 4, 8, 4, 8, > -8, 8, 8, 8, 4, 4, > -4, 20,8, 4, 4") > + "*, *, *, 8, *, 8, > +8, 8, 8, 8, *, *, > +*, 20,8, *, *") No idea which ones are split here :-) None of the * and all other would be nice, but who knows :-) Okay for trunk, maybe with some "4" if you agree that has value. Thanks! And again, sorry I missed this patch. Segher
[PATCH, Ada, Darwin, PPC] PPC Darwin has stack check probes.
On PPC, Darwin uses the same code as other parts of the port, I suspect that the False was a historical relic. (perhaps I could have applied this with my Darwin maintainer’s hat, or as obvious, but wasn’t sure for the Ada sub-tree) OK for trunk? Thanks Iain 2019-07-01 Iain Sandoe * libgnat/system-darwin-ppc.ads: Set Stack_Check_Probes True for PPC Darwin. diff --git a/gcc/ada/libgnat/system-darwin-ppc.ads b/gcc/ada/libgnat/system-darwin-ppc.ads index d314b66..9adc2de 100644 --- a/gcc/ada/libgnat/system-darwin-ppc.ads +++ b/gcc/ada/libgnat/system-darwin-ppc.ads @@ -158,7 +158,7 @@ private Preallocated_Stacks : constant Boolean := False; Signed_Zeros : constant Boolean := True; Stack_Check_Default : constant Boolean := False; - Stack_Check_Probes: constant Boolean := False; + Stack_Check_Probes: constant Boolean := True; Stack_Check_Limits: constant Boolean := False; Support_Aggregates: constant Boolean := True; Support_Atomic_Primitives : constant Boolean := Word_Size = 64;
[PATCH, i386]: Also use and split insns with SSE operands for 32bit MMX targets
Attached patch uses and splits a couple of instructions with SSE operands for 32bit MMX targets. 2019-07-01 Uroš Bizjak * config/i386/i386.md ("isa" attribute): Add sse_noavx. ("enabled" attribute): Handle sse_noavx isa attribute. * config/i386/mmx.md (*vec_dupv2sf): Add "isa" attribute. Use TARGET_SSE && SSE_REGNO_P in split condition. (*vec_dupv2sf): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c362a72411a2..c401deb5eb00 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -794,7 +794,7 @@ ;; Used to control the "enabled" attribute on a per-instruction basis. (define_attr "isa" "base,x64,x64_sse2,x64_sse4,x64_sse4_noavx,x64_avx,nox64, - sse2,sse2_noavx,sse3,sse4,sse4_noavx,avx,noavx, + sse_noavx,sse2,sse2_noavx,sse3,sse4,sse4_noavx,avx,noavx, avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f, avx512bw,noavx512bw,avx512dq,noavx512dq, avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw" @@ -820,6 +820,8 @@ (symbol_ref "TARGET_64BIT && TARGET_AVX512BW") (eq_attr "isa" "nox64") (symbol_ref "!TARGET_64BIT") (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2") +(eq_attr "isa" "sse_noavx") + (symbol_ref "TARGET_SSE && !TARGET_AVX") (eq_attr "isa" "sse2_noavx") (symbol_ref "TARGET_SSE2 && !TARGET_AVX") (eq_attr "isa" "sse3") (symbol_ref "TARGET_SSE3") diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 70413349390c..9ff86ba0eecc 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -590,12 +590,16 @@ punpckldq\t%0, %0 # #" - "TARGET_MMX_WITH_SSE && reload_completed" + "TARGET_SSE && reload_completed + && SSE_REGNO_P (REGNO (operands[0]))" [(set (match_dup 0) (vec_duplicate:V4SF (match_dup 1)))] - "operands[0] = lowpart_subreg (V4SFmode, operands[0], -GET_MODE (operands[0]));" - [(set_attr "mmx_isa" "native,sse_noavx,avx") +{ + operands[0] = lowpart_subreg (V4SFmode, operands[0], + GET_MODE (operands[0])); +} + [(set_attr "isa" "*,sse_noavx,avx") + (set_attr "mmx_isa" "native,*,*") (set_attr "type" "mmxcvt,ssemov,ssemov") (set_attr "mode" "DI,TI,TI")]) @@ -1560,12 +1564,16 @@ # # #" - "TARGET_MMX_WITH_SSE && reload_completed" + "TARGET_SSE && reload_completed + && SSE_REGNO_P (REGNO (operands[0]))" [(set (match_dup 0) (vec_duplicate:V4SI (match_dup 1)))] - "operands[0] = lowpart_subreg (V4SImode, operands[0], -GET_MODE (operands[0]));" - [(set_attr "mmx_isa" "native,sse_noavx,avx,avx") +{ + operands[0] = lowpart_subreg (V4SImode, operands[0], + GET_MODE (operands[0])); +} + [(set_attr "isa" "*,sse_noavx,avx,avx") + (set_attr "mmx_isa" "native,*,*,*") (set_attr "type" "mmxcvt,ssemov,ssemov,ssemov") (set_attr "mode" "DI,TI,TI,TI")])
Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)
On 7/1/19 10:33 AM, Richard Biener wrote: On Mon, Jul 1, 2019 at 4:55 PM Martin Sebor wrote:> [Adding gcc-patches] Richard, do you have any further comments or is the revised patch good to commit? No further comments from my side - it's good to commit. After running a full bootstrap with the patch with the static_assert I found that the I didn't fully understand the KeyId type/compare_type requirements. Like value_type, this type too can be a non-POD type. It just needs a suitable Traits (AKA Descriptor) class. I've updated the comments to reflect that and removed the static_assert and checked in the original version of the change with better comments in r272893. Sorry about that hiccup. Martin Richard. Martin On 6/25/19 2:30 PM, Martin Sebor wrote: On 6/25/19 3:53 AM, Jonathan Wakely wrote: On 24/06/19 19:42 +0200, Richard Biener wrote: On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: On 6/24/19 6:11 AM, Richard Biener wrote: On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor wrote: On 6/21/19 6:06 AM, Richard Biener wrote: On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor wrote: Bug 90923 shows that even though GCC hash-table based containers like hash_map can be instantiated on types with user-defined ctors and dtors they invoke the dtors of such types without invoking the corresponding ctors. It was thanks to this bug that I spent a day debugging "interesting" miscompilations during GCC bootstrap (in fairness, it was that and bug 90904 about auto_vec copy assignment/construction also being hosed even for POD types). The attached patch corrects the hash_map and hash_set templates to invoke the ctors of the elements they insert and makes them (hopefully) safe to use with non-trivial user-defined types. Hum. I am worried about the difference of assignment vs. construction in ::put() + bool ins = hash_entry::is_empty (*e); + if (ins) + { + e->m_key = k; + new ((void *) &e->m_value) Value (v); + } + else + e->m_value = v; why not invoke the dtor on the old value and then the ctor again? It wouldn't work for self-assignment: Value &y = m.get_or_insert (key); m.put (key, y); The usual rule of thumb for writes into containers is to use construction when creating a new element and assignment when replacing the value of an existing element. Which reminds me that the hash containers, despite being copy- constructible (at least for POD types, they aren't for user- defined types), also aren't safe for assignment even for PODs. I opened bug 90959 for this. Until the assignment is fixed I made it inaccessibe in the patch (I have fixed the copy ctor to DTRT for non-PODs). How is an empty hash_entry constructed? In hash_table::find_slot_with_hash simply by finding an empty slot and returning a pointer to it. The memory for the slot is marked "empty" by calling the Traits::mark_empty() function. The full type of hash_map is actually hash_mapsimple_hashmap_traits, Value> and simple_hashmap_traits delegates it to default_hash_traits whose mark_empty() just clears the void*, leaving the Value part uninitialized. That makes sense because we don't want to call ctors for empty entries. I think the questions one might ask if one were to extend the design are: a) what class should invoke the ctor/assignment and b) should it do it directly or via the traits? ::remove() doesn't seem to invoke the dtor either, instead it relies on the traits::remove function? Yes. There is no Traits::construct or assign or copy. We could add them but I'm not sure I see to what end (there could be use cases, I just don't know enough about these classes to think of any). Attached is an updated patch with the additional minor fixes mentioned above. Martin PS I think we would get much better results by adopting the properly designed and tested standard library containers than by spending time trying to improve the design of these legacy classes. For simple uses that don't need to integrate with the GC machinery the standard containers should be fine (plus, it'd provide us with greater motivation to improve them and the code GCC emits for their uses). Unfortunately, to be able to use the hash-based containers we would need to upgrade to C++ 11. Isn't it time yet? I don't think so. I'm also not sure if C++11 on its own is desirable or if it should be C++14 or later at that point. SLES 12 has GCC 4.8 as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7. SLES 11 already struggles currently (GCC 4.3) but I'd no longer consider that important enough. Note any such change to libstdc++ containers should be complete and come with both code-size and compile-time and memory-usage measurements (both of GCC and other apps of course). Can I go ahead and commit the patch? I think we need to document the requirements on Value classes better. @@ -177,7 +185,10 @@ public: INSERT); bool ins = Traits::is_empty (*e); if (ins) - e->m_key = k; + {
[PATCH 10/12] @abs2_hw
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (abs2_hw): Make this a parameterized name. (abs2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 27fdc4f..974f0b1 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -8117,12 +8117,7 @@ (define_expand "abs2" { if (TARGET_FLOAT128_HW) { - if (mode == TFmode) - emit_insn (gen_abstf2_hw (operands[0], operands[1])); - else if (mode == KFmode) - emit_insn (gen_abskf2_hw (operands[0], operands[1])); - else - FAIL; + emit_insn (gen_abs2_hw (mode, operands[0], operands[1])); DONE; } else if (TARGET_FLOAT128_TYPE) @@ -13908,7 +13903,7 @@ (define_insn "@neg2_hw" (set_attr "size" "128")]) -(define_insn "abs2_hw" +(define_insn "@abs2_hw" [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") (abs:IEEE128 (match_operand:IEEE128 1 "altivec_register_operand" "v")))] -- 1.8.3.1
[PATCH 11/12] @ieee_128bit_vsx_neg2
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (ieee_128bit_vsx_neg2): Make this a parameterized name. (neg2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 974f0b1..86acaae 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -8070,14 +8070,8 @@ (define_expand "neg2" if (TARGET_FLOAT128_HW) emit_insn (gen_neg2_hw (mode, operands[0], operands[1])); else if (TARGET_FLOAT128_TYPE) - { - if (mode == TFmode) - emit_insn (gen_ieee_128bit_vsx_negtf2 (operands[0], operands[1])); - else if (mode == KFmode) - emit_insn (gen_ieee_128bit_vsx_negkf2 (operands[0], operands[1])); - else - gcc_unreachable (); - } + emit_insn (gen_ieee_128bit_vsx_neg2 (mode, +operands[0], operands[1])); else { rtx libfunc = optab_libfunc (neg_optab, mode); @@ -8189,7 +8183,7 @@ (define_expand "ieee_128bit_negative_zero" ;; twiddle the sign bit. Later GCSE passes can then combine multiple uses of ;; neg/abs to create the constant just once. -(define_insn_and_split "ieee_128bit_vsx_neg2" +(define_insn_and_split "@ieee_128bit_vsx_neg2" [(set (match_operand:IEEE128 0 "register_operand" "=wa") (neg:IEEE128 (match_operand:IEEE128 1 "register_operand" "wa"))) (clobber (match_scratch:V16QI 2 "=v"))] -- 1.8.3.1
[PATCH 12/12] @ieee_128bit_vsx_abs2
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (ieee_128bit_vsx_abs2): Make this a parameterized name. (abs2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 86acaae..9e81df9 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -8116,12 +8116,8 @@ (define_expand "abs2" } else if (TARGET_FLOAT128_TYPE) { - if (mode == TFmode) - emit_insn (gen_ieee_128bit_vsx_abstf2 (operands[0], operands[1])); - else if (mode == KFmode) - emit_insn (gen_ieee_128bit_vsx_abskf2 (operands[0], operands[1])); - else - FAIL; + emit_insn (gen_ieee_128bit_vsx_abs2 (mode, + operands[0], operands[1])); DONE; } else @@ -8212,7 +8208,7 @@ (define_insn "*ieee_128bit_vsx_neg2_internal" [(set_attr "type" "veclogical")]) ;; IEEE 128-bit absolute value -(define_insn_and_split "ieee_128bit_vsx_abs2" +(define_insn_and_split "@ieee_128bit_vsx_abs2" [(set (match_operand:IEEE128 0 "register_operand" "=wa") (abs:IEEE128 (match_operand:IEEE128 1 "register_operand" "wa"))) (clobber (match_scratch:V16QI 2 "=v"))] -- 1.8.3.1
[PATCH 09/12] @neg2_hw
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (neg2_hw): Make this a parameterized name. (neg2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 5b3e458..27fdc4f 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -8068,14 +8068,7 @@ (define_expand "neg2" if (FLOAT128_IEEE_P (mode)) { if (TARGET_FLOAT128_HW) - { - if (mode == TFmode) - emit_insn (gen_negtf2_hw (operands[0], operands[1])); - else if (mode == KFmode) - emit_insn (gen_negkf2_hw (operands[0], operands[1])); - else - gcc_unreachable (); - } + emit_insn (gen_neg2_hw (mode, operands[0], operands[1])); else if (TARGET_FLOAT128_TYPE) { if (mode == TFmode) @@ -13905,7 +13898,7 @@ (define_insn "copysign3_soft" [(set_attr "type" "veccomplex") (set_attr "length" "8")]) -(define_insn "neg2_hw" +(define_insn "@neg2_hw" [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") (neg:IEEE128 (match_operand:IEEE128 1 "altivec_register_operand" "v")))] -- 1.8.3.1
[PATCH 05/12] @ctr
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (ctr): Make this a parameterized name. (doloop_end): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 18 +- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index d665316..381f140 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -12566,22 +12566,14 @@ (define_expand "doloop_end" (use (match_operand 1))]; label "" { - if (TARGET_64BIT) -{ - if (GET_MODE (operands[0]) != DImode) - FAIL; - emit_jump_insn (gen_ctrdi (operands[0], operands[1])); -} - else -{ - if (GET_MODE (operands[0]) != SImode) - FAIL; - emit_jump_insn (gen_ctrsi (operands[0], operands[1])); -} + if (GET_MODE (operands[0]) != Pmode) +FAIL; + + emit_jump_insn (gen_ctr (Pmode, operands[0], operands[1])); DONE; }) -(define_expand "ctr" +(define_expand "@ctr" [(parallel [(set (pc) (if_then_else (ne (match_operand:P 0 "register_operand") (const_int 1)) -- 1.8.3.1
[PATCH 08/12] @extenddf2
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (extenddf2): Make this a parameterized name. (floatsi2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 3235eb2..5b3e458 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7775,7 +7775,7 @@ (define_insn_and_split "*mov_softfloat" (const_string "8") (const_string "16"))])]) -(define_expand "extenddf2" +(define_expand "@extenddf2" [(set (match_operand:FLOAT128 0 "gpc_reg_operand") (float_extend:FLOAT128 (match_operand:DF 1 "gpc_reg_operand")))] "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128" @@ -7922,12 +7922,7 @@ (define_expand "floatsi2" { rtx tmp = gen_reg_rtx (DFmode); expand_float (tmp, op1, false); - if (mode == TFmode) - emit_insn (gen_extenddftf2 (op0, tmp)); - else if (mode == IFmode) - emit_insn (gen_extenddfif2 (op0, tmp)); - else - gcc_unreachable (); + emit_insn (gen_extenddf2 (mode, op0, tmp)); DONE; } }) -- 1.8.3.1
[PATCH 06/12] @eh_set_lr_
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (eh_set_lr_): Make this a parameterized name. (eh_return): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 381f140..881efe1 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -13184,15 +13184,12 @@ (define_expand "eh_return" [(use (match_operand 0 "general_operand"))] "" { - if (TARGET_32BIT) -emit_insn (gen_eh_set_lr_si (operands[0])); - else -emit_insn (gen_eh_set_lr_di (operands[0])); + emit_insn (gen_eh_set_lr (Pmode, operands[0])); DONE; }) ; We can't expand this before we know where the link register is stored. -(define_insn_and_split "eh_set_lr_" +(define_insn_and_split "@eh_set_lr_" [(unspec_volatile [(match_operand:P 0 "register_operand" "r")] UNSPECV_EH_RR) (clobber (match_scratch:P 1 "=&b"))] "" -- 1.8.3.1
[PATCH 07/12] @extenddf2_{fprs,vsx}
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (extenddf2_fprs): Make this a parameterized name. (extenddf2_vsx): Make this a parameterized name. (extenddf2): Use those names. Simplify. --- gcc/config/rs6000/rs6000.md | 23 ++- 1 file changed, 6 insertions(+), 17 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 881efe1..3235eb2 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7783,31 +7783,20 @@ (define_expand "extenddf2" if (FLOAT128_IEEE_P (mode)) rs6000_expand_float128_convert (operands[0], operands[1], false); else if (TARGET_VSX) -{ - if (mode == TFmode) - emit_insn (gen_extenddftf2_vsx (operands[0], operands[1])); - else if (mode == IFmode) - emit_insn (gen_extenddfif2_vsx (operands[0], operands[1])); - else - gcc_unreachable (); -} - else +emit_insn (gen_extenddf2_vsx (mode, operands[0], operands[1])); + else { rtx zero = gen_reg_rtx (DFmode); rs6000_emit_move (zero, CONST0_RTX (DFmode), DFmode); - if (mode == TFmode) - emit_insn (gen_extenddftf2_fprs (operands[0], operands[1], zero)); - else if (mode == IFmode) - emit_insn (gen_extenddfif2_fprs (operands[0], operands[1], zero)); - else - gcc_unreachable (); + emit_insn (gen_extenddf2_fprs (mode, +operands[0], operands[1], zero)); } DONE; }) ;; Allow memory operands for the source to be created by the combiner. -(define_insn_and_split "extenddf2_fprs" +(define_insn_and_split "@extenddf2_fprs" [(set (match_operand:IBM128 0 "gpc_reg_operand" "=d,d,&d") (float_extend:IBM128 (match_operand:DF 1 "nonimmediate_operand" "d,m,d"))) @@ -7826,7 +7815,7 @@ (define_insn_and_split "extenddf2_fprs" operands[4] = simplify_gen_subreg (DFmode, operands[0], mode, lo_word); }) -(define_insn_and_split "extenddf2_vsx" +(define_insn_and_split "@extenddf2_vsx" [(set (match_operand:IBM128 0 "gpc_reg_operand" "=d,d") (float_extend:IBM128 (match_operand:DF 1 "nonimmediate_operand" "wa,m")))] -- 1.8.3.1
[PATCH 04/12] @indirect_jump_nospec
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (indirect_jump_nospec): Make this a parameterized name. (indirect_jump): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index ca8b0c0..d665316 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -12410,10 +12410,7 @@ (define_expand "indirect_jump" { if (!rs6000_speculate_indirect_jumps) { rtx ccreg = gen_reg_rtx (CCmode); -if (Pmode == DImode) - emit_jump_insn (gen_indirect_jumpdi_nospec (operands[0], ccreg)); -else - emit_jump_insn (gen_indirect_jumpsi_nospec (operands[0], ccreg)); +emit_jump_insn (gen_indirect_jump_nospec (Pmode, operands[0], ccreg)); DONE; } }) @@ -12425,7 +12422,7 @@ (define_insn "*indirect_jump" "b%T0" [(set_attr "type" "jmpreg")]) -(define_insn "indirect_jump_nospec" +(define_insn "@indirect_jump_nospec" [(set (pc) (match_operand:P 0 "register_operand" "c,*l")) (clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))] "!rs6000_speculate_indirect_jumps" -- 1.8.3.1
[PATCH 03/12] @abs2_internal
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (abs2_internal): Make this a parameterized name. (abs2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 48ead5e..ca8b0c0 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -8163,17 +8163,12 @@ (define_expand "abs2" } label = gen_label_rtx (); - if (mode == TFmode) -emit_insn (gen_abstf2_internal (operands[0], operands[1], label)); - else if (mode == IFmode) -emit_insn (gen_absif2_internal (operands[0], operands[1], label)); - else -FAIL; + emit_insn (gen_abs2_internal (mode, operands[0], operands[1], label)); emit_label (label); DONE; }) -(define_expand "abs2_internal" +(define_expand "@abs2_internal" [(set (match_operand:IBM128 0 "gpc_reg_operand") (match_operand:IBM128 1 "gpc_reg_operand")) (set (match_dup 3) (match_dup 5)) -- 1.8.3.1
[PATCH 02/12] @fix_truncsi2_fprs
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (fix_truncsi2_fprs): Make this a parameterized name. (fix_truncsi2): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 63823c4..48ead5e 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -7969,17 +7969,13 @@ (define_expand "fix_truncsi2" { if (FLOAT128_IEEE_P (mode)) rs6000_expand_float128_convert (op0, op1, false); - else if (mode == TFmode) - emit_insn (gen_fix_trunctfsi2_fprs (op0, op1)); - else if (mode == IFmode) - emit_insn (gen_fix_truncifsi2_fprs (op0, op1)); else - gcc_unreachable (); + emit_insn (gen_fix_truncsi2_fprs (mode, op0, op1)); DONE; } }) -(define_expand "fix_truncsi2_fprs" +(define_expand "@fix_truncsi2_fprs" [(parallel [(set (match_operand:SI 0 "gpc_reg_operand") (fix:SI (match_operand:IBM128 1 "gpc_reg_operand"))) (clobber (match_dup 2)) -- 1.8.3.1
[PATCH 01/12] @neg2
2019-07-01 Segher Boessenkool * config/rs6000/rs6000.md (neg2): Make this a parameterized name. (allocate_stack): Use that name. Simplify. --- gcc/config/rs6000/rs6000.md | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index d0d272a..63823c4 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -2249,7 +2249,7 @@ (define_insn "subf3_carry_in_xx" [(set_attr "type" "add")]) -(define_insn "neg2" +(define_insn "@neg2" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (neg:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))] "" @@ -9810,10 +9810,7 @@ (define_expand "allocate_stack" { operands[1] = force_reg (Pmode, operands[1]); neg_op0 = gen_reg_rtx (Pmode); - if (TARGET_32BIT) - emit_insn (gen_negsi2 (neg_op0, operands[1])); - else - emit_insn (gen_negdi2 (neg_op0, operands[1])); + emit_insn (gen_neg2 (Pmode, neg_op0, operands[1])); } else neg_op0 = GEN_INT (-INTVAL (operands[1])); -- 1.8.3.1
[PATCH 00/12] rs6000: Use parameterised names
This series makes the rs6000 backend use parameterised names. This means adding an "@" to the start of pattern names, removing the mode from the gen_* names where you call them, and adding an extra mode parameter to those calls (as the first argument). The abs and neg patterns used to call FAIL for some unexpected modes, they no longer do. It should have maybe used gcc_unreachable there instead. These patches also remove such gcc_unreachable. Tested all on powerpc64-linux {-m32,-m64} and on p9 powerpc64le-linux. Committing to trunk. Segher Segher Boessenkool (12): @neg2 @fix_truncsi2_fprs @abs2_internal @indirect_jump_nospec @ctr @eh_set_lr_ @extenddf2_{fprs,vsx} @extenddf2 @neg2_hw @abs2_hw @ieee_128bit_vsx_neg2 @ieee_128bit_vsx_abs2 gcc/config/rs6000/rs6000.md | 130 +++- 1 file changed, 33 insertions(+), 97 deletions(-) -- 1.8.3.1
Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format
Hi, Liška I incorporated all your suggestions to the patch, but before sending the v2 I want to discuss some errors reported by the style script. >=== ERROR type #1: blocks of 8 spaces should be replaced with tabs (16 >error(s)) === >gcc/lto/lto-dump.c:263:22:" -list [options] Dump the >symbol list.\n" >gcc/lto/lto-dump.c:264:18:"-demangle Dump the >demangled output.\n" >gcc/lto/lto-dump.c:265:22:"-defined-only Dump only the >defined symbols.\n" >gcc/lto/lto-dump.c:266:21:"-print-valueDump initial >values of the variables.\n" >gcc/lto/lto-dump.c:267:19:"-name-sort Sort the >symbols alphabetically.\n" >gcc/lto/lto-dump.c:268:19:"-size-sort Sort the >symbols according to size.\n" >gcc/lto/lto-dump.c:269:22:"-reverse-sort Dump the >symbols in reverse order.\n" >gcc/lto/lto-dump.c:270:15:" -symbol= Dump the >details of specific symbol.\n" >gcc/lto/lto-dump.c:271:15:" -objects Dump the >details of LTO objects.\n" >gcc/lto/lto-dump.c:272:17:" -callgraphDump the >callgraph in graphviz format.\n" >gcc/lto/lto-dump.c:273:18:" -type-stats Dump >statistics of tree types.\n" >gcc/lto/lto-dump.c:274:18:" -tree-stats Dump >statistics of trees.\n" >gcc/lto/lto-dump.c:275:20:" -gimple-stats Dump >statistics of gimple statements.\n" >gcc/lto/lto-dump.c:276:18:" -dump-body= Dump the >specific gimple body.\n" >gcc/lto/lto-dump.c:277:19:" -dump-level= Deciding the >optimization level of body.\n" >gcc/lto/lto-dump.c:278:12:" -help Display the >dump tool help.\n"; > Is this correct? I think that replacing these spaces with tabs will do bad things in some terminals. >=== ERROR type #2: there should be exactly one space between function >name and parenthesis (1 error(s)) === >gcc/lto/lang.opt:131:11:LTODump Var(flag_dump_callgraph) I just followed the pattern in that file. Unless everything there is wrong, the style in the patch is correct. > >=== ERROR type #3: there should be no space before a left square bracket >(2 error(s)) === >gcc/lto/lto-dump.c:261:21:"Usage: lto-dump [OPTION]... SUB_COMMAND >[OPTION]...\n\n" >gcc/lto/lto-dump.c:263:13:" -list [options] Dump the >symbol list.\n" > >=== ERROR type #4: trailing operator (1 error(s)) === >gcc/lto/lto-dump.c:260:18: const char *msg = I think doing it in this way is cleaner than the previous version, where there was a printf for each line. Therefore this change is less noisy because I just write the string with keeping in mind with what will be displayed in the terminal, and then print it. Of course this is my point of view and it is completely subjective. Any other style error suggested by the script were fixed. On 07/01, Martin Liška wrote: > On 7/1/19 2:56 PM, Giuliano Belinassi wrote: > > Hi, > > > > On 07/01, Martin Jambor wrote: > >> On Sat, Jun 29 2019, Giuliano Belinassi wrote: > >>> This patch makes lto-dump dump the symbol table callgraph in graphviz > >>> format. > >> -ENOPATCH > > Sorry, > > I forgot the most important. Here it is. > > Hello. > > I like the intention! Please try to use: > $ git diff HEAD > /tmp/patch && ~/Programming/gcc/contrib/check_GNU_style.py > /tmp/patch > > which will show you some coding style issues. > > > > >> Martin > >> > >>> I've not added any test to this because I couldn't find a way to call > >>> lto-dump from the testsuite. Also, any feedback with regard to how can > >>> I improve this is welcome. > >>> > >>> gcc/ChangeLog > >>> 2019-06-29 Giuliano Belinassi > >>> > >>> * cgraph.c (dump_graphviz): New function > >>> * cgraph.h (dump_graphviz): New function > >>> * symtab.c (dump_graphviz): New function > >>> * varpool.c (dump_graphviz): New function > >>> > >>> gcc/lto/ChangeLog > >>> 2019-06-29 Giuliano Belinassi > >>> > >>> * lang.opt (flag_dump_callgraph): New flag > >>> * lto-dump.c (dump_symtab_graphviz): New function > >>> * lto-dump.c (dump_tool_help): New option > >>> * lto-dump.c (lto_main): New option > > Giuliano > > > > === > > > > diff --git gcc/cgraph.c gcc/cgraph.c > > index 28019aba434..55f4ee0bdaa 100644 > > > > --- gcc/cgraph.c > > > > +++ gcc/cgraph.c > > > > @@ -2204,6 +2204,22 @@ > > > > cgraph_node::dump (FILE *f) > > > > } > > } > > +/* Dump call graph node to file F in graphviz format. */ > > + > > +void > > +cgraph_node::dump_graphviz (FILE *f) > > +{ > > + cgraph_edge *edge; > > + > > + for (edge = callees; edge; edge = edge->next_callee) > > + { > > + cgraph_node *callee = edge->callee; > > + > > + fprintf(f, "\t\"%s\" -> \"%s\"\n", name (), callee->name ()); > > + } > > +} > > + > > + > > /* Dump ca
[committed] Fix longjmp expander on hppa
With the recent update to setjmp, we now need to restore the hard frame pointer directly from the saved frame pointer. We don't need to adjust for the offset between the virtual_stack_vars_rtx and the hard frame pointer. Tested on hppa-unknwon-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11. Committed to trunk. Dave -- John David Anglin dave.ang...@bell.net 2019-07-01 Wilco Dijkstra John David Anglin setjmp PR target/90963 * config/pa/pa.md (builtin_longjmp): Restore hard_frame_pointer_rtx using saved frame pointer. Index: config/pa/pa.md === --- config/pa/pa.md (revision 272852) +++ config/pa/pa.md (working copy) @@ -8700,9 +8700,7 @@ restoring the gp. */ emit_move_insn (pv, lab); - /* Restore the stack and frame pointers. The virtual_stack_vars_rtx - is saved instead of the hard_frame_pointer_rtx in the save area. - We need to adjust for the offset between these two values. */ + /* Restore the stack and frame pointers. */ fp = copy_to_reg (fp); emit_stack_restore (SAVE_NONLOCAL, stack); @@ -8710,7 +8708,7 @@ emit_insn (gen_blockage ()); emit_clobber (hard_frame_pointer_rtx); emit_clobber (frame_pointer_rtx); - emit_move_insn (hard_frame_pointer_rtx, plus_constant (Pmode, fp, -8)); + emit_move_insn (hard_frame_pointer_rtx, fp); emit_use (hard_frame_pointer_rtx); emit_use (stack_pointer_rtx);
Re: [PING][AArch64] Use scvtf fbits option where appropriate
On Wed, Jun 26, 2019 at 10:35:00AM +0100, Joel Hutton wrote: > Ping, plus minor rework (mostly non-functional changes) > > gcc/ChangeLog: > > 2019-06-12 Joel Hutton > > * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow2_recip): New > prototype > * config/aarch64/aarch64.c (aarch64_fpconst_pow2_recip): New function > * config/aarch64/aarch64.md > (*aarch64_cvtf2_mult): New pattern Cool; I learned a new instruction! > (*aarch64_cvtf2_mult): New pattern > * config/aarch64/constraints.md (Dt): New constraint > * config/aarch64/predicates.md (aarch64_fpconst_pow2_recip): New > predicate > > gcc/testsuite/ChangeLog: > > 2019-06-12 Joel Hutton > > * gcc.target/aarch64/fmul_scvtf_1.c: New test. This testcase will fail on ILP32 targets where unsigned long will still live in a 'w' register. Thanks, James
Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)
On Mon, Jul 1, 2019 at 4:55 PM Martin Sebor wrote:> > [Adding gcc-patches] > > Richard, do you have any further comments or is the revised patch > good to commit? No further comments from my side - it's good to commit. Richard. > Martin > > On 6/25/19 2:30 PM, Martin Sebor wrote: > > On 6/25/19 3:53 AM, Jonathan Wakely wrote: > >> On 24/06/19 19:42 +0200, Richard Biener wrote: > >>> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: > > On 6/24/19 6:11 AM, Richard Biener wrote: > > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor > wrote: > >> > >> On 6/21/19 6:06 AM, Richard Biener wrote: > >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor > wrote: > > Bug 90923 shows that even though GCC hash-table based containers > like hash_map can be instantiated on types with user-defined ctors > and dtors they invoke the dtors of such types without invoking > the corresponding ctors. > > It was thanks to this bug that I spent a day debugging > "interesting" > miscompilations during GCC bootstrap (in fairness, it was that and > bug 90904 about auto_vec copy assignment/construction also being > hosed even for POD types). > > The attached patch corrects the hash_map and hash_set templates > to invoke the ctors of the elements they insert and makes them > (hopefully) safe to use with non-trivial user-defined types. > >>> > >>> Hum. I am worried about the difference of assignment vs. > construction > >>> in ::put() > >>> > >>> + bool ins = hash_entry::is_empty (*e); > >>> + if (ins) > >>> + { > >>> + e->m_key = k; > >>> + new ((void *) &e->m_value) Value (v); > >>> + } > >>> + else > >>> + e->m_value = v; > >>> > >>> why not invoke the dtor on the old value and then the ctor again? > >> > >> It wouldn't work for self-assignment: > >> > >> Value &y = m.get_or_insert (key); > >> m.put (key, y); > >> > >> The usual rule of thumb for writes into containers is to use > >> construction when creating a new element and assignment when > >> replacing the value of an existing element. > >> > >> Which reminds me that the hash containers, despite being copy- > >> constructible (at least for POD types, they aren't for user- > >> defined types), also aren't safe for assignment even for PODs. > >> I opened bug 90959 for this. Until the assignment is fixed > >> I made it inaccessibe in the patch (I have fixed the copy > >> ctor to DTRT for non-PODs). > >> > >>> How is an empty hash_entry constructed? > >> > >> In hash_table::find_slot_with_hash simply by finding an empty > >> slot and returning a pointer to it. The memory for the slot > >> is marked "empty" by calling the Traits::mark_empty() function. > >> > >> The full type of hash_map is actually > >> > >> hash_map >> Value, > >> simple_hashmap_traits, > >>Value> > >> > >> and simple_hashmap_traits delegates it to default_hash_traits > >> whose mark_empty() just clears the void*, leaving the Value > >> part uninitialized. That makes sense because we don't want > >> to call ctors for empty entries. I think the questions one > >> might ask if one were to extend the design are: a) what class > >> should invoke the ctor/assignment and b) should it do it > >> directly or via the traits? > >> > >>> ::remove() doesn't > >>> seem to invoke the dtor either, instead it relies on the > >>> traits::remove function? > >> > >> Yes. There is no Traits::construct or assign or copy. We > >> could add them but I'm not sure I see to what end (there could > >> be use cases, I just don't know enough about these classes to > >> think of any). > >> > >> Attached is an updated patch with the additional minor fixes > >> mentioned above. > >> > >> Martin > >> > >> PS I think we would get much better results by adopting > >> the properly designed and tested standard library containers > >> than by spending time trying to improve the design of these > >> legacy classes. For simple uses that don't need to integrate > >> with the GC machinery the standard containers should be fine > >> (plus, it'd provide us with greater motivation to improve > >> them and the code GCC emits for their uses). Unfortunately, > >> to be able to use the hash-based containers we would need to > >> upgrade to C++ 11. Isn't it time yet? > > > > I don't think so. I'm also not sure if C++11 on its own is desirable > > or if it should
Re: [PATCH][AArch64] Remove constraint strings from define_expand constructs in the back end
On Mon, Jun 24, 2019 at 04:33:40PM +0100, Dennis Zhang wrote: > Hi, > > A number of AArch64 define_expand patterns have specified constraints > for their operands. But the constraint strings are ignored at expand > time and are therefore redundant/useless. We now avoid specifying > constraints in new define_expands, but we should clean up the existing > define_expand definitions. > > For example, the constraint "=w" is removed in the following case: > (define_expand "sqrt2" >[(set (match_operand:GPF_F16 0 "register_operand" "=w") > The "" marks with an empty constraint in define_expand are removed as well. > > The patch is tested with the build configuration of > --target=aarch64-none-linux-gnu, and it passes gcc/testsuite. This is OK for trunk. Thanks, James > gcc/ChangeLog: > > 2019-06-21 Dennis Zhang > > * config/aarch64/aarch64-simd.md: Remove redundant constraints > from define_expand. > * config/aarch64/aarch64-sve.md: Likewise. > * config/aarch64/aarch64.md: Likewise. > * config/aarch64/atomics.md: Likewise.
Re: RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch
On Mon, Jul 1, 2019 at 4:05 PM Joern Wolfgang Rennecke wrote: > > [Apologies if this is a duplicate, I'm unsure if my previous mail was > delivered] > On 01/07/19 12:38, Richard Biener wrote: > > On Mon, Jul 1, 2019 at 1:22 PM Joern Wolfgang Rennecke > > wrote: > >> The heuristic introduced for PR71016 prevents recognizing a max / min > >> combo like it is used for > >> saturation when followed by a conversion. > >> The attached patch refines the heuristic to allow this case. Regression > >> tested on x86_64-pc-linux-gnu . > > Few style nits: > > > >... > > > > also please check that 'lhs' is equal to gimple_assing_rhs1 (arg0_def_stmt) > > otherwise you'd also allow MIN/MAX unrelated to the conversion detected. > > Like the attached patch? Yes - minor nit: + if (!operand_equal_p +(lhs, gimple_assign_rhs1 (arg0_def_stmt), 0)) + return NULL; you can use if (lhs != gimple_assign_rhs1 (arg0_def_stmt)) return NULL; since SSA names are shared tree nodes. OK with that change. > > > > On x86 I see the MIN_EXPR is already detected by GENERIC folding, > > I wonder if that is required or if we can handle the case without in one > > phiopt pass invocation as well. > > > tree_ssa_phiopt_worker is supposed to handle this by handling nested > COND_EXPRs > from the inside out: > > /* Search every basic block for COND_EXPR we may be able to optimize. > > We walk the blocks in order that guarantees that a block with > a single predecessor is processed before the predecessor. > This ensures that we collapse inner ifs before visiting the > outer ones, and also that we do not try to visit a removed > block. */ >bb_order = single_pred_before_succ_order (); > > However, I have no idea how to construct a testcase for this with the > gimple folding in place. > > #define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x)) > > void > foo (unsigned char *p, int i, int m) > { >*p = (m > SAT (i) ? m : SAT (i)); > } > > or the equivalent for MIN_EXPR, *.c.004.original already has only one > conditional left. Hmm. The testcase in your patch should be equal to void foo (unsigned char *p, int i) { // tem1 = MAX ((unsinged char)MIN (i, 255), 0) unsigned char tem1; if (i >= 0) { // tem2 = MIN (i, 255) int tem2; if (i < 255) tem2 = i; else tem2 = 255; // tem1 = (unsigned char) tem2; tem1 = tem2; } else tem1 = 0; *p = tem1; } but for this I don't see any MIN/MAX recognition :/ Anyhow - probably a stupid mistake on my side ;) Richard. >
[SPARC] Fix PR middle-end/64242
SPARC defines a nonlocal_goto pattern to which the same adjustment needs to be applied as in the middle-end. Tested on SPARC/Solaris, applied on mainline and 9 branch. 2019-07-01 Eric Botcazou PR middle-end/64242 * config/sparc/sparc.md (nonlocal_goto): Restore frame pointer last. Add frame clobber and schedule blockage. -- Eric BotcazouIndex: config/sparc/sparc.md === --- config/sparc/sparc.md (revision 272633) +++ config/sparc/sparc.md (working copy) @@ -7381,7 +7381,7 @@ (define_expand "nonlocal_goto" "" { rtx i7 = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM); - rtx r_label = copy_to_reg (operands[1]); + rtx r_label = operands[1]; rtx r_sp = adjust_address (operands[2], Pmode, 0); rtx r_fp = operands[3]; rtx r_i7 = adjust_address (operands[2], Pmode, GET_MODE_SIZE (Pmode)); @@ -7394,9 +7394,18 @@ (define_expand "nonlocal_goto" emit_clobber (gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode))); emit_clobber (gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx)); - /* Restore frame pointer for containing function. */ - emit_move_insn (hard_frame_pointer_rtx, r_fp); + r_label = copy_to_reg (r_label); + + /* Restore the frame pointer and stack pointer. We must use a + temporary since the setjmp buffer may be a local. */ + r_fp = copy_to_reg (r_fp); emit_stack_restore (SAVE_NONLOCAL, r_sp); + r_i7 = copy_to_reg (r_i7); + + /* Ensure the frame pointer move is not optimized. */ + emit_insn (gen_blockage ()); + emit_clobber (hard_frame_pointer_rtx); + emit_move_insn (hard_frame_pointer_rtx, r_fp); emit_move_insn (i7, r_i7); /* USE of hard_frame_pointer_rtx added for consistency; @@ -7405,8 +7414,7 @@ (define_expand "nonlocal_goto" emit_use (stack_pointer_rtx); emit_use (i7); - emit_jump_insn (gen_indirect_jump (r_label)); - emit_barrier (); + emit_indirect_jump (r_label); DONE; })
Re: [PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A5
On Mon, Jul 01, 2019 at 04:13:40PM +0100, Kyrill Tkachov wrote: > Hi all, > > Some scheduling descriptions, like the Cortex-A57 one, are reused for > multiple -mcpu options. > Sometimes those other -mcpu cores support more architecture features > than the Armv8-A Cortex-A57. > For example, the Cortex-A75 and Cortex-A76 support Armv8.2-A as well as > the Dot Product instructions. > These Dot Product instructions have the neon_dot and neon_dot_q > scheduling type, but that type is not > handled in cortex-a57.md, since the Cortex-A57 itself doesn't need to > care about these instructions. > > But if we just ignore the neon_dot(_q) type at scheduling we get really > terrible codegen when compiling > for -mcpu=cortex-a76, for example, because the scheduler just pools all > the UDOT instructions at the end > of the basic block, since it doesn't assume anything about their behaviour. > > This patch ameliorates the situation somewhat by telling the Cortex-A57 > scheduling model to treat any > insn that doesn't get assigned a cortex_a57_neon_type but is actually a > is_neon_type instruction as > a simple neon_arith_basic instruction. This allows us to treat > post-Armv8-A SIMD instructions more sanely > without having to model each of them explicitly in cortex-a57.md. > > Bootstrapped and tested on arm-none-linux-gnueabihf and > aarch64-none-linux-gnu. > > Ok for trunk from an aarch64 perspective? OK. Thansk, James > > Thanks, > Kyrill >
Re: [PATCH][ARM] Add support for "noinit" attribute
Hi Christophe, On 6/13/19 4:13 PM, Christophe Lyon wrote: Hi, Similar to what already exists for TI msp430 or in TI compilers for arm, this patch adds support for "noinit" attribute for arm. It's very similar to the corresponding code in GCC for msp430. It is useful for embedded targets where the user wants to keep the value of some data when the program is restarted: such variables are not zero-initialized.It is mostly a helper/shortcut to placing variables in a dedicated section. It's probably desirable to add the following chunk to the GNU linker: diff --git a/ld/emulparams/armelf.sh b/ld/emulparams/armelf.sh index 272a8bc..9555cec 100644 --- a/ld/emulparams/armelf.sh +++ b/ld/emulparams/armelf.sh @@ -10,7 +10,19 @@ OTHER_TEXT_SECTIONS='*(.glue_7t) *(.glue_7) *(.vfp11_veneer) *(.v4_bx)' OTHER_BSS_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__bss_start__ = .${CREATE_SHLIB+)};" OTHER_BSS_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}_bss_end__ = .${CREATE_SHLIB+)}; ${CREATE_SHLIB+PROVIDE (}__bss_end__ = .${CREATE_SHLIB+)};" OTHER_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__end__ = .${CREATE_SHLIB+)};" -OTHER_SECTIONS='.note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) }' +OTHER_SECTIONS=' +.note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) } + /* This section contains data that is not initialised during load + *or* application reset. */ + .noinit (NOLOAD) : + { + . = ALIGN(2); + PROVIDE (__noinit_start = .); + *(.noinit) + . = ALIGN(2); + PROVIDE (__noinit_end = .); + } +' so that the noinit section has the "NOLOAD" flag. I'll submit that part separately to the binutils project if OK. OK? This is mostly ok by me, with a few code comments inline. I wonder whether this is something we could implement for all targets in the midend, but this would require linker script support for the target to be effective... Thanks, Kyrill Thanks, Christophe diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e3e71ea..332c41b 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -151,6 +151,7 @@ static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *); #endif static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *); static tree arm_handle_cmse_nonsecure_call (tree *, tree, tree, int, bool *); +static tree arm_data_attr (tree *, tree, tree, int, bool *); static void arm_output_function_epilogue (FILE *); static void arm_output_function_prologue (FILE *); static int arm_comp_type_attributes (const_tree, const_tree); @@ -375,7 +376,8 @@ static const struct attribute_spec arm_attribute_table[] = arm_handle_cmse_nonsecure_entry, NULL }, { "cmse_nonsecure_call", 0, 0, true, false, false, true, arm_handle_cmse_nonsecure_call, NULL }, - { NULL, 0, 0, false, false, false, false, NULL, NULL } + { "noinit", 0, 0, true, false, false, false, arm_data_attr, NULL }, + { NULL, 0, 0, false, false, false, false, NULL, NULL }, }; /* Initialize the GCC target structure. */ @@ -808,6 +810,10 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_CONSTANT_ALIGNMENT #define TARGET_CONSTANT_ALIGNMENT arm_constant_alignment + +#undef TARGET_ASM_SELECT_SECTION +#define TARGET_ASM_SELECT_SECTION arm_select_section + /* Obstack for minipool constant handling. */ static struct obstack minipool_obstack; @@ -7150,6 +7156,47 @@ arm_handle_cmse_nonsecure_call (tree *node, tree name, return NULL_TREE; } +/* Called when the noinit attribute is used. Check whether the + attribute is allowed here and add the attribute to the variable + decl tree or otherwise issue a diagnostic. This function checks + NODE is of the expected type and issues diagnostics otherwise using + NAME. If it is not of the expected type *NO_ADD_ATTRS will be set + to true. */ + +static tree +arm_data_attr (tree * node, + tree name, + tree args ATTRIBUTE_UNUSED, + intflags ATTRIBUTE_UNUSED, + bool * no_add_attrs ATTRIBUTE_UNUSED) +{ no_add_attrs is set in this function, so should not be ATTRIBUTE_UNUSED? Arguably args is also checked against NULL, so it's technically not unused either. + const char * message = NULL; + + gcc_assert (DECL_P (* node)); The house style doesn't have a space after '*'. Same elsewhere in the patch. + gcc_assert (args == NULL); + + if (TREE_CODE (* node) != VAR_DECL) +message = G_("%qE attribute only applies to variables"); + + /* Check that it's possible for the variable to have a section. */ + if ((TREE_STATIC (* node) || DECL_EXTERNAL (* node) || in_lto_p) + && DECL_SECTION_NAME (* node)) +message = G_("%qE attribute cannot be applied to variables with specific sections"); + + /* If this var is thought to be common, then change this. Common variables + are assigned to sections before the backend has a chance to process them. */ + if (DECL_COMMON (* node)) +DECL_CO
Re: [PATCH 2/2] gdbhooks.py: extend vec support in pretty printers
On Mon, 2019-07-01 at 13:07 +0300, Vladislav Ivanishin wrote: > This change is threefold: > - enable pretty printing of vec<>, not just vec<>* > - generalize 'vec<(\S+), (\S+), (\S+)>' regex, which is limiting > - extend to work for vl_ptr layout (only vl_embed was supported) > > The motivating example for all three is a vector of vectors in > tree-ssa-uninit.c: > > typedef vec pred_chain; > typedef vec pred_chain_union; > > (This predictably expands to > vec, va_heap, vl_ptr> > in gdb, if we use strip_typedefs() on it -- otherwise it's just > pred_chain_union. The first patch of the series takes care of that.) > > This example features "direct" vecs themselves rather than pointers > to > vecs, which is not a rare case; in fact, a quick grep showed that the > number of declarations of direct vecs is about the same as that of > pointers to vec. > > The GDB's Python API seems to do dereferencing automatically (i.e. it > detects whether to use '->', or '.' for field access), allowing to > use > the same code to access fields from a struct directly as well as > through > a pointer. > > This makes it possible to reuse VecPrinter (meant for vec<> *) for > plain > vec<>, but perhaps it would be beneficial to handle all pointers in a > generic way e.g. print the address and then print the dereferenced > object, letting the appropriate pretty printer to pick it up and do > its > thing, if one is provided for the type. > > The output I have for direct vecs now: > > (gdb) p *bb->succs > $94 = @0x776569b0 = { 5)>, 0x776548a0 (2 -> 3)>} > > Compare that to > > (gdb) p bb->succs > $70 = 0x776569b0 = { 5)>, 0x776548a0 (2 -> 3)>} > > Hope the choice of the '@' sign to represent the address of an object > taken by the pretty printer is not confusing? (GDB itself uses this > notation for references, like so: (edge_def *&) @0x77656c38.) Thanks for the patch. I like the notation idea. [...] > diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py > index b036704b86a..a7e665cadb9 100644 > --- a/gcc/gdbhooks.py > +++ b/gcc/gdbhooks.py > @@ -128,12 +128,18 @@ and here's a length 1 vec: >(gdb) p bb->succs >$19 = 0x70428bb8 = { EXIT)>} > > -You cannot yet use array notation [] to access the elements within the > -vector: attempting to do so instead gives you the vec itself (for vec[0]), > -or a (probably) invalid cast to vec<> for the memory after the vec (for > -vec[1] onwards). > +Direct instances of vec<> are also supported (their addresses are > +prefixed with the '@' sign). > > -Instead (for now) you must access m_vecdata: > +In case you have a vec<> pointer, to use array notation [] to access the > +elements within the vector you must first dereference the pointer, just > +as you do in C: > + (gdb) p (*bb->succs)[0] > + $50 = (edge_def *&) @0x77656c38: 10)> > + (gdb) p bb->succs[0][0] > + $51 = (edge_def *&) @0x77656c38: 10)> That's a nice improvement. If the user erroneously attempts, say: (gdb) p bb->succs[2] are they still accessing whatever's in memory after the vector pointer? If so, I think the comment needs to retain some kind of warning about this, as it's a nasty "gotcha". > +Another option is to access m_vecdata: >(gdb) p bb->preds->m_vecdata[0] >$20 = 5)> >(gdb) p bb->preds->m_vecdata[1] [...] > @@ -441,6 +447,10 @@ class VecPrinter: > #-ex "up" -ex "p bb->preds" > def __init__(self, gdbval): > self.gdbval = gdbval > +if self.gdbval.type.code == gdb.TYPE_CODE_PTR: > +self.ptr = intptr(self.gdbval) > +else: > +self.ptr = None I think this could use a comment. If we have a NULL "vec *", then presumably self.ptr will be 0... > def display_hint (self): > return 'array' > @@ -448,14 +458,25 @@ class VecPrinter: > def to_string (self): > # A trivial implementation; prettyprinting the contents is done > # by gdb calling the "children" method below. > -return '0x%x' % intptr(self.gdbval) > +if self.ptr: > +return '0x%x' % self.ptr > +else: > +return '@0x%x' % intptr(self.gdbval.address) ...and if we have a NULL vec *, with self.ptr as 0, it will take the "else" path here, which I think is meant for the non-pointer case. Should the condition be "if self.ptr is not None"? > def children (self): > -if intptr(self.gdbval) == 0: > +if self.ptr == 0: > return > -m_vecpfx = self.gdbval['m_vecpfx'] > +m_vec = self.gdbval > +# There are 2 kinds of vec layouts: vl_embed, and vl_ptr. The latter > +# is implemented with the former stored in the m_vec field. Sadly, > +# template_argument(2) won't work: `gdb.error: No type named > vl_embed`. > +try: > +m_vec = m_vec['m_vec'] > +except: > +pass I don't like the use of naked "except"; I always worry it might s
Re: Incremental LTO linking part 7: documentation
On 5/26/19 11:35 AM, Gerald Pfeifer wrote: On Tue, 8 May 2018, Jan Hubicka wrote: this patch adds documentation of -flinker-output. * doc/invoke.texi (-flinker-output): Document I found a follow-up patch to this in a local tree that had been sitting there for a while, had a another look, and now committed it. Sandra, if you could have another pass that would be really good; I'm sure you'll be able to further improve. I've checked in the attached patch, which I've also had sitting around for a few weeks now. I kept getting sidetracked when working on this... first I noticed that we were not consistently spelling "link-time optimization" like that, then I noticed that terms like "WPA" were not defined in the user documentation, there are some LTO-related options that are listed in the option summary in the user manual but only documented in the internals manual, etc. I've finally decided I should just commit this piece as-is to get it out of my tree; it's an incremental improvement, anyway. -Sandra 2019-07-01 Sandra Loosemore gcc/ * doc/invoke.texi (Link Options): Further editorial changes to -flinker-output docs. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 272886) +++ gcc/doc/invoke.texi (working copy) @@ -6007,8 +6007,9 @@ Warn about types with virtual methods wh if the type were declared with the C++11 @code{final} specifier, or, if possible, declared in an anonymous namespace. This allows GCC to more aggressively -devirtualize the polymorphic calls. This warning is more effective with link -time optimization, where the information about the class hierarchy graph is +devirtualize the polymorphic calls. This warning is more effective with +link-time optimization, +where the information about the class hierarchy graph is more complete. @item -Wsuggest-final-methods @@ -13190,49 +13191,49 @@ Options}. @item -flinker-output=@var{type} @opindex flinker-output -This option controls code generation of the link time optimizer. By +This option controls code generation of the link-time optimizer. By default the linker output is automatically determined by the linker plugin. For debugging the compiler and if incremental linking with a non-LTO object file is desired, it may be useful to control the type manually. -If @var{type} is @samp{exec} code generation produces a static +If @var{type} is @samp{exec}, code generation produces a static binary. In this case @option{-fpic} and @option{-fpie} are both disabled. -If @var{type} is @samp{dyn} code generation produces a shared +If @var{type} is @samp{dyn}, code generation produces a shared library. In this case @option{-fpic} or @option{-fPIC} is preserved, but not enabled automatically. This allows to build shared libraries -without position independent code on architectures where this is +without position-independent code on architectures where this is possible, i.e.@: on x86. -If @var{type} is @samp{pie} code generation produces an @option{-fpie} +If @var{type} is @samp{pie}, code generation produces an @option{-fpie} executable. This results in similar optimizations as @samp{exec} except that @option{-fpie} is not disabled if specified at compilation time. -If @var{type} is @samp{rel} the compiler assumes that incremental linking is +If @var{type} is @samp{rel}, the compiler assumes that incremental linking is done. The sections containing intermediate code for link-time optimization are merged, pre-optimized, and output to the resulting object file. In addition, if -@option{-ffat-lto-objects} is specified the binary code is produced for future -non-LTO linking. The object file produced by incremental linking will be smaller +@option{-ffat-lto-objects} is specified, binary code is produced for future +non-LTO linking. The object file produced by incremental linking is smaller than a static library produced from the same object files. At link time the -result of incremental linking will also load faster to compiler than a static +result of incremental linking also loads faster than a static library assuming that the majority of objects in the library are used. Finally @samp{nolto-rel} configures the compiler for incremental linking where -code generation is forced, a final binary is produced and the intermediate +code generation is forced, a final binary is produced, and the intermediate code for later link-time optimization is stripped. When multiple object files -are linked together the resulting code will be optimized better than with -link-time optimizations disabled (for example, cross-module inlining will -happen), most of benefits of whole program optimizations are however lost. +are linked together the resulting code is better optimized than with +link-time optimizations disabled (for example, cross-module inlining +happens), but most of benefits of whole program optimizations are lost.
Re: [PATCH 1/2] gdbhooks.py: use strip_typedefs to simplify matching type names
On Mon, 2019-07-01 at 12:50 +0300, Vladislav Ivanishin wrote: > Hi! > > GDB's Python API provides strip_typedefs method that can be > instrumental > for writing DRY code. Using it at least partially obviates the need > for > the add_printer_for_types method we have in gdbhooks.py (it takes a > list > of typenames to match and is usually used to deal with typedefs). > > I think, the code can be simplified further (there's still > ['tree_node *', 'const tree_node *'], which is a little awkward) if > we > deal with pointer types in a uniform fashion (I'll touch on this in > the > description of the second patch). But that can be improved > separately. > > The gimple_statement_cond, etc. part has been dysfunctional for a > while > (namely since gimple-classes-v2-option-3 branch was merged). I > updated > it to use the new classes' names. That seems to work (though it > doesn't > output much info anyway: pretty > vs. >(gphi *) 0x778c0200 > in the raw version). > > I changed the name passed to pp.add_printer_for_types() for > scalar_mode > and friends -- so they all share the same name now -- but I don't > believe the name is used in any context where it would matter. IIRC there's a gdb command to tell you what the registered pretty- printers are; I think the name is only ever used in that context (or maybe for fine-grained control of pretty-printing) - and I don't know of anyone who uses that. I don't think changing the name matters. > This is just a clean up of gdbhooks.py. OK to commit? The only area I wasn't sure about were the removal hunks relating to machine modes: is that all covered now by the looking-through-typedefs? Otherwise, looks good to me. I assume you've tested the patch by debugging with it. Thanks Dave
Re: [PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A57
Something somewhere cut off the subject line: it should say Cortex-A57. Sorry about that. Kyrill On 7/1/19 4:13 PM, Kyrill Tkachov wrote: Hi all, Some scheduling descriptions, like the Cortex-A57 one, are reused for multiple -mcpu options. Sometimes those other -mcpu cores support more architecture features than the Armv8-A Cortex-A57. For example, the Cortex-A75 and Cortex-A76 support Armv8.2-A as well as the Dot Product instructions. These Dot Product instructions have the neon_dot and neon_dot_q scheduling type, but that type is not handled in cortex-a57.md, since the Cortex-A57 itself doesn't need to care about these instructions. But if we just ignore the neon_dot(_q) type at scheduling we get really terrible codegen when compiling for -mcpu=cortex-a76, for example, because the scheduler just pools all the UDOT instructions at the end of the basic block, since it doesn't assume anything about their behaviour. This patch ameliorates the situation somewhat by telling the Cortex-A57 scheduling model to treat any insn that doesn't get assigned a cortex_a57_neon_type but is actually a is_neon_type instruction as a simple neon_arith_basic instruction. This allows us to treat post-Armv8-A SIMD instructions more sanely without having to model each of them explicitly in cortex-a57.md. Bootstrapped and tested on arm-none-linux-gnueabihf and aarch64-none-linux-gnu. Ok for trunk from an aarch64 perspective? Thanks, Kyrill 2019-01-07 Kyrylo Tkachov * config/arm/cortex-a57.md (cortex_a57_neon_type): Use neon_arith_basic for is_neon_type instructions that have not already been categorized.
[PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A5
Hi all, Some scheduling descriptions, like the Cortex-A57 one, are reused for multiple -mcpu options. Sometimes those other -mcpu cores support more architecture features than the Armv8-A Cortex-A57. For example, the Cortex-A75 and Cortex-A76 support Armv8.2-A as well as the Dot Product instructions. These Dot Product instructions have the neon_dot and neon_dot_q scheduling type, but that type is not handled in cortex-a57.md, since the Cortex-A57 itself doesn't need to care about these instructions. But if we just ignore the neon_dot(_q) type at scheduling we get really terrible codegen when compiling for -mcpu=cortex-a76, for example, because the scheduler just pools all the UDOT instructions at the end of the basic block, since it doesn't assume anything about their behaviour. This patch ameliorates the situation somewhat by telling the Cortex-A57 scheduling model to treat any insn that doesn't get assigned a cortex_a57_neon_type but is actually a is_neon_type instruction as a simple neon_arith_basic instruction. This allows us to treat post-Armv8-A SIMD instructions more sanely without having to model each of them explicitly in cortex-a57.md. Bootstrapped and tested on arm-none-linux-gnueabihf and aarch64-none-linux-gnu. Ok for trunk from an aarch64 perspective? Thanks, Kyrill 2019-01-07 Kyrylo Tkachov * config/arm/cortex-a57.md (cortex_a57_neon_type): Use neon_arith_basic for is_neon_type instructions that have not already been categorized. diff --git a/gcc/config/arm/cortex-a57.md b/gcc/config/arm/cortex-a57.md index 6ba4f5711cf4b1127ff6cc2e59f1fa9dbd25a9b1..d1ea2aa1edbec1981c68d2e54c5ae6dfe0fec944 100644 --- a/gcc/config/arm/cortex-a57.md +++ b/gcc/config/arm/cortex-a57.md @@ -236,7 +236,12 @@ (define_attr "cortex_a57_neon_type" neon_store1_4reg, neon_store1_4reg_q,\ neon_store1_one_lane, neon_store1_one_lane_q,\ neon_store2_one_lane, neon_store2_one_lane_q") - (const_string "neon_store_complex")] + (const_string "neon_store_complex") +;; If it doesn't match any of the above that we want to treat specially but is +;; still a NEON type, treat it as a basic NEON type. This is better than +;; dropping it on the floor and making no assumptions about it whatsoever. + (eq_attr "is_neon_type" "yes") + (const_string "neon_arith_basic")] (const_string "unknown"))) ;; The Cortex-A57 core is modelled as a triple issue pipeline that has
Re: [Committed] S/390: Fix vector shift count operand
On 01.07.19 17:01, Marek Polacek wrote: > On Mon, Jul 01, 2019 at 04:58:09PM +0200, Andreas Krebbel wrote: >> We currently use subst definitions to handle the different variants of shift >> count operands. Unfortunately, in the vector shift pattern the shift count >> operand is used directly. Without it being adjusted for the 'subst' variants >> the >> displacement value is omitted resulting in a wrong shift count being applied. >> >> This patch needs to be applied to older branches as well. >> >> Bootstrapped and regression tested on s390x (IBM z14). >> Committed to mainline >> >> gcc/ChangeLog: >> >> 2019-07-01 Andreas Krebbel >> >> * config/s390/vector.md: > > This CL entry is blank. I've fixed this in the changelog file already. Andreas
Re: [Committed] S/390: Fix vector shift count operand
On Mon, Jul 01, 2019 at 04:58:09PM +0200, Andreas Krebbel wrote: > We currently use subst definitions to handle the different variants of shift > count operands. Unfortunately, in the vector shift pattern the shift count > operand is used directly. Without it being adjusted for the 'subst' variants > the > displacement value is omitted resulting in a wrong shift count being applied. > > This patch needs to be applied to older branches as well. > > Bootstrapped and regression tested on s390x (IBM z14). > Committed to mainline > > gcc/ChangeLog: > > 2019-07-01 Andreas Krebbel > > * config/s390/vector.md: This CL entry is blank. Marek
[Committed] S/390: Fix vector shift count operand
We currently use subst definitions to handle the different variants of shift count operands. Unfortunately, in the vector shift pattern the shift count operand is used directly. Without it being adjusted for the 'subst' variants the displacement value is omitted resulting in a wrong shift count being applied. This patch needs to be applied to older branches as well. Bootstrapped and regression tested on s390x (IBM z14). Committed to mainline gcc/ChangeLog: 2019-07-01 Andreas Krebbel * config/s390/vector.md: gcc/testsuite/ChangeLog: 2019-07-01 Andreas Krebbel * gcc.target/s390/vector/vec-shift-2.c: New test. --- gcc/config/s390/vector.md | 2 +- gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c | 24 ++ 2 files changed, 25 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index a2c1012..140ef47 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -981,7 +981,7 @@ (VEC_SHIFTS:VI (match_operand:VI 1 "register_operand" "v") (match_operand:SI 2 "nonmemory_operand" "an")))] "TARGET_VX" - "\t%v0,%v1,%Y2" + "\t%v0,%v1," [(set_attr "op_type" "VRS")]) ; Shift each element by corresponding vector element diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c new file mode 100644 index 000..c7a1d93 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */ + +/* { dg-final { scan-assembler-times "veslf" 1 } } */ + +typedef __attribute__((vector_size(16))) signed int v4si; + +v4si __attribute__((noinline,noclone)) +shift_left_by_scalar (v4si in, int shift_count) +{ + return in << (3 + shift_count); +} + +int +main () +{ + v4si a = { 1, 2, 3, 4 }; + v4si result = shift_left_by_scalar (a, 1); + + if (result[1] != 32) +__builtin_abort (); + + return 0; +} -- 2.7.4
[PATCH] rs6000: Improve indexed addressing
The function rs6000_force_indexed_or_indirect_mem makes a memory operand suitable for indexed (or indirect) addressing. If the memory address isn't yet valid, it loads the whole thing into a register to make it valid. That isn't optimal. This changes it to load an address that is the sum of two things into two registers instead. This results in lower latency code, and if inside loops, a constant term can be moved outside the loop. Tested on powerpc64-linux {-m32,-m64} and on powerpc64le-linux (a Power9). Committing. Segher 2019-07-01 Segher Boessenkool * config/rs6000/rs6000.c (rs6000_force_indexed_or_indirect_mem): Load both operands of a PLUS into registers separately. --- gcc/config/rs6000/rs6000.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 5e80673..f59f3a9 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -32100,7 +32100,16 @@ rs6000_force_indexed_or_indirect_mem (rtx x) addr = reg; } - x = replace_equiv_address (x, force_reg (Pmode, addr)); + if (GET_CODE (addr) == PLUS) + { + rtx op0 = XEXP (addr, 0); + rtx op1 = XEXP (addr, 1); + op0 = force_reg (Pmode, op0); + op1 = force_reg (Pmode, op1); + x = replace_equiv_address (x, gen_rtx_PLUS (Pmode, op0, op1)); + } + else + x = replace_equiv_address (x, force_reg (Pmode, addr)); } return x; -- 1.8.3.1
Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)
[Adding gcc-patches] Richard, do you have any further comments or is the revised patch good to commit? Martin On 6/25/19 2:30 PM, Martin Sebor wrote: On 6/25/19 3:53 AM, Jonathan Wakely wrote: On 24/06/19 19:42 +0200, Richard Biener wrote: On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: On 6/24/19 6:11 AM, Richard Biener wrote: > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor wrote: >> >> On 6/21/19 6:06 AM, Richard Biener wrote: >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor wrote: Bug 90923 shows that even though GCC hash-table based containers like hash_map can be instantiated on types with user-defined ctors and dtors they invoke the dtors of such types without invoking the corresponding ctors. It was thanks to this bug that I spent a day debugging "interesting" miscompilations during GCC bootstrap (in fairness, it was that and bug 90904 about auto_vec copy assignment/construction also being hosed even for POD types). The attached patch corrects the hash_map and hash_set templates to invoke the ctors of the elements they insert and makes them (hopefully) safe to use with non-trivial user-defined types. >>> >>> Hum. I am worried about the difference of assignment vs. construction >>> in ::put() >>> >>> + bool ins = hash_entry::is_empty (*e); >>> + if (ins) >>> + { >>> + e->m_key = k; >>> + new ((void *) &e->m_value) Value (v); >>> + } >>> + else >>> + e->m_value = v; >>> >>> why not invoke the dtor on the old value and then the ctor again? >> >> It wouldn't work for self-assignment: >> >> Value &y = m.get_or_insert (key); >> m.put (key, y); >> >> The usual rule of thumb for writes into containers is to use >> construction when creating a new element and assignment when >> replacing the value of an existing element. >> >> Which reminds me that the hash containers, despite being copy- >> constructible (at least for POD types, they aren't for user- >> defined types), also aren't safe for assignment even for PODs. >> I opened bug 90959 for this. Until the assignment is fixed >> I made it inaccessibe in the patch (I have fixed the copy >> ctor to DTRT for non-PODs). >> >>> How is an empty hash_entry constructed? >> >> In hash_table::find_slot_with_hash simply by finding an empty >> slot and returning a pointer to it. The memory for the slot >> is marked "empty" by calling the Traits::mark_empty() function. >> >> The full type of hash_map is actually >> >> hash_map> Value, >> simple_hashmap_traits, >> Value> >> >> and simple_hashmap_traits delegates it to default_hash_traits >> whose mark_empty() just clears the void*, leaving the Value >> part uninitialized. That makes sense because we don't want >> to call ctors for empty entries. I think the questions one >> might ask if one were to extend the design are: a) what class >> should invoke the ctor/assignment and b) should it do it >> directly or via the traits? >> >>> ::remove() doesn't >>> seem to invoke the dtor either, instead it relies on the >>> traits::remove function? >> >> Yes. There is no Traits::construct or assign or copy. We >> could add them but I'm not sure I see to what end (there could >> be use cases, I just don't know enough about these classes to >> think of any). >> >> Attached is an updated patch with the additional minor fixes >> mentioned above. >> >> Martin >> >> PS I think we would get much better results by adopting >> the properly designed and tested standard library containers >> than by spending time trying to improve the design of these >> legacy classes. For simple uses that don't need to integrate >> with the GC machinery the standard containers should be fine >> (plus, it'd provide us with greater motivation to improve >> them and the code GCC emits for their uses). Unfortunately, >> to be able to use the hash-based containers we would need to >> upgrade to C++ 11. Isn't it time yet? > > I don't think so. I'm also not sure if C++11 on its own is desirable > or if it should be C++14 or later at that point. SLES 12 has GCC 4.8 > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7. > SLES 11 already struggles currently (GCC 4.3) but I'd no longer > consider that important enough. > > Note any such change to libstdc++ containers should be complete > and come with both code-size and compile-time and memory-usage > measurements (both of GCC and other apps of course). Can I go ahead and commit the patch? I think we need to document the requirements on Value classes better. @@ -177,7 +185,10 @@ public: INSERT); bool ins = Traits::is_empty (*e); if (ins) - e->m_key = k; + { + e->m_key = k; + new ((void *)&e->m_value) Value (); + } this now requires a default constructor and I always forget about diff
Re: RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch
[Apologies if this is a duplicate, I'm unsure if my previous mail was delivered] On 01/07/19 12:38, Richard Biener wrote: On Mon, Jul 1, 2019 at 1:22 PM Joern Wolfgang Rennecke wrote: The heuristic introduced for PR71016 prevents recognizing a max / min combo like it is used for saturation when followed by a conversion. The attached patch refines the heuristic to allow this case. Regression tested on x86_64-pc-linux-gnu . Few style nits: ... also please check that 'lhs' is equal to gimple_assing_rhs1 (arg0_def_stmt) otherwise you'd also allow MIN/MAX unrelated to the conversion detected. Like the attached patch? On x86 I see the MIN_EXPR is already detected by GENERIC folding, I wonder if that is required or if we can handle the case without in one phiopt pass invocation as well. tree_ssa_phiopt_worker is supposed to handle this by handling nested COND_EXPRs from the inside out: /* Search every basic block for COND_EXPR we may be able to optimize. We walk the blocks in order that guarantees that a block with a single predecessor is processed before the predecessor. This ensures that we collapse inner ifs before visiting the outer ones, and also that we do not try to visit a removed block. */ bb_order = single_pred_before_succ_order (); However, I have no idea how to construct a testcase for this with the gimple folding in place. #define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x)) void foo (unsigned char *p, int i, int m) { *p = (m > SAT (i) ? m : SAT (i)); } or the equivalent for MIN_EXPR, *.c.004.original already has only one conditional left. 2019-07-01 Joern Rennecke PR middle-end/66726 * tree-ssa-phiopt.c (factor_out_conditional_conversion): Tune heuristic from PR71016 to allow MIN / MAX. * testsuite/gcc.dg/tree-ssa/pr66726-4.c: New testcase. Index: tree-ssa-phiopt.c === --- tree-ssa-phiopt.c (revision 272846) +++ tree-ssa-phiopt.c (working copy) @@ -504,7 +504,25 @@ factor_out_conditional_conversion (edge gsi = gsi_for_stmt (arg0_def_stmt); gsi_prev_nondebug (&gsi); if (!gsi_end_p (gsi)) - return NULL; + { + if (gassign *assign + = dyn_cast (gsi_stmt (gsi))) + { + tree lhs = gimple_assign_lhs (assign); + enum tree_code ass_code + = gimple_assign_rhs_code (assign); + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR) + return NULL; + if (!operand_equal_p +(lhs, gimple_assign_rhs1 (arg0_def_stmt), 0)) + return NULL; + gsi_prev_nondebug (&gsi); + if (!gsi_end_p (gsi)) + return NULL; + } + else + return NULL; + } gsi = gsi_for_stmt (arg0_def_stmt); gsi_next_nondebug (&gsi); if (!gsi_end_p (gsi)) Index: testsuite/gcc.dg/tree-ssa/pr66726-4.c === --- testsuite/gcc.dg/tree-ssa/pr66726-4.c (nonexistent) +++ testsuite/gcc.dg/tree-ssa/pr66726-4.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-phiopt1-details" } */ + +#define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x)) + +void +foo (unsigned char *p, int i) +{ + *p = SAT (i); +} + +/* { dg-final { scan-tree-dump-times "COND_EXPR .*and PHI .*converted to straightline code" 1 "phiopt1" } } */
[Ada] Implement GNAT.Graphs
This patch introduces new unit GNAT.Graphs which currently provides a directed graph abstraction. -- Source -- -- operations.adb with Ada.Text_IO; use Ada.Text_IO; with GNAT;use GNAT; with GNAT.Graphs; use GNAT.Graphs; with GNAT.Sets; use GNAT.Sets; procedure Operations is type Vertex_Id is (No_V, VA, VB, VC, VD, VE, VF, VG, VH, VX, VY, VZ); No_Vertex_Id : constant Vertex_Id := No_V; function Hash_Vertex (V : Vertex_Id) return Bucket_Range_Type; type Edge_Id is (No_E, E1, E2, E3, E4, E5, E6, E7, E8, E9, E10, E97, E98, E99); No_Edge_Id : constant Edge_Id := No_E; function Hash_Edge (E : Edge_Id) return Bucket_Range_Type; package ES is new Membership_Set (Element_Type => Edge_Id, "=" => "=", Hash => Hash_Edge); package DG is new Directed_Graph (Vertex_Id => Vertex_Id, No_Vertex => No_Vertex_Id, Hash_Vertex => Hash_Vertex, Same_Vertex => "=", Edge_Id => Edge_Id, No_Edge => No_Edge_Id, Hash_Edge => Hash_Edge, Same_Edge => "="); use DG; package VS is new Membership_Set (Element_Type => Vertex_Id, "=" => "=", Hash => Hash_Vertex); --- -- Local subprograms -- --- procedure Check_Belongs_To_Component (R: String; G: Instance; V: Vertex_Id; Exp_Comp : Component_Id); -- Verify that vertex V of graph G belongs to component Exp_Comp. R is the -- calling routine. procedure Check_Belongs_To_Some_Component (R : String; G : Instance; V : Vertex_Id); -- Verify that vertex V of graph G belongs to some component. R is the -- calling routine. procedure Check_Destination_Vertex (R : String; G : Instance; E : Edge_Id; Exp_V : Vertex_Id); -- Vertify that the destination vertex of edge E of grah G is Exp_V. R is -- the calling routine. procedure Check_Distinct_Components (R : String; Comp_1 : Component_Id; Comp_2 : Component_Id); -- Verify that components Comp_1 and Comp_2 are distinct (not the same) procedure Check_Has_Component (R : String; G : Instance; G_Name : String; Comp : Component_Id); -- Verify that graph G with name G_Name contains component Comp. R is the -- calling routine. procedure Check_Has_Edge (R : String; G : Instance; E : Edge_Id); -- Verify that graph G contains edge E. R is the calling routine. procedure Check_Has_Vertex (R : String; G : Instance; V : Vertex_Id); -- Verify that graph G contains vertex V. R is the calling routine. procedure Check_No_Component (R : String; G : Instance; V : Vertex_Id); -- Verify that vertex V does not belong to some component. R is the calling -- routine. procedure Check_No_Component (R : String; G : Instance; G_Name : String; Comp : Component_Id); -- Verify that graph G with name G_Name does not contain component Comp. R -- is the calling routine. procedure Check_No_Edge (R : String; G : Instance; E : Edge_Id); -- Verify that graph G does not contain edge E. R is the calling routine. procedure Check_No_Vertex (R : String; G : Instance; V : Vertex_Id); -- Verify that graph G does not contain vertex V. R is the calling routine. procedure Check_Number_Of_Components (R : String; G : Instance; Exp_Num : Natural); -- Verify that graph G has exactly Exp_Num components. R is the calling -- routine. procedure Check_Number_Of_Edges (R : String; G : Instance; Exp_Num : Natural); -- Verify that graph G has exactly Exp_Num edges. R is the calling routine. procedure Check_Number_Of_Vertices (R : String; G : Instance; Exp_Num : Natural); -- Verify that graph G has exactly Exp_Num vertices. R is the calling -- routine. procedure Check_Outgoing_Edge_Iterator (R : String; G : Instance; V : Vertex_Id; Set : ES.Instance); -- Verify that all outgoing edges of vertex V of graph G can be iterated -- and appear in set Set. R is the calling routine. procedure Check_Source_Vertex (R : String; G : Instance; E : Edge_Id; Exp_V : Vertex_Id); -- Vertify that the source vertex of edge E of grah G is Exp_V. R is the -- calling routine. procedure Check_Vertex_Iterator (R: String; G: Instance; Comp : Component_Id; Set : VS.Instance); -- Verify that all vertices of component Comp of graph G can be iterated -- and appear in set Set. R is the calling routine. function Create_And_Populate return Instance; -- Create a
[Ada] Improve error message on mult/div between fixed-point and integer
Multiplication and division of a fixed-point type by an integer type is only defined by default for type Integer. Clarify the error message to explain that a conversion is needed in other cases. Also change an error message to start with lowercase as it should be. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Yannick Moy gcc/ada/ * sem_ch4.adb (Operator_Check): Refine error message.--- gcc/ada/sem_ch4.adb +++ gcc/ada/sem_ch4.adb @@ -7375,7 +7375,7 @@ package body Sem_Ch4 is Etype (Next_Formal (First_Formal (Op_Id then Error_Msg_N -("No legal interpretation for operator&", N); +("no legal interpretation for operator&", N); Error_Msg_NE ("\use clause on& would make operation legal", N, Scope (Op_Id)); @@ -7393,6 +7393,26 @@ package body Sem_Ch4 is Error_Msg_NE ("\left operand has}!", N, Etype (L)); Error_Msg_NE ("\right operand has}!", N, Etype (R)); + -- For multiplication and division operators with + -- a fixed-point operand and an integer operand, + -- indicate that the integer operand should be of + -- type Integer. + + if Nkind_In (N, N_Op_Multiply, N_Op_Divide) + and then Is_Fixed_Point_Type (Etype (L)) + and then Is_Integer_Type (Etype (R)) + then + Error_Msg_N ("\convert right operand to " + & "`Integer`", N); + + elsif Nkind (N) = N_Op_Multiply + and then Is_Fixed_Point_Type (Etype (R)) + and then Is_Integer_Type (Etype (L)) + then + Error_Msg_N ("\convert left operand to " + & "`Integer`", N); + end if; + -- For concatenation operators it is more difficult to -- determine which is the wrong operand. It is worth -- flagging explicitly an access type, for those who
[Ada] More permissive use of GNAT attribute Enum_Rep
This patch allows the prefix of the attribute Enum_Rep to be an attribute referece (such as Enum_Type'First). A recent patch had restricted the prefix to be an object of a discrete type, which is incompatible with orevious usage. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Ed Schonberg gcc/ada/ * sem_attr.adb (Analyze_Attribute, case Enum_Rep): Allow prefix of attribute to be an attribute reference of a discrete type. gcc/testsuite/ * gnat.dg/enum_rep.adb, gnat.dg/enum_rep.ads: New testcase.--- gcc/ada/sem_attr.adb +++ gcc/ada/sem_attr.adb @@ -3833,14 +3833,16 @@ package body Sem_Attr is Check_Discrete_Type; Resolve (E1, P_Base_Type); - -- X'Enum_Rep case. X must be an object or enumeration literal, and - -- it must be of a discrete type. + -- X'Enum_Rep case. X must be an object or enumeration literal + -- (including an attribute reference), and it must be of a + -- discrete type. elsif not ((Is_Object_Reference (P) or else (Is_Entity_Name (P) -and then Ekind (Entity (P)) = E_Enumeration_Literal)) +and then Ekind (Entity (P)) = E_Enumeration_Literal) + or else Nkind (P) = N_Attribute_Reference) and then Is_Discrete_Type (Etype (P))) then Error_Attr_P ("prefix of % attribute must be discrete object"); --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/enum_rep.adb @@ -0,0 +1,5 @@ +-- { dg-do compile } + +package body Enum_Rep is + procedure Foo is null; +end; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/enum_rep.ads @@ -0,0 +1,22 @@ +with Interfaces; + +package Enum_Rep is + + type My_Type is range 00 .. 100; + + subtype My_Subtype2 is Interfaces.Unsigned_32 +range My_Type'First'Enum_Rep .. My_Type'Last'Enum_Rep; + + My_Type_First : constant My_Type := My_Type'First; + My_Type_Last : constant My_Type := My_Type'Last; + + subtype My_Subtype is Interfaces.Unsigned_32 + range My_Type_First'Enum_Rep .. My_Type_Last'Enum_Rep; + + subtype My_Subtype1 is Interfaces.Unsigned_32 +range My_Type'Enum_Rep (My_Type'First) .. + My_Type'Enum_Rep (MY_Type'Last); + + procedure Foo; + +end;
[Ada] Spurious error on inst. of partially defaulted formal package
This patch removes a spurious error on an instantiation whose generic unit has a formal package where some formal parameters are box-initialiaed. Previously the code assumed that box-initialization for a formal package applied to all its formal parameters. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Ed Schonberg gcc/ada/ * sem_ch12.adb (Is_Defaulted): New predicate in Check_Formal_Package_Intance, to skip the conformance of checks on parameters of a formal package that are defaulted, gcc/testsuite/ * gnat.dg/generic_inst3.adb, gnat.dg/generic_inst3_kafka_lib-topic.ads, gnat.dg/generic_inst3_kafka_lib.ads, gnat.dg/generic_inst3_markets.ads, gnat.dg/generic_inst3_traits-encodables.ads, gnat.dg/generic_inst3_traits.ads: New testcase.--- gcc/ada/sem_ch12.adb +++ gcc/ada/sem_ch12.adb @@ -6195,6 +6195,12 @@ package body Sem_Ch12 is -- Common error routine for mismatch between the parameters of the -- actual instance and those of the formal package. + function Is_Defaulted (Param : Entity_Id) return Boolean; + -- If the formql package has partly box-initialized formals, skip + -- conformace check for these formals. Previously the code assumed + -- that boc initialization for a formal package applied to all + -- its formal parameters. + function Same_Instantiated_Constant (E1, E2 : Entity_Id) return Boolean; -- The formal may come from a nested formal package, and the actual may -- have been constant-folded. To determine whether the two denote the @@ -6245,6 +6251,32 @@ package body Sem_Ch12 is end if; end Check_Mismatch; + -- + -- Is_Defaulted -- + -- + + function Is_Defaulted (Param : Entity_Id) return Boolean is + Assoc : Node_Id; + begin + Assoc := First (Generic_Associations + (Parent (Associated_Formal_Package (Actual_Pack; + + while Present (Assoc) loop +if Nkind (Assoc) = N_Others_Choice then + return True; + +elsif Nkind (Assoc) = N_Generic_Association + and then Chars (Selector_Name (Assoc)) = Chars (Param) +then + return Box_Present (Assoc); +end if; + +Next (Assoc); + end loop; + + return False; + end Is_Defaulted; + -- Same_Instantiated_Constant -- @@ -6414,6 +6446,9 @@ package body Sem_Ch12 is then goto Next_E; + elsif Is_Defaulted (E1) then +goto Next_E; + elsif Is_Type (E1) then -- Subtypes must statically match. E1, E2 are the local entities --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/generic_inst3.adb @@ -0,0 +1,20 @@ +-- { dg-do compile } + +with Generic_Inst3_Kafka_Lib.Topic; +with Generic_Inst3_Traits.Encodables; +with Generic_Inst3_Markets; + +procedure Generic_Inst3 is + generic + with package Values is new Generic_Inst3_Traits.Encodables (<>); + with package Topic is new Generic_Inst3_Kafka_Lib.Topic + (Values => Values, others => <>); + package Dummy is + end Dummy; + + package Inst is new Dummy + (Values => Generic_Inst3_Markets.Data_Encodables, + Topic => Generic_Inst3_Markets.Data_Topic); +begin + null; +end Generic_Inst3; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/generic_inst3_kafka_lib-topic.ads @@ -0,0 +1,7 @@ +with Generic_Inst3_Traits.Encodables; +generic + Topic_Name : String; + with package Keys is new Generic_Inst3_Traits.Encodables (<>); + with package Values is new Generic_Inst3_Traits.Encodables (<>); +package Generic_Inst3_Kafka_Lib.Topic is +end Generic_Inst3_Kafka_Lib.Topic; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/generic_inst3_kafka_lib.ads @@ -0,0 +1,2 @@ +package Generic_Inst3_Kafka_Lib is +end Generic_Inst3_Kafka_Lib; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/generic_inst3_markets.ads @@ -0,0 +1,10 @@ +with Generic_Inst3_Kafka_Lib.Topic; +with Generic_Inst3_Traits.Encodables; +package Generic_Inst3_Markets is + type Data_Type is null record; + function Image (D : Data_Type) return String is (""); + package Data_Encodables is new Generic_Inst3_Traits.Encodables (Data_Type, Image); + package Data_Topic is new Generic_Inst3_Kafka_Lib.Topic + (Keys => Data_Encodables, Values => Data_Encodables, + Topic_Name => "bla"); +end Generic_Inst3_Markets; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/generic_inst3_traits-encodables.ads @@ -0,0 +1,8 @@ +with Ada.Streams; +generic + pragma Warnings (Off, "is not referenced"); + type T (<>) is private; + with function Image (Val : in T) return String; +package Generic_Inst3_Traits.Encodables is + pragma Pu
[Ada] Crash on improper pragma Weak_External
This patch adds a guard on the use of pragma Weak_External. This pragma affects link-time addresses of entities, and does not apply to types. Previous to this patch the compiler would abort on a misuse of the pragma. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Ed Schonberg gcc/ada/ * sem_prag.adb (Analyze_Pragma, case Weak_External): Pragma only applies to entities with run-time addresses, not to types. gcc/testsuite/ * gnat.dg/weak3.adb, gnat.dg/weak3.ads: New testcase.--- gcc/ada/sem_prag.adb +++ gcc/ada/sem_prag.adb @@ -25608,6 +25608,12 @@ package body Sem_Prag is Ent := Underlying_Type (Ent); end if; +-- The pragma applies to entities with addresses. + +if Is_Type (Ent) then + Error_Pragma ("pragma applies to objects and subprograms"); +end if; + -- The only processing required is to link this item on to the -- list of rep items for the given entity. This is accomplished -- by the call to Rep_Item_Too_Late (when no error is detected --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/weak3.adb @@ -0,0 +1,11 @@ +-- { dg-do compile } + +package body Weak3 is + + type T is new Integer; + pragma Weak_External (T); -- { dg-error "pragma applies to objects and subprograms" } + X : T; + + procedure Foo is null; + +end Weak3; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/weak3.ads @@ -0,0 +1,3 @@ +package Weak3 is + procedure Foo; +end Weak3;
[Ada] Wrong code with -gnatVa on lock-free protected objects
This patch fixes the handling of validity checks on protected objects that use the Lock-Free implementation when validity checks are enabled, previous to this patch the compiler would report improperly that a condition in a protected operation was always True (when comoipled with -gnatwa) and would generate incorrect code fhat operation. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Ed Schonberg gcc/ada/ * checks.adb (Insert_Valid_Check): Do not apply validity check to variable declared within a protected object that uses the Lock_Free implementation, to prevent unwarranted constant folding, because entities within such an object msut be treated as volatile. gcc/testsuite/ * gnat.dg/prot7.adb, gnat.dg/prot7.ads: New testcase.--- gcc/ada/checks.adb +++ gcc/ada/checks.adb @@ -7429,6 +7429,19 @@ package body Checks is return; end if; + -- Entities declared in Lock_free protected types must be treated + -- as volatile, and we must inhibit validity checks to prevent + -- improper constant folding. + + if Is_Entity_Name (Expr) +and then Is_Subprogram (Scope (Entity (Expr))) +and then Present (Protected_Subprogram (Scope (Entity (Expr +and then Uses_Lock_Free + (Scope (Protected_Subprogram (Scope (Entity (Expr) + then + return; + end if; + -- If we have a checked conversion, then validity check applies to -- the expression inside the conversion, not the result, since if -- the expression inside is valid, then so is the conversion result. --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/prot7.adb @@ -0,0 +1,22 @@ +-- { dg-do compile } +-- { dg-options "-gnatwa -gnatVa" } + +package body Prot7 is + protected body Default_Slice is + function Get return Instance_Pointer is + begin + return Default; + end Get; + + procedure Set ( +Discard : in out Boolean; +Slice : in Instance_Pointer + ) is + begin + Discard := Default /= null; + if not Discard then +Default := Slice; + end if; + end Set; + end Default_Slice; +end Prot7; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/prot7.ads @@ -0,0 +1,16 @@ +package Prot7 is + type Instance_Pointer is access Integer; + + protected Default_Slice +with Lock_Free + is + function Get return Instance_Pointer; + + procedure Set ( +Discard : in out Boolean; +Slice : in Instance_Pointer + ); + private + Default : Instance_Pointer; + end Default_Slice; +end Prot7;
[Ada] Spurious error private subtype derivation
This patch fixes a spurious error on a derived type declaration whose subtype indication is a subtype of a private type whose full view is a constrained discriminated type. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Ed Schonberg gcc/ada/ * sem_ch3.adb (Build_Derived_Record_Type): If the parent type is declared as a subtype of a private type with an inherited discriminant constraint, its generated full base appears as a record subtype, so we need to retrieve its oen base type so that the inherited constraint can be applied to it. gcc/testsuite/ * gnat.dg/derived_type6.adb, gnat.dg/derived_type6.ads: New testcase.--- gcc/ada/sem_ch3.adb +++ gcc/ada/sem_ch3.adb @@ -8582,6 +8582,16 @@ package body Sem_Ch3 is Parent_Base := Base_Type (Parent_Type); end if; + -- If the parent type is declared as a subtype of another private + -- type with inherited discriminants, its generated base type is + -- itself a record subtype. To further inherit the constraint we + -- need to use its own base to have an unconstrained type on which + -- to apply the inherited constraint. + + if Ekind (Parent_Base) = E_Record_Subtype then + Parent_Base := Base_Type (Parent_Base); + end if; + -- AI05-0115: if this is a derivation from a private type in some -- other scope that may lead to invisible components for the derived -- type, mark it accordingly. --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/derived_type6.adb @@ -0,0 +1,5 @@ +-- { dg-do compile } + +package body Derived_Type6 is + procedure Foo is null; +end Derived_Type6; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/derived_type6.ads @@ -0,0 +1,9 @@ +with Ada.Strings.Bounded; + +package Derived_Type6 is + package b is new Ada.Strings.Bounded.Generic_Bounded_Length(10); + subtype s1 is b.Bounded_String; + type s2 is new s1; + + procedure Foo; +end Derived_Type6;
[Ada] Remove a SPARK rule about implicit Global
A rule about implicit Global contract for functions whose names overload an abstract state was never implemented (and no user complained about this). It is now removed, so references to other rules need to be renumbered. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Piotr Trojanek gcc/ada/ * einfo.adb, sem_ch7.adb, sem_prag.adb, sem_util.adb: Update references to the SPARK RM after the removal of Rule 7.1.4(5).--- gcc/ada/einfo.adb +++ gcc/ada/einfo.adb @@ -8141,7 +8141,7 @@ package body Einfo is function Is_External_State (Id : E) return B is begin -- To qualify, the abstract state must appear with option "external" or - -- "synchronous" (SPARK RM 7.1.4(8) and (10)). + -- "synchronous" (SPARK RM 7.1.4(7) and (9)). return Ekind (Id) = E_Abstract_State @@ -8319,7 +8319,7 @@ package body Einfo is function Is_Synchronized_State (Id : E) return B is begin -- To qualify, the abstract state must appear with simple option - -- "synchronous" (SPARK RM 7.1.4(10)). + -- "synchronous" (SPARK RM 7.1.4(9)). return Ekind (Id) = E_Abstract_State --- gcc/ada/sem_ch7.adb +++ gcc/ada/sem_ch7.adb @@ -3253,7 +3253,7 @@ package body Sem_Ch7 is -- A [generic] package that defines at least one non-null abstract state -- requires a completion only when at least one other construct requires - -- a completion in a body (SPARK RM 7.1.4(4) and (6)). This check is not + -- a completion in a body (SPARK RM 7.1.4(4) and (5)). This check is not -- performed if the caller requests this behavior. if Do_Abstract_States --- gcc/ada/sem_prag.adb +++ gcc/ada/sem_prag.adb @@ -12219,7 +12219,7 @@ package body Sem_Prag is Check_Ghost_Synchronous; -- Option Part_Of without an encapsulating state is --- illegal (SPARK RM 7.1.4(9)). +-- illegal (SPARK RM 7.1.4(8)). elsif Chars (Opt) = Name_Part_Of then SPARK_Msg_N --- gcc/ada/sem_util.adb +++ gcc/ada/sem_util.adb @@ -10737,7 +10737,7 @@ package body Sem_Util is -- Asynch_Writers Effective_Writes -- -- Note that both forms of External have higher precedence than - -- Synchronous (SPARK RM 7.1.4(10)). + -- Synchronous (SPARK RM 7.1.4(9)). elsif Has_Synchronous then return Nam_In (Property, Name_Async_Readers, Name_Async_Writers);
[Ada] Cleanup references to LynuxWorks in docs and comments
Apparently the company behind LynxOS is now called Lynx Software Technologies (formerly LynuxWorks). Use the current name in user docs and the previous name in developer comment (to hopefully reflect the company name at the time when the patchset mentioned in the comment was released). Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Piotr Trojanek gcc/ada/ * sysdep.c: Cleanup references to LynuxWorks in docs and comments.--- gcc/ada/sysdep.c +++ gcc/ada/sysdep.c @@ -1066,4 +1066,3 @@ __gnat_name_case_equivalence () return 1; #endif } -
[Ada] Revert "Global => null" on calendar routines that use timezones
Some routines from the Ada.Calendar package, i.e. Year, Month, Day, Split and Time_Off, rely on OS-specific timezone databases that are kept in files (e.g. /etc/localtime on Linux). In SPARK we want to model this as a potential side-effect, so those routines can't have "Global => null". Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Piotr Trojanek gcc/ada/ * libgnat/a-calend.ads: Revert "Global => null" contracts on non-pure routines.--- gcc/ada/libgnat/a-calend.ads +++ gcc/ada/libgnat/a-calend.ads @@ -61,19 +61,20 @@ is -- the result will contain all elapsed leap seconds since the start of -- Ada time until now. - function Year(Date : Time) return Year_Number with Global => null; - function Month (Date : Time) return Month_Number with Global => null; - function Day (Date : Time) return Day_Number with Global => null; - function Seconds (Date : Time) return Day_Duration with Global => null; + function Year(Date : Time) return Year_Number; + function Month (Date : Time) return Month_Number; + function Day (Date : Time) return Day_Number; + function Seconds (Date : Time) return Day_Duration; + -- SPARK Note: These routines, just like Split and Time_Of below, might use + -- the OS-specific timezone database that is typically stored in a file. + -- This side effect needs to be modeled, so there is no Global => null. procedure Split (Date: Time; Year: out Year_Number; Month : out Month_Number; Day : out Day_Number; - Seconds : out Day_Duration) - with - Global => null; + Seconds : out Day_Duration); -- Break down a time value into its date components set in the current -- time zone. If Split is called on a time value created using Ada 2005 -- Time_Of in some arbitrary time zone, the input value will always be @@ -83,9 +84,7 @@ is (Year: Year_Number; Month : Month_Number; Day : Day_Number; - Seconds : Day_Duration := 0.0) return Time - with - Global => null; + Seconds : Day_Duration := 0.0) return Time; -- GNAT Note: Normally when procedure Split is called on a Time value -- result of a call to function Time_Of, the out parameters of procedure -- Split are identical to the in parameters of function Time_Of. However,
[Ada] gprbuild fails to find ghost ALI files
This patch fixes a bug where if a ghost unit is compiled with ignored-ghost mode in a library project, then gprbuild will fail to find the ALI file, because the compiler generates an empty object file, but no ALI file. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Bob Duff gcc/ada/ * gnat1drv.adb (gnat1drv): Call Write_ALI if the main unit is ignored-ghost.--- gcc/ada/gnat1drv.adb +++ gcc/ada/gnat1drv.adb @@ -1453,9 +1453,13 @@ begin -- Generate ALI file if specially requested, or for missing subunits, -- subunits or predefined generic. For ignored ghost code, the object - -- file IS generated, so Object should be True. + -- file IS generated, so Object should be True, and since the object + -- file is generated, we need to generate the ALI file. We never want + -- an object file without an ALI file. - if Opt.Force_ALI_Tree_File then + if Is_Ignored_Ghost_Unit (Main_Unit_Node) + or else Opt.Force_ALI_Tree_File + then Write_ALI (Object => Is_Ignored_Ghost_Unit (Main_Unit_Node)); end if;
[Ada] Compiler abort on use of Invalid_Value on numeric positive subtype
Invalid_Value in most cases uses a predefined numeric value from a built-in table, but if the type does not include zero in its range, the literal 0 is used instead. In that case the value (produced by a call to Get_Simple_Init_Val) must be resolved for proper type information. The following must compile quietly: gnatmake -q main with Problems; use Problems; with Text_IO; use Text_IO; procedure Main is begin Put_Line ("P1: " & P1'Image); Put_Line ("P2: " & P2'Image); Put_Line ("P3: " & P3'Image); Put_Line ("P4: " & P4'Image); end Main; -- package Problems is function P1 return Integer; function P2 return Long_Integer; -- Max. number of prime factors a number can have is log_2 N -- For N = 600851475143, this is ~ 40 -- type P3_Factors is array (1 .. 40) of Long_Integer; function P3 return Long_Integer; type P4_Palindrome is range 100*100 .. 999*999; function P4 return P4_Palindrome; end Problems; package body Problems is function P1 return Integer is separate; function P2 return Long_Integer is separate; function P3 return Long_Integer is separate; function P4 return P4_Palindrome is separate; end Problems; separate(Problems) function P1 return Integer is Sum : Integer range 0 .. 500_500 := 0; begin for I in Integer range 1 .. 1000 - 1 loop if I mod 3 = 0 or I mod 5 = 0 then Sum := Sum + I; end if; end loop; return Sum; end P1; -- separate(Problems) function P2 return Long_Integer is subtype Total is Long_Integer range 0 .. 8_02e6 ; subtype Elem is Totalrange 0 .. 4e7 ; Sum : Total := 0; a, b, c : Elem; begin a := 1; b := 2; loop if b mod 2 = 0 then Sum := Sum + b; end if; c := b; b := a + b; a := c; exit when b >= 4e6; end loop; return Sum; end P2; -- with Text_IO; use Text_IO; with Ada.Numerics.Elementary_Functions; use Ada.Numerics.Elementary_Functions; separate(Problems) function P3 return Long_Integer is -- Greatest prime factor GPF : Long_Integer := 1; Dividend : Long_Integer := 600851475143; Factor : Long_Integer := 2; Quotient : Long_Integer; begin while Dividend > 1 loop Quotient := Dividend / Factor; if Dividend mod Factor = 0 then GPF := Factor; Dividend := Quotient; else if Factor >= Quotient then GPF := Dividend; exit; end if; Factor := Factor + 1; end if; end loop; return GPF; end P3; with Text_IO; use Text_IO; separate(Problems) function P4 return P4_Palindrome is type TripleDigit is range 100 .. 999; a, b: TripleDigit := TripleDigit'First; c : P4_Palindrome; Max_Palindrome : P4_Palindrome := P4_Palindrome'Invalid_Value; function Is_Palindrome (X : in P4_Palindrome) return Boolean is type Int_Digit is range 0 .. 9; type Int_Digits is array (1 .. 6) of Int_Digit; type Digit_Extractor is range 0 .. P4_Palindrome'Last; Y : Digit_Extractor := Digit_Extractor (X); X_Digits : Int_Digits; begin for I in reverse X_Digits'Range loop X_Digits (I) := Int_Digit (Y mod 10); Y := Y / 10; end loop; return (X_Digits (1) = X_Digits (6) and X_Digits (2) = X_Digits (5) and X_Digits (3) = X_Digits (4)) or (X_Digits (2) = X_Digits (6) and X_Digits (3) = X_Digits (5) and X_Digits(1) = 0); end Is_Palindrome; begin for a in TripleDigit'Range loop for b in TripleDigit'Range loop c := P4_Palindrome (a * b); if Is_Palindrome (c) then if Max_Palindrome'Valid or else c > Max_Palindrome then Max_Palindrome := c; end if; end if; end loop; end loop; return Max_Palindrome; end; Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Ed Schonberg gcc/ada/ * exp_attr.adb (Expand_Attribute_Reference, case Invalid_Value): Resolve result of call to Get_Simple_Init_Val, which may be a conversion of a literal.--- gcc/ada/exp_attr.adb +++ gcc/ada/exp_attr.adb @@ -4242,6 +4242,11 @@ package body Exp_Attr is when Attribute_Invalid_Value => Rewrite (N, Get_Simple_Init_Val (Ptyp, N)); + -- The value produced may be a conversion of a literal, which + -- must be resolved to establish its proper type. + + Analyze_And_Resolve (N); + -- -- Last -- --
[Ada] Crash due to missing freeze nodes in transient scope
The following patch updates the freezing of expressions to insert the generated freeze nodes prior to the expression that produced them when the context is a transient scope within a type initialization procedure. This ensures that the nodes are properly interleaved with respect to the constructs that generated them. Tested on x86_64-pc-linux-gnu, committed on trunk 2019-07-01 Hristian Kirtchev gcc/ada/ * freeze.adb (Freeze_Expression): Remove the horrible useless name hiding of N. Insert the freeze nodes generated by the expression prior to the expression when the nearest enclosing scope is transient. gcc/testsuite/ * gnat.dg/freezing1.adb, gnat.dg/freezing1.ads, gnat.dg/freezing1_pack.adb, gnat.dg/freezing1_pack.ads: New testcase.--- gcc/ada/freeze.adb +++ gcc/ada/freeze.adb @@ -7665,9 +7665,8 @@ package body Freeze is or else Ekind (Current_Scope) = E_Void then declare -N: constant Node_Id := Current_Scope; -Freeze_Nodes : List_Id := No_List; -Pos : Int := Scope_Stack.Last; +Freeze_Nodes : List_Id := No_List; +Pos : Int := Scope_Stack.Last; begin if Present (Desig_Typ) then @@ -7700,7 +7699,19 @@ package body Freeze is end if; if Is_Non_Empty_List (Freeze_Nodes) then - if No (Scope_Stack.Table (Pos).Pending_Freeze_Actions) then + + -- When the current scope is transient, insert the freeze nodes + -- prior to the expression that produced them. Transient scopes + -- may create additional declarations when finalizing objects + -- or managing the secondary stack. Inserting the freeze nodes + -- of those constructs prior to the scope would result in a + -- freeze-before-declaration, therefore the freeze node must + -- remain interleaved with their constructs. + + if Scope_Is_Transient then + Insert_Actions (N, Freeze_Nodes); + + elsif No (Scope_Stack.Table (Pos).Pending_Freeze_Actions) then Scope_Stack.Table (Pos).Pending_Freeze_Actions := Freeze_Nodes; else --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/freezing1.adb @@ -0,0 +1,5 @@ +-- { dg-do compile } + +package body Freezing1 is + procedure Foo is null; +end Freezing1; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/freezing1.ads @@ -0,0 +1,10 @@ +with Freezing1_Pack; use Freezing1_Pack; + +package Freezing1 is + type T is abstract tagged record + Collection : access I_Interface_Collection'Class := +new I_Interface_Collection'Class'(Factory.Create_Collection); + end record; + + procedure Foo; +end Freezing1; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/freezing1_pack.adb @@ -0,0 +1,8 @@ +package body Freezing1_Pack is + function Create_Collection + (Factory : in T_Factory) return I_Interface_Collection'Class + is + begin + return Implem'(null record); + end Create_Collection; +end Freezing1_Pack; --- /dev/null new file mode 100644 +++ gcc/testsuite/gnat.dg/freezing1_pack.ads @@ -0,0 +1,16 @@ +package Freezing1_Pack is + type T_Factory is abstract tagged private; + type I_Interface_Collection is interface; + + Factory : constant T_Factory; + + function Create_Collection + (Factory : in T_Factory) return I_Interface_Collection'Class; + + type Implem is new I_Interface_Collection with null record; + +private + type T_Factory is tagged null record; + + Factory : constant T_Factory := T_Factory'(null record); +end Freezing1_Pack;
Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format
On 7/1/19 2:56 PM, Giuliano Belinassi wrote: > Hi, > > On 07/01, Martin Jambor wrote: >> On Sat, Jun 29 2019, Giuliano Belinassi wrote: >>> This patch makes lto-dump dump the symbol table callgraph in graphviz >>> format. >> -ENOPATCH > Sorry, > I forgot the most important. Here it is. Hello. I like the intention! Please try to use: $ git diff HEAD > /tmp/patch && ~/Programming/gcc/contrib/check_GNU_style.py /tmp/patch which will show you some coding style issues. > >> Martin >> >>> I've not added any test to this because I couldn't find a way to call >>> lto-dump from the testsuite. Also, any feedback with regard to how can >>> I improve this is welcome. >>> >>> gcc/ChangeLog >>> 2019-06-29 Giuliano Belinassi >>> >>> * cgraph.c (dump_graphviz): New function >>> * cgraph.h (dump_graphviz): New function >>> * symtab.c (dump_graphviz): New function >>> * varpool.c (dump_graphviz): New function >>> >>> gcc/lto/ChangeLog >>> 2019-06-29 Giuliano Belinassi >>> >>> * lang.opt (flag_dump_callgraph): New flag >>> * lto-dump.c (dump_symtab_graphviz): New function >>> * lto-dump.c (dump_tool_help): New option >>> * lto-dump.c (lto_main): New option > Giuliano > > === > > diff --git gcc/cgraph.c gcc/cgraph.c > index 28019aba434..55f4ee0bdaa 100644 > > --- gcc/cgraph.c > > +++ gcc/cgraph.c > > @@ -2204,6 +2204,22 @@ > > cgraph_node::dump (FILE *f) > > } > } > +/* Dump call graph node to file F in graphviz format. */ > + > +void > +cgraph_node::dump_graphviz (FILE *f) > +{ > + cgraph_edge *edge; > + > + for (edge = callees; edge; edge = edge->next_callee) > + { > + cgraph_node *callee = edge->callee; > + > + fprintf(f, "\t\"%s\" -> \"%s\"\n", name (), callee->name ()); > + } > +} > + > + > /* Dump call graph node NODE to stderr. */ > DEBUG_FUNCTION void > > === > > diff --git gcc/cgraph.h gcc/cgraph.h > index 18839a4a5ec..181c00a3bd8 100644 > > --- gcc/cgraph.h > > +++ gcc/cgraph.h > > @@ -135,6 +135,9 @@ > > public: > > /* Dump symtab node to F. */ > void dump (FILE *f); > + /* Dump symtab callgraph in graphviz format*/ > + void dump_graphviz (FILE *f); > + > /* Dump symtab node to stderr. */ > void DEBUG_FUNCTION debug (void); > > @@ -1106,6 +1109,9 @@ > > public: > > /* Dump call graph node to file F. */ > void dump (FILE *f); > + /* Dump call graph node to file F. */ > + void dump_graphviz (FILE *f); > + > /* Dump call graph node to stderr. */ > void DEBUG_FUNCTION debug (void); > > @@ -1861,6 +1867,9 @@ > > public: > > /* Dump given varpool node to F. */ > void dump (FILE *f); > + /* Dump given varpool node in graphviz format to F. */ > + void dump_graphviz (FILE *f); > + > /* Dump given varpool node to stderr. */ > void DEBUG_FUNCTION debug (void); > > @@ -2279,6 +2288,9 @@ > > public: > > /* Dump symbol table to F. */ > void dump (FILE *f); > + /* Dump symbol table to F in graphviz format. */ > + void dump_graphviz (FILE *f); > + I would not define it for varpool_node as you don't need it. > /* Dump symbol table to stderr. */ > void DEBUG_FUNCTION debug (void); > > === > > diff --git gcc/lto/lang.opt gcc/lto/lang.opt > index 5bacef349e3..c62dd5aac08 100644 > > --- gcc/lto/lang.opt > > +++ gcc/lto/lang.opt > > @@ -127,6 +127,9 @@ > > help > > LTODump Var(flag_lto_dump_tool_help) > Dump the dump tool command line options. > +callgraph > +LTODump Var(flag_dump_callgraph) > +Dump the symtab callgraph. > fresolution= > LTO Joined > > === > > diff --git gcc/lto/lto-dump.c gcc/lto/lto-dump.c > index 691d109ff34..bc20120f2d5 100644 > > --- gcc/lto/lto-dump.c > > +++ gcc/lto/lto-dump.c > > @@ -197,6 +197,12 @@ > > void dump_list_variables (void) > > e->dump (); > } > +/* Dump symbol table in graphviz format. */ > +void dump_symtab_graphviz (void) > +{ > + symtab->dump_graphviz (stdout); > +} > + > /* Dump symbol list. */ > void dump_list (void) > > @@ -251,26 +257,27 @@ > > void dump_body () > > /* List of command line options for dumping. */ > void dump_tool_help () > { > - printf ("Usage: lto-dump [OPTION]... SUB_COMMAND [OPTION]...\n\n"); > - printf ("LTO dump tool command line options.\n\n"); > - printf (" -list [options] Dump the symbol list.\n"); > - printf (" -demangle Dump the demangled output.\n"); > - printf (" -defined-only Dump only the defined symbols.\n"); > - printf (" -print-value Dump initial values of the " > - "variables.\n"); > - printf (" -name-sort Sort the symbols alphabetically.\n"); > - printf (" -size-sort Sort the symbols according to size.\n"); > - printf (" -reverse-sort Dump the symbols in reverse order.\n"); > - printf (" -symbol= Dump the details of specific symbol.\n"); > - printf (" -objects Dump the details of LTO objects.\n"); > - printf (" -type-stats Dump statistics of tr
Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format
Hi, On 07/01, Martin Jambor wrote: > On Sat, Jun 29 2019, Giuliano Belinassi wrote: > > This patch makes lto-dump dump the symbol table callgraph in graphviz > > format. > > -ENOPATCH Sorry, I forgot the most important. Here it is. > > Martin > > > > > I've not added any test to this because I couldn't find a way to call > > lto-dump from the testsuite. Also, any feedback with regard to how can > > I improve this is welcome. > > > > gcc/ChangeLog > > 2019-06-29 Giuliano Belinassi > > > > * cgraph.c (dump_graphviz): New function > > * cgraph.h (dump_graphviz): New function > > * symtab.c (dump_graphviz): New function > > * varpool.c (dump_graphviz): New function > > > > gcc/lto/ChangeLog > > 2019-06-29 Giuliano Belinassi > > > > * lang.opt (flag_dump_callgraph): New flag > > * lto-dump.c (dump_symtab_graphviz): New function > > * lto-dump.c (dump_tool_help): New option > > * lto-dump.c (lto_main): New option Giuliano diff --git gcc/cgraph.c gcc/cgraph.c index 28019aba434..55f4ee0bdaa 100644 --- gcc/cgraph.c +++ gcc/cgraph.c @@ -2204,6 +2204,22 @@ cgraph_node::dump (FILE *f) } } +/* Dump call graph node to file F in graphviz format. */ + +void +cgraph_node::dump_graphviz (FILE *f) +{ + cgraph_edge *edge; + + for (edge = callees; edge; edge = edge->next_callee) +{ + cgraph_node *callee = edge->callee; + + fprintf(f, "\t\"%s\" -> \"%s\"\n", name (), callee->name ()); +} +} + + /* Dump call graph node NODE to stderr. */ DEBUG_FUNCTION void diff --git gcc/cgraph.h gcc/cgraph.h index 18839a4a5ec..181c00a3bd8 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -135,6 +135,9 @@ public: /* Dump symtab node to F. */ void dump (FILE *f); + /* Dump symtab callgraph in graphviz format*/ + void dump_graphviz (FILE *f); + /* Dump symtab node to stderr. */ void DEBUG_FUNCTION debug (void); @@ -1106,6 +1109,9 @@ public: /* Dump call graph node to file F. */ void dump (FILE *f); + /* Dump call graph node to file F. */ + void dump_graphviz (FILE *f); + /* Dump call graph node to stderr. */ void DEBUG_FUNCTION debug (void); @@ -1861,6 +1867,9 @@ public: /* Dump given varpool node to F. */ void dump (FILE *f); + /* Dump given varpool node in graphviz format to F. */ + void dump_graphviz (FILE *f); + /* Dump given varpool node to stderr. */ void DEBUG_FUNCTION debug (void); @@ -2279,6 +2288,9 @@ public: /* Dump symbol table to F. */ void dump (FILE *f); + /* Dump symbol table to F in graphviz format. */ + void dump_graphviz (FILE *f); + /* Dump symbol table to stderr. */ void DEBUG_FUNCTION debug (void); diff --git gcc/lto/lang.opt gcc/lto/lang.opt index 5bacef349e3..c62dd5aac08 100644 --- gcc/lto/lang.opt +++ gcc/lto/lang.opt @@ -127,6 +127,9 @@ help LTODump Var(flag_lto_dump_tool_help) Dump the dump tool command line options. +callgraph +LTODump Var(flag_dump_callgraph) +Dump the symtab callgraph. fresolution= LTO Joined diff --git gcc/lto/lto-dump.c gcc/lto/lto-dump.c index 691d109ff34..bc20120f2d5 100644 --- gcc/lto/lto-dump.c +++ gcc/lto/lto-dump.c @@ -197,6 +197,12 @@ void dump_list_variables (void) e->dump (); } +/* Dump symbol table in graphviz format. */ +void dump_symtab_graphviz (void) +{ + symtab->dump_graphviz (stdout); +} + /* Dump symbol list. */ void dump_list (void) @@ -251,26 +257,27 @@ void dump_body () /* List of command line options for dumping. */ void dump_tool_help () { - printf ("Usage: lto-dump [OPTION]... SUB_COMMAND [OPTION]...\n\n"); - printf ("LTO dump tool command line options.\n\n"); - printf (" -list [options] Dump the symbol list.\n"); - printf ("-demangle Dump the demangled output.\n"); - printf ("-defined-only Dump only the defined symbols.\n"); - printf ("-print-valueDump initial values of the " - "variables.\n"); - printf ("-name-sort Sort the symbols alphabetically.\n"); - printf ("-size-sort Sort the symbols according to size.\n"); - printf ("-reverse-sort Dump the symbols in reverse order.\n"); - printf (" -symbol= Dump the details of specific symbol.\n"); - printf (" -objects Dump the details of LTO objects.\n"); - printf (" -type-stats Dump statistics of tree types.\n"); - printf (" -tree-stats Dump statistics of trees.\n"); - printf (" -gimple-stats Dump statistics of gimple " - "statements.\n"); - printf (" -dump-body= Dump the specific gimple body.\n"); - printf (" -dump-level= Deciding the optimization level " - "of body.\n"); - printf (" -help Display the dump tool help.\n"); + const char *msg = +"Usage: lto-dump [OPTION]... SUB_COMMAND [OPTION]...\n\n" +"LTO dump
Re: Use ODR for canonical types construction in LTO
Hi, this patch fixes the ICE. Problem is that va_arg type is pre-streamed and thus at stream-in time we have main variant constructed by LTO FE which is !CXX_ODR_P while vairants are ones comming from C++ FE which are ODR. It is safe to drop the flags here since we only care about main variants anyway. I am testing the patch on aarch64 now. Honza Index: lto/lto-common.c === --- lto/lto-common.c(revision 272846) +++ lto/lto-common.c(working copy) @@ -568,8 +568,17 @@ lto_register_canonical_types_for_odr_typ /* Register all remaining types. */ FOR_EACH_VEC_ELT (*types_to_register, i, t) -if (!TYPE_CANONICAL (t)) - gimple_register_canonical_type (t); +{ + /* For pre-streamed types like va-arg it is possible that main variant +is !CXX_ODR_P while the variant (which is streamed) is. +Copy CXX_ODR_P to make type verifier happy. This is safe because +in canonical type calculation we only consider main variants. +However we can not change this flag before streaming is finished +to not affect tree merging. */ + TYPE_CXX_ODR_P (t) = TYPE_CXX_ODR_P (TYPE_MAIN_VARIANT (t)); + if (!TYPE_CANONICAL (t)) +gimple_register_canonical_type (t); +} }
Re: [ARM/FDPIC v5 00/21] FDPIC ABI for ARM
ping^4 ? https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00817.html On Mon, 17 Jun 2019 at 13:41, Christophe Lyon wrote: > > ping^3 ? > > On Thu, 6 Jun 2019 at 14:36, Christophe Lyon > wrote: > > > > Hi, > > > > If this makes review easier, here are the areas covered by the patches: > > > > - patches 1,3,4,7,8,9,10,12,21: target-specific > > - patch 2: configure > > - patch 5,6,11,13: generic parts, undef #if defined(__FDPIC__) > > - patches 14-20: testsuite > > > > Christophe > > > > On Tue, 4 Jun 2019 at 14:57, Christophe Lyon > > wrote: > > > > > > Ping? > > > > > > > > > On Thu, 23 May 2019 at 14:46, Christophe Lyon > > > wrote: > > > > > > > > Ping? > > > > > > > > Any feedback other than what I got on patch 03/21 ? > > > > > > > > Thanks, > > > > > > > > Christophe > > > > > > > > > > > > On 15/05/2019 14:39, Christophe Lyon wrote: > > > > > Hello, > > > > > > > > > > This patch series implements the GCC contribution of the FDPIC ABI for > > > > > ARM targets. > > > > > > > > > > This ABI enables to run Linux on ARM MMU-less cores and supports > > > > > shared libraries to reduce the memory footprint. > > > > > > > > > > Without MMU, text and data segments relative distances are different > > > > > from one process to another, hence the need for a dedicated FDPIC > > > > > register holding the start address of the data segment. One of the > > > > > side effects is that function pointers require two words to be > > > > > represented: the address of the code, and the data segment start > > > > > address. These two words are designated as "Function Descriptor", > > > > > hence the "FD PIC" name. > > > > > > > > > > On ARM, the FDPIC register is r9 [1], and the target name is > > > > > arm-uclinuxfdpiceabi. Note that arm-uclinux exists, but uses another > > > > > ABI and the BFLAT file format; it does not support code sharing. > > > > > The -mfdpic option is enabled by default, and -mno-fdpic should be > > > > > used to build the Linux kernel. > > > > > > > > > > This work was developed some time ago by STMicroelectronics, and was > > > > > presented during Linaro Connect SFO15 (September 2015). You can watch > > > > > the discussion and read the slides [2]. > > > > > This presentation was related to the toolchain published on github > > > > > [3], > > > > > which is based on binutils-2.22, gcc-4.7, uclibc-0.9.33.2, gdb-7.5.1 > > > > > and qemu-2.3.0, and for which pre-built binaries are available [3]. > > > > > > > > > > The ABI itself is described in details in [1]. > > > > > > > > > > Our Linux kernel patches have been updated and committed by Nicolas > > > > > Pitre (Linaro) in July 2017. They are required so that the loader is > > > > > able to handle this new file type. Indeed, the ELF files are tagged > > > > > with ELFOSABI_ARM_FDPIC. This new tag has been allocated by ARM, as > > > > > well as the new relocations involved. > > > > > > > > > > The binutils, QEMU and uclibc-ng patch series have been merged a few > > > > > months ago. [4][5][6] > > > > > > > > > > This series provides support for architectures that support ARM and/or > > > > > Thumb-2 and has been tested on arm-linux-gnueabi without regression, > > > > > as well as arm-uclinuxfdpiceabi, using QEMU. arm-uclinuxfdpiceabi has > > > > > a few more failures than arm-linux-gnueabi, but is quite functional. > > > > > > > > > > I have also booted an STM32 board (stm32f469) which uses a cortex-m4 > > > > > with linux-4.20.17 and ran successfully several tools. > > > > > > > > > > Are the GCC patches OK for inclusion in master? > > > > > > > > > > Changes between v4 and v5: > > > > > - rebased on top of recent gcc-10 master (April 26th, 2019) > > > > > - fixed handling of stack-protector combined patterns in FDPIC mode > > > > > > > > > > Changes between v3 and v4: > > > > > > > > > > - improved documentation (patch 1) > > > > > - emit an error message (sorry) if the target architecture does not > > > > >support arm nor thumb-2 modes (patch 4) > > > > > - handle Richard's comments on patch 4 (comments, unspec) > > > > > - added .align directive (patch 5) > > > > > - fixed use of kernel helpers (__kernel_cmpxchg, __kernel_dmb) (patch > > > > > 6) > > > > > - code factorization in patch 7 > > > > > - typos/internal function name in patch 8 > > > > > - improved patch 12 > > > > > - dropped patch 16 > > > > > - patch 20 introduces arm_arch*_thumb_ok effective targets to help > > > > >skip some tests > > > > > - I tested patch 2 on xtensa-buildroot-uclinux-uclibc, it adds many > > > > >new tests, but a few regressions > > > > >(https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00713.html) > > > > > - I compiled and executed several LTP tests to exercise pthreads and > > > > > signals > > > > > - I wrote and executed a simple testcase to change the interaction > > > > >with __kernel_cmpxchg (ie. call the kernel helper rather than use > > > > > an > > > > >implementation in libgcc as requested by Richard) > > > > > > >
Re: [PATCH][ARM] Add support for "noinit" attribute
ping? On Thu, 13 Jun 2019 at 17:13, Christophe Lyon wrote: > > Hi, > > Similar to what already exists for TI msp430 or in TI compilers for > arm, this patch adds support for "noinit" attribute for arm. It's very > similar to the corresponding code in GCC for msp430. > > It is useful for embedded targets where the user wants to keep the > value of some data when the program is restarted: such variables are > not zero-initialized.It is mostly a helper/shortcut to placing > variables in a dedicated section. > > It's probably desirable to add the following chunk to the GNU linker: > diff --git a/ld/emulparams/armelf.sh b/ld/emulparams/armelf.sh > index 272a8bc..9555cec 100644 > --- a/ld/emulparams/armelf.sh > +++ b/ld/emulparams/armelf.sh > @@ -10,7 +10,19 @@ OTHER_TEXT_SECTIONS='*(.glue_7t) *(.glue_7) > *(.vfp11_veneer) *(.v4_bx)' > OTHER_BSS_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__bss_start__ = > .${CREATE_SHLIB+)};" > OTHER_BSS_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}_bss_end__ = > .${CREATE_SHLIB+)}; ${CREATE_SHLIB+PROVIDE (}__bss_end__ = > .${CREATE_SHLIB+)};" > OTHER_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__end__ = .${CREATE_SHLIB+)};" > -OTHER_SECTIONS='.note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) }' > +OTHER_SECTIONS=' > +.note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) } > + /* This section contains data that is not initialised during load > + *or* application reset. */ > + .noinit (NOLOAD) : > + { > + . = ALIGN(2); > + PROVIDE (__noinit_start = .); > + *(.noinit) > + . = ALIGN(2); > + PROVIDE (__noinit_end = .); > + } > +' > > so that the noinit section has the "NOLOAD" flag. > > I'll submit that part separately to the binutils project if OK. > > OK? > > Thanks, > > Christophe
Re: [PING][AArch64] Use scvtf fbits option where appropriate
Hi Joel, This looks good. One more thing, the patterns need to be conditional on check flag_trapping_math since the division can underflow and reassociating it would remove that. Other than that I think this is ready, but I can't approve. Wilco
Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default
On Mon, Jul 01, 2019 at 12:30:41PM +0200, Richard Biener wrote: > On Thu, Jun 27, 2019 at 2:19 PM Bill Schmidt wrote: > > > > On 6/27/19 6:45 AM, Segher Boessenkool wrote: > > > On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote: > > >> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt > > >> wrote: > > >>> We've done some experimenting and realized that the subject option > > >>> almost > > >>> always provide improved performance for Power when the loop unroller is > > >>> enabled. So this patch turns that flag on by default for us. > > >> I guess it creates more freedom for combine (more single-uses) and > > >> register > > >> allocation. I wonder in which cases this might pessimize things? I > > >> guess > > >> the pre-RA scheduler might make RAs life harder with creating overlapping > > >> life-ranges. > > >> > > >> I guess you didn't actually investigate the nature of the improvements > > >> you saw? > > > It breaks the length of dependency chains by a factor equal to the unroll > > > factor. I do not know why this doesn't help a lot everywhere. It of > > > course raises register pressure, maybe that is just it? > > > > Right, it's all about breaking dependencies to more efficiently exploit > > the microarchitecture. By default, variable expansion in GCC is quite > > conservative, creating only two reduction streams out of one, so it's > > pretty rare for it to cause spill. This can be adjusted upwards with > > --param max-variable-expansions-in-unroller=n. Our experiments show > > that raising n to 4 starts to cause some minor degradations, which are > > almost certainly due to pressure, so the default setting looks appropriate. > > But it's probably only an issue for targets which enable pre-RA scheduling > by default? It might also increase RA compile-time (more allocnos). It only helps super-scalar CPUs normally. It doesn't increase register use much at all, compared to only doing unrolling. It helps scheduling a lot, but I don't see where sched1 comes in here? To see if it helps your arch, just try it out? Segher
Re: RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch
On Mon, Jul 1, 2019 at 1:22 PM Joern Wolfgang Rennecke wrote: > > The heuristic introduced for PR71016 prevents recognizing a max / min > combo like it is used for > saturation when followed by a conversion. > The attached patch refines the heuristic to allow this case. Regression > tested on x86_64-pc-linux-gnu . Few style nits: if (!gsi_end_p (gsi)) - return NULL; + { + gimple *assign = gsi_stmt (gsi); + if (gimple_code (assign) != GIMPLE_ASSIGN) + return NULL; if (gassign *assign = dyn_cast (gsi_stmt (gsi))) { + tree lhs = gimple_assign_lhs (assign); + enum tree_code ass_code = gimple_assign_rhs_code (assign); + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR) + return NULL; + gsi_prev_nondebug (&gsi); + if (!gsi_end_p (gsi)) + return NULL; } else return NULL; + } also please check that 'lhs' is equal to gimple_assing_rhs1 (arg0_def_stmt) otherwise you'd also allow MIN/MAX unrelated to the conversion detected. On x86 I see the MIN_EXPR is already detected by GENERIC folding, I wonder if that is required or if we can handle the case without in one phiopt pass invocation as well. Otherwise looks good to me. Thanks, Richard.
Re: [patch][aarch64]: fix unrecognizable insn for ldr got in ilp32 tiny
Hi Silvia, > Combined them into one pattern. Updated the diff and the changelog is now: gcc/ChangeLog: 2019-06-18 Sylvia Taylor * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Change SYMBOL_TINY_GOT. * config/aarch64/aarch64.md (ldr_got_tiny_): New pattern. (ldr_got_tiny_sidi): New pattern. Thanks, this looks fine to me, but I can't approve. As an aside there is an inconsistency in the mode used for symbols throughout aarch64.md. Some have no mode, others use PTR, and some cases use DI mode (which means ILP32 may want SImode for some symbols and DImode in other cases...). Since various existing patterns have this issue it's not a problem with this patch. Wilco
Re: [PATCH 04/30] Changes to arc
* acsaw...@linux.ibm.com [2019-06-25 15:22:13 -0500]: > From: Aaron Sawdey > > * config/arc/arc-protos.h: Change movmem to cpymem. > * config/arc/arc.c (arc_expand_movmem): Change movmem to cpymem. > * config/arc/arc.h: Change movmem to cpymem in comment. > * config/arc/arc.md (movmemsi): Change movmem to cpymem. OK for ARC. Thanks, Andrew > --- > gcc/config/arc/arc-protos.h | 2 +- > gcc/config/arc/arc.c| 6 +++--- > gcc/config/arc/arc.h| 2 +- > gcc/config/arc/arc.md | 4 ++-- > 4 files changed, 7 insertions(+), 7 deletions(-) > > diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h > index f501bc3..74e5247 100644 > --- a/gcc/config/arc/arc-protos.h > +++ b/gcc/config/arc/arc-protos.h > @@ -35,7 +35,7 @@ extern void arc_final_prescan_insn (rtx_insn *, rtx *, int); > extern const char *arc_output_libcall (const char *); > extern int arc_output_addsi (rtx *operands, bool, bool); > extern int arc_output_commutative_cond_exec (rtx *operands, bool); > -extern bool arc_expand_movmem (rtx *operands); > +extern bool arc_expand_cpymem (rtx *operands); > extern bool prepare_move_operands (rtx *operands, machine_mode mode); > extern void emit_shift (enum rtx_code, rtx, rtx, rtx); > extern void arc_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx); > diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c > index 89f69c79..23171d2 100644 > --- a/gcc/config/arc/arc.c > +++ b/gcc/config/arc/arc.c > @@ -8883,7 +8883,7 @@ arc_output_commutative_cond_exec (rtx *operands, bool > output_p) >return 8; > } > > -/* Helper function of arc_expand_movmem. ADDR points to a chunk of memory. > +/* Helper function of arc_expand_cpymem. ADDR points to a chunk of memory. > Emit code and return an potentially modified address such that offsets > up to SIZE are can be added to yield a legitimate address. > if REUSE is set, ADDR is a register that may be modified. */ > @@ -8917,7 +8917,7 @@ force_offsettable (rtx addr, HOST_WIDE_INT size, bool > reuse) > offset ranges. Return true on success. */ > > bool > -arc_expand_movmem (rtx *operands) > +arc_expand_cpymem (rtx *operands) > { >rtx dst = operands[0]; >rtx src = operands[1]; > @@ -10427,7 +10427,7 @@ arc_use_by_pieces_infrastructure_p (unsigned > HOST_WIDE_INT size, > enum by_pieces_operation op, > bool speed_p) > { > - /* Let the movmem expander handle small block moves. */ > + /* Let the cpymem expander handle small block moves. */ >if (op == MOVE_BY_PIECES) > return false; > > diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h > index 80dead9..4a9dd07 100644 > --- a/gcc/config/arc/arc.h > +++ b/gcc/config/arc/arc.h > @@ -1423,7 +1423,7 @@ do { \ > in one reasonably fast instruction. */ > #define MOVE_MAX 4 > > -/* Undo the effects of the movmem pattern presence on STORE_BY_PIECES_P . */ > +/* Undo the effects of the cpymem pattern presence on STORE_BY_PIECES_P . */ > #define MOVE_RATIO(SPEED) ((SPEED) ? 15 : 3) > > /* Define this to be nonzero if shift instructions ignore all but the > diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md > index 528e344..ba595dd 100644 > --- a/gcc/config/arc/arc.md > +++ b/gcc/config/arc/arc.md > @@ -5122,13 +5122,13 @@ core_3, archs4x, archs4xd, archs4xd_slow" > (set_attr "type" "loop_end") > (set_attr "length" "4,20")]) > > -(define_expand "movmemsi" > +(define_expand "cpymemsi" >[(match_operand:BLK 0 "" "") > (match_operand:BLK 1 "" "") > (match_operand:SI 2 "nonmemory_operand" "") > (match_operand 3 "immediate_operand" "")] >"" > - "if (arc_expand_movmem (operands)) DONE; else FAIL;") > + "if (arc_expand_cpymem (operands)) DONE; else FAIL;") > > ;; Close http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35803 if this works > ;; to the point that we can generate cmove instructions. > -- > 2.7.4 >
RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch
The heuristic introduced for PR71016 prevents recognizing a max / min combo like it is used for saturation when followed by a conversion. The attached patch refines the heuristic to allow this case. Regression tested on x86_64-pc-linux-gnu . Index: tree-ssa-phiopt.c === --- tree-ssa-phiopt.c (revision 272846) +++ tree-ssa-phiopt.c (working copy) @@ -504,7 +504,18 @@ factor_out_conditional_conversion (edge gsi = gsi_for_stmt (arg0_def_stmt); gsi_prev_nondebug (&gsi); if (!gsi_end_p (gsi)) - return NULL; + { + gimple *assign = gsi_stmt (gsi); + if (gimple_code (assign) != GIMPLE_ASSIGN) + return NULL; + tree lhs = gimple_assign_lhs (assign); + enum tree_code ass_code = gimple_assign_rhs_code (assign); + if (ass_code != MAX_EXPR && ass_code != MIN_EXPR) + return NULL; + gsi_prev_nondebug (&gsi); + if (!gsi_end_p (gsi)) + return NULL; + } gsi = gsi_for_stmt (arg0_def_stmt); gsi_next_nondebug (&gsi); if (!gsi_end_p (gsi)) Index: testsuite/gcc.dg/tree-ssa/pr66726-4.c === --- testsuite/gcc.dg/tree-ssa/pr66726-4.c (nonexistent) +++ testsuite/gcc.dg/tree-ssa/pr66726-4.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-phiopt1-details" } */ + +#define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x)) + +void +foo (unsigned char *p, int i) +{ + *p = SAT (i); +} + +/* { dg-final { scan-tree-dump-times "COND_EXPR .*and PHI .*converted to straightline code" 1 "phiopt1" } } */
Re: [PATCH 2/2] Rename SINGE_VALUE to TOPN_VALUES counters.
@Honza: PING^1 On 6/20/19 4:46 PM, Martin Liška wrote: > And the second part is rename so that it reflect reality > that single value can actually track multiple values. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin >
Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization
@Honza: PING^1 On 6/20/19 4:45 PM, Martin Liška wrote: > Hi. > > So the first part is about support of N tracked values to be supported. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin >
Re: allow EH to escape from GIMPLE_EH_ELSE ELSE block
On Fri, Jun 28, 2019 at 11:43 AM Alexandre Oliva wrote: > > On Jun 27, 2019, Richard Biener wrote: > > > On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva wrote: > >> > >> The only preexisting use of GIMPLE_EH_ELSE, for transactional memory > >> commits, did not allow exceptions to escape from the ELSE path. The > >> trick it uses to allow the ELSE path to see the propagating exception > >> does not work very well if the exception cleanup raises further > >> exceptions: the ELSE block is configured to handle exceptions in > >> itself. This confuses the heck out of CFG and EH cleanups. > >> > >> Basing the lowering context for the ELSE block on outer_state, rather > >> than this_state, gets us the expected enclosing handler. > >> > >> Regstrapped on x86_64-linux-gnu. Ok to install? > > > Testcase? > > None possible with the current codebase, I'm afraid. Nothing creates > EH_ELSE yet, and nothing creates GIMPLE_EH_ELSE with > exception-generating cleanups. Oh, I see. The GIMPLE frontend is also missing parsing support for the EH stmt kinds, it might have been possible to build a testcase with that I guess (yeah, only a very slight hint ... ;)). Can you share a .gimple dump that has the issue? So the testcase "survives" CFG construction but then crashes during the first CFG cleanup or only after some EH optimization? Thanks, Richard. > The following patch, an extract of a larger patch that is still under > internal review (and pending approval of the EH_ELSE-related changes > ;-), is minimized into pointlessness so as to just exercise the > GIMPLE_EH_ELSE lowering issue. > > IIRC the problem is NOT immediately triggered in a bootstrap, it > requires more elaborate EH scenarios to trigger it: without the > GIMPLE_EH_ELSE lowering patch, a few libadalang units failed to compile > within delete_unreachable_blocks, using a compiler built with the patch > below and the patch that introduced EH_ELSE that I posted yesterday. > > At first, I suspected GIMPLE_EH_ELSE had bit-rotted, as it doesn't seem > to get much use, but the problem turned out to be caused by the > nonsensical CFG resulting from the GIMPLE_EH_ELSE lowering, that breaks > EH CFG optimizations and end up corrupting the CFG. IIRC it was > unsplit_eh that mishandled self edges that arise after a bunch of other > valid transformations. I cannot observe the crash with trunk any more, > but the EH tree is visibly wrong in tree dumps with eh and blocks, > compiling such a simple testcase as this (with a patched compiler as > described above): > > -- compile with -Ofast -g -c > with GNAT.Semaphores; use GNAT.Semaphores; > package T is >subtype Mutual_Exclusion is Binary_Semaphore > (Initially_Available => True, > Ceiling => Default_Ceiling); >Lock : aliased Mutual_Exclusion; > end T; > > > Self edges end up arising because of the way GIMPLE_EH_ELSE was lowered: > the exceptional cleanup was lowered as if within the TRY_FINALLY_EXPR > TRY block, whose exceptions get handled by the exceptional cleanup, but > both cleanup paths should be lowered as if after the TRY_FINALLY_EXPR, > so that an enclosing EH region is used should they throw exceptions. > > The current lowering made sense for cleanups that couldn't throw: no EH > edge would come out of the lowered exceptional cleanup block. However, > EH_ELSE-using code below breaks that assumption. Fortunately, the > assumption was not essential for GIMPLE_EH_ELSE: the lowering code just > needed this small adjustment to make room for relaxing the requirement > that the cleanups couldn't throw and restoring CFG EH edges that match > what we get without the patch below. > > > --- gcc/ada/gcc-interface/trans.c > +++ gcc/ada/gcc-interface/trans.c > @@ -6249,7 +6249,7 @@ Exception_Handler_to_gnu_gcc (Node_Id gnat_node) >if (stmt_list_cannot_alter_control_flow_p (Statements (gnat_node))) > add_stmt_with_node (stmt, gnat_node); >else > -add_cleanup (stmt, gnat_node); > +add_cleanup (build2 (EH_ELSE, void_type_node, stmt, stmt), gnat_node); > >gnat_poplevel (); > > @@ -9066,7 +9081,29 @@ add_cleanup (tree gnu_cleanup, Node_Id g > { >if (Present (gnat_node)) > set_expr_location_from_node (gnu_cleanup, gnat_node, true); > - append_to_statement_list (gnu_cleanup, ¤t_stmt_group->cleanups); > + /* An EH_ELSE must be by itself, and that's all we need when we use > + it. The assert below makes sure that is so. Should we ever need > + more than that, we could combine EH_ELSEs, and copy non-EH_ELSE > + stmts into both cleanup paths of an EH_ELSE, being careful to > + make sure the exceptional path doesn't throw. */ > + if (TREE_CODE (gnu_cleanup) == EH_ELSE) > +{ > + gcc_assert (!current_stmt_group->cleanups); > + if (Present (gnat_node)) > + { > + set_expr_location_from_node (TREE_OPERAND (gnu_cleanup, 0), > + gnat_node, true); > + set_expr_location_from_
Re: [patch, c++ openmp] Improve diagnostics for unmappable types
On 28/06/2019 17:21, Jason Merrill wrote: + inform ((decl ? DECL_SOURCE_LOCATION (decl) : UNKNOWN_LOCATION), + "incomplete types are not mappable"); It's better to use input_location as a fallback; essentially no diagnostics should use UNKNOWN_LOCATION. And let's print the type with %qT. + if (notes) + result = false; + else + return false; Returning early when !notes doesn't seem worth the extra lines of code. How is this version? Andrew Improve OpenMP map diagnostics. 2019-07-01 Andrew Stubbs gcc/cp/ * cp-tree.h (cp_omp_emit_unmappable_type_notes): New prototype. * decl.c (cp_finish_decl): Call cp_omp_emit_unmappable_type_notes. * decl2.c (cp_omp_mappable_type): Move contents to ... (cp_omp_mappable_type_1): ... here and add note output. (cp_omp_emit_unmappable_type_notes): New function. * semantics.c (finish_omp_clauses): Call cp_omp_emit_unmappable_type_notes in four places. gcc/testsuite/ * g++.dg/gomp/unmappable-1.C: New file. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index bf47f67721e..a7b2151e6dd 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -6533,6 +6533,7 @@ extern int parm_index (tree); extern tree vtv_start_verification_constructor_init_function (void); extern tree vtv_finish_verification_constructor_init_function (tree); extern bool cp_omp_mappable_type (tree); +extern bool cp_omp_emit_unmappable_type_notes (tree); extern void cp_check_const_attributes (tree); /* in error.c */ diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 5d49535b0d9..74343bc1ec4 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -7433,8 +7433,11 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, DECL_ATTRIBUTES (decl)); complete_type (TREE_TYPE (decl)); if (!cp_omp_mappable_type (TREE_TYPE (decl))) - error ("%q+D in declare target directive does not have mappable type", - decl); + { + error ("%q+D in declare target directive does not have mappable" + " type", decl); + cp_omp_emit_unmappable_type_notes (TREE_TYPE (decl)); + } else if (!lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)) && !lookup_attribute ("omp declare target link", diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index 206f04c6320..b415716c7dd 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -1406,32 +1406,68 @@ cp_check_const_attributes (tree attributes) } } -/* Return true if TYPE is an OpenMP mappable type. */ -bool -cp_omp_mappable_type (tree type) +/* Return true if TYPE is an OpenMP mappable type. + If NOTES is non-zero, emit a note message for each problem. */ +static bool +cp_omp_mappable_type_1 (tree type, bool notes) { + bool result = true; + /* Mappable type has to be complete. */ if (type == error_mark_node || !COMPLETE_TYPE_P (type)) -return false; +{ + if (notes) + { + tree decl = TYPE_MAIN_DECL (type); + inform ((decl ? DECL_SOURCE_LOCATION (decl) : input_location), + "incomplete type %qT is not mappable", type); + } + result = false; +} /* Arrays have mappable type if the elements have mappable type. */ while (TREE_CODE (type) == ARRAY_TYPE) type = TREE_TYPE (type); /* A mappable type cannot contain virtual members. */ if (CLASS_TYPE_P (type) && CLASSTYPE_VTABLES (type)) -return false; +{ + if (notes) + inform (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (type)), + "type %qT with virtual members is not mappable", type); + result = false; +} /* All data members must be non-static. */ if (CLASS_TYPE_P (type)) { tree field; for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field)) if (VAR_P (field)) - return false; + { + if (notes) + inform (DECL_SOURCE_LOCATION (field), + "static field %qD is not mappable", field); + result = false; + } /* All fields must have mappable types. */ else if (TREE_CODE (field) == FIELD_DECL - && !cp_omp_mappable_type (TREE_TYPE (field))) - return false; + && !cp_omp_mappable_type_1 (TREE_TYPE (field), notes)) + result = false; } - return true; + return result; +} + +/* Return true if TYPE is an OpenMP mappable type. */ +bool +cp_omp_mappable_type (tree type) +{ + return cp_omp_mappable_type_1 (type, false); +} + +/* Return true if TYPE is an OpenMP mappable type. + Emit an error messages if not. */ +bool +cp_omp_emit_unmappable_type_notes (tree type) +{ + return cp_omp_mappable_type_1 (type, true); } /* Return the last pushed declaration for the symbol DECL or NULL diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 92c48753d42..8f019580d0f 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -7090,6 +7090,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) "array section does not have mappable type " "in %qs clause", omp_clause_code_name[OMP_CLAUSE_CODE (c)]); + cp_omp_emit_u
Re: introduce EH_ELSE tree and gimplifier
On Fri, Jun 28, 2019 at 5:21 AM Alexandre Oliva wrote: > > On Jun 27, 2019, Richard Biener wrote: > > > On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva wrote: > > >> @@ -909,6 +909,13 @@ DEFTREECODE (TRY_CATCH_EXPR, "try_catch_expr", > >> tcc_statement, 2) > >> The second operand is a cleanup expression which is evaluated > >> on any exit (normal, exception, or jump out) from this expression. */ > >> DEFTREECODE (TRY_FINALLY_EXPR, "try_finally", tcc_statement, 2) > >> + > >> +/* Evaluate either the normal or the exceptional cleanup. This must > >> + only be present as the cleanup expression in a TRY_FINALLY_EXPR. > >> + If the TRY_FINALLY_EXPR completes normally, the first operand of > >> + EH_ELSE is used as a cleanup, otherwise the second operand is > >> + used. */ > >> +DEFTREECODE (EH_ELSE, "eh_else", tcc_statement, 2) > > > It's a bit weird that this is a tcc_statement as opposed to tcc_expression, > > Erhm... What's weird about it? It is not something that belongs in an > arbitrary expr, it's a lot more like a top-level statement, like > try_finally, having side effects but no useful value. It's weird because it appears in a TRY_FINALLY statement operand. But I guess the line between statements and expressions in GENERIC is muddy... > > also I'd have called it EH_ELSE_EXPR for clarity. > > Now *that* would be weird IMHO. No offense intended, but I'd have > dropped the _EXPR from TRY_FINALLY_EXPR to make it match the > "try_finally" string. OK, let me say for consistency then ... > What reasons are there for them to deserve the _EXPR suffix? History I guess? All 'old' tcc_statement are _EXPR. > If I rename it to EH_ELSE_EXPR, should GIMPLE_EH_ELSE also get a _EXPR > suffix? > > > > I also wonder why TRY_FINALLY_EXPR didn't just get a third optional > > operand? > > I considered that possibility, but I didn't expect EH_ELSE to be used > often enough to bloat every TRY_FINALLY_EXPR with another operand, plus > a flag or some other means to remove ambiguity between same-cleanup and > no-else-cleanup. Fair enough. Richard. > > -- > Alexandre Oliva, freedom fighter he/him https://FSFLA.org/blogs/lxo > Be the change, be Free! FSF Latin America board member > GNU Toolchain EngineerFree Software Evangelist > Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara
Re: [PATCH] constrain one character optimization to one character stores (PR 90989)
On Fri, Jun 28, 2019 at 2:16 AM Jeff Law wrote: > > On 6/27/19 10:12 AM, Jakub Jelinek wrote: > > On Thu, Jun 27, 2019 at 09:15:41AM -0600, Jeff Law wrote: > >> Actually it was trivial to create with store merging. > >> > >> char x[20]; > >> foo() > >> { > >> x[0] = 0x41; > >> x[1] = 0x42; > >> } > >> > >> MEM [(char *)&x] = 16961; > >> > >> So clearly we can get this in gimple. So we need a check of some kind, > >> either on the LHS or the RHS to ensure that we actually have a single > >> character store as opposed to something like the above. > > > > handle_char_store is only called if: > > tree type = TREE_TYPE (lhs); > > if (TREE_CODE (type) == ARRAY_TYPE) > > type = TREE_TYPE (type); > > if (TREE_CODE (type) == INTEGER_TYPE > > && TYPE_MODE (type) == TYPE_MODE (char_type_node) > > && TYPE_PRECISION (type) == TYPE_PRECISION (char_type_node)) > > { > > if (! handle_char_store (gsi)) > > return false; > > } > > so the above case will not trigger, the only way to call it for multiple > > character stores is if you have an array. And so far I've succeeded > > creating that just with strcpy builtin. > How I could have missed it so many times is a mystery to me. I > literally was looking at this on the phone with Martin today for the nth > time and my brain just wouldn't parse this code correctly. > > We can only get into handle_char_store for an LHS that is a character or > array of characters. So the only case we have to worry about would be > if we have a constant array on the RHS. Something like this from Jakub: > > MEM [(char *)&x] = MEM [(char *)"ab"]; > > But in this case the RHS is going to be an ARRAY_TYPE. And thus > Martin's new code would reject it because it would only do the > optmiization if the RHS has INTEGER_TYPE. > > So I think my concerns are ultimately a non-issue. > > FWIW if you're trying to get something like Jakub's assignment, this > will do it: > > char a[7]; > int f () > { > __builtin_strcpy (a, "654321"); > } > > Which results in this being passed to handle_char_store: > > MEM [(char * {ref-all})&a] = MEM > [(char * {ref-all})"654321"]; > > We can make it more problematical by first doing a strcpy into a so that > we create an SI structure. Something like this: > > char a[7]; > int f () > { > strcpy (a, "123"); > __builtin_strcpy (a, "654321"); > } > > Compiled with -O2 -fno-tree-dse (the -fno-tree-dse ensures the first > strcpy is not eliminated). > > That should get us into handle_char_store and into the new code from > Martin's patch that wants to test: > > > > + if (cmp > 0 > + && storing_nonzero_p > + && TREE_CODE (TREE_TYPE (rhs)) == INTEGER_TYPE) > > But the RHS in this case has an ARRAY_TYPE and Martin's test would > reject it. > > > So at this point my concerns are alleviated. > > Jakub, Richi, do either of you have concerns? I've kindof owned the > review of this patch from Martin, but since y'all have chimed in, I'd > like to give you another chance to chime in before I ACK. > > jeff > > ps. FWIW, the gimple FE seems to really dislike constant strings in MEM > expressions like we've been working with. It also seems to go into an > infinite loop if you've got an extra closing paren on the RHS.. It was > still useful to poke at things a bit. One could argue that if we get > the FE to a suitable state that it could define what is and what is not > valid gimple. That would likely be incredibly help for this kind of stuff. ;) error handling is a bit weak indeed. I'm testing/committing the following to allow string literals in __MEM which needed string address literals. I guess for Fortran we need to extend this to cover CONST_DECLs which should eventually appear as _Literal (int *) &5 for example. 2019-07-01 Richard Biener c/ * gimple-parser.c (c_parser_gimple_postfix_expression): Handle _Literal (char *) &"foo" for address literals pointing to STRING_CSTs. * gcc.dg/gimplefe-42.c: New testcase. gimplefe-addr-taken-string-literal Description: Binary data
[PATCH 2/2] Add zstd support for LTO bytecode compression.
Hi. This is updated version of the zstd patch that should handle all what Joseph pointed out. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin >From 5d2006c9c4d481f4083d5a591327ee64847b0bf7 Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Thu, 20 Jun 2019 10:08:17 +0200 Subject: [PATCH 2/2] Add zstd support for LTO bytecode compression. gcc/ChangeLog: 2019-07-01 Martin Liska * Makefile.in: Define ZSTD_LIB. * common.opt: Adjust compression level to support also zstd levels. * config.in: Regenerate. * configure: Likewise. * configure.ac: Add --with-zstd and --with-zstd-include options and detect ZSTD. * doc/install.texi: Mention zstd dependency. * gcc.c: Print supported LTO compression algorithms. * lto-compress.c (lto_normalized_zstd_level): Likewise. (lto_compression_zstd): Likewise. (lto_uncompression_zstd): Likewise. (lto_end_compression): Dispatch in between zlib and zstd. (lto_compression_zlib): Mark with ATTRIBUTE_UNUSED. (lto_uncompression_zlib): Make it static. * lto-compress.h (lto_end_uncompression): Fix GNU coding style. * lto-section-in.c (lto_get_section_data): Pass info about used compression. * lto-streamer-out.c: By default use zstd when possible. * timevar.def (TV_IPA_LTO_DECOMPRESS): Rename to decompression (TV_IPA_LTO_COMPRESS): Likewise for compression. --- gcc/Makefile.in| 4 +- gcc/common.opt | 4 +- gcc/config.in | 6 ++ gcc/configure | 163 - gcc/configure.ac | 66 + gcc/doc/install.texi | 6 ++ gcc/gcc.c | 5 ++ gcc/lto-compress.c | 141 +-- gcc/lto-compress.h | 3 +- gcc/lto-section-in.c | 2 +- gcc/lto-streamer-out.c | 4 + gcc/timevar.def| 4 +- 12 files changed, 378 insertions(+), 30 deletions(-) diff --git a/gcc/Makefile.in b/gcc/Makefile.in index d9e0885b96b..597dc01328b 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1065,7 +1065,7 @@ BUILD_LIBDEPS= $(BUILD_LIBIBERTY) LIBS = @LIBS@ libcommon.a $(CPPLIB) $(LIBINTL) $(LIBICONV) $(LIBBACKTRACE) \ $(LIBIBERTY) $(LIBDECNUMBER) $(HOST_LIBS) BACKENDLIBS = $(ISLLIBS) $(GMPLIBS) $(PLUGINLIBS) $(HOST_LIBS) \ - $(ZLIB) + $(ZLIB) $(ZSTD_LIB) # Any system libraries needed just for GNAT. SYSLIBS = @GNAT_LIBEXC@ @@ -1076,6 +1076,8 @@ GNATMAKE = @GNATMAKE@ # Libs needed (at present) just for jcf-dump. LDEXP_LIB = @LDEXP_LIB@ +ZSTD_LIB = @ZSTD_LIB@ + # Likewise, for use in the tools that must run on this machine # even if we are cross-building GCC. BUILD_LIBS = $(BUILD_LIBIBERTY) diff --git a/gcc/common.opt b/gcc/common.opt index a1544d06824..3b71a36552b 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1888,8 +1888,8 @@ Specify the algorithm to partition symbols and vars at linktime. ; The initial value of -1 comes from Z_DEFAULT_COMPRESSION in zlib.h. flto-compression-level= -Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1) IntegerRange(0, 9) --flto-compression-level= Use zlib compression level for IL. +Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1) IntegerRange(0, 19) +-flto-compression-level= Use zlib/zstd compression level for IL. flto-odr-type-merging Common Ignore diff --git a/gcc/config.in b/gcc/config.in index a718ceaf3da..13fd7959dd7 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -1926,6 +1926,12 @@ #endif +/* Define if you have a working header file. */ +#ifndef USED_FOR_TARGET +#undef HAVE_ZSTD_H +#endif + + /* Define if isl is in use. */ #ifndef USED_FOR_TARGET #undef HAVE_isl diff --git a/gcc/configure b/gcc/configure index 955e9ccc09b..8c9f7742ac7 100755 --- a/gcc/configure +++ b/gcc/configure @@ -782,6 +782,8 @@ manext LIBICONV_DEP LTLIBICONV LIBICONV +ZSTD_LIB +ZSTD_INCLUDE DL_LIB LDEXP_LIB EXTRA_GCC_LIBS @@ -959,6 +961,9 @@ with_pkgversion with_bugurl enable_languages with_multilib_list +with_zstd +with_zstd_include +with_zstd_lib enable_rpath with_libiconv_prefix enable_sjlj_exceptions @@ -1783,6 +1788,12 @@ Optional Packages: --with-pkgversion=PKG Use PKG in the version string in place of "GCC" --with-bugurl=URL Direct users to URL to report a bug --with-multilib-listselect multilibs (AArch64, SH and x86-64 only) + --with-zstd=PATHspecify prefix directory for installed zstd library. + Equivalent to --with-zstd-include=PATH/include plus + --with-zstd-lib=PATH/lib + --with-zstd-include=PATH + specify directory for installed zstd include files + --with-zstd-lib=PATHspecify directory for the installed zstd library --with-gnu-ld assume the C compiler uses GNU ld default=no --with-libiconv-prefix[=DIR] search for libiconv in DIR/include and DIR/lib --without-libiconv-prefix don't search for libiconv in includ
Re: [PATCH] Add .gnu.lto_.lto section.
Hi. Ok, so there's a version with added ChangeLog that survives regression tests. Ready to be installed? Thanks, Martin >From e6745583dc4b7f5543878c0a25498e818531f73e Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Fri, 21 Jun 2019 12:14:04 +0200 Subject: [PATCH 1/2] Add .gnu.lto_.lto section. gcc/ChangeLog: 2019-07-01 Martin Liska * lto-section-in.c (lto_get_section_data): Add "lto" section. * lto-section-out.c (lto_destroy_simple_output_block): Never compress LTO_section_lto section. * lto-streamer-out.c (produce_asm): Do not set major_version and minor_version. (lto_output_toplevel_asms): Likewise. (produce_lto_section): New function. (lto_output): Call produce_lto_section. (lto_write_mode_table): Do not set major_version and minor_version. (produce_asm_for_decls): Likewise. * lto-streamer.h (enum lto_section_type): Add LTO_section_lto type. (struct lto_header): Remove. (struct lto_section): New struct. (struct lto_simple_header): Do not inherit from lto_header. (struct lto_file_decl_data): Add lto_section_header field. gcc/lto/ChangeLog: 2019-07-01 Martin Liska * lto-common.c: Read LTO section and verify header. --- gcc/lto-section-in.c | 9 +++-- gcc/lto-section-out.c | 2 -- gcc/lto-streamer-out.c | 40 +--- gcc/lto-streamer.h | 25 + gcc/lto/lto-common.c | 15 +++ 5 files changed, 64 insertions(+), 27 deletions(-) diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c index 4cfc0cad4be..4e7d1181f23 100644 --- a/gcc/lto-section-in.c +++ b/gcc/lto-section-in.c @@ -52,10 +52,10 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] = "icf", "offload_table", "mode_table", - "hsa" + "hsa", + "lto" }; - /* Hooks so that the ipa passes can call into the lto front end to get sections. */ @@ -146,7 +146,7 @@ lto_get_section_data (struct lto_file_decl_data *file_data, /* WPA->ltrans streams are not compressed with exception of function bodies and variable initializers that has been verbatim copied from earlier compilations. */ - if (!flag_ltrans || decompress) + if ((!flag_ltrans || decompress) && section_type != LTO_section_lto) { /* Create a mapping header containing the underlying data and length, and prepend this to the uncompression buffer. The uncompressed data @@ -167,9 +167,6 @@ lto_get_section_data (struct lto_file_decl_data *file_data, data = buffer.data + header_length; } - lto_check_version (((const lto_header *)data)->major_version, - ((const lto_header *)data)->minor_version, - file_data->file_name); return data; } diff --git a/gcc/lto-section-out.c b/gcc/lto-section-out.c index c91e58f0465..7ae102164ef 100644 --- a/gcc/lto-section-out.c +++ b/gcc/lto-section-out.c @@ -285,8 +285,6 @@ lto_destroy_simple_output_block (struct lto_simple_output_block *ob) /* Write the header which says how to decode the pieces of the t. */ memset (&header, 0, sizeof (struct lto_simple_header)); - header.major_version = LTO_major_version; - header.minor_version = LTO_minor_version; header.main_size = ob->main_stream->total_size; lto_write_data (&header, sizeof header); diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index dc68429303c..7dee770aa11 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -1974,10 +1974,6 @@ produce_asm (struct output_block *ob, tree fn) /* The entire header is stream computed here. */ memset (&header, 0, sizeof (struct lto_function_header)); - /* Write the header. */ - header.major_version = LTO_major_version; - header.minor_version = LTO_minor_version; - if (section_type == LTO_section_function_body) header.cfg_size = ob->cfg_stream->total_size; header.main_size = ob->main_stream->total_size; @@ -2270,10 +2266,6 @@ lto_output_toplevel_asms (void) /* The entire header stream is computed here. */ memset (&header, 0, sizeof (header)); - /* Write the header. */ - header.major_version = LTO_major_version; - header.minor_version = LTO_minor_version; - header.main_size = ob->main_stream->total_size; header.string_size = ob->string_stream->total_size; lto_write_data (&header, sizeof header); @@ -2390,6 +2382,29 @@ prune_offload_funcs (void) DECL_PRESERVE_P (fn_decl) = 1; } +/* Produce LTO section that contains global information + about LTO bytecode. */ + +static void +produce_lto_section () +{ + /* Stream LTO meta section. */ + output_block *ob = create_output_block (LTO_section_lto); + + char * section_name = lto_get_section_name (LTO_section_lto, NULL, NULL); + lto_begin_section (section_name, false); + free (section_name); + + lto_compression compression = ZLIB; + + bool slim_object = flag_generate_lto && !flag_fat_lto_objects; + lto_section s += { LTO_major_version, LTO_minor_version, slim_object, compression, 0 }; + lto_write_data (&s, sizeof s); + lto_end_se
Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default
On Thu, Jun 27, 2019 at 2:19 PM Bill Schmidt wrote: > > On 6/27/19 6:45 AM, Segher Boessenkool wrote: > > On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote: > >> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt > >> wrote: > >>> We've done some experimenting and realized that the subject option almost > >>> always provide improved performance for Power when the loop unroller is > >>> enabled. So this patch turns that flag on by default for us. > >> I guess it creates more freedom for combine (more single-uses) and register > >> allocation. I wonder in which cases this might pessimize things? I guess > >> the pre-RA scheduler might make RAs life harder with creating overlapping > >> life-ranges. > >> > >> I guess you didn't actually investigate the nature of the improvements you > >> saw? > > It breaks the length of dependency chains by a factor equal to the unroll > > factor. I do not know why this doesn't help a lot everywhere. It of > > course raises register pressure, maybe that is just it? > > Right, it's all about breaking dependencies to more efficiently exploit > the microarchitecture. By default, variable expansion in GCC is quite > conservative, creating only two reduction streams out of one, so it's > pretty rare for it to cause spill. This can be adjusted upwards with > --param max-variable-expansions-in-unroller=n. Our experiments show > that raising n to 4 starts to cause some minor degradations, which are > almost certainly due to pressure, so the default setting looks appropriate. But it's probably only an issue for targets which enable pre-RA scheduling by default? It might also increase RA compile-time (more allocnos). Richard. > >> Do we want to adjust the flags documentation, saying whether this is > >> enabled > >> by default depends on the target (or even list them)? > > Good idea, thanks. > > OK, I'll update the docs and make the change that Segher requested. > Thanks for the reviews! > > Bill > > > > > > Segher >
Re: [PATCH] Remove another bunch of dead assignment.
On 6/27/19 7:24 PM, Richard Sandiford wrote: > Martin Liška writes: >> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c >> index d50b811d863..1bd251ea8e2 100644 >> --- a/gcc/config/i386/i386-expand.c >> +++ b/gcc/config/i386/i386-expand.c >> @@ -19780,8 +19780,7 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2) >>emit_insn (gen_vec_widen_umult_even_v4si (t5, >> gen_lowpart (V4SImode, op1), >> gen_lowpart (V4SImode, op2))); >> - op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT); >> - >> + expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT); >> } >>else >> { > > This means that nothing uses the expanded value. It looks like the > call was meant to be force_expand_binop instead, so that the expansion > always goes to op0. You are right. The same function is called in the else branch of the condition. I'm sending updated version of the patch. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin > > Richard > >From aecb58d06336baeaa86942424c3314c6020dd754 Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Thu, 27 Jun 2019 13:39:24 +0200 Subject: [PATCH] Remove another bunch of dead assignment. gcc/ChangeLog: 2019-06-27 Martin Liska * lra-eliminations.c (eliminate_regs_in_insn): Remove dead assignemts. * reg-stack.c (check_asm_stack_operands): Likewise. * tree-ssa-structalias.c (create_function_info_for): Likewise. * tree-vect-generic.c (expand_vector_operations_1): Likewise. * config/i386/i386-expand.c (ix86_expand_sse2_mulvxdi3): Use force_expand_binop. gcc/c-family/ChangeLog: 2019-06-27 Martin Liska * c-common.c (try_to_locate_new_include_insertion_point): Remove dead assignemts. gcc/cp/ChangeLog: 2019-06-27 Martin Liska * call.c (build_new_op_1): Remove dead assignemts. * typeck.c (cp_build_binary_op): Likewise. gcc/fortran/ChangeLog: 2019-06-27 Martin Liska * check.c (gfc_check_c_funloc): Remove dead assignemts. * decl.c (variable_decl): Likewise. * resolve.c (resolve_typebound_function): Likewise. * simplify.c (gfc_simplify_matmul): Likewise. (gfc_simplify_scan): Likewise. * trans-array.c (gfc_could_be_alias): Likewise. * trans-common.c (add_equivalences): Likewise. * trans-expr.c (trans_class_vptr_len_assignment): Likewise. (gfc_trans_array_constructor_copy): Likewise. (gfc_trans_assignment_1): Likewise. * trans-intrinsic.c (conv_intrinsic_atomic_op): Likewise. * trans-openmp.c (gfc_omp_finish_clause): Likewise. * trans-types.c (gfc_get_array_descriptor_base): Likewise. * trans.c (gfc_build_final_call): Likewise. libcpp/ChangeLog: 2019-06-27 Martin Liska * line-map.c (linemap_get_expansion_filename): Remove dead assignemts. * mkdeps.c (make_write): Likewise. --- gcc/c-family/c-common.c | 4 ++-- gcc/config/i386/i386-expand.c | 3 +-- gcc/cp/call.c | 2 +- gcc/cp/typeck.c | 1 - gcc/fortran/check.c | 18 +++--- gcc/fortran/decl.c| 1 - gcc/fortran/resolve.c | 1 - gcc/fortran/simplify.c| 27 --- gcc/fortran/trans-array.c | 2 -- gcc/fortran/trans-common.c| 6 ++ gcc/fortran/trans-expr.c | 6 -- gcc/fortran/trans-intrinsic.c | 1 - gcc/fortran/trans-openmp.c| 1 - gcc/fortran/trans-types.c | 10 +- gcc/fortran/trans.c | 3 --- gcc/lra-eliminations.c| 2 +- gcc/reg-stack.c | 1 - gcc/tree-ssa-structalias.c| 1 - gcc/tree-vect-generic.c | 2 -- libcpp/line-map.c | 3 +-- libcpp/mkdeps.c | 2 +- 21 files changed, 33 insertions(+), 64 deletions(-) diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index da4aadbc590..cb92710f2bc 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -8601,8 +8601,8 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc) /* Get ordinary map containing LOC (or its expansion). */ const line_map_ordinary *ord_map_for_loc = NULL; - loc = linemap_resolve_location (line_table, loc, LRK_MACRO_EXPANSION_POINT, - &ord_map_for_loc); + linemap_resolve_location (line_table, loc, LRK_MACRO_EXPANSION_POINT, + &ord_map_for_loc); gcc_assert (ord_map_for_loc); for (unsigned int i = 0; i < LINEMAPS_ORDINARY_USED (line_table); i++) diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 85d74a28636..5d3b74a159f 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -19779,8 +19779,7 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2) emit_insn (gen_vec_widen_umult_even_v4si (t5, gen_lowpart (V4SImode, op1), gen_lowpart (V4SImode, op2))); - op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT); - + forc
[range-ops] patch 05/04: bonus round!
This is completely unrelated to range-ops itself, but may yield better results in value_range intersections. It's just something I found while working on VRP, and have been dragging around on our branch. If we know the intersection of two ranges is the empty set, there is no need to conservatively add anything to the result. Tested on x86-64 Linux with --enable-languages=all. Aldy commit 4f9aa7bd1066267eee92f622ff29d78534158e20 Author: Aldy Hernandez Date: Fri Jun 28 11:34:19 2019 +0200 Do not try to further refine a VR_UNDEFINED result when intersecting value_ranges. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 01fb97cedb2..b0d78ee6871 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2019-07-01 Aldy Hernandez + + * tree-vrp.c (intersect_ranges): If we know the intersection is + empty, there is no need to conservatively add anything else to + the set. + 2019-06-26 Jeff Law PR tree-optimization/90883 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index dc7f825efc8..594ee9adc17 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -5977,6 +5977,11 @@ intersect_ranges (enum value_range_kind *vr0type, gcc_unreachable (); } + /* If we know the intersection is empty, there's no need to + conservatively add anything else to the set. */ + if (*vr0type == VR_UNDEFINED) +return; + /* As a fallback simply use { *VRTYPE, *VR0MIN, *VR0MAX } as result for the intersection. That's always a conservative correct estimate unless VR1 is a constant singleton range
[range-ops] patch 04/04: range-ops proper (PLACEHOLDER)
We still need to do some cosmetic changes to range-ops itself, but I realize it may be easier to review the first 3 patches, if one has an idea of how it all fits together. As I said before, the first 3 patches can go in as is; this is just for the curious or impatient. As such, I am not submitting this patch for contribution yet. I'll contribute later this week, after I cleanup some #if 0 code and reminders we had written throughout the code. For example, there are some optimizations that I'll hold off on contributing, because they yield different (read "better") results than mainline. Since we want to make sure there are absolutely no differences from extract_range_from* for now, I believe it is prudent to delay these optimizations until they can be contributed separately when the dust settles. Be that as it may, we can still discuss the general layout now. The main idea is that all calls to tree-vrp's extract_range_from_ binary/unary_expr in vr-values.s are replaced with range_fold_*expr. These new functions will then call both range-ops and the old extract_range_from_binary/unary, and assert that the results are the same. The code asserting and comparing the results for range-ops and VRP is all in assert_compare_value_ranges. I have taken care of starting from scratch (no exceptions) and adding only what is strictly necessary, where I have proven why two results mismatch but yet range-ops is correct. For example, EXACT, MIN, MAX_EXPR, etc on VRP give up a lot sooner than range-ops. One annoying set of differences is the order in which extract_range* extracts anti-ranges into pairs with ranges_from_anti_range, versus the way we pick at sub-ranges. The ordering matters, and VRP will frequently give up (into VARYING) because the intermediate representation may be unrepresentable. The range-ops iteration does slightly better, and there are sometimes differences when doing operations on pairs of anti-ranges. If VRP yields a VARYING result, but range-ops does better, we ignore the difference. There is a huge comment section where I explain this in depth, with examples. Other than the few accounted for exceptions in assert_compare_value_ranges, there should be no differences, and we abort() very loudly if there is. Now... The VRP entry points into the ranger are: range_ops_fold_unary_expr range_ops_fold_binary_expr These functions are organized into 3 distinctive sections: a) Perform any special behavior users of extract_range* expect. This is generally the special code at the top of extract_range_{binary,unary}*. b) Symbolic handling. Pleasantly we have found out that the only places where we actually care about symbolics in VRP are: 1. PLUS_EXPR, MINUS_EXPR 2. POINTER_PLUS_EXPR 3. NEGATE_EXPR 4. BIT_NOT_EXPR 5. CONVERT_EXPR_CODE_P Interestingly, NEGATE and BIT_NOT care about symbolics because they may degrade into PLUS/MINUS. Since POINTER_PLUS_EXPR is just a special purpose PLUS, the argument can be made that symbolics only matter for PLUS/MINUS and CONVERT_EXPR. c) Calling into range-ops fold. A few random notes on the rest: 1. range-op.cc is the main engine. 2. range.cc is our irange handling code. As discussed before, irange and value_range_base share the same API, and for trunk we propose a simple drop-in replacement of: #ifndef USE_IRANGE class value_range_storage; typedef value_range_base irange; typedef value_range_storage irange_storage; #endif I am attaching everything as it fits in our branch, but I believe what will ultimately happen is that we keep the #if USE_IRANGE bits in our branch, and everything else in trunk. NOTE: I would like to keep our nomenclature for irange even in trunk, as it makes it easier to keep track of things in branch vs. trunk. Besides, there should be no references to irange in tree-vrp, only in range-op.c and range.h/cc, and eventually gori/ranger/etc. This will allow us to experiment with either irange or value_range seamlessly. 3. I have provided a compile time flag for use during stage1 debugging. The flag -franges= chooses which range folding mechanism to use (VRP, range-ops, or both while checking that they agree). The default is -franges=checking. Again, this is entire patchset is a place holder, so everyone can see how it all fits together. I'll be cleaning this up, and only posting the #if !USE_RANGE bits. Be that as it may, this patch has been tested on x86-64 Linux with --enable-languages=all :). Aldy diff --git a/gcc/Makefile.in b/gcc/Makefile.in index d9e0885b96b..a3f3a201787 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1450,6 +1450,8 @@ OBJS = \ print-tree.o \ profile.o
[PATCH 2/2] gdbhooks.py: extend vec support in pretty printers
This change is threefold: - enable pretty printing of vec<>, not just vec<>* - generalize 'vec<(\S+), (\S+), (\S+)>' regex, which is limiting - extend to work for vl_ptr layout (only vl_embed was supported) The motivating example for all three is a vector of vectors in tree-ssa-uninit.c: typedef vec pred_chain; typedef vec pred_chain_union; (This predictably expands to vec, va_heap, vl_ptr> in gdb, if we use strip_typedefs() on it -- otherwise it's just pred_chain_union. The first patch of the series takes care of that.) This example features "direct" vecs themselves rather than pointers to vecs, which is not a rare case; in fact, a quick grep showed that the number of declarations of direct vecs is about the same as that of pointers to vec. The GDB's Python API seems to do dereferencing automatically (i.e. it detects whether to use '->', or '.' for field access), allowing to use the same code to access fields from a struct directly as well as through a pointer. This makes it possible to reuse VecPrinter (meant for vec<> *) for plain vec<>, but perhaps it would be beneficial to handle all pointers in a generic way e.g. print the address and then print the dereferenced object, letting the appropriate pretty printer to pick it up and do its thing, if one is provided for the type. The output I have for direct vecs now: (gdb) p *bb->succs $94 = @0x776569b0 = { 5)>, 3)>} Compare that to (gdb) p bb->succs $70 = 0x776569b0 = { 5)>, 3)>} Hope the choice of the '@' sign to represent the address of an object taken by the pretty printer is not confusing? (GDB itself uses this notation for references, like so: (edge_def *&) @0x77656c38.) OK? gcc/ * gdbhooks.py: Adjust toplevel comment. Add a new example. (VecPrinter.__init__): Distinguish between pointer and direct types. (VecPrinter.to_string): Print pointer value differently based on that distinction. (VecPrinter.children): Adjust accordingly, and account for vl_ptr layout, add a comment about that. (build_pretty_printer): Generalize regex for the regex based subprinter for vec<>*. Add a new regex subprinter for vec<>. --- gcc/gdbhooks.py | 47 +++ 1 file changed, 35 insertions(+), 12 deletions(-) diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py index b036704b86a..a7e665cadb9 100644 --- a/gcc/gdbhooks.py +++ b/gcc/gdbhooks.py @@ -128,12 +128,18 @@ and here's a length 1 vec: (gdb) p bb->succs $19 = 0x70428bb8 = { EXIT)>} -You cannot yet use array notation [] to access the elements within the -vector: attempting to do so instead gives you the vec itself (for vec[0]), -or a (probably) invalid cast to vec<> for the memory after the vec (for -vec[1] onwards). +Direct instances of vec<> are also supported (their addresses are +prefixed with the '@' sign). -Instead (for now) you must access m_vecdata: +In case you have a vec<> pointer, to use array notation [] to access the +elements within the vector you must first dereference the pointer, just +as you do in C: + (gdb) p (*bb->succs)[0] + $50 = (edge_def *&) @0x77656c38: 10)> + (gdb) p bb->succs[0][0] + $51 = (edge_def *&) @0x77656c38: 10)> + +Another option is to access m_vecdata: (gdb) p bb->preds->m_vecdata[0] $20 = 5)> (gdb) p bb->preds->m_vecdata[1] @@ -441,6 +447,10 @@ class VecPrinter: #-ex "up" -ex "p bb->preds" def __init__(self, gdbval): self.gdbval = gdbval +if self.gdbval.type.code == gdb.TYPE_CODE_PTR: +self.ptr = intptr(self.gdbval) +else: +self.ptr = None def display_hint (self): return 'array' @@ -448,14 +458,25 @@ class VecPrinter: def to_string (self): # A trivial implementation; prettyprinting the contents is done # by gdb calling the "children" method below. -return '0x%x' % intptr(self.gdbval) +if self.ptr: +return '0x%x' % self.ptr +else: +return '@0x%x' % intptr(self.gdbval.address) def children (self): -if intptr(self.gdbval) == 0: +if self.ptr == 0: return -m_vecpfx = self.gdbval['m_vecpfx'] +m_vec = self.gdbval +# There are 2 kinds of vec layouts: vl_embed, and vl_ptr. The latter +# is implemented with the former stored in the m_vec field. Sadly, +# template_argument(2) won't work: `gdb.error: No type named vl_embed`. +try: +m_vec = m_vec['m_vec'] +except: +pass +m_vecpfx = m_vec['m_vecpfx'] m_num = m_vecpfx['m_num'] -m_vecdata = self.gdbval['m_vecdata'] +m_vecdata = m_vec['m_vecdata'] for i in range(m_num): yield ('[%d]' % i, m_vecdata[i]) @@ -572,9 +593,11 @@ def build_pretty_printer(): pp.add_printer_for_types(['rtx_def *'], 'rtx_def', RtxPrinter) pp.add_printe
[PATCH 1/2] gdbhooks.py: use strip_typedefs to simplify matching type names
Hi! GDB's Python API provides strip_typedefs method that can be instrumental for writing DRY code. Using it at least partially obviates the need for the add_printer_for_types method we have in gdbhooks.py (it takes a list of typenames to match and is usually used to deal with typedefs). I think, the code can be simplified further (there's still ['tree_node *', 'const tree_node *'], which is a little awkward) if we deal with pointer types in a uniform fashion (I'll touch on this in the description of the second patch). But that can be improved separately. The gimple_statement_cond, etc. part has been dysfunctional for a while (namely since gimple-classes-v2-option-3 branch was merged). I updated it to use the new classes' names. That seems to work (though it doesn't output much info anyway: pretty vs. (gphi *) 0x778c0200 in the raw version). I changed the name passed to pp.add_printer_for_types() for scalar_mode and friends -- so they all share the same name now -- but I don't believe the name is used in any context where it would matter. This is just a clean up of gdbhooks.py. OK to commit? gcc/ * gdbhooks.py (GdbPrettyPrinters.__call__): canonicalize any variants of a type name using strip_typdefs. (build_pretty_printer): Use canonical names for types and otherwise simplify the code. Clean up gimple_sth, gimple_statement_sth cruft. --- gcc/gdbhooks.py | 48 +++- 1 file changed, 19 insertions(+), 29 deletions(-) diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py index 26a09749aa3..b036704b86a 100644 --- a/gcc/gdbhooks.py +++ b/gcc/gdbhooks.py @@ -528,7 +528,7 @@ class GdbPrettyPrinters(gdb.printing.PrettyPrinter): self.subprinters.append(GdbSubprinterRegex(regex, name, class_)) def __call__(self, gdbval): -type_ = gdbval.type.unqualified() +type_ = gdbval.type.unqualified().strip_typedefs() str_type = str(type_) for printer in self.subprinters: if printer.enabled and printer.handles_type(str_type): @@ -540,7 +540,7 @@ class GdbPrettyPrinters(gdb.printing.PrettyPrinter): def build_pretty_printer(): pp = GdbPrettyPrinters('gcc') -pp.add_printer_for_types(['tree', 'const_tree'], +pp.add_printer_for_types(['tree_node *', 'const tree_node *'], 'tree', TreePrinter) pp.add_printer_for_types(['cgraph_node *', 'varpool_node *', 'symtab_node *'], 'symtab_node', SymtabNodePrinter) @@ -548,32 +548,25 @@ def build_pretty_printer(): 'cgraph_edge', CgraphEdgePrinter) pp.add_printer_for_types(['ipa_ref *'], 'ipa_ref', IpaReferencePrinter) -pp.add_printer_for_types(['dw_die_ref'], +pp.add_printer_for_types(['die_struct *'], 'dw_die_ref', DWDieRefPrinter) pp.add_printer_for_types(['gimple', 'gimple *', # Keep this in the same order as gimple.def: - 'gimple_cond', 'const_gimple_cond', - 'gimple_statement_cond *', - 'gimple_debug', 'const_gimple_debug', - 'gimple_statement_debug *', - 'gimple_label', 'const_gimple_label', - 'gimple_statement_label *', - 'gimple_switch', 'const_gimple_switch', - 'gimple_statement_switch *', - 'gimple_assign', 'const_gimple_assign', - 'gimple_statement_assign *', - 'gimple_bind', 'const_gimple_bind', - 'gimple_statement_bind *', - 'gimple_phi', 'const_gimple_phi', - 'gimple_statement_phi *'], + 'gcond', 'gcond *', + 'gdebug', 'gdebug *', + 'glabel', 'glabel *', + 'gswitch', 'gswitch *', + 'gassign', 'gassign *', + 'gbind', 'gbind *', + 'gphi', 'gphi *'], 'gimple', GimplePrinter) -pp.add_printer_for_types(['basic_block', 'basic_block_def *'], +pp.add_printer_for_types(['basic_block_def *'], 'basic_block', BasicBlockPrinter) -pp.add_printer_for_types(['edge', 'edge_def *'], +pp.add_printer_for_types(['edge_def *'], 'edge', CfgEdgePrinter) pp.add_printer_for_types(['rtx_def *'], 'rtx_def', RtxPrinter) @@ -585,18 +578,15 @@ def build_pretty_printer(): pp.add_printer_for_regex(r'opt_mode<(\S+)>',
Re: [PATCH] Automatics in equivalence statements
On 25/06/2019 14:17, Mark Eggleston wrote: On 25/06/2019 00:17, Jeff Law wrote: On 6/24/19 2:19 AM, Bernhard Reutner-Fischer wrote: On Fri, 21 Jun 2019 07:10:11 -0700 Steve Kargl wrote: On Fri, Jun 21, 2019 at 02:31:51PM +0100, Mark Eggleston wrote: Currently variables with the AUTOMATIC attribute can not appear in an EQUIVALENCE statement. However its counterpart, STATIC, can be used in an EQUIVALENCE statement. Where there is a clear conflict in the attributes of variables in an EQUIVALENCE statement an error message will be issued as is currently the case. If there is no conflict e.g. a variable with a AUTOMATIC attribute and a variable(s) without attributes all variables in the EQUIVALENCE will become AUTOMATIC. Note: most of this patch was written by Jeff Law Please review. ChangeLogs: gcc/fortran Jeff Law Mark Eggleston * gfortran.h: Add check_conflict declaration. This is wrong. By convention a routine that is not static has the gfc_ prefix. Updated the code to use gfc_check_conflict instead. Furthermore doesn't this export indicate that you're committing a layering violation somehow? Possibly. I'm the original author, but my experience in our fortran front-end is minimal. I fully expected this patch to need some tweaking. We certainly don't want to recreate all the checking that's done in check_conflict. We just need to defer it to a later point -- find_equivalence seemed like a good point since we've got the full equivalence list handy and can accumulate the attributes across the entire list, then check for conflicts. If there's a concrete place where you think we should be doing this, I'm all ears. Any suggestions will be appreciate. * symbol.c (check_conflict): Remove automatic in equivalence conflict check. * symbol.c (save_symbol): Add check for in equivalence to stop the the save attribute being added. * trans-common.c (build_equiv_decl): Add is_auto parameter and add !is_auto to condition where TREE_STATIC (decl) is set. * trans-common.c (build_equiv_decl): Add local variable is_auto, set it true if an atomatic attribute is encountered in the variable atomatic? I read atomic but you mean automatic. Yes. list. Call build_equiv_decl with is_auto as an additional parameter. flag_dec_format_defaults is enabled. * trans-common.c (accumulate_equivalence_attributes) : New subroutine. * trans-common.c (find_equivalence) : New local variable dummy_symbol, accumulated equivalence attributes from each symbol then check for conflicts. I'm just curious why you don't gfc_copy_attr for the most part of accumulate_equivalence_attributes? thanks, Simply didn't know about it. It could probably significantly simplify the accumulation of attributes step. Using gfc_copy_attr causes a great many "Duplicate DIMENSION attribute specified at (1)" errors. This is because there is a great deal of checking done instead of simply keeping track of the attributes used which is all that is required for determining whether there is a conflict in the equivalence statement. Also, the final section of accumulate_equivalence_attributes involving SAVE, INTENT and ACCESS look suspect to me. I'll check and update the patch if necessary. No need to check intent as there is already a conflict with DUMMY and INTENT can only be present for dummy variables. Please find attached an updated patch. Change logs: gcc/fortran Jeff Law Mark Eggleston * gfortran.h: Add gfc_check_conflict declaration. * symbol.c (check_conflict): Rename cfg_check_conflict and remove static. * symbol.c (cfg_check_conflict): Remove automatic in equivalence conflict check. * symbol.c (save_symbol): Add check for in equivalence to stop the the save attribute being added. * trans-common.c (build_equiv_decl): Add is_auto parameter and add !is_auto to condition where TREE_STATIC (decl) is set. * trans-common.c (build_equiv_decl): Add local variable is_auto, set it true if an atomatic attribute is encountered in the variable list. Call build_equiv_decl with is_auto as an additional parameter. flag_dec_format_defaults is enabled. * trans-common.c (accumulate_equivalence_attributes) : New subroutine. * trans-common.c (find_equivalence) : New local variable dummy_symbol, accumulated equivalence attributes from each symbol then check for conflicts. gcc/testsuite Mark Eggleston * gfortran.dg/auto_in_equiv_1.f90: New test. * gfortran.dg/auto_in_equiv_2.f90: New test. * gfortran.dg/auto_in_equiv_3.f90: New test. If the updated patch is acceptable, please can someone with the privileges commit the patch. Mark Jeff -- https://www.codethink.co.uk/privacy.html >From 321c7c84f9578e99ac0a1fa5f3ed1fd78b328d1f Mon Sep 17 00:00:00 2001 From: Mark Eggleston Date: Tue, 11 Sep 2018 12:50:11 +0100 S
[PATCH] Fix use-after-scope in host-mingw32.c (PR target/88056).
Hi. The patch is about use-after-scope. However, I can't verify it survives bootstrap on the affected target. Ready for the trunk? Thanks, Martin gcc/ChangeLog: 2019-07-01 Martin Liska PR target/88056 * config/i386/host-mingw32.c (mingw32_gt_pch_use_address): Define local_object_name in outer scope in order to handle use-after-scope issue. --- gcc/config/i386/host-mingw32.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/host-mingw32.c b/gcc/config/i386/host-mingw32.c index f2b56d71c5b..3254d028313 100644 --- a/gcc/config/i386/host-mingw32.c +++ b/gcc/config/i386/host-mingw32.c @@ -157,10 +157,10 @@ mingw32_gt_pch_use_address (void *addr, size_t size, int fd, /* Determine the version of Windows we are running on and use a uniquely-named local object if running > 4. */ GetVersionEx (&version_info); + + char local_object_name[sizeof (OBJECT_NAME_FMT) + sizeof (DWORD) * 2]; if (version_info.dwMajorVersion > 4) { - char local_object_name [sizeof (OBJECT_NAME_FMT) - + sizeof (DWORD) * 2]; snprintf (local_object_name, sizeof (local_object_name), OBJECT_NAME_FMT "%lx", GetCurrentProcessId()); object_name = local_object_name;
Fix verifier ICE on CLOBBER of COMPONENT_REF
Hi, the testcase makes inliner to substitute RESULT_DECL by COMPONENT_REF inside CLOBBER statement. This is not allowed and leads to ICE in verifier (while I think we should fix code to support this). This patch simply makes inliner to watch for this case and do not remap the statement at all. The testcase works for 32bit only which is bit unfortunate. For 64bit build NRV does not happen, i did not look into why. OK? Honza * tree-inline.c (remap_gimple_stmt): Do not subtitute handled components to clobber of return value. * g++.dg/lto/pr90990_0.C: New testcase. Index: tree-inline.c === --- tree-inline.c (revision 272846) +++ tree-inline.c (working copy) @@ -1757,6 +1757,18 @@ remap_gimple_stmt (gimple *stmt, copy_bo return NULL; } } + + /* We do not allow CLOBBERs of handled components. In case +returned value is stored via such handled component, remove +the clobber so stmt verifier is happy. */ + if (gimple_clobber_p (stmt) + && TREE_CODE (gimple_assign_lhs (stmt)) == RESULT_DECL) + { + tree remapped = remap_decl (gimple_assign_lhs (stmt), id); + if (!DECL_P (remapped) + && TREE_CODE (remapped) != MEM_REF) + return NULL; + } if (gimple_debug_bind_p (stmt)) { Index: testsuite/g++.dg/lto/pr90990_0.C === --- testsuite/g++.dg/lto/pr90990_0.C(nonexistent) +++ testsuite/g++.dg/lto/pr90990_0.C(working copy) @@ -0,0 +1,31 @@ +// { dg-lto-do link } +/* { dg-extra-ld-options { -r -nostdlib } } */ +class A { +public: + float m_floats; + A() {} +}; +class B { +public: + A operator[](int); +}; +class C { + B m_basis; + +public: + A operator()(A) { +m_basis[1] = m_basis[2]; +A a; +return a; + } +}; +class D { +public: + C m_fn1(); +}; +class F { + A m_pivotInB; + F(D &, const A &); +}; +F::F(D &p1, const A &p2) : m_pivotInB(p1.m_fn1()(p2)) {} +
[range-ops] patch 03/04: abstract out a few things from extract_range_from*
I hate duplicating code, and the symbolic handling of pointer_plus_expr, plus_minus_expr, and the code dealing with overflows, is ripe for sharing with the ranger. I've abstracted these into their own functions. No changes in functionality is expected. You will notice that I moved value_range_kind into coretypes.h, as I'd like to use it for value_range, wide-int-range.h, and the ranger header files. For value_range, all options are valid. For wide-int-range.h (adjust_range_for_overflow), everything but VR_UNDEFINED is valid. For the ranger, all values are valid but only for the constructors as our internal implementation has no need for this field. The ranger accesses things with the higher-level num_pairs(), lower/upper_bound(), etc. Tested on x86-64 Linux with --enable-languages=all. Aldy commit e58c0f8b8a27fa91fe22f1f12d19f3d37cc729ae Author: Aldy Hernandez Date: Fri Jun 28 10:41:05 2019 +0200 Move set_value_range_with_overflow into wide-int-range.cc as adjust_range_for_overflow. Move the POINTER_PLUS_EXPR symbolics code into handle_symbolics_in_pointer_plus_expr. Move the PLUS/MINUS_EXPR handling code into extract_range_from_plus_expr. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 866200d9594..40a51fc67dc 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,18 @@ +2019-07-01 Aldy Hernandez + + * tree-vrp.h (value_range_kind): Move from here... + * coretypes.h (value_range_kind): ...to here. + * tree-vrp.c (extract_range_from_binary_expr): Abstract pointer + plus handling of symbolics to + handle_symbolics_in_pointer_plus_expr(). + Abstract plus/minus handling code to + extract_range_from_plus_expr. + Move set_value_range_with_overflow to wide-int-range.cc. + (handle_symbolics_in_pointer_plus_expr): New. + (extract_range_from_plus_expr): New. + * wide-int-range.cc (adjust_range_for_overflow): New. + * wide-int-range.h (adjust_range_for_overflow): New. + 2019-07-01 Aldy Hernandez * tree-vrp.c (value_range_base::set_and_canonicalize): Rename to diff --git a/gcc/coretypes.h b/gcc/coretypes.h index 2f6b8599d7c..4bb1282b1b9 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -202,6 +202,24 @@ enum profile_update { PROFILE_UPDATE_PREFER_ATOMIC }; +/* Types of ranges. + + This is still prefixed with VR_*, even though it is more general + purpose, to avoid having to replace everything across the compiler. + Perhaps we should change it later. */ +enum value_range_kind { + /* Empty range. */ + VR_UNDEFINED, + /* Range spans the entire domain. */ + VR_VARYING, + /* Range is [MIN, MAX]. */ + VR_RANGE, + /* Range is ~[MIN, MAX]. */ + VR_ANTI_RANGE, + /* Range is a nice guy. */ + VR_LAST +}; + /* Types of unwind/exception handling info that can be generated. */ enum unwind_info_type diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index f78517b5b3b..efd51735022 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -1470,112 +1470,196 @@ combine_bound (enum tree_code code, wide_int &wi, wi::overflow_type &ovf, wi = wi::shwi (0, prec); } -/* Given a range in [WMIN, WMAX], adjust it for possible overflow and - put the result in VR. +/* Fold two value range's of a POINTER_PLUS_EXPR into VR. Return TRUE + if successful. */ - TYPE is the type of the range. +static bool +handle_symbolics_in_pointer_plus_expr (value_range_base *vr, + enum tree_code code, + tree expr_type, + const value_range_base *vr0, + const value_range_base *vr1) +{ + if (POINTER_TYPE_P (expr_type) && code == POINTER_PLUS_EXPR) +{ + /* For pointer types, we are really only interested in asserting + whether the expression evaluates to non-NULL. + With -fno-delete-null-pointer-checks we need to be more + conservative. As some object might reside at address 0, + then some offset could be added to it and the same offset + subtracted again and the result would be NULL. + E.g. + static int a[12]; where &a[0] is NULL and + ptr = &a[6]; + ptr -= 6; + ptr will be NULL here, even when there is POINTER_PLUS_EXPR + where the first range doesn't include zero and the second one + doesn't either. As the second operand is sizetype (unsigned), + consider all ranges where the MSB could be set as possible + subtractions where the result might be NULL. */ + if ((!range_includes_zero_p (vr0) + || !range_includes_zero_p (vr1)) + && !TYPE_OVERFLOW_WRAPS (expr_type) + && (flag_delete_null_pointer_checks + || (range_int_cst_p (vr1) + && !tree_int_cst_sign_bit (vr1->max () + vr->set_nonzero (expr_type); + else if (vr0->zero_p () && vr1->zero_p ()) + vr->set_zero (expr_type); + else + vr->set_varying (expr_type); + return true; +} + return false; +} - MIN_OVF and MAX_OVF indicate what type of overflow, if any, - occurred while originally calculating WMIN or WMAX. -1 indicates - underflow. +1 indicates overflow. 0 indicates neither. */ +/* Ex
Re: [PATCH] Fix libstdc++ install-pdf support.
On 01/07/19 01:21 -0700, Jim Wilson wrote: Try to run "make install-pdf" on a system with dblatex and pdflatex installed but not doxygen gives an error. run_doxygen error: Could not find Doxygen 1.7.0 in path. Looking at the build log, I see that this is also using xsltproc and xmllint. Installing doxygen and running again, I get lots of ignored errors for a missing dot program. Looking at the docs I see the collaboration diagrams are missing. Installing dot and rebuilding and now I have the collaboration diagrams. I don't see any evidence that the pdf docs are using the stylesheets. Otherwise, they need everything else that the xml docs need. Regenerating configure I got an unexpected change, but that is an issue with a patch a few days ago that added a new thread file, and regenerated the libgcc configure to use it, but failed to notice that the libstdc++ configure should have been regenerated too. Tested with x86_64-linux builds with various packages installed or removed, and looking at the final docs to make sure they look right. OK? OK, thanks.
Fix g++.dg/lto/alias-1 and 2 for non-plugin builds
Hi, alias-1 and alias-2 fails with -fno-use-inliner-plugin because the inlinig it relies on does not happen. This is because w/o resolution info we can not optimize away the offline copy and we know that main is executed once and optimize for size. Bootsrapped/regtested x86_64-linux, comitted. Honza Index: ChangeLog === --- ChangeLog (revision 272846) +++ ChangeLog (working copy) @@ -1,3 +1,11 @@ +2019-07-01 Jan Hubicka + + PR lto/91028 + PR lto/90720 + * g++.dg/lto/alias-1_0.C: Add loop to make inlining happen with + -fno-use-linker-plugin + * g++.dg/lto/alias-2_0.C: Likewise. + 2019-07-01 Dominique d'Humieres * g++.dg/cpp0x/gen-attrs-67.C: Add error for darwin. Index: g++.dg/lto/alias-1_0.C === --- g++.dg/lto/alias-1_0.C (revision 272846) +++ g++.dg/lto/alias-1_0.C (working copy) @@ -17,6 +17,7 @@ __attribute__ ((used)) struct b **bptr = (struct b**)&aptr; extern void init (); extern void inline_me_late (int); +int n=1; int @@ -24,7 +25,8 @@ main (int argc, char **argv) { init (); aptr = 0; - inline_me_late (argc); + for (int i=0; i
[range-ops] patch 02/04: enforce canonicalization in value_range
As discussed prior. This enforces canonicalization at creation time, which makes things a lot more consistent throughout. Since now [MIN, MAX] will be canonicalized into VR_VARYING, we can no longer depend on normalizing VARYING's into [MIN, MAX] for more general handling. I've adjusted the few places where we were doing this so that they work correctly. As a bonus, now that we're canonicalized, I've tweaked singleton to work with anti-ranges. Note: There is a small note in ranges_from_anti_range, which I will start making heavier use of in subsequent patches: + /* ?? This function is called multiple times from num_pairs, + lower_bound, and upper_bound. We should probably memoize this, or + rewrite the callers in such a way that we're not re-calculating + this constantly. */ I don't think this is needed now, but just a note of things to come. It may not be an issue, but just something to keep in mind. Tested on x86-64 Linux with --enable-languages=all. Aldy commit d220a3eeb77615b39260e532e34815a3810e8348 Author: Aldy Hernandez Date: Fri Jun 28 11:15:03 2019 +0200 Enforce canonicalization in value_range. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a5247735694..866200d9594 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,20 @@ +2019-07-01 Aldy Hernandez + + * tree-vrp.c (value_range_base::set_and_canonicalize): Rename to + set() and add normalization of [MIN, MAX] into varying. + (value_range_base::singleton_p): Make work with anti ranges. + (ranges_from_anti_range): Handle pointers. + (extract_range_into_wide_ints): Call normalize_symbolics. + (extract_range_from_binary_expr): Do not build a [MIN,MAX] range + because it will be canonicalized into varying. + Call set() instead of set_and_canonicalize(). + (extract_range_from_unary_expr): Call set() instead of + set_and_canonicalize(). + (intersect_helper): Do not call set_and_canonicalize. + * tree-vrp.h (value_range_base): Remove set_and_canonicalize. + (value_range): Same. + * vr-values.c (extract_range_from_var_from_comparison_expr): Same. + 2019-07-01 Aldy Hernandez * gimple-ssa-evrp-analyze.c (record_ranges_from_phis): Skip PHIs diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 97046c22ed1..f78517b5b3b 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -69,20 +69,15 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "wide-int-range.h" +static bool +ranges_from_anti_range (const value_range_base *ar, + value_range_base *vr0, value_range_base *vr1, + bool handle_pointers = false); + /* Set of SSA names found live during the RPO traversal of the function for still active basic-blocks. */ static sbitmap *live; -void -value_range_base::set (enum value_range_kind kind, tree min, tree max) -{ - m_kind = kind; - m_min = min; - m_max = max; - if (flag_checking) -check (); -} - void value_range::set_equiv (bitmap equiv) { @@ -363,6 +358,24 @@ value_range::equiv_add (const_tree var, bool value_range_base::singleton_p (tree *result) const { + if (m_kind == VR_ANTI_RANGE) +{ + if (nonzero_p ()) + { + if (TYPE_PRECISION (type ()) == 1) + { + if (result) + *result = m_max; + return true; + } + return false; + } + + value_range_base vr0, vr1; + return (ranges_from_anti_range (this, &vr0, &vr1, true) + && vr1.undefined_p () + && vr0.singleton_p (result)); +} if (m_kind == VR_RANGE && vrp_operand_equal_p (min (), max ()) && is_gimple_min_invariant (min ())) @@ -690,8 +703,7 @@ intersect_range_with_nonzero_bits (enum value_range_kind vr_type, extract ranges from var + CST op limit. */ void -value_range_base::set_and_canonicalize (enum value_range_kind kind, - tree min, tree max) +value_range_base::set (enum value_range_kind kind, tree min, tree max) { if (kind == VR_UNDEFINED) { @@ -711,7 +723,9 @@ value_range_base::set_and_canonicalize (enum value_range_kind kind, if (TREE_CODE (min) != INTEGER_CST || TREE_CODE (max) != INTEGER_CST) { - set (kind, min, max); + m_kind = kind; + m_min = min; + m_max = max; return; } @@ -747,12 +761,13 @@ value_range_base::set_and_canonicalize (enum value_range_kind kind, kind = kind == VR_RANGE ? VR_ANTI_RANGE : VR_RANGE; } + tree type = TREE_TYPE (min); + /* Anti-ranges that can be represented as ranges should be so. */ if (kind == VR_ANTI_RANGE) { /* For -fstrict-enums we may receive out-of-range ranges so consider values < -INF and values > INF as -INF/INF as well. */ - tree type = TREE_TYPE (min); bool is_min = (INTEGRAL_TYPE_P (type) && tree_int_cst_compare (min, TYPE_MIN_VALUE (type)) <= 0); bool is_max = (INTEGRAL_TYPE_P (type) @@ -795,22 +810,37 @@ value_range_base::set_and_canonicalize (enum value_range_kind kind, } } + /* Normalize [MIN, MAX] into VARYING and ~[
Re: [PATCH 5/5] Use ira_setup_alts for conflict detection
Vladimir Makarov writes: > On 2019-06-21 9:43 a.m., Richard Sandiford wrote: >> make_early_clobber_and_input_conflicts records allocno conflicts >> between inputs and earlyclobber outputs. It (rightly) avoids >> doing this for inputs that are explicitly allowed to match the >> output due to matching constraints. >> >> The problem is that whether this matching is allowed varies >> between alternatives. At the moment the code avoids adding >> a clobber if *any* enabled alternative allows the match, >> even if some other operand makes that alternative impossible. >> >> The specific instance of this for SVE is that some alternatives >> allow matched earlyclobbers when a third operand X is constant zero. >> We should avoid adding conflicts when X really is constant zero, >> but should ignore the match if X is nonzero or nonconstant. >> >> ira_setup_alts can already filter these alternatives out for us, >> so all we need to do is use it in process_bb_node_lives. The >> preferred_alternatives variable is only used for this earlyclobber >> detection, so no other check should be affected. >> >> With the previous patch to check the reject weight in ira_setup_alts, >> this has the effect of ignoring expensive alternatives if we have >> other valid alternatives with zero cost. It seems reasonable to base >> the heuristic on only the alternatives that we'd actually like to use, >> but if this ends up being too aggressive, we could instead make the new >> reject behaviour conditional and only use it for add_insn_allocno_copies. >> > This patch definitely improves the heuristics. > > The only missed part is a comment for preferred_alternatives > > /* The value of get_preferred_alternatives for the current instruction, > supplemental to recog_data. */ > static alternative_mask preferred_alternatives; > > The comment becomes misleading after the patch. Oops, thanks for noticing that. > With changing the comment, the patch is ok for me. Thanks, here's what I installed after testing on aarch64-linux-gnu and x86_64-linux-gnu. Richard 2019-07-01 Richard Sandiford gcc/ * ira-lives.c (process_bb_node_lives): Use ira_setup_alts. Index: gcc/ira-lives.c === --- gcc/ira-lives.c 2019-06-29 17:20:49.0 +0100 +++ gcc/ira-lives.c 2019-07-01 09:33:14.026477923 +0100 @@ -80,8 +80,9 @@ Software Foundation; either version 3, o /* The number of last call at which given allocno was saved. */ static int *allocno_saved_at_call; -/* The value of get_preferred_alternatives for the current instruction, - supplemental to recog_data. */ +/* The value returned by ira_setup_alts for the current instruction; + i.e. the set of alternatives that we should consider to be likely + candidates during reloading. */ static alternative_mask preferred_alternatives; /* If non-NULL, the source operand of a register to register copy for which @@ -1236,9 +1237,7 @@ process_bb_node_lives (ira_loop_tree_nod } } - extract_insn (insn); - preferred_alternatives = get_preferred_alternatives (insn); - preprocess_constraints (insn); + preferred_alternatives = ira_setup_alts (insn); process_single_reg_class_operands (false, freq); if (call_p)
[range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING
As discussed before, this enforces types on undefined and varying, which makes everything more regular, and removes some special casing throughout range handling. The min/max fields will contain TYPE_MIN_VALUE and TYPE_MAX_VALUE, which will make it easy to get at the bounds of a range later on. Since pointers don't have TYPE_MIN/MAX_VALUE, we are using build_zero_cst() and wide_int_to_tree(wi::max_value(precision)), for consistency. UNDEFINED is set similarly, though nobody should ever ask for anything except type() from it. That is, no one should be asking for the bounds. There is one wrinkle, ipa-cp creates VR_VARYING ranges of floats, presumably to pass around state?? This causes value_range_base::type() and others to fail, even though I haven't actually seen a case of someone actually trying to set a VR_RANGE of a float. For now, I've built a NOP_EXPR wrapper so type() works correctly. The correct approach would probably be to avoid creating these varying/undefined ranges in ipa-cp, but I don't have enough ipa-cp-foo to do so. Suggestions welcome, if you don't like special casing this for ipa-cp. Or perhaps as a follow up. In this patch I start introducing a couple small API changes that will be needed for range-ops. Since they're needed/useful for this patch, I started adding them on a need-to-use basis. They are: value_range_base (tree type) This is our constructor for building a varying: value_range_base foo (some_type); value_range_base::supports_type_p(tree type) We use this instead of having to test everywhere for INTEGRAL_TYPE_P and POINTER_TYPE_P which VRP uses throughout. I have not ported every use of the INTEGRAL || POINTER in the compiler to this function. But that could be a follow up cleanup if someone (else) is interested :). value_range_base_normalize_symbolics(): This normalizes symbolics into constants. In VRP we usually normalize necessary symbolics into [MIN, MAX]. This patch does slightly better. Now we transform: // [SYM, SYM] -> VARYING // [SYM, NUM] -> [-MIN, NUM] // [NUM, SYM] -> [NUM, +MAX] // ~[SYM, NUM] -> [NUM + 1, +MAX] // ~[NUM, SYM] -> [-MIN, NUM - 1] TBH, this bit and its use in *multiplicative_op probably fits better with the canonicalization patch, but as I said. They're a bit intertwined. Ughh. Finally, as you mentioned before, we need a way of caching varyings in the allocation pool. The type_range_cache class in the attached patch is Andrew's doing, but I'll be happy to take the blame and address anything that needs doing. Tested on x86-64 Linux with --enable-languages=all. Aldy commit 24c3a6a2cb7424a9c022930cada3a5f3c84a388a Author: Aldy Hernandez Date: Fri Jun 28 11:00:10 2019 +0200 VR_UNDEFINED and VR_VARYING now have a type. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 01fb97cedb2..a5247735694 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,72 @@ +2019-07-01 Aldy Hernandez + + * gimple-ssa-evrp-analyze.c (record_ranges_from_phis): Skip PHIs + who's result are not valid for a value_range. + Set type for varying value_range. + * ipa-cp.c (ipcp_vr_lattice::init): Set type for varying/undef + value_range. + (ipcp_vr_lattice::set_to_bottom): Same. + (initialize_node_lattices): Same. + * tree-ssa-threadedge.c (record_temporary_equivalences_from_phis): + Same. + * tree-ssanames.c (get_range_info): Same. + * tree-vrp.c (value_range::set_equiv): Do not set equiv on + undef/varying. + (value_range_base::value_range_base): New constructor. + (value_range_base::check): Remove assert for empty min/max. + (value_range_base::equal_p): Allow comparison of typeless undefs. + (value_range_base::set_undefined): Add type. + (value_range::set_undefined): Same. + (value_range_base::set_varying): Same. + (value_range::set_varying): Same. + (value_range_base::type): Remove assert. + (value_range_base::dump): Display type for varying/undef. + (value_range_base::dump): Add argument-less overload. + (value_range::dump): Same. + (vrp_val_max): Add handle_pointers argument. + (vrp_val_min): Same. + (vrp_val_is_max): Same. + (vrp_val_is_min): Same. + (value_range_base::set_and_canonicalize): Adjust so type is + allowed for varying/undef. + (ranges_from_anti_range): Same. + (extract_range_from_muliplicative_op): Same. + (extract_range_from_binary_expr): Same. + (extract_range_from_unary_expr): Same. + (vrp_prop::vrp_initialize): Same. + (vrp_prop::visit_stmt): Same. + (value_range_base::union_helper): Same. + (value_range_base::normalize_symbolics): Same. + (determine_value_range_1): Same. + * tree-vrp.h (value_range_base): Add value_range_base(tree type) + constructor. + Add dump (), supports_type_p, + value_range_base_normalize_symbolics, set_varying, and + set_undefined. + (value_range): Add set_varying, set_undefined, and dump(). + (vrp_val_is_min): A
Re: [PATCH][gcc] libgccjit: check result_type in gcc_jit_context_new_binary_op
Andrea Corallo writes: > David Malcolm writes: > >> On Tue, 2019-06-25 at 08:11 +, Andrea Corallo wrote: >>> Hi, >>> third version for this patch with the simplified test. >>> >>> make check-jit pass clean >>> >>> Bests >>> Andrea >>> >>> 2019-06-09 Andrea Corallo andrea.cora...@arm.com >>> >>> * libgccjit.c (gcc_jit_context_new_binary_op): Check result_type to >>> be a >>> numeric type. >>> >>> >>> 2019-06-20 Andrea Corallo andrea.cora...@arm.com >>> >>> * jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-type.c: >>> New testcase. >> >> Thanks for the updated patch. >> >> This is good for trunk. >> >> (Copying and pasting from my other review): are you working on getting >> SVN commit access, or do you want me to commit your two patches on your >> behalf? >> >> Thanks >> Dave > > Hi David, > I can work on to get the SVN commit access. > As a maintainer has to sponsor it would you mind being the one? > > Thanks > Andrea Hi David, kind ping on this :) Best regards Andrea
Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format
On Sat, Jun 29 2019, Giuliano Belinassi wrote: > This patch makes lto-dump dump the symbol table callgraph in graphviz > format. -ENOPATCH Martin > > I've not added any test to this because I couldn't find a way to call > lto-dump from the testsuite. Also, any feedback with regard to how can > I improve this is welcome. > > gcc/ChangeLog > 2019-06-29 Giuliano Belinassi > > * cgraph.c (dump_graphviz): New function > * cgraph.h (dump_graphviz): New function > * symtab.c (dump_graphviz): New function > * varpool.c (dump_graphviz): New function > > gcc/lto/ChangeLog > 2019-06-29 Giuliano Belinassi > > * lang.opt (flag_dump_callgraph): New flag > * lto-dump.c (dump_symtab_graphviz): New function > * lto-dump.c (dump_tool_help): New option > * lto-dump.c (lto_main): New option
[PATCH] Fix libstdc++ install-pdf support.
Try to run "make install-pdf" on a system with dblatex and pdflatex installed but not doxygen gives an error. run_doxygen error: Could not find Doxygen 1.7.0 in path. Looking at the build log, I see that this is also using xsltproc and xmllint. Installing doxygen and running again, I get lots of ignored errors for a missing dot program. Looking at the docs I see the collaboration diagrams are missing. Installing dot and rebuilding and now I have the collaboration diagrams. I don't see any evidence that the pdf docs are using the stylesheets. Otherwise, they need everything else that the xml docs need. Regenerating configure I got an unexpected change, but that is an issue with a patch a few days ago that added a new thread file, and regenerated the libgcc configure to use it, but failed to notice that the libstdc++ configure should have been regenerated too. Tested with x86_64-linux builds with various packages installed or removed, and looking at the final docs to make sure they look right. OK? Jim libstdc++-v3/ * configure.ac (BUILD_PDF): Also test for doxygen, dot, xsltproc, and xmllint. * configure: Regenerate. --- libstdc++-v3/configure| 21 + libstdc++-v3/configure.ac | 4 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure index 62d9cb49acf..d06b0440cb5 100755 --- a/libstdc++-v3/configure +++ b/libstdc++-v3/configure @@ -15418,6 +15418,7 @@ $as_echo "$target_thread_file" >&6; } case $target_thread_file in aix) thread_header=config/rs6000/gthr-aix.h ;; dce) thread_header=config/pa/gthr-dce.h ;; +gcn) thread_header=config/gcn/gthr-gcn.h ;; lynx) thread_header=config/gthr-lynx.h ;; mipssde) thread_header=config/mips/gthr-mipssde.h ;; posix) thread_header=gthr-posix.h ;; @@ -15637,7 +15638,7 @@ $as_echo "$glibcxx_cv_atomic_long_long" >&6; } # Fake what AC_TRY_COMPILE does. cat > conftest.$ac_ext << EOF -#line 15640 "configure" +#line 15641 "configure" int main() { typedef bool atomic_type; @@ -15672,7 +15673,7 @@ $as_echo "$glibcxx_cv_atomic_bool" >&6; } rm -f conftest* cat > conftest.$ac_ext << EOF -#line 15675 "configure" +#line 15676 "configure" int main() { typedef short atomic_type; @@ -15707,7 +15708,7 @@ $as_echo "$glibcxx_cv_atomic_short" >&6; } rm -f conftest* cat > conftest.$ac_ext << EOF -#line 15710 "configure" +#line 15711 "configure" int main() { // NB: _Atomic_word not necessarily int. @@ -15743,7 +15744,7 @@ $as_echo "$glibcxx_cv_atomic_int" >&6; } rm -f conftest* cat > conftest.$ac_ext << EOF -#line 15746 "configure" +#line 15747 "configure" int main() { typedef long long atomic_type; @@ -15896,7 +15897,7 @@ $as_echo "mutex" >&6; } # unnecessary for this test. cat > conftest.$ac_ext << EOF -#line 15899 "configure" +#line 15900 "configure" int main() { _Decimal32 d1; @@ -15938,7 +15939,7 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu # unnecessary for this test. cat > conftest.$ac_ext << EOF -#line 15941 "configure" +#line 15942 "configure" template struct same { typedef T2 type; }; @@ -15972,7 +15973,7 @@ $as_echo "$enable_int128" >&6; } rm -f conftest* cat > conftest.$ac_ext << EOF -#line 15975 "configure" +#line 15976 "configure" template struct same { typedef T2 type; }; @@ -81880,7 +81881,11 @@ $as_echo "no" >&6; } fi - if test $ac_cv_prog_DBLATEX = "yes" && + if test $ac_cv_prog_DOXYGEN = "yes" && + test $ac_cv_prog_DOT = "yes" && + test $ac_cv_prog_XSLTPROC = "yes" && + test $ac_cv_prog_XMLLINT = "yes" && + test $ac_cv_prog_DBLATEX = "yes" && test $ac_cv_prog_PDFLATEX = "yes"; then BUILD_PDF_TRUE= BUILD_PDF_FALSE='#' diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac index 2e3a1a98f33..80d8202c337 100644 --- a/libstdc++-v3/configure.ac +++ b/libstdc++-v3/configure.ac @@ -483,6 +483,10 @@ AM_CONDITIONAL(BUILD_MAN, AC_CHECK_PROG([DBLATEX], dblatex, yes, no) AC_CHECK_PROG([PDFLATEX], pdflatex, yes, no) AM_CONDITIONAL(BUILD_PDF, + test $ac_cv_prog_DOXYGEN = "yes" && + test $ac_cv_prog_DOT = "yes" && + test $ac_cv_prog_XSLTPROC = "yes" && + test $ac_cv_prog_XMLLINT = "yes" && test $ac_cv_prog_DBLATEX = "yes" && test $ac_cv_prog_PDFLATEX = "yes") -- 2.17.1
Re: [PATCH] Add late non-iterating FRE with optimize > 1
On Thu, 27 Jun 2019, Richard Biener wrote: > > This fixes FREs handling of TARGET_MEM_REF (it didn't consider > &TARGET_MEM_REF) and adds a late FRE pass which has iteration > disabled and runs only at -O[2s]+ to limit the compile-time > impact. > > This helps cases where unrolling and vectorization exposes > "piecewise" redundancies DOM cannot handle. Thus > > (vector *)&a = { 1, 2, 3, 4 }; > .. = a[2]; > > there's still the opposite case not handled (PR83518) but > I will see whether I can make it work without too much cost: > > a[0] = 1; > a[1] = 2; > a[2] = 3; > a[3] = 4; > ... = (vector *)&a; > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > I'll commit the TARGET_MEM_REF fixing indepenently. > > Any comments? I'm not sure I like globbing the iteration > parameter and the optimize > 1 check; maybe I should simply > rename it to 'late' ... > > The compile-time impact might be non-trivial for those testcases > that run into a large overhead from the alias-stmt walking but > I didn't do any measurements yet. Testing went good with only one false-positive fallout of gcc.dg/tree-ssa/pr77445-2.c where new threading opportunities arise in thread3 due to late FRE but those opportunities cross loops and thus are not considered. But that breaks the testcases dump scanning for profile correctness. I've re-visited PR77445 and applied the obvious change. I have commited the FRE TARGET_MEM_REF fix separately and now the patch below. Bootstrapped / tested on x86_64-unknown-linux-gnu. Richard. 2019-07-01 Richard Biener * tree-ssa-sccvn.c (class pass_fre): Add may_iterate pass parameter. (pass_fre::execute): Honor it. * passes.def: Adjust pass_fre invocations to allow iterating, add non-iterating pass_fre before late threading/dom. * gcc.dg/tree-ssa/pr77445-2.c: Adjust. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 272742) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -6872,14 +6853,24 @@ class pass_fre : public gimple_opt_pass { public: pass_fre (gcc::context *ctxt) -: gimple_opt_pass (pass_data_fre, ctxt) +: gimple_opt_pass (pass_data_fre, ctxt), may_iterate (true) {} /* opt_pass methods: */ opt_pass * clone () { return new pass_fre (m_ctxt); } - virtual bool gate (function *) { return flag_tree_fre != 0; } + void set_pass_param (unsigned int n, bool param) +{ + gcc_assert (n == 0); + may_iterate = param; +} + virtual bool gate (function *) +{ + return flag_tree_fre != 0 && (may_iterate || optimize > 1); +} virtual unsigned int execute (function *); +private: + bool may_iterate; }; // class pass_fre unsigned int @@ -6888,15 +6879,16 @@ pass_fre::execute (function *fun) unsigned todo = 0; /* At -O[1g] use the cheap non-iterating mode. */ + bool iterate_p = may_iterate && (optimize > 1); calculate_dominance_info (CDI_DOMINATORS); - if (optimize > 1) + if (iterate_p) loop_optimizer_init (AVOID_CFG_MODIFICATIONS); default_vn_walk_kind = VN_WALKREWRITE; - todo = do_rpo_vn (fun, NULL, NULL, optimize > 1, true); + todo = do_rpo_vn (fun, NULL, NULL, iterate_p, true); free_rpo_vn (); - if (optimize > 1) + if (iterate_p) loop_optimizer_finalize (); return todo; Index: gcc/passes.def === --- gcc/passes.def (revision 272742) +++ gcc/passes.def (working copy) @@ -83,7 +83,7 @@ along with GCC; see the file COPYING3. /* pass_build_ealias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_ealias); - NEXT_PASS (pass_fre); + NEXT_PASS (pass_fre, true /* may_iterate */); NEXT_PASS (pass_early_vrp); NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_dse); @@ -117,7 +117,7 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_oacc_kernels); PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) NEXT_PASS (pass_ch); - NEXT_PASS (pass_fre); + NEXT_PASS (pass_fre, true /* may_iterate */); /* We use pass_lim to rewrite in-memory iteration and reduction variable accesses in loops into local variables accesses. */ NEXT_PASS (pass_lim); @@ -199,7 +199,7 @@ along with GCC; see the file COPYING3. execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_alias); NEXT_PASS (pass_return_slot); - NEXT_PASS (pass_fre); + NEXT_PASS (pass_fre, true /* may_iterate */); NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_thread_jumps); NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */); @@ -312,6 +312,7 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_strength_reduction); NEXT_
[range-ops] patch 00/04: Summary
I'm happy to finally contribute the range-ops part of the ranger work. This is the infrastructure for folding unary and binary ranges of tree_codes into a resulting range. It is the ranger counterpart of extract_range_from_*_expr in tree-vrp.c (not the similarly called callers in vr-values.c). It is divided into four parts. The first 3 patches are supporting infrastructure for range-ops, but are technically independent and are useful on their own. They are: patch 01: VR_UNDEFINED and VR_VARYING now have a type. patch 02: Enforce canonicalization in value_range. patch 03: Abstract overflow handling, pointer_plus_expr symbolics, and plus/minus into their own functions. We've discussed and agreed on the general approach of the first two before. The final one is just shuffling things around so we can share value_range overflow handling, plus/minus, and pointer_plus symbolics with the ranger. There are no changes to functionality in patch 03. Note, that even though these 3 patches are technically independent, they keep touching the same bits over and over, so it's easier to scaffold them. That is, patch 2 depends on 1 and patch 3 depends on 1 and 2. It was a pain just to get them organized without just submitting a huge patch. Finally, there is patch 04 which is range-ops itself. I will discuss this in detail in the appropriate patch, but the general gist is that (if we agree), we will be running range-ops and extract_range_from_*expr during stage1, while comparing that the results agree. Once everyone is in agreement, we can rip out the extract_range_from_* bits from tree-vrp.c There will be a flag (-franges={vrp,range-ops,checking} and the default will be checking (run both and compare). Let's get this party started! Aldy
Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned
On Fri, 28 Jun 2019, Andrew Pinski wrote: > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He wrote: > > > > > > > > On 2019/6/27 11:48 PM, Jeff Law wrote: > > > On 6/27/19 12:11 AM, Li Jia He wrote: > > >> Hi, > > >> > > >> According to the optimizable case described by Qi Feng on > > >> issue 88784, we can combine the cases into the following: > > >> > > >> 1. x > y && x != XXX_MIN --> x > y > > >> 2. x > y && x == XXX_MIN --> false > > >> 3. x <= y && x == XXX_MIN --> x == XXX_MIN > > >> > > >> 4. x < y && x != XXX_MAX --> x < y > > >> 5. x < y && x == XXX_MAX --> false > > >> 6. x >= y && x == XXX_MAX --> x == XXX_MAX > > >> > > >> 7. x > y || x != XXX_MIN --> x != XXX_MIN > > >> 8. x <= y || x != XXX_MIN --> true > > >> 9. x <= y || x == XXX_MIN --> x <= y > > >> > > >> 10. x < y || x != XXX_MAX --> x != UXXX_MAX > > >> 11. x >= y || x != XXX_MAX --> true > > >> 12. x >= y || x == XXX_MAX --> x >= y > > >> > > >> Note: XXX_MIN represents the minimum value of type x. > > >>XXX_MAX represents the maximum value of type x. > > >> > > >> Here we don't need to care about whether the operation is > > >> signed or unsigned. For example, in the below equation: > > >> > > >> 'x > y && x != XXX_MIN --> x > y' > > >> > > >> If the x type is signed int and XXX_MIN is INT_MIN, we can > > >> optimize it to 'x > y'. However, if the type of x is unsigned > > >> int and XXX_MIN is 0, we can still optimize it to 'x > y'. > > >> > > >> The regression testing for the patch was done on GCC mainline on > > >> > > >> powerpc64le-unknown-linux-gnu (Power 9 LE) > > >> > > >> with no regressions. Is it OK for trunk ? > > >> > > >> Thanks, > > >> Lijia He > > >> > > >> gcc/ChangeLog > > >> > > >> 2019-06-27 Li Jia He > > >> Qi Feng > > >> > > >> PR middle-end/88784 > > >> * gimple-fold.c (and_comparisons_contain_equal_operands): New > > >> function. > > >> (and_comparisons_1): Use and_comparisons_contain_equal_operands. > > >> (or_comparisons_contain_equal_operands): New function. > > >> (or_comparisons_1): Use or_comparisons_contain_equal_operands. > > > Would this be better done via match.pd? ISTM this transformation would > > > be well suited for that framework. > > > > Hi, Jeff > > > > I did this because of the following test case: > > ` > > _Bool comp(unsigned x, unsigned y) > > { > >return x > y && x != 0; > > } > > ` > > The gimple file dumped on the power platform is: > > ` > > comp (unsigned int x, unsigned int y) > > { > >_Bool D.2837; > >int iftmp.0; > > > >if (x > y) goto ; else goto ; > >: > >if (x != 0) goto ; else goto ; > >: > >iftmp.0 = 1; > >goto ; > >: > >iftmp.0 = 0; > >: > >D.2837 = (_Bool) iftmp.0; > >return D.2837; > > } > > ` > > However, the gimple file dumped on x86 is > > ` > > comp (unsigned int x, unsigned int y) > > { > >_Bool D.2837; > > > >_1 = x > y; > >_2 = x != 0; > >_3 = _1 & _2; > >_4 = (int) _3; > >D.2837 = (_Bool) _4; > >return D.2837; > > } > > ` > > > > The reason for the inconsistency between these two behaviors is param > > logical-op-non-short-circuit. If we add the pattern to the match.pd > > file, we can only optimize the situation in which the statement is in > > the same basic block (logical-op-non-short-circuit=1, x86). But for > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd > > can't handle this situation. > > > > Another reason is that I found out maybe_fold_and_comparisons and > > maybe_fold_or_comparisons are not only called by ifcombine pass but > > also by reassoc pass. Using this method can basically unify param > > logical-op-non-short-circuit=0 or 1. > > > As mentioned before ifcombine pass should be using gimple-match > instead of fold_build. Try converting ifcombine over to gimple-match > infrastructure and add these to match.pd. Yes, I mentioned that in the PR. The issue is that at the moment to combine x > y with x <= y you'd have to build GENERIC trees for both or temporary GIMPLE assign with a SSA def (and then feed that into the GENERIC or GIMPLE match.pd path). maybe_fold_and/or_comparisons handle two exploded binary expressions while the current match.pd entries handle at most one exploded one (the outermost then, either AND or OR). But it would be definitely doable to auto-generate maybe_fold_and/or_comparisons from match.pd patterns which is what I'd ultimatively suggest to do (in some more generalized form maybe). Either with a separate genmatch invocation or as part of the --gimple processing (not sure what is more feasible here). I told Li Jia He that I don't expect him to do this work. Note I didn't review the actual patch yet. Thanks, Richard.