Re: [PATCH] PR fortran/91513 -- Fix poorly worded error message
On Fri, Oct 11, 2019 at 09:56:15PM -0700, Steve Kargl wrote: > In PR fortran/91513, Damian Rouson points out that the Fortran > does contain the words "impure variable". Geez. That is a messed up sentence. In PR fortran/91513, Damian Rouson points out that the Fortran *standard* does *not* contain the words "impure variable. (It's late, and I'm tired!) > Damian and I spent > an afternoon together a few weeks ago where I gave Damian a > world wind view of how I see gcc/fortran and go about debugging > problems. Damian and I both have full schedules, but I was > hoping he would submit the patch to the list. My time of > fixing gfortran bugs is running out, so ... > > OK to commit? > > 2019-09-29 Damian Rouson > > PR fortran/91513 > * resolve.c (resolve_ordinary_assign): Improved error message. > > 2019-09-29 Damian Rouson > > PR fortran/91513 > * gfortran.dg/impure_assignment_2.f90: Update dg-error regex. > > -- > Steve > Index: gcc/fortran/resolve.c > === > --- gcc/fortran/resolve.c (revision 276269) > +++ gcc/fortran/resolve.c (working copy) > @@ -10774,9 +10774,12 @@ resolve_ordinary_assign (gfc_code *code, > gfc_namespace > "component in a PURE procedure", > &rhs->where); > else > - gfc_error ("The impure variable at %L is assigned to " > - "a derived type variable with a POINTER " > - "component in a PURE procedure (12.6)", > + /* F2008, C1283 (4). */ > + gfc_error ("In a pure subprogram an INTENT(IN) dummy argument " > + "shall not be used as the expr at %L of an intrinsic " > + "assignment statement in which the variable is of a " > + "derived type if the derived type has a pointer " > + "component at any level of component selection.", > &rhs->where); > return rval; > } > Index: gcc/testsuite/gfortran.dg/impure_assignment_2.f90 > === > --- gcc/testsuite/gfortran.dg/impure_assignment_2.f90 (revision 276269) > +++ gcc/testsuite/gfortran.dg/impure_assignment_2.f90 (working copy) > @@ -40,7 +40,7 @@ CONTAINS >PURE FUNCTION give_next3(node) > TYPE(node_type), intent(in) :: node > TYPE(node_type) :: give_next > - give_next = node ! { dg-error "impure variable" } > + give_next = node ! { dg-error "pure subprogram" } >END FUNCTION > END MODULE pr20863 > -- Steve 20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4 20161221 https://www.youtube.com/watch?v=IbCHE-hONow
[PATCH] PR fortran/91513 -- Fix poorly worded error message
In PR fortran/91513, Damian Rouson points out that the Fortran does contain the words "impure variable". Damian and I spent an afternoon together a few weeks ago where I gave Damian a world wind view of how I see gcc/fortran and go about debugging problems. Damian and I both have full schedules, but I was hoping he would submit the patch to the list. My time of fixing gfortran bugs is running out, so ... OK to commit? 2019-09-29 Damian Rouson PR fortran/91513 * resolve.c (resolve_ordinary_assign): Improved error message. 2019-09-29 Damian Rouson PR fortran/91513 * gfortran.dg/impure_assignment_2.f90: Update dg-error regex. -- Steve Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (revision 276269) +++ gcc/fortran/resolve.c (working copy) @@ -10774,9 +10774,12 @@ resolve_ordinary_assign (gfc_code *code, gfc_namespace "component in a PURE procedure", &rhs->where); else - gfc_error ("The impure variable at %L is assigned to " - "a derived type variable with a POINTER " - "component in a PURE procedure (12.6)", + /* F2008, C1283 (4). */ + gfc_error ("In a pure subprogram an INTENT(IN) dummy argument " + "shall not be used as the expr at %L of an intrinsic " + "assignment statement in which the variable is of a " + "derived type if the derived type has a pointer " + "component at any level of component selection.", &rhs->where); return rval; } Index: gcc/testsuite/gfortran.dg/impure_assignment_2.f90 === --- gcc/testsuite/gfortran.dg/impure_assignment_2.f90 (revision 276269) +++ gcc/testsuite/gfortran.dg/impure_assignment_2.f90 (working copy) @@ -40,7 +40,7 @@ CONTAINS PURE FUNCTION give_next3(node) TYPE(node_type), intent(in) :: node TYPE(node_type) :: give_next - give_next = node ! { dg-error "impure variable" } + give_next = node ! { dg-error "pure subprogram" } END FUNCTION END MODULE pr20863
[PATCH] rs6000: -flto forgets 'no-vsx' function attributes (PR target/70010)
Hi, As expected in the PR, when a function is marked with no-vsx, we would assume user has good reason to disable VSX code generation for the function. To avoid VSX code generation, this function should not be inlined into VSX function. In previous implementation, function with non-vsx is treated as with subset features of target with vsx, even programer set no-vsx attribute. Following patch prevents no-vsx function to be inlined into vsx function. Bootstrapped/regtested on ppc64le-linux, ok for trunk? Thanks, Jiufu Guo [gcc] 2019-10-12 Jiufu Guo PR target/70010 * config/rs6000/rs6000.c (rs6000_can_inline_p): Check 'vsx' feature for caller and callee [gcc/testsuite] 2019-10-12 Jiufu Guo PR target/70010 * gcc.target/powerpc/inline_error.c: New test --- gcc/config/rs6000/rs6000.c | 14 +- gcc/testsuite/gcc.target/powerpc/inline_error.c | 17 + 2 files changed, 26 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/inline_error.c diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index d1434a9..c38dc87 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -23978,11 +23978,15 @@ rs6000_can_inline_p (tree caller, tree callee) struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); - /* Callee's options should a subset of the caller's, i.e. a vsx function -can inline an altivec function but a non-vsx function can't inline a -vsx function. */ - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) - == callee_opts->x_rs6000_isa_flags) + /* Callee's options should a subset of the caller's. In addition, +vsx function should not inline function with non-vsx by which +programmer may intend to disable VSX code generation. */ + if ((caller_opts->x_rs6000_isa_flags & OPTION_MASK_VSX) + && (callee_opts->x_rs6000_isa_flags & OPTION_MASK_VSX) == 0) + ret = false; + else if ((caller_opts->x_rs6000_isa_flags + & callee_opts->x_rs6000_isa_flags) + == callee_opts->x_rs6000_isa_flags) ret = true; } diff --git a/gcc/testsuite/gcc.target/powerpc/inline_error.c b/gcc/testsuite/gcc.target/powerpc/inline_error.c new file mode 100644 index 000..78870db --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/inline_error.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -flto -mvsx" } */ + +vector int c, a, b; + +static inline void __attribute__ ((__always_inline__, target ("no-vsx"))) +foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */ +{ + c = a + b; +} + +int +main () +{ + foo (); /* { dg-message "called from here" } */ + c = a + b; +} -- 2.7.4
Go patch commited: Mangle dots in pkgpath
The Go frontend needs to mangle dots to avoid problems with -fgo-pkgpath=a.0. That will confuse the name mangling, which assumes that names entering the mangling cannot contain arbitrary dot characters. We don't need to mangle other characters; go_encode_id will handle them. This patch fixes the problem. It fixes https://golang.org/issue/33871. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 276594) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -ddfb845fad1f2e8b84383f262ed5ea5be7b3e35a +f174fdad69cad42309984dfa108d80f2ae8a9f78 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/go-encode-id.cc === --- gcc/go/gofrontend/go-encode-id.cc (revision 276382) +++ gcc/go/gofrontend/go-encode-id.cc (working copy) @@ -253,3 +253,16 @@ go_mangle_struct_tag(const std::string& } return ret; } + +// Encode a package path. + +std::string +go_mangle_pkgpath(const std::string& pkgpath) +{ + std::string s = pkgpath; + for (size_t i = s.find('.'); + i != std::string::npos; + i = s.find('.', i + 1)) +s.replace(i, 1, ".x2e"); // 0x2e is the ASCII encoding for '.' + return s; +} Index: gcc/go/gofrontend/go-encode-id.h === --- gcc/go/gofrontend/go-encode-id.h(revision 276382) +++ gcc/go/gofrontend/go-encode-id.h(working copy) @@ -34,4 +34,12 @@ go_selectively_encode_id(const std::stri extern std::string go_mangle_struct_tag(const std::string& tag); +// Encode a package path. A package path can contain any arbitrary +// character, including '.'. go_encode_id expects that any '.' will +// be inserted by name mangling in a controlled manner. So first +// translate any '.' using the same .x encoding as used by +// go_mangle_struct_tag. +extern std::string +go_mangle_pkgpath(const std::string& pkgpath); + #endif // !defined(GO_ENCODE_ID_H) Index: gcc/go/gofrontend/gogo.cc === --- gcc/go/gofrontend/gogo.cc (revision 276579) +++ gcc/go/gofrontend/gogo.cc (working copy) @@ -298,7 +298,7 @@ void Gogo::set_pkgpath(const std::string& arg) { go_assert(!this->pkgpath_set_); - this->pkgpath_ = arg; + this->pkgpath_ = go_mangle_pkgpath(arg); this->pkgpath_set_ = true; this->pkgpath_from_option_ = true; } @@ -396,7 +396,8 @@ Gogo::set_package_name(const std::string { if (!this->prefix_from_option_) this->prefix_ = "go"; - this->pkgpath_ = this->prefix_ + '.' + package_name; + this->pkgpath_ = (go_mangle_pkgpath(this->prefix_) + '.' + + package_name); this->pkgpath_symbol_ = (Gogo::pkgpath_for_symbol(this->prefix_) + '.' + Gogo::pkgpath_for_symbol(package_name)); }
Re: [PATCH][ARM] Switch to default sched pressure algorithm
On Fri, Oct 11, 2019 at 10:42 PM Wilco Dijkstra wrote: > > Hi, > > > the defaults for v7-a are still to use the > > Cortex-A8 scheduler > > I missed that part, but that's a serious bug btw - Cortex-A8 is 15 years old > now so > way beyond obsolete. Even Cortex-A53 is ancient now, but it has an accurate > scheduler > that performs surprisingly well on both in-order and out-of-order > implementations. On armv8-a we do use cortex-a53 as the default scheduler in the AArch32 backend. regards Ramana > > Cheers, > Wilco
Re: [PATCH][ARM] Switch to default sched pressure algorithm
On Fri, Oct 11, 2019 at 6:19 PM Wilco Dijkstra wrote: > > Hi Ramana, > > > Can you see what happens with the Cortex-A8 or Cortex-A9 schedulers to > > spread the range across some v7-a CPUs as well ? While they aren't that > > popular today I > > would suggest you look at them because the defaults for v7-a are still to > > use the > > Cortex-A8 scheduler and the Cortex-A9 scheduler might well also get used in > > places given > > the availability of hardware. > > The results are practically identical to Cortex-A53 and A57 - there is a huge > codesize win > across the board on SPEC2006, there isn't a single benchmark that is larger > (ie. more > spilling). > > > I'd be happy to move this forward if you could show if there is no > > *increase* in spills > > for the same range of benchmarks that you are doing for the Cortex-A8 and > > Cortex-A9 > > schedulers. > > There certainly isn't. I don't think results like these could be any more > one-sided, it's a > significant win for every single benchmark, both for codesize and performance! > Ok go ahead - please be sensitive to testsuite regressions. Ramana > What isn't clear is whether something has gone horribly wrong in the > scheduler which > could be fixed/reverted, but as it is right now I can't see it being useful > at all. This means > we should also reevaluate whether pressure scheduling now hurts AArch64 too. > > Cheers, > Wilco
[PATCH] PR fortran/90297 -- Remove code with no functional effect
The patch is fairly self-explanatory. OK to commit? 2019-10-11 Steven G. Kargl PR fortran/90297 * resolve.c (resolve_typebound_function): Remove code with no functional effect. -- Steve Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (revision 276899) +++ gcc/fortran/resolve.c (working copy) @@ -6548,21 +6548,6 @@ resolve_typebound_function (gfc_expr* e) overridable = !e->value.compcall.tbp->non_overridable; if (expr && expr->ts.type == BT_CLASS && e->value.compcall.name) { - /* If the base_object is not a variable, the corresponding actual - argument expression must be stored in e->base_expression so - that the corresponding tree temporary can be used as the base - object in gfc_conv_procedure_call. */ - if (expr->expr_type != EXPR_VARIABLE) - { - gfc_actual_arglist *args; - - for (args= e->value.function.actual; args; args = args->next) - { - if (expr == args->expr) - expr = args->expr; - } - } - /* Since the typebound operators are generic, we have to ensure that any delays in resolution are corrected and that the vtab is present. */
Re: [C++ Patch] Remove most uses of in_system_header_at
On Fri, Oct 11, 2019 at 11:48:27PM +0200, Paolo Carlini wrote: > --- decl.c(revision 276903) > +++ decl.c(working copy) > @@ -9328,7 +9328,6 @@ grokfndecl (tree ctype, > } > /* 17.6.3.3.5 */ > if (suffix[0] != '_' > - && !in_system_header_at (location) > && !current_function_decl && !(friendp && !funcdef_flag)) > warning_at (location, OPT_Wliteral_suffix, > "literal operator suffixes not preceded by %<_%>" This should make no functional difference. > @@ -10036,8 +10035,6 @@ compute_array_index_type_loc (location_t name_loc, > indicated by the state of complain), so that > another substitution can be found. */ > return error_mark_node; > - else if (in_system_header_at (input_location)) > - /* Allow them in system headers because glibc uses them. */; > else if (name) > pedwarn (loc, OPT_Wpedantic, >"ISO C++ forbids zero-size array %qD", name); But this one is unclear, in_system_header_at is testing a different location from what will be used by pedwarn, so if input_location is in system header and loc is not, it didn't pedwarn before and now it will. Similarly various other spots in the patch. I haven't tried to check in detail what exactly we want in each case, all I want to say is that some cases are obvious, other cases are not. Jakub
Support decimal floating-point constants in C2x
ISO C2x adds decimal floating point as an optional standard feature. This patch accordingly makes GCC accept DFP constants (DF, DD, DL, df, dd, dl suffixes) in strict C2X mode, with a pedwarn-if-pedantic for older standards and a warning with -Wc11-c2x-compat even in C2x mode (which in turn requires -Wc11-c2x-compat to be newly passed through to libcpp). Bootstrapped with no regressions on x86_64-pc-linux-gnu. Applied to mainline. gcc/c-family: 2019-10-11 Joseph Myers * c.opt (Wc11-c2x-compat): Add CPP(cpp_warn_c11_c2x_compat) CppReason(CPP_W_C11_C2X_COMPAT). gcc/testsuite: 2019-10-11 Joseph Myers * gcc.dg/dfp/c11-constants-1.c, gcc.dg/dfp/c11-constants-2.c, gcc.dg/dfp/c2x-constants-1.c, gcc.dg/dfp/c2x-constants-2.c: New tests. * gcc.dg/dfp/constants-pedantic.c: Use -std=gnu17 explicitly. Update expected diagnostics. libcpp: 2019-10-11 Joseph Myers * include/cpplib.h (struct cpp_options): Add dfp_constants and cpp_warn_c11_c2x_compat. (enum cpp_warning_reason): Add CPP_W_C11_C2X_COMPAT. * init.c (struct lang_flags): Add dfp_constants. (lang_defaults): Set dfp_constants to 1 for GNUC2X and STDC2X and 0 for other languages. (cpp_set_lang): Set dfp_constants from language. (cpp_create_reader): Set cpp_warn_c11_c2x_compat to -1. * expr.c (interpret_float_suffix): Mention DFP constants as C2X in comment. (cpp_classify_number): Do not diagnose DFP constants for languages setting dfp_constants, unless cpp_warn_c11_c2x_compat. Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt (revision 276896) +++ gcc/c-family/c.opt (working copy) @@ -367,7 +367,7 @@ C ObjC C++ ObjC++ CPP(warn_builtin_macro_redefined Warn when a built-in preprocessor macro is undefined or redefined. Wc11-c2x-compat -C ObjC Var(warn_c11_c2x_compat) Init(-1) Warning +C ObjC CPP(cpp_warn_c11_c2x_compat) CppReason(CPP_W_C11_C2X_COMPAT) Var(warn_c11_c2x_compat) Init(-1) Warning Warn about features not present in ISO C11, but present in ISO C2X. Wc90-c99-compat Index: gcc/testsuite/gcc.dg/dfp/c11-constants-1.c === --- gcc/testsuite/gcc.dg/dfp/c11-constants-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/dfp/c11-constants-1.c (working copy) @@ -0,0 +1,13 @@ +/* Test that DFP constants are diagnosed in C11 mode: -pedantic. */ +/* { dg-do compile } */ +/* { dg-options "-std=c11 -pedantic" } */ + +int a = (int) 1.1DF; /* { dg-warning "C2X feature" } */ +int b = (int) 2.df; /* { dg-warning "C2X feature" } */ +int c = (int) .33DD; /* { dg-warning "C2X feature" } */ +int d = (int) 2e1dd; /* { dg-warning "C2X feature" } */ +int e = (int) .3e2DL; /* { dg-warning "C2X feature" } */ +int f = (int) 4.5e3dl; /* { dg-warning "C2X feature" } */ +int g = (int) 5.e0DF; /* { dg-warning "C2X feature" } */ +int h = (int) 1e+2df; /* { dg-warning "C2X feature" } */ +int i = (int) 1000e-3DL; /* { dg-warning "C2X feature" } */ Index: gcc/testsuite/gcc.dg/dfp/c11-constants-2.c === --- gcc/testsuite/gcc.dg/dfp/c11-constants-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/dfp/c11-constants-2.c (working copy) @@ -0,0 +1,13 @@ +/* Test that DFP constants are diagnosed in C11 mode: -pedantic-errors. */ +/* { dg-do compile } */ +/* { dg-options "-std=c11 -pedantic-errors" } */ + +int a = (int) 1.1DF; /* { dg-error "C2X feature" } */ +int b = (int) 2.df; /* { dg-error "C2X feature" } */ +int c = (int) .33DD; /* { dg-error "C2X feature" } */ +int d = (int) 2e1dd; /* { dg-error "C2X feature" } */ +int e = (int) .3e2DL; /* { dg-error "C2X feature" } */ +int f = (int) 4.5e3dl; /* { dg-error "C2X feature" } */ +int g = (int) 5.e0DF; /* { dg-error "C2X feature" } */ +int h = (int) 1e+2df; /* { dg-error "C2X feature" } */ +int i = (int) 1000e-3DL; /* { dg-error "C2X feature" } */ Index: gcc/testsuite/gcc.dg/dfp/c2x-constants-1.c === --- gcc/testsuite/gcc.dg/dfp/c2x-constants-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/dfp/c2x-constants-1.c (working copy) @@ -0,0 +1,13 @@ +/* Test that DFP constants are accepted in C2X mode. */ +/* { dg-do compile } */ +/* { dg-options "-std=c2x -pedantic-errors" } */ + +int a = (int) 1.1DF; +int b = (int) 2.df; +int c = (int) .33DD; +int d = (int) 2e1dd; +int e = (int) .3e2DL; +int f = (int) 4.5e3dl; +int g = (int) 5.e0DF; +int h = (int) 1e+2df; +int i = (int) 1000e-3DL; Index: gcc/testsuite/gcc.dg/dfp/c2x-constants-2.c === --- gcc/testsuite/gcc.dg/dfp/c2x-constants-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/dfp/c2x-constants-2.c (working copy) @@ -0,0 +1,13 @@ +/* Test that DFP constants are accepted in C2X mode: compat warnings. */ +/* { dg-do compile } */ +/*
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
On 10/11/19 6:58 AM, Paolo Carlini wrote: Hi, another straightforward DECL_SOURCE_LOCATION (TYPE_MAIN_DECL consistent with the one I used in build_anon_union_vars. Tested x86_64-linux. Thanks, Paolo. / OK. Jason
Re: [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses
Hi! On Wed, Oct 09, 2019 at 04:26:20PM -0400, Michael Meissner wrote: > --- gcc/config/rs6000/vsx.md (revision 276713) > +++ gcc/config/rs6000/vsx.md (working copy) > @@ -1149,10 +1149,14 @@ (define_insn "vsx_mov_64bit" > "vecstore, vecload, vecsimple, mffgpr,mftgpr,load, > store, load, store, *, vecsimple, > vecsimple, > vecsimple, *, *, vecstore, vecload") > - (set_attr "length" > + (set_attr "non_prefixed_length" > "*, *, *, 8, *, 8, > 8, 8, 8, 8, *, *, > *, 20,8, *, *") > + (set_attr "prefixed_length" > + "*, *, *, 8, *, 20, > +20,20,20,8, *, *, > +*, 20,8, *, *") Alternative 13 has non_prefixed_length 20, I wonder what insns that generates? Other than that, looks good afaics. Segher
[C++ Patch] Remove most uses of in_system_header_at
Hi, as promised, this removes most uses of the predicate. I left alone the two occurrences in decl.c which guard permerrors, not warnings, and two more in parser.c which should be a (minor) optimization (particularly that in cp_parser_cast_expression). Tested x86_64-linux. Thanks, Paolo. /// 2019-10-11 Paolo Carlini * decl.c (grokfndecl): Remove redundant use of in_system_header_at. (compute_array_index_type_loc): Likewise. (grokdeclarator): Likewise. * error.c (cp_printer): Likewise. * lambda.c (add_default_capture): Likewise. * parser.c (cp_parser_primary_expression): Likewise. (cp_parser_selection_statement): Likewise. (cp_parser_toplevel_declaration): Likewise. (cp_parser_enumerator_list): Likewise. (cp_parser_using_declaration): Likewise. (cp_parser_member_declaration): Likewise. (cp_parser_exception_specification_opt): Likewise. (cp_parser_std_attribute_spec): Likewise. * pt.c (do_decl_instantiation): Likewise. (do_type_instantiation): Likewise. * typeck.c (cp_build_unary_op): Likewise. Index: decl.c === --- decl.c (revision 276903) +++ decl.c (working copy) @@ -9328,7 +9328,6 @@ grokfndecl (tree ctype, } /* 17.6.3.3.5 */ if (suffix[0] != '_' - && !in_system_header_at (location) && !current_function_decl && !(friendp && !funcdef_flag)) warning_at (location, OPT_Wliteral_suffix, "literal operator suffixes not preceded by %<_%>" @@ -10036,8 +10035,6 @@ compute_array_index_type_loc (location_t name_loc, indicated by the state of complain), so that another substitution can be found. */ return error_mark_node; - else if (in_system_header_at (input_location)) - /* Allow them in system headers because glibc uses them. */; else if (name) pedwarn (loc, OPT_Wpedantic, "ISO C++ forbids zero-size array %qD", name); @@ -11037,7 +11034,7 @@ grokdeclarator (const cp_declarator *declarator, } /* Don't pedwarn if the alternate "__intN__" form has been used instead of "__intN". */ - else if (!int_n_alt && pedantic && ! in_system_header_at (input_location)) + else if (!int_n_alt && pedantic) pedwarn (declspecs->locations[ds_type_spec], OPT_Wpedantic, "ISO C++ does not support %<__int%d%> for %qs", int_n_data[declspecs->int_n_idx].bitsize, name); @@ -12695,10 +12692,7 @@ grokdeclarator (const cp_declarator *declarator, else { /* Array is a flexible member. */ - if (in_system_header_at (input_location)) - /* Do not warn on flexible array members in system -headers because glibc uses them. */; - else if (name) + if (name) pedwarn (id_loc, OPT_Wpedantic, "ISO C++ forbids flexible array member %qs", name); else Index: error.c === --- error.c (revision 276903) +++ error.c (working copy) @@ -4317,10 +4317,7 @@ cp_printer (pretty_printer *pp, text_info *text, c void maybe_warn_cpp0x (cpp0x_warn_str str) { - if ((cxx_dialect == cxx98) && !in_system_header_at (input_location)) -/* We really want to suppress this warning in system headers, - because libstdc++ uses variadic templates even when we aren't - in C++0x mode. */ + if (cxx_dialect == cxx98) switch (str) { case CPP0X_INITIALIZER_LISTS: Index: lambda.c === --- lambda.c(revision 276903) +++ lambda.c(working copy) @@ -697,8 +697,7 @@ add_default_capture (tree lambda_stack, tree id, t /* Warn about deprecated implicit capture of this via [=]. */ if (cxx_dialect >= cxx2a && this_capture_p - && LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lambda) == CPLD_COPY - && !in_system_header_at (LAMBDA_EXPR_LOCATION (lambda))) + && LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lambda) == CPLD_COPY) { if (warning_at (LAMBDA_EXPR_LOCATION (lambda), OPT_Wdeprecated, "implicit capture of %qE via %<[=]%> is deprecated " Index: parser.c === --- parser.c(revision 276903) +++ parser.c(working copy) @@ -5364,8 +5364,7 @@ cp_parser_primary_expression (cp_parser *parser, { expr = cp_parser_fold_expression (parser, expr); if (expr != error_mark_node - && cxx_dialect < cxx17 - && !in_system_header_at (input_location)) + &&
Re: [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames
On Wed, Oct 09, 2019 at 04:22:06PM -0400, Michael Meissner wrote: > This patch allows -fstack-protect to work with large stack frames. I don't understand; please explain. What I see is a workaround for not properly handling prefixed addresses in the stack protect code (by forcing the addresses to not be prefixed at expand time). > +rtx > +make_memory_non_prefixed (rtx mem) > +{ > + gcc_assert (MEM_P (mem)); > + > + rtx old_addr = XEXP (mem, 0); > + if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT)) > +{ > + rtx new_addr; > + > + if (GET_CODE (old_addr) == PLUS > + && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0))) How can you ever have a subreg in an address? One in Pmode even? > + && CONST_INT_P (XEXP (old_addr, 1))) > + { > + rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1)); > + new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg); > + } > + else > + new_addr = force_reg (Pmode, old_addr); > + > + mem = change_address (mem, VOIDmode, new_addr); replace_equiv_address ? > +(define_expand "stack_protect_setdi" > + [(parallel [(set (match_operand:DI 0 "memory_operand") > +(unspec:DI [(match_operand:DI 1 "memory_operand")] > +UNSPEC_SP_SET)) > + (set (match_scratch:DI 2) > +(const_int 0))])] > + "TARGET_64BIT" > +{ > + if (TARGET_PREFIXED_ADDR) > +{ > + operands[0] = make_memory_non_prefixed (operands[0]); > + operands[1] = make_memory_non_prefixed (operands[1]); > +} > +}) It shouldn't be terribly hard to make the define_insns just *work* with prefixed insns, instead? Is there any reason we are sure these memory references will not turn into something that needs prefixed insns, after expand? It seems fragile like this. > +@item em > +A memory operand that does not contain a prefixed address. "A memory operand that does not require a prefixed instruction"? Segher
Re: [PATCH][ARM] Switch to default sched pressure algorithm
Hi, > the defaults for v7-a are still to use the > Cortex-A8 scheduler I missed that part, but that's a serious bug btw - Cortex-A8 is 15 years old now so way beyond obsolete. Even Cortex-A53 is ancient now, but it has an accurate scheduler that performs surprisingly well on both in-order and out-of-order implementations. Cheers, Wilco
Re: [PATCH] V5, #4 of 15: Update predicates
On Wed, Oct 09, 2019 at 04:07:50PM -0400, Michael Meissner wrote: > It also adds two new predicates that disallow prefixed memory that will used > in > patches #5 and #7. Then you should have really introduced them in *those* patches. > 2019-10-08 Michael Meissner > > * config/rs6000/predicates.md (lwa_operand): Allow using PLWA to > generate sign extend with odd offsets. I don't understand what this means. "Odd offsets" isn't correct, in any case? > + /* The LWA instruction uses the DS-form format where the bottom two bits of > + the offset must be 0. The prefixed PLWA does not have this > + restriction. */ > + if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS)) > +return true; DImode? > +;; Return 1 if op is a memory operand that is not prefixed. > +(define_predicate "non_prefixed_memory" > + (match_code "mem") > +{ > + if (!memory_operand (op, mode)) > +return false; > + > + return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT); > +}) This one is fine. > +;; Return 1 if op is either a register operand or a memory operand that does > +;; not use a prefixed address. > +(define_predicate "reg_or_non_prefixed_memory" > + (match_code "reg,subreg,mem") > +{ > + return (gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode)); > +}) This never allows subreg. Segher
Re: [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost
Hi! On Wed, Oct 09, 2019 at 04:03:16PM -0400, Michael Meissner wrote: > The basic problem is if there is no explicit cost predicate, rs6000_insn_cost > uses the instruction size to figure out how many instructions are present, and > make the cost a fact on that. Since prefixed instructions are 12 bytes within > GCC (to deal with the implicit NOP), if we did not do this change, the > optimizers would try to save registers from prefixed loads because they > thought > the load was more expensive. Maybe we should just have an attribute that says how many insns this is? You can get rid of many prefixed_length and non_prefixed_length attributes that way, too. > + int cost; > + int length = get_attr_length (insn); > + int n = length / 4; > + > + /* How many real instructions are generated for this insn? This is > slightly What is a "real" instruction? Machine instruction? > + different from the length attribute, in that the length attribute counts > + the number of bytes. With prefixed instructions, we don't want to > count a > + prefixed instruction (length 12 bytes including possible NOP) as taking > 3 > + instructions, but just one. */ > + if (length >= 12 && get_attr_prefixed (insn) == PREFIXED_YES) > +{ > + /* Single prefixed instruction. */ > + if (length == 12) > + n = 1; > + > + /* A normal instruction and a prefixed instruction (16) or two back > + to back prefixed instructions (20). */ > + else if (length == 16 || length == 20) > + n = 2; > + > + /* Guess for larger instruction sizes. */ > + else > + n = 2 + (length - 20) / 4; That's a pretty bad estimate. Can you look at non_prefixed_size, will that help? Segher
C++ PATCH to add test for c++/92070
Also fixed by r276906. Tested on x86_64-linux, applying to trunk. 2019-10-11 Marek Polacek PR c++/92070 - bogus error with -fchecking=2. * g++.dg/expr/cond17.C: New test. diff --git gcc/testsuite/g++.dg/expr/cond17.C gcc/testsuite/g++.dg/expr/cond17.C new file mode 100644 index 000..1999c376dd1 --- /dev/null +++ gcc/testsuite/g++.dg/expr/cond17.C @@ -0,0 +1,11 @@ +// PR c++/92070 - bogus error with -fchecking=2. +// { dg-additional-options "-fchecking=2" } + +struct a; +struct b { + static a c(); +}; +struct a : b {}; +template struct d { + void e() { 0 ? b() : b::c(); } +};
C++ PATCH for c++/92062 - ODR-use ignored for static member of class template
has_value_dependent_address wasn't stripping location wrappers so it gave the wrong answer for "&x" in the static_assert. That led us to thinking that the expression isn't instantiation-dependent, and we skipped static initialization of A<0>::x. This patch adds stripping so that has_value_dependent_address gives the same answer as it used to before the location wrappers addition. Bootstrapped/regtested on x86_64-linux, ok for trunk and 9? 2019-10-11 Marek Polacek PR c++/92062 - ODR-use ignored for static member of class template. * pt.c (has_value_dependent_address): Strip location wrappers. * g++.dg/cpp0x/constexpr-odr1.C: New test. * g++.dg/cpp0x/constexpr-odr2.C: New test. diff --git gcc/cp/pt.c gcc/cp/pt.c index 84464436991..521d0c56002 100644 --- gcc/cp/pt.c +++ gcc/cp/pt.c @@ -6542,6 +6542,8 @@ check_valid_ptrmem_cst_expr (tree type, tree expr, static bool has_value_dependent_address (tree op) { + STRIP_ANY_LOCATION_WRAPPER (op); + /* We could use get_inner_reference here, but there's no need; this is only relevant for template non-type arguments, which can only be expressed as &id-expression. */ diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C new file mode 100644 index 000..cf3f95f0565 --- /dev/null +++ gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C @@ -0,0 +1,19 @@ +// PR c++/92062 - ODR-use ignored for static member of class template. +// { dg-do run { target c++11 } } + +template struct A { + static const bool x; + static_assert(&x, ""); // odr-uses A<...>::x +}; + +int g; + +template +const bool A::x = (g = 42, false); + +void f(A<0>) {}// A<0> must be complete, so is instantiated +int main() +{ + if (g != 42) +__builtin_abort (); +} diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C new file mode 100644 index 000..0927488e569 --- /dev/null +++ gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C @@ -0,0 +1,19 @@ +// PR c++/92062 - ODR-use ignored for static member of class template. +// { dg-do run { target c++11 } } + +template struct A { + static const bool x; + enum { force_instantiation =! &x}; // odr-uses A<...>::x +}; + +int g; + +template +const bool A::x = (g = 42, false); + +void f(A<0>) {}// A<0> must be complete, so is instantiated +int main() +{ + if (g != 42) +__builtin_abort (); +}
Meet Visitors of IOT Solutions World Congress 2019
Greeting, I was wondering if you be interested in acquiring the list of visitors who are going to attend the show IOT Solutions World Congress on BARCELONA - 29-31 OCTOBER 2019 Total visitors' counts: 16,000 Each record in the list contains: Company name, contact name, job titles, phone number, verified e-mail address, Industry, etc. Please let me know if you are interested in purchasing the list of 16,000 contacts so that I can send you pricing for review Awaiting your response Best regards Darby Howe To be removed from my emails, please reply STOP.
Re: C++ PATCH for c++/92049 - extra error with -fchecking=2
On 10/10/19 8:08 PM, Marek Polacek wrote: The concepts merge brought this bit @@ -26326,9 +26559,9 @@ build_non_dependent_expr (tree expr) unexpected recursive instantiations. */ && !parsing_nsdmi () /* Don't do this during concept expansion either and for - the same reason. */ - && !expanding_concept ()) -fold_non_dependent_expr (expr, tf_none); +the same reason. */ + && !parsing_constraint_expression_p ()) +fold_non_dependent_expr (expr); Weird. STRIP_ANY_LOCATION_WRAPPER (expr); (which I'm not finding in the ChangeLog). Dropping tf_none means that fold_non_dependent_expr will use tf_warning_or_error by default, and in this test that causes an error: template struct cond; template struct S { void f(int i) { cond<__builtin_constant_p(i)>(); } }; S<1> s; where it complains that cond is incomplete. Which it is, but we're not actually instantiating the function f, so we need not issue an error. It's broken a bunch of tests if --enable-checking=extra is in effect. This patch brings that tf_none back. We will still complain if we do instantiate f. Bootstrapped/regtested on x86_64-linux, ok for trunk? OK, thanks. Jason
Re: [PATCH] Fix constexpr-dtor3.C FAIL on arm
On 10/11/19 4:20 AM, Jakub Jelinek wrote: On Thu, Oct 10, 2019 at 05:38:21PM -0400, Jason Merrill wrote: FAIL: g++.dg/cpp2a/constexpr-dtor3.C -std=c++2a (test for excess errors) Excess errors: /gcc/testsuite/g++.dg/cpp2a/constexpr-dtor3.C:152:12: in 'constexpr' expansion of '(& w13)->W7::~W7()' This also seems unrelated, but I'll take a look. This is solely a location_t difference, comparing x86_64 and arm output constexpr-dtor3.C: At global scope: constexpr-dtor3.C:156:23: in ‘constexpr’ expansion of ‘f4()’ -constexpr-dtor3.C:156:24: in ‘constexpr’ expansion of ‘(& w13)->W7::~W7()’ +constexpr-dtor3.C:152:12: in ‘constexpr’ expansion of ‘(& w13)->W7::~W7()’ constexpr-dtor3.C:88:34: error: inline assembly is not a constant expression 88 | constexpr ~W7 () { if (w == 5) asm (""); w = 3; } // { dg-error "inline assembly is not a constant expression" } | ^~~ constexpr-dtor3.C:88:34: note: only unevaluated inline assembly is allowed in a ‘constexpr’ function in C++2a constexpr int f4 () { W7 w13 = 5; return 0; } constexpr int x4 = f4 (); // { dg-message "in 'constexpr' expansion of" } Line 152 is the W7 w13 = 5; line, line 156 the x4 = f4 () line. From warning POV, the arm locations are nicer, but see below. The difference is in cxx_maybe_build_cleanup: /* build_delete sets the location of the destructor call to the current location, even though the destructor is going to be called later, at the end of the current scope. This can lead to a "jumpy" behavior for users of debuggers when they step around the end of the block. So let's unset the location of the destructor call instead. */ protected_set_expr_location (cleanup, UNKNOWN_LOCATION); On x86_64 and most other targets, cleanup here (if non-NULL) is the CALL_EXPR, as destructor return type is void, but on arm, as the dtor return type is some pointer, the CALL_EXPR is wrapped into a NOP_EXPR to void. protected_set_expr_location then on x86_64 clears the CALL_EXPR location, but on arm only NOP_EXPR location. The following patch (totally untested) should fix that. For the warning location, perhaps we could special case destructor calls in push_cx_call_context (to offset the intentional clearing of location for debugging purposes), if they don't have location set, don't use input_location for them, but try to pick DECL_SOURCE_LOCATION for the variable being destructed? Expanding the CLEANUP_EXPR of a CLEANUP_STMT could use the EXPR_LOCATION of the CLEANUP_STMT. Or the EXPR_LOCATION of *jump_target, if suitable. Jason
Attendees of IOT Solutions World Congress 2019
Greeting, I was wondering if you be interested in acquiring the list of visitors who are going to attend the show IOT Solutions World Congress on BARCELONA - 29-31 OCTOBER 2019 Total visitors' counts: 16,000 Each record in the list contains: Company name, contact name, job titles, phone number, verified e-mail address, Industry, etc. Please let me know if you are interested in purchasing the list of 16,000 contacts so that I can send you pricing for review Awaiting your response Best regards Nadia Dach To be removed from my emails, please reply STOP.
Re: [PATCH][ARM] Tweak HONOR_REG_ALLOC_ORDER
On Fri, Oct 11, 2019 at 3:52 PM Wilco Dijkstra wrote: > > Hi Ramana, > > > My only question would be whether it's more suitable to use > > optimize_function_for_size_p(cfun) instead as IIRC that gives us a > > chance with lto rather than the global optimize_size. > > Yes that is even better and that defaults to optimize_size if cfun isn't > set. I've committed this: > > diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h > index > 8b67c9c3657b312be223ab60c01969958baa9ed3..5fad1e5bcc2bc448489fdc8239c676246bbc8879 > 100644 > --- a/gcc/config/arm/arm.h > +++ b/gcc/config/arm/arm.h > @@ -1068,9 +1068,8 @@ extern int arm_regs_in_sequence[]; > /* Use different register alloc ordering for Thumb. */ > #define ADJUST_REG_ALLOC_ORDER arm_order_regs_for_local_alloc () > > -/* Tell IRA to use the order we define rather than messing it up with its > - own cost calculations. */ > -#define HONOR_REG_ALLOC_ORDER 1 > +/* Tell IRA to use the order we define when optimizing for size. */ > +#define HONOR_REG_ALLOC_ORDER optimize_function_for_size_p (cfun) I'd be happy with a patch that goes and looks at more such uses in the backend in your copious free time. hint hint. R > > /* Interrupt functions can only use registers that have already been > saved by the prologue, even if they would normally be
Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2
On Fri, Oct 11, 2019 at 5:17 PM Wilco Dijkstra wrote: > > Hi Ramana, > > >On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra wrote: > >> > >> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting > >> in > >> inefficient code. Since Thumb-2 supports similar address offsets use the > >> Arm > >> legitimization code for Thumb-2 to get significant codesize and performance > >> gains. SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves > >> 0.2%. > > > > What were the sort of code size gains ? It did end up piquing my > > curiosity as to how we missed something so basic. For instance ldr > > h264ref is 1% smaller, various other benchmarks a few hundred bytes to 1KB > smaller (~0.1%). > > > r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if > > there was an illegitimate address given here, would we end up > > producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out. > > Scratching my head a bit , it's late at night. > > If the proposed offsets are not legal, GCC tries something different that is > legal, hence there is no requirement to only propose legal splits. In any > case we don't optimize the negative range even on Arm - the offsets are > always split into 4KB ranges (not 8KB as would be possible). > > The negative offset splitting doesn't appear to have changed on Thumb-2 > so there is apparently a different mechanism that deals with negative offsets. > So only positive offsets are improved with my patch: > > int f(int *p) { return p[1025] + p[1026]; } > > before: > movwr3, #4104 > movwr2, #4100 > ldr r2, [r0, r2] > ldr r0, [r0, r3] > add r0, r0, r2 > bx lr > > after: > add r3, r0, #4096 > ldrdr0, r3, [r3, #4] > add r0, r0, r3 > bx lr > > > Orthogonally it looks like you can clean up the MINUS handling here > > and in legitimate_address_p , I'm not sure what the status of LRA with > > MINUS is either and thus we should now look to clean this up or look > > to turn this on and see what happens. However that's a subject of a > > future patch. > > Well there are lots of cases that aren't handled correctly or optimally > yet - I'd copy the way I wrote aarch64_legitimize_address_displacement, > but that's for a future patch indeed. > > > For the record, bootstrap with Thumb2 presumably and the testruns were > > clean ? > > Yes, at the time I ran them. Yeah it dropped about 2660 bytes out of a Thumb2 bootstrap - so it seems worth while. Ok please apply but as usual watch out for any fallout from this. regards Ramana > > Cheers, > Wilco
[C++ PATCH] Preserve the location of explicitly defaulted functions.
While working on operator<=> I noticed that a diagnostic referring to an explicitly defaulted function was using the location of the class, rather than the location of the declaration. We change DECL_SOURCE_LOCATION of implicitly declared functions to indicate where they were first needed, but I don't want to lose the location of actual declarations. Tested x86_64-pc-linux-gnu, applying to trunk. * decl2.c (mark_used): Don't clobber DECL_SOURCE_LOCATION on explicitly defaulted functions. * method.c (synthesize_method): Likewise. --- gcc/cp/decl2.c | 3 ++- gcc/cp/method.c | 7 --- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index ee198cdf5ce..6d5e973b487 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -5588,7 +5588,8 @@ mark_used (tree decl, tsubst_flags_t complain) /* Remember the current location for a function we will end up synthesizing. Then we can inform the user where it was required in the case of error. */ - DECL_SOURCE_LOCATION (decl) = input_location; + if (DECL_ARTIFICIAL (decl)) + DECL_SOURCE_LOCATION (decl) = input_location; /* Synthesizing an implicitly defined member function will result in garbage collection. We must treat this situation as if we were diff --git a/gcc/cp/method.c b/gcc/cp/method.c index 01bf534aef2..73a01147ff9 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -892,7 +892,7 @@ synthesize_method (tree fndecl) /* Reset the source location, we might have been previously deferred, and thus have saved where we were first needed. */ - if (!DECL_INHERITED_CTOR (fndecl)) + if (DECL_ARTIFICIAL (fndecl) && !DECL_INHERITED_CTOR (fndecl)) DECL_SOURCE_LOCATION (fndecl) = DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT (fndecl))); @@ -953,8 +953,9 @@ synthesize_method (tree fndecl) pop_deferring_access_checks (); if (error_count != errorcount || warning_count != warningcount + werrorcount) -inform (input_location, "synthesized method %qD first required here", - fndecl); +if (DECL_ARTIFICIAL (fndecl)) + inform (input_location, "synthesized method %qD first required here", + fndecl); } /* Build a reference to type TYPE with cv-quals QUALS, which is an base-commit: 6dc6cd99d8bb4313585a1d9b93fc7a43fb2016a9 -- 2.21.0
Re: Type representation in CTF and DWARF
On 10/11/2019 04:23 AM, Richard Biener wrote: Thanks for your pointers. CTF does not encode location information. So, I used early exit in the add_src_coords_attributes to avoid generation of location info (file, line, column). To answer Richard's question, CTF does have type debug info for function declarations and the argument types. So I think with these changes, both CTF and DWARF generation will emit debug info for the same set of types and decl. Compile with -g -gdwarf-like-ctf and use dwz -o (using dwz compiled from the master branch) on the generated binaries: (coreutils-0.22) .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4)) ls 30616 |1136 |21098 | 26240 | 0.62 pwd 10734 |788|10433 | 13929 | 0.83 groups 10706 |811|10249 | 13378 | 0.80 (emacs-26.3) .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4)) emacs-26.3.1 674657 |6402 | 273963 | 273910 | 0.33 I chose to account for 50% of .debug_str because at this point, it will be unfair to not account for them. Actually, one could even argue that upto 70% of the .debug_str are names of entities. CTF section sizes do include the CTF string tables. Across coreutils, I see a geomean of 0.73 (ratio of .ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the "-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger footprint than CTF (with 50% of .debug_str accounted for). I'm not convinced this "improvement" in size is worth maintainig another debug-info format much less since it lacks desirable features right now and thus evaluation is tricky. At least you can improve dwarf size considerably with a low amount of work. I suspect another factor where dwarf is bigger compared to CTF is that dwarf is recording typedef names as well as qualified type variants. But maybe CTF just has a more compact representation for the bits it actually implements. Richard. CTF represents typedefs and qualified type variants. They are included in the the .ctf section sizes above. Indu
Re: [PATCH] PR fortran/89943 -- Duplicate BIND(c) allowed (?)
PING. On Fri, Oct 04, 2019 at 03:26:53PM -0700, Steve Kargl wrote: > The attached patch allows the declaration of a BIND(C) > module function or module subroutine to appear in a > submodule (see testcases). Regression test was clean. > OK to commit? > > Before a rubber stamped 'OK'. I do NOT use submodules. > A submodule user needs to pipe up on the validity of > the patch. > > > 2019-10-04 Steven G. Kargl > > PR fortran/89943 > decl.c (gfc_match_function_decl): Ignore duplicate BIND(C) for function > declaration in submodule. > (gfc_match_entry): Use temporary for locus, which allows removal of > one gfc_error_now(). > (gfc_match_subroutine): Ignore duplicate BIND(C) for subroutine > declaration in submodule. > > 2019-10-04 Steven G. Kargl > > PR fortran/89943 > * gfortran.dg/pr89943_1.f90: New test. > * gfortran.dg/pr89943_2.f90: Ditto. > > -- > Steve > Index: gcc/fortran/decl.c > === > --- gcc/fortran/decl.c(revision 276601) > +++ gcc/fortran/decl.c(working copy) > @@ -7259,13 +7259,16 @@ gfc_match_function_decl (void) >if (sym->attr.is_bind_c == 1) > { >sym->attr.is_bind_c = 0; > - if (sym->old_symbol != NULL) > -gfc_error_now ("BIND(C) attribute at %L can only be used for " > - "variables or common blocks", > - &(sym->old_symbol->declared_at)); > - else > -gfc_error_now ("BIND(C) attribute at %L can only be used for " > - "variables or common blocks", &gfc_current_locus); > + > + if (gfc_state_stack->previous > + && gfc_state_stack->previous->state != COMP_SUBMODULE) > + { > + locus loc; > + loc = sym->old_symbol != NULL > + ? sym->old_symbol->declared_at : gfc_current_locus; > + gfc_error_now ("BIND(C) attribute at %L can only be used for " > + "variables or common blocks", &loc); > + } > } > >if (found_match != MATCH_YES) > @@ -7517,16 +7520,16 @@ gfc_match_entry (void) > not allowed for procedures. */ >if (entry->attr.is_bind_c == 1) > { > + locus loc; > + >entry->attr.is_bind_c = 0; > - if (entry->old_symbol != NULL) > -gfc_error_now ("BIND(C) attribute at %L can only be used for " > - "variables or common blocks", > - &(entry->old_symbol->declared_at)); > - else > -gfc_error_now ("BIND(C) attribute at %L can only be used for " > - "variables or common blocks", &gfc_current_locus); > -} > > + loc = entry->old_symbol != NULL > + ? entry->old_symbol->declared_at : gfc_current_locus; > + gfc_error_now ("BIND(C) attribute at %L can only be used for " > + "variables or common blocks", &loc); > + } > + >/* Check what next non-whitespace character is so we can tell if there > is the required parens if we have a BIND(C). */ >old_loc = gfc_current_locus; > @@ -7725,13 +7728,16 @@ gfc_match_subroutine (void) >if (sym->attr.is_bind_c == 1) > { >sym->attr.is_bind_c = 0; > - if (sym->old_symbol != NULL) > -gfc_error_now ("BIND(C) attribute at %L can only be used for " > - "variables or common blocks", > - &(sym->old_symbol->declared_at)); > - else > -gfc_error_now ("BIND(C) attribute at %L can only be used for " > - "variables or common blocks", &gfc_current_locus); > + > + if (gfc_state_stack->previous > + && gfc_state_stack->previous->state != COMP_SUBMODULE) > + { > + locus loc; > + loc = sym->old_symbol != NULL > + ? sym->old_symbol->declared_at : gfc_current_locus; > + gfc_error_now ("BIND(C) attribute at %L can only be used for " > + "variables or common blocks", &loc); > + } > } > >/* C binding names are not allowed for internal procedures. */ > Index: gcc/testsuite/gfortran.dg/pr89943_1.f90 > === > --- gcc/testsuite/gfortran.dg/pr89943_1.f90 (nonexistent) > +++ gcc/testsuite/gfortran.dg/pr89943_1.f90 (working copy) > @@ -0,0 +1,31 @@ > +! { dg-do compile } > +! PR fortran/89943 > +! Code contributed by Alberto Luaces > +module Foo_mod > + > + implicit none > + > + interface > + module subroutine runFoo4C(ndim) bind(C, name="runFoo") > + use, intrinsic :: iso_c_binding > + implicit none > + integer(c_int32_t) , intent(in) :: ndim > + end subroutine runFoo4C > + end interface > + > + contains > + > +end module Foo_mod > + > +submodule(Foo_mod) Foo_smod > + > + contains > + > + module subroutine runFoo4C(ndim) bind(C, name="runFoo") > + use, intrinsic :: iso_c_binding > + impli
Re: [PATCH][AArch64] Fix symbol offset limit
Hi Richard, > If global_char really is a char then isn't that UB? No why? We can do all kinds of arithmetic based on pointers, either using pointer types or converted to uintptr_t. Note that the optimizer actually creates these expressions, for example arr[N-x] can be evaluated as (&arr[0] + N) - x. So this remains legal even if N is well outside the bounds of the array. > I guess we still have to compile it without error though... *THIS* a million times... Any non-trivial application relies on UB behaviour with 100% certainty. But we still have to compile it correctly! >> To avoid this, limit the offset to +/-1MB so that the symbol needs to be >> within a >> 3.9GB offset from its references. For the tiny code model use a 64KB >> offset, allowing >> most of the 1MB range for code/data between the symbol and its >> references. > > These new values seem a bit magical. Would it work for the original > testcase to use FORCE_TO_MEM if !offset_within_block_p (x, offset)? No - the testcases fail with that. It also reduces codequality by not allowing commonly used offsets as part of the symbol relocation. So how would you want to make the offsets less magical? There isn't a clear limitation like for MOVW/MOVT relocations (which are limited to +-32KB), so it just is an arbitrary decision which allows this optimization for all frequently used offsets but avoids relocation failures caused by large offsets. The values I used were chosen so that codequality isn't affected, but it is practically impossible to trigger relocation errors. And they can always be changed again if required - this is not meant to be the final AArch64 patch ever! Cheers, Wilco
Re: [Darwin, machopic 4/n, committed] Arrange to indirect IVARs when needed.
On Oct 10, 2019, at 12:25 PM, Iain Sandoe wrote: > > Objective C V2 (m64) IVAR offset refs from Apple GCC-4.x have an indirection > for m64 code on PPC (which is the only 64b user for Mach-O PIC). Apple GCC > 4.x places the indirections in the .data section, however this seems to have > been unintentional - and we are placing the indirections in the non-lazy > symbol pointers section as usual. > > If Mike can recall any reason that they should be in the .data section, we can > revise that - but testing hasn’t revealed any issues so far. Don't recall any reason.
Support _Decimal* keywords for C2x
ISO C2x adds decimal floating point as an optional standard feature. This patch accordingly makes GCC accept the _Decimal* keywords in strict C2x mode, using pedwarn_c11 to get a warning for -Wc11-c2x-compat. (Constants, where the pedwarn is in libcpp, will be dealt with separately.) The _Decimal* keywords are marked with D_EXT, meaning they are not considered keywords at all in strict conformance modes. This is contrary to the normal practice for most implementation-namespace keywords, which are accepted in all standards modes but with appropriate pedwarns for older standards. This patch removes D_EXT from those keywords so they are accepted as keywords for all standards, consequently removing the gcc.dg/dfp/keywords-ignored-c99.c test that is no longer valid. (A new D_C2X for keywords will still be needed if any new keywords get added that aren't in the implementation namespace for older standards; there are proposals for such keywords, albeit as predefined macros that might not actually need new keywords in the compiler in all cases. If the DFP keywords end up as decimal32 etc., of course appropriate compiler and testcase changes will be needed, and a version of keywords-ignored-c99.c would make sense again with such spellings.) Bootstrapped with no regressions on x86_64-pc-linux-gnu. Applied to mainline. gcc/c: 2019-10-11 Joseph Myers * c-decl.c (declspecs_add_type): Use pedwarn_c11 for DFP types. gcc/c-family: 2019-10-11 Joseph Myers * c-common.c (c_common_reswords): Do not use D_EXT for _Decimal32, _Decimal64 and _Decimal128. gcc/testsuite: 2019-10-11 Joseph Myers * gcc.dg/dfp/c11-keywords-1.c, gcc.dg/dfp/c11-keywords-2.c, gcc.dg/dfp/c2x-keywords-1.c, gcc.dg/dfp/c2x-keywords-2.c: New tests. * gcc.dg/dfp/keywords-ignored-c99.c: Remove test. * gcc.dg/dfp/constants-c99.c, gcc.dg/dfp/keywords-c89.c, gcc.dg/dfp/keywords-c99.c: Use -pedantic-errors. Index: gcc/c/c-decl.c === --- gcc/c/c-decl.c (revision 276895) +++ gcc/c/c-decl.c (working copy) @@ -10959,8 +10959,9 @@ declspecs_add_type (location_t loc, struct c_decls error_at (loc, ("decimal floating-point not supported " "for this target")); - pedwarn (loc, OPT_Wpedantic, - "ISO C does not support decimal floating-point"); + pedwarn_c11 (loc, OPT_Wpedantic, + "ISO C does not support decimal floating-point " + "before C2X"); return specs; case RID_FRACT: case RID_ACCUM: Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c (revision 276895) +++ gcc/c-family/c-common.c (working copy) @@ -351,9 +351,9 @@ const struct c_common_resword c_common_reswords[] { "_Float32x",RID_FLOAT32X, D_CONLY }, { "_Float64x",RID_FLOAT64X, D_CONLY }, { "_Float128x", RID_FLOAT128X, D_CONLY }, - { "_Decimal32", RID_DFLOAT32, D_CONLY | D_EXT }, - { "_Decimal64", RID_DFLOAT64, D_CONLY | D_EXT }, - { "_Decimal128", RID_DFLOAT128, D_CONLY | D_EXT }, + { "_Decimal32", RID_DFLOAT32, D_CONLY }, + { "_Decimal64", RID_DFLOAT64, D_CONLY }, + { "_Decimal128", RID_DFLOAT128, D_CONLY }, { "_Fract", RID_FRACT, D_CONLY | D_EXT }, { "_Accum", RID_ACCUM, D_CONLY | D_EXT }, { "_Sat", RID_SAT, D_CONLY | D_EXT }, Index: gcc/testsuite/gcc.dg/dfp/c11-keywords-1.c === --- gcc/testsuite/gcc.dg/dfp/c11-keywords-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/dfp/c11-keywords-1.c (working copy) @@ -0,0 +1,7 @@ +/* Test that _Decimal* keywords diagnosed in C11 mode: -pedantic. */ +/* { dg-do compile } */ +/* { dg-options "-std=c11 -pedantic" } */ + +_Decimal32 d32; /* { dg-warning "ISO C does not support" } */ +_Decimal64 d64; /* { dg-warning "ISO C does not support" } */ +_Decimal128 d128; /* { dg-warning "ISO C does not support" } */ Index: gcc/testsuite/gcc.dg/dfp/c11-keywords-2.c === --- gcc/testsuite/gcc.dg/dfp/c11-keywords-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/dfp/c11-keywords-2.c (working copy) @@ -0,0 +1,7 @@ +/* Test that _Decimal* keywords diagnosed in C11 mode: -pedantic-errors. */ +/* { dg-do compile } */ +/* { dg-options "-std=c11 -pedantic-errors" } */ + +_Decimal32 d32; /* { dg-error "ISO C does not support" } */ +_Decimal64 d64; /* { dg-error "ISO C does not support" } */ +_Decimal128 d128; /* { dg-error "ISO C does not support" } */ Index: gcc/testsuite/gcc.dg/dfp/c2x-keywords-1.c === --- gcc/testsuite/gcc.dg/dfp/c2x
Re: [PATCH][ARM] Switch to default sched pressure algorithm
Hi Ramana, > Can you see what happens with the Cortex-A8 or Cortex-A9 schedulers to > spread the range across some v7-a CPUs as well ? While they aren't that > popular today I > would suggest you look at them because the defaults for v7-a are still to use > the > Cortex-A8 scheduler and the Cortex-A9 scheduler might well also get used in > places given > the availability of hardware. The results are practically identical to Cortex-A53 and A57 - there is a huge codesize win across the board on SPEC2006, there isn't a single benchmark that is larger (ie. more spilling). > I'd be happy to move this forward if you could show if there is no *increase* > in spills > for the same range of benchmarks that you are doing for the Cortex-A8 and > Cortex-A9 > schedulers. There certainly isn't. I don't think results like these could be any more one-sided, it's a significant win for every single benchmark, both for codesize and performance! What isn't clear is whether something has gone horribly wrong in the scheduler which could be fixed/reverted, but as it is right now I can't see it being useful at all. This means we should also reevaluate whether pressure scheduling now hurts AArch64 too. Cheers, Wilco
New Finnish PO file for 'gcc' (version 9.1.0)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Finnish team of translators. The file is available at: https://translationproject.org/latest/gcc/fi.po (This file, 'gcc-9.1.0.fi.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: [PATCH] Fix PR c++/92024
On 10/11/19 6:31 PM, Jason Merrill wrote: > On 10/10/19 2:06 PM, Bernd Edlinger wrote: >> On 10/10/19 7:49 PM, Jason Merrill wrote: >>> On 10/10/19 10:42 AM, Bernd Edlinger wrote: Hi, this fixes a crash when -Wshadow=compatible-local is enabled in the testcase g++.dg/parse/crash68.C >>> >>> Why does that flag cause this crash? >>> >> >> gcc/cp/name-lookup.c: >> >> if (warn_shadow) >> warning_code = OPT_Wshadow; >> else if (warn_shadow_local) >> warning_code = OPT_Wshadow_local; >> else if (warn_shadow_compatible_local >> && (same_type_p (TREE_TYPE (old), TREE_TYPE (decl)) >> || (!dependent_type_p (TREE_TYPE (decl)) >> && !dependent_type_p (TREE_TYPE (old)) >> /* If the new decl uses auto, we don't yet know >> its type (the old type cannot be using auto >> at this point, without also being >> dependent). This is an indication we're >> (now) doing the shadow checking too >> early. */ >> && !type_uses_auto (TREE_TYPE (decl)) >> && can_convert (TREE_TYPE (old), TREE_TYPE (decl), >> tf_none >> warning_code = OPT_Wshadow_compatible_local; >> >> if -Wshadow=compatible-local is used, the can_convert function crashes >> in instantiate_class_template_1. > > Right, checking can_convert is problematic here, as it can cause template > instantiations that change the semantics of the program. Or, in this case, > crash. > Ah, alright. I think shadowing one type with another type of the same name is always a no no. How about always warning (and not asking for can_convert) when either decl is a TYPE_DECL? like this: Index: gcc/cp/name-lookup.c === --- gcc/cp/name-lookup.c(Revision 276886) +++ gcc/cp/name-lookup.c(Arbeitskopie) @@ -2741,8 +2741,7 @@ return; } - /* If '-Wshadow=compatible-local' is specified without other --Wshadow= flags, we will warn only when the type of the + /* -Wshadow=compatible-local warns only when the type of the shadowing variable (DECL) can be converted to that of the shadowed parameter (OLD_LOCAL). The reason why we only check if DECL's type can be converted to OLD_LOCAL's type (but not the @@ -2750,29 +2749,29 @@ parameter, more than often they would use the variable thinking (mistakenly) it's still the parameter. It would be rare that users would use the variable in the place that -expects the parameter but thinking it's a new decl. */ +expects the parameter but thinking it's a new decl. +If either object is a TYPE_DECL -Wshadow=compatible-local +warns regardless of whether one of the types involved +is a subclass of the other, since that is never okay. */ enum opt_code warning_code; - if (warn_shadow) - warning_code = OPT_Wshadow; - else if (warn_shadow_local) - warning_code = OPT_Wshadow_local; - else if (warn_shadow_compatible_local - && (same_type_p (TREE_TYPE (old), TREE_TYPE (decl)) - || (!dependent_type_p (TREE_TYPE (decl)) - && !dependent_type_p (TREE_TYPE (old)) - /* If the new decl uses auto, we don't yet know - its type (the old type cannot be using auto - at this point, without also being - dependent). This is an indication we're - (now) doing the shadow checking too - early. */ - && !type_uses_auto (TREE_TYPE (decl)) - && can_convert (TREE_TYPE (old), TREE_TYPE (decl), - tf_none + if (same_type_p (TREE_TYPE (old), TREE_TYPE (decl)) + || TREE_CODE (decl) == TYPE_DECL + || TREE_CODE (old) == TYPE_DECL + || (!dependent_type_p (TREE_TYPE (decl)) + && !dependent_type_p (TREE_TYPE (old)) + /* If the new decl uses auto, we don't yet know +its type (the old type cannot be using auto +at this point, without also being +dependent). This is an indication we're +(now) doing the shadow checking too +early. */ + && !type_uses_auto (TREE_TYPE (decl)) + && can_convert (TREE_TYPE (old), TREE_TYPE (decl), + tf_none))) warning_code = OPT_Wshadow_compatible_local; else - return; + warning_code = OPT_Wshadow_local; const char *msg; if (TREE_CODE (old) == PARM_DECL) Bernd.
Re: [C++ Patch] Remove RROTATE_EXPR and LROTATE_EXPR handling
Hi, On 11/10/19 18:25, Jason Merrill wrote: On 10/10/19 2:33 PM, Paolo Carlini wrote: Hi, On 10/10/19 19:57, Jason Merrill wrote: On 10/10/19 1:53 PM, Jakub Jelinek wrote: On Thu, Oct 10, 2019 at 06:12:02PM +0200, Paolo Carlini wrote: while working on cp_build_binary_op I noticed that the testsuite wasn't exercising the warnings in case RROTATE_EXPR / LROTATE_EXPR, even more the code handling those tree codes seemed completely unused. Turned out that the C front-end doesn't handle those tree codes at all: I'm coming to the conclusion that the C++ front-end bits too are now obsolete and may be removed, because only the middle-end generates those codes in order to implement optimizations. Anything I'm missing? Any additional testing? I guess it depends on where. fold_binary_loc certainly has code to create {L,R}ROTATE_EXPR, just look at unsigned foo (unsigned x) { return (x << 3) + (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - 3)); } unsigned bar (unsigned x, unsigned y) { return (x << y) | (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - y)); } and the *.original dump. The cp_build_binary_op case is unlikely to ever trigger, unless we'd rerun it on cp_folded trees. cxx_eval_constant_expression is unlikely, because recently we've switched to performing constexpr evaluation on pre-cp_folded bodies, not sure if we never encounter folded trees though. cp_fold itself depends on whether we ever reprocess the already folded trees, I'd be afraid we could. True, and the failure mode there is silent. Let's leave the codes in that switch. Ok, thanks Jason and Jakub for the additional information. While I give this more thought and maybe manage to come up with a testcase triggering the warning, shall we simply pass the location to those two warnings too - cannot hurt, AFAICS? I meant let's omit the changes to cp_fold, the rest of the patch is still OK. Nice, thanks, I'll do that. Paolo.
[PATCH] S/390: Run %a0:DI splitters only after reload
Bootstrapped and regtested on s390x-redhat-linux. gcc/ChangeLog: 2019-10-10 Ilya Leoshkevich * config/s390/s390.md: Run %a0:DI splitters only after reload. gcc/testsuite/ChangeLog: 2019-10-10 Ilya Leoshkevich * gcc.target/s390/load-thread-pointer-once.c: New test. --- gcc/config/s390/s390.md | 13 ++--- .../gcc.target/s390/load-thread-pointer-once.c | 12 2 files changed, 22 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 4f7bde6616b..1e6439d5fd6 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -1860,10 +1860,17 @@ *,*,yes") ]) +; Splitters for loading/storing TLS pointers from/to %a0:DI. +; Do this only during split2, which runs after reload. At the point when split1 +; runs, some of %a0:DI occurrences might be nested inside other rtxes and thus +; not matched. As a result, only some occurrences will be split, which will +; prevent CSE. At the point when split2 runs, reload will have ensured that no +; nested references exist. + (define_split [(set (match_operand:DI 0 "register_operand" "") (match_operand:DI 1 "register_operand" ""))] - "TARGET_ZARCH && ACCESS_REG_P (operands[1])" + "TARGET_ZARCH && ACCESS_REG_P (operands[1]) && reload_completed" [(set (match_dup 2) (match_dup 3)) (set (match_dup 0) (ashift:DI (match_dup 0) (const_int 32))) (set (strict_low_part (match_dup 2)) (match_dup 4))] @@ -1873,7 +1880,7 @@ (define_split [(set (match_operand:DI 0 "register_operand" "") (match_operand:DI 1 "register_operand" ""))] - "TARGET_ZARCH && ACCESS_REG_P (operands[0]) + "TARGET_ZARCH && ACCESS_REG_P (operands[0]) && reload_completed && dead_or_set_p (insn, operands[1])" [(set (match_dup 3) (match_dup 2)) (set (match_dup 1) (lshiftrt:DI (match_dup 1) (const_int 32))) @@ -1884,7 +1891,7 @@ (define_split [(set (match_operand:DI 0 "register_operand" "") (match_operand:DI 1 "register_operand" ""))] - "TARGET_ZARCH && ACCESS_REG_P (operands[0]) + "TARGET_ZARCH && ACCESS_REG_P (operands[0]) && reload_completed && !dead_or_set_p (insn, operands[1])" [(set (match_dup 3) (match_dup 2)) (set (match_dup 1) (rotate:DI (match_dup 1) (const_int 32))) diff --git a/gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c b/gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c new file mode 100644 index 000..21a5bfaa732 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +__thread void *foo; + +void *bar() +{ + return (foo = __builtin_thread_pointer()); +} + +/* { dg-final { scan-assembler-times {\n\tear\t} 2 { target { lp64 } } } } */ +/* { dg-final { scan-assembler-times {\n\tear\t} 1 { target { ! lp64 } } } } */ -- 2.23.0
Re: [PATCH] bring -Warray-bounds closer to -Wstringop-overflow (PR91647, 91463, 91679)
On 9/10/19 4:35 PM, Jeff Law wrote: On 9/6/19 1:27 PM, Martin Sebor wrote: Recent enhancements to -Wstringop-overflow improved the warning to the point that it detects a superset of the problems -Warray- bounds is intended detect in character accesses. Because both warnings detect overlapping sets of problems, and because the IL they work with tends to change in subtle ways from target to targer, tests designed to verify one or the other sometimes fail with a target where the warning isn't robust enough to detect the problem given the IL representation. To reduce these test suite failures the attached patch extends -Warray-bounds to handle some of the same problems -Wstringop- overflow does, pecifically, out-of-bounds accesses to array members of structs, including zero-length arrays and flexible array members of defined objects. In the process of testing the enhancement I realized that the recently added component_size() function doesn't work as intended for non-character array members (see below). The patch corrects this by reverting back to the original implementation of the function until the better/simpler solution can be put in place as mentioned below. Tested on x86_64-linux. Martin [*] component_size() happens to work for char arrays because those are transformed to STRING_CSTs, but not for arrays that are not. E.g., in struct S { int64_t i; int16_t j; int16_t a[]; } s = { 0, 0, { 1, 0 } }; unless called with type set to int16_t[2], fold_ctor_reference will return s.a[0] rather than all of s.a. But set type to int16_t[2] we would need to know that s.a's initializer has two elements, and that's just what we're using fold_ctor_reference to find out. I think this could probably be made to work somehow by extending useless_type_conversion_p to handle this case as special somehow, but it doesn't seem worth the effort given that there should be an easier way to do it as you noted below. Given the above, the long term solution should be to rely on DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard suggested in the review of its initial implementation. Unfortunately, because of bugs in both the C and C++ front ends (I just opened PR 65403 with the details) the simple formula doesn't give the right answers either. So until the bugs are fixed, the patch reverts back to the original loopy solution. It's no more costly than the current fold_ctor_reference approach. ... So no concerns with the patch itself, just the fallout you mentioned in a follow-up message. Ideally we'd have glibc and the kernel fixed before this goes in, but I'd settle for just getting glibc fixed since we have more influence there. Half of the issues there were due to a bug in the warning. The rest are caused by Glibc's use of interior zero-length arrays to access subsequent members. It works in simple cases but it's very brittle because GCC assumes that even such members don't alias. If it's meant to be a supported feature then aliasing would have to be changed to take it into account. But I'd rather encourage projects to move away from these dangerous hacks and towards cleaner, safer code. I've fixed the bug in the attached patch. The rest can be suppressed by replacing the zero-length arrays with flexible array members but that's just trading one misuse for another. If the code can't be changed to avoid this (likely not an option since the arrays are in visible in the public API) I think the best way to deal with them is to suppress them by #pragma GCC diagnostic ignored. I opened BZ 25097 in Glibc Bugzilla to track this. Out of curiosity are the kernel issues you raised due to flexible arrays or just cases where we're doing a better job on normal objects? I'd be a bit surprised to find flexible arrays in the kernel. I don't think I've come across any flexible arrays in the kernel. The patch triggers 94 instances of -Warray-bounds (60 of which are for distinct code) in 21 .c files. I haven't looked at all of them but some of the patterns I noticed are: 1) Intentionally using an interior zero-length array to access (e.g., memset) one or more subsequent members. E.g., _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and quite a few others. This is pretty pervasive but seems easily avoidable. 2) Overwriting a member array with more data (e.g., function cxio_rdev_open in drivers/infiniband/hw/cxgb3/cxio_hal.c or in function pk_probe in drivers/hid/hid-prodikeys.c). At first glance some of these look like bugs but with stuff obscured by macros and no comments it's hard to tell. 3) Uses of the container_of() macro to access one member given the address of another. This is undefined (and again breaks the aliasing rules) but the macro is used all over the place in the kernel. I count over 15,000 references to it. 4) Uses of one-element arrays as members of other one-element arrays (in include/scsi/fc/fc_ms.h). Was this ever meant to
Re: [PATCH] Fix PR c++/92024
On 10/10/19 2:06 PM, Bernd Edlinger wrote: On 10/10/19 7:49 PM, Jason Merrill wrote: On 10/10/19 10:42 AM, Bernd Edlinger wrote: Hi, this fixes a crash when -Wshadow=compatible-local is enabled in the testcase g++.dg/parse/crash68.C Why does that flag cause this crash? gcc/cp/name-lookup.c: if (warn_shadow) warning_code = OPT_Wshadow; else if (warn_shadow_local) warning_code = OPT_Wshadow_local; else if (warn_shadow_compatible_local && (same_type_p (TREE_TYPE (old), TREE_TYPE (decl)) || (!dependent_type_p (TREE_TYPE (decl)) && !dependent_type_p (TREE_TYPE (old)) /* If the new decl uses auto, we don't yet know its type (the old type cannot be using auto at this point, without also being dependent). This is an indication we're (now) doing the shadow checking too early. */ && !type_uses_auto (TREE_TYPE (decl)) && can_convert (TREE_TYPE (old), TREE_TYPE (decl), tf_none warning_code = OPT_Wshadow_compatible_local; if -Wshadow=compatible-local is used, the can_convert function crashes in instantiate_class_template_1. Right, checking can_convert is problematic here, as it can cause template instantiations that change the semantics of the program. Or, in this case, crash. Jason
Re: [C++ Patch] Remove RROTATE_EXPR and LROTATE_EXPR handling
On 10/10/19 2:33 PM, Paolo Carlini wrote: Hi, On 10/10/19 19:57, Jason Merrill wrote: On 10/10/19 1:53 PM, Jakub Jelinek wrote: On Thu, Oct 10, 2019 at 06:12:02PM +0200, Paolo Carlini wrote: while working on cp_build_binary_op I noticed that the testsuite wasn't exercising the warnings in case RROTATE_EXPR / LROTATE_EXPR, even more the code handling those tree codes seemed completely unused. Turned out that the C front-end doesn't handle those tree codes at all: I'm coming to the conclusion that the C++ front-end bits too are now obsolete and may be removed, because only the middle-end generates those codes in order to implement optimizations. Anything I'm missing? Any additional testing? I guess it depends on where. fold_binary_loc certainly has code to create {L,R}ROTATE_EXPR, just look at unsigned foo (unsigned x) { return (x << 3) + (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - 3)); } unsigned bar (unsigned x, unsigned y) { return (x << y) | (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - y)); } and the *.original dump. The cp_build_binary_op case is unlikely to ever trigger, unless we'd rerun it on cp_folded trees. cxx_eval_constant_expression is unlikely, because recently we've switched to performing constexpr evaluation on pre-cp_folded bodies, not sure if we never encounter folded trees though. cp_fold itself depends on whether we ever reprocess the already folded trees, I'd be afraid we could. True, and the failure mode there is silent. Let's leave the codes in that switch. Ok, thanks Jason and Jakub for the additional information. While I give this more thought and maybe manage to come up with a testcase triggering the warning, shall we simply pass the location to those two warnings too - cannot hurt, AFAICS? I meant let's omit the changes to cp_fold, the rest of the patch is still OK. Jason
Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2
Hi Ramana, >On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra wrote: >> >> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting in >> inefficient code. Since Thumb-2 supports similar address offsets use the Arm >> legitimization code for Thumb-2 to get significant codesize and performance >> gains. SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves >> 0.2%. > > What were the sort of code size gains ? It did end up piquing my > curiosity as to how we missed something so basic. For instance ldr h264ref is 1% smaller, various other benchmarks a few hundred bytes to 1KB smaller (~0.1%). > r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if > there was an illegitimate address given here, would we end up > producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out. > Scratching my head a bit , it's late at night. If the proposed offsets are not legal, GCC tries something different that is legal, hence there is no requirement to only propose legal splits. In any case we don't optimize the negative range even on Arm - the offsets are always split into 4KB ranges (not 8KB as would be possible). The negative offset splitting doesn't appear to have changed on Thumb-2 so there is apparently a different mechanism that deals with negative offsets. So only positive offsets are improved with my patch: int f(int *p) { return p[1025] + p[1026]; } before: movwr3, #4104 movwr2, #4100 ldr r2, [r0, r2] ldr r0, [r0, r3] add r0, r0, r2 bx lr after: add r3, r0, #4096 ldrdr0, r3, [r3, #4] add r0, r0, r3 bx lr > Orthogonally it looks like you can clean up the MINUS handling here > and in legitimate_address_p , I'm not sure what the status of LRA with > MINUS is either and thus we should now look to clean this up or look > to turn this on and see what happens. However that's a subject of a > future patch. Well there are lots of cases that aren't handled correctly or optimally yet - I'd copy the way I wrote aarch64_legitimize_address_displacement, but that's for a future patch indeed. > For the record, bootstrap with Thumb2 presumably and the testruns were clean > ? Yes, at the time I ran them. Cheers, Wilco
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
On Fri, Oct 11, 2019 at 05:54:43PM +0200, Paolo Carlini wrote: > Hi, > > On 11/10/19 17:52, Jakub Jelinek wrote: > > On Fri, Oct 11, 2019 at 05:34:16PM +0200, Paolo Carlini wrote: > > > Hi again... > > > > > > On 11/10/19 17:30, Paolo Carlini wrote: > > > > Oh nice, I wasn't aware of that, to be honest, probably we should audit > > > > the front-end for more such redundant uses. > > > ... and I can confirm that we have *many*. If we agree that removing all > > > of > > > them is the way to go I can do that in a follow up patch. > > If they just guard a pedwarn/warning/warning_at etc., that's fine, if they > > guard also some code that needs to compute something for the diagnostics, > > it might not be redundant. Yeah, much like with warn_foo guards. > Indeed. We got many of the very straightforward ones ;) Sounds like a nice cleanup! -- Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
Hi, On 11/10/19 17:52, Jakub Jelinek wrote: On Fri, Oct 11, 2019 at 05:34:16PM +0200, Paolo Carlini wrote: Hi again... On 11/10/19 17:30, Paolo Carlini wrote: Oh nice, I wasn't aware of that, to be honest, probably we should audit the front-end for more such redundant uses. ... and I can confirm that we have *many*. If we agree that removing all of them is the way to go I can do that in a follow up patch. If they just guard a pedwarn/warning/warning_at etc., that's fine, if they guard also some code that needs to compute something for the diagnostics, it might not be redundant. Indeed. We got many of the very straightforward ones ;) Paolo.
[PATCH] Implement header for C++20
There are currently no tests for [concepts.compare], but they will be added ASAP. * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/precompiled/stdc++.h: Include . * include/std/concepts: New header for C++20. * include/std/version (__cpp_lib_concepts): Define. * scripts/create_testsuite_files: Look for test files in new std directory. * testsuite/libstdc++-dg/conformance.exp: Likewise. * testsuite/std/concepts/concepts.callable/invocable.cc: New test. * testsuite/std/concepts/concepts.callable/regular_invocable.cc: New test. * testsuite/std/concepts/concepts.callable/relation.cc: New test. * testsuite/std/concepts/concepts.callable/strictweakorder.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/ floating_point.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/integral.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/ signed_integral.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/ unsigned_integral.cc: New test. * testsuite/std/concepts/concepts.lang/concept.assignable/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.common/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.commonref/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.constructible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.convertible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.copyconstructible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.defaultconstructible/ 1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.derived/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.destructible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.moveconstructible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.same/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc: New test. * testsuite/std/concepts/concepts.lang/concept.swappable/swappable.cc: New test. * testsuite/std/concepts/concepts.lang/concept.swappable/ swappable_with.cc: New test. * testsuite/std/concepts/concepts.object/copyable.cc: New test. * testsuite/std/concepts/concepts.object/movable.cc: New test. * testsuite/std/concepts/concepts.object/regular.cc: New test. * testsuite/std/concepts/concepts.object/semiregular.cc: New test. Tested x86_64-linux, committed to trunk. commit a27d35a1cf5d2c8b0609006b67a561f54d1f0e73 Author: Jonathan Wakely Date: Fri Oct 11 16:52:30 2019 +0100 Implement header for C++20 There are currently no tests for [concepts.compare], but they will be added ASAP. * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/precompiled/stdc++.h: Include . * include/std/concepts: New header for C++20. * include/std/version (__cpp_lib_concepts): Define. * scripts/create_testsuite_files: Look for test files in new std directory. * testsuite/libstdc++-dg/conformance.exp: Likewise. * testsuite/std/concepts/concepts.callable/invocable.cc: New test. * testsuite/std/concepts/concepts.callable/regular_invocable.cc: New test. * testsuite/std/concepts/concepts.callable/relation.cc: New test. * testsuite/std/concepts/concepts.callable/strictweakorder.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/ floating_point.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/integral.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/ signed_integral.cc: New test. * testsuite/std/concepts/concepts.lang/concept.arithmetic/ unsigned_integral.cc: New test. * testsuite/std/concepts/concepts.lang/concept.assignable/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.common/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.commonref/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.constructible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.convertible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.copyconstructible/1.cc: New test. * testsuite/std/concepts/concepts.lang/concept.defaultconstructibl
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
On Fri, Oct 11, 2019 at 05:34:16PM +0200, Paolo Carlini wrote: > Hi again... > > On 11/10/19 17:30, Paolo Carlini wrote: > > Oh nice, I wasn't aware of that, to be honest, probably we should audit > > the front-end for more such redundant uses. > > ... and I can confirm that we have *many*. If we agree that removing all of > them is the way to go I can do that in a follow up patch. If they just guard a pedwarn/warning/warning_at etc., that's fine, if they guard also some code that needs to compute something for the diagnostics, it might not be redundant. Jakub
[PATCH] Use __is_same_as for std::is_same and std::is_same_v
By using the built-in we don't need to match a partial specialization for std::is_same and don't need to instantiate std::is_same at all for uses of std::is_same_v. * include/std/type_traits (is_same): Replace partial specialization by using __is_same_as built-in in primary template. (is_same_v): Use __is_same_as built-in instead of instantiating the is_same trait. Tested x86_64-linux, committed to trunk. commit ee208e530893aeb12b6a9edb1563a4e8ecaea887 Author: Jonathan Wakely Date: Wed Sep 11 21:23:31 2019 +0100 Use __is_same_as for std::is_same and std::is_same_v By using the built-in we don't need to match a partial specialization for std::is_same and don't need to instantiate std::is_same at all for uses of std::is_same_v. * include/std/type_traits (is_same): Replace partial specialization by using __is_same_as built-in in primary template. (is_same_v): Use __is_same_as built-in instead of instantiating the is_same trait. diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index dc8a019324d..4de5daa9f06 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -1388,13 +1388,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Type relations. /// is_same - template + template struct is_same -: public false_type { }; - - template -struct is_same<_Tp, _Tp> -: public true_type { }; +: public integral_constant +{ }; /// is_base_of template @@ -3158,7 +3155,7 @@ template template inline constexpr size_t extent_v = extent<_Tp, _Idx>::value; template - inline constexpr bool is_same_v = is_same<_Tp, _Up>::value; + inline constexpr bool is_same_v = __is_same_as(_Tp, _Up); template inline constexpr bool is_base_of_v = is_base_of<_Base, _Derived>::value; template
Re: [PATCH] PR fortran/91714 -- Mangled type-spec
On Fri, Oct 11, 2019 at 05:57:04PM +0300, Janne Blomqvist wrote: > On Sat, Sep 28, 2019 at 10:16 PM Steve Kargl > wrote: > > > > The attached patch issues errors for a few mangled type-specs. > > It has been regression tested on x86_64-*-freebsd. OK to commit? > > > > 2019-09-28 Steven G. Kargl > > > > PR fortran/91714 > > * decl.c (gfc_match_decl_type_spec): Issue errors for a few > > mangled types. > > > > 2019-09-28 Steven G. Kargl > > > > PR fortran/91714 > > * gfortran.dg/dec_type_print_3.f90: Update dg-error regex. > > * gfortran.dg/pr91714.f90: New test. > > > > A daunting task awaits a brave soul as gfc_match_decl_type_spec > > is a minefield for bugs. This is dues to the fact the the functions > > has grown by adding kludge upon kludge upon kludge. The first > > thing to fix is the broken parsing that follows from the > > matching of 'type('. > > > > -- > > Steve > > Ok. > This was committed as r276270 | kargl | 2019-09-29 09:19:58 -0700 (Sun, 29 Sep 2019) | 12 lines Again, I was getting a backlog of patches, and committed this as somewhat obvious. I will repeat that gfc_match_decl_type_spec has become a difficult nightmare to work through. The complication comes from the new TYPE(INTRINSIC-TYPE-SPEC) feature of either F2008 or F2018. In developing this patch, I found that something like type(character(len=1) is accepted as valid, but it is missing the closing ')'. I cannot find why the closing ')' in character is used twice! A long range project would be to split the declaration matching into gfc_match_type_decl_stmt and gfc_match_intrinsic_type_decl. Then, we can refactor gfc_match_decl_type_spec as gfc_match_decl_type_spec gfc_match_type_decl_stmt gfc_match_intrinsic_type_decl There are complication because TYPE is overloaded via TYPE(*) TYPE(derive-type), and a nonstandard TYPE for IO. -- Steve
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
Hi again... On 11/10/19 17:30, Paolo Carlini wrote: Oh nice, I wasn't aware of that, to be honest, probably we should audit the front-end for more such redundant uses. ... and I can confirm that we have *many*. If we agree that removing all of them is the way to go I can do that in a follow up patch. Thanks, Paolo.
Re: [PATCH] PR fortran/91943 -- More BOZ fallout
On Fri, Oct 11, 2019 at 05:55:45PM +0300, Janne Blomqvist wrote: > On Tue, Oct 1, 2019 at 1:23 AM Steve Kargl > wrote: > > > > A BOZ cannot be an actual argument in a subroutine or function > > reference except those intrinsic function listed in the Fortran > > standard. The attach patch checks for invalid BOZ. Built and > > tested on x86_64-*-freebsd. OK to commit? > > > > 2019-09-30 Steven G. Kargl > > > > PR fortran/91943 > > * match.c (gfc_match_call): BOZ cannot be an actual argument in > > a subroutine reference. > > * resolve.c (resolve_function): BOZ cannot be an actual argument in > > a function reference. > > > > 2019-09-30 Steven G. Kargl > > > > PR fortran/91943 > > gfortran.dg/pr91943.f90 > > > > -- > > Steve > > Ok, though I believe your other BOZ patch that I just reviewed applies > on top of this one? > Yes. This patch was applied in r276471 | kargl | 2019-10-02 10:01:30 -0700 (Wed, 02 Oct 2019) | 13 lines I had a growing stack of patches interfering/interacting with each other. As I completely changed how BOZ are handled, I figure that this one could be committed without a review. -- Steve 20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4 20161221 https://www.youtube.com/watch?v=IbCHE-hONow
[SLP] SLP vectorization: vectorize vector constructors
Hi Richard, Thanks for your help, I've reworked my SLP RFC based on your feedback. > I think a better place for the loop searching for CONSTRUCTORs is > vect_slp_analyze_bb_1 where I'd put it before the check you remove, > and I'd simply append found CONSTRUCTORs to the grouped_stores > array I've moved this check into a separate function and called it from vect_slp_analyze_bb_1 > The fixup you do in vectorizable_operation doesn't > belong there either, I'd add a new field to the SLP instance > structure refering to the CONSTRUCTOR stmt and do the fixup > in vect_schedule_slp_instance instead where you can simply > replace the CONSTRUCTOR with the vectorized SSA name then. Done. > + /* Check that the constructor elements are unique. */ > + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), i, val) > + { > + tree prev_val; > + int j; > + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), j, > prev_val) > + { > + if (val == prev_val && i!=j) > > why's that necessary? (it looks incomplete, also doesn't catch > [duplicate] constants) The thinking was that there was no benefit in vectorizing a constructor of duplicates, or a vector of constants, although now you mention it that thinking may be flawed. I've removed it > You miss to check that CONSTRUCTOR_NELTS == TYPE_VECTOR_SUBPARTS > (we can have omitted trailing zeros). ... > What happens if you have a vector constructor that is twice > as large as the machine supports? The vectorizer will happily > produce a two vector SSA name vectorized result but your > CONSTRUCTOR replacement doesn't work here. I think this should > be made work correctly (not give up on that case). I've reworked the patch to account for this, by checking that the vectorized version has one vectorized stmt at the root of the SLP tree. I'm not sure how to handle a vector constructor twice as large as the machine supports, as far as I can see, when only analyzing a within a basic block, the SSA name of the constructor has to be maintained. Currently SLP vectorization can build SLP trees starting from reductions or from group stores. This patch adds a third starting point: vector constructors. For the following test case (compiled with -O3): char g_d[1024], g_s1[1024], g_s2[1024]; void test_loop(void) { char d = g_d, s1 = g_s1, *s2 = g_s2; for ( int y = 0; y < 128; y++ ) { for ( int x = 0; x < 16; x++ ) d[x] = s1[x] + s2[x]; d += 16; } } before patch: test_loop: .LFB0: .cfi_startproc adrp x0, g_s1 adrp x2, g_s2 add x3, x0, :lo12:g_s1 add x4, x2, :lo12:g_s2 ldrb w7, [x2, #:lo12:g_s2] ldrb w1, [x0, #:lo12:g_s1] adrp x0, g_d ldrb w6, [x4, 1] add x0, x0, :lo12:g_d ldrb w5, [x3, 1] add w1, w1, w7 fmov s0, w1 ldrb w7, [x4, 2] add w5, w5, w6 ldrb w1, [x3, 2] ldrb w6, [x4, 3] add x2, x0, 2048 ins v0.b[1], w5 add w1, w1, w7 ldrb w7, [x3, 3] ldrb w5, [x4, 4] add w7, w7, w6 ldrb w6, [x3, 4] ins v0.b[2], w1 ldrb w8, [x4, 5] add w6, w6, w5 ldrb w5, [x3, 5] ldrb w9, [x4, 6] add w5, w5, w8 ldrb w1, [x3, 6] ins v0.b[3], w7 ldrb w8, [x4, 7] add w1, w1, w9 ldrb w11, [x3, 7] ldrb w7, [x4, 8] add w11, w11, w8 ldrb w10, [x3, 8] ins v0.b[4], w6 ldrb w8, [x4, 9] add w10, w10, w7 ldrb w9, [x3, 9] ldrb w7, [x4, 10] add w9, w9, w8 ldrb w8, [x3, 10] ins v0.b[5], w5 ldrb w6, [x4, 11] add w8, w8, w7 ldrb w7, [x3, 11] ldrb w5, [x4, 12] add w7, w7, w6 ldrb w6, [x3, 12] ins v0.b[6], w1 ldrb w12, [x4, 13] add w6, w6, w5 ldrb w5, [x3, 13] ldrb w1, [x3, 14] add w5, w5, w12 ldrb w13, [x4, 14] ins v0.b[7], w11 ldrb w12, [x4, 15] add w4, w1, w13 ldrb w1, [x3, 15] add w1, w1, w12 ins v0.b[8], w10 ins v0.b[9], w9 ins v0.b[10], w8 ins v0.b[11], w7 ins v0.b[12], w6 ins v0.b[13], w5 ins v0.b[14], w4 ins v0.b[15], w1 .p2align 3,,7 .L2: str q0, [x0], 16 cmp x2, x0 bne .L2 ret .cfi_endproc .LFE0: After patch: test_loop: .LFB0: .cfi_startproc adrp x3, g_s1 adrp x2, g_s2
[PATCH] PR libstdc++/92059 fix several bugs in tr2::dynamic_bitset
PR libstdc++/92059 * include/tr2/dynamic_bitset (__dynamic_bitset_base): Define all special member functions as defaulted. Add noexcept to most members. (__dynamic_bitset_base(size_t, unsigned long long, const _Alloc&)): Mask off unwanted bits in the __val parameter. Avoid undefined left shifts. (__dynamic_bitset_base::_M_assign): Remove. (__dynamic_bitset_base::_M_do_reset): Use std::fill. (__dynamic_bitset_base::_M_are_all_aux): Avoid integer promotion when block_type has lower rank than int. (dynamic_bitset): Add noexcept to most members. Use injected-class-name in return types and parameter types. (dynamic_bitset::_M_Nb): Add default member initializer. (dynamic_bitset(), dynamic_bitset(const dynamic_bitset&)): Define as defaulted. (dynamic_bitset(dynamic_bitset&&)): Clear source object after move. (dynamic_bitset::operator=(const dynamic_bitset&)): Define as defaulted. (dynamic_bitset::operator=(dynamic_bitset&&)): Add noexcept-specifier. Define without using swap, to propagate allocator correctly. (dynamic_bitset(const char*, const _Alloc&)): Use strlen. (dynamic_bitset::_M_do_sanitize, dynamic_bitset::_M_do_fill): Use casts to avoid unwanted integer promotions. (dynamic_bitset::_M_copy_from_ptr): Rearrange template parameters and add default template arguments and default argument to simplify usage. (dynamic_bitset::_M_copy_from_string): Adjust call to _M_copy_from_ptr. (operator==(const dynamic_bitset&, const dynamic_bitset&)) (operator<(const dynamic_bitset&, const dynamic_bitset&)): Use _M_Nb. * include/tr2/dynamic_bitset.tcc (dynamic_bitset::_M_copy_from_ptr): Adjust template parameters to match declaration. * testsuite/tr2/dynamic_bitset/cmp.cc: New test. * testsuite/tr2/dynamic_bitset/cons.cc: New test. * testsuite/tr2/dynamic_bitset/copy.cc: New test. * testsuite/tr2/dynamic_bitset/move.cc: New test. * testsuite/tr2/dynamic_bitset/pr92059.cc: New test. Tested x86_64-linux. Committed to trunk. This seems worth backporting too, as the type is now no longer totally broken (e.g. it is now copy constructible, and the move constructor no longer breaks the class invariant). commit 18ed132a6b119bc2c23078848a8981ba96067a15 Author: redi Date: Fri Oct 11 15:29:55 2019 + PR libstdc++/92059 fix several bugs in tr2::dynamic_bitset PR libstdc++/92059 * include/tr2/dynamic_bitset (__dynamic_bitset_base): Define all special member functions as defaulted. Add noexcept to most members. (__dynamic_bitset_base(size_t, unsigned long long, const _Alloc&)): Mask off unwanted bits in the __val parameter. Avoid undefined left shifts. (__dynamic_bitset_base::_M_assign): Remove. (__dynamic_bitset_base::_M_do_reset): Use std::fill. (__dynamic_bitset_base::_M_are_all_aux): Avoid integer promotion when block_type has lower rank than int. (dynamic_bitset): Add noexcept to most members. Use injected-class-name in return types and parameter types. (dynamic_bitset::_M_Nb): Add default member initializer. (dynamic_bitset(), dynamic_bitset(const dynamic_bitset&)): Define as defaulted. (dynamic_bitset(dynamic_bitset&&)): Clear source object after move. (dynamic_bitset::operator=(const dynamic_bitset&)): Define as defaulted. (dynamic_bitset::operator=(dynamic_bitset&&)): Add noexcept-specifier. Define without using swap, to propagate allocator correctly. (dynamic_bitset(const char*, const _Alloc&)): Use strlen. (dynamic_bitset::_M_do_sanitize, dynamic_bitset::_M_do_fill): Use casts to avoid unwanted integer promotions. (dynamic_bitset::_M_copy_from_ptr): Rearrange template parameters and add default template arguments and default argument to simplify usage. (dynamic_bitset::_M_copy_from_string): Adjust call to _M_copy_from_ptr. (operator==(const dynamic_bitset&, const dynamic_bitset&)) (operator<(const dynamic_bitset&, const dynamic_bitset&)): Use _M_Nb. * include/tr2/dynamic_bitset.tcc (dynamic_bitset::_M_copy_from_ptr): Adjust template parameters to match declaration. * testsuite/tr2/dynamic_bitset/cmp.cc: New test. * testsuite/tr2/dynamic_bitset/cons.cc: New test. * testsuite/tr2/dynamic_bitset/copy.cc: New test. * testsuite/tr2/dynamic_bitset/move.cc: New test. * testsuite/tr2/dynamic_bitset/pr92059.cc: New test. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276890 138bc75d-0d04-0410-961f-82ee72b054a
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
Hi, On 11/10/19 17:23, Marek Polacek wrote: I mean the latter; pedwarn will check for diagnostic_report_warnings_p so the pedwarn will not trigger in a system header unless -Wsystem-headers even without that check. Oh nice, I wasn't aware of that, to be honest, probably we should audit the front-end for more such redundant uses. Anyway, consider it dropped in this case, I'm re-running the testsuite to be safe. Thanks, Paolo.
[PATCH] Avoid warnings in
* include/bits/charconv.h (__to_chars_len): Avoid -Wsign-compare warnings. Tested x86_64-linux. Committed to trunk. commit 0839fa95cf5775f0022b35cda01bb0d6c27ca08a Author: Jonathan Wakely Date: Fri Oct 11 16:10:24 2019 +0100 Avoid warnings in * include/bits/charconv.h (__to_chars_len): Avoid -Wsign-compare warnings. diff --git a/libstdc++-v3/include/bits/charconv.h b/libstdc++-v3/include/bits/charconv.h index a5b6be567bc..7c0922fec21 100644 --- a/libstdc++-v3/include/bits/charconv.h +++ b/libstdc++-v3/include/bits/charconv.h @@ -50,16 +50,16 @@ namespace __detail static_assert(is_unsigned<_Tp>::value, "implementation bug"); unsigned __n = 1; - const int __b2 = __base * __base; - const int __b3 = __b2 * __base; - const int __b4 = __b3 * __base; + const unsigned __b2 = __base * __base; + const unsigned __b3 = __b2 * __base; + const unsigned long __b4 = __b3 * __base; for (;;) { - if (__value < __base) return __n; + if (__value < (unsigned)__base) return __n; if (__value < __b2) return __n + 1; if (__value < __b3) return __n + 2; if (__value < __b4) return __n + 3; - __value /= (unsigned)__b4; + __value /= __b4; __n += 4; } }
Re: [PATCH] add __has_builtin (PR 66970)
Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00062.html On 10/1/19 11:16 AM, Martin Sebor wrote: Attached is an implementation of the __has_builtin special preprocessor operator/macro analogous to __has_attribute and (hopefully) compatible with the synonymous Clang feature (I couldn't actually find tests for it in the Clang test suite but if someone points me at them I'll verify it). Tested on x86_64-linux. Martin PS I couldn't find an existing API to test whether a reserved symbol like __builtin_offsetof is a function-like built-in so I hardwired the tests for C and C++ into the new names_builtin_p functions. I don't like this very much because the next time such an operator is added there is nothing to remind us to update the functions. Adding a flag to the c_common_reswords array would solve the problem but at the expense of a linear search through it. Does anyone have a suggestion for how to do this better?
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
On Fri, Oct 11, 2019 at 05:14:33PM +0200, Paolo Carlini wrote: > Hi, > > On 11/10/19 15:37, Marek Polacek wrote: > > > > > Index: cp/decl.c > > > === > > > --- cp/decl.c (revision 276845) > > > +++ cp/decl.c (working copy) > > > @@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs, > > > if (TREE_CODE (declared_type) != UNION_TYPE > > > && !in_system_header_at (input_location)) > > > - pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous > > > structs"); > > > + pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)), > > > + OPT_Wpedantic, "ISO C++ prohibits anonymous structs"); > > The change looks fine but the in_system_header_at line can be dropped, > > right? > > What do you mean, exactly? Do you mean that, more correctly, we should use > DECL_SOURCE_LOCATION for it too or that in fact is and was already > completely redundant? I agree with the former, I didn't bother changing it > (likely in a couple of other places too) because in practice input_location > should work fine anyway as far as noticing that we are in system header is > concerned and is simpler. If you mean the latter, I'm not sure, I don't > really see why... I mean the latter; pedwarn will check for diagnostic_report_warnings_p so the pedwarn will not trigger in a system header unless -Wsystem-headers even without that check. I know you're just changing the location so you don't need to drop it. I wouldn't worry about the former, I guess it would only make a difference when the { comes from a system header and the } doesn't. -- Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
Re: Fix unchecked use of tree_to_uhwi in tree-ssa-strlen.c
On 10/11/19 8:47 AM, Richard Sandiford wrote: r273783 introduced an unchecked use of tree_to_uhwi. This is tested by the SVE ACLE patches, but could potentially trigger in non-SVE cases too. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? I wasn't able to come up with a C test case that triggers the bug but the fix seems obviously safe and correct to me. Thanks for handling it! Martin Richard 2019-10-11 Richard Sandiford gcc/ * tree-ssa-strlen.c (count_nonzero_bytes): Check tree_fits_uhwi_p before using tree_to_uhwi. Index: gcc/tree-ssa-strlen.c === --- gcc/tree-ssa-strlen.c 2019-10-11 15:43:51.127514545 +0100 +++ gcc/tree-ssa-strlen.c 2019-10-11 15:46:11.718524445 +0100 @@ -4026,10 +4026,10 @@ count_nonzero_bytes (tree exp, unsigned /* The size of the MEM_REF access determines the number of bytes. */ tree type = TREE_TYPE (exp); - if (tree typesize = TYPE_SIZE_UNIT (type)) - nbytes = tree_to_uhwi (typesize); - else + tree typesize = TYPE_SIZE_UNIT (type); + if (!typesize || !tree_fits_uhwi_p (typesize)) return false; + nbytes = tree_to_uhwi (typesize); /* Handle MEM_REF = SSA_NAME types of assignments. */ return count_nonzero_bytes (arg, offset, nbytes, lenrange, nulterm,
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
Hi, On 11/10/19 15:37, Marek Polacek wrote: Index: cp/decl.c === --- cp/decl.c (revision 276845) +++ cp/decl.c (working copy) @@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs, if (TREE_CODE (declared_type) != UNION_TYPE && !in_system_header_at (input_location)) - pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous structs"); + pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)), +OPT_Wpedantic, "ISO C++ prohibits anonymous structs"); The change looks fine but the in_system_header_at line can be dropped, right? What do you mean, exactly? Do you mean that, more correctly, we should use DECL_SOURCE_LOCATION for it too or that in fact is and was already completely redundant? I agree with the former, I didn't bother changing it (likely in a couple of other places too) because in practice input_location should work fine anyway as far as noticing that we are in system header is concerned and is simpler. If you mean the latter, I'm not sure, I don't really see why... Paolo.
Re: [PATCH] PR fortran/91714 -- Mangled type-spec
On Sat, Sep 28, 2019 at 10:16 PM Steve Kargl wrote: > > The attached patch issues errors for a few mangled type-specs. > It has been regression tested on x86_64-*-freebsd. OK to commit? > > 2019-09-28 Steven G. Kargl > > PR fortran/91714 > * decl.c (gfc_match_decl_type_spec): Issue errors for a few > mangled types. > > 2019-09-28 Steven G. Kargl > > PR fortran/91714 > * gfortran.dg/dec_type_print_3.f90: Update dg-error regex. > * gfortran.dg/pr91714.f90: New test. > > A daunting task awaits a brave soul as gfc_match_decl_type_spec > is a minefield for bugs. This is dues to the fact the the functions > has grown by adding kludge upon kludge upon kludge. The first > thing to fix is the broken parsing that follows from the > matching of 'type('. > > -- > Steve Ok. -- Janne Blomqvist
[PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)
Hi all, This is a patch for an issue where the compiler was generating a conditional branch in Thumb2, which was too far for b{cond} to handle. This was originally reported at binutils: https://sourceware.org/bugzilla/show_bug.cgi?id=24991 And then raised for GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91816 As can be seen here: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Cihfddaf.html the range of a 32-bit Thumb B{cond} is +/-1MB. This is now checked for in arm.md and an unconditional branch is generated if the jump would be greater than 1MB. New test has been written that checks this for: beq (if (a)), bne (if (a==1)) Patch bootstrapped and regression tested on arm-none-linux-gnueabihf, however, on my native Aarch32 setup the test times out when run as part of a big "make check-gcc" regression, but not when run individually. Patch also regression tested on arm-none-eabi, arm-none-linux-gnueabi with no issues. Also, I don't have commit rights yet, so could someone commit it on my behalf? Thanks, Stam Markianos-Wright gcc/ChangeLog: 2019-10-11 Stamatis Markianos-Wright * config/arm/arm.md: Update b for Thumb2 range checks. * config/arm/arm.c: New function arm_gen_far_branch. * config/arm/arm-protos.h: New function arm_gen_far_branch prototype. gcc/testsuite/ChangeLog: 2019-10-11 Stamatis Markianos-Wright * testsuite/gcc.target/arm/pr91816.c: New test. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index f995974f9bb..1dce333d1c3 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *, void arm_initialize_isa (sbitmap, const enum isa_feature *); +const char * arm_gen_far_branch (rtx *, int,const char * , const char *); + + #endif /* ! GCC_ARM_PROTOS_H */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 39e1a1ef9a2..1a693d2ddca 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -32139,6 +32139,31 @@ arm_run_selftests (void) } } /* Namespace selftest. */ + +/* Generate code to enable conditional branches in functions over 1 MiB. */ +const char * +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest, + const char * branch_format) +{ + rtx_code_label * tmp_label = gen_label_rtx (); + char label_buf[256]; + char buffer[128]; + ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \ + CODE_LABEL_NUMBER (tmp_label)); + const char *label_ptr = arm_strip_name_encoding (label_buf); + rtx dest_label = operands[pos_label]; + operands[pos_label] = tmp_label; + + snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr); + output_asm_insn (buffer, operands); + + snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr); + operands[pos_label] = dest_label; + output_asm_insn (buffer, operands); + return ""; +} + + #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests #endif /* CHECKING_P */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index f861c72ccfc..634fd0a59da 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -6686,9 +6686,16 @@ ;; And for backward branches we have ;; (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4). ;; +;; In 16-bit Thumb these ranges are: ;; For a 'b' pos_range = 2046, neg_range = -2048 giving (-2040->2048). ;; For a 'b' pos_range = 254, neg_range = -256 giving (-250 ->256). +;; In 32-bit Thumb these ranges are: +;; For a 'b' +/- 16MB is not checked for. +;; For a 'b' pos_range = 1048574, neg_range = -1048576 giving +;; (-1048568 -> 1048576). + + (define_expand "cbranchsi4" [(set (pc) (if_then_else (match_operator 0 "expandable_comparison_operator" @@ -6947,22 +6954,42 @@ (pc)))] "TARGET_32BIT" "* - if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2) -{ - arm_ccfsm_state += 2; - return \"\"; -} - return \"b%d1\\t%l0\"; + if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2) + { + arm_ccfsm_state += 2; + return \"\"; + } + switch (get_attr_length (insn)) + { + // Thumb2 16-bit b{cond} + case 2: + + // Thumb2 32-bit b{cond} + case 4: return \"b%d1\\t%l0\";break; + + // Thumb2 b{cond} out of range. Use unconditional branch. + case 8: return arm_gen_far_branch \ + (operands, 0, \"Lbcond\", \"b%D1\t\"); + break; + + // A32 b{cond} + default: return \"b%d1\\t%l0\"; + } " [(set_attr "conds" "use") (set_attr "type" "branch") (set (attr "length") - (if_then_else - (and (match_test "TARGET_THUMB2") - (and (ge (minus (match_dup 0) (pc)) (const_int -250)) - (le (minus (match_dup 0) (pc)) (const_int 256 - (const_int 2) - (const_int 4)))] + (if_then_else (match_test "TARGET_THUMB2") + (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250)) + (le (minus (match
Re: [PATCH] PR fortran/91943 -- More BOZ fallout
On Tue, Oct 1, 2019 at 1:23 AM Steve Kargl wrote: > > A BOZ cannot be an actual argument in a subroutine or function > reference except those intrinsic function listed in the Fortran > standard. The attach patch checks for invalid BOZ. Built and > tested on x86_64-*-freebsd. OK to commit? > > 2019-09-30 Steven G. Kargl > > PR fortran/91943 > * match.c (gfc_match_call): BOZ cannot be an actual argument in > a subroutine reference. > * resolve.c (resolve_function): BOZ cannot be an actual argument in > a function reference. > > 2019-09-30 Steven G. Kargl > > PR fortran/91943 > gfortran.dg/pr91943.f90 > > -- > Steve Ok, though I believe your other BOZ patch that I just reviewed applies on top of this one? -- Janne Blomqvist
Re: [PATCH] PR fortran/92018 -- BOZ the gift that keeps giving
On Wed, Oct 9, 2019 at 2:14 AM Steve Kargl wrote: > > Tested on x86_64-*-freebsd. OK to commit? > > A BOZ literal constant can be an actual argument in a > very limited number of intrinsic subprograms. For those > intrinsics subprograms, the BOZ literal constant is converted > either during checking (see check.c) or simplification > (see simplify.c). In resolve.c (resolve_function), I added > code that would walk the actual argument list to check for a > BOZ, but that code was restricted to functions with the EXTERNAL > attribute. > > The new testcase, pr92018.f90, demonstrates a situation > when neither the INTRINSIC and EXTERNAL attribute is set, > and the actual argument list contains BOZ. This led to > an ICE. The patch removes the previous restriction, and > so the actual arguments for all functions are checked. > This works except it pointed to a deficiency in the checking > routines. If something was rejected, (e.g., IAND(Z'12',Z34')), > the BOZ were passed onto resolve_function() and run-on errors > were reported. To avoid these additional error messages, I have > added the reset_boz() function, which converts a rejected > BOZ to a default integer kind 0. > > 2019-10-09 Steven G. Kargl > > PF fortran/92018 > * check.c (reset_boz): New function. > (illegal_boz_arg, boz_args_check, gfc_check_complex, gfc_check_float, > gfc_check_transfer): Use it. > (gfc_check_dshift): Use reset_boz, and re-arrange the checking to > help suppress possible run-on errors. > (gfc_check_and): Restore checks for valid argument types. Use > reset_boz, and re-arrange the checking to help suppress possible > run-on errors. > * resolve.c (resolve_function): Actual arguments cannot be BOZ in > a function reference. > > 2019-10-09 Steven G. Kargl > > PF fortran/92018 > * gfortran.dg/gnu_logical_2.f90: Update dg-error regex. > * gfortran.dg/pr81509_2.f90: Ditto. > * gfortran.dg/pr92018.f90: New test. > > -- > Steve Ok. -- Janne Blomqvist
Re: [PATCH][ARM] Tweak HONOR_REG_ALLOC_ORDER
Hi Ramana, > My only question would be whether it's more suitable to use > optimize_function_for_size_p(cfun) instead as IIRC that gives us a > chance with lto rather than the global optimize_size. Yes that is even better and that defaults to optimize_size if cfun isn't set. I've committed this: diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 8b67c9c3657b312be223ab60c01969958baa9ed3..5fad1e5bcc2bc448489fdc8239c676246bbc8879 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -1068,9 +1068,8 @@ extern int arm_regs_in_sequence[]; /* Use different register alloc ordering for Thumb. */ #define ADJUST_REG_ALLOC_ORDER arm_order_regs_for_local_alloc () -/* Tell IRA to use the order we define rather than messing it up with its - own cost calculations. */ -#define HONOR_REG_ALLOC_ORDER 1 +/* Tell IRA to use the order we define when optimizing for size. */ +#define HONOR_REG_ALLOC_ORDER optimize_function_for_size_p (cfun) /* Interrupt functions can only use registers that have already been saved by the prologue, even if they would normally be
Re: [PATCH] PR fortran/92019 -- BOZ cannot be an array index
On Wed, Oct 9, 2019 at 11:26 PM Steve Kargl wrote: > > The attach patch fixes an ICE caused by the use of a bOZ as > an array index. Tested on x86_64-*-freebsd. OK to commit? > > 2019-10-09 Steven G. Kargl > > PR fortran/92019 > * array.c (match_subscript): BOZ cannot be an array subscript. > > 2019-10-09 Steven G. Kargl > > PR fortran/92019 > * gfortran.dg/pr92019.f90: New test. > > -- > Steve Ok. -- Janne Blomqvist
Fix unchecked use of tree_to_uhwi in tree-ssa-strlen.c
r273783 introduced an unchecked use of tree_to_uhwi. This is tested by the SVE ACLE patches, but could potentially trigger in non-SVE cases too. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-10-11 Richard Sandiford gcc/ * tree-ssa-strlen.c (count_nonzero_bytes): Check tree_fits_uhwi_p before using tree_to_uhwi. Index: gcc/tree-ssa-strlen.c === --- gcc/tree-ssa-strlen.c 2019-10-11 15:43:51.127514545 +0100 +++ gcc/tree-ssa-strlen.c 2019-10-11 15:46:11.718524445 +0100 @@ -4026,10 +4026,10 @@ count_nonzero_bytes (tree exp, unsigned /* The size of the MEM_REF access determines the number of bytes. */ tree type = TREE_TYPE (exp); - if (tree typesize = TYPE_SIZE_UNIT (type)) - nbytes = tree_to_uhwi (typesize); - else + tree typesize = TYPE_SIZE_UNIT (type); + if (!typesize || !tree_fits_uhwi_p (typesize)) return false; + nbytes = tree_to_uhwi (typesize); /* Handle MEM_REF = SSA_NAME types of assignments. */ return count_nonzero_bytes (arg, offset, nbytes, lenrange, nulterm,
[committed] Relax store_bit_field call in store_expr
store_bit_field already takes a poly_uint64 size, so we can relax the INTVAL to rtx_to_poly_int64. This is tested by the SVE ACLE patches. Tested on aarch64-linux-gnu and x86_64-linux-gnu, applied as obvious. Richard 2019-10-11 Richard Sandiford gcc/ * expr.c (store_expr): Use rtx_to_poly_int64 rather than INTVAL when calling store_bit_field. Index: gcc/expr.c === --- gcc/expr.c 2019-10-06 16:37:30.845042970 +0100 +++ gcc/expr.c 2019-10-11 15:44:02.755432660 +0100 @@ -5790,7 +5790,8 @@ store_expr (tree exp, rtx target, int ca copy_blkmode_from_reg (target, temp, TREE_TYPE (exp)); else store_bit_field (target, -INTVAL (expr_size (exp)) * BITS_PER_UNIT, +rtx_to_poly_int64 (expr_size (exp)) +* BITS_PER_UNIT, 0, 0, 0, GET_MODE (temp), temp, reverse); } else
Add a constant_range_value_p function (PR 92033)
The range-tracking code has a pretty hard-coded assumption that is_gimple_min_invariant is equivalent to "INTEGER_CST or invariant ADDR_EXPR". It seems better to add a predicate specifically for that rather than contiually fight cases in which it can't handle other invariants. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-10-11 Richard Sandiford gcc/ PR tree-optimization/92033 * tree-vrp.h (constant_range_value_p): Declare. * tree-vrp.c (constant_range_value_p): New function. (value_range_base::symbolic_p, value_range_base::singleton_p) (get_single_symbol, compare_values_warnv, intersect_ranges) (value_range_base::normalize_symbolics): Use it instead of is_gimple_min_invariant. (simplify_stmt_for_jump_threading): Likewise. * vr-values.c (symbolic_range_based_on_p, valid_value_p): Likewise. (vr_values::op_with_constant_singleton_value_range): Likewise. (vr_values::extract_range_from_binary_expr): Likewise. (vr_values::extract_range_from_unary_expr): Likewise. (vr_values::extract_range_from_cond_expr): Likewise. (vr_values::extract_range_from_comparison): Likewise. (vr_values::extract_range_from_assignment): Likewise. (vr_values::adjust_range_with_scev, vrp_valueize): Likewise. (vr_values::vrp_visit_assignment_or_call): Likewise. (vr_values::vrp_evaluate_conditional): Likewise. (vr_values::simplify_bit_ops_using_ranges): Likewise. (test_for_singularity): Likewise. (vr_values::simplify_cond_using_ranges_1): Likewise. Index: gcc/tree-vrp.h === --- gcc/tree-vrp.h 2019-10-08 09:23:31.282533990 +0100 +++ gcc/tree-vrp.h 2019-10-11 15:41:20.380576059 +0100 @@ -284,6 +284,7 @@ value_range_base::supports_type_p (tree return false; } +extern bool constant_range_value_p (const_tree); extern void register_edge_assert_for (tree, edge, enum tree_code, tree, tree, vec &); extern bool stmt_interesting_for_vrp (gimple *); Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c 2019-10-08 09:23:31.282533990 +0100 +++ gcc/tree-vrp.c 2019-10-11 15:41:20.380576059 +0100 @@ -78,6 +78,18 @@ ranges_from_anti_range (const value_rang for still active basic-blocks. */ static sbitmap *live; +/* Return true if VALUE is considered constant for range tracking. + This is stricter than is_gimple_min_invariant and should be + used instead of it in range-related code. */ + +bool +constant_range_value_p (const_tree value) +{ + return (TREE_CODE (value) == INTEGER_CST + || (TREE_CODE (value) == ADDR_EXPR + && is_gimple_invariant_address (value))); +} + void value_range::set_equiv (bitmap equiv) { @@ -273,8 +285,8 @@ value_range_base::symbolic_p () const { return (!varying_p () && !undefined_p () - && (!is_gimple_min_invariant (m_min) - || !is_gimple_min_invariant (m_max))); + && (!constant_range_value_p (m_min) + || !constant_range_value_p (m_max))); } /* NOTE: This is not the inverse of symbolic_p because the range @@ -388,7 +400,7 @@ value_range_base::singleton_p (tree *res } if (m_kind == VR_RANGE && vrp_operand_equal_p (min (), max ()) - && is_gimple_min_invariant (min ())) + && constant_range_value_p (min ())) { if (result) *result = min (); @@ -953,13 +965,13 @@ get_single_symbol (tree t, bool *neg, tr || TREE_CODE (t) == POINTER_PLUS_EXPR || TREE_CODE (t) == MINUS_EXPR) { - if (is_gimple_min_invariant (TREE_OPERAND (t, 0))) + if (constant_range_value_p (TREE_OPERAND (t, 0))) { neg_ = (TREE_CODE (t) == MINUS_EXPR); inv_ = TREE_OPERAND (t, 0); t = TREE_OPERAND (t, 1); } - else if (is_gimple_min_invariant (TREE_OPERAND (t, 1))) + else if (constant_range_value_p (TREE_OPERAND (t, 1))) { neg_ = false; inv_ = TREE_OPERAND (t, 1); @@ -1106,8 +1118,8 @@ compare_values_warnv (tree val1, tree va TYPE_SIGN (TREE_TYPE (val1))); } - const bool cst1 = is_gimple_min_invariant (val1); - const bool cst2 = is_gimple_min_invariant (val2); + const bool cst1 = constant_range_value_p (val1); + const bool cst2 = constant_range_value_p (val2); /* If one is of the form '[-]NAME + CST' and the other is constant, then it might be possible to say something depending on the constants. */ @@ -5785,7 +5797,7 @@ intersect_ranges (enum value_range_kind correct estimate unless VR1 is a constant singleton range in which case we choose that. */ if (vr1type == VR_RANGE - && is_gimple_min_invariant (vr1min) + && constant_range_value_p (vr1min) && vrp_operand_equa
Add expr_callee_abi
This turned out to be useful for the SVE PCS support, and is a natural tree-level analogue of insn_callee_abi. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-10-11 Richard Sandiford gcc/ * function-abi.h (expr_callee_abi): Declare. * function-abi.cc (expr_callee_abi): New function. Index: gcc/function-abi.h === --- gcc/function-abi.h 2019-09-30 17:39:33.514597856 +0100 +++ gcc/function-abi.h 2019-10-11 15:38:54.141605718 +0100 @@ -315,5 +315,6 @@ call_clobbered_in_region_p (unsigned int extern const predefined_function_abi &fntype_abi (const_tree); extern function_abi fndecl_abi (const_tree); extern function_abi insn_callee_abi (const rtx_insn *); +extern function_abi expr_callee_abi (const_tree); #endif Index: gcc/function-abi.cc === --- gcc/function-abi.cc 2019-09-30 17:39:33.514597856 +0100 +++ gcc/function-abi.cc 2019-10-11 15:38:54.141605718 +0100 @@ -229,3 +229,32 @@ insn_callee_abi (const rtx_insn *insn) return default_function_abi; } + +/* Return the ABI of the function called by CALL_EXPR EXP. Return the + default ABI for erroneous calls. */ + +function_abi +expr_callee_abi (const_tree exp) +{ + gcc_assert (TREE_CODE (exp) == CALL_EXPR); + + if (tree fndecl = get_callee_fndecl (exp)) +return fndecl_abi (fndecl); + + tree callee = CALL_EXPR_FN (exp); + if (callee == error_mark_node) +return default_function_abi; + + tree type = TREE_TYPE (callee); + if (type == error_mark_node) +return default_function_abi; + + if (POINTER_TYPE_P (type)) +{ + type = TREE_TYPE (type); + if (type == error_mark_node) + return default_function_abi; +} + + return fntype_abi (type); +}
Re: Ping: [C][C++] Avoid exposing internal details in aka types
On Fri, Oct 11, 2019 at 03:14:09PM +0100, Richard Sandiford wrote: > Marek Polacek writes: > > On Thu, Oct 10, 2019 at 08:00:53PM +0100, Richard Sandiford wrote: > >> Ping > >> > >> Richard Sandiford writes: > >> > The current aka diagnostics can sometimes leak internal details that > >> > seem more likely to be distracting than useful. E.g. on aarch64: > >> > > >> > void f (va_list *va) { *va = 1; } > >> > > >> > gives: > >> > > >> > incompatible types when assigning to type ‘va_list’ {aka ‘__va_list’} > >> > from type ‘int’ > >> > > >> > where __va_list isn't something the user is expected to know about. > >> > A similar thing happens for C++ on the arm_neon.h-based: > >> > > >> > float x; > >> > int8x8_t y = x; > >> > > >> > which gives: > >> > > >> > cannot convert ‘float’ to ‘int8x8_t’ {aka ‘__Int8x8_t’} in > >> > initialization > >> > > >> > This is accurate -- and __Int8x8_t is defined by the AArch64 PCS -- > >> > but it's not going to be meaningful to most users. > > > > Agreed. > > > >> +/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to > >> + the user in diagnostics, false if it would be better to use TYPE > >> itself. > >> + TYPE is known to satisfy typedef_variant_p. */ > >> + > >> +bool > >> +user_facing_original_type_p (const_tree type) > >> +{ > >> + gcc_assert (typedef_variant_p (type)); > >> + tree decl = TYPE_NAME (type); > >> + > >> + /* Look through any typedef in "user" code. */ > >> + if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl)) > >> +return true; > >> + > >> + /* If the original type is also named and is in the user namespace, > >> + assume it too is a user-facing type. */ > >> + tree orig_type = DECL_ORIGINAL_TYPE (decl); > >> + if (tree orig_id = TYPE_IDENTIFIER (orig_type)) > >> +{ > >> + const char *name = IDENTIFIER_POINTER (orig_id); > >> + if (name[0] != '_' || (name[1] != '_' && !ISUPPER (name[1]))) > >> + return true; > > > > This looks like name_reserved_for_implementation_p. > > > > The rest looks fine to me! > > Ah, nice! I'd looked for a helper but missed that one. > > Here's just the C parts, with that change. Tested on aarch64-linux-gnu > and x86_64-linux-gnu. OK to install? Ok, thanks! -- Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
Re: [RFH][libgcc] fp-bit bit ordering (PR 78804)
On Thu, 2019-10-03 at 19:34 -0600, Jeff Law wrote: > > So probably the most interesting target for this test is v850-elf as > it's got a reasonably well functioning simulator, hard and soft FP > targets, little endian, and I'm familiar with its current set of > failures. > > I can confirm that your patch makes no difference in the test results > (which includes execution results). > > In fact, there haven't been any problems on any target in my tester > that > I can tie back to this change. > > At this point I'd say let's go for it. > Thanks, Jeff. I'll commit it to trunk if there are no further objections some time next week. It's a latent issue, not a regression. OK for the open branches, too? Any opinions on that? Cheers, Oleg
Re: Ping: [C][C++] Avoid exposing internal details in aka types
Richard Sandiford writes: > Marek Polacek writes: >> On Thu, Oct 10, 2019 at 08:00:53PM +0100, Richard Sandiford wrote: >>> Ping >>> >>> Richard Sandiford writes: >>> > The current aka diagnostics can sometimes leak internal details that >>> > seem more likely to be distracting than useful. E.g. on aarch64: >>> > >>> > void f (va_list *va) { *va = 1; } >>> > >>> > gives: >>> > >>> > incompatible types when assigning to type ‘va_list’ {aka ‘__va_list’} >>> > from type ‘int’ >>> > >>> > where __va_list isn't something the user is expected to know about. >>> > A similar thing happens for C++ on the arm_neon.h-based: >>> > >>> > float x; >>> > int8x8_t y = x; >>> > >>> > which gives: >>> > >>> > cannot convert ‘float’ to ‘int8x8_t’ {aka ‘__Int8x8_t’} in >>> > initialization >>> > >>> > This is accurate -- and __Int8x8_t is defined by the AArch64 PCS -- >>> > but it's not going to be meaningful to most users. >> >> Agreed. >> >>> +/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to >>> + the user in diagnostics, false if it would be better to use TYPE itself. >>> + TYPE is known to satisfy typedef_variant_p. */ >>> + >>> +bool >>> +user_facing_original_type_p (const_tree type) >>> +{ >>> + gcc_assert (typedef_variant_p (type)); >>> + tree decl = TYPE_NAME (type); >>> + >>> + /* Look through any typedef in "user" code. */ >>> + if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl)) >>> +return true; >>> + >>> + /* If the original type is also named and is in the user namespace, >>> + assume it too is a user-facing type. */ >>> + tree orig_type = DECL_ORIGINAL_TYPE (decl); >>> + if (tree orig_id = TYPE_IDENTIFIER (orig_type)) >>> +{ >>> + const char *name = IDENTIFIER_POINTER (orig_id); >>> + if (name[0] != '_' || (name[1] != '_' && !ISUPPER (name[1]))) >>> + return true; >> >> This looks like name_reserved_for_implementation_p. >> >> The rest looks fine to me! > > Ah, nice! I'd looked for a helper but missed that one. > > Here's just the C parts, with that change. And here are the C++ parts, on top of that. I should probably have split them from the outset. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard [Comment from the original covering note, in case it helps: strip_typedefs had: /* Explicitly get the underlying type, as TYPE_MAIN_VARIANT doesn't strip typedefs with attributes. */ result = TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE (TYPE_NAME (t))); result = strip_typedefs (result); Applying TYPE_MAIN_VARIANT predates the strip_typedefs call, with the comment originally contrasting with plain: result = TYPE_MAIN_VARIANT (t); But the recursive call to strip_typedefs will apply TYPE_MAIN_VARIANT, so it doesn't seem necessary to do it here too. I think there was also a missing "remove_attributes" argument, since wrapping something in a typedef shouldn't change which attributes get removed.] 2019-10-11 Richard Sandiford gcc/cp/ * cp-tree.h (STF_USER_VISIBLE): New constant. (strip_typedefs, strip_typedefs_expr): Take a flags argument. * tree.c (strip_typedefs, strip_typedefs_expr): Likewise, updating mutual calls accordingly. When STF_USER_VISIBLE is true, only look through typedefs if user_facing_original_type_p. * error.c (dump_template_bindings, type_to_string): Pass STF_USER_VISIBLE to strip_typedefs. (dump_type): Likewise, unless pp_c_flag_gnu_v3 is set. gcc/testsuite/ * g++.dg/diagnostic/aka5.h: New test. * g++.dg/diagnostic/aka5a.C: Likewise. * g++.dg/diagnostic/aka5b.C: Likewise. * g++.target/aarch64/diag_aka_1.C: Likewise. Index: gcc/cp/cp-tree.h === --- gcc/cp/cp-tree.h2019-10-08 09:23:43.0 +0100 +++ gcc/cp/cp-tree.h2019-10-11 15:12:12.544911643 +0100 @@ -5696,6 +5696,13 @@ #define TFF_NO_OMIT_DEFAULT_TEMPLATE_ARG #define TFF_NO_TEMPLATE_BINDINGS (1 << 13) #define TFF_POINTER(1 << 14) +/* These constants can be used as bit flags to control strip_typedefs. + + STF_USER_VISIBLE: use heuristics to try to avoid stripping user-facing + aliases of internal details. This is intended for diagnostics, + where it should (for example) give more useful "aka" types. */ +const unsigned int STF_USER_VISIBLE = 1U; + /* Returns the TEMPLATE_DECL associated to a TEMPLATE_TEMPLATE_PARM node. */ #define TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL(NODE) \ @@ -7254,8 +7261,10 @@ extern int zero_init_p (const_tree); extern bool check_abi_tag_redeclaration(const_tree, const_tree, const_tree); extern bool check_abi_tag_args (tree, tree); -extern tree strip_typedefs (tree, bool * = NULL); -
Re: [PATCH] Fix parser to recognize operator?:
On Fri, Oct 11, 2019 at 04:06:43PM +0200, Matthias Kretz wrote: > This is a minor bugfix for improved error reporting. Overloading ?: is just > as > disallowed as it is without this change. Thanks. Can you provide a testcase that shows why this change makes sense? That testcase then should be part of the patch submission. > 2019-10-11 Matthias Kretz > > * gcc/cp/parser.c (cp_parser_operator): Parse operator?: as an > attempt to overload the conditional operator. Then > grok_op_properties can print its useful "ISO C++ prohibits > overloading operator ?:" message instead of the cryptic error > message about a missing type-specifier before '?' token. > > diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c > index 3ee8da7db94..73385cb3dcb 100644 > --- a/gcc/cp/parser.c > +++ b/gcc/cp/parser.c > @@ -15502,6 +15502,15 @@ cp_parser_operator (cp_parser* parser, location_t > start_loc) >op = COMPONENT_REF; >break; > > +case CPP_QUERY: > + op = COND_EXPR; > + /* Consume the `?'. */ > + cp_lexer_consume_token (parser->lexer); > + /* Look for the matching `:'. */ > + cp_parser_require (parser, CPP_COLON, RT_COLON); > + consumed = true; > + break; > + > case CPP_OPEN_PAREN: >{ > /* Consume the `('. */ -- Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
Re: Ping: [C][C++] Avoid exposing internal details in aka types
Marek Polacek writes: > On Thu, Oct 10, 2019 at 08:00:53PM +0100, Richard Sandiford wrote: >> Ping >> >> Richard Sandiford writes: >> > The current aka diagnostics can sometimes leak internal details that >> > seem more likely to be distracting than useful. E.g. on aarch64: >> > >> > void f (va_list *va) { *va = 1; } >> > >> > gives: >> > >> > incompatible types when assigning to type ‘va_list’ {aka ‘__va_list’} >> > from type ‘int’ >> > >> > where __va_list isn't something the user is expected to know about. >> > A similar thing happens for C++ on the arm_neon.h-based: >> > >> > float x; >> > int8x8_t y = x; >> > >> > which gives: >> > >> > cannot convert ‘float’ to ‘int8x8_t’ {aka ‘__Int8x8_t’} in initialization >> > >> > This is accurate -- and __Int8x8_t is defined by the AArch64 PCS -- >> > but it's not going to be meaningful to most users. > > Agreed. > >> +/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to >> + the user in diagnostics, false if it would be better to use TYPE itself. >> + TYPE is known to satisfy typedef_variant_p. */ >> + >> +bool >> +user_facing_original_type_p (const_tree type) >> +{ >> + gcc_assert (typedef_variant_p (type)); >> + tree decl = TYPE_NAME (type); >> + >> + /* Look through any typedef in "user" code. */ >> + if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl)) >> +return true; >> + >> + /* If the original type is also named and is in the user namespace, >> + assume it too is a user-facing type. */ >> + tree orig_type = DECL_ORIGINAL_TYPE (decl); >> + if (tree orig_id = TYPE_IDENTIFIER (orig_type)) >> +{ >> + const char *name = IDENTIFIER_POINTER (orig_id); >> + if (name[0] != '_' || (name[1] != '_' && !ISUPPER (name[1]))) >> +return true; > > This looks like name_reserved_for_implementation_p. > > The rest looks fine to me! Ah, nice! I'd looked for a helper but missed that one. Here's just the C parts, with that change. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-10-11 Richard Sandiford gcc/c-family/ * c-common.h (user_facing_original_type_p): Declare. * c-common.c: Include c-spellcheck.h. (user_facing_original_type_p): New function. gcc/c/ * c-objc-common.c (useful_aka_type_p): Replace with... (get_aka_type): ...this new function. Given the original type, decide which aka type to print (if any). Only look through typedefs if user_facing_original_type_p. (print_type): Update accordingly. gcc/testsuite/ * gcc.dg/diag-aka-5.h: New test. * gcc.dg/diag-aka-5a.c: Likewise. * gcc.dg/diag-aka-5b.c: Likewise. * gcc.target/aarch64/diag_aka_1.c (f): Expect an aka to be printed for myvec. Index: gcc/c-family/c-common.h === --- gcc/c-family/c-common.h 2019-10-08 09:23:43.0 +0100 +++ gcc/c-family/c-common.h 2019-10-11 15:12:08.552939853 +0100 @@ -1063,6 +1063,7 @@ extern tree builtin_type_for_size (int, extern void c_common_mark_addressable_vec (tree); extern void set_underlying_type (tree); +extern bool user_facing_original_type_p (const_tree); extern void record_types_used_by_current_var_decl (tree); extern vec *make_tree_vector (void); extern void release_tree_vector (vec *); Index: gcc/c-family/c-common.c === --- gcc/c-family/c-common.c 2019-10-08 09:23:43.0 +0100 +++ gcc/c-family/c-common.c 2019-10-11 15:12:08.552939853 +0100 @@ -48,6 +48,7 @@ #define GCC_C_COMMON_C #include "gimplify.h" #include "substring-locations.h" #include "spellcheck.h" +#include "c-spellcheck.h" #include "selftest.h" cpp_reader *parse_in; /* Declared in c-pragma.h. */ @@ -7713,6 +7714,52 @@ set_underlying_type (tree x) } } +/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to + the user in diagnostics, false if it would be better to use TYPE itself. + TYPE is known to satisfy typedef_variant_p. */ + +bool +user_facing_original_type_p (const_tree type) +{ + gcc_assert (typedef_variant_p (type)); + tree decl = TYPE_NAME (type); + + /* Look through any typedef in "user" code. */ + if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl)) +return true; + + /* If the original type is also named and is in the user namespace, + assume it too is a user-facing type. */ + tree orig_type = DECL_ORIGINAL_TYPE (decl); + if (tree orig_id = TYPE_IDENTIFIER (orig_type)) +if (!name_reserved_for_implementation_p (IDENTIFIER_POINTER (orig_id))) + return true; + + switch (TREE_CODE (orig_type)) +{ +/* Don't look through to an anonymous vector type, since the syntax + we use for them in diagnostics isn't real C or C++ syntax. + And if ORIG_TYPE is named but in the implementation namespace, + TYPE is likely to be mor
[PATCH] Fix parser to recognize operator?:
This is a minor bugfix for improved error reporting. Overloading ?: is just as disallowed as it is without this change. 2019-10-11 Matthias Kretz * gcc/cp/parser.c (cp_parser_operator): Parse operator?: as an attempt to overload the conditional operator. Then grok_op_properties can print its useful "ISO C++ prohibits overloading operator ?:" message instead of the cryptic error message about a missing type-specifier before '?' token. diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 3ee8da7db94..73385cb3dcb 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -15502,6 +15502,15 @@ cp_parser_operator (cp_parser* parser, location_t start_loc) op = COMPONENT_REF; break; +case CPP_QUERY: + op = COND_EXPR; + /* Consume the `?'. */ + cp_lexer_consume_token (parser->lexer); + /* Look for the matching `:'. */ + cp_parser_require (parser, CPP_COLON, RT_COLON); + consumed = true; + break; + case CPP_OPEN_PAREN: { /* Consume the `('. */ -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtzzentrum für Schwerionenforschung https://gsi.de SIMD easy and portable https://github.com/VcDevel/Vc ──
Re: [RFH][libgcc] fp-bit bit ordering (PR 78804)
Hi Bernd, On Sun, 2019-09-29 at 08:49 +, Bernd Edlinger wrote: > > But I cannot tell if the bitfield access generates > more efficient code or identical code than the > original variant when no ms bitfields are used. > That needs closer inspection of the generated > assembler code, a simple bootstrap / regtest will > probably not be sufficient. Thanks for your comment. The bitfields in this case are used for packing and unpacking the FP values. So I gave it a try on RX and SH as examples (with trunk compiler). I've extracted the unpack function for a "double" value and compared the generated assembly code. Please see the attached file. The bitfield code on RX is equivalent. On SH bitfields is slightly worse. It gets *a lot* worse when the struct is passed by reference/pointer, then things get completely out of control. Theoretically, if the bitfields and the shifts-and-ands extract/pack the same bits from the underlying base integer value, the resulting code should be the same in both cases (see RX). But it depends on what the backend does, or tries to do (see SH). > But my thought is if the -mms-bitfields option has such > an impact on this structure, then it would be good if there > was a built-in define that can be used to adjust to and/or > diagnose the problem at compile time. I think it'd be better to add static_assert on the expected sizes of the various float value structs. > > I think that is missing right now, but wouldn't it be nice to have > a define like __MS_BITFIELD_LAYOUT__ ? Honestly, I'm not a big fan of the whole MS bitfield thing. On RX it's been put there long time ago for compatibility with some other compiler. I'm not sure if it's relevant at all anymore. So I don't want to add any more to the pile ... Cheers, Oleg typedef unsigned long long fractype; struct unpacked_double { fractype fraction; int exp; int sign; }; struct __attribute__ ((packed)) bits_little_endian { fractype fraction:52 __attribute__ ((packed)); unsigned int exp:11 __attribute__ ((packed)); unsigned int sign:1 __attribute__ ((packed)); }; static_assert (sizeof (bits_little_endian) == 8, ""); static_assert (sizeof (long long) == 8, ""); // unpacked_double unpack_d_bitfields (bits_little_endian val) { fractype fraction = val.fraction; int exp = val.exp; int sign = val.sign; return { fraction, exp, sign }; } /* RX -O2 __Z18unpack_d_bitfieldsRK18bits_little_endian: mov.L r2, r4 ; 4 [c=4 l=2] *movsi_internal/4 shlr #20, r2, r3 ; 9 [c=8 l=3] lshrsi3/2 add #-16, r0 ; 58 [c=4 l=3] addsi3_internal/3 mov.L #0xf, r2 ; 57 [c=4 l=5] *movsi_internal/2 and r4, r2 ; 43 [c=4 l=2] andsi3/0 and #0x7ff, r3 ; 44 [c=4 l=4] andsi3/3 shlr #31, r4 ; 45 [c=8 l=2] lshrsi3/1 rtsd #16 ; 61 [c=4 l=2] deallocate_and_return SH -O2 __Z18unpack_d_bitfieldsRK18bits_little_endian: .LFB0: .cfi_startproc .cfi_startproc add #-8,r15 ! 85 [c=4 l=2] *addsi3/0 .cfi_def_cfa_offset 8 mov r2,r0 ! 2 [c=4 l=2] movsi_ie/1 mov.l .L3,r3 ! 32 [c=10 l=2] movsi_ie/0 mov #-21,r2 ! 82 [c=4 l=2] movsi_ie/2 mov r5,r1 ! 81 [c=4 l=2] movsi_ie/1 add r1,r1 ! 16 [c=4 l=2] ashlsi3_k/0 shld r2,r1 ! 71 [c=4 l=2] lshrsi3_d mov r5,r2 ! 83 [c=4 l=2] movsi_ie/1 shll r2 ! 72 [c=0 l=2] shll mov.l r4,@r0 ! 44 [c=4 l=2] movsi_ie/8 movt r2 ! 73 [c=4 l=2] movt mov.l r1,@(8,r0) ! 35 [c=4 l=2] movsi_ie/8 and r3,r5 ! 33 [c=4 l=2] *andsi_compact/3 mov.l r2,@(12,r0) ! 36 [c=4 l=2] movsi_ie/8 mov.l r5,@(4,r0) ! 45 [c=4 l=2] movsi_ie/8 rts ! 91 [c=0 l=2] *return_i add #8,r15 ! 90 [c=4 l=2] *addsi3/0 .cfi_endproc */ // unpacked_double unpack_d_no_bitfields (long long val) { fractype fraction = val & fractype)1) << 52) - 1); int exp = ((int)(val >> 52)) & ((1 << 11) - 1); int sign = ((int)(val >> (52 + 11))) & 1; return { fraction, exp, sign }; } /* RX -O2 mov.L r2, r4 ; 4 [c=4 l=2] *movsi_internal/4 shar #20, r2, r3 ; 13 [c=8 l=3] ashrsi3/2 add #-16, r0 ; 56 [c=4 l=3] addsi3_internal/3 mov.L #0xf, r2 ; 55 [c=4 l=5] *movsi_internal/2 and r4, r2 ; 42 [c=4 l=2] andsi3/0 and #0x7ff, r3 ; 43 [c=4 l=4] andsi3/3 shlr #31, r4 ; 44 [c=8 l=2] lshrsi3/1 rtsd #16 ; 59 [c=4 l=2] deallocate_and_return SH -O2 mov r2,r0 ! 2 [c=4 l=2] movsi_ie/1 mov.l .L6,r1 ! 10 [c=10 l=2] movsi_ie/0 mov.l r4,@r2 ! 9 [c=4 l=2] movsi_ie/8 and r5,r1 ! 11 [c=4 l=2] *andsi_compact/3 mov.l r1,@(4,r2) ! 12 [c=4 l=2] movsi_ie/8 mov #-21,r2 ! 69 [c=4 l=2] movsi_ie/2 mov r5,r1 ! 68 [c=4 l=2] movsi_ie/1 shll r5 ! 59 [c=0 l=2] shll add r1,r1 ! 18 [c=4 l=2] ashlsi3_k/0 shld r2,r1 ! 58 [c=4 l=2] lshrsi3_d movt r5 ! 60 [c=4 l=2] movt mov.l r1,@(8,r0) ! 20 [c=4 l=2] movsi_ie/8 rts ! 74 [c=0 l=2] *return_i mov.l r5,@(12,r0) ! 28 [c=4 l=2] movsi_ie/8 .L7: .align 2 .L6: .long 1048575 */
Re: [C++ Patch] One more DECL_SOURCE_LOCATION
On Fri, Oct 11, 2019 at 12:58:41PM +0200, Paolo Carlini wrote: > Hi, > > another straightforward DECL_SOURCE_LOCATION (TYPE_MAIN_DECL consistent with > the one I used in build_anon_union_vars. Tested x86_64-linux. > > Thanks, Paolo. > > / > /cp > 2019-10-11 Paolo Carlini > > * decl.c (check_tag_decl): Use DECL_SOURCE_LOCATION. > > /testsuite > 2019-10-11 Paolo Carlini > > * g++.dg/cpp0x/constexpr-union5.C: Test location(s) too. > * g++.dg/diagnostic/bitfld2.C: Likewise. > * g++.dg/ext/anon-struct1.C: Likewise. > * g++.dg/ext/anon-struct6.C: Likewise. > * g++.dg/ext/flexary19.C: Likewise. > * g++.dg/ext/flexary9.C: Likewise. > * g++.dg/template/error17.C: Likewise. > Index: cp/decl.c > === > --- cp/decl.c (revision 276845) > +++ cp/decl.c (working copy) > @@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs, > >if (TREE_CODE (declared_type) != UNION_TYPE > && !in_system_header_at (input_location)) > - pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous > structs"); > + pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)), > + OPT_Wpedantic, "ISO C++ prohibits anonymous structs"); The change looks fine but the in_system_header_at line can be dropped, right? Marek
Re: [patch,Fortran] PR 92050 - fix ICE with -fcheck=all
On Fri, Oct 11, 2019 at 12:17:49PM +0200, Tobias Burnus wrote: > Checking produces extra code (unsurprisingly); this code needs to end up > in the program… > > Bootstrapped on x86-64_gnu-linux. OK for the trunk? > OK. -- Steve
[PATCH] Re-instantiate more redundant store removal in FRE
I've spent some time robustifying the original idea (fixing some latent wrong-code on the way...). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2019-10-11 Richard Biener PR tree-optimization/90883 PR tree-optimization/91091 * tree-ssa-sccvn.c (vn_reference_lookup_3): Use correct alias-sets both for recording VN table entries and continuing walking after translating through copies. Handle same-sized reads from SSA names by returning the plain SSA name. (eliminate_dom_walker::eliminate_stmt): Properly handle non-size precision stores in redundant store elimination. * gcc.dg/torture/20191011-1.c: New testcase. * gcc.dg/tree-ssa/ssa-fre-82.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-83.c: Likewise. * gcc.dg/tree-ssa/redundant-assign-zero-1.c: Disable FRE. * gcc.dg/tree-ssa/redundant-assign-zero-2.c: Likewise. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 276858) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -1877,8 +1877,10 @@ if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Successfully combined %u partial definitions\n", ndefs); + /* ??? If we track partial defs alias-set we could use that if it + is the same for all. Use zero for now. */ return vn_reference_lookup_or_insert_for_pieces - (first_vuse, vr->set, vr->type, vr->operands, val); + (first_vuse, 0, vr->type, vr->operands, val); } else { @@ -2333,7 +2335,7 @@ /* If we are looking for redundant stores do not create new hashtable entries from aliasing defs with made up alias-sets. */ - if (*disambiguate_only > TR_TRANSLATE || !data->tbaa_p) + if (*disambiguate_only > TR_TRANSLATE) return (void *)-1; /* If we cannot constrain the size of the reference we cannot @@ -2449,7 +2451,7 @@ return (void *)-1; } return vn_reference_lookup_or_insert_for_pieces - (vuse, vr->set, vr->type, vr->operands, val); + (vuse, 0, vr->type, vr->operands, val); } /* For now handle clearing memory with partial defs. */ else if (known_eq (ref->size, maxsize) @@ -2499,7 +2501,7 @@ { tree val = build_zero_cst (vr->type); return vn_reference_lookup_or_insert_for_pieces - (vuse, vr->set, vr->type, vr->operands, val); + (vuse, get_alias_set (lhs), vr->type, vr->operands, val); } else if (known_eq (ref->size, maxsize) && maxsize.is_constant (&maxsizei) @@ -2614,7 +2616,7 @@ if (val) return vn_reference_lookup_or_insert_for_pieces - (vuse, vr->set, vr->type, vr->operands, val); + (vuse, get_alias_set (lhs), vr->type, vr->operands, val); } } else if (ranges_known_overlap_p (offseti, maxsizei, offset2i, size2i)) @@ -2672,23 +2674,26 @@ according to endianness. */ && (! INTEGRAL_TYPE_P (vr->type) || known_eq (ref->size, TYPE_PRECISION (vr->type))) - && multiple_p (ref->size, BITS_PER_UNIT) - && (! INTEGRAL_TYPE_P (TREE_TYPE (def_rhs)) - || type_has_mode_precision_p (TREE_TYPE (def_rhs + && multiple_p (ref->size, BITS_PER_UNIT)) { - gimple_match_op op (gimple_match_cond::UNCOND, - BIT_FIELD_REF, vr->type, - vn_valueize (def_rhs), - bitsize_int (ref->size), - bitsize_int (offset - offset2)); - tree val = vn_nary_build_or_lookup (&op); - if (val - && (TREE_CODE (val) != SSA_NAME - || ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (val))) + if (known_eq (ref->size, size2)) + return vn_reference_lookup_or_insert_for_pieces + (vuse, get_alias_set (lhs), vr->type, vr->operands, +SSA_VAL (def_rhs)); + else if (! INTEGRAL_TYPE_P (TREE_TYPE (def_rhs)) + || type_has_mode_precision_p (TREE_TYPE (def_rhs))) { - vn_reference_t res = vn_reference_lookup_or_insert_for_pieces - (vuse, vr->set, vr->type, vr->operands, val); - return res; + gimple_match_op op (gimple_match_cond::UNCOND, + BIT_FIELD_REF, vr->type, + vn_valueize (def_rhs), + bit
Re: [PATCH] Cleanup parameter of vectorizable_live_operation
On Fri, 11 Oct 2019, Bernd Edlinger wrote: > Hi Richard, > > I became aware of this while looking at the -Wshadow=compatible-local > warnings. > The function vectorizable_live_operation uses a parameter called "vec_stmt" > that is shadowed by something also called "vec_stmt". But in this case, > the vec_stmt actually only used as a boolean, i.e. pointer is NULL or not. > > This changes the parameter vec_stmt to vec_stmt_p, and propagates that > change to can_vectorize_live_stmts. > > > Bootstrapped and reg-tested on x86_64-pc-linux-gnu. > Is it OK for trunk? OK. Some more refactoring is in order here but it's an improvement. Thanks, Richard.
Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops
On Thu, 10 Oct 2019, Andre Vieira (lists) wrote: > Hi, > > After all the discussions and respins I now believe this patch is close to > what we envisioned. > > This patch achieves two things when vect-epilogues-nomask=1: > 1) It analyzes the original loop for each supported vector size and saves this > analysis per loop, as well as the vector sizes we know we can vectorize the > loop for. > 2) When loop versioning it uses the 'skip_vector' code path to vectorize the > epilogue, and uses the lowest versioning threshold between the main and > epilogue's. > > As side effects of this patch I also changed ensure_base_align to only update > the alignment if the new alignment is lower than the current one. This > function already did that if the object was a symbol, now it behaves this way > for any object. > > I bootstrapped this patch with both vect-epilogues-nomask turned on and off on > x86_64 (AVX512) and aarch64. Regression tests looked good. > > Is this OK for trunk? + + /* Keep track of vector sizes we know we can vectorize the epilogue with. */ + vector_sizes epilogue_vsizes; }; please don't enlarge struct loop, instead track this somewhere in the vectorizer (in loop_vinfo? I see you already have epilogue_vinfos there - so the loop_vinfo simply lacks convenient access to the vector_size?) I don't see any use that could be trivially adjusted to look at a loop_vinfo member instead. For the vect_update_inits_of_drs this means that we'd possibly do less CSE. Not sure if really an issue. You use LOOP_VINFO_EPILOGUE_P sometimes and sometimes LOOP_VINFO_ORIG_LOOP_INFO, please change predicates to LOOP_VINFO_EPILOGUE_P. @@ -2466,15 +2461,62 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, else niters_prolog = build_int_cst (type, 0); + loop_vec_info epilogue_vinfo = NULL; + if (vect_epilogues) +{ ... + vect_epilogues = false; +} + I don't understand what all this does - it clearly needs a comment. Maybe the overall comment of the function should be amended with an overview of how we handle [multiple] epilogue loop vectorization? + + if (epilogue_any_upper_bound && prolog_peeling >= 0) + { + epilog->any_upper_bound = true; + epilog->nb_iterations_upper_bound = eiters + 1; + } + comment missing. How can prolog_peeling be < 0? We likely didn't set the upper bound because we don't know it in the case we skipped the vector loop (skip_vector)? So make sure to not introduce wrong-code issues here - maybe do this optimization as followup? class loop * -vect_loop_versioning (loop_vec_info loop_vinfo, - unsigned int th, bool check_profitability, - poly_uint64 versioning_threshold) +vect_loop_versioning (loop_vec_info loop_vinfo) { class loop *loop = LOOP_VINFO_LOOP (loop_vinfo), *nloop; class loop *scalar_loop = LOOP_VINFO_SCALAR_LOOP (loop_vinfo); @@ -2988,10 +3151,15 @@ vect_loop_versioning (loop_vec_info loop_vinfo, bool version_align = LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo); bool version_alias = LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo); bool version_niter = LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo); + poly_uint64 versioning_threshold += LOOP_VINFO_VERSIONING_THRESHOLD (loop_vinfo); tree version_simd_if_cond = LOOP_REQUIRES_VERSIONING_FOR_SIMD_IF_COND (loop_vinfo); + unsigned th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo); - if (check_profitability) + if (th >= vect_vf_for_cost (loop_vinfo) + && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) + && !ordered_p (th, versioning_threshold)) cond_expr = fold_build2 (GE_EXPR, boolean_type_node, scalar_loop_iters, build_int_cst (TREE_TYPE (scalar_loop_iters), th - 1)); split out this refactoring - preapproved. @@ -1726,7 +1729,13 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo) return 0; } - HOST_WIDE_INT estimated_niter = estimated_stmt_executions_int (loop); + HOST_WIDE_INT estimated_niter = -1; + + if (LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo)) +estimated_niter + = vect_vf_for_cost (LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo)) - 1; + if (estimated_niter == -1) +estimated_niter = estimated_stmt_executions_int (loop); if (estimated_niter == -1) estimated_niter = likely_max_stmt_executions_int (loop); if (estimated_niter != -1 it's clearer if the old code is completely in a else {} path even though vect_vf_for_cost - 1 should never be -1. +/* Decides whether we need to create an epilogue loop to handle + remaining scalar iterations and sets PEELING_FOR_NITERS accordingly. */ + +void +determine_peel_for_niter (loop_vec_info loop_vinfo) +{ + extra vertical space + unsigned HOST_WIDE_INT const_vf; + HOST_WIDE_INT max_niter if it's a 1:1 copy outlined then split it out - preapproved (so further rev
[PATCH] Cleanup parameter of vectorizable_live_operation
Hi Richard, I became aware of this while looking at the -Wshadow=compatible-local warnings. The function vectorizable_live_operation uses a parameter called "vec_stmt" that is shadowed by something also called "vec_stmt". But in this case, the vec_stmt actually only used as a boolean, i.e. pointer is NULL or not. This changes the parameter vec_stmt to vec_stmt_p, and propagates that change to can_vectorize_live_stmts. Bootstrapped and reg-tested on x86_64-pc-linux-gnu. Is it OK for trunk? Thanks Bernd. 2019-10-11 Bernd Edlinger * tree-vect-loop.c (vect_analyze_loop_operations): Adjust call to vectorizable_live_operation. (vectorizable_live_operation): Adjust parameters. * tree-vect-stmts.c (vect_init_vector, vect_gen_widened_results_half): Fix typo in function comment. (can_vectorize_live_stmts): Adjust function comment. Adjust parameters. Adjust call to vectorizable_live_operation. (vect_analyze_stmt): Adjust call to can_vectorize_live_stmts. (vect_transform_stmt): Adjust function comment. Adjust call to can_vectorize_live_stmts. * tree-vectorizer.h (vectorizable_live_operation): Adjust parameters. Index: gcc/tree-vect-loop.c === --- gcc/tree-vect-loop.c (revision 276843) +++ gcc/tree-vect-loop.c (working copy) @@ -1566,7 +1566,7 @@ vect_analyze_loop_operations (loop_vec_info loop_v && STMT_VINFO_LIVE_P (stmt_info) && !PURE_SLP_STMT (stmt_info)) ok = vectorizable_live_operation (stmt_info, NULL, NULL, NULL, - -1, NULL, &cost_vec); + -1, false, &cost_vec); if (!ok) return opt_result::failure_at (phi, @@ -7628,9 +7628,9 @@ vectorizable_induction (stmt_vec_info stmt_info, bool vectorizable_live_operation (stmt_vec_info stmt_info, - gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, + gimple_stmt_iterator *gsi, slp_tree slp_node, slp_instance slp_node_instance, - int slp_index, stmt_vec_info *vec_stmt, + int slp_index, bool vec_stmt_p, stmt_vector_for_cost *) { loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); @@ -7652,7 +7652,7 @@ vectorizable_live_operation (stmt_vec_info stmt_in validity so just trigger the transform here. */ if (STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info))) { - if (!vec_stmt) + if (!vec_stmt_p) return true; if (slp_node) { @@ -7721,7 +7721,7 @@ vectorizable_live_operation (stmt_vec_info stmt_in } } - if (!vec_stmt) + if (!vec_stmt_p) { /* No transformation required. */ if (LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo)) Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c (revision 276843) +++ gcc/tree-vect-stmts.c (working copy) @@ -1449,7 +1449,7 @@ vect_init_vector_1 (stmt_vec_info stmt_vinfo, gimp Insert a new stmt (INIT_STMT) that initializes a new variable of type TYPE with the value VAL. If TYPE is a vector type and VAL does not have vector type a vector with all elements equal to VAL is created first. - Place the initialization at BSI if it is not NULL. Otherwise, place the + Place the initialization at GSI if it is not NULL. Otherwise, place the initialization at the loop preheader. Return the DEF of INIT_STMT. It will be used in the vectorization of STMT_INFO. */ @@ -4484,7 +4484,7 @@ vectorizable_simd_clone_call (stmt_vec_info stmt_i Create a vector stmt whose code, type, number of arguments, and result variable are CODE, OP_TYPE, and VEC_DEST, and its arguments are - VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at BSI. + VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at GSI. In the case that CODE is a CALL_EXPR, this means that a call to DECL needs to be created (DECL is a function-decl of a target-builtin). STMT_INFO is the original scalar stmt that we are vectorizing. */ @@ -10474,12 +10474,12 @@ vectorizable_comparison (stmt_vec_info stmt_info, /* If SLP_NODE is nonnull, return true if vectorizable_live_operation can handle all live statements in the node. Otherwise return true if STMT_INFO is not live or if vectorizable_live_operation can handle it. - GSI and VEC_STMT are as for vectorizable_live_operation. */ + GSI and VEC_STMT_P are as for vectorizable_live_operation. */ static bool can_vectorize_live_stmts (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, slp_tree slp_node, slp_instance slp_node_instance, - stmt_vec_info *vec_stmt, + bool vec_stmt_p, stmt_vector_for_cost *cost_vec) { if (slp_node) @@ -10491,7 +10491,7 @@ can_vectorize_live_stmts (stmt_vec_info stmt_info, if (STMT_VINFO_LIVE_P (slp_stmt_info) && !vectorizable_live_operation (slp_stmt_info, gsi, slp_node, slp_node_instance, i, - vec_stmt, cost_vec)) + vec_stmt_p, cost_vec))
Re: [PATCH 1/2][GCC][RFC][middle-end]: Add SLP pattern matcher.
On Tue, 8 Oct 2019, Tamar Christina wrote: > Hi Richi, > > Thanks for the review, I've added some comments inline. > > The 10/07/2019 12:15, Richard Biener wrote: > > On Tue, 1 Oct 2019, Tamar Christina wrote: > > > > > Hi All, > > > > > > This adds a framework to allow pattern matchers to be written at based on > > > the > > > SLP tree. The difference between this one and the one in > > > tree-vect-patterns is > > > that this matcher allows the matching of an arbitrary number of parallel > > > statements and replacing of an arbitrary number of children or statements. > > > > > > Any relationship created by the SLP pattern matcher will be undone if SLP > > > fails. > > > > > > The pattern matcher can also cancel all permutes depending on what the > > > pattern > > > requested it to do. As soon as one pattern requests the permutes to be > > > cancelled all permutes are cancelled. > > > > > > Compared to the previous pattern matcher this one will work for an > > > arbitrary > > > group size and will match at any arbitrary node in the SLP tree. The only > > > requirement is that the entire node is matched or rejected. > > > > > > vect_build_slp_tree_1 is a bit more lenient in what it accepts as > > > "compatible > > > operations" but the matcher cannot be because in cases where you match > > > the order > > > of the operands may be changed. So all operands must be changed or none. > > > > > > Furthermore the matcher relies on canonization of the operations inside > > > the > > > SLP tree and on the fact that floating math operations are not > > > commutative. > > > This means that matching a pattern does not need to try all alternatives > > > or > > > combinations and that arguments will always be in the same order if it's > > > the > > > same operation. > > > > > > The pattern matcher also ignored uninteresting nodes such as type casts, > > > loads > > > and stores. Doing so is essential to keep the runtime down. > > > > > > Each matcher is allowed a post condition that can be run to perform any > > > changes > > > to the SLP tree as needed before the patterns are created and may also > > > abort > > > the creation of the patterns. > > > > > > When a pattern is matched it is not immediately created but instead it is > > > deferred until all statements in the node have been analyzed. Only if > > > all post > > > conditions are true, and all statements will be replaced will the > > > patterns be > > > created in batch. This allows us to not have to undo any work if the > > > pattern > > > fails but also makes it so we traverse the tree only once. > > > > > > When a new pattern is created it is a marked as a pattern to the > > > statement it is > > > replacing and be marked as used in the current SLP scope. If SLP fails > > > then > > > relationship is undone and the relevancy restored. > > > > > > Each pattern matcher can detect any number of pattern it wants. The only > > > constraint is that the optabs they produce must all have the same arity. > > > > > > The pattern matcher supports instructions that have no scalar form as they > > > are added as pattern statements to the stmt. The BB is left untouched and > > > so the scalar loop is untouched. > > > > > > Bootstrapped on aarch64-none-linux-gnu and no issues. > > > No regression testing done yet. > > > > If you split out the introduction of SLP_TREE_REF_COUNT you can commit > > that right now (sorry for being too lazy there...). > > > > I'll split those off :) > > > One overall comment - you do pattern matching after SLP tree > > creation (good) but still do it before the whole SLP graph is > > created (bad). Would it be possible to instead do it as a separate > > phase in vect_analyze_slp, looping over all instances (the instances > > present entries into the single unified SLP graph now), avoiding > > to visit "duplicates"? > > > > It should be, the only issue I can see is that build SLP may fail because of > an unsupported permute, or because it can use load lanes. If I'm > understanding > it correctly you wouldn't get SLP vectorization in those cases so then the > matching > can't work? So it would limit it a it more. True. As said later ideally the pattern matching woudl happen before or rather as part of SLP building so that the operations then actually match (and we then don't need that plus/minus SLP_TREE_TWO_OPERATORS handling). At the moment it sits in a quite awkward place which may contribute to its complexity. > > In patch 1/2 I see references (mostly in variable names) to > > "complex" which is IMHO too specific. > > > > Sorry, missed those during my cleanup. > > > I'd also think that a useful first pattern to support would be > > what we do with SLP_TREE_TWO_OPERATORS, where code generation > > inserts extra blends. I have yet to dive into the complex pattern > > details to see if that's feasible or if you maybe re-use that... > > Oh, I hadn't thought of that. I'll take a look.
[PATCH] [MIPS] Replace insert with insve for floating-point values
Currently, when a function argument of type double gets loaded into a vector register on a 32-bit target, it is firstly reloaded into two general purpose registers, and then loaded into a vector register using two insert.w instructions. This patch swaps the two insert.w instructions with one insve.d instruction, which operates on 64-bit floating point registers, so the value can be reloaded into a FPR. This is done by adding another alternative of constraints for msa_insert_ pattern, which covers the case of a floating-point input value. gcc/ChangeLog: * config/mips/mips-msa.md (msa_insert_): Add an alternative which covers the floating-point input value. Also forbid the split of insert.d pattern for floating-point values. gcc/testsuite/ChangeLog: * gcc.target/mips/msa-insert-split.c: New test. --- gcc/config/mips/mips-msa.md | 11 --- gcc/testsuite/gcc.target/mips/msa-insert-split.c | 16 2 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/mips/msa-insert-split.c diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md index 929646d..628423d 100644 --- a/gcc/config/mips/mips-msa.md +++ b/gcc/config/mips/mips-msa.md @@ -436,14 +436,17 @@ }) (define_insn "msa_insert_" - [(set (match_operand:MSA 0 "register_operand" "=f") + [(set (match_operand:MSA 0 "register_operand" "=f,f") (vec_merge:MSA (vec_duplicate:MSA - (match_operand: 1 "reg_or_0_operand" "dJ")) - (match_operand:MSA 2 "register_operand" "0") + (match_operand: 1 "reg_or_0_operand" "dJ,f")) + (match_operand:MSA 2 "register_operand" "0,0") (match_operand 3 "const__operand" "")))] "ISA_HAS_MSA" { + if (which_alternative == 1) +return "insve.\t%w0[%y3],%w1[0]"; + if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode)) return "#"; else @@ -462,6 +465,8 @@ "reload_completed && ISA_HAS_MSA && !TARGET_64BIT" [(const_int 0)] { + if (REG_P (operands[1]) && FP_REG_P (REGNO (operands[1]))) +FAIL; mips_split_msa_insert_d (operands[0], operands[2], operands[3], operands[1]); DONE; }) diff --git a/gcc/testsuite/gcc.target/mips/msa-insert-split.c b/gcc/testsuite/gcc.target/mips/msa-insert-split.c new file mode 100644 index 000..50f3b8a --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/msa-insert-split.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mfp64 -mhard-float -mmsa" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */ + +typedef double v2f64 __attribute__ ((vector_size (16))); + +void foo (double* arr, v2f64* vec) +{ + v2f64 v; + v[0] = arr[0]; + v[1] = arr[1]; + *vec = v; +} + +/* { dg-final { scan-assembler-not "insert.w" } } */ +/* { dg-final { scan-assembler-times "insve.d" 2 } } */ -- 2.7.4
Re: Avid ggc_alloc and push_cfun during LTO streaming
> On Fri, Oct 11, 2019 at 11:03 AM Jan Hubicka wrote: > > > > Hi, > > this patch prevents tree creation druing WPA stream out (to avoid > > touching pages and triggering COW). It fixes the following > > - gimple streamer produces MEM_REF wrappings for global decls. > >This is to preserve the type of access and is not necessary for > >WPA->LTRANS streaming when decls ar eno longer going to be merged. > > - we renumber stmt uids during streaming WPA summaries > > - loop optimizer is initialized in output_function. > > After testing the patch I noticed that output_function does one extra > > renumbering of stmts. This seems quite broken and I will fix it > > incrementally. > > > > Bootstrapped/regtested x86_64-linux, comitted. > > Huh. Why do we stream function bodies at WPA time at all? > We should already have input sections we can copy/remap? > > That is, why does gcc_assert (!flag_wpa) in output_function trip? Because we produce new bodies (such as merged ctors) and also bodies read by ipa-ICF to memory are then streamed form in-memory copies rather than the pickled sections. The overall idea is to keep things flexble enough so IPA passes can produce new functions and also load bodies to memory and modify them. The second is useful for development where one can first get pass working modifying at WPA time and then work on optimization summaries. I also think we may end up with passes that does more involved modifications on small portions of program (perhaps matric reorg could) I am also using this to play with re-running early opts at WPA time and seeing how many optimizations are missed by not early inlining/propagating dead code via nothrow etc etc. Honza
Re: Type representation in CTF and DWARF
On Fri, Oct 11, 2019 at 01:23:12PM +0200, Richard Biener wrote: > > (coreutils-0.22) > > .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf > > (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4)) > > ls 30616 |1136 |21098 | 26240 > > | 0.62 > > pwd 10734 |788|10433 | 13929 > > | 0.83 > > groups 10706 |811|10249 | 13378 > > | 0.80 > > > > (emacs-26.3) > > .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf > > (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4)) > > emacs-26.3.1 674657 |6402 | 273963 | 273910 > > | 0.33 > > > > I chose to account for 50% of .debug_str because at this point, it will be > > unfair to not account for them. Actually, one could even argue that upto 70% > > of the .debug_str are names of entities. CTF section sizes do include the > > CTF > > string tables. > > > > Across coreutils, I see a geomean of 0.73 (ratio of > > .ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the > > "-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger > > footprint than CTF (with 50% of .debug_str accounted for). > > I'm not convinced this "improvement" in size is worth maintainig another > debug-info format much less since it lacks desirable features right now > and thus evaluation is tricky. > > At least you can improve dwarf size considerably with a low amount of work. > > I suspect another factor where dwarf is bigger compared to CTF is that dwarf > is recording typedef names as well as qualified type variants. But maybe > CTF just has a more compact representation for the bits it actually > implements. Does CTF record automatic variables in functions, or just global variables? If only the latter, it would be fair to also disable addition of local variable DIEs, lexical blocks. Does CTF record inline functions? Again, if not, it would be fair to not emit that either in .debug_info. -gno-record-gcc-switches so that the compiler command line is not encoded in the debug info (unless it is in CTF). DWARF is highly extensible format, what exactly is and is not emitted is something that consumers can choose. Yes, DWARF can be large, but mainly because it provides a lot of information, the actual representation has been designed with size concerns in mind and newer versions of the standard keep improving that too. Jakub
Re: Avid ggc_alloc and push_cfun during LTO streaming
On Fri, Oct 11, 2019 at 11:03 AM Jan Hubicka wrote: > > Hi, > this patch prevents tree creation druing WPA stream out (to avoid > touching pages and triggering COW). It fixes the following > - gimple streamer produces MEM_REF wrappings for global decls. >This is to preserve the type of access and is not necessary for >WPA->LTRANS streaming when decls ar eno longer going to be merged. > - we renumber stmt uids during streaming WPA summaries > - loop optimizer is initialized in output_function. > After testing the patch I noticed that output_function does one extra > renumbering of stmts. This seems quite broken and I will fix it > incrementally. > > Bootstrapped/regtested x86_64-linux, comitted. Huh. Why do we stream function bodies at WPA time at all? We should already have input sections we can copy/remap? That is, why does gcc_assert (!flag_wpa) in output_function trip? Richard. > > * gimple-streamer-out.c (output_gimple_stmt): Add explicit function > parameter. > * lto-streamer-out.c: Include tree-dfa.h. > (output_cfg): Do not use cfun. > (lto_prepare_function_for_streaming): New. > (output_function): Do not push cfun; do not initialize loop optimizer. > * lto-streamer.h (lto_prepare_function_for_streaming): Declare. > * passes.c (ipa_write_summaries): Use it. > (ipa_write_optimization_summaries): Do not modify bodies. > * tree-dfa.c (renumber_gimple_stmt_uids): Add function parameter. > * tree.dfa.h (renumber_gimple_stmt_uids): Update prototype. > * tree-ssa-dse.c (pass_dse::execute): Update use of > renumber_gimple_stmt_uids. > * tree-ssa-math-opts.c (pass_optimize_widening_mul::execute): > Likewise. > > * lto.c (lto_wpa_write_files): Prepare all bodies for streaming. > Index: gimple-streamer-out.c > === > --- gimple-streamer-out.c (revision 276850) > +++ gimple-streamer-out.c (working copy) > @@ -57,7 +57,7 @@ output_phi (struct output_block *ob, gph > /* Emit statement STMT on the main stream of output block OB. */ > > static void > -output_gimple_stmt (struct output_block *ob, gimple *stmt) > +output_gimple_stmt (struct output_block *ob, struct function *fn, gimple > *stmt) > { >unsigned i; >enum gimple_code code; > @@ -80,7 +80,7 @@ output_gimple_stmt (struct output_block > as_a (stmt)), >1); >bp_pack_value (&bp, gimple_has_volatile_ops (stmt), 1); > - hist = gimple_histogram_value (cfun, stmt); > + hist = gimple_histogram_value (fn, stmt); >bp_pack_value (&bp, hist != NULL, 1); >bp_pack_var_len_unsigned (&bp, stmt->subcode); > > @@ -139,7 +139,7 @@ output_gimple_stmt (struct output_block > so that we do not have to deal with type mismatches on > merged symbols during IL read in. The first operand > of GIMPLE_DEBUG must be a decl, not MEM_REF, though. */ > - if (op && (i || !is_gimple_debug (stmt))) > + if (!flag_wpa && op && (i || !is_gimple_debug (stmt))) > { > basep = &op; > if (TREE_CODE (*basep) == ADDR_EXPR) > @@ -147,7 +147,7 @@ output_gimple_stmt (struct output_block > while (handled_component_p (*basep)) > basep = &TREE_OPERAND (*basep, 0); > if (VAR_P (*basep) > - && !auto_var_in_fn_p (*basep, current_function_decl) > + && !auto_var_in_fn_p (*basep, fn->decl) > && !DECL_REGISTER (*basep)) > { > bool volatilep = TREE_THIS_VOLATILE (*basep); > @@ -228,7 +228,7 @@ output_bb (struct output_block *ob, basi > print_gimple_stmt (streamer_dump_file, stmt, 0, TDF_SLIM); > } > > - output_gimple_stmt (ob, stmt); > + output_gimple_stmt (ob, fn, stmt); > > /* Emit the EH region holding STMT. */ > region = lookup_stmt_eh_lp_fn (fn, stmt); > Index: lto/lto.c > === > --- lto/lto.c (revision 276850) > +++ lto/lto.c (working copy) > @@ -304,6 +304,13 @@ lto_wpa_write_files (void) > >timevar_push (TV_WHOPR_WPA_IO); > > + cgraph_node *node; > + /* Do body modifications needed for streaming before we fork out > + worker processes. */ > + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) > +if (gimple_has_body_p (node->decl)) > + lto_prepare_function_for_streaming (node); > + >/* Generate a prefix for the LTRANS unit files. */ >blen = strlen (ltrans_output_list); >temp_filename = (char *) xmalloc (blen + sizeof ("2147483648.o")); > Index: lto-streamer-out.c > === > --- lto-streamer-out.c (revision 276850) > +++ lto-streamer-out.c (working copy) > @@ -43,6 +43,7 @@ along with GCC; see the fil
Re: [PR47785] COLLECT_AS_OPTIONS
On Fri, Oct 11, 2019 at 6:15 AM Kugan Vivekanandarajah wrote: > > Hi Richard, > Thanks for the review. > > On Wed, 2 Oct 2019 at 20:41, Richard Biener > wrote: > > > > On Wed, Oct 2, 2019 at 10:39 AM Kugan Vivekanandarajah > > wrote: > > > > > > Hi, > > > > > > As mentioned in the PR, attached patch adds COLLECT_AS_OPTIONS for > > > passing assembler options specified with -Wa, to the link-time driver. > > > > > > The proposed solution only works for uniform -Wa options across all > > > TUs. As mentioned by Richard Biener, supporting non-uniform -Wa flags > > > would require either adjusting partitioning according to flags or > > > emitting multiple object files from a single LTRANS CU. We could > > > consider this as a follow up. > > > > > > Bootstrapped and regression tests on arm-linux-gcc. Is this OK for trunk? > > > > While it works for your simple cases it is unlikely to work in practice > > since > > your implementation needs the assembler options be present at the link > > command line. I agree that this might be the way for people to go when > > they face the issue but then it needs to be documented somewhere > > in the manual. > > > > That is, with COLLECT_AS_OPTION (why singular? I'd expected > > COLLECT_AS_OPTIONS) available to cc1 we could stream this string > > to lto_options and re-materialize it at link time (and diagnose mismatches > > even if we like). > OK. I will try to implement this. So the idea is if we provide > -Wa,options as part of the lto compile, this should be available > during link time. Like in: > > arm-linux-gnueabihf-gcc -march=armv7-a -mthumb -O2 -flto > -Wa,-mimplicit-it=always,-mthumb -c test.c > arm-linux-gnueabihf-gcc -flto test.o > > I am not sure where should we stream this. Currently, cl_optimization > has all the optimization flag provided for compiler and it is > autogenerated and all the flags are integer values. Do you have any > preference or example where this should be done. In lto_write_options, I'd simply append the contents of COLLECT_AS_OPTIONS (with -Wa, prepended to each of them), then recover them in lto-wrapper for each TU and pass them down to the LTRANS compiles (if they agree for all TUs, otherwise I'd warn and drop them). Richard. > Thanks, > Kugan > > > > > > > Richard. > > > > > Thanks, > > > Kugan > > > > > > > > > gcc/ChangeLog: > > > > > > 2019-10-02 kugan.vivekanandarajah > > > > > > PR lto/78353 > > > * gcc.c (putenv_COLLECT_AS_OPTION): New to set COLLECT_AS_OPTION in env. > > > (driver::main): Call putenv_COLLECT_AS_OPTION. > > > * lto-wrapper.c (run_gcc): use COLLECT_AS_OPTION from env. > > > > > > gcc/testsuite/ChangeLog: > > > > > > 2019-10-02 kugan.vivekanandarajah > > > > > > PR lto/78353 > > > * gcc.target/arm/pr78353-1.c: New test. > > > * gcc.target/arm/pr78353-2.c: New test.
Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
On Thu, Oct 10, 2019 at 8:06 PM Richard Sandiford wrote: > > Wilco Dijkstra writes: > > ping > > > > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > > bitfields by their declared type, which results in better codegeneration on > > practically > > any target. > > The name is confusing, but the documentation looks accurate to me: > > Define this macro as a C expression which is nonzero if accessing less > than a word of memory (i.e.@: a @code{char} or a @code{short}) is no > faster than accessing a word of memory, i.e., if such access > require more than one instruction or if there is no difference in cost > between byte and (aligned) word loads. > > When this macro is not defined, the compiler will access a field by > finding the smallest containing object; when it is defined, a fullword > load will be used if alignment permits. Unless bytes accesses are > faster than word accesses, using word accesses is preferable since it > may eliminate subsequent memory access if subsequent accesses occur to > other fields in the same word of the structure, but to different bytes. > > > I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS > > from GCC as it's confusing and useless. > > I disagree. Some targets can optimise single-bit operations when the > container is a byte, for example. There's also less chance of store-to-load forwarding issues when _not_ forcing word accesses. Because I think that we do not use the same macro to change code generation for writes (where a larger access requires a read-modify-write cycle). But this also means that a decision with no information on context is going to be flawed. Still generally I'd do smaller reads if they are not significantly more expensive on the target but larger writes (with the same constraint) just for the STLF issue. Unfortunately the macro docs don't say if it is applicable to reads or writes or both... > > OK for commit until we get rid of it? > > > > ChangeLog: > > 2017-11-17 Wilco Dijkstra > > > > gcc/ > > * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. > > -- > > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > > index > > 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078 > > 100644 > > --- a/gcc/config/aarch64/aarch64.h > > +++ b/gcc/config/aarch64/aarch64.h > > @@ -769,14 +769,9 @@ typedef struct > > if given data not on the nominal alignment. */ > > #define STRICT_ALIGNMENTTARGET_STRICT_ALIGN > > > > -/* Define this macro to be non-zero if accessing less than a word of > > - memory is no faster than accessing a word of memory, i.e., if such > > - accesses require more than one instruction or if there is no > > - difference in cost. > > - Although there's no difference in instruction count or cycles, > > - in AArch64 we don't want to expand to a sub-word to a 64-bit access > > - if we don't have to, for power-saving reasons. */ > > -#define SLOW_BYTE_ACCESS 0 > > +/* Contrary to all documentation, this enables wide bitfield accesses, > > + which results in better code when accessing multiple bitfields. */ > > +#define SLOW_BYTE_ACCESS 1 > > > > #define NO_FUNCTION_CSE 1 > > I agree this makes sense from a performance point of view, and I think > the existing comment is admitting that AArch64 has the properties that > would normally cause us to set SLOW_BYTE_ACCESS to 1. But the comment > is claiming that there's a power-saving benefit to leaving it off. > > It seems like a weak argument though. Bitfields are used when several > values are packed into the same integer, so there's a high likelihood > we'll need the whole integer anyway. Avoiding the redundancies described > in the documention should if anything help with power usage. > > Maybe the main concern was using a 64-bit access when a 32-bit one > would do, since 32-bit bitfield containers are the most common. But the: > > && GET_MODE_ALIGNMENT (mode) <= align > > condition in get_best_mode should avoid that unless the 64-bit > access is naturally aligned. (See the big comment above for the > pros and cons of this.) > > So I think we should change the macro value unless anyone can back up the > power-saving claim. Let's wait a week (more) to see if anyone objects. > > The comment change isn't OK though. Please keep the first paragraph > and just reword the second to say that's why we set the value to 1. > > Thanks, > Richard
Re: Type representation in CTF and DWARF
On Fri, Oct 11, 2019 at 1:06 AM Indu Bhagat wrote: > > > > On 10/09/2019 12:49 AM, Jakub Jelinek wrote: > > On Wed, Oct 09, 2019 at 09:41:09AM +0200, Richard Biener wrote: > >> There's a mechanism to get type (and decl - I suppose CTF also > >> contains debug info > >> for function declarations not only its type?) info as part of early > >> debug generation. > >> The attached "hack" simply mangles dwarf2out to output this early info as > >> the > >> only debug info (only verified on a small .c file). We still have things > >> like > >> file, line and column numbers for entities (not sure if CTF has those). > >> > >> It should be possible to "hide" the hack behind a -gdwarf-like-ctf or > >> similar. > >> I guess -g0.5 isn't desirable and we've taken both -g0 and -g1 already... > >> (and -g1 doesn't include types but just decls). > > Yeah. And if location info isn't in CTF, you can as well add an early > > return in add_src_coords_attributes, like it has one for UNKNOWN_LOCATION > > already. Or if it is there, but just file/line and not column, you can use > > -gno-column-info. As has been mentioned earlier, you can use dwz utility > > post-linking instead of -fdebug-types-section. > > > > Jakub > > Thanks for your pointers. > > CTF does not encode location information. So, I used early exit in the > add_src_coords_attributes to avoid generation of location info (file, line, > column). To answer Richard's question, CTF does have type debug info > for function declarations and the argument types. So I think with these > changes, both CTF and DWARF generation will emit debug info for the same set > of > types and decl. > > Compile with -g -gdwarf-like-ctf and use dwz -o (using > dwz compiled from the master branch) on the generated binaries: > > (coreutils-0.22) > .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf > (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4)) > ls 30616 |1136 |21098 | 26240 > | 0.62 > pwd 10734 |788|10433 | 13929 > | 0.83 > groups 10706 |811|10249 | 13378 > | 0.80 > > (emacs-26.3) > .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf > (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4)) > emacs-26.3.1 674657 |6402 | 273963 | 273910 > | 0.33 > > I chose to account for 50% of .debug_str because at this point, it will be > unfair to not account for them. Actually, one could even argue that upto 70% > of the .debug_str are names of entities. CTF section sizes do include the CTF > string tables. > > Across coreutils, I see a geomean of 0.73 (ratio of > .ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the > "-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger > footprint than CTF (with 50% of .debug_str accounted for). I'm not convinced this "improvement" in size is worth maintainig another debug-info format much less since it lacks desirable features right now and thus evaluation is tricky. At least you can improve dwarf size considerably with a low amount of work. I suspect another factor where dwarf is bigger compared to CTF is that dwarf is recording typedef names as well as qualified type variants. But maybe CTF just has a more compact representation for the bits it actually implements. Richard. > Indu >
Re: Remove splay tree form ipa-reference.c
On Thu, Oct 10, 2019 at 9:25 PM Jan Hubicka wrote: > > Hi, > ipa-reference uses splay tree to map DECL_UIDs to trees. On other > places we use hash-maps which are more sutiable. A simple hash-table/jhash-set does it as well if the UID is the same as DECL_UID. You can use the tree_decl_hash hash-traits for that. decl lookup can then be done like tree_decl_minimal d; d.uid = uid; find_slot_with_hash (&d, uid) or so. There must be existing users that got converted from old-style hashes to new templated ones but I can't find them right now :/ Richard. > Bootstrapped/regtested x86_64-linux, comitted. > > Honza > * ipa-reference.c: Do not include splay-tree.h > (reference_vars_to_consider): Turn to hash map. > (get_static_name, ipa_init, analyze_function, propagate, > stream_out_bitmap, ipa_reference_write_optimization_summary, > ipa_reference_write_optimization_summary): Update. > Index: ipa-reference.c > === > --- ipa-reference.c (revision 276849) > +++ ipa-reference.c (working copy) > @@ -46,7 +46,6 @@ along with GCC; see the file COPYING3. > #include "cgraph.h" > #include "data-streamer.h" > #include "calls.h" > -#include "splay-tree.h" > #include "ipa-utils.h" > #include "ipa-reference.h" > #include "symbol-summary.h" > @@ -92,9 +91,11 @@ struct ipa_reference_vars_info_d > > typedef struct ipa_reference_vars_info_d *ipa_reference_vars_info_t; > > -/* This splay tree contains all of the static variables that are > +/* This map contains all of the static variables that are > being considered by the compilation level alias analysis. */ > -static splay_tree reference_vars_to_consider; > +typedef hash_map, tree> > +reference_vars_to_consider_t; > +static reference_vars_to_consider_t *reference_vars_to_consider; > > /* Set of all interesting module statics. A bit is set for every module > static we are considering. This is added to the local info when asm > @@ -272,9 +273,7 @@ is_proper_for_analysis (tree t) > static const char * > get_static_name (int index) > { > - splay_tree_node stn = > -splay_tree_lookup (reference_vars_to_consider, index); > - return fndecl_name ((tree)(stn->value)); > + return fndecl_name (*reference_vars_to_consider->get (index)); > } > > /* Dump a set of static vars to FILE. */ > @@ -416,7 +415,7 @@ ipa_init (void) >ipa_init_p = true; > >if (dump_file) > -reference_vars_to_consider = splay_tree_new (splay_tree_compare_ints, 0, > 0); > +reference_vars_to_consider = new reference_vars_to_consider_t(251); > >bitmap_obstack_initialize (&local_info_obstack); >bitmap_obstack_initialize (&optimization_summary_obstack); > @@ -476,9 +475,8 @@ analyze_function (struct cgraph_node *fn > && bitmap_set_bit (all_module_statics, ipa_reference_var_uid (var))) > { > if (dump_file) > - splay_tree_insert (reference_vars_to_consider, > - ipa_reference_var_uid (var), > - (splay_tree_value)var); > + reference_vars_to_consider->put (ipa_reference_var_uid (var), > + var); > } >switch (ref->use) > { > @@ -898,7 +896,7 @@ propagate (void) > } > >if (dump_file) > -splay_tree_delete (reference_vars_to_consider); > +delete reference_vars_to_consider; >reference_vars_to_consider = NULL; >return remove_p ? TODO_remove_functions : 0; > } > @@ -968,8 +966,7 @@ stream_out_bitmap (struct lto_simple_out > return; >EXECUTE_IF_AND_IN_BITMAP (bits, ltrans_statics, 0, index, bi) > { > - tree decl = (tree)splay_tree_lookup (reference_vars_to_consider, > - index)->value; > + tree decl = *reference_vars_to_consider->get (index); >lto_output_var_decl_index (ob->decl_state, ob->main_stream, decl); > } > } > @@ -987,7 +984,7 @@ ipa_reference_write_optimization_summary >auto_bitmap ltrans_statics; >int i; > > - reference_vars_to_consider = splay_tree_new (splay_tree_compare_ints, 0, > 0); > + reference_vars_to_consider = new reference_vars_to_consider_t (251); > >/* See what variables we are interested in. */ >for (i = 0; i < lto_symtab_encoder_size (encoder); i++) > @@ -1001,9 +998,8 @@ ipa_reference_write_optimization_summary > { > tree decl = vnode->decl; > bitmap_set_bit (ltrans_statics, ipa_reference_var_uid (decl)); > - splay_tree_insert (reference_vars_to_consider, > -ipa_reference_var_uid (decl), > -(splay_tree_value)decl); > + reference_vars_to_consider->put > +(ipa_reference_var_uid (decl), decl); > ltrans_statics_bitcount ++; > } > } > @@ -1045,7 +1041,7 @@ ipa_reference_write_optimization_summary >
Re: [SH][committed] Fix PR 88630
On Fri, 2019-10-11 at 00:36 +0900, Oleg Endo wrote: > Hi, > > When we did the refactoring of SH's FPSCR handling back in GCC 5, we > missed one thing regarding ST-40, it seems. > > The attached patch fixes the issue as confirmed on the real > hardware. > Also tested on sh-sim with > make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb,- > m2a/-mb,-m4/-ml,-m4/-mb}" > > > Committed to trunk, GCC 9, GCC 8 as r276809, r276825, r276837. I forgot about GCC 7. Committed there as well as r276877. Cheers, Oleg
Re: Correctly release ipa-reference summarries
On Thu, Oct 10, 2019 at 9:21 PM Jan Hubicka wrote: > > Hi, > this patch fixes code removing summaries in ipa-reference. As a memory > leak it may make sense to backport this to release branches. Please do so. Richard. > Honza > > * ipa-reference.c (propagate): Fix releasing of IPA summaries. > Index: ipa-reference.c > === > --- ipa-reference.c (revision 276707) > +++ ipa-reference.c (working copy) > @@ -891,15 +889,14 @@ propagate (void) > >bitmap_obstack_release (&local_info_obstack); > > - if (ipa_ref_var_info_summaries == NULL) > + if (ipa_ref_var_info_summaries != NULL) > { >delete ipa_ref_var_info_summaries; >ipa_ref_var_info_summaries = NULL; > } > > - ipa_ref_var_info_summaries = NULL; >if (dump_file) > splay_tree_delete (reference_vars_to_consider); >reference_vars_to_consider = NULL; >return remove_p ? TODO_remove_functions : 0; > }
[C++ Patch] One more DECL_SOURCE_LOCATION
Hi, another straightforward DECL_SOURCE_LOCATION (TYPE_MAIN_DECL consistent with the one I used in build_anon_union_vars. Tested x86_64-linux. Thanks, Paolo. / /cp 2019-10-11 Paolo Carlini * decl.c (check_tag_decl): Use DECL_SOURCE_LOCATION. /testsuite 2019-10-11 Paolo Carlini * g++.dg/cpp0x/constexpr-union5.C: Test location(s) too. * g++.dg/diagnostic/bitfld2.C: Likewise. * g++.dg/ext/anon-struct1.C: Likewise. * g++.dg/ext/anon-struct6.C: Likewise. * g++.dg/ext/flexary19.C: Likewise. * g++.dg/ext/flexary9.C: Likewise. * g++.dg/template/error17.C: Likewise. Index: cp/decl.c === --- cp/decl.c (revision 276845) +++ cp/decl.c (working copy) @@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs, if (TREE_CODE (declared_type) != UNION_TYPE && !in_system_header_at (input_location)) - pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous structs"); + pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)), +OPT_Wpedantic, "ISO C++ prohibits anonymous structs"); } else Index: testsuite/g++.dg/cpp0x/constexpr-union5.C === --- testsuite/g++.dg/cpp0x/constexpr-union5.C (revision 276845) +++ testsuite/g++.dg/cpp0x/constexpr-union5.C (working copy) @@ -23,16 +23,16 @@ SA((a.i == 42)); struct B { - struct { + struct { // { dg-warning "10:ISO C\\+\\+ prohibits anonymous struct" } int h; -struct { +struct { // { dg-warning "12:ISO C\\+\\+ prohibits anonymous struct" } union { unsigned char i; int j; }; int k; -}; // { dg-warning "anonymous struct" } - }; // { dg-warning "anonymous struct" } +}; + }; int l; constexpr B(): h(1), i(2), k(3), l(4) {} Index: testsuite/g++.dg/diagnostic/bitfld2.C === --- testsuite/g++.dg/diagnostic/bitfld2.C (revision 276845) +++ testsuite/g++.dg/diagnostic/bitfld2.C (working copy) @@ -6,4 +6,4 @@ template struct A struct {} : 2; // { dg-error "expected ';' after struct" "expected" } }; // { dg-error "ISO C.. forbids declaration" "declaration" { target *-*-* } 6 } -// { dg-error "ISO C.. prohibits anonymous" "anonymous" { target *-*-* } 6 } +// { dg-error "10:ISO C.. prohibits anonymous" "anonymous" { target *-*-* } 6 } Index: testsuite/g++.dg/ext/anon-struct1.C === --- testsuite/g++.dg/ext/anon-struct1.C (revision 276845) +++ testsuite/g++.dg/ext/anon-struct1.C (working copy) @@ -19,7 +19,7 @@ char testD[sizeof(C::D) == sizeof(A) ? 1 : -1]; /* GNU extension. */ struct E { - struct { char z; }; /* { dg-error "prohibits anonymous structs" } */ + struct { char z; }; /* { dg-error "10:ISO C\\+\\+ prohibits anonymous structs" } */ char e; }; @@ -45,6 +45,6 @@ char testH[sizeof(H) == 2 * sizeof(A) ? 1 : -1]; /* Make sure __extension__ gets turned back off. */ struct I { - struct { char z; }; /* { dg-error "prohibits anonymous structs" } */ + struct { char z; }; /* { dg-error "10:ISO C\\+\\+ prohibits anonymous structs" } */ char i; }; Index: testsuite/g++.dg/ext/anon-struct6.C === --- testsuite/g++.dg/ext/anon-struct6.C (revision 276845) +++ testsuite/g++.dg/ext/anon-struct6.C (working copy) @@ -3,8 +3,8 @@ struct A { struct - { + { // { dg-error "3:ISO C\\+\\+ prohibits anonymous structs" } struct { static int i; }; // { dg-error "prohibits anonymous structs|non-static data members|unnamed class" } void foo() { i; } // { dg-error "public non-static data" } - }; // { dg-error "prohibits anonymous structs" } + }; }; Index: testsuite/g++.dg/ext/flexary19.C === --- testsuite/g++.dg/ext/flexary19.C(revision 276845) +++ testsuite/g++.dg/ext/flexary19.C(working copy) @@ -146,13 +146,13 @@ struct S16 { int i; - struct { // { dg-warning "invalid use" } + struct { // { dg-warning "10:ISO C\\+\\+ prohibits anonymous struct|invalid use" } // A flexible array as a sole member of an anonymous struct is // rejected with an error in C mode but emits just a pedantic // warning in C++. Other than excessive pedantry there is no // reason to reject it. int a[]; - };// { dg-warning "anonymous struct" } + }; }; struct S17 @@ -177,9 +177,9 @@ struct S19 { int i; - struct { // { dg-warning "invalid use" } + struct { // { dg-warning "10:ISO C\\+\\+ prohibits anonymous struct|in
Re: [PATCH] Fix PR92046
On Fri, 11 Oct 2019, Rainer Orth wrote: > Hi Christophe, > > > On Thu, 10 Oct 2019 at 16:01, Richard Biener wrote: > > > >> > >> The following fixes a few param adjustments that are made based on > >> per-function adjustable flags by moving the adjustments to their > >> users. Semantics change in some minor ways but that's allowed > >> for --params. > >> > >> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. > >> > >> Hi, > > > > This generates several regressions. > > On aarch64: > > FAIL: gcc.target/aarch64/vect_fp16_1.c scan-assembler-times > > fadd\tv[0-9]+.8h 2 > > > > on arm-linux-gnueabihf: > > FAIL: gcc.dg/vect/vect-align-1.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorized 1 loops" 1 > > FAIL: gcc.dg/vect/vect-align-1.c scan-tree-dump-times vect "vectorized 1 > > loops" 1 > > FAIL: gcc.dg/vect/vect-align-2.c -flto -ffat-lto-objects > > scan-tree-dump-times vect "vectorized 1 loops" 1 > > FAIL: gcc.dg/vect/vect-align-2.c scan-tree-dump-times vect "vectorized 1 > > loops" 1 > > > > on armeb-linux-gnueabihf, many (316) like: > > FAIL: gcc.dg/vect/O3-vect-pr34223.c scan-tree-dump-times vect "vectorized 1 > > loops" 1 > > FAIL: gcc.dg/vect/fast-math-pr35982.c scan-tree-dump-times vect "vectorized > > 1 loops" 1 > > > > still on armeb-linux-gnueabihf: > > g++.dg/vect/pr33426-ivdep-2.cc -std=c++14 (test for warnings, line ) > > g++.dg/vect/pr33426-ivdep-2.cc -std=c++17 (test for warnings, line ) > > g++.dg/vect/pr33426-ivdep-2.cc -std=c++2a (test for warnings, line ) > > g++.dg/vect/pr33426-ivdep-2.cc -std=c++98 (test for warnings, line ) > > g++.dg/vect/pr33426-ivdep-3.cc(test for warnings, line ) > > g++.dg/vect/pr33426-ivdep-4.cc(test for warnings, line ) > > g++.dg/vect/pr33426-ivdep.cc -std=c++14 (test for warnings, line ) > > g++.dg/vect/pr33426-ivdep.cc -std=c++17 (test for warnings, line ) > > g++.dg/vect/pr33426-ivdep.cc -std=c++2a (test for warnings, line ) > > g++.dg/vect/pr33426-ivdep.cc -std=c++98 (test for warnings, line ) > > > > gfortran.dg/vect/no-vfa-pr32377.f90 -O scan-tree-dump-times vect > > "vectorized 2 loops" 1 > > gfortran.dg/vect/pr19049.f90 -O scan-tree-dump-times vect > > "vectorized 1 loops" 1 > > gfortran.dg/vect/pr32377.f90 -O scan-tree-dump-times vect > > "vectorized 2 loops" 1 > > gfortran.dg/vect/vect-2.f90 -O scan-tree-dump-times vect > > "vectorized 3 loops" 1 > > gfortran.dg/vect/vect-3.f90 -O scan-tree-dump-times vect "Alignment > > of access forced using versioning" 3 > > gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times vect "accesses > > have the same alignment." 1 > > gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times vect > > "vectorized 1 loops" 1 > > gfortran.dg/vect/vect-5.f90 -O scan-tree-dump-times vect "Alignment > > of access forced using versioning." 2 > > gfortran.dg/vect/vect-5.f90 -O scan-tree-dump-times vect > > "vectorized 1 loops" 1 > > that's PR tree-optimization/92066, also seen on sparc, powerpc64, and > ia64. Hmm, OK. There's one obvious bug fixed below, other than that I have to investigate in more detail. Committed as obvious. Richard. 2019-10-11 Richard Biener PR tree-optimization/92066 PR tree-optimization/92046 * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Fix bogus cost model check. Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 276858) +++ gcc/tree-vect-data-refs.c (working copy) @@ -2179,7 +2179,7 @@ do_versioning = (optimize_loop_nest_for_speed_p (loop) && !loop->inner /* FORNOW */ - && flag_vect_cost_model > VECT_COST_MODEL_CHEAP); + && flag_vect_cost_model != VECT_COST_MODEL_CHEAP); if (do_versioning) {
Re: [PATCH] Fix PR92046
Hi Christophe, > On Thu, 10 Oct 2019 at 16:01, Richard Biener wrote: > >> >> The following fixes a few param adjustments that are made based on >> per-function adjustable flags by moving the adjustments to their >> users. Semantics change in some minor ways but that's allowed >> for --params. >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. >> >> Hi, > > This generates several regressions. > On aarch64: > FAIL: gcc.target/aarch64/vect_fp16_1.c scan-assembler-times > fadd\tv[0-9]+.8h 2 > > on arm-linux-gnueabihf: > FAIL: gcc.dg/vect/vect-align-1.c -flto -ffat-lto-objects > scan-tree-dump-times vect "vectorized 1 loops" 1 > FAIL: gcc.dg/vect/vect-align-1.c scan-tree-dump-times vect "vectorized 1 > loops" 1 > FAIL: gcc.dg/vect/vect-align-2.c -flto -ffat-lto-objects > scan-tree-dump-times vect "vectorized 1 loops" 1 > FAIL: gcc.dg/vect/vect-align-2.c scan-tree-dump-times vect "vectorized 1 > loops" 1 > > on armeb-linux-gnueabihf, many (316) like: > FAIL: gcc.dg/vect/O3-vect-pr34223.c scan-tree-dump-times vect "vectorized 1 > loops" 1 > FAIL: gcc.dg/vect/fast-math-pr35982.c scan-tree-dump-times vect "vectorized > 1 loops" 1 > > still on armeb-linux-gnueabihf: > g++.dg/vect/pr33426-ivdep-2.cc -std=c++14 (test for warnings, line ) > g++.dg/vect/pr33426-ivdep-2.cc -std=c++17 (test for warnings, line ) > g++.dg/vect/pr33426-ivdep-2.cc -std=c++2a (test for warnings, line ) > g++.dg/vect/pr33426-ivdep-2.cc -std=c++98 (test for warnings, line ) > g++.dg/vect/pr33426-ivdep-3.cc(test for warnings, line ) > g++.dg/vect/pr33426-ivdep-4.cc(test for warnings, line ) > g++.dg/vect/pr33426-ivdep.cc -std=c++14 (test for warnings, line ) > g++.dg/vect/pr33426-ivdep.cc -std=c++17 (test for warnings, line ) > g++.dg/vect/pr33426-ivdep.cc -std=c++2a (test for warnings, line ) > g++.dg/vect/pr33426-ivdep.cc -std=c++98 (test for warnings, line ) > > gfortran.dg/vect/no-vfa-pr32377.f90 -O scan-tree-dump-times vect > "vectorized 2 loops" 1 > gfortran.dg/vect/pr19049.f90 -O scan-tree-dump-times vect > "vectorized 1 loops" 1 > gfortran.dg/vect/pr32377.f90 -O scan-tree-dump-times vect > "vectorized 2 loops" 1 > gfortran.dg/vect/vect-2.f90 -O scan-tree-dump-times vect > "vectorized 3 loops" 1 > gfortran.dg/vect/vect-3.f90 -O scan-tree-dump-times vect "Alignment > of access forced using versioning" 3 > gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times vect "accesses > have the same alignment." 1 > gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times vect > "vectorized 1 loops" 1 > gfortran.dg/vect/vect-5.f90 -O scan-tree-dump-times vect "Alignment > of access forced using versioning." 2 > gfortran.dg/vect/vect-5.f90 -O scan-tree-dump-times vect > "vectorized 1 loops" 1 that's PR tree-optimization/92066, also seen on sparc, powerpc64, and ia64. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[patch,Fortran] PR 92050 - fix ICE with -fcheck=all
Checking produces extra code (unsurprisingly); this code needs to end up in the program… Bootstrapped on x86-64_gnu-linux. OK for the trunk? Tobias diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 965ab7786a1..65238ff623d 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -7031,8 +7031,11 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, gfc_allocate_lang_decl (result); GFC_DECL_SAVED_DESCRIPTOR (result) = parmse.expr; gfc_free_expr (class_expr); - gcc_assert (parmse.pre.head == NULL_TREE - && parmse.post.head == NULL_TREE); + /* -fcheck= can add diagnostic code, which has to be placed before + the call. */ + if (parmse.pre.head != NULL) + gfc_add_expr_to_block (&se->pre, parmse.pre.head); + gcc_assert (parmse.post.head == NULL_TREE); } /* Follow the function call with the argument post block. */ diff --git a/gcc/testsuite/gfortran.dg/pr92050.f90 b/gcc/testsuite/gfortran.dg/pr92050.f90 new file mode 100644 index 000..64193878d8f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr92050.f90 @@ -0,0 +1,53 @@ +! { dg-do run } +! { dg-options "-fcheck=all" } +! { dg-shouldfail "above upper bound" } +! +! PR fortran/92050 +! +! +module buggy + implicit none (type, external) + + type :: par + contains +procedure, public :: fun => fun_par + end type par + + type comp +class(par), allocatable :: p + end type comp + + type foo +type(comp), allocatable :: m(:) + end type foo + +contains + + function fun_par(this) +class(par) :: this +integer:: fun_par(1) +fun_par = 42 + end function fun_par + + subroutine update_foo(this) +class(foo) :: this +write(*,*) this%m(1)%p%fun() + end subroutine update_foo + + subroutine bad_update_foo(this) +class(foo) :: this +write(*,*) this%m(2)%p%fun() + end subroutine bad_update_foo +end module buggy + +program main + use buggy + implicit none (type, external) + type(foo) :: x + allocate(x%m(1)) + allocate(x%m(1)%p) + call update_foo(x) + call bad_update_foo(x) +end program main + +! { dg-output "At line 39 of file .*pr92050.f90.*Fortran runtime error: Index '2' of dimension 1 of array 'this%m' above upper bound of 1" }
Re: [PATCH] Fix PR92046
On Thu, 10 Oct 2019 at 16:01, Richard Biener wrote: > > The following fixes a few param adjustments that are made based on > per-function adjustable flags by moving the adjustments to their > users. Semantics change in some minor ways but that's allowed > for --params. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. > > Hi, This generates several regressions. On aarch64: FAIL: gcc.target/aarch64/vect_fp16_1.c scan-assembler-times fadd\tv[0-9]+.8h 2 on arm-linux-gnueabihf: FAIL: gcc.dg/vect/vect-align-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/vect-align-1.c scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/vect-align-2.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/vect-align-2.c scan-tree-dump-times vect "vectorized 1 loops" 1 on armeb-linux-gnueabihf, many (316) like: FAIL: gcc.dg/vect/O3-vect-pr34223.c scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/fast-math-pr35982.c scan-tree-dump-times vect "vectorized 1 loops" 1 still on armeb-linux-gnueabihf: g++.dg/vect/pr33426-ivdep-2.cc -std=c++14 (test for warnings, line ) g++.dg/vect/pr33426-ivdep-2.cc -std=c++17 (test for warnings, line ) g++.dg/vect/pr33426-ivdep-2.cc -std=c++2a (test for warnings, line ) g++.dg/vect/pr33426-ivdep-2.cc -std=c++98 (test for warnings, line ) g++.dg/vect/pr33426-ivdep-3.cc(test for warnings, line ) g++.dg/vect/pr33426-ivdep-4.cc(test for warnings, line ) g++.dg/vect/pr33426-ivdep.cc -std=c++14 (test for warnings, line ) g++.dg/vect/pr33426-ivdep.cc -std=c++17 (test for warnings, line ) g++.dg/vect/pr33426-ivdep.cc -std=c++2a (test for warnings, line ) g++.dg/vect/pr33426-ivdep.cc -std=c++98 (test for warnings, line ) gfortran.dg/vect/no-vfa-pr32377.f90 -O scan-tree-dump-times vect "vectorized 2 loops" 1 gfortran.dg/vect/pr19049.f90 -O scan-tree-dump-times vect "vectorized 1 loops" 1 gfortran.dg/vect/pr32377.f90 -O scan-tree-dump-times vect "vectorized 2 loops" 1 gfortran.dg/vect/vect-2.f90 -O scan-tree-dump-times vect "vectorized 3 loops" 1 gfortran.dg/vect/vect-3.f90 -O scan-tree-dump-times vect "Alignment of access forced using versioning" 3 gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times vect "accesses have the same alignment." 1 gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times vect "vectorized 1 loops" 1 gfortran.dg/vect/vect-5.f90 -O scan-tree-dump-times vect "Alignment of access forced using versioning." 2 gfortran.dg/vect/vect-5.f90 -O scan-tree-dump-times vect "vectorized 1 loops" 1 Christophe Richard. > > 2019-10-10 Richard Biener > > PR middle-end/92046 > * opts.c (finish_options): Do not influence global --params > from options that are adjustable per function. > * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): > Apply --param adjustment based on active cost-model. > * tree-ssa-phiopt.c (cond_if_else_store_replacement): Disable > further store-sinking when vectorization or if-conversion > are not enabled. > > Index: gcc/opts.c > === > --- gcc/opts.c (revision 276795)+++ gcc/opts.c (working copy) > +++ gcc/opts.c (working copy) > @@ -1123,24 +1123,6 @@ finish_options (struct gcc_options *opts >&& !opts_set->x_flag_reorder_functions) > opts->x_flag_reorder_functions = 1; > > - /* Tune vectorization related parametees according to cost model. */ > - if (opts->x_flag_vect_cost_model == VECT_COST_MODEL_CHEAP) > -{ > - maybe_set_param_value (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS, > -6, opts->x_param_values, opts_set->x_param_values); > - maybe_set_param_value (PARAM_VECT_MAX_VERSION_FOR_ALIGNMENT_CHECKS, > -0, opts->x_param_values, opts_set->x_param_values); > - maybe_set_param_value (PARAM_VECT_MAX_PEELING_FOR_ALIGNMENT, > -0, opts->x_param_values, opts_set->x_param_values); > -} > - > - /* Set PARAM_MAX_STORES_TO_SINK to 0 if either vectorization or > if-conversion > - is disabled. */ > - if ((!opts->x_flag_tree_loop_vectorize && > !opts->x_flag_tree_slp_vectorize) > - || !opts->x_flag_tree_loop_if_convert) > -maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0, > - opts->x_param_values, > opts_set->x_param_values); > - >/* The -gsplit-dwarf option requires -ggnu-pubnames. */ >if (opts->x_dwarf_split_debug_info) > opts->x_debug_generate_pub_sections = 2; > Index: gcc/tree-vect-data-refs.c > === > --- gcc/tree-vect-data-refs.c (revision 276795) > +++ gcc/tree-vect-data-refs.c (working copy) > @@ -2075,6 +2075,8 @@ vect_enhance_data_refs_alignment (loop_v > { >
[Ada] Assorted gigi tweaks
Tested on x86_64-suse-linux, applied on the mainline. 2019-10-11 Eric Botcazou * gcc-interface/decl.c (elaborate_reference_1): Specifically deal with pointer displacement. * gcc-interface/decl.c (components_to_record): Use proper name. * gcc-interface/trans.c (Sloc_to_locus): Use standard types. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 276873) +++ gcc-interface/decl.c (working copy) @@ -6782,6 +6782,15 @@ elaborate_reference_1 (tree ref, void *d elaborate_reference_1 (TREE_OPERAND (ref, 0), data), TREE_OPERAND (ref, 1), NULL_TREE); + /* If this is the displacement of a pointer, elaborate the pointer and then + displace the result. The actual purpose here is to drop the location on + the expression, which may be problematic if replicated on references. */ + if (TREE_CODE (ref) == POINTER_PLUS_EXPR + && TREE_CODE (TREE_OPERAND (ref, 1)) == INTEGER_CST) +return build2 (POINTER_PLUS_EXPR, TREE_TYPE (ref), + elaborate_reference_1 (TREE_OPERAND (ref, 0), data), + TREE_OPERAND (ref, 1)); + sprintf (suffix, "EXP%d", ++er->n); return elaborate_expression_1 (ref, er->entity, suffix, er->definition, false); @@ -7923,7 +7932,7 @@ components_to_record (Node_Id gnat_compo && !(Is_Packed (gnat_record_type) ? has_non_packed_fixed_size_field : Optimize_Alignment_Space (gnat_record_type)) - && !debug__debug_flag_dot_r); + && !Debug_Flag_Dot_R); const bool w_reorder = (Convention (gnat_record_type) == Convention_Ada && Warn_On_Questionable_Layout Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 276864) +++ gcc-interface/trans.c (working copy) @@ -10848,8 +10848,8 @@ Sloc_to_locus (Source_Ptr Sloc, location } Source_File_Index file = Get_Source_File_Index (Sloc); - Logical_Line_Number line = Get_Logical_Line_Number (Sloc); - Column_Number column = (clear_column ? 0 : Get_Column_Number (Sloc)); + Line_Number_Type line = Get_Logical_Line_Number (Sloc); + Column_Number_Type column = (clear_column ? 0 : Get_Column_Number (Sloc)); line_map_ordinary *map = LINEMAPS_ORDINARY_MAP_AT (line_table, file - 1); /* We can have zero if pragma Source_Reference is in effect. */
Re: Implement ggc_trim
On October 11, 2019 9:03:53 AM GMT+02:00, Jan Hubicka wrote: >Hi, >this patch adds ggc_trim that releases free pages used by GGC memory >to system. This is useful to reduce memory footprint of WPA streaming: >WPA streaming ought to not use any more GGC memory (patches in testing >for that) and trimming the memory makes it available to fork&malloc >used >by stream out machinery. > >I collected some stats for cc1 for both GGC and heap (using mallinfo). >Memory footprints are as follows: > >After streaming in global stream: 123MB GGC; 25MB of heap. >After streaming in callgraph: 228MB GGC; 45MB of heap. >After streaming in summaries: 373MB GGC; 126MB of heap. >After symbol merging : 348MB GGC; 130MB of heap. >After IPA-ICF : 501MB GGC; 160MB of heap. (this is all ICF) >After IPA-CP : 528MB GGC; 163MB of heap. >After IPA-SRA : 532MB GGC; 163MB of heap. >After Inline : 644MB GGC; 173MB of heap > This is after collecting of 118MB of > garbage and returning 740k to system > by madvise_dontneed >After ipa-reference: 644MB GGC; 370MB of heap > I checked this all goes into the > bitmaps; I have WIP patch for that >After releasing summariess : 431MB GGC; 383MB of heap > Trim releases 43MB by unmap > and 321MB by madvise_dontneed > >At least i learnt new fact about ipa-reference consuming 200MB of >memory which was not obvious from our detailed mem stats. > >I think the lowest hanging fruit after this patch is to add >malloc_madvise which further reduces footpring and fix ipa-reference. >Hopefully Martin will do a bit about ipa-icf. > >I will dig into what inliner does but it produces a lot of clones so I >think it is mostly clone and summary duplication. Perhaps we can avoid >copying some of summaries for inline clones. > >In TOP I see about 900MB instead of 1.4GB before WPA streaming starts >with both ggc_trim and madvise. > >Note that I also tried to hack ggc_free to recognize free pages but at >least in simple implementation it is a loss since it makes ggc_alloc >more expensive (it needs to bring pages back and add into freelists) >which hurts stream-in performance. > >I think sweeping once per WPA is no problem, it is definitly less than >1% of WPA time. > >Bootstrapped/regtested x86_64-linux, OK? Ok. Richard. > * ggc-page.c (release_pages): Output statistics when !quiet_flag. > (ggc_collect): Dump later to not interfere with release_page dump. > (ggc_trim): New function. > * ggc-none.c (ggc_trim): New. > > * lto.c (lto_wpa_write_files): Call ggc_trim. >Index: ggc-page.c >=== >--- ggc-page.c (revision 276707) >+++ ggc-page.c (working copy) >@@ -529,7 +529,6 @@ static void clear_page_group_in_use (pag > #endif > static struct page_entry * alloc_page (unsigned); > static void free_page (struct page_entry *); >-static void release_pages (void); > static void clear_marks (void); > static void sweep_pages (void); > static void ggc_recalculate_in_use_p (page_entry *); >@@ -1016,6 +1015,8 @@ free_page (page_entry *entry) > static void > release_pages (void) > { >+ size_t n1 = 0; >+ size_t n2 = 0; > #ifdef USING_MADVISE > page_entry *p, *start_p; > char *start; >@@ -1061,6 +1062,7 @@ release_pages (void) > else > G.free_pages = p; > G.bytes_mapped -= mapped_len; >+n1 += len; > continue; > } > prev = newprev; >@@ -1092,6 +1094,7 @@ release_pages (void) >/* Don't count those pages as mapped to not touch the garbage collector > unnecessarily. */ > G.bytes_mapped -= len; >+ n2 += len; > while (start_p != p) > { > start_p->discarded = true; >@@ -1124,6 +1127,7 @@ release_pages (void) > } > > munmap (start, len); >+ n1 += len; > G.bytes_mapped -= len; > } > >@@ -1152,10 +1156,20 @@ release_pages (void) > *gp = g->next; > G.bytes_mapped -= g->alloc_size; > free (g->allocation); >+ n1 += g->alloc_size; > } > else > gp = &g->next; > #endif >+ if (!quiet_flag && (n1 || n2)) >+{ >+ fprintf (stderr, " {GC"); >+ if (n1) >+ fprintf (stderr, " released %luk", (unsigned long)(n1 / 1024)); >+ if (n2) >+ fprintf (stderr, " madv_dontneed %luk", (unsigned long)(n2 / 1024)); >+ fprintf (stderr, "}"); >+} > } > > /* This table provides a fast way to determine ceil(log_2(size)) for >@@ -2178,19 +2192,22 @@ ggc_collect (void) > return; > > timevar_push (TV_GC); >- if (!quiet_flag) >-fprintf (stderr, " {GC %luk -> ", (unsigned long) G.allocated / >1024); > if (GGC_DEBUG_LEVEL >= 2) > fprintf (G.debug_fi
[Ada] Accept size for record with oversized component clause
This makes the compiler accept again an exact size clause for a record type that contains both a component subject to a representation clause that gives it a size greater than its nominal size and another component that is aliased. Tested on x86_64-suse-linux, applied on the mainline. 2019-10-11 Eric Botcazou * gcc-interface/decl.c (gnat_to_gnu_field): Adjust again the packing for a field without strict alignment and with an oversized clause. 2019-10-11 Eric Botcazou * gnat.dg/specs/size_clause4.ads: New test. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 276866) +++ gcc-interface/decl.c (working copy) @@ -7229,10 +7229,17 @@ gnat_to_gnu_field (Entity_Id gnat_field, && INTEGRAL_TYPE_P (TREE_TYPE (TYPE_FIELDS (gnu_field_type gnu_field_type = TREE_TYPE (TYPE_FIELDS (gnu_field_type)); + orig_field_type = gnu_field_type; gnu_field_type = make_type_from_size (gnu_field_type, gnu_size, Has_Biased_Representation (gnat_field)); + /* If the type has been extended, we may need to cap the alignment. */ + if (!needs_strict_alignment + && gnu_field_type != orig_field_type + && tree_int_cst_lt (TYPE_SIZE (orig_field_type), gnu_size)) + packed = adjust_packed (gnu_field_type, gnu_record_type, packed); + orig_field_type = gnu_field_type; gnu_field_type = maybe_pad_type (gnu_field_type, gnu_size, 0, gnat_field, false, false, definition, true); -- { dg-do compile } with System; package Size_Clause4 is type Rec is record A1 : System.Address; A2 : System.Address; I : aliased Integer; end record; for Rec use record A1 at 0 range 0 .. 63; A2 at 8 range 0 .. 63; I at 16 range 0 .. 31; end record; for Rec'Size use 160; end Size_Clause4;
Re: [PATCH v4 6/7] S/390: Use signaling FP comparison instructions
> Am 07.10.2019 um 18:19 schrieb Andreas Krebbel : > > On 01.10.19 15:27, Ilya Leoshkevich wrote: >> dg-torture.exp=inf-compare-1.c is failing, because (qNaN > +Inf) >> comparison is compiled to CDB instruction, which does not signal an >> invalid operation exception. KDB should have been used instead. >> >> This patch introduces a new CCmode and a new pattern in order to >> generate signaling instructions in this and similar cases. >> >> gcc/ChangeLog: >> >> 2019-08-09 Ilya Leoshkevich >> >> PR target/77918 >> * config/s390/2827.md: Add new opcodes. >> * config/s390/2964.md: Likewise. >> * config/s390/3906.md: Likewise. >> * config/s390/8561.md: Likewise. >> * config/s390/s390-builtins.def (s390_vfchesb): Use >> the new vec_cmpgev4sf_quiet_nocc. >> (s390_vfchedb): Use the new vec_cmpgev2df_quiet_nocc. >> (s390_vfchsb): Use the new vec_cmpgtv4sf_quiet_nocc. >> (s390_vfchdb): Use the new vec_cmpgtv2df_quiet_nocc. >> (vec_cmplev4sf): Use the new vec_cmplev4sf_quiet_nocc. >> (vec_cmplev2df): Use the new vec_cmplev2df_quiet_nocc. >> (vec_cmpltv4sf): Use the new vec_cmpltv4sf_quiet_nocc. >> (vec_cmpltv2df): Use the new vec_cmpltv2df_quiet_nocc. >> * config/s390/s390-modes.def (CCSFPS): New mode. >> * config/s390/s390.c (s390_match_ccmode_set): Support CCSFPS. >> (s390_select_ccmode): Return CCSFPS for LT, LE, GT, GE and LTGT. >> (s390_branch_condition_mask): Reuse CCS for CCSFPS. >> (s390_expand_vec_compare): Use non-signaling patterns where >> necessary. >> (s390_reverse_condition): Support CCSFPS. >> * config/s390/s390.md (*cmp_ccsfps): New pattern. >> * config/s390/vector.md: (VFCMP_HW_OP): Remove. >> (asm_fcmp_op): Likewise. >> (*smaxv2df3_vx): Use pattern for quiet comparison. >> (*sminv2df3_vx): Likewise. >> (*vec_cmp_nocc): Remove. >> (*vec_cmpeq_quiet_nocc): New pattern. >> (vec_cmpgt_quiet_nocc): Likewise. >> (vec_cmplt_quiet_nocc): New expander. >> (vec_cmpge_quiet_nocc): New pattern. >> (vec_cmple_quiet_nocc): New expander. >> (*vec_cmpeq_signaling_nocc): New pattern. >> (*vec_cmpgt_signaling_nocc): Likewise. >> (*vec_cmpgt_signaling_finite_nocc): Likewise. >> (*vec_cmpge_signaling_nocc): Likewise. >> (*vec_cmpge_signaling_finite_nocc): Likewise. >> (vec_cmpungt): New expander. >> (vec_cmpunge): Likewise. >> (vec_cmpuneq): Use quiet patterns. >> (vec_cmpltgt): Allow only on z14+. >> (vec_cmpordered): Use quiet patterns. >> (vec_cmpunordered): Likewise. >> (VEC_CMP_EXPAND): Add ungt and unge. >> >> gcc/testsuite/ChangeLog: >> >> 2019-08-09 Ilya Leoshkevich >> >> * gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust >> expectations. > > ... > >> diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def >> index 7b7c1141449..2d9cd9b5945 100644 >> --- a/gcc/config/s390/s390-modes.def >> +++ b/gcc/config/s390/s390-modes.def >> @@ -52,6 +52,8 @@ CCS: EQ LT GT UNORDERED >> (LTGFR, LTGR, LTR, ICM/Y, >>ADB/R, AEB/R, SDB/R, >> SEB/R, >>SRAG, SRA, SRDA) >> CCSR: EQ GT LT UNORDERED (CGF/R, CH/Y) >> +CCSFPS: EQGT LT UNORDERED (KEB/R, KDB/R, KXBR, >> KDTR, >> + KXTR, WFK) > > GT and LT need to be swapped. > > Ok with that change. Thanks! I've swapped GT and LT and committed this as https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=276871
Re: [PATCH v4 7/7] S/390: Test signaling FP comparison instructions
> Am 07.10.2019 um 18:27 schrieb Andreas Krebbel : > > On 01.10.19 15:27, wrote: >> gcc/testsuite/ChangeLog: >> >> 2019-08-09 Ilya Leoshkevich >> >> PR target/77918 >> * gcc.target/s390/s390.exp: Enable Fortran tests. >> * gcc.target/s390/zvector/autovec-double-quiet-eq.c: New test. >> * gcc.target/s390/zvector/autovec-double-quiet-ge.c: New test. >> * gcc.target/s390/zvector/autovec-double-quiet-gt.c: New test. >> * gcc.target/s390/zvector/autovec-double-quiet-le.c: New test. >> * gcc.target/s390/zvector/autovec-double-quiet-lt.c: New test. >> * gcc.target/s390/zvector/autovec-double-quiet-ordered.c: New test. >> * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: New test. >> * gcc.target/s390/zvector/autovec-double-quiet-unordered.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-eq-z13-finite.c: New >> test. >> * gcc.target/s390/zvector/autovec-double-signaling-eq-z13.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-eq.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-ge-z13-finite.c: New >> test. >> * gcc.target/s390/zvector/autovec-double-signaling-ge-z13.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-ge.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-gt-z13-finite.c: New >> test. >> * gcc.target/s390/zvector/autovec-double-signaling-gt-z13.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-gt.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-le-z13-finite.c: New >> test. >> * gcc.target/s390/zvector/autovec-double-signaling-le-z13.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-le.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-lt-z13-finite.c: New >> test. >> * gcc.target/s390/zvector/autovec-double-signaling-lt-z13.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-lt.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13-finite.c: >> New test. >> * gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13.c: New test. >> * gcc.target/s390/zvector/autovec-double-signaling-ltgt.c: New test. >> * gcc.target/s390/zvector/autovec-double-smax-z13.F90: New test. >> * gcc.target/s390/zvector/autovec-double-smax.F90: New test. >> * gcc.target/s390/zvector/autovec-double-smin-z13.F90: New test. >> * gcc.target/s390/zvector/autovec-double-smin.F90: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-eq.c: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-ge.c: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-gt.c: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-le.c: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-lt.c: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-ordered.c: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: New test. >> * gcc.target/s390/zvector/autovec-float-quiet-unordered.c: New test. >> * gcc.target/s390/zvector/autovec-float-signaling-eq.c: New test. >> * gcc.target/s390/zvector/autovec-float-signaling-ge.c: New test. >> * gcc.target/s390/zvector/autovec-float-signaling-gt.c: New test. >> * gcc.target/s390/zvector/autovec-float-signaling-le.c: New test. >> * gcc.target/s390/zvector/autovec-float-signaling-lt.c: New test. >> * gcc.target/s390/zvector/autovec-float-signaling-ltgt.c: New test. >> * gcc.target/s390/zvector/autovec-fortran.h: New test. >> * gcc.target/s390/zvector/autovec-long-double-signaling-ge.c: New test. >> * gcc.target/s390/zvector/autovec-long-double-signaling-gt.c: New test. >> * gcc.target/s390/zvector/autovec-long-double-signaling-le.c: New test. >> * gcc.target/s390/zvector/autovec-long-double-signaling-lt.c: New test. >> * gcc.target/s390/zvector/autovec.h: New test. > > Do these tests work on 32 bit? We need -mzarch to make the vector > instructions available there. > > Ok, if clean on 32 and 64 bit. I've added -mzarch, tweaked a few expectations, and committed this as https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=276872
Avid ggc_alloc and push_cfun during LTO streaming
Hi, this patch prevents tree creation druing WPA stream out (to avoid touching pages and triggering COW). It fixes the following - gimple streamer produces MEM_REF wrappings for global decls. This is to preserve the type of access and is not necessary for WPA->LTRANS streaming when decls ar eno longer going to be merged. - we renumber stmt uids during streaming WPA summaries - loop optimizer is initialized in output_function. After testing the patch I noticed that output_function does one extra renumbering of stmts. This seems quite broken and I will fix it incrementally. Bootstrapped/regtested x86_64-linux, comitted. * gimple-streamer-out.c (output_gimple_stmt): Add explicit function parameter. * lto-streamer-out.c: Include tree-dfa.h. (output_cfg): Do not use cfun. (lto_prepare_function_for_streaming): New. (output_function): Do not push cfun; do not initialize loop optimizer. * lto-streamer.h (lto_prepare_function_for_streaming): Declare. * passes.c (ipa_write_summaries): Use it. (ipa_write_optimization_summaries): Do not modify bodies. * tree-dfa.c (renumber_gimple_stmt_uids): Add function parameter. * tree.dfa.h (renumber_gimple_stmt_uids): Update prototype. * tree-ssa-dse.c (pass_dse::execute): Update use of renumber_gimple_stmt_uids. * tree-ssa-math-opts.c (pass_optimize_widening_mul::execute): Likewise. * lto.c (lto_wpa_write_files): Prepare all bodies for streaming. Index: gimple-streamer-out.c === --- gimple-streamer-out.c (revision 276850) +++ gimple-streamer-out.c (working copy) @@ -57,7 +57,7 @@ output_phi (struct output_block *ob, gph /* Emit statement STMT on the main stream of output block OB. */ static void -output_gimple_stmt (struct output_block *ob, gimple *stmt) +output_gimple_stmt (struct output_block *ob, struct function *fn, gimple *stmt) { unsigned i; enum gimple_code code; @@ -80,7 +80,7 @@ output_gimple_stmt (struct output_block as_a (stmt)), 1); bp_pack_value (&bp, gimple_has_volatile_ops (stmt), 1); - hist = gimple_histogram_value (cfun, stmt); + hist = gimple_histogram_value (fn, stmt); bp_pack_value (&bp, hist != NULL, 1); bp_pack_var_len_unsigned (&bp, stmt->subcode); @@ -139,7 +139,7 @@ output_gimple_stmt (struct output_block so that we do not have to deal with type mismatches on merged symbols during IL read in. The first operand of GIMPLE_DEBUG must be a decl, not MEM_REF, though. */ - if (op && (i || !is_gimple_debug (stmt))) + if (!flag_wpa && op && (i || !is_gimple_debug (stmt))) { basep = &op; if (TREE_CODE (*basep) == ADDR_EXPR) @@ -147,7 +147,7 @@ output_gimple_stmt (struct output_block while (handled_component_p (*basep)) basep = &TREE_OPERAND (*basep, 0); if (VAR_P (*basep) - && !auto_var_in_fn_p (*basep, current_function_decl) + && !auto_var_in_fn_p (*basep, fn->decl) && !DECL_REGISTER (*basep)) { bool volatilep = TREE_THIS_VOLATILE (*basep); @@ -228,7 +228,7 @@ output_bb (struct output_block *ob, basi print_gimple_stmt (streamer_dump_file, stmt, 0, TDF_SLIM); } - output_gimple_stmt (ob, stmt); + output_gimple_stmt (ob, fn, stmt); /* Emit the EH region holding STMT. */ region = lookup_stmt_eh_lp_fn (fn, stmt); Index: lto/lto.c === --- lto/lto.c (revision 276850) +++ lto/lto.c (working copy) @@ -304,6 +304,13 @@ lto_wpa_write_files (void) timevar_push (TV_WHOPR_WPA_IO); + cgraph_node *node; + /* Do body modifications needed for streaming before we fork out + worker processes. */ + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) +if (gimple_has_body_p (node->decl)) + lto_prepare_function_for_streaming (node); + /* Generate a prefix for the LTRANS unit files. */ blen = strlen (ltrans_output_list); temp_filename = (char *) xmalloc (blen + sizeof ("2147483648.o")); Index: lto-streamer-out.c === --- lto-streamer-out.c (revision 276850) +++ lto-streamer-out.c (working copy) @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. #include "debug.h" #include "omp-offload.h" #include "print-tree.h" +#include "tree-dfa.h" static void lto_write_tree (struct output_block*, tree, bool); @@ -1893,7 +1894,7 @@ output_cfg (struct output_block *ob, str streamer_write_hwi (ob, -1); - bb = ENTRY_BLOCK_PTR_FOR_FN (cfun); + bb = ENTRY_BLOCK_PTR_FOR_FN (fn); while (bb->next_bb) { streamer_write_hwi (ob, bb->next_bb->index); @@ -1902,9 +19