[Bug libstdc++/99117] [9/10/11 Regression] cannot accumulate std::valarray

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99117

Richard Biener  changed:

   What|Removed |Added

  Component|middle-end  |libstdc++

--- Comment #2 from Richard Biener  ---
FRE5 does

[local count: 2736164397]:
   # ivtmp.143_71 = PHI 
   _221 = (void *) ivtmp.143_71;
   _124 = MEM[(int * *)_221 + 8B];
-  _115 = MEM[(const int &)_101];
   _162 = MEM[(const int &)_124];
-  _140 = _115 + _162;
-  *_101 = _140;
+  *_101 = _162;
   __p_61 = _101 + 4;
   _127 = _124 + 4;
-  _131 = MEM[(const int &)__p_61];
   _132 = *_127;
-  _133 = _131 + _132;
-  *__p_61 = _133;
+  *__p_61 = _132;
   __p_136 = __p_61 + 4;
   D.69731 ={v} {CLOBBER};
   ivtmp.143_186 = ivtmp.143_71 + 16;
-  if (ivtmp.143_186 != _60)
+  if (_158 != ivtmp.143_186)
 goto ; [89.00%]

which looks wrong at a first glance.  Ah, it's OK because we have

   [local count: 2736164397]:
  # ivtmp.143_71 = PHI 
  # PT = null { D.69770 } (escaped, escaped heap)
  _221 = (void *) ivtmp.143_71;
  # PT = nonlocal escaped null { D.69771 } (escaped, escaped heap)
  _124 = MEM[(int * *)_221 + 8B clique 11 base 0];
  _115 = MEM[(const int &)_101 clique 11 base 0];
  _162 = MEM[(const int &)_124 clique 11 base 0];
  _140 = _115 + _162;
  MEM[(int *)_101 clique 11 base 1] = _140;
  # PT = { D.69772 } (escaped, escaped heap)
  __p_61 = _101 + 4;
  # PT = nonlocal escaped null { D.69771 } (escaped, escaped heap)
  _127 = _124 + 4;
  _131 = MEM[(const int &)__p_61 clique 11 base 0];
  _132 = MEM[(const int &)_127 clique 11 base 0];
  _133 = _131 + _132;
  MEM[(int *)__p_61 clique 11 base 1] = _133;
  # PT = { D.69772 } (escaped, escaped heap)
  __p_136 = __p_61 + 4;
  D.69731 ={v} {CLOBBER};
  ivtmp.143_186 = ivtmp.143_71 + 16;
  if (ivtmp.143_186 != _60)
goto ; [89.00%]
  else
goto ; [11.00%]

so the load is from clique 11, base 0 while the store is to clique 11, base 1
so they do not alias because of some __restrict marking.  So the issue is
likely 
an invalid testcase or bogus annotation in libstdc++

[Bug middle-end/99117] [9/10/11 Regression] cannot accumulate std::valarray

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99117

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
  Component|c++ |middle-end
   Last reconfirmed||2021-02-16
   Target Milestone|--- |9.4
 Ever confirmed|0   |1
  Known to fail||10.2.1, 11.0
 Status|UNCONFIRMED |NEW
  Known to work||7.5.0, 8.4.0
   Keywords||needs-reduction, wrong-code
Summary|cannot accumulate   |[9/10/11 Regression] cannot
   |std::valarray   |accumulate std::valarray

--- Comment #1 from Richard Biener  ---
Confirmed.  If I disable SRA it works but I haven't looked closer yet.

[Bug rtl-optimization/99114] [WORD_REGISTER_OPERATIONS] wrong code for (u16_var & 3) == (u32)1

2021-02-15 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99114

--- Comment #4 from Eric Botcazou  ---
> I'll try, but please consider investigating this without one. It happens
> after a very lengthy compilation process (compiling a buggy gcc with a buggy
> cross-compiler, then compiling JIT code with that, generating a buggy .so,
> then running _that_, to see incorrect behavior rather than a clear crash),
> in C++ code, with an out-of-tree backend.

Impressive indeed, but not without a precedent.  Can you upload the RTL dump
files of the pass preceding .combine and that of .combine itself?

> On the other hand, it's been investigated, and it's a clear bug with a
> one-line fix.

This needs to be double checked though, this code has been there for ages.

[Bug c++/99116] [11 Regression] ICE in set_identifier_type_value_with_scope, at cp/name-lookup.c:4764

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99116

Richard Biener  changed:

   What|Removed |Added

   Priority|P2  |P4

[Bug rtl-optimization/99114] [WORD_REGISTER_OPERATIONS] wrong code for (u16_var & 3) == (u32)1

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99114

--- Comment #3 from Richard Biener  ---
If

(gtu (subreg:SI (reg:HI 593)) (const_int 1))

is incorrect, why is it then recognized?  And why should (subreg:SI (and:HI
..))
be OK?

The way I read WORD_REGISTER_OPERATIONS it's a bad design to make the IL do
something that could have better been explicitely represented.  (if it is
a SImode op then please make it an SImode op!)

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

--- Comment #8 from Richard Biener  ---
Well, certainly expected as you changed the semantics of 'used' ...

[Bug fortran/99112] [11 Regression] ICE in gfc_conv_component_ref, at fortran/trans-expr.c:2646

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99112

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P4
   Target Milestone|--- |11.0

[Bug fortran/99111] [10/11 Regression] ICE in gfc_conv_expr_descriptor, at fortran/trans-array.c:7336

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99111

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |10.3

[Bug middle-end/99109] [9/10/11 Regression] ICE: Error reporting routines re-entered since r9-1948

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99109

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug c++/99108] ICE in ix86_get_function_versions_dispatcher, at config/i386/i386-features.c:2862

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99108

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code

--- Comment #2 from Richard Biener  ---
So the issue here is likely that f() have not been finalized because they are
declared in local scope and thus there's no cgraph node for them (yet).

[Bug sanitizer/99106] [9/10/11 Regression] ICE in tree_to_poly_int64, at tree.c:3091

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99106

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
Summary|ICE in tree_to_poly_int64,  |[9/10/11 Regression] ICE in
   |at tree.c:3091  |tree_to_poly_int64, at
   ||tree.c:3091
   Target Milestone|--- |9.4
   Keywords||ice-on-valid-code

[Bug c++/99117] New: cannot accumulate std::valarray

2021-02-15 Thread yasui at icepp dot s.u-tokyo.ac.jp via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99117

Bug ID: 99117
   Summary: cannot accumulate std::valarray
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yasui at icepp dot s.u-tokyo.ac.jp
  Target Milestone: ---

Created attachment 50193
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50193=edit
source file

The result of the summation of the std::valarrays by std::accumulate is
improper when compiling with -O2.
It works properly with -O1.

option:
g++ -std=c++2a -Wall -Wextra -O2 source.cpp

In the source file, it calculates (1,1) + (2,2) = (3,3).
The expected result is outputting nothing.
If compiling with -O2, following message is shown:
a.out: source.cpp:10: int main(): Assertion `sum[0]==3' failed.

[Bug c++/99116] [11 Regression] ICE in set_identifier_type_value_with_scope, at cp/name-lookup.c:4764

2021-02-15 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99116

Marek Polacek  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
 Status|UNCONFIRMED |NEW
 CC||mpolacek at gcc dot gnu.org,
   ||nathan at gcc dot gnu.org
   Last reconfirmed||2021-02-16
 Ever confirmed|0   |1
   Priority|P3  |P2

--- Comment #1 from Marek Polacek  ---
Started with r11-7228.

[Bug c++/99116] New: [11 Regression] ICE in set_identifier_type_value_with_scope, at cp/name-lookup.c:4764

2021-02-15 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99116

Bug ID: 99116
   Summary: [11 Regression] ICE in
set_identifier_type_value_with_scope, at
cp/name-lookup.c:4764
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

g++-11.0.0-alpha20210214 snapshot (g:9966699d7a9d8e35c0c4cf9a945bcf90ef874f2d)
ICEs when compiling the following testcase, reduced from
test/CXX/temp/temp.res/temp.local/p6.cpp from the clang 11.0.1 test suite:

template struct Z {
  template struct A {};

  friend struct T;
};

% g++-11.0.0 -c rzrmc2qs.c
rzrmc2qs.c:2:12: error: declaration of template parameter 'T' shadows template
parameter
2 |   template struct A {};
  |^~~~
rzrmc2qs.c:1:10: note: template parameter 'T' declared here
1 | template struct Z {
  |  ^~~
rzrmc2qs.c:4:17: error: declaration of 'struct T' shadows template parameter
4 |   friend struct T;
  | ^
rzrmc2qs.c:1:10: note: template parameter 'T' declared here
1 | template struct Z {
  |  ^~~
rzrmc2qs.c:4:17: internal compiler error: in
set_identifier_type_value_with_scope, at cp/name-lookup.c:4764
4 |   friend struct T;
  | ^
0x67ea81 set_identifier_type_value_with_scope
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/name-lookup.c:4764
0xa23af9 do_pushdecl
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/name-lookup.c:3817
0xa24561 do_pushdecl
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/name-lookup.c:4850
0xa24561 do_pushdecl_with_scope
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/name-lookup.c:4850
0xa24b15 do_pushtag
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/name-lookup.c:8282
0xa24b15 pushtag(tree_node*, tree_node*, TAG_how)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/name-lookup.c:8342
0x9614d0 xref_tag(tag_types, tree_node*, TAG_how, bool)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/decl.c:15323
0xa60c10 cp_parser_elaborated_type_specifier
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:19644
0xa455b5 cp_parser_type_specifier
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:18398
0xa46644 cp_parser_decl_specifier_seq
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:14994
0xa7304e cp_parser_member_declaration
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:25874
0xa43120 cp_parser_member_specification_opt
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:25731
0xa43120 cp_parser_class_specifier_1
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:24809
0xa4568b cp_parser_class_specifier
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:25125
0xa4568b cp_parser_type_specifier
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:18372
0xa46644 cp_parser_decl_specifier_seq
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:14994
0xa71a79 cp_parser_single_declaration
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:30343
0xa71e25 cp_parser_template_declaration_after_parameters
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:30006
0xa725f0 cp_parser_explicit_template_declaration
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:30272
0xa74e81 cp_parser_declaration
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20210214/work/gcc-11-20210214/gcc/cp/parser.c:14000

[Bug testsuite/98125] [11 Regression] New test case g++.dg/pr93195a.C in r11-5656 has excess errors

2021-02-15 Thread amodra at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98125

--- Comment #14 from Alan Modra  ---
-fpatchable-function-entry isn't used by the powerpc linux kernel.

[Bug rtl-optimization/99114] [WORD_REGISTER_OPERATIONS] wrong code for (u16_var & 3) == (u32)1

2021-02-15 Thread pipcet at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99114

--- Comment #2 from pipcet at gmail dot com ---
(In reply to Eric Botcazou from comment #1)
> Please provide a reproducer as documented in https://gcc.gnu.org/bugs

I'll try, but please consider investigating this without one. It happens after
a very lengthy compilation process (compiling a buggy gcc with a buggy
cross-compiler, then compiling JIT code with that, generating a buggy .so, then
running _that_, to see incorrect behavior rather than a clear crash), in C++
code, with an out-of-tree backend.

On the other hand, it's been investigated, and it's a clear bug with a one-line
fix.

> > The assumption here is that op0 will be an (and:HI) after the first
> > statement (and we assume (subreg:SI (and:HI ... (const_int 3))) is
> > defined because of WORD_REGISTER_OPERATIONS) but it's actually
> > simplified to be just the (reg:HI 593), and (subreg:SI (reg:HI 593))
> > is not defined.
> 
> Paradoxical registers are defined under specific circumstances though.

Thanks, I understand that. This isn't one of them.

> > I'm unsure whether this can cause wrong code for in-tree backends or
> > backends which don't define WORD_REGISTER_OPERATIONS.
> 
> Well, obviously not for the latter, see the comment just above the code.

As I said, I'm unsure. The buggy line of code is executed on other targets, and
the condition under which that happens is not !paradoxical_subreg_p. I think
it's equivalent, but I don't think that's obvious...

[Bug fortran/99111] [10/11 Regression] ICE in gfc_conv_expr_descriptor, at fortran/trans-array.c:7336

2021-02-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99111

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||anlauf at gcc dot gnu.org
   Priority|P3  |P4
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-02-15

--- Comment #1 from anlauf at gcc dot gnu.org ---
We properly reject this with -std=...

The following checks in io.c might need improvement:

  /* If rank is nonzero and type is not character, we allow it under
GFC_STD_LEGACY.
 It may be assigned an Hollerith constant.  */
  if (e->ts.type != BT_CHARACTER)
{
  if (!gfc_notify_std (GFC_STD_LEGACY, "Non-character in FORMAT tag "
   "at %L", >where))
return false;


Since we have e->ts.type == BT_DERIVED in the present case, this could be
straightforward.

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

H.J. Lu  changed:

   What|Removed |Added

   Keywords||patch
   Target Milestone|--- |11.0

--- Comment #7 from H.J. Lu  ---
A patch is posted at

https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565337.html

[Bug rtl-optimization/99114] [WORD_REGISTER_OPERATIONS] wrong code for (u16_var & 3) == (u32)1

2021-02-15 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99114

Eric Botcazou  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 CC||ebotcazou at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2021-02-15

--- Comment #1 from Eric Botcazou  ---
Please provide a reproducer as documented in https://gcc.gnu.org/bugs

> The assumption here is that op0 will be an (and:HI) after the first
> statement (and we assume (subreg:SI (and:HI ... (const_int 3))) is
> defined because of WORD_REGISTER_OPERATIONS) but it's actually
> simplified to be just the (reg:HI 593), and (subreg:SI (reg:HI 593))
> is not defined.

Paradoxical registers are defined under specific circumstances though.

> I'm unsure whether this can cause wrong code for in-tree backends or
> backends which don't define WORD_REGISTER_OPERATIONS.

Well, obviously not for the latter, see the comment just above the code.

[Bug c++/99115] ICE in extract_insn, at recog.c:2309 on alpha (error: unrecognizable insn) with -O2

2021-02-15 Thread kurt at intricatesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99115

Kurt Miller  changed:

   What|Removed |Added

 Target||alpha-unknown-openbsd6.8
 CC||kurt at intricatesoftware dot 
com

--- Comment #2 from Kurt Miller  ---
This ICE is on alpha (not hppa).

[Bug c++/99115] ICE in extract_insn, at recog.c:2309 on hppa (error: unrecognizable insn) with -O2

2021-02-15 Thread kurt at intricatesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99115

--- Comment #1 from Kurt Miller  ---
Created attachment 50192
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50192=edit
testcase preprocessed

[Bug c++/99115] New: ICE in extract_insn, at recog.c:2309 on hppa (error: unrecognizable insn) with -O2

2021-02-15 Thread kurt at intricatesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99115

Bug ID: 99115
   Summary: ICE in extract_insn, at recog.c:2309 on hppa (error:
unrecognizable insn) with -O2
   Product: gcc
   Version: 8.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kurt at intricatesoftware dot com
  Target Milestone: ---

Created attachment 50191
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50191=edit
testcase

Noticed while compiling src/deps_log.cc ninja 1.10.2 on hppa. I trimmed down
the file to test.cc.

g++ -O2 -c ./test.cc -o test.o
./test.cc: In member function 'LoadStatus DepsLog::Load(const string&, State*,
std::__cxx11::string*)':
./test.cc:176:1: error: unrecognizable insn:
 }
 ^
(insn 218 217 219 25 (set (reg:DI 217)
(plus:DI (reg/f:DI 65 virtual-stack-vars)
(const_int -524292 [0xfff7fffc]))) "./test.cc":122 -1
 (nil))
during RTL pass: vregs
./test.cc:176:1: internal compiler error: in extract_insn, at recog.c:2309

[Bug rtl-optimization/98722] [11 Regression] ICE in lra_set_insn_recog_data, at lra.c:1004 since r11-6615-gcf2ac1c30af0fa783c8d72e527904dda5d8cc330

2021-02-15 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98722

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org

--- Comment #5 from Peter Bergner  ---
Another P1 that looks like it might be fixed.  Vlad, can we marked this as
fixed?

[Bug rtl-optimization/98777] [11 Regression] ICE in update_equiv at gcc/lra-constraints.c:504 since r11-6819-g4334b52427420312

2021-02-15 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98777

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org

--- Comment #4 from Peter Bergner  ---
Vlad, is this fixed now and we can close it?  It's marked as a P1, so would be
nice to close if fixed.

[Bug gcov-profile/99105] [11 regression] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #15 from Jan Hubicka  ---
GCC10 testing time is:
Testing Time: 656.80s
  Unsupported  :   104
  Passed   : 21273
  Expectedly Failed:26

I will see tomorrow if I can get GCC11 testing to finish.

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

--- Comment #6 from H.J. Lu  ---
Created attachment 50190
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50190=edit
A kernel patch to pass -fno-gnu-retain

This patch makes kernel to boot.

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

H.J. Lu  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

--- Comment #5 from H.J. Lu  ---
Created attachment 50189
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50189=edit
A patch

I am testing this.

[Bug gcov-profile/99105] [11 regression] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

Jan Hubicka  changed:

   What|Removed |Added

Summary|profile streaming scales|[11 regression] profile
   |poorly to projects with |streaming scales poorly to
   |many source files   |projects with many source
   ||files

--- Comment #14 from Jan Hubicka  ---
With GCC10 the testuiste seems to run reasonably fast (in about 20 minutes
too).  Profile is
  21.38%  clang-11 [.] gcov_do_dump 
   9.99%  clang-11 [.] gcov_read_words  
   5.42%  [kernel] [k]
copy_user_enhanced_fast_string  
   2.68%  FileCheck[.] lstep
   2.63%  [kernel] [k] clear_page_erms  
   2.53%  clang-11 [.] __gcov_merge_add 
   2.23%  clang-11 [.] gcov_write_words 
   1.93%  clang-11 [.]
__gcov_read_counter 
   1.50%  clang-11 [.]
__gcov_merge_topn   
   1.29%  [kernel] [k]
native_queued_spin_lock_slowpath
   1.26%  [kernel] [k] alloc_set_pte
   1.20%  [kernel] [k]
__x86_indirect_thunk_rax
   1.05%  FileCheck[.] sstep
   0.79%  [kernel] [k] _raw_spin_lock   
   0.78%  [kernel] [k] kmem_cache_free  
   0.78%  [kernel] [k]
generic_file_buffered_read  
   0.72%  [kernel] [k] kmem_cache_alloc 
   0.69%  [kernel] [k] kfree
   0.64%  [kernel] [k] page_remove_rmap 
   0.63%  [kernel] [k]
filemap_map_pages   
   0.60%  [kernel] [k] __slab_free  
   0.55%  [kernel] [k] handle_mm_fault

So I am marking this a regression at least until we analyze it better.

[Bug fortran/98686] Namelist group objects shall be defined before appearing in namelist for -std=f2018

2021-02-15 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98686

--- Comment #5 from Jerry DeLisle  ---
The wording in the F2018 standard goes all the way back to F95. I do not plan
to put this behind any check for any particular standard.

[Bug rtl-optimization/99114] New: [WORD_REGISTER_OPERATIONS] wrong code for (u16_var & 3) == (u32)1

2021-02-15 Thread pipcet at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99114

Bug ID: 99114
   Summary: [WORD_REGISTER_OPERATIONS] wrong code for (u16_var &
3) == (u32)1
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pipcet at gmail dot com
  Target Milestone: ---

I was seeing a miscompilation issue with my wasm32 backend (which was
somewhat difficult to track down because it was actually the
cross-compiled native compiler miscompiling JIT code).

The symptom is that combine.c replaces the correct RTL

(gtu (and:SI (subreg:SI (reg:HI 593)) (const_int 3))
 (const_int 1))

with the incorrect RTL

(gtu (subreg:SI (reg:HI 593)) (const_int 1))

when reg:HI 593 is known to be <= 3.

The culprit is this code, in combine.c:

  op0 = simplify_gen_binary (AND, tmode,
 SUBREG_REG (XEXP (op0, 0)),
 gen_int_mode (c1, tmode));
  op0 = gen_lowpart (mode, op0);

The assumption here is that op0 will be an (and:HI) after the first
statement (and we assume (subreg:SI (and:HI ... (const_int 3))) is
defined because of WORD_REGISTER_OPERATIONS) but it's actually
simplified to be just the (reg:HI 593), and (subreg:SI (reg:HI 593))
is not defined.

I'm unsure whether this can cause wrong code for in-tree backends or backends
which don't define WORD_REGISTER_OPERATIONS.

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

--- Comment #4 from Sergei Trofimovich  ---
(In reply to H.J. Lu from comment #3)
> (In reply to Sergei Trofimovich from comment #2)
> > 3. I tried to add '.data.event*' (and similar) to linux ldscript and it was
> > not enough for me to built a kernel that does not crash. Which might hint at
> > binutils bug or linux loader bug (or at my bad attempt at adding sections).
> 
> There are many different section names.

Yeah, that makes sense. My attempt was:

--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -102,7 +102,7 @@
 #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]*
 #else
 #define TEXT_MAIN .text
-#define DATA_MAIN .data
+#define DATA_MAIN .data .data.event* .data.__syscall_meta__*
 #define SDATA_MAIN .sdata
 #define RODATA_MAIN .rodata
 #define BSS_MAIN .bss

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

--- Comment #3 from H.J. Lu  ---
(In reply to Sergei Trofimovich from comment #2)
> 3. I tried to add '.data.event*' (and similar) to linux ldscript and it was
> not enough for me to built a kernel that does not crash. Which might hint at
> binutils bug or linux loader bug (or at my bad attempt at adding sections).

There are many different section names.

[Bug c++/99103] Initializer-list constructors in CTAD for vector is still wrong

2021-02-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99103

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2021-02-15
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

--- Comment #1 from Patrick Palka  ---
Thanks for the bug report, confirmed.  We can trigger the bug without the
variadic template:

const std::vector v;
std::vector w{v};
static_assert(std::is_same_v>); // fails

Investigating.

[Bug middle-end/99109] [9/10/11 Regression] ICE: Error reporting routines re-entered since r9-1948

2021-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99109

--- Comment #3 from Jakub Jelinek  ---
array_bounds_checker::check_mem_ref certainly shouldn't try to pretend there is
any ARRAY_TYPE that wasn't in the source if the type is overaligned (its size
is not a multiple of the alignment).

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

Sergei Trofimovich  changed:

   What|Removed |Added

 CC||slyfox at gcc dot gnu.org

--- Comment #2 from Sergei Trofimovich  ---
I have a few notes:

1. GCC should put SHF_GNU_RETAIN section into a different section from .data
anyway, right?

I think that means linux kernel would need to adapt in any case to merge new
.data.${SHF_GNU_RETAIN_name} section into appropriate place in it's linker
script.

Proposal: rename a section in an easily distinguishable prefix, like:

NEW:.data.retain.event_initcall_finish
OLD:.data.event_initcall_finish

That would simplify tweaking linux ldscript. For -fno-function-sections case it
could be just.
.data.retain

Do we still have time for such a rename? Is it safe?

2. Linux kernel does support -ffunction-sections  with
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION. But it relies even more on naming
conventions:

  #ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
  #define TEXT_MAIN .text .text.[0-9a-zA-Z_]*
  #define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX*
  #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]*
  #define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]*
  #define BSS_MAIN .bss .bss.[0-9a-zA-Z_]*
  #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]*
  #else
  #define TEXT_MAIN .text
  #define DATA_MAIN .data
  #define SDATA_MAIN .sdata
  #define RODATA_MAIN .rodata
  #define BSS_MAIN .bss
  #define SBSS_MAIN .sbss
  #endif

  A variant of [1.] would allow for easier section ganthering in the script.

3. I tried to add '.data.event*' (and similar) to linux ldscript and it was not
enough for me to built a kernel that does not crash. Which might hint at
binutils bug or linux loader bug (or at my bad attempt at adding sections).

[Bug middle-end/99109] [9/10/11 Regression] ICE: Error reporting routines re-entered since r9-1948

2021-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99109

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[9/10/11 Regression] ICE:   |[9/10/11 Regression] ICE:
   |Error reporting routines|Error reporting routines
   |re-entered  |re-entered since r9-1948
 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Started with r9-1948-gd893b683f40884cd00b5beb392566ecc7b67f721

[Bug target/99113] SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2021-02-15
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from H.J. Lu  ---
Since the Linux kernel is compiled with -fno-function-sections
-fno-data-sections, we should use it to avoid putting SHF_GNU_RETAIN
sections in separate sections.

[Bug target/99113] New: SHF_GNU_RETAIN doesn't work with Linux kernel

2021-02-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99113

Bug ID: 99113
   Summary: SHF_GNU_RETAIN doesn't work with Linux kernel
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: jozef.l at somniumtech dot com
  Target Milestone: ---

when building Linux kernel, ld in ninutils 2.36 with GCC 11 generates
thousands of

ld: warning: orphan section `.data.event_initcall_finish' from `init/main.o'
bei
ng placed in section `.data.event_initcall_finish'
ld: warning: orphan section `.data.event_initcall_start' from `init/main.o'
bein
g placed in section `.data.event_initcall_start'
ld: warning: orphan section `.data.event_initcall_level' from `init/main.o'
bein
g placed in section `.data.event_initcall_level'

Since these sections are marked with SHF_GNU_RETAIN, they are placed in
separate sections.  They become orphan sections since they aren't expected
in the Linux kernel linker script. But orphan sections normally don't work
well with the Linux kernel linker script and the resulting kernel crashed.

[Bug middle-end/99109] [9/10/11 Regression] ICE: Error reporting routines re-entered

2021-02-15 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99109

Marek Polacek  changed:

   What|Removed |Added

   Target Milestone|--- |9.4

[Bug middle-end/99109] [9/10/11 Regression] ICE: Error reporting routines re-entered

2021-02-15 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99109

Marek Polacek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-02-15
 Status|UNCONFIRMED |NEW

--- Comment #1 from Marek Polacek  ---
Confirmed.

[Bug fortran/99112] New: [11 Regression] ICE in gfc_conv_component_ref, at fortran/trans-expr.c:2646

2021-02-15 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99112

Bug ID: 99112
   Summary: [11 Regression] ICE in gfc_conv_component_ref, at
fortran/trans-expr.c:2646
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Changed between 20201115 and 20201122 :


$ cat z1.f90
module m
   type t
   end type
contains
   function f(x, y) result(z)
  class(t) :: x(:)
  class(t) :: y(size(x))
  type(t) :: z(size(x))
   end
   subroutine s
  class(t), allocatable :: a(:), b(:), c(:)
  c = f(a, b)
   end
end


$ gfortran-11-20201115 -c z1.f90 -fcheck=all
$ gfortran-11-20210214 -c z1.f90
$
$ gfortran-11-20210214 -c z1.f90 -fcheck=all
z1.f90:12:17:

   12 |   c = f(a, b)
  | 1
internal compiler error: Segmentation fault
0xc09bcf crash_signal
../../gcc/toplev.c:327
0x764181 gfc_conv_component_ref(gfc_se*, gfc_ref*)
../../gcc/fortran/trans-expr.c:2646
0x76b9c7 gfc_conv_variable
../../gcc/fortran/trans-expr.c:3019
0x767d4a gfc_conv_expr(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-expr.c:8887
0x76abb0 gfc_conv_expr_lhs(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-expr.c:8917
0x73dea8 gfc_conv_ss_descriptor
../../gcc/fortran/trans-array.c:3047
0x74894f gfc_conv_expr_descriptor(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-array.c:7364
0x77b014 gfc_conv_intrinsic_size
../../gcc/fortran/trans-intrinsic.c:8021
0x79248b gfc_conv_intrinsic_function(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-intrinsic.c:10690
0x767d2a gfc_conv_expr(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-expr.c:8879
0x76a1ca gfc_apply_interface_mapping(gfc_interface_mapping*, gfc_se*,
gfc_expr*)
../../gcc/fortran/trans-expr.c:4779
0x73cd27 gfc_set_loop_bounds_from_array_spec(gfc_interface_mapping*, gfc_se*,
gfc_array_spec*)
../../gcc/fortran/trans-array.c:940
0x774233 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*,
gfc_expr*, vec*)
../../gcc/fortran/trans-expr.c:6953
0x77584c gfc_trans_arrayfunc_assign
../../gcc/fortran/trans-expr.c:10350
0x7793a4 gfc_trans_assignment(gfc_expr*, gfc_expr*, bool, bool, bool, bool)
../../gcc/fortran/trans-expr.c:11519
0x739de7 trans_code
../../gcc/fortran/trans.c:1922
0x760554 gfc_generate_function_code(gfc_namespace*)
../../gcc/fortran/trans-decl.c:6880
0x73a659 gfc_generate_module_code(gfc_namespace*)
../../gcc/fortran/trans.c:2322
0x6e6981 translate_all_program_units
../../gcc/fortran/parse.c:6338
0x6e6981 gfc_parse_file()
../../gcc/fortran/parse.c:6620

[Bug fortran/99111] New: [10/11 Regression] ICE in gfc_conv_expr_descriptor, at fortran/trans-array.c:7336

2021-02-15 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99111

Bug ID: 99111
   Summary: [10/11 Regression] ICE in gfc_conv_expr_descriptor, at
fortran/trans-array.c:7336
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Changed between 20191215 and 20200105 :


$ cat z1.f90
program p
   type t
  integer :: a(1)
   end type
   type(t), parameter :: x(2) = t([1])
   print x
end


$ gfortran-10-20191215 -c z1.f90
z1.f90:5:31:

5 |type(t), parameter :: x(2) = t([1])
  |   1
Error: FORMAT tag at (1) must be of type default-kind CHARACTER or of INTEGER


$ gfortran-11-20210214 -c z1.f90
z1.f90:6:8:

6 |print x
  |1
Warning: Legacy Extension: Non-character in FORMAT tag at (1)
z1.f90:6:10:

6 |print x
  |  1
internal compiler error: in gfc_conv_expr_descriptor, at
fortran/trans-array.c:7336
0x74995a gfc_conv_expr_descriptor(gfc_se*, gfc_expr*)
../../gcc/fortran/trans-array.c:7336
0x74f343 gfc_conv_array_parameter(gfc_se*, gfc_expr*, bool, gfc_symbol const*,
char const*, tree_node**)
../../gcc/fortran/trans-array.c:8149
0x793f1c gfc_convert_array_to_string
../../gcc/fortran/trans-io.c:788
0x793f1c set_string
../../gcc/fortran/trans-io.c:848
0x795eae build_dt
../../gcc/fortran/trans-io.c:1941
0x73a077 trans_code
../../gcc/fortran/trans.c:2114
0x760554 gfc_generate_function_code(gfc_namespace*)
../../gcc/fortran/trans-decl.c:6880
0x6e6e86 translate_all_program_units
../../gcc/fortran/parse.c:6351
0x6e6e86 gfc_parse_file()
../../gcc/fortran/parse.c:6620
0x7330ff gfc_be_parse_file
../../gcc/fortran/f95-lang.c:212

[Bug c++/99110] New: ICE in function_and_variable_visibility, at ipa-visibility.c:716

2021-02-15 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99110

Bug ID: 99110
   Summary: ICE in function_and_variable_visibility, at
ipa-visibility.c:716
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Testsuite file gcov-14.c or following slightly reduced variant
affects g++ versions down to at least r5 with -fno-weak :


$ cat z1.cc
extern int __attribute__ ((weak)) Foo ();
extern __inline int Foo ()
{
  return 0;
}
int (* __attribute__ ((noinline)) Bar ()) ()
{
  return Foo;
}


$ g++-11-20210214 -c z1.cc -fno-weak
during IPA pass: visibility
z1.cc:9:1: internal compiler error: in function_and_variable_visibility, at
ipa-visibility.c:716
9 | }
  | ^
0x1604946 function_and_variable_visibility
../../gcc/ipa-visibility.c:712

[Bug c++/99108] ICE in ix86_get_function_versions_dispatcher, at config/i386/i386-features.c:2862

2021-02-15 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99108

Marek Polacek  changed:

   What|Removed |Added

   Last reconfirmed||2021-02-15
 Status|UNCONFIRMED |NEW
 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Marek Polacek  ---
Confirmed.

[Bug c++/99109] New: [9/10/11 Regression] ICE: Error reporting routines re-entered

2021-02-15 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99109

Bug ID: 99109
   Summary: [9/10/11 Regression] ICE: Error reporting routines
re-entered
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Started with r9 between 20180708 and 20180722 at -O2+ :


$ g++-11-20210214 -c pr88619.c -O2 -Wall -fsanitize=undefined
pr88619.c: In function 'int main()':
pr88619.c:13:9: warning: unused variable 'p2' [-Wunused-variable]
   13 |   void *p2 = __builtin_alloca (b);
  | ^~
pr88619.c:8:1: error: alignment of array elements is greater than element size
8 | main ()
  | ^~~~
'
Internal compiler error: Error reporting routines re-entered.
0xcd0f36 layout_type(tree_node*)
../../gcc/stor-layout.c:2599
0x801cfc build_cplus_array_type(tree_node*, tree_node*, int)
../../gcc/cp/tree.c:1110
0x80543c build_cplus_array_type(tree_node*, tree_node*, int)
../../gcc/cp/tree.c:1586
0x80543c strip_typedefs(tree_node*, bool*, unsigned int)
../../gcc/cp/tree.c:1586
0x6f599b type_to_string
../../gcc/cp/error.c:3300
0x6f65d5 cp_printer
../../gcc/cp/error.c:4380
0x16f133e pp_format(pretty_printer*, text_info*)
../../gcc/pretty-print.c:1475
0x16e52c1 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
../../gcc/diagnostic.c:1244
0x16e597e diagnostic_impl
../../gcc/diagnostic.c:1406
0x16e5ea2 warning_at(unsigned int, int, char const*, ...)
../../gcc/diagnostic.c:1543
0x1572bfd array_bounds_checker::check_mem_ref(unsigned int, tree_node*, bool)
../../gcc/gimple-array-bounds.cc:697
0x15735a9 array_bounds_checker::check_addr_expr(unsigned int, tree_node*)
../../gcc/gimple-array-bounds.cc:810
0x15736bf array_bounds_checker::check_array_bounds(tree_node**, int*, void*)
../../gcc/gimple-array-bounds.cc:913
0xf64ee5 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
../../gcc/tree.c:12099
0xa6956d walk_gimple_op(gimple*, tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
../../gcc/gimple-walk.c:253
0x156f821 check_array_bounds_dom_walker::before_dom_children(basic_block_def*)
../../gcc/gimple-array-bounds.cc:966
0x1560724 dom_walker::walk(basic_block_def*)
../../gcc/domwalk.c:309
0x1570f4a array_bounds_checker::check()
../../gcc/gimple-array-bounds.cc:980
0xf4b099 execute_vrp
../../gcc/tree-vrp.c:4517

[Bug c++/99108] New: ICE in ix86_get_function_versions_dispatcher, at config/i386/i386-features.c:2862

2021-02-15 Thread gscfq--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99108

Bug ID: 99108
   Summary: ICE in ix86_get_function_versions_dispatcher, at
config/i386/i386-features.c:2862
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gs...@t-online.de
  Target Milestone: ---

Affects versions down to at least r5 (with -std=c++17) :


$ cat z1.cc
struct A {
  void foo(auto);
};
void A::foo(auto)
{
  int f(void) __attribute__((target("default")));
  int f(void) __attribute__((target("arch=atom")));
  int b = f();
}
void bar(void)
{
  A c;
  c.foo(7);
}


$ g++-11-20210214 -c z1.cc -std=c++2a
z1.cc: In instantiation of 'void A::foo(auto:2) [with auto:1 = int]':
z1.cc:13:10:   required from here
z1.cc:9:1: internal compiler error: in ix86_get_function_versions_dispatcher,
at config/i386/i386-features.c:2862
9 | }
  | ^
0x1076533 ix86_get_function_versions_dispatcher(void*)
../../gcc/config/i386/i386-features.c:2862
0x667bdd get_function_version_dispatcher(tree_node*)
../../gcc/cp/call.c:8389
0x6b4fa5 cp_genericize_r
../../gcc/cp/cp-gimplify.c:1435
0xf64ee5 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
../../gcc/tree.c:12099
0x6b5795 cp_genericize_r
../../gcc/cp/cp-gimplify.c:1486
0xf64ee5 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
../../gcc/tree.c:12099
0x6b5a05 cp_genericize_r
../../gcc/cp/cp-gimplify.c:1178
0xf64ee5 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
../../gcc/tree.c:12099
0x6b45f8 cp_genericize_tree
../../gcc/cp/cp-gimplify.c:1604
0x6b4793 cp_genericize(tree_node*)
../../gcc/cp/cp-gimplify.c:1750
0x6dde47 finish_function(bool)
../../gcc/cp/decl.c:17470
0x7badde instantiate_body
../../gcc/cp/pt.c:25881
0x7bb9d0 instantiate_decl(tree_node*, bool, bool)
../../gcc/cp/pt.c:26151
0x7d5feb instantiate_pending_templates(int)
../../gcc/cp/pt.c:26230
0x6ecf72 c_parse_final_cleanups()
../../gcc/cp/decl2.c:4962

[Bug target/98872] ICE leads to SEGV on MMA test case

2021-02-15 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98872

Peter Bergner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |11.0
 Resolution|--- |FIXED

--- Comment #4 from Peter Bergner  ---
Fixed.  No backport needed.

[Bug c++/99088] Failure to error on recursive template instantiation in a reasonable time

2021-02-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99088

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #1 from Patrick Palka  ---
GCC 11 appears to be much slower than GCC 10 here because snapshots of trunk
are by default built with extra checking enabled, and your testcase is
exercising a check in the type comparison code that seems to be quadratic in
the depth of the template instantiation.  Release builds (by default) don't
perform this particular check, which is why GCC 10 appears to be much faster. 
You can disable most of these sanity checks when using snapshots of trunk by
passing -fno-checking to the compile command line.

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2021-02-15 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

--- Comment #46 from Rich Felker  ---
It's a standard and completely reasonable assumption that, if you statically
linked libstdc++ into your shared library, the copy there is for *internal use
only* and cannot share objects of the standard library's types across
boundaries with other libraries or the main application. The problem only comes
when the library's implementation (via templates or inline code in headers)
imposes the same requirement on normal dynamic linking, where it's a
nonstandard and unreasonable one.

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2021-02-15 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

--- Comment #45 from Florian Weimer  ---
Statically linking libstdc++ into shared objects is also not too uncommon. 
With luck, the libstdc++ symbols are hidden, but operating on globally shared
across multiple libstdc++s exposes similar issues even without inlining.

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2021-02-15 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

--- Comment #44 from Rich Felker  ---
Uhg. I don't know what kind of retroactive fix for that is possible, if any,
but going forward this kind of thing (assumptions that impose ABI boundaries)
should not be inlined by the template. It should just expand to an external
call so that the implementation details can be kept as implementation details
and changed as needed.

[Bug target/99104] [11 Regression] ICE: Segmentation fault (in bitmap_list_find_element)

2021-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99104

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #50186|0   |1
is obsolete||

--- Comment #5 from Jakub Jelinek  ---
Created attachment 50188
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50188=edit
gcc11-pr99104.patch

Different (still untested) fix.

[Bug tree-optimization/86010] [8 Regression] redundant memset with smaller size not eliminated

2021-02-15 Thread law at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86010

--- Comment #14 from Jeffrey A. Law  ---
I believe it's still an issue for gcc-8

[Bug target/98872] ICE leads to SEGV on MMA test case

2021-02-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98872

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Peter Bergner :

https://gcc.gnu.org/g:a33927c9ab4af3f4595251ce0c8ba54db821b039

commit r11-7249-ga33927c9ab4af3f4595251ce0c8ba54db821b039
Author: Peter Bergner 
Date:   Mon Feb 15 10:38:33 2021 -0600

rtl-optimization: Fix uninitialized use of opaque mode variable ICE
[PR98872]

The initialize_uninitialized_regs function emits (set (reg:) (CONST0_RTX))
for all uninitialized pseudo uses.  However, some modes (eg, opaque modes)
may not have a CONST0_RTX defined, leading to an ICE when we try and create
the initialization insn.  The fix is to skip emitting the initialization
if there is no CONST0_RTX defined for the mode.

2021-02-15  Peter Bergner  

gcc/
PR rtl-optimization/98872
* init-regs.c (initialize_uninitialized_regs): Skip initialization
if CONST0_RTX is NULL.

gcc/testsuite/
PR rtl-optimization/98872
* gcc.target/powerpc/pr98872.c: New test.

[Bug target/99089] unnecessary zero extend before AND

2021-02-15 Thread wilson at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99089

--- Comment #2 from Jim Wilson  ---
I don't know if REE can do this optimization, but it is a good place to start
looking.

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #13 from Jan Hubicka  ---
> And please remeasure with the AppArmor disabled.
> It may slow down each I/O-related syscall rapidly!

I tried disabling apparmor and it does not make much difference..

[Bug rtl-optimization/99085] [10/11 Regression] ICE: verify_flow_info failed (error: multiple hot/cold transitions found)

2021-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99085

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #2 from Jakub Jelinek  ---
Created attachment 50187
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50187=edit
gcc11-pr99085.patch

Untested fix.
The fixup_partitions code can move some bbs from hot to cold partitions (if
they are dominated by cold bbs, either before or after unreachable bb removal),
but
when not in cfglayout mode after reorder partitions it isn't sufficient to
adjust the edges and corresponding jumps, but we also need to move the bb in
the bb chain.

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2021-02-15 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 98863, which changed state.

Bug 98863 Summary: [11 Regression] WRF with LTO consumes a lot of memory in 
REE, FWPROP and x86 specific passes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/98863] [11 Regression] WRF with LTO consumes a lot of memory in REE, FWPROP and x86 specific passes

2021-02-15 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #48 from rsandifo at gcc dot gnu.org  
---
Fixed on master.

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #12 from Jan Hubicka  ---
> Ah, yeah, that will make a big difference.
> So clang is using 'make check', running a test-suite for a PGO build, right?
It uses 
make check-llvm
make check-clang
and then it rebuilds whole llvm with the instrumented compiler.

Honza

Re: [Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread Jan Hubicka
> Ah, yeah, that will make a big difference.
> So clang is using 'make check', running a test-suite for a PGO build, right?
It uses 
make check-llvm
make check-clang
and then it rebuilds whole llvm with the instrumented compiler.

Honza


[Bug target/99104] [11 Regression] ICE: Segmentation fault (in bitmap_list_find_element)

2021-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99104

Jakub Jelinek  changed:

   What|Removed |Added

 CC||uros at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Note, try_split actually makes some attempts to maintain df,
it calls emit_insn_after_setloc and delete_insn and the former in the end calls
df_set_bb_dirty and df_insn_rescan and the latter df_insn_delete.

So, one question is if ix86_ok_to_clobber_flags can safely use
  FOR_EACH_INSN_USE (use, insn)
if (DF_REF_REG_USE_P (use) && DF_REF_REGNO (use) == FLAGS_REG)
  return false;

  if (insn_defines_reg (FLAGS_REG, INVALID_REGNUM, insn))
return true;
and another question is if it can safely use
  live = df_get_live_out(bb);
  return !REGNO_REG_SET_P (live, FLAGS_REG);

And, yes, perhaps a way out of this is a target hook that would initialize df
for insn splitting passes and return TODO_* to be returned.
All the split conditions that can call ix86_ok_to_clobber_flags are guarded
with reload_completed, so we could df_analyze just for split{2,3,4} (split5
isn't even invoked on x86).

[Bug c++/96645] [9/10/11 Regression] std::variant default constructor

2021-02-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96645

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #7 from Patrick Palka  ---
(In reply to Jonathan Wakely from comment #4)
> It looks like the nested class DataWithStruct::A isn't considered complete
> until after DataWithStruct is complete, but I'm not sure why that is.

Maybe the note in [class.mem.general]/7 is relevant:

  A complete-class context of a nested class is also a complete-class context
of any enclosing class, if the nested class is defined within the
member-specification of the enclosing class.

We can't determine if A is constructible until we parse the initializer for
DataWithStruct::A::number.  And according to the above, we can't parse this
initializer until DataWithStruct is complete.

Looks like PR81359 is closely related.

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #11 from Martin Liška  ---
> Yep, this is usual profile I see.  Perhaps you want to try profile "make
> check"

Ah, yeah, that will make a big difference.
So clang is using 'make check', running a test-suite for a PGO build, right?

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #10 from Martin Liška  ---
> From the perf it seems that simply the syscall overhead plays important
> role (about 20% at kernel side, plus 9% on glibc side) followed by some
> stupidness of opensuse setup - apparmor and btrfs.

And please remeasure with the AppArmor disabled.
It may slow down each I/O-related syscall rapidly!

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #9 from Jan Hubicka  ---
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105
> 
> --- Comment #8 from Martin Liška  ---
> This is what I see for GCC PGO in train stage. It's from perf top:
> 
>4.33%  cc1plus  [.] 
> __gcov_indirect_call_profiler_v4
>  ◆
>2.28%  cc1plus  [.] __gcov_topn_values_profiler
>  
>  ▒
>0.85%  cc1plus  [.] ggc_internal_alloc 
>  

Yep, this is usual profile I see.  Perhaps you want to try profile "make check"
> 
> In the case of GCC, we emit 500 .gcda files.
> 
> @Honza: Can you please test my patch that uses glibc buffered I/O if it helps?

I can give it a try later this week (I would like to collect some data on
performance first)

Honza

Re: [Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread Jan Hubicka
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105
> 
> --- Comment #8 from Martin Liška  ---
> This is what I see for GCC PGO in train stage. It's from perf top:
> 
>4.33%  cc1plus  [.] 
> __gcov_indirect_call_profiler_v4
>  ◆
>2.28%  cc1plus  [.] __gcov_topn_values_profiler
>  
>  ▒
>0.85%  cc1plus  [.] ggc_internal_alloc 
>  

Yep, this is usual profile I see.  Perhaps you want to try profile "make check"
> 
> In the case of GCC, we emit 500 .gcda files.
> 
> @Honza: Can you please test my patch that uses glibc buffered I/O if it helps?

I can give it a try later this week (I would like to collect some data on 
performance first)

Honza


[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #8 from Martin Liška  ---
This is what I see for GCC PGO in train stage. It's from perf top:

   4.33%  cc1plus  [.] __gcov_indirect_call_profiler_v4
 ◆
   2.28%  cc1plus  [.] __gcov_topn_values_profiler 
 ▒
   0.85%  cc1plus  [.] ggc_internal_alloc  
 ▒
   0.83%  [kernel] [k] perf_event_task_tick
 ▒
   0.72%  libc-2.32.so [.] _int_malloc 
 ▒
   0.71%  cc1plus  [.] ht_lookup_with_hash 
 ▒
   0.53%  cc1plus  [.] grokdeclarator  
 ▒
   0.48%  cc1plus  [.] df_note_compute 
 ▒
   0.47%  cc1plus  [.] get_ref_base_and_extent 
 ▒
   0.45%  [kernel] [k] clear_page_rep  
 ▒
   0.44%  cc1plus  [.] _cpp_lex_direct 
 ▒
   0.41%  cc1plus  [.] walk_tree_1 
 ▒
   0.41%  cc1plus  [.] et_splay
 ▒
   0.41%  cc1plus  [.] bitmap_set_bit  
 ▒
   0.40%  libc-2.32.so [.] _int_free   
 ▒
   0.39%  cc1plus  [.] bitmap_list_find_element
 ▒
   0.36%  libc-2.32.so [.] malloc  
 ▒
   0.35%  cc1plus  [.] operand_compare::operand_equal_p
 ▒
   0.35%  cc1plus  [.] hash_table::find_slot_with_ha▒

In the case of GCC, we emit 500 .gcda files.

@Honza: Can you please test my patch that uses glibc buffered I/O if it helps?

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #7 from Jan Hubicka  ---
> > Because user apps may do funny thins with stdio such as they do with
> > malloc.  Fewer library stuff we rely on, the less likely we will hit the
> > problems.  So I am not sure if simply fixing i/o isn't better approach,
> > but I do not know.
> 
> Sure. With the patch, we don't rely on any glibc feature. We will just
> use a default read/write IO (which uses a buffering internally).

Well, buffered i/o is library feature :)
> > 2727 gcda files, 44MB overall, 4MB xz compressed tar file.
> > I am actually surprised that the file count is quite small. Firefox has
> > more...
> 
> To be honest, it's very small file size. I would expect these files should
> definitely live in a page cache.

For Firefox it is 2500 gcda files, 75MB overall, 6MB compressed.
> 
> What type of disk do you use?

/dev/nvme0n1 7QG00HVT Seagate FireCuda 520 SSD ZP1000GM30002  
1   1.00  TB /   1.00  TB512   B +  0 B STNSC014

(it is our new zen3 machine)

Honza

[Bug rtl-optimization/98863] [11 Regression] WRF with LTO consumes a lot of memory in REE, FWPROP and x86 specific passes

2021-02-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863

--- Comment #47 from CVS Commits  ---
The master branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:abe07a74bb7a2692eff2af151ca54e749ed5eba6

commit r11-7246-gabe07a74bb7a2692eff2af151ca54e749ed5eba6
Author: Richard Sandiford 
Date:   Mon Feb 15 15:05:22 2021 +

rtl-ssa: Reduce the amount of temporary memory needed [PR98863]

The rtl-ssa code uses an on-the-side IL and needs to build that IL
for each block and RTL insn.  I'd originally not used the classical
dominance frontier method for placing phis on the basis that it seemed
like more work in this context: we're having to visit everything in
an RPO walk anyway, so for non-backedge cases we can tell immediately
whether a phi node is needed.  We then speculatively created phis for
registers that are live across backedges and simplified them later.
This avoided having to walk most of the IL twice (once to build the
initial IL, and once to link uses to phis).

However, as shown in PR98863, this leads to excessive temporary
memory in extreme cases, since we had to record the value of
every live register on exit from every block.  In that PR,
there were many registers that were live (but unused) across
a large region of code.

This patch does use the classical approach to placing phis, but tries
to use the existing DF defs information to avoid two walks of the IL.
We still use the previous approach for memory, since there is no
up-front information to indicate whether a block defines memory or not.
However, since memory is just treated as a single unified thing
(like for gimple vops), memory doesn't suffer from the same
scalability problems as registers.

With this change, fwprop no longer seems to be a memory-hog outlier
in the PR: the maximum RSS is similar with and without fwprop.

The PR also shows the problems inherent in using bitmap operations
involving the live-in and live-out sets, which in the testcase are
very large.  I've therefore tried to reduce those operations to the
bare minimum.

The patch also includes other compile-time optimisations motivated
by the PR; see the changelog for details.

I tried adding:

for (int i = 0; i < 200; ++i)
  {
crtl->ssa = new rtl_ssa::function_info (cfun);
delete crtl->ssa;
  }

to fwprop.c to stress the code.  fwprop then took 35% of the compile
time for the problematic partition in the PR (measured on a release
build).  fwprop takes less than .5% of the compile time when running
normally.

The command:

  git diff 0b76990a9d75d97b84014e37519086b81824c307~ gcc/fwprop.c | \
patch -p1 -R

still gives a working compiler that uses the old fwprop.c.  The compile
time with that version is very similar.

For a more reasonable testcase like optabs.ii at -O, I saw a 6.7%
compile time regression with the loop above added (i.e. creating
the info 201 times per pass instead of once per pass).  That goes
down to 4.8% with -O -g.  I can't measure a significant difference
with a normal compiler (no 200-iteration loop).

So I think that (as expected) the patch does make things a bit
slower in the normal case.  But like Richi says, peak memory usage
is harder for users to work around than slighter slower compile times.

gcc/
PR rtl-optimization/98863
* rtl-ssa/functions.h (function_info::bb_live_out_info): Delete.
(function_info::build_info): Turn into a declaration, moving the
definition to internals.h.
(function_info::bb_walker): Declare.
(function_info::create_reg_use): Likewise.
(function_info::calculate_potential_phi_regs): Take a build_info
parameter.
(function_info::place_phis, function_info::create_ebbs): Declare.
(function_info::calculate_ebb_live_in_for_debug): Likewise.
(function_info::populate_backedge_phis): Delete.
(function_info::start_block, function_info::end_block): Declare.
(function_info::populate_phi_inputs): Delete.
(function_info::m_potential_phi_regs): Move information to
build_info.
* rtl-ssa/internals.h: New file.
(function_info::bb_phi_info): New class.
(function_info::build_info): Moved from functions.h.
Add a constructor and destructor.
(function_info::build_info::ebb_use): Delete.
(function_info::build_info::ebb_def): Likewise.
(function_info::build_info::bb_live_out): Likewise.
(function_info::build_info::tmp_ebb_live_in_for_debug): New
variable.
(function_info::build_info::potential_phi_regs): Likewise.
(function_info::build_info::potential_phi_regs_for_debug):
Likewise.

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2021-02-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

--- Comment #43 from Jonathan Wakely  ---
(In reply to Rich Felker from comment #42)
> I'm confused why this is an ABI boundary at all. Was the old implementation
> of std::call_once being inlined into callers?

Yes, it's a function template:

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/std/mutex;h=12b7e548d179c3a2cb0ed65b6e113031f11293f6;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l710

The call to __gthread_once (which is a weak alias for pthread_once) is on line
729.

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #6 from Martin Liška  ---
(In reply to Jan Hubicka from comment #5)
> > > So it effectively replaces gcov's own buffered I/O by stdio.  First I am
> > > not sure how safe it is (as we had a lot of fun about using malloc)
> > 
> > Why is not safe? We use filesystem locking for .gcda file.
> 
> Because user apps may do funny thins with stdio such as they do with
> malloc.  Fewer library stuff we rely on, the less likely we will hit the
> problems.  So I am not sure if simply fixing i/o isn't better approach,
> but I do not know.

Sure. With the patch, we don't rely on any glibc feature. We will just
use a default read/write IO (which uses a buffering internally).

> > 
> > > also it adds dependency on stdio that is not necessarily good idea for
> > > embedded targets. Not sure how often it is used there.
> > 
> > It was motivated by PR97834. Well, I think it's better to rely on a system C
> > library
> > as it provides a faster implementation of buffered I/O.
> > 
> > For embedded targets, I plan to implement hooks that can be used instead of
> > I/O:
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559342.html
> > 
> > > 
> > > But why glibc stdio is more effective? Is it because our buffer size of
> > > 1k is way too small (as it seems juding from the profile that is
> > > dominated by fread calls rather than open/lock/close)?
> > 
> > It behaved the same on my machine, but BSD impact was more significant.
> 
> Clang training seems to be a good extreme testcase and not that hard to
> set up. It is relatively large testsuite and streaming is clearly
> dominating over everything.

Sure, I'll set it up.

> 
> Profile also seems quite clear that read dominates other syscall
> overhead.
> > 
> > I'm planning to collect more detailed statistics about why is a lot of small
> > I/Os slower.
> 
> From the perf it seems that simply the syscall overhead plays important
> role (about 20% at kernel side, plus 9% on glibc side) followed by some
> stupidness of opensuse setup - apparmor and btrfs.

Yes, that's pretty obvious from the profile.

> > 
> > In the case of Clang, I would expect 100s (or even 1000s) of object files.
> > During profiling
> > run (using all cores), I would expect each run takes 100ms (or even 
> > seconds),
> > so waiting
> > for a file lock of an object file should not block it much.
> 
> 2727 gcda files, 44MB overall, 4MB xz compressed tar file.
> I am actually surprised that the file count is quite small. Firefox has
> more...

To be honest, it's very small file size. I would expect these files should
definitely live in a page cache.

What type of disk do you use?

[Bug c++/99107] New: Ignored inconsistent parameter/arguments types in variadic templates

2021-02-15 Thread oleksandr.koval.dev at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99107

Bug ID: 99107
   Summary: Ignored inconsistent parameter/arguments types in
variadic templates
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: oleksandr.koval.dev at gmail dot com
  Target Milestone: ---

template
struct Types {};

struct X {};
struct Y {};
struct Z {};
struct W {
operator Y(){return Y{};}
operator Z(){return Z{};}
};

template
void ctor(Types, Ts...){}

This is an error:

ctor(Types{}, X{}, Y{}, W{});

because of mismatch during deduction: first pack is deduced to  but
second is , conversions are not allowed here.

Here we have the same mismatched types but, currently, it's OK:

ctor(Types{}, {}, {}, W{});

I've noticed this problem while playing with CTAD for aggregates:

template
struct F : Types, Ts...
{};

// error, as expected
F f3 = {Types{}, X{}, W{}};

// should also be an error
F f4 = {Types{}, {}, W{}};

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #5 from Jan Hubicka  ---
> > So it effectively replaces gcov's own buffered I/O by stdio.  First I am
> > not sure how safe it is (as we had a lot of fun about using malloc)
> 
> Why is not safe? We use filesystem locking for .gcda file.

Because user apps may do funny thins with stdio such as they do with
malloc.  Fewer library stuff we rely on, the less likely we will hit the
problems.  So I am not sure if simply fixing i/o isn't better approach,
but I do not know.
> 
> > also it adds dependency on stdio that is not necessarily good idea for
> > embedded targets. Not sure how often it is used there.
> 
> It was motivated by PR97834. Well, I think it's better to rely on a system C
> library
> as it provides a faster implementation of buffered I/O.
> 
> For embedded targets, I plan to implement hooks that can be used instead of
> I/O:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559342.html
> 
> > 
> > But why glibc stdio is more effective? Is it because our buffer size of
> > 1k is way too small (as it seems juding from the profile that is
> > dominated by fread calls rather than open/lock/close)?
> 
> It behaved the same on my machine, but BSD impact was more significant.

Clang training seems to be a good extreme testcase and not that hard to
set up. It is relatively large testsuite and streaming is clearly
dominating over everything.

Profile also seems quite clear that read dominates other syscall
overhead.
> 
> I'm planning to collect more detailed statistics about why is a lot of small
> I/Os slower.

>From the perf it seems that simply the syscall overhead plays important
role (about 20% at kernel side, plus 9% on glibc side) followed by some
stupidness of opensuse setup - apparmor and btrfs.
> 
> In the case of Clang, I would expect 100s (or even 1000s) of object files.
> During profiling
> run (using all cores), I would expect each run takes 100ms (or even seconds),
> so waiting
> for a file lock of an object file should not block it much.

2727 gcda files, 44MB overall, 4MB xz compressed tar file.
I am actually surprised that the file count is quite small. Firefox has
more...

Honza
> 
> > To avoid waiting for lock one can simply allow multiple profile files to
> > be created and teach libgcov to acquire unlocked file in pseudorandom
> > order.
> > 
> > Honza
> 
> -- 
> You are receiving this mail because:
> You reported the bug.

[Bug target/99104] [11 Regression] ICE: Segmentation fault (in bitmap_list_find_element)

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99104

--- Comment #3 from Richard Biener  ---
Well, at least if split() does not maintain DF then split patterns may not use
DF.  It's as simple as that ;)  Note you need DF analyze anyway since even
if present, the DF problem might be out-of-date.

Note when you add DF_LIVE you have to remove it manually again if optimize == 1
because there we do not maintain it (but the problem is still not optional).

I guess if the x86 backend wants to use DF but other backends do not and do
not want to pay the compile-time cost of a df_analyze we could put DF
setup into a target hooks hand ... (or a target hook returning the DF problems
it likes to have computed for split).

[Bug ipa/96252] [10/11 Regression] mis-optimization where identical functions have very different codegen since gcc 10

2021-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96252

--- Comment #6 from Jan Hubicka  ---
Thinking of it, perhaps also inliner could take a hint that it is
inlining a tail call and do not produce unnecesary copy of the
functio parameter passed by value.

More generally, mod/ref has good chance to determine that parameter in
its original location is not modified by the call and we could avoid the
copy even for non-tailcalls?

Still would be interesting to know why copy propagation gives up.

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #4 from Martin Liška  ---
(In reply to Jan Hubicka from comment #3)
> > A small improvement can be achieved by the removal of libgcov I/O buffering:
> > https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=5a17015c096012b9e43a8dd45768a8d5fb3a3aee
> 
> So it effectively replaces gcov's own buffered I/O by stdio.  First I am
> not sure how safe it is (as we had a lot of fun about using malloc)

Why is not safe? We use filesystem locking for .gcda file.

> also it adds dependency on stdio that is not necessarily good idea for
> embedded targets. Not sure how often it is used there.

It was motivated by PR97834. Well, I think it's better to rely on a system C
library
as it provides a faster implementation of buffered I/O.

For embedded targets, I plan to implement hooks that can be used instead of
I/O:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559342.html

> 
> But why glibc stdio is more effective? Is it because our buffer size of
> 1k is way too small (as it seems juding from the profile that is
> dominated by fread calls rather than open/lock/close)?

It behaved the same on my machine, but BSD impact was more significant.

> > 
> > But the key thing is likely the ability to omit profile modifications
> > (read/modify/write) for parts of a binary that are not trained.
> Problem there are the per-program summaries that needs to be updated
> even for files never visited.
> 
> It seems that producing one file with tar-like format that can be
> expanded to gcda files by gcov-tool would be good idea. Even if we need
> to lock whole file it is probably faster than a lot of small I/Os.

I'm planning to collect more detailed statistics about why is a lot of small
I/Os slower.

In the case of Clang, I would expect 100s (or even 1000s) of object files.
During profiling
run (using all cores), I would expect each run takes 100ms (or even seconds),
so waiting
for a file lock of an object file should not block it much.

> To avoid waiting for lock one can simply allow multiple profile files to
> be created and teach libgcov to acquire unlocked file in pseudorandom
> order.
> 
> Honza

[Bug rtl-optimization/98791] [11 Regression] ICE in paradoxical_subreg_p (in ira) with SVE

2021-02-15 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98791

--- Comment #3 from Alex Coplan  ---
And here is a testcase (using SVE intrinsics) that ICEs without the param:

#include 
extern char a[11];
extern long b[];
void f() {
  for (int d; d < 10; d++) {
a[d] = svaddv(svptrue_b8(), svdup_u8(0));
b[d] = 0;
  }
}

i.e. with just -O -ftree-vectorize -march=armv8.2-a+sve.

[Bug sanitizer/99106] New: ICE in tree_to_poly_int64, at tree.c:3091

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99106

Bug ID: 99106
   Summary: ICE in tree_to_poly_int64, at tree.c:3091
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Started with the same revision as PR99033:

$ g++ /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/ext/flexary38.C -c
-fsanitize=undefined
during GIMPLE pass: ubsan
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/ext/flexary38.C: In function
‘void foo()’:
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/ext/flexary38.C:18:1:
internal compiler error: in tree_to_poly_int64, at tree.c:3091
   18 | }
  | ^
0x87f81e tree_to_poly_int64(tree_node const*)
/home/marxin/Programming/gcc/gcc/tree.c:3091
0x87f81e tree_to_poly_int64(tree_node const*)
/home/marxin/Programming/gcc/gcc/tree.c:3089
0x148775e component_ref_size(tree_node*, special_array_member*)
/home/marxin/Programming/gcc/gcc/tree.c:13920
0x1232ece decl_init_size(tree_node*, bool)
/home/marxin/Programming/gcc/gcc/tree-object-size.c:196
0x12336ca addr_object_size
/home/marxin/Programming/gcc/gcc/tree-object-size.c:285
0x123556b compute_builtin_object_size(tree_node*, int, unsigned long*,
tree_node**, tree_node**)
/home/marxin/Programming/gcc/gcc/tree-object-size.c:570
0x11a6565 instrument_object_size
/home/marxin/Programming/gcc/gcc/ubsan.c:2150
0x11ab575 execute
/home/marxin/Programming/gcc/gcc/ubsan.c:2405
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #3 from Jan Hubicka  ---
> A small improvement can be achieved by the removal of libgcov I/O buffering:
> https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=5a17015c096012b9e43a8dd45768a8d5fb3a3aee

So it effectively replaces gcov's own buffered I/O by stdio.  First I am
not sure how safe it is (as we had a lot of fun about using malloc) and
also it adds dependency on stdio that is not necessarily good idea for
embedded targets. Not sure how often it is used there.

But why glibc stdio is more effective? Is it because our buffer size of
1k is way too small (as it seems juding from the profile that is
dominated by fread calls rather than open/lock/close)?
> 
> But the key thing is likely the ability to omit profile modifications
> (read/modify/write) for parts of a binary that are not trained.
Problem there are the per-program summaries that needs to be updated
even for files never visited.

It seems that producing one file with tar-like format that can be
expanded to gcda files by gcov-tool would be good idea. Even if we need
to lock whole file it is probably faster than a lot of small I/Os.
To avoid waiting for lock one can simply allow multiple profile files to
be created and teach libgcov to acquire unlocked file in pseudorandom
order.

Honza

Re: [Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread Jan Hubicka
> A small improvement can be achieved by the removal of libgcov I/O buffering:
> https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=5a17015c096012b9e43a8dd45768a8d5fb3a3aee

So it effectively replaces gcov's own buffered I/O by stdio.  First I am
not sure how safe it is (as we had a lot of fun about using malloc) and
also it adds dependency on stdio that is not necessarily good idea for
embedded targets. Not sure how often it is used there.

But why glibc stdio is more effective? Is it because our buffer size of
1k is way too small (as it seems juding from the profile that is
dominated by fread calls rather than open/lock/close)?
> 
> But the key thing is likely the ability to omit profile modifications
> (read/modify/write) for parts of a binary that are not trained.
Problem there are the per-program summaries that needs to be updated
even for files never visited.

It seems that producing one file with tar-like format that can be
expanded to gcda files by gcov-tool would be good idea. Even if we need
to lock whole file it is probably faster than a lot of small I/Os.
To avoid waiting for lock one can simply allow multiple profile files to
be created and teach libgcov to acquire unlocked file in pseudorandom
order.

Honza


[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2021-02-15 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

--- Comment #42 from Rich Felker  ---
I'm confused why this is an ABI boundary at all. Was the old implementation of
std::call_once being inlined into callers? Otherwise all code operating on the
same once object should be using a common implementation, either the old one or
the new one, from libstdc++.

[Bug c++/95615] [coroutines] Coroutine frame and promise is leaked if exception thrown from promise.initial_suspend()

2021-02-15 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95615

--- Comment #3 from Iain Sandoe  ---
Actually, I don't think the example goes far enough.

ISTM, [on exception in the places noted] that non-trivial DTORs for frame
copies of parms should be run after the GRO and promise DTOR but before the
frame is freed.

Do you agree? (if so I have a patch for this now)

So, I think the modified example below should print:

Y ()
Y (const Y&)
operator new()
Y (const Y&)
promise_type()
get_return_object()
task()
~task()
~promise_type()
~Y ()
operator delete
~Y ()
caught X
~Y ()

===
but *both* [master] GCC and clang print:

Y ()
Y (const Y&)
operator new()
Y (const Y&)
promise_type()
task()
~Y ()
~task()
~Y ()

===

#if __has_include()
#include 
#else
#include 
namespace std {
using namespace std::experimental;
}
#endif
#include 

struct X {};

int Y_live = 0;

struct Y
{
  Y () { std::puts("Y ()"); Y_live++; } 
  Y (const Y&) { std::puts("Y (const Y&)"); Y_live++; } 
  ~Y () { std::puts("~Y ()"); Y_live--; }
};

struct task {
struct promise_type {
void* operator new(size_t sz) {
std::puts("operator new()");
return ::operator new(sz);
}

void operator delete(void* p, size_t sz) {
std::puts("operator delete");
return ::operator delete(p, sz);
}

promise_type() {
std::puts("promise_type()");
// throw X{};
}

~promise_type() {
std::puts("~promise_type()");
}

struct awaiter {
bool await_ready() {
//throw X{};
return false;
}
void await_suspend(std::coroutine_handle<>) {
//throw X{};
}
void await_resume() {
//throw X{};
}
};

awaiter initial_suspend() {
// throw X{};
return {};
}

task get_return_object() {
// throw X{};
return task{};
}

std::suspend_never final_suspend() noexcept { return {}; }
void return_void() noexcept {}
void unhandled_exception() noexcept {
std::puts("unhandled_exception()");
}
};

task() { std::puts("task()"); }
~task() { std::puts("~task()"); }
task(task&&) { std::puts("task(task&&)"); }
};

task f(Y Val) {
co_return;
}

int main() {
Y Val;
try {
auto a = f(Val);
} catch (X) {
std::puts("caught X");
}
}



[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2021-02-15
 Ever confirmed|0   |1
   Target Milestone|--- |12.0
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #2 from Martin Liška  ---
Thank you for the bug report. It's really something we should improve for the
next GCC release.
A small improvement can be achieved by the removal of libgcov I/O buffering:
https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=5a17015c096012b9e43a8dd45768a8d5fb3a3aee

But the key thing is likely the ability to omit profile modifications
(read/modify/write) for parts of a binary that are not trained.

[Bug gcov-profile/99105] profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #1 from Jan Hubicka  ---
I use following:

cmake -G 'Unix Makefiles' /home/jan/llvm-project/llvm
-DCLANG_TABLEGEN=/home/jan/llmm-build2-lto-fdo/stage1/bin/clang-tblgen
-DCMAKE_AR=/home/jan/trunk-instal/bin/gcc-ar -DCMAKE_BUILD_TY
PE=Release -DCMAKE_CXX_COMPILER=/home/jan/trunk-install/bin/g++
-DCMAKE_C_COMPILER=/home/jan/trunk-install/bin/gcc
'-DCMAKE_INSTALL_PREFIX=~/llvm11-install-gcc-instrument-lto' -DCMAKE_RA
NLIB=/home/jan/trunk-install/bin/gcc-ranlib
-DLLVM_BINUTILS_INCDIR=/home/jan//binutils-gdb/include  -DLLVM_BUILD_RUNTIME=No
-DLLVM_TABLEGEN=/home/jan/llmm-build2-lto-fdo/stage1/bin/llvm-
tblgen -DCMAKE_AR=/home/jan/trunk-install/bin/gcc-ar -DCMAKE_C_FLAGS="-O3 -flto
-fprofile-generate -fno-semantic-interposition" -DCMAKE_CXX_FLAGS="-O3 -flto
-fprofile-generate -fno-seman
tic-interposition"
make -j16 clang lld LLVMgold
make -j16 check-llvm check-clang


which is extracted from llvm profile feedback building script.

[Bug gcov-profile/99105] New: profile streaming scales poorly to projects with many source files

2021-02-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

Bug ID: 99105
   Summary: profile streaming scales poorly to projects with many
source files
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Compared to clang we need significantly longer time to train Firefox (25
minutes compared to 7) and run clang 
make check-clang
which takes 12 hours compared to 27 minutes.

Most of time is spent by kernel by IO.  I suppose we really should consider
optionally producing per-binary rather then per-source file profile data dumps
and omit untrained parts of program.
This is perf top of running llvm testsuite (first with and second without
kernel symbols).  Seems merging of topn is high in profile now.


Overhead  Shared ObjectSymbol   
   8.58%  libc-2.32.so [.] read 
   7.24%  [kernel] [k] __x86_indirect_thunk_rax 
   7.15%  [kernel] [k] entry_SYSCALL_64 
   6.43%  [kernel] [k] __x64_sys_read   
   5.70%  [kernel] [k] apparmor_file_permission 
   5.49%  [kernel] [k] generic_file_buffered_read   
   4.45%  [kernel] [k] btrfs_file_read_iter 
   4.10%  [kernel] [k] syscall_return_via_sysret
   3.31%  [kernel] [k] new_sync_read
   3.07%  libc-2.32.so [.] _IO_file_xsgetn  
   2.77%  [kernel] [k] find_get_entry   
   2.76%  libc-2.32.so [.] _IO_fread
   2.60%  [kernel] [k] current_time 
   2.33%  [kernel] [k] atime_needs_update   
   2.18%  [kernel] [k] vfs_read 
   2.11%  clang-11 [.] __gcov_merge_topn
   2.02%  [kernel] [k] pagecache_get_page   
   1.97%  [kernel] [k] entry_SYSCALL_64_after_hwframe   
   1.89%  clang-11 [.] gcov_read_words  
   1.76%  [kernel] [k] __fsnotify_parent
   1.67%  [kernel] [k] syscall_exit_to_user_mode
   1.60%  [kernel] [k] ksys_read
   1.40%  [kernel] [k] security_file_permission 
   1.30%  [kernel] [k] aa_file_perm 
   1.23%  [kernel] [k] syscall_enter_from_user_mode 
   1.11%  [kernel] [k] touch_atime  
   1.02%  [kernel] [k] exit_to_user_mode_prepare
   0.99%  [kernel] [k] xas_load 
   0.95%  [kernel] [k] xas_start
   0.74%  [kernel] [k] __fget_light 
   0.71%  [kernel] [k] __fdget_pos  
   0.69%  clang-11 [.] __gcov_read_counter  
   0.64%  [kernel] [k] do_syscall_64
   0.58%  [kernel] [k] ktime_get_coarse_real_ts64   
   0.55%  [kernel] [k] rw_verify_area   
   0.50%  libc-2.32.so [.] _IO_sgetn
   0.50%  [kernel] [k] PageHuge 
   0.45%  perf [.] rb_next  
   0.38%  [kernel] [k] iov_iter_init
For a higher level overview, try: perf top --sort comm,dso  



Overhead  Shared Object   Symbol
  43.43%  libc-2.32.so[.] read  
  12.00%  libc-2.32.so[.] _IO_file_xsgetn   
  11.80%  libc-2.32.so[.] _IO_fread 
   7.89%  clang-11[.] __gcov_merge_topn 
   7.28%  clang-11[.] gcov_read_words   
   2.32%  clang-11[.] __gcov_read_counter   
   2.28%  libc-2.32.so[.] _IO_sgetn 
   

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #10 from Uroš Bizjak  ---
(In reply to Richard Biener from comment #7)
> There are a lot of targets that define REG_ALLOC_ORDER ^
> HONOR_REG_ALLOC_ORDER and thus are affected by this change...

The following patch should solve this issue:

--cut here--
diff --git a/gcc/defaults.h b/gcc/defaults.h
index 91216593e75..2af4add0c05 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1047,7 +1047,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.
 If not, see
 #endif

 #ifndef HONOR_REG_ALLOC_ORDER
-#define HONOR_REG_ALLOC_ORDER 0
+# if defined REG_ALLOC_ORDER
+#  define HONOR_REG_ALLOC_ORDER 1
+# else
+#  define HONOR_REG_ALLOC_ORDER 0
+# endif
 #endif

 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
--cut here--

So, if REG_ALLOC_ORDER is defined, then IRA should obey the order by default.

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #9 from Martin Jambor  ---
I will benchmark the patch later this week, just so that we know, but I agree
that reverting the patch and applying it again at the beginning of stage1 is
probably the best.

[Bug tree-optimization/99101] optimization bug with -ffinite-loops

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99101

--- Comment #7 from Richard Biener  ---
So

@@ -1661,6 +1662,7 @@ perform_tree_ssa_dce (bool aggressive)
   if (aggressive)
 {
   /* Compute control dependence.  */
+  connect_infinite_loops_to_exit ();
   calculate_dominance_info (CDI_POST_DOMINATORS);
   cd = new control_dependences ();


"fixes" it (plus related changes), but

@@ -1661,6 +1662,7 @@ perform_tree_ssa_dce (bool aggressive)
   if (aggressive)
 {
   /* Compute control dependence.  */
+  add_noreturn_fake_exit_edges ();
+  connect_infinite_loops_to_exit ();
   calculate_dominance_info (CDI_POST_DOMINATORS);
   cd = new control_dependences ();


breaks it again.  Similar to perturbing block numbering in a way so that
connect_infinite_loops_to_exit would first visit the noreturn block.

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #8 from Uroš Bizjak  ---
(In reply to Richard Biener from comment #7)
> Btw, for GCC 11 it might be tempting to simply revert the "no-op" change?

I agree, this is the safest way at this time. The situation now looks like
going into rabbit hole.

> There are a lot of targets that define REG_ALLOC_ORDER ^
> HONOR_REG_ALLOC_ORDER and thus are affected by this change...

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2021-02-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

--- Comment #41 from Jonathan Wakely  ---
The new std::call_once using a futex is not backwards compatible, so I think it
needs to be reverted, or hidden behind an ABI-breaking flag.

The new std::once_flag::_M_activate() function sets _M_once=1 when an
initialization is in progress.

With glibc, if a call to the new std::call_once happens before a call to the
old version of std::call_once, then glibc's pthread_once will find no fork
generation value in _M_once and so will think it should run the init_function
itself. Both threads will run their init_function, instead of the second one
waiting for the first to finish.

With musl, if a call to the old std::call_once happens before a call to the new
std::call_once, then the second thread won't set _M_once=3 and so musl's
pthread_once won't wake the second thread when the first finishes. The second
thread will sleep forever (or until a spurious wake from the futex wait).

[Bug target/99104] [11 Regression] ICE: Segmentation fault (in bitmap_list_find_element)

2021-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99104

--- Comment #2 from Jakub Jelinek  ---
Created attachment 50186
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50186=edit
gcc11-pr99104.patch

What a mess!  Can we finally get rid of sel-sched?
The x86 backend uses DF inside of the splitter conditions which looks
problematic as the split passes don't really call df_analyze but so far we were
just lucky that the pass right before it maintained df.
Selective scheduling (unlike normal scheduling) can create new blocks with
insns that may need splitting and nothing computes the live or lr problem for
those.
So, either the backend needs to cope with df_get_live_out returning NULL, or,
because the pass_split_before_regstack pass is a single backend specific pass
(i386) we can call df_analyze there.

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #7 from Richard Biener  ---
Btw, for GCC 11 it might be tempting to simply revert the "no-op" change?

There are a lot of targets that define REG_ALLOC_ORDER ^ HONOR_REG_ALLOC_ORDER
and thus are affected by this change...

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #6 from Uroš Bizjak  ---
As a side note, it is strange that ADJUST_REG_ALLOC_ORDER somehow require
REG_ALLOC_ORDER to be defined (c.f. Comment #3), while its documentation says:

 The macro body should not assume anything about the contents of
 'reg_alloc_order' before execution of the macro.

This mess begs for the redefinition of REG_ALLOC_ORDER/ADJUST_REG_ALLOC_ORDER
as a target hook.

[Bug fortran/98014] [Fortran][OpenACC][OpenMP] Empty '!$acc'/'!$omp' continuation line rejected

2021-02-15 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98014

Tobias Burnus  changed:

   What|Removed |Added

Summary|[Fortran OpenACC] Empty |[Fortran][OpenACC][OpenMP]
   |'!$acc' continuation line   |Empty '!$acc'/'!$omp'
   |rejected|continuation line rejected
   Keywords||openmp

--- Comment #2 from Tobias Burnus  ---
The following compiles here:

!$acc parallel &
!$acc   vector_length (1) &
!$acc& ! { dg-warning }
!$acc end parallel
end

If I remove the '&' after '!$acc', I get the error:

2 | !$acc   vector_length (1) &
  |   1
Error: Failed to match clause at (1)

Same with OpenMP.

 * * *

I think there is a bug in gfortran to not accept the first version – and not to
reject the second version which should be invalid for the reasons given below.

 * * *

In any case, I believe this is an omission in the OpenACC and OpenMP specs
which does not tell how to understand that line. Namely, the following is
ambiguous:

!$omp parallel &
!$omp &
!$omp if(.false.)
vs.
!$omp parallel &
!$omp &
!$omp do

Is the 'omp &' line the last line of the directive? Of just in the middle of a
three-line directive?

For Fortran itself, a single '&' is invalid in free from for the same reasons.
→ "No line shall contain a single “&” as the only nonblank character or as the
only nonblank character before an “!” that initiates a comment." (F2018,
6.3.2.4 Free form statement continuation).

 * * *

On the specification side, I have opened a ticked
* for OpenACC, Issue 353
* for OpenMP,  Issue 2668.

[Bug target/99102] [11 Regression] SVE: Wrong code with -O2 -ftree-vectorize -march=armv8.2-a+sve -msve-vector-bits=256

2021-02-15 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99102

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org
   Priority|P3  |P1

[Bug target/99104] [11 Regression] ICE: Segmentation fault (in bitmap_list_find_element)

2021-02-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99104

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Target Milestone|--- |11.0
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
   Last reconfirmed||2021-02-15
   Priority|P3  |P1

--- Comment #1 from Jakub Jelinek  ---
Started with my r11-7235-g05402ca65a6696a8f20a3dbcb18f47ba3bdfa268
I'll have a look.

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #5 from Uroš Bizjak  ---
Martin, can you please benchmark the patch from Comment #4?

The patch is not totally trivial, because it introduces HONOR_REG_ALLOC_ORDER
to x86 and this define disables some other code in ira-color.c,
assign_hard_reg:

  if (!HONOR_REG_ALLOC_ORDER)
{
  if ((saved_nregs = calculate_saved_nregs (hard_regno, mode)) != 0)
  /* We need to save/restore the hard register in
 epilogue/prologue.  Therefore we increase the cost.  */
  {
rclass = REGNO_REG_CLASS (hard_regno);
add_cost = ((ira_memory_move_cost[mode][rclass][0]
 + ira_memory_move_cost[mode][rclass][1])
* saved_nregs / hard_regno_nregs (hard_regno,
  mode) - 1);
cost += add_cost;
full_cost += add_cost;
  }
}

[Bug target/99083] Big run-time regressions of 519.lbm_r with LTO

2021-02-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99083

--- Comment #4 from Uroš Bizjak  ---
Created attachment 50185
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50185=edit
Proposed patch

Proposed patch that fixes ira-color.c and introduces HONOR_REG_ALLOC_ORDER.

  1   2   >