[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-28 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789

--- Comment #34 from Hongtao.liu  ---
(In reply to Kewen Lin from comment #29)
> (In reply to Hongtao.liu from comment #28)
> > > Probably you can try to tweak it in ix86_add_stmt_cost? when the statement
> > 
> > Yes, it's the place.
> > 
> > > is UB to UH conversion statement, further check if the def of the input UB
> > > is MEM.
> > 
> > Only if there's no multi-use for UB. More generally, it's quite difficult to
> > guess later optimizations for the purpose of more accurate vectorization
> > cost model, :(.
> 
> Yeah, it's hard sadly. The generic cost modeling is rough,
> ix86_add_stmt_cost is more fine-grain (at least than what we have on Power
> :)), if you want to check it more, it seems doable in target specific hook
> finish_cost where you can get the whole vinfo object, but it could end up
> with very heavy analysis and might not be worthy.
> 
> Do you mind to check if it can also fix this degradation on x86 to run FRE
> and DSE just after cunroll? I found it worked for Power, hoped it can help
> there too.

No, it's not working for CLX, problem in i386 backend is a bit different.

[Bug c/97206] [11 Regression] internal compiler error: in composite_type, at c/c-typeck.c:447 since r11-3303-g6450f07388f9fe57

2020-09-28 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97206

Martin Sebor  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org

--- Comment #5 from Martin Sebor  ---
Joseph, any idea what might be causing this?  The test case in comment #4 has
me stumped.

[Bug ipa/97235] New: ICE in duplicate, at ipa-prop.c:4251

2020-09-28 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97235

Bug ID: 97235
   Summary: ICE in duplicate, at ipa-prop.c:4251
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

gcc-11.0.0-alpha20200927 snapshot (g:e24817aa7a1c6d12039b486ab5ea9b5ee0a46cd4)
ICEs when compiling the following testcase w/ -O2 -fno-ipa-modref:

struct ie {
  int hk;
};

typedef int
gn (int, struct ie *);

int
de (int, struct ie *);

int
qc (int, struct ie *);

int bh;

int
uj (gn yz)
{
  struct ie *se = 0;

  if (se->hk == 1)
return yz (bh, se);

  return 1;
}

int
n6 (void)
{
  (void) uj (qc);
  (void) uj (de);
}

% gcc-11.0.0 -O2 -fno-ipa-modref -c a7m1ghpt.c
during IPA pass: inline
a7m1ghpt.c: In function 'n6':
a7m1ghpt.c:31:10: internal compiler error: in duplicate, at ipa-prop.c:4251
   31 |   (void) uj (de);
  |  ^~~
0x6741de ipa_edge_args_sum_t::duplicate(cgraph_edge*, cgraph_edge*,
ipa_edge_args*, ipa_edge_args*)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/ipa-prop.c:4251
0x95912b symbol_table::call_edge_duplication_hooks(cgraph_edge*, cgraph_edge*)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/cgraph.c:451
0x96d12f cgraph_edge::clone(cgraph_node*, gcall*, unsigned int, profile_count,
profile_count, bool)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/cgraphclones.c:149
0xe322d7 copy_bb
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/tree-inline.c:2266
0xe332fb copy_cfg_body
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/tree-inline.c:3042
0xe332fb copy_body
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/tree-inline.c:3290
0xe36714 expand_call_inline
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/tree-inline.c:5076
0xe38391 gimple_expand_calls_inline
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/tree-inline.c:5266
0xe38391 optimize_inline_calls(tree_node*)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/tree-inline.c:5439
0xb92863 inline_transform(cgraph_node*)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/ipa-inline-transform.c:741
0xccd67d execute_one_ipa_transform_pass
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/passes.c:2240
0xccd67d execute_all_ipa_transforms(bool)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/passes.c:2279
0x968683 cgraph_node::expand()
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/cgraphunit.c:2302
0x969c1d expand_all_functions
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/cgraphunit.c:2480
0x969c1d symbol_table::compile()
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/cgraphunit.c:2843
0x96c092 symbol_table::compile()
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/cgraphunit.c:2756
0x96c092 symbol_table::finalize_compilation_unit()
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/cgraphunit.c:3021

[Bug target/93176] PPC: inefficient 64-bit constant consecutive ones

2020-09-28 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93176

Peter Bergner  changed:

   What|Removed |Added

   Assignee|bergner at gcc dot gnu.org |amodra at gcc dot 
gnu.org

--- Comment #4 from Peter Bergner  ---
(In reply to Peter Bergner from comment #3)
> I submitted a patch that implements this idea.

Actually, Alan is taking this over, so reassigning to him.

[Bug c++/97222] GCC discards attributes aligned and may_alias for typedefs passed as template arguments

2020-09-28 Thread richard-gccbugzilla at metafoo dot co.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97222

Richard Smith  changed:

   What|Removed |Added

 CC||richard-gccbugzilla@metafoo
   ||.co.uk

--- Comment #3 from Richard Smith  ---
> would be interesting to see how ICC mangles the aligned case

It doesn't, it just takes the properties from whichever instantiation happens
to be performed first. On ICC,

 std::cout << alignof(typename identity::type) << std::endl;
 std::cout << alignof(typename identity::type) << std::endl;

prints 4 4, and 

 std::cout << alignof(typename identity::type) << std::endl;
 std::cout << alignof(typename identity::type) << std::endl;

prints 16 16. The ICC behavior seems unsound, compared to the GCC / Clang
behavior of (effectively) stripping the attribute from template arguments.

> There is NO way defined at this point to mange for some attributes
> including but not limited to may_alias and alignment.

These can be mangled as s:
http://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.qualified-type

Presumably the more problematic part from an ABI perspective is that this will
change a bunch of existing manglings. Similarly from a source compatibility
standpoint, making the alignment override part of the type would be a breaking
change. It'd probably be better to add a new syntax for the new functionality
(please, not based on an attributed typedef this time!) and deprecate the old
way.

[Bug c++/97234] New: Constexpr class-scope array initializer referencing previous elements

2020-09-28 Thread botond at mozilla dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97234

Bug ID: 97234
   Summary: Constexpr class-scope array initializer referencing
previous elements
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: botond at mozilla dot com
  Target Milestone: ---

The following code:

struct S {
static constexpr int rolling_sum[4]{
0,
rolling_sum[0] + 1,
rolling_sum[1] + 2,
rolling_sum[2] + 3
};
};

produces errors when compiled with g++ 10:

test.cpp:4:9: error: ‘rolling_sum’ was not declared in this scope
4 | rolling_sum[0] + 1,
  | ^~~
test.cpp:5:9: error: ‘rolling_sum’ was not declared in this scope
5 | rolling_sum[1] + 2,
  | ^~~
test.cpp:6:9: error: ‘rolling_sum’ was not declared in this scope
6 | rolling_sum[2] + 3
  |   

The code is accepted by clang. It's also accepted by gcc if the array is
declared at namespace scope, leading me to believe that rejecting it at class
scope is likely a bug.

[Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs

2020-09-28 Thread lucier at math dot purdue.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #30 from lucier at math dot purdue.edu ---
I'm coming back to this project.

I naively thought "Well, I don't need arc profiling, I'll just set
-ftest-coverage without -fprofile-arcs" but it appears that I can't do that,
the gcda files are generated by -fprofile-arcs.

It seems to me that test coverage could be implemented simply by instrumenting
each basic block in an algorithm that's linear in the number of basic blocks. 
Is it possible to do this?

Brad

[Bug analyzer/94433] enhancement: 12 * constify some parameters

2020-09-28 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94433

David Malcolm  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

--- Comment #4 from David Malcolm  ---
Thanks again for your help with getting cppcheck running.  I found some issues
with cppcheck and committed these fixes:
  g:20d16d61dd22a9bfb66d5c4a383d193037e8f16d (unused field)
  g:c0ed6afef7897f32dc199da9a5430664fcbb61bb (missing "final override" on some
vfuncs)

However I'm confused by the "can be declared with const [constParameter]"
warnings in comment #0 - they look const to me.  What are these messages trying
to tell me, and how would I fix them?  (could they be false positives?)

Any ideas?

[Bug analyzer/97233] [11 Regression] ICE in deref_rvalue, at analyzer/region-model.cc:1465

2020-09-28 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97233

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from David Malcolm  ---
Thanks for filing this bug.  It should be fixed by the above commit.

[Bug analyzer/97233] [11 Regression] ICE in deref_rvalue, at analyzer/region-model.cc:1465

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97233

--- Comment #1 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:01eabbeadb645959d5dcb0f00f41c3565a8f54f1

commit r11-3512-g01eabbeadb645959d5dcb0f00f41c3565a8f54f1
Author: David Malcolm 
Date:   Mon Sep 28 15:42:31 2020 -0400

analyzer: fix ICE on non-pointer longjmp [PR97233]

gcc/analyzer/ChangeLog:
PR analyzer/97233
* analyzer.cc (is_longjmp_call_p): Require the initial argument
to be a pointer.
* engine.cc (exploded_node::on_longjmp): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/97233
* gcc.dg/analyzer/pr97233.c: New test.

[Bug libbacktrace/97227] dsymutil runs on ELF execs during libbacktrace testing

2020-09-28 Thread ian at airs dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97227

Ian Lance Taylor  changed:

   What|Removed |Added

 CC||ian at airs dot com
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Ian Lance Taylor  ---
Thanks, should be fixed.

[Bug libbacktrace/97082] new test 'mtest' fails for Mach-O/Darwin.

2020-09-28 Thread ian at airs dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97082

Ian Lance Taylor  changed:

   What|Removed |Added

 CC||ian at airs dot com

--- Comment #2 from Ian Lance Taylor  ---
Does btest pass?  It's hard to see why mtest would fail if btest passes.

[Bug libbacktrace/97082] new test 'mtest' fails for Mach-O/Darwin.

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97082

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Ian Lance Taylor :

https://gcc.gnu.org/g:5f394e2d4c66678411c88b297f0db9a828214aef

commit r11-3507-g5f394e2d4c66678411c88b297f0db9a828214aef
Author: Ian Lance Taylor 
Date:   Mon Sep 28 13:54:57 2020 -0700

libbacktrace: build mtest.dSYM if using dsymutil

libbacktrace/ChangeLog:
PR libbacktrace/97082
* Makefile.am (check_DATA): Add mtest.dSYM if USE_DSYMUTIL.
* Makefile.in: Regenerate.

[Bug libbacktrace/97227] dsymutil runs on ELF execs during libbacktrace testing

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97227

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Ian Lance Taylor :

https://gcc.gnu.org/g:7c363a4e044115ea811888fe0d32b04f23d79ae2

commit r11-3506-g7c363a4e044115ea811888fe0d32b04f23d79ae2
Author: Ian Lance Taylor 
Date:   Mon Sep 28 13:47:25 2020 -0700

libbacktrace: only run dsymutil with Mach-O

libbacktrace/ChangeLog:
PR libbacktrace/97227
* configure.ac (USE_DSYMUTIL): Define instead of HAVE_DSYMUTIL.
* Makefile.am: Change all uses of HAVE_DSYMUTIL to USE_DSYMUTIL.
* configure: Regenerate.
* Makefile.in: Regenerate.

[Bug analyzer/94433] enhancement: 12 * constify some parameters

2020-09-28 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94433

David Malcolm  changed:

   What|Removed |Added

   Last reconfirmed||2020-09-28
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #3 from David Malcolm  ---
Thanks - with that I can reproduce the warnings from comment #0.

[Bug analyzer/94433] enhancement: 12 * constify some parameters

2020-09-28 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94433

--- Comment #2 from David Binderman  ---

I use something like

cppcheck --enable=all --language=c++ trunk.git/gcc/analyzer/*.{h,cc}

This seems to work to me, although my copy of cppcheck is a hand
tweeked version of their development code.

[Bug analyzer/94433] enhancement: 12 * constify some parameters

2020-09-28 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94433

--- Comment #1 from David Malcolm  ---
Thanks for filing this.  I've been attempting to reproduce this, but I'm not
getting any warnings out of cppcheck.

That said, looking at
  git show
a96f1c38a787fbc847cb014d4b094e2787d539a7:gcc/analyzer/diagnostic-manager.cc
(to get the version for the bug was filed) the first few warnings in comment #0
seem to be on class dedupe_hash_map_traits:

 289   │   static inline hashval_t hash (const key_type )
 291   │   {
 292   │ return v->hash ();
 293   │   }
 294   │   static inline bool equal_keys (const key_type , const key_type
)
 295   │   {
 296   │ return *k1 == *k2;
 297   │   }

where key_type is:
 286   │   typedef const dedupe_key *key_type;

and I don't think the above code has changed since then.

Is there a good way to invoke cppcheck on GCC?  I'm naively trying "cppcheck
gcc/analyzer" and passing in -I options, and am not getting any warnings.

[Bug c++/97217] C++ program compiled with GCC crashes

2020-09-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97217

Jonathan Wakely  changed:

   What|Removed |Added

 Status|WAITING |UNCONFIRMED
 Ever confirmed|1   |0

--- Comment #5 from Jonathan Wakely  ---
Thanks!

[Bug c++/96229] Invalid specialization accepted when also constrained in base template template parameter

2020-09-28 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96229

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

[Bug c++/97230] Invocation of non-static member function on a null instance in core constant expression should not be allowed

2020-09-28 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97230

Marek Polacek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 CC||mpolacek at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug analyzer/97233] New: [11 Regression] ICE in deref_rvalue, at analyzer/region-model.cc:1465

2020-09-28 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97233

Bug ID: 97233
   Summary: [11 Regression] ICE in deref_rvalue, at
analyzer/region-model.cc:1465
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-11.0.0-alpha20200927 snapshot (g:e24817aa7a1c6d12039b486ab5ea9b5ee0a46cd4)
ICEs when compiling the following testcase w/ -fanalyzer:

void
longjmp (__SIZE_TYPE__, int);

void
e7 (__SIZE_TYPE__ gr)
{
  longjmp (gr, 1);
}

% gcc-11.0.0 -fanalyzer -c tdkztnou.c
during IPA pass: analyzer
tdkztnou.c: In function 'e7':
tdkztnou.c:7:3: internal compiler error: in deref_rvalue, at
analyzer/region-model.cc:1465
7 |   longjmp (gr, 1);
  |   ^~~
0x7345b8 ana::region_model::deref_rvalue(ana::svalue const*, tree_node*,
ana::region_model_context*)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/region-model.cc:1465
0x1128c06 ana::exploded_node::on_longjmp(ana::exploded_graph&, gcall const*,
ana::program_state*, ana::region_model_context*) const
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/engine.cc:1283
0x1129383 ana::exploded_node::on_stmt(ana::exploded_graph&, ana::supernode
const*, gimple const*, ana::program_state*) const
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/engine.cc:1133
0x112ae65 ana::exploded_graph::process_node(ana::exploded_node*)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/engine.cc:2811
0x112b94a ana::exploded_graph::process_worklist()
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/engine.cc:2446
0x112da6d ana::impl_run_checkers(ana::logger*)
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/engine.cc:4498
0x112e8bc ana::run_checkers()
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/engine.cc:4569
0x1121e38 execute
   
/var/tmp/portage/sys-devel/gcc-11.0.0_alpha20200927/work/gcc-11-20200927/gcc/analyzer/analyzer-pass.cc:84

[Bug libstdc++/78830] std::prev accepts ForwardIterator-s

2020-09-28 Thread akrzemi1 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78830

--- Comment #16 from Andrzej Krzemienski  ---
Oh, I see. The above requirement applies only to chapter Algorithms library.
Not Iterators library. Sorry.

[Bug c++/97217] C++ program compiled with GCC crashes

2020-09-28 Thread carsten.schmidt-achim at gmx dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97217

--- Comment #4 from Carsten Schmidt  ---
Sorry for the inconvenience!

o  the exact version of GCC

o  the system type

o  the options given when GCC was configured/built

Please confer to file output.txt.

o  the complete command line that triggers the bug

The program needs to be invoked either

unittest_array.exe [direction]

OR

unittest_array.exe [inverse]

Both test cases crash the program; other test cases do not seem to be affected.

o  the compiler output (error messages, warnings, etc.)

Again, please confer to file output.txt.

o  the preprocessed file (*.i*) that triggers the bug

Please see the attached archive ii_files.zip.

[Bug c++/97217] C++ program compiled with GCC crashes

2020-09-28 Thread carsten.schmidt-achim at gmx dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97217

--- Comment #3 from Carsten Schmidt  ---
Created attachment 49282
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49282=edit
Complete output of the compiler run; including version information.

[Bug c++/97217] C++ program compiled with GCC crashes

2020-09-28 Thread carsten.schmidt-achim at gmx dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97217

--- Comment #2 from Carsten Schmidt  ---
Created attachment 49281
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49281=edit
The *.ii files generated.

[Bug libstdc++/78830] std::prev accepts ForwardIterator-s

2020-09-28 Thread akrzemi1 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78830

--- Comment #15 from Andrzej Krzemienski  ---
How come? 
[algorithms.requirements], paragraph 4, bullet 5
(http://eel.is/c++draft/algorithms#requirements-4.5) says:

If an algorithm's template parameter is named BidirectionalIterator,
BidirectionalIterator1, or BidirectionalIterator2, the template argument shall
meet the Cpp17BidirectionalIterator requirements

[Bug libstdc++/78830] std::prev accepts ForwardIterator-s

2020-09-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78830

--- Comment #14 from Jonathan Wakely  ---
There is an LWG issue requesting clarification in the standard:
https://wg21.link/lwg3197

Option B is consistent with the interpretation of libstdc++ (and recent
versions of libc++). If Option A or C is accepted, I will change libstdc++
accordingly. Until then, I maintain that we conform to the standard as
currently written.

[Bug middle-end/96390] [OpenMP] Link errors on the offload side for C++ code with templates

2020-09-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96390

Tobias Burnus  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Tobias Burnus  ---
FIXED on GCC 11 (= mainline).

[Bug fortran/97123] double free detected in tcache 2 with recursive allocatable type

2020-09-28 Thread dlhough at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97123

Luke Hough  changed:

   What|Removed |Added

 CC||dlhough at gmail dot com

--- Comment #2 from Luke Hough  ---
This is likely related to bug #96910.

The intrinsic assignment operator does not duplicate the allocated
subcomponents during assignment for recursive types. The `top_level =
init_recursive()` creates a temporary storage for the RHS before assigning to
the LHS. After assignment, the temporary RHS is deallocated. Since the
subcomponent is not properly duplicated during the assignment, this leads to a
double free.

[Bug libstdc++/97232] Iterator category of "std::prev" are not checked

2020-09-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97232

--- Comment #2 from Jonathan Wakely  ---
If you compile with -D_GLIBCXX_ASSERTIONS then there is a runtime check:

/home/jwakely/gcc/11/include/c++/11.0.0/bits/stl_iterator_base_funcs.h:151:
constexpr void std::__advance(_InputIterator&, _Distance,
std::input_iterator_tag) [with _InputIterator = std::_Fwd_list_iterator;
_Distance = long int]: Assertion '__n >= 0' failed.
Aborted (core dumped)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #67 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:c0817ceebffa0be66b39c874a5da408404330b42

commit r8-10546-gc0817ceebffa0be66b39c874a5da408404330b42
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 10:29:17 2020 +0100

AArch64: Implement vstrq_p128 intrinsic

This patch implements the missing vstrq_p128 intrinsic.
It just performs a store of the poly128_t argument to a memory location.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vstrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vstrq_p128_1.c: New test.

(cherry picked from commit d23ea1e865301cd45f14ccbdb0bca49251fde9e1)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #71 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:3489c06cccb60c1af4c66aff82b670fb39f36266

commit r8-10550-g3489c06cccb60c1af4c66aff82b670fb39f36266
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 17:37:58 2020 +0100

AArch64: Implement missing p128<->f64 reinterpret intrinsics

This patch implements the missing reinterprets to and from poly128_t and
float64x2_t.
I've plugged in the appropriate testing in the advsimd-intrinsics.exp
too.

Bootstrapped and tested on aarch64-none-linux-gnu.
Tested advsimd-intrinsics.exp on arm-none-eabi too to make sure arm
testing isn't affected.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vreinterpretq_f64_p128,
vreinterpretq_p128_f64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(clean_results): Add float64x2_t cleanup.
(DECL_VARIABLE_128BITS_VARIANTS): Add float64x2_t variable.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c: Add
testing of vreinterpretq_f64_p128, vreinterpretq_p128_f64.

(cherry picked from commit 65c9878641cbe0ed898aa7047b7b994e9d4a5bb1)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #70 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:852423cd68b403d09a14f6436080243c609a57a8

commit r8-10549-g852423cd68b403d09a14f6436080243c609a57a8
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 12:02:29 2020 +0100

AArch64: Implement missing vrndns_f32 intrinsic

This patch implements the missing vrndns_f32 intrinsic. This operates on a
scalar float32_t value.
It can be mapped down to a __builtin_aarch64_frintnsf builtin.

This patch does that.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/aarch64-simd-builtins.def (frintn): Use
BUILTIN_VHSDF_HSDF
for modes.  Remove explicit hf instantiation.
* config/aarch64/arm_neon.h (vrndns_f32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vrndns_f32_1.c: New test.

(cherry picked from commit 02b5377b3766804059b7824330d33d0e1cef2e5b)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #68 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:a45e419416c641b7be5d4f4eb877fa390349c004

commit r8-10547-ga45e419416c641b7be5d4f4eb877fa390349c004
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 10:32:42 2020 +0100

AArch64: Implement vldrq_p128 intrinsic

This patch implements the missing vldrq_p128 intrinsic that just loads from
the appropriate pointer.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vldrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vldrq_p128_1.c: New test.

(cherry picked from commit f2868e4bcff2c7b882d01231f039459c00e59d7b)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #66 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:fd250940d0e3dd17302eb5e2653255c9189bfd70

commit r8-10545-gfd250940d0e3dd17302eb5e2653255c9189bfd70
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 12:03:49 2020 +0100

AArch64: Implement missing vcls intrinsics on unsigned types

This patch implements some missing intrinsics that perform a CLS on
unsigned SIMD types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vcls_u8, vcls_u16, vcls_u32,
vclsq_u8, vclsq_u16, vclsq_u32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vcls_unsigned_1.c: New test.

(cherry picked from commit 30957092db46d8798e632feefb5df634488dbb33)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #72 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:768c95cc6c84d504cf95fe948d808376628d2fa8

commit r8-10551-g768c95cc6c84d504cf95fe948d808376628d2fa8
Author: Christophe Lyon 
Date:   Fri Sep 25 10:40:18 2020 +

testsuite: [aarch64] Fix aarch64/advsimd-intrinsics/v{trn,uzp,zip}_half.c

Since r11-3402 (g:65c9878641cbe0ed898aa7047b7b994e9d4a5bb1), the
vtrn_half, vuzp_half and vzip_half started failing with

vtrn_half.c:76:17: error: redeclaration of 'vector_float64x2' with no
linkage
vtrn_half.c:77:17: error: redeclaration of 'vector2_float64x2' with no
linkage
vtrn_half.c:80:17: error: redeclaration of 'vector_res_float64x2' with no
linkage

This is because r11-3402 now always declares float64x2 variables for
aarch64, leading to a duplicate declaration in these testcases.

The fix is simply to remove these now useless declarations.

These tests are skipped on arm*, so there is no impact on that target.

2020-09-25  Christophe Lyon  

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: Remove
declarations of vector, vector2, vector_res for float64x2 type.
* gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: Likewise.

(cherry picked from commit 8c775bf447e190024fa08c55e38db94dd013a393)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #69 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:99a8808add97c61b64a4cb979e4616731b86e58b

commit r8-10548-g99a8808add97c61b64a4cb979e4616731b86e58b
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 11:07:50 2020 +0100

AArch64: Implement missing _p64 intrinsics for vector permutes

This patch implements some missing vector permute intrinsics operating on
poly64x2_t types.
They are implemented identically to their uint64x2_t brethren.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vtrn1q_p64, vtrn2q_p64, vuzp1q_p64,
vuzp2q_p64, vzip1q_p64, vzip2q_p64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/trn_zip_p64_1.c: New test.

(cherry picked from commit e8e818399d70c5a5a3d30a54d305c6e2b92e2c66)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #64 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:3c21a2f28014cd3bbfaee975a466dc3488052060

commit r8-10543-g3c21a2f28014cd3bbfaee975a466dc3488052060
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 11:58:36 2020 +0100

AArch64: Implement poly-type vadd intrinsics

This implements the vadd[p]_p* intrinsics.
In terms of functionality they are aliases of veor operations on the
relevant unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vadd_p8, vadd_p16, vadd_p64, vaddq_p8,
vaddq_p16, vaddq_p64, vaddq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vadd_poly_1.c: New test.

(cherry picked from commit fa9ad35dae03dcb20c4ccb50ba1b351a8ab77970)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #65 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:e9ed4afbb6778dedfb1efa0ba92429a51d4d049b

commit r8-10544-ge9ed4afbb6778dedfb1efa0ba92429a51d4d049b
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 12:00:38 2020 +0100

AArch64: Implement missing vceq*_p* intrinsics

This patch implements some missing vceq* intrinsics on poly types.
The behaviour is to produce the appropriate CMEQ instruction as for the
unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vceqq_p64, vceqz_p64, vceqzq_p64):
Define.

gcc/testsuite/

PR target/71233
* gcc.target/aarch64/simd/vceq_poly_1.c: New test.

(cherry picked from commit d4703be185b422f637deebd3bb9222a41c8023d6)

[Bug libstdc++/97232] Iterator category of "std::prev" are not checked

2020-09-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97232

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Jonathan Wakely  ---
Please see Bug 78830, specifically Bug 78830 comment 8.

*** This bug has been marked as a duplicate of bug 78830 ***

[Bug libstdc++/78830] std::prev accepts ForwardIterator-s

2020-09-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78830

Jonathan Wakely  changed:

   What|Removed |Added

 CC||lesley at lesleylai dot info

--- Comment #13 from Jonathan Wakely  ---
*** Bug 97232 has been marked as a duplicate of this bug. ***

[Bug middle-end/96390] [OpenMP] Link errors on the offload side for C++ code with templates

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96390

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:2a10a2c0689db280ee3a94164504b7196b8370f4

commit r11-3505-g2a10a2c0689db280ee3a94164504b7196b8370f4
Author: Tobias Burnus 
Date:   Mon Sep 28 18:08:05 2020 +0200

OpenMP: Handle cpp_implicit_alias in declare-target discovery (PR96390)

gcc/ChangeLog:

PR middle-end/96390
* omp-offload.c (omp_discover_declare_target_tgt_fn_r): Handle
alias nodes.

libgomp/ChangeLog:

PR middle-end/96390
* testsuite/libgomp.c++/pr96390.C: New test.
* testsuite/libgomp.c-c++-common/pr96390.c: New test.

[Bug target/97231] Missing FSF copyright notes for some x86 intrinsic headers

2020-09-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97231

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2020-09-28
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from H.J. Lu  ---
LGTM.  Please submit it to the GCC patches mailing list.

[Bug libstdc++/97232] New: Iterator category of "std::prev" are not checked

2020-09-28 Thread lesley at lesleylai dot info via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97232

Bug ID: 97232
   Summary: Iterator category of "std::prev" are not checked
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lesley at lesleylai dot info
  Target Milestone: ---

As Matt Godbolt tweeted
https://twitter.com/mattgodbolt/status/1310590905217409024?s=20, `std::prev`
silently accepts a forward iterator and crash at runtime. Here is his sample
code: https://godbolt.org/z/9cKKsT

I found the implementation in `bits/stl_iterator_base_funcs.h` looks like this:

  template
inline _GLIBCXX17_CONSTEXPR _BidirectionalIterator
prev(_BidirectionalIterator __x, typename
 iterator_traits<_BidirectionalIterator>::difference_type __n = 1) 
{
  // concept requirements
  __glibcxx_function_requires(_BidirectionalIteratorConcept<
  _BidirectionalIterator>)
  std::advance(__x, -__n);
  return __x;
}

The `__glibcxx_function_requires` is a no-op be default, and the whole Concept
Checking mechanism seems to be outdated. Instead, checking
`_BidirectionalIterator` with a `static_assert` should solve this problem:

  template
inline _GLIBCXX17_CONSTEXPR _BidirectionalIterator
prev(_BidirectionalIterator __x, typename
 iterator_traits<_BidirectionalIterator>::difference_type __n = 1) 
{
  static_assert(is_same::value_type, 
   std::bidirectional_iterator_tag>::value, "_BidirectionalIterator
must be a bidirectional_iterator");

  std::advance(__x, -__n);
  return __x;
}

[Bug tree-optimization/96344] 3rdd case of gnat.dg/opt86a.adb fails because of VRP

2020-09-28 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96344

Eric Botcazou  changed:

   What|Removed |Added

Summary|3rdd case of|3rdd case of
   |gnat.dg/opt86a.adb fails|gnat.dg/opt86a.adb fails
   ||because of VRP
  Component|ada |tree-optimization

--- Comment #3 from Eric Botcazou  ---
The difference is that, on x86-64, the third case is preoptimized (.dse2):

   [local count: 1072024872]:
  system.secondary_stack.ss_release ();
  _6 = s3_35 == 1;
  _7 = s3_35 == 4;
  _8 = _6 | _7;
  _57 = s3_35 == 8;
  _56 = _8 | _57;
  if (_56 != 0)
goto ; [0.08%]
  else
goto ; [99.92%]

whereas it is not on PowerPC64:

   [local count: 1072024872]:
  system.secondary_stack.ss_release ();
  if (_31 == 1)
goto ; [0.04%]
  else
goto ; [99.96%]

   [local count: 1071596063]:
  if (s3_32 == 4)
goto ; [0.04%]
  else
goto ; [99.96%]

   [local count: 1071167425]:
  if (s3_32 == 8)
goto ; [0.04%]
  else
goto ; [99.96%]

Note the _31 in the first arm instead of s3_32: VRP1 has propagated

s3_32 = (opt86_pkg__enum) _31;

into the first arm but not the others...  So -fno-tree-vrp is a workaround.

[Bug c++/96229] Invalid specialization accepted when also constrained in base template template parameter

2020-09-28 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96229

Patrick Palka  changed:

   What|Removed |Added

   Last reconfirmed||2020-09-28
 CC||ppalka at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Patrick Palka  ---
Confirmed.

[Bug target/97231] Missing FSF copyright notes for some x86 intrinsic headers

2020-09-28 Thread wwwhhhyyy333 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97231

--- Comment #1 from Hongyu Wang  ---
Created attachment 49280
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49280=edit
A patch

[Bug c++/96840] [11 Regression] Recursive substitution in constrained commutative operator

2020-09-28 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96840

Patrick Palka  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org,
   ||ppalka at gcc dot gnu.org

--- Comment #1 from Patrick Palka  ---
Started with r11-2774:

c++: Check satisfaction before non-dep convs. [CWG2369]

It's very hard to use concepts to protect a template from hard errors due
to
unwanted instantiation if constraints aren't checked until after doing all
substitution and checking of non-dependent conversions.

We fall into a loop when checking the constraints of the second overload
(instantiated with Rep=int and T=Int).

It looks like we're correct to reject the testcase as of this DR?

[Bug target/97231] New: Missing FSF copyright notes for some x86 intrinsic headers

2020-09-28 Thread wwwhhhyyy333 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97231

Bug ID: 97231
   Summary: Missing FSF copyright notes for some x86 intrinsic
headers
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wwwhhhyyy333 at gmail dot com
  Target Milestone: ---

Many x86 intrinsic header files doesn't have FSF copyright:

amxbf16intrin.h
amxint8intrin.h
amxtileintrin.h
avx512vp2intersectintrin.h
avx512vp2intersectvlintrin.h
pconfigintrin.h
tsxldtrkintrin.h
wbnoinvdintrin.h

[Bug c++/97230] Invocation of non-static member function on a null instance in core constant expression should not be allowed

2020-09-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97230

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-09-28
 Ever confirmed|0   |1
   Keywords|compile-time-hog|accepts-invalid

--- Comment #2 from Jonathan Wakely  ---
GCC trunk does warn about exactly the issue:

ce.cc: In function 'int main()':
ce.cc:10:43: warning: 'this' pointer null [-Wnonnull]
   10 | constexpr auto x = ((X*)nullptr)->foo();
  |   ^
ce.cc:4:19: note: in a call to non-static member function 'constexpr int
X::foo() const'
4 | constexpr int foo(){
  |   ^~~


But that should be an error during constant evaluation.

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #63 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:3fa772a7acfea62a01fb36d1451c8be9c54ba7da

commit r9-8954-g3fa772a7acfea62a01fb36d1451c8be9c54ba7da
Author: Christophe Lyon 
Date:   Fri Sep 25 10:40:18 2020 +

testsuite: [aarch64] Fix aarch64/advsimd-intrinsics/v{trn,uzp,zip}_half.c

Since r11-3402 (g:65c9878641cbe0ed898aa7047b7b994e9d4a5bb1), the
vtrn_half, vuzp_half and vzip_half started failing with

vtrn_half.c:76:17: error: redeclaration of 'vector_float64x2' with no
linkage
vtrn_half.c:77:17: error: redeclaration of 'vector2_float64x2' with no
linkage
vtrn_half.c:80:17: error: redeclaration of 'vector_res_float64x2' with no
linkage

This is because r11-3402 now always declares float64x2 variables for
aarch64, leading to a duplicate declaration in these testcases.

The fix is simply to remove these now useless declarations.

These tests are skipped on arm*, so there is no impact on that target.

2020-09-25  Christophe Lyon  

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: Remove
declarations of vector, vector2, vector_res for float64x2 type.
* gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: Likewise.

(cherry picked from commit 8c775bf447e190024fa08c55e38db94dd013a393)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #62 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:803f597d3125bfd67d29a11c118e131353ee314e

commit r9-8953-g803f597d3125bfd67d29a11c118e131353ee314e
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 17:37:58 2020 +0100

AArch64: Implement missing p128<->f64 reinterpret intrinsics

This patch implements the missing reinterprets to and from poly128_t and
float64x2_t.
I've plugged in the appropriate testing in the advsimd-intrinsics.exp
too.

Bootstrapped and tested on aarch64-none-linux-gnu.
Tested advsimd-intrinsics.exp on arm-none-eabi too to make sure arm
testing isn't affected.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vreinterpretq_f64_p128,
vreinterpretq_p128_f64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(clean_results): Add float64x2_t cleanup.
(DECL_VARIABLE_128BITS_VARIANTS): Add float64x2_t variable.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c: Add
testing of vreinterpretq_f64_p128, vreinterpretq_p128_f64.

(cherry picked from commit 65c9878641cbe0ed898aa7047b7b994e9d4a5bb1)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #58 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:9f7c4bb47c97aa6cd68bd48f6f2129e19f01c892

commit r9-8949-g9f7c4bb47c97aa6cd68bd48f6f2129e19f01c892
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 10:29:17 2020 +0100

AArch64: Implement vstrq_p128 intrinsic

This patch implements the missing vstrq_p128 intrinsic.
It just performs a store of the poly128_t argument to a memory location.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vstrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vstrq_p128_1.c: New test.

(cherry picked from commit d23ea1e865301cd45f14ccbdb0bca49251fde9e1)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #61 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:61291c4b7d429ddd12536732759bd56708e78e14

commit r9-8952-g61291c4b7d429ddd12536732759bd56708e78e14
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 12:02:29 2020 +0100

AArch64: Implement missing vrndns_f32 intrinsic

This patch implements the missing vrndns_f32 intrinsic. This operates on a
scalar float32_t value.
It can be mapped down to a __builtin_aarch64_frintnsf builtin.

This patch does that.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/aarch64-simd-builtins.def (frintn): Use
BUILTIN_VHSDF_HSDF
for modes.  Remove explicit hf instantiation.
* config/aarch64/arm_neon.h (vrndns_f32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vrndns_f32_1.c: New test.

(cherry picked from commit 02b5377b3766804059b7824330d33d0e1cef2e5b)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #57 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:6f189fa29bc90c658ce1df33774a04d4956dcc27

commit r9-8948-g6f189fa29bc90c658ce1df33774a04d4956dcc27
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 12:03:49 2020 +0100

AArch64: Implement missing vcls intrinsics on unsigned types

This patch implements some missing intrinsics that perform a CLS on
unsigned SIMD types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vcls_u8, vcls_u16, vcls_u32,
vclsq_u8, vclsq_u16, vclsq_u32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vcls_unsigned_1.c: New test.

(cherry picked from commit 30957092db46d8798e632feefb5df634488dbb33)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #59 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:0d27e8eb8dc8ed28fdf4d6876d7f6f0610273198

commit r9-8950-g0d27e8eb8dc8ed28fdf4d6876d7f6f0610273198
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 10:32:42 2020 +0100

AArch64: Implement vldrq_p128 intrinsic

This patch implements the missing vldrq_p128 intrinsic that just loads from
the appropriate pointer.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vldrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vldrq_p128_1.c: New test.

(cherry picked from commit f2868e4bcff2c7b882d01231f039459c00e59d7b)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #60 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:23b4d65ef54b9ad8eb5cca65b7412d46f35d913f

commit r9-8951-g23b4d65ef54b9ad8eb5cca65b7412d46f35d913f
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 11:07:50 2020 +0100

AArch64: Implement missing _p64 intrinsics for vector permutes

This patch implements some missing vector permute intrinsics operating on
poly64x2_t types.
They are implemented identically to their uint64x2_t brethren.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vtrn1q_p64, vtrn2q_p64, vuzp1q_p64,
vuzp2q_p64, vzip1q_p64, vzip2q_p64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/trn_zip_p64_1.c: New test.

(cherry picked from commit e8e818399d70c5a5a3d30a54d305c6e2b92e2c66)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #55 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:48e274be62b924379541ae0321b82862f572b973

commit r9-8946-g48e274be62b924379541ae0321b82862f572b973
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 11:58:36 2020 +0100

AArch64: Implement poly-type vadd intrinsics

This implements the vadd[p]_p* intrinsics.
In terms of functionality they are aliases of veor operations on the
relevant unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vadd_p8, vadd_p16, vadd_p64, vaddq_p8,
vaddq_p16, vaddq_p64, vaddq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vadd_poly_1.c: New test.

(cherry picked from commit fa9ad35dae03dcb20c4ccb50ba1b351a8ab77970)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #56 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:11874a0d4033908e596181a17dab5444271f892b

commit r9-8947-g11874a0d4033908e596181a17dab5444271f892b
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 12:00:38 2020 +0100

AArch64: Implement missing vceq*_p* intrinsics

This patch implements some missing vceq* intrinsics on poly types.
The behaviour is to produce the appropriate CMEQ instruction as for the
unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vceqq_p64, vceqz_p64, vceqzq_p64):
Define.

gcc/testsuite/

PR target/71233
* gcc.target/aarch64/simd/vceq_poly_1.c: New test.

(cherry picked from commit d4703be185b422f637deebd3bb9222a41c8023d6)

[Bug target/96470] [10/11 regression] gnat.dg/opt39.adb is not scalarized

2020-09-28 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96470

Eric Botcazou  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2020-09-28
  Component|testsuite   |target
 Status|UNCONFIRMED |NEW
Summary|[10 regression] |[10/11 regression]
   |gnat.dg/opt39.adb fails |gnat.dg/opt39.adb is not
   |since r10-917   |scalarized

--- Comment #2 from Eric Botcazou  ---
The difference is:

Rejected (3173): not aggregate: a
Rejected (3174): not aggregate: i
Rejected (3175): not aggregate: aL
Candidate (3180): tmp
Too big to totally scalarize: tmp (UID: 3180)

because max_scalarization_size == 128 on PowerPC64 (it's 1088 on x86-64) in the
analyze_all_variable_accesses function.  So I can force the test to pass on
PowerPC64 by means of --param=sra-max-scalarization-size-Ospeed=32 but the
default max_scalarization_size seems to be unduly low on PowerPC64 (just 2
words).

[Bug c++/97230] Invocation of non-static member function on a null instance in core constant expression should not be allowed

2020-09-28 Thread eligorkadaf at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97230

--- Comment #1 from Eligor Kadaf  ---
Basically it described in 7th paragraph in [expr.const] of the latest C++
standard draft: https://eel.is/c++draft/expr.const#5

[Bug c++/97230] New: Invocation of non-static member function on a null instance in core constant expression should not be allowed

2020-09-28 Thread eligorkadaf at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97230

Bug ID: 97230
   Summary: Invocation of non-static member function on a null
instance in core constant expression should not be
allowed
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eligorkadaf at gmail dot com
  Target Milestone: ---

The code below should not compile according to the C++ standart.

The code:
class X{

public:
constexpr int foo(){
return 1;
}
};

int main(){
constexpr auto x = ((X*)nullptr)->foo();
}

Compile and run logs:

$ g++-10 -v
Using built-in specs.
COLLECT_GCC=g++-10
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
10-20200411-0ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-10
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none,amdgcn-amdhsa,hsa --without-cuda-driver
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.1 20200411 (experimental) [master revision
bb87d5cc77d:75961caccb7:f883c46b4877f637e0fa5025b4d6b5c9040ec566] (Ubuntu
10-20200411-0ubuntu1)

$ g++-10 -Wall -Wextra -std=c++20 bug.cpp
bug.cpp: In function ‘int main()’:
bug.cpp:10:20: warning: unused variable ‘x’ [-Wunused-variable]
   10 | constexpr auto x = ((X*)nullptr)->foo();
  |^

There are additional non-issue warnings whether the code is compiled by gcc
trunk

The godbolt link: https://godbolt.org/z/rx3axf

The problem is reproducible with gcc 4.8.1, gcc 5.1 and gcc 8.1 also

[Bug fortran/97224] [8/9/10/11 Regression] SPECCPU 2006 Gamess fails to build after g:e5a76af3a2f3324efc60b4b2778ffb29d5c377bc

2020-09-28 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97224

--- Comment #6 from Steve Kargl  ---
On Mon, Sep 28, 2020 at 09:48:13AM +, tnfchris at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97224
> 
> Bug ID: 97224
>Summary: [8/9/10/11 Regression] SPECCPU 2006 Gamess fails to
> build after g:e5a76af3a2f3324efc60b4b2778ffb29d5c377bc
>Product: gcc
>Version: 11.0
> Status: UNCONFIRMED
>   Severity: normal
>   Priority: P3
>  Component: fortran
>   Assignee: unassigned at gcc dot gnu.org
>   Reporter: tnfchris at gcc dot gnu.org
> CC: kargl at gcc dot gnu.org, markeggleston at gcc dot gnu.org
>   Target Milestone: ---
> 
> The benchmark fails to build after
> r11-3487-ge5a76af3a2f3324efc60b4b2778ffb29d5c377bc with the following error:
> 
>   108 |   SUBROUTINE AIMPAC(ACO,AIC,EXPON,ICENT,ITYPE,OE,OCCNO,
>   |   1
> ..
>   920 |   COMMON /INTRFC/ FRIEND,AIMPAC,RPAC,PLTORB,MOLPLT
>   | 2  
> Error: Global entity 'aimpac' at (1) cannot appear in a COMMON block at (2)
> parley.fppized.f:1118:23:
> 

Without access to the source code, which was conveniently omitted,
it cannot no be determined if gfortran is correct.  The EBNF for COMMON
is

R873 common-stmt  is COMMON
[ / [ common-block-name ] / ] common-block-object-list
[ [ , ] / [ common-block-name ] /
common-block-object-list ] ...

R874 common-block-object  is variable-name [ ( array-spec ) ]


AIMPAC is the name of a subroutine.  It is not a variable-name.
Unfortunately, there are 812 lines of missing code, so it is
not possible to tell if AIMPAC is declared in local scope to
something else, which blocks the global entity name.

[Bug fortran/97070] Discrepancy in results between OpenMP/OpenACC

2020-09-28 Thread venetis at ceid dot upatras.gr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97070

--- Comment #3 from Ioannis E. Venetis  ---
This is weird. Just downloaded gcc from git and built version 11.0

$ /home/venetis/apps/gcc-20200928/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/venetis/apps/gcc-20200928/bin/gcc
COLLECT_LTO_WRAPPER=/home/venetis/apps/gcc-20200928/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-20200928/configure --enable-offload-targets=nvptx-none
--with-cuda-driver-include=/usr/local/cuda/include
--with-cuda-driver-lib=/usr/local/cuda/lib64 --disable-bootstrap
--disable-multilib --enable-languages=c,c++,fortran,lto
--prefix=/home/venetis/apps/gcc-20200928
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20200928 (experimental) (GCC)

I am still getting the wrong results with OpenACC. I can see three
possibilities.

1) I build gcc the wrong way. I have attached the script I am using to build
gcc. It is a slightly modified version of what I found here:
https://gist.github.com/matthiasdiener/e318e7ed8815872e9d29feb3b9c8413f

I have created manually a tarball of the code downloaded from git so as to make
minimal changes in the script I had.

2) The wrong run-time libraries are used during execution of the example, since
gcc is installed in a non-default path. I have tried with and without setting:
LD_LIBRARY_PATH=/home/venetis/apps/gcc-20200928/lib:/home/venetis/apps/gcc-20200928/lib64

Unfortunately I get wrong results in both cases.

3) Wrong nvptx tools and libraries are used during compilation of the example,
as my system (Ubuntu 16.04.7 LTS) has also the corresponding packages for gcc
9.3.0 installed.

How can I make certain that my compilation and execution of the example are
using all tools and libraries from my custom build?

PS: As a side note, I tried OpenACC with nvfortran 20.7 from NVidia HPC SDK
20.7 and I get the correct results for the example.

[Bug fortran/97070] Discrepancy in results between OpenMP/OpenACC

2020-09-28 Thread venetis at ceid dot upatras.gr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97070

--- Comment #2 from Ioannis E. Venetis  ---
Created attachment 49279
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49279=edit
GCC building script

[Bug c++/97094] Compiling big std::unordered_map became slower

2020-09-28 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97094

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #3 from Patrick Palka  ---
With GCC 10/11, according to -ftime-report most of the time/memory usage is
from


tree eh:   1.10 ( 40%)   0.18 ( 31%)   1.29 ( 38%) 
642338 kB ( 84%)

[Bug sanitizer/97229] New: pointer-compare sanitizer is very slow due to __asan::IsAddressNearGlobal

2020-09-28 Thread mail at milianw dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97229

Bug ID: 97229
   Summary: pointer-compare sanitizer is very slow due to
__asan::IsAddressNearGlobal
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mail at milianw dot de
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

I am trying to use the pointer-compare sanitizer during product development. I
noticed that it is usually fine from a performance POV, but one specific code
path is getting extremely slow. Quasi 99% of the CPU samples point at this
backtrace:

```

__sanitizer_ptr_cmp
__asanCheckForInvalidPointerPair
__asanCheckForInvalidPointerPair
__asan::IsInvalidPointerPair
__asan::GetGlobalAddressInformation(unsigned long, unsigned long, ...)
__asan::GetGlobalsForAddress(unsigned long, __asan_global*, ...)
__asan::isAddressNearGlobal
```

I have tried to simulate what our code does in this simplistic example: It
copies one file to another in a stupid way via mmap. The pointer comparison is
within the copy() function below.

```
#include 
#include 
#include 
#include 

#include 

static
__attribute__((noinline))
void copy(const unsigned char *source, size_t source_size,
  unsigned char *target, size_t target_size)
{
if (target + source_size > target + target_size) {
fprintf(stderr, "bad offsets: %zu %zu\n", target_size, source_size);
return;
}
std::copy_n(source, source_size, target);
}

unsigned char* mapBuffer(const char *path, size_t size)
{
auto fd = open(path, O_CREAT | O_RDWR, 0600);
if (fd == -1) {
perror("failed to open file");
return nullptr;
}

if (posix_fallocate64(fd, 0, size) != 0) {
perror("failed to resize file");
close(fd);
return nullptr;
}

auto buffer = mmap(nullptr, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
0);
close(fd);

if (!buffer) {
perror("failed to mmap file");
return nullptr;
}

return reinterpret_cast(buffer);
}

int main(int argc, char **argv)
{
if (argc != 3) {
fprintf(stderr, "USAGE: ./a.out BUFFER_SIZE COPY_SIZE\n");
return 1;
}

const auto size_i = atoi(argv[1]);
if (size_i < 0) {
fprintf(stderr, "bad size: %d\n", size_i);
return 1;
}
const auto size = static_cast(size_i);

const auto copySize_i = atoi(argv[2]);
if (copySize_i < 0 || copySize_i > size_i || (size_i % copySize_i) != 0) {
fprintf(stderr, "bad copy size: %d %d\n", copySize_i, size_i);
return 1;
}
const auto copySize = static_cast(copySize_i);

auto source = mapBuffer("/tmp/source.dat", size);
if (!source) {
return 1;
}

auto target = mapBuffer("/tmp/target.dat", size);
if (!target) {
return 1;
}

for (int i = 0; i < size; i += copySize) {
copy(source + i, copySize, target + i, copySize);
}

munmap(source, size);
munmap(target, size);
return 0;
}
```

But that demo does not show the extreme slow down. It is actually behaving
quite well, at most 10% slow down, when enabling pointer-compare with the
ASAN_OPTIONS env var.

In the real application, the slow-down is more in the order of 100x or more.
That app links in a lot of other libraries and also runs code in multiple
threads, so I suspect that the issue I'm seeing is related to the amount of
globals and potentially libraries available in the application?  Any idea how I
could reproduce this to create a proper MWE?

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789

--- Comment #33 from Richard Biener  ---
(In reply to Kewen Lin from comment #32)
> (In reply to Richard Biener from comment #31)
> > (In reply to Kewen Lin from comment #29)
> > > (In reply to Hongtao.liu from comment #28)
> > > > > Probably you can try to tweak it in ix86_add_stmt_cost? when the 
> > > > > statement
> > > > 
> > > > Yes, it's the place.
> > > > 
> > > > > is UB to UH conversion statement, further check if the def of the 
> > > > > input UB
> > > > > is MEM.
> > > > 
> > > > Only if there's no multi-use for UB. More generally, it's quite 
> > > > difficult to
> > > > guess later optimizations for the purpose of more accurate vectorization
> > > > cost model, :(.
> > > 
> > > Yeah, it's hard sadly. The generic cost modeling is rough,
> > > ix86_add_stmt_cost is more fine-grain (at least than what we have on Power
> > > :)), if you want to check it more, it seems doable in target specific hook
> > > finish_cost where you can get the whole vinfo object, but it could end up
> > > with very heavy analysis and might not be worthy.
> > > 
> > > Do you mind to check if it can also fix this degradation on x86 to run FRE
> > > and DSE just after cunroll? I found it worked for Power, hoped it can help
> > > there too.
> > 
> > Btw, we could try sth like adding a TODO_force_next_scalar_cleanup to be
> > returned from passes that see cleanup opportunities and have the pass
> > manager queue that up, looking for a special marked pass and enabling
> > that so we could have
> > 
> >   NEXT_PASS (pass_predcom);
> >   NEXT_PASS (pass_complete_unroll);
> >   NEXT_PASS (pass_scalar_cleanup);
> >   PUSH_INSERT_PASSES_WITHIN (pass_scalar_cleanup);
> > NEXT_PASS (pass_fre, false /* may_iterate */);
> > NEXT_PASS (pass_dse);
> >   POP_INSERT_PASSES ();
> > 
> > with pass_scalar_cleanup gate() returning false otherwise.  Eventually
> > pass properties would match this better, or sth else.
> > 
> 
> Thanks for the suggestion! Before cooking the patch, I have one question
> that it looks to only update function property is enough, eg: some pass sets
> property PROP_ok_for_cleanup and later pass_scalar_cleanup only goes for the
> func with this property (checking in gate), I'm not quite sure the reason
> for the TODO_flag TODO_force_next_scalar_cleanup.

properties are not an easy fit since they are static in the pass
description while we want to trigger the cleanup only if we unrolled
an outermost loop for example.  Returning TODO_force_next_scalar_cleanup
from cunroll is the natural way to signal this.

The hackish way then would be to "queue" this TODO_force_next_scalar_cleanup
inside the pass manager itself until it runs into a
"scalar cleanup pass" (either somehow marked or just hard-matched), forcing
its gate() to evaluate to true (just skipping its evaluation).

As said, there's no "nice" way for the information flow at the moment.

One could expose the "pending TODO" (TODO not handled by the pass manager
itself) in a global variable (like current_pass) so the cleanup pass
gate() could check

  gate () { return pending_todo & TODO_force_next_scalar_cleanup; }

and in its execute clear this bit from pending_todo.

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194

--- Comment #14 from Alexander Monakov  ---
I see, there are more weaknesses than I thought. For CSE (or rather fwprop?) I
was thinking about a simpler case where the extracted-from value is loaded from
memory, but even in trivial cases RTL optimizers cannot clean it up today (so
it wouldn't get any better with separate temporaries):

#define N 16
typedef int T;
typedef T V __attribute__((vector_size(N)));
T f(V *px, long i)
{
V x = *px;
return x[i];
}

f:
movdqa  (%rdi), %xmm0
movaps  %xmm0, -24(%rsp)
movl-24(%rsp,%rsi,4), %eax
ret

[Bug gcov-profile/96919] [GCOV] uncovered line of stack allocation while using virutal destructor

2020-09-28 Thread filip.pudak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96919

Filip Puđak  changed:

   What|Removed |Added

 CC||filip.pudak at gmail dot com

--- Comment #3 from Filip Puđak  ---
(In reply to Martin Liška from comment #2)
> Using latest GCC release you can see what happens:
> 
> $ g++ pr96919.cc --coverage && ./a.out && gcov a-pr96919.cc -t
> hello
> libgcov profiling
> error:/home/marxin/Programming/testcases/a-pr96919.gcda:overwriting an
> existing profile data with a different timestamp
> -:0:Source:pr96919.cc
> -:0:Graph:a-pr96919.gcno
> -:0:Data:a-pr96919.gcda
> -:0:Runs:1
> -:1:class Base {
> -:2:public:
> -:3:  Base() = default;
>1*:4:  virtual ~Base() = default;
> --
> _ZN4BaseD0Ev:
> #:4:  virtual ~Base() = default;
> --
> _ZN4BaseD2Ev:
> 1:4:  virtual ~Base() = default;
> --
> -:5:  virtual void foo() = 0;
> -:6:};
> -:7:class Hello : public Base {
> -:8:public:
> -:9:  Hello() = default;
>1*:   10:  ~Hello() = default;
> --
> _ZN5HelloD0Ev:
> #:   10:  ~Hello() = default;
> --
> _ZN5HelloD2Ev:
> 1:   10:  ~Hello() = default;
> --
> -:   11:  void foo() override;
> -:   12:};
> -:   13:
> -:   14:#include 
> -:   15:
> -:   16:using namespace std;
> -:   17:
> 1:   18:void Hello::foo() {
> 1:   19:  cout << "hello" << endl;
> 1:   20:}
> -:   21:
> 1:   22:int main(int argc, char* argv[]) {
> #:   23:  Hello hello;
> 1:   24:  hello.foo();
> 1:   25:  return 0;
> -:   26:}
> 
> So yes, it's a virtual destructor _ZN4BaseD0Ev that is not called.
> And the not executed line:
> #:4:  Hello hello;
> 
> corresponds to a basic block 
> 
>:
> :
>   Hello::~Hello ();
>   resx 2
> 
> which would be executed when the Hellow constructor fails.

Hi Martin,

So according to your analysis would you classify this issue as a bug? Or is it
an NBC change that was introduced?

/Filip

[Bug tree-optimization/97228] [11 regression] New ICEs on arm since r11-3426 g:10843f8303509fcba880c6c05c08e4b4ccd24f36

2020-09-28 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97228

--- Comment #1 from Christophe Lyon  ---
It causes regressions in fortran too:
gfortran.dg/assumed_rank_bounds_3.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
gfortran.dg/assumed_rank_bounds_3.f90   -O3 -g  (internal compiler error)

[Bug tree-optimization/97228] New: [11 regression] New ICEs on arm since r11-3426 g:10843f8303509fcba880c6c05c08e4b4ccd24f36

2020-09-28 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97228

Bug ID: 97228
   Summary: [11 regression] New ICEs on arm since r11-3426
g:10843f8303509fcba880c6c05c08e4b4ccd24f36
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Since r11-3426 g:10843f8303509fcba880c6c05c08e4b4ccd24f36, I have noticed
several regressions on arm, for instance:
--target arm-none-linux-gnueabihf
--with-mode arm
--with-cpu cortex-a9
--with-fpu neon-fp16

FAIL: gcc.dg/tree-ssa/ifc-cd.c (internal compiler error)
FAIL: gcc.dg/vect/pr59591-1.c (internal compiler error)
FAIL: gcc.dg/vect/pr59591-1.c -flto -ffat-lto-objects (internal compiler error)
FAIL: gcc.dg/vect/slp-cond-5.c (internal compiler error)
FAIL: gcc.dg/vect/slp-cond-5.c -flto -ffat-lto-objects (internal compiler
error)
FAIL: gcc.dg/vect/vect-23.c (internal compiler error)
FAIL: gcc.dg/vect/vect-23.c -flto -ffat-lto-objects (internal compiler error)
FAIL: gcc.dg/vect/vect-cond-reduc-6.c (internal compiler error)
FAIL: gcc.dg/vect/vect-cond-reduc-6.c -flto -ffat-lto-objects (internal
compiler error)

FAIL: gcc.dg/vect/pr59591-1.c (internal compiler error)
FAIL: gcc.dg/vect/pr59591-1.c (test for excess errors)
Excess errors:
during RTL pass: expand
/gcc/testsuite/gcc.dg/vect/pr59591-1.c:23:1: internal compiler error: in
do_store_flag, at expr.c:12388
0x913726 do_store_flag
/gcc/expr.c:12388
0x914e01 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/gcc/expr.c:9623
0x91d8d0 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/gcc/expr.c:10165
0xa37e6d expand_normal
/gcc/expr.h:288
0xa37e6d expand_vect_cond_optab_fn
/gcc/internal-fn.c:2602
0x7d28ef expand_call_stmt
/gcc/cfgexpand.c:2612
0x7d28ef expand_gimple_stmt_1
/gcc/cfgexpand.c:3686
0x7d28ef expand_gimple_stmt
/gcc/cfgexpand.c:3851
0x7d43bd expand_gimple_basic_block
/gcc/cfgexpand.c:5892
0x7d6530 execute
/gcc/cfgexpand.c:6576

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #51 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:5b9f76b95528775b3f09d151c56ff80747109498

commit r10-8813-g5b9f76b95528775b3f09d151c56ff80747109498
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 11:07:50 2020 +0100

AArch64: Implement missing _p64 intrinsics for vector permutes

This patch implements some missing vector permute intrinsics operating on
poly64x2_t types.
They are implemented identically to their uint64x2_t brethren.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vtrn1q_p64, vtrn2q_p64, vuzp1q_p64,
vuzp2q_p64, vzip1q_p64, vzip2q_p64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/trn_zip_p64_1.c: New test.

(cherry picked from commit e8e818399d70c5a5a3d30a54d305c6e2b92e2c66)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #53 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:677f34508f1c8fba6a52c77f83eca08b6762ed28

commit r10-8815-g677f34508f1c8fba6a52c77f83eca08b6762ed28
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 17:37:58 2020 +0100

AArch64: Implement missing p128<->f64 reinterpret intrinsics

This patch implements the missing reinterprets to and from poly128_t and
float64x2_t.
I've plugged in the appropriate testing in the advsimd-intrinsics.exp
too.

Bootstrapped and tested on aarch64-none-linux-gnu.
Tested advsimd-intrinsics.exp on arm-none-eabi too to make sure arm
testing isn't affected.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vreinterpretq_f64_p128,
vreinterpretq_p128_f64): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(clean_results): Add float64x2_t cleanup.
(DECL_VARIABLE_128BITS_VARIANTS): Add float64x2_t variable.
* gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c: Add
testing of vreinterpretq_f64_p128, vreinterpretq_p128_f64.

(cherry picked from commit 65c9878641cbe0ed898aa7047b7b994e9d4a5bb1)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #54 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:a6c47f4ce26639bfbc72821ae629b9af7744a9d7

commit r10-8816-ga6c47f4ce26639bfbc72821ae629b9af7744a9d7
Author: Christophe Lyon 
Date:   Fri Sep 25 10:40:18 2020 +

testsuite: [aarch64] Fix aarch64/advsimd-intrinsics/v{trn,uzp,zip}_half.c

Since r11-3402 (g:65c9878641cbe0ed898aa7047b7b994e9d4a5bb1), the
vtrn_half, vuzp_half and vzip_half started failing with

vtrn_half.c:76:17: error: redeclaration of 'vector_float64x2' with no
linkage
vtrn_half.c:77:17: error: redeclaration of 'vector2_float64x2' with no
linkage
vtrn_half.c:80:17: error: redeclaration of 'vector_res_float64x2' with no
linkage

This is because r11-3402 now always declares float64x2 variables for
aarch64, leading to a duplicate declaration in these testcases.

The fix is simply to remove these now useless declarations.

These tests are skipped on arm*, so there is no impact on that target.

2020-09-25  Christophe Lyon  

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/advsimd-intrinsics/vtrn_half.c: Remove
declarations of vector, vector2, vector_res for float64x2 type.
* gcc.target/aarch64/advsimd-intrinsics/vuzp_half.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vzip_half.c: Likewise.

(cherry picked from commit 8c775bf447e190024fa08c55e38db94dd013a393)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #48 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:34db2d23439b39d2c7e5760f0f7de41f98b08c80

commit r10-8810-g34db2d23439b39d2c7e5760f0f7de41f98b08c80
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 12:03:49 2020 +0100

AArch64: Implement missing vcls intrinsics on unsigned types

This patch implements some missing intrinsics that perform a CLS on
unsigned SIMD types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vcls_u8, vcls_u16, vcls_u32,
vclsq_u8, vclsq_u16, vclsq_u32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vcls_unsigned_1.c: New test.

(cherry picked from commit 30957092db46d8798e632feefb5df634488dbb33)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #50 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:1c0679d6b5d12fb9d708442a8af725136a17f9cf

commit r10-8812-g1c0679d6b5d12fb9d708442a8af725136a17f9cf
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 10:32:42 2020 +0100

AArch64: Implement vldrq_p128 intrinsic

This patch implements the missing vldrq_p128 intrinsic that just loads from
the appropriate pointer.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vldrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vldrq_p128_1.c: New test.

(cherry picked from commit f2868e4bcff2c7b882d01231f039459c00e59d7b)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #52 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:858cfd55807883dfe1e051ad7d67d9c0449728f9

commit r10-8814-g858cfd55807883dfe1e051ad7d67d9c0449728f9
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 12:02:29 2020 +0100

AArch64: Implement missing vrndns_f32 intrinsic

This patch implements the missing vrndns_f32 intrinsic. This operates on a
scalar float32_t value.
It can be mapped down to a __builtin_aarch64_frintnsf builtin.

This patch does that.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/aarch64-simd-builtins.def (frintn): Use
BUILTIN_VHSDF_HSDF
for modes.  Remove explicit hf instantiation.
* config/aarch64/arm_neon.h (vrndns_f32): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vrndns_f32_1.c: New test.

(cherry picked from commit 02b5377b3766804059b7824330d33d0e1cef2e5b)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #49 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:bc04ceb7b94d9144ac923210f1affaa3dfed0725

commit r10-8811-gbc04ceb7b94d9144ac923210f1affaa3dfed0725
Author: Kyrylo Tkachov 
Date:   Wed Sep 23 10:29:17 2020 +0100

AArch64: Implement vstrq_p128 intrinsic

This patch implements the missing vstrq_p128 intrinsic.
It just performs a store of the poly128_t argument to a memory location.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vstrq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vstrq_p128_1.c: New test.

(cherry picked from commit d23ea1e865301cd45f14ccbdb0bca49251fde9e1)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #47 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:b8442a7c4c09375b76fa174313205fa7fcbfb016

commit r10-8809-gb8442a7c4c09375b76fa174313205fa7fcbfb016
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 12:00:38 2020 +0100

AArch64: Implement missing vceq*_p* intrinsics

This patch implements some missing vceq* intrinsics on poly types.
The behaviour is to produce the appropriate CMEQ instruction as for the
unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vceqq_p64, vceqz_p64, vceqzq_p64):
Define.

gcc/testsuite/

PR target/71233
* gcc.target/aarch64/simd/vceq_poly_1.c: New test.

(cherry picked from commit d4703be185b422f637deebd3bb9222a41c8023d6)

[Bug target/71233] [ARM, AArch64] missing AdvSIMD intrinsics

2020-09-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233

--- Comment #46 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kyrylo Tkachov
:

https://gcc.gnu.org/g:117b23e43f765cadbf3ca4b80602dc158789675b

commit r10-8808-g117b23e43f765cadbf3ca4b80602dc158789675b
Author: Kyrylo Tkachov 
Date:   Tue Sep 22 11:58:36 2020 +0100

AArch64: Implement poly-type vadd intrinsics

This implements the vadd[p]_p* intrinsics.
In terms of functionality they are aliases of veor operations on the
relevant unsigned types.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/
PR target/71233
* config/aarch64/arm_neon.h (vadd_p8, vadd_p16, vadd_p64, vaddq_p8,
vaddq_p16, vaddq_p64, vaddq_p128): Define.

gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vadd_poly_1.c: New test.

(cherry picked from commit fa9ad35dae03dcb20c4ccb50ba1b351a8ab77970)

[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

2020-09-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789

--- Comment #32 from Kewen Lin  ---
(In reply to Richard Biener from comment #31)
> (In reply to Kewen Lin from comment #29)
> > (In reply to Hongtao.liu from comment #28)
> > > > Probably you can try to tweak it in ix86_add_stmt_cost? when the 
> > > > statement
> > > 
> > > Yes, it's the place.
> > > 
> > > > is UB to UH conversion statement, further check if the def of the input 
> > > > UB
> > > > is MEM.
> > > 
> > > Only if there's no multi-use for UB. More generally, it's quite difficult 
> > > to
> > > guess later optimizations for the purpose of more accurate vectorization
> > > cost model, :(.
> > 
> > Yeah, it's hard sadly. The generic cost modeling is rough,
> > ix86_add_stmt_cost is more fine-grain (at least than what we have on Power
> > :)), if you want to check it more, it seems doable in target specific hook
> > finish_cost where you can get the whole vinfo object, but it could end up
> > with very heavy analysis and might not be worthy.
> > 
> > Do you mind to check if it can also fix this degradation on x86 to run FRE
> > and DSE just after cunroll? I found it worked for Power, hoped it can help
> > there too.
> 
> Btw, we could try sth like adding a TODO_force_next_scalar_cleanup to be
> returned from passes that see cleanup opportunities and have the pass
> manager queue that up, looking for a special marked pass and enabling
> that so we could have
> 
>   NEXT_PASS (pass_predcom);
>   NEXT_PASS (pass_complete_unroll);
>   NEXT_PASS (pass_scalar_cleanup);
>   PUSH_INSERT_PASSES_WITHIN (pass_scalar_cleanup);
> NEXT_PASS (pass_fre, false /* may_iterate */);
> NEXT_PASS (pass_dse);
>   POP_INSERT_PASSES ();
> 
> with pass_scalar_cleanup gate() returning false otherwise.  Eventually
> pass properties would match this better, or sth else.
> 

Thanks for the suggestion! Before cooking the patch, I have one question that
it looks to only update function property is enough, eg: some pass sets
property PROP_ok_for_cleanup and later pass_scalar_cleanup only goes for the
func with this property (checking in gate), I'm not quite sure the reason for
the TODO_flag TODO_force_next_scalar_cleanup.

[Bug libbacktrace/97227] New: dsymutil runs on ELF execs during libbacktrace testing

2020-09-28 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97227

Bug ID: 97227
   Summary: dsymutil runs on ELF execs during libbacktrace testing
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: trivial
  Priority: P3
 Component: libbacktrace
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org
CC: ian at gcc dot gnu.org
  Target Milestone: ---

When running libbacktrace check, I run into:
...
make[3]: Entering directory
'/dev/shm/tdevries/data/master/2020-09-24T22-37-10-02-00-942ab9e9d4f/build/libbacktrace'
dsymutil btest
Stack dump:
0.  Program arguments: dsymutil btest
#0 0x7fc89542fd7d llvm::sys::PrintStackTrace(llvm::raw_ostream&)
(/usr/bin/../lib64/libLLVM.so.10+0x9a9d7d)
#1 0x7fc89542d6a0 llvm::sys::RunSignalHandlers()
(/usr/bin/../lib64/libLLVM.so.10+0x9a76a0)
#2 0x7fc895430412 (/usr/bin/../lib64/libLLVM.so.10+0x9aa412)
#3 0x7fc8943295a0 __restore_rt (/lib64/libc.so.6+0x395a0)
#4 0x00410085 _init (/usr/bin/dsymutil-10.0.0+0x410085)
#5 0x7fc89431434a __libc_start_main (/lib64/libc.so.6+0x2434a)
#6 0x0040d69a _init (/usr/bin/dsymutil-10.0.0+0x40d69a)
make[3]: *** [Makefile:2396: btest.dSYM] Segmentation fault (core dumped)
...

The installed dsymutil is from llvm10, which has a bug which is fixed on llvm
trunk by this commit:
...
commit ef87f69ec538ccfe7d68b6d03125e7636e859ace (HEAD)
Author: Greg Clayton 
Date:   Fri Mar 6 14:59:41 2020 -0800

Fix a copy and paste error that would cause a crash.
...

Anyway, after applying that commit we'll be likely to get:
...
dsymutil btest
error: cannot parse the debug map for 'btest': The file was not recognized as a
valid object file
...
because this tool is intended for mach-o, not for elf.

So, probably we should only do dsymutil btest on a mach-o platform.

[Bug pch/97226] New: ICE in gt_pch_note_object at ggc-common.c:276

2020-09-28 Thread jonneransijn1998 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97226

Bug ID: 97226
   Summary: ICE in gt_pch_note_object at ggc-common.c:276
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: pch
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jonneransijn1998 at gmail dot com
  Target Milestone: ---

$ cat bug.min.h
extern unsigned int __builtin_ia32_crc32qi(unsigned int, unsigned char);
extern unsigned int __builtin_ia32_crc32hi(unsigned int, unsigned short);
extern unsigned int __builtin_ia32_crc32si(unsigned int, unsigned int);
#pragma GCC push_options
#pragma GCC target("sse4.2")
#pragma GCC pop_options
class ClassName {};

$ x86_64-w64-mingw32-g++-win32 -o bug.gch -c bug.h
bug.h:3:19: internal compiler error: in gt_pch_note_object, at ggc-common.c:276
 class ClassName {};
   ^
0x7f1c02bb409a __libc_start_main
../csu/libc-start.c:308
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

$ x86_64-w64-mingw32-g++-win32 -v
Using built-in specs.
COLLECT_GCC=x86_64-w64-mingw32-g++-win32
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-w64-mingw32/8.3-win32/lto-wrapper
Target: x86_64-w64-mingw32
Configured with: ../../src/configure --build=x86_64-linux-gnu --prefix=/usr
--includedir='/usr/include' --mandir='/usr/share/man'
--infodir='/usr/share/info' --sysconfdir=/etc --localstatedir=/var
--disable-silent-rules --libdir='/usr/lib/x86_64-linux-gnu'
--libexecdir='/usr/lib/x86_64-linux-gnu' --disable-maintainer-mode
--disable-dependency-tracking --prefix=/usr --enable-shared --enable-static
--disable-multilib --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --libdir=/usr/lib --enable-libstdcxx-time=yes
--with-tune=generic --with-headers=/usr/x86_64-w64-mingw32/include
--enable-version-specific-runtime-libs --enable-fully-dynamic-string
--enable-libgomp --enable-languages=c,c++,fortran,objc,obj-c++,ada --enable-lto
--enable-threads=win32 --program-suffix=-win32
--program-prefix=x86_64-w64-mingw32- --target=x86_64-w64-mingw32
--with-as=/usr/bin/x86_64-w64-mingw32-as
--with-ld=/usr/bin/x86_64-w64-mingw32-ld --enable-libatomic
--enable-libstdcxx-filesystem-ts=yes
Thread model: win32
gcc version 8.3-win32 20190406 (GCC)

$ uname -a
Linux yyny 4.4.0-18362-Microsoft #1049-Microsoft Thu Aug 14 12:01:00 PST 2020
x86_64 GNU/Linux

This is the default MinGW GCC Cross Compiler for the Windows Subsystem for
Linux.

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194

--- Comment #13 from Richard Biener  ---
(In reply to Richard Biener from comment #12)
> (In reply to Alexander Monakov from comment #11)
> > Yeah, for inserts such tactic would be inappropriate due to bad store
> > forwarding stalls anyway. As you've shown in earlier comments, inserts have
> > a very nice generic way to expand them (that does not touch stack).
> 
> Unfortunately it doesn't work (the CSE).  Patch:
> 
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 1eaa1da11b9..f7b1a92dd95 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -6102,7 +6102,11 @@ discover_nonconstant_array_refs_r (tree * tp, int
> *walk_subtrees,
>  || CONVERT_EXPR_P (t))
> t = TREE_OPERAND (t, 0);
>  
> -  if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)
> +  if ((TREE_CODE (t) == ARRAY_REF
> +  && !(TREE_CODE (TREE_OPERAND (t, 0)) == VIEW_CONVERT_EXPR
> +   && DECL_P (TREE_OPERAND (TREE_OPERAND (t, 0), 0)))
> +   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (TREE_OPERAND (t,
> 0), 0
> +  || TREE_CODE (t) == ARRAY_RANGE_REF)
> {
>   t = get_base_address (t);
>   if (t && DECL_P (t)
> 
> 
> and for
> 
> typedef int v4si __attribute__((vector_size(16)));
> 
> int foo (v4si v, int i)
> {
>   v = v + v;
>   return v[i] + v[2*i];
> }
> 
> at -O2 we get
> 
> foo:
> .LFB0:
> .cfi_startproc
> leal(%rdi,%rdi), %edx
> paddd   %xmm0, %xmm0
> movslq  %edi, %rdi
> movslq  %edx, %rdx
> movaps  %xmm0, -24(%rsp)
> movaps  %xmm0, -40(%rsp)
> movl-40(%rsp,%rdi,4), %eax
> addl-24(%rsp,%rdx,4), %eax
> ret

and unpatched

foo:
.LFB0:
.cfi_startproc
leal(%rdi,%rdi), %edx
paddd   %xmm0, %xmm0
movslq  %edi, %rdi
movslq  %edx, %rdx
movaps  %xmm0, -24(%rsp)
movl-24(%rsp,%rdi,4), %eax
addl-24(%rsp,%rdx,4), %eax
ret

so we're able to elide the stack slot usage for the add and retain a single
slot.

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194

--- Comment #12 from Richard Biener  ---
(In reply to Alexander Monakov from comment #11)
> Yeah, for inserts such tactic would be inappropriate due to bad store
> forwarding stalls anyway. As you've shown in earlier comments, inserts have
> a very nice generic way to expand them (that does not touch stack).

Unfortunately it doesn't work (the CSE).  Patch:

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 1eaa1da11b9..f7b1a92dd95 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -6102,7 +6102,11 @@ discover_nonconstant_array_refs_r (tree * tp, int
*walk_subtrees,
 || CONVERT_EXPR_P (t))
t = TREE_OPERAND (t, 0);

-  if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)
+  if ((TREE_CODE (t) == ARRAY_REF
+  && !(TREE_CODE (TREE_OPERAND (t, 0)) == VIEW_CONVERT_EXPR
+   && DECL_P (TREE_OPERAND (TREE_OPERAND (t, 0), 0)))
+   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (TREE_OPERAND (t, 0),
0
+  || TREE_CODE (t) == ARRAY_RANGE_REF)
{
  t = get_base_address (t);
  if (t && DECL_P (t)


and for

typedef int v4si __attribute__((vector_size(16)));

int foo (v4si v, int i)
{
  v = v + v;
  return v[i] + v[2*i];
}

at -O2 we get

foo:
.LFB0:
.cfi_startproc
leal(%rdi,%rdi), %edx
paddd   %xmm0, %xmm0
movslq  %edi, %rdi
movslq  %edx, %rdx
movaps  %xmm0, -24(%rsp)
movaps  %xmm0, -40(%rsp)
movl-40(%rsp,%rdi,4), %eax
addl-24(%rsp,%rdx,4), %eax
ret

we likely also not get rid of the stack allocation.  Maybe it's due to the
way expand does the temporary spill, not ending its lifetime, not sure.
We're definitely not "remembering" the spill slot used for 'v' and do
not re-use it, there's no mechanism for that IIRC.

At least we don't ICE for the specific case of vectors.  We're running into

/* If we have either an offset, a BLKmode result, or a reference
   outside the underlying object, we must force it to memory.
   Such a case can occur in Ada if we have unchecked conversion
   of an expression from a scalar type to an aggregate type or
   for an ARRAY_RANGE_REF whose type is BLKmode, or if we were
   passed a partially uninitialized object or a view-conversion
   to a larger size.  */
must_force_mem = (offset
  || mode1 == BLKmode
  || (mode == BLKmode
  && !int_mode_for_size (bitsize, 1).exists ())
  || maybe_gt (bitpos + bitsize,
   GET_MODE_BITSIZE (mode2)));

where 'offset' is MULT_EXPR and we've sofar expanded 'v' to op0 = (reg/v:V4SI
88 [ v ])
and then

/* Otherwise, if this is a constant or the object is not in memory
   and need be, put it there.  */
else if (CONSTANT_P (op0) || (!MEM_P (op0) && must_force_mem))
  {
memloc = assign_temp (TREE_TYPE (tem), 1, 1);
emit_move_insn (memloc, op0);
op0 = memloc;
clear_mem_expr = true;
  }

[Bug fortran/97224] [8/9/10/11 Regression] SPECCPU 2006 Gamess fails to build after g:e5a76af3a2f3324efc60b4b2778ffb29d5c377bc

2020-09-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97224

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |8.5
   Last reconfirmed||2020-09-28
 Resolution|MOVED   |---
 Status|RESOLVED|NEW
   Priority|P3  |P1
 Ever confirmed|0   |1

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194

--- Comment #11 from Alexander Monakov  ---
Yeah, for inserts such tactic would be inappropriate due to bad store
forwarding stalls anyway. As you've shown in earlier comments, inserts have a
very nice generic way to expand them (that does not touch stack).

[Bug tree-optimization/97225] Failure to optimize out conditional addition of zero

2020-09-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97225

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2020-09-28
  Component|c   |tree-optimization
   Keywords||missed-optimization
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
I think we've seen a duplicate report for this.

Confirmed.  late phiopt sees (we need hoisting to get rid of the loads
in the two if arms):


   [local count: 1073741824]:
  _1 = vec_6(D)->size;
  pretmp_9 = vec_6(D)->data;
  if (_1 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 708669601]:
  _3 = _1 * 4;
  _7 = pretmp_9 + _3;

   [local count: 1073741824]:
  # _4 = PHI <_7(3), pretmp_9(2)>
  return _4;

where it misses the value-conversion, possibly either based on lack
of handling the mult+add or because of consideration of making the
not infrequent path more expensive.  Later if-converting this anyway
on RTL shows a disconnect in cost then.

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194

--- Comment #10 from rguenther at suse dot de  ---
On Mon, 28 Sep 2020, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
> 
> --- Comment #9 from Alexander Monakov  ---
> (In reply to Richard Biener from comment #8)
> > Note that currently RTL expansion forces a local vector typed variable
> > to the stack (instead of allocating a pseudo) when there are
> > variable-index accesses to it.  That might be a reason to also handle
> > slightly "expensive" extract cases.  But I guess later falling back
> > to a stack slot via a splitter or LRA will lead to worse code.
> 
> Indeed, but I struggle to see a good reason to bind the entire lifetime of a
> variable to memory just because one operation requires that. Cannot GCC 
> instead
> create a fresh temporary early at RTL-expand (not split) time for each extract
> operation, letting the original variable live in a pseudo, and binding only
> that short-lived temporary to memory?
> 
> It can result in extra copies if the temporary needs to be loaded from memory
> anyway, but I think passes like RTL CSE should be able to propagate them.

Sure, that would be possible.  We do this kind of things for
CONCAT (complex numbers) with handling some select cases and falling
back to spilling.  But we've backed off for more general handling
because it ICEd too many cases.

For inserts we don't want to do this since I'm quite positive that
RTL wouldn't be able to merge two spill slots when doing two
consecutive inserts.

[Bug analyzer/93388] ensure -fanalyzer works with our C code

2020-09-28 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93388

--- Comment #22 from Arseny Solokha  ---
(In reply to David Binderman from comment #21)
> So given that the analyzer doesn't crash on the current Linux kernel code,
> does it do anything useful like find any bugs, perhaps ?

The snapshot I've mentioned is the first one ever which has gotten to be able
to build whole allyesconfig Linux w/ -fanalyzer w/o any ICE. I didn't try to
check its reports on Linux up until now. Analyzer still has some issues like
the one I reported in PR93695 which constitutes a heavy dose of false positives
in my testing.

[Bug fortran/97224] [8/9/10/11 Regression] SPECCPU 2006 Gamess fails to build after g:e5a76af3a2f3324efc60b4b2778ffb29d5c377bc

2020-09-28 Thread mark.eggleston at codethink dot co.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97224

--- Comment #5 from mark.eggleston at codethink dot co.uk ---
in progress...

On 28/09/2020 11:43, jakub at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97224
>
> Jakub Jelinek  changed:
>
> What|Removed |Added
> 
>   CC||jakub at gcc dot gnu.org
>
> --- Comment #4 from Jakub Jelinek  ---
> You've reverted it only on the trunk and not on the release branches though.
>

[Bug target/97194] optimize vector element set/extract at variable position

2020-09-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194

--- Comment #9 from Alexander Monakov  ---
(In reply to Richard Biener from comment #8)
> Note that currently RTL expansion forces a local vector typed variable
> to the stack (instead of allocating a pseudo) when there are
> variable-index accesses to it.  That might be a reason to also handle
> slightly "expensive" extract cases.  But I guess later falling back
> to a stack slot via a splitter or LRA will lead to worse code.

Indeed, but I struggle to see a good reason to bind the entire lifetime of a
variable to memory just because one operation requires that. Cannot GCC instead
create a fresh temporary early at RTL-expand (not split) time for each extract
operation, letting the original variable live in a pseudo, and binding only
that short-lived temporary to memory?

It can result in extra copies if the temporary needs to be loaded from memory
anyway, but I think passes like RTL CSE should be able to propagate them.

[Bug fortran/97224] [8/9/10/11 Regression] SPECCPU 2006 Gamess fails to build after g:e5a76af3a2f3324efc60b4b2778ffb29d5c377bc

2020-09-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97224

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
You've reverted it only on the trunk and not on the release branches though.

[Bug analyzer/93388] ensure -fanalyzer works with our C code

2020-09-28 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93388

--- Comment #21 from David Binderman  ---
So given that the analyzer doesn't crash on the current Linux kernel code,
does it do anything useful like find any bugs, perhaps ?

Maybe gcc compiling itself with the analyzer might find some bugs, too.

[Bug c/97225] New: Failure to optimize out conditional addition of zero

2020-09-28 Thread osandov at osandov dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97225

Bug ID: 97225
   Summary: Failure to optimize out conditional addition of zero
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: osandov at osandov dot com
  Target Milestone: ---

For the following code:

#include 

struct vector {
int *data;
size_t size;
};

int *vector_end(struct vector *vec)
{
return vec->data + vec->size;
}

GCC 10.2.0 on x86-64 generates the following code (same on -O2, -O3, and -Os):

vector_end:
movq8(%rdi), %rdx
movq(%rdi), %rax
leaq(%rax,%rdx,4), %rax
ret

However, vector_end() needs to handle empty vectors represented as { NULL, 0 }.
Pointer arithmetic on a null pointer is undefined behavior (even NULL + 0, as
far as I can tell from the C standard), so the correct code is:

int *vector_end(struct vector *vec)
{
if (vec->size == 0)
return vec->data;
return vec->data + vec->size;
}

I'd expect this to generate the same code, but GCC 10.2.0 generates a
conditional move with -O2 and -O3:

vector_end:
movq8(%rdi), %rdx
movq(%rdi), %rax
testq   %rdx, %rdx
leaq(%rax,%rdx,4), %rcx
cmovne  %rcx, %rax
ret

And a branch with -Os:

vector_end:
movq8(%rdi), %rdx
movq(%rdi), %rax
testq   %rdx, %rdx
je  .L1
leaq(%rax,%rdx,4), %rax
.L1:
ret

Clang 10.0.1, on the other hand, generates the same code with and without the
size check (oddly enough, it also falls back to a conditional move if the size
member is an int or unsigned int instead of size_t/unsigned long):

vector_end: # @vector_end
movq8(%rdi), %rax
shlq$2, %rax
addq(%rdi), %rax
retq

Can GCC avoid the conditional move/branch here?

[Bug analyzer/93388] ensure -fanalyzer works with our C code

2020-09-28 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93388

--- Comment #20 from Arseny Solokha  ---
Using gcc-11.0.0-alpha20200927 snapshot I've just managed to build current
Linux master, configured w/ allyesconfig for x86_64, without ICEs, compile-time
hogs, or memory hogs in the analyzer for the first time. That's quite a
milestone.

[Bug c++/97195] construct_at on a union member is not a constant expression

2020-09-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97195

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2020-09-28
 CC||jakub at gcc dot gnu.org

[Bug fortran/97224] [8/9/10/11 Regression] SPECCPU 2006 Gamess fails to build after g:e5a76af3a2f3324efc60b4b2778ffb29d5c377bc

2020-09-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97224

--- Comment #3 from Tamar Christina  ---
Cheers, thanks Mark!

[Bug fortran/95614] ICE in build_field, at fortran/trans-common.c:301

2020-09-28 Thread markeggleston at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95614

markeggleston at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #10 from markeggleston at gcc dot gnu.org ---
See 97224

  1   2   >