[Bug target/82731] _mm256_set_epi8(array[offset[0]], array[offset[1]], ...) byte gather makes slow code, trying to zero-extend all the uint16_t offsets first and spilling them.

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-04-15
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed. This comes down to having a scheduler that reduces live ranges much
more agressively.

Adding -fschedule-insns helps slightly but not enough in this case.

[Bug c++/108602] FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)

2024-04-15 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108602

Alexandre Oliva  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Alexandre Oliva  ---
I've bisected the fail, that also occurred on arm-eabi, and that led me to the
second patch for bug 112580.  I'm thus marking this as a dupe.

*** This bug has been marked as a duplicate of bug 112580 ***

[Bug c++/112580] [14 Regression]: g++.dg/modules/xtreme-header-4_b.C et al; ICE tree check: expected class 'type', have 'declaration'

2024-04-15 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112580

Alexandre Oliva  changed:

   What|Removed |Added

 CC||danglin at gcc dot gnu.org

--- Comment #12 from Alexandre Oliva  ---
*** Bug 108602 has been marked as a duplicate of this bug. ***

[Bug target/82731] _mm256_set_epi8(array[offset[0]], array[offset[1]], ...) byte gather makes slow code, trying to zero-extend all the uint16_t offsets first and spilling them.

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82731

--- Comment #2 from Andrew Pinski  ---
Note you can reproduce the same issue with SSE2 (and not just AVX):
```

#define vect16 __attribute__((vector_size(16)))

vect16 char gather(char *array, unsigned short *offset) {

  return (vect16 char){array[offset[0]], array[offset[1]], array[offset[2]],
array[offset[3]], array[offset[4]], array[offset[5]], array[offset[6]],
array[offset[7]],
  array[offset[8]],array[offset[9]],array[offset[10]],array[offset[11]],
array[offset[12]], array[offset[13]], array[offset[14]]};
}
```

[Bug target/85223] [nvptx] nvptx_single needs rewrite

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85223

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||internal-improvement
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-15

--- Comment #2 from Andrew Pinski  ---
This looks like it is still true.

[Bug c/92880] Documentation for Built-in Vector-Extensions should mention C99 Fixed-width ints as base types

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92880

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #4 from Andrew Pinski  ---
I am going to make a slight change to the documentation.
Right now it reads:
```
All the basic integer types can be used as base types, both as signed and as
unsigned: char, short, int, long, long long. In addition, float and double can
be used to build floating-point vector types.

```

But I am going to add a mention of typedefs.
Something like:
```
All the basic integer types can be used as base types (and typedefs of them),
both as signed and as unsigned: char, short, int, long, long long. In addition,
float and double can be used to build floating-point vector types.

```

Which then will fix the documentation here.

There is still more to be done in vector documentation but that is PR 107796 .

[Bug middle-end/114700] middle-end optimization generates causes -fsanitize=undefined not to happen in some cases

2024-04-15 Thread lin1.hu at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114700

--- Comment #19 from Hu Lin  ---
(In reply to Jakub Jelinek from comment #18)
> (In reply to Hu Lin from comment #17)
> > (In reply to Jakub Jelinek from comment #16)
> > > 
> > > No, -ftrapv isn't a debugging tool.  There is no overflow in the 
> > > expression
> > > that GCC actually evaluates (into which the expression has been 
> > > optimized).
> > > If you have overflow in an expression that is never used, GCC with -ftrapv
> > > will also
> > > eliminate it as unused and won't diagnose the trap.
> > > -fsanitize=undefined behaves in that case actually the same with -O1 and
> > > higher (intentionally, to decrease the cost of the sanitization).  So, one
> > > needs to use -O0 -fsanitize=undefined to get as many cases of UB in the
> > > program diagnosed as possible.
> > 
> > OK, that look like GCC's -ftrapv is not the same as clang's. Then my added
> > condition should be (optimize || !TYPE_OVERFLOW_SANITIZED (type)). 
> 
> Why?  Just !TYPE_OVERFLOW_SANITIZED (type).
> 

OK, so the part is one of your suggestions on how to test UB in a program. 
I have another question, -fsanitize=undefined disable this optimization, but
you said -ftrapv won't diagnose the trap. Why is the logic here different for
these two options?

> 
> TYPE_OVERFLOW_SANITIZED is
> #define TYPE_OVERFLOW_SANITIZED(TYPE)   \
>   (INTEGRAL_TYPE_P (TYPE)   \
>&& !TYPE_OVERFLOW_WRAPS (TYPE)   \
>&& (flag_sanitize & SANITIZE_SI_OVERFLOW))
> so, it isn't true for non-integral types, nor for TYPE_OVERFLOW_WRAPS types.
> So, if you want to avoid the (view_convert (negate @1)), just add (if
> !TYPE_OVERFLOW_SANITIZED (type)) above the (view_convert (negate @1)).  But
> in each case, you want to be careful which exact type you want to check,
> type is the type of
> the outermost expression, otherwise TREE_TYPE (@0) etc.

Thanks for your advice.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #9 from kugan at gcc dot gnu.org ---
Looking at the options, looks to me that making loop->safelen a poly_in is the
way to go. (In reply to Jakub Jelinek from comment #4)
> The OpenMP safelen clause argument is a scalar integer, so using poly_int
> for something that must be an int doesn't make sense.
> Though, the above testcase actually doesn't use safelen clause, so safelen
> is there effectively infinity.
Thanks. I was looking at this to see if there is a way to handle this
differently. Looks to me that making loop->safelen a poly_int is the way to
handle at least the case when omp safelen clause is not provided. I am
interested in looking into this. Any suggestions? Here is a completely untested
diff that makes loop->safelen a poly_int.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #10 from kugan at gcc dot gnu.org ---
Created attachment 57946
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57946&action=edit
patch

patch to make loop->safelen a poly_int

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #11 from Jakub Jelinek  ---
(In reply to kugan from comment #9)
> Looking at the options, looks to me that making loop->safelen a poly_in is
> the way to go. (In reply to Jakub Jelinek from comment #4)
> > The OpenMP safelen clause argument is a scalar integer, so using poly_int
> > for something that must be an int doesn't make sense.
> > Though, the above testcase actually doesn't use safelen clause, so safelen
> > is there effectively infinity.
> Thanks. I was looking at this to see if there is a way to handle this
> differently. Looks to me that making loop->safelen a poly_int is the way to
> handle at least the case when omp safelen clause is not provided.

Why?
Then it just is INT_MAX value, which is a magic value that says that it is
infinity.
No need to say it is a poly_int infinity.

[Bug ipa/114703] Missed devirtualization in rather simple case

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114703

Richard Biener  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #2 from Richard Biener  ---
Yep, 'new' memory escapes.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #12 from kugan at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #11)
> (In reply to kugan from comment #9)
> > Looking at the options, looks to me that making loop->safelen a poly_in is
> > the way to go. (In reply to Jakub Jelinek from comment #4)
> > > The OpenMP safelen clause argument is a scalar integer, so using poly_int
> > > for something that must be an int doesn't make sense.
> > > Though, the above testcase actually doesn't use safelen clause, so safelen
> > > is there effectively infinity.
> > Thanks. I was looking at this to see if there is a way to handle this
> > differently. Looks to me that making loop->safelen a poly_int is the way to
> > handle at least the case when omp safelen clause is not provided.
> 
> Why?
> Then it just is INT_MAX value, which is a magic value that says that it is
> infinity.
> No need to say it is a poly_int infinity.

For this test case, omp_max_vf gets [16, 16] from the backend. This then
becomes 16. If we keep it as poly_int, it would pass maybe_lt (max_vf, min_vf))
after applying safelen?

[Bug tree-optimization/114704] Missed optimization : eliminate store if the value is known in all predecessors

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114704

--- Comment #5 from Richard Biener  ---
We're not handling "phi translation" in the lookup phase when determining if
there's a redundant store (PHI translation for the virtual operand).  In
particular value-numbering never considers whether an expression
in multiple paths into a CFG merge value-numbers the same.  This is only
done as part of PRE which figures some extra fully redundant expressions.
But the redundant store removal is something done after-the-fact using
just the VN machinery.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #13 from Jakub Jelinek  ---
(In reply to kugan from comment #12)
> > Why?
> > Then it just is INT_MAX value, which is a magic value that says that it is
> > infinity.
> > No need to say it is a poly_int infinity.
> 
> For this test case, omp_max_vf gets [16, 16] from the backend. This then
> becomes 16. If we keep it as poly_int, it would pass maybe_lt (max_vf,
> min_vf)) after applying safelen?

No.  You should just special case loop->safelen == INT_MAX to mean infinity in
the comparisons where it currently doesn't work as infinity.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #14 from Richard Biener  ---
I think

  if (safelen)
{
  poly_uint64 val;
  safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
  if (!poly_int_tree_p (safelen, &val))
safelen_int = 0;
  else
safelen_int = MIN (constant_lower_bound (val), INT_MAX);

should simply become

safelen_int = constant_upper_bound_with_limit (val, INT_MAX);

?  Usually targets do have a limit on the actual length but I see
constant_upper_bound_with_limit doesn't query such.  But it would
be a more appropriate way to say there might be an actual target limit here?

[Bug target/114717] New: '-fcf-protection' vs. offloading compilation

2024-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

Bug ID: 114717
   Summary: '-fcf-protection' vs. offloading compilation
   Product: gcc
   Version: 14.0
   URL: https://github.com/gcc-mirror/gcc/commit/1bf18629c54ad
f4893c8db5227a36e1952ee69a3#commitcomment-140648051
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, vries at gcc dot gnu.org
  Target Milestone: ---
Target: GCN, nvptx

If '-fcf-protection' is in effect (as, for example, enabled by default in
certain distributions), that option gets forwarded to the offloading compilers,
but for both GCN and nvptx:

lto1: error: ‘-fcf-protection=full’ is not supported for this target

Originally reported by Oscar Barenys in
.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #15 from rguenther at suse dot de  ---
On Mon, 15 Apr 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
> 
> --- Comment #13 from Jakub Jelinek  ---
> (In reply to kugan from comment #12)
> > > Why?
> > > Then it just is INT_MAX value, which is a magic value that says that it is
> > > infinity.
> > > No need to say it is a poly_int infinity.
> > 
> > For this test case, omp_max_vf gets [16, 16] from the backend. This then
> > becomes 16. If we keep it as poly_int, it would pass maybe_lt (max_vf,
> > min_vf)) after applying safelen?
> 
> No.  You should just special case loop->safelen == INT_MAX to mean infinity in
> the comparisons where it currently doesn't work as infinity.

But then an actual safelen(INT_MAX) would need to be adjusted.

Maybe using a poly-int safelen internally is cleaner.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #16 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #14)
> I think
> 
>   if (safelen)
> {
>   poly_uint64 val;
>   safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
>   if (!poly_int_tree_p (safelen, &val))
> safelen_int = 0;
>   else
> safelen_int = MIN (constant_lower_bound (val), INT_MAX);
> 
> should simply become
> 
> safelen_int = constant_upper_bound_with_limit (val, INT_MAX);
> 
> ?  Usually targets do have a limit on the actual length but I see
> constant_upper_bound_with_limit doesn't query such.  But it would
> be a more appropriate way to say there might be an actual target limit here?

OMP_CLAUSE_SAFELEN_EXPR is always an INTEGER_CST, the FEs verify that and error
if it is not.  So, I must say I don't really understand parts of the
r8-5649-g9d2f08ab97be
changes.  I can understand the intent to make max_vf a poly_int, but don't
understand why safelen should be, what would it mean and when it would be set
that way?

[Bug tree-optimization/114711] Missed optimization: fold load of global constant array if there is obivous pattern

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114711

--- Comment #4 from Richard Biener  ---
This one requires "symbolicizing" an initializer.  That might for example also
help implementing a non-constant initializer with a loop, reducing .data and
possibly relocations.  It might also help reducing compile-time memory usage
when we can have a more compact representation of a CONSTRUCTOR and its
elements.

It feels somewhat similar to what we do in switch-conversion, so maybe some
common infrastructure could be built.

[Bug target/114718] New: GCN's '-march'es vs. default multilib

2024-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114718

Bug ID: 114718
   Summary: GCN's '-march'es vs. default multilib
   Product: gcc
   Version: 14.0
   URL: https://github.com/gcc-mirror/gcc/commit/1bf18629c54ad
f4893c8db5227a36e1952ee69a3#commitcomment-140648051
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

When a specific multilib build for GCN's '-march'es is not available (has not
been 'configure'd/packaged), GCC resorts to the default multilib build -- which
in the case of GCN won't even link: 'ld: error: incompatible mach'.  Instead of
attempting the latter (default multilib build), should this case be diagnosed
properly, instead?

(Independent of the vague idea that multilib builds for GCN be made "more
permeable".)

Reported, for example, by Oscar Barenys in
.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #17 from rguenther at suse dot de  ---
On Mon, 15 Apr 2024, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
> 
> --- Comment #16 from Jakub Jelinek  ---
> (In reply to Richard Biener from comment #14)
> > I think
> > 
> >   if (safelen)
> > {
> >   poly_uint64 val;
> >   safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
> >   if (!poly_int_tree_p (safelen, &val))
> > safelen_int = 0;
> >   else
> > safelen_int = MIN (constant_lower_bound (val), INT_MAX);
> > 
> > should simply become
> > 
> > safelen_int = constant_upper_bound_with_limit (val, INT_MAX);
> > 
> > ?  Usually targets do have a limit on the actual length but I see
> > constant_upper_bound_with_limit doesn't query such.  But it would
> > be a more appropriate way to say there might be an actual target limit here?
> 
> OMP_CLAUSE_SAFELEN_EXPR is always an INTEGER_CST, the FEs verify that and 
> error
> if it is not.  So, I must say I don't really understand parts of the
> r8-5649-g9d2f08ab97be
> changes.  I can understand the intent to make max_vf a poly_int, but don't
> understand why safelen should be, what would it mean and when it would be set
> that way?

It would be only to "better" encode "infinity".  But I see loop->safelen
is 'int' but only [0, MAX_INT] is specified so we'd conveniently have
say -1 to encode "infinity".  We'd have to special case that value
anyway?

[Bug target/114718] GCN's '-march'es vs. default multilib

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114718

Andrew Pinski  changed:

   What|Removed |Added

URL|https://github.com/gcc-mirr |
   |or/gcc/commit/1bf18629c54ad |
   |f4893c8db5227a36e1952ee69a3 |
   |#commitcomment-140648051|

--- Comment #1 from Andrew Pinski  ---
https://github.com/gcc-mirror/gcc/commit/1bf18629c54adf4893c8db5227a36e1952ee69a3#commitcomment-140648051

[Bug driver/114718] GCN's '-march'es vs. default multilib

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114718

Andrew Pinski  changed:

   What|Removed |Added

  Component|target  |driver
   Severity|normal  |enhancement
   Keywords||diagnostic

[Bug lto/114713] incorrect TBAA for struct with flexible array member or GNU zero size

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114713

Richard Biener  changed:

   What|Removed |Added

Version|unknown |14.0
 CC||rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
[] and [0] are not the same.  Note the C and C++ frontends encode [0] (or did
encode) differently.

See tree.cc:gimple_canonical_types_compatible_p

I agree that from a QOI perspective [] and [0] should be compatible.

Note [0] can have many "equivalent" domains, [3:2], [0:-1], [1:0], etc.
but the current code has

  /* Array types are the same if the element types are the same and
 the number of elements are the same.  */
...
  /* The minimum/maximum values have to be the same.  */
  if ((min1 == min2
   || (min1 && min2

which is somewhat contradicting comments.

See also layout_type which says, when computing TYPE_SIZE of an array:

/* ??? We have no way to distinguish a null-sized array from an
   array spanning the whole sizetype range, so we arbitrarily
   decide that [0, -1] is the only valid representation.  */
if (integer_zerop (length)
&& TREE_OVERFLOW (length)
&& integer_zerop (lb))
  length = size_zero_node;

revealing that the constraints on TYPE_DOMAIN are not very well specified.

[Bug c++/114600] [14 Regression] [modules] redefinition errors when using certain std headers in GMF

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114600

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Nathaniel Shead :

https://gcc.gnu.org/g:3878e9aeb30cb192f769997c52743daf8190744c

commit r14-9961-g3878e9aeb30cb192f769997c52743daf8190744c
Author: Nathaniel Shead 
Date:   Mon Apr 8 23:34:42 2024 +1000

c++: Only emit exported GMF usings [PR114600]

A typo in r14-6978 made us emit too many things. This ensures that we
don't emit using-declarations from the GMF that we don't need to.

PR c++/114600

gcc/cp/ChangeLog:

* module.cc (depset::hash::add_binding_entity): Require both
WMB_Using and WMB_Export for GMF entities.

gcc/testsuite/ChangeLog:

* g++.dg/modules/using-14.C: New test.

Signed-off-by: Nathaniel Shead 
Co-authored-by: Patrick Palka 

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

Andrew Pinski  changed:

   What|Removed |Added

URL|https://github.com/gcc-mirr |
   |or/gcc/commit/1bf18629c54ad |
   |f4893c8db5227a36e1952ee69a3 |
   |#commitcomment-140648051|
  Component|target  |driver

--- Comment #1 from Andrew Pinski  ---
https://github.com/gcc-mirror/gcc/commit/1bf18629c54adf4893c8db5227a36e1952ee69a3#commitcomment-140648051

This option should be masked off when calling the offload-lto ...

[Bug driver/114718] GCN's '-march'es vs. default multilib

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114718

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-04-15
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
This is a standard ABI changing option issue. I doubt there is anything can be
done as you don't know if the library was installed seperately and might just
happen to work with some extra -L options.

Maybe describe what kind of error message you are expecting instead of the
current one which is coming from the linker when doing the final link?

[Bug lto/114713] incorrect TBAA for struct with flexible array member or GNU zero size

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114713

--- Comment #2 from Richard Biener  ---
Also note that people might find it reasonable to access

struct { int n; int a[4]; } a = { 4, };

via

struct X { int n; int a[] } *p;

The fortran frontend goes some lengths to make this work for array
descriptors statically allocated by accessing the storage always
via the type with the flexible array member.  But that also relies
on GCC middle-end semantics, accessing an automatic variable with
a declared type via a not compatible type, thus changing its effective type.

The only valid C way is to resort to dynamic (stack) allocation.

[Bug tree-optimization/114719] New: Missed optimization: conditional in loop is an invariant (a%2)

2024-04-15 Thread 652023330028 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114719

Bug ID: 114719
   Summary: Missed optimization: conditional in loop is an
invariant (a%2)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 652023330028 at smail dot nju.edu.cn
  Target Milestone: ---

Hello, we noticed that maybe there is a missed optimization for Loop Unswitch.

In the following reduced code, the value of 'a%2' is the same each time through
the loop.

https://godbolt.org/z/49cGaanr8

unsigned a, b;
int m,n;
void func() {
  for (unsigned i=0; i < 1000; i++) {
if(a%2)
b++;
a += 2;
  }
}

But GCC -O3 -fwrapv:
   [local count: 1063004408]:
  # b_lsm.7_10 = PHI 
  # b_lsm_flag.8_11 = PHI <0(2), b_lsm_flag.8_35(5)>
  # ivtmp.13_24 = PHI 
  # DEBUG i => NULL
  # DEBUG BEGIN_STMT
  _32 = ivtmp.13_24 & 1;
  if (_32 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 531502204]:
  # DEBUG BEGIN_STMT
  _33 = b_lsm.7_10 + 1;

   [local count: 1063004408]:
  # b_lsm.7_34 = PHI 
  # b_lsm_flag.8_35 = PHI 
  # DEBUG BEGIN_STMT
  # DEBUG BEGIN_STMT
  # DEBUG i => NULL
  # DEBUG BEGIN_STMT
  ivtmp.13_28 = ivtmp.13_24 + 2;
  if (ivtmp.13_28 != _37)
goto ; [98.99%]
  else
goto ; [1.01%]

Thank you very much for your time and effort! We look forward to hearing from
you.

[Bug gcov-profile/114720] New: gcc.misc-tests/gcov-22.c loops

2024-04-15 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114720

Bug ID: 114720
   Summary: gcc.misc-tests/gcov-22.c loops
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
  Target Milestone: ---
Target: sparc*-sun-solaris2.11, sparc64-unknown-linux-gnu

The new gcc.misc-tests/gcov-22.c test loops on Solaris/SPARC (32 and 64-bit)
and
Linux/sparc64:

WARNING: gcc.misc-tests/gcov-22.c execution test program timed out.
+FAIL: gcc.misc-tests/gcov-22.c execution test
+FAIL: gcc.misc-tests/gcov-22.c gcov: 0 failures in line counts, 0 in branch
percentages, 32 in condition/decision, 0 in return percentages, 0 in
intermediate format
[...]

truss -u reveals the test loops in longjmp like this:

/1@1:   -> libc:longjmp(0x25450, 0x1, 0xffbfe9e8, 0x12b04)
/1@1:   <- libc:longjmp() = 1
/1@1:   -> libc:longjmp(0x25450, 0x1, 0xffbfe9e8, 0x12b04)
/1@1:   <- libc:longjmp() = 1
[...]

I'm astonished the test works anywere, actually.  AFAICS, the issue is this:

  setdest
  -> setjmp returns 0
 setdest returns 2
  -> jump
  -> longjmp make setdest return 1
  -> jump
  -> longjmp ...

continuing ad infinitum.

[Bug gcov-profile/114715] Gcov allocates branches to wrong row for nested switches

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114715

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-04-15
  Known to fail||13.2.0
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
Confirmed.  The branches emitted by pass_lower_switch_O0 lack location info,
so does the switch itself, already since at least gimplification.

The GENERIC SWITCH_EXPR has

{file = 0x4926cc0 "/tmp/t.c", line = 12, column = 9, data = 0x0, 
  sysp = false}

we set a location on the BIND we wrap the switch with though.  The outer
switch is not wrapped in a BIND.  That one gets the location from
annotate_all_with_location_after but this function doesn't recurse into
structured gimple.

It's easy to manually set the switch stmt location, annotating random
nested GIMPLE with a location might be odd.  Maybe instead of

  gimple_set_location (bind, EXPR_LOCATION (switch_expr));

we should also set all of the BIND_BODY locations as well?

Jakub, any preference?

[Bug gcov-profile/114715] Gcov allocates branches to wrong row for nested switches

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114715

--- Comment #2 from Richard Biener  ---
Besides OMP the switch gimplification code is the only one building a new BIND.

[Bug gcov-profile/114715] Gcov allocates branches to wrong row for nested switches

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114715

--- Comment #3 from Richard Biener  ---
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 3df58b962f3..26e96ada4c7 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -3017,6 +3017,7 @@ gimplify_switch_expr (tree *expr_p, gimple_seq *pre_p)

   switch_stmt = gimple_build_switch (SWITCH_COND (switch_expr),
 default_case, labels);
+  gimple_set_location (switch_stmt, EXPR_LOCATION (switch_expr));
   /* For the benefit of -Wimplicit-fallthrough, if switch_body_seq
 ends with a GIMPLE_LABEL holding SWITCH_BREAK_LABEL_P LABEL_DECL,
 wrap the GIMPLE_SWITCH up to that GIMPLE_LABEL into a GIMPLE_BIND,

fixes the testcase

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-15 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635

--- Comment #18 from kugan at gcc dot gnu.org ---
Also, can we set INT_MAX when there is no explicit safelen specified in OMP.
Something like:

--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -6975,14 +6975,11 @@ lower_rec_input_clauses (tree clauses, gimple_seq
*ilist, gimple_seq *dlist,
 {
   tree c = omp_find_clause (gimple_omp_for_clauses (ctx->stmt),
OMP_CLAUSE_SAFELEN);
-  poly_uint64 safe_len;
-  if (c == NULL_TREE
- || (poly_int_tree_p (OMP_CLAUSE_SAFELEN_EXPR (c), &safe_len)
- && maybe_gt (safe_len, sctx.max_vf)))
+  if (c == NULL_TREE)
{
  c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_SAFELEN);
  OMP_CLAUSE_SAFELEN_EXPR (c) = build_int_cst (integer_type_node,
-  sctx.max_vf);
+  INT_MAX);
  OMP_CLAUSE_CHAIN (c) = gimple_omp_for_clauses (ctx->stmt);
  gimple_omp_for_set_clauses (ctx->stmt, c);
}

[Bug gcov-profile/114715] Gcov allocates branches to wrong row for nested switches

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114715

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug ipa/114703] Missed devirtualization in rather simple case

2024-04-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114703

--- Comment #3 from Jan Hubicka  ---
> Yep, 'new' memory escapes.
Yep, this is blocking a lot of propagation in common C++ code.
Here it may help to do speculative devirtualization during IPA stage
that will let the late optimization to get rid of the speculation (since
after inlning we will know that the virtual call does not overwrite
virtual table pointer).  This is technically not too hard to add.  We
can optimistically rule out (some) may aliases while walking the alias
oracle.   I will take a look next stage1.

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

--- Comment #2 from Li Pan  ---
The vzext.vf2 has earlyclobber dest operand, and then it cannot allocated to
the source operand, like vzext.vf2 v0, v0.  Thus we will fail when check_rtl.

(define_insn "@pred__vf2"
  [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr,
vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr")
(if_then_else:VWEXTI
  (unspec:
[(match_operand: 1 "vector_mask_operand"   " vm,Wc1,
vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1")
 (match_operand 4 "vector_length_operand"  " rK, rK,
rK, rK, rK, rK, rK, rK, rK, rK, rK, rK,   rK,   rK")
 (match_operand 5 "const_int_operand"  "i,  i,  i, 
i,  i,  i,  i,  i,  i,  i,  i,  i,i,i")
 (match_operand 6 "const_int_operand"  "i,  i,  i, 
i,  i,  i,  i,  i,  i,  i,  i,  i,i,i")
 (match_operand 7 "const_int_operand"  "i,  i,  i, 
i,  i,  i,  i,  i,  i,  i,  i,  i,i,i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
  (any_extend:VWEXTI
(match_operand: 3 "register_operand"  
"W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,   vr,   vr"))
  (match_operand:VWEXTI 2 "vector_merge_operand"   " vu, vu, 
0,  0, vu, vu,  0,  0, vu, vu,  0,  0,   vu,0")))]
  "TARGET_VECTOR"
  "vext.vf2\t%0,%3%p1"
  [(set_attr "type" "vext")
   (set_attr "mode" "")
   (set_attr "group_overlap"
"W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")])



insn 1205 1214 5405 70 (set (reg:RVVM1SI 97 v1 [orig:687 _1177 ] [687])
(if_then_else:RVVM1SI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(reg:DI 25 s9 [orig:539 _889 ] [539])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(zero_extend:RVVM1SI (reg:RVVMF2HI 97 v1 [orig:654 _1100 ] [654]))
(unspec:RVVM1SI [
(reg:DI 0 zero)
] UNSPEC_VUNDEF))) "../hwy/ops/rvv-inl.h":1964:386 discrim 1
8452 {pred_zero_extendrvvm1si_vf2}
 (nil))
during RTL pass: reload

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403

--- Comment #28 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:85002f8085c25bb3e74ab013581a74e7c7ae006b

commit r14-9969-g85002f8085c25bb3e74ab013581a74e7c7ae006b
Author: Tamar Christina 
Date:   Mon Apr 15 12:06:21 2024 +0100

middle-end: adjust loop upper bounds when peeling for gaps and early break
[PR114403].

This fixes a bug with the interaction between peeling for gaps and early
break.

Before I go further, I'll first explain how I understand this to work for
loops
with a single exit.

When peeling for gaps we peel N < VF iterations to scalar.
This happens by removing N iterations from the calculation of niters such
that
vect_iters * VF == niters is always false.

In other words, when we exit the vector loop we always fall to the scalar
loop.
The loop bounds adjustment guarantees this. Because of this we potentially
execute a vector loop iteration less.  That is, if you're at the boundary
condition where niters % VF by peeling one or more scalar iterations the
vector
loop executes one less.

This is accounted for by the adjustments in vect_transform_loops.  This
adjustment happens differently based on whether the the vector loop can be
partial or not:

Peeling for gaps sets the bias to 0 and then:

when not partial:  we take the floor of (scalar_upper_bound / VF) - 1 to
get the
   vector latch iteration count.

when loop is partial:  For a single exit this means the loop is masked, we
take
   the ceil to account for the fact that the loop can
handle
   the final partial iteration using masking.

Note that there's no difference between ceil an floor on the boundary
condition.
There is a difference however when you're slightly above it. i.e. if scalar
iterates 14 times and VF = 4 and we peel 1 iteration for gaps.

The partial loop does ((13 + 0) / 4) - 1 == 2 vector iterations. and in
effect
the partial iteration is ignored and it's done as scalar.

This is fine because the niters modification has capped the vector
iteration at
2.  So that when we reduce the induction values you end up entering the
scalar
code with ind_var.2 = ind_var.1 + 2 * VF.

Now lets look at early breaks.  To make it esier I'll focus on the specific
testcase:

char buffer[64];

__attribute__ ((noipa))
buff_t *copy (buff_t *first, buff_t *last)
{
  char *buffer_ptr = buffer;
  char *const buffer_end = &buffer[SZ-1];
  int store_size = sizeof(first->Val);
  while (first != last && (buffer_ptr + store_size) <= buffer_end)
{
  const char *value_data = (const char *)(&first->Val);
  __builtin_memcpy(buffer_ptr, value_data, store_size);
  buffer_ptr += store_size;
  ++first;
}

  if (first == last)
return 0;

  return first;
}

Here the first, early exit is on the condition:

  (buffer_ptr + store_size) <= buffer_end

and the main exit is on condition:

  first != last

This is important, as this bug only manifests itself when the first exit
has a
known constant iteration count that's lower than the latch exit count.

because buffer holds 64 bytes, and VF = 4, unroll = 2, we end up processing
16
bytes per iteration.  So the exit has a known bounds of 8 + 1.

The vectorizer correctly analizes this:

Statement (exit)if (ivtmp_21 != 0)
 is executed at most 8 (bounded by 8) + 1 times in loop 1.

and as a consequence the IV is bound by 9:

  # vect_vec_iv_.14_117 = PHI <_118(9), { 9, 8, 7, 6 }(20)>
  ...
  vect_ivtmp_21.16_124 = vect_vec_iv_.14_117 + { 18446744073709551615,
18446744073709551615, 18446744073709551615, 18446744073709551615 };
  mask_patt_22.17_126 = vect_ivtmp_21.16_124 != { 0, 0, 0, 0 };
  if (mask_patt_22.17_126 == { -1, -1, -1, -1 })
goto ; [88.89%]
  else
goto ; [11.11%]

The imporant bits are this:

In this example the value of last - first = 416.

the calculated vector iteration count, is:

x = (((ptr2 - ptr1) - 16) / 16) + 1 = 27

the bounds generated, adjusting for gaps:

   x == (((x - 1) >> 2) << 2)

which means we'll always fall through to the scalar code. as intended.

Here are two key things to note:

1. In this loop, the early exit will always be the one taken.  When it's
taken
   we enter the scalar loop with the correct induction value to apply the
gap
   peeling.

2. If the main exit is taken, the induction values assumes you've finished
all
   vector iterations.  i.e. it assumes you have completed 24 iterations, as
we
   treat the main exit the same for normal loop vect and early break when
not
   PEELED.
   This means the induction value is adjusted to ind_var.2 = ind_var.1

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403

Tamar Christina  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #29 from Tamar Christina  ---
Fixed, thanks for the report!

[Bug tree-optimization/113552] [11/12/13 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #13 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Tamar Christina
:

https://gcc.gnu.org/g:1e08e39c743692afdd5d3546b2223474beac1dbc

commit r13-8604-g1e08e39c743692afdd5d3546b2223474beac1dbc
Author: Tamar Christina 
Date:   Mon Apr 15 12:11:48 2024 +0100

AArch64: Do not allow SIMD clones with simdlen 1 [PR113552]

This is a backport of g:306713c953d509720dc394c43c0890548bb0ae07.

The AArch64 vector PCS does not allow simd calls with simdlen 1,
however due to a bug we currently do allow it for num == 0.

This causes us to emit a symbol that doesn't exist and we fail to link.

gcc/ChangeLog:

PR tree-optimization/113552
* config/aarch64/aarch64.cc
(aarch64_simd_clone_compute_vecsize_and_simdlen): Block simdlen 1.

gcc/testsuite/ChangeLog:

PR tree-optimization/113552
* gcc.target/aarch64/pr113552.c: New test.
* gcc.target/aarch64/simd_pcs_attribute-3.c: Remove bogus check.

[Bug target/114686] Feature request: Dynamic LMUL should be the default for the RISC-V Vector extension

2024-04-15 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114686

--- Comment #3 from Robin Dapp  ---
I think we have always maintained that this can definitely be a per-uarch
default but shouldn't be a generic default.

> I don't see any reason why this wouldn't be the case for the vast majority of
> implementations, especially high performance ones would benefit from having
> more work to saturate the execution units with, since a larger LMUL works
> quite
> similar to loop unrolling.

One argument is reduced freedom for renaming and the out of order machinery. 
It's much easier to shuffle individual registers around than large blocks. 
Also lower-latency insns are easier to schedule than longer-latency ones and
faults, rejects, aborts etc. get proportionally more expensive.
I was under the impression that unrolling doesn't help a whole lot (sometimes
even slows things down a bit) on modern cores and certainly is not
unconditionally helpful.  Granted, I haven't seen a lot of data on it recently.
An exception is of course breaking dependency chains.

In general nothing stands in the way of having a particular tune target use
dynamic LMUL by default even now but nobody went ahead and posted a patch for
theirs.  One could maybe argue that it should be the default for in-order
uarchs?

Should it become obvious in the future that LMUL > 1 is indeed,
unconditionally, a "better unrolling" because of its favorable icache footprint
and other properties (which I doubt - happy to be proved wrong) then we will
surely re-evaluation the decision or rather have a different consensus.

The data we publicly have so far is all in-order cores and my expectation is
that the picture will change once out-of-order cores hit the scene.

[Bug target/114696] ICE: in extract_constrain_insn_cached, at recog.cc:2725 insn does not satisfy its constraints: {*anddi_1} with -mapxf -mx32

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114696

--- Comment #4 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:a3281dd0f4b46c16ec1192ad411c0a96e6d086eb

commit r14-9970-ga3281dd0f4b46c16ec1192ad411c0a96e6d086eb
Author: H.J. Lu 
Date:   Fri Apr 12 15:42:12 2024 -0700

x86: Allow TImode offsettable memory only with 8-bit constant

The x86 instruction size limit is 15 bytes.  If a NDD instruction has
a segment prefix byte, a 4-byte opcode prefix, a MODRM byte, a SIB byte,
a 4-byte displacement and a 4-byte immediate, adding an address size
prefix will exceed the size limit.  Change TImode ADD, AND, OR and XOR
to allow offsettable memory only with 8-bit signed integer constant,
which is encoded with a 1-byte immediate, if the address size prefix
is used.

gcc/

PR target/114696
* config/i386/i386.md (isa): Add apx_ndd_64.
(enabled): Likewise.
(*add3_doubleword): Change rjO to r,ro,jO with 8-bit
signed integer constant and enable jO only for apx_ndd_64.
(*add3_doubleword_cc_overflow_1): Likewise.
(*and3_doubleword): Likewise.
(*3_doubleword): Likewise.

gcc/testsuite/

PR target/114696
* gcc.target/i386/apx-ndd-x32-2a.c: New test.
* gcc.target/i386/apx-ndd-x32-2b.c: Likewise.
* gcc.target/i386/apx-ndd-x32-2c.c: Likewise.
* gcc.target/i386/apx-ndd-x32-2d.c: Likewise.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #18 from GCC Commits  ---
The releases/gcc-13 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:abe3a80aa2d6d53cc9b8c9f7c531e065451d5b6e

commit r13-8606-gabe3a80aa2d6d53cc9b8c9f7c531e065451d5b6e
Author: H.J. Lu 
Date:   Sun Apr 14 12:57:39 2024 -0700

tree-profile: Disable indirect call profiling for IFUNC resolvers

We can't profile indirect calls to IFUNC resolvers nor their callees as
it requires TLS which hasn't been set up yet when the dynamic linker is
resolving IFUNC symbols.

Add an IFUNC resolver caller marker to cgraph_node and set it if the
function is called by an IFUNC resolver.  Disable indirect call profiling
for IFUNC resolvers and their callees.

Tested with profiledbootstrap on Fedora 39/x86-64.

gcc/ChangeLog:

PR tree-optimization/114115
* cgraph.h (symtab_node): Add check_ifunc_callee_symtab_nodes.
(cgraph_node): Add called_by_ifunc_resolver.
* cgraphunit.cc (symbol_table::compile): Call
symtab_node::check_ifunc_callee_symtab_nodes.
* symtab.cc (check_ifunc_resolver): New.
(ifunc_ref_map): Likewise.
(is_caller_ifunc_resolver): Likewise.
(symtab_node::check_ifunc_callee_symtab_nodes): Likewise.
* tree-profile.cc (gimple_gen_ic_func_profiler): Disable indirect
call profiling for IFUNC resolvers and their callees.

gcc/testsuite/ChangeLog:

PR tree-optimization/114115
* gcc.dg/pr114115.c: New test.

(cherry picked from commit cab32bacaea268ec062b1fb4fc662d90c9d1cfce)

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #19 from GCC Commits  ---
The releases/gcc-12 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:23049e851ebf840dffdd3f062dba0e795be347f8

commit r12-10331-g23049e851ebf840dffdd3f062dba0e795be347f8
Author: H.J. Lu 
Date:   Mon Feb 26 08:38:58 2024 -0800

tree-profile: Disable indirect call profiling for IFUNC resolvers

We can't profile indirect calls to IFUNC resolvers nor their callees as
it requires TLS which hasn't been set up yet when the dynamic linker is
resolving IFUNC symbols.

Add an IFUNC resolver caller marker to cgraph_node and set it if the
function is called by an IFUNC resolver.  Disable indirect call profiling
for IFUNC resolvers and their callees.

Tested with profiledbootstrap on Fedora 39/x86-64.

gcc/ChangeLog:

PR tree-optimization/114115
* cgraph.h (symtab_node): Add check_ifunc_callee_symtab_nodes.
(cgraph_node): Add called_by_ifunc_resolver.
* cgraphunit.cc (symbol_table::compile): Call
symtab_node::check_ifunc_callee_symtab_nodes.
* symtab.cc (check_ifunc_resolver): New.
(ifunc_ref_map): Likewise.
(is_caller_ifunc_resolver): Likewise.
(symtab_node::check_ifunc_callee_symtab_nodes): Likewise.
* tree-profile.cc (gimple_gen_ic_func_profiler): Disable indirect
call profiling for IFUNC resolvers and their callees.

gcc/testsuite/ChangeLog:

PR tree-optimization/114115
* gcc.dg/pr114115.c: New test.

(cherry picked from commit cab32bacaea268ec062b1fb4fc662d90c9d1cfce)

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #20 from GCC Commits  ---
The releases/gcc-11 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:574d52a9b6e40a466b90f4810e72d3dd072d5160

commit r11-11321-g574d52a9b6e40a466b90f4810e72d3dd072d5160
Author: H.J. Lu 
Date:   Mon Feb 26 08:38:58 2024 -0800

tree-profile: Disable indirect call profiling for IFUNC resolvers

We can't profile indirect calls to IFUNC resolvers nor their callees as
it requires TLS which hasn't been set up yet when the dynamic linker is
resolving IFUNC symbols.

Add an IFUNC resolver caller marker to cgraph_node and set it if the
function is called by an IFUNC resolver.  Disable indirect call profiling
for IFUNC resolvers and their callees.

Tested with profiledbootstrap on Fedora 39/x86-64.

gcc/ChangeLog:

PR tree-optimization/114115
* cgraph.h (symtab_node): Add check_ifunc_callee_symtab_nodes.
(cgraph_node): Add called_by_ifunc_resolver.
* cgraphunit.c (symbol_table::compile): Call
symtab_node::check_ifunc_callee_symtab_nodes.
* symtab.c (check_ifunc_resolver): New.
(ifunc_ref_map): Likewise.
(is_caller_ifunc_resolver): Likewise.
(symtab_node::check_ifunc_callee_symtab_nodes): Likewise.
* tree-profile.c (gimple_gen_ic_func_profiler): Disable indirect
call profiling for IFUNC resolvers and their callees.

gcc/testsuite/ChangeLog:

PR tree-optimization/114115
* gcc.dg/pr114115.c: New test.

(cherry picked from commit cab32bacaea268ec062b1fb4fc662d90c9d1cfce)

[Bug target/114696] ICE: in extract_constrain_insn_cached, at recog.cc:2725 insn does not satisfy its constraints: {*anddi_1} with -mapxf -mx32

2024-04-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114696

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from H.J. Lu  ---
Fixed.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

H.J. Lu  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #21 from H.J. Lu  ---
Fixed for GCC 14 and GCC 11/12/13 release branches.

[Bug tree-optimization/113552] [11/12/13 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

--- Comment #14 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Tamar Christina
:

https://gcc.gnu.org/g:0c2fcf3ddfe93d1f403962c4bacbb5d55ab7d19d

commit r11-11323-g0c2fcf3ddfe93d1f403962c4bacbb5d55ab7d19d
Author: Tamar Christina 
Date:   Mon Apr 15 12:32:24 2024 +0100

[AArch64]: Do not allow SIMD clones with simdlen 1 [PR113552]

This is a backport of g:306713c953d509720dc394c43c0890548bb0ae07.

The AArch64 vector PCS does not allow simd calls with simdlen 1,
however due to a bug we currently do allow it for num == 0.

This causes us to emit a symbol that doesn't exist and we fail to link.

gcc/ChangeLog:

PR tree-optimization/113552
* config/aarch64/aarch64.c
(aarch64_simd_clone_compute_vecsize_and_simdlen): Block simdlen 1.

gcc/testsuite/ChangeLog:

PR tree-optimization/113552
* gcc.target/aarch64/pr113552.c: New test.
* gcc.target/aarch64/simd_pcs_attribute-3.c: Remove bogus check.

[Bug tree-optimization/113552] [11/12/13 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-04-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552

Tamar Christina  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Tamar Christina  ---
Fixed on trunk and all open branches

[Bug gcov-profile/114715] Gcov allocates branches to wrong row for nested switches

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114715

Richard Biener  changed:

   What|Removed |Added

  Known to work||14.0

--- Comment #4 from Richard Biener  ---
commit 9d573f71e80e9f6f4aac912fc8fc128aa2697e3a (origin/master, origin/HEAD)
Author: Richard Biener 
Date:   Mon Apr 15 11:09:17 2024 +0200

gcov-profile/114715 - missing coverage for switch

The following avoids missing coverage for the line of a switch statement
which happens when gimplification emits a BIND_EXPR wrapping the switch
as that prevents us from setting locations on the containing statements
via annotate_all_with_location.  Instead set the location of the GIMPLE
switch directly.

PR gcov-profile/114715
* gimplify.cc (gimplify_switch_expr): Set the location of the
GIMPLE switch.

* gcc.misc-tests/gcov-24.c: New testcase.

[Bug tree-optimization/114719] Missed optimization: conditional in loop is an invariant (a%2)

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114719

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||missed-optimization
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-15
 Blocks||85316

--- Comment #1 from Richard Biener  ---
Confirmed.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316
[Bug 85316] [meta-bug] VRP range propagation missed cases

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

Richard Biener  changed:

   What|Removed |Added

   Keywords||lto
   Last reconfirmed||2024-04-15
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Richard Biener  ---
Hmm, but if offload targets were to support it, not forwarding it would be
wrong.  That said, the way we communicate this is a bit odd.

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

--- Comment #3 from Andrew Stubbs  ---
Can this be filtered (safely) in mkoffload? That tool is
offload-target-specific, so no problem with "if offload target were to support
it".

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

--- Comment #4 from Richard Biener  ---
(In reply to Andrew Stubbs from comment #3)
> Can this be filtered (safely) in mkoffload? That tool is
> offload-target-specific, so no problem with "if offload target were to
> support it".

Yes, I think so.

[Bug target/114668] [14] RISC-V rv64gcv: miscompile at -O3

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114668

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:02cc8f3e68f9af96d484d9946ceaa9e3eed38151

commit r14-9972-g02cc8f3e68f9af96d484d9946ceaa9e3eed38151
Author: Robin Dapp 
Date:   Mon Apr 15 12:44:56 2024 +0200

RISC-V: Add VLS to mask vec_extract [PR114668].

This adds the missing VLS modes to the mask extract expanders.

gcc/ChangeLog:

PR target/114668

* config/riscv/autovec.md: Add VLS.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr114668.c: New test.

[Bug target/114668] [14] RISC-V rv64gcv: miscompile at -O3

2024-04-15 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114668

Robin Dapp  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Robin Dapp  ---
I didn't have the time to fully investigate but the default path without vec
extract is definitely broken for masks.  I'd probably sleep better if we fixed
that at some point but for now the obvious fix is to add the missing expanders.

Patrick, I'm still unable to reproduce PR114665 (maybe also a qemu
difference?).  Could you re-check with this fix?  Thanks.

[Bug tree-optimization/113479] Two equivalent programs have inconsistent output results at the same optimization level

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113479

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Richard Biener  ---
static int func_108(char *p_112) {
  unsigned short l_191 = 1;
  int l_190 = l_191;
  g_122 = &p_112;

this leaks the address of a stack local.

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

--- Comment #5 from Thomas Schwinge  ---
Distributions injecting some '-fcf-protection' by default could also inject
'-foffload-options=amdgcn-amdhsa=-fno-cf-protection' (or similar) to keep the
default case of offloading compilation working, but then with explicit
user-specified '-fcf-protection', the user would still get an error for
offloading compilation -- which may actually be desirable (for some)?

Alternatively: yes, the 'mkoffload's could filter that out -- but there is a
policy question, whether 'mkoffload's are permitted to silently drop
user-requested '-f[...]' flags?  Probably that's OK if the '-fcf-protection'
documentation is updated accordingly?

I guess I don't have any strong preference.  ;-)

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

--- Comment #6 from Richard Biener  ---
I think this has been fixed now?

[Bug middle-end/108410] x264 averaging loop not optimized well for avx512

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410

Richard Biener  changed:

   What|Removed |Added

   Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #11 from Richard Biener  ---
I think "fixed" as far as we can get, esp. w/o considering all possible vector
sizes.

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 108410, which changed state.

Bug 108410 Summary: x264 averaging loop not optimized well for avx512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 108410, which changed state.

Bug 108410 Summary: x264 averaging loop not optimized well for avx512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug target/110214] x86 backend lacks support for vec_pack_ssat_m and vec_pack_usat_m

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110214
Bug 110214 depends on bug 108410, which changed state.

Bug 108410 Summary: x264 averaging loop not optimized well for avx512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/110935] Missed BB reduction vectorization because of missed eliding of a permute

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110935

--- Comment #5 from Richard Biener  ---
So ideally we could special-case the "output" of the SLP instance root.  It
might be possible to insert the node just into the digraph.

[Bug gcov-profile/114720] gcc.misc-tests/gcov-22.c loops

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114720

--- Comment #1 from GCC Commits  ---
The master branch has been updated by J?rgen Kvalsvik :

https://gcc.gnu.org/g:18e881ebd9f4b9429c652a81b8ceee84275bdade

commit r14-9973-g18e881ebd9f4b9429c652a81b8ceee84275bdade
Author: Jørgen Kvalsvik 
Date:   Mon Apr 15 14:14:26 2024 +0200

Guard longjmp in test to not inf loop [PR114720]

Guard the longjmp to not infinitely loop. The longjmp (jump) function is
called unconditionally to make test flow simpler, but the jump
destination would return to a point in main that would call longjmp
again. The longjmp is really there to exercise the then-branch of
setjmp, to verify coverage is accurately counted in the presence of
complex edges.

PR gcov-profile/114720

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-22.c: Guard longjmp to not loop.

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2024-04-15 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

--- Comment #7 from Matthias Kretz (Vir)  ---
looks good to me

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Richard Biener  ---
Fixed.

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-15 Thread kito at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

Kito Cheng  changed:

   What|Removed |Added

 CC||kito at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-15
 Status|UNCONFIRMED |NEW

--- Comment #3 from Kito Cheng  ---
Reduced case, not the final result, but it already run 8+ hours...
```
typedef int a;
typedef short b;
typedef unsigned c;
template < typename > using e = unsigned;
template < typename > void ab();
#pragma riscv intrinsic "vector"
template < typename f, int, int ac > struct g {
  using i = f;
  template < typename m > using j = g< m, 0, ac >;
  using k = g< i, 1, ac - 1 >;
  using ad = g< i, 1, ac + 1 >;
};
namespace ae {
struct af {
  using h = g< short, 6, 0 < 3 >;
};
struct ag {
  using h = af::h;
};
} template < typename, int > using ah = ae::ag::h;
template < class ai > using aj = typename ai::i;
template < class i, class ai > using j = typename ai::j< i >;
template < class ai > using ak = j< e< ai >, ai >;
template < class ai > using k = typename ai::k;
template < class ai > using ad = typename ai::ad;
template < a ap > vuint16m1_t ar(g< b, ap, 0 >, b);
template < a ap > vuint16m2_t ar(g< b, ap, 1 >, b);
template < a ap > vuint32m2_t ar(g< c, ap, 1 >, c);
template < a ap > vuint32m4_t ar(g< c, ap, 2 >, c);
template < class ai > using as = decltype(ar(ai(), aj< ai >()));
template < class ai > as< ai > at(ai);
namespace ae {
template < int ap > vuint32m4_t au(g< c, ap, 1 + 1 >, vuint32m2_t l) {
  return __riscv_vlmul_ext_v_u32m2_u32m4(l);
}
} template < int ap > vuint32m2_t aw(g< c, ap, 1 >, vuint16m1_t l) {
  return __riscv_vzext_vf2_u32m2(l, 0);
}
namespace ae {
vuint32m4_t ax(vuint32m4_t, vuint32m4_t, a);
}
template < class ay, class an > as< ay > az(ay ba, an bc) {
  an bb;
  return ae::ax(ae::au(ba, bc), ae::au(ba, bb), 2);
}
template < class bd > as< bd > be(bd, as< ad< bd > >);
namespace ae {
template < class bh, class bi > void bj(bh bk, bi bl) {
  ad< decltype(bk) > bn;
  az(bn, bl);
}
} template < int ap, int ac, class bp, class bq >
void br(g< c, ap, ac > bk, bp, bq bl) {
  ae::bj(bk, bl);
}
template < class ai > using bs = decltype(at(ai()));
struct bt;
template < int ac = 1 > class bu {
public:
  template < typename i > void operator()(i) {
ah< i, ac > d;
bt()(i(), d);
  }
};
struct bt {
  template < typename bv, class bf > void operator()(bv, bf bw) {
using bx = bv;
ak< bf > by;
k< bf > bz;
using bq = bs< decltype(by) >;
using bp = bs< decltype(bw) >;
bp cb;
ab< bx >();
for (;;) {
  bp cc;
  bq bl = aw(by, be(bz, cc));
  br(by, cb, bl);
}
  }
};
void d() { bu()(b()); }

```

[Bug libstdc++/114721] New: libstdc++-v3/include/ext/codecvt_specializations.h: 2 * small performance tweeks

2024-04-15 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114721

Bug ID: 114721
   Summary: libstdc++-v3/include/ext/codecvt_specializations.h: 2
* small performance tweeks
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

Static analyser cppcheck says:

1.

libstdc++-v3/include/ext/codecvt_specializations.h:142:5: performance: Function
'external_encoding()' should return member '_M_ext_enc' by const reference.
[returnByReference]

Source code is

const std::string
external_encoding() const
{ return _M_ext_enc; }

2.

libstdc++-v3/include/ext/codecvt_specializations.h:134:5: performance: Function
'internal_encoding()' should return member '_M_int_enc' by const reference.
[returnByReference]

Duplicate.

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

--- Comment #4 from Li Pan  ---
(In reply to Kito Cheng from comment #3)
> Reduced case, not the final result, but it already run 8+ hours...
> ```
> typedef int a;
> typedef short b;
> typedef unsigned c;
> template < typename > using e = unsigned;
> template < typename > void ab();
> #pragma riscv intrinsic "vector"
> template < typename f, int, int ac > struct g {
>   using i = f;
>   template < typename m > using j = g< m, 0, ac >;
>   using k = g< i, 1, ac - 1 >;
>   using ad = g< i, 1, ac + 1 >;
> };
> namespace ae {
> struct af {
>   using h = g< short, 6, 0 < 3 >;
> };
> struct ag {
>   using h = af::h;
> };
> } template < typename, int > using ah = ae::ag::h;
> template < class ai > using aj = typename ai::i;
> template < class i, class ai > using j = typename ai::j< i >;
> template < class ai > using ak = j< e< ai >, ai >;
> template < class ai > using k = typename ai::k;
> template < class ai > using ad = typename ai::ad;
> template < a ap > vuint16m1_t ar(g< b, ap, 0 >, b);
> template < a ap > vuint16m2_t ar(g< b, ap, 1 >, b);
> template < a ap > vuint32m2_t ar(g< c, ap, 1 >, c);
> template < a ap > vuint32m4_t ar(g< c, ap, 2 >, c);
> template < class ai > using as = decltype(ar(ai(), aj< ai >()));
> template < class ai > as< ai > at(ai);
> namespace ae {
> template < int ap > vuint32m4_t au(g< c, ap, 1 + 1 >, vuint32m2_t l) {
>   return __riscv_vlmul_ext_v_u32m2_u32m4(l);
> }
> } template < int ap > vuint32m2_t aw(g< c, ap, 1 >, vuint16m1_t l) {
>   return __riscv_vzext_vf2_u32m2(l, 0);
> }
> namespace ae {
> vuint32m4_t ax(vuint32m4_t, vuint32m4_t, a);
> }
> template < class ay, class an > as< ay > az(ay ba, an bc) {
>   an bb;
>   return ae::ax(ae::au(ba, bc), ae::au(ba, bb), 2);
> }
> template < class bd > as< bd > be(bd, as< ad< bd > >);
> namespace ae {
> template < class bh, class bi > void bj(bh bk, bi bl) {
>   ad< decltype(bk) > bn;
>   az(bn, bl);
> }
> } template < int ap, int ac, class bp, class bq >
> void br(g< c, ap, ac > bk, bp, bq bl) {
>   ae::bj(bk, bl);
> }
> template < class ai > using bs = decltype(at(ai()));
> struct bt;
> template < int ac = 1 > class bu {
> public:
>   template < typename i > void operator()(i) {
> ah< i, ac > d;
> bt()(i(), d);
>   }
> };
> struct bt {
>   template < typename bv, class bf > void operator()(bv, bf bw) {
> using bx = bv;
> ak< bf > by;
> k< bf > bz;
> using bq = bs< decltype(by) >;
> using bp = bs< decltype(bw) >;
> bp cb;
> ab< bx >();
> for (;;) {
>   bp cc;
>   bq bl = aw(by, be(bz, cc));
>   br(by, cb, bl);
> }
>   }
> };
> void d() { bu()(b()); }
> 
> ```

Thanks Kito, really save my day!

[Bug tree-optimization/114722] New: Missed optimization: !e*d*e=>0, affected by useless instructions

2024-04-15 Thread 652023330028 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114722

Bug ID: 114722
   Summary: Missed optimization: !e*d*e=>0, affected by useless
instructions
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 652023330028 at smail dot nju.edu.cn
  Target Milestone: ---

Hello, we noticed that the code below can be optimized as stated in the title
(!e*d*e=>0), but gcc -O3 -fwrapv missed it. It looks like it's affected by some
useless instructions.

https://godbolt.org/z/5vPqcbGWr

extern int x;
extern int y;
void func1(int a, int b, int c, int d, int e) {
a = c;
a = b;
a = d;
x = c;
y = !e * d * e;
}

GCC -O3 -fwrapv:
func1(int, int, int, int, int):
xor eax, eax
testr8d, r8d
mov DWORD PTR x[rip], edx
cmovne  ecx, eax
imulecx, r8d
mov DWORD PTR y[rip], ecx
ret


Expected code (when we remove 'a=c; a=b; a=d;'):
func1(int, int, int, int, int):
mov DWORD PTR x[rip], edx
mov DWORD PTR y[rip], 0
ret

By the way, when the exact same fun1 and func2 appear together, func2 gets the
expected result:

extern int x;
extern int y;
void func1(int a, int b, int c, int d, int e) {
a = c;
a = b;
a = d;
x = c;
y = !e * d * e;
}

void func2(int a, int b, int c, int d, int e) {
a = c;
a = b;
a = d;
x = c;
y = !e * d * e;
}

GCC -O3 -fwrapv:
func1(int, int, int, int, int):
xor eax, eax
testr8d, r8d
mov DWORD PTR x[rip], edx
cmovne  ecx, eax
imulecx, r8d
mov DWORD PTR y[rip], ecx
ret
func2(int, int, int, int, int):
mov DWORD PTR x[rip], edx
mov DWORD PTR y[rip], 0
ret

Thank you very much for your time and effort! We look forward to hearing from
you.

[Bug target/111231] [12/13/14 regression] armhf: Miscompilation with -O2/-fno-exceptions level (-fno-tree-vectorize is working)

2024-04-15 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111231

--- Comment #31 from Richard Earnshaw  ---
While that does seem to fix the bug, it's at the cost of 6 additional stores in
the problematic test that are redundant other than changing the alias set view.

[Bug libstdc++/114721] libstdc++-v3/include/ext/codecvt_specializations.h: 2 * small performance tweeks

2024-04-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114721

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed||2024-04-15
 Ever confirmed|0   |1
   Severity|normal  |trivial
 Status|UNCONFIRMED |NEW

--- Comment #1 from Jonathan Wakely  ---
I don't think anybody uses this extension either, but the change seems worth
doing.

[Bug c/114723] New: ICE when checking for type compatibility with structure that contains flexible array member

2024-04-15 Thread luigighiron at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114723

Bug ID: 114723
   Summary: ICE when checking for type compatibility with
structure that contains flexible array member
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luigighiron at gmail dot com
  Target Milestone: ---

The following code causes an internal compiler error on GCC 14:

#include
struct S{int x,y[1];}*a;
int main(void){
struct S{int x,y[];};
puts(_Generic(
a,
struct S*:"compatible",
default:"incompatible"
));
}

If I understand the type compatibility rules, these types should be compatible?
The types "int[1]" and "int[]" are compatible and everything else seems to
match exactly. Interestingly, it seems to also crash when checking if the type
of a is compatible with struct S (instead of struct S*).

[Bug c++/110006] [11/12 Regression] friend function template with constraint doesn't match existing declaration

2024-04-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110006

Patrick Palka  changed:

   What|Removed |Added

Summary|[11/12/13 Regression]   |[11/12 Regression] friend
   |friend function template|function template with
   |with constraint doesn't |constraint doesn't match
   |match existing declaration  |existing declaration

--- Comment #6 from Patrick Palka  ---
Now also backported for GCC 13.3 as r13-8608-g38c2679ff9330d and
r13-8609-g265f207a46bc38.

[Bug c++/112769] [11/12 Regression] ICE on valid code related to requires-expression

2024-04-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112769

Patrick Palka  changed:

   What|Removed |Added

Summary|[11/12/13 Regression] ICE   |[11/12 Regression] ICE on
   |on valid code related to|valid code related to
   |requires-expression |requires-expression

--- Comment #6 from Patrick Palka  ---
Now also backported for GCC 13.3 as r13-8608-g38c2679ff9330d and
r13-8609-g265f207a46bc38.

[Bug c++/107168] [11/12/13 Regression] Wrong errors for concepts with default lambda not-type argument since r11-3714-gc1c62aec6751678e

2024-04-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107168

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Target Milestone|11.5|14.0

--- Comment #4 from Patrick Palka  ---
Backporting is unlikely, the patch is kind of risky and this worked earlier
only by accident.

[Bug c++/114706] ICE - std::bit_cast in consteval function involving array of union

2024-04-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114706

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from Jakub Jelinek  ---
Created attachment 57947
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57947&action=edit
gcc14-pr114706.patch

Untested obvious patch.

[Bug lto/113208] [14 Regression] lto1: error: Alias and target's comdat groups differs since r14-5979-g99d114c15523e0

2024-04-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113208

--- Comment #25 from Jan Hubicka  ---
So we have comdat groups that diverges in t1.o and t2.o.  In one object it has
alias in it while in other object it does not

Merging nodes for _ZN6vectorI12QualityValueEC2ERKS1_. Candidates:
_ZN6vectorI12QualityValueEC2ERKS1_/1 (__ct_base )
  Type: function definition analyzed
  Visibility: externally_visible semantic_interposition prevailing_def_ironly
public weak comdat comdat_group:_ZN6vectorI12QualityValueEC2ERKS1_ one_only
  next sharing asm name: 19
  References: 
  Referring:  
  Read from file: t1.o
  Unit id: 1
  Function flags: count:1073741824 (estimated locally)
  Called by: _Z1n1k/6 (1073741824 (estimated locally),1.00 per call) (can throw
external)
  Calls: _ZN12_Vector_baseI12QualityValueEC2Eii/10 (1073741824 (estimated
locally),1.00 per call) (can throw external)
_ZNK12_Vector_baseI12QualityValueE1gEv/9 (1073741824 (estimated locally),1.00
per call) (can throw external)
_ZN6vectorI12QualityValueEC2ERKS1_/19 (__ct_base )
  Type: function definition analyzed
  Visibility: externally_visible semantic_interposition preempted_ir public
weak comdat comdat_group:_ZN6vectorI12QualityValueEC5ERKS1_ one_only
  Same comdat group as: _ZN6vectorI12QualityValueEC1ERKS1_/20
  previous sharing asm name: 1
  References: 
  Referring: _ZN6vectorI12QualityValueEC1ERKS1_/20 (alias)
  Read from file: t2.o
  Unit id: 2
  Function flags: count:1073741824 (estimated locally)
  Called by:
  Calls: _ZN12_Vector_baseI12QualityValueEC2Eii/23 (1073741824 (estimated
locally),1.00 per call) (can throw external)
_ZNK12_Vector_baseI12QualityValueE1gEv/24 (1073741824 (estimated locally),1.00
per call) (can throw external)
After resolution:
_ZN6vectorI12QualityValueEC2ERKS1_/1 (__ct_base )
  Type: function definition analyzed
  Visibility: externally_visible semantic_interposition prevailing_def_ironly
public weak comdat comdat_group:_ZN6vectorI12QualityValueEC2ERKS1_ one_only
  next sharing asm name: 19
  References: 
  Referring: 
  Read from file: t1.o
  Unit id: 1
  Function flags: count:1073741824 (estimated locally)
  Called by: _Z1n1k/6 (1073741824 (estimated locally),1.00 per call) (can throw
external)
  Calls: _ZN12_Vector_baseI12QualityValueEC2Eii/10 (1073741824 (estimated
locally),1.00 per call) (can throw external)
_ZNK12_Vector_baseI12QualityValueE1gEv/9 (1073741824 (estimated locally),1.00
per call) (can throw external)

We opt for version without alias and later ICE in sanity check verifying that
aliases have same comdat group as their targets.

I wonder how this is ice-on-valid code, since with normal linking the aliased
symbol may or may not appear in the winning comdat group, so using he alias has
to break.

If constexpr changes how the constructor is generated, isn't this violation of
ODR?

We probably can go and reset every node in losing comdat group to silence the
ICE and getting undefined symbol instead

[Bug libgcc/114689] [14 Regression] libgcc/config/m68k/fpgnulib.c:305: Suspicious coding ?

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114689

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:f8409c3109d2970a1fd63ac1a61601138b7ae46f

commit r14-9976-gf8409c3109d2970a1fd63ac1a61601138b7ae46f
Author: Jakub Jelinek 
Date:   Mon Apr 15 17:46:03 2024 +0200

m68k: Quiet up cppcheck warning [PR114689]

cppcheck apparently warns on the | !!sticky part of the expression and
using | (!!sticky) quiets it up (it is correct as is).
The following patch adds the ()s, and also adds them around mant >> 1 just
in case it makes it clearer to all readers that the expression is parsed
that way already.

2024-04-15  Jakub Jelinek  

PR libgcc/114689
* config/m68k/fpgnulib.c (__truncdfsf2): Add parentheses around
!!sticky bitwise or operand to quiet up cppcheck.  Add parentheses
around mant >> 1 bitwise or operand.

[Bug libgcc/114689] [14 Regression] libgcc/config/m68k/fpgnulib.c:305: Suspicious coding ?

2024-04-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114689

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #8 from Jakub Jelinek  ---
Should be silenced now (am not saying fixed because there was no bug).

[Bug other/89863] [meta-bug] Issues in gcc that other static analyzers (cppcheck, clang-static-analyzer, PVS-studio) find that gcc misses

2024-04-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863
Bug 89863 depends on bug 114689, which changed state.

Bug 114689 Summary: [14 Regression] libgcc/config/m68k/fpgnulib.c:305: 
Suspicious coding ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114689

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug lto/113208] [14 Regression] lto1: error: Alias and target's comdat groups differs since r14-5979-g99d114c15523e0

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113208

--- Comment #26 from Andrew Pinski  ---
(In reply to Jan Hubicka from comment #25)
> 
> If constexpr changes how the constructor is generated, isn't this violation
> of ODR?

Note the original code didn't have the constexpr change. And IIRC constexpr in
libstdc++ might be different depending on the language, C++11 vs C++17 in many
cases too.

[Bug testsuite/114720] gcc.misc-tests/gcov-22.c loops

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114720

Andrew Pinski  changed:

   What|Removed |Added

  Component|gcov-profile|testsuite
   Keywords||testsuite-fail
 Status|UNCONFIRMED |RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED

--- Comment #2 from Andrew Pinski  ---
Fixed.

[Bug lto/113208] [14 Regression] lto1: error: Alias and target's comdat groups differs since r14-5979-g99d114c15523e0

2024-04-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113208

--- Comment #27 from Jan Hubicka  ---
OK, but the problem is same. Having comdats with same key defining different
set of public symbols is IMO not a good situation for both non-LTO and LTO
builds.
Unless the additional alias is never used by valid code (which would make it
useless and probably we should not generate it) it should be possible to
produce a scenario where linker will pick wrong version of comdat and we get
undefined symbol in non-LTO builds...

[Bug tree-optimization/114722] Missed optimization: !e*d*e=>0, affected by useless instructions

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114722

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-04-15
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
   Keywords||missed-optimization, TREE

--- Comment #1 from Andrew Pinski  ---

Confirmed.

Working case:
  x = c_6(D);
  _1 = e_8(D) == 0;
  _11 = e_8(D) * d_9(D);
  _4 = _1 ? _11 : 0;

vs not working case:

  x = c_5(D);
  _1 = e_10(D) == 0;
  _3 = _1 ? d_7(D) : 0;
  _4 = _3 * e_10(D);


In the first case, the rtl optimizers can figure it out back to 0. While in the
second case, it can't. 

Not a regression and has only been improved lately.

[Bug lto/113208] [14 Regression] lto1: error: Alias and target's comdat groups differs since r14-5979-g99d114c15523e0

2024-04-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113208

--- Comment #28 from Jan Hubicka  ---
So the main problem is that in t2 we have

_ZN6vectorI12QualityValueEC1ERKS1_/7 (vector<_Tp>::vector(const vector<_Tp>&)
[with _Tp = QualityValue])
  Type: function definition analyzed alias cpp_implicit_alias
  Visibility: semantic_interposition public weak comdat
comdat_group:_ZN6vectorI12QualityValueEC5ERKS1_ one_only
  Same comdat group as: _ZN6vectorI12QualityValueEC2ERKS1_/6
  References: _ZN6vectorI12QualityValueEC2ERKS1_/6 (alias) 
  Referring: 
  Function flags:
  Called by: _Z41__static_initialization_and_destruction_0v/8 (can throw
external)
  Calls: 

and in t1 we have

_ZN6vectorI12QualityValueEC1ERKS1_/2 (constexpr vector<_Tp>::vector(const
vector<_Tp>&) [with _Tp = QualityValue])
  Type: function definition
  Visibility: semantic_interposition external public weak comdat
comdat_group:_ZN6vectorI12QualityValueEC1ERKS1_ one_only
  References: 
  Referring:
  Function flags:
  Called by: 
  Calls: 

This is the same symbol name but in two different comdat groups (C1 compared to
C5).  With -O0 both seems to get the C5 group

I can silence the ICE by making aliases undefined during symbol merging (which
is kind of hack but should make sanity checks happy), but I am still lost how
this is supposed to work in valid code.

[Bug libstdc++/114724] New: [Regression] libstdc++prettyprinters/debug.[cc|cxx11.cc] failing to build

2024-04-15 Thread carlos.seo at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114724

Bug ID: 114724
   Summary: [Regression]
libstdc++prettyprinters/debug.[cc|cxx11.cc] failing to
build
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: carlos.seo at linaro dot org
  Target Milestone: ---

The following tests are failing to build:

=== libstdc++ tests ===

Running libstdc++:libstdc++-prettyprinters/prettyprinters.exp ...
UNRESOLVED: libstdc++-prettyprinters/debug.cc compilation failed to produce
executable
UNRESOLVED: libstdc++-prettyprinters/debug_cxx11.cc compilation failed to
produce executable

Possible culprit, according to the post-commit CI is:

https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=c0419c024bf922128131671e40de0aed736e38ed

CI logs:

https://git-us.linaro.org/toolchain/ci/interesting-commits.git/plain/binutils/sha1/c0419c024bf922128131671e40de0aed736e38ed/tcwg_gnu_native_check_gcc/master-aarch64/details.txt

https://ci.linaro.org/job/tcwg_gnu_native_check_gcc--master-aarch64-build/1047/artifact/artifacts

[Bug debug/78322] Debug info still present for fully optimized away functions

2024-04-15 Thread dblaikie at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78322

--- Comment #5 from David Blaikie  ---
(In reply to Andrew Pinski from comment #4)
> (In reply to David Blaikie from comment #2)
> > (In reply to Richard Biener from comment #1)
> > > We produce an abstract copy for use by repeated inline copies.
> > 
> > Yep! Is it still reasonable to consider it a bug (or at least a feature
> > request) that this is still produced even when no inline copies are emitted?
> 
> Not really. 
> 
> Sounds like what you are aiming for is the nodebug attribute that you can
> use with always_inline. Basically in dwarf inline functions are still
> represented as functions (calls) and most folks want that for their
> debugability of their program but in this case you specific inlined
> functions not to have debug info which is exactly what nodebug would do ...

Not sure I follow. I'm not suggesting this function should be `nodebug`.

Specifically: If an abstract origin is unreferenced, it seems like it
should/could be omitted, for brevity.

If the abstract origin is referenced - if there was some remnant of the inlined
code that then caused an inlined_subroutine to be emitted, that would need to
reference the abstract origin and so the latter should be emitted.

This is what clang does, at least - thought it might be nice for gcc to do that
to, to have more compact DWARF output.

https://godbolt.org/z/3doWWK4G4

(though, interestingly, since this bug was filed - in GCC 9, GCC started
putting NOPs in for the inlined code, which is a nice touch - so at -O0 you can
still step into/out of a no-op (or presumably otherwise optimized away? if you
had some optimizations forced on at -O0 somehow) inlined function - but with
optimizations enabled you still see the behavior of an abstract origin emitted
without any uses/references to it)

[Bug target/114665] [14] RISC-V rv64gcv: miscompile at -O3

2024-04-15 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114665

--- Comment #4 from Patrick O'Neill  ---
Reran as requested in pr114668. Still present with that fix.

I'll triage some other testcases and file those as well. Hopefully one of them
is a duplicate to this one that is easily reproducible.

GCC: r14-9976-gf8409c3109d
QEMU: v8.2.1 (f48c205fb42be48e2e47b7e1cd9a2802e5ca17b0)

Commands:
> /scratch/tc-testing/tc-apr-15/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc 
> -march=rv64gcv -O3 red.c -o red.out
> /scratch/tc-testing/tc-apr-15/build-rv64gcv/bin/qemu-riscv64 red.out
35
> /scratch/tc-testing/tc-apr-15/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc 
> red.c -o red.out
> /scratch/tc-testing/tc-apr-15/build-rv64gcv/bin/qemu-riscv64 red.out
FFB5

[Bug tree-optimization/114725] New: Missed optimization: more precise range for and

2024-04-15 Thread xxs_chy at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114725

Bug ID: 114725
   Summary: Missed optimization: more precise range for and
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xxs_chy at outlook dot com
  Target Milestone: ---

Godbolt link: https://godbolt.org/z/vTb1Y5b39

```
bool src(int offset) {
if(offset > 128) {
return 0;
} else {
dummy();
return (offset & -9) == 258;
}
}

```
can be folded to:
```
bool tgt(int offset) {
if(offset > 128) {
return 0;
} else {
dummy();
return 0;
}
}
```

[Bug libstdc++/114724] [Regression] libstdc++prettyprinters/debug.[cc|cxx11.cc] failing to build

2024-04-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114724

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-04-15
 Status|UNCONFIRMED |WAITING

--- Comment #1 from Andrew Pinski  ---
If a binutils change causes link errors (assumi bo lto which sho be this
case)., then it should reported there.

[Bug libstdc++/114724] [Regression] libstdc++prettyprinters/debug.[cc|cxx11.cc] failing to build

2024-04-15 Thread carlos.seo at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114724

--- Comment #2 from Carlos Eduardo Seo  ---
OK, I'll reopen it there.

[Bug libstdc++/114724] [Regression] libstdc++prettyprinters/debug.[cc|cxx11.cc] failing to build

2024-04-15 Thread carlos.seo at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114724

Carlos Eduardo Seo  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

[Bug c++/114625] requires { T{}; } wrongly returns false when T{} is ill-formed while in concept

2024-04-15 Thread ted at lyncon dot se via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114625

--- Comment #5 from Ted Lyngmo  ---
@Andrew, the title change seems wrong. It wrongly returns true when T{} is
ill-formed.

[Bug libstdc++/113386] [C++23] std::pair comparison operators should be transparent, but are not in libstdc++

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:2a0c083558b4ac6609692294df7a388cf4468711

commit r14-9979-g2a0c083558b4ac6609692294df7a388cf4468711
Author: Jonathan Wakely 
Date:   Mon Jan 15 14:47:52 2024 +

libstdc++: Heterogeneous std::pair comparisons [PR113386]

I'm only treating this as a DR for C++20 for now, because it's less work
and only requires changes to operator== and operator<=>. To do this for
older standards would require changes to the six relational operators
used pre-C++20.

libstdc++-v3/ChangeLog:

PR libstdc++/113386
* include/bits/stl_pair.h (operator==, operator<=>): Support
heterogeneous comparisons, as per LWG 3865.
* testsuite/20_util/pair/comparison_operators/lwg3865.cc: New
test.

[Bug libstdc++/93672] [11/12/13/14 Regression] std::basic_istream::ignore hangs if delim MSB is set

2024-04-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93672

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:2d694414ada8e3b58f504c1b175d31088529632e

commit r14-9978-g2d694414ada8e3b58f504c1b175d31088529632e
Author: Jonathan Wakely 
Date:   Thu Apr 4 10:33:33 2024 +0100

libstdc++: Fix infinite loop in std::istream::ignore(n, delim) [PR93672]

A negative delim value passed to std::istream::ignore can never match
any character in the stream, because the comparison is done using
traits_type::eq_int_type(sb->sgetc(), delim) and sgetc() never returns
negative values (except at EOF). The optimized version of ignore for the
std::istream specialization uses traits_type::find to locate the delim
character in the streambuf, which _can_ match a negative delim on
platforms where char is signed, but then we do another comparison using
eq_int_type which fails. The code then keeps looping forever, with
traits_type::find locating the character and traits_type::eq_int_type
saying it's not a match, so traits_type::find is used again and finds
the same character again.

A possible fix would be to check with eq_int_type after a successful
find, to see whether we really have a match. However, that would be
suboptimal since we know that a negative delimiter will never match
using eq_int_type. So a better fix is to adjust the check at the top of
the function that handles delim==eof(), so that we treat all negative
delim values as equivalent to EOF. That way we don't bother using find
to search for something that will never match with eq_int_type.

The version of ignore in the primary template doesn't need a change,
because it doesn't use traits_type::find, instead characters are
extracted one-by-one and always matched using eq_int_type. That avoids
the inconsistency between find and eq_int_type. The specialization for
std::wistream does use traits_type::find, but traits_type::to_int_type
is equivalent to an implicit conversion from wchar_t to wint_t, so
passing a wchar_t directly to ignore without using to_int_type works.

libstdc++-v3/ChangeLog:

PR libstdc++/93672
* src/c++98/istream.cc (istream::ignore(streamsize, int_type)):
Treat all negative delimiter values as eof().
* testsuite/27_io/basic_istream/ignore/char/93672.cc: New test.
* testsuite/27_io/basic_istream/ignore/wchar_t/93672.cc: New
test.

[Bug libstdc++/93672] [11/12/13 Regression] std::basic_istream::ignore hangs if delim MSB is set

2024-04-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93672

Jonathan Wakely  changed:

   What|Removed |Added

Summary|[11/12/13/14 Regression]|[11/12/13 Regression]
   |std::basic_istream::ignore  |std::basic_istream::ignore
   |hangs if delim MSB is set   |hangs if delim MSB is set

--- Comment #6 from Jonathan Wakely  ---
Fixed on trunk so far.

[Bug libstdc++/113386] [C++23] std::pair comparison operators should be transparent, but are not in libstdc++

2024-04-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386

Jonathan Wakely  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED

--- Comment #11 from Jonathan Wakely  ---
Done for gcc 14.

[Bug libstdc++/106749] Implement C++23 library features

2024-04-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106749
Bug 106749 depends on bug 113386, which changed state.

Bug 113386 Summary: [C++23] std::pair comparison operators should be 
transparent, but are not in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113386

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

  1   2   >