[Bug c++/111028] Incorrect optimization with -o1,-o2

2023-08-15 Thread zhaiqiming at baidu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111028

--- Comment #5 from zhaiqiming at baidu dot com ---
(In reply to Patrick Palka from comment #4)

thanks for your plan. Because my project need -o2 optimization, i use "#pragma
GCC optimize("-O0")" to temporarily solve the problem with this function. hope
this case can make gcc-optimization better at a higher version.

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2023-08-15 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

--- Comment #4 from Jerry DeLisle  ---
The relative text in the standard is:

13.7.2.1 General rules
--- snip ---
(6) On output, with I, B, O, Z, D, E, EN, ES, EX, F, and G editing, the
specified value of the field width w may be zero. In such cases, the processor
selects the smallest positive actual field width that does not result in a
field filled with asterisks. The specified value of w shall not be zero on
input.

[Bug c++/111034] New: Precompiled headers still non-deterministic

2023-08-15 Thread skunk at iskunk dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111034

Bug ID: 111034
   Summary: Precompiled headers still non-deterministic
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: skunk at iskunk dot org
  Target Milestone: ---

This is a follow-on of bug #92717.

In that bug, it was noted that .gch files are basically GCC memory dumps, and
that because ASLR is typically enabled on modern Linux systems, this results in
non-deterministic output every time such a file is generated.

The solution given was to disable ASLR, e.g. by using "setarch -R". And I
confirmed that if I generate the same .gch file multiple times in a tight loop
with ASLR disabled, every file comes out identical.

That was only a test, however. My production fix was to disable ASLR for the
entire build process, with "setarch -R make bootstrap".

And that yielded much spottier results. After multiple attempts, maybe 10% of
the pairs of bootstraps that I ran came out with identical .gch files. And that
was running on the same container host---if I tried the build in the same
container architecture/environment but a different host, the files would end up
different as a matter of course.

I think, in the interests of reproducibility, the way .gch files are generated
needs to be reworked in a way that does not depend on the runtime environment.
The current approach may be fine for PCH purposes, but the security
implications of any opaque non-determinism will only get worse with time.

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2023-08-15 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

Jerry DeLisle  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jvdelisle at gcc dot 
gnu.org

--- Comment #3 from Jerry DeLisle  ---
Yes I will take. I did not notice this change.

[Bug c++/109021] accept size parameter in extern C

2023-08-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109021

--- Comment #5 from Andrew Pinski  ---
(In reply to Martin Uecker from comment #4)
> Sorry, how can an enhancement request that addresses a real C/C++
> compatibility problem be marked "resolved invalid" ?

Because GCC does like these days to add huge extensions to the C++ language.
Maybe you could bring this up to WG21 (C++ standards committee) as a paper
which fully thought out (and an implementation either GCC or clang as an open
source example). GCC might take that implementation and have it as an extension
...

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2023-08-15 Thread john.harper at vuw dot ac.nz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

--- Comment #2 from john.harper at vuw dot ac.nz ---
Further information on this bug: it affects all four real kinds with all 
three of E0.0E0, ES0.0E0 and EN0.0E0 formats. My 15-line test program for 
that is attached. I hope it helps.

  On Tue, 15 Aug 2023, anlauf at gcc dot gnu.org wrote:

> Date: Tue, 15 Aug 2023 19:45:06 +
> From: anlauf at gcc dot gnu.org 
> To: John Harper 
> Subject: [Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too
>  high.
> Resent-Date: Wed, 16 Aug 2023 07:45:16 +1200 (NZST)
> Resent-From: 
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022
>
> anlauf at gcc dot gnu.org changed:
>
>   What|Removed |Added
> 
> Status|UNCONFIRMED |NEW
> Ever confirmed|0   |1
> CC||jvdelisle at gcc dot gnu.org
>  Component|fortran |libfortran
>   Keywords||wrong-code
>   Last reconfirmed||2023-08-15
>
> --- Comment #1 from anlauf at gcc dot gnu.org ---
> Confirmed.
>
> @Jerry: can you have a look?  F2008 did not specify w=0 for the ES format;
> this was added in F2018.
>
> -- 
> You are receiving this mail because:
> You reported the bug.
>


-- John Harper, School of Mathematics and Statistics
Victoria Univ. of Wellington, PO Box 600, Wellington 6140, New Zealand.
e-mail john.har...@vuw.ac.nz

[Bug tree-optimization/111032] using small types inside loops sometimes confuses the vectorizer

2023-08-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111032

--- Comment #1 from Andrew Pinski  ---
One way of fixing this is to optimize the following for the scalar side:
```
  _15 = _4 != 0;
  _16 = (short unsigned int) _15;
  _17 = _16 << 3;
  _6 = (int) _17;
```
into:
```
  _t = (int) _15;
  _6 = _t << 3;
```

Note this has the same issue too:
```
void __attribute__ ((noipa))
f0_1 (int *__restrict r,
  int *__restrict pred)
{
  for (int i = 0; i < 1024; ++i)
  {
short p = pred[i]?-1:0;
r[i] = p ;
  }
}
```

[Bug c++/111033] New: libcody build does not use AR_FLAGS

2023-08-15 Thread skunk at iskunk dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111033

Bug ID: 111033
   Summary: libcody build does not use AR_FLAGS
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: skunk at iskunk dot org
  Target Milestone: ---

I am working on getting a reproducible GCC build. This includes passing the "D"
option to ar(1) to avoid non-deterministic static library outputs.

In bootstrapping 13.2.0, I noticed that libcody.a was not being generated
consistently. The libcody makefile has the following:

AR := @AR@

...

$(AR) -cr $@ $^

For comparison, in the rest of the GCC tree, you typically see

AR = @AR@
AR_FLAGS = rc

...

$(AR) $(AR_FLAGS) libfoo.a ...

and AR_FLAGS is the natural place to add additional flags. An AR_FLAGS variable
should thus likewise be added to the libcody makefile.

(ARFLAGS is also used; I am not entirely sure which form is favored.)

Beyond addressing this small inconsistency, it would be good to see the GCC
build itself check whether ar(1) supports the "D" flag, and add it
automatically if so.

[Bug tree-optimization/111032] New: using small types inside loops sometimes confuses the vectorizer

2023-08-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111032

Bug ID: 111032
   Summary: using small types inside loops sometimes confuses the
vectorizer
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64-linux-gnu x6_64-linux-gnu

Take:
```
void __attribute__ ((noipa))
f0 (int *__restrict r,
   int *__restrict a,
   int *__restrict pred)
{
  for (int i = 0; i < 1024; ++i)
  {
unsigned short p = pred[i]?3:0;
r[i] = p ;
  }
}

void __attribute__ ((noipa))
f1 (int *__restrict r,
   int *__restrict a,
   int *__restrict pred)
{
  for (int i = 0; i < 1024; ++i)
  {
int p = pred[i]?1<<3:0;
r[i] = p ;
  }
}
```

These 2 functions should produce the same code, selecting between 8 and 0 but
instead in f0, we have a truncation and then an extension.

This happens on x86_64 at -O3 and aarch64 at -O3.

Though aarch64 with `-O3 -march=armv8.5-a+sve2` will be fixed with the patch to
PR 111006 (which I will be submitting later today) because SVE uses conversions
rather than VEC_PACK_TRUNC_EXPR/vec_unpack_hi_expr/vec_unpack_lo_expr here.

[Bug tree-optimization/111015] [11/12/13/14 Regression] __int128 bitfields optimized incorrectly to the 64 bit operations

2023-08-15 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111015

Mikael Pettersson  changed:

   What|Removed |Added

 CC||mikpelinux at gmail dot com

--- Comment #3 from Mikael Pettersson  ---
10.5.0 is good, 11.4.0 and above are affected, started with (or was exposed
by):

commit ed01d707f8594827de95304371d5b62752410842
Author: Eric Botcazou 
Date:   Mon May 25 22:13:11 2020 +0200

Fix internal error on store to FP component at -O2

This is about a GIMPLE verification failure at -O2 or above because
the GIMPLE store merging pass generates a NOP_EXPR between a FP type
and an integral type.  This happens when the bit-field insertion path
is taken for a FP field, which can happen in Ada for bit-packed record
types.

It is fixed by generating an intermediate VIEW_CONVERT_EXPR.  The patch
also tames a little the bit-field insertion path because, for bit-packed
record  types in Ada, you can end up with large bit-field regions, which
results in a lot of mask-and-shifts instructions.

gcc/ChangeLog
* gimple-ssa-store-merging.c
(merged_store_group::can_be_merged_into):
Only turn MEM_REFs into bit-field stores for small bit-field
regions
(imm_store_chain_info::output_merged_store): Be prepared for
sources
with non-integral type in the bit-field insertion case.
(pass_store_merging::process_store): Use MAX_BITSIZE_MODE_ANY_INT
as
the largest size for the bit-field case.

gcc/testsuite/ChangeLog
* gnat.dg/opt84.adb: New test.

 gcc/ChangeLog   |  9 +
 gcc/gimple-ssa-store-merging.c  | 20 ---
 gcc/testsuite/ChangeLog |  4 +++
 gcc/testsuite/gnat.dg/opt84.adb | 74 +
 4 files changed, 103 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/opt84.adb

[Bug tree-optimization/106238] [12/13/14 regression] Inline optimization causes dangling pointer warning on "include/c++/12.1.0/bits/stl_tree.h"

2023-08-15 Thread romain.geissler at amadeus dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106238

--- Comment #10 from Romain Geissler  ---
Hi,

It seems the reproducers from comment #1 and #5 don't happen anymore with gcc
13 or trunk. So it seems fixed.

Cheers,
Romain

[Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison

2023-08-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006

--- Comment #3 from Andrew Pinski  ---
Created attachment 55740
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55740=edit
The patch which fixes the SVE part

It took me longer to come up with this due to failures which I didn't know
about expand_vec_cond_expr_p, I was just thinking is_truth_type_for would have
worked here but nope that crashes for types with BLKmode .

[Bug fortran/87326] [F18] Support the NEW_INDEX= specifier in the FORM TEAM statement

2023-08-15 Thread weeks at iastate dot edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87326

--- Comment #9 from Nathan Weeks  ---
(In reply to anlauf from comment #8)
> (In reply to Nathan Weeks from comment #7)
> > (In reply to anlauf from comment #6)
> > > (In reply to Nathan Weeks from comment #5)
> > > > (In reply to Brad Richardson from comment #3)
> > > > > Was there any more progress on this? I've just run into it.
> > > > > 
> > > > > FYI I'm trying implement a polymorphic send/receive:
> > > > > https://gitlab.com/everythingfunctional/communicator
> > > > 
> > > > The FSF copyright assignment ended up being an unexpectedly-difficult 
> > > > hurdle
> > > > at the time. I could try again if there is interest, though it would 
> > > > also
> > > > require some effort to rework the original patch for GCC 14. If another
> > > > contributor were willing to submit a clean-room implementation, that 
> > > > may be
> > > > more expedient.
> > > 
> > > Besides the copyright assignment there is also the possibility to use the
> > > Developer's Certificate of Origin sign-off:
> > > 
> > > https://gcc.gnu.org/dco.html
> > 
> > Is that an option in this case? I was originally advised to pursue a
> > copyright assignment
> > 
> > https://gcc.gnu.org/pipermail/fortran/2019-January/051674.html
> 
> That requirement was changed in 2021:
> 
> https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html
> 
> See also:
> 
> https://gcc.gnu.org/contribute.html

That's great news! I'll carve out some time to try to adapt the patch for GCC
14.

[Bug libstdc++/110801] std::format code runs slower than equivalent {fmt} code

2023-08-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110801

--- Comment #1 from Jonathan Wakely  ---
Created attachment 55739
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55739=edit
Add special case for format("{}", integer)

With this patch std::format is much closer to fmt::format:

Benchmark  Time CPU   Iterations

sprintf   554621 ns   553889 ns 1241
ostringstream 932465 ns   931258 ns  746
to_string 122602 ns   122425 ns 5424
format241978 ns   241656 ns 2939
format_to 109541 ns   109391 ns 6282
std_format282151 ns   281787 ns 2490
std_format_to 225596 ns   225301 ns 3080

std::format_to could still be faster. It should be potentially as fast as
to_string.

The patch is just a prototype that only optimizes for integers, but the idea
could be extended to other types too. For integers, floats, strings, and
pointers we can skip the formatter::parse and formatter::format calls for a
"{}" format string and just do the basic output form, with none of the
additional code for alternative presentation forms, alignment, width, precision
etc.

As well as short-circuiting most of the formatting logic, the other part of the
optimization is writing directly to the output buffer if it is a contiguous
iterator (or a sink iterator that writes to a contiguous container). This is
more efficient than writing to a buffer and then copying to the output.  This
could be extended to work with a back_insert_iterator or
back_insert_iterator, as we could extract the container, resize it, and
then write directly into it.

[Bug target/109815] AIX: Combining -static-libstdc++ and -pthread causes a TLS-related link error

2023-08-15 Thread John.Parke at alebra dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109815

--- Comment #4 from John Parke  ---
Thanks for taking a look at this.

My AIX system is 7.1 TL 04 SP 05 (2017/20).

We just dropped support for 6, but still have customers on 7.1.

I'll have to see about updating this system to see if that works.

As far as I am concerned you have addressed this issue.

Best regards,

John

[Bug analyzer/109570] detect fclose on unopened or NULL files

2023-08-15 Thread glebfm at altlinux dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570

Gleb Fotengauer-Malinovskiy  changed:

   What|Removed |Added

 CC||glebfm at altlinux dot org

--- Comment #7 from Gleb Fotengauer-Malinovskiy  ---
FYI, the change in the glibc also has an effect on autoconf-based projects,
when the -fanalyzer flag is used in combination with the -Werror flag, see:

https://git.savannah.gnu.org/cgit/autoconf.git/commit/?id=ea3e0cec2e66132e34228546256a1657c7b9b2e9

[Bug fortran/78054] gfortran.dg/pr70673.f90 FAILs at -O0

2023-08-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78054

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|WAITING |RESOLVED

--- Comment #13 from anlauf at gcc dot gnu.org ---
Closing.

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2023-08-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||jvdelisle at gcc dot gnu.org
  Component|fortran |libfortran
   Keywords||wrong-code
   Last reconfirmed||2023-08-15

--- Comment #1 from anlauf at gcc dot gnu.org ---
Confirmed.

@Jerry: can you have a look?  F2008 did not specify w=0 for the ES format;
this was added in F2018.

[Bug fortran/110360] ABI issue with character,value dummy argument

2023-08-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110360

--- Comment #41 from anlauf at gcc dot gnu.org ---
(In reply to Mikael Morin from comment #40)
> Harald, I have just closed the followup PR110419.
> I think this PR can be closed as well, or is there something left to be done?

It is pretty much done.

There is a minor memleak for the bind(c) case left that can be seen for
testcase gfortran.dg/bind_c_usage_13.f03 or the reduced version:

program p
  interface
 subroutine val_c (c) bind(c)
   use iso_c_binding, only: c_char
   character(len=1,kind=c_char), value :: c
 end subroutine val_c
  end interface
  call val_c ("A")
end

The leak is plugged by the first part of the patch attached to comment#37:

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 52cd88f5b00..ee3cd47cf91 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -4044,8 +4044,9 @@ conv_scalar_char_value (gfc_symbol *sym, gfc_se *se,
gfc_expr **expr)
   gfc_typespec ts;
   gfc_clear_ts ();

-  *expr = gfc_get_int_expr (gfc_default_character_kind, NULL,
-   (*expr)->value.character.string[0]);
+  gfc_expr *tmp = gfc_get_int_expr (gfc_default_character_kind, NULL,
+   (*expr)->value.character.string[0]);
+  gfc_replace_expr (*expr, tmp);
 }
   else if (se != NULL && (*expr)->expr_type == EXPR_VARIABLE)
 {

Shall we commit this one?

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-15 Thread broonie at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #13 from Mark Brown  ---
The kernel hasn't got any problem with BTI as far as I am aware - when built
with clang we run the kernel with BTI enabled since clang does just insert a
BTI C at the start of every function, and GCC works fine so long as we don't
get any out of range jumps being generated. The issue is that we don't have
anything to insert veneers in the case where section placement puts static
functions into a distant enough part of memory to need an indirect jump but GCC
has decided to omit the landing pad.

[Bug target/111029] bpf: GCC generates invalid instructions wN = (s8) rM

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111029

--- Comment #1 from CVS Commits  ---
The master branch has been updated by David Faust :

https://gcc.gnu.org/g:489e1adf7792985b21195c740da7370f96b19640

commit r14-3227-g489e1adf7792985b21195c740da7370f96b19640
Author: David Faust 
Date:   Tue Aug 15 10:54:17 2023 -0700

bpf: fix pseudoc w regs for small modes [PR111029]

In the BPF pseudo-c assembly dialect, registers treated as 32-bits
rather than the full 64 in various instructions ought to be printed as
"wN" rather than "rN".  But bpf_print_register () was only doing this
for specifically SImode registers, meaning smaller modes were printed
incorrectly.

This caused assembler errors like:

  Error: unrecognized instruction `w2 =(s8)r1'

for a 32-bit sign-extending register move instruction, where the source
register is used in QImode.

Fix bpf_print_register () to print the "w" version of register when
specified by the template for any mode 32-bits or smaller.

PR target/111029

gcc/
* config/bpf/bpf.cc (bpf_print_register): Print 'w' registers
for any mode 32-bits or smaller, not just SImode.

gcc/testsuite/

* gcc.target/bpf/smov-2.c: New test.
* gcc.target/bpf/smov-pseudoc-2.c: New test.

[Bug c++/99232] Exported variable in module gives error: 'lambda' was not declared in this scope

2023-08-15 Thread yagreg7 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99232

Gregory Dushkin  changed:

   What|Removed |Added

 CC||yagreg7 at gmail dot com

--- Comment #1 from Gregory Dushkin  ---
The issue is still present in GCC 13.2.1 and appears to only affect the actual
variables but not references to them.
E.g. the code:
```
// test.cc
export module test;
export constexpr int a = 42;
// main.cc
import test;
int main() { return a; }
```
would give:
```
$ g++ test.cc -std=c++20 -fmodules-ts -c
$ g++ main.cc -std=c++20 -fmodules-ts -c
main.cc: In function ‘int main()’:
main.cc:4:12: error: ‘a’ was not declared in this scope
4 | return a;
  |
```

but

```
// test.cc
export module test;
export constexpr int a_ = 42;
export constexpr const int& a = a_;
// main.cc
import test;
int main() { return a; }
```
compiles successfully using the same commands.

[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour

2023-08-15 Thread gcc at pauldreik dot se via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860

--- Comment #19 from Paul Dreik  ---
Thanks Jonathan!
I am happy to count myself as a gcc contributor now :-D

Never mind the tiny git mistake, that will be forgotten once gcc 14 is out!

[Bug middle-end/111009] [12/13/14 regression] -fno-strict-overflow erroneously elides null pointer checks and causes SIGSEGV on perf from linux-6.4.10

2023-08-15 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111009

--- Comment #4 from Andrew Macleod  ---
(In reply to Richard Biener from comment #3)
> bool
> operator_addr_expr::fold_range (irange , tree type,
> const irange ,
> const irange ,
> relation_trio) const
> { 
>   if (empty_range_varying (r, type, lh, rh))
> return true;
>   
>   // Return a non-null pointer of the LHS type (passed in op2).
>   if (lh.zero_p ())
> r = range_zero (type); 
> 
> not sure how this is called, but we can only derive this if the offset
> is zero as well, definitely if targetm.addr_space.zero_address_valid,
> but I think this is true in general.
> 
>   else if (!contains_zero_p (lh))
> r = range_nonzero (type);
> 
> and this is only true for TYPE_OVERFLOW_UNDEFINED (type), with
> -fwrapv-pointer we could wrap to zero.
> 
> That is, it's _not_ GIMPLE undefined behavior to compute &0->bar.


> It looks like without -fwrapv-pointer we elide the if (!a) check,
> dereferencing it when dso && dso != curr.  I suppose that looks reasonable
> with a = >maj, when dso != 0 then a != 0 unless ->maj wraps.

Range-ops won't see anything like >maj.. it sees rangers and nothing else.
it just gets the result of that expression determined by someone else. . so if
it see [0,0] for the range, that means >maj has been determined to be 0.

When folding, addressof has some funky mechanics, and it symbolically processes
the trees in gimple-range-fold.cc  in fold_using_range::range_of_address

I think it takes care of all the funky things you mention.

I also notice in the earlier comment where we set _13 to 0...   the code you
quoted where _13 was recomputed by ranger. it ends with   
   GORI  TRUE : (797) recomputation (_13) [irange] _Bool [1, 1]
The result was [1,1] as far as ranger was concerned o the edge from 3->16
so that prop0bably isn't how gimple fold determined it was zero.

Is there still an issue here?

[Bug c++/105644] ICE in a fold expression with a requires expression: in iterative_hash_template_arg, at cp/pt.cc:1805

2023-08-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105644

Patrick Palka  changed:

   What|Removed |Added

   Keywords|needs-reduction |

--- Comment #3 from Patrick Palka  ---
Thanks for the reduction. Fixing this GCC bug(s) proved to be tricky, but one
should be able to reliably work around it by avoiding using a requires-expr
directly inside a pack expansion or fold expression and use it indirectly
instead, e.g.

template
concept foo = requires(T t) { { (I, t) } -> C; };

template
constexpr bool check() {
return (foo && ...);
}

[Bug c++/111031] ICE: internal compiler error: in iterative_hash_template_arg, at cp/pt.cc:1747

2023-08-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111031

Patrick Palka  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-15
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Patrick Palka  ---
This seems to be a more interesting case of PR105644 where the requires-expr
inside the pack expansion also uses the pack within a nested pack expansion.

A workaround is to factor out the requires-expr from the pack expansion, e.g.

template
concept single_concatable = requires(const iterator_t in) {
  { *in } -> convertible_to>;
};

template
concept concatable = (single_concatable && ...);

[Bug target/109279] RISC-V: complex constants synthesized should be improved

2023-08-15 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #17 from Vineet Gupta  ---
(In reply to Vineet Gupta from comment #16)
> > Which is what this produces:
> > ```
> > long long f(void)
> > {
> >   unsigned t = 16843009;
> >   long long t1 = t;
> >   long long t2 = ((unsigned long long )t) << 32;
> >   asm("":"+r"(t1));
> >   return t1 | t2;
> > }
> > ```

> 
>   li  a0,16842752
>   addia0,a0,257
>   li  a5,16842752
>   sllia0,a0,32
>   addia5,a5,257
>   or  a0,a5,a0
>   ret

This is again IRA inflicted pain (similar to [PR110748]). 
IRA seems to be undoing split1 since we have 2 insn sequences to synthesize the
constant pieces. This explains why the problem got exacerbated with commit
0530254413f8 ("riscv: relax splitter restrictions for creating pseudos") since
now different regs are used to create parts of const, vs 1 reg being repeatedly
used for assembling a const (fooling IRA's equivalent replacement logic).

[Bug target/110748] RISC-V: optimize store of DF 0.0

2023-08-15 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110748

--- Comment #16 from Vineet Gupta  ---
(In reply to Vineet Gupta from comment #15)

> On the branch devel/vineetg/optim-double-const-m0 I have double -0.0 working.
> 
> znd:
> li  a5,-1
> sllia5,a5,63
> sd  a5,0(a0)
> ret
> 
> There's currently an ICE for zbs
> 
> IRA is undoing the split so the insn with const_int 0x8000_
> doesn't exist for final pass.
> 
> expand
> --
> (insn 6 3 0 2 (set (mem:DF (reg:DI 135)
> (const_double:DF -0.0 [-0x0.0p+0])) {*movdf_hardfloat_rv64}
> 
> split1
> -
> (insn 10 3 11 2 (set (reg:DI 136)
> (const_int [0x8000]))
> 
> (insn 11 10 0 2 (set (mem:DF (reg:DI 135)
> (subreg:DF (reg:DI 136) 0))
> 
> ira
> 
> (insn 11 9 12 2 (set (mem:DF (reg:DI 135)
> (const_double:DF -0.0 [-0x0.0p+0])) {*movdf_hardfloat_rv64}

So IRA is doing the equivalent replacement for a register which is referenced
exactly twice: set once and used once, w/o any reg pressure considerations [1].

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627212.html

There seems to be no easy way around it.

[Bug tree-optimization/110628] [14 regression] gcc.dg/tree-ssa/update-threading.c fails after r14-2383-g768f00e3e84123

2023-08-15 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628

Hans-Peter Nilsson  changed:

   What|Removed |Added

 CC||hp at gcc dot gnu.org

--- Comment #5 from Hans-Peter Nilsson  ---
A quick survey of recent test-results show this regression happens for many
targets.

Besides the reported:
powerpc64le-unknown-linux-gnu
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/793193.html

Also fails for:
pru-unknown-elf
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/792102.html

s390x-ibm-linux-gnu
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/792106.html

m68k-unknown-linux-gnu
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/793249.html

cris-axis-elf
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/793462.html

But, curiously not for:
x86_64-pc-linux-gnu
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/792111.html

i686-pc-linux-gnu
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/792104.html

aarch64-unknown-linux-gnu
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/793190.html

arm-unknown-linux-gnueabihf
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/793191.html

Not seeing any action for this regression for three weeks, for tracking
purposes I'm considering xfailing this test-case for cris-* after another week.

[Bug c++/111019] [12/13/14 Regression] Optimizer incorrectly assumes variable is not changed while change happens through another pointer

2023-08-15 Thread boskidialer at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111019

--- Comment #4 from Sławomir Fraś  ---
Not sure if that hints at the possible cause, when i extracted `this` variable
out of the loop to something like this (https://godbolt.org/z/8ebb5E1qG):

Target* first = this;
while (first->next)
{
  Base* n = first->next;

Then the code behaves the same as faulty code, but when i changed it into this
(https://godbolt.org/z/7o58Y4fEq):

Base* first = this; // only change: Target -> Base
while (first->next)
{
  Base* n = first->next;

Then the code works fine even with the `always_inline` on the destructor. In
this case it looks like write to `Base::next` member is not considered as
invalidating value of `Target::next` in register even when it should since
Target derives from the Base class.

[Bug c++/111031] New: ICE: internal compiler error: in iterative_hash_template_arg, at cp/pt.cc:1747

2023-08-15 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111031

Bug ID: 111031
   Summary: ICE: internal compiler error: in
iterative_hash_template_arg, at cp/pt.cc:1747
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

I will reduce it if I have the bandwidth.

#include 

namespace std::ranges {

template
using concat_reference_t = common_reference_t...>;

template
concept concatable = (requires(const iterator_t in) {
  { *in } -> convertible_to>;
} && ...);

template
  requires concatable
class concat_view { };

template
concat_view(Rs&&...) -> concat_view...>;

}  // namespace std::ranges

int main() {
  int x[] = {0, 1, 2};
  std::ranges::concat_view r(x);
}

https://godbolt.org/z/jEh9YzafT

[Bug c/111025] attribute((malloc)) and posix_memalign() (and other functions that return newly allocated object address into an output parameter)

2023-08-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111025

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=60086

--- Comment #3 from Andrew Pinski  ---
It was referenced in pr 60086

[Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e

2023-08-15 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227
Bug 103227 depends on bug 92497, which changed state.

Bug 92497 Summary: Aggregate IPA-CP and inlining do not play well together, 
transformation is lost
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92497

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug ipa/92497] Aggregate IPA-CP and inlining do not play well together, transformation is lost

2023-08-15 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92497

Martin Jambor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Martin Jambor  ---
Fixed.

[Bug ipa/68930] Aggregate replacements not applied to inline function bodies.

2023-08-15 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68930

Martin Jambor  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Martin Jambor  ---
Finally fixed.

[Bug ipa/92497] Aggregate IPA-CP and inlining do not play well together, transformation is lost

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92497

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Martin Jambor :

https://gcc.gnu.org/g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a

commit r14-3226-gd073e2d75d9ed492de9a8dc6970e5b69fae20e5a
Author: Martin Jambor 
Date:   Tue Aug 15 17:26:13 2023 +0200

Feed results of IPA-CP into tree value numbering

PRs 68930 and 92497 show that when IPA-CP figures out constants in
aggregate parameters or when passed by reference but the loads happen
in an inlined function the information is lost.  This happens even
when the inlined function itself was known to have - or even cloned to
have - such constants in incoming parameters because the transform
phase of IPA passes is not run on them.  See discussion in the bugs
for reasons why.

Honza suggested that we can plug the results of IPA-CP analysis into
value numbering, so that FRE can figure out that some loads fetch
known constants.  This is what this patch attempts to do.  The patch
does not attempt to populate partial_defs with information from
IPA-CP, this can be hopefully added as a follow-up.

gcc/ChangeLog:

2023-08-11  Martin Jambor  

PR ipa/68930
PR ipa/92497
* ipa-prop.h (ipcp_get_aggregate_const): Declare.
* ipa-prop.cc (ipcp_get_aggregate_const): New function.
(ipcp_transform_function): Do not deallocate transformation info.
* tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and
ipa-prop.h.
(vn_reference_lookup_2): When hitting default-def vuse, query
IPA-CP transformation info for any known constants.

gcc/testsuite/ChangeLog:

2023-06-07  Martin Jambor  

PR ipa/68930
PR ipa/92497
* gcc.dg/ipa/pr92497-1.c: New test.
* gcc.dg/ipa/pr92497-2.c: Likewise.

[Bug ipa/68930] Aggregate replacements not applied to inline function bodies.

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68930

--- Comment #10 from CVS Commits  ---
The master branch has been updated by Martin Jambor :

https://gcc.gnu.org/g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a

commit r14-3226-gd073e2d75d9ed492de9a8dc6970e5b69fae20e5a
Author: Martin Jambor 
Date:   Tue Aug 15 17:26:13 2023 +0200

Feed results of IPA-CP into tree value numbering

PRs 68930 and 92497 show that when IPA-CP figures out constants in
aggregate parameters or when passed by reference but the loads happen
in an inlined function the information is lost.  This happens even
when the inlined function itself was known to have - or even cloned to
have - such constants in incoming parameters because the transform
phase of IPA passes is not run on them.  See discussion in the bugs
for reasons why.

Honza suggested that we can plug the results of IPA-CP analysis into
value numbering, so that FRE can figure out that some loads fetch
known constants.  This is what this patch attempts to do.  The patch
does not attempt to populate partial_defs with information from
IPA-CP, this can be hopefully added as a follow-up.

gcc/ChangeLog:

2023-08-11  Martin Jambor  

PR ipa/68930
PR ipa/92497
* ipa-prop.h (ipcp_get_aggregate_const): Declare.
* ipa-prop.cc (ipcp_get_aggregate_const): New function.
(ipcp_transform_function): Do not deallocate transformation info.
* tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and
ipa-prop.h.
(vn_reference_lookup_2): When hitting default-def vuse, query
IPA-CP transformation info for any known constants.

gcc/testsuite/ChangeLog:

2023-06-07  Martin Jambor  

PR ipa/68930
PR ipa/92497
* gcc.dg/ipa/pr92497-1.c: New test.
* gcc.dg/ipa/pr92497-2.c: Likewise.

[Bug target/111023] missing extendv4siv4hi (and friends)

2023-08-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

--- Comment #1 from Uroš Bizjak  ---
(In reply to Richard Biener from comment #0)
> We could vectorize gcc.dg/vect/pr65947-7.c if we implement the
> extendv4siv4hi pattern (sign-extend V4HI to V4SI).  We can already do
> vec_unpacks_lo via
> 
> pcmpgtw %xmm0, %xmm1
> movdqa  %xmm0, %xmm2
> punpcklwd   %xmm1, %xmm2
> 
> and that would trivially extend to the required pattern - just the
> input is v4hi instead of v8hi.
> 
> Other related patterns are probably missing as well, where we can do
> vec_unpack[s]_lo we should be able to implement [zero_]extend.

We have:

(define_expand "v4hiv4si2"
  [(set (match_operand:V4SI 0 "register_operand")
(any_extend:V4SI
  (match_operand:V4HI 1 "nonimmediate_operand")))]
  "TARGET_SSE4_1"

in sse.md, so the testcase should be vectorized using -msse4.1. Is there any
other pattern missing for efficient vectorization?

[Bug fortran/110677] UBSAN error: load of value 1818451807, which is not a valid value for type 'expr_t' when compiling pr49213.f90

2023-08-15 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110677

Martin Jambor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Martin Jambor  ---
Should be fixed now.

[Bug other/63426] [meta-bug] Issues found with -fsanitize=undefined

2023-08-15 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426
Bug 63426 depends on bug 110677, which changed state.

Bug 110677 Summary: UBSAN error: load of value 1818451807, which is not a valid 
value for type 'expr_t' when compiling pr49213.f90
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110677

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug d/110959] gdc: internal compiler error: in layout_aggregate_type in recursive templated class

2023-08-15 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110959

ibuclaw at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
 CC||ibuclaw at gcc dot gnu.org

--- Comment #3 from ibuclaw at gcc dot gnu.org ---
Fix committed to releases/gcc-12, and test case added to mainline.

[Bug d/110959] gdc: internal compiler error: in layout_aggregate_type in recursive templated class

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110959

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Iain Buclaw :

https://gcc.gnu.org/g:4acce4c4e53ae93ab8e7dad2ca9099e45559a541

commit r14-3225-g4acce4c4e53ae93ab8e7dad2ca9099e45559a541
Author: Iain Buclaw 
Date:   Tue Aug 15 17:10:45 2023 +0200

d: Add test case for PR110959.

This ICE is specific to the D front-end language version in GDC 12,
however a test has been added to mainline to catch the unlikely event of
a regression.

PR d/110959

gcc/testsuite/ChangeLog:

* gdc.dg/pr110959.d: New test.

[Bug fortran/110677] UBSAN error: load of value 1818451807, which is not a valid value for type 'expr_t' when compiling pr49213.f90

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110677

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Martin Jambor :

https://gcc.gnu.org/g:84e122c34834d9dea189c10fe0bf60c4d1a99fae

commit r14-3224-g84e122c34834d9dea189c10fe0bf60c4d1a99fae
Author: Martin Jambor 
Date:   Tue Aug 15 17:13:44 2023 +0200

Fortran: Avoid accessing gfc_charlen when not looking at BT_CHARACTER (PR
110677)

This patch addresses an issue uncovered by the undefined behavior
sanitizer.  In function resolve_structure_cons in resolve.cc there is
a test starting with:

  if (cons->expr->ts.type == BT_CHARACTER && comp->ts.u.cl
  && comp->ts.u.cl->length
  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT

and UBSAN complained of loads from comp->ts.u.cl->length->expr_type of
integer value 1818451807 which is outside of the value range expr_t
enum.  If I understand the code correctly it the entire load was
unwanted because comp->ts.type in those cases is BT_CLASS and not
BT_CHARACTER.  This patch simply adds a check to make sure it is only
accessed in those cases.

During review, Harald Anlauf noticed that length types also need to be
checked and so I added also checks that he suggested to the condition.

Co-authored-by: Harald Anlauf 

gcc/fortran/ChangeLog:

2023-08-14  Martin Jambor  

PR fortran/110677
* resolve.cc (resolve_structure_cons): Check comp->ts is character
type before accessing stuff through comp->ts.u.cl.

[Bug d/110959] gdc: internal compiler error: in layout_aggregate_type in recursive templated class

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110959

--- Comment #1 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Iain Buclaw
:

https://gcc.gnu.org/g:3cf5a511e253876279462b1de08cfd8f5f804242

commit r12-9817-g3cf5a511e253876279462b1de08cfd8f5f804242
Author: Iain Buclaw 
Date:   Tue Aug 15 16:56:42 2023 +0200

d: Fix internal compiler error: in layout_aggregate_type, at d/types.cc:574

This ICE is specific to the D front-end language version in GDC 12.

PR d/110959

gcc/d/ChangeLog:

* dmd/canthrow.d (Dsymbol_canThrow): Use foreachVar.
* dmd/declaration.d (TupleDeclaration::needThis): Likewise.
(TupleDeclaration::foreachVar): New function.
(VarDeclaration::setFieldOffset): Use foreachVar.
* dmd/dinterpret.d (Interpreter::visit (DeclarationExp)): Likewise.
* dmd/dsymbolsem.d (DsymbolSemanticVisitor::visit
(VarDeclaration)):
Don't push tuple field members to the scope symbol table.
(determineFields): Handle pushing tuple field members here instead.
* dmd/dtoh.d (ToCppBuffer::visit (VarDeclaration)): Visit all tuple
fields.
(ToCppBuffer::visit (TupleDeclaration)): New function.
* dmd/expression.d (expandAliasThisTuples): Use foreachVar.
* dmd/foreachvar.d (VarWalker::visit (DeclarationExp)): Likewise.
* dmd/ob.d (genKill): Likewise.
(checkObErrors): Likewise.
* dmd/semantic2.d (Semantic2Visitor::visit (TupleDeclaration)):
Visit
all tuple fields.

gcc/testsuite/ChangeLog:

* gdc.dg/pr110959.d: New test.
* gdc.test/runnable/test23010.d: New test.

[Bug c++/111028] Incorrect optimization with -o1,-o2

2023-08-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111028

--- Comment #4 from Patrick Palka  ---
Or simply just:

diff --git a/111028.C b/111028.C
index ed4106e..85f1c2a 100644
--- a/111028.C
+++ b/111028.C
@@ -15,7 +15,7 @@ class list_t
public:
list_t() { _head.next = _head.prev = &_head; }
bool is_empty() const { return _head.next == &_head; }
-   T* entry(list_node_t ) const { return  == &_head ?
NULL : (T*)((char*) - (char*)_node_offset); }
+   T* entry(list_node_t ) const { return  == &_head ?
NULL : (T*)((char*) - (size_t)_node_offset); }

void add(T )
{

[Bug c++/111028] Incorrect optimization with -o1,-o2

2023-08-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111028

--- Comment #3 from Patrick Palka  ---
Converting _node_offset from a pointer into an integer offset seems to fix the
testcase:

diff --git a/111028.C b/111028.C
index ed4106e..ef2b1be 100644
--- a/111028.C
+++ b/111028.C
@@ -15,7 +15,7 @@ class list_t
public:
list_t() { _head.next = _head.prev = &_head; }
bool is_empty() const { return _head.next == &_head; }
-   T* entry(list_node_t ) const { return  == &_head ?
NULL : (T*)((char*) - (char*)_node_offset); }
+   T* entry(list_node_t ) const { return  == &_head ?
NULL : (T*)((char*) - _node_offset); }

void add(T )
{
@@ -37,11 +37,11 @@ class list_t
}

protected:
-   static list_node_t const * const _node_offset;
+   static size_t const _node_offset;
list_node_t _head;
 };
 template 
-list_node_t const * const list_t::_node_offset = &(((T
*)0)->*inner_list_node);
+size_t const list_t::_node_offset = size_t(&(((T
*)0)->*inner_list_node));

 template 
 class safe_list_t: public list_t

[Bug c++/111028] Incorrect optimization with -o1,-o2

2023-08-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111028

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
   Keywords|needs-bisection |

--- Comment #2 from Patrick Palka  ---
Seems to have started with r10-515-g810c42c38d3731 / r271414.

The line

T* entry(list_node_t ) const { return  == &_head ?
NULL : (T*)((char*) - (char*)_node_offset); }

seems suspiciously similar to PR96993 which was deemed invalid due to treating
the difference of two pointers as a valid pointer.

[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-15 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #6 from Jorn Wolfgang Rennecke  ---
(In reply to H. Peter Anvin from comment #5)

> 2. It seems like it almost would require an implementation-specific
> performance model. Now, one can validly argue that by setting the cost of
> unimplemented instructions to a (near-)infinite value such instructions
> should never be generated even if they are "enabled". That might also be a
> possible avenue for achieving this.

Yes, that makes it possible to implement the interface without actually having
a dedicated mask table.  However, you still have the headache of how to get
code generation to use this effectively.  A lot of code generation strategies
are basically canned solution that a skilled assembler programmer has devised;
you can theoretically use the superoptimizer to find linear sequences for
arbitrary instruction sets, but the compilation time cost and the limit to
linear sequences makes this impractical.
Therefore, as you want to co-develop architecture and software, you likely also
have to hack the compiler to make effective use of your architecture.
FWIW, 'infinite' cost seems unnecessarily high, considering you could make your
assembler replace missing instructions with function calls, and these functions
can get linked from a library.  So you have a finite cost per-call for the call
site size (static instruction count) & time (dynamic instruction count), and a
one-time size cost per-object for each function used.  Such a library and
assembler modification could be prepared for specific extensions that you want
to deconstruct, and then used flexibly.

[Bug target/109815] AIX: Combining -static-libstdc++ and -pthread causes a TLS-related link error

2023-08-15 Thread cameron.heide at betasystems dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109815

--- Comment #3 from C. Heide  ---
I finally got access to an AIX 7.2 system and the problem does not occur there
with GCC 12.3, so it seems to be something specific to DWARF only being
partially supported in AIX 7.1.

The installation instructions do note that AIX 7.1 TL03 SP7 minimum is needed.
My 7.1 system is TL05, but maybe a certain minimum 'SPx' version is needed
there too?

(Probably not a big deal now that 7.1 is out of support and 7.2+ work
anyway...)

[Bug tree-optimization/111030] New: tree-object-size: incorrect sub-object size for VLA

2023-08-15 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111030

Bug ID: 111030
   Summary: tree-object-size: incorrect sub-object size for VLA
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: qinzhao at gcc dot gnu.org
  Target Milestone: ---

current __builtin_dynamic_object_size cannot handle VLA correctly for the
sub-object size, please see the following testing case:

#include 
#include 

#define expect(p, _v) do { \
size_t v = _v; \
if (p == v) \
__builtin_printf ("ok:  %s == %zd\n", #p, p); \
else \
{  \
  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
} \
} while (0);

#define noinline __attribute__((__noinline__))

static void noinline bar (int index)
{
  struct annotated {
long foo;
char b;
char array[index];
long c;
  } q, *p;

  p = 

  expect (__builtin_dynamic_object_size(p->array, 0), 
  sizeof (struct annotated) - offsetof (struct annotated, array[0]));
  expect (__builtin_dynamic_object_size(p->array, 1), 
  offsetof (struct annotated, array[index]) - offsetof (struct
annotated, array[0]));
  return;
}

int main ()
{
  bar (10);
  return 0;
}

when compiled with the latest gcc and run:
/home/opc/Install/latest-d/bin/gcc -O t.c
ok:  __builtin_dynamic_object_size(p->array, 0) == 23
WAT: __builtin_dynamic_object_size(p->array, 1) == 23 (expected 10)

[Bug middle-end/110989] RISC-V vector ICE due to invalid tree code in GIMPLE vect pass

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110989

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:0618adfa80fcd2fd7ae03b30553c60a6b1abf573

commit r14-3222-g0618adfa80fcd2fd7ae03b30553c60a6b1abf573
Author: Juzhe-Zhong 
Date:   Sat Aug 12 22:15:15 2023 +0800

RISC-V: Fix autovec_length_operand predicate[PR110989]

Currently, autovec_length_operand predicate incorrect configuration is
discovered in PR110989 since this following situation:

vect__6.24_107 = .MASK_LEN_LOAD (vectp.22_105, 32B, mask__49.21_99,
POLY_INT_CST [2, 2], 0); ---> dummy length = VF.

The current autovec length operand failed to recognize the VF dummy length.

-march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=scalable -Ofast
-fno-schedule-insns -fno-schedule-insns2:

Before this patch:

srlia4,s0,2
addia4,a4,-3
srlis0,s0,3
vsetvli a5,zero,e64,m1,ta,ma
vid.v   v1
vmul.vx v1,v1,a4
addia4,s0,-2
vadd.vx v1,v1,a4
addia4,s0,-1
vslide1up.vxv2,v1,a4
vmv.v.x v1,a4
vand.vv v1,v2,v1
vl1re64.v   v3,0(t2)
vrgather.vv v2,v3,v1
vmv.v.i v1,0
vmfeq.vvv0,v2,v1
vsetvli zero,s0,e32,mf2,ta,ma---> s0 = POLY (2,2)
vle32.v v3,0(t3),v0.t
vsetvli a5,zero,e64,m1,ta,ma
vmfne.vvv0,v2,v1
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.f.x.vv1,v3
vsetvli zero,zero,e64,m1,ta,ma
vmerge.vvm  v1,v1,v2,v0
vslidedown.vx   v1,v1,a4
vfmv.f.sfa5,v1
j   .L6

After this patch:

srlia4,s0,2
addia4,a4,-3
srlis0,s0,3
vsetvli a5,zero,e64,m1,ta,ma
vid.v   v1
vmul.vx v1,v1,a4
addia4,s0,-2
vadd.vx v1,v1,a4
addis0,s0,-1
vslide1up.vxv2,v1,s0
vmv.v.x v1,s0
vand.vv v1,v2,v1
vl1re64.v   v3,0(t2)
vrgather.vv v2,v3,v1
vmv.v.i v1,0
vmfeq.vvv0,v2,v1
vle32.v v3,0(t3),v0.t
vmfne.vvv0,v2,v1
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.f.x.vv1,v3
vsetvli zero,zero,e64,m1,ta,ma
vmerge.vvm  v1,v1,v2,v0
vslidedown.vx   v1,v1,s0
vfmv.f.sfa5,v1
j   .L6

2 vsetvli insns are reduced.

gcc/ChangeLog:

PR target/110989
* config/riscv/predicates.md: Fix predicate.

gcc/testsuite/ChangeLog:

PR target/110989
* gcc.target/riscv/rvv/autovec/pr110989.c: Add vsetvli assembly
check.

[Bug c/111025] attribute((malloc)) and posix_memalign() (and other functions that return newly allocated object address into an output parameter)

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111025

--- Comment #2 from Richard Biener  ---
Note GCC already handles this internally (posix_memalign, that is) as if a
malloc attribute was possible and present.

[Bug c/111025] attribute((malloc)) and posix_memalign() (and other functions that return newly allocated object address into an output parameter)

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111025

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-08-15

--- Comment #1 from Richard Biener  ---
Confirmed.  We probably have a duplicate bugreport.

[Bug c++/111028] Incorrect optimization with -o1,-o2

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111028

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
  Known to work||9.5.0
  Known to fail||10.5.0, 12.3.0, 13.2.0

--- Comment #1 from Richard Biener  ---
I can confirm the finding, not sure if the program is valid though.  Neither
-fno-strict-aliasing nor -fno-lifetime-dse has any effect.

[Bug target/111029] New: bpf: GCC generates invalid instructions wN = (s8) rM

2023-08-15 Thread jemarch at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111029

Bug ID: 111029
   Summary: bpf: GCC generates invalid  instructions wN = (s8) rM
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jemarch at gcc dot gnu.org
  Target Milestone: ---

Created attachment 55738
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55738=edit
preprocessed source of kernel bpf selftest

When compiling the attached pre-processed program (which is a kernel bpf
selftest) with gcc-bpf we get invalid pseudo-c movs32 instructions:

$ bpf-unknown-none-gcc  -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
-O2 -gbtf -masm=pseudoc -mco-re -Wno-unknown-pragmas -Wno-unused-variable
-Wno-error=attributes -Wno-error=address-of-packed-member -c test_sysctl_prog.i
/tmp/ccgPxp1s.s: Assembler messages:
/tmp/ccgPxp1s.s:63: Error: unrecognized instruction `w2 =(s8)r1'
/tmp/ccgPxp1s.s:63: Error: expected register name, got 'r1'
/tmp/ccgPxp1s.s:71: Error: unrecognized instruction `w4 =(s8)r3'
/tmp/ccgPxp1s.s:71: Error: expected register name, got 'r3'
/tmp/ccgPxp1s.s:79: Error: unrecognized instruction `w9 =(s8)r5'
/tmp/ccgPxp1s.s:79: Error: expected register name, got 'r5'
/tmp/ccgPxp1s.s:88: Error: unrecognized instruction `w1 =(s8)r0'
/tmp/ccgPxp1s.s:88: Error: expected register name, got 'r0'
/tmp/ccgPxp1s.s:96: Error: unrecognized instruction `w3 =(s8)r2'
/tmp/ccgPxp1s.s:96: Error: expected register name, got 'r2'
/tmp/ccgPxp1s.s:104: Error: unrecognized instruction `w5 =(s8)r4'
/tmp/ccgPxp1s.s:104: Error: expected register name, got 'r4'
/tmp/ccgPxp1s.s:112: Error: unrecognized instruction `w0 =(s8)r9'
/tmp/ccgPxp1s.s:112: Error: expected register name, got 'r9'
/tmp/ccgPxp1s.s:120: Error: unrecognized instruction `w2 =(s8)r1'
/tmp/ccgPxp1s.s:120: Error: expected register name, got 'r1'
/tmp/ccgPxp1s.s:128: Error: unrecognized instruction `w4 =(s8)r3'
/tmp/ccgPxp1s.s:128: Error: expected register name, got 'r3'
/tmp/ccgPxp1s.s:136: Error: unrecognized instruction `w9 =(s8)r5'
/tmp/ccgPxp1s.s:136: Error: expected register name, got 'r5'
/tmp/ccgPxp1s.s:144: Error: unrecognized instruction `w1 =(s8)r0'
/tmp/ccgPxp1s.s:144: Error: expected register name, got 'r0'
/tmp/ccgPxp1s.s:152: Error: unrecognized instruction `w3 =(s8)r2'
/tmp/ccgPxp1s.s:152: Error: expected register name, got 'r2'
/tmp/ccgPxp1s.s:160: Error: unrecognized instruction `w5 =(s8)r4'
/tmp/ccgPxp1s.s:160: Error: expected register name, got 'r4'
/tmp/ccgPxp1s.s:168: Error: unrecognized instruction `w0 =(s8)r9'
/tmp/ccgPxp1s.s:168: Error: expected register name, got 'r9'
/tmp/ccgPxp1s.s:176: Error: unrecognized instruction `w2 =(s8)r1'
/tmp/ccgPxp1s.s:176: Error: expected register name, got 'r1'
/tmp/ccgPxp1s.s:184: Error: unrecognized instruction `w4 =(s8)r3'
/tmp/ccgPxp1s.s:184: Error: expected register name, got 'r3'
/tmp/ccgPxp1s.s:192: Error: unrecognized instruction `w9 =(s8)r5'
/tmp/ccgPxp1s.s:192: Error: expected register name, got 'r5'

[Bug other/111027] Install error "tmp-header-vars: Permission denied", build on NFS, improvement possible

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111027

Richard Biener  changed:

   What|Removed |Added

   Keywords||build

--- Comment #1 from Richard Biener  ---
I think the bug is that make install writes to the build directory at all (OTOH
libtool tends to re-link, but this instance seems to be different).

[Bug c++/111019] [12/13/14 Regression] Optimizer incorrectly assumes variable is not changed while change happens through another pointer

2023-08-15 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111019

--- Comment #3 from rguenther at suse dot de  ---
On Tue, 15 Aug 2023, ppalka at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111019
> 
> Patrick Palka  changed:
> 
>What|Removed |Added
> 
>  CC||ppalka at gcc dot gnu.org
>Keywords|needs-bisection |
> 
> --- Comment #2 from Patrick Palka  ---
> Bisection points to r12-4319-g09a0affdb0598a

Huh, that should make us optimize less ... (it was also backported
to GCC 11).  Having a good/bad rev should make it easier to analyze
though - thanks.

[Bug tree-optimization/110923] [14 Regression] with-build-config=bootstrap-lto-lean and `make profile-bootstrap` ICEs during build during lsplit pass

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110923

Jan Hubicka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
 CC||hubicka at gcc dot gnu.org

--- Comment #3 from Jan Hubicka  ---
Fixed.

[Bug tree-optimization/110940] [14 Regression] ICE at -O3 on x86_64-linux-gnu: in apply_scale, at profile-count.h:1180

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110940

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Jan Hubicka  ---
Fixed by g:882af290c137dfab5d99b88e6dbecc5e75d85a0b

[Bug c++/106604] Fully-specified deduction guide in anonymous namespace warns as-if a function? Unsuppressably?

2023-08-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106604

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Patrick Palka  ---
Fixed for GCC 14.

[Bug c++/109021] accept size parameter in extern C

2023-08-15 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109021

--- Comment #4 from Martin Uecker  ---

Sorry, how can an enhancement request that addresses a real C/C++ compatibility
problem be marked "resolved invalid" ?

[Bug tree-optimization/110971] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: in operator/, at sreal.cc:261

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110971

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Jan Hubicka  ---
Fixed by g:39204ae9ddbfca710880d7f5fda48234a1e85e4e

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 110988, which changed state.

Bug 110988 Summary: [14 regression] ICE when building 523.xalancbmk_r with pgo 
and lto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110988

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/110988] [14 regression] ICE when building 523.xalancbmk_r with pgo and lto

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110988

Jan Hubicka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Jan Hubicka  ---
Fixed.

[Bug c++/111019] [12/13/14 Regression] Optimizer incorrectly assumes variable is not changed while change happens through another pointer

2023-08-15 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111019

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
   Keywords|needs-bisection |

--- Comment #2 from Patrick Palka  ---
Bisection points to r12-4319-g09a0affdb0598a

[Bug target/109068] bpf: "error: too many function arguments for eBPF" for always_inline function

2023-08-15 Thread jemarch at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109068

Jose E. Marchesi  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Jose E. Marchesi  ---
Fixed by:

commit 6103df1e4fae5192c507484b1d32f00c42c70b54
Author: Jose E. Marchesi 
Date:   Thu Aug 10 10:53:16 2023 +0200

bpf: allow exceeding max num of args in BPF when always_inline

BPF currently limits the number of registers used to pass arguments to
functions to five registers.  There is a check for this at function
expansion time.  However, if a function is guaranteed to be always
inlined (and its body never generated) by virtue of the always_inline
attribute, it can "receive" any number of arguments.

[Bug c++/111028] New: Incorrect optimization with -o1,-o2

2023-08-15 Thread zhaiqiming at baidu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111028

Bug ID: 111028
   Summary: Incorrect optimization with -o1,-o2
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhaiqiming at baidu dot com
  Target Milestone: ---

==

[summary]

When I upgraded from gcc82 to gcc12, I found that list_t::del execution did not
meet expectations. 
Looking at .s , i noticed that the assembly statements corresponding to the
following two lines have disappeared.

(node.*inner_list_node).next->prev = (node.*inner_list_node).prev;
(node.*inner_list_node).prev->next = (node.*inner_list_node).next;


==

[environment-gcc12]
Reading specs from
/home/opt/compiler/gcc-12/bin/../lib/gcc/x86_64-pc-linux-gnu/12.1.0/specs
COLLECT_GCC=/home/opt/compiler/gcc-12/bin/gcc
COLLECT_LTO_WRAPPER=/home/opt/compiler/gcc-12/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/opt/compiler/gcc-12
--with-local-prefix=/opt/compiler/gcc-12
--with-native-system-header-dir=/opt/compiler/gcc-12/include
--enable-languages=c,c++ --disable-libstdcxx-pch --disable-multilib
--with-default-libstdcxx-abi=gcc4-compatible --disable-bootstrap
--with-sysroot=/ --build=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--enable-gold --enable-lto
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.1.0 (GCC) 

[environment-gcc82]
Using built-in specs.
COLLECT_GCC=/home/opt/compiler/gcc-8.2/gcc-8.2/bin/gcc
COLLECT_LTO_WRAPPER=/home/opt/compiler/gcc-8.2/gcc-8.2/bin/../libexec/gcc/x86_64-pc-linux-gnu/8.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/opt/compiler/gcc-8.2
--with-local-prefix=/opt/compiler/gcc-8.2
--with-native-system-header-dir=/opt/compiler/gcc-8.2/include
--enable-languages=c,c++ --disable-libstdcxx-pch --disable-multilib
--disable-bootstrap --with-default-libstdcxx-abi=gcc4-compatible
Thread model: posix
gcc version 8.2.0 (GCC) 

==

[source code]

#include 
#include 
#include 
#include 

struct list_node_t
{
list_node_t *next;
list_node_t *prev;
};

template 
class list_t
{
public:
list_t() { _head.next = _head.prev = &_head; }
bool is_empty() const { return _head.next == &_head; }
T* entry(list_node_t ) const { return  == &_head ?
NULL : (T*)((char*) - (char*)_node_offset); }

void add(T )
{
_head.next->prev = &(node.*inner_list_node);
(node.*inner_list_node).next = _head.next;
(node.*inner_list_node).prev = &_head;
_head.next = &(node.*inner_list_node);
}


static void del(T )
{
int slot_1 = 1;
printf("slot_1 %d\n", slot_1);
(node.*inner_list_node).next->prev =
(node.*inner_list_node).prev;
(node.*inner_list_node).prev->next =
(node.*inner_list_node).next;
int slot_2 = 2;
printf("slot_2 %d\n", slot_2);
}

protected:
static list_node_t const * const _node_offset;
list_node_t _head;
};
template 
list_node_t const * const list_t::_node_offset = &(((T
*)0)->*inner_list_node);

template 
class safe_list_t: public list_t
{
public:
safe_list_t(): _alive(1),_num(0)
{
pthread_mutex_init(&_mutex, NULL);
pthread_cond_init(&_cond, NULL);
}

~safe_list_t()
{
pthread_cond_destroy(&_cond);
pthread_mutex_destroy(&_mutex);
}

int len()
{
return _num;
}

void put(T )
{
pthread_mutex_lock(&_mutex);
if (_alive)
{
this->add(node);
++_num;
}
pthread_mutex_unlock(&_mutex);
pthread_cond_signal(&_cond);
}

T* get()
{
T *ret;
pthread_mutex_lock(&_mutex);
while (_alive && list_t::is_empty())
pthread_cond_wait(&_cond, &_mutex);
if (_alive)
{
ret = this->entry(*list_t::_head.prev);
this->del(*ret);
--_num;
}
else
{

[Bug c++/111026] Incorrect optimization with -o1,-o2

2023-08-15 Thread zhaiqiming at baidu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111026

zhaiqiming at baidu dot com changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|WAITING |RESOLVED

[Bug c++/111026] Incorrect optimization with -o1,-o2

2023-08-15 Thread zhaiqiming at baidu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111026

--- Comment #3 from zhaiqiming at baidu dot com ---

[head file]


#include 
#include 
#include 
#include 

[Bug c++/111026] Incorrect optimization with -o1,-o2

2023-08-15 Thread zhaiqiming at baidu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111026

--- Comment #2 from zhaiqiming at baidu dot com ---
(In reply to zhaiqiming from comment #1)
> ==
> 
> [summary]
> 
> When I upgraded from gcc82 to gcc12, I found that linked_list_t::del
> execution did not meet expectations. 
> Looking at .s , i noticed that the assembly statements corresponding to the
> following two lines have disappeared.
> 
> (node.*list_node).next->prev = (node.*list_node).prev;
> (node.*list_node).prev->next = (node.*list_node).next;
> 
> 
> ==
> 
> [environment-gcc12]
> Reading specs from
> /home/opt/compiler/gcc-12/bin/../lib/gcc/x86_64-pc-linux-gnu/12.1.0/specs
> COLLECT_GCC=/home/opt/compiler/gcc-12/bin/gcc
> COLLECT_LTO_WRAPPER=/home/opt/compiler/gcc-12/bin/../libexec/gcc/x86_64-pc-
> linux-gnu/12.1.0/lto-wrapper
> Target: x86_64-pc-linux-gnu
> Configured with: ../configure --prefix=/opt/compiler/gcc-12
> --with-local-prefix=/opt/compiler/gcc-12
> --with-native-system-header-dir=/opt/compiler/gcc-12/include
> --enable-languages=c,c++ --disable-libstdcxx-pch --disable-multilib
> --with-default-libstdcxx-abi=gcc4-compatible --disable-bootstrap
> --with-sysroot=/ --build=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
> --enable-gold --enable-lto
> Thread model: posix
> Supported LTO compression algorithms: zlib
> gcc version 12.1.0 (GCC) 
> 
> [environment-gcc82]
> Using built-in specs.
> COLLECT_GCC=/home/opt/compiler/gcc-8.2/gcc-8.2/bin/gcc
> COLLECT_LTO_WRAPPER=/home/opt/compiler/gcc-8.2/gcc-8.2/bin/../libexec/gcc/
> x86_64-pc-linux-gnu/8.2.0/lto-wrapper
> Target: x86_64-pc-linux-gnu
> Configured with: ../configure --prefix=/opt/compiler/gcc-8.2
> --with-local-prefix=/opt/compiler/gcc-8.2
> --with-native-system-header-dir=/opt/compiler/gcc-8.2/include
> --enable-languages=c,c++ --disable-libstdcxx-pch --disable-multilib
> --disable-bootstrap --with-default-libstdcxx-abi=gcc4-compatible
> Thread model: posix
> gcc version 8.2.0 (GCC) 
> 
> ==
> 
> [source code]
> 
> 
> struct list_node_t
> {
>   list_node_t *next;
>   list_node_t *prev;
> };
> 
> template 
> class list_t
> {
>   public:
>   list_t() { _head.next = _head.prev = &_head; }
>   bool is_empty() const { return _head.next == &_head; }
>   T* entry(list_node_t ) const { return  == &_head ? 
> NULL :
> (T*)((char*) - (char*)_node_offset); }
> 
>   void add(T )
>   {
>   _head.next->prev = &(node.*inner_list_node);
>   (node.*inner_list_node).next = _head.next;
>   (node.*inner_list_node).prev = &_head;
>   _head.next = &(node.*inner_list_node);
>   }
> 
> 
>   static void del(T )
>   {
> int slot_1 = 1;
> printf("slot_1 %d\n", slot_1);
>   (node.*inner_list_node).next->prev = 
> (node.*inner_list_node).prev;
>   (node.*inner_list_node).prev->next = 
> (node.*inner_list_node).next;
>   int slot_2 = 2;
> printf("slot_2 %d\n", slot_2);
>   }
> 
>   protected:
>   static list_node_t const * const _node_offset;
>   list_node_t _head;
> };
> template 
> list_node_t const * const list_t::_node_offset = &(((T
> *)0)->*inner_list_node);
> 
> template 
> class safe_list_t: public list_t
> {
>   public:
>   safe_list_t(): _alive(1),_num(0)
>   {
>   pthread_mutex_init(&_mutex, NULL);
>   pthread_cond_init(&_cond, NULL);
>   }
> 
>   ~safe_list_t()
>   {
>   pthread_cond_destroy(&_cond);
>   pthread_mutex_destroy(&_mutex);
>   }
> 
>   int len()
>   {
>   return _num;
>   }
> 
>   void put(T )
>   {
>   pthread_mutex_lock(&_mutex);
>   if (_alive)
>   {
> this->add(node);
>   ++_num;
>   }
>   pthread_mutex_unlock(&_mutex);
>   pthread_cond_signal(&_cond);
>   }
> 
>   T* get()
>   {
>   T *ret;
>   pthread_mutex_lock(&_mutex);
>   while (_alive && list_t::is_empty())
>   pthread_cond_wait(&_cond, &_mutex);
>   if (_alive)
>   {
> ret = this->entry(*list_t::_head.prev);
> this->del(*ret);
>   --_num;
>   }
>   else
>   {
>   ret = NULL;
>   }
>   pthread_mutex_unlock(&_mutex);
>   return ret;
>   }
> 
> 

[Bug c++/111026] Incorrect optimization with -o2,-o

2023-08-15 Thread zhaiqiming at baidu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111026

--- Comment #1 from zhaiqiming at baidu dot com ---
==

[summary]

When I upgraded from gcc82 to gcc12, I found that linked_list_t::del execution
did not meet expectations. 
Looking at .s , i noticed that the assembly statements corresponding to the
following two lines have disappeared.

(node.*list_node).next->prev = (node.*list_node).prev;
(node.*list_node).prev->next = (node.*list_node).next;


==

[environment-gcc12]
Reading specs from
/home/opt/compiler/gcc-12/bin/../lib/gcc/x86_64-pc-linux-gnu/12.1.0/specs
COLLECT_GCC=/home/opt/compiler/gcc-12/bin/gcc
COLLECT_LTO_WRAPPER=/home/opt/compiler/gcc-12/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/opt/compiler/gcc-12
--with-local-prefix=/opt/compiler/gcc-12
--with-native-system-header-dir=/opt/compiler/gcc-12/include
--enable-languages=c,c++ --disable-libstdcxx-pch --disable-multilib
--with-default-libstdcxx-abi=gcc4-compatible --disable-bootstrap
--with-sysroot=/ --build=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--enable-gold --enable-lto
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.1.0 (GCC) 

[environment-gcc82]
Using built-in specs.
COLLECT_GCC=/home/opt/compiler/gcc-8.2/gcc-8.2/bin/gcc
COLLECT_LTO_WRAPPER=/home/opt/compiler/gcc-8.2/gcc-8.2/bin/../libexec/gcc/x86_64-pc-linux-gnu/8.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/opt/compiler/gcc-8.2
--with-local-prefix=/opt/compiler/gcc-8.2
--with-native-system-header-dir=/opt/compiler/gcc-8.2/include
--enable-languages=c,c++ --disable-libstdcxx-pch --disable-multilib
--disable-bootstrap --with-default-libstdcxx-abi=gcc4-compatible
Thread model: posix
gcc version 8.2.0 (GCC) 

==

[source code]


struct list_node_t
{
list_node_t *next;
list_node_t *prev;
};

template 
class list_t
{
public:
list_t() { _head.next = _head.prev = &_head; }
bool is_empty() const { return _head.next == &_head; }
T* entry(list_node_t ) const { return  == &_head ?
NULL : (T*)((char*) - (char*)_node_offset); }

void add(T )
{
_head.next->prev = &(node.*inner_list_node);
(node.*inner_list_node).next = _head.next;
(node.*inner_list_node).prev = &_head;
_head.next = &(node.*inner_list_node);
}


static void del(T )
{
int slot_1 = 1;
printf("slot_1 %d\n", slot_1);
(node.*inner_list_node).next->prev =
(node.*inner_list_node).prev;
(node.*inner_list_node).prev->next =
(node.*inner_list_node).next;
int slot_2 = 2;
printf("slot_2 %d\n", slot_2);
}

protected:
static list_node_t const * const _node_offset;
list_node_t _head;
};
template 
list_node_t const * const list_t::_node_offset = &(((T
*)0)->*inner_list_node);

template 
class safe_list_t: public list_t
{
public:
safe_list_t(): _alive(1),_num(0)
{
pthread_mutex_init(&_mutex, NULL);
pthread_cond_init(&_cond, NULL);
}

~safe_list_t()
{
pthread_cond_destroy(&_cond);
pthread_mutex_destroy(&_mutex);
}

int len()
{
return _num;
}

void put(T )
{
pthread_mutex_lock(&_mutex);
if (_alive)
{
this->add(node);
++_num;
}
pthread_mutex_unlock(&_mutex);
pthread_cond_signal(&_cond);
}

T* get()
{
T *ret;
pthread_mutex_lock(&_mutex);
while (_alive && list_t::is_empty())
pthread_cond_wait(&_cond, &_mutex);
if (_alive)
{
ret = this->entry(*list_t::_head.prev);
this->del(*ret);
--_num;
}
else
{
ret = NULL;
}
pthread_mutex_unlock(&_mutex);
return ret;
}

protected:
pthread_mutex_t _mutex;
pthread_cond_t _cond;
int _alive;
int _num;
};

template 
class worker_t
{
public:
T *t;
list_node_t 

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-15 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #17 from Hans-Peter Nilsson  ---
(In reply to Richard Biener from comment #12)
> I think a "too broad" dependence isn't bad.  The cris specific solution also
> looks manageable, though I wonder what's special about x-protos.h, knowing
> very little of that area.

Not very special, it's just that there's a now dependence on tree.h and its
generated contents where for tm-protos.h there has been none before.  The
reasons seem avoidable at this time, but I think it's nominally valid to have
that dependency.  On the one hand, introducing a new dependency where there has
been none before and dealing with the fallout we now see is awkward, but on the
other hand pragmatically the dependence on tree.h for $(TM_P) is trivial enough
that a parallel build "raced" to fulfill it. IOW, shouldn't affect build time
by any measure.

(Not closing this PR yet in case maintainers go forward for the other
tm-protos.h-affected targets)

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #16 from CVS Commits  ---
The master branch has been updated by Hans-Peter Nilsson :

https://gcc.gnu.org/g:eef192b181b8777d708671e2541896e7e31293aa

commit r14-3218-geef192b181b8777d708671e2541896e7e31293aa
Author: Hans-Peter Nilsson 
Date:   Tue Aug 15 06:35:43 2023 +0200

CRIS: Don't include tree.h in cris-protos.h, PR bootstrap/111021

While there's another patch that fixes the immediate error
in the PR by other means, the include of tree.h here is
something I prefer to avoid.

PR bootstrap/111021
* config/cris/cris-protos.h: Revert recent change.
* config/cris/cris.cc (cris_legitimate_address_p): Remove
code_helper unused parameter.
(cris_legitimate_address_p_hook): New wrapper function.
(TARGET_LEGITIMATE_ADDRESS_P): Change to
cris_legitimate_address_p_hook.

[Bug other/111027] New: Install error "tmp-header-vars: Permission denied", build on NFS, improvement possible

2023-08-15 Thread etienne_lorrain at yahoo dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111027

Bug ID: 111027
   Summary: Install error "tmp-header-vars: Permission denied",
build on NFS, improvement possible
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: etienne_lorrain at yahoo dot fr
  Target Milestone: ---

On an ARM32 Linux system just installed, i.e. parallellla board just flashed
with an SDCard image at
https://github.com/parallella/parabuntu/releases/download/parabuntu-2019.1-beta1/parabuntu-2019.1-beta1-headless-z7010.img.gz
, full compilation works fine with:

mkdir vega
# limited SDCard storage so:
sudo mount 192.168.1.84:/home/etienne/parallella /home/parallella/vega
cd vega
wget https://ftp.gnu.org/gnu/gcc/gcc-13.2.0/gcc-13.2.0.tar.xz tar xf
gcc-13.2.0.tar.xz 
cd gcc-13.2.0/
./contrib/download_prerequisites
cd ..
mkdir gcc_build
cd gcc_build
../gcc-13.2.0/configure --enable-languages=c,c++,fortran
time make -j 3# SDCard real 2033m27.533s / NFS real: 2065m36.178s

But install fails with:
$ echo parallella | sudo -S make install
...
/usr/bin/install -c -m 644 ../../gcc-13.2.0/gcc/cp/operators.def
/usr/local/lib/gcc/armv7l-unknown-linux-gnueabihf/13.2.0/plugin/include/cp/operators.def
/usr/bin/install -c -m 644 ../../gcc-13.2.0/gcc/cp/cp-trait.def
/usr/local/lib/gcc/armv7l-unknown-linux-gnueabihf/13.2.0/plugin/include/cp/cp-trait.def
/usr/bin/install -c -m 644 ../../gcc-13.2.0/gcc/cp/contracts.h
/usr/local/lib/gcc/armv7l-unknown-linux-gnueabihf/13.2.0/plugin/include/cp/contracts.h
rm -f tmp-header-vars
echo USER_H= ... some filenames ... >> tmp-header-vars; echo
HASHTAB_H=hashtab.h >> tmp-header-vars; echo OBSTACK_H=obstack.h >>
tmp-header-vars; echo SPLAY_TREE_H=splay-tree.h >> tmp-header-vars; ...plenty
more .. ; echo GTFILES_LANG_H=gtype-ada.h gtype-c.h gtype-cp.h gtype-d.h
gtype-fortran.h gtype-go.h gtype-jit.h gtype-lto.h gtype-m2.h gtype-objc.h
gtype-objcp.h gtype-rust.h >> tmp-header-vars;
/bin/sh: tmp-header-vars: Permission denied
/bin/sh: tmp-header-vars: Permission denied
/bin/sh: tmp-header-vars: Permission denied
... lots of identical messages ...
Makefile:3736: recipe for target 's-header-vars' failed
make[2]: *** [s-header-vars] Error 1
make[2]: Leaving directory '/home/parallella/vega/gcc_build/gcc'
Makefile:5361: recipe for target 'install-gcc' failed
make[1]: *** [install-gcc] Error 2
make[1]: Leaving directory '/home/parallella/vega/gcc_build'
Makefile:2632: recipe for target 'install' failed
make: *** [install] Error 2

The problem is simple, I am just creating this bug report so that people can
google it.

That is the first time "root" (due to sudo) tries to create a simple file in
the build directory, and such build directory is mounted on a NFS filesystem.
The NFS server may not configure the export with "no_root_squash", so that a
standard user can create such a file "tmp-header-vars", but root cannot!

Maybe a simple test (and a better error message) could be created when root
cannot create a file in the build directory? Obviously root will create files
without problems on /usr/local/{bin,lib,}.

Feel free to classify as "NOTABUG", just keep it accessible in search engines!

Best Regards, Etienne.

[Bug c++/111026] Incorrect optimization with -o2,-o

2023-08-15 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111026

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-08-15
 Status|UNCONFIRMED |WAITING

[Bug c++/111026] New: Incorrect optimization with -o2,-o

2023-08-15 Thread zhaiqiming at baidu dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111026

Bug ID: 111026
   Summary: Incorrect optimization with -o2,-o
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhaiqiming at baidu dot com
  Target Milestone: ---

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-15 Thread nsz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #12 from nsz at gcc dot gnu.org ---
(In reply to Jiangning Liu from comment #11)
> Hi Wilco,
> 
> > "it means we will need a linker optimization to remove those redundant BTIs 
> > (eg. by changing them into NOPs)"
> 
> It will be only for performance optimization, right? If we don't care about
> performance, the linker doesn't need to optimize it to be NOP, right? It
> could still be useful if we only do this operation for a specific module.

no, this is a security feature, we want as few BTI c in an executable
segment as possible.

[Bug c/111025] New: attribute((malloc)) and posix_memalign() (and other functions that return newly allocated object address into an output parameter)

2023-08-15 Thread yann at droneaud dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111025

Bug ID: 111025
   Summary: attribute((malloc)) and posix_memalign() (and other
functions that return newly allocated object address
into an output parameter)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yann at droneaud dot fr
  Target Milestone: ---

Functions such as posix_memalign() don't return the pointer to the newly
allocated memory as their return value, thus attribute((malloc)) cannot be used
with them.

It would be useful to have some form of attribute((malloc)) that could apply to
function such as posix_memalign().

This new attribute((malloc)) form could also be used on asprintf() for example.

With support for the attribute((malloc))'s deallocator specification, it could
improve warnings at compile time and prevents developer mistake.

[Bug tree-optimization/110963] [14 Regression] Dead Code Elimination Regression since r14-2946-g46c8c225455

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110963

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Richard Biener  ---
Fixed nevertheless.

[Bug tree-optimization/110963] [14 Regression] Dead Code Elimination Regression since r14-2946-g46c8c225455

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110963

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:4d6132e59327e809a4d4e39fb9465dbd43775b7c

commit r14-3217-g4d6132e59327e809a4d4e39fb9465dbd43775b7c
Author: Richard Biener 
Date:   Thu Aug 10 13:55:36 2023 +0200

tree-optimization/110963 - more PRE when optimizing for size

The following adjusts the heuristic when we perform PHI insertion
during GIMPLE PRE from requiring at least one edge that is supposed
to be optimized for speed to also doing insertion when the expression
is available on all edges (but possibly with different value) and
we'd at most have one copy from a constant.  The first ensures
we optimize two computations on all paths to one plus a possible
copy due to the PHI, the second makes sure we do not need to insert
many possibly large copies from constants, disregarding the
cummulative size cost of the register copies when they are not
coalesced.

The case in the testcase is

  
  _14 = h;
  if (_14 == 0B)
goto ;
  else
goto ;

  
  h = 0B;

  
  h.6_12 = h;

and we want to optimize that to

  
  # h.6_12 = PHI <_14(5), 0B(6)>

If we want to consider the cost of the register copies I think the
only simplistic enough way would be to restrict the special-case to
two incoming edges - we'd assume one register copy is coalesced
leaving one copy from a register or from a constant.

As with every optimization the downstream effects are probably
bigger than what we can locally estimate.

PR tree-optimization/110963
* tree-ssa-pre.cc (do_pre_regular_insertion): Also insert
a PHI node when the expression is available on all edges
and we insert at most one copy from a constant.

* gcc.dg/tree-ssa/ssa-pre-34.c: New testcase.

[Bug fortran/110360] ABI issue with character,value dummy argument

2023-08-15 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110360

--- Comment #40 from Mikael Morin  ---
Harald, I have just closed the followup PR110419.
I think this PR can be closed as well, or is there something left to be done?

[Bug testsuite/110419] [14 regression] new test case gfortran.dg/value_9.f90 in r14-2050-gd130ae8499e0c6 fails

2023-08-15 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110419

Mikael Morin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #21 from Mikael Morin  ---
The value_9.f90 FAIL is gone after the commit:
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/793383.html
whereas it was present before:
https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/793371.html

Closing as FIXED.
Thanks to all who contributed to the resolution.

[Bug tree-optimization/110991] [14 Regression] Dead Code Elimination Regression at -O2 since r14-1135-gc53f51005de

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110991

Richard Biener  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=91975
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Richard Biener  ---
Fixed.

[Bug tree-optimization/110991] [14 Regression] Dead Code Elimination Regression at -O2 since r14-1135-gc53f51005de

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110991

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:bcdbedb3e6083ad01d844ed97cf19645c1ef6568

commit r14-3216-gbcdbedb3e6083ad01d844ed97cf19645c1ef6568
Author: Richard Biener 
Date:   Mon Aug 14 09:31:18 2023 +0200

tree-optimization/110991 - unroll size estimate after vectorization

The following testcase shows that we are bad at identifying inductions
that will be optimized away after vectorizing them because SCEV doesn't
handle vectorized defs.  The following rolls a simpler identification
of SSA cycles covering a PHI and an assignment with a binary operator
with a constant second operand.

PR tree-optimization/110991
* tree-ssa-loop-ivcanon.cc (constant_after_peeling): Handle
VIEW_CONVERT_EXPR , handle more simple IV-like SSA cycles
that will end up constant.

* gcc.dg/tree-ssa/cunroll-16.c: New testcase.

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

Richard Biener  changed:

   What|Removed |Added

   Keywords||build
   Target Milestone|--- |14.0

[Bug c++/111019] [12/13/14 Regression] Optimizer incorrectly assumes variable is not changed while change happens through another pointer

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111019

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.4
 Ever confirmed|0   |1
Summary|Optimizer incorrectly   |[12/13/14 Regression]
   |assumes variable is not |Optimizer incorrectly
   |changed while change|assumes variable is not
   |happens through another |changed while change
   |pointer |happens through another
   ||pointer
   Keywords||alias, needs-bisection,
   ||wrong-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-08-15

--- Comment #1 from Richard Biener  ---
I can confirm this with -O3 (but not -O2), even when adding
-fno-tree-loop-optimize.  -fno-strict-aliasing avoids the issue (but I can't
see anything
obviously wrong in the sources).  The requirement for the bug to show up
is inlining Target::~Target, marking it always_inline makes the bug
appear at -O2 or -O1 -fstrict-aliasing as well.  The bug is we endlessly
loop in the Target::~Target loop.

Needs further analysis, bisection might help.

[Bug libgomp/111024] New: libgomp: FAILs with oldish libnuma/libmemkind

2023-08-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111024

Bug ID: 111024
   Summary: libgomp: FAILs with oldish libnuma/libmemkind
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

Re commit r14-2462-g450b05ce54d3f08c583c3b5341233ce0df99725b "libgomp: Use
libnuma for OpenMP's partition=nearest allocation trait" (plus commit
r14-2514-g407d68daed00e040a7d9545b2a18aa27bf93a106 "libgomp: Fix allocator
handling for Linux when libnuma is not available"), on our amdfury2, nvidia-4a
systems (not necessarily an exhaustive list), I see:

+PASS: libgomp.c/../libgomp.c-c++-common/alloc-11.c (test for excess
errors)
+FAIL: libgomp.c/../libgomp.c-c++-common/alloc-11.c execution test

alloc-11.exe: src/memkind_interleave.c:54: memkind_interleave_init_once:
Assertion `err == 0' failed.

+PASS: libgomp.c/../libgomp.c-c++-common/alloc-12.c (test for excess
errors)
+FAIL: libgomp.c/../libgomp.c-c++-common/alloc-12.c execution test

alloc-12.exe: src/memkind_interleave.c:54: memkind_interleave_init_once:
Assertion `err == 0' failed.

Same for C++.

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x775b2921 in __GI_abort () at abort.c:79
#2  0x775a248a in __assert_fail_base (fmt=0x77729750
"%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@entry=0x77169682 "err == 0",
file=file@entry=0x7716986a "src/memkind_interleave.c", line=line@entry=54,
function=function@entry=0x77169890 "memkind_interleave_init_once") at
assert.c:92
#3  0x775a2502 in __GI___assert_fail (assertion=0x77169682 "err
== 0", file=0x7716986a "src/memkind_interleave.c", line=54,
function=0x77169890 "memkind_interleave_init_once") at assert.c:101
#4  0x771691c4 in memkind_interleave_init_once () from
/usr/lib/x86_64-linux-gnu/libmemkind.so.0
#5  0x77972907 in __pthread_once_slow (once_control=0x7736d72c,
init_routine=0x77169180 ) at
pthread_once.c:116
#6  0x77166d95 in memkind_malloc () from
/usr/lib/x86_64-linux-gnu/libmemkind.so.0
#7  0x77bbed04 in omp_aligned_alloc (alignment=1, size=4,
allocator=6309568) at [...]/libgomp/config/linux/../../allocator.c:521
#8  0x00400fd9 in main () at
source-gcc/libgomp/testsuite/libgomp.c-c++-common/alloc-11.c:182

Note that the failure is in libmemkind; the commits that trigger it are for
libnuma support.

Both these system have Ubuntu bionic libnuma1 2.0.11-2.1ubuntu0.1, libmemkind0
1.1.0-0ubuntu1.  Assuming this is similar to the latter's upstream v1.1.0,
that's
,
where, per assembly-level debugging, we've got 'err = MEMKIND_ERROR_MALLCTL'
from '‎memkind_arena_create_map‎';
.

Might be a bug in libnuma/libmemkind, fixed in later versions (we're not seeing
these FAILs on other systems), or is the bug on our side, due to calling into
libnuma/libmemkind with some unexpected parameters, for example?

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:ecb95399f43873e1f34119ac260bbea2ef358e53

commit r14-3215-gecb95399f43873e1f34119ac260bbea2ef358e53
Author: Kewen Lin 
Date:   Tue Aug 15 03:01:20 2023 -0500

Makefile.in: Make recog.h depend on $(TREE_H) [PR111021]

Commit r14-3093 introduced a random build failure on
build/gencondmd.cc building.  Since r14-3093 make recog.h
include tree.h, which further includes (depends on) some
files that are generated during the building, such as:
all-tree.def, tree-check.h etc, when building file
build/gencondmd.cc, the build can fail if these dependences
are not ready.  So this patch is to teach this dependence.

Thank Jan-Benedict Glaw for testing this!

PR bootstrap/111021

gcc/ChangeLog:

* Makefile.in (RECOG_H): Add $(TREE_H) as dependence.

[Bug target/111023] New: missing extendv4siv4hi (and friends)

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

Bug ID: 111023
   Summary: missing extendv4siv4hi (and friends)
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

We could vectorize gcc.dg/vect/pr65947-7.c if we implement the
extendv4siv4hi pattern (sign-extend V4HI to V4SI).  We can already do
vec_unpacks_lo via

pcmpgtw %xmm0, %xmm1
movdqa  %xmm0, %xmm2
punpcklwd   %xmm1, %xmm2

and that would trivially extend to the required pattern - just the
input is v4hi instead of v8hi.

Other related patterns are probably missing as well, where we can do
vec_unpack[s]_lo we should be able to implement [zero_]extend.

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #14 from Kewen Lin  ---
(In reply to Kewen Lin from comment #13)
> (In reply to Richard Biener from comment #12)
> > I think a "too broad" dependence isn't bad.  The cris specific solution also
> > looks manageable, though I wonder what's special about x-protos.h, knowing
> > very little of that area.
> 
> Thanks for the comments, I'll follow up with the former if no objections. :)
> 
> About {port}-protos.h, the documentation says "The ‘machine.h’ header is
> included very early in GCC’s standard sequence of header files, while
> ‘machine-protos.h’ is included late in the sequence. Thus ‘machine-protos.h’
> can include declarations referencing types that are not defined when
> ‘machine.h’ is included, specifically including those from ‘rtl.h’ and
> ‘tree.h’."

insn-automata.cc have both tm.h ($port.h) and tm_p.h (${port}-protos.h), the
order looks like:

#define IN_TARGET_CODE 1
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "alias.h"
#include "tree.h"
#include "varasm.h"
#include "stor-layout.h"
#include "calls.h"
#include "rtl.h"
#include "memmodel.h"
#include "tm_p.h"
#include "insn-config.h"
#include "recog.h"
#include "regs.h"
#include "output.h"
#include "insn-attr.h"
#include "diagnostic-core.h"
#include "flags.h"
#include "function.h"
#include "emit-rtl.h"

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #13 from Kewen Lin  ---
(In reply to Richard Biener from comment #12)
> I think a "too broad" dependence isn't bad.  The cris specific solution also
> looks manageable, though I wonder what's special about x-protos.h, knowing
> very little of that area.

Thanks for the comments, I'll follow up with the former if no objections. :)

About {port}-protos.h, the documentation says "The ‘machine.h’ header is
included very early in GCC’s standard sequence of header files, while
‘machine-protos.h’ is included late in the sequence. Thus ‘machine-protos.h’
can include declarations referencing types that are not defined when
‘machine.h’ is included, specifically including those from ‘rtl.h’ and
‘tree.h’."

One special thing I noticed is that for the functions used in md files but
defined in ${port}.cc or others, they needs to be declared in {port}-protos.h,
since those generated insn-*.cc includes tm_p.h which includes {port}-protos.h
(and some other in tm_p_file_list like tm-preds.h target specific predicates).

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #12 from Richard Biener  ---
I think a "too broad" dependence isn't bad.  The cris specific solution also
looks manageable, though I wonder what's special about x-protos.h, knowing very
little of that area.