[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #11 from Hans-Peter Nilsson  ---
(In reply to Hans-Peter Nilsson from comment #8)
> I might have misunderstood things of course, 
...like still having to include tree.h to get the code_helper class definition.
Doh!

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #10 from Kewen Lin  ---
(In reply to Hans-Peter Nilsson from comment #7)
> Exactly; I'm happy that we seem to be on the same page here.
> 
> I'm testing a patch for CRIS (making the hook function just a wrapper,
> reverting the cris-protos.h change) that may be re-usable for the other
> targets similarly affected.

> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627444.html

Thanks for the suggestion and doing the patch! If Richi & Richard prefer this,
I'll follow up with a patch for the other affected ports.

> I believe you could add a code_helper constructor for another type, maybe
> even "int"?  Then the "default" value for the code_helper argument for the
> legitimate address hook function would just look like "code_helper = 0".
> 
> Maybe that's too error-prone and a pointer-type or some entirely different
> type would be better. I might have misunderstood things of course, but
> that's what I mean by "or something".

OK, thanks for further clarifying, I agree it's doable, I didn't consider this
before because it's a "code" helper, as its comments says it's for tree codes
and builtin function codes, shouldn't make it support things that isn't "code".
Even for a int which can be equivalent to a tree_code (with an appropriate
assertion check for unexpected range), it's still easy to be questioned like
why not just use tree_code directly (having good readability and without any
expected range checking etc.).

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #9 from Hans-Peter Nilsson  ---
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627444.html

[Bug fortran/101602] [F2018] local and local_init are not supported in DO CONCURRENT

2023-08-14 Thread michael at dontknow dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101602

--- Comment #3 from Michael Klemm  ---
The locality specifiers cannot directly map to the OpenMP data-sharing clauses.
 While SHARED and LOCAL can be mapped, LOCAL_INIT cannot.  The latter needs to
initialize the variable anew for each iteration of the DO CONCURRENT loop,
while FIRSTPRIVATE will initialize the variable only once per thread that
executes chunks of said loop.  So, the code transformation for that case will
have to be more involved.

There has been discussions in the OpenMP language committee if a LINEAR(x:0)
clause can substitute LOCAL_INIT(x).  That might be one solution to reduce the
implementation burden and map everything to OpenMP constructs.

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #8 from Hans-Peter Nilsson  ---
(In reply to Kewen Lin from comment #6)
> 
> btw, thanks for the suggestion on "defaulting to NULL or something" in ML,
> but I guess a default value like zero rather than ERROR_MARK has to base on
> the assertion that the default value is equal to ERROR_MARK, IMHO it isn't
> quite maintainable.

I believe you could add a code_helper constructor for another type, maybe even
"int"?  Then the "default" value for the code_helper argument for the
legitimate address hook function would just look like "code_helper = 0".

Maybe that's too error-prone and a pointer-type or some entirely different type
would be better. I might have misunderstood things of course, but that's what I
mean by "or something".

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #7 from Hans-Peter Nilsson  ---
(In reply to Kewen Lin from comment #6)
> (In reply to Hans-Peter Nilsson from comment #5)
> [...]
> but the concern is that one day some file just includes tm_p.h but
> not recog.h, the issue will show up again. I'm thinking how to deal with the
> problems on ${port}-protos.h, adding $(TREE_H) in TM_P is an alternative,
> but not sure if people would think it's an overkill since the current
> affected ports are:
> [...]

Exactly; I'm happy that we seem to be on the same page here.

I'm testing a patch for CRIS (making the hook function just a wrapper,
reverting the cris-protos.h change) that may be re-usable for the other targets
similarly affected.

[Bug fortran/111022] New: ES0.0E0 format gave ES0.dE0 output with d too high.

2023-08-14 Thread john.harper at vuw dot ac.nz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

Bug ID: 111022
   Summary: ES0.0E0 format gave ES0.dE0 output with d too high.
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: john.harper at vuw dot ac.nz
  Target Milestone: ---

This 3-line Fortran program:

  print "(ES0.0E0)", -666e0
  print "(ES0.0E0)", -666d0
  end program

printed

-6.66000E+2
-6.66000E+2

but I think it should have printed 

-7.E+2
-7.E+2

which is what ifort printed with the same program.

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

Kewen Lin  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org

--- Comment #6 from Kewen Lin  ---
(In reply to Hans-Peter Nilsson from comment #5)
> (In reply to Hans-Peter Nilsson from comment #4)
> > (In reply to Andrew Pinski from comment #1)
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627435.html
> > 
> > Nope, that just fixes recog.h.  See the quoted error; it's the patched
> > *-protos.h.

Thanks for clarifying, yeah, some ${port}-protos.h includes tree.h in that
commit.

> 
> ...but I'm "guessing" that it will appear to work even for the
> tm-protos.h-on-tree.h dependency, just incidentally.

It can work for this build failure on build/gencondmd.o since the same required
dependency on tree.h is supported by checking the dependence of recog.h, but
the concern is that one day some file just includes tm_p.h but not recog.h, the
issue will show up again. I'm thinking how to deal with the problems on
${port}-protos.h, adding $(TREE_H) in TM_P is an alternative, but not sure if
people would think it's an overkill since the current affected ports are:

gcc/config/arm/arm-protos.h:#include "tree.h" /* For ERROR_MARK.  */
gcc/config/cris/cris-protos.h:#include "tree.h" /* For ERROR_MARK.  */
gcc/config/microblaze/microblaze-protos.h:#include "tree.h"  /* For ERROR_MARK.
 */
gcc/config/rl78/rl78-protos.h:#include "tree.h"  /* For ERROR_MARK.  */
gcc/config/stormy16/stormy16-protos.h:#include "tree.h"  /* For ERROR_MARK.  */

I checked the existing tm_p_file_list and tm_p_include_list which are specific
for port, but tm_p_include_list is generated based on tm_p_file_list, it seems
not for this purpose. 

Hi Richi & Richard, what do you think of this?

btw, thanks for the suggestion on "defaulting to NULL or something" in ML, but
I guess a default value like zero rather than ERROR_MARK has to base on the
assertion that the default value is equal to ERROR_MARK, IMHO it isn't quite
maintainable.

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #5 from Hans-Peter Nilsson  ---
(In reply to Hans-Peter Nilsson from comment #4)
> (In reply to Andrew Pinski from comment #1)
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627435.html
> 
> Nope, that just fixes recog.h.  See the quoted error; it's the patched
> *-protos.h.

...but I'm "guessing" that it will appear to work even for the
tm-protos.h-on-tree.h dependency, just incidentally.

[no subject]

2023-08-14 Thread สีดา เงินฟู via Gcc-bugs
ทุนหมุนเวียนธุรกิจระยะสั้นสำหรับ ผู้ประกอบการ โรงงานฯ หจก. บริษัท ธุรกิจ
SMEs
 ง่ายกว่าธนาคาร | ไม่เช็คบูโร | ลดต้น ลดดอกเบี้ย | ไม่ต้องค้ำ | คุยง่าย
อนุมัติไวเร็ว
โทร 082 5928519 คุณเอก
โทร 063 2543219 ตะวัน
ไลน์ไอดี esc.credit
✅ดอกเบี้ยต่ำสุด 1.25%*
✅ปิดยอดได้ตลอดเวลา ไม่ต้องรอให้ครบสัญญา
✅ฟรีค่าธรรมเนียม ไม่เรียกเก็บเงินก่อนทำสัญญาทุกกรณี


[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #4 from Hans-Peter Nilsson  ---
(In reply to Andrew Pinski from comment #1)
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627435.html

Nope, that just fixes recog.h.  See the quoted error; it's the patched
*-protos.h.

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #3 from Hans-Peter Nilsson  ---
(In reply to Kewen Lin from comment #2)
> Thanks for reporting, I think the culprit is r14-3093 instead of r14-3092? 

Not for this PR!

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-August/
   ||627435.html
 Ever confirmed|0   |1
   Last reconfirmed||2023-08-15
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #2 from Kewen Lin  ---
Thanks for reporting, I think the culprit is r14-3093 instead of r14-3092? 

I think the other build/gen*.cc building don't have this issue, since none of
them includes recog.h themselves, but emit the source files which includ
recog.h (those insn-*.cc, I had a double check, they are fine).
build/gencondmd.cc is special, unlike the other build/gen*.cc, it's generated
by genconditions and it include "recog.h". Only build/gencondmd.o depends on
RECOG_H.

[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-14 Thread hpa at zytor dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

--- Comment #5 from H. Peter Anvin  ---
I don't think source code modifications are a huge problem, but at this point
they require tracking down each individual bit.

As far as trapping implementations are concerned:

1. In deeply embedded implementations, it is entirely possible that
firmware/microcode might be *more* expensive than logic. Although memory arrays
are, of course, very dense, they are still extremely general and RISC-V isn't a
very sparse instruction set.

2. It seems like it almost would require an implementation-specific performance
model. Now, one can validly argue that by setting the cost of unimplemented
instructions to a (near-)infinite value such instructions should never be
generated even if they are "enabled". That might also be a possible avenue for
achieving this.

As far as an explosion of subsets, yes, this is really what this means.
Bloating a tiny on-chip control processor both in area and timing to implement
instructions that never actually appears in the code is at best painful.

That being said, I do intend to submit a proposal to the RISC-V ISA folks to
subset the Zbb subset. It is worth noting that there are overlaps between the
Zb* and Zbk* subsets, but the individual intersection sets do not have their
own names.

The Zbb instruction set is particularly noxious (and this is indeed an ISA
definition problem), because it implements multiple things that are, from an
implementation point of view, completely separate and require separate code
paths in the ALU:

§ 1.2.1 Logical with negate
- minimal cost; in fact in some implementations it might have zero or
even negative cost due to decoder simplification.
- Extremely common in embedded operations.

§ 1.2.2 Count leading/trailing zero bits
- Requires dedicated logic.
- ctz and clz have very different uses.
- Typically clz and ctz will not be able to share logic, either,
requiring *two* dedicated units.

§ 1.2.3 Count population
- Requires dedicated logic.
- May be useless depending on what the processor needs.

§ 1.2.4 Integer minimum/maximum
- May be cheap or expensive, depending on if an existing comparator can
be leveraged.
- Quite possibly free or almost free if the AMO instruction set is
already supported in its entirety, as that requires max/min already.

§ 1.2.5 Sign- and zero-extension
§ 1.2.6 Bitwise rotation
- May be very cheap or quite expensive, depending on the implementation
of the shift instructions.

§ 1.2.7 OR combine
- Requires dedicated logic.
- Virtually useless in control processors that do not process text.

§ 1.2.8 Byte-reverse
- Requires dedicated logic.
- These, and some other instructions, are special cases of a bit swap
extension proposed in the original bitmanip proposal, but was not included even
as a separate set.
- Virtually useless in control processors that does not need to
interface with cross-endian data.


These 8 groups really ought to be given separate names.

Is this going to happen again? Quite likely.

It seems, as you say, that chopping the public ISA to pieces to support every
single use case would seem unlikely.

It really comes down to: out of multiple suboptimal cases (forced hardware
bloat, custom subsets, extremely fine grained public subsets, vendor-hacked
trees that lag behind and/or diverge from upstream), what option is the least
amount of badness?

[Bug bootstrap/111021] [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

--- Comment #1 from Andrew Pinski  ---
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627435.html

[Bug libfortran/105456] Child I/O does not propage iostat

2023-08-14 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105456

Jerry DeLisle  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Assignee|unassigned at gcc dot gnu.org  |jvdelisle at gcc dot 
gnu.org
 CC||jvdelisle at gcc dot gnu.org
   Last reconfirmed||2023-08-15

--- Comment #3 from Jerry DeLisle  ---
On my list.

[Bug bootstrap/111021] New: [14 Regression] Serial build broken for CRIS, ARM, and others

2023-08-14 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111021

Bug ID: 111021
   Summary: [14 Regression] Serial build broken for CRIS, ARM, and
others
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hp at gcc dot gnu.org
CC: linkw at gcc dot gnu.org
  Target Milestone: ---

Since r14-3092-g165b1f6ad1d396 "targhooks: Extend legitimate_address_p with
code_helper [PR110248]", build has been broken (at least with "GNU Make 4.3")
for *serial* builds (no "-j" option) for those target where that commit added
an include of tree.h was added, exposed by building for example cris-elf and
arm-eabi (wrapped lines copy-pasted):

g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W
-Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
-Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -fno-common
-DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I/srctop/gcc
-I/srctop/gcc/build -I/srctop/gcc/../include
-I/srctop/gcc/../libcpp/include \ -o build/gencondmd.o
build/gencondmd.cc
In file included from /srctop/gcc/tree.h:23,
 from /srctop/gcc/config/arm/arm-protos.h:26,
 from ./tm_p.h:5,
 from build/gencondmd.cc:29:
/srctop/gcc/tree-core.h:145:10: fatal error: all-tree.def: No such file or
directory
  145 | #include "all-tree.def"
  |  ^~
compilation terminated.
make[2]: *** [Makefile:2918: build/gencondmd.o] Error 1

At a glance, (at least) build/gencondmd.o lacks a dependency on the generated
all-tree.def.  But adding only that dependency leads to a similar error for a
missing of tree-check.h; looks like TREE_H contains the right bits.

Maybe it's as "simple" as including $(TREE_H) in TM_P (or a new intermediary
macro) or add $(TREE_H) to all build/gen*.o rules where the gen*.cc includes
"tm_p.h" (but within the filter-out for build/gencondmd.o).

I *did* try the latter; simply adding $(TREE_H) (within the filter-out call)
for build/gencondmd.o and that seemed to work, but it doesn't seem *right* and
may work just by happenstance, as other build/gen*.o seems like they could fail
given unfortunate timing or Makefile tweak.  It seems proper to leave that
headache to the author of that the offending commit.  (I'm going to bail out
for CRIS with a patch to cris-protos.h, wrapping the function that needs to be
prototyped so it doesn't need to refer to ERROR_MARK.)

[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-14 Thread palmer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

palmer at gcc dot gnu.org changed:

   What|Removed |Added

 CC||palmer at gcc dot gnu.org

--- Comment #4 from palmer at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #3)
> (In reply to H. Peter Anvin from comment #2)
> > Named subsets are, inherently, designed to make sense toward mass-produced
> > products where the hardware and software are designed (mostly)
> > independently. However, what I mean with "very deep embedded use" is
> > hardware and software being co-designed.
> > 
> > The RISC-V ISA policy is that those are considered vendor-specific subsets
> > and are to be given an X* name; however, gcc obviously needs to be able to
> > understand the meaning of this X* name. At this point there is no way to do
> > without changing the source code in nontrivial ways.
> > 
> > Regardless of if it is done in source code or at runtime, by implementing a
> > fine-grained, preferably table-driven, approach to subsets in gcc then it
> > would be very simple for a hardware implementor to define their custom
> > X-subsets without a lot of surgery to the code, *and* it makes it possible
> > to take it one step further and allowing custom (or newly defined! - there
> > have been multiple instances already of new subsets of existing instructions
> > defined a posteori) instruction subsets to be defined in a configuration
> > file.
> 
> I am 100% disagree here. Because if you do this there would be a huge
> explosion of what is and is not considered a subset. THIS is why it should
> be defined at the ISA level instead. Why just CTZ for ZBB what next just
> bseti or bexti of ZBS?
> 
> defining the specific set during your development is different from a
> production compiler really. GCC should aim for production compiler quality
> even for highly embedded targets.

IMO adding some config file for custom subsets is going to make more headaches
than it fixes.  For a while we had args like "-mno-div", but that's kind of
hacky and we eventually ended up with Zmmul to handle it -- having an external
config file controlling this would expose a lot of interface surface we don't
have a sane way to test.

If vendors want a custom subset then they can make one, it'll just be called
"X${vendor}${subset}".  We've already got a few forks/subsets floating around,
look at the T-Head and Ventana stuff.  For a few instructions it's pretty
mechanical, aside from fixing whatever fallout comes from splitting off the
subset.

We do currently require (IIRC we still didn't write this down) some amount of
public commitment to hardware availability to take that code, but if that's the
problem we should try and figure something out.  It's certainly a pain for
vendors to keep in-development trees around, but we're trading that off with
upstream pain -- I've found these sorts of subsets drift around until the HW
actually ships, so we don't want to end up stuck keeping around subsets that
didn't ship.

Vendors also have the option of just implementing all the instructions (via
some trap or microcode or whatever), thus turning this into a performance
problem.  That sort of just trades one problem for another, but we've got some
examples of this as well (SiFive traps on a bunch of stuff, for example).

[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

--- Comment #3 from Andrew Pinski  ---
(In reply to H. Peter Anvin from comment #2)
> Named subsets are, inherently, designed to make sense toward mass-produced
> products where the hardware and software are designed (mostly)
> independently. However, what I mean with "very deep embedded use" is
> hardware and software being co-designed.
> 
> The RISC-V ISA policy is that those are considered vendor-specific subsets
> and are to be given an X* name; however, gcc obviously needs to be able to
> understand the meaning of this X* name. At this point there is no way to do
> without changing the source code in nontrivial ways.
> 
> Regardless of if it is done in source code or at runtime, by implementing a
> fine-grained, preferably table-driven, approach to subsets in gcc then it
> would be very simple for a hardware implementor to define their custom
> X-subsets without a lot of surgery to the code, *and* it makes it possible
> to take it one step further and allowing custom (or newly defined! - there
> have been multiple instances already of new subsets of existing instructions
> defined a posteori) instruction subsets to be defined in a configuration
> file.

I am 100% disagree here. Because if you do this there would be a huge explosion
of what is and is not considered a subset. THIS is why it should be defined at
the ISA level instead. Why just CTZ for ZBB what next just bseti or bexti of
ZBS?

defining the specific set during your development is different from a
production compiler really. GCC should aim for production compiler quality even
for highly embedded targets.

[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-14 Thread hpa at zytor dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

--- Comment #2 from H. Peter Anvin  ---
Named subsets are, inherently, designed to make sense toward mass-produced
products where the hardware and software are designed (mostly) independently.
However, what I mean with "very deep embedded use" is hardware and software
being co-designed.

The RISC-V ISA policy is that those are considered vendor-specific subsets and
are to be given an X* name; however, gcc obviously needs to be able to
understand the meaning of this X* name. At this point there is no way to do
without changing the source code in nontrivial ways.

Regardless of if it is done in source code or at runtime, by implementing a
fine-grained, preferably table-driven, approach to subsets in gcc then it would
be very simple for a hardware implementor to define their custom X-subsets
without a lot of surgery to the code, *and* it makes it possible to take it one
step further and allowing custom (or newly defined! - there have been multiple
instances already of new subsets of existing instructions defined a posteori)
instruction subsets to be defined in a configuration file.

[Bug fortran/110996] RISC-V vector Fortran: SEGV ICE during parsing

2023-08-14 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110996

--- Comment #4 from JuzheZhong  ---
(In reply to Jeremy Bennett from comment #3)
> @JuzheZhong I believe this is in someway related to RVV.  If I remove `v'
> from the march:
> 
> riscv64-unknown-linux-gnu-gfortran -march=rv64gc -mabi=lp64d -c -Ofast
> testcase.f90
> 
> The output I get is correct:
> 
> testcase.f90:6:20:
> 
> 6 |SUBROUTINE c(d) e
>   |1
> Error: Syntax error in SUBROUTINE statement at (1)
> 
> Why does adding `v' to the -march string cause a SEGV?

I didn't reproduce the issue. What I see is GCC has ICE even without 'v' in 
-march. And I have no idea for it.

[Bug target/110985] RISC-V: Incorrect code gen for RVV VLS

2023-08-14 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110985

Li Pan  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Li Pan  ---
Close this bug as committed already.

[Bug c/96952] __builtin_thread_pointer support cannot be probed

2023-08-14 Thread hpa at zytor dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96952

H. Peter Anvin  changed:

   What|Removed |Added

 CC||hpa at zytor dot com

--- Comment #10 from H. Peter Anvin  ---
Is this bug still relevant? RISC-V doesn't even seem to support disabling tls
support, and __builtin_thread_pointer() appears to be properly supported. So it
would presumably be up to any remaining target that doesn't have
__builtin_thread_pointer() (or not in all configurations) to verify that
__has_builtin(__builtin_thread_pointer) evaluates to false?

[Bug target/111020] RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

--- Comment #1 from Andrew Pinski  ---
This sounds more like something which should be designed on at ISA level and
since RISC-V is an open source ISA, it should be discussed at that level ...

There are already extensions which are designed this way too. E.g. Zmmul which
is a subset of the M extension.

[Bug target/111020] New: RFE: RISC-V: ability to cherry-pick additional instructions

2023-08-14 Thread hpa at zytor dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111020

Bug ID: 111020
   Summary: RFE: RISC-V: ability to cherry-pick additional
instructions
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hpa at zytor dot com
  Target Milestone: ---

For very deeply embedded use, it is sometimes highly desirable to control the
instruction set on a very fine grained basis. For example, the Zbb extension
contains a mixture of things that most likely requires separate functional
units. However, as an example, the ctz instruction is highly useful to speed up
interrupt latency in designs that do not have vectorized interrupt handling
(which is, in its most basic form, a dedicated ctz unit.) It would be massive
hardware bloat to require the full Zbb set to add this one instruction.

Once the instruction is added, though, one would like to be able to use it as
fully as possible.

This, obviously, creates binaries that are specifically tuned toward a single
processor implementation, but that is pretty much the essence of deeply
embedded, where in the normal case the entire software stack from the OS to
application is linked together in a single binary, or at the very least
compiled together, often from a single source tree.

As far as object code compatibility is concerned, this is very much a
"programmer beware" situation. There is no need for heroics in terms of tagging
objects with the exact instruction set, for example.

[Bug middle-end/110986] [14 Regression] aarch64 has support for conditional not (and vectorized conditional not ) after r14-3110-g7fb65f10285

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110986

--- Comment #17 from Andrew Pinski  ---
For the scalar version we could match:
(set (reg:SI 106)
(xor:SI (neg:SI (ne:SI (reg:CC 66 cc)
(const_int 0 [0])))
(reg:SI 107 [ MEM[(int *)a_14(D) + ivtmp.14_9 * 1] ])))

In the backend to do the csinv but it might not catch all of them because
combine does not match everything.

[Bug tree-optimization/103035] [meta-bug] YARPGen bugs

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103035
Bug 103035 depends on bug 110954, which changed state.

Bug 110954 Summary: [14 Regression] Wrong code with -O0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110954

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/110954] [14 Regression] Wrong code with -O0

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110954

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Andrew Pinski  ---
Fixed.

[Bug c++/111019] New: Optimizer incorrectly assumes variable is not changed while change happens through another pointer

2023-08-14 Thread boskidialer at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111019

Bug ID: 111019
   Summary: Optimizer incorrectly assumes variable is not changed
while change happens through another pointer
   Product: gcc
   Version: 12.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: boskidialer at gmail dot com
  Target Milestone: ---

Created attachment 55737
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55737=edit
Smallest reproduction i managed to create

Hello,

I was investigating one of the tests failures in the product, test failure that
only happens while compiling with -O3 or -O2, but one that does not happen with
-O1 or when not using any optimization.

GCC Version:

dashboard@dashboard-desktop:~$ /usr/bin/g++ -v
Using built-in specs.
COLLECT_GCC=/usr/bin/g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12.3.0-1ubuntu1~23.04' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~23.04)

Reproduction:

/usr/bin/g++ gcc-err.cpp -O3 -o gcc-err.out && ./gcc-err.out

(gcc-err.cpp is provided as the attachment to the bug report).

Issue is that generated output freezes when compiled it under -O3 or -O2 but
not when compiling under -O1 or without any optimizations.

Just in case to verify the issue is not on my end, i pasted the reproduction
code and required compiler flags onto a godbolt:
https://godbolt.org/z/Ez7vrz77W - and it shows that the compiled program times
out. This is a confirmation that the generated output is stuck. After changing
the compiler options on the right side on the godbolt site to -O1, the code
compiles as well but the executable now correctly finishes within time limit
and outputs a single line "test".

Based on the debugging i did on this code, it looks to be related to the
Target::~Target code where there is the `whlie (this->next)` loop where i
suspect compiler or optimizer incorrectly assumes that value of `this->next` is
unchanged between iterations however that is not true because in this case
there is `n` variable set to `this->next` which points to a second item in the
double linked list, which means `n->previous == this` and as such
`n->previous->next = ...` line is effectively changing value of the
`this->next`, but indirectly.

When generating the assembly from the given reproduction using `/usr/bin/g++
-masm=intel gcc-err.cpp -O3 -S -o gcc-err.S`, instructions produced seem to be
incorrect as they are missing the repeated checks if the value of `this->next`
was changed in the next iteration:

.L21:
mov rcx, QWORD PTR [rax]
mov rdx, QWORD PTR 8[rax]
testrcx, rcx
je  .L19   // if (n->previous)
mov QWORD PTR 8[rcx], rdx
mov rdx, QWORD PTR 8[rax]  //   n->previous->next = n->next;
.L19:
testrdx, rdx
je  .L20   // if (n->next)
mov QWORD PTR [rdx], rcx   //   n->next->previous = n->previous;
.L20:
xor edx, edx
movups  XMMWORD PTR [rax], xmm0
mov QWORD PTR 16[rax], rdx
jmp .L21

When any external function calls, barrier instructions (like 'asm
volatile("":::"memory")') or more complex code is added, the loop produces the
correct code:

.L18:
mov rax, QWORD PTR 8[rbx]
testrax, rax
je  .L74   // quits the loop if `this->next ==
nullptr`
mov rcx, QWORD PTR [rax]
mov rdx, QWORD PTR 8[rax]
testrcx, rcx
   

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-08-14 Thread tmgross at umich dot edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Trevor Gross  changed:

   What|Removed |Added

 CC||tmgross at umich dot edu

--- Comment #95 from Trevor Gross  ---
Just as a heads up: there is an ongoing conversation at the x86 psABI about
adjusting `_BitInt(128)` to have the same alignment as `__int128`, which would
help address some of the issues mentioned here. Please join in the discussion
if you have any comments: https://groups.google.com/g/x86-64-abi/c/-JeR9HgUU20

[Bug c++/111018] lexical interpretation of friendship rules depends on whether the friend function has a definition

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111018

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=58993

--- Comment #4 from Andrew Pinski  ---
I am going to leave this one open only because of what is mentioned in the
commit that changed behavior here:
r13-465-g4df735e01e3199978

[Bug c++/111018] lexical interpretation of friendship rules depends on whether the friend function has a definition

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111018

--- Comment #3 from Andrew Pinski  ---
https://wg21.link/cwg1699

[Bug c++/111018] lexical interpretation of friendship rules depends on whether the friend function has a definition

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111018

Andrew Pinski  changed:

   What|Removed |Added

   Keywords|rejects-valid   |

--- Comment #2 from Andrew Pinski  ---
MSVC rejects both cases 

[Bug c++/111018] lexical interpretation of friendship rules depends on whether the friend function has a definition

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111018

--- Comment #1 from Andrew Pinski  ---
So if we make a definition outside of the class:
```
  template 
  auto bar(Self s) -> decltype(::T::foo(s))
  {

  }
```
Then clang rejects the code too ...
So 

[Bug target/106671] aarch64: BTI instruction are not inserted for cross-section direct calls

2023-08-14 Thread jiangning.liu at amperecomputing dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671

--- Comment #11 from Jiangning Liu  
---
Hi Wilco,

> "it means we will need a linker optimization to remove those redundant BTIs 
> (eg. by changing them into NOPs)"

It will be only for performance optimization, right? If we don't care about
performance, the linker doesn't need to optimize it to be NOP, right? It could
still be useful if we only do this operation for a specific module.

Thanks,
-Jiangning

[Bug c++/111018] New: lexical interpretation of friendship rules depends on whether the friend function has a definition

2023-08-14 Thread eric.niebler at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111018

Bug ID: 111018
   Summary: lexical interpretation of friendship rules depends on
whether the friend function has a definition
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eric.niebler at gmail dot com
  Target Milestone: ---

Starting in gcc 12, transitive friendship is extended to friend functions that
are defined lexically within the body of the friend class. E.g.:


struct S;

struct T {
  friend struct S;
private:
  template 
  static void foo(T) {}
};

struct S {
  template 
  friend auto bar(Self s) -> decltype(::T::foo(s)) { // (1)
return ::T::foo(s);
  }
};

Prior to gcc-12, the commented line would have been rejected, but now it is
accepted. Great, it brings gcc in line with clang and is arguably more
sensible.

HOWEVER, it does NOT work if the friend function is merely declared but not
defined. For instance, this is still an error:

struct S;

struct T {
  friend struct S;
private:
  template 
  static void foo(T) {}
};

struct S {
  template 
  friend auto bar(Self s) -> decltype(::T::foo(s)); // NO FN DEFINITION
};

int main() {
  S s;
  using T = decltype(bar(s)); // ERROR: T::foo is private
}


This is very confusing and inconsistent behavior.

See: https://godbolt.org/z/WT9P37Wba

[Bug fortran/101602] [F2018] local and local_init are not supported in DO CONCURRENT

2023-08-14 Thread marshall.ward at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101602

Marshall Ward  changed:

   What|Removed |Added

 CC||marshall.ward at gmail dot com

--- Comment #2 from Marshall Ward  ---
I've tested this in 13.0.0 and it appears that `local`, `local_init`, `shared`
are still not supported in `do concurrent` and produce syntax errors.

Has there been any activity on this issue recently?  If not, could anyone
comment on the proposal by Jeff Hammond about utilizing the analogous OpenMP
constructs?  Would this be a feasible option?

[Bug fortran/110360] ABI issue with character,value dummy argument

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110360

--- Comment #39 from CVS Commits  ---
The master branch has been updated by Mikael Morin :

https://gcc.gnu.org/g:564b637f4a32883cbf3c3019d3cfcf0b0aec9b82

commit r14-3207-g564b637f4a32883cbf3c3019d3cfcf0b0aec9b82
Author: Mikael Morin 
Date:   Mon Aug 14 21:51:54 2023 +0200

fortran: Fix length one character dummy arg type [PR110419]

Revision r14-2171-g8736d6b14a4dfdfb58c80ccd398981b0fb5d00aa
changed the argument passing convention for length 1 value dummy
arguments to pass just the single character by value.  However, the
procedure declarations weren't updated to reflect the change in the
argument types.
This change does the missing argument type update.

The change of argument types generated an internal error in
gfc_conv_string_parameter with value_9.f90.  Indeed, that function is
not prepared for bare character type, so it is updated as well.

The condition guarding the single character argument passing code
is loosened to not exclude non-interoperable kind (this fixes
a regression with c_char_tests_2.f03).

Finally, the constant string argument passing code is updated as well
to extract the single char and pass it instead of passing it as
a length one string.  As the code taking care of non-constant arguments
was already doing this, the condition guarding it is just removed.

With these changes, value_9.f90 passes on 32 bits big-endian powerpc.

PR fortran/110360
PR fortran/110419

gcc/fortran/ChangeLog:

* trans-types.cc (gfc_sym_type): Use a bare character type for
length
one value character dummy arguments.
* trans-expr.cc (gfc_conv_string_parameter): Handle single
character
case.
(gfc_conv_procedure_call): Don't exclude interoperable kinds
from single character handling.  For single character dummy
arguments,
extend the existing handling of non-constant expressions to
constant
expressions.

gcc/testsuite/ChangeLog:

* gfortran.dg/bind_c_usage_13.f03: Update tree dump patterns.

[Bug testsuite/110419] [14 regression] new test case gfortran.dg/value_9.f90 in r14-2050-gd130ae8499e0c6 fails

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110419

--- Comment #20 from CVS Commits  ---
The master branch has been updated by Mikael Morin :

https://gcc.gnu.org/g:564b637f4a32883cbf3c3019d3cfcf0b0aec9b82

commit r14-3207-g564b637f4a32883cbf3c3019d3cfcf0b0aec9b82
Author: Mikael Morin 
Date:   Mon Aug 14 21:51:54 2023 +0200

fortran: Fix length one character dummy arg type [PR110419]

Revision r14-2171-g8736d6b14a4dfdfb58c80ccd398981b0fb5d00aa
changed the argument passing convention for length 1 value dummy
arguments to pass just the single character by value.  However, the
procedure declarations weren't updated to reflect the change in the
argument types.
This change does the missing argument type update.

The change of argument types generated an internal error in
gfc_conv_string_parameter with value_9.f90.  Indeed, that function is
not prepared for bare character type, so it is updated as well.

The condition guarding the single character argument passing code
is loosened to not exclude non-interoperable kind (this fixes
a regression with c_char_tests_2.f03).

Finally, the constant string argument passing code is updated as well
to extract the single char and pass it instead of passing it as
a length one string.  As the code taking care of non-constant arguments
was already doing this, the condition guarding it is just removed.

With these changes, value_9.f90 passes on 32 bits big-endian powerpc.

PR fortran/110360
PR fortran/110419

gcc/fortran/ChangeLog:

* trans-types.cc (gfc_sym_type): Use a bare character type for
length
one value character dummy arguments.
* trans-expr.cc (gfc_conv_string_parameter): Handle single
character
case.
(gfc_conv_procedure_call): Don't exclude interoperable kinds
from single character handling.  For single character dummy
arguments,
extend the existing handling of non-constant expressions to
constant
expressions.

gcc/testsuite/ChangeLog:

* gfortran.dg/bind_c_usage_13.f03: Update tree dump patterns.

[Bug fortran/110995] segfault for function in declaration of module function

2023-08-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110995

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2023-08-14
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from anlauf at gcc dot gnu.org ---
Confirmed.

A reduced testcase giving the same failure may give some insight:

module strings
  implicit none
contains
  pure function charArrToString(data)
character, intent(in) :: data(:)
!   character(size(data))   :: charArrToString ! OK
character(ubound(data,1))   :: charArrToString ! ICE if used in main
charArrToString = ""
  end function charArrToString
end module strings

program test_str
  use strings
  implicit none
  character :: c(2) = (/ 'T', 'T' /)
  write(*,*) "write module-", charArrToString(c),  "-" ! ICE
  write(*,*) "write module-", charArrToString2(c), "-" ! OK
contains
  pure function charArrToString2(data)
character, intent(in) :: data(:)
!   character(size(data))   :: charArrToString2 ! OK
character(ubound(data,1))   :: charArrToString2 ! OK
charArrToString2 = ""
  end function charArrToString2
end program test_str

So (apparently):
- if the function definition is in the same compilation unit as where it is
  used, there is no problem.
- if the function interface is passed via a module file, some declarations
  work (e.g. using the SIZE intrinsic), but some don't (e.g. using a pure
  user-defined function, or some other intrinsics).

[Bug fortran/87326] [F18] Support the NEW_INDEX= specifier in the FORM TEAM statement

2023-08-14 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87326

--- Comment #8 from anlauf at gcc dot gnu.org ---
(In reply to Nathan Weeks from comment #7)
> (In reply to anlauf from comment #6)
> > (In reply to Nathan Weeks from comment #5)
> > > (In reply to Brad Richardson from comment #3)
> > > > Was there any more progress on this? I've just run into it.
> > > > 
> > > > FYI I'm trying implement a polymorphic send/receive:
> > > > https://gitlab.com/everythingfunctional/communicator
> > > 
> > > The FSF copyright assignment ended up being an unexpectedly-difficult 
> > > hurdle
> > > at the time. I could try again if there is interest, though it would also
> > > require some effort to rework the original patch for GCC 14. If another
> > > contributor were willing to submit a clean-room implementation, that may 
> > > be
> > > more expedient.
> > 
> > Besides the copyright assignment there is also the possibility to use the
> > Developer's Certificate of Origin sign-off:
> > 
> > https://gcc.gnu.org/dco.html
> 
> Is that an option in this case? I was originally advised to pursue a
> copyright assignment:
> 
> https://gcc.gnu.org/pipermail/fortran/2019-January/051674.html

That requirement was changed in 2021:

https://gcc.gnu.org/pipermail/gcc/2021-June/236182.html

See also:

https://gcc.gnu.org/contribute.html

[Bug c++/110513] Invalid use of incomplete type std::bool_constant inside requires expression

2023-08-14 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110513

Patrick Palka  changed:

   What|Removed |Added

 CC||ed at catmur dot uk

--- Comment #4 from Patrick Palka  ---
*** Bug 111016 has been marked as a duplicate of this bug. ***

[Bug c++/111016] Confusing "used in its own initializer" for non-dependent ad-hoc constraint

2023-08-14 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111016

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED
 CC||ppalka at gcc dot gnu.org

--- Comment #1 from Patrick Palka  ---
dup of PR110513 I think

*** This bug has been marked as a duplicate of bug 110513 ***

[Bug analyzer/110543] RFE: Add optional trim of the analyzer diagnostics through system headers.

2023-08-14 Thread vultkayn at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110543

Benjamin Priour  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Benjamin Priour  ---
Fixed by the above patch

[Bug c++/110127] -fimplicit-constexpr leads to extremely slow and memory intensive compilation

2023-08-14 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110127

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |13.3
 Status|NEW |RESOLVED

--- Comment #3 from Patrick Palka  ---
Fixed for GCC 13.3 after the backport r13-7713-gd3088f0ed25fe7.

[Bug c++/55004] [meta-bug] constexpr issues

2023-08-14 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004
Bug 55004 depends on bug 110127, which changed state.

Bug 110127 Summary: -fimplicit-constexpr leads to extremely slow and memory 
intensive compilation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110127

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c++/111008] '>' in a lambda as a template argument causes a syntax error

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111008

--- Comment #6 from Andrew Pinski  ---
Here is another valid C++20 which is also causing issues, this time with `>>`
rather than `>`:
```
typedef int b;
templatestruct F1{};
templatestruct F2{};
F10)}> a;
constexpr int t = 3;
F2>0}>{}> a0;
```

[Bug analyzer/110543] RFE: Add optional trim of the analyzer diagnostics through system headers.

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110543

--- Comment #1 from CVS Commits  ---
The trunk branch has been updated by Benjamin Priour :

https://gcc.gnu.org/g:ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83

commit r14-3205-gce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83
Author: benjamin priour 
Date:   Mon Aug 14 17:36:21 2023 +0200

analyzer: New option fanalyzer-show-events-in-system-headers [PR110543]

This patch introduces -fanalyzer-show-events-in-system-headers,
disabled by default.

This option reduces the noise of the analyzer emitted diagnostics
when dealing with system headers.
The new option only affects the display of the diagnostics,
but doesn't hinder the actual analysis.

Given a diagnostics path diving into a system header in the form
[
  prefix events...,
  system header call,
system header entry,
events within system headers...,
  system header return,
  suffix events...
]
then disabling the option (either by default or explicitly)
will shorten the path into:
[
  prefix events...,
  system header call,
  system header return,
  suffix events...
]

Signed-off-by: benjamin priour 

gcc/analyzer/ChangeLog:

PR analyzer/110543
* analyzer.opt: Add new option.
* diagnostic-manager.cc
(diagnostic_manager::prune_path): Call prune_system_headers.
(prune_frame): New function that deletes all events in a frame.
(diagnostic_manager::prune_system_headers): New function.
* diagnostic-manager.h: Add prune_system_headers declaration.

gcc/ChangeLog:

PR analyzer/110543
* doc/invoke.texi: Add documentation of
fanalyzer-show-events-in-system-headers

gcc/testsuite/ChangeLog:

PR analyzer/110543
*
g++.dg/analyzer/fanalyzer-show-events-in-system-headers-default.C:
New test.
* g++.dg/analyzer/fanalyzer-show-events-in-system-headers-no.C:
New test.
* g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C:
New test.

[Bug libstdc++/110990] `format_to_n` returns wrong value

2023-08-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110990

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jonathan Wakely  ---
Fixed for 13.3, thanks for the report.

[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour

2023-08-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860

Jonathan Wakely  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Jonathan Wakely  ---
Should be fixed for real now. Thanks for the report and the patch.

Looks like I managed to mess up the authorship of the gcc-13 backport though
(because I squashed together my incorrect commit and your correct one for the
backport). I'll fix it in the ChangeLog file after that is regenerated
overnight, but it will stay wrong in Git. Sorry.

[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860

--- Comment #17 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:559341b5b5a30448362f3f205cd1bf043a919945

commit r13-7722-g559341b5b5a30448362f3f205cd1bf043a919945
Author: Jonathan Wakely 
Date:   Fri Aug 11 18:10:29 2023 +0100

libstdc++: Avoid problematic use of log10 in std::format [PR110860]

If abs(__v) is smaller than one, the result will be of the
form 0.x. It is only if the magnitude is large that more digits
are needed before the decimal dot.

This uses frexp instead of log10 which should be less expensive
and have sufficient precision for the desired purpose.

It removes the problematic cases where log10 will be negative or not
fit in an int.

Signed-off-by: Paul Dreik 

libstdc++-v3/ChangeLog:

PR libstdc++/110860
* include/std/format (__formatter_fp::format): Use frexp instead
of log10.

(cherry picked from commit 2d2b05f0691799f03062bf5c436462f14cad3e7c)

[Bug libstdc++/110990] `format_to_n` returns wrong value

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110990

--- Comment #5 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:8f82863df33c63693a355d9244c315ef5cb2158e

commit r13-7721-g8f82863df33c63693a355d9244c315ef5cb2158e
Author: Jonathan Wakely 
Date:   Fri Aug 11 23:02:44 2023 +0100

libstdc++: Fix std::format_to_n return value [PR110990]

When writing to a contiguous iterator, std::format_to_n(out, n, ...)
always returns out + n, even if it wrote fewer than n characters to the
iterator.

The problem is in the _M_finish() member function of the _Iter_sink
specialization for contiguous iterators. _M_finish() calls _M_overflow()
to update its count of characters written, so it can return the count of
characters that would be written if there was room. But _M_overflow()
assumes it's only called when the buffer is full, and so switches to the
internal buffer. _M_finish() then thinks that if the internal buffer is
in use, we already wrote at least n characters and so returns out+n as
the output position.

We can fix the problem by adding a check in _M_overflow() so that we
don't update the count and switch to the internal buffer unless we've
run out of room, i.e. _M_unused().size() is zero. The caller then needs
to be prepared for _M_count not being the final total, and so add
_M_used.size() to it.

However, there's not actually any need for _M_finish() to call
_M_overflow() to get the count. We now need to use _M_count and
_M_used.size() to get the total anyway so _M_overflow() doesn't help
with that. And we don't need to use _M_overflow() to flush unwritten
characters to the output, because the specialization for contiguous
iterators always writes directly to the output without buffering (except
when we've exceeded the maximum number of characters, in which case we
want to discard the buffered characters anyway). So _M_finish() can be
simplified and can avoid calling _M_overflow().

This change also fixes some member functions of other sink classes to
only call _M_overflow() when there are characters in the buffer, which
is needed to meet _M_overflow's precondition that _M_used().size()!=0.

libstdc++-v3/ChangeLog:

PR libstdc++/110990
* include/std/format (_Seq_sink::get): Only call _M_overflow if
its precondition is met.
(_Iter_sink::_M_finish): Likewise.
(_Iter_sink::_M_overflow): Only switch to the
internal buffer after running out of space.
(_Iter_sink::_M_finish): Do not use _M_overflow.
(_Counting_sink::count): Likewise.
* testsuite/std/format/functions/format_to_n.cc: Check cases
where the output fits into the buffer.

(cherry picked from commit 003016a40844701c48851020df672b70f3446bdb)

[Bug fortran/110677] UBSAN error: load of value 1818451807, which is not a valid value for type 'expr_t' when compiling pr49213.f90

2023-08-14 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110677

--- Comment #2 from Martin Jambor  ---
I have proposed a fix on the mailing list:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627379.html

...and also posted it to the Fortran mailing list:
  https://gcc.gnu.org/pipermail/fortran/2023-August/059687.html

[Bug c++/110216] tuple_size requirements for structured binding has not been updated after DR 2386

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110216

--- Comment #4 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:1a43af04dd62b80f45700f94ed241347263ed773

commit r14-3204-g1a43af04dd62b80f45700f94ed241347263ed773
Author: gnaggnoyil 
Date:   Sat Aug 12 16:16:52 2023 +0800

c++: follow DR 2386 and update implementation of get_tuple_size [PR110216]

DR 2386 updated the tuple_size requirements for structured binding and
it now requires tuple_size to be considered only if
std::tuple_size names a complete class type with member value. GCC
before this patch does not follow the updated requrements, and this
patch is intended to implement it.

(jason) Accepting pseudonym sign-off because a change this small is not
legally significant for copyright.

DR 2386
PR c++/110216

gcc/cp/ChangeLog:

* decl.cc (get_tuple_size): Update implementation for DR 2386.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/decomp10.C: Update expected error for DR 2386.
* g++.dg/cpp1z/pr110216.C: New test.

Signed-off-by: gnaggnoyil 
Reviewed-by: Jason Merrill 

[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860

--- Comment #16 from CVS Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:2d2b05f0691799f03062bf5c436462f14cad3e7c

commit r14-3202-g2d2b05f0691799f03062bf5c436462f14cad3e7c
Author: Paul Dreik 
Date:   Mon Aug 14 15:42:33 2023 +0100

libstdc++: Avoid problematic use of log10 in std::format [PR110860]

If abs(__v) is smaller than one, the result will be of the
form 0.x. It is only if the magnitude is large that more digits
are needed before the decimal dot.

This uses frexp instead of log10 which should be less expensive
and have sufficient precision for the desired purpose.

It removes the problematic cases where log10 will be negative or not
fit in an int.

Signed-off-by: Paul Dreik 

libstdc++-v3/ChangeLog:

PR libstdc++/110860
* include/std/format (__formatter_fp::format): Use frexp instead
of log10.

[Bug middle-end/111017] New: [OpenMP] Wrong code with non-rectangular loop nest

2023-08-14 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111017

Bug ID: 111017
   Summary: [OpenMP] Wrong code with non-rectangular loop nest
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

At a glance, the following looks fine to me - but it fails with UBSAN as
follows:


test.c:10:15: runtime error: index -42 out of bounds for type 'int [1024]'
test.c:10:8: runtime error: index -42 out of bounds for type 'int [1024]'
=
==619433==ERROR: AddressSanitizer: stack-buffer-overflow on address
0x7f2942e01060 at pc 0x00401856 bp 0x7f29415fecb0 sp 0x7f29415feca8
READ of size 4 at 0x7f2942e01060 thread T1
test.c:10:22: runtime error: index -4210096 out of bounds for type 'int [1024]'


#define DIM 32
#define N (DIM*DIM)

int
main() {
  int a[N] = {}, b[N] = {}, c[N];
#pragma omp parallel for collapse(2)
  for (int i = 0; i < DIM; i++) {
for (int j = (i*DIM); j < (i*DIM + DIM); j++) {
  c[j] = a[j] + b[j];
}
  }
}


Longer version using 'target teams loop', which segfault here (w/o offloading
configured + w/o UBSAN):

https://github.com/SOLLVE/sollve_vv/blob/master/tests/5.0/teams_loop/test_target_teams_loop_collapse.c

[Bug c++/111016] New: Confusing "used in its own initializer" for non-dependent ad-hoc constraint

2023-08-14 Thread ed at catmur dot uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111016

Bug ID: 111016
   Summary: Confusing "used in its own initializer" for
non-dependent ad-hoc constraint
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ed at catmur dot uk
  Target Milestone: ---

#include 
struct S { int i; };
static_assert(requires(S s) { requires std::destructible; });

In file included from :1:
include/c++/14.0.0/concepts:148:13:   required for the satisfaction of
'destructible'
include/c++/14.0.0/concepts:148:38: error: the value of
'std::__detail::__destructible' is not usable in a constant
expression
  148 | concept destructible = __detail::__destructible<_Tp>;
  |~~^~~
include/c++/14.0.0/concepts:127:22: note:
'std::__detail::__destructible' used in its own initializer
  127 |   constexpr bool __destructible = __destructible_impl<_Tp>;
  |  ^~

Obviously this is IFNDR, but it would be nice to emit a diagnostic which gives
some better clue to what is going on (e.g. "warning: constraint is
non-dependent"). Unfortunately both clang and MSVC accept with no diagnostic,
making this look like a gcc bug.

[Bug tree-optimization/110988] [14 regression] ICE when building 523.xalancbmk_r with pgo and lto

2023-08-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110988

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:882af290c137dfab5d99b88e6dbecc5e75d85a0b

commit r14-3201-g882af290c137dfab5d99b88e6dbecc5e75d85a0b
Author: Jan Hubicka 
Date:   Mon Aug 14 17:55:33 2023 +0200

Avoid division by zero in fold_loop_internal_call

My patch to fix profile after folding internal call is missing check for
the
case profile was already zero before if-conversion.

gcc/ChangeLog:

PR gcov-profile/110988
* tree-cfg.cc (fold_loop_internal_call): Avoid division by zero.

[Bug tree-optimization/111012] [14 Regression] Dead Code Elimination Regression at -O3 since r14-573-g69f1a8af45d

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111012

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-14
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
So the difference here is basically doing the following manually:
```
#if 0
static char i(int k) {
  if (k)
bar173_();
  c <= 0 >= b;
  if (k)
return c;
  return c;
}
#else
static char i(int k) {
  if (k)
bar173_();
  c <= 0 >= b;
  int t;
  if (k)
t = c;
  else
t = c;
  return t;
}
#endif
```
and we used to do some jump threading previously such that the load of the c
would be in the `if (k)` branch after the call to `bar173_()`. The only pass
which moves it like that is pre which happens maybe later but too late for jump
threading to happen for the optimizations.

I have not looked into why this makes a difference either. maybe because there
is another jump threading where we know c is 0 ...

Confirmed.

[Bug tree-optimization/111013] [14 Regression] Dead Code Elimination Regression at -O2 since r14-338-g1dd154f6407

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111013

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-08-14
 Status|UNCONFIRMED |NEW

--- Comment #2 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/111013] [14 Regression] Dead Code Elimination Regression at -O2 since r14-338-g1dd154f6407

2023-08-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111013

--- Comment #1 from Andrew Pinski  ---
THe only difference phiopt does is:
GCC 13:
```
  _19 = d.8_18 & 1;
  _6 = (bool) _19;
  _21 = (int) _6;
```

vs trunk:
```
  _19 = d.8_18 & 1;
  _33 = _19 != 0;
  _21 = (int) _33;
```

[Bug fortran/110996] RISC-V vector Fortran: SEGV ICE during parsing

2023-08-14 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110996

--- Comment #3 from Jeremy Bennett  ---
@JuzheZhong I believe this is in someway related to RVV.  If I remove `v' from
the march:

riscv64-unknown-linux-gnu-gfortran -march=rv64gc -mabi=lp64d -c -Ofast
testcase.f90

The output I get is correct:

testcase.f90:6:20:

6 |SUBROUTINE c(d) e
  |1
Error: Syntax error in SUBROUTINE statement at (1)

Why does adding `v' to the -march string cause a SEGV?

[Bug target/111010] [13 regression] error: unable to find a register to spill compiling GCDAProfiling.c

2023-08-14 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111010

--- Comment #4 from Rainer Orth  ---
Created attachment 55736
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55736=edit
original testcase

I just noticed that the original testcase still fails on trunk.

[Bug middle-end/110994] RISC-V Fortran: Illegal instruction ICE with scalable autovec

2023-08-14 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110994

Jeremy Bennett  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Jeremy Bennett  ---
Confirmed the patch resolves this issue, and the code correctly produces a
syntax error.

[Bug middle-end/110989] RISC-V vector ICE due to invalid tree code in GIMPLE vect pass

2023-08-14 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110989

Jeremy Bennett  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Jeremy Bennett  ---
Confirmed the patch resolves this issue.

[Bug target/110964] RISC-V vector ICE in expand_cond_len_ternop

2023-08-14 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110964

Jeremy Bennett  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Jeremy Bennett  ---
Confirmed the patch resolves this issue.

[Bug middle-end/110962] RISC-V vector Fortran ICE in expand_expr_real_2

2023-08-14 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110962

Jeremy Bennett  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Jeremy Bennett  ---
Confirmed patch resolves the issue.

[Bug target/110950] RISC-V vector ICE in expand_const_vector

2023-08-14 Thread jeremy.bennett at embecosm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110950

Jeremy Bennett  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jeremy Bennett  ---
Patch resolves the issue.

[Bug target/111010] [13 regression] error: unable to find a register to spill compiling GCDAProfiling.c

2023-08-14 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111010

--- Comment #3 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
I've just completed a reghunt which identified

commit 4e0b504f26f78ff02e80ad98ebbf8ded3aa6ffa1
Author: Richard Biener 
Date:   Tue Jan 10 13:48:51 2023 +0100

tree-optimization/106293 - missed DSE with virtual LC PHI

as the culprit.

[Bug tree-optimization/111013] [14 Regression] Dead Code Elimination Regression at -O2 since r14-338-g1dd154f6407

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111013

Richard Biener  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org
   Target Milestone|--- |14.0

[Bug tree-optimization/111015] [11/12/13/14 Regression] __int128 bitfields optimized incorrectly to the 64 bit operations

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111015

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
Summary|__int128 bitfields  |[11/12/13/14 Regression]
   |optimized incorrectly to|__int128 bitfields
   |the 64 bit operations   |optimized incorrectly to
   ||the 64 bit operations
  Known to work||7.5.0
   Last reconfirmed||2023-08-14
   Target Milestone|--- |11.5
   Priority|P3  |P2
  Component|rtl-optimization|tree-optimization
 Ever confirmed|0   |1
   Keywords||needs-bisection
  Known to fail||11.4.0, 13.2.0

--- Comment #2 from Richard Biener  ---
Confirmed.

[Bug rtl-optimization/111015] __int128 bitfields optimized incorrectly to the 64 bit operations

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111015

--- Comment #1 from Richard Biener  ---
Created attachment 55735
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55735=edit
testcase from godbolt

[Bug rtl-optimization/111015] New: __int128 bitfields optimized incorrectly to the 64 bit operations

2023-08-14 Thread pshevchuk at pshevchuk dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111015

Bug ID: 111015
   Summary: __int128 bitfields optimized incorrectly to  the 64
bit operations
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pshevchuk at pshevchuk dot com
  Target Milestone: ---

godbolt: https://godbolt.org/z/r5d6ToY1z

Basically, a store of one half of a 70-bit bitfield gets completely optimized
away.
i.e. for 

struct Entry {
unsigned left : 4;
unsigned right : 4;
uint128 key : KEY_BITS;
} data;

the code:

data.left = left;
data.right = right;
data.key = key & KEY_BITS_MASK;

produces the following (amd64):
andl$15, %ecx
salq$4, %rcx
andl$15, %edx
orq %rdx, %rcx
movq%rdi, %rax
salq$8, %rax
orq %rax, %rcx
movq%rcx, data(%rip)
andw$-16384, data+8(%rip)

critically, at no point is there any attempt to actually initialize data+8

The problem does not disappear if the bitfields gets moved around; it is,
however, very finicky with respect to the size of the bitfields.

-O1 -fstore-merging appears to be close to the smallest set of compilation
options at which it fails.

If you replace -O1 with the list of -O1 optimization from here:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html , it will start
working correctly, so probably we have a documentation issue

[Bug driver/111014] New: do_spec_1 terminates arguments too eagerly when processing spec function

2023-08-14 Thread pexu--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111014

Bug ID: 111014
   Summary: do_spec_1 terminates arguments too eagerly when
processing spec function
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: p...@gcc-bugzilla.mail.kapsi.fi
  Target Milestone: ---

Hi.

do_spec_1 always terminates the current argument (via end_going_arg) if
processing a spec function.

6881  if (processing_spec_function)
6882end_going_arg ();

If a spec function returns a non-null string it is expanded using do_spec_1.  

7043  funcval = eval_spec_function (func, args, soft_matched_part);
7044  if (funcval != NULL && do_spec_1 (funcval, 0, NULL) < 0)

However, because processing_spec_function is in this case non-zero, there is
always a termination after each nested %-sequence has been processed.

Thus, e.g. "%:version-compare(>= 1.0 -version-compare= %%(do_spec))" and
"%(do_spec)" behave differently if the expanded %-sequences have any trailing
parts:

# cat version_compare_string.spec
*self_spec:
+ --version-compare=1.0 %<-version-compare=*

*do_spec:
-W%{!Werror:no-}error

*cc1_options:
+ %:version-compare(>= 1.0 -version-compare= %%(do_spec)) %(do_spec)

# gcc -c -xc -specs=version_compare_string.spec - < /dev/null
cc1: error: unrecognized command-line option ‘-Wno-’
[...]

In the first case, do_spec is not expanded as "-Wno-error" but as "-Wno-" and
"error".  I wonder if this termination is really necessary; the same effect can
be archieved by simply having a whitespace character to follow the spec
function, if concatenation is not to be desired.

[Bug tree-optimization/111013] New: [14 Regression] Dead Code Elimination Regression at -O2 since r14-338-g1dd154f6407

2023-08-14 Thread scherrer.sv at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111013

Bug ID: 111013
   Summary: [14 Regression] Dead Code Elimination Regression at
-O2 since r14-338-g1dd154f6407
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: scherrer.sv at gmail dot com
  Target Milestone: ---

static int b, c, e;
static unsigned d;
static short f;
void foo(void);
void(a)();
static char g(int h) {
  int i = 2749857453;
j:
  a();
k:
  if (b)
goto j;
  f = 1;
  if ((unsigned)(7 ^ e | (h & d && f)) >= 2)
i = 0;
  e = 0;
  if (c)
goto k;
  if (!(i <= 6))
foo();
  return h;
}
int main() {
  d--;
  b = c = e && g(1);
}

gcc-9ec5d6de735 (trunk) -O2 cannot eliminate the call to foo but
gcc-releases/gcc-13.1.0 -O2 can.
---
gcc-9ec5d6de7355c15b3811150d1581dab5bd489966 -O2 case.c -S -o case.s
- OUTPUT -
main:
.LFB1:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
movle(%rip), %eax
subl$1, d(%rip)
testl   %eax, %eax
jne .L3
.L2:
movl%eax, c(%rip)
movl%eax, b(%rip)
xorl%eax, %eax
addq$8, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.p2align 4,,10
.p2align 3
.L3:
.cfi_restore_state
xorl%eax, %eax
calla
movlb(%rip), %edx
testl   %edx, %edx
jne .L3
movld(%rip), %ecx
movle(%rip), %eax
movl$-1545109843, %edx
xorl%esi, %esi
movlc(%rip), %edi
andl$1, %ecx
jmp .L5
.p2align 4,,10
.p2align 3
.L8:
xorl%eax, %eax
.L5:
xorl$7, %eax
orl %ecx, %eax
cmpl$2, %eax
cmovnb  %esi, %edx
testl   %edi, %edi
jne .L8
xorl%eax, %eax
movl%eax, e(%rip)
cmpl$6, %edx
jg  .L15
.L6:
movl$1, %eax
jmp .L2
.L15:
callfoo
jmp .L6
-- END OUTPUT -

---
gcc-2b98cc24d6af0432a74f6dad1c722ce21c1f7458 -O2 case.c -S -o case.s
- OUTPUT -
main:
.LFB1:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
movle(%rip), %eax
subl$1, d(%rip)
testl   %eax, %eax
jne .L3
.L2:
movl%eax, c(%rip)
movl%eax, b(%rip)
xorl%eax, %eax
addq$8, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.p2align 4,,10
.p2align 3
.L3:
.cfi_restore_state
xorl%eax, %eax
calla
movlb(%rip), %edx
movlc(%rip), %eax
testl   %edx, %edx
jne .L3
testl   %eax, %eax
je  .L4
.L5:
jmp .L5
.L4:
xorl%eax, %eax
movl%eax, e(%rip)
movl$1, %eax
jmp .L2
-- END OUTPUT -

---
Bisects to r14-338-g1dd154f6407

[Bug tree-optimization/111012] [14 Regression] Dead Code Elimination Regression at -O3 since r14-573-g69f1a8af45d

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111012

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 CC||pinskia at gcc dot gnu.org

[Bug tree-optimization/111012] New: [14 Regression] Dead Code Elimination Regression at -O3 since r14-573-g69f1a8af45d

2023-08-14 Thread scherrer.sv at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111012

Bug ID: 111012
   Summary: [14 Regression] Dead Code Elimination Regression at
-O3 since r14-573-g69f1a8af45d
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: scherrer.sv at gmail dot com
  Target Milestone: ---

static int b, c;
static char d;
static short e = -1L;
static int *j = 
void foo(void);
void bar150_(void);
void bar173_(void);
static char(a)(char k, char l) { return k + l; }
static void g(unsigned k, int l) {
  if (l)
if (!k)
  foo();
  if (k)
bar150_();
}
static const unsigned char h();
static char i(int k) {
  if (k)
bar173_();
  c <= 0 >= b;
  if (k)
return c;
  return c;
}
static void f(char k, unsigned) {
  char m = h(8 != c);
  g(m && 8, k);
}
static const unsigned char h(int k) {
  d = i(c);
  *j = a(e, d < k < k && c) ^ k;
  b = 0;
  return c;
}
int main() { f(c, b); }

gcc-9ec5d6de735 (trunk) -O3 cannot eliminate the call to foo but
gcc-releases/gcc-13.1.0 -O3 can.
---
gcc-9ec5d6de7355c15b3811150d1581dab5bd489966 -O3 case.c -S -o case.s
- OUTPUT -
main:
.LFB5:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
pushq   %rbx
.cfi_def_cfa_offset 24
.cfi_offset 3, -24
subq$8, %rsp
.cfi_def_cfa_offset 32
movlc(%rip), %ebx
cmpl$8, %ebx
setne   %bpl
testl   %ebx, %ebx
jne .L9
movl$-2, c(%rip)
xorl%eax, %eax
movl%eax, b(%rip)
.L4:
callbar150_
.L6:
addq$8, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 24
xorl%eax, %eax
popq%rbx
.cfi_def_cfa_offset 16
popq%rbp
.cfi_def_cfa_offset 8
ret
.p2align 4,,10
.p2align 3
.L9:
.cfi_restore_state
callbar173_
movlc(%rip), %edx
movzbl  %bpl, %ebp
movl$0, b(%rip)
movsbl  %dl, %eax
cmpl%eax, %ebp
setg%al
movzbl  %al, %eax
cmpl%eax, %ebp
setg%al
testl   %edx, %edx
setne   %dl
andl%edx, %eax
subl$1, %eax
movsbl  %al, %eax
xorl%ebp, %eax
movl%eax, c(%rip)
testb   %bl, %bl
je  .L3
testb   %al, %al
jne .L4
callfoo
jmp .L6
.L3:
testb   %al, %al
je  .L6
jmp .L4
-- END OUTPUT -

---
gcc-2b98cc24d6af0432a74f6dad1c722ce21c1f7458 -O3 case.c -S -o case.s
- OUTPUT -
main:
.LFB5:
.cfi_startproc
movlc(%rip), %eax
pushq   %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
xorl%ebx, %ebx
cmpl$8, %eax
setne   %bl
testl   %eax, %eax
jne .L10
.L2:
cmpl%eax, %ebx
setg%al
movzbl  %al, %eax
cmpl%eax, %ebx
jg  .L3
notl%ebx
.L4:
movl%ebx, c(%rip)
movl$0, b(%rip)
callbar150_
xorl%eax, %eax
popq%rbx
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.L3:
.cfi_restore_state
cmpl$1, c(%rip)
sbbb%bl, %bl
xorl$1, %ebx
movsbl  %bl, %ebx
jmp .L4
.L10:
callbar173_
movsbl  c(%rip), %eax
jmp .L2
-- END OUTPUT -

---
Bisects to r14-573-g69f1a8af45d

[Bug c++/108080] ICE: in core_vals, at cp/module.cc:6262 with -fmodule-header

2023-08-14 Thread yagreg7 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108080

--- Comment #7 from Gregory Dushkin  ---
Sorry for splitting the comments, but the issue indeed appears to be tied to
optimization options. When manually specifying -Og, -O1, -O2, -O3, -Os the
compilation is successful. I only observe the ICE with -O0 or no optimization
options.

[Bug middle-end/111009] [12/13/14 regression] -fno-strict-overflow erroneously elides null pointer checks and causes SIGSEGV on perf from linux-6.4.10

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111009

--- Comment #3 from Richard Biener  ---
bool
operator_addr_expr::fold_range (irange , tree type,
const irange ,
const irange ,
relation_trio) const
{ 
  if (empty_range_varying (r, type, lh, rh))
return true;

  // Return a non-null pointer of the LHS type (passed in op2).
  if (lh.zero_p ())
r = range_zero (type); 

not sure how this is called, but we can only derive this if the offset
is zero as well, definitely if targetm.addr_space.zero_address_valid,
but I think this is true in general.

  else if (!contains_zero_p (lh))
r = range_nonzero (type);

and this is only true for TYPE_OVERFLOW_UNDEFINED (type), with
-fwrapv-pointer we could wrap to zero.

That is, it's _not_ GIMPLE undefined behavior to compute &0->bar.

It looks like without -fwrapv-pointer we elide the if (!a) check,
dereferencing it when dso && dso != curr.  I suppose that looks reasonable
with a = >maj, when dso != 0 then a != 0 unless ->maj wraps.

[Bug middle-end/111009] [12/13/14 regression] -fno-strict-overflow erroneously elides null pointer checks and causes SIGSEGV on perf from linux-6.4.10

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111009

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 CC||amacleod at redhat dot com,
   ||rguenth at gcc dot gnu.org
   Priority|P3  |P2
   Target Milestone|--- |12.4
   Last reconfirmed||2023-08-14
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
There's nothing really wrong with lifting the >maj computation, on GIMPLE
>maj is just address arithmetic.

Interestingly we unswitch the loop but only with -fwrapv-pointer.

OK, so the bug looks like we have


 if (>maj != 0)
   for (;;)
 {
   if (!dso) return 1;
   if (dso == curr) return 1;
...
 }

and the if (!dso) test is optimized away since >maj != 0.

That's done by DOM3 here:

Optimizing statement _21 = dso_8(D) == _11;
LKUP STMT _21 = dso_8(D) eq_expr _11
2>>> STMT _21 = dso_8(D) eq_expr _11
Optimizing statement _22 = _21 | _13;
  Replaced '_13' with constant '0'
Applying pattern match.pd:201, gimple-match-10.cc:6318
gimple_simplified to _22 = _21;
  Folded to: _22 = _21;

I don't see where _13 = 0 is entered, this is possibly ranger related:

_13 : CACHE: BB 9 DOM query for _13, found [irange] _Bool VARYING at BB3
797 GORI  recomputation attempt on edge 3->16 for _13 = dso_8(D) == 0B;
798 GORIoutgoing_edge for dso_8(D) on edge 3->16
799 GORI  compute op 1 (a_9) at if (a_9 == 0B)
GORILHS =[irange] _Bool [1, 1]
GORIComputes a_9 = [irange] int * [0, 0] intersect Known range
: [irange] int * VARYING
GORI  TRUE : (799) produces  (a_9) [irange] int * [0, 0]
800 GORI  compute op 1 (dso_8(D)) at a_9 = _8(D)->maj;
GORILHS =[irange] int * [0, 0]
GORIComputes dso_8(D) = [irange] struct dso * [0, 0] intersect
Known range : [irange] struct dso * VARYING
GORI  TRUE : (800) produces  (dso_8(D)) [irange] struct dso * [0,
0]
GORITRUE : (798) outgoing_edge (dso_8(D)) [irange] struct dso * [0,
0]
GORI  TRUE : (797) recomputation (_13) [irange] _Bool [1, 1]

I don't think we can do this.  Andrew?

[Bug c++/108080] ICE: in core_vals, at cp/module.cc:6262 with -fmodule-header

2023-08-14 Thread yagreg7 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108080

--- Comment #6 from Gregory Dushkin  ---
Created attachment 55734
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55734=edit
Preprocessed source file /tmp/ccoSmHEM.out (gzip compressed)

[Bug c++/108080] ICE: in core_vals, at cp/module.cc:6262 with -fmodule-header

2023-08-14 Thread yagreg7 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108080

Gregory Dushkin  changed:

   What|Removed |Added

 CC||yagreg7 at gmail dot com

--- Comment #5 from Gregory Dushkin  ---
I have a similar issue with GCC 13.2.1. The weird part about it is that it
seems to depend on the exact value of some compiler flags rather than the
actual source file content. I'm trying to use g++ to compile {fmt} as a module
and this is what I get:

When building with CMake the {fmt} itself (go to repository, `mkdir build; cd
build; cmake .. -DFMT_MODULE=ON && make`, the command
```
/usr/bin/c++ 
-I/home/greg/.cpm/fmt/c85658eda638008e7e9290fcd31836ed9f7be1a4/include -O3
-DNDEBUG -std=gnu++20 -fvisibility=hidden -fvisibility-inlines-hidden
-fmodules-ts -MD -MT CMakeFiles/fmt.dir/src/fmt.cc.o -MF
CMakeFiles/fmt.dir/src/fmt.cc.o.d -o CMakeFiles/fmt.dir/src/fmt.cc.o -c
/home/greg/.cpm/fmt/c85658eda638008e7e9290fcd31836ed9f7be1a4/src/fmt.cc

```
executes successfully and the compilation is OK.

However, when compiling {fmt} as a CMake subproject, the command is a little
different:
```
cd /home/greg/projects/cpp/4seudo/build/_deps/fmt-build && /usr/bin/c++ 
-I/home/greg/.cpm/fmt/c85658eda638008e7e9290fcd31836ed9f7be1a4/include
-std=c++20 -fmodules-ts -MD -MT _deps/fmt-build/CMakeFiles/fmt.dir/src/fmt.cc.o
-MF CMakeFiles/fmt.dir/src/fmt.cc.o.d -o CMakeFiles/fmt.dir/src/fmt.cc.o -c
/home/greg/.cpm/fmt/c85658eda638008e7e9290fcd31836ed9f7be1a4/src/fmt.cc
```

And _this_ command fails with an ICE:
```
$ cd /home/greg/projects/cpp/4seudo/build/_deps/fmt-build && /usr/bin/c++ 
-I/home/greg/.cpm/fmt/c85658eda638008e7e9290fcd31836ed9f7be1a4/include
-std=c++20 -fmodules-ts -MD -MT _deps/fmt-build/CMakeFiles/fmt.dir/src/fmt.cc.o
-MF CMakeFiles/fmt.dir/src/fmt.cc.o.d -o CMakeFiles/fmt.dir/src/fmt.cc.o -c
/home/greg/.cpm/fmt/c85658eda638008e7e9290fcd31836ed9f7be1a4/src/fmt.cc
-freport-bug

/home/greg/.cpm/fmt/c85658eda638008e7e9290fcd31836ed9f7be1a4/src/fmt.cc:73:8:
internal compiler error: in core_vals, at cp/module.cc:6262
   73 | export module fmt;
  |^~
0x1ad33c8 internal_error(char const*, ...)
???:0
0x6b7b63 fancy_abort(char const*, int, char const*)
???:0
0x7c98b7 trees_out::tree_value(tree_node*)
???:0
0x7c763d trees_out::tree_node(tree_node*)
???:0
0x7c8806 trees_out::core_vals(tree_node*)
???:0
0x7c8d64 trees_out::tree_node_vals(tree_node*)
???:0
0x7c616b trees_out::decl_value(tree_node*, depset*)
???:0
0x7cb47f depset::hash::find_dependencies(module_state*)
???:0
0x7cc422 module_state::write_begin(elf_out*, cpp_reader*, module_state_config&,
unsigned int&)
???:0
0x7dc105 finish_module_processing(cpp_reader*)
???:0
0x7714fd c_parse_final_cleanups()
???:0
0x9444b4 c_common_parse_file()
???:0
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
Preprocessed source stored into /tmp/ccoSmHEM.out file, please attach this to
your bugreport.
```

Maybe there is an error in arguments parsing?

I will be attaching ccoSmHEM.out to this issue now.

[Bug target/111010] [13 regression] error: unable to find a register to spill compiling GCDAProfiling.c

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111010

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-08-14
   Priority|P3  |P2
 Ever confirmed|0   |1
 Target|i386-pc-solaris2.11 |i386-pc-solaris2.11
   ||i?86-linux-gnu
   Keywords||needs-bisection

--- Comment #2 from Richard Biener  ---
I can confirm on x86_64-unknown-linux-gnu if you add -fno-omit-frame-pointer:

> ../../gcc13-g/gcc/cc1 -quiet t.c -m32 -O3 -fPIC -march=pentium4 
> -mtune=generic -fpreprocessed -w -fno-omit-frame-pointer
t.c: In function 'c':
t.c:21:1: error: unable to find a register to spill
   21 | }
  | ^
t.c:21:1: error: this is the insn:
(insn 109 283 243 9 (set (reg:DI 295)
(ior:DI (ashift:DI (zero_extend:DI (mem:SI (plus:SI (mult:SI (reg:SI
329 [orig:229 _118 ] [229])
(const_int 4 [0x4]))
(reg/f:SI 328 [orig:83 a.0_1 ] [83])) [3
MEM[(unsigned int *)_11]+0 S4 A32]))
(const_int 32 [0x20]))
(zero_extend:DI (mem:SI (plus:SI (mult:SI (reg:SI 294 [233])
(const_int 4 [0x4]))
(reg/f:SI 328 [orig:83 a.0_1 ] [83])) [3 MEM[(unsigned
int *)_15]+0 S4 A32] "t.c":15:7 681 {*concatsidi3_3}
 (expr_list:REG_DEAD (reg/f:SI 328 [orig:83 a.0_1 ] [83])
(expr_list:REG_DEAD (reg:SI 329 [orig:229 _118 ] [229])
(expr_list:REG_DEAD (reg:SI 294 [233])
(nil)
during RTL pass: reload
t.c:21:1: internal compiler error: in lra_split_hard_reg_for, at
lra-assigns.cc:1871
0x15066db _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/space/rguenther/src/gcc-13-branch/gcc/rtl-error.cc:108
0x131bb9a lra_split_hard_reg_for()
/space/rguenther/src/gcc-13-branch/gcc/lra-assigns.cc:1871
0x1314d74 lra(_IO_FILE*)
/space/rguenther/src/gcc-13-branch/gcc/lra.cc:2451
0x12bdb69 do_reload
/space/rguenther/src/gcc-13-branch/gcc/ira.cc:5963
0x12be060 execute
/space/rguenther/src/gcc-13-branch/gcc/ira.cc:6149
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug rtl-optimization/111011] gcc-13 incorrectly decrements by 2. It's twice as fast as gcc-12 and clang!

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111011

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Richard Biener  ---
There's nothing wrong, we unroll the loop.

> ./cc1 -quiet t.c -O3 -fopt-info
t.c:5:15: optimized: loop unrolled 1 times

adding "# foo" to the asm text you'll see

.L2:
#APP
# 6 "t.c" 1
# foo
# 0 "" 2
# 6 "t.c" 1
# foo
# 0 "" 2
#NO_APP
subq$2, %rax
jne .L2

there's no data dependence with 'count' for the asm.  You can instead use

#include 

int main() {
  int64_t count=270;
  while (count>0) {
__asm__ __volatile__("" : "=g" (count) : "0" (count) : "memory");
--count;
  }
  return 0;
}

to get the desired effect.

[Bug target/88160] Error: register save offset not a multiple of 4 only with optimize

2023-08-14 Thread admin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88160

Thorsten Otto  changed:

   What|Removed |Added

 CC||ad...@tho-otto.de

--- Comment #5 from Thorsten Otto  ---
Another possible patch would be:

--- a/gcc/config/m68k/m68k.cc2023-07-27 10:13:04.0 +0200
+++ b/gcc/config/m68k/m68k.cc 2023-08-13 08:59:00.959510772 +0200
@@ -712,6 +712,14 @@ m68k_option_override (void)
   else
m68k_sched_mac = MAC_NO;
 }
+
+  /*
+   * disable -fcombine-stack-adjustments for coldfire/mshort combination,
+   * which generates wrong CFI offsets.
+   * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88160
+   */
+   if (TARGET_COLDFIRE && TARGET_SHORT && (write_symbols & DWARF2_DEBUG))
+flag_combine_stack_adjustments = 0;
 }

This is only a workaround, but should prevent the bug.

[Bug rtl-optimization/111011] New: gcc-13 incorrectly decrements by 2. It's twice as fast as gcc-12 and clang!

2023-08-14 Thread adam.warner.nz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111011

Bug ID: 111011
   Summary: gcc-13 incorrectly decrements by 2. It's twice as fast
as gcc-12 and clang!
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: adam.warner.nz at gmail dot com
  Target Milestone: ---

(Please fix my guess at the correct component for this bug report)

I'm amused by a ghost in the GCC virtual machine. I'm running this code on a
Debian Linux x86-64 desktop with these software versions:

gcc-12 (Debian 12.3.0-7) 12.3.0
gcc-13 (Debian 13.2.0-2) 13.2.0
gcc (Debian 20230718-1) 14.0.0 20230718 (experimental) [master
r14-2597-g6bab2772dbc]
Debian clang version 17.0.0 (++20230128060150+75153adeda1a-1~exp1)

My CPU is locked at 2.7GHz. It should take a nice round 10 seconds to decrement
2.7x10^10 to zero if each decrement takes one clock cycle.

And indeed it used to:

$ cat countdown.c
#include 

int main() {
  int64_t count=270;
  while (count>0) {
__asm__ __volatile__("" : : : "memory");
--count;
  }
  return 0;
}
$ gcc-12 -O3 countdown.c && time ./a.out 

real0m10.029s
user0m10.024s
sys 0m0.004s
$ clang-17 -O3 countdown.c && time ./a.out 

real0m10.032s
user0m10.030s
sys 0m0.000s


But now it only takes 5 seconds:
$ gcc-13 -O3 countdown.c && time ./a.out 

real0m5.022s
user0m5.021s
sys 0m0.001s
$ gcc-snapshot.sh -O3 countdown.c && time ./a.out 

real0m5.023s
user0m5.022s
sys 0m0.000s

By disassembling the machine code we can clearly see why:
$ gcc-13 -O3 countdown.c && objdump -d -m i386:x86-64:intel a.out
...
1040 :
1040:   48 b8 00 4e 53 49 06movabs rax,0x649534e00
1047:   00 00 00 
104a:   66 0f 1f 44 00 00   nopWORD PTR [rax+rax*1+0x0]
1050:   48 83 e8 02 subrax,0x2
1054:   75 fa   jne1050 
1056:   31 c0   xoreax,eax
1058:   c3  ret
1059:   0f 1f 80 00 00 00 00nopDWORD PTR [rax+0x0]
...

[Bug target/111010] [13 regression] error: unable to find a register to spill compiling GCDAProfiling.c

2023-08-14 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111010

Rainer Orth  changed:

   What|Removed |Added

  Known to work||12.3.1, 14.0
  Known to fail||13.2.1
   Target Milestone|--- |13.3

[Bug target/111010] [13 regression] error: unable to find a register to spill compiling GCDAProfiling.c

2023-08-14 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111010

--- Comment #1 from Rainer Orth  ---
Created attachment 55733
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55733=edit
reduced testcase

[Bug target/111010] New: [13 regression] error: unable to find a register to spill compiling GCDAProfiling.c

2023-08-14 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111010

Bug ID: 111010
   Summary: [13 regression] error: unable to find a register to
spill compiling GCDAProfiling.c
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
  Target Milestone: ---
  Host: i386-pc-solaris2.11
Target: i386-pc-solaris2.11
 Build: i386-pc-solaris2.11

Building current LLVM main with GCC 13.1.0 fails on Solaris/amd64, compiling
compiler-rt/lib/profile/GCDAProfiling.c.  I could reduce that file to the
attached
testcase:

$ gcc -O3 -m32 -fPIC -c GCDAProfiling.i -w
GCDAProfiling.i: In function ‘c’:
GCDAProfiling.i:21:1: error: unable to find a register to spill
   21 | }
  | ^
GCDAProfiling.i:21:1: error: this is the insn:
(insn 109 333 249 9 (set (reg:DI 300)
(ior:DI (ashift:DI (zero_extend:DI (mem:SI (plus:SI (mult:SI (reg:SI
377 [orig:229 _118 ] [229])
(const_int 4 [0x4]))
(reg/f:SI 338 [orig:83 a.0_1 ] [83])) [3
MEM[(unsigned int *)_11]+0 S4 A32]))
(const_int 32 [0x20]))
(zero_extend:DI (mem:SI (plus:SI (mult:SI (reg:SI 299 [233])
(const_int 4 [0x4]))
(reg/f:SI 338 [orig:83 a.0_1 ] [83])) [3 MEM[(unsigned
int *)_15]+0 S4 A32] "GCDAProfiling.i":15:7 680 {*concatsidi3_3}
 (expr_list:REG_DEAD (reg:SI 377 [orig:229 _118 ] [229])
(expr_list:REG_DEAD (reg/f:SI 338 [orig:83 a.0_1 ] [83])
(expr_list:REG_DEAD (reg:SI 299 [233])
(nil)

The corresponding cc1 invocation is

cc1 -fpreprocessed GCDAProfiling.i -quiet -m32 -mtune=generic -march=pentium4
-O3 -w -fPIC

The testcase compiles on the gcc-12 branch and trunk, so this is gcc-13 branch
regression only.

However, it does compile with a Linux/x86_64 gcc 13.1.0.

[Bug middle-end/111009] [12/13/14 regression] -fno-strict-overflow erroneously elides null pointer checks and causes SIGSEGV on perf from linux-6.4.10

2023-08-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111009

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #1 from Alexander Monakov  ---
Triggered by GIMPLE loop invariant motion lifting

  a_9 = _8(D)->maj;

across a (dso != NULL) test.

[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour

2023-08-14 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860

--- Comment #15 from Jonathan Wakely  ---
Hi Paul, could you please submit the patch to the gcc-patches mailing list for
formal review, CCing the libstd...@gcc.gnu.org list?

Please either complete a copyright assignment to the FSF or use DCO sign-off as
per https://gcc.gnu.org/dco.html so that we can use your patch in GCC.

Thanks!

[Bug tree-optimization/111000] [14 Regression] Wrong code at -O3 on x86_64-linux-gnu since r14-2944-g3d48c11ad08

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111000

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
we are computing 647 >> t even when that would invoke undefined behavior (but
LIM does this already).  Disabling either loop splitting or final value
replacement avoids the miscompile.

The only "interesting" replacement is the PHI defining the guard of the
second loop, thus 'd' after the first loop:

final value replacement:
  _54 = PHI <_4(10)>
 with expr: (d.6_22 + 1) + (d.6_22 <= _53 ? (int) ((unsigned int) _53 -
(unsigned int) d.6_22) : 0)
 final stmt:
  _54 = _67 + _73;

d.6_22 is 'd' at the start of the program, _53 is the loop split adjusted
exit test value, MIN (647 >> t, 7 - (d + 1)).

When disabling vectorization we instead get

0
Aborted (core dumped)

likewise when disabling unswitching.  When disabling early invariant motion
splitting is disabled.  Changing the test to the following, avoiding the
undefined behavior resolves the miscompile.

int printf(const char *, ...);
volatile int a = 31;
int b, d, e;
int main()
{
  int t = a;
  for (; d <= 6; d++) {
  for (b = 0; b <= 6; b++) {
  if (t >= 30)
e = d;
  else if (d > (647 >> t))
e = d;
  else
e = 0;
  }
  }
  printf("%d\n", e);
  if (e != 6)
__builtin_abort();
}

so in the end I think we see the effect of hoisting the shift and then
involving it in a complex computation + comparison, eventually enabling
folding that's "invalid" (taking advantage of the undefined behavior).

I will need to see to avoid hoisting shifts (we already rewrite undefined
signed overflow to unsigned arithmetic).  Unfortunately there's no "safe"
way to write shifts without adding computation (but maybe we should do
that, rewrite it to 647 >> min (t, 31)).

[Bug middle-end/111009] New: [12/13/14 regression] -fno-strict-overflow erroneously elides null pointer checks and causes SIGSEGV on perf from linux-6.4.10

2023-08-14 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111009

Bug ID: 111009
   Summary: [12/13/14 regression] -fno-strict-overflow erroneously
elides null pointer checks and causes SIGSEGV on perf
from linux-6.4.10
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at gcc dot gnu.org
  Target Milestone: ---

Initially observed the failure as a perf SIGSEGV when running against
r14-3191-g614052dd4ea083.

I hope I did not break it too much when minimizing. Self-contained example:

// $ cat bug c.c
struct dso {
 struct dso * next;
 int maj;
};

__attribute__((noipa)) static void __dso_id__cmp_(void) {}

__attribute__((noipa))
static int bug(struct dso * d, struct dso *dso)
{
 struct dso **p = 
 struct dso *curr = 0;

 while (*p) {
  curr = *p;
  // prevent null deref below
  if (!dso) return 1;
  if (dso == curr) return 1;

  int *a = >maj;
  // null deref
  if (!(a && *a)) __dso_id__cmp_();

  p = >next;
 }
 return 0;
}

__attribute__((noipa))
int main(void) {
struct dso d = { 0, 0, };
bug(, 0);
}

Triggering the bug:

$ gcc -fno-strict-overflow -O3 bug.c.c -o bug3 && ./bug3
Segmentation fault (core dumped)

$ gcc -fno-strict-overflow -O2 bug.c.c -o bug2 && ./bug2


$ gcc -v
Using built-in specs.
COLLECT_GCC=/<>/gcc-14.0.0/bin/gcc
COLLECT_LTO_WRAPPER=/<>/gcc-14.0.0/libexec/gcc/x86_64-unknown-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../source/configure --prefix=/<>/gcc-14.0.0
--with-gmp-include=/<>/gmp-6.3.0-dev/include
--with-gmp-lib=/<>/gmp-6.3.0/lib
--with-mpfr-include=/<>/mpfr-4.2.0-12-dev/include
--with-mpfr-lib=/<>/mpfr-4.2.0-12/lib --with-mpc=/<>/libmpc-1.3.1
--with-native-system-header-dir=/<>/glibc-2.38-dev/include
--with-build-sysroot=/ --program-prefix= --enable-lto --disable-libstdcxx-pch
--without-included-gettext --with-system-zlib --enable-checking=release
--enable-static --enable-languages=c,c++ --disable-multilib --enable-plugin
--disable-libcc1 --with-isl=/<>/isl-0.20 --disable-bootstrap
--build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu
--target=x86_64-unknown-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0  (experimental) (GCC)

[Bug tree-optimization/110248] ivopts could under-cost for some addressing modes on len_{load,store}

2023-08-14 Thread jbglaw--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110248

Jan-Benedict Glaw  changed:

   What|Removed |Added

 CC||jbg...@lug-owl.de

--- Comment #12 from Jan-Benedict Glaw  ---
The second patch (that pulls in tree.h into recog.h) breaks building for me on
an usual amd64 Linux system:

configure   '--with-pkgversion=basepoints/gcc-14-3093-g4a8e6fa8016, built
at 1691996332'\
--prefix=/var/lib/laminar/run/gcc-local/82/toolchain-install   
\
--enable-werror-always 
\
--enable-languages=all 
\
--disable-multilib  
make V=1 all-gcc

echo timestamp > s-preds-h
TARGET_CPU_DEFAULT="" \
HEADERS="config/i386/i386-d.h" DEFINES="" \
/bin/bash ../../gcc/gcc/mkconfig.sh tm_d.h
/var/lib/laminar/run/gcc-local/82/local-toolchain-install/bin/g++ -std=c++11 -c
  -g -O2   -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables
-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
-DHAVE_CONFIG_H  -DGENERATOR_FILE -I. -Ibuild -I../../gcc/gcc
-I../../gcc/gcc/build -I../../gcc/gcc/../include 
-I../../gcc/gcc/../libcpp/include  \
 -o build/genflags.o ../../gcc/gcc/genflags.cc
/var/lib/laminar/run/gcc-local/82/local-toolchain-install/bin/g++ -std=c++11  
-g -O2   -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W
-Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
-DHAVE_CONFIG_H  -DGENERATOR_FILE -static-libstdc++ -static-libgcc  -o
build/genflags \
build/genflags.o build/rtl.o build/read-rtl.o build/ggc-none.o build/vec.o
build/min-insn-modes.o build/gensupport.o build/print-rtl.o build/hash-table.o
build/sort.o build/read-md.o build/errors.o
../build-x86_64-pc-linux-gnu/libiberty/libiberty.a
/var/lib/laminar/run/gcc-local/82/local-toolchain-install/bin/g++ -std=c++11 -c
  -g -O2   -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables
-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
-DHAVE_CONFIG_H  -DGENERATOR_FILE -I. -Ibuild -I../../gcc/gcc
-I../../gcc/gcc/build -I../../gcc/gcc/../include 
-I../../gcc/gcc/../libcpp/include  \
 -o build/genconditions.o ../../gcc/gcc/genconditions.cc
/var/lib/laminar/run/gcc-local/82/local-toolchain-install/bin/g++ -std=c++11  
-g -O2   -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W
-Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
-DHAVE_CONFIG_H  -DGENERATOR_FILE -static-libstdc++ -static-libgcc  -o
build/genconditions \   
build/genconditions.o build/rtl.o build/read-rtl.o build/ggc-none.o
build/vec.o build/min-insn-modes.o build/gensupport.o build/print-rtl.o
build/hash-table.o build/sort.o build/read-md.o build/errors.o
../build-x86_64-pc-linux-gnu/libiberty/libiberty.a  
build/genconditions ../../gcc/gcc/common.md ../../gcc/gcc/config/i386/i386.md >
tmp-condmd.cc
/bin/bash ../../gcc/gcc/../move-if-change tmp-condmd.cc build/gencondmd.cc
echo timestamp > s-conditions
build/genpreds -c ../../gcc/gcc/common.md ../../gcc/gcc/config/i386/i386.md >
tmp-constrs.h
/bin/bash ../../gcc/gcc/../move-if-change tmp-constrs.h tm-constrs.h
echo timestamp > s-constrs-h
/var/lib/laminar/run/gcc-local/82/local-toolchain-install/bin/g++ -std=c++11 -c
  -g -O2   -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables
-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
-DHAVE_CONFIG_H  -DGENERATOR_FILE -I. -Ibuild -I../../gcc/gcc
-I../../gcc/gcc/build -I../../gcc/gcc/../include 
-I../../gcc/gcc/../libcpp/include  \
 -o build/gencondmd.o build/gencondmd.cc
In file included from ../../gcc/gcc/tree.h:23,
 from ../../gcc/gcc/recog.h:24,
 from build/gencondmd.cc:40:
../../gcc/gcc/tree-core.h:145:10: fatal error: all-tree.def: No such file or
directory
  145 | #include "all-tree.def"
  |  ^~
compilation 

[Bug tree-optimization/111003] [14 Regression] Dead Code Elimination Regression at -O3 since r14-2161-g237e83e2158

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111003

Richard Biener  changed:

   What|Removed |Added

 CC|rguenther at suse dot de   |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
So the only additional hoisting is in LIM4

Moving statement
_60 = _13 ^ iftmp.9_18;
(cost 1) out of loop 1.

Moving statement
_62 = _60 == 10;
(cost 21) out of loop 1.


@@ -239,6 +279,8 @@
   goto ; [100.00%]

[local count: 477815112]:
+  _60 = _13 ^ iftmp.9_18;
+  _62 = _60 == 10;

[local count: 4343774241]:
   _9 = a ();
@@ -296,8 +338,6 @@
   goto ; [100.00%]

[local count: 14585209535]:
-  _60 = _13 ^ iftmp.9_18;
-  _62 = _60 == 10;

[local count: 35145083376]:
   # prephitmp_63 = PHI <0(73), _62(14), 0(74)>

and -fdisable-tree-lim4 restores the missed optimization.  The difference
is then in CCP4 which is able to remove the call to foo() when not hoisting
the compare.  When not hoisting the global range of _60 is stricter:

-  # RANGE [irange] unsigned int [0, 511] MASK 0x1c7 VALUE 0x0
-  _60 = _13 ^ iftmp.9_18;
+  # RANGE [irange] unsigned int [0, 16383] MASK 0x3fff VALUE 0x0
+  _60 = _13 ^ iftmp.9_18;

[Bug tree-optimization/110991] [14 Regression] Dead Code Elimination Regression at -O2 since r14-1135-gc53f51005de

2023-08-14 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110991

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 Target||x86_64-*-*
   Target Milestone|--- |14.0
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #3 from Richard Biener  ---
So the difference is that GCC 14 vectorizes the loop and that vectorized loop
is not completely unrolled because

Loop 1 likely iterates at most 2 times.
Estimating sizes for loop 1
 BB: 3, after_exit: 0
  size:   1 _34 = vect_vec_iv_.15_33 + { 252, 252, 252, 252 };
  size:   0 vect_a.16_35 = VIEW_CONVERT_EXPR(vect_vec_iv_.15_33);
  size:   1 vect_iftmp.17_36 = vect_a.16_35 << 3;
  size:   1 mask__23.18_38 = vect_a.16_35 < { 0, 0, 0, 0 };
  size:   1 vect_iftmp.19_40 = VEC_COND_EXPR ;
  size:   1 ivtmp_44 = ivtmp_43 + 1;
   Induction variable computation will be folded away.
  size:   2 if (ivtmp_44 < 3)
   Exit condition will be eliminated in peeled copies.
   Exit condition will be eliminated in last copy.
   Constant conditional.
 BB: 9, after_exit: 1
size: 7-3, last_iteration: 7-3
  Loop size: 7
  Estimated size after unrolling: 8
Not unrolling loop 1: size would grow.

when we still have a loop there's nothing that can fully elide things.
Without vectorization we have

Loop 2 likely iterates at most 11 times.
Estimating sizes for loop 2
 BB: 10, after_exit: 0
  size:   0 a.2_13 = (signed char) a.6_22;
   Induction variable computation will be folded away.
  size:   2 if (a.2_13 < 0)
   Constant conditional.
 BB: 13, after_exit: 1
 BB: 12, after_exit: 0
  size:   1 _26 = a.6_22 + 255;
   Induction variable computation will be folded away.
  size:   1 ivtmp_27 = ivtmp_4 - 1;
   Induction variable computation will be folded away.
  size:   2 if (ivtmp_27 != 0)
   Exit condition will be eliminated in peeled copies.
   Exit condition will be eliminated in last copy.
   Constant conditional. 
 BB: 11, after_exit: 0
  size:   1 iftmp.0_12 = a.2_13 << 3;
   Induction variable computation will be folded away.
size: 7-7, last_iteration: 7-7
  Loop size: 7
  Estimated size after unrolling: 1

unrolling relies on constant_after_peeling which relies on SCEV which
doesn't handle vector IVs.

I have a patch improving it to

size: 7-4, last_iteration: 7-4
  Loop size: 7
  Estimated size after unrolling: 6

IIRC I also had a patch more appropriately "propagating" constness at some
point.