[Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

--- Comment #9 from Xi Ruoyao  ---
And in fact the optimal code for

int t(int x, _Bool y)
{
return x * y;
}

should be

maskeqz $r4,$r4,$r5
jr  $r1 

like

int t(int x, _Bool y)
{
return y ? x : 0;
}

[Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

--- Comment #8 from Xi Ruoyao  ---
(In reply to Andrew Pinski from comment #7)
> (In reply to Xi Ruoyao from comment #5)
> > 
> > so we still slightly penalty multiplication.  To me we should code
> > COSTS_N_INSNS (1) + 1 into loongarch_rtx_cost_optimize_size instead of
> > special casing it in loongarch_rtx_costs.
> 
> Oh yes slightly penalty is definitely not going make a huge difference if
> the cost of an mult instruction is worse than an and and an neg.
> 
> > 
> > For the default value (used when -O2) I'll do some micro-benchmark...

I've changed it to

/* Default RTX cost initializer.  */
loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
  : fp_add (COSTS_N_INSNS (5)),
fp_mult_sf (COSTS_N_INSNS (5)),
fp_mult_df (COSTS_N_INSNS (5)),
fp_div_sf (COSTS_N_INSNS (8)),
fp_div_df (COSTS_N_INSNS (8)),
int_mult_si (COSTS_N_INSNS (4)),
int_mult_di (COSTS_N_INSNS (4)),
int_div_si (COSTS_N_INSNS (5)),
int_div_di (COSTS_N_INSNS (5)),
branch_cost (6),
memory_latency (4) {}

based on micro-benchmark results.  This fixes the int * _Bool case and int * 17
case.  But for the original test case I still get a multiplication instruction.

[Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

--- Comment #7 from Andrew Pinski  ---
(In reply to Xi Ruoyao from comment #5)
> 
> so we still slightly penalty multiplication.  To me we should code
> COSTS_N_INSNS (1) + 1 into loongarch_rtx_cost_optimize_size instead of
> special casing it in loongarch_rtx_costs.

Oh yes slightly penalty is definitely not going make a huge difference if the
cost of an mult instruction is worse than an and and an neg.

> 
> For the default value (used when -O2) I'll do some micro-benchmark...

[Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

--- Comment #6 from Xi Ruoyao  ---
On a LA664 it seems a mul.w instruction costs 4 times a "simple" instruction
like add.w/sub.w/and, and div.w costs 5 times.

[Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

--- Comment #5 from Xi Ruoyao  ---
(In reply to Andrew Pinski from comment #4)
> /* Default RTX cost initializer.  */
> ...
> int_mult_si (COSTS_N_INSNS (1)),
> int_mult_di (COSTS_N_INSNS (1)),
> 
> 
> That seems wrong.
> I suspect you will get other improvements when you touch this.
> 
> E.g.
> ```
> int f(int t)
> {
> return t * 17;
> }
> ```
> Should really be:
> shift followed by an add.
> But currently is just a mult.
> 
> What is interesting is I think -Os cost is the opposite from the -O2 cost ...
> 
> That is -Os produces the better code generation due to the cost for mult
> being set to 4:
> /* RTX costs to use when optimizing for size.  */
> ...
> .int_mult_si_ (4)
> .int_mult_di_ (4)

4 is just COSTS_N_INSNS(1), so in -Os we are making all instructions cost the
same.  This should be correct because in -Os we should minimize the number of
the instructions.  In loongarch_rtx_costs though we have:

case MULT: 
  if (float_mode_p)
*total = loongarch_fp_mult_cost (mode);
  else if (mode == DImode && !TARGET_64BIT)
*total = (speed
  ? loongarch_cost->int_mult_si * 3 + 6 
  : COSTS_N_INSNS (7)); 
  else if (!speed)
*total = COSTS_N_INSNS (1) + 1;
  else if (mode == DImode)
*total = loongarch_cost->int_mult_di;
  else  
*total = loongarch_cost->int_mult_si;
  return false;

so we still slightly penalty multiplication.  To me we should code
COSTS_N_INSNS (1) + 1 into loongarch_rtx_cost_optimize_size instead of special
casing it in loongarch_rtx_costs.

For the default value (used when -O2) I'll do some micro-benchmark...

[Bug middle-end/112917] Most strub execution tests FAIL on SPARC

2023-12-08 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112917

Alexandre Oliva  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |aoliva at gcc dot 
gnu.org
   Last reconfirmed||2023-12-09

--- Comment #1 from Alexandre Oliva  ---
Hello, Rainer,

The bulk of the documentation about strub is not at the options, that are
indeed mostly developer-oriented implementation details, but in the attribute,
referenced from the first of the strub options.

It's about stack scrubbing, and the execution tests all follow roughly the same
pattern: there's a body that gets a certain string onto the stack, a check
before scrubbing that the string is there, and a check after scrubbing that the
string is no longer there.

I'm afraid I haven't had access to sparc hardware in a very long time; the
sparc machines in the compile farm that I used before have long been down, and
the solaris ones there aren't letting me in for some reason.

I've looked at the asm output for the tests, and I see nothing particularly
wrong.  What's probably happening is that the test_string, stored in the s
buffer within leak_string(), is getting into the stack range that, when the
deferred_at_calls calls strub_leave, is used by strub_leave itself, so it
doesn't get cleared.  sparc is quite stack hungry in this regard, and ISTM
that, if the register window doesn't get flushed, that range won't be
overwritten at all, and the copy of test_string will remain there.

If this theory is correct, this is a severe vulnerability in the stack
scrubbing implementation on sparc.  I'd envisioned overwriting some fixed stack
range after an out-of-line strub_update (hand-coded assembly tail-called from
strub_update could accomplish this), to catch just this kind of situation, but
strub_update has been so lean in stack use that this didn't seem necessary. 
Now, for sparc, this seems to be essential.

Could you please help me confirm this theory?

Since it is likely that GDB would cause register window flushes that wouldn't
occur in normal execution, inspecting the stack range would be little use, but
checking the addresses would likely confirm it.

Here's a debug session (on x86_64) that I'd appreciate if you could mirror on
sparc.  

gdb strub-defer-O1
b 32
run
p /x [7]
b strub.c:103
continue
p /x base
p /x end

If the theory is wrong, [7] will be between base and end, but if it's
correct, it will be above end, and increasing PAD enough in the testcase would
likely be enough to move the string out of the register window saving part of
__strub_leave's frame, and make this (and other tests that define PAD) pass,
confirming what we need for a proper fix.

Thanks,

[Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

--- Comment #4 from Andrew Pinski  ---
/* Default RTX cost initializer.  */
...
int_mult_si (COSTS_N_INSNS (1)),
int_mult_di (COSTS_N_INSNS (1)),


That seems wrong.
I suspect you will get other improvements when you touch this.

E.g.
```
int f(int t)
{
return t * 17;
}
```
Should really be:
shift followed by an add.
But currently is just a mult.

What is interesting is I think -Os cost is the opposite from the -O2 cost ...

That is -Os produces the better code generation due to the cost for mult being
set to 4:
/* RTX costs to use when optimizing for size.  */
...
.int_mult_si_ (4)
.int_mult_di_ (4)

[Bug target/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

Andrew Pinski  changed:

   What|Removed |Added

   Keywords|needs-bisection |missed-optimization
 Target|loongarch64-*-* |loongson
  Component|tree-optimization   |target

--- Comment #3 from Andrew Pinski  ---
Either r14-1655-g52c92fb3f40050 or r14-1654-g7ceed7e3e29c33 causes the
difference in the gimple level.

BUT
`a * boolean` is Canonical form and expr.cc expands it as `a & -boolean`:
```
  /* Expand X*Y as X&-Y when Y must be zero or one.  */
...
  bool speed = optimize_insn_for_speed_p ();
  int cost = add_cost (speed, mode) + neg_cost (speed, mode);
  struct algorithm algorithm;
  enum mult_variant variant;
  if (CONST_INT_P (op1)
  ? !choose_mult_variant (mode, INTVAL (op1),
  , , cost)
  : cost < mul_cost (speed, mode))
{
  target = bit0_p ? expand_and (mode, negate_rtx (mode, op0),
op1, target)
  : expand_and (mode, op0,
negate_rtx (mode, op1),
target);
  return REDUCE_BIT_FIELD (target);
}
```

Which means the cost model for loongson is wrong here.
Though it is comparing the cost of doing + and - to a * here though it should
be & and - but most target's + and & have a similar cost.

[Bug tree-optimization/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug tree-optimization/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

Xi Ruoyao  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-12-09
 Ever confirmed|0   |1

--- Comment #2 from Xi Ruoyao  ---
Self-confirming as this report is actually from Jia Jie.

[Bug tree-optimization/112935] [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

--- Comment #1 from Xi Ruoyao  ---
Note that for 

int t(int x, _Bool y)
{
return x * y;
}

even GCC 13 is generating the sub-optimal mul.w instruction.  So perhaps this
is just a target issue after all...

[Bug tree-optimization/112935] New: [14 Regression] Performance regression in Coremarks crcu8 function

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935

Bug ID: 112935
   Summary: [14 Regression] Performance regression in Coremarks
crcu8 function
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xry111 at gcc dot gnu.org
  Target Milestone: ---

typedef __UINT8_TYPE__ ee_u8;
typedef __UINT16_TYPE__ ee_u16;

ee_u16 crcu8(ee_u8 data, ee_u16 crc) {
  ee_u8 i = 0, x16 = 0, carry = 0;

  for (i = 0; i < 8; i++) {
x16 = (ee_u8)((data & 1) ^ ((ee_u8)crc & 1));
data >>= 1;

if (x16 == 1) {
  crc ^= 0x4002;
  carry = 1;
} else
  carry = 0;
crc >>= 1;
if (carry)
  crc |= 0x8000;
else
  crc &= 0x7fff;
  }
  return crc;
}

With GCC 13.2.0 -O2, on LoongArch we get:

.L2:
xor $r12,$r4,$r14
andi$r12,$r12,1
sub.w   $r12,$r0,$r12
srli.w  $r4,$r4,1
and $r12,$r12,$r15
addi.w  $r13,$r13,-1
xor $r12,$r12,$r4
bstrpick.w  $r13,$r13,7,0
srli.d  $r14,$r14,1
bstrpick.w  $r4,$r12,15,0
bnez$r13,.L2

With GCC 14.0.0 -O2:

.L2:
xor $r12,$r4,$r14
andi$r12,$r12,1
mul.w   $r12,$r12,$r15
srli.w  $r4,$r4,1
addi.w  $r13,$r13,-1
bstrpick.w  $r13,$r13,7,0
srli.d  $r14,$r14,1
xor $r12,$r12,$r4
bstrpick.w  $r4,$r12,15,0
bnez$r13,.L2

mul.w is slower than sub.w + and.

I'm now setting components to tree-optimization because the difference already
exists in 254t.optimized vs 263t.optimized.  But maybe the tree optimizer is
doing things correctly and we should just add a target-specific optimization.

[Bug libstdc++/112934] New: excessive code for std::map::erase(key)

2023-12-08 Thread pobrn at protonmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112934

Bug ID: 112934
   Summary: excessive code for std::map::erase(key)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pobrn at protonmail dot com
  Target Milestone: ---

It is probably expected that calling erase(key) is equivalent to or better than

  auto it = m.find(k);
  if (it != m.end())
m.erase(it);

However, currently that is not the case: https://gcc.godbolt.org/z/f1Mh3bodf

This is because std::map::erase(key) calls erase(key) on the underlying tree,
which then uses equal_range() and tries to delete a range of iterators. This is
unnecessary since a map enforces unique keys.

libc++ has an __erase_unique() method on the underlying tree type to handle
this, which does essentially what - I assume - most people expect erase(key) to
do:
https://github.com/llvm/llvm-project/blob/b88b480640f173582ffbfd2faae690f2bc895d14/libcxx/include/__tree#L2453

I believe the same applies to std::set.

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929

--- Comment #5 from Andrew Pinski  ---
Could this be a linker relaxation issue? Does -Wl,--no-relax solve the issue?

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929

--- Comment #4 from Andrew Pinski  ---
(In reply to Patrick O'Neill from comment #3)
> A slightly more reduced testcase without the extra printf:
> https://godbolt.org/z/1xjPzs9v5

Note add_em_up should techincally have:
  __builtin_va_end(ap);

At the end of the function (though, the code generation for riscv is not
different there with or without it).

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-08 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929

--- Comment #3 from Patrick O'Neill  ---
A slightly more reduced testcase without the extra printf:
https://godbolt.org/z/1xjPzs9v5

[Bug target/112778] ICE in ppc64-linux-gnu crosscompiler in store_by_pieces since r14-5946-g1ff6d9f7428b06

2023-12-08 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112778

Alexandre Oliva  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |aoliva at gcc dot 
gnu.org

--- Comment #2 from Alexandre Oliva  ---
Mine.

Patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639987.html

[Bug target/112804] ICE in aarch64 crosscompiler in plus_constant, at explow.cc:102 with -mabi=ilp32 and -finline-stringops

2023-12-08 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112804

Alexandre Oliva  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-12-09

--- Comment #2 from Alexandre Oliva  ---
Mine.

Patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639986.html

[Bug middle-end/112784] ICE in smallest_mode_for_size, at stor-layout.cc:356 | -finline-stringops -mavx512cd

2023-12-08 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112784

Alexandre Oliva  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |aoliva at gcc dot 
gnu.org
   Last reconfirmed||2023-12-09

--- Comment #1 from Alexandre Oliva  ---
Mine.

Patch at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639985.html

[Bug target/112933] New: gcc.target/aarch64/sme2/acle-asm/read_za16_vg1x2.c fails on aarch64_be

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112933

Bug ID: 112933
   Summary: gcc.target/aarch64/sme2/acle-asm/read_za16_vg1x2.c
fails on aarch64_be
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization, testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
CC: rsandifo at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64_be

>From the log
```
FAIL: gcc.target/aarch64/sme2/acle-asm/read_za16_vg1x2.c  -std=c23 -O2
-fno-schedule-insns -fno-schedule-insns2 -DCHECK_ASM --save-temps -DTEST_FULL 
check-function-bodies read_0_z0
body: \tmov (w8|w9|w10|w11), w0
\tmova  {z0\.d - z1\.d}, za\.d\[\1, 0, vgx2\]
\tret

against:addvl   sp, sp, #-2
mov w9, w0
mova{z30.d - z31.d}, za.d[w9, 0, vgx2]
ptrue   p7.b, all
st1dz30.d, p7, [sp]
st1dz31.d, p7, [sp, #1, mul vl]
ld1hz0.h, p7/z, [sp]
ld1hz1.h, p7/z, [sp, #1, mul vl]
addvl   sp, sp, #2
ret
```

Seems like all of the SME check-function-bodies testcases fail this way for
big-endian in that they extra store/loads there ...

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-08 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929

--- Comment #2 from Patrick O'Neill  ---
I tried messing around with it - turns out passing the 'b' variable isn't
required:

https://godbolt.org/z/EKa15xqYP

Using a variadic function reproduces the problem:

https://godbolt.org/z/n95sxY1Y8

After recompiling with the new source the behavior persists with any variadic
function:
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0 
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/qemu-riscv64 rv64gc.out
m: 5
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0 
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/qemu-riscv64 rv64gcv.out
m: 0

[Bug target/112932] New: [14] RISC-V rv64gcv_zvl256b vector: Incorrect behavior with nested loop array writing

2023-12-08 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112932

Bug ID: 112932
   Summary: [14] RISC-V rv64gcv_zvl256b vector: Incorrect behavior
with nested loop array writing
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Testcase:
int printf(char *, ...);
int a, j, n, b, c, o, d, g, h;
int e[8];
long f[8][6];
void l() {
  o = -27;
  for (; o; o++) {
*e = 1;
if (a >= n) {
  d = 0;
  for (; d <= 7; d++)
e[d] = c;
}
  }
  j = 0;
  for (; j < 8; j++) {
g = 0;
for (; g < 2; g++) {
  h = 1;
  for (; h < 3; h++)
f[j][g * 2 + h] = 1;
}
  }
  unsigned long *m = [1][1];
  *m = 0;
}
int main() {
  l();
  b = f[0][1];
  printf("b: %d\n", b);
}

Commands:
rv64gc:
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -march=rv64gc -mabi=lp64d -O3 red.c -o rv64gc.out
> QEMU_CPU=rv64,vlen=256,v=true,vext_spec=v1.0 
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/qemu-riscv64 rv64gc.out
f[0][1]: 1
f[0][2]: 1
f[0][3]: 1
f[0][4]: 1
f[1][1]: 2
f[1][2]: 1
f[1][3]: 1
f[1][4]: 1
f[2][1]: 1
f[2][2]: 1
f[2][3]: 1
f[2][4]: 1
f[3][1]: 1
f[3][2]: 1
f[3][3]: 1
f[3][4]: 1
f[4][1]: 1
f[4][2]: 1
f[4][3]: 1
f[4][4]: 1
f[5][1]: 1
f[5][2]: 1
f[5][3]: 1
f[5][4]: 1
f[6][1]: 1
f[6][2]: 1
f[6][3]: 1
f[6][4]: 1
f[7][1]: 1
f[7][2]: 1
f[7][3]: 1
f[7][4]: 1

rv64gcv_zvl256b:
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -march=rv64gcv_zvl256b -mabi=lp64d -O3 red.c -o rv64gcv.out
> QEMU_CPU=rv64,vlen=256,v=true,vext_spec=v1.0 
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/qemu-riscv64 rv64gcv.out
f[0][1]: 0
f[0][2]: 0
f[0][3]: 0
f[0][4]: 0
f[1][1]: 2
f[1][2]: 1
f[1][3]: 1
f[1][4]: 1
f[2][1]: 0
f[2][2]: 0
f[2][3]: 0
f[2][4]: 0
f[3][1]: 0
f[3][2]: 0
f[3][3]: 0
f[3][4]: 0
f[4][1]: 0
f[4][2]: 0
f[4][3]: 0
f[4][4]: 0
f[5][1]: 0
f[5][2]: 0
f[5][3]: 0
f[5][4]: 0
f[6][1]: 0
f[6][2]: 0
f[6][3]: 0
f[6][4]: 0
f[7][1]: 0
f[7][2]: 0
f[7][3]: 0
f[7][4]: 0

This issue does not occur when compiled with vlenb=128 (rv64gcv).

Basic analysis:
The print loop is copied from the second set of loops in l().

All of these elements should equal 1 except for f[1][1] since it's set
to 2 via *m.

rv64gcv incorrectly reports that some of these elements are set to 0.

[Bug target/112931] gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c ICEs on aarch64_be

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112931

--- Comment #1 from Andrew Pinski  ---
There are many of these same ICE on a few testcases.

[Bug target/112931] New: gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c ICEs on aarch64_be

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112931

Bug ID: 112931
   Summary: gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c
ICEs on aarch64_be
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
CC: rsandifo at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64_be-linux-gnu

spawn -ignore SIGHUP
/bajas/pinskia/src/upstream-full-cross/gcc/objdir-stage2/gcc/xgcc
-B/bajas/pinskia/src/upstream-full-cross/gcc/objdir-stage2/gcc/
-fdiagnostics-plain-output -std=c90 -O0 -g -DTEST_FULL -march=armv9-a+sme2
-mtune=generic -moverride=tune=none -fno-ipa-icf -c -o write_za16_vg1x2.o
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c^M
during RTL pass: expand^M
In file included from
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/../../sme/acle-asm/test_sme_acle.h:14,^M
 from
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/test_sme2_acle.h:4,^M
 from
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c:3:^M
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c:
In function 'write_0_z0':^M
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c:12:13:
internal compiler error: Segmentation fault^M
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/../../sme/acle-asm/../../sve/acle/asm/test_sve_acle.h:9:30:
note: in definition of macro 'INVOKE'^M
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/write_za16_vg1x2.c:11:1:
note: in expansion of macro 'TEST_ZA_XN'^M
0x103b6bf crash_signal^M
../../gcc/toplev.cc:316^M
0x7fd396654def ???^M
   
/usr/src/debug/glibc-2.34-60.el9.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0^M
0x14e6130 aarch64_sve::function_expander::add_input_operand(insn_code,
rtx_def*)^M
../../gcc/config/aarch64/aarch64-sve-builtins.cc:3816^M
0x14e62a7 aarch64_sve::function_expander::use_exact_insn(insn_code)^M
../../gcc/config/aarch64/aarch64-sve-builtins.cc:4020^M
0x14e52a2 aarch64_sve::expand_builtin(unsigned int, tree_node*, rtx_def*)^M
../../gcc/config/aarch64/aarch64-sve-builtins.cc:4689^M
0xc24865 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)^M
../../gcc/expr.cc:12305^M
0xaf50aa expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)^M
../../gcc/expr.h:313^M
0xaf50aa expand_call_stmt^M
../../gcc/cfgexpand.cc:2832^M
0xaf50aa expand_gimple_stmt_1^M
../../gcc/cfgexpand.cc:3894^M
0xaf50aa expand_gimple_stmt^M
../../gcc/cfgexpand.cc:4058^M
0xaf9ce0 expand_gimple_basic_block^M
../../gcc/cfgexpand.cc:6114^M
0xafbd86 execute^M
../../gcc/cfgexpand.cc:6849^M
Please submit a full bug report, with preprocessed source (by using
-freport-bug).^M
Please include the complete backtrace with any bug report.^M
See  for instructions.^M
compiler exited with status 1

[Bug target/112930] gcc.target/aarch64/sme/call_sm_switch_7.c ICEs on aarch64_be

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112930

--- Comment #1 from Andrew Pinski  ---
gcc.target/aarch64/sme/locally_streaming_3.c fails the same way.

[Bug target/112930] New: gcc.target/aarch64/sme/call_sm_switch_7.c ICEs on aarch64_be

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112930

Bug ID: 112930
   Summary: gcc.target/aarch64/sme/call_sm_switch_7.c ICEs on
aarch64_be
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
CC: rsandifo at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64_be-linux-gnu

spawn -ignore SIGHUP
/bajas/pinskia/src/upstream-full-cross/gcc/objdir-stage2/gcc/xgcc
-B/bajas/pinskia/src/upstream-full-cross/gcc/objdir-stage2/gcc/
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c
-fdiagnostics-plain-output -O -fomit-frame-pointer -fno-optimize-sibling-calls
-march=armv9-a+sme -mtune=generic -moverride=tune=none -S -o
call_sm_switch_7.s^M
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c:
In function 'test_mixed':^M
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c:475:1:
error: unrecognizable insn:^M
(insn 64 63 65 (set (mem:VNx4SF (plus:DI (reg/f:DI 31 sp)^M
(const_poly_int:DI [16, 16])) [0  S[16, 16] A8])^M
(reg:VNx4SF 35 v3))
"/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c":472:3
-1^M
 (nil))^M
during RTL pass: shorten^M
/home/apinski/src/upstream-full-cross/gcc/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c:475:1:
internal compiler error: in insn_min_length, at
config/aarch64/aarch64.md:8172^M
0x80401a _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)^M
../../gcc/rtl-error.cc:108^M
0x804036 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)^M
../../gcc/rtl-error.cc:116^M
0x8ed1d4 insn_min_length(rtx_insn*)^M
../../gcc/config/aarch64/aarch64.md:8172^M
0xc3e033 shorten_branches(rtx_insn*)^M
../../gcc/final.cc:1089^M
0xc3e11f rest_of_handle_shorten_branches^M
../../gcc/final.cc:4338^M
0xc3e11f execute^M
../../gcc/final.cc:4367^M
Please submit a full bug report, with preprocessed source (by using
-freport-bug).^M
Please include the complete backtrace with any bug report.^M
See  for instructions.^M
compiler exited with status 1

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929

--- Comment #1 from Andrew Pinski  ---
I am not seeing anything wrong with the difference even.

What if you change printf for a different function which still takes a variable
arguments but does nothing (in a different TU)? Does it still fail?

I can only think printf is miscompiled somehow ...

[Bug target/112929] New: [14] RISC-V vector: Variable clobbered at runtime

2023-12-08 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929

Bug ID: 112929
   Summary: [14] RISC-V vector: Variable clobbered at runtime
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Testcase:
int printf(char *, ...);
int a, b, l, i, p, q, t, n, o;
int *volatile c;
static int j;
static struct pack_1_struct d;
long e;
char m = 5;
short s;
#pragma pack(1)
struct pack_1_struct {
  long c;
  int d;
  int e;
  int f;
  int g;
  int h;
  int i;
} h, r = {1}, *f = , *volatile g;
int main() {
  int u;
  j = 0;
  for (; j < 9; ++j) {
u = ++t ? a : 0;
if (u) {
  int *v = 
  *v = g || e;
  *c = 0;
  *f = h;
}
s = l && c;
o = i;
d.f || (p = 0);
q |= n;
  }
  r = *f;
  printf("b: %d\n", b);
  printf("m: %d\n", m);
}

Commands:
rv64gc:
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -march=rv64gc -mabi=lp64d -O3 red.c -o rv64gc.out
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0 
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/qemu-riscv64 rv64gc.out
b: 0
m: 5

rv64gcv:
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -march=rv64gcv -mabi=lp64d -O3 red.c -o rv64gcv.out
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0 
> /scratch/tc-testing/tc-dec-8-trunk/build-rv64gcv/bin/qemu-riscv64 rv64gcv.out
b: 0
m: 0

Nothing touches the m variable so at the end it should equal 5.

Commenting out the preceding printf("b: %d\n", b); statement causes the
testcase to pass successfully (and doesn't cause much change to the assembly):
https://godbolt.org/z/Erzzqxo8q

[Bug fortran/112873] F2023 degree trig functions

2023-12-08 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112873

--- Comment #9 from kargl at gcc dot gnu.org ---
I have updated the diff.  I don't know have ChangeLog works under git.  Here's
what I have written.

* gcc/fortran/gfortran.texi: Remove the "Extended math intrinsics" node.
  It documented the only the degree trigonometric functions.  That is, it
  was a very limited list of the additional nonstandard intrinsic subprograms
  offered by gfortran.  Most of these functions are now part of Fortran 2023.

* gcc/fortran/intrinsic.cc(add_functions):  Degree trigonometric functions
  [A]COSD, [A]SIND, [A]TAND, and [A]TAN2D have been added to the Fortran
  2023 standard.  These were originally added for compaitiblity with DEC
  Fortran under the -fdec-math.  Change these from GFC_STD_GNU to
  GFC_STD_F2023.
  (gfc_check_intrinsic_standard): Add a check for Fortran 2023.

* gcc/fortran/intrinsic.texi:  Update documentation for [A]COSD, [A]SIND,
  [A]TAND, and [A]TAN2D.

[Bug fortran/112873] F2023 degree trig functions

2023-12-08 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112873

kargl at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #56810|0   |1
is obsolete||

--- Comment #8 from kargl at gcc dot gnu.org ---
Created attachment 56838
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56838=edit
updated diff

This is an updated diff.  It removes making some functions generic when they
are only specific intrinsic routines (e.g., DSIND).  It also updates the
documentation.

[Bug testsuite/112786] [14 Regression] gcc.dg/tree-ssa/scev-3.c scev-4.c and scev-5.c XPASSing on most ilp32 targets

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112786

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Hans-Peter Nilsson :

https://gcc.gnu.org/g:0f3bac474e8f6563a59f814ccf7609ced48b1157

commit r14-6353-g0f3bac474e8f6563a59f814ccf7609ced48b1157
Author: Hans-Peter Nilsson 
Date:   Thu Dec 7 17:23:30 2023 +0100

testsuite: Remove gcc.dg/tree-ssa/scev-3.c -4.c and 5.c

These tests were recently xfailed on ilp32 targets though
passing on almost all ilp32 targets (known exceptions: ia32
and some arm subtargets).  They've been changed around too
much to remain useful.

PR testsuite/112786
* gcc.dg/tree-ssa/scev-3.c, gcc.dg/tree-ssa/scev-4.c,
gcc.dg/tree-ssa/scev-5.c: Remove.

[Bug middle-end/44300] Spurious array subscript warning, [0] == [1] is not folded

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44300

Andrew Pinski  changed:

   What|Removed |Added

 CC||goon.pri.low at gmail dot com

--- Comment #13 from Andrew Pinski  ---
*** Bug 112928 has been marked as a duplicate of this bug. ***

[Bug middle-end/112928] missed-optimization: automatic storage address comparisons

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112928

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
See PR 44300 which has already been closed as won't fix ...

*** This bug has been marked as a duplicate of bug 44300 ***

[Bug tree-optimization/112928] missed-optimization: automatic storage address comparisons

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112928

--- Comment #3 from Andrew Pinski  ---
Does this show up in real code? If so the code is undefined and should be
fixed.

Note we could even replace the comparison directly with `__builtin_unreachable
()` and it would be valid transformation due to this non equality comparisons
against two different "arrays" is undefined.

[Bug tree-optimization/112928] missed-optimization: automatic storage address comparisons

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112928

--- Comment #2 from Andrew Pinski  ---
No compiler I tested changes this to a constant ...

I almost want to say this should be closed as won't fix ...

[Bug tree-optimization/112928] missed-optimization: automatic storage address comparisons

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112928

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #1 from Andrew Pinski  ---
Note the code is undefined so it could be true or false ...

[Bug tree-optimization/112928] New: missed-optimization: automatic storage address comparisons

2023-12-08 Thread goon.pri.low at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112928

Bug ID: 112928
   Summary: missed-optimization: automatic storage address
comparisons
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goon.pri.low at gmail dot com
  Target Milestone: ---

This code:

int unopt() {
int a, b;
return  < 
}

unopt:
lea rax, [rsp-4]
lea rdx, [rsp-8]
cmp rdx, rax
setbal
movzx   eax, al
ret

Could really just be optimized to

opt:
mov eax, 1
ret

[Bug analyzer/112927] New: -Wanalyzer-tainted-size false positive seen in Linux kernel's drivers/char/ipmi/ipmi_devintf.c

2023-12-08 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112927

Bug ID: 112927
   Summary: -Wanalyzer-tainted-size false positive seen in Linux
kernel's drivers/char/ipmi/ipmi_devintf.c
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: dmalcolm at gcc dot gnu.org
Blocks: 106358
  Target Milestone: ---

Created attachment 56837
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56837=edit
Reduced reproducer

With the kernel plugin, this test erroenously reports:

In function 'call_copy_from_user',
inlined from 'handle_send_req' at
gcc.dg/plugin/taint-drivers-char-ipmi-ipmi_devintf.c:35:7:
gcc.dg/plugin/taint-drivers-char-ipmi-ipmi_devintf.c:19:7: warning: use of
attacker-controlled value as size without upper-bounds checking [CWE-129]
[-Wanalyzer-tainted-size]
   19 |   n = copy_from_user(to, from, n); /* { dg-bogus "use of
attacker-controlled value as size without upper-bounds checking" } */
  |   ^~~
  'ipmi_ioctl': events 1-4
|
|   41 | ipmi_ioctl(void* arg)
|  | ^~
|  | |
|  | (1) entry to 'ipmi_ioctl'
|..
|   44 |   if (call_copy_from_user(, arg, sizeof(msg))) {
|  |  ~
|  |  |
|  |  (2) following 'false' branch (when 'n == 0')...
|..
|   48 |   return handle_send_req();
|  |  ~
|  |  |
|  |  (3) ...to here
|  |  (4) calling 'handle_send_req' from 'ipmi_ioctl'
|
+--> 'handle_send_req': events 5-8
   |
   |   29 | handle_send_req(struct ipmi_msg* msg)
   |  | ^~~
   |  | |
   |  | (5) entry to 'handle_send_req'
   |..
   |   32 |   if (msg->data_len > 272) {
   |  |  ~
   |  |  |
   |  |  (6) following 'false' branch...
   |..
   |   35 |   if (call_copy_from_user(buf, msg->data, msg->data_len)) {
   |  |   ~~
   |  |   |
   |  |   (7) ...to here
   |  |   (8) inlined call to 'call_copy_from_user' from
'handle_send_req'
   |
   +--> 'call_copy_from_user': event 9
  |
  |   19 |   n = copy_from_user(to, from, n); /* { dg-bogus
"use of attacker-controlled value as size without upper-bounds checking" } */
  |  |   ^~~
  |  |   |
  |  |   (9) use of attacker-controlled value as size
without upper-bounds checking
  |


despite the value being sanitized at event (6).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106358
[Bug 106358] [meta-bug] tracker bug for building the Linux kernel with
-fanalyzer

[Bug c++/54367] [meta-bug] lambda expressions

2023-12-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54367
Bug 54367 depends on bug 83167, which changed state.

Bug 83167 Summary: decltype((x)) inside lambda is considered odr-use if x is 
not a reference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83167

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug c++/83167] decltype((x)) inside lambda is considered odr-use if x is not a reference

2023-12-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83167

Patrick Palka  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 CC||ppalka at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

--- Comment #4 from Patrick Palka  ---
Narrowly fixed for GCC 14, thanks for the bug report.

[Bug c++/83167] decltype((x)) inside lambda is considered odr-use if x is not a reference

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83167

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:d9965fef40794d548021d2e34844e5fafeca4ce5

commit r14-6350-gd9965fef40794d548021d2e34844e5fafeca4ce5
Author: Patrick Palka 
Date:   Fri Dec 8 16:57:13 2023 -0500

c++: decltype of (non-captured variable) [PR83167]

For decltype((x)) within a lambda where x is not captured, we dubiously
require that the lambda has a capture default, unlike for decltype(x).
But according to [expr.prim.id.unqual]/3 we should just ignore the lambda
in this case.  This patch narrowly fixes this issue by disabling the
capture_decltype handling and falling back to the ordinary handling when
the innermost lambda has no capture-default.  In fact, we can restrict
the special handling to only by-copy lambdas since that's what
[expr.prim.id.unqual]/3 is concerned with; for by-ref implicit captures
both code paths should give the same result anyway.

During review some other issues were discovered which are documented in
a new FIXME.

PR c++/83167

gcc/cp/ChangeLog:

* semantics.cc (capture_decltype): Inline into its only caller ...
(finish_decltype_type): ... here.  Update nearby comment to refer
to recent standard.  Add FIXME.  Restrict uncaptured variable type
transformation to happen only for lambdas with a by-copy
capture-default.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-decltype4.C: New test.

[Bug libstdc++/111052] std::format_to(std::back_inserter(str), "") should write directly to the string

2023-12-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111052

--- Comment #2 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #1)
> We should also make ranges::copy implement the std::copy optimization for
> copying to ostreambuf_iterator, which is an important performance
> enhancement. That will benefit:
> 
> std::format_to(std::ostreambuf_iterator(std::cout), "");

The following does that, but only gives about 10% improvement:

--- a/libstdc++-v3/include/bits/ranges_algobase.h
+++ b/libstdc++-v3/include/bits/ranges_algobase.h
@@ -201,6 +201,22 @@ namespace ranges
  copy_backward_result<_Iter, _Out>>
 __copy_or_move_backward(_Iter __first, _Sent __last, _Out __result);

+namespace __detail
+{
+  // True if _Tp is std::istreambuf_iterator.
+  template constexpr bool __is_istreambuf_iterator = false;
+  template
+constexpr bool
+__is_istreambuf_iterator>>
+  = true;
+  // True if _Tp is std::ostreambuf_iterator.
+  template constexpr bool __is_ostreambuf_iterator = false;
+  template
+constexpr bool
+__is_ostreambuf_iterator>>
+  = true;
+} // namespace __detail
+
   template _Sent,
   weakly_incrementable _Out>
@@ -217,6 +233,8 @@ namespace ranges
   using __detail::__is_move_iterator;
   using __detail::__is_reverse_iterator;
   using __detail::__is_normal_iterator;
+  using __detail::__is_istreambuf_iterator;
+  using __detail::__is_ostreambuf_iterator;
   if constexpr (__is_move_iterator<_Iter> && same_as<_Iter, _Sent>)
{
  auto [__in, __out]
@@ -248,6 +266,34 @@ namespace ranges
= ranges::__copy_or_move<_IsMove>(std::move(__first), __last,
__result.base());
  return {std::move(__in), decltype(__result){__out}};
}
+  else if constexpr (is_pointer_v<_Iter> && is_pointer_v<_Sent>
+  && __is_ostreambuf_iterator<_Out>
+  && requires {
+requires is_same_v, typename
_Out::char_type>;
+requires is_same_v,
iter_value_t<_Sent>>;
+  })
+   {
+ // copy([const] C*, [const] C*, ostreambuf_iterator)
+ return {__first, std::__copy_move_a2<_IsMove>(__first, __last,
__result)};
+   }
+  else if constexpr (__is_istreambuf_iterator<_Out>
+  && is_same_v<_Out, iter_value_t<_Iter>*>
+  && is_same_v<_Iter, _Sent>)
+   {
+ // copy(istreambuf_iterator, istreambuf_iterator, C*)
+ return {__first, std::__copy_move_a2<_IsMove>(__first, __last,
__result)};
+   }
+  else if constexpr (__is_istreambuf_iterator<_Iter> && is_same_v<_Iter,
_Sent>
+  && !_IsMove && __is_ostreambuf_iterator<_Out>
+  && requires {
+requires is_same_v;
+  })
+   {
+ // copy(istreambuf_iterator, istreambuf_iterator,
+ //  ostreambuf_iterator)
+ return {__first, std::copy(__first, __last, __result)};
+   }
   else if constexpr (sized_sentinel_for<_Sent, _Iter>)
{
  if (!std::__is_constant_evaluated())


> The _Iter_sink<_CharT, _Iter> could recognise an ostreambuf_iterator and use
> std::copy, but we should really just make ranges::copy do those
> optimizations.

Better would be for _Iter_sink<_CharT, ostreambuf_iterator<_CharT>> to allow
writing directly to the streambuf's put area. That should be faster than
writing to the sink's buffer and then copying it to the streambuf.

[Bug rtl-optimization/112875] [14 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.cc:670 with -Oz -frounding-math -fno-dce -fno-trapping-math -fno-tree-dce -fno-tree-dse -g

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112875

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Vladimir Makarov :

https://gcc.gnu.org/g:48cb51827c9eb991b92014a3f59d31eb237ce03f

commit r14-6347-g48cb51827c9eb991b92014a3f59d31eb237ce03f
Author: Vladimir N. Makarov 
Date:   Fri Dec 8 15:37:42 2023 -0500

[PR112875][LRA]: Fix an assert in lra elimination code

PR112875 test ran into a wrong assert (gcc_unreachable) in elimination
in a debug insn.  The insn seems ok.  So I change the assertion.
To be more accurate I made it the same as analogous reload pass code.

gcc/ChangeLog:

PR rtl-optimization/112875
* lra-eliminations.cc (lra_eliminate_regs_1): Change an assert.
Add ASM_OPERANDS case.

gcc/testsuite/ChangeLog:

PR rtl-optimization/112875
* gcc.target/i386/pr112875.c: New test.

[Bug c++/112926] issues with nested lambdas and decltype of uncaptured local variable

2023-12-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112926

Patrick Palka  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=83167,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=86697
   Keywords||accepts-invalid,
   ||rejects-valid

--- Comment #1 from Patrick Palka  ---
Complete rejects-valid testcase:

  int main() {
int x;
[] {
  [=] {
using ty1 = decltype((x)); // refers to local variable despite
   // innermost by-copy capture-default
using ty1 = int&;
  };
};
[=] {
  [] {
using ty1 = decltype((x)); // same
using ty1 = int&;
  };
};
[=] {
  [&] {
using ty1 = decltype((x)); // refers to hypothetical capture proxy
using ty1 = const int&;
  };
};
[&] {
  [=] {
using ty1 = decltype((x)); // same
using ty1 = const int&;
  };
};
[x] {
   [x] {
 using ty1 = decltype((x)); // refers to actual capture proxy,
// found by HIDDEN_LAMBDA name lookup
 using ty1 = const int&;
   };
};
[x] {
   [] {
 using ty1 = decltype((x)); // refers to local variable,
// HIDDEN_LAMBDA name lookup not performed
 using ty1 = int&;
   };
};
}

Discussion: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/638976.html

[Bug c++/112926] New: issues with nested lambdas and decltype of uncaptured local variable

2023-12-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112926

Bug ID: 112926
   Summary: issues with nested lambdas and decltype of uncaptured
local variable
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ppalka at gcc dot gnu.org
  Target Milestone: ---

Here since the outer lambda has no capture-default, using x outside of an
unevaluated operand wouldn't capture it (despite the by-copy capture-default of
the inner lambda), so the special case in
https://eel.is/c++draft/expr.prim.id.unqual#3 doesn't apply and decltype((x)
should be int& not const int&.

int main() {
  int x;

  [] {
[=] {
  using type = decltype((x)); // should be int& not const int&
  using type = int&;
};
  };
}

[Bug c++/63378] decltype and access control issues

2023-12-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63378

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 CC||ppalka at gcc dot gnu.org

--- Comment #2 from Patrick Palka  ---
This seems to be fixed for GCC 12+ by r12-4453-g79802c5dcc043a.  Before closing
the PR we should add this testcase to the testsuite.

[Bug sanitizer/112727] [11/12/13 Regression] UBSAN creates GIMPLE path with uninitialized variable

2023-12-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112727

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[11/12/13/14 Regression]|[11/12/13 Regression] UBSAN
   |UBSAN creates GIMPLE path   |creates GIMPLE path with
   |with uninitialized variable |uninitialized variable

--- Comment #9 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug sanitizer/112727] [11/12/13/14 Regression] UBSAN creates GIMPLE path with uninitialized variable

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112727

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:6ddaf06e375e1c15dcda338697ab6ea457e6f497

commit r14-6345-g6ddaf06e375e1c15dcda338697ab6ea457e6f497
Author: Jakub Jelinek 
Date:   Fri Dec 8 20:56:48 2023 +0100

c++: Unshare folded SAVE_EXPR arguments during cp_fold [PR112727]

The following testcase is miscompiled because two ubsan instrumentations
run into each other.
The first one is the shift instrumentation.  Before the C++ FE calls
it, it wraps the 2 shift arguments with cp_save_expr, so that side-effects
in them aren't evaluated multiple times.  And, ubsan_instrument_shift
itself uses unshare_expr on any uses of the operands to make sure further
modifications in them don't affect other copies of them (the only not
unshared ones are the one the caller then uses for the actual operation
after the instrumentation, which means there is no tree sharing).

Now, if there are side-effects in the first operand like say function
call, cp_save_expr wraps it into a SAVE_EXPR, and ubsan_instrument_shift
in this mode emits something like
if (..., SAVE_EXPR , SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR , ...);
and caller adds
SAVE_EXPR  << SAVE_EXPR 
after it in a COMPOUND_EXPR.  So far so good.

If there are no side-effects and cp_save_expr doesn't create SAVE_EXPR,
everything is ok as well because of the unshare_expr.
We have
if (..., SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., ptr->something[i], ...);
and
ptr->something[i] << SAVE_EXPR 
where ptr->something[i] is unshared.

In the testcase below, the !x->s[j] ? 1 : 0 expression is wrapped initially
into a SAVE_EXPR though, and unshare_expr doesn't unshare SAVE_EXPRs nor
anything used in them for obvious reasons, so we end up with:
if (..., SAVE_EXPR (x)->s[j] ?
1 : 0>, SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR (x)->s[j] ? 1 : 0>, ...);
and
SAVE_EXPR (x)->s[j] ? 1 : 0> <<
SAVE_EXPR 
So far good as well.  But later during cp_fold of the SAVE_EXPR we find
out that VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1 is actually
invariant (has TREE_READONLY set) and so cp_fold simplifies the above to
if (..., SAVE_EXPR  > const)
 __ubsan_handle_shift_out_of_bounds (..., (bool) VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1, ...);
and
((bool) VIEW_CONVERT_EXPR(x)->s[j] ? 0 : 1) << SAVE_EXPR

with the s[j] ARRAY_REFs and other expressions shared in between the two
uses (and obviously the expression optimized away from the COMPOUND_EXPR in
the if condition.

Then comes another ubsan instrumentation at genericization time,
this time to instrument the ARRAY_REFs with strict bounds checking,
and replaces the s[j] in there with s[.UBSAN_BOUNDS (0B, SAVE_EXPR, 8),
SAVE_EXPR]
As the trees are shared, it does that just once though.
And as the if body is gimplified first, the SAVE_EXPR is evaluated
inside
of the if body and when it is used again after the if, it uses a
potentially
uninitialized value of j.1 (always uninitialized if the shift count isn't
out of bounds).

The following patch fixes that by unshare_expr unsharing the folded
argument
of a SAVE_EXPR if we've folded the SAVE_EXPR into an invariant and it is
used more than once.

2023-12-08  Jakub Jelinek  

PR sanitizer/112727
* cp-gimplify.cc (cp_fold): If SAVE_EXPR has been previously
folded, unshare_expr what is returned.

* c-c++-common/ubsan/pr112727.c: New test.

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #7 from Andreas Schwab  ---
spawn -ignore SIGHUP /daten/aranym/gcc/gcc-20231208/Build/gcc/xgcc
-B/daten/aranym/gcc/gcc-20231208/Build/gcc/
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.dg/torture/vshuf-v16qi.c
-fdiagnostics-plain-output -O2 -lm -o ./vshuf-v16qi.exe
during RTL pass: reload
In file included from
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.dg/torture/vshuf-v16qi.c:11:
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.dg/torture/vshuf-main.inc: In
function 'test_3':
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.dg/torture/vshuf-main.inc:27:1:
internal compiler error: maximum number of generated reload insns per insn
achieved (90)
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.dg/torture/vshuf-16.inc:6:1:
note: in expansion of macro 'T'
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.dg/torture/vshuf-main.inc:28:1:
note: in expansion of macro 'TESTS'
0xbe89f0 lra_constraints(bool)
../../gcc/lra-constraints.cc:5429
0xbcffba lra(_IO_FILE*, int)
../../gcc/lra.cc:2440
0xb7def7 do_reload
../../gcc/ira.cc:5973
0xb7def7 execute
../../gcc/ira.cc:6161

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #6 from Jakub Jelinek  ---
I must say I have no idea what WORD_REGISTER_OPERATION says about the upper
bits of a paradoxical SUBREG if it is a MEM and load_extend_op (inner_mode) is
ZERO_EXTEND (zeros then?  Then this optimization is ok), or something else? 
And what it says on REGs.

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #6 from Andreas Schwab  ---
spawn -ignore SIGHUP /daten/aranym/gcc/gcc-20231208/Build/gcc/xgcc
-B/daten/aranym/gcc/gcc-20231208/Build/gcc/ -fdiagnostics-plain-output
-mcpu=5235 -Os -c -o pr64461.o
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.target/m68k/pr64461.c
during RTL pass: reload
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.target/m68k/pr64461.c: In
function 'rtems_rfs_block_map_indirect_alloc':
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.target/m68k/pr64461.c:16:1:
internal compiler error: in lra_set_insn_recog_data, at lra.cc:1034
0xbccb1f lra_set_insn_recog_data(rtx_insn*)
../../gcc/lra.cc:1032
0xbcd496 lra_get_insn_recog_data(rtx_insn*)
../../gcc/lra-int.h:503
0xbcd496 setup_sp_offset
../../gcc/lra.cc:1875
0xbcee55 lra_process_new_insns(rtx_insn*, rtx_insn*, rtx_insn*, char const*)
../../gcc/lra.cc:1923
0xbe647e curr_insn_transform
../../gcc/lra-constraints.cc:4893
0xbe7f0e lra_constraints(bool)
../../gcc/lra-constraints.cc:5511
0xbcffba lra(_IO_FILE*, int)
../../gcc/lra.cc:2440
0xb7def7 do_reload
../../gcc/ira.cc:5973
0xb7def7 execute
../../gcc/ira.cc:6161

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #5 from Jakub Jelinek  ---
Oh, and the reason why given the above 
(and:DI (subreg:DI (mem/c:SI (lo_sum:DI (reg/f:DI 144)
(symbol_ref:DI ("globalVar") [flags 0x86] )) [1 globalVar+0 S4 A32]) 0)
(const_int -280375465082881 [0x00ff]))
is optimized into the zero extension is the following in combine.cc:
  /* If the one operand is a paradoxical subreg of a register or memory and
 the constant (limited to the smaller mode) has only zero bits where
 the sub expression has known zero bits, this can be expressed as
 a zero_extend.  */
  else if (GET_CODE (XEXP (x, 0)) == SUBREG)
{
  rtx sub;

  sub = XEXP (XEXP (x, 0), 0);
  machine_mode sub_mode = GET_MODE (sub);
  int sub_width;
  if ((REG_P (sub) || MEM_P (sub))
  && GET_MODE_PRECISION (sub_mode).is_constant (_width)
  && sub_width < mode_width)
{
  unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode);
  unsigned HOST_WIDE_INT mask;

  /* original AND constant with all the known zero bits set */
  mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode));
  if ((mask & mode_mask) == mode_mask)
{
  new_rtx = make_compound_operation (sub, next_code);
  new_rtx = make_extraction (mode, new_rtx, 0, 0, sub_width,
 true, false, in_code == COMPARE);
}
}
}
clearly, if the sign_bit_copies stuff is right for wordmode paradoxical SUBREGs
of smaller MEMs with load_extend_op (MEM_mode) == SIGN_EXTEND, then this
optimization
needs to punt if those conditions are met and sub is a MEM.

Will defer this to people actually using WORD_REGISTER_OPERATIONS arches,
fortunately
none of the ones I'm involved with on a daily basis is.

[Bug rtl-optimization/112875] [14 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.cc:670 with -Oz -frounding-math -fno-dce -fno-trapping-math -fno-tree-dce -fno-tree-dse -g

2023-12-08 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112875

--- Comment #3 from Vladimir Makarov  ---
(In reply to Jakub Jelinek from comment #2)
> Started with r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a

I reproduced it and hope to fix it today.

[Bug target/112922] [14 Regression] 465.tonto from SPECFP 2006 fails train run on Aarch64-linux with -O2 and -flto

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112922

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
Summary|465.tonto from SPECFP 2006  |[14 Regression] 465.tonto
   |fails train run on  |from SPECFP 2006 fails
   |Aarch64-linux with -O2 and  |train run on Aarch64-linux
   |-flto   |with -O2 and -flto
 CC||pinskia at gcc dot gnu.org
   Target Milestone|--- |14.0

[Bug tree-optimization/112924] [14 regression] ICE when building util-linux (error: gimple_bb (stmt) is set to a wrong basic block)

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112924

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Keywords||needs-bisection
   Last reconfirmed||2023-12-08
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
I cannot reproduce this.

I suspect this was fixed with r14-6132-g50f2a3370d177f .

[Bug tree-optimization/112924] [14 regression] ICE when building util-linux (error: gimple_bb (stmt) is set to a wrong basic block)

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112924

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |tree-optimization
   Keywords||ice-on-valid-code
   Target Milestone|--- |14.0

[Bug c++/88848] member ambiguous in multiple inheritance lattice

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88848

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug c++/88848] member ambiguous in multiple inheritance lattice

2023-12-08 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88848

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Marek Polacek  ---
Fixed in GCC 12.

[Bug c++/88848] member ambiguous in multiple inheritance lattice

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88848

--- Comment #4 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:2a5a5d5e7d32b21205562a35b307ff69e389b996

commit r14-6344-g2a5a5d5e7d32b21205562a35b307ff69e389b996
Author: Marek Polacek 
Date:   Fri Dec 8 13:44:10 2023 -0500

c++: Add fixed test [PR88848]

This one was fixed by r12-7714-g47da5198766256.

PR c++/88848

gcc/testsuite/ChangeLog:

* g++.dg/inherit/multiple2.C: New test.

[Bug c++/88848] member ambiguous in multiple inheritance lattice

2023-12-08 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88848

Marek Polacek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from Marek Polacek  ---
Wow, this works now.  It was fixed by r12-7714-g47da5198766256.

I will add the test since the commit above was fixing different errors.

[Bug c++/112658] [12/13 Regression] ICE: finish_expr_stmt with casting an temp array to pointer and constructor call

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112658

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:5764825aed613f201a8bc47e5b239027a39691f0

commit r14-6342-g5764825aed613f201a8bc47e5b239027a39691f0
Author: Patrick Palka 
Date:   Fri Dec 8 13:33:55 2023 -0500

c++: undiagnosed error_mark_node from cp_build_c_cast [PR112658]

When cp_build_c_cast commits to an erroneous const_cast, we neglect to
replay errors from build_const_cast_1 which can lead to us incorrectly
accepting (and "miscompiling") the cast, or triggering the assert in
finish_expr_stmt.

This patch fixes this oversight.  This was the original fix for the ICE
in PR112658 before r14-5941-g305a2686c99bf9 made us accept the testcase
there after all.  I wasn't able to come up with an alternate testcase for
which this fix has an effect anymore, but below is a reduced version of
the PR112658 testcase (accepted ever since r14-5941) for good measure.

PR c++/112658
PR c++/94264

gcc/cp/ChangeLog:

* typeck.cc (cp_build_c_cast): If we're committed to a const_cast
and the result is erroneous, call build_const_cast_1 a second
time to issue errors.  Use complain=tf_none instead of =false.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-array20.C: New test.

[Bug c++/112658] [12/13 Regression] ICE: finish_expr_stmt with casting an temp array to pointer and constructor call

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112658

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:0c018a74eb1affe2a1fa385cdddaa93979683420

commit r14-6343-g0c018a74eb1affe2a1fa385cdddaa93979683420
Author: Patrick Palka 
Date:   Fri Dec 8 13:34:04 2023 -0500

c++: guard more against undiagnosed error_mark_node [PR112658]

This adds a sanity check to cp_parser_expression_statement similar to
the one in finish_expr_stmt added by r6-6795-g0fd9d4921f7ba2, which
effectively downgrades accepts-invalid/wrong-code bugs like this one
into ice-on-invalid/ice-on-valid ones.

PR c++/112658

gcc/cp/ChangeLog:

* parser.cc (cp_parser_expression_statement): If the statement
is error_mark_node, make sure we've seen_error().

[Bug c++/94264] Array-to-pointer conversion not performed on array prvalues

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94264

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:5764825aed613f201a8bc47e5b239027a39691f0

commit r14-6342-g5764825aed613f201a8bc47e5b239027a39691f0
Author: Patrick Palka 
Date:   Fri Dec 8 13:33:55 2023 -0500

c++: undiagnosed error_mark_node from cp_build_c_cast [PR112658]

When cp_build_c_cast commits to an erroneous const_cast, we neglect to
replay errors from build_const_cast_1 which can lead to us incorrectly
accepting (and "miscompiling") the cast, or triggering the assert in
finish_expr_stmt.

This patch fixes this oversight.  This was the original fix for the ICE
in PR112658 before r14-5941-g305a2686c99bf9 made us accept the testcase
there after all.  I wasn't able to come up with an alternate testcase for
which this fix has an effect anymore, but below is a reduced version of
the PR112658 testcase (accepted ever since r14-5941) for good measure.

PR c++/112658
PR c++/94264

gcc/cp/ChangeLog:

* typeck.cc (cp_build_c_cast): If we're committed to a const_cast
and the result is erroneous, call build_const_cast_1 a second
time to issue errors.  Use complain=tf_none instead of =false.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-array20.C: New test.

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #4 from Segher Boessenkool  ---
WORD_REGISTER_OPERATIONS is extremely ill-defined.  Or, it is used for other
things than what it stands for, whichever way you want to look at it.

A backend that defines the macro to non-zero promises that for *any* operation
on any values in a smaller than full-register mode, the compiler can instead
do the operation in that full-register mode, and all the resulting bits will
be well-defined.  This is not true for most real non-trivial backends.

There is word_register_operation_p to filter out the most obvious and egregious
cases where WORD_REGISTER_OPERATIONS is just a foolish thing, but this function
isn't used nearly enough, and it doesn't filter out enough either.

[Bug c/112488] [14 Regression] ICE in make_ssa_name_fn with VLA inside type and inlining since r14-1142

2023-12-08 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112488

--- Comment #10 from Martin Uecker  ---
PATCH: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639961.html

[Bug c++/112869] [14 Regression] ICE at gimplify_expr, at gimplify.cc:17531 on libopenmpt-0.7.3

2023-12-08 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112869

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #5 from David Binderman  ---
I see this one also, in a build of tlx-0.6.1 package.

[Bug target/112109] Missing riscv vectorized strcmp (and other) expanders

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112109

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:d468718c9a097aeb8794fb1a2df6db2c1064d7f7

commit r14-6341-gd468718c9a097aeb8794fb1a2df6db2c1064d7f7
Author: Robin Dapp 
Date:   Fri Dec 1 10:07:23 2023 +0100

RISC-V: Add vectorized strcmp and strncmp.

This patch adds vectorized strcmp and strncmp implementations and
tests.  Similar to strlen, expansion is still guarded by
-minline-str(n)cmp.

gcc/ChangeLog:

PR target/112109

* config/riscv/riscv-protos.h (expand_strcmp): Declare.
* config/riscv/riscv-string.cc (riscv_expand_strcmp): Add
strategy handling and delegation to scalar and vector expanders.
(expand_strcmp): Vectorized implementation.
* config/riscv/riscv.md: Add TARGET_VECTOR to strcmp and strncmp
expander.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strcmp.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strncmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strncmp.c: New test.

[Bug target/112109] Missing riscv vectorized strcmp (and other) expanders

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112109

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:2664964b2f695e947faea4c29dbddd3615cc4b0b

commit r14-6340-g2664964b2f695e947faea4c29dbddd3615cc4b0b
Author: Robin Dapp 
Date:   Fri Dec 1 09:57:15 2023 +0100

RISC-V: Add vectorized strlen.

This patch implements a vectorized strlen by re-using and slightly
adjusting the rawmemchr implementation.  Rawmemchr returns the address
of the needle while strlen returns the difference between needle address
and start address.

As before, strlen expansion is guarded by -minline-strlen.

While testing with -minline-strlen I encountered a vsetvl problem in
memcpy-chk.c where we didn't insert a vsetvl at the proper spot (after
a setjmp).  This needs to be fixed separately and I figured I'd post
this patch as-is.

gcc/ChangeLog:

PR target/112109

* config/riscv/riscv-protos.h (expand_rawmemchr): Add strlen
parameter.
* config/riscv/riscv-string.cc (riscv_expand_strlen): Call
rawmemchr.
(expand_rawmemchr): Add strlen handling.
* config/riscv/riscv.md: Add TARGET_VECTOR to strlen expander.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strlen.c: New test.

[Bug libstdc++/112925] Optimize std::string construction from std::istreambuf_iterator

2023-12-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112925

--- Comment #1 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #0)
> +#if __cplusplus >= 201103L && 0

Oops, without the && 0 obviously. I was testing performance with and without
it.

[Bug libstdc++/112925] New: Optimize std::string construction from std::istreambuf_iterator

2023-12-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112925

Bug ID: 112925
   Summary: Optimize std::string construction from
std::istreambuf_iterator
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

Something like this would improve performance considerably. Instead of reading
a character at a time we can use streambuf::sgetn to copy directly into the
string's unused capacity:

--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -205,6 +205,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_data(__another);
_M_capacity(__capacity);
  }
+
+#if __cplusplus >= 201103L && 0
+   using _BufIter = istreambuf_iterator<_CharT, char_traits<_CharT>>;
+   if _GLIBCXX17_CONSTEXPR (is_same<_InIterator, _BufIter>::value)
+   {
+ if (!std::__is_constant_evaluated())
+   {
+ auto __r = std::__copy_n_a(__beg, capacity() - size(),
+data() + size(), false);
+ __len = __r - data();
+ _M_set_length(__len);
+ continue;
+   }
+   }
+#endif
traits_type::assign(_M_data()[__len++], *__beg);
++__beg;
  }

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

Jakub Jelinek  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org,
   ||law at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Seems expand_compound_operation is called on
(sign_extend:DI (mem/c:SI (lo_sum:DI (reg/f:DI 144)
(symbol_ref:DI ("globalVar") [flags 0x86] )) [1 globalVar+0 S4 A32]))
and takes the
7336  tem = gen_lowpart (mode, XEXP (x, 0));
7337  if (!tem || GET_CODE (tem) == CLOBBER)
7338return x;
7339  tem = simplify_shift_const (NULL_RTX, ASHIFT, mode,
7340  tem, modewidth - pos - len);
7341  tem = simplify_shift_const (NULL_RTX, unsignedp ? LSHIFTRT :
ASHIFTRT,
7342  mode, tem, modewidth - len);
path on it, mode being DImode, modewidth 64, pos 0, len 32.
The second simplify_shift_const is then called with
(ashift:DI (subreg:DI (mem/c:SI (lo_sum:DI (reg/f:DI 144)
(symbol_ref:DI ("globalVar") [flags 0x86] )) [1 globalVar+0 S4 A32]) 0)
(const_int 32 [0x20]))
and triggers the
10810 /* If this was (ashiftrt (ashift foo C1) C2) and FOO has
more
10811than C1 high-order bits equal to the sign bit, we can
convert
10812this to either an ASHIFT or an ASHIFTRT depending on
the
10813two counts.
10814   
10815We cannot do this if VAROP's mode is not
SHIFT_UNIT_MODE.  */
10816   
10817 if (code == ASHIFTRT && first_code == ASHIFT
10818 && int_varop_mode == shift_unit_mode
10819 && (num_sign_bit_copies (XEXP (varop, 0),
shift_unit_mode)
10820 > first_count))
10821   {
10822 varop = XEXP (varop, 0);
10823 count -= first_count;
10824 if (count < 0)
10825   {
10826 count = -count;
10827 code = ASHIFT;
10828   }
10829   
10830 continue;
10831   }
optimization in there.
As RISC V is WORD_REGISTER_OPERATIONS target with load_extend_op (E_SImode) ==
SIGN_EXTEND, it triggers the:
5444  /* For paradoxical SUBREGs on machines where all register
operations
5445 affect the entire register, just look inside.  Note that
we are
5446 passing MODE to the recursive call, so the number of sign
bit
5447 copies will remain relative to that mode, not the inner
mode.
5448
5449 This works only if loads sign extend.  Otherwise, if we
get a
5450 reload for the inner part, it may be loaded from the
stack, and
5451 then we lose all sign bit copies that existed before the
store
5452 to the stack.  */
5453  if (WORD_REGISTER_OPERATIONS
5454  && load_extend_op (inner_mode) == SIGN_EXTEND
5455  && paradoxical_subreg_p (x)
5456  && MEM_P (SUBREG_REG (x)))
5457return cached_num_sign_bit_copies (SUBREG_REG (x), mode,
5458   known_x, known_mode,
known_ret);
path and so the sign-extension in the end folds into just a paradoxical subreg
of the MEM.  But probably something in the combiner then just sees a
paradoxical SUBREG and thinks that all the bits above the SUBREG_REG are
undefined and picks ZERO_EXTEND.
I'm afraid I don't know enough about WORD_REGISTER_OPERATIONS to know what is
right and what is not.

[Bug rtl-optimization/112875] [14 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.cc:670 with -Oz -frounding-math -fno-dce -fno-trapping-math -fno-tree-dce -fno-tree-dse -g

2023-12-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112875

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||jakub at gcc dot gnu.org
   Priority|P3  |P1

--- Comment #2 from Jakub Jelinek  ---
Started with r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a

[Bug c/112924] [14 regression] ICE when building util-linux (error: gimple_bb (stmt) is set to a wrong basic block)

2023-12-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112924

--- Comment #1 from Andrew Pinski  ---
This might already be fixed.

[Bug c/112924] New: [14 regression] ICE when building util-linux (error: gimple_bb (stmt) is set to a wrong basic block)

2023-12-08 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112924

Bug ID: 112924
   Summary: [14 regression] ICE when building util-linux (error:
gimple_bb (stmt) is set to a wrong basic block)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: csfore at posteo dot net
  Target Milestone: ---

Created attachment 56836
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56836=edit
trimmed file with cvise

This originally failed to compile with LTO, but I've reduced it to the
following arguments: `gcc -m32 -O2 reduced.i`. It is successful on GCC 13. The
version of util linux this was tested on was 2.39.3.

`
reduced.i: In function ‘probe_btrfs’:
reduced.i:16:6: error: gimple_bb (stmt) is set to a wrong basic block
   16 | void probe_btrfs() {
  |  ^~~
# .MEM_16 = VDEF <.MEM_2>
__builtin_memcpy (, p_9, 16);
during GIMPLE pass: sccp
reduced.i:16:6: internal compiler error: verify_gimple failed
`


gcc -v:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/14/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-14.0.0_pre20231203/work/gcc-14-20231203/configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/14
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/14/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/14/python
--enable-languages=c,c++,fortran,rust --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=yes,extra
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo
14.0.0_pre20231203 p9' --with-gcc-major-version-only --enable-libstdcxx-time
--enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --enable-multilib
--with-multilib-list=m32,m64 --disable-fixed-point --enable-targets=all
--enable-libgomp --disable-libssp --disable-libada --disable-cet
--disable-systemtap --disable-valgrind-annotations --disable-vtable-verify
--disable-libvtv --with-zstd --without-isl --enable-default-pie
--enable-host-pie --enable-host-bind-now --enable-default-ssp
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231203 (experimental) (Gentoo 14.0.0_pre20231203 p9)

[Bug tree-optimization/112887] during GIMPLE pass: phiopt ICE: Floating point exception (SIGFPE) at tree-ssa-phiopt.c:2224 with --param=l1-cache-line-size=0x20000000

2023-12-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112887

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Created attachment 56835
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56835=edit
gcc14-pr112887.patch

The function uses tree_fits_uhwi_p and then just blindly sets int vars to
tree_to_uhwi.
I think we just should use unsigned HOST_WIDE_INT types everywhere, that fixes
the ICE too.

[Bug fortran/105170] Invalid finalization in intrinsic assignment

2023-12-08 Thread baradi09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105170

--- Comment #2 from Bálint Aradi  ---
Thanks, with 13.2.0, it seems to behave correctly.

[Bug modula2/112923] New: gm2 runs out of memory

2023-12-08 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112923

Bug ID: 112923
   Summary: gm2 runs out of memory
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: gaius at gcc dot gnu.org
  Target Milestone: ---

Forwarded from the gm2 mailing list:

When building:
https://github.com/k-john-gough/gpmclr/tree/master/GPMCLR/source/gpmake

Gm2 runs out of memory with one particular module, namely TermSymbolsIO:

$ gm2 -c -fiso  -I../libgm2gpm -I. TermSymbolsIO.mod

cc1gm2: out of memory allocating 2097152 bytes after a total of
33873920 bytes

[Bug target/112922] New: 465.tonto from SPECFP 2006 fails train run on Aarch64-linux with -O2 and -flto

2023-12-08 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112922

Bug ID: 112922
   Summary: 465.tonto from SPECFP 2006 fails train run on
Aarch64-linux with -O2 and -flto
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jamborm at gcc dot gnu.org
Blocks: 26163
  Target Milestone: ---
  Host: aarch64-linux
Target: aarch64-linux

At some point between r14-6211-g056cce412862f8 and
r14-6290-g9f0f7d802482a8 465.tonto started to fail its train run when
compiled with -O2 -g -flto and PGO on Aarch64-linux.

My Aarch64 machine is a bit slow so I'd like to ask someone with a
spare modern one to bisect this to an exact revision.  I do not
observe the failure on x86_64 (and so for now I have guessed its
component is "target").

The miscomparison is:

0056:  SCF Energy  =  -407.833257
   SCF Energy  =   -878703.963298
^
0057:  Kinetic Energy  =   406.333744
   Kinetic Energy  =   426.858584
^
0087:  Chi^2 in F  = 4.143448
   Chi^2 in F  =19.253240
^
0088:  Goodness of fit in F= 2.035546
   Goodness of fit in F= 4.387851
^
0089:  R factor in F   = 0.046861
   R factor in F   = 0.096290
^
0090:  Weighted R factor in F  = 0.054923
   Weighted R factor in F  = 0.118393
^
0098:  Secondary extinction factor =-0.10
   Secondary extinction factor =-0.75
^
0100:  Scale factor= 1.010260
   Scale factor= 1.023437
^
0105:  00.144.143448 -407.833257 *Damping on
   00.14   19.253240  -878703.963298 *Damping on
   ^
0108:  10.144.844793 -406.881314
   10.14  105.264541 -1961055.460671


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug modula2/112921] New: module shortreal is missing

2023-12-08 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112921

Bug ID: 112921
   Summary: module shortreal is missing
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: gaius at gcc dot gnu.org
  Target Milestone: ---

Forwarded from the gm2 mailing list:

For consistency there should be a ShortReal.{def,mod} module in the libgm2.

[Bug libstdc++/111826] __cpp_lib_format should be 202110, not 202106

2023-12-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111826

Jonathan Wakely  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Target Milestone|--- |13.3
   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org

[Bug modula2/112893] gm2 fails to detect procedure address actual parameter is incompatible with cardinal formal parameter

2023-12-08 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112893

Gaius Mulley  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #7 from Gaius Mulley  ---
Re-opening as this should issue an error for the PIM dialects.

[Bug libstdc++/112876] ranges:to: c.end() is unnecessarily assigned by the return value of c.emplace()

2023-12-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112876

--- Comment #4 from Jonathan Wakely  ---
D'oh, I didn't even reuse the returned iterator, as the 'auto end = c.end();'
statement is inside the loop, so it's completely pointless.

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #5 from Andreas Schwab  ---
spawn -ignore SIGHUP /daten/aranym/gcc/gcc-20231208/Build/gcc/xgcc
-B/daten/aranym/gcc/gcc-20231208/Build/gcc/ -fdiagnostics-plain-output -O1 -w
-fpermissive -c -o pr82052.o
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.c-torture/compile/pr82052.c
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.c-torture/compile/pr82052.c:
In function 'main':
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.c-torture/compile/pr82052.c:394:1:
error: insn does not satisfy its constraints:
(insn 1377 3034 3035 128 (set (reg:SI 9 %a1 [2134])
(plus:SI (sign_extend:SI (reg:HI 9 %a1 [orig:2132 t83 ] [2132]))
(reg:SI 2 %d2 [2940])))
"/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.c-torture/compile/pr82052.c":294:45
407 {*lea}
 (nil))
during RTL pass: reload
/daten/aranym/gcc/gcc-20231208/gcc/testsuite/gcc.c-torture/compile/pr82052.c:394:1:
internal compiler error: in extract_constrain_insn, at recog.cc:2713
0x6466d0 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
../../gcc/rtl-error.cc:108
0x6466f9 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
../../gcc/rtl-error.cc:118
0xd41b7d extract_constrain_insn(rtx_insn*)
../../gcc/recog.cc:2713
0xbcad47 check_rtl
../../gcc/lra.cc:2187
0xbd07c2 lra(_IO_FILE*, int)
../../gcc/lra.cc:2608
0xb7def7 do_reload
../../gcc/ira.cc:5973
0xb7def7 execute
../../gcc/ira.cc:6161

[Bug modula2/112920] gm2 hangs in the type resolver

2023-12-08 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112920

Gaius Mulley  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-08
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Gaius Mulley  ---
Confirmed - in directory 191:

$ gm2 -fiso -fsoft-check-all squash1.mod Ctv2.o unix.o -o squash1 -lc -lcrypt
-lm -wrapper gdb,--args
(gdb) run
^C
(gdb) where
#0  __udivmodti4 (rp=0x0, d=, n=) at
../../../libgcc/libgcc2.c:1203
#1  __divti3 (u=704, v=) at ../../../libgcc/libgcc2.c:1225
#2  0x00ac3683 in findPos (pb=0xa1ba320, i=719) at
../../gcc/m2/gm2-compiler/Sets.mod:145
#3  0x00ac2e25 in Sets_IsElementInSet (s=0x3362d30, i=719) at
../../gcc/m2/gm2-compiler/Sets.mod:254
#4  0x00a2fc3b in TraverseDependantsInner (sym=719) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2087
#5  0x00a3bd99 in WalkArrayDependants (sym=7430, p=0x7fffd8e0) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:5981
#6  0x00a2fb07 in WalkDependants (sym=7430, p=0x7fffd920) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2055
#7  0x00a2fca4 in TraverseDependantsInner (sym=7430) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2095
#8  0x00a2f300 in WalkConst (sym=14448, p=0x7fffd960) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:1690
#9  0x00a2fbcc in WalkDependants (sym=14448, p=0x7fffd990) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2070
#10 0x00a2fca4 in TraverseDependantsInner (sym=14448) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2095
#11 0x00a2fcef in TraverseDependants (sym=14448) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2111
#12 0x00a2bcf5 in M2GCCDeclare_TryDeclareConstant (tokenno=8668,
sym=14448) at ../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:1738
#13 0x00a42961 in FoldBecomes (tokenno=8668, p=0x7fffda30,
quad=1125, op1=7458, op3=14448) at ../../gcc/m2/gm2-compiler/M2GenGCC.mod:2667
#14 0x00a3cb7c in M2GenGCC_ResolveConstantExpressions
(p=0x7fffda88, start=700, end=1129) at
../../gcc/m2/gm2-compiler/M2GenGCC.mod:600
#15 0x00a30a17 in DeclareTypesConstantsProceduresInRange (scope=178,
start=700, end=1129) at ../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2573
#16 0x00a9f2ba in M2Scope_ForeachScopeBlockDo (sb=0xa1b9df0,
p=0x7fffdb10) at ../../gcc/m2/gm2-compiler/M2Scope.mod:431
#17 0x00a30bb9 in DeclareTypesConstantsProcedures (scope=178) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2649
#18 0x00a30df3 in StartDeclareModuleScopeSeparate (scope=178) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2734
#19 0x00a30fcc in StartDeclareModuleScope (scope=178) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2806
#20 0x00a2baaa in M2GCCDeclare_StartDeclareScope (scope=178) at
../../gcc/m2/gm2-compiler/M2GCCDeclare.mod:2863



type resolver is spinning in an infinite loop.

[Bug modula2/112920] New: gm2 hangs in the type resolver

2023-12-08 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112920

Bug ID: 112920
   Summary: gm2 hangs in the type resolver
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: gaius at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56834
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56834=edit
test case showing bug

A large program with many data types causes gm2 to hang in the type resolver.

Forwarded from the gm2 mailing list to track bug progress etc:

"""
I could generate dynamic
or static binaries but now with version 191 gm2 seems to go into an
endless loop. Perhaps you can see what's going wrong?

BTW for staring a compilation one may simply enter

./mach_squash1

for a dynamic binary or

./mach_squash1_static

for a static binary in the relevant directory.
"""

It hangs in the dist_gm2_191 directory when invoked with:

$ gm2 -fiso -fsoft-check-all squash1.mod Ctv2.o unix.o -o squash1 -lc -lcrypt
-lm

[Bug libstdc++/112876] ranges:to: c.end() is unnecessarily assigned by the return value of c.emplace()

2023-12-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112876

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-12-08
 Status|UNCONFIRMED |NEW

--- Comment #3 from Jonathan Wakely  ---
Yes, I think I should fix this. But I'm checking with LWG whether we want to
change the issue's proposed resolution to use the iterator this way.

My feeling is that we should not do this, and I should just fix the libstdc++
code.

Thanks for pointing it out!

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #4 from Andreas Schwab  ---
It seems to be related to -fPIC.

cc1plus -fpreprocessed floating_from_chars.ii -quiet -mcpu=68020 -O2
-std=gnu++17 -fimplicit-templates -fPIC -o floating_from_chars.s

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #3 from Andreas Schwab  ---
Created attachment 56833
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56833=edit
floating_from_chars.ii

[Bug jit/112910] Getting the size of the type size_t returns the wrong value on some platforms

2023-12-08 Thread bouanto at zoho dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112910

--- Comment #3 from Antoni  ---
Yes, but it isn't available in recording.
Perhaps I could use it with another solution that is in the work, though.

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #2 from Richard Biener  ---
Also if a C only compiler builds OK fallout in the testsuite might be easier to
analyze.

[Bug target/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

--- Comment #1 from Richard Biener  ---
Can you attach preprocessed source and the cc1plus command line to reproduce
this with a simple all-gcc cross?

[Bug target/112919] LoongArch: Alignments in tune parameters are not precise and they regress performance

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112919

Xi Ruoyao  changed:

   What|Removed |Added

   See Also||https://github.com/loongson
   ||-community/discussions/issu
   ||es/23
 CC||chenglulu at loongson dot cn,
   ||xen0n at gentoo dot org
 Target||loongarch64-*-*

--- Comment #1 from Xi Ruoyao  ---
Jia Jie reported a huge performance regression running Coremarks from GCC 13 to
14, and I can confirm it on LA664.

It seems a part of the regression is caused by over-aligning the labels.  On a
LA664 with different configurations I get Coremarks Iterations/Sec values (the
larger the better):

21120 with GCC 13.2.0
18320 with GCC 14.0.0 (with the default: -falign-labels=16
-falign-functions=32)
19972 with GCC 14.0.0 + -falign-loops=32 -falign-labels=4 -falign-jumps=4
-falign-functions=32 (the best I've got)
19938 with GCC 14.0.0 + -falign-loops=32 -falign-labels=4 -falign-jumps=4
-falign-functions=16
19964 with GCC 14.0.0 + -falign-loops=32 -falign-labels=4 -falign-jumps=4
-falign-functions=64
19276 with GCC 14.0.0 + -falign-loops=32 -falign-labels=8 -falign-jumps=4
-falign-functions=32
19674 with GCC 14.0.0 + -falign-loops=32 -falign-labels=4 -falign-jumps=8
-falign-functions=32
19752 with GCC 14.0.0 + -falign-loops=16 -falign-labels=4 -falign-jumps=4
-falign-functions=32
19922 with GCC 14.0.0 + -falign-loops=64 -falign-labels=4 -falign-jumps=4
-falign-functions=32

Lulu: can you help to run some other benchmarks like SPEC (I don't have an
access to it) and update these values for LA464 and LA664?

[Bug target/112919] New: LoongArch: Alignments in tune parameters are not precise and they regress performance

2023-12-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112919

Bug ID: 112919
   Summary: LoongArch: Alignments in tune parameters are not
precise and they regress performance
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xry111 at gcc dot gnu.org
  Target Milestone: ---

In r14-1839 we added -falign-function=32 -falign-label=16 for LA464.  But these
values is not precise now because in r14-4674 we removed
ASM_OUTPUT_ALIGN_WITH_NOP and altered the semantics of -falign-* switches.  And
we also have LA664 now which may benefit from a different value.

[Bug target/30271] -mstrict-align can an store extra for struct agrument passing

2023-12-08 Thread guojiufu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30271

Jiu Fu Guo  changed:

   What|Removed |Added

 CC||guojiufu at gcc dot gnu.org

--- Comment #13 from Jiu Fu Guo  ---
(In reply to Andrew Pinski from comment #10)
> (In reply to comment #9)
> > Andrew, 
> > 
> > What is your point here?
> 
> My point here is that currently we do:
>   gi->frame_related =
> (base == frame_pointer_rtx) || (base == hard_frame_pointer_rtx);
> 
> But if we change it to be:
>   gi->frame_related =
> (base == frame_pointer_rtx) || (base == hard_frame_pointer_rtx)
> || (base == arg_pointer_rtx && fixed_regs[ARG_POINTER_REGNUM]);
> 
> It would delete the store (at least in a 4.3 based compiler). 
> arg_pointer_rtx is the incoming argument space so if it is a fixed register
> it will be also frame related and we can safely delete the stores to this
> space.

https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639550.html is using
this idea too.  And the 'std' in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30271#c2 disappeared.

[Bug target/112918] New: [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

Bug ID: 112918
   Summary: [m68k] [LRA] ICE: maximum number of generated reload
insns per insn achieved (90)
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sch...@linux-m68k.org
  Target Milestone: ---
Target: m68k-*-*

When enabling LRA on m68k the compiler ICEs when compiling libstdc++:

$ ../../../../gcc/xgcc -shared-libgcc -B../../../../gcc -nostdinc++ -L..
-L../.libs -L../../libsupc++/.libs -I../../../../../libstdc++-v3/../libgcc
-I../../include/m68k-linux -I../../include
-I../../../../../libstdc++-v3/libsupc++ -std=gnu++17 -nostdinc++
-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings
-Wcast-qual -Wabi=2 -fdiagnostics-show-location=once -ffunction-sections
-fdata-sections -frandom-seed=floating_from_chars.lo -fimplicit-templates -g
-O2 -D_GNU_SOURCE -c
../../../../../libstdc++-v3/src/c++17/floating_from_chars.cc  -fPIC -DPIC
-D_GLIBCXX_SHARED -o floating_from_chars.o
during RTL pass: reload
../../../../../libstdc++-v3/src/c++17/floating_from_chars.cc: In function
‘std::from_chars_result std::__from_chars_float16_t(const char*, const char*,
float&, chars_format)’:
../../../../../libstdc++-v3/src/c++17/floating_from_chars.cc:1289:1: internal
compiler error: maximum number of generated reload insns per insn achieved (90)
 1289 | }
  | ^
0xed33c0 lra_constraints(bool)
../../gcc/lra-constraints.cc:5429
0xeba98a lra(_IO_FILE*, int)
../../gcc/lra.cc:2440
0xe688c7 do_reload
../../gcc/ira.cc:5973
0xe688c7 execute
../../gcc/ira.cc:6161
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug middle-end/112917] Most strub execution tests FAIL on SPARC

2023-12-08 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112917

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug middle-end/112909] [14 Regression] glibc -Wuninitialized build failure for i686-gnu

2023-12-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112909

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Richard Biener  ---
Should be fixed now.

[Bug middle-end/24639] [meta-bug] bug to track all Wuninitialized issues

2023-12-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639
Bug 24639 depends on bug 112909, which changed state.

Bug 112909 Summary: [14 Regression] glibc -Wuninitialized build failure for 
i686-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112909

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/112909] [14 Regression] glibc -Wuninitialized build failure for i686-gnu

2023-12-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112909

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:5e25baa7e577f9b73f746005efb5ccd4e000e51e

commit r14-6319-g5e25baa7e577f9b73f746005efb5ccd4e000e51e
Author: Richard Biener 
Date:   Fri Dec 8 09:14:43 2023 +0100

tree-optimization/112909 - uninit diagnostic with abnormal copy

The following avoids spurious uninit diagnostics for SSA name
copies which mostly appear when the source is marked as abnormal
which prevents copy propagation.

To prevent regressions I remove the bail out for anonymous SSA
names in the PHI arg place from warn_uninitialized_phi leaving
that to warn_uninit where I handle SSA copies from a SSA name
which isn't anonymous.  In theory this might cause more
valid and false positive diagnostics to pop up.

PR tree-optimization/112909
* tree-ssa-uninit.cc (find_uninit_use): Look through a
single level of SSA name copies with single use.

* gcc.dg/uninit-pr112909.c: New testcase.

[Bug fortran/111503] Issues with POINTER, OPTIONAL, CONTIGUOUS dummy arguments

2023-12-08 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111503

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |anlauf at gcc dot 
gnu.org
   Last reconfirmed||2023-12-08
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from anlauf at gcc dot gnu.org ---
Submitted: https://gcc.gnu.org/pipermail/fortran/2023-December/060003.html

  1   2   >