[Bug tree-optimization/111221] New: Floating point handling a*1.0 vs. a+0.0

2023-08-28 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111221

Bug ID: 111221
   Summary: Floating point handling a*1.0 vs. a+0.0
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

I just noticed that gcc will optimize away multiplying a floating
point number with 1.0, but will not do for an addition with 0.0.
Example, with -O3,

double add0 (double a)
{
  return a + 0.0;
}

double mul1 (double a)
{
  return a * 1.0;
}

yields

add0:
.LFB0:
.cfi_startproc
pxor%xmm1, %xmm1
addsd   %xmm1, %xmm0
ret

vs.

mul1:
.LFB1:
.cfi_startproc
ret

which seems inconsistent.  If this is the result of a deliberate design
decision, feel free to close as WONTFIX.

[Bug c++/109859] [12/13/14 Regression] ICE on concept mis-typed as template type parameter

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109859

Andrew Pinski  changed:

   What|Removed |Added

 CC||stevenxia990430 at gmail dot 
com

--- Comment #3 from Andrew Pinski  ---
*** Bug 111220 has been marked as a duplicate of this bug. ***

[Bug c++/111220] ICE with std::integral in template

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111220

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup of bug 109859.

*** This bug has been marked as a duplicate of bug 109859 ***

[Bug c++/111220] New: ICE with std::integral in template

2023-08-28 Thread stevenxia990430 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111220

Bug ID: 111220
   Summary: ICE with std::integral in template
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: stevenxia990430 at gmail dot com
  Target Milestone: ---

The following invalid program reports an internal compiler error. Failed on
gcc-trunk. 

To quickly reproduce: https://gcc.godbolt.org/z/TsKjKbfrd
```
#include 
template
```

note that it requires --std=c++20 or c++23, tried on default and it errors
without any crashes.

[Bug tree-optimization/110111] bool patterns that should produce a?b:c

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110111

--- Comment #3 from Andrew Pinski  ---
f1:
  _6 = b_2(D) ^ c_3(D);
  _7 = a_1(D) & _6;
  _4 = c_3(D) ^ _7;

Which was done due to:
/* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */
(simplify
 (bit_ior:c (bit_and:cs @0 (bit_not @2)) (bit_and:cs @1 @2))
 (bit_xor (bit_and (bit_xor @0 @1) @2) @0))

Note if we move this over to bitwise_inverted_equal_p (which we should), we
will lose also:
```
bool f(int a, int b, int t)
{
  bool x = a == 0;
  bool y = b == 1;
  bool m = t == 2;
  bool mp = !m;
  return (x & mp) | (y & m);
}
```
Which is currently handled.
We should check for `element_precision (type) == 1` too.

So something like:

(simplify
 (bit_ior (bit_and:c@and1 @0 @3) (bit_and:c@and2 @1 @2))
 (with { bool wascmp; }
  (if (bitwise_inverted_equal_p (@0, @2, wascmp))
   (switch
/* For 1bit, wascmp can be true and we can just convert it into `m ? y : x`
*/
(if (INTEGRAL_TYPE_P (type) && element_precision (type) == 1)
 (cond @3 @0 @1))
(if (!wascmp && element_precision (type) != 1
 && single_use (@and1) && single_use (@and2))
 (bit_xor (bit_and (bit_xor @0 @1) @2) @0))
)
   )
  )
 )
)

/* 1bit `((x ^ y) & m) ^ x` should just be convert into `m ? y : x` early */
(simplify
 (bit_xor:c (bit_and:c (bit_xor:c @0 @1) @2) @0)
 (if (INTEGRAL_TYPE_P (type) && element_precision (type) == 1)
  (cond @2 @0 @1)))

[Bug middle-end/110983] -fpatchable-function-entry is missing in Option Summary page

2023-08-28 Thread sray at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110983

--- Comment #5 from Mao  ---
(In reply to Andrew Pinski from comment #4)
> `make html` is the way to build the HTML web pages ...

Thanks for the help. Yes, I have confirmed with the generated HTML as well. My
patch can fix it.

[Bug middle-end/110983] -fpatchable-function-entry is missing in Option Summary page

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110983

--- Comment #4 from Andrew Pinski  ---
(In reply to Mao from comment #3)
> Created attachment 55810 [details]
> invoke-doc-patch
> 
> I think this can help fix the issue.
> I am not sure how to build the HTML web pages. But I also checked the man
> page. The fpatchable-function-entry is also missing in the manpage in the
> option summary section. And my fix can solve this issue.

`make html` is the way to build the HTML web pages ...

[Bug tree-optimization/111149] bool0 != bool1 should be expanded as bool0 ^ bool1

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49

Andrew Pinski  changed:

   What|Removed |Added

  Component|middle-end  |tree-optimization

--- Comment #2 from Andrew Pinski  ---
I need to look into this again for the gimple level because I have noticed VRP
changes bool != bool into bool ^ bool but we should be able to do it without
VRP.

[Bug middle-end/110983] -fpatchable-function-entry is missing in Option Summary page

2023-08-28 Thread sray at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110983

Mao  changed:

   What|Removed |Added

 CC||sray at live dot com

--- Comment #3 from Mao  ---
Created attachment 55810
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55810=edit
invoke-doc-patch

I think this can help fix the issue.
I am not sure how to build the HTML web pages. But I also checked the man page.
The fpatchable-function-entry is also missing in the manpage in the option
summary section. And my fix can solve this issue.

I still need more time to fight with the git-send-mail and my email provider
before I can send this patch file to the gcc-patches mail list...

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

--- Comment #14 from Andrew Pinski  ---
I have a patch which is able to optimize this to:
  t1_3 = b_1(D) >= a_2(D);
  _6 = b_1(D) > a_2(D);
  _4 = t1_3 ^ _6;


But then we need to handle some simplifications for ^.

I will handle that next week or so ...

[Bug tree-optimization/95185] Failure to optimize specific kind of sign comparison check

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185

--- Comment #8 from Andrew Pinski  ---
I have a patch which converts this into:
  _1 = x_4(D) < 0;
  _2 = y_5(D) <= 0;
  _3 = _1 ^ _2;

[Bug target/110943] RISC-V: vmv.v.x and vmv.s.x pattern combine error

2023-08-28 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110943

Lehua Ding  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Lehua Ding  ---
Fixed.

[Bug target/110943] RISC-V: vmv.v.x and vmv.s.x pattern combine error

2023-08-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110943

--- Comment #1 from CVS Commits  ---
The trunk branch has been updated by Lehua Ding :

https://gcc.gnu.org/g:973eb0deb467c79cc21f265a710a81054cfd3e8c

commit r14-3535-g973eb0deb467c79cc21f265a710a81054cfd3e8c
Author: Lehua Ding 
Date:   Tue Aug 29 09:54:22 2023 +0800

RISC-V: Fix error combine of pred_mov pattern

This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:

```

void foo9 (void *base, void *out, size_t vl)
{
int64_t scalar = *(int64_t*)(base + 100);
vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
*(vint64m2_t*)out = v;
}
```

RTL before combine pass:

```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089
{pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1
MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
(reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717
{*movrvvm2di_whole})
```

RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t
*)out_4(D)]+0 S[32, 32] A128])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089
{pred_movrvvm2di})
```

This combine change the semantics of insn 14. I split @pred_mov pattern and
restrict the conditon of @pred_mov.

PR target/110943

gcc/ChangeLog:

* config/riscv/predicates.md
(vector_const_int_or_double_0_operand):
New predicate.
* config/riscv/riscv-vector-builtins.cc
(function_expander::function_expander):
force_reg mem target operand.
* config/riscv/vector.md (@pred_mov): Wrapper.
(*pred_mov): Remove imm -> reg pattern.
(*pred_broadcast_imm): Add imm -> reg pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Adjust.
* gcc.target/riscv/rvv/base/pr110943.c: New test.

[Bug fortran/111218] Conflict in BIND(C) INTERFACEs in two Modules leads to ICE.

2023-08-28 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111218

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---

> (tested with gcc version 14.0.0 20230828 (experimental) [master
> r14-3528-gc3669bb677b] (GCC)

No ICE with a 14.0.0 20230824 gfortran

[Bug tree-optimization/107880] bool tautology missed optimisation

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107880

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #4 from Andrew Pinski  ---
With a patch I have for PR 95185

we get:
```
  _1 = b_2(D) == a_3(D);
  _10 = b_2(D) ^ a_3(D);
  _5 = _1 ^ _10;
```

Which is better than before.

One more improvement would be:
```
bool a(bool x, bool y)
{
bool t = x == y;
return t ^ x;
}
```

Into:
```
bool a0(bool x, bool y)
{
bool t = (x ^ y);
return t ^ x ^1; // ~y
}
```

So the 2 which are needed still:

/* (a == b) ^ a -> b^1 */
(simplify
 (bit_xor:c (eq:c zero_one_valued_p@0 zero_one_valued_p@1) @0)
 (bit_xor @1 { build_one_cst (type); })

/* (a == b) ^ (a^b) -> b^(b^1) or (b^b)^1 or rather 1 */
(simplify
 (bit_xor:c (eq:c zero_one_valued_p@0 zero_one_valued_p@1) (bit_xor:c @0 @1))
 { build_one_cst (type); })

So mine.

[Bug c/111219] -Wformat-truncation false negative with %p modifier

2023-08-28 Thread ndesaulniers at google dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219

--- Comment #2 from Nick Desaulniers  ---
Ah ok that makes sense.

Would it be possible to get that behavior documented on this page?

https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wformat-truncation

We can probably modify clang to match this behavior then.

It's good to know that this was intentional, but too bad that Martin did the
work to change this, but the kernel commit still disabled the diagnostic.

Martin's GCC patch is dated:
Date: Tue Nov 29 21:08:02 2016

Linus' kernel patch is dated:
Date:   Wed Jul 12 19:25:47 2017 -0700

(So this was changed in GCC BEFORE the kernel commit; perhaps Linus was using
an older release at the time. Or perhaps there was something else Linus was
witnessing).

[Bug c/111219] -Wformat-truncation false negative with %p modifier

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=78512

--- Comment #1 from Andrew Pinski  ---
>From the GCC itself:
case 'p':
  /* The %p output is implementation-defined.  It's possible
 to determine this format but due to extensions (especially
 those of the Linux kernel -- see bug 78512) the first %p
 in the format string disables any further processing.  */
  return false;

[Bug c/111219] New: -Wformat-truncation false negative with %p modifier

2023-08-28 Thread ndesaulniers at google dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219

Bug ID: 111219
   Summary: -Wformat-truncation false negative with %p modifier
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ndesaulniers at google dot com
  Target Milestone: ---

I noticed that -Wformat-truncation was disabled in the linux kernel.

commit bd664f6b3e37 ("disable new gcc-7.1.1 warnings for now")

I was curious since I was unfamiliar with that flag.  I filed a bug against
clang to look into implementing something similar.

https://github.com/llvm/llvm-project/issues/64871

They extended their existing -Wfortify-source flag instead (*sigh*), but we
noticed now in the Linux kernel that `-Wfortify-source` is flagging a few cases
where kernel devs have added custom format flags for pretty printing oft-used
data structures, which is tripping up this warning, since these format
specifiers are not part of the language standard.

A recent kernel patch looks to re-enable -Wformat-truncation for W=1 kernel
builds.  Nathan noticed that GCC is not warning for the %p related flags,
whereas clang is (with -Wfortify-source).

I don't think GCC's current behavior is intentional?

For example, consider the following code:
```
void foo (void *x) {
char dst [1];
__builtin_snprintf(dst, sizeof(dst), "%p", x);
}
```
Clang-18 (trunk, not yet released, after
https://github.com/llvm/llvm-project/commit/0c9c9dd9a24f9d715d950fef0ac7aae01437af96)
with -Wfortify-source will warn:

```
tmp.c:3:5: warning: 'snprintf' will always be truncated; specified size is 1,
but format string expands to at least 4 [-Wfortify-source]
3 | __builtin_snprintf(dst, sizeof(dst), "%p", x);
  | ^
```

GCC with -Wformat-truncation does not warn, but I think it should.

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||95185

--- Comment #13 from Andrew Pinski  ---
But this depends on PR 95185 still.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185
[Bug 95185] Failure to optimize specific kind of sign comparison check

[Bug tree-optimization/107880] bool tautology missed optimisation

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107880
Bug 107880 depends on bug 107881, which changed state.

Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|DUPLICATE   |---

[Bug tree-optimization/107887] (bool0 > bool1) | bool1 is not optimized to bool0 | bool1

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887
Bug 107887 depends on bug 107881, which changed state.

Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|DUPLICATE   |---

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

Andrew Pinski  changed:

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|DUPLICATE   |---

--- Comment #12 from Andrew Pinski  ---
Actually reopen since it is not an exact dup.

But still mine.

[Bug testsuite/111216] [14 regression] instructions counts for vector tests change after r14-3258-ge7a36e4715c716

2023-08-28 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111216

Peter Bergner  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org

--- Comment #2 from Peter Bergner  ---
The code change that led to this looks correct to me.  Are we possibly just
folding more than we used to (a good thing), and that is changing our numbers? 
What are the actual and expected counts?

I'm sorry for repeating myself, but I really really dislike counting xxlor
insns, since they're mostly used for register copies and the number of those
can easily change with the phase of the moon, day of the week, etc. etc.

[Bug tree-optimization/95185] Failure to optimize specific kind of sign comparison check

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #7 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> 
> Something like:
> Prefer ^ over ==
> ```
> (for cmp
> (for cmpN
> (for neeq
>  (simplify
>   (neeq:c (cmp @0 @1) @3
>   (if (cmpN == inverseof(cmp, TREE_TYPE (type))
>(bit_xor (cmpN @0 @1) @3)
>   )
>  )
> )))
> ```

Actually we can just do:
```
/* For CMP == b, prefer CMP` ^ b. */
(for neeq (ne eq)
 (for cmp (tcc_comparison)
  (simplify
   (neeq:c (cmp@0 @1 @2) @3)
   (bit_xor (bit_not! @0) @3)
  )
 )
)
```
Since we already have folding of (bit_not cmp) in another place.

[Bug bootstrap/111141] Compiling gcc-13.2.0 on Ubuntu 22.04.3 LTS, problem asm-generic/errno.h

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41

--- Comment #4 from Andrew Pinski  ---
(In reply to etienne_lorrain from comment #3)
> Unlike for ARM64 host compiling a native compiler, you need to say such
> --disable-multilib for amd64 compiling a native compiler.

Well aarch64 (arm64 [which is techincally not a thing]) defaults to having only
one multi-lib (LP64) while x86_64 (amd64 which is the non-canonical name for
x86_64) defaults to having both 64 and 32bit multi-lib.

[Bug bootstrap/111141] Compiling gcc-13.2.0 on Ubuntu 22.04.3 LTS, problem asm-generic/errno.h

2023-08-28 Thread etienne_lorrain at yahoo dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41

--- Comment #3 from etienne_lorrain at yahoo dot fr ---
Just reporting that the problem do not appears when --disable-multilib is asked
at the configure stage.
Unlike for ARM64 host compiling a native compiler, you need to say such
--disable-multilib for amd64 compiling a native compiler.

[Bug tree-optimization/107880] bool tautology missed optimisation

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107880
Bug 107880 depends on bug 107881, which changed state.

Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |DUPLICATE

[Bug tree-optimization/107887] (bool0 > bool1) | bool1 is not optimized to bool0 | bool1

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887
Bug 107887 depends on bug 107881, which changed state.

Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |DUPLICATE

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|ASSIGNED|RESOLVED

--- Comment #11 from Andrew Pinski  ---
Basically a dup of bug 95185.

*** This bug has been marked as a duplicate of bug 95185 ***

[Bug tree-optimization/95185] Failure to optimize specific kind of sign comparison check

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org

--- Comment #6 from Andrew Pinski  ---
*** Bug 107881 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #10 from Andrew Pinski  ---
Mine, there was another bug where we had `cmp == b` and I Mentioned the way to
improve that is prefer ^ and `~cmp`.

[Bug tree-optimization/101676] ^ not changed to | if the non-zero don't overlap

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101676

--- Comment #2 from Andrew Pinski  ---
(In reply to Richard Biener from comment #1)
> why is | better than ^?

Just to reply to this. The reasoning from simplify-rtx.cc:
  /* If we are XORing two things that have no bits in common,
 convert them into an IOR.  This helps to detect rotation encoded
 using those methods and possibly other simplifications.  */

Which was added with r0-24478-g79e8185c9ccfcb .

[Bug tree-optimization/111147] bitwise_inverted_equal_p can be used in the `(x | y) & (~x ^ y)` pattern to catch more

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-August/
   ||628600.html
   Keywords||patch

--- Comment #1 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628600.html

[Bug c++/111160] [11/12/13/14 Regression] ICE on assigning volatile through ternary operator

2023-08-28 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #2 from Marek Polacek  ---
Started with r6-4886:

commit cda0a029f45d20f4535dcacf6c3194352c31e736
Author: Jason Merrill 
Date:   Fri Nov 13 19:08:05 2015 -0500

Merge C++ delayed folding branch.

[Bug testsuite/111216] [14 regression] instructions counts for vector tests change after r14-3258-ge7a36e4715c716

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111216

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-28
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
test6_nor in fold-vec-logical-ors-char.c

Trying 10, 9 -> 11:
   10: r127:V16QI=const_vector // (-1)
9: r125:V16QI=r126:V16QI-r124:V16QI
  REG_DEAD r126:V16QI
  REG_DEAD r124:V16QI
   11: r121:V16QI=r125:V16QI+r127:V16QI
  REG_DEAD r127:V16QI
  REG_DEAD r125:V16QI
  REG_EQUAL r125:V16QI+const_vector
Failed to match this instruction:
(set (reg:V16QI 121 [  ])
(plus:V16QI (not:V16QI (reg:V16QI 124))
(reg:V16QI 126 [ *foo_4(D) ])))
Successfully matched this instruction:
(set (reg:V16QI 127)
(not:V16QI (reg:V16QI 124)))
Successfully matched this instruction:
(set (reg:V16QI 121 [  ])
(plus:V16QI (reg:V16QI 127)
(reg:V16QI 126 [ *foo_4(D) ])))
allowing combination of insns 9, 10 and 11
original costs 4 + 20 + 4 = 28
replacement costs 4 + 4 = 8
deferring deletion of insn with uid = 9.
modifying insn i210: r127:V16QI=~r124:V16QI
  REG_DEAD r124:V16QI
deferring rescan insn with uid = 10.
modifying insn i311: r121:V16QI=r127:V16QI+r126:V16QI
  REG_DEAD r126:V16QI
  REG_DEAD r127:V16QI
deferring rescan insn with uid = 11.

[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED

--- Comment #4 from Andrew Pinski  ---
Fixed. Filed PR 111217 as mentioned.

[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails

2023-08-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215

--- Comment #3 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:b7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a

commit r14-3529-gb7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a
Author: Andrew Pinski 
Date:   Mon Aug 28 19:27:41 2023 +

Fix cond-bool-2.c on powerpc and other targets

This adds `--param logical-op-non-short-circuit=1` to the tescase
so it becomes a target indepdendent testcase now.
I filed PR 111217 as the variant of the testcase which fails indepdendently
of the param.

Committed as obvious after testing to make sure it passes on powerpc now.

gcc/testsuite/ChangeLog:

PR testsuite/111215
* gcc.dg/tree-ssa/cond-bool-2.c: Add
`--param logical-op-non-short-circuit=1` to the options.

[Bug tree-optimization/111217] variant of cond-bool-2.c fails

2023-08-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111217

--- Comment #1 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:b7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a

commit r14-3529-gb7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a
Author: Andrew Pinski 
Date:   Mon Aug 28 19:27:41 2023 +

Fix cond-bool-2.c on powerpc and other targets

This adds `--param logical-op-non-short-circuit=1` to the tescase
so it becomes a target indepdendent testcase now.
I filed PR 111217 as the variant of the testcase which fails indepdendently
of the param.

Committed as obvious after testing to make sure it passes on powerpc now.

gcc/testsuite/ChangeLog:

PR testsuite/111215
* gcc.dg/tree-ssa/cond-bool-2.c: Add
`--param logical-op-non-short-circuit=1` to the options.

[Bug fortran/111218] New: Conflict in BIND(C) INTERFACEs in two Modules leads to ICE.

2023-08-28 Thread toon at moene dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111218

Bug ID: 111218
   Summary: Conflict in BIND(C) INTERFACEs in two Modules leads to
ICE.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: toon at moene dot org
  Target Milestone: ---

The following program:

MODULE FIELD_2RM_UTIL_MODULE

INTERFACE

SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER()
BIND(C,name="set_abor1_exception_handler")
END SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER

END INTERFACE

END MODULE
MODULE FIELD_3RM_UTIL_MODULE

INTERFACE

SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER()
BIND(C,name="set_abor1_exception_handler")
END SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER

END INTERFACE

END MODULE
MODULE FIELD_UTIL_MODULE


USE FIELD_2RM_UTIL_MODULE
USE FIELD_3RM_UTIL_MODULE

IMPLICIT NONE

END MODULE

leads to the following internal compiler error:

/home/toon/compilers/install/gcc/bin/gfortran -c -g a.f90

in gfc_format_decoder, at fortran/error.cc:1078
0x75917d gfc_format_decoder
/home/toon/compilers/gcc/gcc/fortran/error.cc:1078
0x2153e1f pp_format(pretty_printer*, text_info*)
/home/toon/compilers/gcc/gcc/pretty-print.cc:1475
0x21315be diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*)
/home/toon/compilers/gcc/gcc/diagnostic.cc:1606
0x9b628e gfc_report_diagnostic
/home/toon/compilers/gcc/gcc/fortran/error.cc:890
0x9b628e gfc_error_opt
/home/toon/compilers/gcc/gcc/fortran/error.cc:1460
0x9b7470 gfc_error(char const*, ...)
/home/toon/compilers/gcc/gcc/fortran/error.cc:1489
0xa6205b ambiguous_symbol
/home/toon/compilers/gcc/gcc/fortran/symbol.cc:3167
0xa6ce9e gfc_find_sym_tree(char const*, gfc_namespace*, int, gfc_symtree**)
/home/toon/compilers/gcc/gcc/fortran/symbol.cc:3240
0xa6cec1 gfc_find_symbol(char const*, gfc_namespace*, int, gfc_symbol**)
/home/toon/compilers/gcc/gcc/fortran/symbol.cc:3291
0xb27a05 check_against_globals
/home/toon/compilers/gcc/gcc/fortran/frontend-passes.cc:5842
0xa630e2 do_traverse_symtree
/home/toon/compilers/gcc/gcc/fortran/symbol.cc:4190
0xb30231 gfc_check_externals(gfc_namespace*)
/home/toon/compilers/gcc/gcc/fortran/frontend-passes.cc:5888
0xa293d8 gfc_parse_file()
/home/toon/compilers/gcc/gcc/fortran/parse.cc:7195
0xa7aecf gfc_be_parse_file
/home/toon/compilers/gcc/gcc/fortran/f95-lang.cc:229
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

(tested with gcc version 14.0.0 20230828 (experimental) [master
r14-3528-gc3669bb677b] (GCC)

[Bug tree-optimization/111217] New: variant of cond-bool-2.c fails

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111217

Bug ID: 111217
   Summary: variant of cond-bool-2.c fails
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
static inline _Bool nand(_Bool a, _Bool b)
{
  _Bool t = 0;
  if (a) { if (b) t = 1; }
  return !t;
  //  return !(a && b);
}

_Bool f(int a, int b)
{
return nand(nand(b, nand(a, a)), nand(a, nand(b, b)));
}
```

we get at ifcombine:
   [local count: 1073741824]:
  if (a_3(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  if (b_2(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  if (b_2(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  ...

   [local count: 536870912]:
  # iftmp.0_21 = PHI <1(3), 0(4)>

So we could swap these ifs around slighlty

 if (b_2(D) != 0) goto L1; else goto L2;
L1:
 if (a_3(D) != 0) goto L3; else goto L4;
L3: goto L4;
L4:
 iftmp.0_21 = PHI <1(3), 0(4)>
L1: goto bb5;

And then it will be optimized.

[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215

--- Comment #2 from Andrew Pinski  ---
So there might be two ways of fixing this:
   [local count: 1073741824]:
  if (a_3(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  if (b_2(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  if (b_2(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870912]:
  ...

   [local count: 536870912]:
  # iftmp.0_21 = PHI <1(3), 0(4)>

So we could swap this if around slighlty

 if (b_2(D) != 0) goto L1; else goto L2;
L1:
 if (a_3(D) != 0) goto L3; else goto L4;
L3: goto L4;
L4:
 iftmp.0_21 = PHI <1(3), 0(4)>
L1: goto bb5;

Implementing that will take some time.

But --param=logical-op-non-short-circuit=1 is enough to fix the testcase so
that is what I am going to use here.

Will file another bug about the above case.

[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes

2023-08-28 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209

--- Comment #5 from cqwrteur  ---
(In reply to Jakub Jelinek from comment #4)
> (In reply to cqwrteur from comment #3)
> > (In reply to Jakub Jelinek from comment #1)
> > > Just use __int128 addition if all you want is double-word addition (or 
> > > long
> > > long for 32-bit arches)?
> > 
> > Well, I've presented this merely as an illustrative example. The length can
> > actually be arbitrary.
> 
> No, it was working with all the other lengths.

This might come across as unusual. I frequently engage in manipulations
involving the carry flag.

like this implementation for 128 bit division (for 32 bit machine and Microsoft
compiler)


auto shift = static_cast(::std::countl_zero(divisorhigh) -
::std::countl_zero(dividendhigh));

divisorhigh =
::fast_io::intrinsics::shiftleft(divisorlow,divisorhigh,shift);
divisorlow <<= shift;

quotientlow = 0;
bool carry;
do
{
carry=0;
   
dividendlow=intrinsics::subc(dividendlow,divisorlow,carry,carry);
   
dividendhigh=intrinsics::subc(dividendhigh,divisorhigh,carry,carry);
constexpr T zero{};
T mask{zero-carry};
T templow{divisorlow},temphigh{divisorhigh};
carry=!carry;
   
quotientlow=intrinsics::addc(quotientlow,quotientlow,carry,carry);
carry=0;
dividendlow=intrinsics::addc(dividendlow,templow,carry,carry);
   
dividendhigh=intrinsics::addc(dividendhigh,temphigh,carry,carry);
divisorlow = intrinsics::shiftright(divisorlow,divisorhigh,1u);
divisorhigh >>= 1u;
}
while(shift--);
return {quotientlow,0,dividendlow,dividendhigh};

[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes

2023-08-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209

--- Comment #4 from Jakub Jelinek  ---
(In reply to cqwrteur from comment #3)
> (In reply to Jakub Jelinek from comment #1)
> > Just use __int128 addition if all you want is double-word addition (or long
> > long for 32-bit arches)?
> 
> Well, I've presented this merely as an illustrative example. The length can
> actually be arbitrary.

No, it was working with all the other lengths.

[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes

2023-08-28 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209

--- Comment #3 from cqwrteur  ---
(In reply to Jakub Jelinek from comment #1)
> Just use __int128 addition if all you want is double-word addition (or long
> long for 32-bit arches)?

Well, I've presented this merely as an illustrative example. The length can
actually be arbitrary. I've directly taken the code from the GCC documentation,
but it doesn't appear to perform as the document asserts.

"
Built-in Function: unsigned int __builtin_addc (unsigned int a, unsigned int b,
unsigned int carry_in, unsigned int *carry_out)
Built-in Function: unsigned long int __builtin_addcl (unsigned long int a,
unsigned long int b, unsigned int carry_in, unsigned long int *carry_out)
Built-in Function: unsigned long long int __builtin_addcll (unsigned long long
int a, unsigned long long int b, unsigned long long int carry_in, unsigned long
long int *carry_out)
These built-in functions are equivalent to:

  ({ __typeof__ (a) s; \
  __typeof__ (a) c1 = __builtin_add_overflow (a, b, ); \
  __typeof__ (a) c2 = __builtin_add_overflow (s, carry_in, ); \
  *(carry_out) = c1 | c2; \
  s; })
i.e. they add 3 unsigned values, set what the last argument points to to 1 if
any of the two additions overflowed (otherwise 0) and return the sum of those 3
unsigned values. Note, while all the first 3 arguments can have arbitrary
values, better code will be emitted if one of them (preferrably the third one)
has only values 0 or 1 (i.e. carry-in).
"

Additionally, it's advisable to steer clear of using __uint128_t in certain
situations. This data type is not compatible with the Microsoft compiler and
32-bit machines. Moreover, the compiler does not effectively optimize the
associated costs.

[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes

2023-08-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209

Jakub Jelinek  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-28
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Created attachment 55809
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55809=edit
gcc14-pr111209.patch

Anyway, here is a patch that makes it match, but it is getting ugly to avoid
making it match prematurely and break other matching.

[Bug target/111209] GCC fails to understand adc pattern what its document describes

2023-08-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209

--- Comment #1 from Jakub Jelinek  ---
Just use __int128 addition if all you want is double-word addition (or long
long for 32-bit arches)?

[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   Keywords||testsuite-fail
  Component|other   |testsuite
   Last reconfirmed||2023-08-28
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
I think I know what the issue is with the testcase.

[Bug testsuite/111216] New: [14 regression] instructions counts for vector tests change after r14-3258-ge7a36e4715c716

2023-08-28 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111216

Bug ID: 111216
   Summary: [14 regression] instructions counts for vector tests
change after r14-3258-ge7a36e4715c716
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:e7a36e4715c7162ccfd7cd32da985d629bbd9c61, r14-3258-ge7a36e4715c716

FAIL: gcc.target/powerpc/fold-vec-logical-ors-char.c scan-assembler-times
\\mxxlnor\\M 1
FAIL: gcc.target/powerpc/fold-vec-logical-ors-char.c scan-assembler-times
\\mxxlor\\M 7
FAIL: gcc.target/powerpc/fold-vec-logical-ors-int.c scan-assembler-times
\\mxxlnor\\M 1
FAIL: gcc.target/powerpc/fold-vec-logical-ors-int.c scan-assembler-times
\\mxxlor\\M 7
FAIL: gcc.target/powerpc/fold-vec-logical-ors-longlong.c scan-assembler-times
\\mxxlnor\\M 3
FAIL: gcc.target/powerpc/fold-vec-logical-ors-longlong.c scan-assembler-times
\\mxxlor\\M 9
FAIL: gcc.target/powerpc/fold-vec-logical-ors-short.c scan-assembler-times
\\mxxlnor\\M 1
FAIL: gcc.target/powerpc/fold-vec-logical-ors-short.c scan-assembler-times
\\mxxlor\\M 7
FAIL: gcc.target/powerpc/fold-vec-logical-other-char.c scan-assembler-times
\\mxxlnand\\M 3
FAIL: gcc.target/powerpc/fold-vec-logical-other-int.c scan-assembler-times
\\mxxlnand\\M 3
FAIL: gcc.target/powerpc/fold-vec-logical-other-longlong.c scan-assembler-times
\\mxxlnand\\M 3
FAIL: gcc.target/powerpc/fold-vec-logical-other-short.c scan-assembler-times
\\mxxlnand\\M 3

These are all just instruction count tests so the changes may not matter.

commit e7a36e4715c7162ccfd7cd32da985d629bbd9c61 (HEAD)
Author: Yanzhang Wang 
Date:   Wed Aug 16 22:28:50 2023 -0600

[PATCH] RISC-V: Support simplify (-1-x) for vector.

[Bug target/111107] i686-w64-mingw32 does not realign stack when __attribute__((aligned)) or __attribute__((vector_size)) are used

2023-08-28 Thread gabrielopcode at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07

Gabriel Ivăncescu  changed:

   What|Removed |Added

 CC||gabrielopcode at gmail dot com

--- Comment #7 from Gabriel Ivăncescu  ---
So to re-iterate summary of the problem:

1) The i686 Win32 ABI has a de-facto stack alignment of 4 bytes *only*. GCC may
have set it to 16 bytes on Linux because it compiled the whole userland, but
that's not the case on Windows; the caller can be MSVC compiled code (very
likely on Windows) and MSVC only uses 4-byte alignment.

2) SSE is *not* the only thing that requires stack realignment. Sure, it does
require it, but that's more a side effect of requiring larger-than-4 alignment
in the first place.

A variable (or its type) declared with __attribute__((aligned(...))) **should**
also let GCC re-align the stack upon entry, if it's > 4 bytes and if it's
actually used on the stack and spilled (or has its address taken).

There's no reason to special-case SSE at all. It's just the alignment of the
variable or spilled vector that should matter, and GCC must know that the
incoming stack is aligned only to 4 bytes on this platform.

i686 PE targets should simply default to -mincoming-stack-boundary=2
-mpreferred-stack-boundary=2 (the latter to minimize realignments unless
necessary), as that's basically MSVC's behavior, and as such the de-facto
standard on this platform.

[Bug other/111215] New: New test case gcc.dg/tree-ssa/cond-bool-2.c fails

2023-08-28 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215

Bug ID: 111215
   Summary: New test case gcc.dg/tree-ssa/cond-bool-2.c fails
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:ddd64a6ec3b38e18aefb9fcba50c0d9297e5e711, r14-3432-gddd64a6ec3b38e
make  -k check-gcc RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/cond-bool-2.c"
FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-times optimized "ne_expr, "
2
FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-not optimized "gimple_cond "
FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-not optimized "gimple_phi "
FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-times optimized
"bit_xor_expr, " 1
FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-times optimized
"gimple_assign " 3
# of expected passes3
# of unexpected failures5

commit ddd64a6ec3b38e18aefb9fcba50c0d9297e5e711 (HEAD)
Author: Andrew Pinski 
Date:   Tue Aug 22 18:41:56 2023 -0700

MATCH: remove negate for 1bit types

* gcc.dg/tree-ssa/cond-bool-2.c: New test.

[Bug fortran/102417] Wrong error message about character length with -std=f2018

2023-08-28 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102417

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords|diagnostic  |rejects-valid, wrong-code
 CC||anlauf at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=107721
 Ever confirmed|0   |1
   Last reconfirmed||2023-08-28

--- Comment #2 from anlauf at gcc dot gnu.org ---
It appears that we lose the typespec for nested ctors, so I guess this PR
is related to pr107721.

Slight variation of testcase:

program p
  character:: x = 'a'
  character(4) :: y(2)
  y = [ character(4) :: x, 'b' ]
  y = [[character(4) :: x, 'b']]
  print *, y
  print *, len ([ character(4) :: x, 'b' ])
  print *, len ([[character(4) :: x, 'b']])
end

Compiling with -fdump-fortran-original, I see:

  code:
  ASSIGN p:y(FULL) (/ p:x , 'b   ' /)
  ASSIGN p:y(FULL) (/ p:x , 'b' /)
  WRITE UNIT=6 FMT=-1
  TRANSFER p:y(FULL)
  DT_END
  WRITE UNIT=6 FMT=-1
  TRANSFER 4
  DT_END
  WRITE UNIT=6 FMT=-1
  TRANSFER 1
  DT_END

Clearly, the code for the lines with nested ctors is wrong.

[Bug target/111171] [14 Regression] ICE: in decompose, at rtl.h:2297 at -O1 on riscv64-unknown-linux-gnu

2023-08-28 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71

--- Comment #2 from Zdenek Sojka  ---
(In reply to Xi Ruoyao from comment #1)
> Can you try
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627024.html?

The patch
* combine.cc (simplify_compare_const): Properly handle unsigned
constants while narrowing comparison of memory and constants.
fixes this ICE on several testcases

[Bug libgomp/111214] New: omp_get_num_procs: Improve documentation - especially for devices

2023-08-28 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111214

Bug ID: 111214
   Summary: omp_get_num_procs: Improve documentation - especially
for devices
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: documentation
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Current wording:
https://gcc.gnu.org/onlinedocs/libgomp/omp_005fget_005fnum_005fprocs.html

"Returns the number of processors online on that device."

(A) For the host, I wonder whether it should mention the affinity bits, which
we have in Linux:

  if (gomp_places_list == NULL)
...  && pthread_getaffinity_np (pthread_self (), gomp_get_cpuset_size,
 gomp_cpusetp) == 0)
...


(B) We are completely silent for devices.

Seems as if the number of independent hardware threads is what is the sentiment
during today's OpenMP accel talk, i.e. #warps (nvptx) and #wavefronts (amdgcn)
in hardware (possibly: minus those removed via explicit num_threads settings).

We currently have for accelerators:
  return gomp_icv (false)->nthreads_var
with gomp_icv(false) is:
  struct gomp_task *task = gomp_thread ()->task;
  if (task)
return >icv; /*... */
  else
return _global_icv;

And set on GCN gomp_global_icv.nthreads_var = 16 and for nvptx:
gomp_global_icv.nthreads_var = ntids.

Possibly, it should be omp_get_num_teams() * nthreads_var, or not.

In any case, it needs to be documented.

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #18 from Thorsten Glaser  ---
I cannot, unfortunately. But I have found _another_ “mitigation”:

varsub() is static and has only one caller:
https://evolvis.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=alioth/mksh.git;a=blob;f=eval.c;h=cb959b1d1104229ead20a698ff2dc974b8da3b10;hb=35563a7897b98de2743233c5f3340a14bea6ebf2#l400

By making varsub…
https://evolvis.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=alioth/mksh.git;a=blob;f=eval.c;h=cb959b1d1104229ead20a698ff2dc974b8da3b10;hb=35563a7897b98de2743233c5f3340a14bea6ebf2#l1238
… not static, the bug *also* goes away. (Probably because varsub is not
inlined.)

Now we see that…
 399 sp = cstrchr(sp, '\0') + 1;
 400 type = varsub(, varname, sp, ,
);
… the varsub call is *directly* below the strchr/strlen line, *and* it gets
passed the sp variable. (Inside varsub, the variable is also modified.)

My suspicion here is that, somehow only triggerable on x32+dietlibc, something
about the multiple modifications of sp (just before and within varsub) confuses
GCC?

And indeed. Adding -O2, -O1, -O0 to the GCC command line doesn’t help, but
-fno-inline again does.

As does adding an attribute to the function prototype:
static int varsub(Expand *, const char *, const char *, unsigned int *, int *)
__attribute__((noinline));

Could we somehow debug there further? I really don’t see a way to reproduce
this on x32/glibc or amd64…

[Bug target/111171] [14 Regression] ICE: in decompose, at rtl.h:2297 at -O1 on riscv64-unknown-linux-gnu

2023-08-28 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #1 from Xi Ruoyao  ---
Can you try https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627024.html?

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #17 from Thorsten Glaser  ---
Hm, okay, I’ll try to find if I can trigger it in glibc/x32 then…

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #16 from Thorsten Glaser  ---
If I add -maddress-mode=long to the build of the expr.c file, then link it with
the rest, it still fails.

I’m not sure about reducing, and not sure about the cross-anything, but I *did*
get it to fail on amd64 now! (Just differently.) HOWEVER, I’m not sure whether
this is from x32/amd64 mismatch or from the bug, as the resulting pattern
differs.

The code flow is roughly: eval.c from line 1608 onwards opens a temporary file,
dups it to stdout, calls funsub() from line 2147, and on return rewinds that
file and restores stdout. This all is called from line 352 (where the jump to
the subroutine is), but the strlen in question is on line 399 in a different
codepath (where the stuff immediately following '${' is parsed). They only have
the use of the variable 'sp' and the jumping past the first NUL in it in common
(the funsub caller has 'sp = strnul(sp) + 1;' instead, but that’s just
'sp+strlen(sp)', and changing the 'sp = cstrchr(sp, '\0') + 1;' to that (which
I did in upstream CVS HEAD now anyway) doesn’t “fix” the issue.

In a Debian sid/amd64 chroot, with GCC 13.2.0-1 (as packaged in Debian), I did:

gcc-13 -g -fno-lto -fno-asynchronous-unwind-tables -fno-strict-aliasing
-fstack-protector-strong -malign-data=abi -fwrapv  -I. -D_FORTIFY_SOURCE=2
-DMKSH_BUILDMEAT -DMKSH_BUILDSH=1 -D_GNU_SOURCE -DSETUID_CAN_FAIL_WITH_EAGAIN=1
-DHAVE_STRING_POOLING=2 -DHAVE_ATTRIBUTE_BOUNDED=0 -DHAVE_ATTRIBUTE_FORMAT=1
-DHAVE_ATTRIBUTE_NORETURN=1 -DHAVE_ATTRIBUTE_UNUSED=1 -DHAVE_ATTRIBUTE_USED=1
-DHAVE_SYS_TIME_H=1 -DHAVE_TIME_H=1 -DHAVE_BOTH_TIME_H=1 -DHAVE_SYS_SELECT_H=1
-DHAVE_SELECT_TIME_H=1 -DHAVE_SYS_BSDTYPES_H=0 -DHAVE_SYS_FILE_H=1
-DHAVE_SYS_MKDEV_H=0 -DHAVE_SYS_MMAN_H=1 -DHAVE_SYS_PARAM_H=1
-DHAVE_SYS_PTEM_H=0 -DHAVE_SYS_RESOURCE_H=1 -DHAVE_SYS_SYSMACROS_H=1
-DHAVE_BSTRING_H=0 -DHAVE_GRP_H=1 -DHAVE_IO_H=0 -DHAVE_LIBGEN_H=1
-DHAVE_LIBUTIL_H=0 -DHAVE_PATHS_H=1 -DHAVE_STDINT_H=1 -DHAVE_STRINGS_H=1
-DHAVE_TERMIOS_H=1 -DHAVE_ULIMIT_H=1 -DHAVE_VALUES_H=1 -DHAVE_CAN_INTTYPES=1
-DHAVE_SIG_T=1 -DHAVE_STRERRORDESC_NP=1 -DHAVE_SYS_ERRLIST=1
-DHAVE_SIGABBREV_NP=1 -DHAVE_SYS_SIGNAME=0 -DHAVE_SIGDESCR_NP=1
-DHAVE_SYS_SIGLIST=1 -DHAVE_FLOCK=1 -DHAVE_LOCK_FCNTL=1 -DHAVE_RLIMIT=1
-DHAVE_RLIM_T=1 -DHAVE_GET_CURRENT_DIR_NAME=1 -DHAVE_GETRANDOM=0
-DHAVE_GETRUSAGE=1 -DHAVE_GETSID=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_KILLPG=1
-DHAVE_MEMMOVE=1 -DHAVE_MKNOD=0 -DHAVE_MMAP=1 -DHAVE_FTRUNCATE=1 -DHAVE_NICE=1
-DHAVE_RENAME=1 -DHAVE_REVOKE=0 -DHAVE_POSIX_UTF8_LOCALE=0 -DHAVE_SELECT=1
-DHAVE_SETRESUGID=1 -DHAVE_SETGROUPS=1 -DHAVE_SIGACTION=1 -DHAVE_STRERROR=0
-DHAVE_STRSIGNAL=0 -DHAVE_STRLCPY=0 -DHAVE_STRSTR=1 -DHAVE_FLOCK_DECL=1
-DHAVE_REVOKE_DECL=1 -DHAVE_SYS_ERRLIST_DECL=1 -DHAVE_SYS_SIGLIST_DECL=1
-DHAVE_ST_MTIMENSEC=0 -DHAVE_INTCONSTEXPR_RSIZE_MAX=0
-DHAVE_PERSISTENT_HISTORY=1 -DMKSH_BUILD_R=599 -c lalloc.c edit.c eval.c exec.c
expr.c funcs.c histrap.c jobs.c lex.c main.c misc.c shf.c syn.c tree.c var.c
ulimit.c strlcpy.c

gcc-13 -g -fno-lto -fno-asynchronous-unwind-tables -fno-strict-aliasing
-fstack-protector-strong -malign-data=abi -fwrapv   -fno-lto -o mksh lalloc.o
edit.o eval.o exec.o expr.o funcs.o histrap.o jobs.o lex.o main.o misc.o shf.o
syn.o tree.o var.o ulimit.o strlcpy.o

./mksh -c 'x=q; x=${ echo a; typeset e=2; return 3; echo x$e;}; echo .$x.'

gcc-13 -g -fno-lto -fno-asynchronous-unwind-tables -fno-strict-aliasing
-fstack-protector-strong -malign-data=abi -fwrapv  -I. -D_FORTIFY_SOURCE=2
-DMKSH_BUILDMEAT -DMKSH_BUILDSH=1 -D_GNU_SOURCE -DSETUID_CAN_FAIL_WITH_EAGAIN=1
-DHAVE_STRING_POOLING=2 -DHAVE_ATTRIBUTE_BOUNDED=0 -DHAVE_ATTRIBUTE_FORMAT=1
-DHAVE_ATTRIBUTE_NORETURN=1 -DHAVE_ATTRIBUTE_UNUSED=1 -DHAVE_ATTRIBUTE_USED=1
-DHAVE_SYS_TIME_H=1 -DHAVE_TIME_H=1 -DHAVE_BOTH_TIME_H=1 -DHAVE_SYS_SELECT_H=1
-DHAVE_SELECT_TIME_H=1 -DHAVE_SYS_BSDTYPES_H=0 -DHAVE_SYS_FILE_H=1
-DHAVE_SYS_MKDEV_H=0 -DHAVE_SYS_MMAN_H=1 -DHAVE_SYS_PARAM_H=1
-DHAVE_SYS_PTEM_H=0 -DHAVE_SYS_RESOURCE_H=1 -DHAVE_SYS_SYSMACROS_H=1
-DHAVE_BSTRING_H=0 -DHAVE_GRP_H=1 -DHAVE_IO_H=0 -DHAVE_LIBGEN_H=1
-DHAVE_LIBUTIL_H=0 -DHAVE_PATHS_H=1 -DHAVE_STDINT_H=1 -DHAVE_STRINGS_H=1
-DHAVE_TERMIOS_H=1 -DHAVE_ULIMIT_H=1 -DHAVE_VALUES_H=1 -DHAVE_CAN_INTTYPES=1
-DHAVE_SIG_T=1 -DHAVE_STRERRORDESC_NP=1 -DHAVE_SYS_ERRLIST=1
-DHAVE_SIGABBREV_NP=1 -DHAVE_SYS_SIGNAME=0 -DHAVE_SIGDESCR_NP=1
-DHAVE_SYS_SIGLIST=1 -DHAVE_FLOCK=1 -DHAVE_LOCK_FCNTL=1 -DHAVE_RLIMIT=1
-DHAVE_RLIM_T=1 -DHAVE_GET_CURRENT_DIR_NAME=1 -DHAVE_GETRANDOM=0
-DHAVE_GETRUSAGE=1 -DHAVE_GETSID=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_KILLPG=1
-DHAVE_MEMMOVE=1 -DHAVE_MKNOD=0 -DHAVE_MMAP=1 -DHAVE_FTRUNCATE=1 -DHAVE_NICE=1
-DHAVE_RENAME=1 -DHAVE_REVOKE=0 -DHAVE_POSIX_UTF8_LOCALE=0 -DHAVE_SELECT=1
-DHAVE_SETRESUGID=1 -DHAVE_SETGROUPS=1 -DHAVE_SIGACTION=1 -DHAVE_STRERROR=0
-DHAVE_STRSIGNAL=0 -DHAVE_STRLCPY=0 -DHAVE_STRSTR=1 -DHAVE_FLOCK_DECL=1
-DHAVE_REVOKE_DECL=1 -DHAVE_SYS_ERRLIST_DECL=1 -DHAVE_SYS_SIGLIST_DECL=1
-DHAVE_ST_MTIMENSEC=0 

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #15 from H.J. Lu  ---
We need a testcase which can be reproduced with glibc since the bug may be in
other parts of dietlibc.

[Bug c++/111173] G++ allows constinit functions

2023-08-28 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

Uroš Bizjak  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #14 from Uroš Bizjak  ---
(In reply to Thorsten Glaser from comment #13)
> The interesting part is around the occurrence of…
> 
> # eval.c:399:   sp = cstrchr(sp, '\0') + 1;
> 
> … in the .s files (it occurs thrice, the first is the beginning of the setup
> part, the second and third surround the strlen call, so they’re all within a
> bunch of lines).

Unfortunately, the runtime bug requires test that fails at runtime; the
attached dumps are not that usable. The fact that the compiler fails for not so
common target makes things even harder.

I think that the best way forward is to create a minimized standalone testcase
(From Comment #11 it looks that the issue is independent of dietlibc) that can
be compiled with -mx32 in a kind of cross-compiler fashion. You can use
-maddress-mode=long with -mx32 to create a .s assembly file that is compatible
with x86_64, as far as stack handling is concerned.

The resulting .s assembly can then be compiled and linked with a C wrapper, so
a testcase that eventually fails on x86_64 can be produced.

IOW, does the testcase fail when -maddress-mode=long is used?

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #13 from Thorsten Glaser  ---
The interesting part is around the occurrence of…

# eval.c:399:   sp = cstrchr(sp, '\0') + 1;

… in the .s files (it occurs thrice, the first is the beginning of the setup
part, the second and third surround the strlen call, so they’re all within a
bunch of lines).

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #12 from Thorsten Glaser  ---
Created attachment 55808
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55808=edit
tarball (.xz) with preprocessed and assembly output

I’ve verified (back to unmodified source) that it is indeed only the file
eval.c that’s at fault.

I’ve compiled mksh with gcc-12, then built that one file with gcc-13, linked
with gcc-12, and it failed.

I’m attaching an xz-compressed tarball with preprocessed and assembly (both
AT and Intel, both -fverbose-asm and not) of the file with exactly identical
options between GCC 12 and 13, in the hope of that being helpful to hunt this
down.

[Bug target/111212] [13/14 Regression] internal compiler error: in extract_insn, at recog.cc:2791

2023-08-28 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111212

--- Comment #2 from Mathieu Malaterre  ---
reduced:

% g++  -maltivec -mcpu=power8 -O2 -c testcase.i
testcase.i:15:30: warning: '{anonymous}::m {anonymous}::n(a) [with f =
short int]' used but never defined
   15 | template  m n(a);
  |  ^
testcase.i: In function 'void f::o::b()':
testcase.i:66:25: error: unrecognizable insn:
   66 | void b() { bo(bj()); }
  | ^
(insn 14 10 15 2 (set (reg:DI 127)
(ashift:DI (reg:DI 126)
(const_int 56 [0x38]))) "testcase.i":61:8 -1
 (nil))
during RTL pass: vregs
testcase.i:66:25: internal compiler error: in extract_insn, at recog.cc:2791
0x10813967 internal_error(char const*, ...)
???:0
0x10813a77 fancy_abort(char const*, int, char const*)
???:0
0x1041cb67 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
???:0
0x1041cba3 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
???:0
0x10bc12b7 extract_insn(rtx_insn*)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.


with:

% cat testcase.i
typedef int a;
typedef short b;
namespace c {
template  struct g { using h = e &; };
template  using i = typename g::h;
template  struct k { using h = i; };
template  class ad;
template  class ad {
public:
  typename k::h operator[](a);
};
} // namespace c
namespace {
template  using m = c::ad;
template  m n(a);
} // namespace
#pragma GCC target "cpu=power10"
namespace f {
namespace o {
template  struct p { using f = q; };
namespace detail {
template  struct aq { using h = p; };
template  struct at {
  static constexpr a au = 0;
  using h = typename aq::h;
};
} // namespace detail
template 
using ax = typename detail::at::h;
template  using az = typename ay::f;
namespace detail {
template  struct be {
  static void bf(a, a) {
ax d;
bd()(f(), d);
  }
};
} // namespace detail
template  class bg {
public:
  template  void operator()(f) {
a bh;
constexpr a bi = as;
constexpr a bc{};
detail::be::bf(1, bh);
  }
};
template  class bj {
public:
  template  void operator()(f r) { bg()(r); }
};
template  void bl(bk bm) { bm(b()); }
template  void bn(bk bm) { bl(bm); }
template  void bo(bk bm) { bn(bm); }
struct s {
  template  void operator()(f, ay) {
ay bp;
using bq = az;
a br;
auto bs = n(br);
bq bt[]{8, 5, 4, 4, 5, 4, 9, 8, 5};
for (a j;;)
  bs[j] = bt[j];
  }
};
void b() { bo(bj()); }
} // namespace o
} // namespace f

[Bug c/111059] [11/12/13/14 Regression] ICE: in gimplify_expr, at gimplify.cc:17253

2023-08-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111059

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
void f() {
  (_Bool) (0 / 0);
}
ICEs too, so I think the problem is elsewhere.

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #11 from Thorsten Glaser  ---
OK, to summarise:

When using the original code but providing a wrapper function (in a separate
CU) for strchr, it works.

When replacing the strchr with strlen (which GCC also does), it fails even
without the presence of dietlibc’s strlen. (And yes, disassembly of main.o
(where I added it) shows no call to dietlibc from xstrlen.)

This doesn’t seem to be coupled to the name of the function (the wrapper
functions are called cstrchr and xstrlen, so the compiler cannot make any
assumptions about them).

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #10 from Thorsten Glaser  ---
oh no, wait, that was for strchr… the strlen one… but, yeah, that too:

extern size_t xstrlen(const char *s);

and changing the line again to…

sp += xstrlen(sp) + 1;

… and adding in another .c file:

size_t xstrlen(const char *s) {
register const char *cp = s;

while (*cp++ != '\0')
;

return (--cp - s);
}

And it still fails.

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread tg at mirbsd dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

--- Comment #9 from Thorsten Glaser  ---
> Does providing your own (trivially correct) strlen implementation in a 
> separate CU also fix the issue?

Even providing one that just calls dietlibc’s (in a separate CU) fixes the
issue, so I’m very sure it’s not that, but probably some codegen surrounding
the call.

[Bug tree-optimization/111211] No warning for iterator going out of scope for writing to array of inline-asm

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||inline-asm
Summary|No warning for iterator |No warning for iterator
   |going out of scope  |going out of scope for
   ||writing to array of
   ||inline-asm

--- Comment #5 from Andrew Pinski  ---
Note this is only an issue with inline-asm really and only if write directly in
the array.

If we change the code slightly:
```
#include 
int foo2 (uint64_t ddr0_addr_access)
{
uint64_t check[1] = {0};

for (int k = 0; k < 8; k += 1)
{
int t;
asm volatile ("nop" : "=r"(t) : "r"(ddr0_addr_access));
check[k] = t;
}
return 0;
}
```

GCC does warn (though slightly different):
```
:11:18: warning: iteration 1 invokes undefined behavior
[-Waggressive-loop-optimizations]
   11 | check[k] = t;
  | ~^~~
:7:23: note: within this loop
7 | for (int k = 0; k < 8; k += 1)
  | ~~^~~
```

[Bug libstdc++/104167] Implement C++20 std::chrono::utc_clock, std::chrono::tzdb etc.

2023-08-28 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104167

--- Comment #10 from Christophe Lyon  ---
(In reply to Jonathan Wakely from comment #9)
> (In reply to Christophe Lyon from comment #8)
> > On arm-eabi targets (thus, using newlib), we've noticed new errors:
> 
> New since when? These files haven't changed in the last two weeks.

The bisection pointed to the patch in comment #6.

[Bug tree-optimization/111146] Some patterns in match.pd are no longer needed

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED

--- Comment #2 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/111146] Some patterns in match.pd are no longer needed

2023-08-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46

--- Comment #1 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:cbde03abe5dbba13b992a3b610efe43aefc0e234

commit r14-3527-gcbde03abe5dbba13b992a3b610efe43aefc0e234
Author: Andrew Pinski 
Date:   Sun Aug 27 17:04:04 2023 -0700

MATCH: Remove redundant pattern for `(x | y) & ~x`

After r14-2885-gb9237226fdc938, this pattern becomes
redundant as we match it using bitwise_inverted_equal_p.

There is already a testcase (gcc.dg/nand.c) for this pattern
and it still passes after the removal.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/46
* match.pd (`(x | y) & ~x`, `(x & y) | ~x`): Remove
redundant pattern.

[Bug libstdc++/104167] Implement C++20 std::chrono::utc_clock, std::chrono::tzdb etc.

2023-08-28 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104167

--- Comment #9 from Jonathan Wakely  ---
(In reply to Christophe Lyon from comment #8)
> On arm-eabi targets (thus, using newlib), we've noticed new errors:

New since when? These files haven't changed in the last two weeks.

[Bug c/111211] No warning for iterator going out of scope

2023-08-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211

--- Comment #4 from Andrew Pinski  ---
(In reply to Lehua Ding from comment #3)
> (In reply to Richard Biener from comment #2)
> > We diagnose this after unrolling, so the difference is whether we unroll or
> > not.
> 
> But based on the assembly code it looks like both are unrolled.
> 
> foo:
> nop
> nop
> nop
> nop
> nop
> nop
> nop
> xor eax, eax
> ret
> foo2:
> nop
> nop
> nop
> nop
> nop
> nop
> nop
> nop
> xor eax, eax
> ret

At different times in the pipeline and the warning happens before the second
unrollinb

[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7

2023-08-28 Thread shaohua.li at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210

--- Comment #5 from Shaohua Li  ---
Thanks for all your comments!

[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7

2023-08-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210

--- Comment #4 from Alexander Monakov  ---
The testcase is small enough to notice the issue by inspection.

Note that you get the "expected" answer with -fno-strict-aliasing, and as
explained in https://gcc.gnu.org/bugs/ it is one of the things you should check
when submitting a bugreport:

Before reporting that GCC compiles your code incorrectly, compile it with gcc
-Wall -Wextra and see whether this shows anything wrong with your code.
Similarly, if compiling with -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations makes a difference, or if compiling with
-fsanitize=undefined produces any run-time errors, then your code is probably
not correct.

[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66

Richard Biener  changed:

   What|Removed |Added

 CC||guojiufu at gcc dot gnu.org,
   ||sayle at gcc dot gnu.org

--- Comment #6 from Richard Biener  ---
Roger was working on TImode incoming(?) argument code generation, this is
TImode outgoing argument code generation where we produce for 32bit parts

7: NOTE_INSN_BASIC_BLOCK 2
2: r84:SI=di:SI
3: r85:SI=si:SI
4: r86:SI=dx:SI
5: r87:SI=cx:SI
6: NOTE_INSN_FUNCTION_BEG
9: r88:DI=zero_extend(r84:SI)
   10: r89:DI=r82:TI#0
   11: r91:DI=0x
   12: {r90:DI=r89:DI:DI;clobber flags:CC;}
   13: {r92:DI=r90:DI|r88:DI;clobber flags:CC;}
   14: r82:TI=r82:TI&<0x,0>|zero_extend(r92:DI)
   15: r93:DI=zero_extend(r85:SI)
   16: {r94:DI=r93:DI<<0x20;clobber flags:CC;}
   17: r95:DI=r82:TI#0
   18: r96:DI=zero_extend(r95:DI#0)
   19: {r97:DI=r96:DI|r94:DI;clobber flags:CC;}
   20: r82:TI=r82:TI&<0x,0>|zero_extend(r97:DI)
   21: r98:DI=zero_extend(r86:SI)
   22: r99:DI=r82:TI#8
   23: r101:DI=0x
   24: {r100:DI=r99:DI:DI;clobber flags:CC;}
   25: {r102:DI=r100:DI|r98:DI;clobber flags:CC;}
   26: r82:TI=r82:TI&<0,0x>|zero_extend(r102:DI)<<0x40
   27: r103:DI=zero_extend(r87:SI)
   28: {r104:DI=r103:DI<<0x20;clobber flags:CC;}
   29: r105:DI=r82:TI#8
   30: r106:DI=zero_extend(r105:DI#0)
   31: {r107:DI=r106:DI|r104:DI;clobber flags:CC;}
   32: r82:TI=r82:TI&<0,0x>|zero_extend(r107:DI)<<0x40
   33: r108:DI=r82:TI#0
   34: r109:DI=r82:TI#8
   35: di:DI=r108:DI
   36: si:DI=r109:DI
   37: ax:DI=call [`do_smth_with_4_u32'] argc:0

and we fail to dissect "backwards" from the

   33: r108:DI=r82:TI#0
   34: r109:DI=r82:TI#8

subregs.  Possibly one issue is that we re-use r82.  The dual-use of r82
at the end also poses issues as combine tries to match things like

(parallel [ 
(set (reg:DI 108 [ D.2865 ])
(subreg:DI (reg:TI 82 [ D.2865 ]) 0))
(set (reg:TI 82 [ D.2865 ])
(ior:TI (and:TI (reg:TI 82 [ D.2865 ])
(const_wide_int 0x0))
(ashift:TI (zero_extend:TI (reg:DI 107))
(const_int 64 [0x40]
])  

but fails to "rename" r82 to split the parallel.

At RTL expansion time we store to D.2865 where it's DECL_RTL is r82:TI so
we can hardly fix it there.  Only a later pass could figure each of the
insns fully define the reg.

Jiufu Guo is working to improve what we choose for DECL_RTL, but for
incoming params / outgoing return.  This is a case where we could,
with -fno-tree-vectorize, improve DECL_RTL for an automatic var and
choose not TImode but something like a (concat:TI reg:DI reg:DI).

[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)

2023-08-28 Thread gnu_bugzilla_gcc at catelyn dot tech via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66

--- Comment #5 from gnu_bugzilla_gcc at catelyn dot tech ---
(In reply to Richard Biener from comment #4)
> note the situation is difficult to rectify - ideally the vectorizer
> would see that we require two 64bit register pieces but it doesn't - it sees
> we store into memory.

right, I figured that might have been what was going on, given some of the
related issues, the vectorizer incorrectly calculating the cost beforehand

> I'll note the non-vectorized code is also far from optimal.  clang
> produces the following which is faster by more of the delta that
> the vectorized version is slower compared to the scalar GCC variant.

I did notice that the GCC -Os and clang -O3 versions were different, didn't
realize that it was by that much, interesting

[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7

2023-08-28 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #3 from Xi Ruoyao  ---
(In reply to Shaohua Li from comment #2)
> (In reply to Alexander Monakov from comment #1)
> > 'c' is called with 'd' pointing to 'long e[2]', so
> > 
> >   return *(int *)(d + 1);
> > 
> > is an aliasing violation (dereferencing a pointer to an incompatible type).
> 
> Thanks for the quick diagnosis. I tried to enable -Wall -Wextra -pedantic
> but got no warning about the test case. Could you share how you diagnose
> this issue?

The red banner in the bug creation page says clearly:

"Similarly, if compiling with -fno-strict-aliasing -fwrapv makes a difference,
your code probably is not correct."

[Bug libstdc++/104167] Implement C++20 std::chrono::utc_clock, std::chrono::tzdb etc.

2023-08-28 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104167

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #8 from Christophe Lyon  ---
On arm-eabi targets (thus, using newlib), we've noticed new errors:

FAIL: std/time/clock/gps/io.cc (test for excess errors)
UNRESOLVED: std/time/clock/gps/io.cc compilation failed to produce executable
FAIL: std/time/clock/tai/io.cc (test for excess errors)
UNRESOLVED: std/time/clock/tai/io.cc compilation failed to produce executable

The logs say:
FAIL: std/time/clock/gps/io.cc (test for excess errors)
Excess errors:
ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function
`std::filesystem::current_path(std::filesystem::__cxx11::path const&,
std::error_code&)':
/libstdc++-v3/src/c++17/fs_ops.cc:806:(.text._ZNSt10filesystem12current_pathERKNS_7__cxx114pathE+0x10):
undefined reference to `chdir'
ld:
/libstdc++-v3/src/c++17/fs_ops.cc:806:(.text._ZNSt10filesystem12current_pathERKNS_7__cxx114pathERSt10error_code+0x6):
undefined reference to `chdir'
ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function
`(anonymous namespace)::create_dir(std::filesystem::__cxx11::path const&,
std::filesystem::perms, std::error_code&)':
/libstdc++-v3/src/c++17/fs_ops.cc:583:(.text._ZN12_GLOBAL__N_110create_dirERKNSt10filesystem7__cxx114pathENS0_5permsERSt10error_code+0xa):
undefined reference to `mkdir'
ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function
`std::filesystem::create_directory(std::filesystem::__cxx11::path const&,
std::error_code&)':
/libstdc++-v3/src/c++17/fs_ops.cc:583:(.text._ZNSt10filesystem16create_directoryERKNS_7__cxx114pathERSt10error_code+0xe):
undefined reference to `mkdir'
ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function
`std::filesystem::permissions(std::filesystem::__cxx11::path const&,
std::filesystem::perms, std::filesystem::perm_options, std::error_code&)':
/libstdc++-v3/src/c++17/fs_ops.cc:1134:(.text._ZNSt10filesystem11permissionsERKNS_7__cxx114pathENS_5permsENS_12perm_optionsERSt10error_code+0x7c):
undefined reference to `chmod'
ld:
/libstdc++-v3/src/c++17/fs_ops.cc:1134:(.text._ZNSt10filesystem11permissionsERKNS_7__cxx114pathENS_5permsENS_12perm_optionsERSt10error_code+0x9c):
undefined reference to `chmod'
ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function
`std::filesystem::current_path[abi:cxx11](std::error_code&)':
/libstdc++-v3/src/c++17/fs_ops.cc:750:(.text._ZNSt10filesystem12current_pathB5cxx11ERSt10error_code+0x22):
undefined reference to `pathconf'
ld:
/libstdc++-v3/src/c++17/fs_ops.cc:769:(.text._ZNSt10filesystem12current_pathB5cxx11ERSt10error_code+0x54):
undefined reference to `getcwd'
ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function
`std::filesystem::do_copy_file(char const*, char const*,
std::filesystem::copy_options_existing_file, stat*, stat*, std::error_code&)':
/libstdc++-v3/src/c++17/../filesystem/ops-common.h:553:(.text._ZNSt10filesystem12do_copy_fileEPKcS1_NS_26copy_options_existing_fileEP4statS4_RSt10error_code+0x114):
undefined reference to `chmod'
collect2: error: ld returned 1 exit status


BTW I noticed the same error messages for other tests (eg.
std/time/clock/gps/1.cc)

[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7

2023-08-28 Thread shaohua.li at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210

--- Comment #2 from Shaohua Li  ---
(In reply to Alexander Monakov from comment #1)
> 'c' is called with 'd' pointing to 'long e[2]', so
> 
>   return *(int *)(d + 1);
> 
> is an aliasing violation (dereferencing a pointer to an incompatible type).

Thanks for the quick diagnosis. I tried to enable -Wall -Wextra -pedantic but
got no warning about the test case. Could you share how you diagnose this
issue?

[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66

Richard Biener  changed:

   What|Removed |Added

 Depends on||101926

--- Comment #4 from Richard Biener  ---
Your benchmark confirms the vectorized variant is slower, on a 7900X it's
both the memory roundtrip and the gpr->xmm move causing it.  perf shows

   |turn_into_struct():
 1 |  movd   %edi,%xmm1
 3 |  movd   %esi,%xmm4
 4 |  movd   %edx,%xmm0
95 |  movd   %ecx,%xmm3
 6 |  punpckldq  %xmm4,%xmm1
 2 |  punpckldq  %xmm3,%xmm0
 1 |  movdqa %xmm1,%xmm2
   |  punpcklqdq %xmm0,%xmm2
 5 |  movaps %xmm2,-0x18(%rsp)
63 |  mov-0x18(%rsp),%rdi
70 |  mov-0x10(%rsp),%rsi
47 |  jmp400630 

note the situation is difficult to rectify - ideally the vectorizer
would see that we require two 64bit register pieces but it doesn't - it sees
we store into memory.

I'll note the non-vectorized code is also far from optimal.  clang
produces the following which is faster by more of the delta that
the vectorized version is slower compared to the scalar GCC variant.

turn_into_struct:   # @turn_into_struct
.cfi_startproc
# %bb.0:
# kill: def $ecx killed $ecx def $rcx
# kill: def $esi killed $esi def $rsi
shlq$32, %rsi
movl%edi, %edi
orq %rsi, %rdi
shlq$32, %rcx
movl%edx, %esi
orq %rcx, %rsi
jmp do_smth_with_4_u32  # TAILCALL


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926
[Bug 101926] [meta-bug] struct/complex/other argument passing and return should
be improved

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-28 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57

--- Comment #7 from Martin Jambor  ---
(In reply to Jan Hubicka from comment #4)
> So here ipa-modref declares the field dead, while ipa-prop determines its
> value even if it is unused and makes it used later?

This is what I wanted to ask about.  Looking at the dumps, ipa-modref
knows it is "killed."  Is that enough or does it need to be also not
read to be know to be useless?

> 
> I think dead argument is probably better than optimizing out one store, so I
> think ipa-prop, however question is how to detect this reliably.
> 
> ipa-modref has update_signature which updates summaries after ipa-sra work,
> so it is also place to erase the info about parameter being dead from the
> summary.

This is what I have been looking at last week and where I'd like to
plug such mechanism in so that it is not even streamed from WPA.

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-28 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57

--- Comment #6 from Martin Jambor  ---
(In reply to Richard Biener from comment #5)
> I think if IPA modref declares the argument dead at the call site then IPA
> CP/SRA cannot declare it known constant.

It is declared "killed" by the function.  I still need to figure out
whether that is all I need or whether the fact that it is not read
either is the combination I am after.  But I agree that IPA-CP should
refrain from propagating clearly unneeded info in that case.

> 
> Now, I wonder why IPA CP/SRA does not replace the known constant parameter
> with an automatic var like
> 
> point.constprop.isra (double ISRA.1740, int & restrict ipoint, double &
> restrict x, double & restrict y, double & restrict z, int & restrict istat)
> {
> ...
>   const int istat.local = 0;
>   istat = 
> 
> ?  So if not all uses of 'istat' get resolved we avoid generating wrong
> code.  The expense is a constant pool entry (if not all uses are removed),
> but I think that's OK.  It would also work for aggregates.  It would also
> relieve IPA-CP modification phase from doing anything but trival value
> replacement (in case the arg isn't apointer).

I'm afraid I don't understand.  Even in this particular case, istat is
checked by the caller and the callee can assign to it also other
values, not just the one which happens to be what it it initialized to
by the caller - and in the original code it does when there is an
error - those writes cannot be redirected to a local variable.

[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)

2023-08-28 Thread gnu_bugzilla_gcc at catelyn dot tech via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66

--- Comment #3 from gnu_bugzilla_gcc at catelyn dot tech ---
(In reply to Richard Biener from comment #1)
> Unless you can come up with an actual benchmark showing the vector code is
> slower I'd say it's not.  Given it's smaller it should win on the icache
> side if not executed frequently as well.

I'm not an expert in benchmarking C, so my benchmark may be incorrect, but I
compiled the same (attached preprocessed) file with -O2, -O3, and -Os into an
object file, and then compiled a benchmarking file into an object as well (to
avoid variance caused by the benchmarking file being compiled with different
optimization levels), I added a very simple implementation for
`do_smth_with_4_u32`, and ran the `turn_into_struct` function in a hot loop,
with varying (pre-generated) input data and storing the result in an array, I
timed this hot loop using `(float)clock()/CLOCKS_PER_SEC;` at the start and
end, then added up the calculated results to ensure all three programs get the
same result

on my machine (Ryzen 9 5900X) the -Os version takes ~.36s, while the -O2 and
-O3 versions take ~.43 and ~.42 seconds

I tried both -O2 and -O3 to get a slightly better view of the typical variance
between program runs, and their times are very similar, but the -Os version is
a decent amount faster (around 16%, which I'd assume is significant)

I've added the preprocessed benchmark file as well, which I then compiled with
-mtune=generic and -march=x86-64 to match the system-under-test

[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)

2023-08-28 Thread gnu_bugzilla_gcc at catelyn dot tech via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66

--- Comment #2 from gnu_bugzilla_gcc at catelyn dot tech ---
Created attachment 55807
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55807=edit
preprocessed file containing the benchmark code I used

I compiled this code (although using includes for clock, CLOCKS_PER_SEC,
time_t, printf, and ) to an object and linked it with the
bug-triggering file (compiled with -Os, -O2 and -O3 to test all those options),
to measure the speed of the generated implementations of the bug-triggering
file

[Bug analyzer/111213] New: -Wanalyzer-out-of-bounds false negative with `return arr[9];`

2023-08-28 Thread dale.mengli.ming at proton dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111213

Bug ID: 111213
   Summary: -Wanalyzer-out-of-bounds false negative with `return
arr[9];`
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: dale.mengli.ming at proton dot me
  Target Milestone: ---

Hi, this case (https://godbolt.org/z/98PMz1KKz) contains an out-of-bound error
(stmt: `return arr[9];`). At -O0, the analyzer can report this warning.
However, at -O1, -O2, and -O3, the analyzer doesn't report that.

After removing the `static` keyword (https://godbolt.org/z/qKohK3eeY), the
analyzer can report this warning at -O1, -O2, and -O3.

[Bug bootstrap/100932] autoconf error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT

2023-08-28 Thread nicolas at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100932

Nicolas Boulenguez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Nicolas Boulenguez  ---
Quite ironically, given the only answer so far, somebody has investigated the
same issue, duplicated the effort, and applied almost the same fix.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=25861cf3a88a07c8dca3fb32d098c0ad756bbe38

[Bug c/111211] No warning for iterator going out of scope

2023-08-28 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211

--- Comment #3 from Lehua Ding  ---
(In reply to Richard Biener from comment #2)
> We diagnose this after unrolling, so the difference is whether we unroll or
> not.

But based on the assembly code it looks like both are unrolled.

foo:
nop
nop
nop
nop
nop
nop
nop
xor eax, eax
ret
foo2:
nop
nop
nop
nop
nop
nop
nop
nop
xor eax, eax
ret

[Bug c/111211] No warning for iterator going out of scope

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211

--- Comment #2 from Richard Biener  ---
We diagnose this after unrolling, so the difference is whether we unroll or
not.

[Bug target/111212] [13/14 Regression] internal compiler error: in extract_insn, at recog.cc:2791

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111212

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3
Summary|internal compiler error: in |[13/14 Regression] internal
   |extract_insn, at|compiler error: in
   |recog.cc:2791   |extract_insn, at
   ||recog.cc:2791

[Bug target/111212] internal compiler error: in extract_insn, at recog.cc:2791

2023-08-28 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111212

--- Comment #1 from Mathieu Malaterre  ---
Compilation line:

 % /usr/bin/c++ -freport-bug -DHWY_STATIC_DEFINE -DTOOLCHAIN_MISS_ASM_HWCAP_H
-I/home/malat/highway -maltivec -mcpu=power8 -O2 -g -DNDEBUG -fPIE
-fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined
-D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\"
-fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla
-Wnon-virtual-dtor -fmath-errno -fno-exceptions -DHWY_IS_TEST=1
-DGTEST_HAS_PTHREAD=1 -MD -MT
CMakeFiles/table_test.dir/hwy/tests/table_test.cc.o -MF
CMakeFiles/table_test.dir/hwy/tests/table_test.cc.o.d -o
CMakeFiles/table_test.dir/hwy/tests/table_test.cc.o -c
/home/malat/highway/hwy/tests/table_test.cc

[Bug target/111212] New: internal compiler error: in extract_insn, at recog.cc:2791

2023-08-28 Thread malat at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111212

Bug ID: 111212
   Summary: internal compiler error: in extract_insn, at
recog.cc:2791
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: malat at debian dot org
  Target Milestone: ---

Created attachment 55806
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55806=edit
Preprocessed source

I cannot compile highway on powerpc (Debian/ppc32). It fails with:

/home/malat/highway/hwy/tests/table_test.cc:189:3: error: unrecognizable insn:
  189 |   }
  |   ^
(insn 231 227 232 8 (set (reg:DI 192)
(ashift:DI (reg:DI 191)
(const_int 56 [0x38])))
"/home/malat/highway/hwy/tests/table_test.cc":165:22 -1
 (nil))
during RTL pass: vregs
/home/malat/highway/hwy/tests/table_test.cc:189:3: internal compiler error: in
extract_insn, at recog.cc:2791
0x10813967 internal_error(char const*, ...)
???:0
0x10813a77 fancy_abort(char const*, int, char const*)
???:0
0x1041cb67 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
???:0
0x1041cba3 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
???:0
0x10bc12b7 extract_insn(rtx_insn*)
???:0
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
Preprocessed source stored into /tmp/cciw5XZG.out file, please attach this to
your bugreport.

[Bug c/111211] No warning for iterator going out of scope

2023-08-28 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211

--- Comment #1 from Lehua Ding  ---
Reproduce:

Compile Command: gcc -O3 -Wall -Wextra

C Code:

```
#include 

int foo (uint64_t ddr0_addr_access)
{
uint64_t check[1] = {0};

for (int k = 0; k < 7; k += 1)
{
asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
}
return 0;
}

int foo2 (uint64_t ddr0_addr_access)
{
uint64_t check[1] = {0};

for (int k = 0; k < 8; k += 1)
{
asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
}
return 0;
}
```

Output:

: In function 'foo':
:9:41: warning: array subscript 1 is above array bounds of
'uint64_t[1]' {aka 'long unsigned int[1]'} [-Warray-bounds=]
9 | asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
  |~^~~
:5:14: note: while referencing 'check'
5 | uint64_t check[1] = {0};
  |  ^
:9:41: warning: array subscript 2 is above array bounds of
'uint64_t[1]' {aka 'long unsigned int[1]'} [-Warray-bounds=]
9 | asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
  |~^~~
:5:14: note: while referencing 'check'
5 | uint64_t check[1] = {0};
  |  ^
:9:41: warning: array subscript 3 is above array bounds of
'uint64_t[1]' {aka 'long unsigned int[1]'} [-Warray-bounds=]
9 | asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
  |~^~~
:5:14: note: while referencing 'check'
5 | uint64_t check[1] = {0};
  |  ^
:9:41: warning: array subscript 4 is above array bounds of
'uint64_t[1]' {aka 'long unsigned int[1]'} [-Warray-bounds=]
9 | asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
  |~^~~
:5:14: note: while referencing 'check'
5 | uint64_t check[1] = {0};
  |  ^
:9:41: warning: array subscript 5 is above array bounds of
'uint64_t[1]' {aka 'long unsigned int[1]'} [-Warray-bounds=]
9 | asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
  |~^~~
:5:14: note: while referencing 'check'
5 | uint64_t check[1] = {0};
  |  ^
:9:41: warning: array subscript 6 is above array bounds of
'uint64_t[1]' {aka 'long unsigned int[1]'} [-Warray-bounds=]
9 | asm volatile ("nop" : "=r"(check[k]) : "r"(ddr0_addr_access));
  |~^~~
:5:14: note: while referencing 'check'
5 | uint64_t check[1] = {0};
  |  ^
Compiler returned: 0

[Bug c/111211] New: No warning for iterator going out of scope

2023-08-28 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211

Bug ID: 111211
   Summary: No warning for iterator going out of scope
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lehua.ding at rivai dot ai
  Target Milestone: ---

Compiler Explorer: https://godbolt.org/z/o4qPbovaz

There is a warning reported when the number of iterations is less than or equal
to 7, and no warning reported when it is greater than 7. I feel this behavior
is a bit strange.

[Bug target/111171] [14 Regression] ICE: in decompose, at rtl.h:2297 at -O1 on riscv64-unknown-linux-gnu

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-08-28
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.  We vectorize the scalar code:

> ./cc1 -quiet t.i -O2 -fopt-info-vec -fdump-tree-slp-details
weird_gcc_behaviour.c:15:41: optimized: basic block part vectorized using 16
byte vectors

generating

uint64_t turn_into_struct (u32 a, u32 b, u32 c, u32 d)
{
  u32 * vectp.4;
  vector(4) unsigned int * vectp.3;
  struct quad_u32 D.2865;
  uint64_t _11;
  vector(4) unsigned int _13;

   [local count: 1073741824]:
  _13 = {a_2(D), b_4(D), c_6(D), d_8(D)};
  MEM  [(unsigned int *)] = _13;
  _11 = do_smth_with_4_u32 (D.2865);
  D.2865 ={v} {CLOBBER(eol)};
  return _11;

and

weird_gcc_behaviour.c:15:41: note: Cost model analysis:
a_2(D) 1 times scalar_store costs 12 in body
b_4(D) 1 times scalar_store costs 12 in body
c_6(D) 1 times scalar_store costs 12 in body
d_8(D) 1 times scalar_store costs 12 in body
a_2(D) 1 times vector_store costs 12 in body
node 0x5d35f70 1 times vec_construct costs 36 in prologue
weird_gcc_behaviour.c:15:41: note: Cost model analysis for part in loop 0:
  Vector cost: 48
  Scalar cost: 48
weird_gcc_behaviour.c:15:41: note: Basic block will be vectorized using SLP

we are choosing the vector side at same cost because we assume it would win
on code size.  Practically a vector store instead of a scalar store is
also good for store forwarding.

We get

turn_into_struct:
.LFB0:
.cfi_startproc
movd%edi, %xmm1
movd%esi, %xmm4
movd%edx, %xmm0
movd%ecx, %xmm3
punpckldq   %xmm4, %xmm1
punpckldq   %xmm3, %xmm0
movdqa  %xmm1, %xmm2
punpcklqdq  %xmm0, %xmm2
movaps  %xmm2, -24(%rsp)
movq-24(%rsp), %rdi
movq-16(%rsp), %rsi
jmp do_smth_with_4_u32

instead of (-fno-tree-vectorize)

turn_into_struct:
.LFB0:
.cfi_startproc
xorl%eax, %eax
movl%ecx, %r8d
movl%edi, %ecx
movl%edx, %r9d
movabsq $-4294967296, %r10
movq%rax, %rdi
xorl%edx, %edx
salq$32, %r8
andq%r10, %rdi
orq %rcx, %rdi
movq%rsi, %rcx
salq$32, %rcx
movl%edi, %esi
orq %rcx, %rsi
movq%rdx, %rcx
andq%r10, %rcx
movq%rsi, %rdi
orq %r9, %rcx
movl%ecx, %ecx
orq %r8, %rcx
movq%rcx, %rsi
jmp do_smth_with_4_u32

and our guess for code-size is correct (47 bytes for vector, 67 for scalar).
The latency for the scalar code is also quite a bit bigger.  The spilling
should be OK, the store should forward nicely.

Unless you can come up with an actual benchmark showing the vector code is
slower I'd say it's not.  Given it's smaller it should win on the icache
side if not executed frequently as well.

So - not a bug?

The spilling could be avoided by using movq, movhlps + movq, but it's
call handling so possibly difficult to achieve.

[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7

2023-08-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210

Alexander Monakov  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED
 CC||amonakov at gcc dot gnu.org

--- Comment #1 from Alexander Monakov  ---
'c' is called with 'd' pointing to 'long e[2]', so

  return *(int *)(d + 1);

is an aliasing violation (dereferencing a pointer to an incompatible type).

[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

--- Comment #8 from Richard Biener  ---
Does providing your own (trivially correct) strlen implementation in a separate
CU also fix the issue?  That would point towards dietlibc.

[Bug target/111161] [13 Regression] ICE: RTL check: expected code 'const_int', have 'reg' in riscv_print_operand, at config/riscv/riscv.cc:4394 during build

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

[Bug c++/111159] [13/14 Regression] False positive -Wdangling-reference

2023-08-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic
Summary|[13 Regression] False   |[13/14 Regression] False
   |positive|positive
   |-Wdangling-reference|-Wdangling-reference
   Target Milestone|--- |13.3

  1   2   >