[Bug tree-optimization/106912] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032 since r13-1575-gcf3a120084e94614

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106912

--- Comment #3 from Richard Biener  ---
OK, it's a late IPA pass doing the clones it seems.  The scalar node got the
'const' stripped btw, but the call fntype still has it via the attributes.

It loses 'const' by

Old value = 252968993
New value = 251920417
set_const_flag_1 (node=, set_const=false,
looping=false, changed=0x7fffda5f) at
/home/rguenther/src/trunk/gcc/cgraph.cc:2696
2696  DECL_LOOPING_CONST_OR_PURE_P (node->decl) = false;
(gdb) bt
#0  set_const_flag_1 (node=, 
set_const=false, looping=false, changed=0x7fffda5f)
at /home/rguenther/src/trunk/gcc/cgraph.cc:2696
#1  0x00da633f in cgraph_node::set_const_flag (
this=, set_const=false, 
looping=false) at /home/rguenther/src/trunk/gcc/cgraph.cc:2789
#2  0x015e2910 in tree_profiling ()
at /home/rguenther/src/trunk/gcc/tree-profile.cc:818
#3  0x015e2b9f in (anonymous namespace)::pass_ipa_tree_profile::execute
(this=0x42a0c70) at /home/rguenther/src/trunk/gcc/tree-profile.cc:888

but the IL happily continues to treat the calls as 'const' because
flags_from_decl_or_type on the call fntype has

849   else if (TYPE_P (exp))
850 {
851   if (TYPE_READONLY (exp))
852 flags |= ECF_CONST;

note that __attribute__((pure)) is not duplicated on the type and so the
IPA profile effect will change the IL in fixup_cfg (), rewriting virtual
operands there.
making things consistent.

[Bug target/106910] roundss not vectorized

2022-09-13 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106910

--- Comment #3 from Hongtao.liu  ---

> The backend should modernize itself, get rid of the
> ix86_builtin_vectorized_function parts for those functions and instead rely
> on define_expands with vector modes.

Indeed, let me do it.

[Bug c/106920] -Warray-bound false positive regression with -O2 or -Os and constant address

2022-09-13 Thread npfhrotynz-ptnqh.myvf at noclue dot notk.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106920

--- Comment #2 from Dominique Martinet  ---
Thanks for the very fast reply!

since you mentioned null pointers I now see this warning doesn't happen if I
try with a larger constant, I just had bad luck that imx-atf uses an address <
4k...?


I checked the first dozen of issues from the meta-bug (from start of open bugs
list to 86613 included), but there are just too many and didn't see a
workaround in the ones I did open.

I can see catching bad casts to be useful, but for low level hardware code
accessing register addresses directly is the norm -- I'm not too worried now
I've noticed the <4k "rule" but there really can't be any assumption made with
hardware, as seen here...
(And NXP isn't exactly great at working with external entities, I tried
reaching out for another compile fix with little success... but that's
offtopic.)

Well, good to understand the reason behind that warning at least.

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902

Richard Biener  changed:

   What|Removed |Added

Summary|[11/12 Regression] Program  |[11/12/13 Regression]
   |compiled with -O3 -mfma |Program compiled with -O3
   |produces different result   |-mfma produces different
   ||result
 Blocks||53947

--- Comment #5 from Richard Biener  ---
(In reply to Martin Liška from comment #4)
> Fixed on master with r13-1450-gd2a89809452e.
> Started with r11-4637-gf5e18dd9c7dacc96.

I believe both a are unrelated.  The fix possibly caused a missed optimization
while the cause exposed some opportunity.

More analysis is needed here.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/106909] [13 Regression] error: control flow in the middle of basic block since r13-2541-g78ef801b7263606d

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106909

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #5 from Richard Biener  ---
 [local count: 1073741824]:
_80 = SR.96_116(D);
# DEBUG this => SR.96_116(D)
# DEBUG firstElement => ptrCopy_79(D)
# DEBUG elementCount => sizeCopy_83(D)
# DEBUG capacity => sizeCopy_83(D)
# DEBUG INLINE_ENTRY dispose
# DEBUG firstElement => ptrCopy_79(D)
# DEBUG elementCount => sizeCopy_83(D)
# DEBUG capacity => sizeCopy_83(D)
# DEBUG disposer => SR.96_116(D)
# DEBUG INLINE_ENTRY dispose
_81 = MEM[(const struct ArrayDisposer *)SR.96_116(D)]._vptr.ArrayDisposer;
_82 = *_81;
__builtin_unreachable ();
# DEBUG firstElement => NULL
# DEBUG elementCount => NULL
# DEBUG capacity => NULL
# DEBUG disposer => NULL
# DEBUG this => NULL
# DEBUG firstElement => NULL
# DEBUG elementCount => NULL
# DEBUG capacity => NULL

after some folding.  I fear this is the general
gimple_build_builtin_unreachable
which is now generally used but esp. folding should _not_ mark the call
as control altering but leave that to CFG fixup (CFG cleanup doesn't catch
this since it only looks at the last stmt of BBs).

I'm fixing up in the use.

[Bug c++/106921] New: [11/12.1] -O1 and -fipa-icf -fpartial-inlining causes wrong code

2022-09-13 Thread lutztonineubert at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106921

Bug ID: 106921
   Summary: [11/12.1] -O1 and -fipa-icf  -fpartial-inlining causes
wrong code
   Product: gcc
   Version: 11.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lutztonineubert at gmail dot com
  Target Milestone: ---

Short summary:

The following code returns 1 if compiled with -O2 (which is wrong) and does
return 0 if compiled without optimization.

```
#include 
#include 
#include 

#define GCC_VERSION (__GNUC__ * 1 \
 + __GNUC_MINOR__ * 100 \
 + __GNUC_PATCHLEVEL__)
static_assert(GCC_VERSION == 110300);

template 
class bitset {
 private:
  using word_t = size_t;
  static constexpr size_t bits_per_word = sizeof(word_t) * 8;
  static constexpr size_t number_of_words = (Bits / bits_per_word) + (((Bits %
bits_per_word) == 0) ? 0 : 1);

 public:
  bool all_first(size_t n) const {
{
  if (n > Bits) {
#ifdef RETURN_INSTEAD_TERMINATE
return false;
#else
std::terminate();
#endif
  }
  size_t i = 0;
  for (; n > bits_per_word; n -= bits_per_word, i++) {
if (words_[i] != ~word_t{0}) {
  return false;
}
  }
  word_t last_word = words_[i];
  for (; n != 0; n--) {
if ((last_word & 1) != 1) {
  return false;
}
last_word >>= 1;
  }
  return true;
}
  }

  void fill() noexcept {
  for (auto& word : words_) {
  word = ~word_t{0};
  }
  }

 private:
  std::array words_{};
};

volatile int X = 0;

int main() {
  if (X == 1) {
bitset<123> bitset;
static_cast(bitset.all_first(123));
  } else {
bitset<256> bitset;
bitset.fill();
if (!bitset.all_first(255)) {
  return 1;
}
  }
  return 0;
}

```

See: https://gcc.godbolt.org/z/bEexjrKP4

This issue does not exist in GCC 10 or GCC > 12.1. I couldn't test if it does
work in GCC 11.3.1 (or the trunk of it).

Additional:
* I could also trigger the issue with -O1 -fipa-icf  -fpartial-inlining 
* If we do a return false instead of a std::terminate, no wrong code is
generated.

I am sorry, but I couldn't reduced the code any further - this already took so
much time to figure out it is a compiler bug.

[Bug c/106920] -Warray-bound false positive regression with -O2 or -Os

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106920

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-09-13
 Blocks||56456
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed, that was an intended change to catch errors with accessing a
subobject of an object at nullptr.  There's some related duplicate where we
discuss workarounds.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456
[Bug 56456] [meta-bug] bogus/missing -Warray-bounds

[Bug target/106919] [13 Regression] RTL check: expected code 'set' or 'clobber', have 'if_then_else' in s390_rtx_costs, at config/s390/s390.cc:3672on s390x-linux-gnu

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106919

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.0

[Bug tree-optimization/106914] [13 Regression] ICE in operator[], at vec.h:889 since r13-2288-g61c4c989034548f4

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106914

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.0
   Priority|P3  |P1

[Bug rtl-optimization/106913] [13 Regression] ICE in dump_bb_info, at cfg.cc:796 since r13-2263-gf71abacfed170852

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106913

Richard Biener  changed:

   What|Removed |Added

   Assignee|marxin at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |13.0

--- Comment #2 from Richard Biener  ---
That looks like we fail to clear an auto_bb_flag but verification should also
catch that earlier ... huh.  Maybe we fail to verify that for ENTRY/EXIT.

I have a patch.

[Bug c/106920] New: -Warray-bound false positive regression with -O2 or -Os

2022-09-13 Thread npfhrotynz-ptnqh.myvf at noclue dot notk.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106920

Bug ID: 106920
   Summary: -Warray-bound false positive regression with -O2 or
-Os
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: npfhrotynz-ptnqh.myvf at noclue dot notk.org
  Target Milestone: ---

Hello,

I think I've run into a false positive on this file:
https://source.codeaurora.org/external/imx/imx-atf/tree/plat/imx/imx8m/hab.c?h=lf_v2.6

I could trim it down to this

#include 

typedef void hab_rvt_entry_t(void);

int main() {
hab_rvt_entry_t *a;
a = ((hab_rvt_entry_t *)(*(unsigned long *)(0x908)));
a();
return 0;
}

$ gcc -O2 -Warray-bounds -c t.c
t.c: In function ‘main’:
t.c:7:34: warning: array subscript 0 is outside array bounds of ‘long unsigned
int[0]’ [-Warray-bounds]
7 | a = ((hab_rvt_entry_t *)(*(unsigned long *)(0x908)));
  | ~^~


According to godbolt this passed on 11.3 and starts emitting the warning on
12.1 (it doesn't have 12.0) and still emits it on trunk.

Note the warning requires -O2, -O3 or -Os to be emitted.


The problem seems to be that it considers an arbitrary address casted to u64*
to be a u64[0] ?

If so that might be a problem for quite a few embedded products as that is
quite common when dealing with hardware registers.
(and who doesn't love products that compile with -Werror for release builds...)

Thanks!

[Bug tree-optimization/106912] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032 since r13-1575-gcf3a120084e94614

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106912

Richard Biener  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Priority|P3  |P1
   Target Milestone|--- |13.0
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
Confirmed, we have

# .MEM = VDEF <.MEM>
vect__5.57_58 = foo.simdclone.0 (vect__4.56_57);

here.  IIRC I filed a bugreport about simdclones not being const when the
scalar version is, in this case it's possibly IPA pure const not updating
the clones before materializing them!?

That said, the not vectorized variant is just

  _5 = foo (_4);

and without -fprofile-generate the vectorized variant also keeps 'const'.

I will look at this again after Cauldron.  Have to dig to where the
simdclone is actually generated.

[Bug target/106910] roundss not vectorized

2022-09-13 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106910

Richard Biener  changed:

   What|Removed |Added

 Blocks||53947
 CC||crazylht at gmail dot com
 Target|x86_64  |x86_64-*-*

--- Comment #2 from Richard Biener  ---
Probably missing patterns for V2SFmode here.  Hmm, we don't seem to have any
vector mode patterns here but possibly rely on ix86_builtin_vectorized_function
which indeed doesn't have any V2SFmode support.

The vectorizer would go the direct internal fn way for those, querying the
floor optab but the x86 backend only has scalar modes supported for the
rounding optabs.

The backend should modernize itself, get rid of the
ix86_builtin_vectorized_function parts for those functions and instead rely
on define_expands with vector modes.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug testsuite/106345] Some ppc64le tests fail with -mcpu=power9 -mtune=power9

2022-09-13 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #14 from Kewen Lin  ---
Should be fixed everywhere.

[Bug testsuite/106345] Some ppc64le tests fail with -mcpu=power9 -mtune=power9

2022-09-13 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345

--- Comment #13 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:12d28957b613d8c9b74e7841d73945025a7f0ccb

commit r10-10982-g12d28957b613d8c9b74e7841d73945025a7f0ccb
Author: Kewen Lin 
Date:   Tue Sep 6 20:37:57 2022 -0500

rs6000/test: Fix empty TU in some cases of effective targets [PR106345]

As the failure of test case gcc.target/powerpc/pr92398.p9-.c in
PR106345 shows, some test sources for some powerpc effective
targets use empty translation unit wrongly.  The test sources
could go with options like "-ansi -pedantic-errors", then those
effective target checkings will fail unexpectedly with the
error messages like:

  error: ISO C forbids an empty translation unit [-Wpedantic]

This patch is to fix empty TUs with one dummy function definition
accordingly.

PR testsuite/106345

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_has_arch_pwr5):
Add
a function definition to avoid pedwarn about empty translation
unit.
(check_effective_target_has_arch_pwr6): Likewise.
(check_effective_target_has_arch_pwr7): Likewise.
(check_effective_target_has_arch_pwr8): Likewise.
(check_effective_target_has_arch_pwr9): Likewise.
(check_effective_target_has_arch_ppc64): Likewise.
(check_effective_target_ppc_float128): Likewise.
(check_effective_target_ppc_float128_insns): Likewise.
(check_effective_target_powerpc_vsx): Likewise.

(cherry picked from commit 7a43e52a48b6403a99d3e8ab3105869b4b3c081e)

<    1   2