[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

Richard Biener  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #8 from Richard Biener  ---
I think if you can reproduce the issue with -fsignaling-nans (which defaults to
off) then I would consider this to be an implementation issue.  The GCC
middle-end shouldn't clang to contradicting wording in the C standard here.

In particular we say

@opindex fsignaling-nans
@item -fsignaling-nans
Compile code assuming that IEEE signaling NaNs may generate user-visible
traps during floating-point operations.  Setting this option disables
optimizations that may change the number of exceptions visible with
signaling NaNs.  This option implies @option{-ftrapping-math}.

Changing tem = x * 1.0 to tem = x would possibly change the number of observed
traps if tem is used more than once.

OTOH below we also say

This option is experimental and does not currently guarantee to
disable all GCC optimizations that affect signaling NaN behavior.

so bugs in this area are expected (but they are still bugs IMHO).

CCing Joseph for clarification.

Note a quick check with

double foo (double x)
{
  return x * 1.0;
}

and -O2 -fsignaling-nans shows the multiplication is preserved (on x86_64),
so is your example in godbolt where you fail to specify -fsignaling-nans.

I agree the documentation is maybe not entirely clear about the effect.

[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450

--- Comment #3 from JuzheZhong  ---
Add cond_len pattern for VLS mode can work around this bug.
Even though COND_LEN_xxx is not eventually

Testing a patch to fix it.

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #7 from Richard Biener  ---
I will have a look.

[Bug modula2/110779] SysClock can not read the clock

2023-11-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110779

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #18 from Thomas Schwinge  ---
Noticed by chance:

(In reply to CVS Commits from comment #15)
> commit r13-7716-ga11ca333df2b6abb4187b39f32bb35a195d8fb33
> Author: Gaius Mulley 
> Date:   Sat Aug 12 20:20:45 2023 +0100
> 
> PR modula2/110779 SysClock can not read the clock (Darwin fixes)
> 
> This patch adds corrections to defensively check against glibc
> functions, structures and contains fallbacks.  These fixes were
> required under Darwin.

> libgm2/ChangeLog:
> 
> PR modula2/110779

> * configure.ac: Provide special case test for Darwin cross
> configuration.

That's 'GLIBCXX_IS_NATIVE' -- but that's then not actually used anywhere?

> (GLIBCXX_CONFIGURE): New statement.
> (GLIBCXX_CHECK_GETTIMEOFDAY): New statement.
> (GLIBCXX_ENABLE_LIBSTDCXX_TIME): New statement.

But without actual definitions; spotted during libgm2 'configure':

[...]
checking target system type... powerpc64le-unknown-linux-gnu
[...]/source-gcc/libgm2/configure: line 4013: GLIBCXX_CONFIGURE: command
not found
[...]/source-gcc/libgm2/configure: line 4016: GLIBCXX_CHECK_GETTIMEOFDAY:
command not found
[...]/source-gcc/libgm2/configure: line 4019:
GLIBCXX_ENABLE_LIBSTDCXX_TIME: command not found
checking for a BSD-compatible install... /usr/bin/install -c
[...]

[Bug target/112443] [12/13/14 Regression] Misoptimization of _mm256_blendv_epi8 intrinsic on avx512bw+avx512vl

2023-11-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112443

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Priority|P3  |P2
 Target||x86_64-*-*
   Last reconfirmed||2023-11-09
 Ever confirmed|0   |1

[Bug modula2/111956] Many powerpc platforms do _not_ have support for IEEE754 long double

2023-11-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111956

--- Comment #10 from Thomas Schwinge  ---
In addition to what Maciej said (..., and similarly, I don't have any proper
knowledge about PowerPC details):

(In reply to Gaius Mulley from comment #6)
> Created attachment 56522 [details]
> Proposed fix v5

Thanks for looking into this!

> Here is the latest patch which [...] 96
> failures on gcc135.  [...]

With no special 'configure' flags, I'm seeing (presumably) those, too.

I noticed in 'build-gcc/gcc/m2/config-make' (generated from
'gcc/m2/config-make.in'):

# Does the target have -mabi=ieeelongdouble support in libm?  (yes/no).
HAVE_TARGET_LONG_DOUBLE_IEEE = @have_target_long_double_ieee@

..., so missing 'AC_SUBST' or similar -- but is that actually unused?

I further noticed the following delta when regenerating 'libgm2/configure':

 case "$target" in
powerpc*-*-linux*)
- LONG_DOUBLE_COMPAT_FLAGS="$LONG_DOUBLE_COMPAT_FLAGS
-mno-gnu-attribute"
  # Check for IEEE128 support in libm:


(In reply to Gaius Mulley from comment #8)
> Here is the same patch as v5 but generated using git diff -w.

Please don't include unrelated changes (here: whitespace cleanup); handle that
separately (if you must).  ;-)

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread post+gcc at ralfj dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #7 from post+gcc at ralfj dot de ---
I guess the idea is that by passing a signaling NaN to a float operation, I am
already entering unspecified behavior, so it's okay for that float operation to
violate its usual contract and return a signaling NaN?

[Bug target/112454] New: csinc (csel is though) is not being used when there is matches twice

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112454

Bug ID: 112454
   Summary: csinc (csel is though) is not being used when there is
matches twice
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

Take:
```
int f(int a, int b, int c, int d)
{
  return (a == 2 ? 1 : b) + (c == 3 ? 1 : d);
}
```

GCC produces:
```
cmp w0, 2
mov w4, 1
cselw0, w1, w4, ne
cmp w2, 3
cselw3, w3, w4, ne
add w0, w0, w3
```


But that `mov w4, 1` is useless if we use csinc .

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread post+gcc at ralfj dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #6 from post+gcc at ralfj dot de ---
Hm, OTOH the C standard says

> The expressions 1×x, x/1, and x are equivalent (on IEC 60559 machines, among
others).

So, it seems like when they say "The + ,- , * , and / operators provide the IEC
60559 add, subtract, multiply, and divide operations.", they don't quite mean
that.

This seems internally inconsistent in the C standard, since C also permits
`pow(1, sNaN)` to behave different from `pow(1, qNaN)` -- and in fact they do
behave different in GNU's libm. So on the one hand `pow(1, x * y)` must always
be `1` but on the other hand it can return a NaN when `x` is an sNaN and `y` is
`1`?

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread post+gcc at ralfj dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #5 from post+gcc at ralfj dot de ---
> See 
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fsignaling-nans

That's unrelated. That's about whether operation on signaling NaNs can trap. I
am asking when operations can output a signaling NaN.

So, for code like

float x = y  z;
return is_signaling_nan(x);

when can that code return `true`? Normal IEEE semantics would say "never". And
yet if "z" is the constant 1,  is `*`, and "y" is a signaling NaN, then
this evidently can output a signaling NaN.

I would hope the answer is "this can output a signaling NaN only if one of the
inputs is a signaling NaN", but is that documented anywhere?

> Note mips and sh and a few other targets have the quiet bit meaning the 
> opposite.

I know. LLVM is currently buggy on those targets.

> GCC does document some of this on 
> https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Floating-point-implementation.html
>  but not the signaling nan part.

This seems to list a bunch of implementation-defined aspects of C? To my
knowledge, my question is not implementation defined. C (with the annex for
floating-point arithmetic) requires the above operations to always return
"false". GCC violates the C spec here (since it defines __STDC_IEC_559__,
declaring support for the annex), and it'd be good to know how far it is going
in that violation.

[Bug c/112432] Internal-fn: The [i|l|ll]rint family don't support FLOATN

2023-11-08 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112432

--- Comment #5 from Li Pan  ---
(In reply to Li Pan from comment #4)
> (In reply to Richard Biener from comment #3)
> > Ah, yes, for lrint we have the builtins - I just looked for lceil here.  So
> > yeah, where there are DEF_EXT_LIB_FLOATN_NX_BUILTINS we should have
> > DEF_INTERNAL_FLT_FLOATN_FN.
> 
> Thanks Richard, I will have a try for this change.

After some double-confirmation, the related definition are list as below

 glibc  GCC-FLOATN_NX_BUILTINS
iceilN  N
ifloor   N  N
irintN  N
iround   N  N

lceilN  N
lfloor   N  N
lrintY  Y
lround   Y  Y

llceil   N  N
llfllor  N  N
llrint   Y  Y
llround  Y  Y

We only need to support lrint/lround/llrint/llround for FLOATN for now.

[Bug c++/52339] using delete ptr1->ptr2 where ptr2 destructor deletes a const ptr1 fails

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52339

--- Comment #10 from Andrew Pinski  ---
(In reply to Jakub Jelinek from comment #7)
> Created attachment 54994 [details]
> gcc14-pr52339.patch
> 
> Untested fix.

I think this might fix PR 108789 too ...

[Bug middle-end/108789] __builtin_(add|mul)_overflow methods generate duplicate operations if both operands are const which in turn causes wrong code due to overlapping arguments

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108789

Andrew Pinski  changed:

   What|Removed |Added

Summary|__builtin_(add|mul)_overflo |__builtin_(add|mul)_overflo
   |w methods generate  |w methods generate
   |duplicate operations if |duplicate operations if
   |both operands are const |both operands are const
   ||which in turn causes wrong
   ||code due to overlapping
   ||arguments
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-09

--- Comment #2 from Andrew Pinski  ---
Confirmed.

The obvious workaround is to use a temporary variables for the arguments of
__builtin_add_overflow .

[Bug tree-optimization/109906] a rrotate (32-b) -> a lrotate b

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109906

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   Last reconfirmed||2023-11-09
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Mine.

[Bug target/97503] Suboptimal use of cntlzw and cntlzd

2023-11-08 Thread lh_mouse at 126 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97503

--- Comment #5 from LIU Hao  ---
(In reply to LIU Hao from comment #4)
> lzcnt   rax, rdx
> testrdx, rdx
> mov edx, 64
> cmove   rax, rdx

There is actually another missed optimization here. LZCNT sets CF if the source
operand is zero. so the TEST instruction is totally unnecessary. We can do
this:

```
  ...
  xor eax, eax
  lzcnt rax, rdx
  mov edx, 64   # or something else, whatever
  cmovb eax, edx
```

[Bug target/112445] [14 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1861 unable to find a register to spill: {*umulditi3_1} with -O -march=cascadelake -fwrapv

2023-11-08 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112445

--- Comment #2 from Zdenek Sojka  ---
Created attachment 56545
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56545=edit
testcase failing just at -O1

$ x86_64-pc-linux-gnu-gcc -O testcase2.c
testcase2.c: In function 'foo':
testcase2.c:19:1: error: unable to find a register to spill
   19 | }
  | ^
testcase2.c:19:1: error: this is the insn:
(insn 36 155 216 2 (parallel [
(set (reg:TI 280 [orig:142 _66 ] [142])
(mult:TI (zero_extend:TI (reg:DI 171 [ cu8_0 ]))
(zero_extend:TI (subreg:DI (reg:TI 104 [ _10 ]) 0
(clobber (reg:CC 17 flags))
]) "testcase2.c":12:9 513 {*umulditi3_1}
 (expr_list:REG_DEAD (reg:DI 171 [ cu8_0 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil
during RTL pass: reload
testcase2.c:19:1: internal compiler error: in lra_split_hard_reg_for, at
lra-assigns.cc:1861
0x7f4bef _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/repo/gcc-trunk/gcc/rtl-error.cc:108
0x12bd93d lra_split_hard_reg_for()
/repo/gcc-trunk/gcc/lra-assigns.cc:1861
0x12b6e38 lra(_IO_FILE*)
/repo/gcc-trunk/gcc/lra.cc:2495
0x1265569 do_reload
/repo/gcc-trunk/gcc/ira.cc:5973
0x1265569 execute
/repo/gcc-trunk/gcc/ira.cc:6161
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug target/97503] Suboptimal use of cntlzw and cntlzd

2023-11-08 Thread lh_mouse at 126 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97503

LIU Hao  changed:

   What|Removed |Added

 CC||lh_mouse at 126 dot com

--- Comment #4 from LIU Hao  ---
Are there any reasons why this was not done for 64?
(https://gcc.godbolt.org/z/7vddPdxaP)


```
using int32_t = int;
using int64_t = long long;
using uint32_t = unsigned int;
using uint64_t = unsigned long long;

void
xlzcnt32(int32_t& val)
  {
val = val ? (__builtin_clz(val) & 31) : 32;
  }

void
xlzcnt64(int64_t& val)
  {
val = val ? (__builtin_clzll(val) & 63) : 64;
  }
```

results in
```
xlzcnt32(int&):
xor eax, eax
lzcnt   eax, DWORD PTR [rdi]
mov DWORD PTR [rdi], eax
ret
xlzcnt64(long long&):
mov rdx, QWORD PTR [rdi]
xor eax, eax
lzcnt   rax, rdx
testrdx, rdx
mov edx, 64
cmove   rax, rdx
mov QWORD PTR [rdi], rax
ret
```

[Bug tree-optimization/109843] signbit comparisons -> copysign optimization

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109843

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-11-09
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Mine.

[Bug c++/108911] 0xe+100 gives talks about an impossible literal operator in error message

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108911

--- Comment #3 from Andrew Pinski  ---
Note clang has a decent error message for `0xe+100` but has just as bad one for
`123_to.`

Full testcase for a few:
```
  int a = 0xe+100;
  int b = 123_to.;
  int c = 0xe_100e+10;
```

[Bug c++/108911] 0xe+100 gives talks about an impossible literal operator in error message

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108911

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-09
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
Confirmed via the dup.

[Bug c++/112423] A weird reported diagnosis about user-defined-literal

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112423

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
A dup of bug 108911.

*** This bug has been marked as a duplicate of bug 108911 ***

[Bug c++/108911] 0xe+100 gives talks about an impossible literal operator in error message

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108911

Andrew Pinski  changed:

   What|Removed |Added

 CC||xmh970252187 at gmail dot com

--- Comment #1 from Andrew Pinski  ---
*** Bug 112423 has been marked as a duplicate of this bug. ***

[Bug c++/111918] #pragma GCC diagnostic pop does not restore error status of -Wnarrowing

2023-11-08 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918

Lewis Hyatt  changed:

   What|Removed |Added

 CC||lhyatt at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-09
 Ever confirmed|0   |1

--- Comment #3 from Lewis Hyatt  ---
The below patch should fix it for all such options, I am testing it now.

The old_kind shouldn't demand that the diagnostic must be a warning or an error
specifically, it just needs to note that the diagnostic is enabled or disabled,
and then let the frontend determine what type it is like it normally does.

diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index addd6606eaa..99921a10b7b 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -1126,8 +1126,7 @@ classify_diagnostic (const diagnostic_context *context,
  old_kind = !context->m_option_enabled (option_index,
 context->m_lang_mask,
 context->m_option_state)
-   ? DK_IGNORED : (context->warning_as_error_requested_p ()
-   ? DK_ERROR : DK_WARNING);
+   ? DK_IGNORED : DK_ANY;
  m_classify_diagnostic[option_index] = old_kind;
}

@@ -1469,7 +1468,15 @@ diagnostic_context::diagnostic_enabled (diagnostic_info
*diagnostic)
  option.  */
   if (diag_class == DK_UNSPECIFIED
   && !option_unspecified_p (diagnostic->option_index))
-diagnostic->kind = m_option_classifier.get_current_override
(diagnostic->option_index);
+{
+  const diagnostic_t new_kind
+   = m_option_classifier.get_current_override (diagnostic->option_index);
+  if (new_kind != DK_ANY)
+   /* DK_ANY means the diagnostic is not to be ignored, but we don't want
+  to change it specifically to DK_ERROR or DK_WARNING; we want to
+  preserve whatever the caller has specified.  */
+   diagnostic->kind = new_kind;
+}

   /* This allows for future extensions, like temporarily disabling
  warnings for ranges of source code.  */
diff --git a/gcc/diagnostic.def b/gcc/diagnostic.def
index 813b8daa4cc..e889eca7757 100644
--- a/gcc/diagnostic.def
+++ b/gcc/diagnostic.def
@@ -53,3 +53,8 @@ DEFINE_DIAGNOSTIC_KIND (DK_WERROR, "error: ", NULL)
 /* This is like DK_ICE, but backtrace is not printed.  Used in the driver
when reporting fatal signal in the compiler.  */
 DEFINE_DIAGNOSTIC_KIND (DK_ICE_NOBT, "internal compiler error: ", "error")
+
+/* This is used internally to indicate that a diagnostic is not to be ignored,
+   without mandating it be a specific type, so that it can be an error or
+   warning or otherwise, as the current context requires.  */
+DEFINE_DIAGNOSTIC_KIND (DK_ANY, "", NULL)

[Bug target/108678] Windows on ARM64 platform target aarch64-w64-mingw32

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108678

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-11-09
 Status|UNCONFIRMED |NEW

--- Comment #9 from Andrew Pinski  ---
.

[Bug c++/108026] Confusing pedwarn with template lambda with -std=c++11

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108026

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |14.0
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Fixed as mentioned.

[Bug tree-optimization/111893] range-op misses {L,R}ROTATE_EXPR handling

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111893

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-09
   Severity|normal  |enhancement
 Ever confirmed|0   |1

--- Comment #3 from Andrew Pinski  ---
.

[Bug libstdc++/110807] [13 Regression] Copy list initialisation of a vector raises a warning with -O2

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110807

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:e39b3e02c27bd771a07e385f9672ecf1a45ced77

commit r14-5260-ge39b3e02c27bd771a07e385f9672ecf1a45ced77
Author: Alexandre Oliva 
Date:   Thu Nov 9 00:01:37 2023 -0300

libstdc++: optimize bit iterators assuming normalization [PR110807]

The representation of bit iterators, using a pointer into an array of
words, and an unsigned bit offset into that word, makes for some
optimization challenges: because the compiler doesn't know that the
offset is always in a certain narrow range, beginning at zero and
ending before the word bitwidth, when a function loads an offset that
it hasn't normalized itself, it may fail to derive certain reasonable
conclusions, even to the point of retaining useless calls that elicit
incorrect warnings.

Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
list to a global bit vector variable.  Based on the compile-time
constant length of the list, we decide in _M_insert_range whether to
use the existing storage or to allocate new storage for the vector.
After allocation, we decide in _M_copy_aligned how to copy any
preexisting portions of the vector to the newly-allocated storage.
When copying two or more words, we use __builtin_memmove.

However, because we compute the available room using bit offsets
without range information, even comparing them with constants, we fail
to infer ranges for the preexisting vector depending on word size, and
may thus retain the memmove call despite knowing we've only allocated
one word.

Other parts of the compiler then detect the mismatch between the
constant allocation size and the much larger range that could
theoretically be copied into the newly-allocated storage if we could
reach the call.

Ensuring the compiler is aware of the constraints on the offset range
enables it to do a much better job at optimizing.  Using attribute
assume (_M_offset <= ...) didn't work, because gimple lowered that to
something that vrp could only use to ensure 'this' was non-NULL.
Exposing _M_offset as an automatic variable/gimple register outside
the unevaluated assume operand enabled the optimizer to do its job.

Rather than placing such load-then-assume constructs all over, I
introduced an always-inline member function in bit iterators that does
the job of conveying to the compiler the information that the
assumption is supposed to hold, and various calls throughout functions
pertaining to bit iterators that might not otherwise know that the
offsets have to be in range, so that the compiler no longer needs to
make conservative assumptions that prevent optimizations.

With the explicit assumptions, the compiler can correlate the test for
available storage in the vector with the test for how much storage
might need to be copied, and determine that, if we're not asking for
enough room for two or more words, we can omit entirely the code to
copy two or more words, without any runtime overhead whatsoever: no
traces remain of the undefined behavior or of the tests that inform
the compiler about the assumptions that must hold.


for  libstdc++-v3/ChangeLog

PR libstdc++/110807
* include/bits/stl_bvector.h (_Bit_iterator_base): Add
_M_assume_normalized member function.  Call it in _M_bump_up,
_M_bump_down, _M_incr, operator==, operator<=>, operator<, and
operator-.
(_Bit_iterator): Also call it in operator*.
(_Bit_const_iterator): Likewise.

[Bug middle-end/112383] `a&=CST; (a) != a` and `((~b) & a) & CST != 0`

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112383

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=112382

--- Comment #1 from Andrew Pinski  ---
Note the difference from PR 112382 is that PR 112382 is about a power of 2
while this is not any mask. I almost missed that even though I filed both.

[Bug target/112445] [14 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1861 unable to find a register to spill: {*umulditi3_1} with -O -march=cascadelake -fwrapv

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112445

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-11-09
   Keywords||needs-bisection

--- Comment #1 from Andrew Pinski  ---
Confirmed.

The code is very sensative to changes even.

[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450

--- Comment #2 from JuzheZhong  ---
  if (loop_vinfo
  && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
  && mask_out_inactive)
{
  if (cond_len_fn != IFN_LAST
  && direct_internal_fn_supported_p (cond_len_fn, vectype,
 OPTIMIZE_FOR_SPEED))
vect_record_loop_len (loop_vinfo, lens, ncopies * vec_num, vectype,
  1);
  else if (cond_fn != IFN_LAST
   && direct_internal_fn_supported_p (cond_fn, vectype,
  OPTIMIZE_FOR_SPEED))
vect_record_loop_mask (loop_vinfo, masks, ncopies * vec_num,
   vectype, NULL);
  else
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "can't use a fully-masked loop because no"
 " conditional operation is available.\n");
  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
}
}

go through second condition with
vect_record_loop_mask here.

Seems that we can't differentiate RVV VLS mode with cond_xxx.

RVV VLS mode just want to support COND_XXX to support

for (int i < N)
cond[i]? a[i] + b[i] : c[i]

N is known iterations.

[Bug target/112445] [14 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1861 unable to find a register to spill: {*umulditi3_1} with -O -march=cascadelake -fwrapv

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112445

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug libstdc++/112453] New: : __take_of_repeat_view/__drop_of_repeat_view should forwards __r._M_value

2023-11-08 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112453

Bug ID: 112453
   Summary: : __take_of_repeat_view/__drop_of_repeat_view
should forwards __r._M_value
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

The current wording clearly indicates *E.value_ instead of *r.value_.

https://godbolt.org/z/GcnbesbEE

#include 
#include 

int main() {
  auto t = std::views::repeat(std::make_unique(5), 4) |
std::views::take(2);
  auto d = std::views::repeat(std::make_unique(5), 4) |
std::views::drop(2);
}

[Bug tree-optimization/112450] RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450

--- Comment #1 from JuzheZhong  ---
Oh. I see we have cond_xxx pattern for VLS modes.

like V64HImdoe. But we don't support partial vectorization for VLS modes.

VLS modes are supposed to used as SIMD GNU vectorization.

As long as COND_XXX is enabled, loop vectorizer considers target support
partial
vectorization with mask and since no while_ult, then go through AVX512 partial
vectorization.

It seems that for conditional operations, I should use backend RTL PASS to walk
around that.

[Bug target/112443] [12/13/14 Regression] Misoptimization of _mm256_blendv_epi8 intrinsic on avx512bw+avx512vl

2023-11-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112443

--- Comment #1 from Hongtao.liu  ---
The below can fix that, there's typo for 2 splitters.

@@ -17082,7 +17082,7 @@ (define_insn_and_split "*avx2_pcmp3_4"
  (match_dup 4))]
  UNSPEC_BLENDV))]
 {
-  if (INTVAL (operands[5]) == 1)
+  if (INTVAL (operands[5]) == 5)
 std::swap (operands[1], operands[2]);
   operands[3] = gen_lowpart (mode, operands[3]);
 })
@@ -17112,7 +17112,7 @@ (define_insn_and_split "*avx2_pcmp3_5"
  (match_dup 4))]
  UNSPEC_BLENDV))]
 {
-  if (INTVAL (operands[5]) == 1)
+  if (INTVAL (operands[5]) == 5)
 std::swap (operands[1], operands[2]);
 })

[Bug libstdc++/112452] New: : operator|(_Range&& __r, _Self&& __self) should return decltype(auto)

2023-11-08 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112452

Bug ID: 112452
   Summary: : operator|(_Range&& __r, _Self&& __self)
should return decltype(auto)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

The is the same issue with https://github.com/microsoft/STL/issues/4153.

[Bug modula2/111956] Many powerpc platforms do _not_ have support for IEEE754 long double

2023-11-08 Thread macro at orcam dot me.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111956

--- Comment #9 from Maciej W. Rozycki  ---
Hmm, the host check for `__frexpieee128' in gcc/ will surely not do
what's intended: even if the host is `powerpc*-*-linux*', the target
will often be something else and vice versa (libgm2's host is GCC's
target).

I think there is no way to verify target C library features in gcc/
at `configure' time, because at this point we may not yet have a target
compiler.  I haven't dealt with such a situation before, but AFAICS
people have used GCC_GLIBC_VERSION_GTE_IFELSE to explicitly check for
the glibc version required instead.  There's a relevant case for
TARGET_DEFAULT_LONG_DOUBLE_128 you can use as an example.

I'm not sure if such a check is needed though, unless perhaps for
sanity, as you only define TARGET_LIBM_PROVIDES_LONG_DOUBLE_IEEE128 if
--with-long-double-format=ieee has been explicitly given.

Also ISTM you can omit the target check for `powerpc64le-*-linux*' here
keeping $with_long_double_format=ieee check only and get support for the
relevant `powerpc64-*-linux*' targets too, as -with-long-double-format=
will already have verified correct usage.

Finally you may or may not have to check for $gcc_cv_target_ldbl128
equal to "yes" too, in case someone has used --without-long-double-128
(I'm not sure what the consequences would be, but it has caught my
attention, so please double-check).

Overall please refer to someone more familiar with POWER GCC targets as
I can only comment based on what I can see in the scripts and my general
experience.  The best course of action might be submitting the patch to
gcc-patches for review, cc-ing RS6000 port maintainers, as the most
relevant people may not be reading this bug report.

[Bug other/112451] gcc_build was not updated to checkout via git

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112451

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-11-09
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Andrew Pinski  ---
MIne, filed it so I can fix this in the next few days.

[Bug other/112451] New: gcc_build was not updated to checkout via git

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112451

Bug ID: 112451
   Summary: gcc_build was not updated to checkout via git
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Keywords: internal-improvement
  Severity: normal
  Priority: P3
 Component: other
  Assignee: pinskia at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

gcc_build was not updated when gcc moved over to git. Even though `gcc_build
update` works since that uses gcc_update, `gcc_build checkout` does not work.

Right now checkout_gcc uses svn .

[Bug c++/110936] if constexpr: member function pointers cannot be checked with ubsan

2023-11-08 Thread johelegp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110936

--- Comment #4 from Johel Ernesto Guerrero Peña  ---
They talk about `-fno-delete-null-pointer-checks` in BUG71962.

[Bug c++/110936] if constexpr: member function pointers cannot be checked with ubsan

2023-11-08 Thread nathanieloshead at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110936

Nathaniel Shead  changed:

   What|Removed |Added

 CC||nathanieloshead at gmail dot 
com

--- Comment #3 from Nathaniel Shead  ---
More generally this appears to be caused by '-fno-delete-null-pointer-checks'
causing constant folding not to occur. Minimised example:

  struct foo { void bar() { } };
  constexpr bool b = ::bar;

https://gcc.godbolt.org/z/x7csTzjfa

[Bug c/112450] New: RVV vectorization ICE in vect_get_loop_mask, at tree-vect-loop.cc:11037

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112450

Bug ID: 112450
   Summary: RVV vectorization ICE in vect_get_loop_mask, at
tree-vect-loop.cc:11037
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

int a, b, d, e;
short c;
void f() {
  for (; e; e++) {
int g = 6;
for (; g > 2; g--) {
  int i = -8;
  while (i < 20) {
i += 5;
a += b;
  }
  c *= d;
}
b--;
  }
}

-O2 --param=riscv-autovec-lmul=m8 -fno-vect-cost-model

during GIMPLE pass: vect
: In function 'f':
:3:6: internal compiler error: in vect_get_loop_mask, at
tree-vect-loop.cc:11037
3 | void f() {
  |  ^
0x7fa31fe47082 __libc_start_main
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
Compiler returned: 1

https://gcc.godbolt.org/z/sda87oaW5

The ICE looks pretty odd. For partial vectorization with length, we should
never
reach 'vect_get_loop_mask' which is supposed by used by partial vectorization
with mask.

And it also reaches condition:

LOOP_VINFO_PARTIAL_VECTORS_STYLE (loop_vinfo) == vect_partial_vectors_avx512

which should be unlikely for RVV.

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

Sam James  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org
Summary|[14 regression] ICE when|[14 regression] ICE when
   |buliding libqmi with -O3|buliding libqmi with -O3
   |-ftrivial-auto-var-init=zer |-ftrivial-auto-var-init=zer
   |o (internal compiler error: |o (internal compiler error:
   |tree check: expected class  |tree check: expected class
   |‘type’, have ‘exceptional’  |‘type’, have ‘exceptional’
   |(error_mark) in |(error_mark) in
   |useless_type_conversion_p,  |useless_type_conversion_p,
   |at gimple-expr.cc:85)   |at gimple-expr.cc:85) since
   ||r14-4405-gb583a2940af90d

--- Comment #6 from Sam James  ---
b583a2940af90d03f535648fef111cb158933f7d is the first bad commit
commit b583a2940af90d03f535648fef111cb158933f7d
Author: Richard Biener 
Date:   Wed Oct 4 15:25:33 2023 +0200

Avoid left around copies when value-numbering BBs

The following makes sure to treat values whose definition we didn't
visit as available since those by definition must dominate the entry
of the region.  That avoids unpropagated copies after if-conversion
and resulting SLP discovery fails (which doesn't handle plain copies).

* tree-ssa-sccvn.cc (rpo_elim::eliminate_avail): Not
visited value numbers are available itself.

 gcc/tree-ssa-sccvn.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
bisect found first bad commit

i.e. r14-4405-gb583a2940af90d

[Bug target/112374] [14 Regression] `--with-arch=skylake-avx512 --with-cpu=skylake-avx512` causes a comparison failure

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #8 from Andrew Pinski  ---
>From the dup:
The only difference between stage2-gcc/i386-expand.o and
stage3-gcc/i386-expand.o is 
```
< ./stage2-gcc/i386-expand.o: file format elf64-x86-64
---
> ./stage3-gcc/i386-expand.o: file format elf64-x86-64

11207c11207
< b48c: 39 ca   cmp%ecx,%edx
---
> b48c: 39 d1   cmp%edx,%ecx

```

This gives off vibes of unstable sort somewhere or a difference due to debug
info.
And maybe just exposed by r14-5076-g01c18f58d37865d5f3bbe93e666183b54ec ...

I am going to test a few things.

[Bug target/112443] [12/13/14 Regression] Misoptimization of _mm256_blendv_epi8 intrinsic on avx512bw+avx512vl

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112443

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.4

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

--- Comment #5 from Andrew Pinski  ---
Note the 2 tmp variables need to be named the same. Otherwise fre won't merge
the 2 ".DEFERRED_INIT (1, 2, &"tmp"[0]);" .

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Keywords||ice-on-valid-code
 Status|UNCONFIRMED |NEW
   Target Milestone|--- |14.0
   Last reconfirmed||2023-11-09

--- Comment #4 from Andrew Pinski  ---
```
   [local count: 57140633]:
  # ivtmp_62 = PHI <44(2), ivtmp_72(8)>
  _7 = .DEFERRED_INIT (1, 2, &"tmp"[0]);
...
   [local count: 57140633]:
  # _59 = PHI <_7(3), _54(4)>

   [local count: 877722739]:
  # ivtmp_8 = PHI <444(5), ivtmp_63(7)>
  tmp = _59;
```

defining statement of _54 is nowhere to be found.

Note this ICEs still even with -fno-checking

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

Andrew Pinski  changed:

   What|Removed |Added

  Attachment #56543|0   |1
is obsolete||

--- Comment #3 from Andrew Pinski  ---
Created attachment 56544
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56544=edit
with no warnings

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

--- Comment #2 from Andrew Pinski  ---
Created attachment 56543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56543=edit
little more reduced

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #28 from dave.anglin at bell dot net ---
On 2023-11-08 7:07 p.m., law at gcc dot gnu.org wrote:
> Do we already have a dump for the key function?  Presumably f-m-o doesn't
> trigger*that*  much.  And if this is triggering w/o LTO we can probably move 
> to
> cross debugging and analysis of those dump files and assembly code with and
> without f-m-o enabled, narrowing our focus on the key function.
I tried looking at the difference with and without f-m-o and it was quite
large.  The difference
with and without strict aliasing is much smaller.  The main differences that I
saw relate to the
inlining of compiler_visit_expr and compiler_visit_expr1.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #27 from dave.anglin at bell dot net ---
On 2023-11-08 7:00 p.m., John David Anglin wrote:
> On 2023-11-08 6:51 p.m., sjames at gcc dot gnu.org wrote:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>>
>> --- Comment #23 from Sam James  ---
>> (In reply to Andrew Pinski from comment #21)
>>> The other option to try is -fstack-reuse=none. There is definitely known
>>> issues with the code that coalesces stack variables together too (see PR
>>> 111843 for examples).
>> I had a good feeling about this but no, didn't help when applied to 
>> compile.o.
> At this point, I don't know whether this is a python or gcc bug. I scanned 
> for unions in compile.i
> that might be problematic but I didn't find anything obvious.
Note -no-strict-aliasing affects the inlining of compiler_visit_expr.  It is
not inlined with -no-strict-aliasing.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #26 from Jeffrey A. Law  ---
As a compiler junkie, I tend to think compiler first until I can prove it
otherwise.  I wouldn't get too hung up on aliasing issues and such at this
point.

Do we already have a dump for the key function?  Presumably f-m-o doesn't
trigger *that* much.  And if this is triggering w/o LTO we can probably move to
cross debugging and analysis of those dump files and assembly code with and
without f-m-o enabled, narrowing our focus on the key function.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #25 from Sam James  ---
I am having the same thoughts. It would not be the first time Python had
something dubious, like...
* https://wiki.gentoo.org/wiki/Project:Python/Strict_aliasing ->
https://www.python.org/dev/peps/pep-3123/
* https://github.com/python/cpython/issues/78

So far, I did not see this failure on any other target (-> makes me think it's
a gcc bug). But also, I didn't yet see any other software break on hppa (->
makes me think it might be a Python bug).

I tried ubsan on amd64 with Python 3.12 at least and got a lot of different
errors, although ubsan does not diagnose aliasing issues...

I am undecided myself still.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #24 from dave.anglin at bell dot net ---
On 2023-11-08 6:51 p.m., sjames at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>
> --- Comment #23 from Sam James  ---
> (In reply to Andrew Pinski from comment #21)
>> The other option to try is -fstack-reuse=none. There is definitely known
>> issues with the code that coalesces stack variables together too (see PR
>> 111843 for examples).
> I had a good feeling about this but no, didn't help when applied to compile.o.
At this point, I don't know whether this is a python or gcc bug.  I scanned for
unions in compile.i
that might be problematic but I didn't find anything obvious.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #23 from Sam James  ---
(In reply to Andrew Pinski from comment #21)
> The other option to try is -fstack-reuse=none. There is definitely known
> issues with the code that coalesces stack variables together too (see PR
> 111843 for examples).

I had a good feeling about this but no, didn't help when applied to compile.o.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #22 from John David Anglin  ---
Created attachment 56542
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56542=edit
Preprocessed source and assembly files for Python/compile.c

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #21 from Andrew Pinski  ---
(In reply to dave.anglin from comment #20)
> Both -fno-strict-aliasing and -fno-schedule-insns2 applied to
> compiler_visit_expr()
> work around issue.

The other option to try is -fstack-reuse=none. There is definitely known issues
with the code that coalesces stack variables together too (see PR 111843 for
examples).

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #10 from JuzheZhong  ---
(In reply to Vineet Gupta from comment #9)
> (In reply to JuzheZhong from comment #7)
> > Oh. I missed it:
> > 
> >   vmv.v.x   v2,s0
> > vse8.v  v2,0(a5)
> > 
> > Leave it to me today. It should be simple fix.
> > 
> > Thanks for report it.
> 
> Can I request you to let me continue to debug and fix it. I want to
> familiarize myself with the vsetv pass and this seems like a good
> opportunity to do so considering you think the fix is not hard.

OK. Thanks.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

Vineet Gupta  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-11-08
 Status|UNCONFIRMED |ASSIGNED

--- Comment #9 from Vineet Gupta  ---
(In reply to JuzheZhong from comment #7)
> Oh. I missed it:
> 
>   vmv.v.x v2,s0
>   vse8.v  v2,0(a5)
> 
> Leave it to me today. It should be simple fix.
> 
> Thanks for report it.

Can I request you to let me continue to debug and fix it. I want to familiarize
myself with the vsetv pass and this seems like a good opportunity to do so
considering you think the fix is not hard.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #8 from JuzheZhong  ---
Could you continue debug more cases ?


FAIL: gcc.c-torture/execute/pr89369.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr89369.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr89369.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/pr89369.c   -O3 -g  execution test
FAIL: gcc.dg/pr96239.c execution test
FAIL: gcc.dg/vshift-5.c execution test
FAIL: gcc.dg/torture/pr61346.c   -O2  execution test
FAIL: gcc.dg/torture/pr61346.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.dg/torture/pr61346.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gcc.dg/torture/pr61346.c   -O3 -g  execution test

They are RV32 system. memset issue I will fix it soon today.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #7 from JuzheZhong  ---
Oh. I missed it:

vmv.v.x v2,s0
vse8.v  v2,0(a5)

Leave it to me today. It should be simple fix.

Thanks for report it.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #6 from Vineet Gupta  ---
I have debugged this by single stepped in qemu 

when the test fails (first loop for offset 0, iteration 8)

The last VSETVLI is this one, 

   0x10a3e   0d107057  vsetvli  zero,zero,e32,m2,ta,ma
   0x10a42   j  0x10666

We eventually hit a VMV.v.x. which creates invalid pattern due to e32.

   (gdb) info reg vtype
   vtype  0xd1  209 # SEW = 010’b / e32, LMUL = 001’b / m2
   (gdb) info reg vl
   vl 0x8   8
   (gdb) info reg a0
   a0 0x41  65

   vmv.v.x  v2,a0

  (gdb) info reg v2
  v2 {q = {0x41004100410041} 
  (gdb) info reg v3
  v2 {q = {0x41004100410041}

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #5 from JuzheZhong  ---
I don't think VSETVL is wrong.


vsetivlizero,8,e8,mf2,ta,ma
sd  ra,120(sp)
vmv.x.s a1,v1
...
.L36:
vse8.v
...
vsetivlizero,8,e32,m2,ta,ma
j.L36

Both e8mf2 and e32m2 are valid for vse8.v since they have same ratio = 16

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #4 from Vineet Gupta  ---
Created attachment 56541
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56541=edit
asm output nok

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #3 from Vineet Gupta  ---
Created attachment 56540
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56540=edit
asm output ok

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #2 from Vineet Gupta  ---
Created attachment 56539
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56539=edit
manually reduced src

[Bug c/112339] ICE with clang::no_sanitize and -fsanitize=

2023-11-08 Thread s_gccbugzilla at nedprod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112339

--- Comment #3 from Niall Douglas  ---
Thanks for the patch. I've sent it on to the originator of the bug, if they
confirm it fixes their issue to me I'll let you know.

[Bug target/112447] risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

--- Comment #1 from JuzheZhong  ---
Could you share more assembly ?

[Bug c++/104255] parsing function signature fails when it uses a function parameter outside of an unevaluated context

2023-11-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104255

Patrick Palka  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-11-08

[Bug c++/104255] parsing function signature fails when it uses a function parameter outside of an unevaluated context

2023-11-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104255

Patrick Palka  changed:

   What|Removed |Added

 CC||fchelnokov at gmail dot com

--- Comment #7 from Patrick Palka  ---
*** Bug 112448 has been marked as a duplicate of this bug. ***

[Bug c++/112448] Constraint expression b rejected

2023-11-08 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112448

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Patrick Palka  ---
dup

*** This bug has been marked as a duplicate of bug 104255 ***

[Bug c++/112448] Constraint expression b rejected

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112448

--- Comment #1 from Andrew Pinski  ---
Note the error message on the trunk changed to:
:4:40: error: template argument 1 is invalid
4 | constexpr bool f(auto x) requires b { return true; }
  |^

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #4 from Andrew Pinski  ---
And documented other parts here:
https://gcc.gnu.org/onlinedocs/gcc-13.2.0/cpp/Common-Predefined-Macros.html


specifically:
It does not indicate whether optimizations respect signaling NaN semantics (the
macro for that is __SUPPORT_SNAN__).

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #3 from Andrew Pinski  ---
GCC does document some of this on
https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Floating-point-implementation.html
but not the signaling nan part.

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #2 from Andrew Pinski  ---
Note mips and sh and a few other targets have the quiet bit meaning the
opposite.

[Bug c/112449] Arithmetic operations can produce signaling NaNs

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

--- Comment #1 from Andrew Pinski  ---
See
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fsignaling-nans

[Bug c/112449] New: Arithmetic operations can produce signaling NaNs

2023-11-08 Thread post+gcc at ralfj dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449

Bug ID: 112449
   Summary: Arithmetic operations can produce signaling NaNs
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: post+gcc at ralfj dot de
  Target Milestone: ---

According to the IEEE 754 specification, the output of an arithmetic operation
can never be a signaling NaN. However, GCC performs optimizations that turn `x
* 1.0` into just `x`, and if `x` is a signaling NaN, that means that the
multiplication will now (seem to) return a signaling NaN. (proof of GCC
performing that transformation: https://godbolt.org/z/scPhn1d8s)

It is very common for C compilers to violate this IEEE 754 requirement, but it
does open the door to a great many questions. Since GCC evidently does not seem
to implement the original IEEE 754 semantics, it would be great to have some
documentation on what exactly GCC *does* implement, and in particular under
which conditions operations are allowed to return a signaling NaN.

So currently, GCC is either buggy because it violates the IEEE 754 spec, or
there's a documentation bug in that the actual floating point spec GCC intends
to implement is not documented. At least, all I was able to find is
https://gcc.gnu.org/wiki/FloatingPointMath, which just says "does not care
about signalling NaNs". (I hope this does not mean that any arithmetic
operation may arbitrarily produce signaling NaNs. That would be an issue for
operations which are sensitive to the difference between quiet NaN and
signaling NaN, such as `pow`.)

As a point of comparison, LLVM recently added this to their documentation to
answer these kinds of questions:
https://llvm.org/docs/LangRef.html#behavior-of-floating-point-nan-values. (That
PR was authored by me but received input from a lot of people.) LLVM goes
further than to just document signaling vs quiet NaN there, since in practice
there's some critical code that would break if arithmetic operations returned
NaNs with arbitrary bits in their payload (specifically, that would break NaN
boxing as performing by some JavaScript engines, or at least make it a lot less
efficient since engines would have to re-normalize NaNs after every single
operation -- which to my knowledge, they don't actually do in practice).

[Bug target/82524] [7/8 Regression] expensive-optimizations produces wrong results

2023-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82524

--- Comment #22 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:dced5ae64703507a7159972316a1dde48e5f7470

commit r14-5254-gdced5ae64703507a7159972316a1dde48e5f7470
Author: Uros Bizjak 
Date:   Wed Nov 8 21:46:26 2023 +0100

i386: Apply LRA reload workaround to insns with high registers [PR82524]

LRA is not able to reload zero_extracted in-out operand with matched input
operand in the same way as strict_low_part in-out operand.  The patch
applies the strict_low_part workaround, where we allow LRA to generate
an instruction with non-matched input operand, which is split post reload
to the instruction that inserts non-matched input operand to an in-out
operand and the instruction that uses matched operand, also to
zero_extracted in-out operand case.

The generated code from the pr82524.c testcase improves from:

movl%esi, %ecx
movl%edi, %eax
movsbl  %ch, %esi
addl%esi, %edx
movb%dl, %ah

to:
movl%edi, %eax
movl%esi, %ecx
movb%ch, %ah
addb%dl, %ah

The compiler is now also able to handle non-commutative operations:

movl%edi, %eax
movl%esi, %ecx
movb%ch, %ah
subb%dl, %ah

and unary operations:

movl%edi, %eax
movl%esi, %edx
movb%dh, %ah
negb%ah

The patch also robustifies split condition of the splitters to ensure that
only alternatives with unmatched operands are split.

PR target/82524

gcc/ChangeLog:

* config/i386/i386.md (*add_1_slp):
Split insn only for unmatched operand 0.
(*sub_1_slp): Ditto.
(*_1_slp): Merge pattern from
"*and_1_slp"
and "*_1_slp" using any_logic code iterator.
Split insn only for unmatched operand 0.
(*neg1_slp): Split insn only for unmatched operand 0.
(*one_cmpl_1_slp): Ditto.
(*ashl3_1_slp): Ditto.
(*_1_slp): Ditto.
(*_1_slp): Ditto.
(*addqi_ext_1): Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_2): Merge pattern from
"*addqi_ext_2" and "*subqi_ext_2" using plusminus code
iterator. Redefine as define_insn_and_split.  Add alternative 1
and split insn after reload for unmatched operand 0.
(*subqi_ext_1): Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_0): Merge pattern from
"*andqi_ext_0" and and "*qi_ext_0"
using
any_logic code iterator.
(*qi_ext_1): Merge pattern from
"*andqi_ext_1" and "*qi_ext_1" using
any_logic code iterator. Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_1_cc): Merge pattern from
"*andqi_ext_1_cc" and "*xorqi_ext_1_cc" using any_logic
code iterator. Redefine as define_insn_and_split.  Add alternative
1
and split insn after reload for unmatched operand 0.
(*qi_ext_2): Merge pattern from
"*andqi_ext_2" and "*qi_ext_2" using
any_logic code iterator. Redefine as define_insn_and_split.  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*qi_ext_3): Redefine as
define_insn_and_split.
Add alternative 1 and split insn after reload for unmatched operand
0.
(*negqi_ext_1): Rename from "*negqi_ext_2".  Add
alternative 1 and split insn after reload for unmatched operand 0.
(*one_cmplqi_ext_1): Ditto.
(*ashlqi_ext_1): Ditto.
(*qi_ext_1): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr78904-1.c (test_sub): New test.
* gcc.target/i386/pr78904-1a.c (test_sub): Ditto.
* gcc.target/i386/pr78904-1b.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2a.c (test_sub): Ditto.
* gcc.target/i386/pr78904-2b.c (test_sub): Ditto.
* gcc.target/i386/pr78952-4.c (test_sub): Ditto.
* gcc.target/i386/pr82524.c: New test.
* gcc.target/i386/pr82524-1.c: New test.
* gcc.target/i386/pr82524-2.c: New test.
* gcc.target/i386/pr82524-3.c: New test.

[Bug c++/112448] New: Constraint expression b rejected

2023-11-08 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112448

Bug ID: 112448
   Summary: Constraint expression b rejected
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

This program

struct s { static constexpr bool v = true; };
template inline constexpr bool b = true;
constexpr bool f(auto x) requires b { return true; }
static_assert(f(s{})); // clang ok, gcc nope, msvc ok

is valid per the explanation here https://stackoverflow.com/a/77439003/7325599


but GCC rejects it with the error:

missing template arguments before '<' token
3 | constexpr bool f(auto x) requires b { return true; }
  |^
:3:36: error: expected initializer before '<' token
:4:15: error: 'f' was not declared in this scope
4 | static_assert(f(s{}));

Online demo: https://godbolt.org/z/1Gejvr4cn

[Bug target/111311] RISC-V regression testsuite errors with --param=riscv-autovec-preference=scalable

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111311

--- Comment #16 from Vineet Gupta  ---
(In reply to Patrick O'Neill from comment #8)
> Updated regression list using r14-5070-g4ea36076d66 on rv64gcv:
> 
> === gcc: Unexpected fails for rv64gcv lp64d medlow ===
> FAIL: gcc.c-torture/execute/memset-3.c   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL: gcc.c-torture/execute/memset-3.c   -O3 -g  execution test

memset-3 failure tracked separately:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

[Bug target/112447] New: risc-v regression: FAIL: gcc.c-torture/execute/memset-3.c -O3

2023-11-08 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112447

Bug ID: 112447
   Summary: risc-v regression: FAIL:
gcc.c-torture/execute/memset-3.c   -O3
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: vineetg at gcc dot gnu.org
  Reporter: vineetg at gcc dot gnu.org
CC: jeffreyalaw at gmail dot com, juzhe.zhong at rivai dot ai,
lehua.ding at rivai dot ai, rdapp at gcc dot gnu.org
  Target Milestone: ---

As reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111311#c8

we have following execute failures on trunk.

=== gcc: Unexpected fails for rv64gcv lp64d medlow ===
FAIL: gcc.c-torture/execute/memset-3.c   -O3 -g  execution test

The issue is an extraneous VSETVLI instruction (with wrong SEW) being generated
which creates wrong fill pattern for memset.

```
main:

[...]

.L36:  ; 2. loop start for @off 0 
vse8.v  v1,0(t3)
vse8.v  v1,0(t6)
vse8.v  v1,0(s1)
vse8.v  v3,0(a5)
...
; loop epilogue
li  a7,15
beq a4,a7,.L171
vsetvli zero,zero,e32,m2,ta,ma   <--- wrong
j   .L36
```

vsetvli pass dumps:

```
Phase 3: Reduce global vsetvl infos. 

  Compute LCM insert and delete data:

  Expr[2]: VALID (insn 2847, bb 3)
Demand fields: demand_sew_lmul demand_avl
SEW=8, VLMUL=mf2, RATIO=16, MAX_SEW=64
TAIL_POLICY=agnostic, MASK_POLICY=agnostic
AVL=(const_int 8 [0x8])
VL=(nil)

VSETVL infos after phase 3

  bb 3:
probability: always (guessed)
Header vsetvl info:VALID (insn 2847, bb 3) (deleted)  <---
  Demand fields: demand_sew_lmul demand_avl
  SEW=8, VLMUL=mf2, RATIO=16, MAX_SEW=64
  TAIL_POLICY=agnostic, MASK_POLICY=agnostic
  AVL=(const_int 8 [0x8])
  VL=(nil)
```

So it seems LCM is deleting the valid VSETVLI insn which later causes Phase 4
to insert a different/incorrect one.

I revert the following commit and the issue goes away. 

 2023-10-18 f0e28d8c1371 RISC-V: Fix failed hoist in LICM of vmv.v.x
instruction  

This at least tells us the cause of issue, next step is to fix the issue.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #20 from dave.anglin at bell dot net ---
On 2023-11-08 2:07 p.m., pinskia at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
>
> --- Comment #18 from Andrew Pinski  ---
> I wonder if -fno-strict-aliasing works around the issue too?
> I get the feeling that `fold mem offset pass` allows the aliasing code to have
> a better time with the offset and that might be expose more aliasing issues.
>
> The other thing to try is add `-fno-schedule-insns2 -fno-schedule-insns`
> instead of `-fno-strict-aliasing` as the scheduler is normally where the
> aliasing issues are exposed on the RTL level ...
Both -fno-strict-aliasing and -fno-schedule-insns2 applied to
compiler_visit_expr()
work around issue.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #19 from Jeffrey A. Law  ---
f-m-o runs post-allocation, so the scope of where it's behavior can change
things is narrower.  So testing with -fno-schedule-insns isn't going to be
useful, but -fno-schedule-insns2 might.

I'm a bit concerned that we can't turn off f-m-o with an attribute.  That would
indicating something isn't wired up right in the options handling.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #18 from Andrew Pinski  ---
I wonder if -fno-strict-aliasing works around the issue too?
I get the feeling that `fold mem offset pass` allows the aliasing code to have
a better time with the offset and that might be expose more aliasing issues.

The other thing to try is add `-fno-schedule-insns2 -fno-schedule-insns`
instead of `-fno-strict-aliasing` as the scheduler is normally where the
aliasing issues are exposed on the RTL level ...

[Bug libstdc++/8670] Alignment problem in std::basic_string

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8670

--- Comment #24 from Jonathan Wakely  ---
Oh, but this would be an ABI break. When using the explicit instantiation
definitions in libstdc++.so allocations and deallocations will match because
both will come from the library. But if anything is inlined, code compiled
against older gcc headers might allocate N bytes from _Raw_bytes_alloc, and
other code might deallocate N2 bytes from _Rep_alloc, where N != N2.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #17 from dave.anglin at bell dot net ---
On 2023-11-08 9:42 a.m., jeffreyalaw at gmail dot com wrote:
> I'd probably continue with the process of narrowing down what code is
> affected using the attributes.  We already know the file, narrowing it
> down to a function might help considerably with the evaluation effort.
The problem seems to be in compiler_visit_expr().

-static int compiler_visit_expr(struct compiler *, expr_ty);
+static int compiler_visit_expr(struct compiler *, expr_ty)
__attribute__((optimize("no-inline-small-functions")));

Python builds okay if this function is not inlined, if it is compiled at -O1,
or if -fno-inline-small-functions is
specified as above.  Can't specify -fno-fold-mem-offsets as a function
attribute.

[Bug ada/112446] New: Switch -gnatyz included in -gnatyg

2023-11-08 Thread simon at pushface dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112446

Bug ID: 112446
   Summary: Switch -gnatyz included in -gnatyg
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: simon at pushface dot org
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56538
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56538=edit
Demonstrator

"gnatmake --help" states that -gnatyg is equivalent to -gnatydISux, but 
in fact the new switch -gnatyz (check parentheses not required by operator 
precedence rules) is included.

If this is deliberate, the help information should say so.

(Personally, I think that clarifying parens are a valuable help to the 
reader! Are the GNAT Style Rules published?)

Given this (see the attachment),

   procedure P (P1, P2 : Boolean) is
  Dummy : Boolean;
   begin
  Dummy := (P1) or P2;
   end P;

this happens:

   $ /opt/gcc-14.0.0-20231105/bin/gnatmake -gnatyg p.adb
   gcc -c -gnatyg p.adb
   p.adb:4:13: (style) redundant parentheses [-gnatyz]

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread adam.andersson at elisapolystar dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #9 from Adam Andersson  ---
I was sure I had tried -fno-strict-aliasing without any difference, but I
guessed I messed up somehow. Sorry about that.

Still, is it not strange that -Wall doesn't generate a warning about this then?

[Bug libstdc++/8670] Alignment problem in std::basic_string

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8670

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|giovannibajo at gmail dot com  |redi at gcc dot gnu.org

--- Comment #23 from Jonathan Wakely  ---
Created attachment 56537
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56537=edit
Fix alignment of COW std::string storage

This fixes both problems, so re-assigning to myself.

[Bug libstdc++/8670] Alignment problem in std::basic_string

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8670

--- Comment #22 from Jonathan Wakely  ---
This bug is still present in the COW std::string, which is still supported even
though it's not the default.

There are two problems. The first is the one reported by James Kanze, that the
string contents need to be aligned to alignof(_CharT) but are currently aligned
to 1. The second, stated by Nathan in comment 13, is that the _Rep object needs
to be aligned to alignof(_Rep), not 1. The reference count in the _Rep ends up
misaligned, and the atomic operations on it are undefined.

Here's a C++17 test case that shows both problems:

#define _GLIBCXX_USE_CXX11_ABI 0

#include 

template
struct alloc
{
  using value_type = T;

  alloc() = default;

  template
alloc(const alloc&) { }

  T* allocate(std::size_t n)
  {
if constexpr (std::is_same_v)
  return next.allocate(n + 1) + 1;
return next.allocate(n);
  }

  void deallocate(T* p, std::size_t n)
  {
if constexpr (std::is_same_v)
  return next.deallocate(p - 1, n + 1);
return next.deallocate(p, n);
  }

  [[no_unique_address]] std::allocator next;

  bool operator==(const alloc&) const { return true; }
};

template
  using String = std::basic_string, alloc>;

int main()
{
  String sd(2, 0.0L);
  return sd[1];
}



This results in loads of UBsan errors like:

/usr/include/c++/13/bits/cow_string.h:3604:24: runtime error: member access
within misaligned address 0x006f22b1 for type 'struct _Rep', which requires
8 byte alignment
0x006f22b1: note: pointer points here
 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 
...
/usr/include/c++/13/bits/char_traits.h:307:15: runtime error: store to
misaligned address 0x006f22c9 for type 'char_type', which requires 16 byte
alignment
0x006f22c9: note: pointer points here
 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 
...
/usr/include/c++/13/bits/cow_string.h:252:46: runtime error: reference binding
to misaligned address 0x006f22e9 for type 'char_type', which requires 16
byte alignment
0x006f22e9: note: pointer points here
 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 
...
/usr/include/c++/13/ext/atomicity.h:84:18: runtime error: load of misaligned
address 0x006f22c1 for type '_Atomic_word', which requires 4 byte alignment
0x006f22c1: note: pointer points here
 00 00 00  00 ff ff ff ff 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00
  ^ 


It seems to me that we can just do:

  struct __attribute__((__aligned__(__alignof__(_CharT _Rep_base
  {
size_type   _M_length;
size_type   _M_capacity;
_Atomic_word_M_refcount;
  };

And then stop allocating raw bytes (with alignment 1) to place the _Rep into,
and use this allocator type instead:

typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
  rebind<_Rep>::other _Rep_alloc;

Then of course we need to adjust the size that we (de)allocate, to be in units
of sizeof(_Rep) not bytes.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #8 from Jonathan Wakely  ---
The aliasing doesn't happen when writing to the array, it's when reading a
char* value from an object of type unsigned char*.

If you just passed the unsigned char* to memcpy instead of *(char**) it
would be OK.

memcpy(*, ...) would also be OK.

[Bug target/112445] New: [14 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1861 unable to find a register to spill: {*umulditi3_1} with -O -march=cascadelake -fwrapv

2023-11-08 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112445

Bug ID: 112445
   Summary: [14 Regression] ICE: in lra_split_hard_reg_for, at
lra-assigns.cc:1861 unable to find a register to
spill: {*umulditi3_1} with -O -march=cascadelake
-fwrapv
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 56536
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56536=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O -march=cascadelake -fwrapv testcase.c 
testcase.c: In function 'foo':
testcase.c:19:1: error: unable to find a register to spill
   19 | }
  | ^
testcase.c:19:1: error: this is the insn:
(insn 37 142 199 2 (parallel [
(set (reg:TI 302 [orig:150 _73 ] [150])
(mult:TI (zero_extend:TI (reg:DI 184 [ cu8_0 ]))
(zero_extend:TI (reg:DI 181 [ foo0_s64_0 ]
(clobber (reg:CC 17 flags))
]) "testcase.c":10:9 510 {*umulditi3_1}
 (expr_list:REG_DEAD (reg:DI 184 [ cu8_0 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil
during RTL pass: reload
testcase.c:19:1: internal compiler error: in lra_split_hard_reg_for, at
lra-assigns.cc:1861
0x7f3bef _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/repo/gcc-trunk/gcc/rtl-error.cc:108
0x12b9d6d lra_split_hard_reg_for()
/repo/gcc-trunk/gcc/lra-assigns.cc:1861
0x12b3268 lra(_IO_FILE*)
/repo/gcc-trunk/gcc/lra.cc:2495
0x1261999 do_reload
/repo/gcc-trunk/gcc/ira.cc:5973
0x1261999 execute
/repo/gcc-trunk/gcc/ira.cc:6161
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-5250-20231108213319-g8cf7b936d44-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-5250-20231108213319-g8cf7b936d44-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231108 (experimental) (GCC)

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #11 from Robin Dapp  ---
Thanks, this is helpful.

I have a patch that I just bootstrapped and ran the testsuite with on aarch64.
Going to post it soon, maybe Richi still has a better idea how to work around
this.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406

--- Comment #10 from Tamar Christina  ---
Just finished second bisect and reduce.  Came out to this commit as well.

---

  module brute_force
integer, parameter :: r=9
 integer sudoku1(1, r)
contains
  subroutine brute
  integer l(r), u(r)
 where(sudoku1(1, :) /= 1)
  l = 1
u = 1
 end where
  do i1 = 1, u(1)
 do
end do
 end do
  end
  end

---

gfortran -w -c exchange2.f90 -fprofile-generate -march=armv8-a+sve -Ofast -o
exchange2.f90.o

gives:

during GIMPLE pass: vect
exchange2.fppized2.f90:5:18:

5 |   subroutine brute
  |  ^
internal compiler error: in vect_get_vec_defs_for_operand, at
tree-vect-stmts.cc:1257

which is probably related to your last message.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #7 from Xi Ruoyao  ---
Note that in the "new bug" page, there is a red banner saying:

Before reporting that GCC compiles your code incorrectly, compile it with gcc
-Wall -Wextra and see whether this shows anything wrong with your code.
Similarly, if compiling with -fno-strict-aliasing -fwrapv makes a difference,
your code probably is not correct.

In this case -fno-strict-aliasing makes a difference.  And the code is indeed
incorrect.

[Bug fortran/112407] [13/14 Regression] Fix for PR37336 triggers an ICE in gfc_format_decoder while constructing a vtab

2023-11-08 Thread trnka at scm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112407

--- Comment #5 from Tomáš Trnka  ---
(In reply to Paul Thomas from comment #4)
> Created attachment 56531 [details]
> Fix for this PR
> 
> The bug comes about because the vtable is being declared in one of the
> specific procedures typebound to the derived type, thereby making the
> procedure implicitly recursive. The attached fix gives this specific
> procedure the recursive attribute.

This fix seems to work great, all of our stuff builds and passes tests without
any new trouble (without -frecursive). Your previous patch in comment 2 also
seems to work (our code builds fine, but I haven't tested that variant
thoroughly).

I'm looking forward to any more information on the root cause.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

Xi Ruoyao  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED
 CC||xry111 at gcc dot gnu.org

--- Comment #6 from Xi Ruoyao  ---
It's definitely an aliasing rule violation.  And it's still wrong even if you
use a void pointer.  The void pointer "workaround" just happens to work by
luck.

[Bug middle-end/112444] [14 regression] ICE when buliding libqmi with -O3 -ftrivial-auto-var-init=zero (internal compiler error: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in u

2023-11-08 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112444

--- Comment #1 from Sam James  ---
Created attachment 56535
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56535=edit
reduced.i

cvise popped this out, I haven't tried to prettify it by hand at all as heading
out now.

[Bug c/112442] Segfault from casting a ptr when using -O2

2023-11-08 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112442

--- Comment #5 from Andreas Schwab  ---
warning: dereferencing type-punned pointer will break strict-aliasing rules
[-Wstrict-aliasing]
   15 | test((char **), "test!");

[Bug testsuite/111298] time-profiler-2.c flaky on glibc RISC-V

2023-11-08 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111298

Jorn Wolfgang Rennecke  changed:

   What|Removed |Added

 CC||amylaar at gcc dot gnu.org

--- Comment #3 from Jorn Wolfgang Rennecke  ---
(In reply to Patrick O'Neill from comment #0)
> I'm guessing that this is likely due to some conflict between
> time-profiler-1.c and time-profiler-2.c and filing this under testsuite
> framework issue, but feel free to move it if it's likely caused by a
> specific component.

My guess is that the atomic fetch-and-update emitted by
gimple_gen_time_profiler
is not actually atomic (at least under RISC-V Qemu).
Note that in time-profiler-2.c, there is a parent and a child process that
access the same gcov data.

[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94

2023-11-08 Thread jeffreyalaw at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415

--- Comment #16 from Jeffrey A. Law  ---
On 11/8/23 03:09, manolis.tsamis at vrull dot eu wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
> 
> --- Comment #15 from Manolis Tsamis  ---
> (In reply to Sam James from comment #13)
>> Created attachment 56527 [details]
>> compile.c.323r.fold_mem_offsets.bad.xz
>>
>> Output from
>> ```
>> hppa2.0-unknown-linux-gnu-gcc -c  -DNDEBUG -g -fwrapv -O3 -Wall -O2
>> -std=c11 -Werror=implicit-function-declaration -fvisibility=hidden
>> -I/home/sam/git/cpython/Include/internal -IObjects -IInclude -IPython -I.
>> -I/home/sam/git/cpython/Include-DPy_BUILD_CORE -o Python/compile.o
>> /home/sam/git/cpython/Python/compile.c -fdump-rtl-fold_mem_offsets-all
>> ```
>>
>> If I instrument certain functions in compile.c with no optimisation
>> attribuet or build the file with -fno-fold-mem-offsets, Python works, so I'm
>> reasonably sure this is the relevant object.
> 
> Thanks for the dump file! There are 66 folded/eliminated instructions in this
> object file; I did look at each case and there doesn't seem to be anything
> strange. In fact most of the transformations are straightforward:
> 
>   - All except a couple of cases don't involve any arithmetic, so it's just
> moving a constant around.
>   - The majority of the transformations are 'trivial' and consist of a single
> add and then a memory operation: a sequence like X = Y + Const, R = MEM[X + 0]
> is folded to X = Y, R = MEM[X + Const]. I wonder why so many of these exist 
> and
> are not optimized elsewhere.
>   - There are some cases with negative offsets, but the calculations look
> correct.
>   - There are few more complicated cases, but I've done these on paper and 
> also
> look correct.
The PA port is "weird".  It's addressing modes aren't a good match for 
GCC (they're not symmetrical across loads vs stores and across fp vs 
integer) and they have the implicit space register problem.  But I don't 
immediately recall needing to avoid propagation of constants into memory 
references or anything like that.

I'd probably continue with the process of narrowing down what code is 
affected using the attributes.  We already know the file, narrowing it 
down to a function might help considerably with the evaluation effort.

Note that QEMU has a functional PA port.  So you might be able to just 
take a root filesystem, add the tarball referenced earlier and play 
around to narrow things down further.

I haven't done work on the PA in about 20 years at this point, but I can 
probably still grok its code.  Between David and myself I'm sure we can 
help interpret what's going on


Jeff

  1   2   >