[Bug target/108208] Bad assembly? on large LLVM source files on powerpc-unknown-linux-gnu (Error: operand out of range)

2024-01-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108208

--- Comment #7 from Segher Boessenkool  ---
This PR is for the sysv ABI, while most discussion was about the "ELFv1" ABI.

Only the 64-bit ABIs have the code model ABI, for the powerpc*-*-*
configurations.
Some other architectures have it for more things, and some for fewer, or even
none.

If you get an error at line 577996 of a source file, changes are your code is
just
completely unreasonably large, esp. on a smaller target like this :-)

[Bug target/108208] Bad assembly? on large LLVM source files on powerpc-unknown-linux-gnu (Error: operand out of range)

2024-01-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108208

--- Comment #4 from Segher Boessenkool  ---
See my previous comment?

You can either write better code, or use -mcmodel=large or similar, accepting
the not-so-stellar generated code you get then.

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #12 from Segher Boessenkool  ---
(In reply to Eric Botcazou from comment #11)
> > It says those upper bits are well-defined, i.e. whatever MD pattern is used
> > for it eventually will emit machine code that has the exact same result for
> > those upper bits.
> 
> No, that's not true, the set of "register operations" is restricted.

Who what where?  That is not how it is documented.  There is
word_register_operation_p as a bandaid to make it *somewhat* work, added
decades
later, but it still won't fly :-(

Different parts of the compiler think it has much more stringent semantics btw.

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #10 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #6)
> I must say I have no idea what WORD_REGISTER_OPERATION says about the upper
> bits of a paradoxical SUBREG if it is a MEM and load_extend_op (inner_mode)
> is ZERO_EXTEND (zeros then?  Then this optimization is ok), or something
> else?  And what it says on REGs.

It says those upper bits are well-defined, i.e. whatever MD pattern is used for
it eventually will emit machine code that has the exact same result for those
upper bits.  This is almost impossible to prove for any non-trivial target, and
certainly extremely fragile.

[Bug rtl-optimization/112758] [13/14 Regression] Inconsistent Bitwise AND Operation Result between int and long long int

2023-12-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112758

--- Comment #4 from Segher Boessenkool  ---
WORD_REGISTER_OPERATIONS is extremely ill-defined.  Or, it is used for other
things than what it stands for, whichever way you want to look at it.

A backend that defines the macro to non-zero promises that for *any* operation
on any values in a smaller than full-register mode, the compiler can instead
do the operation in that full-register mode, and all the resulting bits will
be well-defined.  This is not true for most real non-trivial backends.

There is word_register_operation_p to filter out the most obvious and egregious
cases where WORD_REGISTER_OPERATIONS is just a foolish thing, but this function
isn't used nearly enough, and it doesn't filter out enough either.

[Bug target/110606] ICE output_operand: '%&' used without any local dynamic TLS references on powerpc64le-linux-gnu

2023-12-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110606

--- Comment #8 from Segher Boessenkool  ---
What does "dead at sched2" mean?  Are they dead when sched2 starts, or made
dead
by it?  Well it must be the former; what pass does make it dead, then?  split3
apparently?  Why is this not done in split2 already, any good reason?

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-12-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

--- Comment #13 from Segher Boessenkool  ---
(In reply to Peter Bergner from comment #12)
> I'll note that you don't always
> get an assembler error, since gcc still passes -many to the assembler for
> non --enable-checking gcc builds, which causes it to accept the fctid insn.

Hrm.  Was that an oversight?  Should we always do that now?  Can you prepare a
patch (and test on some common configs) please?

[Bug target/110606] ICE output_operand: '%&' used without any local dynamic TLS references on powerpc64le-linux-gnu

2023-11-28 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110606

--- Comment #5 from Segher Boessenkool  ---
The insn that it fails on is the result from a split using *tls_ld .

[Bug target/110606] ICE output_operand: '%&' used without any local dynamic TLS references on powerpc64le-linux-gnu

2023-11-28 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110606

--- Comment #4 from Segher Boessenkool  ---
It needs   -O2 -fPIC -fno-exceptions   to fail.

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-11-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #8 from Segher Boessenkool  ---
Yeah, it tested for ISA 2.04 before.  That was an attempt at including 476
probably?

We really should have a TARGET_FCTID, on for TARGET_POWERPC64 or for cpu 476
(so
NOT user-selectable separately, of course!); not try to use pre-existing flags
for this, which might work but will forever stay confusing.

So either a separate OPTION_FCTID for in rs6000-cpus.def, or TARGET_FCTID. 
Either
works for me.

(Background: in ISA 1.xx it was for 64-bit implementations only.  But it does
not
need 64-bit registers or a 64-bit integer pipeline at all, it is an FP
instruction
that works on FP registers, which always are 64-bit.  The instruction was
implemented
on the 476).

[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304

2023-11-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103

--- Comment #2 from Segher Boessenkool  ---
In all those cases the code is perfectly fine, but also in all of those cases
the
code is still suboptimal: the rldicl is just as superfluous as the second
rlwinm
was!  :-)

[Bug target/112103] [14 regression] gcc.target/powerpc/rlwinm-0.c fails after r14-4941-gd1bb9569d70304

2023-11-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112103

--- Comment #1 from Segher Boessenkool  ---
Those are:

$ diff -up rlwinm-0.s{.12,}
--- rlwinm-0.s.12   2023-11-09 18:28:49.362639203 +
+++ rlwinm-0.s  2023-11-09 18:30:46.422896735 +
@@ -6747,7 +6747,7 @@ f_1_16_31:
 .LFB345:
.cfi_startproc
rlwinm 3,3,1,16,31
-   rlwinm 3,3,0,0x
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -7645,7 +7645,7 @@ f_1_24_31:
 .LFB390:
.cfi_startproc
rlwinm 3,3,1,24,31
-   rlwinm 3,3,0,0xff
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -11235,7 +11235,7 @@ f_2_16_31:
 .LFB570:
.cfi_startproc
rlwinm 3,3,2,16,31
-   rlwinm 3,3,0,0x
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -12133,7 +12133,7 @@ f_2_24_31:
 .LFB615:
.cfi_startproc
rlwinm 3,3,2,24,31
-   rlwinm 3,3,0,0xff
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -15722,7 +15722,7 @@ f_7_16_31:
 .LFB795:
.cfi_startproc
rlwinm 3,3,7,16,31
-   rlwinm 3,3,0,0x
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -16620,7 +16620,7 @@ f_7_24_31:
 .LFB840:
.cfi_startproc
rlwinm 3,3,7,24,31
-   rlwinm 3,3,0,0xff
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -20207,7 +20207,7 @@ f_8_16_31:
 .LFB1020:
.cfi_startproc
rlwinm 3,3,8,16,31
-   rlwinm 3,3,0,0x
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -24691,7 +24691,7 @@ f_9_16_31:
 .LFB1245:
.cfi_startproc
rlwinm 3,3,9,16,31
-   rlwinm 3,3,0,0x
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -29174,7 +29174,7 @@ f_15_16_31:
 .LFB1470:
.cfi_startproc
rlwinm 3,3,15,16,31
-   rlwinm 3,3,0,0x
+   rldicl 3,3,0,32
blr
.long 0
.byte 0,0,0,0,0,0,0,0
@@ -67092,4 +67092,4 @@ f_31_31_31:
.cfi_endproc
 .LFE3375:
.size   f_31_31_31,.-.L.f_31_31_31
-   .ident  "GCC: (GNU) 12.0.1 20220406 (experimental)"
+   .ident  "GCC: (GNU) 14.0.0 20231103 (experimental)"

[Bug rtl-optimization/106594] [13/14 Regression] sign-extensions no longer merged into addressing mode

2023-10-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594

--- Comment #27 from Segher Boessenkool  ---
(In reply to Roger Sayle from comment #21)
> Segher has proposed that object code size correlates with the quality of

It isn't a proposition, it is a simple and obvious fact.  But, this isn't
exactly
what I say :-)

Code size strongly correlates with number of instructions, almost 1-1 on most
targets.  Number of instructions is exactly what combine tries to reduce.

Whether that makes the code actually better is something completely separate as
well.  If your instruction cost function (and please use insn_cost, it is much
easier to use, and thus gives better results than rtx_costs) is good, this of
course should work fine.  And there is a hook (TARGET_LEGITIMATE_COMBINED_INSN)
for the very nasty cases.

But the whole "fewer insns that do the same thing, is better" thing is not
actually true on some targets.  Such targets are incredibly hard to optimise
for.  There is no way combine can do a good job for such targets.  It is
incredibly hard for human programmers to write good machine code for such
systems
by hand as well.

I do use object code size **of a huge sample** as a quick and dirty sniff test
to see if a change to combines is good or bad.  After that I always look at the
actual changes as well.  I do realise all pitfalls associated with this :-)

[Bug target/111367] Error: operand out of range (0x1391c is not between 0xffffffffffff8000 and 0x7fff)

2023-09-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111367

--- Comment #11 from Segher Boessenkool  ---
> > There really should be a comment why one alternative needs the %U{n} and the
> > other can
> > ignore it, btw.  Nothing new there, but a head-scratcher :-)
> 
> OK, something like: "prefixed load/store insns only have D-form but no
> update and X-form"?

Exactly.  Something short is plenty, but if there is nothing there it is
surprising.  Surprising is bad :-)

[Bug target/111367] Error: operand out of range (0x1391c is not between 0xffffffffffff8000 and 0x7fff)

2023-09-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111367

--- Comment #9 from Segher Boessenkool  ---
I don't like that "wzd" attribute at all.  Please just put an "if" for the mode
around this -- everywhere else (including in a large part of this patch!) we
deal with SImode and DImode separately already.  Or perhaps you can use the
"ptrload" attribute,
which includes the "l"?

There really should be a comment why one alternative needs the %U{n} and the
other can
ignore it, btw.  Nothing new there, but a head-scratcher :-)

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2023-09-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #60 from Segher Boessenkool  ---
(In reply to Roman Krotov from comment #59)
> All, what I'm asking for, is to make something like -Wno-void-unused, which
> would suppress the warnings only for the (void) casted calls.

So you want to not warn for some (just *some*) explicitly unused cases, and do
warn for other explicitly unused cases, and all implicitly unused cases?  While
the author of the code explicitly asked for a warning message to be emitted in
all such cases: "The 'warn_unused_result' attribute causes a warning to be
emitted if a caller of the function with this attribute does not use its return
value."

> This is desperately needed by the projects like systemd (see the first link
> in my first comment) as a less severe variant than -Wno-unused-result, so
> that they won't get punished with less diagnostics.

They (like EVERYONE ELSE IN THE WORLD) should not use -Werror, if they do not
like punishment.  Warnings are warnings.  The author of your code (the header
files for the library code) wanted everyone to be warned about not using the
return value from a certain function.  He/she was almost certainly right about
that.  And it is easy to suppress the warning in the few cases where you really
want to.

> I don't see any reason not to implement -Wno-void-unused with the similar
> description (stating that it's not recommended, if you want) to help the
> projects like systemd.

Define what it would do *exactly*, make a patch for it (including for the
documentation, amending all existing documentation as well), and do that in
such a way that it a) is correct, and b) makes any sense.  Then send the
patch to gcc-patches@.  If you do not want to do all that work (including the
very much non-trivial amount of follow-up work that will cause), then please
go away?  Don't tell us to do insane things that are an incredible amount of
work just because you had a bad idea and now want it to become reality.

> It won't change the meaning of the wur attribute, bacause it will be a
> non-default switch.

This makes no sense at all.

[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization

2023-07-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

--- Comment #13 from Segher Boessenkool  ---
So.  Before expand we have

  _6 = (__int128) x_3(D);
  x.0_1 = _6 << 59;
  _2 = x.0_1 >> 59;
  _4 = (__int128 unsigned) _2;
  return _4;

That should have been optimised better :-(

The RTL code it expands to sets the same pseudo multiple times.  Bad bad bad.
This hampers many optimisations.  Like:
(insn 6 3 7 2 (set (reg:DI 124)
(lshiftrt:DI (reg:DI 129 [ x+8 ])
(const_int 5 [0x5]))) "110717.c":6:11 299 {lshrdi3}
 (nil))
(insn 7 6 8 2 (set (reg:DI 132)
(ashift:DI (reg:DI 128 [ x ])
(const_int 59 [0x3b]))) "110717.c":6:11 289 {ashldi3}
 (nil))
(insn 8 7 9 2 (set (reg:DI 132)
(ior:DI (reg:DI 124)
(reg:DI 132))) "110717.c":6:11 233 {*booldi3}
 (nil))
(They are subregs right after expand, totally unreadable; this is after
subreg1, slightly more readable, but essentially the same code still).

The web pass eventually gets rid of the double set in this case.

Because the shift-left-then-right survives all the way to combine, it (being
the greedy bastard that it is) will use the combiner patterns rs6000 has for
multi-precision shifts, before it would notice the two (multiprecision!)
shifts together are largely a no-op, so you get stuck at a local optimum.
Pat for the course for combine :-/

[Bug rtl-optimization/110717] Double-word sign-extension missed-optimization

2023-07-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

--- Comment #12 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #9)
> Wonder how many important targets provide double-word shift patterns vs.
> ones which expand it through generic code.

Very long ago rs6000 had special code for this.  That was sub-optimal in
other ways, and the generic code generated almost ideal code (sometimes an
extra data movement insn).

> powerpc probably could be improved:
> foo:
> srwi 9,4,5
> mr 10,9
> rlwimi 4,9,5,0,31-5
> rlwimi 10,3,27,0,31-27
> srawi 3,10,27
> blr

This is hugely worse than what we used to do, it seems?

GCC 8 did

srdi 9,4,5
rldimi 9,3,59,0
rldimi 4,9,5,0
sradi 3,9,59
blr

GCC 9 started with the unnecessary move.

But we should get only one insert insn in any case!

[Bug target/110762] inappropriate use of SSE (or AVX) insns for v2sf mode operations

2023-07-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #5 from Segher Boessenkool  ---
(In reply to Richard Biener from comment #2)
> The
> 
>(insn 13 4 14 2 (set (reg:V2SF 20 xmm0 [orig:91 x2 ] [91])
> (vec_select:V2SF (reg:V4SF 20 xmm0 [94])
> (parallel [
> (const_int 0 [0])
> (const_int 1 [0x1])
> ]))) "t.c":10:12 4394 {sse_storelps}
>  (nil))
> 
> insns are gone in split after reload.

Insns 13 and 14 are deleted by split2, yes.  Although the very next insn
(15) obviously uses the regs (20 and 21) those insns set?!

[Bug target/106895] powerpc64 unable to specify even/odd register pairs in extended inline asm

2023-07-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106895

--- Comment #12 from Segher Boessenkool  ---
> I guess that would be annoying if you couldn't have modifiers on constraints

There is no such thing as "operand modifiers".  There are *output* modifiers:
they change how an operand is *printed*, they do not change the operand in any
way, shape, or form.

> or a bad algorithm for working them out. Fair enough.

No idea what you mean here?

> > > or why TI doesn't work but PTI apparently would,
> > 
> > Because this is exactly what PTImode is *for*!
> 
> Right I accept it is, I meant I just would not have been able to work it out
> (assuming if PTI was documented it would be "Partial Tetra Integer" and be
> no more useful than the other P?I type documentation.

For the rs6000 port, multi-register operands are not restricted to aligned
register numbers ("even/odd pairs").  (Some other ports do have this).  We use
the existing PTI mode for that (it also can be allocated in GPRs only, never in
VSRs, unlike TImode).

"Partial" does not have much meaning here.  A minority of ports use partial
integer words for what they were introduced for originally: modes that are
smaller than a full register, say, a 24-bit mode when registers are 32 bits.

We use it as another integer mode that is the same size.  It is unfortunate
that we still have to resort to such tricks.

[Bug target/106895] powerpc64 unable to specify even/odd register pairs in extended inline asm

2023-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106895

--- Comment #10 from Segher Boessenkool  ---
(In reply to Nicholas Piggin from comment #9)
> I don't know why constraint is wrong and mode is right

Simple: you would need O(2**T*N) constraints for our existing N register
constraints, together with T features like this.  But only O(2**T) modes at
most.

> or why TI doesn't work but PTI apparently would,

Because this is exactly what PTImode is *for*!

> but I'll take anything that works. Could we
> get PTI implemented? Does it need a new issue opened?

It was implemented in 2013.  The restriction to only even pairs was a bugfix,
also from 2013.

If you have code like

  typedef __int128 __attribute__((mode(PTI))) even;

you get an error like

  error: no data type for mode 'PTI'

This needs fixing.  You can keep it in this PR?

[Bug target/106895] powerpc64 unable to specify even/odd register pairs in extended inline asm

2023-07-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106895

--- Comment #8 from Segher Boessenkool  ---
(In reply to Peter Bergner from comment #6)
> (In reply to Segher Boessenkool from comment #5)
> > Constraints are completely the wrong tool for this.  Just use modes, which
> > *are* the right tool?
> 
> Well you cannot specify modes in the asm, so I think you're saying we need
> use the correct type that maps to a internal to GCC mode that has the
> even/odd register behavior, so something like:
> 
>   unsigned int foo __attribute__ ((mode (XX)));
> 
> ...where XXmode is the new integer mode that gives us even/odd register
> pairs?  Of course we have to be careful about how this all works wrt -m32
> versus -m64.

No, the type there is "unsigned int".  I meant to say exactly what I did say:
just use modes.  Which you indeed do in user code by the mode attribute, yes.

And you do not need a new mode: PTImode should just work.  But the user
specifying that is currently broken it seems?

Without -mpowerpc64 you cannot *have* 128-bit integers in registers.  That
should be
fixed, but you cannot have it in just *two* registers, which is what is
required
here.  For most targets that then means -m64 is required.

[Bug target/106895] powerpc64 unable to specify even/odd register pairs in extended inline asm

2023-07-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106895

--- Comment #5 from Segher Boessenkool  ---
Constraints are completely the wrong tool for this.  Just use modes, which
*are* the right tool?

[Bug target/78904] zero-extracts are not effective

2023-06-23 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #17 from Segher Boessenkool  ---
(In reply to Roger Sayle from comment #16)
> Just to warn people in advance, the test case pr78904-1b.c is expected to
> start FAILing with the commit of
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622079.html and is
> scheduled to be resolved 24-48 hours later (over the weekend) by
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622078.html
> As explained in
> https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622083.html this is to
> investigate additional tweaks and whether alternate fixes are more
> appropriate.

Thanks for the warning Roger!  Much appreciated.

That fix is for x86 only though?  Is that really the only target affected?

[Bug target/54089] [SH] Refactor shift patterns

2023-06-23 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #94 from Segher Boessenkool  ---
(In reply to Alexander Klepikov from comment #92)
> I remembered why I used two different insns - first to eliminate infinite
> loop with help of marking insn with attribute, and second because I could
> not set attribute when emitting insn from C code. Whe have 'get_attr_*'
> functions but we have not 'set_attr_*'.

An attribute is part of the instruction *definition*, the define_insn; it isn't
a property you put on one instance of it.

[Bug testsuite/101002] Some powerpc tests fail with -mlong-double-64

2023-06-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101002

--- Comment #9 from Segher Boessenkool  ---
(In reply to Peter Bergner from comment #4)
> These die because the struct we're using to check the alignment of uses long
> double as the "big" aligned type.  We could either disable the tests using a
> "dg-require-effective-target longdouble128" or we could use a different more
> aligned type in the struct.  Maybe _Float128 or _Decimal128 or use an
> attribute aligned?   Thoughts?

Maybe just some vector type?  Those have 128-bit alignment even with
-mno-altivec,
right?

> gcc.target/powerpc/pr85657-3.c
> gcc.target/powerpc/signbit-1.c
> pr85657-3.c:38:20: error: unknown type name ‘__ibm128’; did you mean
> ‘__int128’?
> 
> These die because we don't create the type __ibm128 when using
> -mlong-double-64, which seems strange since we do create the __float128 type
> used in the test cases.
> 
> Mike, I assume the __ibm128 type should always be created?

It always should, yes.  Always.  Unconditionally.

[Bug target/54089] [SH] Refactor shift patterns

2023-06-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #88 from Segher Boessenkool  ---
(In reply to Oleg Endo from comment #85)
> > +/* { dg-final { scan-assembler 
> > "_f_loop1_rshift:.*mov\.l\\t(\.L\[0-9\]+),(r\[0-9\]+).*sts.l\\tpr,@-r15.*(\.L\[0-9\]+):.*jsr\\t@\\2.*bf\.s\\t\\3.*\\1:\\n\\t\.long\\t___ashiftrt_r4_6.*_f_loop2_rshift:"
> >  { target { ! has_dyn_shift } } } }  */
> 
> Can you try to somehow write this in a simpler way?  Maybe omit some of the
> register number matches, as they don't matter etc.

Do not use double-quoted strings unless you need interpolation?  If you use {}
around the string you do not need to backslash-quote (and double-quote) so much
at all.

[0-9] is \d

whitespace is \s

See the Tcl re_syntax manual page :-)

[Bug target/54089] [SH] Refactor shift patterns

2023-06-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #87 from Segher Boessenkool  ---
(In reply to Oleg Endo from comment #53)
> (In reply to Segher Boessenkool from comment #52)
> > There is TARGET_LEGITIMATE_COMBINED_INSN though, which is a workaround for 
> > if
> > you really do not want the instruction combiner to create particular
> > instruction patterns (but it does nothing to prevent other parts of the
> > compiler from doing the same!)
> 
> Thanks for pointing it out.  I knew I missed something recent ;)

g:78e4f1ad4e48, from 2012?  Well, fairly recent, okay :-)

[Bug rtl-optimization/110254] improve_allocation() routine does not update allocated_hardreg_p[] array

2023-06-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110254

--- Comment #1 from Segher Boessenkool  ---
Off topic / pet peeve: it's not an array of functions, so it should not be
called
something_p .

[Bug driver/71850] @file should be used to cc1/cc1plus when @file is used

2023-06-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71850

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Segher Boessenkool  ---
Costas says this is fixed by g:180ebb8a24d2 .  Marking as such.  Thanks :-)

[Bug target/54089] [SH] Refactor shift patterns

2023-06-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #52 from Segher Boessenkool  ---
(In reply to Alexander Klepikov from comment #50)
> But maybe there is a way to exclude particular insn from combine pass? (I
> guess not).

In general, it is best to let combine just work on everything.  It will not
replace instructions if the replacement is more expensive, and it will only
ever create instruction sequences with the same semantics as what it started
with.

There is TARGET_LEGITIMATE_COMBINED_INSN though, which is a workaround for if
you really do not want the instruction combiner to create particular
instruction patterns (but it does nothing to prevent other parts of the
compiler from doing the same!)

[Bug target/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails

2023-05-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949

--- Comment #6 from Segher Boessenkool  ---
(In reply to Matthias Kretz (Vir) from comment #4)
> With -mcpu=power10 I see the issue. The problem has been there all the time
> and only surfaced with this test. (It should also have shown on `make
> check-simd` in libstdc++.)

Yup, you should never use -mpower9-vector and friends.  Such options are handy
*during development* but are heavily problematic later; they should never have
existed in mainline.

What is the actual problem here?  Or do you want to build up the suspense and
only show it in the patch you will send :-)

[Bug target/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails

2023-05-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949

--- Comment #5 from Segher Boessenkool  ---
(In reply to Matthias Kretz (Vir) from comment #2)
> Yes, I stopped my backporting efforts when I became aware that it's failing
> on ARM. I'll get to PPC ASAP and then continue with the backports.

You should backport to N-1 first, only then to N-2, etc.  Sanity is nice :-)

Next time :-)

[Bug rtl-optimization/109858] [14 Regression] r14-172 caused some SPEC2017 bmk to degrade on Power

2023-05-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109858

--- Comment #10 from Segher Boessenkool  ---
(In reply to Hongtao.liu from comment #8)
> (In reply to Segher Boessenkool from comment #7)
> > > The patch will still use GENERAL_REGS when hard_regno_mode_ok for mode and
> > > GENERAL_REGS(which is the case in PR109610), hope it can also fix this
> > > regression.
> > 
> > That sounds more reasonable.  But, why use any heuristics like this?  Can't
> > you
> > just look at the actual costs of using mem and regs?
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109610#c2

That is not an answer to my question at all?

[Bug rtl-optimization/109858] [14 Regression] r14-172 caused some SPEC2017 bmk to degrade on Power

2023-05-16 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109858

--- Comment #7 from Segher Boessenkool  ---
> The patch will still use GENERAL_REGS when hard_regno_mode_ok for mode and
> GENERAL_REGS(which is the case in PR109610), hope it can also fix this
> regression.

That sounds more reasonable.  But, why use any heuristics like this?  Can't you
just look at the actual costs of using mem and regs?

[Bug target/109610] [14 regression] gcc.target/powerpc/dform-3.c fails after r14-172-g0368d169492017

2023-05-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109610

--- Comment #11 from Segher Boessenkool  ---
(In reply to Hongtao.liu from comment #5)
> One solution is add an peephole for handle such redudancy.

Not okay.

> If powerpc maintainer doesn't like this way, another alternative is add a
> target hook in RA to still use GENEREAL_REGS for other targets, but use
> NO_REGS only for x86.

Also not okay.  Please solve the fundamental problem in cost estimation you
created, don't let all targets try to fix it in different ways :-(

[Bug target/109610] [14 regression] gcc.target/powerpc/dform-3.c fails after r14-172-g0368d169492017

2023-05-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109610

--- Comment #10 from Segher Boessenkool  ---
(In reply to Hongtao.liu from comment #5)
> One solution is add an peephole for handle such redudancy.

Not okay.

> If powerpc maintainer doesn't like this way, another alternative is add a
> target hook in RA to still use GENEREAL_REGS for other targets, but use
> NO_REGS only for x86.

Also not okay.  Please solve the fundamental problem in cost estimation you
created, don't let all targets try to fix it in different ways :-(

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2023-05-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

Segher Boessenkool  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-05-03

--- Comment #2 from Segher Boessenkool  ---
vect_long_mult has
 || (([istarget powerpc*-*-*]
  && ![istarget powerpc-*-linux*paired*])
  && [check_effective_target_ilp32])
which does not know that p10 has 64x64->64 mult in vectors (and has weird
parens
as well :-P )  The linux*paired* case can be removed of course.

Confirmed.  Should we open a separate bug for this Power problem, or handle it
here?

[Bug target/109566] [12/13/14 Regression] powerpc: unrecognizable insn for -mcpu=e6500, -mcpu=power3, ..., -mcpu=power10

2023-04-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109566

--- Comment #17 from Segher Boessenkool  ---
So, apparently powerpc-rtems uses -mpowerpc64 by default?!  That is
problematic,
it changes the ABI, might not actually work at all (it requires your
setjmp/longjmp
and getcontext/setcontext to restore the full 64-bit registers), and is often
bigger and slower code (but not always).

I suppose powerpc-rtems gets this from a default CPU choice?

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2023-04-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #51 from Segher Boessenkool  ---
(In reply to rusty from comment #47)
> Civility please.

Thank you.

> As Andrew Pinski says "people are mis-using this attribute", and Jakub
> Jelinek makes a similar point.  The use of _wur has changed from "ignoring
> the result is criminally wrong" to "possibly wrong".

And that is the core of why this issue reinflames once in a while: some people
abuse the attribute, and the compiler cannot read minds.


The documentation of this attribute states
'warn_unused_result'
 The 'warn_unused_result' attribute causes a warning to be emitted
 if a caller of the function with this attribute does not use its
 return value.  This is useful for functions where not checking the
 result is either a security problem or always a bug, such as
 'realloc'.

The "non-bugs" section of the manual ("Certain Changes We Don't Want to Make"
says
   * Warning when a non-void function value is ignored.

 C contains many standard functions that return a value that most
 programs choose to ignore.  One obvious example is 'printf'.
 Warning about this practice only leads the defensive programmer to
 clutter programs with dozens of casts to 'void'.  Such casts are
 required so frequently that they become visual noise.  Writing
 those casts becomes so automatic that they no longer convey useful
 information about the intentions of the programmer.  For functions
 where the return value should never be ignored, use the
 'warn_unused_result' function attribute (*note Function
 Attributes::).

Completely useless casts to void cluttered programs decades ago already,
we do not fear cargo cult, instead we observed it already existed.

And finally there is
'-Wno-unused-result'
 Do not warn if a caller of a function marked with attribute
 'warn_unused_result' (*note Function Attributes::) does not use its
 return value.  The default is '-Wunused-result'.

A caller that casts a return value to void *explicitly* does not use that
return value.


> I still put a comment complaining about this every time I hit it, which is
> about once or twice a year.  But I have little more to say; it's been almost
> 20 year after all :)

Changing the behaviour of this attribute after all that time will not make
things better.  But perhaps we can say a bit more in the documentation,
maybe at one of the three very concise quotes above?  Say half a line worth?

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2023-04-23 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #43 from Segher Boessenkool  ---
(In reply to Andrew Church from comment #40)
> My rationale for changing the default behavior is that the wider community
> consensus, as evidenced by things like the C++ (and C2x) [[nodiscard]]
> specification, the behavior of Clang, and the balance of comments on this
> bug, seems to be that casting a discarded return value to void should
> suppress any warning about the discarded value; and under the principle of
> least surprise, GCC should follow that consensus by default even if it also
> provides alternative behaviors.

That is not the consensus, no.  "Consensus" does not mean doing what the
unthinking masses shout.

There already are easy ways to deal suppress the error, very direct, and
very descriptive ways.  A cast to void is round-about, cryptic, and already
is cargo-cult, before this attribute existed even!  So allowing casts to void
to suppress this warning means the warning becomes less useful, and people
will write worse code.  That is not something GCC should encourage IMO.

[Bug target/109501] rs6000: Add suggested defines for vec_test_data_class

2023-04-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109501

Segher Boessenkool  changed:

   What|Removed |Added

   Last reconfirmed||2023-04-13
 Ever confirmed|0   |1
Summary|vec_test_data_class defines |rs6000: Add suggested
   |missing |defines for
   ||vec_test_data_class
   Priority|P3  |P4
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement

--- Comment #9 from Segher Boessenkool  ---
I marked this as enhancement, and changed the summary.  Thanks!

[Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression

2023-04-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476

--- Comment #12 from Segher Boessenkool  ---
With the modified compiler?  Does it ICE with an unmodified compiler as well?

[Bug target/109501] vec_test_data_class defines missing

2023-04-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109501

--- Comment #7 from Segher Boessenkool  ---
"For clarity of code, the following named constants are suggested. Preferably,
compilers will provide these constants in a header file, but this is not
required
for compliance."

[Bug target/109501] vec_test_data_class defines missing

2023-04-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109501

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #6 from Segher Boessenkool  ---
None of those are required.  All are optional.  No portable code should use
them.

[Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression

2023-04-12 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476

--- Comment #9 from Segher Boessenkool  ---
That patch looks fine :-)

[Bug rtl-optimization/109476] Missing optimization for 8bit/8bit multiplication / regression

2023-04-12 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476

--- Comment #5 from Segher Boessenkool  ---
Correct, this certainly can not be done by combine, it see two independent
pseudos here.  For hard registers it *can* do many tricks, but not for
pseudos like this.

[Bug bootstrap/109460] Build gcc for win32 failed in gcc13 master branch

2023-04-12 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109460

Segher Boessenkool  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #33 from Segher Boessenkool  ---
Fixed (says Costas :-) )

[Bug bootstrap/109460] Build gcc for win32 failed in gcc13 master branch

2023-04-12 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109460

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #32 from Segher Boessenkool  ---
> If you are wondering why this PR wasn't automatically updated by the commit,
> I am wondering the same thing.

You need to note "PR109460" or preferably "PR bootstrap/109460" in the
changelog.

You can paste the commit message here yourself, but let me do that for you :-)


author  Costas Argyris
Wed, 12 Apr 2023 07:48:18 + (08:48 +0100)
committer   Jonathan Yong <10wa...@gmail.com>   
Wed, 12 Apr 2023 14:35:33 + (14:35 +)
commit  3beeebd6934654f3453209730b98c7a1fd0305b6
treeee01b276eba9f13284d1880794371e7ba6cca8c1tree
parent  56529056cb42baa382c40de7d239d02dbf72c94fcommit | diff
mingw: Support building with older gcc versions

The $@ argument to the compiler is causing
only a warning in some gcc versions but an
error in others. In any case, $@ was never
necessary so remove it completely, just like
the rules in x-mingw32 where the object file
gets named after the source file.

This fixes both warnings and errors about
sym-mingw32.o appearing in the command line
unnecessarily.

The -nostdlib flag is required along with -r
for older gcc versions that don't apply it
automatically with -r, resulting in main
functions erroneously entering a partial link.

Signed-off-by: Jonathan Yong <10wa...@gmail.com>
gcc/ChangeLog:

* config/i386/x-mingw32-utf8: Remove extrataneous $@

[Bug libffi/109447] test case libffi.closures/cls_align_longdouble_split.c fails

2023-04-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109447

Segher Boessenkool  changed:

   What|Removed |Added

 CC||green at gcc dot gnu.org

--- Comment #5 from Segher Boessenkool  ---
That
()
looks like an obvious fix.  Can we get it applied to GCC please?

(cc:ing the libffi author)

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-04-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

Segher Boessenkool  changed:

   What|Removed |Added

 Status|WAITING |NEW
   Priority|P3  |P1

--- Comment #6 from Segher Boessenkool  ---
We should not use any VMX insn unless explicitly asked for it, since those
do not work as expected if VSCR[NJ]=1, which unfortunately is the default on
Linux (but not on powerpc64le-linux; that is a separate (kernel) bug).

Rounding mode does not matter too much, if we have some subset of fast-math
anyway; the only rounding mode in VMX is round-to-nearest-ties-to-even, which
is the default for most everything else).

But NJ=1 makes arithmetic behave completely unexpectedly, and it isn't
actually faster than NJ=0 on modern hardware anyway.  We cannot change the
default for setting NJ because some code might rely on it, unfortunately.
Luckily disabling generating all VMX insns automatically (i.e. without it
being explicitly asked for) isn't all that expensive, just ends up as a few
more move instructions here and there.

This isn't a regression, but we should have this in GCC 13.

[Bug bootstrap/101834] make distclean forgets ./c++tools/

2023-03-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101834

--- Comment #9 from Segher Boessenkool  ---
(In reply to Segher Boessenkool from comment #8)
> (In reply to Jonathan Wakely from comment #6)
> > Also, after 'make clean' you can no longer do 'make all'
> 
> Of course you cannot.  Where do you see this?

Erm, scratch that, confusing clean and distclean myself now, heh.

[Bug bootstrap/101834] make distclean forgets ./c++tools/

2023-03-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101834

--- Comment #8 from Segher Boessenkool  ---
(In reply to Jonathan Wakely from comment #6)
> Also, after 'make clean' you can no longer do 'make all'

Of course you cannot.  Where do you see this?

[Bug bootstrap/101834] make distclean forgets ./c++tools/

2023-03-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101834

--- Comment #7 from Segher Boessenkool  ---
Thank you for looking at this!

(In reply to Jonathan Wakely from comment #5)
> c++tools/Makefile.in has:
> 
> mostlyclean::
>   rm -f $(MAPPER.O)
> 
> clean::
>   rm -f g++-mapper-server$(exeext)
> 
> distclean::
>   rm -f config.log config.status config.h
> 
> 
> Should distclean have clean as a prerequisite and clean have mostlyclean as
> a prerequisite?

That is what all other Makefiles do, and that makes sense yes.  Is it that
simple?  I'll test with that now.

> That would still leave config.cache and Makefile and the .d fragments though.

Yup, but those I know how to handle :-)

[Bug bootstrap/101834] make distclean forgets ./c++tools/

2023-03-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101834

Segher Boessenkool  changed:

   What|Removed |Added

   Last reconfirmed||2023-03-30
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #4 from Segher Boessenkool  ---
This still happens, and still is incredibly annoying.  Left over after
distclean are

$ ll c++tools/
total 3620
-rw-rw-r-- 1 segher segher4781 Mar 29 10:39 Makefile
-rw-rw-r-- 1 segher segher4972 Mar 29 10:39 config.cache
-rwxrwxr-x 1 segher segher 2578928 Mar 29 10:39 g++-mapper-server
-rw-rw-r-- 1 segher segher1319 Mar 29 10:39 resolver.d
-rw-rw-r-- 1 segher segher  593464 Mar 29 10:39 resolver.o
-rw-rw-r-- 1 segher segher2020 Mar 29 10:39 server.d
-rw-rw-r-- 1 segher segher  518552 Mar 29 10:39 server.o

[Bug target/109329] rs6000: New testcases {mul,div}ic3* should run on systems without QP

2023-03-29 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109329

Segher Boessenkool  changed:

   What|Removed |Added

   Priority|P3  |P4
 Target||powerpc*-*-*

[Bug target/109329] New: rs6000: New testcases {mul,div}ic3* should run on systems without QP

2023-03-29 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109329

Bug ID: 109329
   Summary: rs6000: New testcases {mul,div}ic3* should run on
systems without QP
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: segher at gcc dot gnu.org
  Target Milestone: ---

The testcases use ICmode, which exists fine on almost all systems.  But we
get the error
  cc1: error: '-mabi=ieeelongdouble' requires full ISA 2.06 support

We should not use that flag, or make the testcase not fail some other way.

[Bug target/103628] ICE: Segmentation fault (in gfc_conv_tree_to_mpfr)

2023-03-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103628

--- Comment #7 from Segher Boessenkool  ---
(In reply to HaoChen Gui from comment #5)
> The memory representation of IBM long double is not unique. It's actually
> the sum of two 64-bit doubles. 

Yes, and the first of those two DP numbers is required to be the full
number rounded to double precision (with round-to-nearest).

What happened here?  I cannot make much sense of those numbers, but it
seems to contain something with uppercase ASCII overwritten?

[Bug target/109067] Powerpc GCC does not support __ibm128 complex multiply/divide if long double is IEEE 128-bit.

2023-03-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109067

--- Comment #1 from Segher Boessenkool  ---
Do you have a testcase please?

[Bug target/109007] building for POWER8 leaks into POWER9 ISA with g++ 11.3 (cross-compiler on x86_64 host)

2023-03-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109007

--- Comment #17 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #16)
> Or just make sure the libraries are still built with -mcpu=power8 even when
> the compiler defaults to something else.

That doesn't scale.

> That said, neither multilibs nor just making sure libraries are built with
> -mcpu=power8 can help with libraries outside of gcc (unless they are built
> as multilibs or with the extra flags).

Sure.  But this bug is about that a compiler built --with-cpu=X can not build
anything that can run on any older CPU (or does not have some feature that X
has, more generally).  This is a problem that we already do have a solution
for, but something we have disabled in the powerpc64le-linux subtarget
unfortunately.

> or build gcc and all the needed libraries yourself.

Yup.  But it should not be necessary to build a different GCC.

[Bug target/103784] suboptimal code for returning bool value on target ppc

2023-03-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784

--- Comment #12 from Segher Boessenkool  ---
What David says :-)

We really could use something good for this, it has been a problem for all
GCC targets since forever; it hurts rs6000 more than most though.

Before RA this is a diamond, one side does the 0/1, the other the always 0.
After the join it gets an AND with 1 (not an extend; the effect is similar
of course).  Shrink-wrapping gets rid of the join (duplicates the tail code
to both branches) but does not optimise the result of that, which gives us
the silly li 3,0;clrldi 3,3,63 (and the other side does not need the shift
either, but doesn't look quite as silly :-) ).

It is not unlikely this would work better if we had no QImode thing for the
bool; even SImode might work better already, but DImode would be best (in
the ABI everything is passed in full registers always, so something has to
set the upper bits somewhere).

[Bug target/109007] building for POWER8 leaks into POWER9 ISA with g++ 11.3 (cross-compiler on x86_64 host)

2023-03-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109007

--- Comment #15 from Segher Boessenkool  ---
(In reply to bugreporter66 from comment #14)
> I should be able to workaround that by emulating all LE targets on POWER9,
> with a comment that building for POWER8 natively on target should work too.

If you want to default to Power9 but still support Power8 (on builds that use
-mcpu=power8), you need to set up appropriate multilibs.  If you want that,
please do a feature request for that?  As a new PR, not hidden inside this
one please :-)

[Bug rtl-optimization/106594] [13 Regression] sign-extensions no longer merged into addressing mode

2023-03-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594

--- Comment #16 from Segher Boessenkool  ---
(In reply to Roger Sayle from comment #14)
> This really is a regression, that points to a previously latent/unnoticed
> bug in combine.
> 
> In GCC 12, combine would take the input RTL and based on target costs
> transform it into the better of implementation A or B.

And it still does exactly that.  And your patch would *not*!  It does not
compare the cost of two alternative pieces of code; it just says "if the cost
of calculating some RTL expression as separate code is 4 or less, do not do
any other optimisation here".  It is completely ad-hoc, not an improvement,
will only cause more problems in the future.

> Now in GCC 13, the
> tree-optimizers are able to perform this same optimization earlier and so
> combine is now given optimal implementation A,

So it is not a regression in combine.

> where a latent bug always
> transforms this to B without ever checking target costs.

Latent.  It is not a regression.

> The consensus is that performing (more) optimizations at the tree-level is a
> good thing,

No.  The situation is different, not so simplistic.

At tree level the representation is higher level.  At RTL level it is more
nitty-gritty, very *low* level, very close to machine code.  It is very
important to do optimisations at that level as well.  It is pretty much
impossible to do a good job of low-level optimisations (like instruction
selection) at a high level, i.e. at great distance.  It is a bad idea to try
to such things at a high level.  What we can (and should, and *do*) do is
to in higher level optimisations keep code flexible, only have abstractions
that are easy to work with, etc.; and to have earlier passes output code
that the later passes can work with well.

> so reverting changes to the tree optimizers (that now produce
> better code) is a workaround to a glitch where combine is transforming RTL
> into more expensive forms.

But no one is talking about reverting anything?

> There's already a code path in combine that checks/compares costs, it just
> isn't being reached any more.

Now you completely lost me.

> p.s. this has nothing to do with sign_extract/zero_extract, for which
> hardware support would be hypothetical, this patch only affects
> sign_extend/zero_extend, such as aarch64's sxtw or x86_64's movsx or
> powerpc's extsw.  If a target has this functionality, it's unlikely that a
> sequence of shifts or bit-wise operations would be better.

*_extract is just an example (see various open PRs) of the problems with
compound_operation stuff.  I would just rip out all that stuff, similar to
your patch but a bit more drastic, except that causes regressions on other
targets.

Which I have tested and analysed, btw.  As I said in the patch review, this
really needs to be looked at on more targets.  And it should be done in
stage 1, not in stage 4.

(I am running such a test with your patch on all Linux targets, fwiw).

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-03-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009

--- Comment #4 from Segher Boessenkool  ---
Alternatively (or in addition), you can look how to make the shrink-wrap pass
transform the code with some simple added move instructions, maybe even
involving an extra pseudo, so that it can shrink-wrap more.  A very simple
(and because of that, not very effective) version of that is done in
prepare_shrink_wrapping already.

The problem with this is that such transformations are not free: the extra
insns can often be optimised away (just by register allocation), and even if
not, if it causes more / better shrink-wrapping it is a win anyway.  But it
has to be done only if it improves shrink-wrapping (or it likely improves it),
or it isn't a net win.

[Bug rtl-optimization/106594] [13 Regression] sign-extensions no longer merged into addressing mode

2023-03-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594

--- Comment #13 from Segher Boessenkool  ---
Hi!

Either this should not be P1, or the proposed patch is taking completely the
wrong direction.  P1 means there is a regression.  There is no regression in
combine, in fact the proposed patch would *cause* regressions on many targets!

I certainly welcome making the compound_operation stuff behave better, but the
key point there is *better*, random changes that have not been tested on most
archs (and will likely regress on many) are not okay.  This is stage 1 material
no matter what.

Maybe we need some new target macros saying to not use two shifts, and/or
zero_extract, or sign_extract, etc.  No machine newer than VAX (or was it PDP?)
has real hardware support for that, we are much better off expressing things
with machine instructions that *do* exist only.

[Bug target/109007] building for POWER8 leaks into POWER9 ISA with g++ 11.3 (cross-compiler on x86_64 host)

2023-03-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109007

--- Comment #13 from Segher Boessenkool  ---
(In reply to bugreporter66 from comment #10)
> Yes, it seems so. They've switched to POWER9 by default in Ubuntu 22.04, so
> it means that gcc itself (along with standard libraries) was compiled for
> POWER9 as well. It used to be POWER8 in the previous releases.

Power9 *only*, actually.

[Bug target/108315] -mcpu=power10 changes ABI

2023-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108315

--- Comment #17 from Segher Boessenkool  ---
What makes you think we need to tell the user to do something?  There is
nothing that needs to be done as far as I can see?  /confused

[Bug target/108315] -mcpu=power10 changes ABI

2023-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108315

--- Comment #15 from Segher Boessenkool  ---
(In reply to Alexander Monakov from comment #14)
> Are you guys really sure you want to blame the user here,

I apologise if this hasn't been a nice experience for you.

I'm not blaming anyone, least of all the user.  That is not what bugzilla is
for anyway.  The goal here is to work together on improving the compiler.

I marked the bug as RESOLVED INVALID because a) there is nothing left to be
done to resolve this PR, and b) that is because there never was anything to
be done (in GCC!) in the first place.

If this is not correct, please add some info clarifying that, and reopen the
PR?

> considering that
> all linkers, including the BFD linker, initially misinterpreted the ABI the
> same way?

It wasn't implemented correctly there either, yes.  That does not necessarily
mean the ABI was misinterpreted, but sure, that could be.  In either case
that has nothing to do with GCC.

[Bug target/108315] -mcpu=power10 changes ABI

2023-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108315

--- Comment #13 from Segher Boessenkool  ---
(In reply to Alexander Monakov from comment #10)
> (In reply to Rui Ueyama from comment #9)
> > I'm the maintainer of the mold linker. I didn't implement that POWER10 ABI
> > because I didn't have an access to a POWER10 machine and therefore couldn't
> > verify the correctness of my implementation.
> 
> Thanks for the info. It might be better to explicitly diagnose the
> '.localentry 1' case and error out instead of producing an executable that
> will continue to run with wrong r2 after the mislinked call returns.

Yes, exactly.  Silently giving any handling to reserved values will never end
well; just a warning would help a lot already (making it non-silent :-) )

FWIW, calling this "POWER10 ABI" is very inexact; it is just part of the ELFv2
ABI (a reserved value in older ABI versions, and with a specific meaning in
newer versions), not only for Power10.  Of course it was created to make code
that uses pcrel addressing faster, but :-)

[Bug rtl-optimization/109009] Shrink Wrap missed opportunity

2023-03-03 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109009

--- Comment #1 from Segher Boessenkool  ---
This is very target-specific.  Could you please attach a test case (with any
significant compiler flags as well, and specific target mentioned, etc.) that
shows the problem?

[Bug target/106770] powerpc64le: Unnecessary xxpermdi before mfvsrd

2023-03-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770

--- Comment #11 from Segher Boessenkool  ---
(In reply to Jens Seifert from comment #6)
> The left part of VSX registers overlaps with floating point registers, that
> is why no register xxpermdi is required and mfvsrd can access all (left)
> parts of VSX registers directly.

The mfvsrd instruction was invented before ELFv2 (at the same time as mfvsrwz).
Everything in common use was big-endian then.  The insns to move GPR->VSR that
initially existed were mtvstrd and mtvsrw[az], all of which write to dword 0 of
the target VSR.

Dword 0 of vector regs is where 64-bit entities in vector regs are stored in
the ABIs, sure, and that corresponds to the FPRs in the ISA.  mtvsrdd and
mtvsrws
were added in ISA 3.0 (p9), together with mfvsrld, to make little-endian work
better with little-endian ELFv2.

> The xxpermdi x,y,y,3 indicates to me that gcc prefers right part of register
> which might also cause the xxpermdi at the beginning.

And with -mbig you get ,2 here.  It is accidental.

> At the end the mystery
> is why gcc adds 3 xxpermdi to the code.

As I said, this is constructed during expand, to make correct code.  That is
all
that expand should do: make correct (and well-optimisable, "open structured",
easy to transform, code).  We should be able to optimise this to something
better in later passes that *are* supposed to make faster code.  Like the p8
swaps pass, which mostly zaps unnecessary pairs of swaps, or the swiss army
bazooka combine, or even many earlier passes if such an xxpermdi insn is truly
superfluous.  It usually is not, we are dealing with the full 128-bit VSRs
there, there is no way of saying we do not care about part of the register
contents.  Making infra for that is big work.

We can make things easier by expressing things as 64 bit earlier.  We can (and
should) also investigate why the mfvsrd is not combined (as in, what the
instruction combiner pass does) with the xxpermdi.  There are many things not
quite perfect here.

[Bug target/106770] PPCLE: Unnecessary xxpermdi before mfvsrd

2023-02-28 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Target|powerpc |powerpc64le
   Last reconfirmed||2023-02-28
 Ever confirmed|0   |1

--- Comment #7 from Segher Boessenkool  ---
The mystery is not where the permutations came from: they were added during
expand to make correct code, just like many unnecessary register moves are
added at that time.  This is normal, and even good in many ways.

The question is why they weren't optimised better.  This is either due to
some bug, or this is an enhancement request.

[Bug target/108315] -mcpu=power10 changes ABI

2023-02-28 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108315

--- Comment #8 from Segher Boessenkool  ---
To expand a bit: st_other with value 1 was reserved before, and now it
isn't anymore.  Any tool that silently ignores the "special case" of
reserved values will not work correctly (it might sometimes do what is
wanted though).  This is true everywhere always.

[Bug target/108315] -mcpu=power10 changes ABI

2023-02-28 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108315

Segher Boessenkool  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #7 from Segher Boessenkool  ---
This is not a special case at all.

[Bug target/106770] PPCLE: Unnecessary xxpermdi before mfvsrd

2023-02-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770

--- Comment #5 from Segher Boessenkool  ---
(In reply to Jens Seifert from comment #4)
> PPCLE with no special option means -mcpu=power8 -maltivec  (altivecle to be
> mor precise).

What?  No.

$ sh config.sub ppcle  
powerpcle-unknown-none

This is typically the old 32-bit PowerPC ELF format.  powerpcle-elf (which
non-canonically can be called ppcle-elf) for example, or ppcle-linux, but not
ppcle-aix and the like (that one doesn't even exist; at least one COFF
format has existed in the past though).

This may not matter for you, but it is awfully confusing for others.


powerpc64le-linux (and I believe all other existing ELFv2 ports) require
a p8 or later CPU, sure; but it is perfectly valid to have no AltiVec even
then, or for a port to default to some other CPU.  Currently we have no such
thing, and all default defaults are like you say, but that might change.

> vec_promote(, 1) should be a noop on ppcle.

It never is, not on powerpc64le either.  It always duplicates the selected
elt to all lanes.

> But value gets
> splatted to both left and right part of vector register. => 2 unnecesary
> xxpermdi

So why are those not optimised away?  *That* is the question!

> The rest of the operations are done on left and right part.
> 
> vec_extract(, 1) should be noop on ppcle. But value gets
> taken from right part of register which requires a xxpermdi
> 
> Overall 3 unnecessary xxpermdi. Don't know why the right part of register
> gets "preferred".

I don't know what you mean there?  The ABIs say where parameters and return
values are stored, but you mean something else?

[Bug target/106770] PPCLE: Unnecessary xxpermdi before mfvsrd

2023-02-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106770

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #3 from Segher Boessenkool  ---
You get xxpermdi ,3 only with -mlittle.  Maybe you mean this is for target
powerpc64le-linux?  That is very different from powerpcle-linux fwiw, don't
confuse those :-)

What -mcpu= did you use?

xxpermdi doesn't extract any element of course, it is mfvsr[l]d that does
that.  But duplicating the element you extract isn't needed, sure.

[Bug target/108862] [13 Regression] CryptX miscompilation on power9 since r13-2107

2023-02-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108862

--- Comment #3 from Segher Boessenkool  ---
With -fno-unroll-loops added we get

foo:
.LFB0:
.cfi_startproc
cmpwi 0,3,0
ble 0,.L4
mtctr 3
addi 10,4,-8
addi 5,5,8
li 3,0
.p2align 4,,15
.L3:
ldu 4,8(10)
ldu 9,-8(5)
maddld 3,4,9,3
maddhdu 4,4,9,3
bdnz .L3
blr
.p2align 4,,15
.L4:
li 3,0
li 4,0
blr

(which still fails, just is easier to read).

The destinations of the four inner loop insns were 131, 132, 135, 136,
and IRA decided for those
Disposition:
   18:r131 l0 4
   17:r132 l0 9
1:r135 l0 3
0:r136 l0 4

and that cannot fly.

[Bug rtl-optimization/107949] PPC: Unnecessary rlwinm after lbzx

2023-02-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107949

--- Comment #6 from Segher Boessenkool  ---
We generate loads into QImode regs, so we need to explicitly convert it to
whatever larger mode is wanted later.  We also have define_insns to do a
zero-extended load directly into a bigger pseudo, but that isn't used
apparently.

This is one instance of a much more generic problem; on rs6000 this is
usually observed as SImode being extended to DImode more often than
needed / wanted.

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2023-02-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688

--- Comment #33 from Segher Boessenkool  ---
Yes, exactly.  It was the X server I think?  I try to forget such horrors :-)

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2023-02-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688

--- Comment #31 from Segher Boessenkool  ---
Yes, there was a user who incorrectly used memcpy on non-memory memory.

This is not valid, and never has been.

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2023-02-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688

--- Comment #29 from Segher Boessenkool  ---
(In reply to Florian Weimer from comment #28)
> Maybe this belongs in the ABI manual? For example, the POWER ABI says that
> memcpy needs to work on device memory.

Huh?!

Where do you see this?  The way you state it it is trivially impossible to
implement, so if we really say that it needs fixing asap.

[Bug target/108787] [13 Regression] libsodium miscompilation on power9 starting with r13-2107

2023-02-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108787

--- Comment #9 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #7)
> Created attachment 54460 [details]
> gcc13-pr108787.patch

That patch is preapproved, but please add a comment (before umaddditi4)
saying we do not want maddditi4 as well (and why; just something short,
maybe reference this PR :-) )  Thanks!

> Patch that kills maddditi4 in addition to fixing umaddditi4.  As mentioned
> above, in the umaddditi4 case if we later on e.g. during combine find out
> that
> the high half of the last operand is zero, it will be nicely optimized to
> the optimal sequence.  Unfortunately, with maddditi4 it is really hard to
> find out at expansion time if the last operand is sign extended from DImode
> or narrower,
> there is no SSA_NAME on the pseudo to check say for value ranges, and
> looking at earlier already emitted instructions checking for one subreg of
> it set to something and the other to copies of its sign bit would be a total
> mess.
> And at combine time I'm afraid we'd need 5 instruction combination.
> So if we want to be able to optimize qux above, I'm afraid we'd need to add
> a new optab.

It is easy to optimise if operands[3] is a non-negative 64-bit thing.  I
expect combine can do it in that case :-)

[Bug target/108787] [13 Regression] libsodium miscompilation on power9 starting with r13-2107

2023-02-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108787

--- Comment #8 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #6)
> we used to emit in GCC 12 4/4/4/5 instructions:
> mulld 9,3,4
> mulhdu 4,3,4
> addc 3,9,5
> adde 4,4,6
> and
> mulld 9,3,4
> mulhd 4,3,4
> addc 3,9,5
> adde 4,4,6
> and
> mulld 9,3,4
> mulhdu 4,3,4
> addc 3,9,5
> addze 4,4
> and
> mulld 9,3,4
> mulhd 4,3,4
> sradi 10,5,63
> addc 3,9,5
> adde 4,4,10

And it was 2/2/2/2 insns deep.

> Now, with the patch we get 3/5/3/6 instructions:
> maddhdu 9,3,4,5
> maddld 3,3,4,5
> add 4,9,6
> and
> maddhd 9,3,4,5
> srdi 10,5,63
> maddld 3,3,4,5
> add 9,9,10
> add 4,9,6
> and
> mr 9,3
> maddld 3,3,4,5
> maddhdu 4,9,4,5
> and
> maddhd 9,3,4,5
> srdi 8,5,63
> sradi 10,5,63
> maddld 3,3,4,5
> add 9,9,8
> add 4,9,10

And this is 2/3/2/3 so the signed are worse than what we had.

> So, unless we can somehow check for the sign extended operands[3], we
> shouldn't define maddditi3 or FAIL in it or expand it to equivalent of what
> we used to emit before.

I wouldn't define the signed version at all, we have no good way of
generating it.  We still can generate maddhd (and maddld of course), but
we shouldn't do this for maddditi4, just if e.g. combine comes up with the
correct RTL for it (there is no machine insn for it, it would require four
registers in, that is very expensive to do).

[Bug target/108787] [13 Regression] libsodium miscompilation on power9 starting with r13-2107

2023-02-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108787

Segher Boessenkool  changed:

   What|Removed |Added

   Last reconfirmed||2023-02-14
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #5 from Segher Boessenkool  ---
The maddhd insn does a sign-extend of the addend as well, so simply adding
the high part of it is not enough.

I don't see how to solve this with any machine code using the new madd* insns
that is at least as good code as the mulld;mulhd;addc;adde we would otherwise
generate.

We should still have machine patterns for the insn we have (it can be used
if operands[3] here is only one machine word for example), but we shouldn't
have a define_expand for maddditi4?  (For umaddditi4 we can of course, and
that is even useful if it results in better generated code).

[Bug target/108699] gcc.c-torture/execute/builtin-bitops-1.c fails on power 9 BE

2023-02-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108699

--- Comment #2 from Segher Boessenkool  ---
Right, it needs a vpopcntb or similar first.

[Bug tree-optimization/108757] We do not simplify (a - (N*M)) / N + M -> a / N

2023-02-11 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108757

--- Comment #8 from Segher Boessenkool  ---
No, addition and subtraction are well defined for all inputs, for unsigned
integers.

[Bug tree-optimization/108757] We do not simplify (a - (N*M)) / N + M -> a / N

2023-02-11 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108757

--- Comment #6 from Segher Boessenkool  ---
No?  Take a=59 as counterexample:

(a - (N*M)) / N + M = (59 - 2*30)/30 + 2 = ~0UL/30 + 2

but

a / N = 59/30 = 1

Integer division in C is division towards zero, almost no normal algebraic
simplifications apply there.

[Bug tree-optimization/108757] We do not simplify (a - (N*M)) / N + M -> a / N

2023-02-11 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108757

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #4 from Segher Boessenkool  ---
If N is a power of two optimising this to a/N is valid, but for other values
of N it is not (division is not the inverse of multiplication in C).  It also
only works for unsigned of course.

[Bug middle-end/17308] nonnull attribute not as useful as it could be

2023-02-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17308

--- Comment #24 from Segher Boessenkool  ---
(In reply to Andrew Pinski from comment #23)
> I also suspect many of these new warnings we are doing in recent years
> really should not be part of -Wall because of how many false positives we
> have. GCC has been getting a recent attention because of the false positives
> warnings too.

Current documentation says

'-Wall'
 This enables all the warnings about constructions that some users
 consider questionable, and that are easy to avoid (or modify to
 prevent the warning), even in conjunction with macros.

and

'-Wextra'
 This enables some extra warning flags that are not enabled by
 '-Wall'.

We don't document at all what options should be enabled by -Wall, what
options should be enabled by -W, and which should be done by neither.  The
current documentation for -Wall is very noncommittal.

It all is a tradeoff of course.  IMO our documentation should make that clear
as well.

In my view, -Wall should enable all warnings that have few false positives
(less than 5% or 10%, say), and when they do this is easy to avoid, or perhaps
the warning points out very important (security) problems.

-W is the same but with higher tolerance for false positives.  And the
warnings that are not so useful, or are hard to avoid, and have a high false
positive rate, should not be enabled by either.

(Oh, and please note that -Werror is not part of these considerations at all.
When -Werror makes things "break" this is just a learning opportunity for
whoever asked for it.  Maybe we should document -Werror as an alias of
-Wself-flagellation).

[Bug middle-end/108623] We need to grow the precision field in tree_type_common for PowerPC

2023-02-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623

--- Comment #5 from Segher Boessenkool  ---
The failure was not detected, only things down the road broke up, can we add
something for that?  Just a strategically placed assert should do fine.

Less important if we grow the field all the way to 16 bits, but :-)

[Bug middle-end/84514] powerpc sub optimal condition register reuse with extended inline asm

2023-01-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84514

--- Comment #2 from Segher Boessenkool  ---
Still happens with current trunk.

[Bug debug/106746] [13 Regression] '-fcompare-debug' failure (length) with -O2 -fsched2-use-superblocks since r13-2041-g6624ad73064de241

2023-01-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106746

--- Comment #24 from Segher Boessenkool  ---
So this PR can be marked resolved now?

[Bug target/108491] cross compiler does not work: cc1: error: ‘-msecure-plt’ not supported by your assembler

2023-01-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108491

Segher Boessenkool  changed:

   What|Removed |Added

 Resolution|WONTFIX |INVALID

--- Comment #7 from Segher Boessenkool  ---
It is not a bug even.

However, our documentation could be clearer that you need a working assembler
and linker etc., and the hint that putting binutils in the same prefix (before
doing GCC) makes everything automatically work could be helpful?

[Bug analyzer/108432] Analyzer fails to detect out-of-bounds issues within loops

2023-01-23 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108432

--- Comment #3 from Segher Boessenkool  ---
(In reply to David Malcolm from comment #2)
> Unfortunately, some analyzer warnings work better with optimization
> *disabled*.  -fanalyzer runs much later than most other static analyzers.

Understood.  But some work better with it enabled, right?

> For example, -Wanalyzer-deref-before-check doesn't work well with
> optimization, as the dereference means that that optimized can remove the
> checks before the analyzer "sees" them.

Yes.

> I think there's a natural tension between optimization and detecting
> undefined behavior, in that -fanalyzer wants to report on possible undefined
> behavior, whereas optimization wants to take advantage of undefined behavior.

"Take advantage of"...  A program that contains UB is erroneous, has no
defined semantics *at all*, so what the compiler is really doing is assuming
the program is a correct program, and generating more optimal target code
based on that not unreasonable assumption.

This sounds a bit better, right?  It still is true that the compiler cannot
detect all UB during compilation (it needs to know the program's input as
well for that, and even then it isn't realistic).  Is it even possible to
detect *all* UB at runtime?

[Bug target/108491] cross compiler does not work: cc1: error: ‘-msecure-plt’ not supported by your assembler

2023-01-23 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108491

--- Comment #1 from Segher Boessenkool  ---
This error is from sysv4.h SUBTARGET_OVERRIDE_OPTIONS.  -msecure-plt is
unconditionally required.

It looks like an oversight that it is not required in the assembler you
used (which is that?)

[Bug analyzer/108432] Analyzer fails to detect out-of-bounds issues within loops

2023-01-23 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108432

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #1 from Segher Boessenkool  ---
Many warning messages are also dependent on optimisation level.  And the
actual generated code is as well ;-)

-O0 means do the least possible work to generate correct code.  There is
friction between that and having -fanalyzer do deep inspection of the code.
I think we should document -fanalyzer needs some optimisation enabled (does
it need -O2 in some cases, or just -O1 always, btw?)

The suggestion to at least check the last loop iteration is good of course.

[Bug c/108483] gcc warns about suspicious constructs for unevaluted ?: operand

2023-01-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483

--- Comment #4 from Segher Boessenkool  ---
(In reply to Andrew Pinski from comment #1)
> I doubt this will be changed anytime soon, see PR 4210 for the history on
> why.

That PR is about an UB case though.  In this case the code is perfectly well
defined (just IB).

The warning code here sees a sizeof divided by a sizeof and wants to warn for
it as being maybe wrong, although it would be pretty easy to see it isn't.

[Bug c/108483] gcc warns about suspicious constructs for unevaluted ?: operand

2023-01-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483

--- Comment #2 from Segher Boessenkool  ---
The testcase needs a NULL defined as  (void *)0  .

[Bug debug/106746] [13 Regression] '-fcompare-debug' failure (length) with -O2 -fsched2-use-superblocks since r13-2041-g6624ad73064de241

2023-01-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106746

--- Comment #21 from Segher Boessenkool  ---
As far as we (me, you; everybody) can tell it is fixed now.  If one day we get
a testcase showing it has in fact not been fixed, the problem is still there,
we can reopen or link the testcases or etc.?

[Bug target/108240] [13 Regression] Error message missing since r13-4894-gacc727cf02a144 (then make concealed ICE exposed)

2023-01-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108240

--- Comment #12 from Segher Boessenkool  ---
We really really REALLY should neuter -mmodulo.  It is counter-productive
to have command-line flags for separate instructions at all (as opposed to
facilities), and it is downright destructive to have sneaky ways to enable
most (but not all!) of what -mcpu= does via other options.

<    1   2   3   4   5   6   7   8   9   10   >