[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

--- Comment #9 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:e322dff09d011f65f5cae4e95c3a24ccfae7b1e1

commit r16-984-ge322dff09d011f65f5cae4e95c3a24ccfae7b1e1
Author: Richard Sandiford 
Date:   Fri May 30 09:36:35 2025 +0100

rtl-ssa: Reject non-address uses of autoinc regs [PR120347]

As the rtl.texi documentation of RTX_AUTOINC expressions says:

  If a register used as the operand of these expressions is used in
  another address in an insn, the original value of the register is
  used.  Uses of the register outside of an address are not permitted
  within the same insn as a use in an embedded side effect expression
  because such insns behave differently on different machines and hence
  must be treated as ambiguous and disallowed.

late-combine was failing to follow this rule.  One option would have
been to enforce it during the substitution phase, like combine does.
This could either be a dedicated condition in the substitution code
or, more generally, an extra condition in can_merge_accesses.
(The latter would include extending is_pre_post_modify to uses.)

However, since the restriction applies to patterns rather than to
actions on patterns, the more robust fix seemed to be test and reject
this case in (a subroutine of) rtl_ssa::recog.  We already do something
similar for hard-coded register clobbers.

Using vec_rtx_properties isn't the lightest-weight operation
out there.  I did wonder about relying on the is_pre_post_modify
flag of the definitions in the new_defs array, but that would
require callers that create new autoincs to set the flag before
calling recog.  Normally these flags are instead updated
automatically based on the final pattern.

Besides, recog itself has had to traverse the whole pattern,
and it is even less light-weight than vec_rtx_properties.
At least the pattern should be in cache.

The rtl-ssa fix showed up a mistake (of mine) in the rtl_properties
walker: try_to_add_src would drop all flags except IN_NOTE before
recursing into RTX_AUTOINC addresses.

RTX_AUTOINCs only occur in addresses, and so for them, the flags coming
into try_to_add_src are set by:

  unsigned int base_flags = flags & rtx_obj_flags::STICKY_FLAGS;
  ...
  if (MEM_P (x))
{
  ...

  unsigned int addr_flags = base_flags | rtx_obj_flags::IN_MEM_STORE;
  if (flags & rtx_obj_flags::IS_READ)
addr_flags |= rtx_obj_flags::IN_MEM_LOAD;
  try_to_add_src (XEXP (x, 0), addr_flags);
  return;
}

This means that the only flags that can be set are:

- IN_NOTE (the sole member of STICKY_FLAGS)
- IN_MEM_STORE
- IN_MEM_LOAD

Thus dropping all flags except IN_NOTE had the effect of dropping
IN_MEM_STORE and IN_MEM_LOAD, and nothing else.  But those flags
are the ones that mark something as being part of a mem address.
The exclusion was therefore exactly wrong.

gcc/
PR rtl-optimization/120347
* rtlanal.cc (rtx_properties::try_to_add_src): Don't drop the
IN_MEM_LOAD and IN_MEM_STORE flags for autoinc registers.
* rtl-ssa/changes.cc (recog_level2): Check whether an
RTX_AUTOINCed register also appears outside of an address.

gcc/testsuite/
PR rtl-optimization/120347
* gcc.dg/torture/pr120347.c: New test.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread arnd at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

--- Comment #8 from Arnd Bergmann  ---
My mistake, I accidentally reintroduced the kernel bug from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54983 while debugging another
issue, so the code in comment #4 does not actually appear in the Linux kernel.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

--- Comment #7 from Richard Earnshaw  ---
The address in offsettable_address is always validated as a 'byte' sized
address (it's generic code).  So it's not generally useful to use this on Arm
for validating anything that's more restricted than ldrb/strb.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

--- Comment #6 from Andrew Pinski  ---
(In reply to Arnd Bergmann from comment #4)
> I have reduced another build failure (in some configurations but many files)
> and found that also to be caused by -flate-combine-instructions
> 
> https://godbolt.org/z/7o1KE4jd3
> void c(short *d) { asm("strh %1, %0" : : "Qo"(*(d + 1000)), "r"(0)); }
> 
> arm-linux-gnueabihf-gcc -O2 -march=armv7-a -marm test.c
> /tmp/ccw7KGmj.s:77: Error: bad immediate value for 8-bit offset (2000)
> 
> I've added -fno-late-combine-instructions to the kernel cflags on my test
> box now, which seems to address all of these. Let me know if I should open a
> separate report for that.

That is a different issue overall and I think it is invalid inline-asm:
(define_memory_constraint "Q"
 "@internal
  An address that is a single base register."
 (and (match_code "mem")
  (match_test "REG_P (XEXP (op, 0))")))

'o' definition:
A memory operand is allowed, but only if the address is offsettable. This means
that adding a small integer (actually, the width in bytes of the operand, as
determined by its machine mode) may be added to the address and the result is
also a valid memory address.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

--- Comment #5 from Richard Earnshaw  ---
(In reply to Arnd Bergmann from comment #4)
> I have reduced another build failure (in some configurations but many files)
> and found that also to be caused by -flate-combine-instructions
> 
> https://godbolt.org/z/7o1KE4jd3
> void c(short *d) { asm("strh %1, %0" : : "Qo"(*(d + 1000)), "r"(0)); }
> 
> arm-linux-gnueabihf-gcc -O2 -march=armv7-a -marm test.c
> /tmp/ccw7KGmj.s:77: Error: bad immediate value for 8-bit offset (2000)
> 
> I've added -fno-late-combine-instructions to the kernel cflags on my test
> box now, which seems to address all of these. Let me know if I should open a
> separate report for that.

The compiler has no idea what the valid range of supported offsets are in an
asm statement.  IIRC It just assumes the basic addressing modes (ldr/str).

But why aren't you using "m" here?  That does the right thing.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread arnd at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

--- Comment #4 from Arnd Bergmann  ---
I have reduced another build failure (in some configurations but many files)
and found that also to be caused by -flate-combine-instructions

https://godbolt.org/z/7o1KE4jd3
void c(short *d) { asm("strh %1, %0" : : "Qo"(*(d + 1000)), "r"(0)); }

arm-linux-gnueabihf-gcc -O2 -march=armv7-a -marm test.c
/tmp/ccw7KGmj.s:77: Error: bad immediate value for 8-bit offset (2000)

I've added -fno-late-combine-instructions to the kernel cflags on my test box
now, which seems to address all of these. Let me know if I should open a
separate report for that.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

Richard Sandiford  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org
 CC||rsandifo at gcc dot gnu.org

--- Comment #3 from Richard Sandiford  ---
Mine then.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

Sam James  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||sjames at gcc dot gnu.org
   Last reconfirmed||2025-05-19

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

Sam James  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=120351

--- Comment #2 from Sam James  ---
-fno-late-combine-instructions works, just like in PR120351

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

Richard Earnshaw  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=20972
 CC|richard.earnshaw at arm dot com|rearnsha at gcc dot 
gnu.org

--- Comment #1 from Richard Earnshaw  ---
This is somewhat of a heisenbug.  The instruction is unpredictable because the
increment may take place before the value to be stored is read.

Unfortunately, there's currently no way in the machine description (that I'm
aware of) to describe this restriction - I don't think you can early-clobber a
mem.

[Bug target/120347] [15/16 regression] invalid arm32/thumb assembly output

2025-05-19 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120347

Sam James  changed:

   What|Removed |Added

Summary|invalid arm32/thumb |[15/16 regression] invalid
   |assembly output |arm32/thumb assembly output
   Keywords||assemble-failure,
   ||wrong-code
   Target Milestone|--- |15.2