Keith Packard <[email protected]> writes:
>> insn_propagation would previously only replace (reg:M H) with X
>> for some hard register H if the uses of H were also in mode M.
>> This patch extends it to handle simple mode punning too.
>
> (this is from GCC commit 9d20529d94b23275885f380d155fe8671ab5353a)
>
> Looks like this breaks m68k floating point conversion handling. Recall
> that m68k has only 80-bit float hardware with regular numbered registers
> (unlike 8087). Operations are done in 80-bit format, the FPU provides
> fmove.d and fmove.s operations to convert between 80-bit and
> 32-bit/64-bit values in memory.
>
> float naf(float, float);
> int prf(char *, ...);
> float sqf(float);
> int func(void);
>
> int func(void)
> {
> float x = naf(4.0f, 5.0f);
> prf("testing sqrtf for value %a\n", (double) x);
> float y = sqf(x);
> prf("sqrtf is %a\n", (double) y);
> return 0;
> }
>
> In this example, the return value from naf is passed as a double to prf
> and also passed as a float to sqf. The value is saved on the stack
> across the prf call.
>
> In the old code, it was restored to %fp0 and then pushed as a float on
> the stack to sqf:
>
> lea prf,%a2
> fmove.x %fp0,24(%sp)
> jsr (%a2)
> fmove.x 24(%sp),%fp0
> fmove.s %fp0,-(%sp)
> jsr sqf
>
> With this patch, the low 32 bits of the value (which is garbage) gets
> pushed on the stack instead:
>
> lea prf,%a2
> fmove.x %fp0,24(%sp)
> jsr (%a2)
> move.l 32(%sp),-(%sp)
> jsr sqf
>
> Here's the broken code:
>
> rtx newval = to;
> if (GET_MODE (x) != GET_MODE (from))
> {
> gcc_assert (REG_P (x) && HARD_REGISTER_P (x));
> if (REG_NREGS (x) != REG_NREGS (from)
> || !REG_CAN_CHANGE_MODE_P (REGNO (x), GET_MODE (from),
> GET_MODE (x)))
> return false;
> newval = simplify_subreg (GET_MODE (x), to, GET_MODE (from),
> subreg_lowpart_offset (GET_MODE (x),
> GET_MODE (from)));
> if (!newval)
> return false;
> }
>
> The trouble is that REG_CAN_CHANGE_MODE_P(%fp0, E_XFmode, E_SFmode)
> returns true, and then simplify_subreg goes and extracts the low 32-bits
> of the 80-bit value directly, skipping the conversion. This is the same
> as i386, so let's adapt the i386 version of can_change_mode_class. See
> attached patch, which seems to fix the bug for me.
>
> From 862467c578545a4a7ed83767df2a181e7d845bc6 Mon Sep 17 00:00:00 2001
> From: Keith Packard <[email protected]>
> Date: Sat, 28 Dec 2024 23:38:31 -0800
> Subject: [PATCH] m68k: TARGET_CAN_CHANGE_MODE_CLASS is false for FP regs
>
> Just like i386, you cannot extract float or double values by using
> the low bits of m68k float registers. Override this hook and check
> that case.
>
> Signed-off-by: Keith Packard <[email protected]>
Thanks, this looks like the right fix to me. I'm not sure off-hand
whether it's worth continuing to allow subregs between (say) SF and CSF.
If so...
> ---
> gcc/config/m68k/m68k.cc | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/gcc/config/m68k/m68k.cc b/gcc/config/m68k/m68k.cc
> index 050ac096e55..54bcfaffa72 100644
> --- a/gcc/config/m68k/m68k.cc
> +++ b/gcc/config/m68k/m68k.cc
> @@ -200,6 +200,7 @@ static void m68k_asm_final_postscan_insn (FILE *,
> rtx_insn *insn, rtx [], int);
> static HARD_REG_SET m68k_zero_call_used_regs (HARD_REG_SET);
> static machine_mode m68k_c_mode_for_floating_type (enum tree_index);
> static bool m68k_use_lra_p (void);
> +static bool m68k_can_change_mode_class (machine_mode from, machine_mode to,
> reg_class_t rclass);
>
> /* Initialize the GCC target structure. */
>
> @@ -370,6 +371,9 @@ static bool m68k_use_lra_p (void);
> #undef TARGET_C_MODE_FOR_FLOATING_TYPE
> #define TARGET_C_MODE_FOR_FLOATING_TYPE m68k_c_mode_for_floating_type
>
> +#undef TARGET_CAN_CHANGE_MODE_CLASS
> +#define TARGET_CAN_CHANGE_MODE_CLASS m68k_can_change_mode_class
> +
> TARGET_GNU_ATTRIBUTES (m68k_attribute_table,
> {
> /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
> @@ -7269,4 +7273,22 @@ m68k_use_lra_p ()
> return m68k_lra_p;
> }
>
> +/* Implement TARGET_CAN_CHANGE_MODE_CLASS. */
> +
> +static bool
> +m68k_can_change_mode_class (machine_mode from,
> + machine_mode to,
> + reg_class_t rclass)
> +{
> + if (from == to)
> + return true;
> +
> + /* 68881 registers can't do subreg at all, as all values are reformatted
> + to extended precision. */
> + if (reg_classes_intersect_p(rclass, FP_REGS))
> + return false;
...that could be handled by adding:
&& GET_MODE_INNER (from) != GET_MODE_INNER (to)
to the return false condition.
Andreas (cc:ed) would know better tham me though.
Given the restriction you describe, I was surprised to see that m68k
defines MODES_TIEABLE_P as:
static bool
m68k_modes_tieable_p (machine_mode mode1, machine_mode mode2)
{
return (!TARGET_HARD_FLOAT
|| ((GET_MODE_CLASS (mode1) == MODE_FLOAT
|| GET_MODE_CLASS (mode1) == MODE_COMPLEX_FLOAT)
== (GET_MODE_CLASS (mode2) == MODE_FLOAT
|| GET_MODE_CLASS (mode2) == MODE_COMPLEX_FLOAT)));
}
Wouldn't it be better (for optimisation) to similarly require the
GET_MODE_INNERs to match if either mode1 or mode2 is a scalar or complex
float? I suppose that would rarely make a difference though, and in any
case it shouldn't hold up the patch.
Richard
> +
> + return true;
> +}
> +
> #include "gt-m68k.h"
> --
> 2.45.2
>
>
> Here's how to reproduce the bug:
>
> GCC configure flags:
>
> ../configure \
> --mandir=/usr/share/man \
> --enable-languages=c,c++,lto \
> --enable-multilib \
> --enable-multilib-space \
> --disable-gcov \
> --disable-decimal-float \
> --disable-libffi \
> --disable-libgomp \
> --disable-libmudflap \
> --disable-libquadmath \
> --disable-libssp \
> --disable-libstdcxx-pch \
> --disable-nls \
> --disable-shared \
> --disable-threads \
> --enable-tls \
> --target=m68k-unknown-elf \
> --with-system-zlib \
> --with-gnu-as \
> --with-gnu-ld \
> --without-included-gettext \
> --prefix=/opt/m68k \
> --disable-libstdcxx \
> --with-headers=no \
> --without-newlib \
> AR_FOR_TARGET=m68k-unknown-elf-ar \
> AS_FOR_TARGET=m68k-unknown-elf-as \
> LD_FOR_TARGET=m68k-unknown-elf-ld \
> NM_FOR_TARGET=m68k-unknown-elf-nm \
> OBJDUMP_FOR_TARGET=m68k-unknown-elf-objdump \
> RANLIB_FOR_TARGET=m68k-unknown-elf-ranlib \
> READELF_FOR_TARGET=m68k-unknown-elf-readelf \
> STRIP_FOR_TARGET=m68k-unknown-elf-strip \
> SED=/bin/sed \
> SHELL=/bin/sh \
> BASH=/bin/bash \
> CONFIG_SHELL=/bin/bash
>
> Compile flags used to build the test:
>
> m68k-unknown-elf-gcc -O3 -S test.c