On Thu, Nov 9, 2023 at 6:43 AM HAO CHEN GUI <guih...@linux.ibm.com> wrote: > > Hi, > This patch modifies expand_builtin_return and make it call > expand_misaligned_mem_ref to load unaligned memory. The memory reference > pointed by void* pointer might be unaligned, so expanding it with > unaligned move optabs is safe. > > The new test case illustrates the problem. rs6000 doesn't have > unaligned vector load instruction with VSX disabled. When calling > builtin_return, it shouldn't load the memory to vector register by > unaligned load instruction directly. It should store it to an on stack > variable by extract_bit_field then load to return register from stack > by aligned load instruction. > > Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no > regressions. Is this OK for trunk?
I'm not sure if the testcase is valid though? @defbuiltin{{void} __builtin_return (void *@var{result})} This built-in function returns the value described by @var{result} from the containing function. You should specify, for @var{result}, a value returned by @code{__builtin_apply}. @enddefbuiltin I don't see __builtin_apply being used here? Richard. > Thanks > Gui Haochen > > ChangeLog > expand: Call misaligned memory reference in expand_builtin_return > > expand_builtin_return loads memory to return registers. The memory might > be unaligned compared to the mode of the registers. So it should be > expanded by unaligned move optabs if the memory reference is unaligned. > > gcc/ > PR target/112417 > * builtins.cc (expand_builtin_return): Call > expand_misaligned_mem_ref for loading unaligned memory reference. > * builtins.h (expand_misaligned_mem_ref): Declare. > * expr.cc (expand_misaligned_mem_ref): No longer static. > > gcc/testsuite/ > PR target/112417 > * gcc.target/powerpc/pr112417.c: New. > > patch.diff > diff --git a/gcc/builtins.cc b/gcc/builtins.cc > index cb90bd03b3e..b879eb88b7c 100644 > --- a/gcc/builtins.cc > +++ b/gcc/builtins.cc > @@ -1816,7 +1816,12 @@ expand_builtin_return (rtx result) > if (size % align != 0) > size = CEIL (size, align) * align; > reg = gen_rtx_REG (mode, INCOMING_REGNO (regno)); > - emit_move_insn (reg, adjust_address (result, mode, size)); > + rtx tmp = adjust_address (result, mode, size); > + unsigned int align = MEM_ALIGN (tmp); > + if (align < GET_MODE_ALIGNMENT (mode)) > + tmp = expand_misaligned_mem_ref (tmp, mode, 1, align, > + NULL, NULL); > + emit_move_insn (reg, tmp); > > push_to_sequence (call_fusage); > emit_use (reg); > diff --git a/gcc/builtins.h b/gcc/builtins.h > index 88a26d70cd5..a3d7954ee6e 100644 > --- a/gcc/builtins.h > +++ b/gcc/builtins.h > @@ -157,5 +157,7 @@ extern internal_fn replacement_internal_fn (gcall *); > > extern bool builtin_with_linkage_p (tree); > extern int type_to_class (tree); > +extern rtx expand_misaligned_mem_ref (rtx, machine_mode, int, unsigned int, > + rtx, rtx *); > > #endif /* GCC_BUILTINS_H */ > diff --git a/gcc/expr.cc b/gcc/expr.cc > index ed4dbb13d89..b0adb35a095 100644 > --- a/gcc/expr.cc > +++ b/gcc/expr.cc > @@ -9156,7 +9156,7 @@ expand_cond_expr_using_cmove (tree treeop0 > ATTRIBUTE_UNUSED, > If the result can be stored at TARGET, and ALT_RTL is non-NULL, > then *ALT_RTL is set to TARGET (before legitimziation). */ > > -static rtx > +rtx > expand_misaligned_mem_ref (rtx temp, machine_mode mode, int unsignedp, > unsigned int align, rtx target, rtx *alt_rtl) > { > diff --git a/gcc/testsuite/gcc.target/powerpc/pr112417.c > b/gcc/testsuite/gcc.target/powerpc/pr112417.c > new file mode 100644 > index 00000000000..ef82fc82033 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr112417.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile { target { has_arch_pwr7 } } } */ > +/* { dg-options "-mno-vsx -maltivec -O2" } */ > + > +void * foo (void * p) > +{ > + if (p) > + __builtin_return (p); > +} > + > +/* Ensure that unaligned load is generated via stack load/store. */ > +/* { dg-final { scan-assembler {\mstw\M} { target { ! has_arch_ppc64 } } } } > */ > +/* { dg-final { scan-assembler {\mstd\M} { target has_arch_ppc64 } } } */