> -----Original Message-----
> From: Bin.Cheng [mailto:amker.ch...@gmail.com]
> Sent: Wednesday, July 02, 2014 1:46 PM
> To: Ramana Radhakrishnan
> Cc: Bin Cheng; Richard Earnshaw; gcc-patches
> Subject: Re: [PATCH ARM]Handle REG addressing mode in
> output_move_neon explicitly
>
> On Wed, Jul 2, 2014 at 7:48 AM, Ramana Radhakrishnan
> <ramana....@googlemail.com> wrote:
> > On Mon, May 5, 2014 at 8:21 AM, bin.cheng <bin.ch...@arm.com> wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Richard Earnshaw
> >>> Sent: Thursday, May 01, 2014 10:03 PM
> >>> To: Bin Cheng
> >>> Cc: gcc-patches@gcc.gnu.org
> >>> Subject: Re: [PATCH ARM]Handle REG addressing mode in
> >>> output_move_neon explicitly
> >>>
> >>> On 29/04/14 04:02, bin.cheng wrote:
> >>> > Hi,
> >>> > Function output_move_neon now generates vld1.64 for memory ref
> >>> > like "dx <- [r1:SI]", this is bogus because it requires at least
> >>> > 64-bit alignment for 32-bit aligned memory ref. It works now
> >>> > because GCC doesn't generate such insns in the first place, but
> >>> > things are going to change if memset/memcpy calls are inlined by
using
> neon instructions.
> >>> >
> >>>
> >>> V[LD/ST]1.64 only need to be 64-bit aligned if strict alignment is
> >> enabled. We
> >>> normally assume that not to be the case. The exception to this is
> >>> when an
> >> theoretically, this doesn't make the problem go away, right?
> >>
> >>> explicit alignment check is used in the address expression (the :64
> >> suffix),
> >>> which causes the address to be checked for strict alignment at all
times.
> >>>
> >>> Do you have a testcase?
> >> I can't provide a test case without the memset inlining patch.
> >>
> > Are the tests in the memset inlining patch now sufficient to expose
> > the problem or do we need another test ?
>
> Yes, it's covered by the 4th/7th test cases in memset inlining patch.
>
This is the rebased patch, though the original one doesn't conflict with
latest trunk. Is it OK?
Thanks,
bin
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c (revision 212295)
+++ gcc/config/arm/arm.c (working copy)
@@ -18454,6 +18498,20 @@ output_move_neon (rtx *operands)
/* FIXME: Not currently enabled in neon_vector_mem_operand. */
gcc_unreachable ();
+ case REG:
+ /* We have to use vldm / vstm for too-large modes. */
+ if (nregs > 1)
+ {
+ if (nregs > 4)
+ templ = "v%smia%%?\t%%m0, %%h1";
+ else
+ templ = "v%s1.64\t%%h1, %%A0";
+
+ ops[0] = mem;
+ ops[1] = reg;
+ break;
+ }
+ /* Fall through. */
case LABEL_REF:
case PLUS:
{
@@ -18487,14 +18545,7 @@ output_move_neon (rtx *operands)
}
default:
- /* We have to use vldm / vstm for too-large modes. */
- if (nregs > 4)
- templ = "v%smia%%?\t%%m0, %%h1";
- else
- templ = "v%s1.64\t%%h1, %%A0";
-
- ops[0] = mem;
- ops[1] = reg;
+ gcc_unreachable ();
}
sprintf (buff, templ, load ? "ld" : "st");