On Mon, 27 Sep 2021, Jirui Wu wrote:

> Hi all,
> 
> I now use the type based on the specification of the intrinsic
> instead of type based on formal argument. 
> 
> I use signed Int vector types because the outputs of the neon builtins
> that I am lowering is always signed. In addition, fcode and stmt
> does not have information on whether the result is signed.
> 
> Because I am replacing the stmt with new_stmt,
> a VIEW_CONVERT_EXPR cast is already in the code if needed.
> As a result, the result assembly code is correct.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master? If OK can it be committed for me, I have no commit rights.

+           tree temp_lhs = gimple_call_lhs (stmt);
+           aarch64_simd_type_info simd_type
+             = aarch64_simd_types[mem_type];
+           tree elt_ptr_type = build_pointer_type (simd_type.eltype);
+           tree zero = build_zero_cst (elt_ptr_type);
+           gimple_seq stmts = NULL;
+           tree base = gimple_convert (&stmts, elt_ptr_type,
+                                       args[0]);
+           new_stmt = gimple_build_assign (temp_lhs,
+                                    fold_build2 (MEM_REF,
+                                    TREE_TYPE (temp_lhs),
+                                    base,
+                                    zero));

this now uses the alignment info as on the LHS of the call by using
TREE_TYPE (temp_lhs) as type of the MEM_REF.  So for example

 typedef int foo __attribute__((vector_size(N),aligned(256)));

 foo tem = ld1 (ptr);

will now access *ptr as if it were aligned to 256 bytes.  But I'm sure
the ld1 intrinsic documents the required alignment (either it's the
natural alignment of the vector type loaded or element alignment?).

For element alignment you'd do sth like

  tree access_type = build_aligned_type (vector_type, TYPE_ALIGN 
(TREE_TYPE (vector_type)));

for example.

Richard.


> Thanks,
> Jirui
> 
> > -----Original Message-----
> > From: Richard Biener <rguent...@suse.de>
> > Sent: Thursday, September 16, 2021 2:59 PM
> > To: Jirui Wu <jirui...@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com; i...@airs.com; Richard
> > Sandiford <richard.sandif...@arm.com>
> > Subject: Re: [Patch][GCC][middle-end] - Lower store and load neon builtins 
> > to
> > gimple
> > 
> > On Thu, 16 Sep 2021, Jirui Wu wrote:
> > 
> > > Hi all,
> > >
> > > This patch lowers the vld1 and vst1 variants of the store and load
> > > neon builtins functions to gimple.
> > >
> > > The changes in this patch covers:
> > > * Replaces calls to the vld1 and vst1 variants of the builtins
> > > * Uses MEM_REF gimple assignments to generate better code
> > > * Updates test cases to prevent over optimization
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master? If OK can it be committed for me, I have no commit rights.
> > 
> > +           new_stmt = gimple_build_assign (gimple_call_lhs (stmt),
> > +                                           fold_build2 (MEM_REF,
> > +                                           TREE_TYPE
> > +                                           (gimple_call_lhs (stmt)),
> > +                                           args[0], build_int_cst
> > +                                           (TREE_TYPE (args[0]), 0)));
> > 
> > you are using TBAA info based on the formal argument type that might have
> > pointer conversions stripped.  Instead you should use a type based on the
> > specification of the intrinsics (or the builtins).
> > 
> > Likewise for the type of the access (mind alignment info there!).
> > 
> > Richard.
> > 
> > > Thanks,
> > > Jirui
> > >
> > > gcc/ChangeLog:
> > >
> > >         * config/aarch64/aarch64-builtins.c
> > (aarch64_general_gimple_fold_builtin):
> > > lower vld1 and vst1 variants of the neon builtins
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.target/aarch64/fmla_intrinsic_1.c:
> > > prevent over optimization
> > >         * gcc.target/aarch64/fmls_intrinsic_1.c:
> > > prevent over optimization
> > >         * gcc.target/aarch64/fmul_intrinsic_1.c:
> > > prevent over optimization
> > >         * gcc.target/aarch64/mla_intrinsic_1.c:
> > > prevent over optimization
> > >         * gcc.target/aarch64/mls_intrinsic_1.c:
> > > prevent over optimization
> > >         * gcc.target/aarch64/mul_intrinsic_1.c:
> > > prevent over optimization
> > >         * gcc.target/aarch64/simd/vmul_elem_1.c:
> > > prevent over optimization
> > >         * gcc.target/aarch64/vclz.c:
> > > replace macro with function to prevent over optimization
> > >         * gcc.target/aarch64/vneg_s.c:
> > > replace macro with function to prevent over optimization
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendï¿œrffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to