Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst

Richard Biener Fri, 01 Jun 2018 08:37:35 -0700

On June 1, 2018 5:15:58 PM GMT+02:00, Bill Schmidt <wschm...@linux.ibm.com> 
wrote:
>On Jun 1, 2018, at 10:11 AM, Will Schmidt <will_schm...@vnet.ibm.com>
>wrote:
>> 
>> On Fri, 2018-06-01 at 08:53 +0200, Richard Biener wrote:
>>> On Thu, May 31, 2018 at 9:59 PM Will Schmidt
><will_schm...@vnet.ibm.com> wrote:
>>>> 
>>>> Hi,
>>>>  Add support for gimple folding for unaligned vector loads and
>stores.
>>>> testcases posted separately in this thread.
>>>> 
>>>> Regtest completed across variety of systems, P6,P7,P8,P9.
>>>> 
>>>> OK for trunk?
>>>> Thanks,
>>>> -Will
>>>> 
>>>> [gcc]
>>>> 
>>>> 2018-05-31 Will Schmidt <will_schm...@vnet.ibm.com>
>>>> 
>>>>        * config/rs6000/rs6000.c: (rs6000_builtin_valid_without_lhs)
>Add vec_xst
>>>>        variants to the list.  (rs6000_gimple_fold_builtin) Add
>support for
>>>>        folding unaligned vector loads and stores.
>>>> 
>>>> diff --git a/gcc/config/rs6000/rs6000.c
>b/gcc/config/rs6000/rs6000.c
>>>> index d62abdf..54b7de2 100644
>>>> --- a/gcc/config/rs6000/rs6000.c
>>>> +++ b/gcc/config/rs6000/rs6000.c
>>>> @@ -15360,10 +15360,16 @@ rs6000_builtin_valid_without_lhs (enum
>rs6000_builtins fn_code)
>>>>     case ALTIVEC_BUILTIN_STVX_V8HI:
>>>>     case ALTIVEC_BUILTIN_STVX_V4SI:
>>>>     case ALTIVEC_BUILTIN_STVX_V4SF:
>>>>     case ALTIVEC_BUILTIN_STVX_V2DI:
>>>>     case ALTIVEC_BUILTIN_STVX_V2DF:
>>>> +    case VSX_BUILTIN_STXVW4X_V16QI:
>>>> +    case VSX_BUILTIN_STXVW4X_V8HI:
>>>> +    case VSX_BUILTIN_STXVW4X_V4SF:
>>>> +    case VSX_BUILTIN_STXVW4X_V4SI:
>>>> +    case VSX_BUILTIN_STXVD2X_V2DF:
>>>> +    case VSX_BUILTIN_STXVD2X_V2DI:
>>>>       return true;
>>>>     default:
>>>>       return false;
>>>>     }
>>>> }
>>>> @@ -15869,10 +15875,77 @@ rs6000_gimple_fold_builtin
>(gimple_stmt_iterator *gsi)
>>>>        gimple_set_location (g, loc);
>>>>        gsi_replace (gsi, g, true);
>>>>        return true;
>>>>       }
>>>> 
>>>> +    /* unaligned Vector loads.  */
>>>> +    case VSX_BUILTIN_LXVW4X_V16QI:
>>>> +    case VSX_BUILTIN_LXVW4X_V8HI:
>>>> +    case VSX_BUILTIN_LXVW4X_V4SF:
>>>> +    case VSX_BUILTIN_LXVW4X_V4SI:
>>>> +    case VSX_BUILTIN_LXVD2X_V2DF:
>>>> +    case VSX_BUILTIN_LXVD2X_V2DI:
>>>> +      {
>>>> +        arg0 = gimple_call_arg (stmt, 0);  // offset
>>>> +        arg1 = gimple_call_arg (stmt, 1);  // address
>>>> +        lhs = gimple_call_lhs (stmt);
>>>> +        location_t loc = gimple_location (stmt);
>>>> +        /* Since arg1 may be cast to a different type, just use
>ptr_type_node
>>>> +           here instead of trying to enforce TBAA on pointer
>types.  */
>>>> +        tree arg1_type = ptr_type_node;
>>>> +        tree lhs_type = TREE_TYPE (lhs);
>>>> +        /* POINTER_PLUS_EXPR wants the offset to be of type
>'sizetype'.  Create
>>>> +           the tree using the value from arg0.  The resulting type
>will match
>>>> +           the type of arg1.  */
>>>> +        gimple_seq stmts = NULL;
>>>> +        tree temp_offset = gimple_convert (&stmts, loc, sizetype,
>arg0);
>>>> +        tree temp_addr = gimple_build (&stmts, loc,
>POINTER_PLUS_EXPR,
>>>> +                                      arg1_type, arg1,
>temp_offset);
>>>> +        gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>>> +        /* Use the build2 helper to set up the mem_ref.  The
>MEM_REF could also
>>>> +           take an offset, but since we've already incorporated
>the offset
>>>> +           above, here we just pass in a zero.  */
>>>> +        gimple *g;
>>>> +        g = gimple_build_assign (lhs, build2 (MEM_REF, lhs_type,
>temp_addr,
>>>> +                                               build_int_cst
>(arg1_type, 0)));
>>> 
>>> So in GIMPLE the type of the MEM_REF specifies the alignment so my
>question
>>> is what type does the lhs usually have here?  I'd simply guess V4SF,
>etc.?  In
>> 
>> yes.  (double-checking).  my reference for the intrinsic signatures
>> shows the lhs is a vector of type.  The rhs can be either *type or
>> *vector of type. 
>> 
>> vector double vec_vsx_ld (int, const vector double *);
>> vector double vec_vsx_ld (int, const double *);
>> With similar/same for the assorted other types.
>> 
>> These are also on my list as 'unaligned' vector loads.  I'm not
>certain
>> if that adds a twist to how I should answer the below.. 
>> 
>> Bill?
>
>'unaligned' means not necessarily aligned on a vector boundary.
>They are guaranteed to be aligned on an element boundary.
>> 
>>> this case you are missing a
>>>  tree ltype = build_aligned_type (lhs_type, desired-alignment);
>>> 
>>> and use that ltype for building the MEM_REF.  I suppose in this case
>the known
>>> alignment is either BITS_PER_UNIT or element alignment (thus
>>> TYPE_ALIGN (TREE_TYPE (lhs_type)))?
>> 
>> I'd think element alignment.  but no longer certain.  :-)
>
>Yep, element alignment.


Note the x86 unaligned intrinsics support arbitray unaligned loads. So that's 
not available for power? Does the HW implementation require element alignment? 

Richard. 

>Thanks,
>Bill
>> 
>>> Or is the type of the load the element types?
>> 
>> 
>> So, In any case..  I'll build up / modify some tests to look at data
>> being loaded, and see if I can see alignment issues here.
>> 
>> Thanks,
>> -Will 
>> 
>> 
>> 
>>> Richard.
>>> 
>>>> +        gimple_set_location (g, loc);
>>>> +        gsi_replace (gsi, g, true);
>>>> +        return true;
>>>> +      }
>>>> +
>>>> +    /* unaligned Vector stores.  */
>>>> +    case VSX_BUILTIN_STXVW4X_V16QI:
>>>> +    case VSX_BUILTIN_STXVW4X_V8HI:
>>>> +    case VSX_BUILTIN_STXVW4X_V4SF:
>>>> +    case VSX_BUILTIN_STXVW4X_V4SI:
>>>> +    case VSX_BUILTIN_STXVD2X_V2DF:
>>>> +    case VSX_BUILTIN_STXVD2X_V2DI:
>>>> +      {
>>>> +        arg0 = gimple_call_arg (stmt, 0); /* Value to be stored. 
>*/
>>>> +        arg1 = gimple_call_arg (stmt, 1); /* Offset.  */
>>>> +        tree arg2 = gimple_call_arg (stmt, 2); /* Store-to
>address.  */
>>>> +        location_t loc = gimple_location (stmt);
>>>> +        tree arg0_type = TREE_TYPE (arg0);
>>>> +        /* Use ptr_type_node (no TBAA) for the arg2_type.  */
>>>> +        tree arg2_type = ptr_type_node;
>>>> +        /* POINTER_PLUS_EXPR wants the offset to be of type
>'sizetype'.  Create
>>>> +           the tree using the value from arg0.  The resulting type
>will match
>>>> +           the type of arg2.  */
>>>> +        gimple_seq stmts = NULL;
>>>> +        tree temp_offset = gimple_convert (&stmts, loc, sizetype,
>arg1);
>>>> +        tree temp_addr = gimple_build (&stmts, loc,
>POINTER_PLUS_EXPR,
>>>> +                                      arg2_type, arg2,
>temp_offset);
>>>> +        /* Mask off any lower bits from the address.  */
>>>> +        gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>>> +        gimple *g;
>>>> +        g = gimple_build_assign (build2 (MEM_REF, arg0_type,
>temp_addr,
>>>> +                                          build_int_cst
>(arg2_type, 0)), arg0);
>>>> +        gimple_set_location (g, loc);
>>>> +        gsi_replace (gsi, g, true);
>>>> +        return true;
>>>> +      }
>>>> +
>>>>     /* Vector Fused multiply-add (fma).  */
>>>>     case ALTIVEC_BUILTIN_VMADDFP:
>>>>     case VSX_BUILTIN_XVMADDDP:
>>>>     case ALTIVEC_BUILTIN_VMLADDUHM:
>>>>       {

Re: [PATCH, rs6000 8/9] enable gimple folding for vec_xl, vec_xst

Reply via email to