Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

Peter Bergner Fri, 22 Mar 2024 21:38:50 -0700

On 3/22/24 5:15 AM, Ajit Agarwal wrote:
> When using FlexiBLAS with OpenBLAS we noticed corruption of
> the parameters passed to OpenBLAS functions. FlexiBLAS
> basically provides a BLAS interface where each function
> is a stub that forwards the arguments to a real BLAS lib,
> like OpenBLAS.
> 
> Fixes the corruption of caller frame checking number of
> arguments is less than equal to GP_ARG_NUM_REG (8)
> excluding hidden unused DECLS.


I think the git log entry commentary could be a little more descriptive
of what the problem is. How about something like the following?

  When using FlexiBLAS with OpenBLAS, we noticed corruption of the caller
  stack frame when calling OpenBLAS functions.  This was caused by the
  FlexiBLAS C/C++ caller and OpenBLAS Fortran callee disagreeing on the
  number of function parameters in the callee due to hidden Fortran
  parameters. This can cause problems when the callee believes the caller
  has allocated a parameter save area when the caller has not done so.
  That means any writes by the callee into the non-existent parameter save
  area will corrupt the caller stack frame.

  The workaround implemented here, is for the callee to determine whether
  the caller has allocated a parameter save area or not, by ignoring any
  unused hidden parameters when counting the number of parameters.



>       PR rtk-optimization/100799

s/rtk/rtl/



>       * config/rs6000/rs6000-calls.cc (rs6000_function_arg): Don't
>       generate parameter save area if number of arguments passed
>       less than equal to GP_ARG_NUM_REG (8) excluding hidden
>       parameter.

The callee doesn't generate or allocate the parameter save area, the
caller does.  The code here is for the callee trying to determine
whether the caller has done so.  How about saying the following instead?

  Don't assume a parameter save area has been allocated if the number of
  formal parameters, excluding unused hidden parameters, is less than or
  equal to GP_ARG_NUM_REG (8).




>       (init_cumulative_args): Check for hidden parameter in fortran
>       routine and set the flag hidden_string_length and actual
>       parameter passed excluding hidden unused DECLS.

Check for unused hidden Fortran parameters and set hidden_string_length
and actual_parm_length.


> +  /* When the buggy C/C++ wrappers call the function with fewer arguments
> +     than it actually has and doesn't expect the parameter save area on the
> +     caller side because of that while the callee expects it and the callee
> +     actually stores something in the parameter save area, it corrupts
> +     whatever is in the caller stack frame at that location.  */

The wrapper/caller is the one that allocates the parameter save area, so
saying "...doesn't expect the parameter save area on the caller side..."
doesn't make sense, since it knows whether it allocated it or not.
How about saying something like the following instead?

  Check whether this function contains any unused hidden parameters and
  record how many there are for use in rs6000_function_arg() to determine
  whether its callers have allocated a parameter save area or not.
  See PR100799 for details.



> +  unsigned int num_args = 0;
> +  unsigned int hidden_length = 0;
> +
> +  for (tree arg = DECL_ARGUMENTS (current_function_decl);
> +       arg; arg = DECL_CHAIN (arg))
> +    {
> +      num_args++;
> +      if (DECL_HIDDEN_STRING_LENGTH (arg))
> +     {
> +       tree parmdef = ssa_default_def (cfun, arg);
> +       if (parmdef == NULL || has_zero_uses (parmdef))
> +         {
> +           cum->hidden_string_length = 1;
> +           hidden_length++;
> +         }
> +     }
> +   }
> +
> +  cum->actual_parm_length = num_args - hidden_length;

This code looks fine, but do we really need two new fields in rs6000_args?
Can't we just get along with only cum->actual_parm_length by modifying
the rs6000_function_arg() change from:

> +      else if (align_words < GP_ARG_NUM_REG
> +            || (cum->hidden_string_length
> +            && cum->actual_parm_length <= GP_ARG_NUM_REG))

to:

+      else if (align_words < GP_ARG_NUM_REG
+              || cum->actual_parm_length <= GP_ARG_NUM_REG)

???

That said, I have a further comment below on what happens here when 
align_words >= GP_ARG_NUM_REG and cum->actual_parm_length <= GP_ARG_NUM_REG.




> +     /* When the buggy C/C++ wrappers call the function with fewer arguments
> +     than it actually has and doesn't expect the parameter save area on the
> +     caller side because of that while the callee expects it and the callee
> +     actually stores something in the parameter save area, it corrupts
> +     whatever is in the caller stack frame at that location.  */

Same comment as before, so same problem with the comment, but the following
change...

> -      else if (align_words < GP_ARG_NUM_REG)
> +      else if (align_words < GP_ARG_NUM_REG
> +            || (cum->hidden_string_length
> +            && cum->actual_parm_length <= GP_ARG_NUM_REG))
        {
          if (TARGET_32BIT && TARGET_POWERPC64)
            return rs6000_mixed_function_arg (mode, type, align_words);

          return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words);
        }
      else
        return NULL_RTX;

The old code for the unused hidden parameter (which was the 9th param) would
fall thru to the "return NULL_RTX;" which would make the callee assume there
was a parameter save area allocated.  Now instead, we'll return a reg rtx,
probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now
see a copy of r11 into a pseudo like we do for the other param regs.
Is that a problem? Given it's an unused parameter, it'll probably get deleted
as dead code, but could it cause any issues?  What if we have more than one
unused hidden parameter and we return r12 and r13 which have specific uses
in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead.
Have you verified what the callee RTL looks like after expand for these
unused hidden parameters?  Is there a rtx we can return that isn't a NULL_RTX
which triggers the assumption of a parameter save area, but isn't a reg rtx
which might lead to some rtl being generated?  Would a (const_int 0) or
something else work?



> +  /* Actual parameter length ignoring hidden parameter.
> +     This is done to C++ wrapper calling fortran procedures
> +     which has hidden parameter that are not used.  */

I think a simpler comment will suffice:

  /* Actual parameter count ignoring unused hidden parameters.  */


Peter

Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

Reply via email to