Richard,

Thanks a lot for your comments.

> On Oct 20, 2020, at 1:12 PM, Richard Sandiford <richard.sandif...@arm.com> 
> wrote:
> 
>> 
>> +
>> +  if ((strcmp (TREE_STRING_POINTER (id), "skip") != 0)
>> +      && (strcmp (TREE_STRING_POINTER (id), "used-gpr-arg") != 0)
>> +      && (strcmp (TREE_STRING_POINTER (id), "used-arg") != 0)
>> +      && (strcmp (TREE_STRING_POINTER (id), "all-arg") != 0)
>> +      && (strcmp (TREE_STRING_POINTER (id), "used-gpr") != 0)
>> +      && (strcmp (TREE_STRING_POINTER (id), "all-gpr") != 0)
>> +      && (strcmp (TREE_STRING_POINTER (id), "used") != 0)
>> +      && (strcmp (TREE_STRING_POINTER (id), "all") != 0))
> 
> Any reason we don't support all-gpr-arg?  Seems to be the only
> “missing” combination.
Will add this one.

> 
> Would be good to have a single piece of code that parses these
> arguments into a set of flags, rather than have one list here
> and one get_call_used_regs_seq.
> 
> Maybe we could do something similar to sanitizer_opts, but that
> might not be necessary.

Okay, will do that.
> 
>> +    {
>> +      error ("attribute %qE argument must be one of %qs, %qs, %qs, %qs,"
>> +         "%qs, %qs, %qs, or %qs",
>> +         name, "skip", "used-gpr-arg", "used-arg", "all-arg",
>> +         "used-gpr", "all-gpr", "used", "all");
>> +      *no_add_attris = true;
>> +      return NULL_TREE;
>> +    }
>> +
>> +  return NULL_TREE;
>> +}
>> +
>> /* Handle a "returns_nonnull" attribute; arguments as in
>>   struct attribute_spec.handler.  */
>> 
>> diff --git a/gcc/coretypes.h b/gcc/coretypes.h
>> index 6b6cfcd..0ce5eb4 100644
>> --- a/gcc/coretypes.h
>> +++ b/gcc/coretypes.h
>> @@ -418,6 +418,19 @@ enum symbol_visibility
>>  VISIBILITY_INTERNAL
>> };
>> 
>> +/* Zero call-used registers type.  */
>> +enum zero_call_used_regs {
>> +  zero_call_used_regs_unset = 0,
>> +  zero_call_used_regs_skip,
>> +  zero_call_used_regs_used_gpr_arg,
>> +  zero_call_used_regs_used_arg,
>> +  zero_call_used_regs_all_arg,
>> +  zero_call_used_regs_used_gpr,
>> +  zero_call_used_regs_all_gpr,
>> +  zero_call_used_regs_used,
>> +  zero_call_used_regs_all
>> +};
> 
> I think a bitmask would be easier to use:
> 
>  SKIP
>  ONLY_USED
>  ONLY_GPR
>  ONLY_ARG
> 
> Should probably be a class enum given that we're C++11.

Good suggestion.

> 
>> +pass parameters. @samp{used-arg} zeros used call-used registers that
>> +pass parameters. @samp{arg} zeros all call-used registers that pass
>> +parameters.  These 3 choices are used for ROP mitigation.
>> +
>> +@samp{used-gpr} zeros call-used general purpose registers
>> +which are used in function.  @samp{all-gpr} zeros all
>> +call-used registers.  @samp{used} zeros call-used registers which
>> +are used in function.  @samp{all} zeros all call-used registers.
>> +These 4 choices are used for preventing information leak through
>> +registers.
> 
> The description for all-gpr doesn't look right.
Oops. Will fix it.

>  I think it would
> be easier to describe (and hopefully to follow) if we start with
> the three basic choices: “skip”, “used” and “all”.  Then describe
> how “used” and “all” can be modified by adding “-gpr” to limit the
> clearing to general-purpose registers and “-arg” to limit the
> clearing to argument registers.
> 
> We need to say what “call-used” and “used” mean in this context.
> In particular, “call-used” is also known as “call-clobbered”,
> “caller-saved“ and “volatile”, so it would be good to list those
> as alternatives.  We need to say what “used” registers are.

Okay.

>> 
>> +@item -fzero-call-used-regs=@var{choice}
>> +@opindex fzero-call-used-regs
>> +Zero call-used registers at function return to increase the program
>> +security by either mitigating Return-Oriented Programming (ROP) or
>> +preventing information leak through registers.
>> +
>> +@samp{skip}, which is the default, doesn't zero call-used registers.
>> +
>> +@samp{used-gpr-arg} zeros used call-used general purpose registers that
>> +pass parameters. @samp{used-arg} zeros used call-used registers that
>> +pass parameters. @samp{all-arg} zeros all call-used registers that pass
>> +parameters.  These 3 choices are used for ROP mitigation.
>> +
>> +@samp{used-gpr} zeros call-used general purpose registers
>> +which are used in function.  @samp{all-gpr} zeros all
>> +call-used registers.  @samp{used} zeros call-used registers which
>> +are used in function.  @samp{all} zeros all call-used registers.
>> +These 4 choices are used for preventing information leak through
>> +registers.
> 
> Same comment here.

Okay.

> 
>> @@ -310,6 +310,9 @@ struct GTY(()) rtl_data {
>>     sets them.  */
>>  HARD_REG_SET asm_clobbers;
>> 
>> +  /* All hard registers that are zeroed at the return of the routine.  */
>> +  HARD_REG_SET zeroed_reg_set;
> 
> How about “must_be_zero_on_return“?  “zeroed_reg_set” isn't very
> specific about where the zeroing happens or is needed.  E.g. we also
> zero uninitialised registers.
okay.

> 
>> +{
>> +  basic_block bb = BLOCK_FOR_INSN (ret);
>> +  auto_bitmap live_out;
>> +  bitmap_copy (live_out, df_get_live_out (bb));
> 
> Sorry, forgot that here we should do:
> 
>  df_simulate_initialize_backwards (bb, live_out);
> 
> But we should calculate this set once per return instruction rather
> than repeat the calculation for every register.

Okay.
> 
>> 
>> +  /* If gpr_only is true, only zero call-used-registers that are
>> +     general-purpose registers; if used_only is true, only zero
>> +     call-used-registers that are used in the current function.  */
>> +
>> +  switch (zero_regs_type)
>> +    {
>> +      case zero_call_used_regs_used_arg:
>> +    gpr_only = false;
>> +    break;
>> +      case zero_call_used_regs_all_arg:
>> +    gpr_only = false;
>> +    used_only = false;
>> +    break;
>> +      case zero_call_used_regs_used_gpr:
>> +    arg_only = false;
>> +    break;
>> +      case zero_call_used_regs_all_gpr:
>> +    used_only = false;
>> +    arg_only = false;
>> +    break;
>> +      case zero_call_used_regs_used:
>> +    gpr_only = false;
>> +    arg_only = false;
>> +    break;
>> +      case zero_call_used_regs_all:
>> +    gpr_only = false;
>> +    used_only = false;
>> +    arg_only = false;
>> +    break;
>> +      default:
>> +    break;
>> +    }
> 
> Using a bitmask would also simplify this.
agreed.

> 
>> +
>> +  /* For each of the hard registers, check to see whether we should zero it 
>> if:
>> +     1. it is a call-used-registers;
>> + and 2. it is not a fixed-registers;
>> + and 3. it is not live at the return of the routine;
>> + and 4. it is general registor if gpr_only is true;
>> + and 5. it is used in the routine if used_only is true;
>> + and 6. it is a register that passes parameter if arg_only is true;
>> +   */
>> +
>> +  HARD_REG_SET need_zeroed_hardregs;
>> +  CLEAR_HARD_REG_SET (need_zeroed_hardregs);
>> +  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>> +    {
>> +      if (!this_target_hard_regs->x_call_used_regs[regno])
>> +    continue;
> 
> This should use crtl->abi instead.  The set of call-used registers
> can vary from function to function.

You mean to use:

If (!crtl->abi->clobbers_full_reg_p(regno))

?


> 
>> +static unsigned int
>> +rest_of_zero_call_used_regs (void)
>> +{
>> +  basic_block bb;
>> +  rtx_insn *insn;
>> +
>> +  /* This pass needs data flow information.  */
>> +  df_analyze ();
>> +
>> +  /* Search all the "return"s in the routine, and insert instruction 
>> sequence to
>> +     zero the call used registers.  */
>> +  FOR_EACH_BB_REVERSE_FN (bb, cfun)
>> +    if (bb == EXIT_BLOCK_PTR_FOR_FN (cfun)
>> +    || (single_succ_p (bb)
>> +        && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun)))
>> +      FOR_BB_INSNS_REVERSE (bb, insn)
>> +    if (JUMP_P (insn) && ANY_RETURN_P (JUMP_LABEL (insn)))
>> +      {
>> +        /* Now we can insert the instruction sequence to zero the call used
>> +           registers before this insn.  */
>> +        gen_call_used_regs_seq (insn);
>> +        break;
>> +      }
> 
> The exit block never has instructions, so it's only necessary to process
> predecessors.  A simpler way to do that is to iterate over the edges in:
> 
>  EXIT_BLOCK_PTR_FOR_FN (cfun)->preds
> 
> You shouldn't need to use FOR_BB_INSNS_REVERSE: it should be enough
> to check only BB_END (bb), since returns always end blocks.

Something like the following?

  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
    {
     insn = BB_END (e->src);
      If (JUMP_P (insn) && ANY_RETURN_P (JUMP_LABEL (insn)))
        {
          /* Now we can insert the instruction sequence to zero the call used
               registers before this insn.  */
            gen_call_used_regs_seq (insn);
            break;       
        }
      }
> 
>> +
>> +  return 0;
>> +}
>> +
>> +namespace {
>> +
>> +const pass_data pass_data_zero_call_used_regs =
>> +{
>> +  RTL_PASS, /* type */
>> +  "zero_call_used_regs", /* name */
>> +  OPTGROUP_NONE, /* optinfo_flags */
>> +  TV_NONE, /* tv_id */
>> +  0, /* properties_required */
>> +  0, /* properties_provided */
>> +  0, /* properties_destroyed */
>> +  0, /* todo_flags_start */
>> +  0, /* todo_flags_finish */
>> +};
>> +
>> +class pass_zero_call_used_regs: public rtl_opt_pass
>> +{
>> +public:
>> +  pass_zero_call_used_regs (gcc::context *ctxt)
>> +    : rtl_opt_pass (pass_data_zero_call_used_regs, ctxt)
>> +  {}
>> +
>> +  /* opt_pass methods: */
>> +  virtual bool gate (function *)
>> +    {
>> +      return flag_zero_call_used_regs > zero_call_used_regs_unset;
> 
> I think this also needs to check the function attributes.

Okay.
> 
>> +    }
>> +  virtual unsigned int execute (function *)
>> +    {
>> +      return rest_of_zero_call_used_regs ();
>> +    }
>> +
>> +}; // class pass_zero_call_used_regs
>> +
>> +} // anon namespace
>> +
>> +rtl_opt_pass *
>> +make_pass_zero_call_used_regs (gcc::context *ctxt)
>> +{
>> +  return new pass_zero_call_used_regs (ctxt);
>> +}
>> 
>> /* If CONSTRAINT is a matching constraint, then return its number.
>>   Otherwise, return -1.  */
>> diff --git a/gcc/optabs.c b/gcc/optabs.c
>> index 8ad7f4b..57e5c5d 100644
>> --- a/gcc/optabs.c
>> +++ b/gcc/optabs.c
>> @@ -6484,6 +6484,49 @@ expand_memory_blockage (void)
>>    expand_asm_memory_blockage ();
>> }
>> 
>> +/* Generate asm volatile("" : : : "memory") as a memory blockage, at the
>> +   same time clobbering the register set specified by ZEROED_REGS.  */
>> +
>> +void
>> +expand_asm_reg_clobber_mem_blockage (HARD_REG_SET zeroed_regs)
> 
> Just “regs”: the interface is more general than registers that are being
> zeroed.
Okay.

> 
>> +{
>> +  rtx asm_op, clob_mem, clob_reg;
>> +
>> +  unsigned int num_of_regs = 0;
>> +  for (unsigned int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
>> +    if (TEST_HARD_REG_BIT (zeroed_regs, i))
>> +      num_of_regs++;
>> +
>> +  if (num_of_regs == 0)
>> +    return;
> 
> For this interface, I think we should continue and just include
> a memory clobber.

Right.

> 
>> +    RTVEC_ELT (v,j) = clob_reg;
> 
> …IMO it would be more readable as just:
> 
>       RTVEC_ELT (v, j) = gen_rtx_CLOBBER (VOIDmode, regno_reg_rtx[i]);
Okay.

> 
>> diff --git a/gcc/recog.c b/gcc/recog.c
>> index ce83b7f..472c2dc 100644
>> --- a/gcc/recog.c
>> +++ b/gcc/recog.c
>> @@ -923,6 +923,21 @@ validate_simplify_insn (rtx_insn *insn)
>>  return ((num_changes_pending () > 0) && (apply_change_group () > 0));
>> }
>> 
>> 
>> +
>> +bool
>> +valid_insn_p (rtx_insn *insn)
> 
> This should have a function comment.  E.g.:
> 
> /* Check whether INSN matches a specific alternative of an .md pattern.  */
Okay.

> 
>> 
>> +/* Generate instruction sequence to zero call used registers.  */
>> +DEFHOOK
>> +(zero_call_used_regs,
>> + "This target hook emits instructions to zero registers specified\n\
>> +by @var{need_zeroed_hardregs} at function return, at the same time\n\
>> +return the hard register set that are actually zeroed by the hook\n\
>> +Define this hook if the target has more effecient instructions to\n\
>> +zero call-used registers, or if the target only tries to zero a subset\n\
>> +of @var{need_zeroed_hardregs}.\n\
>> +If the hook is not defined, the default_zero_call_used_reg will be used.",
>> + HARD_REG_SET, (HARD_REG_SET need_zeroed_hardregs),
> 
> I'd suggest:
> 
> "Emit instructions to zero the subset of @var{selected_regs} that\n\
> could conceivably contain values that are useful to an attacker.\n\
> Return the set of registers that were actually cleared.\n\
> \n\
> The default implementation uses normal move instructions to zero\n\
> all the registers in @var{selected_regs}.  Define this hook if the\n\
> target has more efficient ways of zeroing certain registers,\n\
> or if you believe that certain registers would never contain\n\
> values that are useful to an attacker."
> 
> with the parameter called “selected_regs” instead of
> “need_zeroed_hardregs”.  (“need” suggests that the target
> doesn't have the option of not zeroing.)

Okay.

> 
>> 
>> +/* The default hook for TARGET_ZERO_CALL_USED_REGS.  */
>> +
>> +HARD_REG_SET
>> +default_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>> +{
>> +  HARD_REG_SET zeroed_hardregs;
>> +  gcc_assert (!hard_reg_set_empty_p (need_zeroed_hardregs));
>> +
>> +  CLEAR_HARD_REG_SET (zeroed_hardregs);
>> +  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>> +    if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
>> +      {
>> +    rtx_insn *last_insn = get_last_insn ();
>> +    machine_mode mode = GET_MODE (regno_reg_rtx[regno]);
>> +    rtx zero = CONST0_RTX (mode);
>> +    rtx_insn *insn = emit_move_insn (regno_reg_rtx[regno], zero);
>> +    if (!valid_insn_p (insn))
>> +      {
>> +        static bool issued_error;
>> +        if (!issued_error)
>> +          {
>> +            issued_error = true;
>> +            sorry ("-fzero-call-used-regs not supported on this target");
> 
> Should be "%qs not supported on this target", with the option name
> as a second argument.

Okay.

> 
>> +          }
>> +        delete_insns_since (last_insn);
>> +      }
>> +    else
>> +      SET_HARD_REG_BIT (zeroed_hardregs, regno);
>> +      }
>> +  return zeroed_hardregs;
> 
> I don't think it's worth building up a different return set.  The only
> time it's different from need_zeroed_hardregs is if we emit the sorry,
> which will cause compilation to fail anyway.

Okay.

> 
> Sorry, I ran out of time to review the tests, but the code part
> otherwise looks good.

thanks.

Qing
> 
> Thanks,
> Richard

Reply via email to