Re: [PATCH][PR target/83994] Fix stack-clash-protection code generation on x86

2018-01-24 Thread Jeff Law
On 01/24/2018 12:11 AM, Uros Bizjak wrote:
> On Wed, Jan 24, 2018 at 12:15 AM, Jeff Law  wrote:
>>
>> pr83994 is a code generation bug in the stack-clash support that affects
>> openssl (we've turned on stack-clash-protection by default for the F28
>> builds).
>>
>> The core problem is stack-clash (like stack-check) will emit a probing
>> loop if the prologue allocates enough stack space.  When emitting a loop
>> both implementations will need a scratch register.
>>
>> They use get_scratch_register_at_entry to find a suitable scratch
>> register.  This routine assumes that callee registers saves are
>> completed at the point where the scratch register is going to be used.
>>
>> In this particular testcase we select %ebx because ax,cx,dx are used for
>> parameter passing.  That's fine.  The problem is %ebx hasn't been saved yet!
>>
>> -fstack-check has a bit of code in the frame setup/layout code which
>> forces the prologue to use pushes rather than reg->mem moves for saving
>> registers.  There's a gcc_assert in the prologue expander to catch any
>> case where the registers aren't saved.
>>
>> -fstack-clash-protection doesn't have that same bit of magic in the
>> frame setup/layout code and it bypasses the assertion due to a change I
>> made back in Nov 2017 due to not being aware of this particular issue.
>>
>> This patch reverts the assertion bypass I added back in Nov 2017 and
>> adds clarifying comments.  The patch also forces use of push to save
>> integer registers for a stack-clash protected prologue if probes are
>> going to be needed.
>>
>> Bootstrapped and regression tested on x86_64.
>>
>> While the bug is not marked as a regression, ISTM this needs to be fixed
>> for gcc-8.
>>
>> OK for the trunk?
>>
>> Jeff
>>
>> * i386.c (get_probe_interval): Move to earlier point.
>> (ix86_compute_frame_layout): If -fstack-clash-protection and
>> the frame is larger than the probe interval, then use pushes
>> to save registers rather than reg->mem moves.
>> (ix86_expand_prologue): Remove conditional for int_registers_saved
>> assertion.
>>
>> * gcc.target/i386/pr83994.c: New test.
> 
> OK with the fixed testcase (see below).
> 
> Thanks,
> Uros.

[ ... snip ... ]

>> diff --git a/gcc/testsuite/gcc.target/i386/pr83994.c 
>> b/gcc/testsuite/gcc.target/i386/pr83994.c
>> new file mode 100644
>> index 000..b57b04b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/pr83994.c
>> @@ -0,0 +1,15 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -m32 -march=i686 -fpic -fstack-clash-protection" } */
> 
> Please use
> 
> /* { dg-require-effective-target ia32 } */
> 
> and remove "-m32" from dg-options.
Done.  Thanks for the quick turnaround.

jeff


Re: [PATCH][PR target/83994] Fix stack-clash-protection code generation on x86

2018-01-23 Thread Uros Bizjak
On Wed, Jan 24, 2018 at 12:15 AM, Jeff Law  wrote:
>
> pr83994 is a code generation bug in the stack-clash support that affects
> openssl (we've turned on stack-clash-protection by default for the F28
> builds).
>
> The core problem is stack-clash (like stack-check) will emit a probing
> loop if the prologue allocates enough stack space.  When emitting a loop
> both implementations will need a scratch register.
>
> They use get_scratch_register_at_entry to find a suitable scratch
> register.  This routine assumes that callee registers saves are
> completed at the point where the scratch register is going to be used.
>
> In this particular testcase we select %ebx because ax,cx,dx are used for
> parameter passing.  That's fine.  The problem is %ebx hasn't been saved yet!
>
> -fstack-check has a bit of code in the frame setup/layout code which
> forces the prologue to use pushes rather than reg->mem moves for saving
> registers.  There's a gcc_assert in the prologue expander to catch any
> case where the registers aren't saved.
>
> -fstack-clash-protection doesn't have that same bit of magic in the
> frame setup/layout code and it bypasses the assertion due to a change I
> made back in Nov 2017 due to not being aware of this particular issue.
>
> This patch reverts the assertion bypass I added back in Nov 2017 and
> adds clarifying comments.  The patch also forces use of push to save
> integer registers for a stack-clash protected prologue if probes are
> going to be needed.
>
> Bootstrapped and regression tested on x86_64.
>
> While the bug is not marked as a regression, ISTM this needs to be fixed
> for gcc-8.
>
> OK for the trunk?
>
> Jeff
>
> * i386.c (get_probe_interval): Move to earlier point.
> (ix86_compute_frame_layout): If -fstack-clash-protection and
> the frame is larger than the probe interval, then use pushes
> to save registers rather than reg->mem moves.
> (ix86_expand_prologue): Remove conditional for int_registers_saved
> assertion.
>
> * gcc.target/i386/pr83994.c: New test.

OK with the fixed testcase (see below).

Thanks,
Uros.

>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 72d25ae..4cb55a8 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -11497,6 +11497,18 @@ static void warn_once_call_ms2sysv_xlogues (const 
> char *feature)
>  }
>  }
>
> +/* Return the probing interval for -fstack-clash-protection.  */
> +
> +static HOST_WIDE_INT
> +get_probe_interval (void)
> +{
> +  if (flag_stack_clash_protection)
> +return (HOST_WIDE_INT_1U
> +   << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));
> +  else
> +return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);
> +}
> +
>  /* When using -fsplit-stack, the allocation routines set a field in
> the TCB to the bottom of the stack plus this much space, measured
> in bytes.  */
> @@ -11773,7 +11785,14 @@ ix86_compute_frame_layout (void)
>to_allocate = offset - frame->sse_reg_save_offset;
>
>if ((!to_allocate && frame->nregs <= 1)
> -  || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x8000)))
> +  || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x8000))
> +  /* If stack clash probing needs a loop, then it needs a
> +scratch register.  But the returned register is only guaranteed
> +to be safe to use after register saves are complete.  So if
> +stack clash protections are enabled and the allocated frame is
> +larger than the probe interval, then use pushes to save
> +callee saved registers.  */
> +  || (flag_stack_clash_protection && to_allocate > get_probe_interval 
> ()))
>  frame->save_regs_using_mov = false;
>
>if (ix86_using_red_zone ()
> @@ -12567,18 +12586,6 @@ release_scratch_register_on_entry (struct 
> scratch_reg *sr)
>  }
>  }
>
> -/* Return the probing interval for -fstack-clash-protection.  */
> -
> -static HOST_WIDE_INT
> -get_probe_interval (void)
> -{
> -  if (flag_stack_clash_protection)
> -return (HOST_WIDE_INT_1U
> -   << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));
> -  else
> -return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);
> -}
> -
>  /* Emit code to adjust the stack pointer by SIZE bytes while probing it.
>
> This differs from the next routine in that it tries hard to prevent
> @@ -13727,12 +13734,11 @@ ix86_expand_prologue (void)
>&& (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
>   || flag_stack_clash_protection))
>  {
> -  /* This assert wants to verify that integer registers were saved
> -prior to probing.  This is necessary when probing may be implemented
> -as a function call (Windows).  It is not necessary for stack clash
> -protection probing.  */
> -  if (!flag_stack_clash_protection)
> -   gcc_assert (int_registers_saved);
> +  /* We expect the 

[PATCH][PR target/83994] Fix stack-clash-protection code generation on x86

2018-01-23 Thread Jeff Law

pr83994 is a code generation bug in the stack-clash support that affects
openssl (we've turned on stack-clash-protection by default for the F28
builds).

The core problem is stack-clash (like stack-check) will emit a probing
loop if the prologue allocates enough stack space.  When emitting a loop
both implementations will need a scratch register.

They use get_scratch_register_at_entry to find a suitable scratch
register.  This routine assumes that callee registers saves are
completed at the point where the scratch register is going to be used.

In this particular testcase we select %ebx because ax,cx,dx are used for
parameter passing.  That's fine.  The problem is %ebx hasn't been saved yet!

-fstack-check has a bit of code in the frame setup/layout code which
forces the prologue to use pushes rather than reg->mem moves for saving
registers.  There's a gcc_assert in the prologue expander to catch any
case where the registers aren't saved.

-fstack-clash-protection doesn't have that same bit of magic in the
frame setup/layout code and it bypasses the assertion due to a change I
made back in Nov 2017 due to not being aware of this particular issue.

This patch reverts the assertion bypass I added back in Nov 2017 and
adds clarifying comments.  The patch also forces use of push to save
integer registers for a stack-clash protected prologue if probes are
going to be needed.

Bootstrapped and regression tested on x86_64.

While the bug is not marked as a regression, ISTM this needs to be fixed
for gcc-8.

OK for the trunk?

Jeff
* i386.c (get_probe_interval): Move to earlier point.
(ix86_compute_frame_layout): If -fstack-clash-protection and
the frame is larger than the probe interval, then use pushes
to save registers rather than reg->mem moves.
(ix86_expand_prologue): Remove conditional for int_registers_saved
assertion.

* gcc.target/i386/pr83994.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 72d25ae..4cb55a8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11497,6 +11497,18 @@ static void warn_once_call_ms2sysv_xlogues (const char 
*feature)
 }
 }
 
+/* Return the probing interval for -fstack-clash-protection.  */
+
+static HOST_WIDE_INT
+get_probe_interval (void)
+{
+  if (flag_stack_clash_protection)
+return (HOST_WIDE_INT_1U
+   << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));
+  else
+return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);
+}
+
 /* When using -fsplit-stack, the allocation routines set a field in
the TCB to the bottom of the stack plus this much space, measured
in bytes.  */
@@ -11773,7 +11785,14 @@ ix86_compute_frame_layout (void)
   to_allocate = offset - frame->sse_reg_save_offset;
 
   if ((!to_allocate && frame->nregs <= 1)
-  || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x8000)))
+  || (TARGET_64BIT && to_allocate >= HOST_WIDE_INT_C (0x8000))
+  /* If stack clash probing needs a loop, then it needs a
+scratch register.  But the returned register is only guaranteed
+to be safe to use after register saves are complete.  So if
+stack clash protections are enabled and the allocated frame is
+larger than the probe interval, then use pushes to save
+callee saved registers.  */
+  || (flag_stack_clash_protection && to_allocate > get_probe_interval ()))
 frame->save_regs_using_mov = false;
 
   if (ix86_using_red_zone ()
@@ -12567,18 +12586,6 @@ release_scratch_register_on_entry (struct scratch_reg 
*sr)
 }
 }
 
-/* Return the probing interval for -fstack-clash-protection.  */
-
-static HOST_WIDE_INT
-get_probe_interval (void)
-{
-  if (flag_stack_clash_protection)
-return (HOST_WIDE_INT_1U
-   << PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL));
-  else
-return (HOST_WIDE_INT_1U << STACK_CHECK_PROBE_INTERVAL_EXP);
-}
-
 /* Emit code to adjust the stack pointer by SIZE bytes while probing it.
 
This differs from the next routine in that it tries hard to prevent
@@ -13727,12 +13734,11 @@ ix86_expand_prologue (void)
   && (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
  || flag_stack_clash_protection))
 {
-  /* This assert wants to verify that integer registers were saved
-prior to probing.  This is necessary when probing may be implemented
-as a function call (Windows).  It is not necessary for stack clash
-protection probing.  */
-  if (!flag_stack_clash_protection)
-   gcc_assert (int_registers_saved);
+  /* We expect the GP registers to be saved when probes are used
+as the probing sequences might need a scratch register and
+the routine to allocate one assumes the integer registers
+have already been saved.  */
+  gcc_assert (int_registers_saved);
 
   if (flag_stack_clash_protection)
{
diff --git