Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
From: Eric BotcazouDate: Mon, 12 Jun 2017 11:27:10 +0200 >> I do not see a direct gen_return happening in function.c in the gcc-7 >> branch. >> >> Is it somewhere else? > > There is a call from force_nonfallthru_and_redirect in cfgrtl.c AFAICS. > > So the code generated for your testcase is less optimized with GCC 7 and > later > than with GCC 6 and earlier? Ok, I see. That invocation from cfgrtl.c is not triggered for my testcase in gcc-7 and later. I'll update the return pattern in the gcc-7 branch and mainline in order to cover all possible cases.
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
> I do not see a direct gen_return happening in function.c in the gcc-7 > branch. > > Is it somewhere else? There is a call from force_nonfallthru_and_redirect in cfgrtl.c AFAICS. So the code generated for your testcase is less optimized with GCC 7 and later than with GCC 6 and earlier? -- Eric Botcazou
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
From: Eric BotcazouDate: Fri, 09 Jun 2017 22:13:20 +0200 >> Eric, after some more testing it turns out that we need something >> more for gcc-5 and gcc-6 to cover all cases. > > Hmm, yes, I should have thought of that... > >> The problem is that before gcc-7, the compiler can emit return >> instructions directly without going through the epilogue expander. > > That should also be true with GCC 7 and later, no? I do not see a direct gen_return happening in function.c in the gcc-7 branch. Is it somewhere else?
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
> Eric, after some more testing it turns out that we need something > more for gcc-5 and gcc-6 to cover all cases. Hmm, yes, I should have thought of that... > The problem is that before gcc-7, the compiler can emit return > instructions directly without going through the epilogue expander. That should also be true with GCC 7 and later, no? -- Eric Botcazou
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
From: David MillerDate: Tue, 06 Jun 2017 15:02:55 -0400 (EDT) > From: David Miller > Date: Mon, 05 Jun 2017 20:54:46 -0400 (EDT) > >> From: Eric Botcazou >> Date: Tue, 06 Jun 2017 00:02:06 +0200 >> That seems to work as well, following is going through a testsuite run right now: [PATCH] sparc: Fix stack references in return delay slot. gcc/ PR target/80968 * config/sparc/sparc.c (sparc_expand_prologue): Emit frame blockage if function uses alloca. >>> >>> Probably worth applying on all active branches I'd think. >> >> I agree, this is a really nasty bug. >> >> I'll do that after some more testing. > > This is now done, thanks again. Eric, after some more testing it turns out that we need something more for gcc-5 and gcc-6 to cover all cases. The problem is that before gcc-7, the compiler can emit return instructions directly without going through the epilogue expander. So I have to add the frame barrier emission to our return expander as well, which I will work on right now. Just FYI...
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
From: David MillerDate: Mon, 05 Jun 2017 20:54:46 -0400 (EDT) > From: Eric Botcazou > Date: Tue, 06 Jun 2017 00:02:06 +0200 > >>> That seems to work as well, following is going through a testsuite >>> run right now: >>> >>> >>> [PATCH] sparc: Fix stack references in return delay slot. >>> >>> gcc/ >>> >>> PR target/80968 >>> * config/sparc/sparc.c (sparc_expand_prologue): Emit frame >>> blockage if function uses alloca. >> >> Probably worth applying on all active branches I'd think. > > I agree, this is a really nasty bug. > > I'll do that after some more testing. This is now done, thanks again.
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
From: Eric BotcazouDate: Tue, 06 Jun 2017 00:02:06 +0200 >> That seems to work as well, following is going through a testsuite >> run right now: >> >> >> [PATCH] sparc: Fix stack references in return delay slot. >> >> gcc/ >> >> PR target/80968 >> * config/sparc/sparc.c (sparc_expand_prologue): Emit frame >> blockage if function uses alloca. > > Probably worth applying on all active branches I'd think. I agree, this is a really nasty bug. I'll do that after some more testing. Eric, thanks for reviewing and your advice to use frame blockage!
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
> That seems to work as well, following is going through a testsuite > run right now: > > > [PATCH] sparc: Fix stack references in return delay slot. > > gcc/ > > PR target/80968 > * config/sparc/sparc.c (sparc_expand_prologue): Emit frame > blockage if function uses alloca. Probably worth applying on all active branches I'd think. -- Eric Botcazou
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
From: Eric BotcazouDate: Sun, 04 Jun 2017 10:32:47 +0200 >> This is an attempt to fix PR target/80968. This bug has existed >> basically forever. >> >> The stack_tie sequence seems to be how other targets deal with this >> issue. I only emit this when alloca is used. If there are other >> conditions that potentially would necessitate such a barrier, just let >> me know. > > See my comment in the audit trail about the stack tie approach, let's just > emit a frame_blockage instead. That seems to work as well, following is going through a testsuite run right now: [PATCH] sparc: Fix stack references in return delay slot. gcc/ PR target/80968 * config/sparc/sparc.c (sparc_expand_prologue): Emit frame blockage if function uses alloca. gcc/testsuite/ * gcc.target/sparc/sparc-ret-3.c: New test. --- gcc/config/sparc/sparc.c | 3 ++ gcc/testsuite/gcc.target/sparc/sparc-ret-3.c | 53 2 files changed, 56 insertions(+) create mode 100644 gcc/testsuite/gcc.target/sparc/sparc-ret-3.c diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index 6dfb269..95a64a4 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -5792,6 +5792,9 @@ sparc_expand_epilogue (bool for_eh) { HOST_WIDE_INT size = sparc_frame_size; + if (cfun->calls_alloca) +emit_insn (gen_frame_blockage ()); + if (sparc_n_global_fp_regs > 0) emit_save_or_restore_global_fp_regs (sparc_frame_base_reg, sparc_frame_base_offset diff --git a/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c new file mode 100644 index 000..7a151f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c @@ -0,0 +1,53 @@ +/* PR target/80968 */ +/* { dg-do compile } */ +/* { dg-skip-if "no register windows" { *-*-* } { "-mflat" } { "" } } */ +/* { dg-require-effective-target ilp32 } */ +/* { dg-options "-mcpu=ultrasparc -O" } */ + +/* Make sure references to the stack frame do not slip into the delay slot + of a return instruction. */ + +struct crypto_shash { + unsigned int descsize; +}; +struct crypto_shash *tfm; + +struct shash_desc { + struct crypto_shash *tfm; + unsigned int flags; + + void *__ctx[] __attribute__((aligned(8))); +}; + +static inline unsigned int crypto_shash_descsize(struct crypto_shash *tfm) +{ + return tfm->descsize; +} + +static inline void *shash_desc_ctx(struct shash_desc *desc) +{ + return desc->__ctx; +} + +#define SHASH_DESC_ON_STACK(shash, ctx) \ + char __##shash##_desc[sizeof(struct shash_desc) + \ + crypto_shash_descsize(ctx)] __attribute__((aligned(8))); \ + struct shash_desc *shash = (struct shash_desc *)__##shash##_desc + +extern int crypto_shash_update(struct shash_desc *, const void *, unsigned int); + +unsigned int bug(unsigned int crc, const void *address, unsigned int length) +{ + SHASH_DESC_ON_STACK(shash, tfm); + unsigned int *ctx = (unsigned int *)shash_desc_ctx(shash); + int err; + + shash->tfm = tfm; + shash->flags = 0; + *ctx = crc; + + err = crypto_shash_update(shash, address, length); + + return *ctx; +} +/* { dg-final { scan-assembler "ld\[ \t\]*\\\[%i5\\+8\\\], %i0\n\[^\n\]*return\[ \t\]*%i7\\+8" } } */ -- 2.1.2.532.g19b5d50
Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
> This is an attempt to fix PR target/80968. This bug has existed > basically forever. > > The stack_tie sequence seems to be how other targets deal with this > issue. I only emit this when alloca is used. If there are other > conditions that potentially would necessitate such a barrier, just let > me know. See my comment in the audit trail about the stack tie approach, let's just emit a frame_blockage instead. -- Eric Botcazou
[PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.
This is an attempt to fix PR target/80968. This bug has existed basically forever. The stack_tie sequence seems to be how other targets deal with this issue. I only emit this when alloca is used. If there are other conditions that potentially would necessitate such a barrier, just let me know. sparc: Fix stack references in return delay slot. gcc/ * config/sparc/sparc.md (UNSPEC_TIE): New unspec. (stack_tie): New pattern. * config/sparc/sparc.c (sparc_emit_stack_tie): New function. (sparc_expand_prologue): Call it if function uses alloca. gcc/testsuite/ * gcc.target/sparc/sparc-ret-3.c: New test. --- gcc/config/sparc/sparc.c | 15 gcc/config/sparc/sparc.md| 11 ++ gcc/testsuite/gcc.target/sparc/sparc-ret-3.c | 53 3 files changed, 79 insertions(+) create mode 100644 gcc/testsuite/gcc.target/sparc/sparc-ret-3.c diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index 6dfb269..345bc7f 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -5784,6 +5784,18 @@ sparc_asm_function_prologue (FILE *file, HOST_WIDE_INT size ATTRIBUTE_UNUSED) sparc_output_scratch_registers (file); } +/* This ties together stack memory (MEM with an alias set of frame_alias_set) + and the change to the stack pointer. */ + +static void +sparc_emit_stack_tie (void) +{ + rtx mem = gen_frame_mem (BLKmode, + gen_rtx_REG (Pmode, STACK_POINTER_REGNUM)); + + emit_insn (gen_stack_tie (mem)); +} + /* Expand the function epilogue, either normal or part of a sibcall. We emit all the instructions except the return or the call. */ @@ -5792,6 +5804,9 @@ sparc_expand_epilogue (bool for_eh) { HOST_WIDE_INT size = sparc_frame_size; + if (cfun->calls_alloca) +sparc_emit_stack_tie (); + if (sparc_n_global_fp_regs > 0) emit_save_or_restore_global_fp_regs (sparc_frame_base_reg, sparc_frame_base_offset diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index 737bdb3..ef00fc5 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -94,6 +94,8 @@ UNSPEC_ADDV UNSPEC_SUBV UNSPEC_NEGV + + UNSPEC_TIE ]) (define_c_enum "unspecv" [ @@ -8473,6 +8475,15 @@ [(set_attr "type" "multi") (set_attr "length" "4")]) +;; This is used in sparc_expand_epilogue in order to prevent insns +;; referencing the stack from being placed after the deallocation of +;; the stack frame. +(define_insn "stack_tie" + [(set (match_operand:BLK 0 "memory_operand" "+m") +(unspec:BLK [(match_dup 0)] UNSPEC_TIE))] + "" + "" + [(set_attr "length" "0")]) ;; Vector instructions. diff --git a/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c new file mode 100644 index 000..7a151f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c @@ -0,0 +1,53 @@ +/* PR target/80968 */ +/* { dg-do compile } */ +/* { dg-skip-if "no register windows" { *-*-* } { "-mflat" } { "" } } */ +/* { dg-require-effective-target ilp32 } */ +/* { dg-options "-mcpu=ultrasparc -O" } */ + +/* Make sure references to the stack frame do not slip into the delay slot + of a return instruction. */ + +struct crypto_shash { + unsigned int descsize; +}; +struct crypto_shash *tfm; + +struct shash_desc { + struct crypto_shash *tfm; + unsigned int flags; + + void *__ctx[] __attribute__((aligned(8))); +}; + +static inline unsigned int crypto_shash_descsize(struct crypto_shash *tfm) +{ + return tfm->descsize; +} + +static inline void *shash_desc_ctx(struct shash_desc *desc) +{ + return desc->__ctx; +} + +#define SHASH_DESC_ON_STACK(shash, ctx) \ + char __##shash##_desc[sizeof(struct shash_desc) + \ + crypto_shash_descsize(ctx)] __attribute__((aligned(8))); \ + struct shash_desc *shash = (struct shash_desc *)__##shash##_desc + +extern int crypto_shash_update(struct shash_desc *, const void *, unsigned int); + +unsigned int bug(unsigned int crc, const void *address, unsigned int length) +{ + SHASH_DESC_ON_STACK(shash, tfm); + unsigned int *ctx = (unsigned int *)shash_desc_ctx(shash); + int err; + + shash->tfm = tfm; + shash->flags = 0; + *ctx = crc; + + err = crypto_shash_update(shash, address, length); + + return *ctx; +} +/* { dg-final { scan-assembler "ld\[ \t\]*\\\[%i5\\+8\\\], %i0\n\[^\n\]*return\[ \t\]*%i7\\+8" } } */ -- 2.1.2.532.g19b5d50