On Thu, Jan 16, 2020 at 5:51 PM Andrew Pinski <pins...@gmail.com> wrote: > > On Thu, Jan 16, 2020 at 5:14 AM Richard Sandiford > <richard.sandif...@arm.com> wrote: > > > > Wilco Dijkstra <wilco.dijks...@arm.com> writes: > > > The separate shrinkwrapping pass may insert stores in the middle > > > of atomics loops which can cause issues on some implementations. > > > Avoid this by delaying splitting of atomic patterns until after > > > prolog/epilog generation. > > > > > > Bootstrap completed, no test regressions on AArch64. > > > > > > Andrew, can you verify this fixes the failure you were getting? > > > > > > ChangeLog: > > > 2020-01-16 Wilco Dijkstra <wdijk...@arm.com> > > > > > > PR target/92692 > > > * config/aarch64/aarch64.c (aarch64_split_compare_and_swap) > > > Add assert to ensure prolog has been emitted. > > > (aarch64_split_atomic_op): Likewise. > > > * config/aarch64/atomics.md (aarch64_compare_and_swap<mode>) > > > Use epilogue_completed rather than reload_completed. > > > (aarch64_atomic_exchange<mode>): Likewise. > > > (aarch64_atomic_<atomic_optab><mode>): Likewise. > > > (atomic_nand<mode>): Likewise. > > > (aarch64_atomic_fetch_<atomic_optab><mode>): Likewise. > > > (atomic_fetch_nand<mode>): Likewise. > > > (aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise. > > > (atomic_nand_fetch<mode>): Likewise. > > > > OK if Andrew confirms it works, thanks. > > Yes this fixes the issue for me. Here is the new assembly showing it worked:
d390: f9000bf3 str x19, [sp, #16] d394: 885ffdc8 ldaxr w8, [x14] d398: 6b01011f cmp w8, w1 d39c: 54000061 b.ne d3a8 <pthread_rwlock_clockwrlock+0x168> // b.any d3a0: 88137dc5 stxr w19, w5, [x14] Thanks, Andrew Pinski > > Thanks, > Andrew > > > > > Richard > > > > > --- > > > > > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > > > index > > > ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309 > > > 100644 > > > --- a/gcc/config/aarch64/aarch64.c > > > +++ b/gcc/config/aarch64/aarch64.c > > > @@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model) > > > void > > > aarch64_split_compare_and_swap (rtx operands[]) > > > { > > > + /* Split after prolog/epilog to avoid interactions with > > > shrinkwrapping. */ > > > + gcc_assert (epilogue_completed); > > > + > > > rtx rval, mem, oldval, newval, scratch, x, model_rtx; > > > machine_mode mode; > > > bool is_weak; > > > @@ -18469,6 +18472,9 @@ void > > > aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, > > > rtx mem, > > > rtx value, rtx model_rtx, rtx cond) > > > { > > > + /* Split after prolog/epilog to avoid interactions with > > > shrinkwrapping. */ > > > + gcc_assert (epilogue_completed); > > > + > > > machine_mode mode = GET_MODE (mem); > > > machine_mode wmode = (mode == DImode ? DImode : SImode); > > > const enum memmodel model = memmodel_from_int (INTVAL (model_rtx)); > > > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md > > > index > > > c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c > > > 100644 > > > --- a/gcc/config/aarch64/atomics.md > > > +++ b/gcc/config/aarch64/atomics.md > > > @@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>" > > > (clobber (match_scratch:SI 7 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_compare_and_swap (operands); > > > @@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap<mode>" > > > (clobber (match_scratch:SI 7 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_compare_and_swap (operands); > > > @@ -104,7 +104,7 @@ (define_insn_and_split > > > "@aarch64_compare_and_swap<mode>" > > > (clobber (match_scratch:SI 7 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_compare_and_swap (operands); > > > @@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange<mode>" > > > (clobber (match_scratch:SI 4 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_atomic_op (SET, operands[0], NULL, operands[1], > > > @@ -344,7 +344,7 @@ (define_insn_and_split > > > "aarch64_atomic_<atomic_optab><mode>" > > > (clobber (match_scratch:SI 4 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_atomic_op (<CODE>, NULL, operands[3], operands[0], > > > @@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand<mode>" > > > (clobber (match_scratch:SI 4 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0], > > > @@ -504,7 +504,7 @@ (define_insn_and_split > > > "aarch64_atomic_fetch_<atomic_optab><mode>" > > > (clobber (match_scratch:SI 5 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_atomic_op (<CODE>, operands[0], operands[4], > > > operands[1], > > > @@ -551,7 +551,7 @@ (define_insn_and_split "atomic_fetch_nand<mode>" > > > (clobber (match_scratch:SI 5 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1], > > > @@ -604,7 +604,7 @@ (define_insn_and_split > > > "aarch64_atomic_<atomic_optab>_fetch<mode>" > > > (clobber (match_scratch:SI 4 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_atomic_op (<CODE>, NULL, operands[0], operands[1], > > > @@ -628,7 +628,7 @@ (define_insn_and_split "atomic_nand_fetch<mode>" > > > (clobber (match_scratch:SI 4 "=&r"))] > > > "" > > > "#" > > > - "&& reload_completed" > > > + "&& epilogue_completed" > > > [(const_int 0)] > > > { > > > aarch64_split_atomic_op (NOT, NULL, operands[0], operands[1],