RE: [PATCH] i386: Prefer remote atomic insn for atomic_fetch{add, and, or, xor}
> On Sun, Nov 6, 2022 at 2:00 PM Kong, Lingling via Gcc-patches patc...@gcc.gnu.org> wrote: > > > > Hi > > > > The patch is to add flag -mprefer-remote-atomic to control whether to > generate raoint insn for atomic operations. > > Ok for trunk? > > Please note TARGET_AVOID_MFENCE tuning flag, introduced a while ago due to > the fact that several targets perform LOCK OR faster than MFENCE. > > It was determined that MFENCE/SFENCE/LFENCE are much more complex > instructions compared to LOCK OR, since they have to handle cases that C > memory model never describes (some MMIO, or such). Considering that > ordinary LOCKed operations adequately cover C memory model, and are > probably faster than new instructions that have to cover all special cases, I > wonder if there is really benefit to emit these insns instead of existing > LOCKed > operations. These should IMO be used only via relevant builtins. > > Uros. > Ok, I will revert this patch in trunk. And wait until the optimization results of the actual hardware come out, and then consider to push the optimization patch. > > > > BRs, > > Lingling > > > > gcc/ChangeLog: > > > > * config/i386/i386.opt:Add -mprefer-remote-atomic. > > * config/i386/sync.md (atomic_): > > New define_expand. > > (atomic_add): Rename to below one. > > (atomic_add_1): To this. > > (atomic_): Ditto. > > (atomic__1): Ditto. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/raoint-atomic-fetch.c: New test. > > --- > > gcc/config/i386/i386.opt | 4 +++ > > gcc/config/i386/sync.md | 29 --- > > .../gcc.target/i386/raoint-atomic-fetch.c | 29 +++ > > 3 files changed, 58 insertions(+), 4 deletions(-) create mode 100644 > > gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > > > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index > > 415c52e1bb4..abb1e5ecbdc 100644 > > --- a/gcc/config/i386/i386.opt > > +++ b/gcc/config/i386/i386.opt > > @@ -1246,3 +1246,7 @@ Support PREFETCHI built-in functions and code > generation. > > mraoint > > Target Mask(ISA2_RAOINT) Var(ix86_isa_flags2) Save Support RAOINT built-in > functions and code generation. > > + > > +mprefer-remote-atomic > > +Target Var(flag_prefer_remote_atomic) Init(0) Prefer use remote > > +atomic insn for atomic operations. > > diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index > > e6543a5efb0..08e944fc9b7 100644 > > --- a/gcc/config/i386/sync.md > > +++ b/gcc/config/i386/sync.md > > @@ -37,7 +37,7 @@ > >UNSPECV_CMPXCHG > >UNSPECV_XCHG > >UNSPECV_LOCK > > - > > + > >;; For CMPccXADD support > >UNSPECV_CMPCCXADD > > > > @@ -791,7 +791,28 @@ > > (define_code_iterator any_plus_logic [and ior xor plus]) > > (define_code_attr plus_logic [(and "and") (ior "or") (xor "xor") (plus > > "add")]) > > > > -(define_insn "rao_a" > > +(define_expand "atomic_" > > + [(match_operand:SWI 0 "memory_operand") > > + (any_plus_logic:SWI (match_dup 0) > > + (match_operand:SWI 1 "nonmemory_operand")) > > + (match_operand:SI 2 "const_int_operand")] > > + "" > > +{ > > + if (flag_prefer_remote_atomic > > + && TARGET_RAOINT && operands[2] == const0_rtx > > + && (mode == SImode || mode == DImode)) > > + { > > +if (CONST_INT_P (operands[1])) > > + operands[1] = force_reg (mode, operands[1]); > > +emit_insn (maybe_gen_rao_a (, mode, operands[0], > > +operands[1])); > > + } > > + else > > +emit_insn (gen_atomic__1 (operands[0], operands[1], > > + operands[2])); > > + DONE; > > +}) > > + > > +(define_insn "@rao_a" > >[(set (match_operand:SWI48 0 "memory_operand" "+m") > > (unspec_volatile:SWI48 > > [(any_plus_logic:SWI48 (match_dup 0) @@ -801,7 +822,7 @@ > >"TARGET_RAOINT" > >"a\t{%1, %0|%0, %1}") > > > > -(define_insn "atomic_add" > > +(define_insn "atomic_add_1" > >[(set (match_operand:SWI 0 "memory_operand" "+m") > > (unspec_volatile:SWI > > [(plus:SWI (match_dup 0) > > @@ -855,7 +876,7 @@ > >return "lock{%;} %K2sub{}\t{%1, %0|%0, %1}"; > > }) > > > > -(define_insn "atomic_" > > +(define_insn "atomic__1" > >[(set (match_operand:SWI 0 "memory_operand" "+m") > > (unspec_volatile:SWI > > [(any_logic:SWI (match_dup 0) diff --git > > a/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > > b/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > > new file mode 100644 > > index 000..ac4099d888e > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > > @@ -0,0 +1,29 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-mraoint -O2 -mprefer-remote-atomic" } */ > > +/* { dg-final { scan-assembler-times "aadd" 2 { target {! ia32 } } } > > +} */ > > +/* { dg-final { scan-assembler-times "aand" 2 { target {! ia32 } } } > > +} */ > > +/*
Re: [PATCH] i386: Prefer remote atomic insn for atomic_fetch{add, and, or, xor}
On Sun, Nov 6, 2022 at 2:00 PM Kong, Lingling via Gcc-patches wrote: > > Hi > > The patch is to add flag -mprefer-remote-atomic to control whether to > generate raoint insn for atomic operations. > Ok for trunk? Please note TARGET_AVOID_MFENCE tuning flag, introduced a while ago due to the fact that several targets perform LOCK OR faster than MFENCE. It was determined that MFENCE/SFENCE/LFENCE are much more complex instructions compared to LOCK OR, since they have to handle cases that C memory model never describes (some MMIO, or such). Considering that ordinary LOCKed operations adequately cover C memory model, and are probably faster than new instructions that have to cover all special cases, I wonder if there is really benefit to emit these insns instead of existing LOCKed operations. These should IMO be used only via relevant builtins. Uros. > > BRs, > Lingling > > gcc/ChangeLog: > > * config/i386/i386.opt:Add -mprefer-remote-atomic. > * config/i386/sync.md (atomic_): > New define_expand. > (atomic_add): Rename to below one. > (atomic_add_1): To this. > (atomic_): Ditto. > (atomic__1): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/raoint-atomic-fetch.c: New test. > --- > gcc/config/i386/i386.opt | 4 +++ > gcc/config/i386/sync.md | 29 --- > .../gcc.target/i386/raoint-atomic-fetch.c | 29 +++ > 3 files changed, 58 insertions(+), 4 deletions(-) create mode 100644 > gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index > 415c52e1bb4..abb1e5ecbdc 100644 > --- a/gcc/config/i386/i386.opt > +++ b/gcc/config/i386/i386.opt > @@ -1246,3 +1246,7 @@ Support PREFETCHI built-in functions and code > generation. > mraoint > Target Mask(ISA2_RAOINT) Var(ix86_isa_flags2) Save Support RAOINT built-in > functions and code generation. > + > +mprefer-remote-atomic > +Target Var(flag_prefer_remote_atomic) Init(0) Prefer use remote atomic > +insn for atomic operations. > diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index > e6543a5efb0..08e944fc9b7 100644 > --- a/gcc/config/i386/sync.md > +++ b/gcc/config/i386/sync.md > @@ -37,7 +37,7 @@ >UNSPECV_CMPXCHG >UNSPECV_XCHG >UNSPECV_LOCK > - > + >;; For CMPccXADD support >UNSPECV_CMPCCXADD > > @@ -791,7 +791,28 @@ > (define_code_iterator any_plus_logic [and ior xor plus]) (define_code_attr > plus_logic [(and "and") (ior "or") (xor "xor") (plus "add")]) > > -(define_insn "rao_a" > +(define_expand "atomic_" > + [(match_operand:SWI 0 "memory_operand") > + (any_plus_logic:SWI (match_dup 0) > + (match_operand:SWI 1 "nonmemory_operand")) > + (match_operand:SI 2 "const_int_operand")] > + "" > +{ > + if (flag_prefer_remote_atomic > + && TARGET_RAOINT && operands[2] == const0_rtx > + && (mode == SImode || mode == DImode)) > + { > +if (CONST_INT_P (operands[1])) > + operands[1] = force_reg (mode, operands[1]); > +emit_insn (maybe_gen_rao_a (, mode, operands[0], > +operands[1])); > + } > + else > +emit_insn (gen_atomic__1 (operands[0], operands[1], > + operands[2])); > + DONE; > +}) > + > +(define_insn "@rao_a" >[(set (match_operand:SWI48 0 "memory_operand" "+m") > (unspec_volatile:SWI48 > [(any_plus_logic:SWI48 (match_dup 0) @@ -801,7 +822,7 @@ >"TARGET_RAOINT" >"a\t{%1, %0|%0, %1}") > > -(define_insn "atomic_add" > +(define_insn "atomic_add_1" >[(set (match_operand:SWI 0 "memory_operand" "+m") > (unspec_volatile:SWI > [(plus:SWI (match_dup 0) > @@ -855,7 +876,7 @@ >return "lock{%;} %K2sub{}\t{%1, %0|%0, %1}"; > }) > > -(define_insn "atomic_" > +(define_insn "atomic__1" >[(set (match_operand:SWI 0 "memory_operand" "+m") > (unspec_volatile:SWI > [(any_logic:SWI (match_dup 0) > diff --git a/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > b/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > new file mode 100644 > index 000..ac4099d888e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > @@ -0,0 +1,29 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mraoint -O2 -mprefer-remote-atomic" } */ > +/* { dg-final { scan-assembler-times "aadd" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "aand" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "aor" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "axor" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "aadd" 1 { target ia32 } } } */ > +/* { dg-final { scan-assembler-times "aand" 1 { target ia32 } } } */ > +/* { dg-final { scan-assembler-times "aor" 1 { target ia32 } } } */ > +/* { dg-final { scan-assembler-times "axor" 1 { target ia32 } } } */ > +vo
Re: [PATCH] i386: Prefer remote atomic insn for atomic_fetch{add, and, or, xor}
On Sun, Nov 6, 2022 at 9:00 PM Kong, Lingling via Gcc-patches wrote: > > Hi > > The patch is to add flag -mprefer-remote-atomic to control whether to > generate raoint insn for atomic operations. > Ok for trunk? Ok with below 2 little adjustments. > > BRs, > Lingling > > gcc/ChangeLog: > > * config/i386/i386.opt:Add -mprefer-remote-atomic. Please also update *x86 options* in gcc/doc/invode.texi. > * config/i386/sync.md (atomic_): > New define_expand. > (atomic_add): Rename to below one. > (atomic_add_1): To this. > (atomic_): Ditto. > (atomic__1): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/raoint-atomic-fetch.c: New test. > --- > gcc/config/i386/i386.opt | 4 +++ > gcc/config/i386/sync.md | 29 --- > .../gcc.target/i386/raoint-atomic-fetch.c | 29 +++ > 3 files changed, 58 insertions(+), 4 deletions(-) create mode 100644 > gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index > 415c52e1bb4..abb1e5ecbdc 100644 > --- a/gcc/config/i386/i386.opt > +++ b/gcc/config/i386/i386.opt > @@ -1246,3 +1246,7 @@ Support PREFETCHI built-in functions and code > generation. > mraoint > Target Mask(ISA2_RAOINT) Var(ix86_isa_flags2) Save Support RAOINT built-in > functions and code generation. > + > +mprefer-remote-atomic > +Target Var(flag_prefer_remote_atomic) Init(0) Prefer use remote atomic > +insn for atomic operations. > diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index > e6543a5efb0..08e944fc9b7 100644 > --- a/gcc/config/i386/sync.md > +++ b/gcc/config/i386/sync.md > @@ -37,7 +37,7 @@ >UNSPECV_CMPXCHG >UNSPECV_XCHG >UNSPECV_LOCK > - > + Please remove this change. >;; For CMPccXADD support >UNSPECV_CMPCCXADD > > @@ -791,7 +791,28 @@ > (define_code_iterator any_plus_logic [and ior xor plus]) (define_code_attr > plus_logic [(and "and") (ior "or") (xor "xor") (plus "add")]) > > -(define_insn "rao_a" > +(define_expand "atomic_" > + [(match_operand:SWI 0 "memory_operand") > + (any_plus_logic:SWI (match_dup 0) > + (match_operand:SWI 1 "nonmemory_operand")) > + (match_operand:SI 2 "const_int_operand")] > + "" > +{ > + if (flag_prefer_remote_atomic > + && TARGET_RAOINT && operands[2] == const0_rtx > + && (mode == SImode || mode == DImode)) > + { > +if (CONST_INT_P (operands[1])) > + operands[1] = force_reg (mode, operands[1]); > +emit_insn (maybe_gen_rao_a (, mode, operands[0], > +operands[1])); > + } > + else > +emit_insn (gen_atomic__1 (operands[0], operands[1], > + operands[2])); > + DONE; > +}) > + > +(define_insn "@rao_a" >[(set (match_operand:SWI48 0 "memory_operand" "+m") > (unspec_volatile:SWI48 > [(any_plus_logic:SWI48 (match_dup 0) @@ -801,7 +822,7 @@ >"TARGET_RAOINT" >"a\t{%1, %0|%0, %1}") > > -(define_insn "atomic_add" > +(define_insn "atomic_add_1" >[(set (match_operand:SWI 0 "memory_operand" "+m") > (unspec_volatile:SWI > [(plus:SWI (match_dup 0) > @@ -855,7 +876,7 @@ >return "lock{%;} %K2sub{}\t{%1, %0|%0, %1}"; > }) > > -(define_insn "atomic_" > +(define_insn "atomic__1" >[(set (match_operand:SWI 0 "memory_operand" "+m") > (unspec_volatile:SWI > [(any_logic:SWI (match_dup 0) > diff --git a/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > b/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > new file mode 100644 > index 000..ac4099d888e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/raoint-atomic-fetch.c > @@ -0,0 +1,29 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mraoint -O2 -mprefer-remote-atomic" } */ > +/* { dg-final { scan-assembler-times "aadd" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "aand" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "aor" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "axor" 2 { target {! ia32 } } } } > +*/ > +/* { dg-final { scan-assembler-times "aadd" 1 { target ia32 } } } */ > +/* { dg-final { scan-assembler-times "aand" 1 { target ia32 } } } */ > +/* { dg-final { scan-assembler-times "aor" 1 { target ia32 } } } */ > +/* { dg-final { scan-assembler-times "axor" 1 { target ia32 } } } */ > +volatile int x; volatile long long y; int *a; long long *b; > + > +void extern > +rao_int_test (void) > +{ > + __atomic_add_fetch (a, x, __ATOMIC_RELAXED); > + __atomic_and_fetch (a, x, __ATOMIC_RELAXED); > + __atomic_or_fetch (a, x, __ATOMIC_RELAXED); > + __atomic_xor_fetch (a, x, __ATOMIC_RELAXED); #ifdef __x86_64__ > + __atomic_add_fetch (b, y, __ATOMIC_RELAXED); > + __atomic_and_fetch (b, y, __ATOMIC_RELAXED); > + __atomic_or_fetch (b, y, __ATOMIC_RELAXED); > + __atomic_xor_fetch (b, y, __ATOMIC_RELAXED); #endif }