Re: [PATCH v3] tree-optimization/95821 - Convert strlen + strchr to memchr
On Sat, Jul 9, 2022 at 8:59 AM Jeff Law via Gcc-patches wrote: > > > > On 6/21/2022 12:12 PM, Noah Goldstein via Gcc-patches wrote: > > This patch allows for strchr(x, c) to the replace with memchr(x, c, > > strlen(x) + 1) if strlen(x) has already been computed earlier in the > > tree. > > > > Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 > > > > Since memchr doesn't need to re-find the null terminator it is faster > > than strchr. > > > > bootstrapped and tested on x86_64-linux. > > > > PR tree-optimization/95821 > > > > gcc/ > > > > * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit > > memchr instead of strchr if strlen already computed. > > > > gcc/testsuite/ > > > > * c-c++-common/pr95821-1.c: New test. > > * c-c++-common/pr95821-2.c: New test. > > * c-c++-common/pr95821-3.c: New test. > > * c-c++-common/pr95821-4.c: New test. > > * c-c++-common/pr95821-5.c: New test. > > * c-c++-common/pr95821-6.c: New test. > > * c-c++-common/pr95821-7.c: New test. > > * c-c++-common/pr95821-8.c: New test. > Given Jakub's involvement to-date and the fact this touches > tree-ssa-strlen.cc I think Jakub should have final ACK/NAK on this. > > jeff > Ping.
Re: [PATCH v2] tree-optimization/95821 - Convert strlen + strchr to memchr
On Tue, Jun 21, 2022 at 11:13 AM Noah Goldstein wrote: > > On Tue, Jun 21, 2022 at 5:01 AM Jakub Jelinek wrote: > > > > On Mon, Jun 20, 2022 at 02:42:20PM -0700, Noah Goldstein wrote: > > > This patch allows for strchr(x, c) to the replace with memchr(x, c, > > > strlen(x) + 1) if strlen(x) has already been computed earlier in the > > > tree. > > > > > > Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 > > > > > > Since memchr doesn't need to re-find the null terminator it is faster > > > than strchr. > > > > > > bootstrapped and tested on x86_64-linux. > > > > > > PR tree-optimization/95821 > > > > This should be indented by a single tab, not two. > > Fixed in V3 > > > > > > gcc/ > > > > > > * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit > > > memchr instead of strchr if strlen already computed. > > > > > > gcc/testsuite/ > > > > > > * c-c++-common/pr95821-1.c: New test. > > > * c-c++-common/pr95821-2.c: New test. > > > * c-c++-common/pr95821-3.c: New test. > > > * c-c++-common/pr95821-4.c: New test. > > > * c-c++-common/pr95821-5.c: New test. > > > * c-c++-common/pr95821-6.c: New test. > > > * c-c++-common/pr95821-7.c: New test. > > > * c-c++-common/pr95821-8.c: New test. > > > --- a/gcc/tree-ssa-strlen.cc > > > +++ b/gcc/tree-ssa-strlen.cc > > > @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen () > > > } > > > } > > > > > > -/* Handle a strchr call. If strlen of the first argument is known, > > > replace > > > - the strchr (x, 0) call with the endptr or x + strlen, otherwise > > > remember > > > - that lhs of the call is endptr and strlen of the argument is endptr - > > > x. */ > > > +/* Handle a strchr call. If strlen of the first argument is known, > > > + replace the strchr (x, 0) call with the endptr or x + strlen, > > > + otherwise remember that lhs of the call is endptr and strlen of the > > > + argument is endptr - x. If strlen of x is not know but has been > > > + computed earlier in the tree then replace strchr(x, c) to > > > > Still missing space before ( above. > > Sorry, fixed that in V3. > > > > > + memchr (x, c, strlen + 1). */ > > > > > > void > > > strlen_pass::handle_builtin_strchr () > > > @@ -2418,8 +2421,12 @@ strlen_pass::handle_builtin_strchr () > > >if (lhs == NULL_TREE) > > > return; > > > > > > - if (!integer_zerop (gimple_call_arg (stmt, 1))) > > > -return; > > > + tree chr = gimple_call_arg (stmt, 1); > > > + /* strchr only uses the lower char of input so to check if its > > > + strchr (s, zerop) only take into account the lower char. */ > > > + bool is_strchr_zerop > > > + = (TREE_CODE (chr) == INTEGER_CST > > > + && integer_zerop (fold_convert (char_type_node, chr))); > > > > The indentation rule is that = should be 2 columns to the right from bool, > > so > > > > Fixed in V3. > > bool is_strchr_zerop > > = (TREE_CODE (chr) == INTEGER_CST > >&& integer_zerop (fold_convert (char_type_node, chr))); > > > > > + /* If its not strchr (s, zerop) then try and convert to > > > + memchr since strlen has already been computed. */ > > > > This comment still has the second line weirdly indented. > > Sorry, have emacs with 4-space tabs so things that look right arent > as they seem :/ > > Fixed in V3 I believe. > > > > > + tree fn = builtin_decl_explicit (BUILT_IN_MEMCHR); > > > + > > > + /* Only need to check length strlen (s) + 1 if chr may be > > > zero. > > > + Otherwise the last chr (which is known to be zero) can never > > > + be a match. NB: We don't need to test if chr is a non-zero > > > + integer const with zero char bits because that is taken into > > > + account with is_strchr_zerop. */ > > > + if (!tree_expr_nonzero_p (chr)) > > > > The above is unsafe though. tree_expr_nonzero_p (chr) will return true > > if say VRP can prove it is not zero, but because of the implicit > > (char) chr cast done by the function we need something different. > > Say if VRP determines that chr is in [1, INT_MAX] or even just [255, 257] > > it doesn't mean (char) chr won't be 0. > > So, as I've tried to explain in the previous mail, it can be done e.g. with > > Added your code in V3. Thanks for the help. > > bool chr_nonzero = false; > > if (TREE_CODE (chr) == INTEGER_CST > > && integer_nonzerop (fold_convert (char_type_node, chr))) > > chr_nonzero = true; > > else if (TREE_CODE (chr) == SSA_NAME > >&& CHAR_TYPE_SIZE < INT_TYPE_SIZE) > > { > > value_range r; > > /* Try to determine using ranges if (char) chr must > > be always 0. That is true e.g. if all the subranges > > have the INT_TYPE_SIZE -
Re: [PATCH v2] tree-optimization/95821 - Convert strlen + strchr to memchr
On Tue, Jun 21, 2022 at 5:01 AM Jakub Jelinek wrote: > > On Mon, Jun 20, 2022 at 02:42:20PM -0700, Noah Goldstein wrote: > > This patch allows for strchr(x, c) to the replace with memchr(x, c, > > strlen(x) + 1) if strlen(x) has already been computed earlier in the > > tree. > > > > Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 > > > > Since memchr doesn't need to re-find the null terminator it is faster > > than strchr. > > > > bootstrapped and tested on x86_64-linux. > > > > PR tree-optimization/95821 > > This should be indented by a single tab, not two. Fixed in V3 > > > > gcc/ > > > > * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit > > memchr instead of strchr if strlen already computed. > > > > gcc/testsuite/ > > > > * c-c++-common/pr95821-1.c: New test. > > * c-c++-common/pr95821-2.c: New test. > > * c-c++-common/pr95821-3.c: New test. > > * c-c++-common/pr95821-4.c: New test. > > * c-c++-common/pr95821-5.c: New test. > > * c-c++-common/pr95821-6.c: New test. > > * c-c++-common/pr95821-7.c: New test. > > * c-c++-common/pr95821-8.c: New test. > > --- a/gcc/tree-ssa-strlen.cc > > +++ b/gcc/tree-ssa-strlen.cc > > @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen () > > } > > } > > > > -/* Handle a strchr call. If strlen of the first argument is known, replace > > - the strchr (x, 0) call with the endptr or x + strlen, otherwise remember > > - that lhs of the call is endptr and strlen of the argument is endptr - > > x. */ > > +/* Handle a strchr call. If strlen of the first argument is known, > > + replace the strchr (x, 0) call with the endptr or x + strlen, > > + otherwise remember that lhs of the call is endptr and strlen of the > > + argument is endptr - x. If strlen of x is not know but has been > > + computed earlier in the tree then replace strchr(x, c) to > > Still missing space before ( above. Sorry, fixed that in V3. > > > + memchr (x, c, strlen + 1). */ > > > > void > > strlen_pass::handle_builtin_strchr () > > @@ -2418,8 +2421,12 @@ strlen_pass::handle_builtin_strchr () > >if (lhs == NULL_TREE) > > return; > > > > - if (!integer_zerop (gimple_call_arg (stmt, 1))) > > -return; > > + tree chr = gimple_call_arg (stmt, 1); > > + /* strchr only uses the lower char of input so to check if its > > + strchr (s, zerop) only take into account the lower char. */ > > + bool is_strchr_zerop > > + = (TREE_CODE (chr) == INTEGER_CST > > + && integer_zerop (fold_convert (char_type_node, chr))); > > The indentation rule is that = should be 2 columns to the right from bool, > so > Fixed in V3. > bool is_strchr_zerop > = (TREE_CODE (chr) == INTEGER_CST >&& integer_zerop (fold_convert (char_type_node, chr))); > > > + /* If its not strchr (s, zerop) then try and convert to > > + memchr since strlen has already been computed. */ > > This comment still has the second line weirdly indented. Sorry, have emacs with 4-space tabs so things that look right arent as they seem :/ Fixed in V3 I believe. > > > + tree fn = builtin_decl_explicit (BUILT_IN_MEMCHR); > > + > > + /* Only need to check length strlen (s) + 1 if chr may be zero. > > + Otherwise the last chr (which is known to be zero) can never > > + be a match. NB: We don't need to test if chr is a non-zero > > + integer const with zero char bits because that is taken into > > + account with is_strchr_zerop. */ > > + if (!tree_expr_nonzero_p (chr)) > > The above is unsafe though. tree_expr_nonzero_p (chr) will return true > if say VRP can prove it is not zero, but because of the implicit > (char) chr cast done by the function we need something different. > Say if VRP determines that chr is in [1, INT_MAX] or even just [255, 257] > it doesn't mean (char) chr won't be 0. > So, as I've tried to explain in the previous mail, it can be done e.g. with Added your code in V3. Thanks for the help. > bool chr_nonzero = false; > if (TREE_CODE (chr) == INTEGER_CST > && integer_nonzerop (fold_convert (char_type_node, chr))) > chr_nonzero = true; > else if (TREE_CODE (chr) == SSA_NAME >&& CHAR_TYPE_SIZE < INT_TYPE_SIZE) > { > value_range r; > /* Try to determine using ranges if (char) chr must > be always 0. That is true e.g. if all the subranges > have the INT_TYPE_SIZE - CHAR_TYPE_SIZE bits > the same on lower and upper bounds. */ > if (get_range_query (cfun)->range_of_expr (r, chr, stmt) > && r.kind () == VR_RANGE) > { > chr_nonzero = true; > wide_int mask =
[PATCH v3] tree-optimization/95821 - Convert strlen + strchr to memchr
This patch allows for strchr(x, c) to the replace with memchr(x, c, strlen(x) + 1) if strlen(x) has already been computed earlier in the tree. Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 Since memchr doesn't need to re-find the null terminator it is faster than strchr. bootstrapped and tested on x86_64-linux. PR tree-optimization/95821 gcc/ * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit memchr instead of strchr if strlen already computed. gcc/testsuite/ * c-c++-common/pr95821-1.c: New test. * c-c++-common/pr95821-2.c: New test. * c-c++-common/pr95821-3.c: New test. * c-c++-common/pr95821-4.c: New test. * c-c++-common/pr95821-5.c: New test. * c-c++-common/pr95821-6.c: New test. * c-c++-common/pr95821-7.c: New test. * c-c++-common/pr95821-8.c: New test. --- gcc/testsuite/c-c++-common/pr95821-1.c | 15 gcc/testsuite/c-c++-common/pr95821-2.c | 17 gcc/testsuite/c-c++-common/pr95821-3.c | 17 gcc/testsuite/c-c++-common/pr95821-4.c | 16 gcc/testsuite/c-c++-common/pr95821-5.c | 19 + gcc/testsuite/c-c++-common/pr95821-6.c | 18 gcc/testsuite/c-c++-common/pr95821-7.c | 18 gcc/testsuite/c-c++-common/pr95821-8.c | 19 + gcc/tree-ssa-strlen.cc | 113 - 9 files changed, 233 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/pr95821-1.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-2.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-3.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-4.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-5.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-6.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-7.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-8.c diff --git a/gcc/testsuite/c-c++-common/pr95821-1.c b/gcc/testsuite/c-c++-common/pr95821-1.c new file mode 100644 index 000..e0beb609ea2 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include + +char * +foo (char *s, char c) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-2.c b/gcc/testsuite/c-c++-common/pr95821-2.c new file mode 100644 index 000..5429f0586be --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "memchr" } } */ + +#include + +char * +foo (char *s, char c, char * other) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-3.c b/gcc/testsuite/c-c++-common/pr95821-3.c new file mode 100644 index 000..bc929c6044b --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include + +char * +foo (char * __restrict s, char c, char * __restrict other) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-4.c b/gcc/testsuite/c-c++-common/pr95821-4.c new file mode 100644 index 000..684b41d5b70 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-4.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include +#include + +char * +foo (char *s, char c) +{ + size_t slen = strlen(s); + if(slen < 1000) + return NULL; + + return strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-5.c b/gcc/testsuite/c-c++-common/pr95821-5.c new file mode 100644 index 000..00c1d93b614 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-5.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "memchr" } } */ + +#include +#include + +char * +foo (char *s, char c, char * other) +{ + size_t slen = strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return strchr(s, c); +} +int main() {} diff --git a/gcc/testsuite/c-c++-common/pr95821-6.c b/gcc/testsuite/c-c++-common/pr95821-6.c new file mode 100644 index 000..dec839de5ea --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-6.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include +#include + +char * +foo (char * __restrict s, char c,
[PATCH v2] tree-optimization/95821 - Convert strlen + strchr to memchr
This patch allows for strchr(x, c) to the replace with memchr(x, c, strlen(x) + 1) if strlen(x) has already been computed earlier in the tree. Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 Since memchr doesn't need to re-find the null terminator it is faster than strchr. bootstrapped and tested on x86_64-linux. PR tree-optimization/95821 gcc/ * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit memchr instead of strchr if strlen already computed. gcc/testsuite/ * c-c++-common/pr95821-1.c: New test. * c-c++-common/pr95821-2.c: New test. * c-c++-common/pr95821-3.c: New test. * c-c++-common/pr95821-4.c: New test. * c-c++-common/pr95821-5.c: New test. * c-c++-common/pr95821-6.c: New test. * c-c++-common/pr95821-7.c: New test. * c-c++-common/pr95821-8.c: New test. --- gcc/testsuite/c-c++-common/pr95821-1.c | 15 + gcc/testsuite/c-c++-common/pr95821-2.c | 17 + gcc/testsuite/c-c++-common/pr95821-3.c | 17 + gcc/testsuite/c-c++-common/pr95821-4.c | 16 + gcc/testsuite/c-c++-common/pr95821-5.c | 19 ++ gcc/testsuite/c-c++-common/pr95821-6.c | 18 ++ gcc/testsuite/c-c++-common/pr95821-7.c | 18 ++ gcc/testsuite/c-c++-common/pr95821-8.c | 19 ++ gcc/tree-ssa-strlen.cc | 89 -- 9 files changed, 209 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/pr95821-1.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-2.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-3.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-4.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-5.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-6.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-7.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-8.c diff --git a/gcc/testsuite/c-c++-common/pr95821-1.c b/gcc/testsuite/c-c++-common/pr95821-1.c new file mode 100644 index 000..e0beb609ea2 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include + +char * +foo (char *s, char c) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-2.c b/gcc/testsuite/c-c++-common/pr95821-2.c new file mode 100644 index 000..5429f0586be --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "memchr" } } */ + +#include + +char * +foo (char *s, char c, char * other) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-3.c b/gcc/testsuite/c-c++-common/pr95821-3.c new file mode 100644 index 000..bc929c6044b --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include + +char * +foo (char * __restrict s, char c, char * __restrict other) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-4.c b/gcc/testsuite/c-c++-common/pr95821-4.c new file mode 100644 index 000..684b41d5b70 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-4.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include +#include + +char * +foo (char *s, char c) +{ + size_t slen = strlen(s); + if(slen < 1000) + return NULL; + + return strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-5.c b/gcc/testsuite/c-c++-common/pr95821-5.c new file mode 100644 index 000..00c1d93b614 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-5.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "memchr" } } */ + +#include +#include + +char * +foo (char *s, char c, char * other) +{ + size_t slen = strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return strchr(s, c); +} +int main() {} diff --git a/gcc/testsuite/c-c++-common/pr95821-6.c b/gcc/testsuite/c-c++-common/pr95821-6.c new file mode 100644 index 000..dec839de5ea --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-6.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include +#include + +char * +foo (char * __restrict
Re: [PATCH v1] tree-optimization/95821 - Convert strlen + strchr to memchr
On Mon, Jun 20, 2022 at 10:29 AM Jakub Jelinek wrote: > > On Mon, Jun 20, 2022 at 09:35:36AM -0700, Noah Goldstein via Gcc-patches > wrote: > > This patch allows for strchr(x, c) to the replace with memchr(x, c, > > strlen(x) + 1) if strlen(x) has already been computed earlier in the > > tree. > > > > Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 > > > > Since memchr doesn't need to re-find the null terminator it is faster > > than strchr. > > Do you have a GCC Copyright assignment on file, or do you want to submit > this under DCO ( https://gcc.gnu.org/dco.html )? If the latter, there > should be a Signed-off-by: line, both in the mail and later commit. > > > > bootstrapped and tested on x86_64-linux. > > > > gcc/ > > > > As it fixes a GCC bugzilla bug, the ChangeLog entry should start with > PR tree-optimization/95821 > line. > > * tree-ssa-strlen.cc: Emit memchr instead of strchr if strlen > > already computed. > > All the indented lines in ChangeLog should be indented by tab. > You are modifying strlen_pass::handle_builtin_strchr function, so after > tree-ssa-strlen.cc there should be that function name in parens: > * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit > memchr ... Fixed in v2. > > > > > gcc/testsuite/ > > > > * c-c++-common/pr95821-1.c > > * c-c++-common/pr95821-2.c > > * c-c++-common/pr95821-3.c > > * c-c++-common/pr95821-4.c > > * c-c++-common/pr95821-5.c > > * c-c++-common/pr95821-6.c > > All the above lines should end with ": New test." after .c Fixed in V2. > > > --- a/gcc/tree-ssa-strlen.cc > > +++ b/gcc/tree-ssa-strlen.cc > > How does the patch relate to the one that H.J. attached in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821#c4 ? > > > @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen () > > } > > } > > > > -/* Handle a strchr call. If strlen of the first argument is known, replace > > - the strchr (x, 0) call with the endptr or x + strlen, otherwise remember > > - that lhs of the call is endptr and strlen of the argument is endptr - > > x. */ > > +/* Handle a strchr call. If strlen of the first argument is known, > > + replace the strchr (x, 0) call with the endptr or x + strlen, > > + otherwise remember that lhs of the call is endptr and strlen of the > > + argument is endptr - x. If strlen of x is not know but has been > > + computed earlier in the tree then replace strchr(x, c) to > > + memchr(x, c, strlen + 1). */ > > Space before ( even in comments. Fixed in V2. > > > > > void > > strlen_pass::handle_builtin_strchr () > > @@ -2418,8 +2421,8 @@ strlen_pass::handle_builtin_strchr () > >if (lhs == NULL_TREE) > > return; > > > > - if (!integer_zerop (gimple_call_arg (stmt, 1))) > > -return; > > + tree chr = gimple_call_arg (stmt, 1); > > + bool is_strchr_zerop = integer_zerop (chr); > > > >tree src = gimple_call_arg (stmt, 0); > > > > @@ -2452,32 +2455,56 @@ strlen_pass::handle_builtin_strchr () > > fprintf (dump_file, "Optimizing: "); > > print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > > } > > - if (si != NULL && si->endptr != NULL_TREE) > > + if (!is_strchr_zerop) > > { > > - rhs = unshare_expr (si->endptr); > > - if (!useless_type_conversion_p (TREE_TYPE (lhs), > > - TREE_TYPE (rhs))) > > - rhs = fold_convert_loc (loc, TREE_TYPE (lhs), rhs); > > + /* If its not strchr(s, zerop) then try and convert to > > +memchr if strlen has already been computed. */ > > Again, space before (. The second line is weirdly formatted, should > be indented below If. Fixed in V2. > > > + tree fn = builtin_decl_explicit (BUILT_IN_MEMCHR); > > + tree one = build_int_cst (TREE_TYPE (rhs), 1); > > + rhs = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (rhs), > > + unshare_expr (rhs), one); > > + tree size = make_ssa_name (TREE_TYPE (rhs)); > > + gassign *size_stmt = gimple_build_assign (size, rhs); > > + gsi_insert_before (_gsi, size_stmt, GSI_SAME_STMT); > > + rhs = size; > > + if (!update_gimple_call (_gsi, fn, 3, src, chr, rhs)) > > + return; > > I think we shou
Re: [PATCH v1] tree-optimization/95821 - Convert strlen + strchr to memchr
On Mon, Jun 20, 2022 at 12:04 PM Jakub Jelinek wrote: > > On Mon, Jun 20, 2022 at 11:48:24AM -0700, Noah Goldstein wrote: > > > I think we should differentiate more. If integer_nonzerop (chr) > > > or perhaps better tree_expr_nonzero_p (chr), then it is better > > > to optimize t = strlen (x); ... p = strchr (x, c); to > > > t = strlen (x); ... p = memchr (x, c, t); > > What do you mean by differentiate more? More comments? Or > > seperate the logic more? > > Different code, don't add the 1 to the strlen value whenever you know > that chr can't be possibly 0 (either it is a non-zero constant, > or the compiler can prove it won't be zero at runtime otherwise). > Because if c is not 0, then memchr (x, c, strlen (x)) == memchr (x, c, strlen > (x) + 1), > either c is among the first strlen (x) chars, or it will return NULL > because x[strlen (x)] == 0. > > It actually is slightly more complicated, strchr second argument is int, > but we just care about the low 8 bits. > For TREE_CODE (chr) == INTEGER_CST, it is still trivial, > say integer_nonzerop (fold_convert (char_type_node, chr)) > or equivalent using wide-int.h APIs. > For SSA_NAMEs, we'd need get_zero_bits API, but we only have > get_nonzero_bits, but we could say at least handle the case where > get_ssa_name_range_info gives a VR_RANGE or set of them where none of > the ranges include integral multiplies of 256. > But for start perhaps just handling INTEGER_CST chr would be good enough. Got it. Will have that in V2. We could also make the initial: bool is_strchr_zerop = integer_zerop (chr); Only check the lower 8 bits. > > Jakub >
Re: [PATCH v1] tree-optimization/95821 - Convert strlen + strchr to memchr
On Mon, Jun 20, 2022 at 10:29 AM Jakub Jelinek wrote: > > On Mon, Jun 20, 2022 at 09:35:36AM -0700, Noah Goldstein via Gcc-patches > wrote: > > This patch allows for strchr(x, c) to the replace with memchr(x, c, > > strlen(x) + 1) if strlen(x) has already been computed earlier in the > > tree. > > > > Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 > > > > Since memchr doesn't need to re-find the null terminator it is faster > > than strchr. > > Do you have a GCC Copyright assignment on file, or do you want to submit > this under DCO ( https://gcc.gnu.org/dco.html )? If the latter, there > should be a Signed-off-by: line, both in the mail and later commit. > > > > bootstrapped and tested on x86_64-linux. > > > > gcc/ > > > > As it fixes a GCC bugzilla bug, the ChangeLog entry should start with > PR tree-optimization/95821 > line. > > * tree-ssa-strlen.cc: Emit memchr instead of strchr if strlen > > already computed. > > All the indented lines in ChangeLog should be indented by tab. > You are modifying strlen_pass::handle_builtin_strchr function, so after > tree-ssa-strlen.cc there should be that function name in parens: > * tree-ssa-strlen.cc (strlen_pass::handle_builtin_strchr): Emit > memchr ... > > > > > gcc/testsuite/ > > > > * c-c++-common/pr95821-1.c > > * c-c++-common/pr95821-2.c > > * c-c++-common/pr95821-3.c > > * c-c++-common/pr95821-4.c > > * c-c++-common/pr95821-5.c > > * c-c++-common/pr95821-6.c > > All the above lines should end with ": New test." after .c > > > --- a/gcc/tree-ssa-strlen.cc > > +++ b/gcc/tree-ssa-strlen.cc > > How does the patch relate to the one that H.J. attached in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821#c4 ? > > > @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen () > > } > > } > > > > -/* Handle a strchr call. If strlen of the first argument is known, replace > > - the strchr (x, 0) call with the endptr or x + strlen, otherwise remember > > - that lhs of the call is endptr and strlen of the argument is endptr - > > x. */ > > +/* Handle a strchr call. If strlen of the first argument is known, > > + replace the strchr (x, 0) call with the endptr or x + strlen, > > + otherwise remember that lhs of the call is endptr and strlen of the > > + argument is endptr - x. If strlen of x is not know but has been > > + computed earlier in the tree then replace strchr(x, c) to > > + memchr(x, c, strlen + 1). */ > > Space before ( even in comments. > > > > > void > > strlen_pass::handle_builtin_strchr () > > @@ -2418,8 +2421,8 @@ strlen_pass::handle_builtin_strchr () > >if (lhs == NULL_TREE) > > return; > > > > - if (!integer_zerop (gimple_call_arg (stmt, 1))) > > -return; > > + tree chr = gimple_call_arg (stmt, 1); > > + bool is_strchr_zerop = integer_zerop (chr); > > > >tree src = gimple_call_arg (stmt, 0); > > > > @@ -2452,32 +2455,56 @@ strlen_pass::handle_builtin_strchr () > > fprintf (dump_file, "Optimizing: "); > > print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > > } > > - if (si != NULL && si->endptr != NULL_TREE) > > + if (!is_strchr_zerop) > > { > > - rhs = unshare_expr (si->endptr); > > - if (!useless_type_conversion_p (TREE_TYPE (lhs), > > - TREE_TYPE (rhs))) > > - rhs = fold_convert_loc (loc, TREE_TYPE (lhs), rhs); > > + /* If its not strchr(s, zerop) then try and convert to > > +memchr if strlen has already been computed. */ > > Again, space before (. The second line is weirdly formatted, should > be indented below If. > > > + tree fn = builtin_decl_explicit (BUILT_IN_MEMCHR); > > + tree one = build_int_cst (TREE_TYPE (rhs), 1); > > + rhs = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (rhs), > > + unshare_expr (rhs), one); > > + tree size = make_ssa_name (TREE_TYPE (rhs)); > > + gassign *size_stmt = gimple_build_assign (size, rhs); > > + gsi_insert_before (_gsi, size_stmt, GSI_SAME_STMT); > > + rhs = size; > > + if (!update_gimple_call (_gsi, fn, 3, src, chr, rhs)) > > + return; > > I think we should differentiate more. If integer_nonzerop (chr) > or perhaps better tree_expr_nonzero_p (chr), then it is better > to optimize t = strlen (x); ... p = strchr (x, c); to > t = strlen (x); ... p = memchr (x, c, t); What do you mean by differentiate more? More comments? Or seperate the logic more? > the t + 1 is only needed if c might be zero. > > > + /* Don't update strlen of lhs if search-char was non-zero. */ > > Wasn't known to be zero is the right thing. > > Jakub >
[PATCH v1] tree-optimization/95821 - Convert strlen + strchr to memchr
This patch allows for strchr(x, c) to the replace with memchr(x, c, strlen(x) + 1) if strlen(x) has already been computed earlier in the tree. Handles PR95821: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95821 Since memchr doesn't need to re-find the null terminator it is faster than strchr. bootstrapped and tested on x86_64-linux. gcc/ * tree-ssa-strlen.cc: Emit memchr instead of strchr if strlen already computed. gcc/testsuite/ * c-c++-common/pr95821-1.c * c-c++-common/pr95821-2.c * c-c++-common/pr95821-3.c * c-c++-common/pr95821-4.c * c-c++-common/pr95821-5.c * c-c++-common/pr95821-6.c --- gcc/testsuite/c-c++-common/pr95821-1.c | 15 ++ gcc/testsuite/c-c++-common/pr95821-2.c | 17 +++ gcc/testsuite/c-c++-common/pr95821-3.c | 17 +++ gcc/testsuite/c-c++-common/pr95821-4.c | 16 ++ gcc/testsuite/c-c++-common/pr95821-5.c | 19 +++ gcc/testsuite/c-c++-common/pr95821-6.c | 18 +++ gcc/tree-ssa-strlen.cc | 69 +++--- 7 files changed, 152 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/pr95821-1.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-2.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-3.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-4.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-5.c create mode 100644 gcc/testsuite/c-c++-common/pr95821-6.c diff --git a/gcc/testsuite/c-c++-common/pr95821-1.c b/gcc/testsuite/c-c++-common/pr95821-1.c new file mode 100644 index 000..e0beb609ea2 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include + +char * +foo (char *s, char c) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-2.c b/gcc/testsuite/c-c++-common/pr95821-2.c new file mode 100644 index 000..5429f0586be --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "memchr" } } */ + +#include + +char * +foo (char *s, char c, char * other) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-3.c b/gcc/testsuite/c-c++-common/pr95821-3.c new file mode 100644 index 000..bc929c6044b --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include + +char * +foo (char * __restrict s, char c, char * __restrict other) +{ + size_t slen = __builtin_strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return __builtin_strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-4.c b/gcc/testsuite/c-c++-common/pr95821-4.c new file mode 100644 index 000..684b41d5b70 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-4.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include +#include + +char * +foo (char *s, char c) +{ + size_t slen = strlen(s); + if(slen < 1000) + return NULL; + + return strchr(s, c); +} diff --git a/gcc/testsuite/c-c++-common/pr95821-5.c b/gcc/testsuite/c-c++-common/pr95821-5.c new file mode 100644 index 000..00c1d93b614 --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-5.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "memchr" } } */ + +#include +#include + +char * +foo (char *s, char c, char * other) +{ + size_t slen = strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return strchr(s, c); +} +int main() {} diff --git a/gcc/testsuite/c-c++-common/pr95821-6.c b/gcc/testsuite/c-c++-common/pr95821-6.c new file mode 100644 index 000..dec839de5ea --- /dev/null +++ b/gcc/testsuite/c-c++-common/pr95821-6.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler "memchr" } } */ + +#include +#include + +char * +foo (char * __restrict s, char c, char * __restrict other) +{ + size_t slen = strlen(s); + if(slen < 1000) + return NULL; + + *other = 0; + + return strchr(s, c); +} diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc index 1d4c0f78fbf..d959a530ea0 100644 --- a/gcc/tree-ssa-strlen.cc +++ b/gcc/tree-ssa-strlen.cc @@ -2405,9 +2405,12 @@ strlen_pass::handle_builtin_strlen () } } -/* Handle a strchr call. If strlen of the first argument is known,
Re: [PATCH] c++: implement C++17 hardware interference size
On Fri, Jul 16, 2021 at 3:37 PM Matthias Kretz wrote: > On Friday, 16 July 2021 19:20:29 CEST Noah Goldstein wrote: > > On Fri, Jul 16, 2021 at 11:12 AM Matthias Kretz wrote: > > > I don't understand how this feature would lead to false sharing. But > maybe > > > I > > > misunderstand the spatial prefetcher. The first access to one of the > two > > > cache > > > lines pairs would bring both cache lines to LLC (and possibly L2). If a > > > core > > > with a different L2 reads the other cache line the cache line would be > > > duplicated; if it writes to it, it would be exclusive to the other > core's > > > L2. > > > The cache line pairs do not affect each other anymore. Maybe there's a > > > minor > > > inefficiency on initial transfer from memory, but isn't that all? > > > > If two cores that do not share an L2 cache need exclusive access to > > a cache-line, the L2 spatial prefetcher could cause pingponging if those > > two cache-lines were adjacent and shared the same 128 byte alignment. > > Say core A requests line x1 in exclusive, it also get line x2 (not sure > > if x2 would be in shared or exclusive), core B then requests x2 in > > exclusive, > > it also gets x1. Irrelevant of the state x1 comes into core B's private > L2 > > cache > > it invalidates the exclusive state on cache-line x1 in core A's private > L2 > > cache. If this was done in a loop (say a simple `lock add` loop) it would > > cause > > pingponging on cache-lines x1/x2 between core A and B's private L2 > caches. > > Quoting the latest ORM: "The following two hardware prefetchers fetched > data > from memory to the L2 cache and last level cache: > Spatial Prefetcher: This prefetcher strives to complete every cache line > fetched to the L2 cache with the pair line that completes it to a 128-byte > aligned chunk." > > 1. If the requested cache line is already present on some other core, the > spatial prefetcher should not get used ("fetched data from memory"). > I think this is correct and I'm incorrect that a request from LLC to L2 will invoke the spatial prefetcher. So not issues with 64 bytes. Sorry for the added confusion! > > 2. The section is about data prefetching. It is unclear whether the > spatial > prefetcher applies at all for normal cache line fetches. > > 3. The ORM uses past tense ("The following two hardware prefetchers > fetched > data"), which indicates to me that Intel isn't doing this for newer > generations anymore. > 4. If I'm wrong on points 1 & 2 consider this: Core 1 requests a read of > cache > line A and the adjacent cache line B thus is also loaded to LLC. Core 2 > request a read of line B and thus loads line A into LLC. Now both cores > have > both cache lines in LLC. Core 1 writes to line A, which invalidates line A > in > LLC of Core 2 but does not affect line B. Core 2 writes to line B, > invalidating line A for Core 1. => no false sharing. Where did I get my > mental > cache protocol wrong? > -- > ── > Dr. Matthias Kretz https://mattkretz.github.io > GSI Helmholtz Centre for Heavy Ion Research https://gsi.de > std::experimental::simd https://github.com/VcDevel/std-simd > ── > > > >
Re: [PATCH] c++: implement C++17 hardware interference size
On Fri, Jul 16, 2021 at 11:12 AM Matthias Kretz wrote: > On Friday, 16 July 2021 04:41:17 CEST Jason Merrill via Gcc-patches wrote: > > > Currently the patch does not adjust the values based on -march, as in > JF's > > > proposal. I'll need more guidance from the ARM/AArch64 maintainers > about > > > how to go about that. --param l1-cache-line-size is set based on > -mtune, > > > but I don't think we want -mtune to change these ABI-affecting values. > > > Are > > > there -march values for which a smaller range than 64-256 makes sense? > > As a user who cares about ABI but also cares about maximizing performance > of > builds for a specific HPC setup I'd expect the hardware interference size > values to be allowed to break ABIs. The point of these values is to give > me > better performance portability (but not necessarily binary portability) > than > my usual "pick 64 as a good average". > Wrt, -march / -mtune setting hardware interference size: IMO -mtune=X > should > be interpreted as "my binary is supposed to be optimized for X, I accept > inefficiencies on everything that's not X". > > On Friday, 16 July 2021 04:48:52 CEST Noah Goldstein wrote: > > On intel x86 systems with a private L2 cache the spatial prefetcher > > can cause destructive interference along 128 byte aligned boundaries. > > > https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-3 > > 2-architectures-optimization-manual.pdf#page=60 > > I don't understand how this feature would lead to false sharing. But maybe > I > misunderstand the spatial prefetcher. The first access to one of the two > cache > lines pairs would bring both cache lines to LLC (and possibly L2). If a > core > with a different L2 reads the other cache line the cache line would be > duplicated; if it writes to it, it would be exclusive to the other core's > L2. > The cache line pairs do not affect each other anymore. Maybe there's a > minor > inefficiency on initial transfer from memory, but isn't that all? > If two cores that do not share an L2 cache need exclusive access to a cache-line, the L2 spatial prefetcher could cause pingponging if those two cache-lines were adjacent and shared the same 128 byte alignment. Say core A requests line x1 in exclusive, it also get line x2 (not sure if x2 would be in shared or exclusive), core B then requests x2 in exclusive, it also gets x1. Irrelevant of the state x1 comes into core B's private L2 cache it invalidates the exclusive state on cache-line x1 in core A's private L2 cache. If this was done in a loop (say a simple `lock add` loop) it would cause pingponging on cache-lines x1/x2 between core A and B's private L2 caches. > > That said. Intel documents the spatial prefetcher exclusively for Sandy > Bridge. So if you still believe 128 is necessary, set the destructive > hardware > interference size to 64 for all of x86 except -mtune=sandybridge. > AFAIK the spatial prefetcher exists on newer x86_64 machines as well. > > -- > ── > Dr. Matthias Kretz https://mattkretz.github.io > GSI Helmholtz Centre for Heavy Ion Research https://gsi.de > std::experimental::simd https://github.com/VcDevel/std-simd > ── >
Re: [PATCH] c++: implement C++17 hardware interference size
On Thu, Jul 15, 2021 at 10:41 PM Jason Merrill via Gcc-patches < gcc-patches@gcc.gnu.org> wrote: > Adding CCs that got lost in the initial mail. > > On Thu, Jul 15, 2021 at 10:36 PM Jason Merrill wrote: > > > The last missing piece of the C++17 standard library is the hardware > > intereference size constants. Much of the delay in implementing these > has > > been due to uncertainty about what the right values are, and even whether > > there is a single constant value that is suitable; the destructive > > interference size is intended to be used in structure layout, so program > > ABIs will depend on it. > > > > In principle, both of these values should be the same as the target's L1 > > cache line size. When compiling for a generic target that is intended to > > support a range of target CPUs with different cache line sizes, the > > constructive size should probably be the minimum size, and the > destructive > > size the maximum, unless you are constrained by ABI compatibility with > > previous code. > > > > JF Bastien's implementation proposal is summarized at > > https://github.com/itanium-cxx-abi/cxx-abi/issues/74 > > > > I implement this by adding new --params for the two sizes. Targets need > to > > override these values in targetm.target_option.override() to support the > > feature. > > > > 64 bytes still seems correct for the x86 family. > > > > I'm not sure why he said 64/64 for 32-bit ARM, since the Cortex A9 has a > > 32-byte cache line, and that seems to be the only ARM_PREFETCH_BENEFICIAL > > target, so I'd think 32/64 would make more sense. > > > > He proposed 64/128 for AArch64, but since the A64FX now has a 256B cache > > line, I've changed that to 64/256. Does that seem right? > > > > Currently the patch does not adjust the values based on -march, as in > JF's > > proposal. I'll need more guidance from the ARM/AArch64 maintainers about > > how to go about that. --param l1-cache-line-size is set based on -mtune, > > but I don't think we want -mtune to change these ABI-affecting values. > Are > > there -march values for which a smaller range than 64-256 makes sense? > > > > gcc/ChangeLog: > > > > * params.opt: Add destructive-interference-size and > > constructive-interference-size. > > * doc/invoke.texi: Document them. > > * config/aarch64/aarch64.c (aarch64_override_options_internal): > > Set them. > > * config/arm/arm.c (arm_option_override): Set them. > > * config/i386/i386-options.c (ix86_option_override_internal): > > Set them. > > > > gcc/c-family/ChangeLog: > > > > * c.opt: Add -Winterference-size. > > * c-cppbuiltin.c (cpp_atomic_builtins): Add > __GCC_DESTRUCTIVE_SIZE > > and __GCC_CONSTRUCTIVE_SIZE. > > > > gcc/cp/ChangeLog: > > > > * decl.c (cxx_init_decl_processing): Check > > --param *-interference-size values. > > > > libstdc++-v3/ChangeLog: > > > > * include/std/version: Define > __cpp_lib_hardware_interference_size. > > * libsupc++/new: Define hardware interference size variables. > > > > gcc/testsuite/ChangeLog: > > > > * g++.target/aarch64/interference.C: New test. > > * g++.target/arm/interference.C: New test. > > * g++.target/i386/interference.C: New test. > > --- > > gcc/doc/invoke.texi | 22 ++ > > gcc/c-family/c.opt| 5 > > gcc/params.opt| 15 > > gcc/c-family/c-cppbuiltin.c | 12 ++ > > gcc/config/aarch64/aarch64.c | 9 > > gcc/config/arm/arm.c | 6 + > > gcc/config/i386/i386-options.c| 6 + > > gcc/cp/decl.c | 23 +++ > > .../g++.target/aarch64/interference.C | 9 > > gcc/testsuite/g++.target/arm/interference.C | 9 > > gcc/testsuite/g++.target/i386/interference.C | 8 +++ > > libstdc++-v3/include/std/version | 3 +++ > > libstdc++-v3/libsupc++/new| 10 ++-- > > 13 files changed, 135 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/g++.target/aarch64/interference.C > > create mode 100644 gcc/testsuite/g++.target/arm/interference.C > > create mode 100644 gcc/testsuite/g++.target/i386/interference.C > > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > index ea8812425e9..f93cb7a20f7 100644 > > --- a/gcc/doc/invoke.texi > > +++ b/gcc/doc/invoke.texi > > @@ -13857,6 +13857,28 @@ prefetch hints can be issued for any constant > > stride. > > > > This setting is only useful for strides that are known and constant. > > > > +@item destructive_interference_size > > +@item constructive_interference_size > > +The values for the C++17 variables > > +@code{std::hardware_destructive_interference_size} and > >