Re: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223
On 10/29/20 1:45 PM, Eugene Rozenfeld via Gcc-patches wrote: > Thank you for the review Richard! > > I re-worked the patch based on your suggestions. I combined the two patterns. > Neither one requires a signedness check as long as the type of the 'add' has > overflow wrap semantics. > > I had to modify the regular expression in no-strict-overflow-4.c test. In > that test the following function is compiled with -fno-strict-overflow : > > int > foo (int i) > { > return i + 1 > i; > } > > We now optimize this function so that the tree-optimized dump has > > ;; Function foo (foo, funcdef_no=0, decl_uid=1931, cgraph_uid=1, > symbol_order=0) > > foo (int i) > { > _Bool _1; > int _3; > >[local count: 1073741824]: > _1 = i_2(D) != 2147483647; > _3 = (int) _1; > return _3; > } > > This is a correct optimization since -fno-strict-overflow implies -fwrapv. > > Eugene > > -Original Message- > From: Richard Biener > Sent: Tuesday, October 27, 2020 2:23 AM > To: Eugene Rozenfeld > Cc: gcc-patches@gcc.gnu.org > Subject: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223 > > On Sat, Oct 24, 2020 at 2:20 AM Eugene Rozenfeld via Gcc-patches > wrote: >> This patch adds a pattern for folding >> x < (short) ((unsigned short)x + const) to >> x <= SHORT_MAX - const >> (and similarly for other integral types) if const is not 0. >> as described in PR97223. >> >> For example, without this patch the x86_64-pc-linux code generated for >> this function >> >> bool f(char x) >> { >> return x < (char)(x + 12); >> } >> >> is >> >> leaeax,[rdi+0xc] >> cmpal,dil >> setg al >> ret >> >> With the patch the code is >> >> cmpdil,0x73 >> setle al >> ret >> >> Tested on x86_64-pc-linux. > +/* Similar to the previous pattern but with additional casts. */ (for > +cmp (lt le ge gt) > + out (gt gt le le) > + (simplify > + (cmp:c (convert@3 (plus@2 (convert@4 @0) INTEGER_CST@1)) @0) > + (if (!TYPE_UNSIGNED (TREE_TYPE (@0)) > + && types_match (TREE_TYPE (@0), TREE_TYPE (@3)) > + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE (@0))) > + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@4)) > + && wi::to_wide (@1) != 0 > + && single_use (@2)) > + (with { unsigned int prec = TYPE_PRECISION (TREE_TYPE (@0)); } > +(out @0 { wide_int_to_tree (TREE_TYPE (@0), > + wi::max_value (prec, SIGNED) > + - wi::to_wide (@1)); }) > > I think it's reasonable but the comment can be made more precise. > In particular I wonder why we require a signed comparison here while the > previous pattern requires an unsigned comparison. It might be an artifact > and the restriction instead only applies to the plus? > > Note that > > + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE > + (@0))) > > unsigned_type_for should be avoided since it's quite expensive. May I suggest > > && TYPE_UNSIGNED (TREE_TYPE (@4)) > && tree_nop_conversion_p (TREE_TYPE (@4), TREE_TYPE (@0)) > > instead? > > I originally wondered if "but with additional casts" could be done in a > single pattern via (convert? ...) uses but then I noticed the strange > difference in the comparison signedness requirement ... > > Richard. > >> Eugene >> >> >> 0001-Add-a-tree-optimization-described-in-PR97223.patch >> >> From 973942122522bbf2e9de54cff17de59de5955547 Mon Sep 17 00:00:00 2001 >> From: Eugene Rozenfeld >> Date: Fri, 23 Oct 2020 16:47:01 -0700 >> Subject: [PATCH] Add a tree optimization described in PR97223. >> MIME-Version: 1.0 >> Content-Type: text/plain; charset=UTF-8 >> Content-Transfer-Encoding: 8bit >> >> Convert >> x < (short) ((unsigned short)x + const) >> to >> x <= SHORT_MAX – const >> (and similarly for other integral types) if const is not 0. >> >> For example, without this patch the x86_64-pc-linux code generated for this >> function >> >> bool f(char x) >> { >> return x < (char)(x + 12); >> } >> >> is >> >> leaeax,[rdi+0xc] >> cmpal,dil >> setg al >> ret >> >> With the patch the code is >> >> cmpdil,0x73 >> setle al >> ret >> --- >> gcc/match.pd| 16 ++-- >> gcc/testsuite/gcc.dg/no-strict-overflow-4.c | 5 +++-- >> 2 files changed, 13 insertions(+), 8 deletions(-) Committed to the trunk. Thanks. jeff
Re: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223
On Thu, Oct 29, 2020 at 8:45 PM Eugene Rozenfeld wrote: > > Thank you for the review Richard! > > I re-worked the patch based on your suggestions. I combined the two patterns. > Neither one requires a signedness check as long as the type of the 'add' has > overflow wrap semantics. > > I had to modify the regular expression in no-strict-overflow-4.c test. In > that test the following function is compiled with -fno-strict-overflow : > > int > foo (int i) > { > return i + 1 > i; > } > > We now optimize this function so that the tree-optimized dump has > > ;; Function foo (foo, funcdef_no=0, decl_uid=1931, cgraph_uid=1, > symbol_order=0) > > foo (int i) > { > _Bool _1; > int _3; > >[local count: 1073741824]: > _1 = i_2(D) != 2147483647; > _3 = (int) _1; > return _3; > } > > This is a correct optimization since -fno-strict-overflow implies -fwrapv. OK. Thanks, Richard. > Eugene > > -Original Message- > From: Richard Biener > Sent: Tuesday, October 27, 2020 2:23 AM > To: Eugene Rozenfeld > Cc: gcc-patches@gcc.gnu.org > Subject: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223 > > On Sat, Oct 24, 2020 at 2:20 AM Eugene Rozenfeld via Gcc-patches > wrote: > > > > This patch adds a pattern for folding > > x < (short) ((unsigned short)x + const) to > > x <= SHORT_MAX - const > > (and similarly for other integral types) if const is not 0. > > as described in PR97223. > > > > For example, without this patch the x86_64-pc-linux code generated for > > this function > > > > bool f(char x) > > { > > return x < (char)(x + 12); > > } > > > > is > > > > leaeax,[rdi+0xc] > > cmpal,dil > > setg al > > ret > > > > With the patch the code is > > > > cmpdil,0x73 > > setle al > > ret > > > > Tested on x86_64-pc-linux. > > +/* Similar to the previous pattern but with additional casts. */ (for > +cmp (lt le ge gt) > + out (gt gt le le) > + (simplify > + (cmp:c (convert@3 (plus@2 (convert@4 @0) INTEGER_CST@1)) @0) > + (if (!TYPE_UNSIGNED (TREE_TYPE (@0)) > + && types_match (TREE_TYPE (@0), TREE_TYPE (@3)) > + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE (@0))) > + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@4)) > + && wi::to_wide (@1) != 0 > + && single_use (@2)) > + (with { unsigned int prec = TYPE_PRECISION (TREE_TYPE (@0)); } > +(out @0 { wide_int_to_tree (TREE_TYPE (@0), > + wi::max_value (prec, SIGNED) > + - wi::to_wide (@1)); }) > > I think it's reasonable but the comment can be made more precise. > In particular I wonder why we require a signed comparison here while the > previous pattern requires an unsigned comparison. It might be an artifact > and the restriction instead only applies to the plus? > > Note that > > + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE > + (@0))) > > unsigned_type_for should be avoided since it's quite expensive. May I suggest > > && TYPE_UNSIGNED (TREE_TYPE (@4)) > && tree_nop_conversion_p (TREE_TYPE (@4), TREE_TYPE (@0)) > > instead? > > I originally wondered if "but with additional casts" could be done in a > single pattern via (convert? ...) uses but then I noticed the strange > difference in the comparison signedness requirement ... > > Richard. > > > Eugene > >
RE: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223
Thank you for the review Richard! I re-worked the patch based on your suggestions. I combined the two patterns. Neither one requires a signedness check as long as the type of the 'add' has overflow wrap semantics. I had to modify the regular expression in no-strict-overflow-4.c test. In that test the following function is compiled with -fno-strict-overflow : int foo (int i) { return i + 1 > i; } We now optimize this function so that the tree-optimized dump has ;; Function foo (foo, funcdef_no=0, decl_uid=1931, cgraph_uid=1, symbol_order=0) foo (int i) { _Bool _1; int _3; [local count: 1073741824]: _1 = i_2(D) != 2147483647; _3 = (int) _1; return _3; } This is a correct optimization since -fno-strict-overflow implies -fwrapv. Eugene -Original Message- From: Richard Biener Sent: Tuesday, October 27, 2020 2:23 AM To: Eugene Rozenfeld Cc: gcc-patches@gcc.gnu.org Subject: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223 On Sat, Oct 24, 2020 at 2:20 AM Eugene Rozenfeld via Gcc-patches wrote: > > This patch adds a pattern for folding > x < (short) ((unsigned short)x + const) to > x <= SHORT_MAX - const > (and similarly for other integral types) if const is not 0. > as described in PR97223. > > For example, without this patch the x86_64-pc-linux code generated for > this function > > bool f(char x) > { > return x < (char)(x + 12); > } > > is > > leaeax,[rdi+0xc] > cmpal,dil > setg al > ret > > With the patch the code is > > cmpdil,0x73 > setle al > ret > > Tested on x86_64-pc-linux. +/* Similar to the previous pattern but with additional casts. */ (for +cmp (lt le ge gt) + out (gt gt le le) + (simplify + (cmp:c (convert@3 (plus@2 (convert@4 @0) INTEGER_CST@1)) @0) + (if (!TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (TREE_TYPE (@0), TREE_TYPE (@3)) + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE (@0))) + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@4)) + && wi::to_wide (@1) != 0 + && single_use (@2)) + (with { unsigned int prec = TYPE_PRECISION (TREE_TYPE (@0)); } +(out @0 { wide_int_to_tree (TREE_TYPE (@0), + wi::max_value (prec, SIGNED) + - wi::to_wide (@1)); }) I think it's reasonable but the comment can be made more precise. In particular I wonder why we require a signed comparison here while the previous pattern requires an unsigned comparison. It might be an artifact and the restriction instead only applies to the plus? Note that + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE + (@0))) unsigned_type_for should be avoided since it's quite expensive. May I suggest && TYPE_UNSIGNED (TREE_TYPE (@4)) && tree_nop_conversion_p (TREE_TYPE (@4), TREE_TYPE (@0)) instead? I originally wondered if "but with additional casts" could be done in a single pattern via (convert? ...) uses but then I noticed the strange difference in the comparison signedness requirement ... Richard. > Eugene > 0001-Add-a-tree-optimization-described-in-PR97223.patch Description: 0001-Add-a-tree-optimization-described-in-PR97223.patch
Re: [PATCH] [tree-optimization] Fix for PR97223
On Sat, Oct 24, 2020 at 2:20 AM Eugene Rozenfeld via Gcc-patches wrote: > > This patch adds a pattern for folding > x < (short) ((unsigned short)x + const) > to > x <= SHORT_MAX - const > (and similarly for other integral types) if const is not 0. > as described in PR97223. > > For example, without this patch the x86_64-pc-linux code generated for this > function > > bool f(char x) > { > return x < (char)(x + 12); > } > > is > > leaeax,[rdi+0xc] > cmpal,dil > setg al > ret > > With the patch the code is > > cmpdil,0x73 > setle al > ret > > Tested on x86_64-pc-linux. +/* Similar to the previous pattern but with additional casts. */ +(for cmp (lt le ge gt) + out (gt gt le le) + (simplify + (cmp:c (convert@3 (plus@2 (convert@4 @0) INTEGER_CST@1)) @0) + (if (!TYPE_UNSIGNED (TREE_TYPE (@0)) + && types_match (TREE_TYPE (@0), TREE_TYPE (@3)) + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE (@0))) + && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@4)) + && wi::to_wide (@1) != 0 + && single_use (@2)) + (with { unsigned int prec = TYPE_PRECISION (TREE_TYPE (@0)); } +(out @0 { wide_int_to_tree (TREE_TYPE (@0), + wi::max_value (prec, SIGNED) + - wi::to_wide (@1)); }) I think it's reasonable but the comment can be made more precise. In particular I wonder why we require a signed comparison here while the previous pattern requires an unsigned comparison. It might be an artifact and the restriction instead only applies to the plus? Note that + && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE (@0))) unsigned_type_for should be avoided since it's quite expensive. May I suggest && TYPE_UNSIGNED (TREE_TYPE (@4)) && tree_nop_conversion_p (TREE_TYPE (@4), TREE_TYPE (@0)) instead? I originally wondered if "but with additional casts" could be done in a single pattern via (convert? ...) uses but then I noticed the strange difference in the comparison signedness requirement ... Richard. > Eugene >
[PATCH] [tree-optimization] Fix for PR97223
This patch adds a pattern for folding x < (short) ((unsigned short)x + const) to x <= SHORT_MAX - const (and similarly for other integral types) if const is not 0. as described in PR97223. For example, without this patch the x86_64-pc-linux code generated for this function bool f(char x) { return x < (char)(x + 12); } is leaeax,[rdi+0xc] cmpal,dil setg al ret With the patch the code is cmpdil,0x73 setle al ret Tested on x86_64-pc-linux. Eugene 0001-Add-a-tree-optimization-described-in-PR97223.patch Description: 0001-Add-a-tree-optimization-described-in-PR97223.patch